Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The issue of false positives and false negatives is glaring in data sets of these sizes.

5% of France is about 3.3 million unique records.

From what I could see online, the false positive rates of various DNA tests are between 0.01%[0] and 40% [1]. Lets call it somewhere inbetween and say 20% are false positives, or 1 of 5 people.

This is just a lead in a case, sure, but that means that 1 out of 5 cases have false leads, wasting a lot of time and resources. I've no idea what the false positive rate for a 'normal' case is like, it could very easily be higher.

Granted, this is today's false positive rate, it should get a lot better over time. But to what percentage, and how long will that take?

Then you have the much more pressing issue of the false negative rate. I did not look too hard, but trying to find that rate wasn't simple. I've no idea what it is. In terms of DNA cases, you could then have a lot of potentially dangerous people falling through cracks in the system. Lets pull a number straight out of nowhere and say that the false negative and false postive rates are the same, about 20%. That would then mean that your odds of getting the 'right' criminal (specificity[2])are at about 80% and your odds of not getting the 'right' criminal (aka clearing people that are actually innocent, aka sensitivity ) are also 80%. Meaning that for any random crinimal case using the DNA database as a lead, you only have about a 64% chance of getting useful information out of the DNA database [3].

[0]https://www.ptclabs.com/relationship-dna/more-information/fa...

[1]https://www.nature.com/articles/gim201838)

[2]https://en.wikipedia.org/wiki/Sensitivity_and_specificity

[3]https://meta.wikimedia.org/wiki/Cunningham%27s_Law



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: