Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Anonymization" in the sense of transforming a dataset so that it's still useful but doesn't significantly reduce the privacy of the people it describes, is usually impossible, or at least beyond the state of the art. People start out with just a few tens of bits of anonymity and bits are everywhere.

You probably have a better chance of creating your own secure block cipher than of achieving this goal. In a similar way, your inability to see what's wrong with your scheme is not evidence that it works.

I don't like to be negative, and I'm all for continued research, but at this point the conservative thing to do with data that you need to "anonymize" is delete it.



Agreed. The more alarming angle to consider is that the more a particular describes somebody, 1) The more valuable it is in the context of surveillance and advertising, 2) The more work good-faith actors should put into anonymising it, and most importantly, 3) The easier it is to de-anonymise through correlation with other sets.

~~People just aren't the unique snowflakes our mothers told us we are.~~ Most people for example can be uniquely (and easily) identified with just a DOB, first name, and suburb.

Edit: maybe the problem is actually that we are too unique :)




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: