With regard to privacy, what’s the difference between your email’s text stored o...

kerkeslager · on April 12, 2023

There's a great deal of privacy in simply being a needle in a haystack. Part of the processing that's possible with an LLM is filtering.

Imagine you've sent an email about transporting a friend's daughter across state lines to get a medically-necessary abortion. Or if you prefer, imagine you've arranged via email to "lose" some firearms which don't comply with your state's new assault weapons ban.

Pre-LLMs, trying to find these sorts of emails was very hard. A simple text search for "abortion" or "gun" is going to come up with far more emails where two family members got into a political debate, than emails about lawbreaking. Big Brother will find a few such emails here and there by chance, but the vast majority of such incriminating emails will simply be lost in the pile.

Enter LLMs, and Big Brother can feed some of the incriminating emails found my chance into a training dataset along with a bunch of non-incriminating emails, and teach the AI to find incriminating emails, and then apply the model to the entire list of emails and get a nicely filtered list of only the emails which are incriminating, further tuning the model by adding emails it gets wrong to the training dataset when they are found.