Access to tech is different from handling of personal data though -- the EU GDPR laws around that are clear and fair
People have a right to know where their personal data is going, what is being stored, what it is being used for and should have a mechanism to correct it and delete
The wider challenge is how that is handled in a compliant way with LLMs and generative tools which vendors do not seem to be taking particularly seriously yet
> The wider challenge is how that is handled in a compliant way with LLMs and generative tools which vendors do not seem to be taking particularly seriously yet
I'm curious as to why people would want to train LLMs on personal identifying information. What's the benefit of an LLM that has a large collection of names, addresses, dates of birth etc.?
Free-form text like Reddit posts contains a whole load of PII. Since there is absolutely no regard for what goes into a LLM, naturally, they also contain this PII.
People have a right to know where their personal data is going, what is being stored, what it is being used for and should have a mechanism to correct it and delete
The wider challenge is how that is handled in a compliant way with LLMs and generative tools which vendors do not seem to be taking particularly seriously yet