Hacker News new | past | comments | ask | show | jobs | submit login

That seems even worse: they had intent to steal and now they're trying to make sure it is properly legislated so nobody else can do it, thus reducing competition.

GPT can't get retroactively untrained on stolen data.




Google actually can “untrain” afaik, my limited understanding is they have good controls their data and its sources, because they know it could be important in the future, GPT not sure.

I’m not sure what you mean by “steal” because it’s a relative term now, me reading your book isn’t stealing if I paid for it and it inspires me to write my own novel about a totally new story. And if you posted your book online, as of right now the legal precedent is you didn’t make any claims to it (anyone could read it for free) so that’s fair game to train on, just like the text I’m writing now also has no protections.

Nearly all Reddit history ever up to a certain date is available for download now online, only until they changed their policies did they start having tighter controls about how their data could be used.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: