Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But I can't legally obtain the book to read and learn from without me (or a library) paying for it. Let's start there first.


Yes, but the learning isn't constrained by those laws. If I steal a book and read it, I'm guilty of the crime of theft. You can put me in jail, try me before a jury, fine me, and put me in prison according to whatever laws I broke.

Nothing in my sentence constrains my ability to teach someone else the stuff I learned, though! In fact, the first amendment makes it pretty damn clear that nothing can constrain that freedom.

Also, note that the example is malformed: in almost all these cases, Meta et. al. aren't "stealing" anything anyway. They're downloading and reading stuff on the internet that is available for free. If you or I can't be prosecuted for reading a preprint from arXiv.org or whatever, it's a very hard case to make that an AI can.

Again, copyright isn't the tool here. We need better laws.


Sure, but OpenAI (same as Google, and Facebook, and all the others) is illegally copying the book, and they want this to be legal for them.

It's perhaps arguable whether it's OK for an LLM to be trained on freely available but licensed works, such as the Linux source code. There you can get in arguments about learning vs machine processing, and whether the LLM is a derived work etc

But it's not arguable that copying a book that you have not even bought to store in your corporate data lake to later use for training is a blatant violation of basic copyright. It's exactly like borrowing a book from a library, photocopying it, and then putting it in your employee-only corporate library.


> copyright isn't the tool here

It's not the only tool. I agree that "use for ML" should be an additional right.

What people are pissed about is that copyright only ever serves to constrain the little guys.

> If I steal a book and read it, I'm guilty of the crime of theft

You or I would never dare to do this in the first place.


> Meta et. al. aren't "stealing" anything anyway

They were caught downloading the entirety of libgen.


One thing is downloading pirated copy and reading it for yourself and another thing is running a business based on downloading millions of pirated works.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: