But I can't legally obtain the book to read and learn from without me (or a libr...

ajross · 2025-03-13T17:22:51 1741886571

Yes, but the learning isn't constrained by those laws. If I steal a book and read it, I'm guilty of the crime of theft. You can put me in jail, try me before a jury, fine me, and put me in prison according to whatever laws I broke.

Nothing in my sentence constrains my ability to teach someone else the stuff I learned, though! In fact, the first amendment makes it pretty damn clear that nothing can constrain that freedom.

Also, note that the example is malformed: in almost all these cases, Meta et. al. aren't "stealing" anything anyway. They're downloading and reading stuff on the internet that is available for free. If you or I can't be prosecuted for reading a preprint from arXiv.org or whatever, it's a very hard case to make that an AI can.

Again, copyright isn't the tool here. We need better laws.

tsimionescu · 2025-03-13T17:39:34 1741887574

Sure, but OpenAI (same as Google, and Facebook, and all the others) is illegally copying the book, and they want this to be legal for them.

It's perhaps arguable whether it's OK for an LLM to be trained on freely available but licensed works, such as the Linux source code. There you can get in arguments about learning vs machine processing, and whether the LLM is a derived work etc

But it's not arguable that copying a book that you have not even bought to store in your corporate data lake to later use for training is a blatant violation of basic copyright. It's exactly like borrowing a book from a library, photocopying it, and then putting it in your employee-only corporate library.

triceratops · 2025-03-13T17:37:41 1741887461

> copyright isn't the tool here

It's not the only tool. I agree that "use for ML" should be an additional right.

What people are pissed about is that copyright only ever serves to constrain the little guys.

> If I steal a book and read it, I'm guilty of the crime of theft

You or I would never dare to do this in the first place.

riversflow · 2025-03-13T17:43:07 1741887787

> Meta et. al. aren't "stealing" anything anyway

They were caught downloading the entirety of libgen.

codedokode · 2025-03-14T01:43:25 1741916605

One thing is downloading pirated copy and reading it for yourself and another thing is running a business based on downloading millions of pirated works.