Some people are more concerned about a world so dumb or in such bad faith that can make and tolerate confusion between "questionable possession" and "theft".
Which, paradoxically, calls for the need for more and more intellectual practice, which is a key purpose in the access to culture we have valued for millennia.
(Similar confusion is in that mentioned idea of Meta having done something wrong in processing texts - we can access all available texts.)
Either Meta did nothing wrong and therefore individuals who pirate ebooks do nothing wrong as well and the concept of piracy should not exist/not be illegal;
Or piracy is actually theft (as it supposedly is when individuals do it) and Meta did millions of counts of it and therefore should pay trillions in damages, be dissolved, have Zuck go to jail, or all three.
What is the accusation: having had an automaton read a million books? I repeat from the previous post: we are entitled to having read all the published available books. (And more than entitled: encouraged to.) That is what libraries are for.
"Piracy" in that context is coming into possession of something you are not entitled to own. And this latter point is thin and a stub, just to say that they are different things - the one above is not (it could be expanded but would not change).
I think any answer to that question needs to be considered carefully, at least in a legal context, since it could end up having unintended consequences.
LLMs ingests works but does not regurgitate them, so the product can be considered transformative. From my understanding of these models, they do not retain the original works. (There are probably reasons for the companies to retain the original works, but that is an entirely different matter.) So equating a trained model to copyright violations is akin to suggesting the knowledge, rather than the content, is copyrightable. Do we really want to enter that territory?
The other route of attack is via how the materials were acquired. This can create problems from several perspectives. If companies had to purchase each work in order to train a model, the process would only be accessible to very well financed corporations. Libraries as well, since they are essentially in the business of purchasing works (albeit for an entirely different purpose). If you allowed borrowed works to be used while training models, the notion of lending would likely come under attack. I'm not sure we want to go there either. Then there is the question of online materials that are freely available. What would protect them?
I'm not a fan of AI and I am even less of a fan of Meta. I would love to see them have the book thrown at them. I'm just uncomfortable with the potential repercussions of throwing the book at them.
Critical points nailed. This current weird state of the world is missing the basic principles, which must be stressed. I find them trivial, but the social (and political) issue is, they are not to many actors.
There is a "right to learn". There is a "right to access". And there are values to pursue, and urgencies to tackle (a world collapsing on its own cognitive faults)...
Which, paradoxically, calls for the need for more and more intellectual practice, which is a key purpose in the access to culture we have valued for millennia.
(Similar confusion is in that mentioned idea of Meta having done something wrong in processing texts - we can access all available texts.)