> They consumed publicly available material on the Internet
I agree that there are some important distinctions and word-choices to be made here, and that there are problems with equating training to "stealing", and that copyright infringement is not theft, etc.
That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.
> That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.
I would use the analogy of those humans learning from the material. Like reading books in the library
"regurgitate slightly-reworded copies" in my experience using LLMs (not insubstantial) that is an unfairly pejorative take on what they do
I agree that there are some important distinctions and word-choices to be made here, and that there are problems with equating training to "stealing", and that copyright infringement is not theft, etc.
That said, if you zoom out to the overall conduct, it's fair to argue that the companies are doing something unethical, the same as if they paid an army of humans to memorize other people's work and then regurgitate slightly-reworded copies.