Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because, unlike humans, LLMs reliably reproduce exact excerpts from their training data. It's very easy to get image generation models to spit out screenshots from movies.



That doesn't mean that all of the output from an LLM trained on GPL code is a derivative work (and therefore GPL'd too).


A model that provably engages in systematic, difficult-to-detect plagiarism must itself be considered plagiaristic.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: