Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think they had the advantage of being ahead of the law in this regard. To my knowledge, reading copywritten material isn't (or wasn't illegal) and remains a legal grey area.

Distilling weights from prompts and responses is even more of a legal grey area. The legal system cannot respond quickly to such technological advancements so things necessarily remain a wild west until technology reaches the asymptotic portion of the curve.

In my view the most interesting thing is, do we really need vast data centers and innumerable GPUs for AGI? In other words, if intelligence is ultimately a function of power input, what is the shape of the curve?



The main issue is that they've had plenty of instances where the LLM outputted copyrighted content verbatim, like it happened with the New York Times and some book authors. And then there's DALL-E, which is baked into ChatGPT and before all the guardrails came up, was clearly trained on copyrighted content to the point it had people's watermarks, as well as their styles, just like Stable Diffusion mixes can do (if you don't prompt it out).

Like you've put, it's still a somewhat gray area, and I personally have nothing against them (or anyone else) using copyrighted content to train models.

I do find it annoying that they're so closed-off about their tech when it's built on the shoulders of openness and other people's hard work. And then they turn around and throw Issy fits when someone copies their homework, allegedly.


> Distilling weights from prompts and responses is even more of a legal grey area.

Actually unless the law changes this is pretty settled territory in US law. All output of AIs are not copyrightable, and are therefore in the public domain. The only legal avenue of attack OpenAi has is Terms of Service violation, which is a much weaker breach then copyright if it is even true.


> if intelligence is ultimately a function of power input, what is the shape of the curve?

According to a quick google search, the human body consumes ~145W of power over 24h (eating 3000kcals/day). The brain needs ~20% of that so 29W/day. Much less than our current designs of software & (especially) hardware for AI.


I think you mean the brain uses 29W (i.e. not 29W/day). Also, I suspect that burgers are a higher entropy energy source than electricity so perhaps it is even less than that.


Illegally acquiring copyrighted material has always been highly illegal in France and I'm sure most other countries. Disney is another example of how it not grey at all.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: