Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, no. We can move an arbitrary amount of data around the world at breakneck speed. Netflix does this for a living. It's not practical to hand out the training material because of the massive rampant copyright violations.


If a research group downloads material in order to train a model, is there some significant difference in copyright violation if they hand it to a second research group in order to fulfill the same purposes?


Yes, because of a key word in a lot of copyright laws... "distribution". Using that copyrighted material themselves to train the model still gives them plausible deniability. Handing the copyrighted material to another group starts to run afoul of other laws and also removes the plausible deniability that the original group can claim regarding their training data.


The training data is not necessarily kept. It's possible that data is consumed, incorporated into the weights and then discarded.


If only we'd figured out a technology that let us move huge torrents of bits around.

If only there was a catchy name for it. Something like bit-torrent perhaps?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: