Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I would happily contribute all of my public source code under whatever license to a dataset that required models to also be open sourced. I am not OK with Microsoft creating a derivative work (Copilot's model) off of GPL code and not releasing the weights under the GPL.


I tend to agree, but to play devil's advocate, if we were talking about a biological neural network (person) training themselves by looking at GPL code, the GPL would of course not apply to code they release later in general.


> if we were talking about a biological neural network (person) training themselves by looking at GPL code...

The thing is, said person both reads way less code than a non-biological neural network, and emits its derivations based on many inputs regardless of the code it ingested via its high resolution multi focal adaptive light sensors. Including but not limited to experiences, communication with other biological neural networks, human-machine code translators (compilers), daily unpredictable hormone fluctuations and infinite other inputs it processed which affected all aspects of its cranial muscle and daily living circumstances and choices.

IOW, A neural network is neither a person, nor learns the same way or derives and emits the same way.

This is equal with claiming that a Furby is a person, just because it can babble and blink.


I'm not saying all neural networks are people, im saying people are a subset of neural networks by definition. We don't have any idea how consciousness works, and our brains are essentially still black boxes.

In a similar way, noone really understands intuitively how these ML models are actually working (we treat them as mostly black boxes in practice), in contrast to looking at an equation for instance. I have played with some of these text generation models, and frankly we are already at the point where deciding whether or not they pass the turing test depends on the details rather than the spirit of the rules for the test. It may not be a coincidence that NNs designed to replicate our own brain structure also replicate important aspects of our cognition.


These are not living beings though. They are programs. No one is arguing against humans learning.

>Neural networks approximate the function represented by your data.


The amount of data (not just code) that we would need to get sign off on is prohibitively large. If you account for all the stakeholders, then this won't be easy at all.

Meanwhile, the institutions will leap ahead of us. Models and annotated data sets will forever be out of reach. Open source equivalents will be severely behind the status quo.


Strongly disagree, institutions have an advantage in having the data, but they do not have the capability of creating more.

An intentional open source dataset could target new domains that there was no institutional will to pursue. I strongly believe that the open source community's capabilities far exceed that of any single large corporation.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: