Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Llama isn't open source either. But if I understand your point correctly, you're saying that the commercial use axis is what is important to people, and it's orthogonal to freeware vs open source. In the present environment, I agree. But I don't think we should let companies get away with poisoning the term open source for things which are not. I also believe that actual open source models have the near-term opportunity to make an impact and shape the future landscape, with red pajamas and others in the works. The distinction could be important in the near term, at the rate this field is developing at.


Neural network weights are better viewed as source code because they specify what function the network computes. As we're operating purely on feed-forward networks, there are no loops. Therefore, weights fully describe everything relevant for executing their represented function on inputs. Weights can be seen as a sort of intermediate language (with lots of stored data and partially computed states) interpretable by some deep learning library.

The network architecture itself is not source code, but a rough specification constraining the optimizer, which searches for possible program descriptions that within the specified constraints, minimize some loss function with respect to the data.

Neither data nor network architecture are the actual source, they are better seen as recipes which if followed (will at great expense), allow finding behaviorally similar programs. As you can see, the standard ideas of open source don’t quite carry over because the actual "source-code" is not human interpretable.


> Weights can be seen as a sort of intermediate language (with lots of stored data and partially computed states) interpretable by some deep learning library.

I've often talked about weights being the equivalent to assembly, your note seems to map to a similar intuition. And in that sense provided we ever solve the interpretability problem, we could in theory disassemble the weights to achieve similar outcomes as we do in asm-to-C. Interesting thought experiment insofar as, if the weights ought not be classified as open source (notwithstanding your first point which I agree with), can the disassembled output be classified as open source?


> But I don't think we should let companies get away with poisoning the term open source for things which are not.

Thats totally fair. And you're correct in that I was making an argument for positive outcomes being orthogonal to the semantics distinction.

> I also believe that actual open source models have the near-term opportunity to make an impact and shape the future landscape, with red pajamas and others in the works. The distinction could be very important in the near term, at the rate this field is developing at.

I think Falcon and MPT support your point as well, but those are still models that were trained on very small budgets relative to llama or gpt-3/4. There's a clear quality delta, albeit that gap is closing. Through that lens, I think having a large, well-funded org doing the pre-training work for the OSS community and releasing the weights permissively is a net positive.


Sen. Marsha Blackburn said “fair use” protections have become a “fairly useful way to steal” intellectual property. Some people would like to use this situation to get rid of "fair use".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: