It started by accident, with the original llama weights being leaked by two sepa...

ipsum2 · on June 18, 2024

Wrong. FAIR has been open sourcing ML models and source code for the last 10+ years. It did not start with llama. Also, llama was not leaked by employees, but people in the broader community, who the weights were shared with.

For example:

Faster R-CNN - state of the art image segmentation, released in 2017.

FastText - text embedding models, 2016.

FAISS - vector DB, 2018.

https://github.com/orgs/facebookresearch/repositories has over 1,000 repos.

hlfshell · on June 18, 2024

I was speaking about LLM weights specifically and llama, not all models and work at FAIR.

Per your links, it's clear that FAIR does have a good history of open source work.

ninjin · on June 19, 2024

This is very true. But we should also mention that Facebook sadly have been and are on a negative trajectory of openness. As someone working closely with them, there was a culture of nearly complete openness in the early years of their existence. Research was promptly shared in its entirety and licensing was compliant with open science. However, as the "AI boom" has grown, there is an increasing internal culture of holding parts of research back (my understanding is that this pressure comes from the C-suite). Licensing that previously was open source compliant, has had non-commercial clauses added more and more frequently and even non-standard complex agreements as we have seen for LLaMA 2 and 3. This is sad and the culture of openness is ultimately at risk as they become more and more like OpenAI, Google DeepMind, etc.

As I frequently point out, Facebook are free to decide on their own culture as they see fit and I am not entitled to their work. But it saddens me that they believe that compromising on their initial ideals is the way forward, rather than sticking to them through thick and thin. This, ultimately, makes it more and more difficult for me as an academic that believe in these ideals to work with them.

onurcel · on June 18, 2024

> It started by accident, with the original llama weights being leaked by two separate employees.

This is not true. Meta Fair has been built on openness from day 1. We published many papers and open-source d many repositories to reproduce the work

sdenton4 · on June 18, 2024

In short: "Commoditize the complement."

freehorse · on June 18, 2024

> weights being leaked

You can hardly call that "leak" when they basically were sending the weights to thousands of people who applied for access. It is not that they kept them secret.