> We release all our models to the research community.
This is yet more evidence for the "AI isn't a competitive advantage" thesis. State-of-the-Art is a public resource, so competing with AI offers no "moat".
In terms of medieval warfare, what Facebook is doing here looks like filling the moat with rocks and dirt. OpenAI is worth billions, Microsoft is spending billions to retrofit most of their big offerings with AI, Google is doubtless also spending billions to integrate AI in their products. Their moat in all cases is “a blob of X billion weights trained on Y trillion tokens”. Facebook here is spending mere _millions_ to make and release competitive models, effectively filling in the moat by giving it to everyone.
I don't see such grand stratagems as being a likely explanation. It seems more likely that a bunch of dorks are running around unsupervised trying their best to make lemonade with whatever budgets they are given as nobody can realistically manage AI researches. Or at least, it seems that much like in political governance, events are suddenly outpacing the ability of corporate to react.
In the case of OpenAI it is a "nudge humanity into a more wholesome direction" because a lot of them went on an acid fueled toxic affective altruism bender. And in this case it is "release all the things while Zuck is still obsessed with VR --for science!".
I like this other group better. But it is disturbing that it can stop at any moment. Probably why they are doing it, while they still can.
I think there is definitely a component of competition here.
Facebook has no real way to monetize this (eg they won’t release an api a la OpenAI and they don’t own a search engine). Since they can’t monetize it… why not provide a bit of kindling to lower the barrier for everyone else to compete with your competitors. This strategy is called “commodize your complement”.
If Facebook makes it easier to develop a google alternative, especially by doing something that doesn’t hurt them, then they just weakened a competitor. See Facebook releasing datasets for mapping. Think of the panic ChatGPT caused Google. It only cost a few Million to train but it’s probably costing Google more than that already.
> Facebook has no real way to monetize this (eg they won’t release an api a la OpenAI and they don’t own a search engine). Since they can’t monetize it… why not provide a bit of kindling to lower the barrier for everyone else to compete with your competitors. This strategy is called “commodize your complement”.
Your analysis is good, but that isn't what "commoditize your complement" means.
Strictly speaking, a search engine isn't a complement for FBs revenue streams. Relatively little of FBs revenue can be attributed to search engine traffic leading to FBs walled garden where they can show the user ads.
Complements are generally a required product or service that enables your core, but that isn't revenue generating for you. Examples for FB are data centers (so they participate in the Open Compute Project[0]), and mobile operating systems (which Google already has made a commodity, for their own reasons, with Android).
What FB is doing here is commoditizing their competitors' core offering (or rather, a rather promising future one). That's just the tactic though, there are several strategies this can enable, from undermining the barriers to entry into their competitors' search markets, to actually fragmenting that market by encouraging a diversity of specialized chat interfaces over one big chat model. You can see hints of both in this announcement.
Final note: FB is also protecting itself from relying on a competitor as a supplier should chat become a preferred user interface for the content on social networks, which it hasn't, but if it ever did this would count as "commoditizing their complement", though I would actually expect FB to switch to a mostly proprietary approach in that circumstance (so not much openness on having LLMs operate on social graphs and the like), just keeping the foundation they rely on, and which undermines and prevents gatekeeping by their advertising competitors, open.
I don't think that's the best interpretation of complement. Think of complement as "is your users experience benefited by the existence of the other thing". I think that Search qualifies as something that enhances the Facebook users experience (e.g. my cousin mentioned thing that they like doing today, how do I find more about thing?)
Given FBs optimization of the feed for dopamine hits, I don't think search across posts is a high priority (yes, it is table stakes, but whatever they currently use seems to be commoditized enough for their purposes). Group and page recommendations and discovery is another matter, perhaps an LLM may help make that more engaging. It might also be useful for helping users (and brands) moderate their groups and so on. These are also complements, but they aren't external ones.
Facebook maybe can't make money from it, but they arguably could save money from it, for instance in automating some fact checking and moderation activities that they currently spend quite a bit of money on.
If you’re going to ask an AI to do fact checking with today’s technology, I would urge you to start by asking the AI a simple test question: Which is heavier? A pound of feathers or two pounds of lead?
I was going to provide evidence that this works correctly, but I was surprised to find that it gets severely tripped up on this particular question, even when told to “show working step by step”.
ChatGPT: "One pound of feathers and two pounds of lead weigh the same, which is one pound or 16 ounces. The difference is in their volume, where a pound of feathers takes up more space than two pounds of lead. This is because the density of feathers is much less than that of lead, so even though the weight is the same, the amount of space they occupy is quite different."
Your question makes it sound like you believe it succeeds, but the answer you pasted is incorrect.
As for why it fails, it is likely a bias arising from the question being way more commonly asked in the corpus with equal mass than with distinct mass, increasing attention weights towards an answer expressing equality.
I believe current LLMs lack some common sense at an architectural level. They learn both specialized facts and general deduction in the same weights class: in my mind, they should separate their world model from their instance model.
They almost certainly spent at least a few million dollars on this research project. Hard to say when the decision was made to open source this (from the outset or after it started showing results), but the decision was conscious and calculated. Nothing this high-profile is going to escape the strategic decision making processes of upper management.
The researchers and engineers and other assorted dorks who built it weren’t thinking about moats, for sure, I agree with you there. But I guarantee you that the metaphor of medieval warfare was on the minds of the executives and legal team deciding whether to let the team release their work in this way.
Today we're releasing a new state-of-the-art AI large language model called LLaMA designed to help researchers advance their work. LLMs have shown a lot of promise in generating text, having conversations, summarizing written material, and more complicated tasks like solving math theorems or predicting protein structures. Meta is committed to this open model of research and we'll make our new model available to the AI research community.
I don't know what that means or if he even wrote/read it tbh. I hope it literally just means Meta is actually committed to this open model of research (for now).
Maybe he is being a Machiavellian moat filler, I stand corrected. I think/hope that they don't really have a plan to counter OpenAI yet because I am afraid this attitude won't last once they do and this stuff has recently started moving quickly.
100%. This kind of announcement is for the street. Investors need to be reassured that Meta is keeping up with the cool AI stuff over at Google and Microsoft. A release like this will cause a flurry of downstream news coverage. I’m sure they are hoping that the model will be picked up by researchers doing interesting things who might further generate good coverage.
I don't know what my bias is supposed to be but I called them dorks affectionately for one. The other replies are literally arguing that they are very extremely supervised where as I am speculating they are just eager to share their work for the right reasons and the eye of Sauron has yet to turn upon them.
Inside knowledge I never claimed. Anything else I can help you with today? :)
I think Noah Kagan's blog is literally called "OK Dork", and I always assumed the title was self-deprecating (in a fun/positive way) rather than negative.
Which kind of suggests Microsoft made a really bad move antagonizing the open-source community with Gtihub Copilot.
They got a few years of lead time in the "AI codes for you" market, but in exchange permanently soured a significant fraction of their potential userbase who will turn to open-source alternatives soon anyway.
I wonder if they'd have been better served focusing on selling Azure usage and released Copilot as an open-source product.
How did Microsoft sour developers with Copilot? I know dozens of people that pay for it (including myself) and I feel like it is widely regarded as a "no brainer" for the price that it's offered at.
The company that tried to kill Linux in the 90s, owned by the world's most famously rich man, is now stealing my code and selling it back to me? Yeah, fuck that.
It's not selling you back your code. It's different code, adapted to a different task; your own code is forever free for you, you don't need anyone to give it to you.
Given the cost of running these models, and the utmost dedication needed to train them, I think it is worth it. GPUs cost money, electricity costs money. They can't serve the world for free and offer good latency.
I mean, that's like saying an author steals the open source alphabet and charges you for reading their ordering of letters, as if the ordering of letters isn't where all the value is.
They didn’t. There is a small group of people that are always looking for the latest reason to be outraged and to point at any of the big tech companies and go “aha! They are evil!” Copilot’s ai was trained on GitHub projects and so these people are taking their turns clutching their pearls inside of their little bubble.
I’d bet that more than 95% of devs haven’t even heard of this “controversy” and even if they did, wouldn’t care.
I do think the controversy is stupid, but inside my own company, we significantly delayed migrating some projects to Github because people were concerned that the way Microsoft handled Copilot meant that Github wasn't a safe long-term host for an open-source project (and yes, I'm aware of all the reason that's irrational).
Even if the people angry about Copilot are a minority, it might still have a bad move. Trust accumulates slowly over years, but mistrusts builds up over only a few events. People are still remembering Microsoft's anticompetitive practices from 20 years ago. The mistakes it makes now might stick for a long time.
Presumably, because they trained Copilot on billions of lines of, often licensed, code (without permission), that Copilot has a tendency to regurgitate verbatim, without said license.
For a specific example some variation of "fast inverse square root" will usually get you the exact GPL licensed code from Quake III, comments included.
Do you mean the same code that has its own Wikipedia page where the exact code is written, comments included, and has probably been copy pasted into 100’s of other projects?
Do you see that notice at the top of the file? It says:
==
This file is part of Quake III Arena source code.
Quake III Arena source code is free software; you can redistribute it
and/or modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2 of the License,
or (at your option) any later version.
===
but because it's been laundered by Microsoft, you think it's okay to steal free software and make it proprietary?
How is it made proprietary? The Quake III Arena is no more proprietary now then if it were stored on GitHub proprietary web servers. Copilot is just a fancy code index, that sometimes returns the original code and other times it gives you a modified copy.
Because as you say, it provides original or modified code but doesn't provide provenance or license information. It's copyright laundering. After decades of fighting the community in the courts over shit like this, Microsoft just turns around and says well, it's okay when we do it? Foh.
"The algorithm was often misattributed to John Carmack, but in fact the code is based on an unpublished paper by William Kahan and K.C. Ng circulated in May 1986"
The point is that it's charging having been trained on open source code. What you're saying agrees with that, but your triumphant tone seems to be implying the opposite. Which did you mean?
> that Copilot has a tendency to regurgitate verbatim, without said license.
A "tendency" is overstating it. I'm not aware of any example that would have been likely to occur if the author wasn't specifically trying to get the regurgitated code.
Isn't this a classic "commoditize your complements" play? Facebook's value is in their social graph and users' attachment to it via various apps. TikTok and others are not trying to replicate that, they're creating similar volumes of user attachment via models. You can easily imagine LLM's being applied by competitors in a comparable way. If models become commodities, then Facebook continues to hold the one advantage nobody else has.
My guess is that the primary strategic motivation here is recruiting and maintaining a competent org of people who know how to build large models. I don't think that the actual release is as calculated as all that. It's more about proving to themselves and the world that they can train big models.
i think maybe the LLM team at facebook was bummed out because twitter bullied them on their last public release (and they didn't ignore it), and this time they decided to sit down and nerd flex by doing some undeniably excellent performance work that reduces resource requirements by 10x and limits itself only to publicly available training data.
maybe they care about moats and elon muskcrosoft's closedai or whatever, but i kinda doubt it. again, it feels more like a nerd flex probably for the purposes of raising morale internally and pushing the field as a whole in a good direction by reducing resource requirements.
excellent paper! easy on the eyes and i really like the angle.
> We release all our models to the research community.
And from the FB blogpost [0]
"Request Form
Thank you for your interest in Meta AI’s LLaMA (Large Language Model Meta AI) models. To request access to the models, please fill out this form, and we'll review and let you know if your use case is approved. The information you provide below will be used solely to assess eligibility to access these models."
So much for "releasing" the model for research community.
Glad to see you still on HN! You've done amazing work in this domain!
I'd argue that this goes further back to the word2vec/glove days too. I was working for a company in 2018 who leveraged my skills for fine-tuning word2vec/fasttext even before BERT/attention is all you need paper.
This is yet more evidence for the "AI isn't a competitive advantage" thesis. State-of-the-Art is a public resource, so competing with AI offers no "moat".