> We release all our models to the research community. This is yet more evidence...

fwlr · on Feb 24, 2023

In terms of medieval warfare, what Facebook is doing here looks like filling the moat with rocks and dirt. OpenAI is worth billions, Microsoft is spending billions to retrofit most of their big offerings with AI, Google is doubtless also spending billions to integrate AI in their products. Their moat in all cases is “a blob of X billion weights trained on Y trillion tokens”. Facebook here is spending mere _millions_ to make and release competitive models, effectively filling in the moat by giving it to everyone.

recuter · on Feb 24, 2023

I don't see such grand stratagems as being a likely explanation. It seems more likely that a bunch of dorks are running around unsupervised trying their best to make lemonade with whatever budgets they are given as nobody can realistically manage AI researches. Or at least, it seems that much like in political governance, events are suddenly outpacing the ability of corporate to react.

In the case of OpenAI it is a "nudge humanity into a more wholesome direction" because a lot of them went on an acid fueled toxic affective altruism bender. And in this case it is "release all the things while Zuck is still obsessed with VR --for science!".

I like this other group better. But it is disturbing that it can stop at any moment. Probably why they are doing it, while they still can.

vineyardmike · on Feb 24, 2023

I think there is definitely a component of competition here.

Facebook has no real way to monetize this (eg they won’t release an api a la OpenAI and they don’t own a search engine). Since they can’t monetize it… why not provide a bit of kindling to lower the barrier for everyone else to compete with your competitors. This strategy is called “commodize your complement”.

If Facebook makes it easier to develop a google alternative, especially by doing something that doesn’t hurt them, then they just weakened a competitor. See Facebook releasing datasets for mapping. Think of the panic ChatGPT caused Google. It only cost a few Million to train but it’s probably costing Google more than that already.

webmaven · on Feb 24, 2023

> Facebook has no real way to monetize this (eg they won’t release an api a la OpenAI and they don’t own a search engine). Since they can’t monetize it… why not provide a bit of kindling to lower the barrier for everyone else to compete with your competitors. This strategy is called “commodize your complement”.

Your analysis is good, but that isn't what "commoditize your complement" means.

Strictly speaking, a search engine isn't a complement for FBs revenue streams. Relatively little of FBs revenue can be attributed to search engine traffic leading to FBs walled garden where they can show the user ads.

Complements are generally a required product or service that enables your core, but that isn't revenue generating for you. Examples for FB are data centers (so they participate in the Open Compute Project[0]), and mobile operating systems (which Google already has made a commodity, for their own reasons, with Android).

What FB is doing here is commoditizing their competitors' core offering (or rather, a rather promising future one). That's just the tactic though, there are several strategies this can enable, from undermining the barriers to entry into their competitors' search markets, to actually fragmenting that market by encouraging a diversity of specialized chat interfaces over one big chat model. You can see hints of both in this announcement.

Final note: FB is also protecting itself from relying on a competitor as a supplier should chat become a preferred user interface for the content on social networks, which it hasn't, but if it ever did this would count as "commoditizing their complement", though I would actually expect FB to switch to a mostly proprietary approach in that circumstance (so not much openness on having LLMs operate on social graphs and the like), just keeping the foundation they rely on, and which undermines and prevents gatekeeping by their advertising competitors, open.

[0] https://www.opencompute.org/

invig · on Feb 24, 2023

I don't think that's the best interpretation of complement. Think of complement as "is your users experience benefited by the existence of the other thing". I think that Search qualifies as something that enhances the Facebook users experience (e.g. my cousin mentioned thing that they like doing today, how do I find more about thing?)

webmaven · on March 2, 2023

Given FBs optimization of the feed for dopamine hits, I don't think search across posts is a high priority (yes, it is table stakes, but whatever they currently use seems to be commoditized enough for their purposes). Group and page recommendations and discovery is another matter, perhaps an LLM may help make that more engaging. It might also be useful for helping users (and brands) moderate their groups and so on. These are also complements, but they aren't external ones.

naasking · on Feb 24, 2023

Facebook maybe can't make money from it, but they arguably could save money from it, for instance in automating some fact checking and moderation activities that they currently spend quite a bit of money on.

sclarisse · on Feb 25, 2023

If you’re going to ask an AI to do fact checking with today’s technology, I would urge you to start by asking the AI a simple test question: Which is heavier? A pound of feathers or two pounds of lead?

selfhoster11 · on Feb 25, 2023

I was going to provide evidence that this works correctly, but I was surprised to find that it gets severely tripped up on this particular question, even when told to “show working step by step”.

elbear · on Feb 25, 2023

Do you have an explanation for why it fails on this question?

panarky · on Feb 25, 2023

Why assume it fails?

ChatGPT: "One pound of feathers and two pounds of lead weigh the same, which is one pound or 16 ounces. The difference is in their volume, where a pound of feathers takes up more space than two pounds of lead. This is because the density of feathers is much less than that of lead, so even though the weight is the same, the amount of space they occupy is quite different."

espadrine · on Feb 25, 2023

Your question makes it sound like you believe it succeeds, but the answer you pasted is incorrect.

As for why it fails, it is likely a bias arising from the question being way more commonly asked in the corpus with equal mass than with distinct mass, increasing attention weights towards an answer expressing equality.

I believe current LLMs lack some common sense at an architectural level. They learn both specialized facts and general deduction in the same weights class: in my mind, they should separate their world model from their instance model.

titzer · on Feb 24, 2023

They almost certainly spent at least a few million dollars on this research project. Hard to say when the decision was made to open source this (from the outset or after it started showing results), but the decision was conscious and calculated. Nothing this high-profile is going to escape the strategic decision making processes of upper management.

fwlr · on Feb 24, 2023

The researchers and engineers and other assorted dorks who built it weren’t thinking about moats, for sure, I agree with you there. But I guarantee you that the metaphor of medieval warfare was on the minds of the executives and legal team deciding whether to let the team release their work in this way.

visarga · on Feb 24, 2023

> It seems more likely that a bunch of dorks are running around unsupervised trying their best to make lemonade with whatever budgets they are given

Probably not, Zuck is announcing it.

recuter · on Feb 24, 2023

  Today we're releasing a new state-of-the-art AI large language model called LLaMA designed to help researchers advance their work. LLMs have shown a lot of promise in generating text, having conversations, summarizing written material, and more complicated tasks like solving math theorems or predicting protein structures. Meta is committed to this open model of research and we'll make our new model available to the AI research community.

I don't know what that means or if he even wrote/read it tbh. I hope it literally just means Meta is actually committed to this open model of research (for now).

Maybe he is being a Machiavellian moat filler, I stand corrected. I think/hope that they don't really have a plan to counter OpenAI yet because I am afraid this attitude won't last once they do and this stuff has recently started moving quickly.

ahati · on Feb 24, 2023

It is to please the shareholders. Now shareholder can know that Meta can compete with GPT3.

ttul · on Feb 24, 2023

100%. This kind of announcement is for the street. Investors need to be reassured that Meta is keeping up with the cool AI stuff over at Google and Microsoft. A release like this will cause a flurry of downstream news coverage. I’m sure they are hoping that the model will be picked up by researchers doing interesting things who might further generate good coverage.

And, yes, it fills the moat.

mliker · on Feb 24, 2023

this is a pretty biased and uninformed opinion. Pretty condescending to call it a "bunch of dorks...running around unsupervised."

recuter · on Feb 24, 2023

I don't know what my bias is supposed to be but I called them dorks affectionately for one. The other replies are literally arguing that they are very extremely supervised where as I am speculating they are just eager to share their work for the right reasons and the eye of Sauron has yet to turn upon them.

Inside knowledge I never claimed. Anything else I can help you with today? :)

rcpt · on Feb 24, 2023

> I called them dorks affectionately

Never in my life have I seen that word used affectionately

vidarh · on Feb 24, 2023

I have seen that many times, as well as seeing people use it about themselves.

drusepth · on Feb 25, 2023

Geek, nerd, and dork can all be used positively/affectionately. I know I've called myself and friends all of the above on many occasions.

alibero · on Feb 25, 2023

As one of the unsupervised dorks working on LLMs at Meta (not one of the authors here) I took it in a positive way :)

recuter · on Feb 25, 2023

Thank you for your service meatbag <3

sebastiennight · on Feb 24, 2023

I think Noah Kagan's blog is literally called "OK Dork", and I always assumed the title was self-deprecating (in a fun/positive way) rather than negative.

tantalor · on Feb 24, 2023

Check out https://news.ycombinator.com/newsguidelines.html

Not sure how calling out comment as "biased", "uninformed" or "condescending" is helping.

PoignardAzur · on Feb 24, 2023

Which kind of suggests Microsoft made a really bad move antagonizing the open-source community with Gtihub Copilot.

They got a few years of lead time in the "AI codes for you" market, but in exchange permanently soured a significant fraction of their potential userbase who will turn to open-source alternatives soon anyway.

I wonder if they'd have been better served focusing on selling Azure usage and released Copilot as an open-source product.

freeqaz · on Feb 24, 2023

How did Microsoft sour developers with Copilot? I know dozens of people that pay for it (including myself) and I feel like it is widely regarded as a "no brainer" for the price that it's offered at.

Please help me understand!

Mizza · on Feb 24, 2023

The company that tried to kill Linux in the 90s, owned by the world's most famously rich man, is now stealing my code and selling it back to me? Yeah, fuck that.

nl · on Feb 25, 2023

This isn't stealing at all. I want my open source code to be used like this.

nhinck2 · on Feb 25, 2023

If only there was some kind of contract like thing you could release your code under so that there was no ambiguity.

nl · on Feb 25, 2023

Sarcasm doesn't translate well online...

To be clear: there is and it's pretty difficult to argue that MS is violating even the GPL.

visarga · on Feb 24, 2023

It's not selling you back your code. It's different code, adapted to a different task; your own code is forever free for you, you don't need anyone to give it to you.

Given the cost of running these models, and the utmost dedication needed to train them, I think it is worth it. GPUs cost money, electricity costs money. They can't serve the world for free and offer good latency.

mbb70 · on Feb 24, 2023

I mean, that's like saying an author steals the open source alphabet and charges you for reading their ordering of letters, as if the ordering of letters isn't where all the value is.

robertlagrant · on Feb 24, 2023

These models are trained on sequences of words, not told the letters and left to get on with it.

KarlKemp · on Feb 24, 2023

It continues to amaze how people are incapable of following even the most trivial cases of abstract reasoning.

theRealMe · on Feb 24, 2023

They didn’t. There is a small group of people that are always looking for the latest reason to be outraged and to point at any of the big tech companies and go “aha! They are evil!” Copilot’s ai was trained on GitHub projects and so these people are taking their turns clutching their pearls inside of their little bubble.

I’d bet that more than 95% of devs haven’t even heard of this “controversy” and even if they did, wouldn’t care.

PoignardAzur · on Feb 27, 2023

I'm not so sure.

I do think the controversy is stupid, but inside my own company, we significantly delayed migrating some projects to Github because people were concerned that the way Microsoft handled Copilot meant that Github wasn't a safe long-term host for an open-source project (and yes, I'm aware of all the reason that's irrational).

Even if the people angry about Copilot are a minority, it might still have a bad move. Trust accumulates slowly over years, but mistrusts builds up over only a few events. People are still remembering Microsoft's anticompetitive practices from 20 years ago. The mistakes it makes now might stick for a long time.

sandkoan · on Feb 24, 2023

Presumably, because they trained Copilot on billions of lines of, often licensed, code (without permission), that Copilot has a tendency to regurgitate verbatim, without said license.

happypumpkin · on Feb 24, 2023

For a specific example some variation of "fast inverse square root" will usually get you the exact GPL licensed code from Quake III, comments included.

theRealMe · on Feb 24, 2023

Do you mean the same code that has its own Wikipedia page where the exact code is written, comments included, and has probably been copy pasted into 100’s of other projects?

https://en.m.wikipedia.org/wiki/Fast_inverse_square_root

Mizza · on Feb 24, 2023

You mean this code?

https://archive.softwareheritage.org/browse/content/sha1_git...

Do you see that notice at the top of the file? It says:

==

This file is part of Quake III Arena source code.

Quake III Arena source code is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

===

but because it's been laundered by Microsoft, you think it's okay to steal free software and make it proprietary?

warkdarrior · on Feb 24, 2023

How is it made proprietary? The Quake III Arena is no more proprietary now then if it were stored on GitHub proprietary web servers. Copilot is just a fancy code index, that sometimes returns the original code and other times it gives you a modified copy.

Mizza · on Feb 24, 2023

Because as you say, it provides original or modified code but doesn't provide provenance or license information. It's copyright laundering. After decades of fighting the community in the courts over shit like this, Microsoft just turns around and says well, it's okay when we do it? Foh.

imtringued · on Feb 25, 2023

The problem is you have to obey the license of the code even if you just take a snippet and Copilot does not reproduce the correct license.

WithinReason · on Feb 27, 2023

"The algorithm was often misattributed to John Carmack, but in fact the code is based on an unpublished paper by William Kahan and K.C. Ng circulated in May 1986"

charcircuit · on Feb 25, 2023

That code didn't originate from quake

robertlagrant · on Feb 24, 2023

The point is that it's charging having been trained on open source code. What you're saying agrees with that, but your triumphant tone seems to be implying the opposite. Which did you mean?

happypumpkin · on Feb 26, 2023

Yes that code, I was replying to a comment claiming that

> Copilot has a tendency to regurgitate [code] verbatim, without said license.

and I think that is a pretty good example.

PoignardAzur · on Feb 27, 2023

> that Copilot has a tendency to regurgitate verbatim, without said license.

A "tendency" is overstating it. I'm not aware of any example that would have been likely to occur if the author wasn't specifically trying to get the regurgitated code.

layer8 · on Feb 24, 2023

Controversy about unlicensed use of source code as training data and lack of attributions in the generated output.

Vvector · on Feb 24, 2023

Microsoft makes dozens of "really bad moves" every year. This is nothing.

twoodfin · on Feb 24, 2023

Isn't this a classic "commoditize your complements" play? Facebook's value is in their social graph and users' attachment to it via various apps. TikTok and others are not trying to replicate that, they're creating similar volumes of user attachment via models. You can easily imagine LLM's being applied by competitors in a comparable way. If models become commodities, then Facebook continues to hold the one advantage nobody else has.

Veen · on Feb 24, 2023

Commoditize your complement.

https://gwern.net/complement

mgraczyk · on Feb 24, 2023

My guess is that the primary strategic motivation here is recruiting and maintaining a competent org of people who know how to build large models. I don't think that the actual release is as calculated as all that. It's more about proving to themselves and the world that they can train big models.

skybrian · on Feb 24, 2023

Not everyone. You have to apply for access.

imtringued · on Feb 25, 2023

There is no moat, the entire point of LLM is to have no moat and just throw money faster at the problem than your competitors.

a-dub · on Feb 25, 2023

i think maybe the LLM team at facebook was bummed out because twitter bullied them on their last public release (and they didn't ignore it), and this time they decided to sit down and nerd flex by doing some undeniably excellent performance work that reduces resource requirements by 10x and limits itself only to publicly available training data.

maybe they care about moats and elon muskcrosoft's closedai or whatever, but i kinda doubt it. again, it feels more like a nerd flex probably for the purposes of raising morale internally and pushing the field as a whole in a good direction by reducing resource requirements.

excellent paper! easy on the eyes and i really like the angle.

lr1970 · on Feb 24, 2023

> We release all our models to the research community.

And from the FB blogpost [0] "Request Form Thank you for your interest in Meta AI’s LLaMA (Large Language Model Meta AI) models. To request access to the models, please fill out this form, and we'll review and let you know if your use case is approved. The information you provide below will be used solely to assess eligibility to access these models."

So much for "releasing" the model for research community.

[0] https://ai.facebook.com/blog/large-language-model-llama-meta...

layer8 · on Feb 24, 2023

They are only releasing it on a case-by-case basis and with a non-commercial license.

minimaxir · on Feb 24, 2023

Business orgs have been finetuning open-source models like these on their own internal data to create a moat since BERT in 2018.

Der_Einzige · on Feb 24, 2023

Glad to see you still on HN! You've done amazing work in this domain!

I'd argue that this goes further back to the word2vec/glove days too. I was working for a company in 2018 who leveraged my skills for fine-tuning word2vec/fasttext even before BERT/attention is all you need paper.

cardine · on Feb 24, 2023

And yet there are still no publicly available models that could actually compete with ChatGPT.

I'm not even talking about RLHF (although data like that is also a huge moat) - just simple things like larger context sizes.

There are still plenty of AI advantages to be had if you go just a little bit outside of what is currently possible with off the shelf models.

jstx1 · on Feb 24, 2023

The research is open, you still need

1) to have a good idea for a product that people want

2) lots of talent and resources to actually build and run everything

ilaksh · on Feb 24, 2023

Actually no it's not available for commercial purposes or for anyone they do not approve.

mjburgess · on Feb 24, 2023

And then facebook will release a version which beats it ? :)