In industry research, someone in a chief position like LeCun should know how to balance long-term research with short-term projects. However, for whatever reason, he consistently shows hostility toward LLMs and engineering projects, even though Llama and PyTorch are two of the most influential projects from Meta AI. His attitude doesn’t really match what is expected from a Chief position at a product company like Facebook. When Llama 4 got criticized, he distanced himself from the project, stating that he only leads FAIR and that the project falls under a different organization. That kind of attitude doesn’t seem suitable for the face of AI at the company. It's not a surprise that Zuck tried to demote him.
These are the types that want academic freedom in a cut-throat industry setup and conversely never fit into academia because their profiles and growth ambitions far exceed what an academic research lab can afford (barring some marquee names). It's an unfortunate paradox.
The Bell Labs we look back on was only the result of government intervention in the telecom monopoly. The 1956 consent decree forced Bell to license thousands of its patents, royalty free, to anyone who wanted to use them. Any patent not listed in the consent decree was to be licensed at "reasonable and nondiscriminatory rates."
The US government basically forced AT&T to use revenue from its monopoly to do fundamental research for the public good. Could the government do the same thing to our modern megacorps? Absolutely! Will it? I doubt it.
Used to be a Google X. Not sure at what scale it was.
But if any state/central bank was clever they would subsidize this.
That's a better trickle down strategy.
Until we get to agi and all new discoveries are autonomously led by AI that is :p
> Google X is a complete failure
- Google Brain
- Google Watch/Wear OS
- Gcam/Pixel Camera
- Insight (indoor GMaps)
- Waymo
- Verily
It is a moonshot factory after all, not a "we're only going to do things that are likely to succeed" factory. It's an internal startup space, which comes with high failure rates. But these successes seem pretty successful. Even the failed Google Glass seems to have led to learning, though they probably should have kept the team going considering the success of Meta Raybands and with things like Snap's glasses.
Didn't the current LLMs stem from this...? Or it might be Google Brain instead. For Google X, there is Waymo? I know a lot of stuff didn't pan out. This is expected. These were 'moonshots'.
But the principle is there. I think that when a company sits on a load of cash, that's what they should do. Either that or become a kind of alternative investments allocator. These are risky bets. But they should be incentivized to take those risks. From a fiscal policy standpoint for instance.
Well it probably is the case already via lower taxation of capital gains and so on.
But there should probably exist a more streamlined framework to make sure incentives are aligned.
And/or assigned government projects?
Besides implementing their Cloud infrastructure that is...
It seems DeepMind is the closest thing to a well funded blue-sky AI research group, even despite the merger with Google Brain and now more of a product focus.
Google Deepmind is the closest lab to that idea because Google is the only entity that is big enough to get close to the scale of AT&T. I was skeptical that the Deepmind and Google Brain merge would be successful but it seems to have worked surprisingly well. They are killing it with LLMs and image editing models. They are also backing the fastest growing cloud business in the world and collecting Nobel prizes along the way.
I thought that was Google. Regulators pretend not to notice their monopoly, they probably get large government contracts for social engineering and surveillance laundered through advertising, and the “don’t be evil” part is they make some open source contributions
I'd argue SSI and Thinking Machines Lab seem to that environment you are thinking about. Industry labs that focuses on research without immediate product requirement.
I don't think that quite matches because those labs have very clear directions of research in LLMs. The theming is a bit more constrained and I don't know if a line of research as vague as what LeCun is pursuing would be funded by those labs.
> A pipe dream sustaining the biggest stock market bubble in history
This is why we're losing innovation.
Look at electric cars, batteries, solar panels, rare earths and many more. Bubble or struggle for survival? Right, because if US has no AI the world will have no AI? That's the real bubble - being stuck in an ancient world view.
Meta's stock has already tanked for "over" investing in AI. Bubble, where?
> 2 Trillion dollars in Capex to get code generators with hallucinations
You assume that's the only use of it.
And are people not using these code generators?
Is this an issue with a lost generation that forgot what Capex is? We've moved from Capex to Opex and now the notion is lost, is it? You can hire an army of software developers but can't build hardware.
Is it better when everyone buys DeepSeek or a non-US version? Well then you don't need to spend Capex but you won't have revenue either.
And that $2T you're referring to includes infrastructure like energy, data centers, servers and many things. DeepSeek rents from others. Someone is paying.
Man, why did no one tell the people who invented bronze that they weren’t allowed to do it until they had a correct definition for metals and understood how they worked? I guess the person saying something can’t be done should stay out of the way of the people doing it.
>> I guess the person saying something can’t be done should stay out of the way of the people doing it.
I'll happily step out of the way once someone simply tells me what it is you're trying to accomplish. Until you can actually define it, you can't do "it".
The big tech companies are trying to make machines that replace all human labor. They call it artificial intelligence. Feel free to argue about definitions.
I'm not sure what 'inventing bronze' is supposed to be. 'Inventing' AGI is pretty much equivalent to creating new life, from scratch. And we don't have an idea on how to do that either, or how life came to be.
Intelligence and human health can't be defined neatly. They are what we call suitcase words. If there exists a physiological tradeoff between medical research about whether to live till 500 years or to be able to lift 1000kg when a person is in youth, those are different dimensions / directions across we can make progress. Same happens for intelligence. I think we are on right track.
I don't think the bar exam is scientifically designed to measure intelligence so that was an odd example. Citing the bar exam is like saying it passes the "Game of thrones trivia" exam so it must be intelligent.
As for IQ tests and the like, to the extent they are "scientific" they are designed based on empirical observations of humans. It is not designed to measure the intelligence of a statistical system containing a compressed version of the internet.
Or does this just prove lawyers are artificially intelligent?
yes, a glib response, but think about it: we define an intelligence test for humans, which by definition is an artificial construct. If we then get a computer to do well on the test we haven't proved it's on par with human intelligence, just that both meet some of the markers that the test makers are using as rough proxies for human intelligence. Maybe this helps signal or judge if AI is a useful tool for specific problems, but it doesn't mean AGI
Hi there! :) Just wanted to gently flag that one of the terms (beginning with the letter "r") in your comment isn't really aligned with the kind of inclusive language we try to encourage across the community. Totally understand it was likely unintentional - happens to all of us! Going forward, it'd be great to keep things phrased in a way that ensures everyone feels welcome and respected. Thanks so much for taking the time to share your thoughts here!
I became interested in the matter reading this thread and vaguely remember reading a couple of the articles. Saved them all in NotebookLM to get an audio overview and to read later. Thanks!
I always take a bird's eye kind of view on things like that, because however close I get, it always loops around to make no sense.
> is massively monopolistic and have unbounded discretionary research budget
that is the case for most megacorps. if you look at all the financial instruments.
modern monopolies are not equal to single corporation domination. modern monopolies are portfolios who do business using the same methods and strategies.
the problem is that private interests strive mostly for control, not money or progress. if they have to spend a lot of money to stay in control of (their (share of the)) segments, they will do that, which is why stuff like the current graph of investments of, by and for AI companies and the industries works.
A modern equivalent and "breadth" of a Bell Labs (et. al) kind of R&D speed could not be controlled and would 100% result in actual Artificial Intelligence vs all those white labelababbebel (sry) AI toys we get now.
Post WW I and II "business psychology" have build a culture that cannot thrive in a free world (free as in undisturbed and left to all devices available) for a variety of reasons, but mostly because of elements with a medieval/dark-age kind of aggressive tendency to come to power and maintain it that way.
In other words: not having a Bell Labs kind of setup anymore ensures that the variety of approaches taken on large scales aka industry-wide or systemic, remains narrow enough.
More importantly even if you do want it, and there are business situations that support your ambitions. You still have to do get into the managerial powerplay, which quite honestly takes a separate kind of skill set, time and effort. Which Im guessing the academia oriented people aren't willing to do.
Its pretty much dog eat dog at top management positions.
Its not exactly a space for free thinking timelines.
It is not a free thinking paradise in academia either. Different groups fighting for hiring, promotions and influence exist there, too. And it tends to be more pronounced: it is much easier in industry to find a comparable job to escape a toxic environment, so a lot of problems in academia settings steam forever.
But the skill sets to avoid and survive personnel issues in academia is different from industry. My 2c.
> Its not exactly a space for free thinking timelines.
Same goes for academia. People's visions compete for other people's financial budgets, time and other resources. Some dogs get to eat, study, train at the frontier and with top tools in top environments while the others hope to find a good enough shelter.
as I understand, Bell Labs mandate was to improve the network, which had tons of great threads to pull on: plastics for handsets, transistors for amplification, information theory for capacity on fixed copper.
Google and Meta are ads businesses with a lot less surface area for such a mandate to have similar impact and, frankly, exciting projects people want to do.
Meanwhile they still have tons of cash so, why not, throw money at solving Atari or other shiny programs.
Also, for cultural reasons, there’s been a huge shift to expensive monolithic “moonshot programs” whose expenses need on-demand progress to justify and are simply slower and way less innovative.
3 passionate designers hiding deep inside Apple can side hustle up the key gestures that make multi touch baked enough to see a path to an iPhone - long before iPhone was any sort endgame direction they were being managed to.
Innovation thrives on lots of small teams mostly failing in the search for something worth doubling down on.
Googles et al have a new approach - aim for the moon, budget and staff for the moon, then burn cash while no one ever really polished up the fundamental enabling pieces in hindsight they needed to succeed
I would pose a question differently, under his leadership did Meta achieve good outcome?
If the answer is yes, then better to keep him, because he has already proved himself and you can win in the long-term. With Meta's pockets, you can always create a new department specifically for short-term projects.
If the answer is no, then nothing to discuss here.
Meta did exactly that, kept him but reduced his scope. Did the broader research community benefit from his research? Absolutely. But did Meta achieve a good outcome? Probably not.
If you follow LeCun on social media, you can see that the way FAIR’s results are assessed is very narrow-minded and still follows the academic mindset. He mentioned that his research is evaluated by: "Research evaluation is a difficult task because the product impact may occur years (sometimes decades) after the work. For that reason, evaluation must often rely on the collective opinion of the research community through proxies such as publications, citations, invited talks, awards, etc."
But as an industry researcher, he should know how his research fits with the company vision and be able to assess that easily. If the company's vision is to be the leader in AI, then as of now, he seems to have failed that objective, even though he has been at Meta for more than 10 years.
Also he always sounds like "I know this will not work". Dude are you a researcher? You're supposed to experiment and follow the results. That's what separates you from oracles and freaking philosophers or whatever.
If academia is in question, then so are their titles.
When I see "PhD", I read "we decided that he was at least good enough for the cause" PhD, or PhD (he fulfilled the criteria).
He's speaking to the entire feedforward Transformer-based paradigm. He sees little point in continuing to try to squeeze more blood out of that stone and instead move on to more appropriate ways to model ontologies per se rather than the crude-for-what-we-use-them-for embedding-based methods that are popular today.
I really resonate with his view due to my background in physics and information theory. I for one welcome his new experimentation in other realms while so many still hack away at their LLMs in pursuit of SOTA benchmarks.
If the LLM hype doesn't cool down fast, we're probably looking at another AI winter. Appears to me like he's just trying to ensure he'll have funding for chasing the global maximum going forward.
> If the LLM hype doesn't cool down fast, we're probably looking at another AI winter.
Is the real bubble ignorance? Maybe you'll cool down but the rest of the world? There will just be more DeepSeek and more advances until the US loses its standing.
Yeah that stuff generated embarrassingly wrong scientific 'facts' and citations.
That kind of hallucination is somewhat acceptable for something marketed as a chatbot, less so for an assistant helping you with scientific knowledge and research.
I thought it was weird at the time how much hate Galactica got for its hallucinations compared to hallucinations of competing models. I get your point and it partially explains things. But it's not a fully satisfying explanation.
Meta had a two prong AI approach - product-focused group working on LLMs, and blue-sky research (FAIR) working on alternate approaches, such as LeCun's JEPA.
It seems they've given up on the research and are now doubling down on LLMs.
None of Meta's revenue has anything to do with AI at all. (Other than GenAI slop in old people's feeds.) Meta is in the strange position of investing very heavily in multiple fields where they have no successful product: VR, hardware devices, and now AI. Ad revenue funds it all.
LeCun truly believes the future is in world models. He’s not alone. Good for him to now be in the position he’s always wanted and hopefully prove out what he constantly talks about.
He seems stuck in the GOFAI development philosophy where they just decide humans have something called a "world model" because they said so, and then decide that if they just develop some random thing and call it a "world model" it'll create intelligence because it has the same name as the thing they made up.
And of course it doesn't work. Humans don't have world models. There's no such thing as a world model!
I do agree humans don't have a world model. It is really more than that. We exist in the world. We don't need a world model because we exist in the world.
It is like saying a fish has a water model. It makes no sense when the fish existence is intertwined with water.
That is not to say that a computer that has a model of the world would not most likely be extremely useful vs something like the LLM that has none. The world model would be the best we could do to create a machine that simulates being in the world.
I don't think the focus is really on world models, rather than on animal intelligence based around predicting the real world, but to predict it you need to model it in some sense.
IMO the issue is that animals can't have a specific "world model" system, because if you create a model ahead of time you will mostly waste energy because most of the model is not used.
And animals' main concern is energy conservation, so they must be doing something else.
There are many factors playing into "survival of the fittest", and energy conservation is only one. Animals build mental models to predict the world because this superpower of seeing into the future is critical to survival - predict where the water is in a drought, where the food is, and how to catch it, etc, etc.
The animal learns as it encounters learning signals - prediction failure - which is the only way to do it. Of course you need to learn/remember something before you can use that in the future, so in that sense it's "ahead of time", but the reason it's done that way because evolution has found that learning patterns will ultimately prove beneficial.
Right - I've no idea how LeCun thinks about it, but I don't see that an animal needs or would have any more of a "world model" than something like an LLM. I'm sure all the research into rats in mazes etc has something to say about their representations of location/etc, but given a goal of prediction it seems that all is needed is a combination of pattern recognition and sequence prediction - not an actual explicit "declarative" model.
It seems that things like place cells and grandmother cells are a part of the pattern recognition component, but recognizing landmarks and other predictive-relevant information doesn't mean we have a complete coherent model of the environments we experience - perhaps more likely a fragmented one of task-relevant memories. It seems like our subjective experience of driving is informative - we don't have a mental road map but rather familiarity with specific routes and landmarks. We know to turn right at the gas station, etc.
LLM hostility was warrented. The overhype/downright charlartan nature of ai hype and marketing threatens another AI winter. It happened to cybernetics, it'll happen to us too. The finance folks will be fine, they'll move to the next big thing to overhype, it is the researchers who suffer the fall-out. I am considered anti LLM (transformers anyway) for this reason, i like the the architecture, it is cool amd rather capable at its problem set, which is a unique set, but, it isnt going to deliver any of what has been promised, any more than a plain DNN or a CNN will.
Meta is in last place among the big tech companies making an AI push because of lecun’s llm hostility. Refusing to properly invest in the biggest product breakthrough this century was not even a little bit warranted. He had more than enough resources available to do the research he wanted and create a fantastic open source llm.
Meta has made some fantastic llm's publically avliable many of which continue to outperform all but the qwen series in real world applications.
LLMs cannot do any of the major claims made for them, so competing at the current frontier is a massive resource waste.
Right now a locally running 8b model with large context window (10k tokens+) beat google/openAI models easily on any task you like.
why would anyone then pay for something that is possible to run on consumer hardware with higher token/second throughput and better performance? What exactly have the billions invested given google/oai in return? Nothing more than an existensial crisis I'd say.
Companies aren't trying to force AI costs into their subscription models in dishonest ways because they've got a winning product.
I dont really agree with your perception of current LLMs, but the point is it doesnt even matter. This is a pr war. Lecun lost it for meta. Meta needs to be thought of as an AI leader to gain traction in their metaverse stuff. They can live with everyone thinking theyre evil but if everyone thinks theyre lame has beens they are fucked.
are they thought of as lame has-beens? OR even on a trajectory for that to be thought of them? I don't think that's true, at least not in my circles. Like you said, evil, sure, but not has been.
This is the right take. He is obviously a pioneer and much more knowledgeable than Wang in the field, but if you don't have the product mind to serve company's business interest in short term and long term capacity anymore, you may as well stay in academia and be your own research director, let alone a chief executive in one of the largest public companies
It's very hard (and almost irreconcilable) to lead both Applied Research -- that optimizes for product/business outcomes -- and Fundamental Research -- that optimizes for novel ideas -- especially at the scale of Meta.
LeCun had chosen to focus on the latter. He can't be blamed for not having taken the second hat.
Yes he can. If he wanted to focus on fundamental research he shouldn’t have accepted a leadership position at a product company. He knew going in that releasing products was part of his job and largely blew it.
Yann was in charge of FAIR which has nothing to do with llama4 or the product focussed AI orgs. In general your comment is filled with misrepresentations. Sad.
tbf, transformers from more of a developmental perspective are hugely wasteful. they're long-range stable sure, but the whole training process requires so much power/data compared to even slightly simpler model designs I can see why people are drawn to alternative complex model designs down-playing the reliance on pure attention.
I totally agree. He appeared to act against his employer and actively undermined Meta's effort to attract talent by his behavior visible on X.
And I stopped reading him, since he - in my opinion - trashed on autopilot everything 99% did - and these 99% were already beyond the two standard deviation of greatness.
It is even more highly problematic if you have absolutely no results eg products to back your claims.
> It was strange to me that there was no log-as-service with the qualities that make it suitable for building higher-level systems like durable execution
Indeed. We are trying to democratize that secret sauce. Since it is backed by object storage, the latencies are not what AWS enjoys with its internal Journal service, but we intend to get there with a NVMe-based tier later. In the meantime there is an existing large market for event streaming where a "truly serverless" (https://erikbern.com/2021/04/19/software-infrastructure-2.0-...) API has been missing.
My guess is from the customer's perspective, DSQL seems to have too many limitations, making it feel more like an enhanced version of a NoSQL database with SQL semantics, ACID transactions, and multi-region capabilities rather than a truly distributed version of a relational database. It seems the DSQL team doesn't fully understand why relational databases remain widely popular today. What makes them great is their flexibility, which lets them handle all kinds of use cases and adapt as things change. But all the limits on transaction size, column size, etc.., and too many missing capabilities pretty much take away all the big advantages relational databases usually offer.
AWS tends to prioritize performance and scalability over functionality, which is reflected in the design of DynamoDB, SimpleDB, and now DSQL. I'm also not a big fan of this style. It doesn't give customers the flexibility to choose their own trade-offs like Spanner does and assumes that customers can't make these kinds of decisions on their own.
It's because of the way most companies build their status dashboards. There are usually at least 2 dashboards, one internal dashboard and one external dashboard. The internal dashboard is the actual monitoring dashboard, where it will be hooked up with other monitoring data sources. The external status dashboard is just for customer communication. Only after the outage/degradation is confirmed internally, then the external dashboard will be updated to avoid flaky monitors and alerts. It will also affect SLAs so it needs multiple levels of approval to change the status, that's why there are some delays.
> The external status dashboard is just for customer communication. Only after the outage/degradation is confirmed internally, then the external dashboard will be updated to avoid flaky monitors and alerts. It will also affect SLAs so it needs multiple levels of approval to change the status, that's why there are some delays.
This defeats the purpose of a status dashboard and is effectively useless in practice most of the time from a consumers point of view.
From a business perspective, I think given the choice to lie a little bit or be brutally honest with your customers, lying a bit is almost always the correct choice.
My ideal would be if regulations which made it necessary that downtime metrics had to be reported with at most somewhere between a 10m and 30m delay as "suspected reliability issue".
If your reliability metrics have lots of false positives, that's on you and you'll have to write down some reason why those false positives exist every time.
Then that company could decide for itself whether to update manually with "not a reliability issue because X".
This lets consumers avoid being gaslighted and businesses don't technically have to call it downtime.
It's not just a meme. The developer experience in Java is worse compared to some other popular programming languages like JavaScript, Go, Python, etc. The language is a bit verbose, and the compiling speed is slow, hence the development speed is slower. Java developers tend to overly abstract things, so the code tends to be unnecessarily complicated. The JVM also has a high memory footprint, the startup speed is slow, and it requires warming up the JIT. Some popular libraries and frameworks overuse reflection and annotation, they are nice to use but are nightmares to debug when issues happen. This is why GraalVM and Kotlin have been gaining popularity recently, as they aim to address several issues with the JVM and Java. The biggest strengths of Java are its ecosystem and community.
I'm not sure if any of this is true and you seem to be contradicting yourself. Java is far less verbose than Go, and the compiler is leagues faster than kotlin's, graal's native compiler, probably most other languages, and I'm sure its faster than Babel. Javac doesn't do any optimizations, just emits bytecode. Why is it acceptable for Go to be verbose and kotlinc to be slow?
I'm not saying that Go or Kotlin is better than Java in all of those aspects. I'm just saying that Kotlin, GraalVM exists because Java, JVM have certain issues or limitation. For Golang, I'm just saying that the developer experience in Go is better. It's not just my personal experience, you can find the same result in any developer survey.
I clearly wasn’t surveyed. I don’t even like Java but I would need a an extender to my ten foot pole before touching Go. (Unless we are talking microcontroller embedded software, and I only could choose between Java and Go.)
I initially disliked Go as well, but it's a language that can only be truly appreciated when you start using it. Many of its advantages come from from its simplicity, explicitness, fast compilation speed, single executable binary, and some opinionated choices made by its authors. Unless you're working with a legacy system, Go (or Rust) is the preferred choice nowadays for distributed systems and system softwares (Kubernestes, TiDB, traefik, to name a few). It's also the default choice for numerous internet companies like Uber, ByteDance, and Monzo.
Here is a survey if you'd like to participate this year, but I don't think it will significantly alter the results.
A python codebase never gets to the size of a Java one for the equivalent functionality. There's a reason the other ongoing mocking of Java is about its abundance of IAbstractGeneratorFactoryFactory classes.
This is true but Python codebases never reaches the feature parity of large Java code bases. I like Python up to about 10 kLOC or so, after that I tend to forget what is going on where, but in Java the IDE just knows what is going on where.
I hear you, and if you're using anything other than PyCharm (or Eclipse with PyDev back in the day) then that's what you'll find. But try it with PyCharm, it works flawlessly with near perfect refractor on million line of code python code bases.
> The developer experience in Java is worse compared to some other popular programming languages like JavaScript, Go, Python,
Wait, what?
Go, maybe.
But the dev experience in languages that are only able to catch errors at runtime, like Javascript and Python, are painful!
Looking at existing Python/Node.js codebases, half the automated tests are there simply to catch errors that statically typed languages catch for free, and even those tests aren't a match for a statically-typed language anyway.
I hate, hate, hate, HATE working on languages where my only options are:
1. Pray that no future code calls this function I just wrote with the wrong parameter types.
2. Write tests for all the callers of that function, to defend against some caller getting called with some combination of arguments that result in the function being called with the wrong types, while knowing full well I can't cover all possible cases like I would in a statically typed language.
Honestly, the first step in writing software is modelling the data types and structures[1]. In Node.js and Python you can model all you want but enforcement is left to developer discretion.
At this point in time, having done a few Node.js and Python backends, the dev experience in c11 is superior.
In order of least painful to most painful, in my experience of writing backends:
1. Go
2. Java
3. C#
4. C
5. C++
6. PHP, Python, Node, Ruby, etc.
Those languages in #6 above are popular because they allow you to hodge-podge your system together.
[1] The second step is modelling the data flow, of course.
Perhaps your experience with them was a long time ago. I agree that working with dynamic typing languages can be painful, but Python has had type hints since version 3.5, and JavaScript has Flow, or you can use TypeScript.
You can’t be serious comparing ASP.NET Core and EF Core to the abomination that is writing full back-end including DB access in Go or Java (assuming Spring Boot and Hibernate).
I think the industry's criticism of AWS is understandable, msw. I believe it is time for AWS to come up with a more sustainable method to support the open-source community. By sustainable, I mean financial support and dedicated resources for contributing back to open source. Given your position, I hope you can initiate this type of change. Allocating 0.5 or 1% of AWS's revenue or even profit from each service that utilizes open-source software is unlikely to significantly affect the financial statements, yet it would represent a significant contribution to the open-source community.
What I meant is a systematic approach to review and reconsider the support mechanisms for all of AWS's current open-source offerings, including those that AWS uses behind the scenes but does not disclose to the public, not just a few services or examples.