I an not an expert but I have a serious counterpoint. While training LLMs to rep...

imiric · 2025-07-07T08:54:41 1751878481

Anthropomorphization is doing a lot of heavy lifting in your comment.

> While training LLMs to replicate the human output, the intelligence and understanding EMERGES in the internal layers.

Is it intelligence and understanding that emerges, or is applying clever statistics on the sum of human knowledge capable of surfacing patterns in the data that humans have never considered?

If this were truly intelligence we would see groundbreaking advancements in all industries even at this early stage. We've seen a few, which is expected when the approach is to brute force these systems into finding actually valuable patterns in the data. The rest of the time they generate unusable garbage that passes for insightful because most humans are not domain experts, and verifying correctness is often labor intensive.

> These LLMs are already better than 90% of humans at understanding any subject, in the sense of answering questions about that subject and carrying on meaningful and reasonable discussion.

Again, exceptional pattern matching does not imply understanding. Just because these tools are able to generate patterns that mimic human-made patterns, doesn't mean they understand anything about what they're generating. In fact, they'll be able to tell you this if you ask them.

> Yes occasionally they stumble or make a mistake, but overall it is very impressive.

This can still be very impressive, no doubt, and can have profound impact on many industries and our society. But it's important to be realistic about what the technology is and does, and not repeat what some tech bros whose income depends on this narrative tell us it is and does.

EGreg · 2025-07-07T23:59:29 1751932769

Well, you have to define what you mean by "intelligence".

I think it's just not been enough time, we can take the current LLM technology and just put it in a pipeline that includes 24/7 checking work and building up knowledge bases.

A lot of the stuff that you think is "new and original ideas" are just like prompting an LLM to "come up with 20 original variations" or "20 original ways to combine" some building blocks it already has been trained on or have been added into its context. If you do this frequently enough, and make sure to run acceptance tests (e.g. unit testing or whatever is in your domain) then you can really get quite far. In fact, you can generate the tests themselves as well. What's missing, essentially, is autonomous incremental improvements, involving acceptance testing and curation, not just generation. Just like a GAN does when it generates novel images.

"Exceptional pattern matching does not imply understanding." - You'll have to define what you mean by "understanding". I think we have to revisit the Chinese Room argument by John Searle. After all, if the book used by the person in the room is the result of training on Chinese, then "the whole Chinese room" with the book and operator may be said to "understand" Chinese.

It's not just pattern matching but emergent structures in the model, that is a non-von-neumann architecture, when it's being trained. Those structures are able to manipulate symbols in ways that are extremely useful and practical for an enormously wide range of applications!

If by "understand" we mean "meaningfully manipulate symbols and helpfully answer a wide range of queries" about something, then why would you say LLMs don't understand the subject matter? Because they sometimes make a mistake?

The idea that artificial intelligence or machines have to understand things exactly in the same way as humans, while arriving at the same or better answers, has been around for quite some time. Have you seen this gem by Richard Feynman from the mid 1980s? https://www.youtube.com/watch?v=ipRvjS7q1DI ("Can Machines Think?")

imiric · 2025-07-08T11:34:06 1751974446

> Well, you have to define what you mean by "intelligence".

The burden of defining these concepts should be on the people who wield them, not on those who object to them. But if pressed, I would describe them in the context of humans. So here goes...

Human understanding involves a complex web of connections formed in our brains that are influenced by our life experiences via our senses, by our genetics, epigenetics, and other inputs and processes we don't fully understand yet; all of which contribute to forming a semantic web of abstract concepts by which we can say we "understand" the world around us.

Human intelligence is manifested by referencing this semantic web in different ways that are also influenced by our life experiences, genetics, and so on; applying creativity, ingenuity, intuition, memory, and many other processes we don't fully understand yet; and forming thoughts and ideas that we communicate to other humans via speech and language.

Notice that there is a complex system in place before communication finally happens. That is only the last step of the entire process.

All of this isn't purely theoretical. It has very practical implications in how we manifest and perceive intelligence.

Elsewhere in the thread someone brought up how Ramanujan achieved brilliant things based only on basic education and a few math books. He didn't require the sum of human knowledge to advance it. It all happened in ways we can't explain which only a few humans are capable of.

This isn't to say that this is the only way understanding and intelligence can exist. But it's the one we're most familiar with.

In stark contrast, the current generation of machines don't do any of this. The connections they establish aren't based on semantics or abstract concepts. They don't have ingenuity or intuition, nor accrue experience. What we perceive as creativity depends on a random number generator. What we perceive as intelligence and understanding works by breaking down language written by humans into patterns of data, assigning numbers to specific patterns based on an incredibly large set of data manually pre-processed by humans, and outputting those patterns by applying statistics and probability.

Describing that system as anything close to human understanding and intelligence is dishonest and confusing at best. It's also dangerous, as it can be interpreted by humans to have far greater capability and meaning than it actually does. So the language used to describe these systems accurately is important, otherwise words lose all meaning. We can call them "magical thinking machines", or "god" for that matter, and it would have the same effect.

So maybe "MatMul with interspersed nonlinearities"[1] is too literal and technical to be useful, and we need new terminology to describe what these systems do.

> I think we have to revisit the Chinese Room argument by John Searle.

I wasn't familiar with this, thanks for mentioning it. From a cursory read, I do agree with Searle. The current generation of machines don't think. Which isn't to say that they're incapable of thinking, or that we'll never be able to create machines that think, but right now they simply don't.

What the current generation does much better than previous generations is mimicking how thoughts are rendered as text. They've definitively surpassed the Turing test, and can fool most humans into thinking that they're humans via text communication. This is a great advancement, but it's not a sign of intelligence. The Turing test was never meant to be a showcase of intelligence; it's simply an Imitation Game.

> Those structures are able to manipulate symbols in ways that are extremely useful and practical for an enormously wide range of applications!

I'm not saying that these systems can't be very useful. In the right hands, absolutely. A probabilistic pattern matcher could even expose novel ideas that humans haven't thought about before. All of this is great. I simply think that using accurate language to describe these systems is very important.

> Have you seen this gem by Richard Feynman from the mid 1980s?

I haven't seen it, thanks for sharing. Feynman is insightful and captivating as usual, but also verbose as usual, so I don't think he answers any of the questions with any clarity.

It's interesting how he describes pattern matching and reinforcement learning back when those ideas were novel and promising, but we didn't have the compute available to implement them.

I agree with the point that machines don't have to mimic the exact processes of human intelligence to showcase intelligence. Planes don't fly like birds, cars don't run like cheetahs, and calculators don't solve problems like humans, yet they're still very useful. Same goes for the current generation of "AI" technology. It can have a wide array of applications that solve real world problems better than any human would.

The difference with those examples and intelligence is that something either takes off the ground and maintains altitude, or it doesn't. It either moves on the ground, or doesn't. It either solves arithmetic problems, or doesn't. I.e. those are binary states we can easily describe. How this is done is an implementation detail and not very important. Whereas something like intelligence is very fuzzy to determine, as you point out, and we don't have good definitions of it. We have some very basic criteria by which we can somewhat judge whether something is intelligent or not, but they're far from reliable or useful.

So in the same way that it would be unclear to refer to airplanes as "magical gravity-defying machines", even though that is what they look like, we label what they do as "flight" since we have a clear mental model of what that is. Calling them something else could potentially imply wrong ideas about their capabilities, which is far from helpful when discussing them.

And, crucially, the application of actual intelligence is responsible for all advancements throughout human history. Considering that current machines only excel at data generation, and at showing us interesting data patterns we haven't considered yet, not only is this a sign that they're not intelligent, but it's a sign that this isn't the right path to Artificial General Intelligence.

Hopefully this clarifies my arguments. Thanks for coming to my TED talk :)

[1]: https://news.ycombinator.com/item?id=44484682

GoblinSlayer · 2025-07-07T12:28:15 1751891295

Indeed 90% problems can be solved by googling and that's what LLMs do. AGI is expected to be something more than a talking encyclopedia.