Piet Hein wrote that in reference to the first operator-free elevators, some 70+ years ago.
What you call hallucination, I call misremembering. Humans do it too. The LLM failure modes are very similar to human failure modes, including making up stuff, being tricked to do something they shouldn't, and even getting mad at their interlocutors. Indeed, they're not merely thinking, they're even thinking wrong.
I don't think it's very salient that LLMs make stuff up, or can be manipulated into saying something they have been trained not to say. An LLM applies a statistical model to the problem of probability assignment over a range of tokens; a token of high probability is selected and the process repeats. This is not what humans do when humans think.
Given that GPT-4 is a simply large collection of numbers that combine with their inputs via arithmetic manipulation, resulting in a sequence of numbers, I find it hard to understand how they're "thinking".
Are you sure? Our senses have gaps that are being constantly filled all day long, it just gets more noticeable when our brain is exhausted and makes errors.
For example, when sleep deprived, people will see things that aren't there but in my own experience they are highly more likely to be things that could be there and make sense in context. I was walking around tired last night and saw a cockroach because I was thinking about cockroaches having killed one earlier but on closer inspection it was a shadow. This has happened for other things in the past like jackets on a chair, people when driving, etc. It seems to me at least when my brain is struggling it fills in the gaps with things it has seen before in similar situations. That sounds a lot like probabilistic extrapolation from possibilities. I could see this capacity extend to novel thought with a few tweaks.
> Given that GPT-4 is a simply large collection of numbers that combine with their inputs via arithmetic manipulation, resulting in a sequence of numbers, I find it hard to understand how they're "thinking".
Reduce a human to atoms and identify which ones cause consciousness or thought. That is the fundamental paradox here and why people think it's a consequence of the system, which could also apply to technology.
We talk about "statistical models", and even "numbers" but really those things are just abstractions that are useful for us to talk about things (and more importantly, design things). They don't technically exist.
What exists are voltage levels that cause different stuff to happen. And we can't say much more about what humans do when humans think. You can surely assign abstractions to that too. Interpret neural spiking patters as exotic biological ways to approximate numbers, or whatever.
As it happens I do think our difference from computers matter. But it's not due to our implementation details.
So the makers proudly say
Will optimize its program
In an almost human way.
And truly, the resemblance
Is uncomfortably strong:
It isn't merely thinking,
It is even thinking wrong.
Piet Hein wrote that in reference to the first operator-free elevators, some 70+ years ago.
What you call hallucination, I call misremembering. Humans do it too. The LLM failure modes are very similar to human failure modes, including making up stuff, being tricked to do something they shouldn't, and even getting mad at their interlocutors. Indeed, they're not merely thinking, they're even thinking wrong.