> Moreover, a hallucination is a pathology. It's something that happens when systems are not working properly.
> When an LLM fabricates a falsehood, that is not a malfunction at all. The machine is doing exactly what it has been designed to do: guess, and sound confident while doing it.
> When LLMs get things wrong they aren't hallucinating. They are bullshitting.
Very important distinction and again, shows the marketing bias to make these systems seem different than they are.
If we want to be pedantic about language, they aren't bullshitting. Bulshitting implies an intent to deceive, whereas LLMs are simply trying their best to predict text. Nobody gains anything from using terms closely related to human agency and intentions.
The authors of this website have published one of the famous books on the topic[0] (along with a course), and their definition is as follows:
"Bullshit involves language, statistical figures, data graphics, and other forms of presentation intended to persuade by impressing and overwhelming a reader or listener, with a blatant disregard for truth and logical coherence."
It does not imply an intent to deceive, just disregard for whether the BS is truth or not. In this case, I see how the definition can apply to LLMs in the sense that they are just doing their best to predict the most likely response.
If you provided them with training data where the majority inputs agree on a common misconception, they will output similar content as well.
The authors have a specific definition of bullshit that they contrast with lying. In their definition, lying involves intent to deceive; bullshitting involves not caring if you’re deceiving.
Lesson 2, The Nature of Bullshit: “BULLSHIT involves language or other forms of communication intended to appear authoritative or persuasive without regard to its actual truth or logical consistency.”
I think the first part of that statement requires more evidence or argumentation, especially since models have shown the ability to practice deception. (you are right that they don't _always_ know what they know)
But sometimes when humans make things up they also don't have the knowledge they may be wrong. It's like the reference to "known unknowns" and "unknown unknowns". Or Dunning-Kruger personified. Basically you have three categories:
(1) Liars know something is false and have an intent to deceive (LLMs don't do this)
(2) Bullshitters may not know/care whether something is false, but they are aware they don't know
(3) Bullshitters may not know something is false, because they don't know all the things they don't know
The pZombie type level where we look at the LLM as if it were a black box and simply account for its behavior. At this level LLM's claim to have knowledge and also claim knowledge of their limited knowledge "I can't actually taste". So approached from this direction we are in (2) they have awareness that there are some things that they don't know, but this awareness doesn't prevent them from pretending to this knowledge.
If we consider it from the perspective of knowing what's happening inside LLM's then I think the picture is different. The LLM is doing next word prediction with constant compute time per token - the algorithm is quite clear. We know this is true because it runs on llama.cpp or mlx on our macbooks as well on the farms of B200's that we fear will destroy the atmosphere. So LLM's don't have any actual operational knowledge of the logic of their utterances (dunning kruger, dunning kruger...) What I mean is that the LLM can't/isn't analysing what it says, it's just responding to stimulus. Humans do do this as well - it's easy to just chatter away to other people like a canary, but humans also can analysis what they are saying and strategically manipulate the messages that they create. So I would say that LLMs cannot be concerned about what they do or don't know - the concern rests with us when we challenge them (or not) by asking "how can you know that chocolate tastes better than strawberry - you have never tasted either".
If you make an LLM which design goal is to state "I do not know" any answer that is not directly in its training set, then all of the above statements don't hold.
> When an LLM fabricates a falsehood, that is not a malfunction at all. The machine is doing exactly what it has been designed to do: guess, and sound confident while doing it.
> When LLMs get things wrong they aren't hallucinating. They are bullshitting.
Very important distinction and again, shows the marketing bias to make these systems seem different than they are.