Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.
I consider it highly plausible that confabulation is inherent to scaling intelligence. In order to run computation on data that due to dimensionality is computationally infeasible, you will most likely need to create a lower dimensional representation and do the computation on that. Collapsing the dimensionality is going to be lossy, which means it will have gaps between what it thinks is the reality and what is.
The concern for me about LLMs confabulating is not that humans don't do it. It's that the massive scale at which LLMs will inevitably be deployed makes even the smallest confabulation extremely risky.
I don't understand this. Many small errors distributed across a large deployment sounds a lot like normal mode of error prone humans / cogs / whatevers distributed over a wide deployment.
There's a difference between 1000 diverse humans with varied traits making errors that should cancel out because of the law of large numbers vs 10 AI with the same training data making errors that would likely correlate and compound upon each other.
Let's say a given B2B system deployment typically requires 100 custom behaviours/scripts and 3 years worth of effort. A team of ten people can execute such a deployment in 3-4 months. The team has the capacity to fix up issues caused by small human errors as they arise, since they show up roughly once a week.
With the advent of LLMs, a new deployment now takes 3 days. Consequently, errors requiring human attention crop up several times a day.
I have yet to see a comparison of human vs. LLM confabulation errors at scale.
"Many small errors" makes a presumption about LLM confabulation/hallucination that seems unwarranted. Pre-LLM humans (and our computers) have managed vast nuclear arsenals, bioweapons research, and ubiquitous global transport - as a few examples - without any catastrophic mistakes, so far. What can we reasonably expect as a likely worst case scenario if LLMs replacing all the relevant expertise and execution?
Your project vue-skuilder has 6 github action steps devoted to checking the work you do before it's allowed to go out. You do not trust yourself to get things right 100% of the time.
I am watching people trust LLM-based analysis and actions 100% of the time without checking.
If you want to call it that, I find the confabulation in LLMs extreme. That level of confabulation would most likely be diagnosed as dementia in humans.[0] Hence, it is considered a bug not a feature in humans as well.
Now imagine a high-skilled software engineer with dementia coding safety-critical software...
> Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.
I think we need to start rejecting anthropomorphic statements like this out of hand. They are lazy, typically wrong, and are always delivered as a dismissive defense of LLM failure modes. Anything can be anthropomorphized, and it's always problematic to do so - that's why the word exists.
This rhetorical technique always follows the form of "this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like" which then opens the door to unbounded speculation that draws on arbitrary aspects of human nature and biology to justify technical reasoning.
In this case, you've deliberately conflated a technical term of art (LLM confabulation) with the the concept of human memory confabulation and used that as a foundation to argue that confabulation is thus inherent to intelligence. There is a lot that's wrong with this reasoning, but the most obvious is that it's a massive category error. "Confabulation" in LLMs and "confabulation" in humans have basically nothing in common, they are comparable only in an extremely superficial sense. To then go on to suggest that confabulation might be inherent to intelligence isn't even really a coherent argument because you've created ambiguity in the meaning of the word confabulate.
>this LLM behavior can be analogized in terms of some human behavior, thus it follows that LLMs are human-like
No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"
>"Confabulation" in LLMs and "confabulation" in humans have basically nothing in common
I don't know why you think this. They seem to have a lot in common. I call it sensible nonsense. Humans are prone to this when self-reflective neural circuits break down. LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)
> No, the argument is "this behavior is similar enough to human behavior that using it as evidence against <claim regarding LLM capability that humans have> is specious"
I'm not really following. LLM capabilities are self-evident, comparing them to a human doesn't add any useful information in that context.
> LLMs are characterized by a lack of self-reflective information. When critical input is missing, the algorithm will craft a narrative around the available, but insufficient information resulting in sensible nonsense (e.g. neural disorders such as somatoparaphrenia)
You're just drawing lines between superficial descriptions from disparate concepts that have a metaphorical overlap. It's also wrong. LLMs do not "craft a narrative around available information when critical input is missing", LLM confabulations are statistical, not a consequence of missing information or damage.
This is undermined by all the disagreement about what LLMs can do and/or how to characterize it.
>LLM confabulations are statistical, not a consequence of missing information or damage.
LLMs aren't statistical in any substantive sense. LLMs are a general purpose computing paradigm. They are circuit builders, the converged parameters define pathways through the architecture that pick out specific programs. Or as Karpathy puts it, LLMs are a differentiable computer[1]. So yes, narrative crafting in terms of leveraging available putative facts into a narrative is an apt characterization of what LLMs do.
We tried that. It was called Cyc. It never got even close to the level of capabilities a modern LLM has in an agentic harness — even on common sense and reasoning problems!
They key capability that humans have that I've yet to see in an LLM is the ability to recognize when they would not be capable of doing a task well and refuse to do it poorly instead. The only times I've ever seen LLMs give up on a problem are when the prompting is very explicitly crafted to try to elicit a response like that when necessary or after very long back-and-forth exchanges where they get repeated feedback about unsatisfactory results. I think this has pretty dire implications in terms of what the consequences are for deploying them in any scenario where failure has significant risk or the output can't be immediately audited for correctness.
> Some people point at LLMs confabulating, as if this wasn’t something humans are already widely known for doing.
Are you seriously making the argument that AI "hallucinations" are comparable and interchangeable to mistakes, omissions and lies made by humans?
You understand that calling AI errors "hallucinations" and "confabulations" is a metaphor to relate them to human language? The technical term would be "mis-prediction", which suddenly isn't something humans ever do when talking, because we don't predict words, we communicate with intent.
There are AI researchers who wrote blogposts which got to HN top about spiky spheres (I won't link the original blogpost making that claim to avoid hurt sentiments). Here's 3blue1brown correcting those AI/ML researchers intuitions.
No. LLMs do not confabulate they bullshit. There is a big difference. AIs do not care, cannot care, have not capacity to care about the output. String tokens in, string tokes out. Even if they have all the data perfectly recorded they will still fail to use it for a coherent output.
> Collapsing the dimensionality is going to be lossy, which means it will have gaps between what it thinks is the reality and what is.
Confabulation has to do with degradation of biological processes and information storage.
There is no equivalent in a LLM. Once the data is recorded it will be recalled exactly the same up to the bit. A LLM representation is immutable. You can download a model a 1000 times, run it for 10 years, etc. and the data is the same. The closes that you get is if you store the data in a faulty disk, but that is not why LLMs output is so awful, that would be a trivial problem to solve with current technology. (Like having a RAID and a few checksums).
I don't even think they bullshit, since that requires conscious effort that they do not an cannot possess. They just simply interpret things incorrectly sometimes, like any of us meatbags.
They make incorrect predictions of text to respond to prompts.
The neat thing about LLMs is they are very general models that can be used for lots of different things. The downside is they often make incorrect predictions, and what's worse, it isn't even very predictable to know when they make incorrect predictions.
I think this is leaning on the "lies are when you tell falsehoods on purpose; bullshit is when you simply don't care at all whether what you're saying is true" definition of bullshit. Cf. On Bullshit.
So, they can't lie, but they can (and, in fact, exclusively do) bullshit.
The correct answer in that the U and _ in the mdstat output cannot be mapped the the rest of the output by either position or indexes in square brackets, so you can't tell the exact nature of the failure from the mdstat output alone (for the record, the failed disk was sda).
So all of the "analysis" was bullshit, including "it's probably multiple partitions from multiple drives". But there are so many juicy numbered and indexed bits of info to pattern match on!
Notice how for the followup question it "thought" for 4 minutes, going in circles trying to make essentially random ordering to make some sort of ordered sense., and then bullshited its way to "it is sdb"
> No. LLMs do not confabulate they bullshit. There is a big difference. AIs do not care, cannot care, have not capacity to care about the output. String tokens in, string tokes out. Even if they have all the data perfectly recorded they will still fail to use it for a coherent output.
Isn't "caring" a necessary pre-requisite for bullshitting? One either bullshits because they care, or don't care, about the context.
They're presumably referring to the Harry Frankfurt definition of bullshit: "speech intended to persuade without regard for truth. The liar cares about the truth and attempts to hide it; the bullshitter doesn't care whether what they say is true or false."
The bullshitter does have an objective in mind however. There is some ultimate purpose to his bullshitting. LLMs don't even have that. They just spew words.
The suggestion is that it is an intrinsic quality and therefore neither a feature nor a bug.
It's like saying, computation requires nonzero energy. Is that a feature or a bug? Neither, it's irrelevant, because it's a physical constant of the universe that computation will always require nonzero energy.
If confabulation is a physical constant of intelligence, then like energy per computation, all we can do is try to minimize it, while knowing it can never go to zero.
The test isn’t whether humans also create bullshit, but whether an honest actor knows when they are doing this and doesn’t do it on purpose. As the article points out, LLMs don’t say “I don’t know.” If you demand they do something that never appears in the training data, they just forge ahead and generate words and make something up according to the statical probabilities they have in the model weights. A human knows that he doesn’t know. That seems missing with current AIs.
Yes, and to me the evolution of life sure looks like an evolution of more truthful models of the universe in service of energy profit. Better model -> better predictions -> better profit.
I'm extremely skeptical that all of life evolved intelligence to be closer to truth only for us to digitize intelligence and then have the opposite happen. Makes no sense.
My understanding is that this is the opposite of what is typically understood to be true - organisms with less truthful (more reductive/compressed) perception survive better than those with more complete perception. "Fitness beats truth."
Fitness is effective truth prediction, appropriately scoped.
A frog doesn't need to understand quantum physics to catch a fly. But if the frogs model of fly movement was trained on lies it will have a model that predicts poorly, won't catch flies, and will die.
There is another level to this in that the more complex and changing the environment the more beneficial a wider scoped model / understanding of truth.
However if you are going to lean fully into Hoffman and accept thatby default consciousness constructs rather than approximate reality I think we will have to agree to disagree. Personally I ascribe to Karl Friston free energy principle.
The question is not whether the alternative is perfect, the question is can it be made better than the status quo. It’s not that hard to come up with potential mitigations for the problems you state.
- A taxable threshold, so people who can’t afford lawyers and accountants don’t need to deal with it. Works well for family gifting.
- You don’t need to tax immediately, tax it when it the profit is realized, eg. When you sell that art.
- Taking out a loan against an asset at an increased valuation should trigger a taxable event. (Eg. Stocks go from 1b to 2b valuation and you take out a 500m loan. You are realizing 250k of gains and should pay tax on that gain.)
- Eliminate stepped up cost basis. This is a ridiculous give away.
So if I email the company a TOS and say that continuing to allow me to use the tool should be considered acceptance of my new TOS that should be valid? Sometimes it's amazing to see the legal contortions people use to justify bad behavior on the part of companies.
When did Ben Thompson go so far down the path to autocratic sympathizer? This is such an anti-democratic, anti-free market, anti-free speech view on this whole situation.
First, everything revolves around a core conceit that "Might makes right". The idea that entities might might push back with the tools at their disposal is treated as a fools errand, you should just acquiesce.
The role of the legislative branch in deciding what private entities are allowed, or not allowed to do is treated as a side note. He equates the dictates of the executive branch as if it was the will of the United States itself, above even the Constitution.
It's dismissive of the rights of private companies and individuals to make decisions for themselves about the actions they take and whether or how they choose to transact within the law with parts of the executive branch.
He acts as if it's a foregone conclusion that every AI company should be considered an arm of the executive branch of the US government. The analogy to nuclear weapons is super flawed, there are multiple laws on the books (written into law by Congress) specifically regulating Nuclear research and development.
And most astonishingly he ends it dropping an implied threat of violence towards Anthropic (and assumedly anyone else who doesn't agree with his point of view):
> I don’t want that, and, more pertinently, the ones with guns aren’t going to tolerate it.
Yep, once you accept “might makes right” the laws in a democracy become polite suggestions. Oh, your town is in the way of hydropower? Too bad the gov’t has more guns than you. That’s how you get the Three Gorges Dam in China. Nevertheless, the Trump Mafia is demonstrating how paper thin democracy and rule of law really is in the US.
It’s a non-clause that is written to sound like they are doing something to prevent these uses when they aren’t. “You are not allowed to do illegal things” is meaningless, since they already can’t legally do illegal things. Plus the administration itself gets to decide if it meets legal use.
> “You are not allowed to do illegal things” is meaningless, since they already can’t legally do illegal things.
That's not quite right.
First off, I don't expect that "you used my service to commit a crime" is in and of itself enough to break a contract, so having your contract state that you're not allowed to use my service to commit a crime does give me tools to cut you off.
Second, I don't want the contract to say "if you're convicted of committing a crime using my service", I want it to say "if you do these specific things". This is for two reasons. First, because I don't want to depend on criminal prosecutors to act before I have standing. Second, because I want to only have to meet the balance of probabilities ("preponderance of evidence" if you're American) standard of evidence in civil court, rather than needing a conviction secured under "beyond a reasonable doubt" standard. IANAL, but I expect that having this "you can't do these illegal things except when they aren't illegal" language in the contract does put me in that position.
I don’t think the language does, or is intended to, give OpenAI any special standing in the courts.
They literally asked the DoD to continue as is.
Their is no safety enforcement standing created because their is no safety enforcement intended.
It is transparently written, as a completely reactive response to Anthropic’s stand, in an attempt to create a perception that they care. And reduce perceived contrast with Anthropic.
If they had any interest in safety or ethics, Anthropic’s stand just made that far easier than they could have imagined. Just join Anthropic and together set a new bar of expectations for the industry and public as a whole.
They could collaborate with Anthropic on a common expectation, if they have a different take on safety.
The upside safety culture impact of such collaboration by two competitive leaders in the industry would be felt globally. Going far beyond any current contracts.
But, no. Nothing.
Except the legalese and an attempt to misleadingly pass it off as “more stringent”. These are not the actions of anyone who cares at all about the obvious potential for governmental abuse, or creating any civil legal leverage for safe use.
This is such a bad faith question, it's annoying to see it come up again as if there's any utility to asking it.
The question itself never specifies that the car you would be driving is the same one that you need to be washed. The car that needs to be washed could be waiting in the parking lot of the car wash already. It doesn't state that you plan on washing your car at the car wash. Perhaps the car wash sells car cleaning equipment that you can bring back to wash your car at home?
The question is designed to be ambiguous so the llm answers it in a way that seem facially absurd to the people who are in on the scheme. What it's actually showing is a failure of imagination for those asking the question.
Do you want your chatbot to be suspicious of you trying to trick it? To me this seems patently unhelpful outside of LLMs tuned for roleplay or to operate in a highly adversarial environment.
Do you want it to assume you are an idiot asking the question because you didn't realize you need to have the car at the car wash to wash it?
Or do you want it to take the best faith assumption as to what you are asking and try to be as helpful as possible given the poor question?
I've already immediately mentioned disambiguations in the replies to myself. The ideal response would be the llm stating the most important assumptions
Given that these tendencies are not evenly distributed throughout the population, you can have structures that leverage the large mean to mitigate the worst tendencies of the extreme tails. Given that the natural state of things is that power begets more power, these are harder to build and maintain, but it can be done. In particular, Democracies and Republics are major historical examples of this.
A important point though is that llm code generation changes that tradeoff. The time/opportunity cost goes way down while the productivity penalty starts accumulating very fast. Outcomes can diverge very quickly.
When it comes to new emerging technologies everyone is searching the space of possibilities, exploring new ways to use said technologies, and seeing where it applies and creates value. In situations such as this, a positive sign is worth way more than a negative. The chances of many people not using it the right way are much much higher when no one really knows what the “right” way is.
It then shows hubris and a lack of imagination for someone in such a situation to think they can apply their negative results to extrapolate to the situation at large. Especially when so many are claiming to be seeing positive utility.
IMO this is a mistake, for basically the same reason you justify it with. Since most people just want the code to work, and the chances of any specific repo being malicious is low, especially when a lot of the repos you work with are trusted or semi-trusted, it easily becomes a learned behavior to just auto accept this.
Trust in code operates on a spectrum, not a binary. Different code bases have vastly different threat profiles, and this approach does close to nothing to accomodate for that.
In addition, code bases change over time, and full auditing is near impossible. Even if you manually audit the code, most code is constantly changing. You can pull an update from git, and the audited repo you trusted can be no longer trustworthy.
An up front binary and persistent, trust or don't trust model isn't a particularly good match match for either user behavior or the potential threats most users will face.
I consider it highly plausible that confabulation is inherent to scaling intelligence. In order to run computation on data that due to dimensionality is computationally infeasible, you will most likely need to create a lower dimensional representation and do the computation on that. Collapsing the dimensionality is going to be lossy, which means it will have gaps between what it thinks is the reality and what is.
reply