Hacker News new | past | comments | ask | show | jobs | submit login

I find it hard to get too excited by tests like "Which episode of Gilligan’s Island was about mind reading?" because they reflect a desire for a world in which the goal is to keep on growing LLMs until they can answer even questions like that one entirely from their trained model weights.

This seems like a wasteful exercise to me. Are we really going to retrain our largest models on a weekly basis to teach them about what's happened recently?

I'm much more interested in learning about the smallest, fastest model we can create that can effectively manipulate language, "reason" about things, summarize and drive tools.

I want a model that can answer any question accurately because it knows how to look up extra information from reliable sources, and evaluate that information effectively once it finds it.




That's not the point of the Gilligan's Island test. If the LLMs had said something to the effect of "I don't know, I wasn't trained to answer trivia about television" then I would agree with your comment.

Instead they all confabulated! Either they made up an episode title or insisted the episode doesn't exist. That's a serious problem: if the LLM doesn't properly recognize that it doesn't know something, how will it properly recognize it needs to use a tool to obtain that knowledge?

It seems like the only answer we have is for a user to flag that GPT-4+Web flubbed the Gilligan test, OpenAI dispatches a data contractor for RLHF whack-a-mole + synthetic data generation, GPT learns to Google answers to Gilligan's Island prompts, then we cross our fingers and hope transformers are smart enough to transfer that knowledge to the Sanford and Son benchmark.


>It seems like the only answer we have is for a user to flag that GPT-4+Web flubbed the Gilligan test OpenAI dispatches a data contractor for RLHF whack-a-mole + synthetic data generation, GPT learns to Google answers to Gilligan's Island prompts, then we cross our fingers and hope transformers are smart enough to transfer that knowledge to the Sanford and Son benchmark.

Or maybe... let me run this up the flagpole and see if anyone salutes it... maybe we accept that LLMs have fundamental architectural limitations and can't do certain things like "math" or "anything requiring an awareness of context or factual accuracy" and don't use them for everything?

So like instead of that, which wouldn't even work because search engines have already been polluted by AI generated garbage so they would just be eating their own shit, we have search engines that actually work again, and people just look that stuff up? And LLMs get relegated to whatever niche they're actually useful in, rather than the current plan of rebuilding our entire technological society on top of them as soon as possible because money?


I know when I'm lying. It doesn't seem totally insane to think that there's a node/tensor (not calling it a neuron) inside the model that is activated when it's confabulating that we could find and highlight and program our way into not happening.


From what I understand of the way LLMs work, they don't "know" when they're confabulating or not. All of the text they generate is arbitrary, albeit not entirely random. Whether or not any particular response is useful is a matter of human interpretation.

The problem is the tendency to assume LLMs behave the same way humans do. You know you're lying when you're lying. LLMs don't even have a concept of a "lie." Even though you can ask one what a lie is, and it responds with an accurate answer, that's still just an arbitrary statistically based response. It doesn't actually know.


It's crazy to me how even in here there's such a huge amount of people who just don't understand. Even the article were discussing points this out

> The implications are that LLMs do not perform reasoning over data in the way that most people conceive or desire.

> There is no self-reflection of its information; it does not know what it knows and what it does not. The line between hallucination and truth is simply a probability factored by the prevalence of training data and post-training processes like fine-tuning

It's not thinking, it's not conscious, it's just a mathematical function that's too complex for us to understand.


I think the best way to think of it is that it’s an estimation of the outputs of reasoning and knowledge. Of course this means that the models do need to model the reasoning process in some way, but it in no way implies that they model it in a way that’s rigorous or reliable.


I just got an interesting[0] response from Claude 3 Opus:

> I don't believe there was an episode of Gilligan's Island that centered around mind reading. The show ran for 3 seasons from 1964-1967 and featured the comic adventures of 7 castaways on an uncharted island. Typical plotlines involved their efforts to get rescued and zany schemes by Gilligan that would go awry. But to my knowledge, none of the 98 episodes had a storyline focused on mind reading or telepathic abilities.

It's probably the closest I can remember seeing an LLM get to saying "I don't know". "I don't believe there was" at least acknowledges the possibility of being incorrect and should prompt a careful user to do further research.

[0] Also interesting is that one of the article's comments shows Opus giving the correct episode title but incorrect details. So ... mixed bag.


> If the LLMs had said something to the effect of "I don't know, I wasn't trained to answer <question type> about <question topic>" ...

This is so very much the answer I want from an LLM every single time it's not reasonably certain about an answer to a query. Not "hallucinate" a plausible sounding answer and then confidently spew lies at me as if it's gospel truth.

I can accept "I don't know" from a human just fine, so I can damn sure accept it from a machine made by humans; but I'm far less tolerant of lies from a machine than I am from humans. Humans will lie for a great many reasons, a fair few of which are easily enough forgivable / understandable. A machine will generally "lie" for only a comparatively very few possible reasons (a physical flaw or defect in the machine; faulty data, be it purposeful or accidental; a human designed it to be untruthful, etc.), most of which are just plain largely unacceptable on multiple levels.

Even better than "I don't know" would be my fifth grade teacher's favorite answer ("way back in ye golden olden days") in situations where he didn't know the answer: "I don't know, but let's find out together." and then the research would proceed apace. An LLM should be capable of such quite easily one would think. They make great "research assistants" when trained on a relevant data set and guided properly with a well crafted system prompt, and they're centered around / trained upon human language so should be able to guide a human to available resources with pretty near zero hassle. :)


I think part of the point of the article is that LLMs don't lie because they are designed to just give you the next work based on making a credible sounding sentence or sequence of sentences. Expecting it to do more is an expectations problem based on the hype around GenAI.

I don't think we have the correct word for what LLMs do but lie and hallucinations are not really correct.


Saying "I don't know" doesn't require too much of a change. This isn't a different mode of operation where it's introspecting about its own knowledge - it's just the best continuation prediction in a context where the person/entity being questioned is not equipped to answer.

LLMs create quite deep representations of the input on which they based their next word prediction (text continuation), and it has been proved that they already sometimes do know when something they are generating is low confidence or false, so maybe with appropriate training data they could better attend to this and predict "I don't know" or "I'm not sure".

To improve the ability of LLMs to answer like this requires them to have a better idea of what is true or not. Humans do this by remembering where they learnt something: was it first hand experience, or from a text book or trusted friend, or from a less trustworthy source. LLMs ability to discern the truth could be boosted by giving them the sources of their training data, maybe together with a trustworthiness rating (although they may be able to learn that for themselves).


I think hallucination is pretty close. It represents what happens when you give an answer based on what you think you remember even if that memory is not correct.

How many people would agree that P.T. Barnum said “There’s a sucker born every minute”? That would be a hallucination.

The quote is from Adam Forepaugh.


The best argument I have found against using lie or hallucination for describing LLM's actions is that it humanizes them to people who don't know the inner workings of LLMs. Saying they lie gives intent which is pretty bad but even hallucination humanizes them unnecessarily. Bullshitting seems the best word to describe it but even then intent can be assumed when there isn't any.


I said "confabulate" in my original post. "Confabulation" is a neurological symptom commonly seen in dementia patients, where a person isn't telling the truth because of errors in memory formation/recall or some other non-psychological problem in the brain. In particular, people who confabulate aren't aware their words are false and therefore it doesn't make sense to say that they're "lying." Likewise it's a problem with memory, not perception, so "hallucination" doesn't work either.

"Confabulation" still isn't great because humans confabulate with non-verbal memories and then express those confabulations in words; human confabulation mostly affects biographical memory, not subject matter knowledge. But considering how weird it is to even be talking about "memory" with a being that isn't aware of the passage of time, I think "confabulate" is the best option short of inventing a brand new word.


“Bullshit” is the _perfect_ term. Philosopher Harry Frankfurter wrote a book called On Bullshit where he defines the term as speech or writing intended to persuade without regard for the truth. This is _exactly_ what LLMs do. They produce text that tries to reproduce the average properties of texts in their training data and the user experiences the encoded in their RLHF training. None of that has anything to do with the truth. At best you could say they are engineered to try to give the users what they want (eg. what the engineers building these systems think we want), which is, again, a common motive of bullshitters.


"Bullshit" doesn't work because it requires a psychological "intent to persuade," but LLMs are not capable of having intentions. People intentionally bullshit because they want to accomplish specific goals and adopt a cynical attitude towards the truth; LLMs incidentally bullshit because they aren't capable telling the difference between true and false.

Specifically: bullshitters know they are bullshitting and hence they are intentionally deceptive. They might not know whether their words are false, but they know that their confidence is undeserved and that "the right thing to do" is to confess their ignorance. But LLMs aren't even aware of their own ignorance. To them, "bullshitting" and "telling the truth" are precisely the same thing: the result of shallow token prediction, by a computer which does not actually understand human language.

That's why I prefer "confabulate" to "bullshit" - confabulation occurs when something is wrong with the brain, but bullshitting occurs when someone with a perfectly functioning brain takes a moral shortcut.


I don’t like “confabulate” because has a euphemistically quality. I think most people hear it as a polite word for lying (no matter the dictionary definition). And this is a space that needs, desperately needs, direct talk that regular people can understand. (I also think confabulate implies intention just as much as bullshit to most people.)


You’re right about the model’s agency. To be precise I’d say that LLMs spew bullshit but that the bullshitters in that case are those who made the LLMs and claimed (in the worst piece of bullshit in the whole equation) that they are truthful and should be listened to.

In that sense you could described LLMs as industrial strength bullshit machines. The same way a meat processing plant produces pink slime at the design of its engineers so too do LLMs produce bullshit at the design of theirs.


> I don't think we have the correct word for what LLMs do but lie and hallucinations are not really correct.

I believe 'bullshit' is accurate, as in "The chatbot didn't know the answer, so it started bullshitting".


What does it even mean to lie for a text generator? It outputs the most probable continuation of the given input and that continuation is indeed the most probable in its training dataset. We don't say that DNA sequences are true or false.


Good prediction involves modelling the data generator, including hidden state such as level of knowledge or uncertainly, motivation, or tendency to lie.

If you ask a question of someone/something who is ill equipped to answer, then (assuming you have not modeled them as inveterate bullshitter) a good predicted response is "I don't know".

Deliberately lying due to a motivation to deceive is different from the default LLM mode of "just keep talking" bullshitting. The only "motivation" an LLM has is to predict next word, but if it knows that this requires lying then it will do so (e.g. give it a setup where it is someone motivated to lie).


> if the LLM doesn't properly recognize that it doesn't know something, how will it properly recognize it needs to use a tool to obtain that knowledge?

Maybe by doing it all/most of the time, the way that LLM/search hybrids like Perplexity and Bing/Copilot already do?

Ideally an LLM would either be trained (or better learn for itself) when it's appropriate to use different types of tool. Web search (or offline WikiPedia lookup) could be the default.


It should be possible to come up with generalizable algorithms to determine a confidence score for any output, something akin to strong or weak connections in the brain. Is the response to the prompt supported by robust connections and strong weights? Or is it flitting around in weird, weak pathways. If the confidence score is below a certain level, some sort of ‘fact check’ feedback loop kicks in. Isn’t it roughly that simple?


that is already exactly how it works. The problem is that how confident it is has no relation to whether it actually knows. I can confidently spout bullshit all day long, if need be.


Maybe it's training data bias -> very few documents claim not to know something.


If you trained an LLM on questions where the answer “I don’t know” is possible, it would likely learn to answer any non yes/no question with “I don’t know”, since it’s probably the most common answer outside of yes/no


...and +1 on "confabulate"

to invent experiences or events that did not really happen


If I'm not mistaken, it also involves not knowing that you are doing so.

> fabricate imaginary experiences as compensation for loss of memory


> This seems like a wasteful exercise to me.

Except TFA is specifically about non-existent reasoning, self-reflection, emergent capabilities (like insights, discoveries, theories) in SoTA LLMs, and laments the misplaced hype, especially since it is instead accelerating erosion of privacy / societal values, and the distortion of truth / reality.

  substantially ironic that LLMs are failing at the primary use cases that are attracting billions of investment, but are rather proficient at the use cases we do not desire, such as destruction of privacy and liberty, a post-truth society, social manipulation, the severance of human connection, fountains of noise, the devaluation of meaning, and a plethora of other societal issues.


True, but it'd be nice if they could just answer "I don't know" unless they are able to use RAG to retrieve an answer.


The tooling around the model could be a lot better. The LLM is just a statistical model and the tooling takes the most likely token at each step (or samples from some of the most likely). Instead it could say "There are no high-probability completions here". Or you could give it a list of actual episode titles and it would select the most likely one.


It does seem like some would rather build a fish-giving vending machine where they can load it up with the fish discovered to date and get it to spit them back out vs a fishing machine that catches fish and distributes them.

But to me this post exemplifies a pet peeve with AI discussions to date, which is a tendency to want a single model to do it all.

Our brains are a network of specialized functions. Damage the hippocampus and your human also won't know the episode name.

But somehow if a model uses an external store it's 'cheating' and not just networking specialized tools, even though that's how our own brains work.


Exactly, we don't need more expansive (and expensive!) models, we need more accurate ones without hallucinations, and which are robust wrt prompts


I would love a LLM that if it doesn't know says I don't know. Rather then extremely firmly say this is the answer, only for that to be 100% incorrect, not even sort of correct.


That would require awareness it doesn't have - kind of the point of the article.


It seems unrealistic to anticipate stronger AI that doesn't hallucinate. We're chasing a human-style intelligence and that is known to hallucinate like crazy (a lot of the most intelligent humans turn out to be crackpots - Bobby Fischer was one of the best meat-based chess engines for example).


The vast majority of humans - even intelligent humans - do not "hallucinate like crazy."

Given a list of episode descriptions of Gilligan's Island, the vast majority of humans - even intelligent humans - would either be able to discern the correct answer or say they don't know.

I understand why there is this drive to present the normal human mental and psychological baseline as being just as unstable as LLMs, there is just too much money behind LLMs not to want to aggressively normalize its faults as much as possible (just as with the faults in autonomous driving), but any human being who hallucinated or confabulated with as much regularity as LLMs would be considered severely mentally ill.


> any human being who hallucinated or confabulated with as much regularity as LLMs would be considered severely mentally ill.

ie, it is common enough that we have a label for it. And the stats on how many people have a mental illness are not encouraging. If you put a little fence around the people hallucinating and dehumanise them then sure, humans don't hallucinate. The problem with that argument is they are actually still people.


>ie, it is common enough that we have a label for it.

Having a label for something doesn't imply that it's common. We have labels for plenty of rare things as well.

Also, "mental illness" is a far more broad category than what's being discussed, which is specifically symptoms that resemble the hallucinations and confabulations of LLMs, at the frequency with which LLMs display them. Most mental illness doesn't involve hallucinations or confabulations That is not common in humans, in LLMs it's normal.

>If you put a little fence around the people hallucinating and dehumanise them then sure, humans don't hallucinate.

I'm not dehumanizing anyone, this isn't a rational argument, it's just an ad hominem.

> The problem with that argument is they are actually still people.

The problem is that isn't the argument, and you can't attack the argument on its merits.

The simple, plain, demonstrable non-prejudiced fact is LLMs confabulate and hallucinate far more than human beings. About 17% to 38% of normal, healthy people experience at least one visual hallucination in their lifetime. But hearing voices and seeing things, alone, still isn't what we're talking about. A healthy, rational human can understand when they see something that isn't supposed to be there. Their concept of reality and ability to judge it doesn't change. That is schizophrenia, which would more accurately model what happens with LLMs. About 24 million people have schizophrenia - 0.32% of the population. And not even all schizophrenics experience the degree of reality dysfunction present in LLMs.

You are claiming that, in essence, all human beings have dementia and schizophrenia, and exhibit the worst case symptoms all the time. We wouldn't even be able to maintain the coherence necessary to create an organized, much less technological, society if that weren't the case. And you're claiming that the only reason to believe otherwise must be bigotry against the mentally ill. Even your assertion upthread, that "a lot of the most intelligent humans turn out to be crackpots" isn't true.

Stop it. Stop white knighting software. Stop normalizing the premise that it isn't worth being concerned about the negative externalities of LLMs because humans are always worse, and thus deserve the consequences. The same attitude that leads people to state that it doesn't matter how many people autonomous cars kill, humans are categorically worse drivers anyway. I can't think of many attitudes more dehumanizing than that.


> I'm not dehumanizing anyone, this isn't a rational argument, it's just an ad hominem.

Well, you lead with "The vast majority of humans - even intelligent humans - do not "hallucinate like crazy."" and then follow up by identifying a vast category of humans that do, literally, hallucinate like crazy. Unless you want to make an argument like mental illness actually being the appropriate mindset for viewing the world. Anyhow, you probably want to include an argument for why you think it is OK to exclude them.

Humans hallucinate continuously. If you test them in any way it is common to get nonsense answers. The difference is that it isn't polite to ask humans questions that expose the madness, people tend to shy away from topics that others routinely get wrong.

It is quite hard to explain a typical scholastic test without hallucinations. Particularly getting making mistakes in maths, spelling, and the sciences. It isn't like there is some other correct answer to a math problem that someone could be confused by; people just invent operations that don't exist when questioned.

> The simple, plain, demonstrable non-prejudiced fact is LLMs confabulate and hallucinate far more than human beings.

That isn't true, the opposite is true. Humans couldn't answer the breadth of questions a LLM does without making up a substantially more garbage. The only reason it isn't more obvious to you is because we structure society around not pressuring humans to answer arbitrary questions that test their understanding.


Which suggests the current approach based on LLMs might be a dead end and we need to explore others..


"Are we really going to retrain our largest models on a weekly basis to teach them about what's happened recently?"

Gilligan's Island was 60 years ago.


This is irrelevant to the point in the parent comment


All of the other parent comments have already been properly addressed in other comments.


Architecture isn't there yet for reasoning (extrapolation), just really good interpolation. To be fair, most people operate at an interpolation level as well.


I'm curious as to why Llama 3 specifically denies the existence of that episode, though.


What is the point of using an LLM with no prompt (or just the question alone for a prompt)? It sounds like...it would say something but it should be just well grammatically formed gibberish.


Unfortunately our time of having trustable information on the internet is rapidly dwindling as LLMs are going to shove garbage everywhere people can get it


>> I want a model that can answer any question accurately

Do you think it is possible Simon? Will we achieve that? Genuine question.


"I want a model that can answer any question accurately because it knows how to look up extra information from reliable sources"

So that very much depends on the "reliable sources" that we can grant it access to!

Honestly, even just giving models the ability to search Wikipedia (the live version, not some stale copy) goes a very long way already.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: