Isn’t it true that the only thing that LLM’s do is “hallucinate”?
The only way to know if it did “hallucinate” is to already know the correct answer. If you can make a system that knows when an answer is right or not, you no longer need the LLM!
Hallucination implies a failure of an otherwise sound mind. What current LLMs do is better described as bullshitting. As the bullshitting improves, it happens to be correct a greater and greater percentage of the time
Sometimes when I am narrating a story I don't care that much about trivial details but focus on the connection between those details. Is there LLM counterpart to such a behaviour? In this case, one can say I was bullshitting on the trivial details.
It has nothing to do with ratio and to do with intent. Bullshitting is what we say you do when you just spin a story with no care for the truth, just make up stuff that sound plausible. That is what LLMs do today, and what they will always do as long as we don't train them to care about the truth.
You can have a generative model that cares about the truth when it tries to generate responses, its just the current LLMs don't.
You can program a concept of truth into them, or maybe punishing it for making mistakes instead of just rewarding it for replicating text. Nobody knows how to do that in a way that get intelligent results today, but we know how to code things that outputs or checks truths in other contexts, like wolfram alpha is capable of solving tons of things and isn't wrong.
> (or any concepts at all).
Nobody here said that, that is your interpretation. Not everyone who is skeptical of current LLM architectures future potential as AGI thinks that computers are unable to solve these things. Most here who argues against LLM don't think the problems are unsolvable, just not solvable by the current style of LLMs.
> You can program a concept of truth into them, ...
The question was, how you do that?
> Nobody here said that, that is your interpretation.
What is my interpretation?
I don't think that the problems are unsolvable, but we don't know how to do it now. Thinking that "just program the truth in them" shows a lack of understanding of the magnitude of the problem.
Personally I'm convinced that we'll never reach any kind of AGI with LLM. They are lacking any kind of model about the world that can be used to reason about. And the concept of reasoning.
And I answered, we don't know how you do that which is why we don't currently.
> Personally I'm convinced that we'll never reach any kind of AGI with LLM. They are lacking any kind of model about the world that can be used to reason about. And the concept of reasoning.
Well, for some definition of LLM we probably could. But probably not the way they are architected today. There is nothing stopping a large language model to add different things to its training steps to enable new reasoning.
> What is my interpretation?
Well, I read your post as being on the other side. I believe it is possible to make a model that can reason about truthiness, but I don't think current style LLMs will lead there. I don't know exactly what will take us there, but I wouldn't rule out an alternate way to train LLMs that looks more like how we teach students in school.
Key words like "epistemology" in the prompt. Chat GPT generally outperforms humans in epistemology substantially in my experience, and it seems to "understand" the concept much more clearly and deeply, and without aversion (lack of an ego or sense of self, values, goals, desires, etc?).
> It has nothing to do with ratio and to do with intent. Bullshitting is what we say you do when you just spin a story with no care for the truth, just make up stuff that sound plausible
Do you people hear yourselves? You're discussing the state of mind of a pseudo-RNG...
ML models intent is the reward function it has. They strive to maximize rewards, just like a human does. There is nothing strange about this.
Humans are much more complex than these models so they have much more concepts and stuff which is why we need psychology. But some core aspects works the same in ML and in human thinking. In those cases it is helpful to use the same terminology for humans and machine learning models, because that helps transfer understanding from one domain to the other.
Does every thread about this topic have to have someone quibbling about the word “hallucination”, which is already an established term of art with a well understood meaning? It’s getting exhausting.
The term hallucination is a fundamental misunderstanding of how LLMs work, and continuing to use it will ultimately result in a confused picture of what AI and AGI are and what is "actually happening" under the hood.
Wanting to use accurate language isn't exhausting, it's a requirement if you want to think about and discuss problems clearly.
"Arguing about semantics" implies that there is no real difference between calling something A vs. calling it B.
I don't think that's the case here: there is a very real difference between describing something with a model that implies one (false) thing vs. a model that doesn't have that flaw.
If you don't find that convincing, then consider this: by taking the time to properly define things at the beginning, you'll save yourself a ton of time later on down the line – as you don't need to untangle the mess that resulted from being sloppy with definitions at the start.
This is all a long way of saying that aiming to clarify your thoughts is not the same as arguing pointlessly over definitions.
"Computer" used to mean the job done by a human being. We chose to use the meaning to refer to machines that did similar tasks. Nobody quibbles about it any more.
Words can mean more than one thing. And sometimes the new meaning is significantly different but once everyone accepts it, there's no confusion.
You're arguing that we shouldn't accept the new meaning - not that "it doesn't mean that" (because that's not how language works).
I think it's fine - we'll get used to it and it's close enough as a metaphor to work.
I'd be willing to bet that people did quibble about what "computer" meant at the time the meaning was transitioning.
It feels like you're assuming that we're already 60 years past re-defining "hallucination" and the consensus is established, but the fact that people are quibbling about it right now is a sign that the definition is currently in transition/ has not reached consensus.
What value is there in trying to shut down the consensus-seeking discussion that gave us "computer"? The same logic could be used arguing that "computers" are actually be called "calculators" and why are people still trying to call it a "computer"?
you stole a term which means something else in an established domain and now assert that the ship has sailed, whereas a perfectly valid term in both domains exists. don't be a lazy smartass.
That's actually what the paper is about. I don't know why they didn't use that in the title.
> Here we develop new methods grounded in statistics, proposing entropy-based uncertainty estimators for LLMs to detect a subset of hallucinations—confabulations—which are arbitrary and incorrect generations.
If there's any forum which can influence a more correct name for a concept it's this one, so please excuse me while I try to point out that contemporary LLMs confabulate and hallucinating should be reserved for more capable models.
It’s well understood in the field. It’s not well understood by laymen. This is not a problem that people working in the field need to address in their literature.
> We need systems that try to be coherent, not systems that try to be unequivocally right, which wouldn't be possible.
The fact that it isn't possible to be right about 100% of things doesn't mean that you shouldn't try to be right.
Humans generally try to be right, these models don't, that is a massive difference you can't ignore. The fact that humans often fails to be right doesn't mean that these models shouldn't even try to be right.
By their nature, the models don’t ‘try’ to do anything at all—they’re just weights applied during inference, and the semantic features that are most prevalent in the training set will be most likely to be asserted as truth.
They are trained to predict next word that is similar to the text they have seen, I call that what they "try" to do here. A chess AI tries to win since that is what it was encouraged to do during training, current LLM try to predict the next word since that is what they are trained to do, there is nothing wrong using that word.
This is an accurate usage of try, ML models at their core tries to maximize a score, so what that score represents is what they try to do. And there is no concept of truth in LLM training, just sequences of words, they have no score for true or false.
Edit: Humans are punished as kids for being wrong all throughout school and in most homes, that makes human try to be right. That is very different from these models that are just rewarded for mimicking regardless if it is right or wrong.
> That is very different from these models that are just rewarded for mimicking regardless if it is right or wrong
That's not a totally accurate characterization. The base models are just trained to predict plausible text, but then the models are fine-tuned on instruct or chat training data that encourages a certain "attitude" and correctness. It's far from perfect, but an attempt is certainly made to train them to be right.
They are trained to replicate text semantically and then given a lot of correct statements to replicate, that is very different from being trained to be correct. That makes them more useful and less incorrect, but they still don't have a concept of correctness trained into them.
Exactly, if a massive data poisoning would happen, will the AI be able to know what’s the truth is there is as much new false information than there is real one ? It won’t be able to reason about it
I think this assumption is wrong, and it's making it difficult for people to tackle this problem, because people do not, in general, produce writing with the goal of producing truthful statements. They try to score rhetorical points, they try to _appear smart_, they sometimes intentionally lie because it benefits them for so many reasons, etc. Almost all human writing is full of a range of falsehooods ranging from unintentional misstatements of fact to out-and-out deceptions. Like forget the politically-fraught topic of journalism and just look at the writing produced in the course of doing business -- everything from PR statements down to jira tickets is full of bullshit.
Any system that is capable of finding "hallucinations" or "confabulations" in ai generated text in general should also be capable of finding them in human produced text, which is probably an insolvable problem.
I do think that since the models do have some internal representation of certitude about facts,that the smaller problem of finding potential incorrect statements in its own produced text based on what it knows about the world _is_ possible, though.
The answer is no, otherwise this paper couldn't exist. Just because you can't draw a hard category boundary doesn't mean "hallucination" isn't a coherent concept.
(the OP is referring to one of the foundational concepts relating to the entropy of a model of a distribution of things -- it's not the same terminology that I would use but the "you have to know everything and the model wouldn't really be useful" is why I didn't end up reading the paper after skimming a bit to see if they addressed it.
It's why this arena things are a hard problem. It's extremely difficult to actually know the entropy of certain meanings of words, phrases, etc, without a comical amount of computation.
This is also why a lot of the interpretability methods people use these days have some difficult and effectively permanent challenges inherent to them. Not that they're useless, but I personally feel they are dangerous if used without knowledge of the class of side effects that comes with them.)
The idea behind this research is to generate answer few times and if results are semantically vastly different from each other then probably they are wrong.
> Isn’t it true that the only thing that LLM’s do is “hallucinate”?
The Boolean answer to that is "yes".
But if Boolean logic were a god representation of reality, we would already have solved that AGI thing ages ago. On practice, your neural network is trained with a lot of samples, that have some relation between themselves, and to the extent that those relations are predictable, the NN can be perfectly able to predict similar ones.
There's an entire discipline about testing NNs to see how well they predict things. It's the other side of the coin of training them.
Then we get to this "know the correct answer" part. If the answer to a question was predictable from the question words, nobody would ask it. So yes, it's a definitive property of NNs that they can't create answers for questions like people have been asking those LLMs.
However, they do have an internal Q&A database they were trained on. Except that the current architecture can not know if an answer comes from the database either. So, it is possible to force them into giving useful answers, but currently they don't.
"The temporal modulation appeared as a spatial pattern known as a ‘phantom array’ during the saccade. The appearance of the pattern enabled the discrimination of flicker from steady light at frequencies that in 11 observers averaged 1.98 kHz."
I personally can see it at substantially higher frequencies. And note that displaying 2kHz flicker requires a 4kHz monitor.
I think we should less worried about people believing deep-fakes are real and more worried that politicians (and others) will be able to claim things which they actually said are deep fakes.
That depends on the relative frequency of the two modes. If 90% of videos of each precidential candidate were a deepfake good enough that you cannot tell if it's real or not, then that is a worse problem than the possibility that one real video can be denied.
In both cases, though, the solution would be to find a way to allow people to tell deepfakes from real videos. For instance by having some sort of certificate provider agencies digitally sign them.
This would still make it hard to tell a deepfake from an unsigned video, of course.
I think the important part of the technique is talking about what you see, not the actual act of seeing things. He talks about creating brain pathways between the visual and linguistic parts of your brain.
Sure, but "were they in the office" is /much/ easier to measure at scale. And I'm sure you'd throw a pretty big fit if your pay was docked because the company didn't think you did a good enough job, even though you were in the office.
Alas, “this metric doesn’t track what we care about, BUT it’s easier to collect, so let’s use it” explains a lot of really bad policies and practices in our world.
You "just" need thousands of employees to do additional labor. And then you need to process all of that data, including adjusting for the variability in how supervisors measure performance.
And it would only notify someone for human review if a certain threshold was reached; just having one or two violating images would have tripped the system.
I emailed them last week to ask about that and they said it’s currently only available in Europe, but they’d hope to have it released in the Americas this week if all goes well. This is on iOS.
Did they say why it's not available in the US? I installed it a year or so ago before it was removed from the US app store, which is why I still have it on my phone.
It seems to me that something really dangerous or illegal needs to be going on to morally justify publicly releasing a 100GB worth of (presumably unfiltered) private company data.
The data also contains salaries information and private addresses of Tesla employees, which clearly don’t belong to the category of “whistler blower data”, even under EU laws.
The only way to know if it did “hallucinate” is to already know the correct answer. If you can make a system that knows when an answer is right or not, you no longer need the LLM!