>It seems like the only answer we have is for a user to flag that GPT-4+Web flubbed the Gilligan test OpenAI dispatches a data contractor for RLHF whack-a-mole + synthetic data generation, GPT learns to Google answers to Gilligan's Island prompts, then we cross our fingers and hope transformers are smart enough to transfer that knowledge to the Sanford and Son benchmark.
Or maybe... let me run this up the flagpole and see if anyone salutes it... maybe we accept that LLMs have fundamental architectural limitations and can't do certain things like "math" or "anything requiring an awareness of context or factual accuracy" and don't use them for everything?
So like instead of that, which wouldn't even work because search engines have already been polluted by AI generated garbage so they would just be eating their own shit, we have search engines that actually work again, and people just look that stuff up? And LLMs get relegated to whatever niche they're actually useful in, rather than the current plan of rebuilding our entire technological society on top of them as soon as possible because money?
I know when I'm lying. It doesn't seem totally insane to think that there's a node/tensor (not calling it a neuron) inside the model that is activated when it's confabulating that we could find and highlight and program our way into not happening.
From what I understand of the way LLMs work, they don't "know" when they're confabulating or not. All of the text they generate is arbitrary, albeit not entirely random. Whether or not any particular response is useful is a matter of human interpretation.
The problem is the tendency to assume LLMs behave the same way humans do. You know you're lying when you're lying. LLMs don't even have a concept of a "lie." Even though you can ask one what a lie is, and it responds with an accurate answer, that's still just an arbitrary statistically based response. It doesn't actually know.
It's crazy to me how even in here there's such a huge amount of people who just don't understand. Even the article were discussing points this out
> The implications are that LLMs do not perform reasoning over data in the way that most people conceive or desire.
> There is no self-reflection of its information; it does not know what it knows and what it does not. The line between hallucination and truth is simply a probability factored by the prevalence of training data and post-training processes like fine-tuning
It's not thinking, it's not conscious, it's just a mathematical function that's too complex for us to understand.
I think the best way to think of it is that it’s an estimation of the outputs of reasoning and knowledge. Of course this means that the models do need to model the reasoning process in some way, but it in no way implies that they model it in a way that’s rigorous or reliable.
Or maybe... let me run this up the flagpole and see if anyone salutes it... maybe we accept that LLMs have fundamental architectural limitations and can't do certain things like "math" or "anything requiring an awareness of context or factual accuracy" and don't use them for everything?
So like instead of that, which wouldn't even work because search engines have already been polluted by AI generated garbage so they would just be eating their own shit, we have search engines that actually work again, and people just look that stuff up? And LLMs get relegated to whatever niche they're actually useful in, rather than the current plan of rebuilding our entire technological society on top of them as soon as possible because money?