This is exactly how dumb these SOTA models feel. A real AI would stop and tell m...

mvieira38 · 2025-07-23T18:36:04 1753295764

This is a feature, not a bug. In chatbot mode and in coding, the vast majority of consumers do not have the critical thinking skills necessary to realise the models are making stuff up, so the AI companies are incentivized to train accordingly. When the same models are used for agent mode the problem is just way more glaring, they don't respect (or fear) the terminal as much as they should, try to give the user some positive output and here we are

falcor84 · 2025-07-23T11:57:03 1753271823

I don't see a reason to believe that this is a "fundamental error". I think it's just an artifact of the way they are trained, and if the training penalized them more for taking a bad path than for stopping for instructions, then the situation would be different.

klabb3 · 2025-07-23T14:06:06 1753279566

It seems fundamental, because it’s isomorphic to the hallucination problem which is nowhere near solved. Basically, LLMs have no meta-cognition, no confidence in their output, and no sense that they’re on ”thin ice”. There’s no difference between hard facts, fiction, educated guesses and hallucinations.

Humans who are good at reasoning tend to ”feel” the amount of shaky assumptions they’ve made and then after some steps it becomes ridiculous because the certainty converges towards 0.

You could train them to stop early but that’s not the desired outcome. You want to stop only after making too many guesses, which is only possible if you know when you’re guessing.

oc1 · 2025-07-23T14:01:14 1753279274

Fine. I'll cancel all other ai subscriptions if finally an ai doesn't aim to please me but behaves like a real professional. If your ai doesn't assume that my personality is trump-like and needs constant flattery . If you respect your users on a level that don't outsource RLHF to the lowest bider but pay actual senior (!) professionals in the respective fields you're training the model for. No Provider does this - they all went down the path to please some kind of low-iq population. Yes, i'm looking at you sama and fellows.

falcor84 · 2025-07-23T14:17:50 1753280270

I think that it will take more time, but things do seem to be going in this direction. See this on the front page at the moment - https://news.ycombinator.com/item?id=44622637

cess11 · 2025-07-23T11:28:04 1753270084

These things are intelligent in the same way Aloy of Horizon fame is brave.