If the hypothesis is not printed out in the context, then it cannot hold it past that turn. You could prompt it to generate said hypothesis first (or set of hypotheses), and only then act on them. And then things might work.
Definitely not exactly a human. OTOH Low hanging fruit is low.
The effect is not quite what you think it is, and people don't quite take the right lessons.
Similar to the eliza effect, people still take the original reading of Clever Hans: "he couldn't really do maths, he's just taking social cues from his handler"
But what's the actual difference between Eliza, Clever Hans and RLHF? They're doing the similar things, right?
Now look at how we valued that in the 20th vs 21st century:
How much does an ALU even cost anymore? even a really good one? (it's almost never separate anymore, usually on the same silicon as the rest of the cpu/microcontroller)
Meanwhile... what's the TCO to deploy a sentiment classifier? Especially a really good one?
If "randomly sampling from a trained distribution" can't produce useful, meaningful output, then deterministic computation is even more suspect. After all, it's a strict subset. You're sampling with temperature zero from a handcrafted distribution.
(this post directionality ok, but there's many a devil in the details)
Opus is also not the worst at hacking things either. Sometimes it hacks things 'by accident' you see. If Mythos is better at it, then at some point, yeah, I can see how that might start to become a problem. Especially running unsupervised.
Definitely not exactly a human. OTOH Low hanging fruit is low.