> *yeah man, the antrophomorphization is bad.* Unfortunately, it's also the leas...

tsimionescu · 2025-06-14T10:23:38 1749896618

Anthropomorphization also leads to very wrong conclusions though. In particular, we have a theory of mind that we apply in relation to other humans, mostly on the basis of "they haven't lied to me so far, so it's unlikely they will suddenly start lying to me now". But this is dead wrong in relation to output from an LLM - just because it generated a hundred correct answers doesn't tell you anything about how likely the 101st one is to be be correct and not a fabrication. Trust is always misplaced if put into results returned by an LLM, fundamentally.

TeMPOraL · 2025-06-14T12:46:37 1749905197

> But this is dead wrong in relation to output from an LLM - just because it generated a hundred correct answers doesn't tell you anything about how likely the 101st one is to be be correct and not a fabrication.

Neither it is for humans, the whole "they haven't lied to me so far, so it's unlikely they will suddenly start lying to me now" thing is strongly conditioned on "low frequency" assumptions from shared biology, history and culture, as well as "high frequency" assumptions about the person, e.g. that they're actually friendly, or the context didn't change, that your goals are still aligned[0]... or that they aren't drunk, confused, or a kid.

Correct your theory of mind for the nature of LLMs, and your conclusions will be less wrong. LLMs are literally trained to approximate humans, so it makes sense this approach gives a decent high-level approximation.

(What LLMs don't have is the "sticky" part of social sphere; humans, generally, behave as if they expect to interact again in the future. Individual or public opinion doesn't weigh on them much. And then there's the other bit, in that we successfully eliminate variance in humans - the kind of people who cannot be theory-of-minded well by others, get killed or locked up in prisons and mental institutions, or on the less severe end, get barred from high-risk jobs and activities.)

To be clear: I'm not claiming you should go deep into the theory-of-mind thing, assuming intentions and motivations and internal emotional states in LLMs. But that's not necessary either. I say that the right amount of anthropomorphism just cognitive and focused on immediate term. "What would a human do in an equivalent situation?" gets you 90% there.

--

BTW, I looked up the theory of mind on Wikipedia to refresh some of the details, and found this by the end of the "Definition" section[1]:

> An alternative account of theory of mind is given in operant psychology and provides empirical evidence for a functional account of both perspective-taking and empathy. The most developed operant approach is founded on research on derived relational responding[jargon] and is subsumed within relational frame theory. Derived relational responding relies on the ability to identify derived relations, or relationships between stimuli that are not directly learned or reinforced; for example, if "snake" is related to "danger" and "danger" is related to "fear", people may know to fear snakes even without learning an explicit connection between snakes and fear.[20] According to this view, empathy and perspective-taking comprise a complex set of derived relational abilities based on learning to discriminate and respond verbally to ever more complex relations between self, others, place, and time, and through established relations

I then followed through to relational frame theory[2], and was surprised to find basically a paraphrase of how vector embeddings work in language models. Curious.

--

[0] - Yes, I used that word. Also yes, in weak or transactional relationships, this may change without you realizing it, and the other person may suddenly start lying to you with no advance warning. I'm pretty sure everyone experienced this at some point.

[1] - https://en.wikipedia.org/wiki/Theory_of_mind#Definition

[2] - https://en.wikipedia.org/wiki/Relational_frame_theory

tough · 2025-06-14T19:31:11 1749929471

There's also a big difference between a local llm which model's weights you fully control and have acess to, and thus can be sure it's not being changed under your nose.

Trusting OpenAI to not put sychophant 4o on, or not do funny things on their opaque CoT is a whole another thing.

The problem is 1% of us are programmers and can reason about these things as the technology they are, for the 99% of the rest that don't understand it and just -use it- it feels like magic, and magic is both good/bad from this POV

me thinks anyways