People sure like making incorrect statements about LLMs. > There is no self-refl...

markwkw · on April 24, 2024

You can easily demonstrate that an LLM does know certain fact X AND demonstrate that the LLM will deny that they know fact X (or be flaky about it, randomly denying and divulging the fact)

There are two explanations: A. They lack self-reflection B. They know they know fact X, but avoid acknowledging for ... reasons?

I find the argument for A quite compelling

astrange · on April 24, 2024

> demonstrate that the LLM will deny that they know fact X (or be flaky about it, randomly denying and divulging the fact)

No, the sampling algorithm you used to query the LLM does that. Not the model itself.

e.g. https://arxiv.org/pdf/2306.03341.pdf

> B. They know they know fact X, but avoid acknowledging for ... reasons?

That reason being that the sampling algorithm didn't successfully sample the answer.

throwaway290 · on April 24, 2024

They will say "it's just a bad LLM", don't bother

arolihas · on April 24, 2024

So are you asserting the LLM does "know what it knows"? I'm not even if sure such a concept makes sense.

astrange · on April 24, 2024

No, it doesn't know what it knows, because it can't examine itself. It's (in part) a very large lookup table.

External code can do this though.

arolihas · on April 24, 2024

> There is no self-reflection of its information; it does not know what it knows and what it does not.

If "it" is referring to the LLM then the statement is correct.

dubcanada · on April 24, 2024

What do you mean you can write this if you wanted to? Are you suggesting a LLM can invent something new?

I don't think there is any record of that, even the image/video/what ever generation ones require something to begin with.

If you can train a LLM to "invent" why is that not the main focus, who knows what it could invent.

astrange · on April 24, 2024

> What do you mean you can write this if you wanted to? Are you suggesting a LLM can invent something new?

Why would I refer to an LLM as "you"? The person who is making the queries can do it with access to the model, assuming they're a programmer with a PhD in ML.

xcv123 · on April 24, 2024

LLM's can self-reflect if you simply tell them to self-reflect. Typically they are trained to be frugal and shoot out quick responses without self reflecting. Each generated token costs money. Just tell them to think about their response. That is chain of thought prompting.

"Which episode of Gilligan’s Island was about mind reading? After writing your response, tell me how certain you are of its accuracy on a scale of 1 to 10. Then self reflect on your response and provide a more accurate response if needed."

astrange · on April 24, 2024

Self-reflection and chain-of-thought are different things; CoT is about writing out the intermediate steps to an answer it can't immediately reach.

Self-reflection doesn't work well though: https://arxiv.org/abs/2310.01798

xcv123 · on April 24, 2024

The paper you linked to mentions "self correction", which seems to be something else. Chain of thought is a form of self reflection. It enables the LLM to "think" through each step and evaluate each step in relation to the entire context window. That is evaluation of each thought in relation to its previous thoughts. Thinking about its thoughts.

senectus1 · on April 24, 2024

but its doesn't even self check its own answers. Its a mouth with no ears...

A great example here, ask it:

Whats the next 3 numbers in this sequence: 71 64 58 53 48 44 ...

They consistently get the answer wrong, then double down on the wrong answer with the explanation that doesn't follow its own logic.

sulam · on April 24, 2024

So I'm very jetlagged and awake in the middle of the night, but does this have a mistake? 7, 6, 5, 5, 4 is not a series.

calfuris · on April 24, 2024

I believe that the intended rule is that the difference between terms is equal to the first digit of the earlier term.

snypher · on April 24, 2024

I'd like to add that ChatGPT also pointed out this error and suggested some alternative solutions.

amenhotep · on April 24, 2024

a0 = 0, a1 = 1, an = |an-1 - an-2| :)