People sure like making incorrect statements about LLMs.
> There is no self-reflection of its information; it does not know what it knows and what it does not.
This is a property of the code around the LLM, like the sampling algorithm, not the model itself. You could write this if you wanted to. (It would occasionally be incorrect about what it knows, of course.)
A question almost none of them know the answer to is "what is the line of poem X that comes before the line Y?", because of the reversal curse.
You can easily demonstrate that an LLM does know certain fact X
AND demonstrate that the LLM will deny that they know fact X (or be flaky about it, randomly denying and divulging the fact)
There are two explanations:
A. They lack self-reflection
B. They know they know fact X, but avoid acknowledging for ... reasons?
> What do you mean you can write this if you wanted to? Are you suggesting a LLM can invent something new?
Why would I refer to an LLM as "you"? The person who is making the queries can do it with access to the model, assuming they're a programmer with a PhD in ML.
LLM's can self-reflect if you simply tell them to self-reflect. Typically they are trained to be frugal and shoot out quick responses without self reflecting. Each generated token costs money. Just tell them to think about their response. That is chain of thought prompting.
"Which episode of Gilligan’s Island was about mind reading?
After writing your response, tell me how certain you are of its accuracy on a scale of 1 to 10. Then self reflect on your response and provide a more accurate response if needed."
The paper you linked to mentions "self correction", which seems to be something else. Chain of thought is a form of self reflection. It enables the LLM to "think" through each step and evaluate each step in relation to the entire context window. That is evaluation of each thought in relation to its previous thoughts. Thinking about its thoughts.
> There is no self-reflection of its information; it does not know what it knows and what it does not.
This is a property of the code around the LLM, like the sampling algorithm, not the model itself. You could write this if you wanted to. (It would occasionally be incorrect about what it knows, of course.)
A question almost none of them know the answer to is "what is the line of poem X that comes before the line Y?", because of the reversal curse.