You're missing the fact that the model can only express its capabilities through the token generation mechanism.
The annoying "humans are auto complete" crowd really tries their best to obscure this.
Consider the following. You are taking notes in French in a choppy way by writing keywords. Then you write the output in English, but you are only allowed to use phrases that you have already seen to express your keywords. Your teacher doesn't speak french and only looks at your essay. You are therefore able to do more complicated things in french, since you don't lose points for writing things that the teacher hasn't taught you. However, the point deduction is so ingrained in you, that even after the teacher is gone, you still decide to not say some of the things you have written in french.
The annoying "humans are auto complete" crowd really tries their best to obscure this.
Consider the following. You are taking notes in French in a choppy way by writing keywords. Then you write the output in English, but you are only allowed to use phrases that you have already seen to express your keywords. Your teacher doesn't speak french and only looks at your essay. You are therefore able to do more complicated things in french, since you don't lose points for writing things that the teacher hasn't taught you. However, the point deduction is so ingrained in you, that even after the teacher is gone, you still decide to not say some of the things you have written in french.