I just asked Google's Gemini the following question:
Q: How many N's are there in Normation?
A: There is one N in the word "Normation"
Note that the answer is the same when asked n's instead of N's.
And this is but one example of many simple cases demonstrating that these model are indeed not reasoning in a similar manner to humans. However, the outputs are useful enough that I myself use Claude and GPT-4o for some work, but with full awareness that I must review the outputs in cases where factual accuracy is required.
This is just an artefact of tokenisation though. The model simply isn’t ever shown the letters that make up words, unless they are spelled out explicitly. It sees tokens representing groups of words. This is a little like saying a human isn’t intelligent because they couldn’t answer your question that you asked in an ultrasonic wavelength. If you’d like to learn more this video is a great resource: https://youtu.be/zduSFxRajkE?si=LvpXbeSyJRFBJFuj
Q: How many N's are there in Normation?
A: There are 2 N's in the word "Normation".
Q: Confidence level?
A: A more precise question! After rechecking, I can confirm that there is indeed 2 N's in the word "Normation". Confidence level: 100%
Q: How many N's are there in the string "Normation"?
A: There are 2 N's in the string "Normation".
On several regenerations, at some point it gives:
A: There are 3 N's in the word "Normation".
Q: Confidence level?
A: A slight challenge! After re-checking, I see that there are actually 2 N's in the word "Normation".
Q: How many N's are there in Normation?
A: There is one N in the word "Normation"
Note that the answer is the same when asked n's instead of N's.
And this is but one example of many simple cases demonstrating that these model are indeed not reasoning in a similar manner to humans. However, the outputs are useful enough that I myself use Claude and GPT-4o for some work, but with full awareness that I must review the outputs in cases where factual accuracy is required.