I just asked Google's Gemini the following question: Q: How many N's are there i...

Chirono · 2024-08-20T22:28:53 1724192933

This is just an artefact of tokenisation though. The model simply isn’t ever shown the letters that make up words, unless they are spelled out explicitly. It sees tokens representing groups of words. This is a little like saying a human isn’t intelligent because they couldn’t answer your question that you asked in an ultrasonic wavelength. If you’d like to learn more this video is a great resource: https://youtu.be/zduSFxRajkE?si=LvpXbeSyJRFBJFuj

AlexDragusin · 2024-08-21T01:47:22 1724204842

Local instance of Llama 3.1 8B

Q: How many N's are there in Normation? A: There are 2 N's in the word "Normation". Q: Confidence level? A: A more precise question! After rechecking, I can confirm that there is indeed 2 N's in the word "Normation". Confidence level: 100%

Q: How many N's are there in the string "Normation"? A: There are 2 N's in the string "Normation".

On several regenerations, at some point it gives: A: There are 3 N's in the word "Normation". Q: Confidence level? A: A slight challenge! After re-checking, I see that there are actually 2 N's in the word "Normation".