Tokenization reorganizes information but doesn't remove it. It may be easier/har...

Tokenization reorganizes information but doesn't remove it. It may be easier/harder to learn stuff like letter counting with different tokenization schemes, but the main reason it's hard is that there's not much text about letter counting in the training set. Ie, you could easily train any of the ChatGPT models to count letters in words by generating a bunch of training samples explicitly for this task, but it's not worth the bother.