It's not, it wouldn't make sense to have 100,000 tokens just for the first 100,0... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		alexvitkov 10 months ago \| parent \| context \| favorite \| on: Numerical Precision Affects Mathematical Reasoning... It's not, it wouldn't make sense to have 100,000 tokens just for the first 100,000 numbers. There's a playground [1] where you can see how LLMs tokenize a string. 12345678987654321 is tokenized on various models like so: `GPT4 123-456-789-876-543-21 GPT3 123-45-678-98-765-43-21 Llama-2, Mistral 1-2-3-4-5-6-7-8-9-8-7-6-5-4-3-2-1` [1] https://huggingface.co/spaces/Xenova/the-tokenizer-playgroun...

Jerrrrrrry 9 months ago [–]

Looks like number string parsing may be of enough importance to warrant token look-ahead recursive sub-parsing, then use the most "promising" token-ization; the one that generates the highest yielding probability tree following that number string.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact