Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is "reasoning model" stuff even for humans :).



There is OCR software that analyses which language is used, and then applies heuristics for the recognized language to steer the character recognition in terms of character sequence likelihoods and punctuation rules.

I don’t think you need a reasoning model for that, just better training; although conversely a reasoning model should hopefully notice the errors — though LLM tokenization might still throw a wrench into that.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: