Jokes aside, isn't it wrong that OCR software still always produces textual resu...

kzrdude · on April 30, 2024

That question seems to be the same kind as the question in the OP. Isn't something wrong when a random scribble creates a valid execution in Perl?

jonahx · on April 30, 2024

> It looks like there's not much progress done since in the field

LLMs help here. From my own experiments chatGPT is pretty good "smart, context-aware" OCR agent.

Kubuxu · on April 30, 2024

Using image embedding and evaluating 100s billion parameter LLM for OCR is like hunting rabbits using Yamato’s 18in naval gun.

manquer · on April 30, 2024

Well using a human is bring an interstellar rail gun to hunt rabbit so i guess it still better ?

jonahx · on April 30, 2024

Not really. Proper OCR in the broadest sense (extracting text from arbitrary pdfs that intermingle tables, images, etc, or from hand written artistic posters) requires a full understanding of semantic intent.

You are perhaps imagining more constrained scenarios of straight lines of consistent text on a page with well-known artifacts of "noise" (smudges, print imperfections, and so on).

petters · on April 30, 2024

Yes, there has been progress. But the featured article is meant to be fun!