Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Because for example Abby replaces the image with an PDF with invisible text overlays. I think it is a bit more complicated once you account for the formatting of documents.


Abbyy has had that ability for a longer period of time so it may be more accurate, but this is something tesseract supports[1]. In fact I'd say most OCR systems support it with varying degrees of accuracy.

[1]:https://github.com/tesseract-ocr/tesseract/wiki/FAQ#how-do-i...


The ability to create searchable PDFs is very useful and convenient. But creating searchable PDFs does not require a deep understanding of the document format (like column detection etc). You just place the words at the right coordinates of their bounding boxes. You can test this for example here: https://ocr.space - select the option to create a searchable PDF. It works even for the most complex documents.

Now, creating a Word document from a scan is a different beast because it requires layout analysis. This is where Abbyy with its long experience still has a good lead.


My issue was that I needed one that exposes an API for doing OCR on pictures taken from mobile devices. It was really difficult to find non-desktop packages.


Out of curiosity, what did you need beyond something like OpenCV/Metal + Tesseract?


The client needed an OCR solution for supplier invoices with a variety of layouts and a combination of printed and hand-written characters, and didn't have the budget for a bespoke solution. To be fair, it's a very hard problem, I was just surprised that given all the much hyped recent advancements in deep learning for computer vision, most of the solutions in the market seem to be running on decades old technology.


Well it is more complex than it appears.

Extracting data from documents requires a solution which uses OCR but is a different product (e.g. ABBYY FlexiCapture).

This is most commonly referred to as zonal OCR and comes with the added functionality of handling multiple templates, defining zones/fields, specifying special rules for fields, verification process for manual inspection (e.g. triggered when the image receives a low confidence recognition score) etc. This is different and more complex than a product that does full page OCR (e.g. ABBYY Finereader).

Handwritten OCR is a whole different story. The products that do zonal OCR will fail to recognize handwritten text, unless it's in boxes (PDF forms). I'm working on a prototype that can handle handwritten text outside of boxes too.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: