Hacker Newsnew | past | comments | ask | show | jobs | submit | chezmo's commentslogin

Great to hear that you like Mailparser and that you are using it since a couple of years already! I'm the founder of Mailparser and reading your comment made my day! :-)

I thought I should mention that we also launched a sister-product called https://docparser.com two years. Docparser is basically like Mailparser, but for documents (PDFs or scanned documents).


I really like the idea and I think there might be a lot of demand for this kind of stuff. Did you already do consulting work for customers with this kind of problem? That would be a great way to kick-start the business.


love the idea! Curious how this will evolve once people start adding more commands.


It's a fun open experimentation. we sort commands depending on how many called was the command.


Awesome feedback!

So far, it's the user who would need to decide which document goes to which parser. A routing engine is however on our list and probably be one of the next features to add.

Regarding the stats, I'm not sure yet as we just launched. OCR was however one of the first things early users asked for.

For the 'unpaper' function we are using http://manpages.ubuntu.com/manpages/trusty/man1/unpaper.1.ht...

I would love to discuss things more in detail with you. Could you contact me contact [at] docparser.com please?


We do position based text extraction. We add however an 'unpaper' function which tries to correct misalignments and increases the quality of the scan.


What OCR library do you use? What languages it supports?


For scanned images we use https://github.com/tesseract-ocr/tesseract. For text based PDFs we pull the text directly from the file and all languages are supported.


That's right! The user defines a rectangular area and we then extract the raw text based on the position. For table extraction we use tabula.java under the hood.


Thanks for the heads up, I just fixed it! mailparser.io is my other product which I launched a couple of years ago. Customers kept asking for document parsing capabilities so I thought it would be a good idea to start Docparser. For the FAQ I copied some text and apparently forgot to properly proof read it :)


Hey Guys! I'm the founder of mailparser.io and I'm super happy to see this showcase of Michael on Medium. Please let me know if you have any questions about mailparser.io or how to pull data out of e-mails.


Curious about the security infrastructure and data retention policies because emailed information is traveling over the wire and processed in a cloud.


I love the fact that you don't need to add another app to your startup stack.


On the paper, it's cool, in real use, it's even better


Interesting to see that the DDP of Meteor can also be used to let servers talk to each other. Until now I thought it's used solely for client/server communication.


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: