Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How do you reduce errors or hallucinations? I recently uploaded a very clear PDF to meta.ai and asked it a few, very simple questions. It completely made up quotes, including page numbers, section numbers etc.


I don't feed documents directly to an LLM. First, extract and process the data in a structured way that maintains the hierarchy and metadata of the content (this is important!). Then convert this into a scheme that you can control — it doesn’t really matter what it is (JSON, XML, markdown). From there, feed this to the LLM in chunks. This will get you most of the way there.

There's different ways to validate, but that's why maintaining hierarchy and metadata is so important. If you track this information properly, you can cross-check responses across different LLMs!




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: