How do you reduce errors or hallucinations? I recently uploaded a very clear PDF...

cccybernetic · 2024-12-10T18:59:51 1733857191

I don't feed documents directly to an LLM. First, extract and process the data in a structured way that maintains the hierarchy and metadata of the content (this is important!). Then convert this into a scheme that you can control — it doesn’t really matter what it is (JSON, XML, markdown). From there, feed this to the LLM in chunks. This will get you most of the way there.

There's different ways to validate, but that's why maintaining hierarchy and metadata is so important. If you track this information properly, you can cross-check responses across different LLMs!