Hacker News new | past | comments | ask | show | jobs | submit login

What differentiates this from Open WebUI? How did you design the RAG pipeline?

I had a project in the past where I had hundreds of PDF / HTML files of industry safety and fatality reports which I was hoping to simply "throw in" and use with Open WebUI, but I found it wasn't effective at this even in RAG mode. I wanted to ask it questions like "How many fatalities occurred in 2020 that involved heavy machinery?", but it wasn't able to provide such broad aggregate data.




I think this is a fundamental issue with naive RAG implementations: they aren't accurate enough for pretty much anything


Ultimately, the quality of OCR on PDF is where we are bottlenecked as an industry. And not just in text characters but understanding and feeding to the LLM structured object relationships as we see in tables and graphs. Intuitive for a human, very error prone for RAG.


That's a real issue, but that's masking some of the issues further downstream, like chunking and other context-related problems. There are some clever proposals to make this work, including some of the stuff from Anthropic and Jina. But as far as I can tell, these haven't been tested thoroughly because everyone is hung up at the OCR step (as you identified).


For my purposes, all of the data was also available in HTML format, so the OCR wasn't a problem. I think the issue is the RAG pipeline doesn't take the entire corpus of knowledge into its context when making a response, but uses an index to find one or more relevant documents that it believes are relevant, then uses that small subset as part of the input.

I'm not sure there's a way to get what a lot of people want RAG to be without actually training the model on all of your data, so they can "chat with it" similar to how you can ask ChatGPT about random facts about almost any publicly available information. But I'm not an expert.


I've also observed this issue and I wonder where the industry is on it. There seem to be a lot of claims that a given approach will work here, but not a lot of provably working use cases.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: