Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Could you elaborate a bit more about how that would work in practice?


Sure, if you're running a customer service chatbot, you can ask customers what the problem is, then start running rag async to populate a proper context for a smart LLM, and have the chatbot continue asking some questions to clarify details to give the background RAG process time to fetch data and run a quick summary, then have the chatbot give some indication it's thinking, run the full context query on the smart LLM, generate a summary answer then feed it back to the chat LLM and say "I may have found a solution to your problem" then switch to the response from the smart LLM.


I see what you're saying, but you're assuming that consumer products are always chatbots (and that a small language model can buy time interacting with the user while possibly providing additional context). That being said, I would be interested to see such a system in practice - any examples you can point me to? My more general point was not chat-related; much of the research around RAG seems to use LLMs to parse or route the user's query, improve retrieval, etc. which doesn't often work in practice.


This is where the opportunity for creativity comes in. You could allow a chat based refinement to search queries, or provide popup refinement buttons that narrow the search space, and build the search results iteratively rather than the old paradigm of "search" -> "results"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: