I just took the time to read through all source code and docs. Nice ideas. I lik...

I just took the time to read through all source code and docs. Nice ideas. I like to experiment with LLMs running on my local computer so I will probably convert this example to use the light weight Python library Rank-BM25 instead of Elastic Search, and a long context model running on Ollama. I wouldn’t have prompt caching though.

This example is well written and documented, easy to understand. Well done.