Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: I Made an API for Persistent KV-Caching (Cache Augmented Generation) (gitbook.io)
3 points by ArnavAgrawal03 5 months ago | hide | past | favorite | 1 comment
Hi HN!

I wanted to demonstrate an easy to use API for Cache Augment Generation. For any open source LLM available on Llama cpp, we can store the KV-cache and model state after it has processed a large corpus of documents, and then load that state in every time we query the document.

This leads to a drastic reduction in latency as well as compute/energy used by the model.

This demo is part of a larger system that I'm building called DataBridge[0] - with a focus on implementing new and useful techniques for knowledge retrieval - allowing developers to use the latest research in production.

I'd love to hear your feedback on DataBridge, and the CAG feature. If you have papers or particular techniques you'd like to see implemented, I'd love to hear about it :)

[0] https://github.com/databridge-org/databridge-core/



Forgot to mention: We are completely open-source! Feel free to create issues or discuss on the github!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: