Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I think the innovation is using caching as so to make the cost of the approach manageable. The way they implemented it is that each time you create a chunk, you ask the llm to create an atomic chunk from the whole context. You need to do this for all tens of thousands of chunks in your data. This costs a lot. By caching the documents, you can spare costs


You could also just save the first outputted atomic chunk and store it then re-use it each time yourself. Easier and more consistent.


I don't understand how that helps here. They're not regenerating each chunk every time, this is about caching the state after running a large doc through a model. You can only do this kind of thing if you have access to the model itself, or it's provided by the API you use.


To be fair, that only works if you keep chunk windows static.


Yup. Caching is very nice.. but the framing is weird. "Introducing" to me, connotes a product release, not a new tutorial.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: