Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm still not sold on recall at such large context window sizes. It's easy for an LLM to find a needle in a haystack, but in most RAG use-cases it's like finding a needle in a stack of needles, and the benchmarks don't really reflect that. There's also the speed and cost implications of dumping millions of tokens into a prompt - it's prohibitively slow and expensive right now.



Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: