Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Faiss has long discussed strategies for scaling to 1B - 1T records here - https://github.com/facebookresearch/faiss/wiki/Indexing-1G-v...

There are plenty of options available to run your own local vector database, txtai is one of them. Ultimately depends if you have a sizable development team or not. But saying it is impossible is a step too far.




Even in that article with much smaller vectors than what GPT puts out (1536 dimensions) QPS drops below 100 if recall@1 is more than 0.4. That's to say nothing of cost of regenerating this index using incremental updates. I don't get why people on HN are so adamant on the idea that no one needs scale beyond 1 machine ever.


The comment said that having an instance with 1B+ vectors yourself is impossible. Clearly that's not the case.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: