Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Benchmarking for this project is a bit weird, since 1) only linear scans are supported, and 2) it's an "embeddable" vector search tool, so it doesn't make a lot of sense to benchmark against "server" vector databases like qdrant or pinecone.

That being said, ~generally~ I'd say it's faster than using numpy and tools like txtai/chromadb. Faiss and hnswlib (bruteforce) are faster because they store everything in memory and use multiple threads. But for smaller vector indexes, I don't think you'd notice much of a difference. sqlite-vec has some support for SIMD operations, which speeds things up quite a bit, but Faiss still takes the cake.



Author of txtai here - great work with this extension.

I wouldn't consider it a "this or that" decision. While txtai does combine Faiss and SQLite, it could also utilize this extension. The same task was just done for Postgres + pgvector. txtai is not tied to any particular backend components.


Ya I worded this part awkwardly - I was hinting that querying a vector index and joining with metadata with sqlite + sqlite-vec (in a single SQL join) will probably be faster than other methods, like txtai, which do the joining phase in a higher level like Python. Which isn't a fair comparison, especially since txtai can switch to much faster vector stores, but I think is fair for most embedded use-cases.

That being said, txtai offers way more than sqlite-vec, like builtin embedding models and other nice LLM features, so it's definitely apples to oranges.


I'll keep an eye on this extension.

With this, what DuckDB just added and pgvector, we're seeing a blurring of the lines. Back in 2021, there wasn't a RDBMS that had native vector support. But native vector integration makes it possible for txtai to just run SQL-driven vector queries...exciting times.

I think systems that bet on existing databases eventually catching up (as is txtai's model) vs trying to reinvent the entire database stack will win out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: