Hacker Newsnew | past | comments | ask | show | jobs | submit | gingerwizard's commentslogin

Few in this thread have suggested ClickHouse would do well here. We tested 1 trillon rows recently, albeit much simpler data - https://clickhouse.com/blog/clickhouse-1-trillion-row-challe...

This is a good dataset though and the level of detail in the post is appreciated. I'll give ClicKHouse a go on the same...

Disclaimer: I work for ClickHouse


Thanks for the link on the trillion row challenge, interesting read! I'm looking at queries and indexes next and I'm hoping to include Clickhouse in that comparison.


Could you also include VictoriaMetrics into the comparison?


Author of the post here, would love to hear ideas for optimization to improve this.


Hey author, here, so yes its not the most recent technique, but when coupled with rescoring like we propose it can be a simple speed up for linear scans. It also a pure SQL solution and requires no indices. It benefits from being easy to update and not being mem bound. We recognise all the issues with it and pros in the final section. We're defn not claiming this is ground breaking - more a useful technique which is easy to replicate in SQL.

Pinecone https://www.pinecone.io/learn/series/faiss/locality-sensitiv... seem to promote as a viable approach.

We defn don't consider this to be the final solution for ANN and hence are investing in other graph based techniques - https://clickhouse.com/docs/en/engines/table-engines/mergetr...


Understood - thanks!



Its about a billion rows a day. Nice idea, we could probably add a visual for possible malicious packages.



Thanks for the reply :-) but your link is only for tracking mentions on the HN website.

I was asking about how they are able to track mentions, across the web, of companies using ClickHouse. This type of info is usually listed in the tech stack section of job descriptions (and these links tend to expire once the position is filled).


I'd use clickhouse-local, ideal for this workload without needing to load into a full ClickHouse instance.

https://clickhouse.com/blog/extracting-converting-querying-l...


Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: