Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Big fan of HLL

Apache foundation has a fantastic DataSketches library that includes HLL and many other powerful data analytics algorithms: https://datasketches.apache.org/

Lee Rhodes has done an excellent introduction to this library - explaining some of the use cases, advantages, and things to be aware of when using these techniques: https://www.youtube.com/watch?v=nO9pauS-mGQ




On sketches, there is a genre of structure for estimating histogram-like statistics (median, 99th centile, etc) in fixed space, which i really like. Two examples:

t-digest https://github.com/tdunning/t-digest

DDSketch https://github.com/DataDog/sketches-java


In the same spirit there is also Circllhist from Circonus. https://arxiv.org/abs/2001.06561

There was a good discussion on approaches to store histogram here: https://news.ycombinator.com/item?id=31379383


t-digest is used by AWS X-ray to render their charts




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: