How do you test a system like this for accuracy? Is this done by simulating mill...

andreareina · on May 28, 2017

The algorithm's accuracy is known. From the wiki[1]:

    The HyperLogLog algorithm is able to estimate 
    cardinalities of > 10^9 with a typical error rate of 2%

federicoponzi · on May 28, 2017

But what about the implementation accuracy? :)

zeroxfe · on May 28, 2017

Tests against both historical and synthetic datasets.

GhostVII · on May 28, 2017

Reddit probably has enough analytics to be able to show mathematically that it will be accurate without simulating any requests.

icelancer · on May 28, 2017

Can't you just use Apache Benchmark and some proxies?