This is a great post, very approachable, with excellent visualizations. We use a...

This is a great post, very approachable, with excellent visualizations.

We use a variation of this sort of thing at $WORK to solve a related problem, where you want to estimate some percentile from a running stream, with the constraints that the percentile you want to choose changes from time to time but is generally static for a trillion or more iterations (and the underlying data is quasi-stationary). If you back the process by a splay tree, you can get amortized O(1) percentile estimates (higher error bars for a given RAM consumption than a number of other techniques, but very fast).

You can also play with the replacement probability to, e.g., have a "data half-life" (denominated either in time or discrete counts) and bias the estimate toward recent events, which is more suitable for some problems.