Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A similar trick is used for a different reason in statistical and probabilistic calculations -- for numerical stability rather than performance.

Suppose you have n probabilities p_1, ..., p_n. Each p_i is a real number in the interval [0, 1] . Often you want to multiply probabilities to compute prod_i p_i . Probabilities p_i can be tiny floating point values (very close to zero) and if n, the number of factors, is large, then the product will evaluate to zero due to underflow.

Software that processes probabilities often instead stores log-probabilities. Then the product prod_i p_i can be evaluated as sum_i log(p_i) , which is more numerically stable.

Where this gets a bit trickier is if you instead want to compute the sum of probabilities, rather than their product -- e.g. to renormalise probabilities so they all sum to 1. If you have encoded the probabilities as log-probabilities, you now need to take the log-exp-sum [1][2], which will cause numeric overflow if done naively, and is also relatively expensive to compute, as it requires n exponentials and 1 log to combine n log-probs. log-exp-sum can be evaluated in a stable way by first doing a pass over the data to compute the max log-prob, then doing a second pass to take the exponential of each log-prob offsetted by the max log-prob, so each of the terms being exponentiated is at most 0.

There's also a streaming version of log-exp-sum that permits a single pass over the data [3] -- a max is calculated on the fly and each time a new max is identified, the running sum is multiplied by a correction factor. I'm a bit suspicious that the branches in the streaming log-exp-sum might cause a performance impact when executing, although each exp calculation is so expensive that perhaps it doesn't make that much of a difference.

[1] https://en.wikipedia.org/wiki/LogSumExp

[2] https://blog.smola.org/post/987977550/log-probabilities-semi...

[3] http://www.nowozin.net/sebastian/blog/streaming-log-sum-exp-...



> I'm a bit suspicious that the branches in the streaming log-exp-sum might cause a performance impact when executing,

Since you're just checking whether the current sample is larger than the largest sample seen so far, you're very likely to find a "large" sample early that rarely gets updated. From then on, this branch will virtually always be false, and the branch predictor will make this fast. Unless the distribution is increasing (but not monotonically increasing) over time in an unpredictable way; in that case, the branch predictor will fail often and will be slow.

The branch predictor (and cache, for that matter) is a sort of Schrodinger's cat that makes programs both slow and fast at the same time, but you never know which until you benchmark it.


> you're very likely to find a "large" sample early that rarely gets updated. From then on, this branch will virtually always be false

Good point, provided there's a decent number of elements being reduced.

In the application I've been focusing on, many of the batched log-exp-sum reductions are over tiny arrays containing 1 to 4 log-prob elements. There's already branching to guard against the case where all elements are log-prob -inf (aka probability zero). I found it helpful to also branch to special case the 1 element case, in which case the reduction is the identity operation, saving both an exp and a log. It's probably the case that inside the exp and log there's branching as well, so doesn't make sense to get too myopically focused on that single aspect of performance.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: