Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are correct in pointing out this error of the parent comment (by SiVal). Each ball is indeed the sum of "n-1" bernoulli RVs, and the CLT does apply to these sums.

As btilly points out elsewhere, to actually obtain the correct limit, you have to normalize the sum correctly. Because of the way the scaling is done in this graphic, as you increase the number of levels, it's in effect normalizing the sum by dividing by "n". To get the right limit, you need to divide by sqrt(n).

In this sense, the CLT is a high-resolution version of the SLLN ("law of averages"). If you normalize the sums by 1/n, the resulting average converges to a number, the mean.

But if instead you subtract this mean, and normalize by 1/sqrt(n) rather than 1/n, the result converges in distribution, and looks like a normal random variable.

By less aggressive normalization, you get information about the fluctuations rather than just pounding it down to a number, as the SLLN does.

I used to TA a probability class for undergrads, and I found the demo to be perfectly reasonable.



If we're going to pick nits, it would illustrate the "Weak" LLN, not the Strong LLN. The Strong LLN still holds here, but you'd need to plot the sample paths to illustrate it.


I was actually trying my best to not pick nits, because HN has somewhat too much of that.

The reason I posted is to defend the OP against some assertions that I felt were picking at details of what started out to be a simple and fun visualization. I wrote what I did about the LLN to supply some intuition to the person who created the visualization, in case they wanted to put these two results in perspective.


Thanks for this explanation. I understood most of it but could you explain why you should normalize using 1/sqrt(n) and why doing so makes the result converge in distribution?


For a sequence of independent random variables with the same variance, X_1, X_2,..., we have

  var( (1/sqrt(n)) * (X_1 + X_2 + X_3 + ... X_n) 
    = (1/n) * (var(X_1) + var(X_2) + ... var(X_n))
    = (1/n) * n * var(X_1)
    = var(X_1)
This holds for any n, which means that, if you normalize by 1/sqrt(n) instead of 1/n, the "randomness" never vanishes even when n gets infinitely large. If you normalize by something bigger than 1/sqrt(n) the variance blows up, and if you normalize by something less than 1/sqrt(n), the variance collapses to zero so you get something concentrated at a single point.

The CLT tells us more than that, it actually tells us how the randomness is distributed when n gets very large, which is pretty remarkable when you think about it. (and it holds under much weaker conditions than what I mentioned above, it's just that those assumptions are probably the easiest to understand).


Thanks a lot for the explanation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: