At what point do we conclude the lottery isn't as random as they claim if one pe...

chipdart · 2024-11-12T14:52:13 1731423133

> At what point do we conclude the lottery isn't as random as they claim if one person keeps winning.

The whole thing about survivorship bias is that you make a critical failure in analysis when confusing partial observations of post-facto results with causality.

bluGill · 2024-11-12T16:35:37 1731429337

The point of statistics (one of many) is to figure out how many observations we need. If someone wins the lottery 10 times with their system I will assume that they have a good system (if they have a lot of losses as well it means the system isn't perfect, but it still works), but if you only win once and never enter again I assume it is survivorship basis. Of course by winning the lottery I mean win a large jackpot - most have smaller prizes that you have high odds of winning many times if you play often enough.

manwe150 · 2024-11-12T17:45:48 1731433548

Only if you only play 3 times though (in your previous example). Statistics also are about figuring out what sort of outliers must exist for a process to be fair (true random). For something like a mega lottery with terrible odds, then winning twice is already very unlikely. But for something easy like a coin flip, every N trials should have a run of about sqrt N heads or wins in a row if it is unbiased. For something unlikely like lotto, it is closer to looking at the birthday paradox: the probability of one person winning twice is low; but the probability that there exists a person who won twice is high, at random.

HPsquared · 2024-11-13T09:49:03 1731491343

There must be a name for "wrongly assuming everything is random and variables are independent". Like the opposite of the gambler's fallacy.

chipdart · 2024-11-13T07:24:22 1731482662

> The point of statistics (one of many) is to figure out how many observations we need.

No, you're missing the whole point. Think about the problem about survivorship bias. Imagine you are at a M&Ms factory. You decide you want to assess what's the color distribution of M&Ms by sampling the colors that come out of the production line. You somehow make the mistake of sampling the production line for the peanut core M&Ms right out of the pipe that produces yellow M&Ms. You sample away and after hours you present your findings: 99.9% of yellow M&Ms have a peanut core. Based on your findings, you proceed to boldly claim that having a yellow core is a critical factor in producing yellow M&Ms. You even go as far as to rationalize it, and claim that yellow represents peanuts, and if anyone wants to create yellw-colored candy they need to start by adding peanut to the mix.

I then alert you to the fact that you made a critical failure in analysis when confusing partial observations of post-facto results with causality. Your answer:

> The point of statistics (one of many) is to figure out how many observations we need. If someone wins the lottery 10 times with their system I will assume that they have a good system (if they have a lot of losses as well it means the system isn't perfect, but it still works), but if you only win once and never enter again I assume it is survivorship basis.

You're sampling M&Ms out of the freakin' peanut M&M production line. If you fix your mistake, you'll get all kinds of M&Ms. You do not fix your mistake with higher sampling. Your mistake is that you're unwittingly filtering out an important subset of the problem domain, and proceeded to do a faulty analysis on the subset you picked.