Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The distinction is between ‘data peeking’, i.e. repeatedly checking the p-value you've obtained and stopping if it falls below 0.05, and repeating assays in the light of new information. Such new information can relate to the distribution of the values, the expected effect size, or any other parameter that you did not know at the outset of the study.

In ‘data peeking’, the flaw is that if an assay is repeated often enough, one will eventually get a result that deviates far from the mean result. This is a natural consequence of the data having a normal distribution, i.e. not all results will be identical. It's the equivalent of getting six heads or tails in a row (which should happen at least once if you flip a coin 200 times), and then reporting your coin as biased.

Repeating an assay because the distribution of the data is not what you thought, or because the likely difference between means is smaller than you thought is a valid approach.

Source: Big little lies: a compendium and simulation of p-hacking strategies Angelika M. Stefan and Felix D. Schönbrodt

https://royalsocietypublishing.org/doi/10.1098/rsos.220346



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: