Sincere question from a non-scientist who struggles with the idea of how to make use of existing data without accidentally P-hacking:
Is it still P-hacking if you stumble upon a correlation in the historical record (after stumbling around for a while), call it a hypothesis, and then stick with it long though to gather a statistically significant amount of _new_ data to support it?
More broadly, are there ways to "go on a fishing expedition" that are still scientifically valid?
As long as you get new data in a way that can falsify your hypothesis that's fine. If you bias your data collection to favor your hypothesis that's still cheating.
Inarguably. The folks arguing that it's p-hacking aren't taking the next step of treating the correlation as a hypothesis, testing it, and establishing causality.
Yeah, there are valid ways to use the data. Looking at the parent again perhaps I was misreading it; I was responding to the idea that you could find the correlation and just go straight from there.
Is it still P-hacking if you stumble upon a correlation in the historical record (after stumbling around for a while), call it a hypothesis, and then stick with it long though to gather a statistically significant amount of _new_ data to support it?
More broadly, are there ways to "go on a fishing expedition" that are still scientifically valid?