A model's need for data is a sort of reciprocal to its inductive bias strength. The more permissive is your model (can learn anything/fit noise perfectly), the more data you need to tune it to a useful state. Conversely, the more restrictive is your model (e.g. y = ax + b), the less data you need (e.g. two points).
People needed a lot of data to predict the movement of planets (entire books of numeric tables), until laws of gravity were figured out, at which point it was reduced to a couple of parameters. This same principle applies to modern AI too, the more you restrict your inductive bias to the sort of structures and dynamics you expect to capture in the wild, the less volumes of data you need to tune.
So is "data a bad idea"? Only as bad as your world model is good. Perfect model of the world requires zero data, weak model of the world requires lots of data.
People needed a lot of data to predict the movement of planets (entire books of numeric tables), until laws of gravity were figured out, at which point it was reduced to a couple of parameters. This same principle applies to modern AI too, the more you restrict your inductive bias to the sort of structures and dynamics you expect to capture in the wild, the less volumes of data you need to tune.
So is "data a bad idea"? Only as bad as your world model is good. Perfect model of the world requires zero data, weak model of the world requires lots of data.