I use to think this but no one I have read believes data is the problem.
Amodei explains that if data, model size and compute scale up linearly, then the reaction happens.
I don't understand why data wouldn't be a problem but it seems like if it was, we would have ran into this problem already and it has already been overcome with synthetic data.
Amodei explains that if data, model size and compute scale up linearly, then the reaction happens.
I don't understand why data wouldn't be a problem but it seems like if it was, we would have ran into this problem already and it has already been overcome with synthetic data.