The training techniques are a breakthrough no matter what data is used. It's not up for debate, it's an empirical question with a concrete answer. They can and did train orders of magnitude faster.
Not arguing with your point about training efficiency, but the degree to which R1 is a technical breakthrough changes if they were calling an outside API to get the answers, doesn't it?
It seems like the difference between someone doing a better writeup of (say) Wiles's proof vs. proving Fermat's Last Theorem independently.