As it says in the article, you are talking about a mere constant of proportionality, a single multiple. When you're dealing with an exponential growth curve, that stuff gets washed out so quickly that it doesn't end up matter all that much.
Keep in mind that the goal everyone is driving towards is AGI, not simply an incremental improvement over the latest model from Open AI.
Their loss curve with the RL didn't level off much though, could be taken a lot further and scaled up to more parameters on the big nvidia mega clusters out there. And the architecture is heavily tuned to nvidia optimizations.
When was the last time the US got their lunch ate in technology?
Sputnik might be a bit hyperbolic but after using the model all day and as someone who had been thinking of a pro subscription, it is hard to grasp the ramifications.
There is just no good reference point that I can think of.
Yep some CEO said they have 50K GPUs of the prior generation. They probably accumulated them through intermediaries that are basically helping nvidia sell to sanctioned parties by proxy