It would be interesting to know how much of this improvement in the last 25 since he was a student comes from Moore's law, other hardware improvements, various new technologies not related to AI, amount of money being thrown at the problem... and how much of it are advancements in our understanding of AI.
In this work, we argue that algorithmic progress has an aspect that is both straightforward to measure and interesting: reductions over time in the compute needed to reach past capabilities. We show that the number of floating-point operations required to train a classifier to AlexNet-level performance on ImageNet has decreased by a factor of 44x between 2012 and 2019. This corresponds to algorithmic efficiency doubling every 16 months over a period of 7 years. By contrast, Moore's Law would only have yielded an 11x cost improvement. We observe that hardware and algorithmic efficiency gains multiply and can be on a similar scale over meaningful horizons, which suggests that a good model of AI progress should integrate measures from both.
Reinforcement learning has achieved great success in many applications. However, sample efficiency remains a key challenge, with prominent methods requiring millions (or even billions) of environment steps to train. (...) This is the first time an algorithm achieves super-human performance on Atari games with such little data. EfficientZero's performance is also close to DQN's performance at 200 million frames while we consume 500 times less data. EfficientZero's low sample complexity and high performance can bring RL closer to real-world applicability.
500x improvement over ~10 years since DQN that roughly 2x improvement in sample complexity every year.
We compare the impact of hardware advancement and algorithm advancement for SAT solving over the last two decades. In particular, we compare 20-year-old SAT-solvers on new computer hardware with modern SAT-solvers on 20-year-old hardware. Our findings show that the progress on the algorithmic side has at least as much impact as the progress on the hardware side.
AI research has also tiny budgets compared to the biggest scientific projects:
Where did you get the $10M figure for GPT-3? That sounds awfully cheap considering the cost of compute alone: one estimate was $4.6M for a single training run [0], while other sources [1] put it at $12M per run. I highly doubt that OpenAI nailed the training process right on the second or even first go respectively (according to your figure).
So even conservative estimates put the compute cost alone at least one order of magnitude higher than your figure of $10M.
A recent paper shows that empowering a language model to search a text corpus (or the internet) for additional information could improve model efficiency by 25 times [1]. So you only need a small model because you can consult the text to get trivia.
That's 25x in one go. Maybe we have the chance to run GPT-4 ourselves and not need 20 GPU cards and a 1mil $ computer.