They actually use distilled versions. The most egregious example of this is thei...

They actually use distilled versions. The most egregious example of this is their misleading reference to all distillations of DeepSeek-R1, which are based on a variety of vastly different base models of varying sizes, as alternative versions of DeepSeek-R1 itself. To this day, many users maintain the mistaken impression that DeepSeek-R1 is overhyped and doesn't perform as well as claimed by those who have been using the actual model with 685B parameters.