They actually use distilled versions. The most egregious example of this is their misleading reference to all distillations of DeepSeek-R1, which are based on a variety of vastly different base models of varying sizes, as alternative versions of DeepSeek-R1 itself. To this day, many users maintain the mistaken impression that DeepSeek-R1 is overhyped and doesn't perform as well as claimed by those who have been using the actual model with 685B parameters.