Deepseek is remarkable, but they explicitly say they built on top of Meta's Llama and Alibaba's Qwen. They scored the goal but there were other players involved to get there.
I feel like you confuse this with their distill models (i.e. 1.5/7/8/32/70B) being build on top of Llama & Qwen models. But those aren't really remarkable models.
Truly remarkable model is DeepSeek-R1, and it's their model, with very particular DeepSeek architecture. Of course they build on the knowledge of other labs, just like other labs build on the top of their/others knowledge. They are miles ahead of Meta in terms of the base architecture at the moment, and you can watch them iterating throughout last year to come to where they are now.