Deepseek is remarkable, but they explicitly say they built on top of Meta's Llam...

boroboro4 · on Jan 29, 2025

I feel like you confuse this with their distill models (i.e. 1.5/7/8/32/70B) being build on top of Llama & Qwen models. But those aren't really remarkable models.

Truly remarkable model is DeepSeek-R1, and it's their model, with very particular DeepSeek architecture. Of course they build on the knowledge of other labs, just like other labs build on the top of their/others knowledge. They are miles ahead of Meta in terms of the base architecture at the moment, and you can watch them iterating throughout last year to come to where they are now.

rtkwe · on Jan 29, 2025

That's true of practically everything though. Completely out of the blue technical inventions are pretty rare.