Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Isn't DeepSeek simple/small enough you can run it locally?


No, the full R1 model is ~650GB. There are quantized version that quantize it down to ~150GB.

What you can run locally are the distilled models, that is actually LLama and Qwen weights further trained on R1's output


At least a TB of VRAM to load it in fp16. They distilled to smaller models, which do not perform as well, but can be run on a single GPU. Full R1 is big though.


fp16? I thought it was trained at fp8.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: