Isn't DeepSeek simple/small enough you can run it locally?

zbendefy · 2025-01-29T16:44:33 1738169073

No, the full R1 model is ~650GB. There are quantized version that quantize it down to ~150GB.

What you can run locally are the distilled models, that is actually LLama and Qwen weights further trained on R1's output

deepsquirrelnet · 2025-01-29T16:16:30 1738167390

At least a TB of VRAM to load it in fp16. They distilled to smaller models, which do not perform as well, but can be run on a single GPU. Full R1 is big though.

nickthegreek · 2025-01-30T16:14:49 1738253689

fp16? I thought it was trained at fp8.