Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I kind of wonder why a lot of these places don't give "amateur" sized models anymore at around the 18B & 30B parameter sizes that you can run on a single 3090 or M2 Max at reasonable speeds and RAM requirements? It's all 7B, 70B, 400B sizing nowadays.


Just a few days ago, Mistral released a 12B model: https://mistral.ai/news/mistral-nemo/


Because you can just quantise the 70B model to 3-4 bits and it'll perform better than a 30B model but be a similar size.


A 70B 4bit model does not fit in a 24GB VRAM card, 30B models are the sweet spot for that size of card at 20GB, with 4GB left for the system to still function.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: