I kind of wonder why a lot of these places don't give "amateur" sized models any...

TobTobXX · on July 24, 2024

Just a few days ago, Mistral released a 12B model: https://mistral.ai/news/mistral-nemo/

logicchains · on July 24, 2024

Because you can just quantise the 70B model to 3-4 bits and it'll perform better than a 30B model but be a similar size.

novok · on July 24, 2024

A 70B 4bit model does not fit in a 24GB VRAM card, 30B models are the sweet spot for that size of card at 20GB, with 4GB left for the system to still function.