Are you aware that cards containing “LLM” (40-80GB) levels of VRAM cost substant...

samspenc · on Jan 9, 2024

And this is exactly the way NVidia intends to keep it, methinks.

Give the consumers / gamers a consumer-priced GPU with a max of 16-24 GB VRAM for the high-end models. By consumer-priced, I mean $500-2000.

And make anyone interested in AI / ML / LLM / 3D / creatives pay $3000-10000 for GPUs that are similar in performance but have much higher VRAM.

Then top it out with six-figure (or higher) priced GPUs for the FAANG companies which can afford them for their data centers and currently contribute the most revenue (and profit) to NVidia.

eurekin · on Jan 8, 2024

Of course

ShamelessC · on Jan 8, 2024

Your comment, pre-edit, had something of a severe tone given that consideration.

Having said that, I've trained/finetuned image models just fine on an RTX 2070 Super with 8 GB of VRAM. This was back when doing so was more fruitful than simply training a more robust model in the first place. Given that is the current status quo - I'm curious what sort of training you're doing whole-network that actually produces results that are noticeably better than doing something few-shot during inference or doing LoRA finetuning? The latter brings you back into the realm of tuning on low-VRAM configs.

In general, a single GPU's memory constraints are one of many when training a model _from scratch_. In that case, you're bottlenecked by data and data parallelism. You don't need one or a few GPU's, you need more than would fit in a consumer setup in the first place.