Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are you aware that cards containing “LLM” (40-80GB) levels of VRAM cost substantially more and the status quo for consumer cards hovers around 4-12GB, only going to 24GB for top end cards?


And this is exactly the way NVidia intends to keep it, methinks.

Give the consumers / gamers a consumer-priced GPU with a max of 16-24 GB VRAM for the high-end models. By consumer-priced, I mean $500-2000.

And make anyone interested in AI / ML / LLM / 3D / creatives pay $3000-10000 for GPUs that are similar in performance but have much higher VRAM.

Then top it out with six-figure (or higher) priced GPUs for the FAANG companies which can afford them for their data centers and currently contribute the most revenue (and profit) to NVidia.


Of course


Your comment, pre-edit, had something of a severe tone given that consideration.

Having said that, I've trained/finetuned image models just fine on an RTX 2070 Super with 8 GB of VRAM. This was back when doing so was more fruitful than simply training a more robust model in the first place. Given that is the current status quo - I'm curious what sort of training you're doing whole-network that actually produces results that are noticeably better than doing something few-shot during inference or doing LoRA finetuning? The latter brings you back into the realm of tuning on low-VRAM configs.

In general, a single GPU's memory constraints are one of many when training a model _from scratch_. In that case, you're bottlenecked by data and data parallelism. You don't need one or a few GPU's, you need more than would fit in a consumer setup in the first place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: