New GeForce RTX 4080 Super, RTX 4070 Ti Super, RTX 4070 Super

laweijfmvo · on Jan 8, 2024

> All three GeForce RTX 40 SUPER Series GPUs are faster than their predecessors

It's weird that they're only comparing the new cards to the RTX 30's and 20's, and not the "v1" 40's. I assume the 4080 SUPER is faster than the 4080 (based on name?) but it seems cheaper and there's absolutely no comparison data

jasonjmcghee · on Jan 8, 2024

This was what I came to comment. Very strange.

This article https://beebom.com/nvidia-rtx-4080-4070-ti-super-gpu-specs-r... says "the RTX 4070 Ti Super is up to 10% faster than the non-Super 4070 Ti on average"

So, on average, up to 10% faster, yeah seems pretty incremental.

anonym29 · on Jan 8, 2024

The big difference with the 4070 Ti Super is that it's using the AD103 chip (with a full 256-bit memory bus and 16GB of VRAM) found in the 4080, which is a huge leap over the AD104 chip found in the 4070 Ti (non-Super), which only touts a 192-bit memory bus and 12GB of VRAM.

While the TFLOPS of the Super variant does only see a ~10% increase as you note, memory bandwidth jumps by 42% and the memory capacity jumps by 33%, while the launch price is the same in my currency.

It basically bridges half the distance between a 4070 Ti (non-Super) and a 4080 (non-Super) for the same launch price as a 4070 Ti (non-Super).

Great card for memory intensive workloads like LLM inference with big context windows, IMO.

EDIT1: 4070 Ti Super TDP is 320W (same as 4080), higher than the anticipated 285W

EDIT2: launch price confirmed to be same as the 4070 Ti (non-Super), lower than anticipated!

jasonjmcghee · on Jan 8, 2024

Appreciate the extra insight here! I hope for the sake of purchasers it is only 12% cost increase, but I have a suspicion if there's more than 12% extra value, we'll see it in the price

anonym29 · on Jan 8, 2024

Just checked the CES announcement and updated my post to reflect that it actually has the same launch price that the 4070 Ti (non-Super) had! Amazing bargain!

tharmas · on Jan 9, 2024

Not only that, the RTX 4070 Ti Super gets near the same performance as the RTX 4080 non Super for $400 less. But that's MSRP. I have a feeling this card will be selling for a lot more than that.

sundvor · on Jan 9, 2024

Cheers, the 4070 TI certainly hits a certain sweet spot for sure.

I got a 4090 a few months ago before the prices increased, and I'm beyond stoked with the performance for (typically triple qhd simulation) gaming. It's just a beast.

I have a 2nd PC I'd like to upgrade too though, and the 4070 TI looks like it would be fantastic in this.

happycube · on Jan 8, 2024

For running AI models etc the 4070ti is the best value of the bunch by far. Memory size and bandwidth are the most important things in that order (which makes the 4050, er, 4060ti 16GB a weak card)

Ergo, there's a decent chance it won't sell for MSRP.

burntalmonds · on Jan 10, 2024

It sure would be nice if Nvidia just named the new card 4075 or something. The whole 4070 vs 4070 Super vs 4070 Ti vs 4070 Ti Super naming scheme sucks.

madeofpalk · on Jan 8, 2024

Which is what the Super cards usually are. A minor 'mid cycle' refresh/improvement on cards, moving better components down the line to cheaper cards.

But yes, very weird that none of the comparions actually show them!.

0x457 · on Jan 8, 2024

It's not weird at all. Those cards aren't meant as next buy for owners of non-Super 40xx cards. Cards are compared with cards that potential buyers currently have.

im3w1l · on Jan 8, 2024

To me it's a bit strange because it assumes a buy / not buy perspective rather than a buy this / buy that perspective.

0x457 · on Jan 8, 2024

Welcome to PC hardware. People rarely switch camps or price tier. People who have 4070 today are the people who had 1070/2070/3070.

RugnirViking · on Jan 9, 2024

Can that really be true? I figure most people just stick with whatever they have then buy the best thing in their means when the old one gets too slow for their needs. I can't imagine upgrading from 1070 to 2070, in fact right now most people that I know who are considering upgrading are on the 900 series

0x457 · on Jan 9, 2024

> I can't imagine upgrading from 1070 to 2070

You're not getting what I'm saying - people stay in the same tier when they upgrade. They might do every generation, every other generation, skip every two generations, but the point is that people who have xx70 (or xx80) will buy xx70 (or xx80) from a newer generation.

10xx to 20xx upgrade made little sense to most gamers because RTX was a thing you turn on, look at pretty reflections and turn off to regain the performance. 10xx generation was a weird generation for NVIDIA and doubt they would make such a consumer friendly generation ever again.

t0bia_s · on Jan 8, 2024

Rather 1080/2080/3080 with current prices.

0x457 · on Jan 9, 2024

Well, 1080/2080/3080 go to 2080/3080/4080 and 1070/2070/3070 go to 2070/3070/4070 and so on. Even though, Nvidia shifted their tiers a bit.

Not everyone is buying the best and latest. Plenty of people wait for previous gens to drop in prices or enter the secondary market.

samstave · on Jan 8, 2024

Im holding out for the "the RTX 6090 Ti Super MAX XXXtreme 197Hz Mr. Manager"

Ive a lot of AMd Nvidia machines - two high-end gaming machines.. the naming conventions of Nvidia cards are just odd to me and I can tell what anything actually means..

culopatin · on Jan 10, 2024

Super means more. What is xtxxtt in AMD language?

selectodude · on Jan 8, 2024

They’re trying to sell to me, the owner of a fully orphaned 3070 with a measly 8GB of vram. Not a current 40 series owner.

notJim · on Jan 8, 2024

What does orphaned mean in this context?

throwup238 · on Jan 8, 2024

The two parent 2080s were lost in a tragic cryptocurrency mining accident when the bitcoin ceiling collapsed.

Please send donations to the Aboveground Miners fund in your choice of shitcoin.

pixelpoet · on Jan 8, 2024

I get that it's supposed to be funny, but I wonder how many people still think bitcoin mining happens on GPUs, particularly Nvidia ones. Pretty sure that stopped being the case like 14 years ago or something? Anyway...

throwup238 · on Jan 8, 2024

ASICs didn’t really take over until around 2013 (I was designing some PCBs for one of the first ASICs in 2012)

Yizahi · on Jan 8, 2024

All tokens are effectively "slaved" to the BTC tokens due to paper thin real liquidity of all of them. Therefore GPUs were very much affected by the BTC volatility, just via proxy, and likely still are. It should be obvious really for anyone.

paulmd · on Jan 8, 2024

Litecoin and Doge and Ethereum were mined on GPUs throughout that period. And in fact all the other coins are tied to bitcoin in the first place, bitcoin price runs also trigger huge mining booms in everything else, so yeah, it's kinda understandable that people tend to view them as a single linked thing, because they kinda are.

What, precisely, is the point of making "actually you mean crypto, not bitcoin" posts, other than demonstrating that you are, indeed, "very smart"? Like, this person doesn't even exist, it's just a "heh aren't those no-coiners dumb" strawperson that you imagine to be some big dummy.

newsclues · on Jan 8, 2024

Less desirable because vram has become more important.

newsclues · on Jan 8, 2024

Most people don’t upgrade to the same generation super card.

Most people upgrade from a 10, 20 or less so 30 series card.

They are selling upgrades to older generations.

bryanlarsen · on Jan 8, 2024

But everybody who is upgrading is trying to figure out which card to upgrade to, and are doing comparisons between the current gen cards, not between their old card and the new ones.

adrian_b · on Jan 8, 2024

It is likely that there will be no choice between Super and non-Super.

At least 4070 Ti and 4080 have become completely obsolete when their Super variants are much better, and in the case of 4080 Super, even cheaper too.

I suppose that they have stopped producing the non-Super variants, as nobody would want those where the Super cards are available.

I am still using a 2060 Super from 2019, and the same has happened in that year when the RTX 2000 Super series has replaced the previous RTX 2000 series.

newsclues · on Jan 8, 2024

I think people shop based on, have card A and get X fps and I can get Y fps with card B.

Card A is old, and card B is new.

The availability of original and super cards that compete in the same price point will be very short, and most people won’t really have that choice.

maxlin · on Jan 8, 2024

This stands out like a sore thumb for anyone even taking a glance at the graphs. Who there thought this was a good idea? Now I think the Supers are just going to be 10% faster while drawing 30% more power or something else hacky and desperate-seeming.

woodson · on Jan 8, 2024

If you’re interested in ML training, to save you some time, RTX 4080 Super and RTX 4070 Ti Super have 16GB GDDR6X, the RTX 4070 Super has 12GB.

deepsquirrelnet · on Jan 8, 2024

It seems they’re being very careful not to undercut their enterprise offerings or even the 4090. Assuming they’re not completely tone deaf, I can only assume this is the explanation.

schaefer · on Jan 8, 2024

The Jetson AGX Orin Developer Kit [1] has 64 GB of unified 256-bit LPDDR5 memory.

It costs $2,000 and might get some people someplace interesting.

[1]: https://developer.nvidia.com/embedded/learn/getting-started-...

wmf · on Jan 8, 2024

Orin is kind of expensive for what it does. I think you'd be better off with a Mac Studio for $2,400 at this point.

cma · on Jan 8, 2024

5FP32 TFLOPs, if not doing sparse low precision inference it seems to be about in line with mid-high end 2014 Nvidia consumer card performance (gtx 980), one decade old.

For running sparsified/quantized llama2 it might be good, not sure about for fine tuning. I didn't see any FP16 numbers.

gautamcgoel · on Jan 8, 2024

LPDDR5 doesn't have nearly as much memory bandwidth of GDDR6.

skavi · on Jan 8, 2024

Per chip? Not the full story when discussing a system which can integrate multiple. The Orin has more memory bandwidth than an RTX 4050 even though the latter uses GDDR6. The M3 Max has double the bandwidth of the Orin, but also uses LPDDR5.

eurekin · on Jan 8, 2024

Thanks! That is every interesting.

Here's a direct amazon link: https://www.amazon.com/dp/B0BYGB3WV4

And a running demo: https://forums.developer.nvidia.com/t/llama-2-llms-w-nvidia-...

eurekin · on Jan 8, 2024

Thanks! That rules them out completely

EDIT: For training

Ekaros · on Jan 8, 2024

Very good. Maybe as a gamer I can get one eventually for reasonable price.

glonq · on Jan 8, 2024

Depending on your definition of 'reasonable', that ship sailed a few years ago and ain't ever coming back

ShamelessC · on Jan 8, 2024

Are you aware that cards containing “LLM” (40-80GB) levels of VRAM cost substantially more and the status quo for consumer cards hovers around 4-12GB, only going to 24GB for top end cards?

samspenc · on Jan 9, 2024

And this is exactly the way NVidia intends to keep it, methinks.

Give the consumers / gamers a consumer-priced GPU with a max of 16-24 GB VRAM for the high-end models. By consumer-priced, I mean $500-2000.

And make anyone interested in AI / ML / LLM / 3D / creatives pay $3000-10000 for GPUs that are similar in performance but have much higher VRAM.

Then top it out with six-figure (or higher) priced GPUs for the FAANG companies which can afford them for their data centers and currently contribute the most revenue (and profit) to NVidia.

eurekin · on Jan 8, 2024

Of course

ShamelessC · on Jan 8, 2024

Your comment, pre-edit, had something of a severe tone given that consideration.

Having said that, I've trained/finetuned image models just fine on an RTX 2070 Super with 8 GB of VRAM. This was back when doing so was more fruitful than simply training a more robust model in the first place. Given that is the current status quo - I'm curious what sort of training you're doing whole-network that actually produces results that are noticeably better than doing something few-shot during inference or doing LoRA finetuning? The latter brings you back into the realm of tuning on low-VRAM configs.

In general, a single GPU's memory constraints are one of many when training a model _from scratch_. In that case, you're bottlenecked by data and data parallelism. You don't need one or a few GPU's, you need more than would fit in a consumer setup in the first place.

loloquwowndueo · on Jan 8, 2024

baobabKoodaa · on Jan 8, 2024

With the current generation of text and image generators, 24GB is the sweet spot

michaelt · on Jan 8, 2024

My impression is a lot of the open source action is around the just-about-runs-in-12GB region - lots of models coming out with 7B/13B and 4-bit quantisation, a few 70B models (which won't fit in 24GB anyway) and only limited stuff in between.

I suppose I could be getting a biased impression though, as of course many more people are in a position to recommend the more accessible models.

What sort of things are you running that take full advantage of that 24GB?

eurekin · on Jan 8, 2024

As the ancestor commenter mentioned:

> If you’re interested in ML training

Training - at least the one I tried - requires to be run in fp16 mode. So a 7b net needs 14 GB for the model weights alone, plus some extra for the context and the stuff I don't really understand (some gradient values, oh that makes sense now that I've written it)

phero_cnstrcts · on Jan 8, 2024

I thought it was possible to use two cards and “share” the ram? Then it would still be a good deal. But maybe I’m wrong.

samspenc · on Jan 9, 2024

This is only supported with the previous generation NVidia 3090, it is apparently possible to combine two 3090s with 24 GB VRAM and 'fuse' them with NVLink to act as a single high-powered GPU with 48 GB VRAM combined.

NVidia no longer supports this for the 40-series, I think this is because they want anyone interested in using their GPUs for LLMs to buy the pricier models with more VRAM.

two_in_one · on Jan 9, 2024

Theoretically you can use as many as GPUs you want in parallel. LLMs are easy to split and run in model parallel configuration (for big models which don't fit on one card). or data parallel for performance, when the same model runs different batches on GPUs. PyTorch has full support for both modes, afaik.

BaculumMeumEst · on Jan 8, 2024

using cloud hardware for training and consumer cards for inference seems like the common sense thing to do then

eurekin · on Jan 8, 2024

In the spirit of hackernews, one can also build a rig such as this one:

https://nonint.com/2022/05/30/my-deep-learning-rig/

and land a job at OpenAI as a side effect. Of all places, I wasn't expecting such pushback here

paulmd · on Jan 8, 2024

with both you and GP, I would imagine the answer is that people tend to build models to the hardware that is available. If 12GB and 24GB are the hardware thresholds that people have, you'll get "open-source action" in the 12GB and 24GB models, because people want to build things that run on the hardware they own.

(Which is of course how CUDA built its success more generally, vs the "you have to buy the $5k workstation card to get started" strategy from ROCm.)

More generally you'd call this optimization and targeting the hardware that's available. No sense releasing crysis when everyone is running a commodore 64, after all.

baobabKoodaa · on Jan 8, 2024

I actually have a 12GB card, which I purchased specifically for AI (24GB cards are too expensive for me). You're correct that 12GB is also a sweet spot in terms of what you get per dollar spent.

Salgat · on Jan 8, 2024

I wouldn't be shocked if the 5090 is also the only one with 24GB. Seems like NVidia is trying their hardest to suppress memory increases.

anonym29 · on Jan 8, 2024

When you can sell an enterprise-grade card with 40-80GB of VRAM for $50k, selling consumer cards with 24GB for $2k is almost a form of charity, by comparison.

AMD and Intel GPUs do not have the software ecosystem for AI workloads that Nvidia does, though AMD is rapidly improving. Nvidia has had an effective monopoly on the AI hardware space for the last year or so, and continues to have an effective near-monopoly, but that won't last forever as AMD and Intel catch up.

The VRAM is one of the largest differentiators of their cards. Sufficient VRAM allows you to run huge LLMs like 65B in-memory, which is orders of magnitudes faster than system RAM + CPU. Smaller amounts of VRAM require swapping between VRAM and system RAM and incur a major performance penalty.

Businesses are fighting to fork over $50k+/card for 40/80GB cards with the same processor as the 24GB consumer cards - it doesn't make economic sense for Nvidia to offer more on the consumer cards, lest they start cannibalizing demand for the enterprise cards.

dragonwriter · on Jan 8, 2024

> When you can sell an enterprise-grade card with 40-80GB of VRAM for $50k

ADA 6000 RTX (48GB) — an enterprise (workstation) cars — is about $10K sticker price. The differentiator between those and data center cards in the 40GB-80GB range is, obviously, not just VRAM.

anonym29 · on Jan 9, 2024

Check out what H100's are going for.

Also, as samspenc points out, that RTX 6000 is using the same AD102 chip found in a consumer RTX 4090, just with marginally more CUDA cores, TMUs, ROPs, etc.

The most substantial difference between the two is that the $10k card has twice the VRAM of the $2k card.

I know it sounds outrageously oversimplified, but Nvidia can indeed more or less print money by attaching a few extra memory chips to what is otherwise a flagship consumer-oriented graphics processor.

The profit margins on the H100, for reference, are estimated to be around one thousand percent (1000%), i.e., they sell for ~10x as much as it costs Nvidia to make them, and demand for them still grossly exceeds the total supply. See: https://the-decoder.com/nvidias-h100-gpu-sells-like-hot-cake...

samspenc · on Jan 9, 2024

And yet if you look at Blender (open source & free) GPU benchmarks https://opendata.blender.org/benchmarks/query/?compute_type=...

The RTX ADA 6000 actually slightly performs worse than a consumer-grade 4090 (~10% less performant), even though it retails for 4-5x more ($8000 for ADA 6000 compared to $1500-2000 for a 4090).

bbatha · on Jan 8, 2024

Because the 5090 ostensibly targets gaming which only needs sufficient vram to display images on 4k textures on 4k, 5k and ultra-wide monitors. A large portion of the 5090 audience is not doing ML training and that vram would sit idle for a few monitor generations. As a gamer I would be kind of upset if they included that very expensive unneeded vram in their already very expensive cards.

Sohcahtoa82 · on Jan 8, 2024

As someone who is a gamer but also wants to dabble in ML, it kind of hurts tbh that this is probably true.

The xx90 cards (3090, 4090, etc) have always been aimed at the gamer with a lot of cash. But games aren't really designed with that hardware in mind, so you won't see games that will take advantage of that much VRAM, so NVIDIA isn't inclined to increase the memory on them.

I have a 3080 right now, so haven't really been able to play around with training a model, but I'd like to see the 5090 have 32 GB, but I'm having my doubts.

sevagh · on Jan 8, 2024

I've been hoping for a >24 GB ADA Titan consumer/gamer/retail card for a long time. The 4090 is awesome but I don't want the same VRAM as my 3090.

paulmd · on Jan 8, 2024

I think it strongly depends on whether 24gbit GDDR7 is ready to go by that point or not. If so, they'll do 36GB.

two_in_one · on Jan 9, 2024

5880 with 48GB has been announced. but I suspect it will be not in consumers range, more like $3k++.

Salgat · on Jan 10, 2024

The 5880 is the same generation as the 4090 and is a workstation card (it has higher memory but lower CUDA cores available). The price is expected to be around $5-6k. What I mean by 5000 series is the non-workstation next-gen lineup.

nightski · on Jan 8, 2024

Agreed but there is a lot more ML out there than just LLMs. You can't solve everything with prose.

eurekin · on Jan 8, 2024

Attention mechanism, the core of LLM, is universal enough to be brought back to standard vision models. Which is kind of ironic, since vision models were dominated by convolutions, and, the transformer is dubbed "convolution for text".

The real reason is that it doesn't deteriorate with regards to the input length in case of text, or far neighbourhood in case of vision. It's just a universal, new, building block that allows for shallower neural networks to perform more like their bigger versions

garyfirestorm · on Jan 8, 2024

many text generation models run on my 11G 1080Ti you can run quantized versions of these models if you aren't running it quantized, I'd say even 24 gig is not enough

baobabKoodaa · on Jan 8, 2024

If you want to get the most bang for your buck, you definitely need to run quantized versions. Yes, there are models that run in 11G, just like there are models that run in 8G, and for any other amount of VRAM - my point is that 24G is the sweet spot.

eurekin · on Jan 8, 2024

Since n-times a 3090 is still a much better offering

bryanlarsen · on Jan 8, 2024

AMD also released the 7600XT today. Not particularly interesting, but if you need/want a cheapish 16GB card you now have an option beside Intel.

https://ir.amd.com/news-events/press-releases/detail/1176/am...

paulmd · on Jan 8, 2024

This is another very good release that is also being somewhat bizarrely panned too. Yeah, it's essentially a 7600 16GB/4060 16GB, and that's what people wanted a few months ago, more VRAM. It also opens up a bunch of possibilities for cheap ML (in the same way the 4060 Ti 16GB does).

IMO this pretty well displaces the 4060 8GB and 16GB - it's cheaper (than even the already below-MSRP street prices) on 4060 Ti 8GB, it's way cheaper than the 16GB model. $50 over 7600 MSRP for twice the VRAM is a very fair deal, and street prices will probably float just as much as 7600 street prices have.

Clearance-priced 6700XT is a great deal but 7600/7600XT is ultimately a 6600/6600XT replacement and it's not a knock on the 7600 that it doesn't have the wider memory bus/etc - it is a lower-tier product that is only in the same price tier due to clearance markdowns.

I maintain that people are just mad about the whole last 5 years (since RTX launched) at this point and pretty much just give automatic thumbs-down to anything that isn't an absurdly out-of-band good product. The pandemic shortages and mining boom have embittered a fair number of people to the point I don't think they're coming back to hobby, and instead they sit on social media and complain.

bryanlarsen · on Jan 8, 2024

It's tough because most of the games being benchmarked by reviewers were designed to work well on 4GB for 1080p and on 8GB high settings. So the extra VRAM doesn't help much for benchmarking.

But future games are likely to run better in >8GB simply because the PS5 and XBox Series X have more than 8.

paulmd · on Jan 9, 2024

I agree that you need to optimize for the reasonably forseeable future. You are buying games for the next 5 years, not the last 5 years - although I do weight things towards the frontend, it is better to optimize for the next 3 years and 2 more of relevance than to try to aim for 5 years of relevance or 2 years of worse/3 years of relevance. Generally tail-end relevance is of low value, by the time 5 years are up things have still shifted enough it's time to upgrade anyway, unless you just don't care by that point.

I think 8GB is going to continue to be a long-lived target especially at 1080p resolutions (with whatever gains can be squeezed from upscaling etc too - although generally DLSS needs to inference against the full-quality textures etc). Series S has 8GB (of fast-partition ram, the rest is GTX 970 style slow-partition) and even Series X only has 10GB.

People also aren't giving enough credit to mesh shaders etc, the GTX 1650 is actually still in the game with 4GB in Alan Wake 2[0], it does make a difference. The "but a 2060 super isn't relevant anymore!" argument relies on the assumption that you're deciding not to turn on upscaling etc. 1650 can run AW2 on lowest-settings 1080p with 4gb with FSR2 and it looks fine, and it'd be even better with RTX/DLSS. 2060 with DLSS can do a console-like experience on AW2 zero problem.

[0] https://www.youtube.com/watch?v=vFf8NsOi-HU&t=356s

Consoles have always been a mixed bag. Yeah, they get a lot of specific tweaking and they also have special hardware which helps somewhat. But overall you're working from a (later in the gen) fairly low baseline. 6700 non-XT performance is ok but nothing stellar, and optimization doesn't save that. But honestly what they are good at is removing "paralysis of choice", having too many choices really hurts people and the emotional feeling of having to turn the setting onto lowest hurts people, even if that's what the console does itself! It's at least pre-tweaked lowest settings etc (although often worse tech etc - FSR3 is blown away by DLSS 3.5 image quality let alone 4.0 and future iterations which aren't far away). You don't have to think about it, you just say "framerate or quality" and you probably know which you want.

That is the problem that people will struggle with. 8GB will still work. It just also will be a Series S level experience, modulo things like mesh shading that occasionally differentiate the consoles (PS5 lacks it iirc, as well as DP4a). And that can still often look fine. Will you get more from spending more? Yes. But it also doesn't take that much - series X is the 6700, series S is like APU territory. A 3080 blows away the series X, etc. But you will have to stomach through moving that slider from "native" to "performance" and the texture quality from "ultra" to "medium". Etc. People have lost touch of the world of yesterday when "can you run crysis" was an actual question and not a meme, slamming every setting to ultra is not a given when you buy an entry-level card, and people also can't handle the fact that $200-300 is now entry-level. Midrange is $500-700, high-end is $800 to "how much have you got".

And that's not NVIDIA, that's really just wafer costs. If you want to compare die sizes and MSRPs against 10+ years ago (look up GTX 670/GK104 lol), you have to bear in mind that a given die size might cost 5x what it did back then. And it increases ~30% every node-family since 28nm, more or less. It's gonna go up over time, if you aren't moving up in price you're moving down in product-design-bracket and are going to have to deal with more design compromises to hit those lower price-points in the face of rising costs. It sucks, but nobody has any better ideas - to paraphrase what someone once told me, "the industrial and creative poles of several societies and continents are laser-focused on pushing this backwards, and yet the problems only become more difficult after each success". There is no easy answer, lots of smart people are working at this.

anonym29 · on Jan 8, 2024

Keep in mind that for local LLM inference, Nvidia's software support is qualitatively superior to AMD's, for the time being. That said, it's worth noting that AMD is catching up quickly.

ranger207 · on Jan 8, 2024

If you're looking to do gaming on Linux, AMD is much less hassle than Nvidia

y-c-o-m-b · on Jan 8, 2024

My 3080 laptop GPU can still play 99% of games at their highest settings. Or at least they did before the latest batch of nVidia drivers went to shit. It's not the hardware that's the problem. What's the point of upgrading to a 4-series when we're left with stuttering and defects from software issues?

EDIT: even people on the 4-series are experiencing it https://www.nvidia.com/en-us/geforce/forums/game-ready-drive...

LUmBULtERA · on Jan 8, 2024

There isn't usually much point in upgrading one generation in most hardware products. Typically that's wasteful in all but edge-use cases.

ziml77 · on Jan 8, 2024

I didn't expect an announcement of the Supers this early in the year, nor did I expect them to be cheaper than the non-Super cards. I bought the 4080 recently with the thought that Nvidia's trend in the past 4 or 5 years has been to increase prices, so even if the Supers were announced this early in the year the performance increase would cost proportionally more money anyway.

Sucks for me, but overall I'm glad that Nvidia is getting prices under control to some extent.

paulmd · on Jan 8, 2024

> nor did I expect them to be cheaper than the non-Super cards.

to be blunt, that's because you bought into ayymd propaganda. it is so endemic that people don't even see it for what it is anymore, people are constantly bombarded with absurdly pro-AMD and absurdly anti-NVIDIA takes, it's just the sea in which we swim on social media.

you should take it as a learning experience and not constantly buy into the ayymd bandwagon of the week next time. because there will absolutely be a next time - probably people will move onto the next insane thing within a few weeks here.

Last year it was that the 4090 was going to be >900W... people talked themselves into thinking that a two-node shrink was going to result in zero efficiency gain. This ada gen is a dud, just wait for AMD, the 7900XTX is gonna blow the doors off!

https://www.techpowerup.com/294261/nvidia-allegedly-testing-...

And it's happened to RDNA3, Vega, Fury X, Zen2, etc, and against every single technology deployed via RTX or DLSS. The flip on framegen the day AMD released FSR3 was amazing, and instantly all the complaints about latency etc vanished within a single day, despite being significantly worse latency because of forced vsync/incompatibility with VRR, let alone the reflex-only baseline. "Possibly the best part of FSR now" etc.

it's like the runup to the iraq war or something, there were counter-voices, but why would you want to listen to them when everybody knows the truth already? Going against the grain constantly is tiresome and frames you as an iconoclast, and even if you're right people still think you're a troublemaker for having contradicted them earlier. The people who blocked you are not gonna unblock you just because you were right. It's like trying to be the voice of reason in a failing project, even if you save the project you're still a troublemaker. So eventually the dialogue just fades into an echo chamber. It is a fast road to what was eulogized as "epistemic closure" - aka "we bandwagoned too hard and drowned out all the opposing voices, and it turns out they were correct".

https://en.wikipedia.org/wiki/Epistemic_closure#Epistemic_cl...

So here we are: green man bad, everyone knows it, and this exception really only proves the rule. Now if you'll excuse me I've got some very important posts to make about how you'll never be able to buy one for MSRP anyway, like it's still 2020 or something.

In fairness you are not alone, fun reading from only a few days ago etc: https://news.ycombinator.com/item?id=38804502

lp0_on_fire · on Jan 8, 2024

The important bits...

> The GeForce RTX 4080 SUPER arrives January 31st, starting at $999

> The GeForce RTX 4070 Ti SUPER launches January 24th, starting at $799

> The GeForce RTX 4070 SUPER launches January 17th, starting at $599

alyandon · on Jan 8, 2024

I wonder if nVidia is going to commit to providing enough stock to keep prices at or under those MSRP values.

kllrnohj · on Jan 8, 2024

Since crypto has crashed so much and most of these are borderline useless for AI/ML, there's not likely to be too much of an issue. The MSRP is already massively marked up.

alyandon · on Jan 8, 2024

Considering the poor generational performance uplift the original 4000 series cards had vs the 3000 series - it almost seems like these super variants are what nVidia should have originally launched. :-/

ziml77 · on Jan 8, 2024

Poor performance uplift? The 4080 is significantly faster than the 3090 with much lower power consumption!

kllrnohj · on Jan 8, 2024

The 4080/4090 are outliers of the generation though unfortunately, and also are themselves still a rather typical generational gain at that.

Most of the rest of the 40xx stack was either unexpectedly slow or unexpectedly expensive (or both), such that performance per dollar stayed flat or regressed

ziml77 · on Jan 8, 2024

Ah that makes sense. I don't pay close enough attention to the mid tier or the low end. I'm sure I saw at some point that there were problems with those cards but it's hard to remember when it doesn't stand out because people are negative about every GPU.

beebeepka · on Jan 8, 2024

It's not always faster than the 3090, let alone "significantly", except in very specific resolutions and games.

here's a review from a traditionally pro intel and nvidia outlet: https://www.techpowerup.com/review/nvidia-geforce-rtx-4080-f...

ziml77 · on Jan 8, 2024

I checked a sampling of a few games there and the only one where it's not ahead by a solid margin at 2560x1440 is Far Cry 6... the game that I gave up trying to play because it would not run smoothly on my 3090 + 5900X even with the settings turned down.

beebeepka · on Jan 9, 2024

I guess that site is serving completely different graphs to me and you.

Or one of us sees what they want to see.

ziml77 · on Jan 13, 2024

Besides Far Cry 6 and it's strange bottlenecks, these are the other games I checked:

    | Game           | 3090 FPS | 4080 FPS | Delta | % Change |
    | Cyberpunk 2077 |     83.3 |    114.6 |  31.3 |     +38% |
    | Doom Eternal   |    261.2 |    378.4 | 117.2 |     +45% |
    | Forza 5        |    108.2 |    147.3 |  39.1 |     +36% |
    | Halo Infinite  |     95.7 |    107.0 |  11.3 |     +12% |

Maybe all those years where Intel was stuck on 14nm made me forget how big leaps could be generation to generation, but to me those jumps of more than 30% are huge especially considering that while the 4080 is a gen ahead of the 3090 it's also a tier down.

Also if you look at 4K performance, the % gaps for all these games are even larger (and I'm not looking at 4K now because it's better for my point. My next monitor will be one of the 4K QD-OLEDs that were announced at CES, so those charts are now more relevant to me than the 1440p ones)

tagrun · on Jan 8, 2024

Raw performance per dollar (after including inflation adjustment) has stagnated in 40 Series. A similar thing happened in 20 Series.

SUPER series has been a response to rival products offering better raw performance/price released afterwards.

Power consumption is a separate issue which may or may not be a concern depending on where you live.

upon_drumhead · on Jan 8, 2024

lol, not going to happen. The low MSRP is too low for partners to make any money and so they’ll keep the prices inflated as much as the market will bare, regardless of the availability of the chips

stetrain · on Jan 8, 2024

And they're claiming the 4070 Super provides more performance than an 3090 at lower TDP (220W vs 350W).

kllrnohj · on Jan 8, 2024

The 3090 is 21% faster than a 4070, so with the 4070 SUPER having 20% more cuda cores than a 4070 then that's plausible.

But it's really more a reflection on how shitty the 4070 was than anything else tbh

Qiu_Zhanxuan · on Jan 8, 2024

Waiting for Intel Battlemage, latest rumors says 16 GB of memory, a 256-bit wide bus, RTX 4080 level of performance in compute for 450$ MSRP. Wait and see Q3 2024. Intel software stack is more likely to challenge Nvidia's than AMD's.

Sohcahtoa82 · on Jan 8, 2024

That's some HUGE promises.

Very much a "I'll believe it when I see it" scenario.

MichaelMug · on Jan 8, 2024

An as aside: Navigating the Nvidia website on mobile is incredibly frustrating. Safari on iOS makes a mess of zooming into charts and spec tables.

I’m wanting to know how this would compare to a 4090…as I’m thinking of upgrading my 3080FE.

twoWhlsGud · on Jan 8, 2024

All I want is a low-power passively-cooled replacement for my aging 5GB Quadro P2000 that doesn't crash out while running the new Lightroom denoise algorithms. None of these seem likely choices in that area, sigh.

hedgehog · on Jan 8, 2024

Is it a noise issue? If you take a newer open-air style card and lower the power limit it should be fairly quiet and still much faster than what you've got. You can also tinker with limiting the clock speed and maybe get even lower, enough that you could take the fans off.

haunter · on Jan 8, 2024

At this point I'll just wait for the 5000 series, especially the 5080. Will be out in ~12-14 months.

diggan · on Jan 8, 2024

Generally a wise decision when it comes to anything that has this type of release schedule, from phones to 3D printers. Buy one generation, skip one, maybe upgrade the next, depending on relevant improvements.

Night_Thastus · on Jan 8, 2024

With CPUs I'd say skip more than 1. I've been doing every 4 generations, and given how poorly Intel's 14th gen performed, I'm waiting another still.

The improvements are very small in this space right now, for the most part.

glonq · on Jan 8, 2024

Yeah I'm doing totally fine with an i7-9700K, although having 32gb sure helps.

2060Super GPU is starting to feel dated though!

Arrath · on Jan 8, 2024

My own 9700k is starting to creak a bit, but that was my own mistake buying a 144hz monitor.

glonq · on Jan 9, 2024

Yeah, my mistake buying an ultrawide monitor. My 4yo rig can push 1080 okay but 3440x1440 is a bit much. I don't dare aspire to anything higher than 60fps.

open-paren · on Jan 8, 2024

1080ti and 8700k still goin' strong

schmorptron · on Jan 8, 2024

Hoping this drives down the price for the 7800XT or older used models in the near future.

That being said, these look competent enough, just stingy with VRAM still making them less desirable for longer use (4+ years) in either playing games or training models.

smrtinsert · on Jan 8, 2024

Starting at 600? Holy cow.

orenlindsey · on Jan 8, 2024

I thought it was weird that they would release new GPUs now, but I'm not complaining. The more GPUs the better.

Sohcahtoa82 · on Jan 8, 2024

> The more GPUs the better.

I feel the exact opposite. There are too many! There are now nine models of the 4000-series, and that's not including laptop models or the cancelled 4080 12 GB.

IMO, there should be 5 models at the maximum. I shouldn't have to sit here and do a bunch of research to find out if the 4070 SUPER is faster than a 4070 Ti, and whether I should go for a 4070 SUPER Ti.

4060, 4070, 4080, 4090. That's all they ever needed. Budget, mid-grade, enthusiast, top-of-the-line. That's all that's needed.

samspenc · on Jan 9, 2024

They did something similar for the 30-series, I think 20-series as well.

Though given that NVidia seems to release its consumer GPUs once every 2 years (20 series in late 2018, 30-series late 2020, 40 series in late 2022), I wonder if this is more of a marketing ploy - release the main series once every 2 years, but bring our "super" refreshers in the middle of the cycle to make sure you're still in the news, and get some segment of consumers / gamers to upgrade to those.

fomine3 · on Jan 9, 2024

I'd like to get SUPER model to be sure to get 12V-2x6 socket instead of semi-failed 12VHPWR.

wnevets · on Jan 8, 2024

too bad it appears impossible to get a 4090 MSRP and there isn't a refresh for it.

christkv · on Jan 8, 2024

I wonder why there are no comparisons to their other series 40x0 cards?

theandrewbailey · on Jan 8, 2024

Likely, the improvement over 4000 cards is minimal, and they are marketing towards owners of 3000 (and older) cards to upgrade.

That said, I'm curious to know how they stack up against a 4090, and if these new cards can melt wires and burn down a house as easily.

ThrowawayTestr · on Jan 8, 2024

Is it a good time to pick up a 3060 now or no?

glonq · on Jan 8, 2024

It's never a good time to pick up a 3060. Can you stretch a bit to a 3070?

samspenc · on Jan 9, 2024

A 3060 12 GB version is not too bad though, it is at a reasonable enough price and has just enough VRAM to be useful for local AI / ML / LLM work.

smoldesu · on Jan 9, 2024

As someone who owns an RTX 3070, why is it so special? I've found myself wishing I had the 3060 Ti for the 12gb memory, sometiemes.

im3w1l · on Jan 8, 2024

Have you considered an amd card? Ive seen them recommended for low and mid range builds.

alyandon · on Jan 8, 2024

Personal opinion:

3060 - no

3060 Ti - yes, it still a great card that I use today

However, if you can't find a 3060 Ti at a lower price point than the 4060 Ti... then I reluctantly have to say you are probably going to be better served with the 4060 Ti.

dartdartdart · on Jan 8, 2024

no displayport 2.1

moogly · on Jan 8, 2024

Ouch. Why is nvidia always so far behind on this stuff? "Gamers don't care" I guess, but come on.

imbnwa · on Jan 9, 2024

What’s the use case for 4K 144hz HDR10 monitor besides gaming?

PH95VuimJjqBqy · on Jan 8, 2024

meh, I'm waiting until the Super Duper gets released.

erik_seaberg · on Jan 8, 2024

It really does seem to be the same branding problem as the Dragon Ball series. How do you say “this version is more powerful than any before” after having said that a dozen times in a row?