More

kd913 · 2025-01-29T00:34:58 1738110898

Am curious if the problem impacts m4 given it came out after this was released and disclosed.

That and it moved to Arm’s 9.2 instructions.

jabwd · 2025-01-29T08:47:18 1738140438

Keep in mind that it takes at least 3 months to produce an M4, and the design has been finalized long before that. So most likely yes

saagarjha · 2025-01-29T05:06:57 1738127217

kd913 · 2025-01-27T22:26:19 1738016779

The biggest discussion I have been on having this is the implications on Deepseek for say the RoI H100. Will a sudden spike in available GPUs and reduction in demand (from efficient GPU usage) dramatically shock the cost per hour to rent a GPU. This I think is the critical value for measuring the investment value for Blackwell now.

The price for a H100 per hour has gone from the peak of $8.42 to about $1.80.

A H100 consumes 700W, lets say $0.10 per kwh?

A H100 costs around $30000.

Given deepseek, can the price of this drop further given a much larger supply of available GPUs can now be proven to be unlocked (Mi300x, H200s, H800s etc...).

Now that LLMs have effectively become commodity, with a significant price floor, is this new value ahead of what is profitable for the card.

Given the new Blackwell is $70000, is there sufficient applications that enable customers to get a RoI on the new card?

Am curious about this as I think I am currently ignorant of the types of applications that businesses can use to outweigh the costs. I predict that the cost per hour of the GPU dropping such that it isn't such a no-brainer investment compared to previously. Especially if it is now possible to unlock potential from much older platforms running at lower electricity rates.

chatmasta · 2025-01-28T02:16:58 1738030618

Why is there this implicit assumption that more efficient training/inference will reduce GPU demand? It seems more likely - based on historical precedent in the computing industry - that demand will expand to fill the available hardware.

We can do more inference and more training on fewer GPUs. That doesn’t mean we need to stop buying GPUs. Unless people think we’re already doing the most training/inference we’ll ever need to do…

“640KB ought to be enough for anybody.”

Jensson · 2025-01-28T02:37:47 1738031867

Historically most compute went to run games in peoples homes, because companies didn't see a need to run that much analytics. I don't see why that wouldn't happen now as well, there is a limit to how much value you can get out of this, since they aren't AGI yet.

chatmasta · 2025-01-28T03:08:00 1738033680

This just seems like a very bold statement to make in the first two years of LLMs. There are so many workflows where they are either not yet embedded at all, or only involved in a limited capacity. It doesn’t take much imagination to see the areas for growth. And that’s before even considering the growth in adoption. I think it’s a safe bet that LLM usage will proliferate in terms of both number of users, and number of inferences per user. And I wouldn’t be surprised if that growth is exponential on both those dimensions.

Jensson · 2025-01-28T04:23:32 1738038212

> This just seems like a very bold statement to make in the first two years of LLMs

GPT-3 is 5 years old, this tech has been looking for a problem to solve for a really long time now. Many billions has already been burned trying to find a viable business model for these, and so far nothing has been found that warrants anything even close to multi trillion dollar valuations.

Even when the product is free people don't use ChatGPT that much, making things cheaper will just reduce the demand for compute then.

light_hue_1 · 2025-01-28T08:49:08 1738054148

Everyone uses chatgpt now. You too. Hundreds of time per day.

It's just not called chatgpt. Instead it is at the top of every Google search you do. Same technology.

It has basically replaced search for most people. A massive industry turned over in 5 years by a totally new technology.

Funny how the tech took over so completely it blends into the background to the point where you think it doesn't exist.

ceejayoz · 2025-01-28T13:59:08 1738072748

> It has basically replaced search for most people.

Not because it's better than search was, though.

They lost the spam battle, and internally lost the "ads should be distinct" battle, and now search sucks. It'll happen to the AI models soon enough; I fully expect to be able to buy responses for questions like "what's the best 27" monitor?" via Google AdWords.

cryptonym · 2025-01-28T10:55:29 1738061729

Using it doesn't mean people like it. It's forced on us. See recent stories about Microsoft.

diamond559 · 2025-01-28T03:49:20 1738036160

Over the long run maybe, but for the next 2 years the market will struggle to find a use for all this possible extra gpus. There is no real consumer demand for AI products and lots of backlash whenever implemented eg: that Coca Cola ad. It's going to be a big hit to demand in the short to medium term as the hyperscalers cut back/reasses.

light_hue_1 · 2025-01-28T08:40:09 1738053609

There's no consumer demand for AI?

In a thread full of people who have no idea what they're talking about either from the ML side or the finance side, this is the worst take here.

OpenAI alone reports hundreds of millions of MAU. That's before we talk about all of the other players. Before we talk about the immense demand in media like Hollywood and games.

Heck there's an entire new entertainment industry forming with things like character ai having more than 20M MAU. Midjourney has about the same.

Definitely. An industry in its infancy that already has hundreds of millions of MAU across of it shows that there's zero demand because of some ad no one has seen.

financypants · 2025-01-28T05:50:40 1738043440

Seems like your reasoning for how the next 2 years will go is a little slanted. And everyone in this thread is neglecting any demand issues stemming from market cycles.

egillie · 2025-01-28T04:17:46 1738037866

Could even argue that the price should go up, since the amount of with one GPU can do and its potential ROI just increased

theptip · 2025-01-28T03:04:11 1738033451

I think training demand is what you might predict would plummet.

Inference demand might increase but you could easily believe that there’s substantial inelasticity currently.

kd913 · 2025-01-25T20:56:45 1737838605

It should be trivially easy to reproduce the results no? Just need to wait for one of the giant companies with many times the GPUs to reproduce the results.

I don't expect a #180 AUM hedgefund to have as many GPUs than meta, msft or Google.

sudosysgen · 2025-01-25T21:34:42 1737840882

AUM isn't a good proxy for quantitative hedge fund performance, many strategies are quite profitable and don't scale with AUM. For what it's worth, they seemed to have some excellent returns for many years for any market, let alone the difficult Chinese markets.

kd913 · 2025-01-13T20:55:56 1736801756

I think there can be a difference here.

Was looking recently at the power requirements of an amp + subwoofer + 5 5.1 JBL surround speakers.

The setup was done decade ago, and the power needed for it was nuts. Something like 500W for a Denon amp and 250W for a JBL subwoofer?

For reference something like a OG HomePod consumes what 45W? The Sony srs xg500 boombox can last 30hours and is a giant room shaking boombox.

The difference in power efficiency between these old and new setups are nuts. Nevermind compatibility with AirPlay, streaming etc…

kristjansson · 2025-01-13T21:12:38 1736802758

> 500W

Amplifiers are quoted in peak output, not average (and play some games with other parameters e.g. resistance) to capture bigger-number-better sales. A 750w system will consume nowhere near 750w at typical listening volumes (just like your 750w PC doesn't use 18 kWh every day.)

throw-qqqqq · 2025-01-13T21:12:52 1736802772

Unless you’re playing REALLY loud, I don’t think you are anywhere near 250 or 500W of consumption. I would guess it is the maximum rated power?

Even with quite old and inefficient amp + speaker combo, 30W of sound is usually a lot(!).

Tube amps are an exception. They can be very power hungry, but it’s difficult to buy such tech today compared to class D etc.

kingnothing · 2025-01-13T21:09:51 1736802591

There's also an absolutely massive difference in audio quality between a HomePod or Sonos anything and a proper amp + speakers.

jkolio · 2025-01-13T21:19:06 1736803146

Yup. Newer products use various tricks to try to fill in the gaps that their physical reality can't overcome, but ultimately there's no getting around that reality.

I will say that the Sony upright boom boxes aren't to be slept on (and, if one is active, fat chance). They're quite good for their intended use cases (parties, and closed Best Buys during clean-up/inventory).

tpm · 2025-01-13T21:12:05 1736802725

A 500W amp is probably a class A and can't really be made more efficient. It would still be 500W in 2024. Decades ago there were more efficient setups too, though of course now they sound better and also have lots more features and connectivity.

kd913 · 2025-01-02T15:48:40 1735832920

Surely this is an issue for there not being an easy mechanism for backing up?

The proper solution should be secure by design and user friendly. We shouldn’t compromise the former for the latter.

kd913 · 2024-11-25T13:27:57 1732541277

What is being asked for already exists? It is called Onload.

https://github.com/Xilinx-CNS/onload

a-dub · 2024-11-25T14:05:25 1732543525

it is my understanding that io_uring is the generalized open source implementation of this, although i do not think it bypasses the kernel fib trie like openonload does...

gpderetta · 2024-11-25T17:58:23 1732557503

Aside for onload being open source, not really. AF_XDP is the generalized, hardware agnostic, version of kernel bypass.

In addition to bypass onload also provides a full IP/TCP user space stack and non-intrusive support for existing binaries using the standard BSD socket interface (incidentally onload also supports XDP now).

io_uring is really for asynchronous communication with the kernel.

a-dub · 2024-11-26T17:28:55 1732642135

interesting, didn't know that the networking stack had ring buffer infrastructure as well. (i don't think this af_xdp stuff existed when i was in this world)

the fib trie is the core of the ip stack - i was using it as proxy for total ip stack bypass.

kd913 · 2024-07-29T23:17:59 1722295079

I am confused why they are around to begin with.

Companies already trust Microsoft, they buy Windows, Office, Azure.

Why would they bother with a 3rd party here when the low effort low risk solution is to pick the tool made by the OS vendor. I.e. windows defender

It should be a nobody gets fired for picking IBM situation. How did this random place get so much credibility that people trust them over the manufacturer?

kccqzy · 2024-07-30T02:13:39 1722305619

Because they provide far more protection than Windows defender. You can write your own custom never-before-seen malware, and CrowdStrike will detect it purely based on behavioral signals. Windows Defender is still largely an antivirus solution.

Peanuts99 · 2024-07-30T09:38:42 1722332322

Microsoft's E5 offerings are a direct competitor to Cloudstrikes threat response products which is a lot more than just Windows Defender on endpoints. I'd imagine many of Cloudstrikes customers will be looking to move this to MS's tools instead as a result of this.

htrp · 2024-07-30T02:12:24 1722305544

crowdstrike has oracle enterprise sales model. have you ever been to one of their events?

kd913 · 2024-07-13T13:26:08 1720877168

Holy shit the level of delusion.

Crypto weakening the hold the state over citizens? WTF kind of paint thinners are you smoking?

Who are the main stakeholders in crypto?

- A consolidated pool of few miners (some of whcih are state owned because they are the only ones who own the electricity plants) - Russia, North Korea, China, the mafia? - A consolidated set of finance individuals who have a perverse incentive to take advantage of lax financial regulations - A series of exchanges run by proven scum - the largest owner what being the FBI and the US government? - A series of undemocratically elected scum who openly print fake money from Bermuda (Tether?)

A type of currency where people can lose their savings on a dime, have openly no protections and ridiculous levels of fees whilst simultaneously destroying the planet.

For the people my arse, the people involved should be lychned given how 90% of it just supports slavery, drug trade, war and crime.

kd913 · 2024-06-25T16:24:14 1719332654

You do know that Microsoft, Oracle, Meta are all in on this right?

Heck I think it is being used to run ChatGPT 3.5 and 4 services.

softfalcon · 2024-06-25T16:30:49 1719333049

I feel like people forget that AMD has huge contracts with Microsoft, Valve, Sony, etc to design consoles at scale. It's an invisible provider as most folks don't even realize their Xbox and their Playstation are both AMD.

When you're providing fab designs at that scale, it makes a lot more sense to folks that companies would be willing to try a more affordable option to nVidia hardware.

My bet is that AMD figures out a service-able solution for some (not all) workloads that isn't ground breaking, but affordable to the clients that want an alternative. That's usually how this goes for AMD in my experience.

sangnoir · 2024-06-25T18:39:13 1719340753

If you read/listen to the Stratechary interview wirh Lisa Hsu, she spelled out being open ro customizing AMD hardware to meet partner's needs. So if Microsoft needs more memory bandwidth and less compute, AMD will build something just for them based on what they have now. If Meta wants 10% less power consumption (and cooling) for a 5% hit in compute, AMD will hear them out too. We'll see if that hardware customization strategy works outside of consoles.

rcxdude · 2024-06-25T23:04:19 1719356659

It certainly helps differentiate from NVIDIA's "Don't even think about putting our chips on a PCB we haven't vetted" approach.

pjmlp · 2024-06-26T08:05:47 1719389147

Yeah, but they will be using internal Microsoft and Meta software stacks, nothing that will dent CUDA.

Rinzler89 · 2024-06-25T20:23:21 1719347001

>I feel like people forget that AMD has huge contracts with Microsoft, Valve, Sony, etc to design consoles at scale.

Nobody forget that, just that those console chips are super low margins, which is why Intel and Nvidia stopped catering to that market after the Xbox/PS3 generations and only AMD took it up because they were broke and every penny mattered to them.

Nvidia did a brief stint with the Shield/Switch because they were trying to get into the Android/ARM space and also kinda gave up due to the margins.

pjmlp · 2024-06-26T07:57:28 1719388648

A market that keeps being discussed that is reaching its end, as newer generations aren't that much into traditional game consoles, and both Sony and Microsoft[0] have to reach out to PCs and mobile devices, to achieve sales growth.

Among the gamer community the discussion of this being the last generation keeps poping up.

[0] - Nintendo is more than happy to keep redoing their hit franchaises, in good enough hardware.

0cf8612b2e1e · 2024-06-25T16:29:40 1719332980

On the other hand, AMD has had a decade of watching CUDA eat their lunch and done basically nothing to change the situation.

bee_rider · 2024-06-25T17:02:37 1719334957

AMD tries to compete in hardware with Intel’s CPUs and Nvidia’s GPUs. They have to slack somewhere, and software seems to be where. It isn’t any surprise that they can’t keep up on every front, but it does mean they can freely bring in partners whose core competency is software and work with them without any caveats.

Not sure why they haven’t managed to execute on that yet, but the partners must be pretty motivated now, right? I’m sure they don’t love doing business at Nvidia’s leisure.

pjmlp · 2024-06-26T08:06:37 1719389197

Hardware is useless without software to make it show off.

bobsondugnut · 2024-06-25T17:11:19 1719335479

when was the last time AMD hardware was keeping up with NVIDIA? 2014?

0cf8612b2e1e · 2024-06-25T17:41:26 1719337286

Been a while since AMD had the top tier offering, but it has been trading blows in the middle tier segment the entire time. If you are just looking for a gamer card (ie not max AI performance), the AMD is typically cheaper and less power hungry than the equivalent Nvidia.

aurareturn · 2024-06-25T18:04:48 1719338688

It’s trading blows because AMD sells their cards at lower margins in the midrange and Nvidia lets them.

bee_rider · 2024-06-26T15:14:25 1719414865

But, the fact that Nvidia cards command higher margins also reflects their better software stack, right? Nvidia “lets them” trade blows in the midrange, or, equivalently, Nvidia is receiving the reward of their software investments: even their midrange hardware commands a premium.

bobsondugnut · 2024-06-25T18:03:24 1719338604

> the AMD is typically cheaper and less power hungry than the equivalent Nvidia

cheaper is true, but less power hungry is absolutely not true, which is kind of my point.

dralley · 2024-06-25T22:25:04 1719354304

It was true with RDNA 2. RDNA 3 regressed on this a bit, supposedly there was a hardware hiccup that prevented them from hitting frequency and voltage targets that they were hoping to reach.

In any case they're only slightly behind, not crazy far behind like Intel is.

bee_rider · 2024-06-25T18:08:55 1719338935

The MI300X sounds like it is competitive, haha

bobsondugnut · 2024-06-25T18:43:39 1719341019

competitive with H100 for inference. a 2 year old product on just one half of the ML story. H200 (and potentially B100) is the appropriate comparison based on their production in volume.

adabyron · 2024-06-25T16:32:30 1719333150

I have read in a few places that Microsoft is using AMD for inference to run ChatGPT. If I recall they said the price/performance was better.

I'm curious if that's just because they can't get enough Nvidia GPUs or if the price/performance is actually that much better.

atq2119 · 2024-06-25T17:39:10 1719337150

Most likely it really is better overall.

Think of it this way: AMD is pretty good at hardware, so there's no reason to think that the raw difference in terms of flops is significant in either direction. It may go in AMD's favor sometimes and Nvidia's other times.

What AMD traditionally couldn't do was software, so those AMD GPUs are sold at a discount (compared to Nvidia), giving you better price/performance if you can use them.

Surely Microsoft is operating GPUs at large enough scale that they can pay a few people to paper over the software deficiencies so that they can use the AMD GPUs and still end up ahead in terms of overall price/performance.

kd913 · 2024-06-14T12:08:09 1718366889

There is this new AI features, but I feel there was also a bigger reason to upgrade that was almost completely ignored.

A huge set of media companies have shifted to using AV1 and these older devices are gonna get hammered for battery.

I was planning on upgrading anyway just to get hardware AV1 decode given youtube music/youtube are one of my most frequent apps.

Surprised there wasn't more furore when Google mandated that shift.

antonkochubey · 2024-06-14T12:13:04 1718367184

>A huge set of media companies have shifted to using AV1

Such as? Most YouTube videos I am watching are still VP9 at 1080p/1440p, and there's no reason to watch 4K on phones (you still can, but lower battery life is your own choice in that case).