Hacker News new | past | comments | ask | show | jobs | submit login
AI cloud startup TensorWave bets AMD can beat Nvidia (theregister.com)
35 points by LorenDB on April 16, 2024 | hide | past | favorite | 30 comments



The MI300X has more that twice the memory and is 25-50 the cost of an H100. It's a great deal if they can make it work.

It could be great for everyone else that TensorWave is willing to invest in it and hopefully help drive improvements in the software.

https://www.techpowerup.com/318652/financial-analyst-outs-am...


I assume you forgot the percent sign after 25-50? Because I originally interpreted that as "25-50 times the cost" for a split second before realizing that it couldn't be right...


Hah! You are correct. Sorry for the confusion.


The part that's not clear to me is how. The value I see in AMDs participation is opening up the CUDA walled garden. This is different than what's good for TW. They would be better off with an AMD/TW walled garden that they can provide service for at better prices. But better prices alone won't be enough to get companies to move from the largest/only walled garden to a new/smaller one.

The best I could see is developing a service platform that frictionlessly and more efficiently runs CUDA workloads on AMD using a proprietary translation. Not a bad bet, IMO.


Availability is a big driver as well. If you can get a MI300x today vs. waiting a year for H100's, which do you go with?


Wait for H100 and use Nvidia support and any Nvidia card to get start?

In my company 90% of computers have a Nvidia card so you can get started with CUDA immediately to start data aggregation and planning your AI training and infercing while waiting for deployment. Totally forget that approach with AMD.

Nvidia also assist larger customers with DGX cloud access through their large deployed super computers.

Nvidia's AI workbench helps a lot here as you can easily transfer your applications from local RTX cards to on-prem or cloud data centers.


As of late last year, AMD is now committed to AI.

MI300x was released in Dec of last year, only months ago. It has 192gb, while H100's only have 80.

There are at least 4 companies now providing bare metal cloud access to these cards. +2 more are hyperscalers (Oracle and MSFT).

It is critically important for the long term safety of AI that we are not dependent on a single source for all of the hardware and software related to AI.

It takes a lot of effort to course correct a large ship, give it some time.


What evidence is there that AMDs software efforts are improving?


So, HIP at a raw level is as performant as CUDA. The real problems come from higher level stack (BLAS, LAPACK libraries for example). But not all software need higher level stack. So, then it becomes a cost benefit analysis.

A 15k AMD part vs a 60k nvidia part. For 100 Nvidia GPUs, you can buy 200 AMD GPUs and at least 2-3 engineers for 3 years at 300k to fix the specific library for that GPU. If you can make that work for a lower level library right now, then it makes to sustain it in future.


rocBLAS/hipBLAS are pretty solid. You're on the money with AMD's implementations of LAPACK not being up to snuff, though.


HIP is a thing. A lot of existing CUDA code can be translated with minimal effort. And machines get better all the time at guiding that rewrite.


I’m skeptical (without evidence). Companies that have never been good at software can’t just suddenly be good at software overnight.


Amd did several acquihires in recent past also , and now amd have more money than has ever had and stated that ai software is one top priority of the company since 1y .. results are surfacing..


Near constant releases of ROCm.


> TensorWave will fund its bit barn build by using its GPUs as collateral for a large round of debt financing, an approach used by other datacenter operators.

Hmmm. This seems like an odd risk for the lenders. Asymmetrical with little upside and security depends on continuing demand for GPUs. When the AI tide goes out who will be left with the losses.


When these silly motorized coaches fail, what will we do with all the paved roads? The horses hate it!


I take your point, but the risk is that even further advances on a price/performance basis will lap this investment.


Also lenders on h100 value faces risks when gb200 comes out as the prices of gh goes down a lot


Disclosure: competitor to TW.

I see this as super risky as well. I'm taking a totally different approach with my business. We are only growing with customer demand. First off, we are getting a decent number of GPUs to rent, which should cover our initial capacity needs. Then as we grow, we will push revenue + further investment back into more purchase orders. We can also order and deploy compute on a very short timeframe, so if we have a customer that wants a bunch of compute that we don't have today, we will get it online relatively quickly. Grandiose claims of 20k GPUs by the end of the year, almost never works out the way you want it to.


> This seems like an odd risk for the lenders. Asymmetrical with little upside

Upside and risk are usually priced into the loan as the interest rate


Gutsy play on timing but I think it’ll pay off. By end of 2024 I expect AmD to be competitive enough to make sense


If you're deep enough into this business, it doesn't seem gutsy at all.

Disclosure: competitor to TW.


what do you think so as a competitor of TW? are you guys also considering to adopt AMD chips?


Yes, we are buying and deploying MI300x.


With this driver quality I don’t support their optimism.


So why should VC back this when they can make the same bet outright by buying AMD stock other than say a leveraged bet?


AMD already has a stock price that reflects their whole bag of products. TW stock price is effectively zero for a pre listed VC investment, TW is a cloud service bet not a chip bet (ie higher up stack = more value), TW is AI only.


During a gold rush you can invest in a mine or a shovel company.


The actual bet is if TensorWave can use AMD to beat nVidia and not directly.


TensorWave doesn't compete with Nvidia but with CoreWeave. I bet they might even have the same founder lol.

CoreWeave is doing the same stuff but with Nvidia. Also using collateral. But CoreWeave did it last year when H100 was way more valuable for collateral than it is today. And CoreWeave has actually been backed up by Nvidia and Microsoft in some funding rounds.

For Nvidia HW there are like 10x as many AI startups doing what TensorWave is doing. TensorWave is going a more risky way to go with AMD instead of Nvidia. Being among few startups might give them a large benefit but it also depends a lot on AMD support in the SW field. I wouldn't bet on that, especially not with AMD HW as collateral.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: