Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can anyone comment on the TinyBox they are taking preorders for?

The tinybox

738 FP16 TFLOPS

144 GB GPU RAM

5.76 TB/s RAM bandwidth

30 GB/s model load bandwidth (big llama loads in around 4 seconds)

AMD EPYC CPU

1600W (one 120V outlet)

Runs 65B FP16 LLaMA out of the box (using tinygrad, subject to software development risks)

$15,000



What is the cost of an equivalent setup using A100s?

I have no idea what I am doing but here goes!

By [1] we have 156 FP16 TFLOPS, taking their non "*" (* = with sparsity) value. So you need 5 So $40,000? pls the other stuff, and someone to make a profit putting it together say $50,000?

So this setup is 3 times cheaper for the same.

If I am allowed to use the sparsity value it is 1.5 times cheaper.

[1] https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Cent...


A100s have a much worse performance per dollar than 3090s/4090s (which are the direct competitors to the 7900x)


Fwiw, I can't think of a single popular neural net architecture that takes advantage of sparsity.


I like George's style and wish him well. But I'm not optimistic about their chances of selling $15k servers that are $10k in parts (or whatever the exact numbers are).

It's just too easy for anyone to throw together a Supermicro machine with 6x GPUs in it, which is what it sounds like they'll be doing.

My guess is they'll end up creating some premium extensions to the software and selling that to make money. Or maybe they can sell an enterprise cluster manager type thing that comes with support. He's good at software so it makes sense for him to sell software.

And maybe the box will sell well initially just as a "dev kit" type thing.


> selling $15k servers that are $10k in parts

Have you seen what a DGXA100 costs? It starts at $199k for 8 40GB A100's, which have a list price of $10k each. So the GPU costs are $80k. What do you get for the extra $120k? 1TB ram, 2 2TB NVMe OS drives, 4 4TB NVME general storage, and 8x200Gbit infiniband. I would guess no more than 20k all of the remaining hardware. So that's a ~$100k computer selling for $200k. And that's with NVDA likely making massive margins already on the A100 and the Infiniband hardware.

The reality is that companies want to buy complete solutions, not to build and manage their own hardware. A $15k a computer that's $10k in parts is not a large markup at all for something like this.


I agree the DGXA100 is a "complete solution" because it's NVIDIA selling NVIDIA customized integrated/certified/tested/supported hardware and software.

NVIDIA's advantage is that they're a proprietary company and they're the ones actually making the chips they're putting in a box.

That's very far away from a random little open source startup slapping third-party GPUs in a generic box.


To me this looks like a way to mask donations.


> And maybe the box will sell well initially just as a "dev kit" type thing.

Price: $15,000.

If they had a "lite" model that sold for $1500, and were actually shipping....


The lite model is any gaming PC with a 7900 XTX.


> It's just too easy for anyone to throw together a Supermicro machine with 6x GPUs in it, which is what it sounds like they'll be doing.

HPC compute is well advanced past just slapping GPUs into generic supermicro servers anyway. Without semi-custom hardware and equivalents to nvlink/nvswitch AMD won't ever be competitive in the HPC space.


SuperMicro has HGX servers. $300k buys you an 8xH100 chassis with crazy amounts of memory, storage, CPU and GPU compute.


We can't really comment much on it because a bunch of specs are lacking. Is it using 6x 7900 XTXs? Which Epyc CPU (Epycs vary in price from $1K to $11K)?


There's a reason no one uses ATI GPUs in datacenters. Their dev support is shit.

Don't waste your money.

Buy 6 RTX 4090's and a decent ECC-memory server, and call it a day.


Didn't read the article.


I thought you weren't allowed to use Nvidia's consumer GPUs in the datacenter?


You aren't, but who's going to stop you?


Their closed source driver?


And how exactly they are going to know you're running card in DC ?




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: