btw, don't bother trying to buy a bunch of Mac boxes to run LLMs in parallel bec...

ionwake · 2025-05-03T19:31:10 1746300670

is everyone just waiting for teh DGX Spark? Are they really going to ban local inference?

neuroelectron · 2025-05-03T20:29:48 1746304188

What do you mean ban? The bandwidth between macs isn't enough to do inference effectively.

jsjohnst · 2025-05-04T15:00:21 1746370821

> The bandwidth between macs isn't enough to do inference effectively.

While it’s certainly no where near the memory bandwidth, 80Gbps is on par with most high end, but still affordable, machine to machine connections. Then add on the fact you can have hundreds of gigabytes of shared ram on each machine.

zozbot234 · 2025-05-03T20:55:04 1746305704

I'm pretty sure you can network Macs together via the latest Thunderbolt standards and get pretty decent performance overall. Sure, it will be a bottleneck to some extent but it's still useful for many purposes.

neuroelectron · 2025-05-03T23:29:07 1746314947

Yes you can do that and shard a very large model across the devices but it's way too slow so you will get no performance gains beyond being able to run a much larger model at all.

ionwake · 2025-05-04T08:59:43 1746349183

thats a performance gain

neuroelectron · 2025-05-04T16:44:55 1746377095

it's paying more for less performance.

ionwake · 2025-05-04T17:33:18 1746379998

Are you a teenager or something

neuroelectron · 2025-05-04T17:47:45 1746380865

Just look at the people who tried it on youtube