Hacker News new | past | comments | ask | show | jobs | submit login

btw, don't bother trying to buy a bunch of Mac boxes to run LLMs in parallel because it won't be any faster than a single box.





is everyone just waiting for teh DGX Spark? Are they really going to ban local inference?

What do you mean ban? The bandwidth between macs isn't enough to do inference effectively.

> The bandwidth between macs isn't enough to do inference effectively.

While it’s certainly no where near the memory bandwidth, 80Gbps is on par with most high end, but still affordable, machine to machine connections. Then add on the fact you can have hundreds of gigabytes of shared ram on each machine.


I'm pretty sure you can network Macs together via the latest Thunderbolt standards and get pretty decent performance overall. Sure, it will be a bottleneck to some extent but it's still useful for many purposes.

Yes you can do that and shard a very large model across the devices but it's way too slow so you will get no performance gains beyond being able to run a much larger model at all.

thats a performance gain

it's paying more for less performance.

Are you a teenager or something

Just look at the people who tried it on youtube



Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: