The ANE and tensor cores are not comparable though. One is literally meant for l...

llm_nerd · 2025-03-05T16:20:30 1741191630

>The ANE and tensor cores are not comparable though

They're both built to do the most common computation in AI (both training and inference), which is multiply and accumulate of matrices - A * B + C. The ANE is far more limited because they decided to spend a lot less silicon space on it, focusing on low-power inference of quantized models. It is fantastically useful for a lot of on-device things like a lot of the photo features (e.g. subject detection, text extraction, etc).

And yes, you need to use CoreML to access it because it's so limited. In the future Apple will absolutely, with 100% certainty, make an ANE that is as flexible and powerful as tensor cores, and they force you through CoreML because it will automatically switch to using it (where now you submit a job to CoreML and for many it will opt to use the CPU/GPU instead, or a combination thereof. It's an elegant, forward thinking implementation). Their AI performance and credibility will greatly improve when they do.

>you really need to compare against the GPU

From a raw performance perspective, the ANE is capable of more matrix multiply/accumulates than the GPU is on Apple Silicon, it's just limited to types and contexts that make it unsuitable for training, or even for many inference tasks.

NorwegianDude · 2025-03-05T16:47:01 1741193221

So now the TOPS are not comparable because M3 is much slower than an Nvidia GPU? That's not how comparisons work.

My numbers are correct, the M3 Ultra has around 1 % of the TOPS performance of a RTX 5090.

Comparing against the GPU would look even worse for apple. Do you think Apple added the neural engine just for fun? This is exactly what the neural engine is there for.

dagmx · 2025-03-05T16:58:05 1741193885

You’re completely missing the point. The ANE is not equivalent as a component to the tensor cores. It has nothing to do with comparison of TOPs but as what they’re intended for.

Try and use the ANE in the same way you would use the tensor cores. Hint: you can’t, because the hardware and software will actively block you.

They’re meant for fundamentally different use cases and power loads. Even apples own ML frameworks do not use the ANE for anything except inference.