Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

[flagged]


[flagged]


What's insane? The claim that recall is spyware isn't that bad of an exaggeration, and the claim that the NPU is hype is a reasonable opinion to have.

If that 7% of the die goes unused it's not a huge waste but it's not very good either. If you're not using it enough to affect your battery then you might not get much benefit, because the GPU can do the same tasks about half as fast, it's just less efficient (and it would be a lot more than half if the NPU die space was converted into GPU compute blocks). And for serious tasks it's not very powerful. There's a range where it's good, but not as big a range as people might expect.


If my speculation about block FP16 is correct, it might be possible to hit 24 to 48 teraflops on the NPU. This means it would be entirely memory bottlenecked even in prompt processing. There would basically be no application where you would run into the NPU being a limitation. What I fear though is that the NPU will be gimped with a single infinity fabric port, which would limit it to a relatively weak 62GB/s out of 120GB/s. For crazy people who want to use 128k token context, that might turn out to make it or break it, in the end.


Whoever makes something cheap with lots of memory bandwidth and parallel compute gets my vote :)

They somehow managed to put 16GB of HBM2 with a 4096 bit bus on the Radeon VII for $700, it's not like it's impossible. Or at least enough memory channels to rival Apples's M series in some meaningful way.

Besides, what do you expect will the average person be able to run on these Coral tier NPUs? They won't be running Yolov9 all day and all these things can do in practice is accelerate really tiny convnets.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: