Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's mostly about avoiding the latency penalty. There's only so much ILP you can extract from the instruction stream before you get stuck waiting for a dependency. If you started executing that dependency speculatively, it will complete earlier, so you can launch the next instructions sooner.

That lets you speed up single-threaded execution more by adding more functional units (since you can't really jam the clock speed any more).



This is exactly it. Speculative execution comes from instruction set design and basic constraints of branching. The energy consumption could make speculative execution prohibitive, but it's not "the" reason we do it.


Noob question here: is this the reason specialized chips can work so much better for AI applications? That the computations needed in a neural network are entirely deterministic and there is no need for branch prediction?


Not really, it's more the massive parallelism. Branch prediction takes away something but mostly it's the parallelism. Each instruction you usually do in a GPU is a massive array in one go. In a cpu you need to use AVX type instructions and those are way more limited in the size of arrays they can process at once.


Yes, the GPUs provide massive parallelism. An NVIDIA RTX 4090 has 16384 "cuda cores". Whatever these cuda cores are, they must be much, much smaller than a CPU core. They do computations though, and CPU cores do computations too. Why do the CPU cores need to be so much larger, so a CPU with more than 64 cores is rarely heard of, while GPUs have thousands of cores?


Read about vector instructions a little bit and you'll see what I mean in the previous comment. A CPU has many many niche instructions it supports, it's way more flexible. A GPU is just trying to multiply the largest arrays possible as fast as possible, so the architecture becomes different. I don't think there's a quick way for you to grasp this without reading more about computer architecture and instruction sets, but you seem to be interested in it, so dive in :)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: