Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Let's say your workload consists solely in traversing a single linked list. This list fits perfectly in L1.

As an L1 load takes 4 cycles and you can't start the next load untill you completed the previous one, the CPU will stall doing nothing 3/4th of cycles. A 4-way SMT could in principle make use of all the wasted cycles.

Of course no load is even close to purely traversing a linked list, but a lot of non-hpc real world load do spend a lot of time in latency limited sections that can benefit from SMT, so it is not just cache misses.




> so it is not just cache misses.

Agreed 100%. SMT is waaaay more complex than just cache. I was just trying to illustrate in simple scenarios where increasing cache would and would not be beneficial to SMT.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: