Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, I haven't checked within the last few years on more recent Intel/AMD processors, but it used to be that on Intel CPUs, only port 5 could be used for shuffles, so it was possible to bottleneck them on code with fairly heavy usage of shuffles.


It's better now that Ice Lake+ can do some shuffles and unpack operations on two ports, but bottlenecking on the shuffle ports can still be a problem.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: