Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
pixelesque
on Feb 6, 2024
|
parent
|
context
|
favorite
| on:
Avoiding register spills in vectorized code with m...
Yeah, I haven't checked within the last few years on more recent Intel/AMD processors, but it used to be that on Intel CPUs, only port 5 could be used for shuffles, so it was possible to bottleneck them on code with fairly heavy usage of shuffles.
ack_complete
on Feb 6, 2024
[–]
It's better now that Ice Lake+ can do some shuffles and unpack operations on two ports, but bottlenecking on the shuffle ports can still be a problem.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: