Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I wonder how much of that 59% gain comes from the 512bit registers/instructions themselves, and how much comes from the new instructions and modes that come with AVX-512, and can still be used with the narrower 256bit and 128bit registers.

Would be interesting to modify some of the benchmarks to be limited to 256bit AVX-512 and see how they compare.




Mysticals report indicates much of it does come from wider instructions, because it can saturate the core easier. Zen 3 was front end bottlenecked, so on Zen4 running AVX512 it can more often hit 4x256. The new instructions are useful and some help perf, but mostly only for pretty specialized stuff. Masking is nice but I think people really exaggerate the improvement from it, vblend was only 2 cycles.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: