"On average for the tested AVX-512 workloads, making use of the AVX-512 instructions led to around 59% higher performance compared to when artificially limiting the Ryzen 9 7950X to AVX2 / no-AVX512.
From these results I am rather impressed by the AVX-512 performance out of the AMD Ryzen 9 7950X. While initially being disappointed when hearing of their "double pumping" approach rather than going for a 512-bit data path, these benchmark results speak for themselves. For software that can effectively make use of AVX-512 (and compiled so), there is significant performance uplift to enjoy while no negative impact in terms of reduced CPU clock speeds / higher power consumption (with oneDNN being one of the only exceptions seen so far in terms of higher power draw).
AVX-512 is looking good on the Ryzen 7000 series and I'll continue running more benchmarks over the weeks ahead. These AVX-512 results make me all the more excited for AMD EPYC "Genoa" where AVX-512 can be a lot more widely-used among HPC/server workloads. "
I wonder how much of that 59% gain comes from the 512bit registers/instructions themselves, and how much comes from the new instructions and modes that come with AVX-512, and can still be used with the narrower 256bit and 128bit registers.
Would be interesting to modify some of the benchmarks to be limited to 256bit AVX-512 and see how they compare.
Mysticals report indicates much of it does come from wider instructions, because it can saturate the core easier. Zen 3 was front end bottlenecked, so on Zen4 running AVX512 it can more often hit 4x256.
The new instructions are useful and some help perf, but mostly only for pretty specialized stuff. Masking is nice but I think people really exaggerate the improvement from it, vblend was only 2 cycles.
https://www.phoronix.com/review/amd-zen4-avx512
"On average for the tested AVX-512 workloads, making use of the AVX-512 instructions led to around 59% higher performance compared to when artificially limiting the Ryzen 9 7950X to AVX2 / no-AVX512.
From these results I am rather impressed by the AVX-512 performance out of the AMD Ryzen 9 7950X. While initially being disappointed when hearing of their "double pumping" approach rather than going for a 512-bit data path, these benchmark results speak for themselves. For software that can effectively make use of AVX-512 (and compiled so), there is significant performance uplift to enjoy while no negative impact in terms of reduced CPU clock speeds / higher power consumption (with oneDNN being one of the only exceptions seen so far in terms of higher power draw).
AVX-512 is looking good on the Ryzen 7000 series and I'll continue running more benchmarks over the weeks ahead. These AVX-512 results make me all the more excited for AMD EPYC "Genoa" where AVX-512 can be a lot more widely-used among HPC/server workloads. "