Right. Since I'm using Gentoo and compiling my whole system with `-march=tigerla...

nwallin · on June 20, 2023

It will not use AVX-512 if you have CFLAGS="-march=tigerlake -O2". You will, at the very least, need CFLAGS="-march=tigerlake -O3" to get it to actually use AVX2, and tigerlake's AVX512 implementation is so poor (clock throttling etc) that gcc will not use AVX-512 on tigerlake. AVX-512 is used if you have -march=znver4 though, so the support for autovectorizing to AVX-512 is clearly there.

https://godbolt.org/z/1a39Mf3bv

aseipp · on June 20, 2023

Is it actually that bad on Tiger Lake? Or just for really high-width vectors? On my old Ice Lake laptop, single-core AVX-512 workloads do not decrease frequency at all even with wider registers, and multi-core workloads will result in clock speed degradation of a small amount, maybe 100Mhz or so.

Depends on a couple factors (i.e. Ice Lake client only has 1 FMA unit) but I'd be surprised if Tiger Lake was a major regression relative to Ice Lake. It seems like they had it in an OK spot by then.

secondcoming · on June 20, 2023

In my experience it depends on the compiler. clang seems far more willing to autovectorise than gcc. Also, when writing the code you have to write it in a way that strongly hints to the compiler that it can be autovectorised. So lots of handholding.

jeffbee · on June 20, 2023

I guess a better question is why you rebuild the system without a rational basis to expect benefits.

mattst88 · on June 20, 2023

Are you familiar with source-based distributions?

I'm not rebuilding specifically for this one potential optimization.

oconnor663 · on June 20, 2023

Why not use -march=native?

inopinatus · on June 20, 2023

Surprisingly, -march=native doesn’t always expand to the locally optimal build flags we might expect, particularly with gcc on non-Linux platforms.

oconnor663 · on June 20, 2023

Oh interesting. Is this one of those things where backwards compatibility eventually got in the way of the intended purpose?

mattst88 · on June 20, 2023

I actually do. I just said -march=tigerlake to make it clear what CPU family the compiler was targeting.

aew4ytasghe5 · on June 20, 2023

Why not use -march=snark?