Indeed a great article, well worth reading in full for anyone who uses AVX-512. ...

celrod · on Sept 26, 2022

The Intel optimization manual has a fun example where they use vpconflict for vectorizing sparse dot products: https://github.com/intel/optimization-manual/blob/main/chap1...

I benchmarked it on Intel, and it was indeed quite fast/a good improvement over the scalar version. Will be interesting to try that on AMD.

janwas · on Sept 27, 2022

Nice! Thanks for linking it :)