Hacker News new | past | comments | ask | show | jobs | submit login

AVX-512 has a lot of instructions that just extend vectorization to 512-bit and make it nicer with features like masking. Thus, a very valid use of it is just to double vectorization width.

But it also has a bunch of specialized instructions that can boost performance beyond just the 2x width. One of them is VPCOMPRESSB, which accelerates compact encoding of sparse data. Another is GF2P8AFFINEQB, which is targeted at specific encryption algorithms but can also be abused for general bit shuffling. Algorithms like computing a histogram can benefit significantly, but it requires reshaping the algorithm around very particular and peculiar intermediate data layouts that are beyond the transformations a compiler can do. This doesn't literally require assembly language, though, it can often be done with intrinsics.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: