Anything that does number crunching can benefit from SIMD (just look at pretty much any modern compiler output on godbolt, icache be damned).
The tradeoff with using it is basically between instruction density, power (AVX-512), and latency (GPUs are seriously powerful, but getting the data going takes time and a lot of driver bullying).
The tradeoff with using it is basically between instruction density, power (AVX-512), and latency (GPUs are seriously powerful, but getting the data going takes time and a lot of driver bullying).