Indeed!
This reminds of a fun issue I ran into years ago with simd code(gcc, linux) . I was experimenting with various vector sizes, and found significant slowdowns for some vector sizes. I was about to call it quits, as in 'well, I'll have to do things differently', when I realized it didn't make any sense.
I double checked the actual values computed by the benchmark, which happened to be completely wrong. What I had actually found was a compiler bug !