> That puts Intel at 1.16x and 1.07x for this specific test
That's absolutely amazing result, and shows how wrong the current information in the article is. I hope the author sees what you did and updates his page as soon as possible.
So the updated values on the original page are now:
> Intel/M1 ratio 1.2 0.9
> As you can see, the older Intel processor is slightly superior to the Apple M1 in the minify test.
I'd consider it as bigger news that M1 in one of the two tests chosen by the author (utf8) 10% faster than Intel, and in another (minify) only 20% slower, which is for most purposes something that most users won't even be able to notice. It's quite remarkable result. I'd surely write:
"As you can also see, in the UTF-8 validate test M1 is superior to older Intel processor, and in the minify test only 20% slower, even if Intel uses more power to calculate the result!"
-----
(Additionally I use the opportunity to thank again to u/bacon_blood who verified the initial claims and u/messe who figured out what the remaining bug in the author sources was! Great work!)
(Edit: the ratio 1.16 is from older native measurement. So I've also made an error in the previous version of this comment! I've wrongly connected that with the Rosetta 2 produced code. I've deleted that part of this message. Still the difference between 1.07 and 0.9 measured on two different setups is interesting, when another test is close enough).
The post with egregious errors was also put up on a Sunday afternoon. And while we're all acting conciliatory now, it's pretty remarkable how biased the post was, the author using some clearly erroneous numbers to prove their prior, baseless claim that the "M1 chip is far inferior" in some respects, when those respects were specifically SIMD. Then becoming strangely defensive when some people rightly pointed out that ARM64 has 128-bit NEON and a number of other advantages.
Far inferior becomes....actually superior in many cases, even at SIMD.
Let’s try to be charitable, shall we? Everyone makes mistakes sometimes, even leading experts in low-level algorithm optimization. Lemire was upfront about making a mistake, and not at all defensive about it; if you are reading it that way, it’s just you.
It is clearly the case that the M1 CPU/SoC has a significant performance advantage in typical branchy single-core code, but much less advantage if any for certain kinds of heavily optimized numerics. Beyond that high-level summary, it’s good to dive into the details, and spark discussions.
Everyone is just now getting their hands on these chips, learning how to work with them, and trying to figure out how to best optimize for them.
That's absolutely amazing result, and shows how wrong the current information in the article is. I hope the author sees what you did and updates his page as soon as possible.