> That puts Intel at 1.16x and 1.07x for this specific test That's absolutely am...

dlemire · on Dec 13, 2020

Yes. I am working on updating it.

kergonath · on Dec 14, 2020

Thank you. The whole discussion (starting from your original post) is very useful and interesting.

acqq · on Dec 13, 2020

So the updated values on the original page are now:

> Intel/M1 ratio 1.2 0.9

> As you can see, the older Intel processor is slightly superior to the Apple M1 in the minify test.

I'd consider it as bigger news that M1 in one of the two tests chosen by the author (utf8) 10% faster than Intel, and in another (minify) only 20% slower, which is for most purposes something that most users won't even be able to notice. It's quite remarkable result. I'd surely write:

"As you can also see, in the UTF-8 validate test M1 is superior to older Intel processor, and in the minify test only 20% slower, even if Intel uses more power to calculate the result!"

-----

(Additionally I use the opportunity to thank again to u/bacon_blood who verified the initial claims and u/messe who figured out what the remaining bug in the author sources was! Great work!)

(Edit: the ratio 1.16 is from older native measurement. So I've also made an error in the previous version of this comment! I've wrongly connected that with the Rosetta 2 produced code. I've deleted that part of this message. Still the difference between 1.07 and 0.9 measured on two different setups is interesting, when another test is close enough).

macintux · on Dec 13, 2020

Yeah, the tone of the post still reflects the original results, and the update should be at the top for anyone returning to it later.

Still, glad this was caught.

jacobolus · on Dec 13, 2020

Thanks for a quick fix on a Sunday afternoon!

I’m impressed that the M1 can keep up on this SIMD-optimized code, likely at much lower temperature / power use.

And even the Rosetta numbers are pretty decent.

bfgoodrich · on Dec 14, 2020

The post with egregious errors was also put up on a Sunday afternoon. And while we're all acting conciliatory now, it's pretty remarkable how biased the post was, the author using some clearly erroneous numbers to prove their prior, baseless claim that the "M1 chip is far inferior" in some respects, when those respects were specifically SIMD. Then becoming strangely defensive when some people rightly pointed out that ARM64 has 128-bit NEON and a number of other advantages.

Far inferior becomes....actually superior in many cases, even at SIMD.

jacobolus · on Dec 14, 2020

Let’s try to be charitable, shall we? Everyone makes mistakes sometimes, even leading experts in low-level algorithm optimization. Lemire was upfront about making a mistake, and not at all defensive about it; if you are reading it that way, it’s just you.

It is clearly the case that the M1 CPU/SoC has a significant performance advantage in typical branchy single-core code, but much less advantage if any for certain kinds of heavily optimized numerics. Beyond that high-level summary, it’s good to dive into the details, and spark discussions.

Everyone is just now getting their hands on these chips, learning how to work with them, and trying to figure out how to best optimize for them.