There are already specialized instructions in the Apple Silicon chips. IIRC ther...

favorited · 2025-03-05T20:23:40 1741206220

Uncontended acquire-release atomic operations are basically free on Apple Silicon, which synergizes with the Objective-C (and Swift!) runtimes, where every retain/release is an atomic increment/decrement.

https://web.archive.org/web/20201119143547/https://twitter.c...

throwaway2037 · 2025-03-06T06:10:25 1741241425

    > Uncontended acquire-release atomic operations are basically free on Apple Silicon

While I don't doubt you, the poster, specifically, how is this possible? To be clear, my brain is x86-wired, not ARM-wired, so I may have some things wrong. Most of the expense of atomic inc/dec is "happens before", which essentially says before the current core reads that memory address, it will be guaranteed to be updated to the latest shared value. How can this be avoided? Or is it not avoided, but just much, much faster than x86? If the shared value was updated in a different core, some not-significant CPU cycles are required to update L1 cache on current current with latest shared value.

throwaway2037 · 2025-03-06T09:27:03 1741253223

EDIT:

    > some not-significant CPU cycles

should say:

    > some not-insignificant CPU cycles