> there's no memory barrier instruction to force its new value (5) to propagate ...

ww520 · on March 3, 2024

Based on the Windows CreateThread API [1], it doesn't say anything about memory synchronization guarantee. Does it do internally?

[1] https://learn.microsoft.com/en-us/windows/win32/api/processt...

cesarb · on March 3, 2024

That MSDN documentation is unfortunately silent on this, but the example in the documentation (at https://learn.microsoft.com/en-us/windows/win32/procthread/c...) only makes sense if the operating system guarantees the ordering.

The C++ standard (at least a draft of it I found on a quick web search) is more explicit: it says (https://eel.is/c++draft/thread.thread.constr) "The completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of f." (see https://eel.is/c++draft/intro.races for more detail on that "synchronizes with"). Since the code in question is using std::thread, even if the operating system did not have the relevant guarantees, the C++ standard library would have the required memory barriers.

tom_ · on March 4, 2024

Any time somebody tells you how simple C is by comparison, point them at https://port70.net/~nsz/c/c11/n1570.html#6.2.4p5:

> An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals. The result of attempting to indirectly access an object with automatic storage duration from a thread other than the one with which the object is associated is implementation-defined

(I bet in practice almost all implementations behave as an equivalent C++ would, as per your notes, so only the ordering is relevant. But people maintaining C implementations have on occasion shown themselves to be their users' enemies, so don't quote me on this!)

ajross · on March 4, 2024

More generally, the OS is going to be doing some level of synchronization on its own, likely a spinlock, during the thread creation process. Those always include memory barriers, because otherwise the locks they define don't actually work on OO systems.