Hacker News new | past | comments | ask | show | jobs | submit login

> there's no memory barrier instruction to force its new value (5) to propagate to all CPU's running the threads.

The equivalent of the memory barrier instructions is there, but it's hidden within the operating system code which creates and initializes a new thread. That is, the operating system ensures that the value in the current CPU (in this case, 5) is propagated to the CPU running the newly started thread, before the thread start routine (in this case, DoStuff) is called. The value is not modified while the child threads are running (it waits for the child threads to exit before clearing the value), so there's no chance of the child threads seeing the value being set back to zero.




Based on the Windows CreateThread API [1], it doesn't say anything about memory synchronization guarantee. Does it do internally?

[1] https://learn.microsoft.com/en-us/windows/win32/api/processt...


That MSDN documentation is unfortunately silent on this, but the example in the documentation (at https://learn.microsoft.com/en-us/windows/win32/procthread/c...) only makes sense if the operating system guarantees the ordering.

The C++ standard (at least a draft of it I found on a quick web search) is more explicit: it says (https://eel.is/c++draft/thread.thread.constr) "The completion of the invocation of the constructor synchronizes with the beginning of the invocation of the copy of f." (see https://eel.is/c++draft/intro.races for more detail on that "synchronizes with"). Since the code in question is using std::thread, even if the operating system did not have the relevant guarantees, the C++ standard library would have the required memory barriers.


Any time somebody tells you how simple C is by comparison, point them at https://port70.net/~nsz/c/c11/n1570.html#6.2.4p5:

> An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, as do some compound literals. The result of attempting to indirectly access an object with automatic storage duration from a thread other than the one with which the object is associated is implementation-defined

(I bet in practice almost all implementations behave as an equivalent C++ would, as per your notes, so only the ordering is relevant. But people maintaining C implementations have on occasion shown themselves to be their users' enemies, so don't quote me on this!)


More generally, the OS is going to be doing some level of synchronization on its own, likely a spinlock, during the thread creation process. Those always include memory barriers, because otherwise the locks they define don't actually work on OO systems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: