DIP 1024--Shared Atomics--Community Review Round 1

Gregor Mückl gregormueckl at gmx.de
Tue Oct 15 10:25:54 UTC 2019

On Saturday, 12 October 2019 at 20:52:45 UTC, Jonathan M Davis 
> In the case of shared, in general, it's not thread-safe to read 
> or write to such a variable without either using atomics or 
> some other form of thread synchronization that is currently 
> beyond the ability of the compiler to make guarantees about and 
> will likely always be beyond the ability of the compiler to 
> make guarantees about except maybe in fairly restricted 
> circumstances.

A shared variable may not need any synchronization at all, 
depending on the algorithm it is used in.

There is a class of optimized algorithms that act like gathering 
operations. You mostly find them on GPUs because they map quite 
naturally to that hardware architecture, but they can also be 
implemented on CPUs with multiple threads. The core idea is in 
every case that you generate a set of output values in parallel 
in such a way that each value in the output is generated by at 
most one of the running threads. So there is no need to 
synchronize memory writes when the underlying hardware 
architecture provides sufficient cache coherency guarantees. All 
the threads share the same input, which obviously must not be 

A compiler cannot possibly be smart enough to prove that 
synchronization is not required for these kinds of algorithms. 
And any form of enforced synchronization (explicit or implicit) 
would significantly affect performance. So how would you express 
this implicit synchronization by mathematical properties to the 
compiler? Would the whole implementation have to be marked as 

More information about the Digitalmars-d mailing list