DIP 1024--Shared Atomics--Community Review Round 1

Sun Oct 13 03:50:33 UTC 2019

On Saturday, October 12, 2019 5:37:50 PM MDT IGotD- via Digitalmars-d wrote:
> On Saturday, 12 October 2019 at 21:28:36 UTC, Jonathan M Davis
>
> wrote:
> > Right now, we basically have #2. What this DIP is trying to do
> > is move us to #1. How libraries should work is exactly the same
> > in either case. It's just that with #1, the places where you
> > operate on shared data in a manner which isn't guaranteed to be
> > atomic, the compiler prevents you from doing it unless you  use
> > core.atomic or have @system code with casts. Even if we have #2
> > and thus no such compiler errors, the code should still have
> > been doing what #1 would have required, since if it doesn't,
> > then it isn't thread-safe.
>
> With this DIP, shared integers/small types will be automatically
> atomic. For complex/large types, will you still be able to use
> them as before between threads and you have protect the type
> yourself at least for a transitional period?
>
> "Atomic" here as I get it also mean atomically updating complex
> types. This usually means that you need to guard the operations
> with some kind of mutex. The compiler can of course detect this
> and issue a warning/error to the user which doesn't seem to be
> the scope of this DIP.
>
> Correct me if I'm wrong but we have the following scenarios.
> 1. shared integer/simple type (size dependent?) -> automatically
> HW atomic operations
> 2. shared complex type -> write to any member must be protected
> with a mutex.
> 3. shared complex type -> read to any member must be protected
> with a mutex or read/write mutex allowing multiple reads.
>
> The compiler is used the detect these scenarios so that the user
> doesn't forget protecting the shared types.
>
> As I get it this DIP is just a baby step towards this bigger
> scope for the shared keyword.

When we're talking about atomic with regards to shared, we're talking about
what core.atomic does. They're operations that are atomic with regards to
CPU instructions. The most that that can work with is primitive types like
integers or pointers. More complex types require stuff like mutexes to
protect them in order to freely mutate them.

Walter needs to make the DIP clearer, but in the discussions in this thread,
he's made it clear that the intention is that read/write operations will
become illegal for all shared data, forcing code to either use core.atomic
to do atomic operations on the data or to use synchronization mechanisms
such as mutexes to allow for thread-safe reading and writing of data, with
the code needing to cast away shared in order to operate on the data
(meaning that the code will then be @system or @trusted).

In principle, the compiler could allow reading and writing shared variables
by inserting the core.atomic stuff for you, but that's not the current plan.
Either way, it's very difficult to have the compiler understand
synchronization primitives well enough and have enough guarantees about what
the code is doing to be able to do something like implicitly remove shared
for you. So, it's highly unlikely that we'll ever get much in the language
that would be able to implicitly remove shared for even simple pieces of
code let alone complex types. AFAIK, only construct along those lines that's
been proposed thus far that could work is TDPL's synchronized classes, and
they can only implicitly remove a single layer of shared - and that can only
do that much because of how restrictive they are. Having the compiler
magically handle thread synchronization primitives for you is likely a pipe
dream.

Rather, what's likely going to tend to happen is that complex objects that
are supposed to be used as shared will handle the appropriate atomics or
mutexes internally, providing an @safe API for the user. However, the
internals will still have to do the dirty stuff and be appropriately vetted
for thread-safety. That's already what you typically get in a language like
C++. It's just that the type system doesn't have shared, so you have to
manage everything yourself and don't have a way to segregate shared data and
the functions operating on it other than by convention, whereas D's shared
enforces it via the type system.

Ultimately, shared is about segregating the code that operates on shared
data - both so that most code can just be thread-local without worrying
about it and so that you can easily determine which pieces of the program
have to be examined for threading issues - just like @system/@trusted/@safe
isolates the portions of the program where you potentially have to worry
about memory safety. shared isn't about magically handling threading stuff
for you. If we can figure out how to add mechanisms on top of shared which
make things easier, then great, but shared has to be properly locked down
first, and given that D is a systems language (thus allowing all kinds of
crazy stuff), and its type system really has no concept of ownership, I
don't think that it's very likely that we're going to be able to add much to
the language that's going to allow shared to be implicitly removed or allow
you to otherwise operate on shared data without worrying about dealing with
the synchronization primitives yourself.

- Jonathan M Davis