Shared

Wed May 15 09:33:19 UTC 2019

On Wednesday, May 15, 2019 12:59:00 AM MDT Dominikus Dittes Scherkl via 
Digitalmars-d wrote:
> On Tuesday, 14 May 2019 at 21:31:57 UTC, Jonathan M Davis wrote:
> > The point is that if that code can legally exist, then the
> > compiler simply cannot guarantee that removing shared from the
> > object is thread-safe even with the locking mechanism you're
> > proposing.
>
> with system code you can always destroy safety assumptions of any
> other written code. This is why system code should be avoided
> where ever possible and the unavoidable remains need to be
> reviewed very carefully to not spoil the guarantees that are
> valid otherwise.

@safety and thread-safety are very different. @system mechanisms such as
casting are involved with thread-safety, meaning that @system and @trusted
get involved, but what they mean and how they're verified are very
different. For @safety, if you want to mark something as @trusted, you just
need to look at that piece of code to verify that what it's doing is memory
safe. You don't have to look at other @safe parts of the program to verify
that what it's doing is correct.

On the other hand, with thread-safety, you have to look at _everywhere_ that
a particular piece of shared data is accessed if you want to be able to
guarantee that it's accessed in a thread-safe manner. You can't just assume
that other code is doing the right thing and just verify that one piece of
code that's using @system mechanisms to interact with shared data. So, you
can't rely on @safe to tell you whether anything is thread-safe.

> >> > And even if this were the only mechanism for removing
> >> > shared, you could easily use it with the same object and a
> >> > completely different mutex in another piece of code
> >>
> >> Yes, the lock block need a list of vars that it allows to be
> >> modified
> >>
> >> lock(var1, var2, ...)
> >> {
> >> }
> >>
> >> two mutexes can only be executed at parallel if their
> >> parameter set is disjunct.
> >
> > Sure, but another thread could be using a completely different
> > mutex with one or more of those variables.
>
> No, it can't. Disjunct means: It cannot be called unless all of
> the given variables are free (not locked by any other mutex).

So, you're proposing that something in the runtime keeps track of which
variables are currently associated with a locked mutex in order to guarantee
that no other lock block is able to access any of those variables at the
same time? That would probably require adding a global lock used by the
runtime when any code enters or exists a lock block. I'd be _very_ surprised
if anything like that were deemed acceptable for D. And it still doesn't
solve the problem of other references referring to any part of the objects
referred to by those variables existing and potentially being used
elsewhere. If you had

lock(mutex, var1)
{
}

and elsewhere

lock(mutex, var2)
{
}

when var1 and var2 were references to the same object or when var2 referred
to a piece of data inside of var1, then the lock wouldn't be providing
thread-safety. If you only had one mutex for the entire program, and casting
away shared were illegal, then something like this could probaly work, but
having one mutex for the entire program would clearly be unworkable -
especially for a systems language - and since mutexes (let alone this
particular use pattern for mutexes) aren't the only way to protect shared
data when accessing it, requiring that this particular construct be used for
accessing shared data wouldn't work anyway.

> >> ok, so we need in addition that a reference to a shared var
> >> need not be lived beyond the end of the locked block or be
> >> immutable. Bad, but seems necessary.
> >
> > A reference could already exist before your proposed locking
> > mechanism was reached in the code. If the type is a class or
> > pointer, then there could be other class references or pointers
> > to the same data in @safe code. And in @system/@trusted code,
> > the address of the object could have been taken to create a
> > pointer to the object (and that could have been done in code
> > for removed from the code that's using the lock with all of the
> > code around the lock being @safe). Heck, there could even be
> > references to data within the object rather than to the object
> > itself which are available elsewhere, meaning that part of the
> > object is protected by the lock and part isn't. If any
> > reference to any part of the data exists anywhere in the
> > program, then it's possible for another thread to access the
> > data at the same time that it's locked by the mechanism that
> > you've proposed.
>
> Ok, that whole reference stuff is always a problem. Why not
> simply forbid it? You can't reference shared variables (outside
> locked blocks), you can only copy them. We can later relax that
> rule if some safe ways to allow that are found. I can't see why
> that should hinder us to make the more practical usecases safe
> for now.

You can't forbid references to the same data. All that would be required
would be something like

shared foo = new shared(Foo)(42);
auto bar = foo;

and you have two references to the same object without doing anything
unsafe. You could also have stuff like

shared baz = foo.getBaz();

resulting in a reference to some piece of data that the foo object contains
(possibly simply a shared class object that it has a reference to as a
member). In general, to be able to verify that no other references to data
existed, we'd probably have to add some sort of ownership semantics to the
language so that the compiler could know that nothing else can possibly have
access to the data (as I understand it, Rust manages something along those
lines, but they have a much more restrictive type system).

> > And it wouldn't surprise me if someone else were able to point
> > out why even that wasn't enough because of some detail I'm not
> > thinking of at the moment. Having the compiler be able to prove
> > that a piece of code is thread-safe such that shared can be
> > safely removed automatically from anything is incredibly
> > difficult.
>
> shared shouldn't be removed from an object, but it can only be
> modyfied if it is locked. Removing shared (with a cast) is system
> stuff and should be out of scope for any safety related proposal
> (including mine), because with system stuff you can destroy any
> kind of safety.
>
> If you don't remove shared, you can easily apply rules like
> forbid to take it's address or such. If you remove it, that makes
> it much harder (and isn't useful anyway).
>
> I still think my proposal could work (provide provable
> thread-safety for shared objects) in a limited but useful way
> (only mutex, no references), and should be relatively easy to
> implement.
> If you want more complex stuff, that's still possible in the same
> way it currently is: cast shared away together with all
> guarantees and verify manually that it works, just like in C++.

As soon as casting away shared is legal (and I think that it has to be for
many common thread-synchronization idioms to be used), any mechanism like
you're suggesting isn't enough to guarantee that it's safe to remove shared
even temporarily. Either way, the ability to get multiple references to the
same data defeats what you're proposing, and I don't see how it would be
possible to make it so that they're can't be multiple references to the data
given D's type system. TDPL synchronized classes are only able to do it with
the data that lives directly in the class, because they're a very
restrictive construct. But even then, what the member variables refer to
can't have shared removed, because references to the same data could have
escaped the class or be passed into the class from elsewhere. And if
something as restrictive as TDPL synchronized classes aren't able to
restrict references sufficiently to be able to just outright remove shared
from the class' member variables, there's no way that something as free-form
as locking a mutex on a set of variables that aren't encapsulated in
anything is going to be able to guarantee that other references to the same
data don't exist.

- Jonathan M Davis