The `shared` debate, from my point of view

Wed Oct 24 01:22:19 UTC 2018

On Tuesday, 23 October 2018 at 21:17:16 UTC, Steven Schveighoffer 
wrote:
> I wrote in a previous buried post that I finally understood the 
> benefits of Manu's system of shared, and why he has proposed it 
> with the implicit casting of unshared to shared. Here is the 
> expansion on that.
>
> Let me first start by trying to infer through his various posts 
> and explanations where this thing came from. I'm going to spend 
> a few paragraphs putting words in Manu's mouth, forgive me if 
> I'm wrong (and please correct if necessary!), but I think it's 
> important to understand the motivation for the proposal.
>
> Obviously Manu has a lot of experience writing C++ code, and in 
> C++, anything shared is "shared by convention". That is, you 
> can share anything you want, there are no restrictions. The end 
> result is that any pointer to any data must be treated like 
> it's shared.

C++ has strict aliasing. C++ can threat pointers like they're the 
only reference to that data in the universe if it likes.

It's not true that "data must be treated like it's shared". Data 
must be very deliberately handled if it IS shared. And if it is 
shared, and not handled, then basically, undefined behaviour.

The rules are basically the same as D.

Just to be clear, I'm not talking about, and not contrasting to 
C++ at any point. So suggest so is not fair. I'm strictly 
interested in the reality of data access.

> Treating every piece of data as if it was shared doesn't pan 
> out well, because synchronizing memory across threads to avoid 
> races isn't cheap, and pointers are used *everywhere*. So you 
> must restrict yourself to a set of rules for sharing data. What 
> I envision has been practiced in this case is that certain 
> types encapsulate COMPLETELY thread-safe behavior. This means 
> that whether the type is on the stack or on the heap, shared or 
> not, it's going to defensively use locks or atomics to make 
> sure no races can happen with that specific type. This is 
> similar to stdout in libc, which is generally thread-safe, but 
> uses locks even when only one thread is using it. (Side note: I 
> myself have used things like shared_ptr in multi-threaded 
> environments, and it does make things so much easier to have 
> thread-safe primitives)
>
> Somehow you must know which data is shared from outside the 
> thread, and which data is local. This means that you have to 
> have some way (probably by convention) for siloing this shared 
> data away from your local data. The reason is because local 
> data can be manipulated without synchronization, while shared 
> data cannot. But clearly, any shared data can only be 
> manipulated via the "thread-safe" types that are fully 
> encapsulated and can't cause problems. The other data, you are 
> not allowed to touch (again, this is C++, so by convention). 
> Knowing what data can be manipulated is easy to discern based 
> on the type of the data (e.g. Atomic<int>).
>
> To recap: the silo that contains data shared from elsewhere can 
> ONLY be manipulated through the fully anointed "thread safe" 
> types. Normal types cannot be touched, because some other 
> thread (the owner thread) can manipulate that data without sync.
>
> ------
>
> Now, let's look at the *current state* of D. In D, shared data 
> and unshared data are STRICTLY separated. One cannot simply 
> share any data (like an int *) because that would now mean that 
> int * is shared. Having a shared alias to unshared data 
> trivially causes a paradox that now will result in races. This 
> is the reason why implicit casting either way isn't allowed.
>
> But that doesn't FIT the fully encapsulated "I can use this 
> type shared or not", which can be used whether it's thread 
> local or shared. So what Manu proposes is to remove this 
> definition of shared, and instead of shared meaning "this data 
> is shared", it means "you can only operate this data IF it 
> provides a thread-safe interface". The thread-safe interface 
> comes in the form of free-functions that accept the type as 
> shared, including shared member functions.
>
> So Manu's proposal (MP) is to do 2 simple things:
> 1. Basic data that is shared CANNOT be read or written via 
> standard methods or operators. This enforces the convention of 
> not touching data in the shared silo that is NOT thread-safe.
> 2. Standard data implicitly casts to shared.
>
> The only way obviously to write or read shared basic types, 
> therefore, is to cast away shared. But the intention from this 
> plan is to only do this while inside a fully encapsulated and 
> tightly controlled type.
>
> And HERE is the key part I was missing -- those ints that have 
> no thread-safe interface, are STILL USABLE by the original 
> thread, because it still can have a non-shared reference to the 
> data. All other threads can ONLY have a shared reference to the 
> data, restricting them to the thread-safe portions of the type. 
> In other words, you can use the Atomic!int type while the 
> reference is shared, but not the int. Manu's quote (with 
> contexts by me) here explains it all:
>
>> In practise, and in my direct experience, classes tend to have 
>> exactly
>> one [thread-safe member], and either zero (pure utility), or 
>> many such [thread-local] members.
>> Threadsafe API interacts with [the thread safe member], and 
>> the rest is just normal
>> thread-local methods which interact with all members 
>> thread-locally,
>> and may also interact with [the thread safe member] while not 
>> violating any threadsafety
>> commitments.
>
> This requires a different mindset when implementing shared 
> data. You can NEVER have a function that takes a shared int * 
> and does anything with it. So all of core.atomic changes to 
> only accepting `ref int`, and not `shared ref int`. 
> Essentially, in order for a type to have an encapsulated 
> thread-safe interface, it cannot have any other thread-unsafe 
> means of manipulating the data. Obviously, this means basic 
> types are useless as shared types unless encapsulated into a 
> specially written type.

Yup.
But let's be clear; this isn't actually a feature of my proposal, 
this is a feature of reality!
There is *absolutely no world* where unregulated interaction with 
primitive data is safe.
You MUST perform custom and deliberate threadsafe handling of any 
interaction with any data. Any access to raw data from a shared 
reference can never be safe, and should be strictly banned.

atomicIncrement(shared int*) is unacceptable under any 
conceivable model, because `int` has an unsafe API (the intrinsic 
operators).

Even in the current implementation, right now, core.atomic 
functions should change to int* and require unsafe casts.

> You want these sharable types in their own modules, so it can't 
> have any unforeseen hooks into the private data, and it will 
> actually work. It is a convention, although not too hard to 
> follow. Some have mentioned that there are still loopholes 
> (like accessing tupleof) that need to be addressed, but those 
> should be addressed anyway.
>
> Therefore, the rules are simple, they are sound, and they do 
> accomplish a certain view of sharing data that will be useful 
> in many cases. And it allows Manu's current model of sharing 
> data to easily be implemented AND get rid of some of the 
> convention in C++ by using compiler guarantees in D.

**Get rid of _all_ of our convention. Our infrastructure would 
become typesafe, and @safe.

I can't think of any data that is strictly shared. We don't have 
that, and I don't know what things are like that which aren't 
also immutable.
All data has an owner, and that owner can do things to it that an 
owner should be allowed to do.

> -----------
>
> So here is my take on this: I propose that we still make basic 
> shared data unusable without casting,

Indeed, I don't see any room for debate on this. It just needs to 
be right.

> but do not allow implicit casting to shared.
>
> Manu's workflow and model is still doable without the implicit 
> casting. Simply because, if you want shared data, declare it 
> shared.
>
> That is, if you have (with MP):
>
> struct SharableType
> {
>    int x;
>    Atomic!int y;
> }
>
> just declare it:
>
> struct SharableType
> {
>    int x;
>    shared Atomic!int y;
> }

Declaring `y` shared might be a useful choice in some cases to 
help catch cases of unshared functions accidentally accessing a 
member. In this case though it's a bit lame that now an unshared 
function can't access `y`, since it's Atomic!() and therefore 
perfectly fine to do. Any unshared function that wants to access 
`y` must do an unsafe cast.

But the real problem is here:

void DoParallelFor(ref shared SharableType x)
{
   x.threadsafeMethod();
}

void fun()
{
   SharableType x;
   x.threadlocalMethod();

   DoParallelFor(x); // <- no implicit conversion requires unsafe 
cast! solame!
}

So now, fun() must be unsafe.
This requirement to perform needless unsafe casts everywhere 
means my whole program becomes unsafe!
And that goes for all forms of safety. We don't have an "allow 
unsafe shared casts, but enforce safety for other things" 
option... it's just that all things are unsafe now.

By forcing totally needless unsafe interactions into user code, 
you are making the whole user-side program unsafe. That's a 
terrible design choice.

> and share y instead of the whole thing.

But it's `SharableType` that defines interesting interactions 
with `y`. `y` is private; it's an uninteresting implementation 
detail of `SharableType`.

> You still do not have to cast anything, and realistically, the 
> other thread doesn't care about the other data it receives that 
> isn't actually accessible. I see no reason to deal with the 
> compiler preventing twiddling when it can be trivially 
> prevented by not giving it to the other thread.
>
> The objection I have seen most cited is that then the user is 
> forced to cast data to shared to share it. I don't see how -- 
> if you have the above you don't need to cast.

My examples above should convince you that casts must exist.

The casts will appear somewhere. Like I say above, sharing `y` is 
uninteresting, because it's `SharableType` that defines 
interesting interactions with `y`. I could add another layer in 
the middle, but then we just move the casts to any unshared 
methods of `SharableType` that wants to call threadsafe functions 
of its member, and we've needlessly made the implementation of 
`SharableType` more complex and noisy.

Like I say, the casts will exist *somewhere*, this is just a 
matter of choice of where.
My proposal makes the (I feel; 'objective') assumption, that the 
best place for unsafe casts to appear is:
  * in the 5-10 core low-level library functions written by the 
threadsafety expert; that guy is trained to handle unsafety 
concerns properly
  * NOT in the user code, causing ALL user code to become unsafe 
because necessitating users to perform unsafe casts

I'm not changing the landscape, I'm just shifting the safety 
guards into are a more reasonable location.
By putting the unsafety in the *core* library, the whole program 
is safe, and the number of unsafe interactions are minimised and 
contained.

It's also a matter is unsafe casting frequency.
I've made the point that library:users is a 1:many ratio.
Shifting the unsafe bits into the '1', and NOT scattering it 
among the 'many' is just common-sense.

> Simply put, casting unshared data to shared or vice versa means 
> you have verified BY HAND that there are no other references to 
> that data from that point forward. If the compiler can prove 
> this, it can do the implicit cast. It works fine for 
> immutable/mutable transitions, and can work here too. Casting 
> to share data will not be a requirement for safe code, and will 
> be rare in user code, if anywhere.

Can you elaborate on this claim: "Casting to share data will not 
be a requirement for safe code"

I went through the process you're going through now... I've been 
all over this design landscape, but I can't produce any design 
that works other than the one I have.

> The only issue I see that can possibly cause problems is that 
> it may not be easy or possible to separate the shared parts of 
> data into its own type, which means you have to share it 
> through an artificial reference type (one that contains only 
> the sharable pieces). This can be automated and implemented via 
> introspection.

This does indeed feel very awkward to me.
Understand, you're enforcing this on a very large number of types 
in my ecosystem.

Burden of complexity is best placed on the threadsafety 
author/expert and isolated/contained in core libs, not 
distributed among all users in all code everywhere.

> One further benefit to keeping the cast explicit, is that one 
> can write specific implementations knowing that data is not 
> shared or is shared, giving a possibility of performance 
> benefit that just isn't possible with MP (at least it isn't 
> possible with compiler guarantees, obviously anything is 
> possible if you follow conventions).

I think what you mean is "one can write specific implementations 
*safely*..."
And that's literally the single useful facet of the current 
design I can identify.

My model can still implement 2 overloads to make the same 
assumptions, but it requires unsafe cast in the threadlocal 
implementation to implement the optimisation.
I am completely happy with such an optimisation being unsafe, but 
I think safety by default is the sensible option.

The reality is though, that the thread-local functions are NOT 
the perf issue, almost by definition. Sharing something implies 
that its threadsafe methods will be called a great many times by 
many threads... the single instance would not represent 'the 
workload' in a shared world, it's just an arbiter, or book-keeper.

> One thing that is problematic with MP, is that you can't 
> actually pass ownership of thread-local data from one thread to 
> another.

Is this true? This feels like a problem for move semantics.
It seems like a general architectural problem, can you explain 
how MP affects this?
How does this work now that would be ruined by my proposal?

> This isn't actually possible without casting under the current 
> shared regime, but with implicit casting from unshared to 
> shared, you have introduced NO opt-in cast on the sharing side.

You say "passing ownership", why would that API receive a 
`shared` one? That's backwards.
It should receive *THE* one, ie, an unsahred rvalue, and you 
would move your object.

> This makes it impossible for the compiler or code reviewer to 
> find the place at which you should be verifying the reference 
> is unique (a requirement if you want to change ownership). The 
> receiving side's cast back to thread-local can be abstracted 
> (because you can wrap it in a type that assumes uniqueness and 
> destroys the original).

I think you're mistaken to think that `shared` has anything at 
all to do with passing ownership. That case is not-shared by 
definition.

> Another thing that looks attractive from MP is you have this 
> "carved out" section of your type that's only owned by your 
> thread. This is great until you realize, you ONLY have access 
> to it from your original reference. You can't send it away, get 
> it back, and then manipulate the result. In this sense, it's 
> VERY similar to const. So really it does you no good to 
> associate the shared portions of the data with your local 
> portions for the purpose of sending it away to other threads 
> for a processing round-trip.

I don't understand this point.

You either lease it out to a cluster for processing (think 
parallel for), or if you want to 'send' it on a round-trip, then 
you are transferring ownership along the way.
Both models work fine.

> -------
>
> To summarize, I think the reality is that we ACTUALLY can 
> implement sharing as Manu wishes without implicit casting, 
> albeit via library abstraction using introspection. I can 
> easily see a library that allows you to pass a type that isn't 
> shared, as long as it has shared pieces, and have that library 
> simply restrict access to the thread-safe pieces via a wrapper. 
> We don't need the compiler's help for that. So Manu can have 
> his cake, I can eat my cake, we'll have a great big sharing of 
> cake party, where nobody is racing, and everything is roses and 
> lollipops.

I agree point #1 in my proposal must happen no matter what world 
we end up in. All of us will be happier in that world, so I think 
that's worth pursuing as a first goal.

Sadly, ignoring the #2 point on my proposal leads to virtually 
all code being unsafe.
#2 makes @safe threadsafety a thing; and that's literally, the 
whole point :/