shared - i need it to be useful

Stanislav Blinov stanislav.blinov at gmail.com
Tue Oct 16 12:26:13 UTC 2018


On Tuesday, 16 October 2018 at 03:00:21 UTC, Manu wrote:

>> I don't see how an *implicit* cast can be a restriction. At 
>> all.

> Because a shared pointer can't access anything.
> You can't do anything with a shared instance, so the can be no 
> harm done.

That just doesn't compute. You obviously *can* do "anything" with 
a shared instance, per your own requirements all you need is 
methods. But at this point, the caller has an unshared reference, 
and your API possibliy stole away a shared one without the caller 
ever knowing of it happening.

>> It's like we're talking about wholly different things here. 
>> Casting should be done by the caller, i.e. a programmer that 
>> uses some API. If that API expects shared arguments, the 
>> caller better make sure they pass shared values. Implicit 
>> conversion destroys any obligations between the caller and the 
>> API.
>
> Why? What could a function do with shared arguments?

How, exactly, do you propose getting shared state from one thread 
to another? Exclusively through globals?

>> The problem that I'm suggesting is exactly that: an `int*` is 
>> not, and can not, be a `shared int*` at the same time. 
>> Substitute int for any type. But D is not Rust and it can't 
>> statically prevent that, except for disallowing trivial 
>> programming mistakes, which, with implicit conversion 
>> introduced, would also go away.
>
> Why not? The guy who receives the argument receives an argument 
> that
> *may be shared*, and as such, he's restricted access to it 
> appropriately.

But the guy that *provides* that argument may have no idea of 
this happening. This is unshared aliasing.

module yourapi;

// My code does not know about this function at all
void giveToThread(shared int* ptr) { /* ... */ }

void yourAPI(int* ptr) {
     giveToThread(ptr);
}

module mycode;

int x;

void main() {
     yourAPI(&x);
}

> Just like if you receive a const thing, you can't write to it, 
> even if the caller's thing isn't const.

You do know why those Rust guys disallowed mutable aliasing? That 
is, having both mutable and immutable references at the same 
time. That's a long known problem that in C++ and D is only 
"solved" by programmer discipline and not much else.

> If you receive a shared thing, you can't read or write to it.

You can, per your model, through methods. All the while the 
original is being read and written freely.

>> > a good chance they don't need it though, they might not  
>> > interact with a thread-unsafe portion of the class.
>>
>> Or they might.
>
> Then you will implement synchronisation, or have violated your 
> thread-safety promise.

Such a good promise it is when it's simply ignored by an implicit 
cast.

>> Because that is exactly the code that a good amount of 
>> "developers" will write. Especially those of the "don't think 
>> about it" variety. Don't be mistaken for a second: if the 
>> language allows it, they'll write it.
>
> This is not even an argument.
> Atomic!int must be used with care. Any threading of ANY KIND 
> must be
> handled with care.

Implicit casts are in another galaxy as far as handling with care 
is concerned.

> Saying we shouldn't make shared useful because someone can do 
> something wrong is like saying we shouldn't have atomic int's 
> and we shouldn't have spawn(). They're simply too dangerous to 
> give to users...

We should make `shared` useful. We shouldn't allow unshared 
aliasing.

>> Can you actually provide an example of a mixed shared/unshared 
>> class that even makes sense then? As I said, at this point I'd 
>> rather see such definitions prohibited entirely.
>
> I think this is a typical sort of construction:
>
> struct ThreadsafeQueue(T)
> {
>   void QueueItem(T*) shared;
>   T* UnqueueItem() shared;
> }

That's an interface for a multi-producer multi-consumer, yet 
you're using it as a multi-producer single-consumer. Those would 
have very different implementations.

> struct SpecialWorkList
> {
>   struct Job { ... }
>
>   void MakeJob(int x, float y, string z) shared  // <- any 
> thread may produce a job
>   {
>     Job* job = new Job; // <- this is thread-local
>     PopulateJob(job, x, y, z); // <- preparation of a job might 
> be complex, and worthy of the SpecialWorkList implementation

...except you can't call PopulateJob from MakeJob.

The two methods after compiler rewrite are (pseudocode):

void struct_method_SpecialWorkList_MakeJob(shared 
SpecialWorkList* this, int x, float y, string z);
void struct_method_SpecialWorkList_PopulateJob(SpecialWorkList* 
this, Job* job, ...);

Or are you also suggesting to allow implicitly demoting 
shared(T)* to T* ?!? Then you can just throw away `shared`.

>     jobList.QueueItem(job);  // <- QueueItem encapsulates 
> thread-safety, no need for blunt casts

None were needed here regardless.

>   void Flush() // <- not shared, thread-local consumer
>   {

I.e. this can only be called by the thread that owns this 
instance, because only that thread can have an unshared reference 
to it.

>   void GetSpecialSystemState() // <- this has NOTHING to do 
> with the threadsafe part of SpecialWorkList
>   {
>     return os.functionThatChecksSystemState();
>   }

Therefore it's either static or out of SpecialWorkList. As 
someone in strong opposition to conflation you should've seen 
that.

>   // there may be any number of utility functions that don't 
> interact with jobList.

So have a separate interface for them, why put them in the same 
namespace then?

> private:
>   void PopulateJob(ref Job job, ...)
>   {
>     // expensive function; not thread-safe, and doesn't have any
> interaction with threading.

And cannot be called from `shared` functions without at least 
casting away `shared`.

> This isn't an amazing example, but it's typical of a thing  
> that's
> mostly thread-local, and only a small controlled part of it's
> functionality is thread-safe.
> The thread-local method Flush() also deals with thread-safety
> internally... because it flushes a thread-safe queue.



> All thread-safety concerns are composed by a utility object, so 
> there's no need for locks, magic, or casts here.
>

// Casts:
auto assumeUnshared(T)(ref shared(T) x) { /* ... */ }

struct MPSCQueue(T) {
     void put(T*) shared;
     T* remove();
     // perhaps also this one:
     T* steal() shared;
}

struct Job { /* ... */ }

struct JobList {

     void addJob(Args...)(Args args) shared {
         auto job = new Job;
         this.assumeUnshared.populateJob(*job, forward!args);
         jobs.put(job);
     }

     Job* removeJob() {
         return jobs.remove();
     }

     Job* stealJob() shared {
        return jobs.steal();
     }

private:

     void populateJob(ref Job job, ...) {
         /* ... */
     }

     MPSCQueue!Job jobs;
}

struct SpecialWorkList {

     shared JobList* getJobList() { return jobList; }

     // here's an implicit cast to the actually "shared" data if 
you need one:
     //alias getJobList this;

     void flush() {
         auto list = jobList.assumeUnshared;
         Job* job;
         while ((job = list.removeJob) !is null) {
             // ...
         }
     }

     // put anything non-shared here...

private:

     // Even though your 'SpecialWorkList' is unshared,
     // you're sharing at least this reference with other threads:
     shared JobList* jobList;
}

SpecialWorkList makeSpecialWorkList() {
     SpecialWorkList result;
     // whatever, allocate the jobList, initialize other state, 
etc.
     return result;
}

void worker(shared int* keepMeltingCPU, shared JobList* jobList) {
     while (true) {
         if (!keepMeltingCPU.atomicLoad()) break;

         // Do some heavy processing...

         jobList.addJob(/* arguments */);
         jobList.addJob(/* arguments */);
     }
}

void thief(shared int* keepMeltingCPU, shared JobList* jobList) {
     while (true) {
         if (!keepMeltingCPU.atomicLoad()) break;
         jobList.addJob(/* arguments */);
         jobList.addJob(/* arguments */);
         /* ... */
         // help with jobs
         Job* job;
         while ((job = jobList.steal) !is null) {
             // ...
         }
     }
}

bool readInput() { /* ... */ }

void main() {
     Thread[2] threads;
     shared int[2] flags = [ 1, 1 ];
     SpecialWorkList list = makeSpecialWorkList();
     threads[0] = someFunctionThatMakesAThread(&worker, &flags[0], 
list.getJobList);
     threads[1] = someFunctionThatMakesAThread(&thief, &flags[1], 
list.getJobList);

     while (true) {
         list.flush();
         if (!readInput()) break;
     }

     foreach (ref flag; flags) {
         flag.atomicStore(0);
     }

     foreach (t; threads) t.join();
}

As you can see, the amount of casting is minimal. It is explicit 
and documenting the intent. And all the casts are actually the 
opposite way of what you're proposing. That's because the design 
lends to just not letting you share thread-local data. The 
"problem" here is that you can't add jobs from main thread 
without adding an unshared overload. Which is trivial, but would 
require a cast.

There are, however, deeper problems that don't have anything to 
do with the `shared` keyword, and start with the call "new Job". 
I could assume it's just for the sake of example, and you'd 
actually store the jobs:

a) not as pointers
b) in aligned and padded storage

but since we're nitpicking, and you're very specific in that 
approach, I'm not sure if I'm safe making such an assumption.

>> missing the point altogether. The way `shared` is 
>> "implemented" today, the API (`thread` function) *requires* 
>> the caller to pass a `shared int*`. Implicit conversion breaks 
>> that contract.
>
> So?
> What does it mean to pass a `shared int*`?

Exactly what it says: passing a pointer to a shared int. Which is 
what int* is not. Adding `shared` to a type is not the same thing 
as "promoting" a mutable to a const, which, the way that's done 
in D, isn't that awesome a deal to begin with.

>> At the highest level, the only reason for taking a `shared` 
>> argument is to pass that argument to another thread.
>
> Not even. This is the most un-useful application I can think 
> of. It doesn't really model the problem at all.

There's no feasible alternative in the current type system.

> Transfer of ownership is a job for move semantics.

There is no transfer of ownership when you're *sharing* data.

> shared is for interacting with objects that are *already* owned 
> by many threads.

There's only one owner, usually a thread that instantiated the 
thing in the first place. Ergo, there needs to be a way to 
"share" that thing. Explicitly.

> shared needs to model mechanics to do a limited set of 
> thread-safe
> interactions with shared objects that are shared. That would 
> make
> shared a useful thing, rather than a giant stain.

Agreed.

>> That is the *only* way to communicate that intent via the type 
>> system for the
>> time being.
>
> ...but that's shit. And it doesn't communicate that intent at 
> all. Bluntly casting attributes on things is a terrible 
> solution to that proposed problem.

It's not a solution, it's a crutch. Presumably the programmer 
would know whether or not that cast is safe in that particular 
context.
Can you propose a better way to distinguish shared and 
thread-local instances without introducing "thread" built-in type 
(at which point we should all probably just stop using D)? So 
long as threading is purely library code, there isn't exactly 
much room for that.

> Yes; everything we think about shared today is completely  
> worthless.
> Under my proposal, some existing applications might remain 
> untouched
> (they do), but they're not worth worrying about from a design 
> point of view, because they're not really 'designs'.
> Focus on making shared a useful thing, and then see where we're 
> at.

I quite like the idea of disallowing member access wholesale. For 
example, things like assignment, copy/postblit should be 
@disabled for anything `shared` by default, shared destructors 
should not exist. But this doesn't *really* solve the problem.

>> `shared` was
>> supposed to protect from unshared aliasing, not silently allow 
>> it.
>
> Inhibiting all access satisfies that protection. It doesn't 
> matter if
> a pointer is distributed if you can't access the contents.
> Now from there, we need a way to make interacting with 
> guaranteed
> thread-safe API's interesting and useful, and I'm describing 
> how to do that.

An API cannot be guaranteed thread-safe without some kind of 
contract in it's signature.

>> If you allow implicit conversion, there would literally be no 
>> way of knowing whether some API will access your data 
>> concurrently, other than plain old documentation (or sifting 
>> through it's code, which may not be available). This makes 
>> `shared` useless as a type qualifier.

> That's the whole point though.
> A thread-safe think couldn't care less if the data is shared or 
> not,
> because it's threadsafe.
> Now we're able to describe what's thread-safe, and what's not. 
> This makes shared *useful* as a type qualifier.

You're looking at it backwards, I assume because you keep 
thinking about `shared` in terms of `const`. `shared` thing cares 
if the data is shared or not, to which with implicit casting in 
place there is no compile-time check.

>> I think you will agree that passing a pointer to a 
>> thread-local variable to another thread is not always a safe 
>> thing to do.

> That's the problem I'm trying to resolve by removing all access.
> I'm trying to make that interaction safe, and that's the key to 
> moving forward as I see it.
> If the object is thread-local, then no other thread can access 
> the object in any way, and it's just a fancy int.

Not true if you allow to silently cast a pointer to that fancy 
int to a pointer to a shared int.

>> Conditions do apply, which are on you (the programmer) to 
>> uphold, and the compiler can't help you with that. The only 
>> way the compiler *can* help you here is make sure you don't do 
>> that unintentionally. Which it won't be able to do if you 
>> allow such implicit conversion.

> You need to demonstrate how the implicit conversion may lead to 
> chaos. The conversion is immensely useful, and I haven't 
> thought how it's a problem yet.

I already have. With your implicit promotions, you can literally 
pass off anything as shared. No amount of DIP1000s will safeguard 
from this, because the only ways to pass data to threads is to 
either only use globals, or escape it. Therefore the language 
needs some way of annotating that intent. Currently, `shared` 
does that. If you don't like that, there needs to be some 
alternative.


More information about the Digitalmars-d mailing list