Mallocator and 'shared'

Sun Feb 12 16:45:05 PST 2017

On Sunday, 12 February 2017 at 20:08:05 UTC, bitwise wrote:
> It seems like you're saying that 'shared' should mean both 
> 'thread safe' and 'not thread safe' depending on context, which 
> doesn't make sense.

Makes sense to me: A `shared` variable is shared among threads. 
Accesses are not thread-safe. When a function has a `shared` 
parameter, it's expected to access that variable in a thread-safe 
manner. A `shared` method is a function with a `shared` `this` 
parameter.

Considering the alternative, that functions are not expected to 
ensure thread-safety on their `shared` parameters, that would 
mean you have to ensure it at the call site. And then what would 
be the point of marking the parameter `shared`, if thread-safety 
is already ensured?

> Example:
>
> shared A a;
>
> struct A {
>     int x, y;
>
>     void foo() shared {
>         a.x = 1;
>     }
> }
>
> int main(string[] argv) {
>     a.x = 5;
>     a.y = 5;
>     a.foo();
>     return 0;
> }
>
> Qualifying 'a' with 'shared' means that it's shared between 
> threads, which means that accessing it is _not_ thread safe.

Yup. In my opinion, non-atomic accesses like that should be 
rejected by the compiler. But you can write race conditions with 
only atomic stores/loads, so in the end it's the programmer's 
responsibility to write correct code.

> Since the method 'foo' accesses 'a', 'foo' is also _not_ thread 
> safe.

Well, yes, that `foo` isn't thread-safe. But it should be written 
differently so that it is.

> Given that both the data and the method are 'shared', a caller 
> should know that race conditions are possible and that they 
> should aquire a lock before accessing either of them...or so it 
> would seem.

But when you have the lock, you can safely call any method, 
including non-`shared` ones. I see no point in distinguishing 
`shared` and unshared methods then.

Non-`shared` methods are obviously not safe to call on `shared` 
objects. So `shared` methods must be other thing: safe.

> I imagine that qualifying a method with 'shared' should mean 
> that it can access shared data, and hence, is _not_ thread safe.

Every function/method can access shared data. They're all 
expected to do it safely. The `shared` attribute just qualifies 
the `this` reference.

> This prevent access to 'shared' data from any non 'shared' 
> context, without some kind of bridge/cast that a programmer 
> would use when they knew that they had aquired the lock or 
> ownership of the data. Although this is what would make sense 
> to me, it doesn't seem to match with the current implementation 
> of 'shared', or what you're saying.

It wouldn't exactly "prevent" it, would it? The compiler can't 
check that you've got the correct lock. It would be expected of 
the programmer to do so before calling the `shared` method. 
That's easy to get wrong.

When `shared` methods are safe themselves, you can't get the 
calls to them wrong. The ugly is nicely contained. To call an 
unsafe method, you have to cast and that's a good indicator that 
you're entering the danger zone.

> It seems that methods qualified with 'shared' may be what 
> you're suggesting matches up with the 'bridge' I'm trying to 
> describe, but again, using the word 'shared' to mean both 
> 'thread safe' and 'not thread safe' doesn't make sense.

Maybe don't think of it meaning "safe" or "unsafe" then. It just 
means "shared".

A `shared` variable is just that: shared. The way you deal with 
it can be thread-safe or not. Everyone is expected to deal with 
it safely, though. "Everyone" includes `shared` methods.

> Firstly, because the same keyword should not mean two strictly 
> opposite things. Also, if a 'shared' method is supposed to be 
> thread safe, then the fact that it has access to shared data is 
> irrelevant to the caller.

Non-`shared` methods are not thread-safe. They expect unshared 
data. You can still call them on shared objects, though, with a 
cast. And when you've ensured thread-safety beforehand, it's 
perfectly fine to do so.

If `shared` methods were unsafe too, then that would only allow 
calling unsafe code without a cast. Doesn't seem like an 
improvement.

> So 'shared' as a method qualifier doesn't really make sense. 
> What would make more sense is to have a region where 'shared' 
> data could be accessed - Maybe something like this:
>
> struct S {
>     shared int x;
>     Lock lk;
>
>     private void addNum(int n) shared {
>         x += num;
>     }
>
>     int add(int a, int b)
>     {
>         shared {
>             lk.lock();
>             addNum(a);
>             addNum(b);
>             lk.unlock();
>         }
>     }
> }
>
> So above,
> 1) 'x' would be shared, and mutating it would not thread safe.

As it is now.

> 2) 'addNum' would have access to 'shared' data, and also be 
> non-thread-safe

Today, non-`shared` methods are unsafe, and they can access 
shared data just like `shared` methods. But I imagine you'd have 
`shared` methods alter `shared` data freely, without casting.

> 3) 'x' and 'addNum' would be inaccessible from 'add' since 
> they're 'shared'

As it is now. Can't just call a `shared` method from a 
non-`shared` one.

> 4) a 'shared' block inside 'add' would allow access to 'x' or 
> 'addNum', with the responsibility being on the programmer to 
> lock.

So the `shared` block as a whole is thread-safe and it's the 
programmer's duty to make sure of that. While today, a `shared` 
method as a whole is thread-safe and it's the programmer's duty 
to make sure of that. Not much of a difference, is it?

> 5) alternatively, 'shared' data could be accessed from within a 
> 'synchronized' block.
>
> I thought 'shared' was a finished feature, but it's starting to 
> seem like it's a WIP. This kind of feature seems like it has 
> great potential, but is mostly useless in it's current state.

That may be so or not. I don't think you've made an argument for 
"unfinished" or "useless", though. You've argued "inconsistent", 
and maybe "surprising" or simply "bad". I wouldn't expect further 
development of the feature to meet your vision.

> After more testing with shared, it seems that 'shared' data is 
> mutable from many contexts, from which it would be unsafe to 
> mutate it without locking first, which basically removes any 
> gauruntee that would make 'shared' useful.

There is no guarantee of thread-safety, yes. There cannot be, as 
far as I understand, because the compiler cannot know which 
operations must happen without interruption.

However, as I've said above, I'd like non-atomic accesses of 
`shared` variables to be rejected. Non-atomic increment is being 
rejected, so it makes no sense to me that non-atomic writes and 
reads are allowed.

Overall, I think `shared` is solid. It's not magic. It mainly 
prevents you from accidentally doing unsafe stuff by highlighting 
shared data and forcing you to cast or use special functions that 
deal safely with shared data.