valid uses of shared

Fri Jun 8 16:55:29 PDT 2012

On 06/08/12 23:59, Steven Schveighoffer wrote:
> On Fri, 08 Jun 2012 15:30:26 -0400, Artur Skawina <art.08.09 at gmail.com> wrote:
> 
>> On 06/08/12 19:13, Steven Schveighoffer wrote:
>>> On Fri, 08 Jun 2012 10:57:15 -0400, Artur Skawina <art.08.09 at gmail.com> wrote:
>>>
>>>> On 06/08/12 06:03, Steven Schveighoffer wrote:
>>>
>>>>>
>>>>> I agree that the type should be shared(int), but the type should not transfer to function calls or auto, it should be sticky to the particular variable.  Only references should be sticky typed.
>>>>
>>>> The problem with this is that it should be symmetrical, IOW the conversion
>>>> from non-shared to shared would also have to be (implicitly) allowed.
>>>> A type that converts to both would be better, even if harder to implement.
>>>
>>> It should be allowed (and is today).
>>
>> Hmm. I think it shouldn't be. This is how it is today:
>>
>>    shared Atomic!int ai;
>>    shared Atomic!(void*) ap;
>>
>>    void f(Atomic!int i) {} // Atomic() struct template temporarily made unshared for this test.
>>    void fp(Atomic!(void*) i) {}
>>
>>    void main() {
>>       f(ai);
>>       fp(ap);
>>    }
>>
>>    Error: function f (Atomic!(int) i) is not callable using argument types (shared(Atomic!(int)))
>>    Error: cannot implicitly convert expression (ai) of type shared(Atomic!(int)) to Atomic!(int)
>>    Error: function fp (Atomic!(void*) i) is not callable using argument types (shared(Atomic!(void*)))
>>    Error: cannot implicitly convert expression (ap) of type shared(Atomic!(void*)) to Atomic!(void*)
> 
> This is a bug (a regression, actually).
> 
> I tested this simple code:
> 
> struct S
> {
>     int i;
> }
> void main()
> {
>     shared(S) s;
>     S s2 = s;
> }
> 
> which fails on 2.057-2.059, but passes on 2.056
> 
> Looking at the changelog, looks like there were some changes related to shared and inout in 2.057.  Those probably had a hand in this.
> 
> I'll file a bug to document this regression.

FWIW i can't think of a specific case where allowing the implicit 
conversions would cause problems, it's just that I'm not sure they
are absolutely always safe. Things like having *both* shared and
unshared versions of a type shouldn't be impossible just because
of this.

>> It seems to work for built-in value types, which i didn't even realize, because the thought of
>> using them with 'shared' never crossed my mind. I don't really see why those should be treated
>> differently from user defined types, which should not allow implicit shared<>unshared conversions.
> 
> They shouldn't be treated differently, everything should implicitly convert.  Here is why:
> 
> shared means "shared".  If you make a copy of the value, that's your private copy, it's not shared!  So there is no reason to automatically mark it as shared (you can explicitly mark it as shared if you want, but I contend this shouldn't be valid on stack data).
> 
> Now, a shared *Reference* needs to keep the referred data as shared.  So for instance, shared(int) * should *not* implicitly cast to int *.
> 
> It's the same rules for immutable.  You can assign an immutable int to an int no problem, because you aren't affecting the original's type, and the two are not referencing the same data.

That's all obvious; it's not the trivial scenarios that i'm worried about.
It's things like having both "global" and "local" versions of a type, 
which are then accessed differently. Implicitly converting between those
would be unsafe.

>>> I am talking about stripping head-shared, so shared(int *) automatically converts to shared(int)* when used as an rvalue.
>>
>> Where would the 'shared(int*) type come from? IOW, given 'shared struct S { int i; } S s;'
>> what would the type of '&s.i' be? In your model; because right now it is 'shared(int)*'.
> 
> struct S
> {
>     int *i;
> }
> void main()
> {
>     shared(S) s;
>     auto x = s.i;
>     pragma(msg, typeof(x).stringof); // prints shared(int *)
> }

Right. Would you have a problem with disallowing accessing 's.i' like that? ;)
Because that's (part of) my point - "raw" access to shared data (the pointer
in this case) is not really safe. Which is why I'd prefer not to drop the shared
qualifier from the head unless absolutely necessary. That operation is of course
safe in itself - the issue is that it encourages writing code like in your example.
Where the better alternatives would be either wrapping the type (pointer, here),
or using some kind of accessor. Imagine auditing code that contains lots if these
direct shared accesses; could you assume that every person who touched it knew
what he/she was doing, that all required memory barriers are there etc?..

If you're saying that forbidding direct access can't be done by default for
backward compatibility reasons, then I'm not really disagreeing. It's just
that you are proposing a new shared model, and in that case i think the order
should be 1) figuring out how a perfect one should look like and 2) making
necessary compromises. And we're IMHO not at that second stage yet. :)

>>>>> Right, I was thinking shared structs do not make sense, since I don't think shared members do not make sense.  Either a whole struct/class is shared or it is not.  Because you can only put classes on the heap, shared makes sense as an attribute for a class.
>>>>>
>>>>> But then again, it might make sense to say "this struct is only ever shared, so it should be required to go on the heap".  I like your idea later about identifying shared struct types that should use synchronization.
>>>>
>>>> Of course shared structs make sense, it's what allows implementing any
>>>> non-trivial shared type.
>>>>
>>>>    static Atomic!int counter;
>>>>
>>>> inside a function is perfectly fine. And, as somebody already mentioned
>>>> in this thread, omitting 'static' should cause a build failure; right
>>>> now it is accepted, even when written as
>>>>
>>>>    shared Atomic!int counter;
>>>>
>>>> The problem? 'shared' is silently dropped. Move the counter from a struct
>>>> into a function after realizing it's only accessed from one place, forget
>>>> to add 'static' - and the result will compile w/o even a warning.
>>>
>>> The difference is that static is not a type constructor.
>>
>> The problem is that 'shared' is lost, resulting in an incorrect program.
>> When you explicitly declare something as shared the compiler better treat
>> it as such, or fail to compile it; silently changing the meaning is never
>> acceptable.
> 
> later:
> 
>> That was misleading; "shared" isn't actually lost, but as the variable is
>> placed on the stack it becomes effectively thread local, which can be very
>> unintuitive. But i can't think of an easy way to prevent this mistake, while
>> still allowing shared data to be placed on the stack. And the latter can
>> be useful sometimes...
> 
> I don't think it's unintuitive at all.  shared *is* lost because it's *no longer shared*.  It makes perfect sense to me.
> 
> I also don't think it is a mistake.  I frequently use the pattern of capturing the current state of a shared variable to work with locally within a function.  Normally, in C or C++, there is no type difference between shared and unshared data, so it's just an int in both cases.  However, while I'm working with my local copy, I don't want it changing, and it shouldn't be.
> 
> A mistake is to mark it shared, because then I can send it to another thread possibly inadvertently.

The 'unintuitive' thing about it is having data typed as shared that in
reality is thread local, that's all. It's unintuitive, but not actually
wrong.

>>> e.g.:
>>>
>>> shared int x; // typeof(x) == int
>>
>> This could be made illegal, but if it is accepted then it should retain its type.
>>
>>
>>> void foo(shared int *n){...}
>>>
>>> foo(&x); // compiler error?  huh?
>>>
>>> I think this is a no-go.  Shared has to be statically disallowed for local variables.
>>
>> It's a possibility. Except _static_ local variables, those must work.
> 
> static is different, because they are not local, they are global.  Again, this comes down to a storage class vs. a type constructor.
> 
> All that is different is that the symbol is local, it's still put in the global segment.

I'm apparently not making myself clear, sorry about that. I'm only trying
to make sure that you don't propose to ban "shared" from local static data.
It seemed you wanted to disallow a lot of things for apparently no, or no
substantial, reason.

>>>> If 'shared(VT)' implicitly converts to VT, then
>>>>
>>>>    auto myY = x.y; // typeof(myY) == shared(int)
>>>>
>>>> would still be fine.
>>>
>>> No, because then &myY yields a reference to shared data on the stack, which is what I think should be disallowed.
>>
>> The only problem with shared data on the stack i can think of is portability.
>> But this is something that can be decided at a much later time, it wouldn't
>> be used much in practice anyway.
> 
> It's the same problem as taking addresses of stack variables.  It's allowed, and can be valid in some cases, but it will cost you dearly if you get it wrong.  You are better off allocating on the heap, or using library constructs that know what they are doing.

In fact your argument for forbidding shared data on the stack inspired me,
so i created a relatively safe API that allows me do to exactly that...
I originally wanted to ban it too, because of the 'unintuitiveness' of it,
but now actually have code that uses it and works. Convincing me now that
it shouldn't be allowed won't work. :)

Thanks for the idea; it's not something i would have even considered
doing in another language; the fact that D makes implementing it possible
(and efficient) in about half a page of code and a one-liner in the callers
is why I'm still around here, despite all the language holes.

>>>> But I'm not sure allowing these implicit conversions is a good idea.
>>>> At least not yet. :)
>>>
>>> Implicit conversions to and from shared already are valid.  i.e. int x = sharedInt; is valid code.
>>
>> yes, but see above. shared(BVT)->BVT and shared(P*)->shared(P)* are allowed,
>> and i don't think the latter is necessarily sound. Yes, the current shared
>> model practically requires this, but i don't think raw access to shared data
>> is the best approach.
> 
> It's not raw access, as soon as you create an rvalue, it's no longer aliased to the shared data.  shared(P)* is it's own copy of the pointer.  In other words, it's no longer shared, so shared should be stripped.  However, what it *points* to is still shared, and still maintains the shard attribute.

It's the act of retrieving that pointer that I'd like to make safer.

> Making shared storage illegal on the stack is somewhat orthogonal to this.  While I can see where having shared stack data is useful, it's completely incorrect to forward shared attributes on copies of data.
> 
> But it's so hard to guarantee that the stack variable storage does not go away, especially when you have now thrown it out to another thread, which may or may not tell you when it's done with it, that I think it should be made illegal.

No! ;)

> At the very least, it should be illegal in @safe code.

Most certainly.

>> Note that inside synchronized() statements such conversions would be fine.
> 
> I think you are not understanding the storage aspect.

I don't care about the storage aspect. :) We're talking about different things,
maybe my explanation above made things clearer, at least I hope it did.

The reason for which conversions inside a synchronized block are safer is
the fact that it could be seen as if the current thread "owns" the data; at 
least if there's a clear monitor<->data dependency.
But I think that approach wouldn't necessarily work, for example what dealing
with semaphores, hence i must retract that statement; they are not always "fine".

>>> I'm talking about changing the types of expressions, such that the expression type is always the tail-shared version.  In fact, simply using a shared piece of data as an rvalue will convert it into a tail-shared version.
>>
>> Could you provide an example? Because I'm not sure what problem this is supposed
>> to solve. Eg. what is "a shared piece of data" and where does it come from?
> 
> If the above replies haven't responded enough, I will elaborate, let me know (responded while reading, I probably should have read the whole post first ;)

I was just wondering if you had any other case in mind, other than directly
reading 'shared' data. Adding compiler magic to do that safely, which seems
to be something that is still seriously considered, would add way too much
overhead. It would make 'shared' fine for toy examples, but inappropriate
for real code.

artur