Something needs to happen with shared, and soon.

Wed Nov 14 07:27:10 PST 2012

On 14-11-2012 16:08, Andrei Alexandrescu wrote:
> On 11/14/12 6:39 AM, Alex Rønne Petersen wrote:
>> On 14-11-2012 15:14, Andrei Alexandrescu wrote:
>>> On 11/14/12 1:19 AM, Walter Bright wrote:
>>>> On 11/13/2012 11:56 PM, Jonathan M Davis wrote:
>>>>> Being able to have double-checked locking work would be valuable, and
>>>>> having
>>>>> memory barriers would reduce race condition weirdness when locks
>>>>> aren't used
>>>>> properly, so I think that it would be desirable to have memory
>>>>> barriers.
>>>>
>>>> I'm not saying "memory barriers are bad". I'm saying that having the
>>>> compiler blindly insert them for shared reads/writes is far from the
>>>> right way to do it.
>>>
>>> Let's not hasten. That works for Java and C#, and is allowed in C++.
>>>
>>> Andrei
>>>
>>>
>>
>> I need some clarification here: By memory barrier, do you mean x86's
>> mfence, sfence, and lfence?
>
> Sorry, I was imprecise. We need to (a) define intrinsics for loading and
> storing data with high-level semantics (a short list: acquire, release,
> acquire+release, and sequentially-consistent) and THEN (b) implement the
> needed code generation appropriately for each architecture. Indeed on
> x86 there is little need to insert fence instructions, BUT there is a
> definite need for the compiler to prevent certain reorderings. That's
> why implementing shared data operations (whether implicit or explicit)
> as sheer library code is NOT possible.

Let's continue this part of the discussion in my other reply (the one 
explaining how core.atomic is implemented in the various compilers).

>
>> Because as Walter said, inserting those blindly when unnecessary can
>> lead to terrible performance because it practically murders
>> pipelining.
>
> I think at this point we need to develop a better understanding of
> what's going on before issuing assessments.

I dunno. On low-end architectures like ARM the out-of-order processing 
is pretty much what makes them usable at all because they don't have the 
raw power that x86 does (I even recall an ARM Holdings executive saying 
that they couldn't possibly switch to a strong memory model with an 
in-order pipeline without severely reducing the efficiency of ARM). So 
I'm just putting that out there - it's definitely worth taking into 
consideration because very few architectures are actually fully in-order 
like x86.

>
>
> Andrei

-- 
Alex Rønne Petersen
alex at lycus.org
http://lycus.org