should pure functions accept/deal with shared data?

Wed Jun 6 19:14:25 PDT 2012

On 07-06-2012 03:55, Andrei Alexandrescu wrote:
> On 6/6/12 8:19 PM, Alex Rønne Petersen wrote:
>> On 07-06-2012 03:11, Andrei Alexandrescu wrote:
>>> On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
>>>> (At this point, I probably don't need to point out how x86-biased and
>>>> unportable shared is.....)
>>>
>>> I confess I'll need that spelled out. How is shared biased towards x86
>>> and nonportable?
>>>
>>> Thanks,
>>>
>>> Andrei
>>
>> The issue lies in its assumption that the architecture being targeted
>> supports atomic operations and/or memory barriers at all. Some
>> architectures plain don't support these, others do, but for certain data
>> sizes like 64-bit ints, they don't, etc. x86 is probably the
>> architecture that has the best support for low-level memory control as
>> far as atomicity and memory barriers go.
>
> Actually x86 is one of the more forgiving architectures (most code works
> even when written without barriers). Indeed we assume the target
> architecture supports double-word atomic load.

And if cent/ucent ever get implemented (which does seem likely, although 
they're low-prio), we'll have to assume 128-bit too. Here Be Dragons. ;)

>
>> The problem is that shared is supposed to guarantee that operations on
>> shared data *always* obeys whatever atomicity/memory barrier rules we
>> end up defining for it (obviously we don't want generated code to have
>> different semantics across architectures due to subtle issues like the
>> lack of certain operations in the ISA). Right now, based on what I've
>> read in the NG and on mailing lists, people seem to assume that shared
>> will provide full-blown x86-level atomicity and/or memory barriers.
>> Providing these features on e.g. ARM is a pipe dream at best (for
>> instance, ARM has no atomic load for 64-bit values).
>
> http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html mentions that
> there is a way to implement atomic load for 64-bit values.

You learn something new every day! When we did research for MCI's atomic 
intrinsics, we didn't notice these instructions on ARM. Thanks for the link.

This covers most significant architectures today, but I'm still worried 
about e.g. Super-H, Alpha, SPARC, MIPS, and others that are listed on 
http://dlang.org/version.html (I think that at least SPARC lacks 
double-word atomic load/store).

>
>> All this being said, shared could probably be implemented with plain old
>> locks on these architectures if correctness is the only goal. But, from
>> a more pragmatic point of view, this would completely butcher
>> performance and adds potential for deadlocks, and all other issues
>> associated with thread synchronization in general. We really shouldn't
>> have such a core feature of the language fall back to a dirty hack like
>> this on low-end/embedded architectures (where performance of this kind
>> of stuff is absolutely critical), IMO.
>
> That's how C++'s atomic<T> does things, by the way. But I sympathize
> with your viewpoint that there should be no hidden locks. We could
> define shared to refuse compilation on odd machines, and THEN provide an
> atomic template with the expected performance of a lock.

That may be a reasonable approach. But if we do this, I think we need to 
revisit the core.atomic API, since it unnecessarily requires the shared 
qualifier for some things (just because shared overall isn't useful on a 
target architecture doesn't mean that e.g. a 32-bit atomic load can't be 
done on it).

>
>
> Andrei
>

-- 
Alex Rønne Petersen
alex at lycus.org
http://lycus.org