should pure functions accept/deal with shared data?

Wed Jun 6 16:01:59 PDT 2012

On 06-06-2012 23:39, Steven Schveighoffer wrote:
> An interesting situation, the current compiler happily will compile pure
> functions that accept shared data.
>
> I believed when we relaxed purity rules, shared data should be taboo for
> pure functions, even weak-pure ones. Note that at least at the time, Don
> agreed with me: http://forum.dlang.org/post/i7d60m$2smf$1@digitalmars.com
>
> Now, technically, there's nothing really *horrible* about this, I mean
> you can't really have truly shared data inside a strong-pure function.
> Any data that's marked as 'shared' will not be shared because a
> strong-pure function cannot receive any shared data.
>
> So if you then were to call a weak-pure function that had shared
> parameters from a strong-pure function, you simply would be wasting
> cycles locking or using a memory-barrier on data that is not truly
> shared. I don't really see a compelling reason to have weak-pure
> functions accept shared data explicitly.
>
> *Except* that template functions which use IFTI have good reason to be
> able to be marked pure.
>
> For example:
>
> void inc(T)(ref T i) pure
> {
> ++i;
> }
>
> Now, we have a template function that we know only will affect i, and
> the compiler enforces that.
>
> But what happens here?
>
> shared int x;
>
> void main()
> {
> x.inc();
> }
>
> here, T == shared int.
>
> One solution (if shared isn't allowed on pure functions) is, don't mark
> inc pure, let it be inferred. But then we are losing the contract to
> have the compiler help us enforce purity.
>
> I'll also point out that inc isn't a valid function for data that is
> actually shared: ++i is not atomic. So disallowing shared actually helps
> us in this regard, by refusing to compile a function that would be
> dangerous when used on shared data.

Man, shared is such a mess.

(I'm going to slightly hijack a branch of your thread because I think we 
need to address the below concerns before we can make this decision 
properly.)

We need to be crystal clear on what we're talking about here. Usually, 
people refer to shared as being supposed to insert memory barriers. 
Others call operations on shared data atomic.

(And of course, neither is actually implemented in any compiler, and I 
doubt they ever will be.)

A memory barrier is what the x86 sfence, lfence, and mfence instructions 
represent. They simply make various useful guarantees about ordering of 
loads and stores. Nothing else.

Atomic operations are what the lock prefix is used for, for example the 
lock add operation, lock cmpxchg, etc. These operate on the most recent 
value at whatever memory location is being operated on, i.e. caches are 
circumvented.

Memory barriers and atomic operations are not the same thing, and we 
should avoid conflating them. Yes, they can be used together to write 
low-level, lock-free data structures, but the use of one does not 
include the other automatically.

(At this point, I probably don't need to point out how x86-biased and 
unportable shared is.....)

So, my question to the community is: What should shared *really* mean?

I don't think that having shared imply memory barriers is going to be 
terribly useful to anyone. In fact, I don't know how the compiler would 
even determine where to efficiently insert memory barriers. And 
*actually*, I think memory barriers is really not what people mean at 
*all* when they refer to shared's effect on code generation. I think 
what people *really* want is atomic operations.

Steven, in your particular case, I don't agree entirely. The operation 
can be atomic quite trivially by implementing inc() like so (for the 
shared int case):

void inc(ref shared int i) pure nothrow
{
     // just pretend the compiler emitted this
     asm
     {
         mov EDX, i;
         lock;
         inc [EDX];
     }
}

But I may be misunderstanding you. Of course, it gets a little more 
complex if you use the result of the ++ operation afterwards, but it's 
still not impossible to do atomically. What can *not* be done is doing 
the increment and loading the result in one purely atomic instruction 
(and I suspect this is what you may have been referring to). It's worth 
pointing out that most atomic operations can be implemented with a spin 
lock (which is exactly what core.atomic does for most binary 
operations), so while it cannot be done with an x86 instruction, it can 
be achieved through such a mechanism, and most real world atomic APIs do 
this (see InterlockedIncrement in Windows for example).

Further, if shared is going to be useful at all, stuff like this *has* 
to be atomic, IMO.

I'm still of the opinion that bringing atomicity and memory barriers 
into the type system is a horrible can of worms that we should never 
have opened, but now shared is there and we need to make up our minds 
already.

>
> The compiler *currently* however, will simply compile this just fine.
>
> I'm strongly leaning towards this being a bug, and needs to be fixed in
> the compiler.
>
> Some background of why this got brought up:
> https://github.com/D-Programming-Language/druntime/pull/147
>
> Opinions?
>
> -Steve

-- 
Alex Rønne Petersen
alex at lycus.org
http://lycus.org