should pure functions accept/deal with shared data?
Steven Schveighoffer
schveiguy at yahoo.com
Thu Jun 7 12:55:31 PDT 2012
On Thu, 07 Jun 2012 15:16:20 -0400, Artur Skawina <art.08.09 at gmail.com>
wrote:
> On 06/07/12 20:29, Steven Schveighoffer wrote:
>> I'm not proposing disallowing mutable references, just shared
>> references.
>
> I know, but if a D function marked as "pure" takes a mutable ref (which
> a shared
> one has to be assumed to be), it won't be treated as really pure for
> optimization
> purposes (yes, i'm deliberately trying to avoid "strong" and "weak").
However, a mutable pure function can be *inside* an optimizable pure
function, and the optimizable function can still be optimized.
A PAS function (pure accepting shared), however, devolves to a mutable
pure function. That is, there is zero advantage of having a pure function
take shared vs. simply mutable TLS.
There is only one reason to mark a function that does not take all
immutable or value type arguments as pure -- so it can be called inside a
strong-pure function. Otherwise, it's just a normal function, and even
marked as pure will not be optimized. You gain nothing else by marking it
pure.
So let's look at two cases. I'll re-state my example, in terms of two
overloads, one which takes shared int and one which takes just int (both
of which do the right thing):
void inc(ref int t) pure;
{
++t;
}
void inc(ref shared(int) t) pure
{
atomicOp!"++"(t);
}
Now, let's define a strong-pure function that uses inc:
int slowAdd(int x, int y) pure
{
while(y--) inc(x);
return x;
}
I think we can both agree that inc *cannot* be optimized away, and that we
agree slowAdd is *fully pure*. That is, slowAdd *can* be optimized away,
even though its call to inc cannot.
Now, what about a strong-pure function using the second (shared) form? A
strong pure function has to have all parameters (and return types) that
are immutable or implicitly convertable to immutable.
I'll re-define slowAdd:
int slowAddShared(int x, int y) pure
{
shared int sx = x;
while(y--) inc(sx);
return sx;
}
We can agree for the same reason the original slowAdd is strong-pure,
slowAddShared is strong-pure.
But what do we gain by being able to declare sx shared? We can't return
it as shared, or slowAddShared becomes weak-pure. We can't share it while
inside slowAddShared, because we have no outlet for it, and we cannot
access global variables. In essence, marking sx as shared does
*nothing*. In fact, it does worse than nothing -- we now have to contend
with shared for data that actually is *provably* unshared. In other
words, we are wasting cycles doing atomic operations instead of straight
ops on a shared type. Not only that, but because there are no outlets,
declaring *any* data as shared while inside a strong-pure function is
useless, no matter how we define any PAS functions.
So if shared is useless inside a strong-pure function, and the only point
in marking a non-pure-optimizable function as pure is so it can be called
within a strong-pure function, then pure is useless as an attribute on a
function that accepts or returns shared data. *Every case* where you use
such a function inside a strong-pure function is incorrect.
But *mutable* data accepting functions *are* useful, because it allows us
to modularize pure functions. For example, sort can be (and should be)
pure. Instead of implementing a functional-style sort, or manually
sorting data inside a strong-pure function, we can simply call sort, and
it acts as a component of a strong-pure function, fully optimizable based
on pure optimization rules.
> And any caller
> will have to obtain this shared ref either from a mutable argument or
> global state.
> Hence that "pure" function with shared inputs will *never* actually be
> pure.
> So I'm wondering what would be the gain from banning shared in weakly
> pure functions
What is to gain is clarity, and more control over parameter types in
generic code.
If shared is banned, than:
void inc(T)(ref T t) pure { ++t; }
*always* does the right thing. As the author of inc, I am done. I don't
need template constraints or documentation, or anything else, and I don't
need to worry about users abusing my function. The compiler will enforce
nobody uses this on shared data, which would require an atomic operation.
> (Ugh, you made me use that word after all ;) ).
I did nothing of the sort :)
> AFAICT you're proposing to forbid something which currently is a NOOP.
It's not a NOOP, marking something as shared means you need special
handling. You can't call most functions or methods with shared data. And
if you do handle shared data, it's not just "the same" as unshared data --
you need to contend with data races, memory barriers, etc. Just because
it's marked shared doesn't mean everything about it is handled.
> And the change
> could have consequences for templated functions or lambdas, where "pure"
> is inferred.
I would label those as *helpful* and *positive* consequences ;)
-Steve
More information about the Digitalmars-d
mailing list