Expanding the horizons of D purity

H. S. Teoh hsteoh at quickfur.ath.cx
Thu Oct 31 13:05:22 PDT 2013


[I actually came up with this idea last week, but decided to postpone
bringing it up until all the furor about Andrei's new allocator design
has settled a little. ;-)]

One of the neatest things about purity in D is that traditionally impure
operations like mutation and assignment can be allowed inside a pure
function, as long as the effect is invisible to the outside world. This,
of course, describes strong purity. Weak purity takes it one step
further, by allowing mutation of outside state via references to mutable
data passed in as function arguments.

I'd like to propose extending the scope of weak purity one step further:
allow weakly-pure functions to call (not necessarily pure) delegates
passed as a parameter. That is, the following code should work:

	// N.B. This is a (weakly) pure function.
	void func(scope void delegate(int) dg) pure
	{
		// N.B. This calls an *impure* delegate.
		dg(1);
	}

Before you break out the pitchforks, please allow me to rationalize this
situation.

The above code is essentially equivalent to:

	void func(void *context, scope void function(void*,int) dg) pure
	{
		dg(context, 1);
	}

That is to say, passing in a delegate is essentially equivalent to
passing in a mutable reference to some outside state (the delegate's
context), and a pointer to a function that possibly mutates the outside
world through that context pointer. In a sense, this is not that much
different from a weakly pure function that directly modifies the outside
world via the context pointer.

But, I hear you cry, if func calls an *impure function* via a function
pointer, doesn't that already violate purity??!

Well, it certainly violates *strong* purity, no question about that. But
consider this code:

	int stronglyPure(int x) pure
	{
		int[] scratchpad;
		scratchpad.length = 2;

		// This is an impure delegate because it closes over
		// scratchpad.
		auto dg = (int x) { scratchpad[x]++; };

		// Should this work?
		func(dg);

		return scratchpad[1];
	}

Think about it.  What func does via dg can only ever affect a variable
local to stronglyPure(). It's actually impossible for stronglyPure() to
construct a delegate that modifies a global variable, because the
compiler will complain that referencing a global is not allowed inside a
pure function (verified on git HEAD). Any delegate that stronglyPure()
can construct, can only ever affect its local state. The only way you
could sneak an impure delegate into func() is if stronglyPure() itself
takes an impure delegate as parameter -- but if it does so, then it is
no longer strongly pure.

IOW, if stronglyPure() is truly strongly pure, then it is actually
impossible for the call to func() to have any effect outside of
stronglyPure()'s local scope, no matter what kind of delegate
stronglyPure() passes to func(). So such a call should be permitted!

Now let's consider the case where we pass a delegate to func() that
*does* modify global state:

	int global_state;
	void main() {
		func((int x) { global_state = x; });
	}

In this case, func being marked pure doesn't really cause any issues:
main() itself is already impure because it is constructing a delegate
that closes over a global variable, so the fact that the actual change
comes from calling func no longer matters. It's always OK for impure
code to call pure code, after all. It's no different from this:

	void weaklyPure(int* x) pure {
		*x = 1;	// OK
	}

	int global_state;
	void main() {
		weaklyPure(&global_state);
	}

That is to say, as long as the code that calls func() is marked pure,
then the behaviour of func() is guaranteed never to affect anything
outside the local scope of the caller (and whatever the caller can reach
via mutable reference parameters). That is, it is (at least) weakly
pure. If the caller is strongly pure (no mutable indirections in
parameters -- and this includes delegates), then func() is guaranteed to
never cause side-effects outside its caller. Therefore, it should be
permissible to mark func() as pure.

//

Why is this important? Well, ultimately the motivation for pushing the
envelope in this direction is due to functions of this sort:

	void toString(scope void delegate(const(char)[]) dg) {
		dg(...);
	}

By allowing this function to be marked pure, we permit it to be called
from pure code (which I proved in the above discussion as actually
pure). Or, put another way, we permit template functions that call
toString with a delegate that updates a local variable to be inferred as
pure. This allows more parts of std.format to be pure, which in turn
expands the usability of things like std.conv.to in pure code.
Currently, to!string(3.14f) is impure due to std.format ultimately
calling a toString function like the above, but there is absolutely no
reason why computing the string representation of a float can't be made
pure. Implementing this proposal would resolve this problem.

Besides, expanding the scope of purity allows much more D code to be
made pure, thus increasing purity-based optimization opportunities.

So, in a nutshell, my proposal is:

- Functions that, besides invoking a delegate parameter, are pure,
  should be allowed to be marked as pure.

- Template functions that, besides invoking a delegate parameter,
  perform no impure operations should be inferred as pure.

- A function that takes a delegate parameter cannot be strongly pure
  (but can be weakly pure), unless the delegate itself is pure.
  (Rationale: the delegate parameter potentially involves arbitrary
  references to the outside world, and thus cannot be strongly pure.)


T

-- 
Gone Chopin. Bach in a minuet.


More information about the Digitalmars-d mailing list