Partially pure (Re: Fully transitive const is not necessary)

Fri Apr 4 05:14:12 PDT 2008

On 2008-04-04 01:36:42 -0400, Walter Bright <newshound1 at digitalmars.com> said:

> Don Clugston wrote:
> 
>> class C
>> {
>>   int numcalls;
>>   this() { numcalls=0; }
>>   void foo() { ++numcalls; } // Not pure - but no global side effects.
>> }
>> 
>> pure int bar(int x)
>> {
>>     C c = new C;
>>     for (int i=0; i<x; ++i) c.foo();
>>     return c.numcalls;
>> }
>> 
>> Is bar() pure or not?
> 
> I think it is pure, although it might be difficult for the compiler to 
> detect it. Like for CTFE, I think the compiler should start out rather 
> restrictive about what it regards as pure, and over time expand it as 
> it gets more capable.

Personally, I'd rather say foo is pure. I'd define pure as "always 
giving the same outputs given the same inputs and having no external 
effect". That would make foo pure, because given any object C, it'll 
always change C in the same, predictable way.

This definition allows parallelizing loops that wouldn't be 
parallelizable if foo wasn't pure (assuming you label foo as pure), 
such as this one:

	foreach (C c; arrayOfUniqueC)
		c.foo();

It also mean that purity doesn't prevent you from altering an out 
parameter and that you don't have to make all your parameters 
invariant. As long as you only alter what was given to you in the 
parameters, since the compiler knows about these parameters from the 
call site it can take the decision to parallelize or not, at the call 
site. In the example above, if you did this:

	C c;
	for (int i = 0; i < 10; ++i)
		c.foo();

it wouldn't be parallelizable because calling foo alters the state for 
the next call. But it's okay, because the compiler can figure that out 
from the call site and it won't do the parallelization. In other words, 
if the result depends on the previous result, you can't parallelize. 
That same situation can happen with a pure function returning a value:

	C c;
	for (int i = 0; i < 10; ++i)
		c = pureFunction(c);

So in other words, altering a non-const parameter isn't a side effect, 
it's a predictable effect you can deduce from the function signature, 
and I think it should be allowed in a pure function.

P.S.: I'm not sure what would happen if arrayOfUniqueC in the example 
above was to contain two or more pointers to the same object. That 
would be a problem, and that's why I put the word "unique" in the name 
to mean there are no duplicate. I'm not sure how the compiler could 
find about that though. Anyway, the worse that can happen is that the 
compiler cannot parallelize this particular case without an additional 
clue; allowing foo to be pure would still permit the optimization in 
the case of the bar function in Don's example, and in any case the 
compiler knows there are no duplicate pointers.

-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/