pure functions cannot be removed, actually: pure vs. total

Tue Jun 5 17:47:15 UTC 2018

On 6/5/18 10:48 AM, FeepingCreature wrote:
> I'm just posting to clear up the misunderstanding that a call to a pure 
> function can be removed. Actually, even calls to strongly pure functions 
> cannot always be removed. This is because there is one thing that a pure 
> function can do that will change program behavior if it is removed, even 
> if it does not change any global state whatsoever: it can simply never 
> return.
> 
> void foo() pure { while (true) { } }
> 
> By the way, this led to an amusing Phobos bug involving a pure function 
> (not) returning a struct with an enum member: 
> https://github.com/dlang/dmd/pull/8013#pullrequestreview-110250441 and 
> assert(false.repeat.any == true); :)
> 
> When a strongly pure function is called multiple times with the same 
> parameter, all but the first call can be removed; this is because if it 
> was going to not return, it would already have inf-looped the first 
> time. pure lets you go from n to 1, but not from 1 to 0.
> 
> A pure function that returns a value for every possible parameter is 
> called a total function. Unfortunately, there is no way to enforce 
> totality in the compiler, due to the halting problem.
> 

I'll repeat what I said in the PR where you made this similar comment, I 
don't think it is important to ensure a pure function that never returns 
is always called. Can you explain the benefit of such a thing?

We've all heard of optimizers that reduce "code that does nothing" down 
to just a return statement, foiling people who are expecting benchmarks 
to run properly, why is this any different?

I'm asking because it seems like we give up a really easy optimization 
for pure functions for the sake of pure infinite loop programs, which I 
suppose have some value, but not much value. Getting around the infinite 
loop elision would be pretty simple, just return an int.

On the flip side, if the compiler can figure out via introspection that 
some template function is strong pure and returns void, therefore it 
doesn't need to call it, that's a much bigger win than preserving the 
possible infinite loopiness that likely is a bug anyway.

Another observation: under the "infinite loops are important observable 
behavior" world-view, pure functions cannot be lazily evaluated either:

pure int foo() { /*infinite loop */}

void main(string[] args)
{
    auto a = foo;
    writeln("hi");
    if(args[1] == "printa")
       writeln(a);
}

With some optimizers, this can be rewritten:

writeln("hi");
if(args[1] == "printa")
    writeln(foo);

Which if foo is a *normally returning* function and not an infinite loop 
one, then this can save cycles in certain cases.

But under your world-view, the optimization is invalid, because foo 
might have an infinite loop, and then the observable behavior changes 
(instead of printing nothing and infinite looping, "hi" is printed, and 
infinite loops).

-Steve