bug in foreach continue

Fri Mar 17 12:05:20 PDT 2017

On Fri, Mar 17, 2017 at 03:14:08PM +0000, Hussien via Digitalmars-d-learn wrote:
[...]
> What I am talking about is
> 
> how the statement
> 
> static if (x) continue;
> pragma(msg, "called");
> 
> vs
> 
> static if (x) { } else
> pragma(msg, "not called");
> 
> 
> They are both semantically equivalent

This appears to be yet another case of the term "compile-time" causing
confusion, because it's actually an ambiguous term.  I actually think
it's a bad term that should be replaced by something more precise; its
ambiguity has caused confusion on the part of D learners more often than
not.

There are actually (at least) TWO distinct phases of compilation that
are conventionally labelled "compile time":

1) Template expansion / AST manipulation, and:

2) CTFE (compile-time function evaluation).

Not clearly understanding the distinction between the two often leads to
confusion and frustration at why the compiler isn't doing "what I want".

This confusion is confounded by the fact that these two do mutually
interact in any non-trivial metaprogramming D code, often in rather
complex ways. But the fundamental thing to understand here is:

	Template expansion / AST manipulation must be completed *before*
	CTFE can run.

Basically, CTFE is essentially a D interpreter embedded inside the
compiler, that simulates computation at runtime.  As such, it requires
the AST to be fully compiled, as in, the compiler is ready to generate
object code for it, before CTFE can work.  This is because it makes no
sense to generate code on an incomplete / partial AST.

Furthermore, once a piece of code has made it to the CTFE stage, its AST
has already been processed, and it's now compiled into an internal
representation (analogous to bytecode), so AST-manipulating constructs
no longer make any sense.  In the CTFE stage, there is no such thing as
an AST anymore.

Constructs like `static if` and `pragma(msg)` belong to the AST
manipulation stage. `static if` changes the effective AST that the
compiler sees when it's generating code, and pragma(msg) is essentially
a debugging construct that says "print this message if this part of the
AST makes it past the template expansion phase". This is why code like
this one doesn't do what you might think it does:

	int ctfeFunc(bool b) {
		if (b) {
			pragma(msg, "hello");
			return 1;
		} else {
			pragma(msg, "goodbye");
			return 2;
		}
	}

The compiler will print *both* "hello" and "goodbye", because the
pragma(msg) is evaluated *when the AST of this function is processed*.
The variable 'b' doesn't even exist at that point, because ASTs don't
have the concept of a 'variable' -- 'b' is just an identifier node in
the tree. And 'if' is just another node in the tree that represents a
control-flow directive.  Change that to 'static if', however, and the
compiler sees it differently:

	int ctfeFunc(bool b) {
		static if (b) {
			pragma(msg, "hello");
			return 1;
		} else {
			pragma(msg, "goodbye");
			return 2;
		}
	}

This code won't work, because 'b' is a variable, but variables don't
exist at the AST manipulation stage. So you have to turn 'b' into a
so-called "compile-time argument" (boy I hate that term "compile-time",
it totally is not precise enough to convey the meaning here):

	int ctfeFunc(bool b)() {
		static if (b) {
			pragma(msg, "hello");
			return 1;
		} else {
			pragma(msg, "goodbye");
			return 2;
		}
	}

Do not be deceived by the appearances; the story here is far more
involved than "since static if is a `compile-time` construct, obviously
it needs the condition to be made from `compile-time` arguments". You
have to understand that when you move 'b' into the so-called
compile-time parameter list, you're essentially saying "here's a
template that, given a boolean value, produces an AST tree according to
the following pattern".  If you instantiate ctfeFunc!true, then what
effectively happens is that the compiler sees this AST:

	int ctfeFunc!true() {
		return 1;
	}

So it prints "hello". Note that the else branch is NOT EVEN SEEN past
the AST stage. It practically doesn't exist at that point. The
pragma(msg) is also not seen past that point: it is consumed in the AST
manipulation stage. The compiler sees the static if, evaluates the
condition, and basically prunes the else branch off, then sees the
pragma(msg) and emits the message (i.e., acknowledging "yes this part of
the AST makes it past the AST manipulation stage"). The pragma(msg) is
pruned from the AST after that.

By the time CTFE runs on this function, all the CTFE interpreter sees is
"return 1;".  It doesn't see the "static if" nor the "pragma(msg)",
because those things are AST manipulation constructs; past the AST stage
such things don't exist anymore.

What complicates this 2-stage picture, though, and probably what doesn't
help the confusion with the term "compile-time", is that the compiler is
smart enough to perform CTFE on-demand. Meaning that it's valid to do
this:

	int ctfeFunc(bool b) { /* N.B.: "runtime" parameter! */
		if (b) return 1;
		else return 2;
	}

	enum b = true;
	static if (ctfeFunc(b) == 1)
		struct S { int x; }
	else
		struct S { int y; }

Note that the static if here has a condition that depends on the output
of a CTFE function. This appears be a reversal of the AST manipulation /
CTFE stages, but strictly speaking that's actually not the case.  What
actually happens is:

1) The compiler sees the declaration of ctfeFunc(), produces an AST for it.
   Note that this means the AST of ctfeFunc gets finalised here -- any
   static if's, templates, etc., are expanded into the "final" AST for
   this function.

2) Then the compiler sees the static if, and enters the AST phase *for
   this part of the code*. It says, I need to know the value of
   ctfeFunc(b) in order to know which struct declaration to use -- so it
   *emits code* for ctfeFunc(), and then runs the CTFE engine on the
   resulting code.  Then it takes the output of that evaluation to make
   the decision.

The key point here is that ctfeFunc has *already* passed the AST
manipulation stage while the struct declaration is still in the AST
stage.  The compiler is clever enough to sequence the processing of
these two parts of the code so that it's able to compute the static if
condition.  But the principle of AST manipulation coming before CTFE
still applies -- it's not possible for the CTFE engine to interpret
ctfeFunc if its AST is still being manipulated (because if the AST is
not finalized yet, there isn't any code to interpret!), and it's not
possible for the AST manipulation stage to read the value of a CTFE
computation that's being run on the same piece of code (because CTFE
can't run in the first place while the AST is still being manipulated,
and if the code is already ready for CTFE to interpret, that means the
AST has already been finalized, you can't change it anymore).

Coming back to your loop:

	static if (x) continue;
	foo();

In the AST manipulation stage, the compiler evaluates x, and if x is
true, then the effective AST is:

	continue;
	foo();

By the time it gets to CTFE, it has already "forgotten" that there was
such a thing as a static if in the source code. So it will interpret the
"continue" unconditionally.

Whereas if you wrote:

	static if (x) {} else { foo(); }

in the AST manipulation stage, if x is true, then the effective AST is:

	{}

and if x is false, then the effective AST is:

	foo();

which is what you intended.

T

-- 
A mathematician is a device for turning coffee into theorems. -- P. Erdos