Compile time and runtime grammars

Sat Oct 10 12:43:38 PDT 2015

On Sat, Oct 10, 2015 at 06:52:29PM +0000, DLangLearner via Digitalmars-d-learn wrote:
> Only now I found that most of my confusions are with D's compile time
> grammar or features. As an excuse, my confusions can be partially
> attributed to the way D is presented:
> 
> 1. There are confusing keywords:
> For example, there is a "if", there is also a "static if", there is a
> "if", and there is an "is()". For new learners like me, they cause
> confusion at least uneasiness.

I assume by the second "if" you meant "is".  It's well-known that the
syntax of is() could be better. Unfortunately, the ship has long since
sailed, and there's not much point in breaking existing code just to
make some cosmetic changes.

The "static" in "static if" is clear indication that this isn't a
regular if-statement, but a branch that's taken at compile-time. I'm not
sure how else it can be made clearer.

> 2. Compile time grammar spreads among runtime grammar
> Most documents present D's compile time grammar and runtime grammar in
> the same time. It made me feel that D's grammar is not consistent
> because compile time grammar seem to be exceptions from runtime
> grammar. If a document talks exclusively about runtime grammar first,
> and introduces compile time grammar late, I think this will make
> readers accept those seemingly conflicting grammar. In fact without
> introducing compile time grammar, D is much similar to other
> languages, in this way the readers from other languages can find D
> more friendly.
> 
> With the understanding of D's compile time grammar, I can read D codes
> from other projects such as std packages, but I am still not easy
> about the way that D's compile time codes are not clearly
> distinguished from runtime codes. I am wondering if it is a good idea
> to clearly indicate those compile time codes with a special identifier
> say "@ct", or prefix "__" as in __traints, if so then those
> "inconsistencies" can be resolved as follows:
> 
> static if -> @ct if
> static assert" -> @ct assert
> enum fileName = "list.txt" -> @ct  fileName = "list.txt"
> is (string[void]) -> @ct is (string[void])
> mixin(`writeln("Hello World!");`) -> @ct `writeln("Hello World!");`
> 
> So this post is not quite a question, just a thought in my mind after
> I am able to differentiate compile time codes from runtime codes.

Actually, this shows a misunderstanding of what D's compile-time
features actually do, and also shows that the terminology "compile-time"
itself is a bit misleading. This is likely the fault of the way these
features are described in the documentation.

In D, there are actually (at least) two (very!) distinct categories of
compile-time features:

There's the template system, which is mainly concerned with manipulating
the syntax tree of the code.  This provides the meta-programming
features of D, and runs quite early on in the compilation process.

There's also the CTFE system (compile-time function evaluation), which
is mainly concerned with *executing code* inside the compiler, at
runtime, after the syntax tree has been generated, which is later in the
compilation process. Obviously, this can only be done after the syntax
tree has been fixed, otherwise the semantics of the code would be
undefined or inconsistent.

The two are closely-related, and the difference may seem to be subtle,
but this is extremely important to understand in order to understand how
to use these features effectively.

For example, "static if" is a feature belonging to the template system,
and is concerned with manipulating the syntax tree of the program before
the compiler runs its semantic passes over it.  The branch is evaluated
*before* CTFE even sees the code; and that's why the following code does
*not* work:

	int func(bool x) {
		static if (x)
			return 1;
		else
			return 2;
	}
	enum y = func(1);

The first problem is that the static-if is asking the compiler to
evaluate x. Theoretically speaking, this should work, since x is known
at "compile-time", but when the static-if is being processed, the syntax
tree of func() isn't even completed yet, so the compiler has no way of
knowing what x might be referring to.

The second problem is that the value of the enum is processed by CTFE,
but since the static-if is processed before CTFE even sees the code, by
the time CTFE runs it's already too late for the static-if to decide
which branch should be taken. Static-if means that the branch of code
that isn't taken, doesn't even exist in the syntax tree of the program;
it's as if the programmer deleted those lines from the source file.

So you see, the term "compile-time" is actually ambiguous, because there
are actually two distinct phases of compilation here, and intermixing
them doesn't make sense.

The correct version of the above code is:

	int func(bool x) {
		if (x)	// <--- N.B. no "static"
			return 1;
		else
			return 2;
	}
	enum y = func(true);

This works, because now the if-statement is a "normal" if-statement that
gets included in the syntax tree of the program, so now when the enum
asks for the value of func(true), CTFE kicks in and is able to emulate
the execution of the if-statement and return 1 as the value of y.

Let's take this one step further, by adding a main() function:

	int func(bool x) {
		if (x)	// <--- N.B. no "static"
			return 1;
		else
			return 2;
	}
	enum y = func(true);

	void main() {
		writeln(func(false));
	}

Question: should the if-statement be annotated @ct or not?

The real answer is that the @ct annotation doesn't make sense, because
the if-statement, on its own, has nothing to indicate whether it will be
evaluated at "compile-time" or at runtime. Again, we see that the
terminology "compile-time" is misleading, because it makes you think of
it as a single period of time before "runtime", whereas the reality is
that "compile-time" consists of multiple, distinct stages in
compilation.

In this case, the if-statement is executed *both* at "compile-time" and
at runtime. The first time, it's running in CTFE inside the compiler,
the second time, it's running "for real" inside the compiled executable.

The trouble with the @ct annotation is that it conflates the template
system (syntax tree manipulation) and CTFE (compile-time evaluation of
functions) under a single "compile-time" umbrella. But the two are very
different beasts. CTFE is "closer to runtime", in the sense that it
evaluates code that's in some sense "already compiled" and ready to run,
whereas the template system (static if) works in a much earlier stage,
when the syntax tree of the code hasn't settled down yet.

Understanding this subtle but important difference will make it clear
why some constructs don't work in CTFE, e.g., using static-if on local
variables in CTFE'd functions, even though they "ought to" because all
the required values ought to be known at "compile-time".

T

-- 
Democracy: The triumph of popularity over principle. -- C.Bond