CTFE for array / string / slice operations, constant propagation and bug detection at compile time

Tue May 16 13:41:59 UTC 2023

On Sat, May 13, 2023 at 07:21:30PM +0000, Cecil Ward via Digitalmars-d wrote:
> Here’s a snippet of code which was compiled with both GDC and LDC,
> latest versions currently present on the godbolt Compiler Explorer
> site, target = x86-64, command line options -O3 -release / -frelease
> 
> immutable (int)[]
> test() {
>        immutable int[] s = [1];
> 	
> 	return s[1..2];
> 	}
> 
> There is no CTFE here with either compiler, and the bug in the return
> statement is not spotted by the compilers at compile time. It’s
> clearer to see in the LDC compiler, the return value is an address off
> the end of the initialiser data array and a length of 1. If the
> compiler had CTFE’d s, had then done constant propagation and CTFE’d
> the retval then it could have spotted the bug.

I think there's a misunderstanding here about the role of CTFE. In
general, D compilers do not use CTFE unless the result of the execution
is needed at compile-time, e.g. the value is used as a template argument
or to determine some compile-time construct like the value of an enum.
Indeed, assigning the value of a function call or expression to an enum
is the standard idiom for forcing CTFE.

Runtime code like the function shown above have nothing to do with CTFE;
the error would be only be caught by VRP (value range propagation) if
applied across statements. Unfortunately I don't think current D
compilers apply VRP across statements.

> Would it be a nightmare to implement CTFE for arrays, slices and
> strings?

Again, I think there's a misunderstanding here about exactly what CTFE
is and does. CTFE is the *execution* of a function at compile-time, in
order to produce a *value* that's used by some compile-time construct.
CTFE can certainly work with arrays, slices, strings, and so forth -- as
values to be manipulated at compile time, but you seem to have a
different meaning in mind here, which is outside of the scope of CTFE.
CTFE does not manipulate language *constructs*; it works with actual
values, and so can only be applied to code that has already been
compiled to the point where it's ready to be turned into machine code
for runtime execution. The only difference is that execution takes place
in a D interpreter embedded inside the compiler rather than directly on
the machine at runtime.

Propagating value ranges and so forth, like you describe above, has
nothing to do with CTFE; that's the domain of the compiler's semantic
analysis of the code.

> Optimisation of basic operations on these types, in respect of
> concatenation, slicing and many more operations would be extremely
> welcome.  Would there possibly be a chance to catch many bugs at
> compile-time?

This has nothing to do with CTFE, but semantic analysis, like VRP, data
flow analysis, etc..  D compilers do this to some extent, but not a lot.
Walter has been resistant to implement full-blown data flow analysis /
VRP across statements, because it would significantly slow down
compilation.  I'm not 100% but I think the LDC optimizer does this to a
greater extent, but this happens rather late in the compilation process
(after the front-end has already translated D code into LLVM IR) so it
wouldn't catch many semantic bugs that require analysis at a higher
level of abstraction.

T

-- 
The early bird gets the worm. Moral: ewww...