Request for Review: DI Generation Improvements

Mon Jun 11 17:49:25 PDT 2012

On Mon, 11 Jun 2012 04:55:37 -0700, Timon Gehr <timon.gehr at gmx.ch> wrote:

> On 06/11/2012 09:37 AM, timotheecour wrote:
>> questions:
>>
>> A) as I understand it, the new di generation will systematically strip  
>> out the implementation of
>> non-auto-return, non-templated functions, is that correct?
>>
>
> This is my understanding as well.
>

Correct. A lot of community consultation went into the improvements.

>> B) since there are some important differences with the current di files  
>> (in terms of inlining
>> optimization, etc), will there be a dmd command-line flag to output  
>> those stripped down di files
>> (eg: -stripdi), so user still has choice of which to output ?
>
> You could use cp instead of dmd -H.
>

In fact I rewrote the DRuntime makefiles to do precisely this with the  
hand-crafted .DI files.
https://github.com/D-Programming-Language/druntime/pull/218

>>
>> C) why can't auto-return functions be semantically analyzed and  
>> resolved?
>> (eg:auto fun(int x){return x;} ) => generated di should be: int fun(int
>> x); )
>>
>
> Conditional compilation.
>
> version(A) int x;
> else version(B) double x;
> else static assert(0);
>
> auto foo(){return x;}
>
> would need to be stripped to
>
> version(A){
>      int x;
>      int foo();
> }else version(B){
>      double x;
>      double foo();
> }else static assert(0);
>
> which is a nontrivial transformation.
>
> This is just a simple example. Resolving the return type conditionally
> in the general case is undecidable, therefore, making it work
> satisfactorily involves a potentially unbounded amount of work.
>
>

The general explanation is that any time you rewrite the AST (such as the  
operation performed above) you have to duplicate that work in DI  
generation to maintain semantic cohesion (what I put in is what I get  
out). Another reason is that DI generation is *required* to be run prior  
to the semantic analysis stage due to fact that your command line could  
alter the analysis and subsequent AST. In fact this note is one of the  
*very* few multi-line comments Walter put into the DMD source. In essence  
DI generation is an AST pretty-printer, and as such is must be run prior  
to analysis and after parsing. All my patch does is insert checks into the  
printing process to stop it from printing certain parts of the AST.

Theoretically one could rebuild a semantic analyzer that didn't change  
it's behavior based on the the command-line into the DI generation, but  
that would literally require rewriting DI generation from the ground up.  
And you'd still have to verify that the primary semantic analysis didn't  
change anything from the DI analysis. Then you'd have to write a  
reconciliation process. Personally, I think that we have better places to  
focus our efforts.

That said, what you really want is the full source embedded into the  
library (similar, to .NET's CIL). That would get you want you are after.  
Such a thing could actually be done except for OMF/Optlink. Since OMF  
doesn't support custom sections there is a no special place to store the  
code that the compiler could easily access. This would enable the compiler  
to extract the source during compilation and analyze the both the user  
source and library source and perform all possible optimizations  
accordingly. I look at adding COFF to DMD and my brain melted. There are  
enough #IFDEFs in there to cause permanent insanity... *sigh*

>> D) can we have an option to strip out template functions as well? This
>> could be more or less customizable, eg with a file that contains a list
>> of template functions to strip, or simply strip all templates). The
>> library writer would instantiate those templates to certain predefined
>> values. Eg:
>>
>> module fun;
>> T fun1(T)(T x){
>>      return 2*x;
>> }
>> void dummy_instantiate(){
>> //instantiate to double, and repeat for all desired types, eg with a  
>> mixin.
>>      alias double T;
>>      fun1!(T)(T.init);
>> }
>> Then the library writer generates a library (static or shared) and the
>> end user uses the templates with one of the allowed types (otherwise
>> link error happens). In many cases (eg matrix/numerical libraries), all
>> that's needed is a finite set of predefined types (eg int,float etc).
>> Bonus points would be if the generated di file automatically generates
>> template constraints to reflect the allowed types, to have compile time
>> errors instead of link time ones.
>> Having ability to strip templates greatly simplifies distribution of
>> code, as it doesn't have to carry all dependencies recursively if all
>> one wants is a few predefined types.
>
> You could use overloads instead and use templates for implementing them.  
> Templates are generally only exposed in a well-designed library  
> interface if they work with an unbounded number of types.
>
>>
>> D) btw, is there a way to force the instantiations more elegantly rather
>> than using dummy_instantiate? in C++ we can just write something like:
>> template double fun<double>(double);, but the same doesnt in D.
>>
>>
>
> For example:
>
> T foo(T)(T arg){return arg; pragma(msg, T);}
>
> mixin template Instantiate(alias t,T...){
>      static assert({
>          void _(){mixin t!T;}
>          return true;
>      }());
> }
>
> mixin Instantiate!(foo,int);
> mixin Instantiate!(foo,double);
>
> The nested function named '_' is unnecessary. It works around a DMD bug,  
> mixin t!T is 'not yet implemented in CTFE'.

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/