DeRailed DSL (was Re: compile-time regex redux)

Fri Feb 9 17:56:12 PST 2007

Andrei Alexandrescu (See Website For Email) wrote:
> Tom S wrote:

>> Therefore, I'd like to see a case when a compile-time DSL processing 
>> is going to be really useful in the real world, as to provoke further 
>> complication of the compiler and its usage patterns.
> 
> I can't parse this sentence. Did you mean "as opposed to provoking" 
> instead of "as to provoke"?

I think he just meant he wants to see some real world examples that 
justify the additional complexity that will be added to the compiler and 
language.

>> The other aspect I observed in the discussions following the new dmd 
>> release, is that folks are suggesting writing full language parsers, D 
>> preprocessors and various sort of operations on complex languages, 
>> notably extended-D parsing... This may sound weird in my lips, but 
>> that's clearly abuse. What these people really want is a way to extend 
>> the language's syntax, pretty much as Nemerle does it. And frankly, if 
>> D is going to be a more powerful language, built-in text preprocessing 
>> won't cut it. Full fledged macro support and syntax extension 
>> mechanics are something that we should look at.
> 
> This is a misunderstanding. The syntax is not to be extended. It stays 
> fixed, and that is arguably a good thing. The semantics become more 
> flexible. For example, they will make it easy to write a matrix operation:
> 
> A = (I - B) * C + D
> 
> and generate highly performant code from it. (There are many reasons for 
> which that's way harder than it looks.)

This is one thing I haven't really understood in the discussion.  How do 
the current proposals help that case?  From what I'm getting you're 
going to have to write every statement like the above as something like:

mixin {
    ProcessMatrixExpr!( "A = (I - B) * C + D;" );
}

How do you get from this mixin/string processing stuff to an API that I 
might actually be willing to use for common operations?

> I think there is a lot of apprehension and misunderstanding surrounding 
> what metacode is able and supposed to do or simplify. Please, let's 
> focus on understanding _before_ forming an opinion.

I think that's exactly what Tom's getting at.  He's asking for examples 
of how this would make like better for him and others.  I think given 
your background you take it for granted that metaprogramming is the 
future.  But D is attracts folks from all kinds of walks of life, 
because it promises to be a kinder, gentler C++.  So some people here 
aren't even sure why D needs templates at all.  Fortran doesn't have 'em 
after all.  And Java just barely does.

Anyway, I think it would help get everyone on board if some specific and 
useful examples were given of how this solves real problems (and no I 
don't really consider black and white holes as solving real problems.  I 
couldn't even find any non-astronomical uses of the terms in a google 
search.)

For instance, it would be nice to see some more concrete discussion about
* the "rails" case.
* the X = A*B + C matrix/vector expressions case.
* the case of generating bindings to scripting langauges / ORB stubs
* the Spirit/parser generator case

So here's my take on vector expressions since that's the only one I know 
anything about.

*Problem statement*:
Make the expression A=(I-B)*C+D efficient, where the variables are large 
vectors (I'll leave out matrices for now).

*Why it's hard*:
The difficulty is that (ignoring SSE instruction etc) the most efficient 
way to compute that is do all operations component-wise.  So instead of 
computing I-B then multiplying by C, you compute
     A[i] = (I[i]-B[i])*C[i]+D[i];
for each i.  This eliminates the need to allocate large intermediate 
vectors.

*Existing solutions*:
   Expression templates in C++, e.g. The Blitz++ library.  Instead of 
making opSub in I-B return a new Vector object, you make opSub return an 
ExpressionTemplate object.  This is a little template struct that 
contains a reference to I and to B, and knows how to subtract the two in 
a component-wise manner.  The types of I and B are template parameters, 
LeftT and RightT. Its interface also allows it to be treated just like a 
Vector.  You can add a vector to it, subtract a Vector from it etc.

Now we go and try to multiply that result times C.  The result of that 
is a new MultExpressionTemplate with two paramters, the LeftT being our 
previous SubExpressionTemplate!(Vector,Vector) and the RightT being Vector.

Proceeding on in this way eventually the result of the math is of type:

AddExpressionTemplate!(
    MultExpressionTemplate!(
       SubExpressionTemplate!(Vector,Vector),
       Vector),
    Vector)

And you can see that we basically have a parse tree expressed as nested 
templates.  The final trick is that a Vector.opAssign that takes an 
ExpressionTemplate is provided and that method calls a method of the 
expression template to finally trigger the calculation, like 
expr.eval(this).  eval() has the top-level loop over the components of 
the answer.

*Why Existing solutions are insufficient*
For that all that effort to actually be useful the compiler has to be 
pretty agressive about inlining everything so that in the end all the 
temporary template structs and function calls go away and you're just 
left with one eval call.  It can be tricky to get the code into just the 
right configuration so that the compiler will do the inlining.  And even 
then results will depend on the quality of the compiler.  My attempts 
using MSVC++ several years ago always came out to be slower than the 
naive version, but with about 10x the amount of code.  The code is also 
pretty difficult to follow because of all those different types of 
expression templates that have to be created.

If you include matrices things are even trickier because there are 
special cases like "A*B+a*C" that can be computed efficiently by a 
single optimized routine.  You'd like to recognize such cases and turn 
them into single calls to that fast routine.  You might also want to 
recognize "A*B+C" as a special case of that with a==1.

*How it could be improved*:
??? this is what I'd like to see explained better.

--bb