DeRailed DSL (was Re: compile-time regex redux)

Fri Feb 9 18:14:08 PST 2007

Bill Baxter wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> Tom S wrote:
> 
>>> Therefore, I'd like to see a case when a compile-time DSL processing 
>>> is going to be really useful in the real world, as to provoke further 
>>> complication of the compiler and its usage patterns.
>>
>> I can't parse this sentence. Did you mean "as opposed to provoking" 
>> instead of "as to provoke"?
> 
> I think he just meant he wants to see some real world examples that 
> justify the additional complexity that will be added to the compiler and 
> language.
> 
>>> The other aspect I observed in the discussions following the new dmd 
>>> release, is that folks are suggesting writing full language parsers, 
>>> D preprocessors and various sort of operations on complex languages, 
>>> notably extended-D parsing... This may sound weird in my lips, but 
>>> that's clearly abuse. What these people really want is a way to 
>>> extend the language's syntax, pretty much as Nemerle does it. And 
>>> frankly, if D is going to be a more powerful language, built-in text 
>>> preprocessing won't cut it. Full fledged macro support and syntax 
>>> extension mechanics are something that we should look at.
>>
>> This is a misunderstanding. The syntax is not to be extended. It stays 
>> fixed, and that is arguably a good thing. The semantics become more 
>> flexible. For example, they will make it easy to write a matrix 
>> operation:
>>
>> A = (I - B) * C + D
>>
>> and generate highly performant code from it. (There are many reasons 
>> for which that's way harder than it looks.)
> 
> This is one thing I haven't really understood in the discussion.  How do 
> the current proposals help that case?  From what I'm getting you're 
> going to have to write every statement like the above as something like:
> 
> mixin {
>    ProcessMatrixExpr!( "A = (I - B) * C + D;" );
> }
> 
> How do you get from this mixin/string processing stuff to an API that I 
> might actually be willing to use for common operations?
> 
> 
>> I think there is a lot of apprehension and misunderstanding 
>> surrounding what metacode is able and supposed to do or simplify. 
>> Please, let's focus on understanding _before_ forming an opinion.
> 
> I think that's exactly what Tom's getting at.  He's asking for examples 
> of how this would make like better for him and others.  I think given 
> your background you take it for granted that metaprogramming is the 
> future.  But D is attracts folks from all kinds of walks of life, 
> because it promises to be a kinder, gentler C++.  So some people here 
> aren't even sure why D needs templates at all.  Fortran doesn't have 'em 
> after all.  And Java just barely does.
> 
> Anyway, I think it would help get everyone on board if some specific and 
> useful examples were given of how this solves real problems (and no I 
> don't really consider black and white holes as solving real problems.  I 
> couldn't even find any non-astronomical uses of the terms in a google 
> search.)
> 
> For instance, it would be nice to see some more concrete discussion about
> * the "rails" case.
> * the X = A*B + C matrix/vector expressions case.
> * the case of generating bindings to scripting langauges / ORB stubs
> * the Spirit/parser generator case
> 
> So here's my take on vector expressions since that's the only one I know 
> anything about.
> 
> *Problem statement*:
> Make the expression A=(I-B)*C+D efficient, where the variables are large 
> vectors (I'll leave out matrices for now).
> 
> *Why it's hard*:
> The difficulty is that (ignoring SSE instruction etc) the most efficient 
> way to compute that is do all operations component-wise.  So instead of 
> computing I-B then multiplying by C, you compute
>     A[i] = (I[i]-B[i])*C[i]+D[i];
> for each i.  This eliminates the need to allocate large intermediate 
> vectors.
> 
> *Existing solutions*:
>   Expression templates in C++, e.g. The Blitz++ library.  Instead of 
> making opSub in I-B return a new Vector object, you make opSub return an 
> ExpressionTemplate object.  This is a little template struct that 
> contains a reference to I and to B, and knows how to subtract the two in 
> a component-wise manner.  The types of I and B are template parameters, 
> LeftT and RightT. Its interface also allows it to be treated just like a 
> Vector.  You can add a vector to it, subtract a Vector from it etc.
> 
> Now we go and try to multiply that result times C.  The result of that 
> is a new MultExpressionTemplate with two paramters, the LeftT being our 
> previous SubExpressionTemplate!(Vector,Vector) and the RightT being Vector.
> 
> Proceeding on in this way eventually the result of the math is of type:
> 
> AddExpressionTemplate!(
>    MultExpressionTemplate!(
>       SubExpressionTemplate!(Vector,Vector),
>       Vector),
>    Vector)
> 
> And you can see that we basically have a parse tree expressed as nested 
> templates.  The final trick is that a Vector.opAssign that takes an 
> ExpressionTemplate is provided and that method calls a method of the 
> expression template to finally trigger the calculation, like 
> expr.eval(this).  eval() has the top-level loop over the components of 
> the answer.
> 
> *Why Existing solutions are insufficient*
> For that all that effort to actually be useful the compiler has to be 
> pretty agressive about inlining everything so that in the end all the 
> temporary template structs and function calls go away and you're just 
> left with one eval call.  It can be tricky to get the code into just the 
> right configuration so that the compiler will do the inlining.  And even 
> then results will depend on the quality of the compiler.  My attempts 
> using MSVC++ several years ago always came out to be slower than the 
> naive version, but with about 10x the amount of code.  The code is also 
> pretty difficult to follow because of all those different types of 
> expression templates that have to be created.
> 
> If you include matrices things are even trickier because there are 
> special cases like "A*B+a*C" that can be computed efficiently by a 
> single optimized routine.  You'd like to recognize such cases and turn 
> them into single calls to that fast routine.  You might also want to 
> recognize "A*B+C" as a special case of that with a==1.
> 
> *How it could be improved*:
> ??? this is what I'd like to see explained better.

Ok, here's my high-level stab at how the new stuff could help.
Instead of returning "expression templates" where the the parse tree is 
represented as a hierarchical type, just make opSub (I-B) return an 
ExpressionWrapper which is just a light wrapper around the string "I-B". 
  Next the compiler gets to "ExpressionWrapper * C", have that just 
return another ExpressionWrapper that modifies the original string to be
    "(" ~ origstring ~ "*C"
In the end you still have an opAssign overload that calls eval() on the 
expression, but now the expression has a (fully parenthesized) string 
representation of the expression:  "((I-B)*C)+D".  And eval just has to 
use the new compile-time string parsing tricks to decide how best to 
evaluate that expression.  It's a pattern matching task, and thankfully 
now you have the whole pattern in one place, as opposed to the 
ExpressionTemplates, which have a hard time really getting a look at the 
whole picture.

--bb