DeRailed DSL (was Re: compile-time regex redux)

Don Clugston dac at nospam.com.au
Mon Feb 12 00:06:25 PST 2007


Andrei Alexandrescu (See Website For Email) wrote:
> Don Clugston wrote:
>> To do a perfect solution, you'd need to be able to identify when a 
>> variable is used twice, for example,
>>
>> a = 3.0 * b + (c - a * 1.5);
>> The compiler knows that 'a' occurs twice, but that information is not 
>> transferred into expression templates.
> 
> I think what you really need is aliasing information (e.g. two names 
> referring to the same vector), and that is typically not easily computable.

Ideally, yes. But there's value in knowing names which are *definitely* 
aliased, even if the majority of names have indeterminate aliasing. I 
think that most of the value comes from the simple cases, the ultimate 
example being the += operator.

Example: for x87, the most severe limitation on expression complexity is 
not the number of temporaries, but the number of distinct vectors (you 
run out of pointer registers before you run out of floating-point stack).

>> The compiler may also know the length of some of the arrays, and that 
>> is also lost.
> 
> Ah, indeed. So that would mean better error checking and probably less 
> runtime bounds checking.

Exactly. And there's implications for loop unrolling, too. (eg, loop 
unrolling is very attractive if you know the length is a multiple of 4).

>> Of course another approach would be to add
>> opAssignExpression(char [] expr)()
>> {
>> }
>> where "expr" is the verbatim expression.
>> a = b*5 + c*d;
>>
>> would become
>> mixin a.opAssignExpression!("a=b*5+c*d");
>>
>> since it seems that with expression templates, you're fighting the 
>> compiler, attempting to undo everything it has done.
> 
> Not everything - e.g., precedence of operators remains unchanged. In the 
> string-based approach you'd have to write a little parser to reimplement 
> operator precedences. But point taken.

I've thought about this a bit more. The compiler has done some useful 
constant folding for you, and parsed all the literals. So ideally you 
want it in a partially digested form.
My implementation converts the expression into a tuple of values 
(basically a stack), and a char [] of postfix operations to be performed 
on that stack. Something like this is probably close to ideal, 
especially if the compiler could remove known duplicate (aliased) values 
from the tuple.

There would need to be a canonical form for the operations, though -- 
postfix is great for Forth, but looks thoroughly out of place in D.
eg
a = 3.0 * b + (c - a * 1.5);
becomes
"v2 v3 * v4 v5 v1 * - v1 ="
with the tuple being v[1] = a, v[2] = 3.0, v[3] = b, v[4] = c, v[5] = 1.5
But perhaps a
char [] ConvertToPostfix(char [] expr) library metafunction is all 
that's required.



More information about the Digitalmars-d mailing list