Compile-time optimization
JS
js.mdnq at gmail.com
Tue Jul 23 09:21:08 PDT 2013
There seems to be a lot of improvement for programming languages
to optimize compile time aspects that are not taken into account.
With ctfe I think such optimizations are more and more relevant
in D.
I'll give a simple example:
Take a standard string joining function:
string join(string[] s)
{
string ss;
foreach(sss; s) ss ~= sss;
return ss
}
when this string function is called at run-time with literal
arguments it is extremely inefficient:
join(["a", "b", "c"]);
because the result can obviously be computed at compile-time by
the compiler.
using ctfe can solve this problem but only when used in const
results.
But there should be no reason why join(["a", "b", "c"]) can't be
optimized at compile time so that the function call is replaced
with "abc".
Now, think of join as defined like
string join(T...)(T t);
and we call it like
join("a", s, "b", "c", ss);
where s is string that is compile time evaluable and ss is a
run-time string array.
the optimal join code would be something like
r = "a"~s~"bc"
foreach(sss; ss) r ~= sss;
(and which would change depending on the argument types and if
they are compile time evaluable)
To generate such "optimal" code(one that gets the compiler to do
as much work as it can at compile time) is somewhat convoluted in
D. We can use string mixins to solve the problem but not easily
because we can't pass variadic arguments to templates.
e.g.,
join(T...)(T s)
{
mixin(tJoinH!(s))
}
fails because s is not known at compile time... even if s is
partially known.
It would be nice if we could pass the arguments to a template
even if they are not completely known at compile time.
e.g., join("a", s),
tJoinH can receive "a" and s but s's value, if not compile-time
known, is undefined.
Hence, tJoinH can optimize the arguments by attempting to join
the compile-time arguments and insert optimal code for the
run-time arguments.
That is, if we call a function with an argument list, we should
be able to pass that to a template without any errors and be able
to determine the value of each argument... if the argument is
undefined, then that becomes it's value.
Another way to see it work is with sum:
sum(1,2,x);
int sum(T...)(T i)
{
// return 1 + 2 + i[3]; for the call above, it is optimal and
our goal to generate
// return mixin(tSumH!(i)); // hypothetically generates 1 + 2
+ i[3] for the call above
// return mixin(tSumH2!(i)); // generates i[1] + i[2] + i[3].
Current possible way
// naive approach(assumes all arguments are known only at
run-time):
int y;
foreach(ii; i) y += ii;
return y;
}
I'm able to achieve this partially by passing T, rather than i,
to the template function and the variable name, i. This works but
may not produce optimal results(referencing the stack instead of
the value of a literal... which prevents it from being further
optimized).
e.g., from above, instead of return 1 + 2 + i[3]; I would
generate return i[1] + i[2] + i[3]; Not optimal because i[1] and
i[2] are treated as run-time entities even though in the above
case they are not.
If D had some model to create functions that can optimize the way
they deal with their compile-time arguments and run-time
arguments then it would be pretty easy to create such functions.
e.g.,
string join(T...)(T t) // Note join has a variadic parameter
{
ctfe {
// called when join is used in ctfe, t can be used to
get the values directly if they exist at compile time, else their
value is undefined.
// can call join here with run-time entities
}
// runtime version goes here
}
we might have something like this
string join(T...)(T t) // easy notation to constrain T on char,
string, string[]
{
ctfe
{
string code = "string s;";
foreach(k, val; t)
{
if (isDefined!val)
code ~= `s ~= "`~join(val)~`";`; // calls join as
a ctfe on val.
else
{
if (type == string[])
code ~= `foreach(ss; t[`~k~`]) s~= sss;`;
} else code ~= `s ~= t[`~k~`];`;
}
}
code ~= "return s;";
mixin code;
}
// concatenates all arguments of t as if they were run-time
arguments.
string s;
foreach(k, ss; t)
static if (T[k].type == string[])
foreach(sss; ss)
s ~= sss;
else
s ~= ss;
return s;
}
I'm not saying this works but the idea is that the ctfe block is
executed at compile time WHEN there are non-compile time
arguments(ok, maybe ctfe isn't the best keyword). The block then
figures out how to best deal with the non-ct arguments.
In the above code note how join is called in the ctfe block to
handle known compile time types. This is just standard ctfe
execution. The real "magic" happens when it is dealing with
non-ct arguments, in which case it builds up meta code which is
used as a mixin(special statement at end of ctfe block) that is
inserted.
So for join("a", s); the code would execute something like
at compile time(in ctfe block)
code = `string s;`;
"a" is defined so
code ~= `s ~= "~join("a")~`";`;
join("a") is called, since "a" is known, the non-ctfe
block is called, so we actually have
code ~= `s ~= "a";`;
next for s(which is undefined at compile time)
code ~= `s ~= t[1];`;
end
code ~= "return s;";
so the string mixin code looks like:
string s;
s ~= "a";
s ~= t[1];
return s;
or simplified,
return "a"~t[1];
which is an optimal join code for the join call join("a", s);
(it would make more sense to call a template in the ctfe block so
we don't have form code strings of code strings)
Note also that having such a ctfe block does not break existing
code structures and pre existing functions can easily be upgraded
for optimal behavior.
Having the compiler do as much optimization as possible should
produce a significant speedup in most applications.
When a function is called with any known compile time entities,
it is possible for an optimization to take place. I think if D
incorporates some mechanism in it's language to do so it will go
a long.
In fact, all this simply demonstrates that ctfe's are not as
powerful as they can be. Calling a ctfe'able function with just
one run-time variable will prevent it from being ctfe'ed and
optimized completely... even though there is still some
possibility for optimization.
Hopefully some will read this acutely and attempt to ponder its
potential. I expect the immediate naysayers who can't see past
their own noses... and those that get hung up on non-essential
details(I said that the block doesn't have to be called ctfe...
oh hell, you do realize we don't even need to use a block?).
More information about the Digitalmars-d
mailing list