Compile time loop unrolling
Bill Baxter
dnewsgroup at billbaxter.com
Wed Aug 29 01:38:16 PDT 2007
Has anyone done this before?
It's pretty similar to what Don's stuff does, and maybe Don is even
doing this in part of Blade somewhere, but anyway it's a little
different from the type of thing he's got on his web page.
Here the basic idea is to optimize templated small vector classes.
Say you've got a struct Vector(N) type. A lot of the operations look like
values_[0] op other.values_[0];
values_[1] op other.values_[1];
...
values_[N-1] op other.values_[N-1];
//----------------------------------------------------------------------------
import std.metastrings;
// Create a string that unrolls the given expression N times replacing
// idx in the expression each time
string unroll(int N,int i=0)(string expr, char idx='z') {
static if(i<N) {
char[] subs_expr;
foreach (c; expr) {
if (c==idx) {
subs_expr ~= ToString!(i);
} else {
subs_expr ~= c;
}
}
return subs_expr ~ "\n" ~ unroll!(N,i+1)(expr,idx);
}
return "";
}
Then to use it to implement opAddAssign you write code like:
alias unroll!(N) unroll_;
void opAddAssign(ref vector_type _rhs) {
const string expr = "values_[z] += _rhs[z];";
//pragma(msg,unroll_(expr)); // handy for debug
mixin( unroll_(expr) );
}
Seems to work pretty well despite the braindead strategy of "replace
every 'z' with the loop number".
I suspect this would improve performance significantly when using DMD
since it can't inline anything with loops.
With the D2.0 and a "static foreach(i;N)" type of construct you could
probably do this by just saying:
static foreach(i;N) {
values_[i] = _rhs.values_[i];
}
I wish that were coming to D1.0.
--bb
More information about the Digitalmars-d
mailing list