Compile time loop unrolling

Bill Baxter dnewsgroup at billbaxter.com
Wed Aug 29 01:38:16 PDT 2007


Has anyone done this before?
It's pretty similar to what Don's stuff does, and maybe Don is even 
doing this in part of Blade somewhere, but anyway it's a little 
different from the type of thing he's got on his web page.

Here the basic idea is to optimize templated small vector classes.

Say you've got a struct Vector(N) type.  A lot of the operations look like
     values_[0] op other.values_[0];
     values_[1] op other.values_[1];
     ...
     values_[N-1] op other.values_[N-1];

//----------------------------------------------------------------------------
import std.metastrings;

// Create a string that unrolls the given expression N times replacing
// idx in the expression each time
string unroll(int N,int i=0)(string expr, char idx='z') {
     static if(i<N) {
         char[] subs_expr;
         foreach (c; expr) {
             if (c==idx) {
                 subs_expr ~= ToString!(i);
             } else {
                 subs_expr ~= c;
             }
         }
         return subs_expr ~ "\n" ~ unroll!(N,i+1)(expr,idx);
     }
     return "";
}

Then to use it to implement opAddAssign you write code like:

     alias unroll!(N) unroll_;
     void opAddAssign(ref vector_type _rhs) {
         const string expr = "values_[z] += _rhs[z];";
         //pragma(msg,unroll_(expr)); // handy for debug
         mixin( unroll_(expr) );
     }

Seems to work pretty well despite the braindead strategy of "replace 
every 'z' with the loop number".

I suspect this would improve performance significantly when using DMD 
since it can't inline anything with loops.

With the D2.0 and a "static foreach(i;N)" type of construct you could 
probably do this by just saying:
     static foreach(i;N) {
        values_[i] = _rhs.values_[i];
     }

I wish that were coming to D1.0.

--bb



More information about the Digitalmars-d mailing list