Go and generic programming on reddit, also touches on D

Mon Sep 19 23:18:12 PDT 2011

On Mon, 19 Sep 2011 12:12:04 -0400, Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:

> On 9/19/11 10:46 AM, Robert Jacques wrote:
>> So, on balance, I'd say the two pointers representation is categorically
>> worse than the fat pointer representation.
>
> Benchmark. A few of your assumptions don't hold.
>
> Andrei
>

So I ran Timon Gehr's benchmark on my system 10 times, and although the Old method performed the best, the times showed a scheduling jitter of about 0.2 s which is an order of magnitude larger than the differences between the runs. I've had some experience with removing jitter from benchmarks, so I redid it. For those who don't want to read through all the results and code snippets below, here's the executive summary: there's no measurable difference between the optimum popFront + empty implementations as laid out by Timon Gehr and myself. However, when you consider the current slice based implementation of popFront (i.e. a[1..$]), things change drastically. For an int[], ptr+ptr was 3x slower and for an int[3][] it was 8x slower.

Timon Gehr's benchmark:

Old     New     Diff
4.64242 4.58722 0.0551916
4.60659 4.5482  0.058386
4.53478 4.51753 0.0172519
4.54561 4.51867 0.026935
4.46662 4.4733  -0.00668139
4.47204 4.46893 0.00310762
4.52798 4.54021 -0.0122326
4.5685  4.57667 -0.00816801
4.50138 4.49186 0.00951325
4.49764 4.49422 0.00342912

--------------------------

import std.datetime, std.stdio, std.algorithm;

int[1_000_000] a;

void f1(){
     for(auto ptr=a.ptr, len=a.length+1; len; ++ptr, --len) {}
}
void f2(){
     for(auto b=a.ptr, e=a.ptr+a.length; b<e; b++){}
}

void main(){
     writeln("Always remember to run benchmarks on a single core. Pause");
     readln();
     double min_f1 = int.max;
     double min_f2 = int.max;
     foreach(i;0..10_000) {
         auto b=benchmark!(f1,f2)(3);
         min_f1 = min(min_f1, b[0].to!("seconds",double));
         min_f2 = min(min_f2, b[1].to!("seconds",double));
     }
     writeln("Old\tNew\tDiff");
     writeln(min_f1,"\t",min_f2,"\t",(min_f2-min_f1)/(min_f1)*100.0,"%");
}

which gives

Old     New     Diff
0.00127244      0.00127244      0%

On my system. Which is pretty much as expected. Now onto the more critical test.

import std.datetime, std.stdio, std.algorithm, std.range, std.array;

alias int[3] T;
T[1_000_000] a;

void f1(){
     auto a2 = a[];
     while(!a2.empty) { a2.popFront(); }
}
void f2(){
     auto ptrFront = a.ptr;
     auto ptrBack  = a.ptr + a.length;
     auto i = 1;
     while(!(ptrFront >= ptrBack) ) {
         auto length = (ptrBack - ptrFront) / T.sizeof;
         auto begin  = ptrFront + T.sizeof * i;
         auto end    = ptrFront + T.sizeof * length;
         if(end - begin >= 0  && ptrBack - end >= 0) {
             ptrFront = begin;
             ptrBack  = end;
         }
     }
}

void main(){
     writeln("Always remember to run benchmarks on a single core. Pause");
     readln;
     double min_f1 = int.max;
     double min_f2 = int.max;
     foreach(i;0..10_000) {
         auto b=benchmark!(f1,f2)(3);
         min_f1 = min(min_f1, b[0].to!("seconds",double));
         min_f2 = min(min_f2, b[1].to!("seconds",double));
     }
     writeln("Old\tNew\tDiff");
     writeln(min_f1,"\t",min_f2,"\t",(min_f2-min_f1)/(min_f1)*100.0,"%");
}

Old     New     Diff
0.00869062      0.0118318       36.145%

Which makes sense given the division. Switching T to an int gives

Old     New     Diff
0.00868968      0.00502586      -42.1629%

Ah.. Perhaps I should also inline f1:

void f1(){
//    auto a2 = a[];
//    while(!a2.empty) { a2.popFront(); }

     auto i = 1;
     auto length = a.length;
     auto ptr    = a.ptr;
     while(!(length == 0)) {
         auto end   = length;
         auto begin = i;
         if(end - begin >= 0  && length - end >= 0) {
             ptr     = ptr + T.sizeof * begin;
             length  = end - begin;
         }
     }
}

which results in:

Old     New     Diff
0.00127244      0.00502912      295.233%

And Switching T back to int[3] gives:

Old     New     Diff
0.00127244      0.01192 836.78%