Go and generic programming on reddit, also touches on D
Robert Jacques
sandford at jhu.edu
Mon Sep 19 23:18:12 PDT 2011
On Mon, 19 Sep 2011 12:12:04 -0400, Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:
> On 9/19/11 10:46 AM, Robert Jacques wrote:
>> So, on balance, I'd say the two pointers representation is categorically
>> worse than the fat pointer representation.
>
> Benchmark. A few of your assumptions don't hold.
>
> Andrei
>
So I ran Timon Gehr's benchmark on my system 10 times, and although the Old method performed the best, the times showed a scheduling jitter of about 0.2 s which is an order of magnitude larger than the differences between the runs. I've had some experience with removing jitter from benchmarks, so I redid it. For those who don't want to read through all the results and code snippets below, here's the executive summary: there's no measurable difference between the optimum popFront + empty implementations as laid out by Timon Gehr and myself. However, when you consider the current slice based implementation of popFront (i.e. a[1..$]), things change drastically. For an int[], ptr+ptr was 3x slower and for an int[3][] it was 8x slower.
Timon Gehr's benchmark:
Old New Diff
4.64242 4.58722 0.0551916
4.60659 4.5482 0.058386
4.53478 4.51753 0.0172519
4.54561 4.51867 0.026935
4.46662 4.4733 -0.00668139
4.47204 4.46893 0.00310762
4.52798 4.54021 -0.0122326
4.5685 4.57667 -0.00816801
4.50138 4.49186 0.00951325
4.49764 4.49422 0.00342912
--------------------------
import std.datetime, std.stdio, std.algorithm;
int[1_000_000] a;
void f1(){
for(auto ptr=a.ptr, len=a.length+1; len; ++ptr, --len) {}
}
void f2(){
for(auto b=a.ptr, e=a.ptr+a.length; b<e; b++){}
}
void main(){
writeln("Always remember to run benchmarks on a single core. Pause");
readln();
double min_f1 = int.max;
double min_f2 = int.max;
foreach(i;0..10_000) {
auto b=benchmark!(f1,f2)(3);
min_f1 = min(min_f1, b[0].to!("seconds",double));
min_f2 = min(min_f2, b[1].to!("seconds",double));
}
writeln("Old\tNew\tDiff");
writeln(min_f1,"\t",min_f2,"\t",(min_f2-min_f1)/(min_f1)*100.0,"%");
}
which gives
Old New Diff
0.00127244 0.00127244 0%
On my system. Which is pretty much as expected. Now onto the more critical test.
import std.datetime, std.stdio, std.algorithm, std.range, std.array;
alias int[3] T;
T[1_000_000] a;
void f1(){
auto a2 = a[];
while(!a2.empty) { a2.popFront(); }
}
void f2(){
auto ptrFront = a.ptr;
auto ptrBack = a.ptr + a.length;
auto i = 1;
while(!(ptrFront >= ptrBack) ) {
auto length = (ptrBack - ptrFront) / T.sizeof;
auto begin = ptrFront + T.sizeof * i;
auto end = ptrFront + T.sizeof * length;
if(end - begin >= 0 && ptrBack - end >= 0) {
ptrFront = begin;
ptrBack = end;
}
}
}
void main(){
writeln("Always remember to run benchmarks on a single core. Pause");
readln;
double min_f1 = int.max;
double min_f2 = int.max;
foreach(i;0..10_000) {
auto b=benchmark!(f1,f2)(3);
min_f1 = min(min_f1, b[0].to!("seconds",double));
min_f2 = min(min_f2, b[1].to!("seconds",double));
}
writeln("Old\tNew\tDiff");
writeln(min_f1,"\t",min_f2,"\t",(min_f2-min_f1)/(min_f1)*100.0,"%");
}
Old New Diff
0.00869062 0.0118318 36.145%
Which makes sense given the division. Switching T to an int gives
Old New Diff
0.00868968 0.00502586 -42.1629%
Ah.. Perhaps I should also inline f1:
void f1(){
// auto a2 = a[];
// while(!a2.empty) { a2.popFront(); }
auto i = 1;
auto length = a.length;
auto ptr = a.ptr;
while(!(length == 0)) {
auto end = length;
auto begin = i;
if(end - begin >= 0 && length - end >= 0) {
ptr = ptr + T.sizeof * begin;
length = end - begin;
}
}
}
which results in:
Old New Diff
0.00127244 0.00502912 295.233%
And Switching T back to int[3] gives:
Old New Diff
0.00127244 0.01192 836.78%
More information about the Digitalmars-d
mailing list