Performance

Marco Leise via Digitalmars-d digitalmars-d at puremagic.com
Fri May 30 22:18:00 PDT 2014


Run this with: -O3 -frelease -fno-assert -fno-bounds-check -march=native
This way GCC and LLVM will recognize that you alternately add
p0 and p1 to the sum and partially unroll the loop, thereby
removing the condition. It takes 1.4xxxx nanoseconds per step
on my not so new 2.0 Ghz notebook, so I assume your PC will
easily reach parity with your original C++ version.



import std.stdio;
import core.time;

alias ℕ = size_t;

void main()
{
	run!plus(1_000_000_000);
}

double plus(ℕ steps)
{
	enum p0 = 0.0045;
	enum p1 = 1.00045452 - p0;

	double sum = 1.346346;
	foreach (i; 0 .. steps)
		sum += i%2 ? p1 : p0;
	return sum;
}

void run(alias func)(ℕ steps)
{
	auto t1 = TickDuration.currSystemTick;
	auto output = func(steps);
	auto t2 = TickDuration.currSystemTick;
	auto nanotime = 1_000_000_000.0 / steps * (t2 - t1).length / TickDuration.ticksPerSec;
	writefln("Last: %s", output);
	writefln("Time per op: %s", nanotime);
	writeln();
}

-- 
Marco



More information about the Digitalmars-d mailing list