Performance
Thomas via Digitalmars-d
digitalmars-d at puremagic.com
Sat May 31 10:44:23 PDT 2014
On Saturday, 31 May 2014 at 05:12:54 UTC, Marco Leise wrote:
> Run this with: -O3 -frelease -fno-assert -fno-bounds-check
> -march=native
> This way GCC and LLVM will recognize that you alternately add
> p0 and p1 to the sum and partially unroll the loop, thereby
> removing the condition. It takes 1.4xxxx nanoseconds per step
> on my not so new 2.0 Ghz notebook, so I assume your PC will
> easily reach parity with your original C++ version.
>
>
>
> import std.stdio;
> import core.time;
>
> alias ℕ = size_t;
>
> void main()
> {
> run!plus(1_000_000_000);
> }
>
> double plus(ℕ steps)
> {
> enum p0 = 0.0045;
> enum p1 = 1.00045452 - p0;
>
> double sum = 1.346346;
> foreach (i; 0 .. steps)
> sum += i%2 ? p1 : p0;
> return sum;
> }
>
> void run(alias func)(ℕ steps)
> {
> auto t1 = TickDuration.currSystemTick;
> auto output = func(steps);
> auto t2 = TickDuration.currSystemTick;
> auto nanotime = 1_000_000_000.0 / steps * (t2 - t1).length /
> TickDuration.ticksPerSec;
> writefln("Last: %s", output);
> writefln("Time per op: %s", nanotime);
> writeln();
> }
Thank you for the help. Which OS is running on your notebook ?
For I compiled your source code with your settings with the GCC
compiler. The run took 3.1xxxx nanoseconds per step. For the DMD
compiler the run took 5.xxxx nanoseconds. So I think the problem
could be specific to the linux versions of the GCC and the DMD
compilers.
Thomas
More information about the Digitalmars-d
mailing list