std.benchmark is in reviewable state

Robert Jacques sandford at jhu.edu
Sun Sep 25 19:02:58 PDT 2011


On Sun, 25 Sep 2011 21:08:45 -0400, Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:
> I've had a good time with std.benchmark today and made it ready for
> submission to Phobos. As a perk I've added the %h format specifier,
> which formats a number in human-readable form using SI prefixes.
>
> For those interested in discussing this prior to the formal review
> process, the code is at:
>
> https://github.com/andralex/phobos/blob/benchmark/std/benchmark.d
> https://github.com/andralex/phobos/blob/benchmark/std/format.d
>
> and the dox at
>
> http://erdani.com/d/web/phobos-prerelease/std_benchmark.html
> http://erdani.com/d/web/phobos-prerelease/std_format.html
>
> Comments and suggestions are welcome. If possible, it would be great if
> this submission were given a bump in the priority queue. This is because
> with robust benchmarking in Phobos we can ask new, performance-sensitive
> submissions, to add benchmarking.
>
>
> Thanks,
>
> Andrei

Andrei, good job on the framework design, but you've left all of the design flaws of the current benchmark routine. To quote myself from last Tuesday:

>> On Tue, 20 Sep 2011 14:01:05 -0400, Timon Gehr <timon.gehr at gmx.ch> wrote:
> [snip]
>
>> Thank you for making this more meaningful! I assumed the standard
>> library benchmark function would take care of those things. Should it?
>Yes and no. Benchmark provides a good way to make a single measurement of
> a function, as for really short functions you do have to loop many timesto be able to get a reliable reading. However, actual benchmarking requiresa) tuning the benchmark() call time to about 10-20 ms and b) runningbenchmark() many times, taking the minimum. The idea is that on any givenrun you could hit a context switch, etc. so if you make multiple run, onewill get lucky and not be interrupted. Worse, if a core switch happensbetween StopWatch start and end, the number of ticks returned is random.Hence, the comment to manually limit execution to a single core. So, itmight be nice if benchmark() also took a second parameter, denoting thenumber of times to benchmark the function and had some better documentationon how to do good benchmarking.

Wikipedia also lists some of the challenges to modern benchmarking: http://en.wikipedia.org/wiki/Time_Stamp_Counter

So, std.benchmark should
[ ] Set the affinity to a single thread at start (i.e. use SetProcessAffinityMask, etc)
[ ] Repeat the measurement N times, taking the min.
[X] Adaptively increase the number of loop iterations to ensure a valid reading.
[ ] Adaptively decrease the number of loop iterations to ensure minimal context switching.


More information about the Digitalmars-d mailing list