std.benchmark is in reviewable state

Mon Oct 3 06:59:32 PDT 2011

Andrei Alexandrescu wrote:
> On 9/25/11 6:08 PM, Andrei Alexandrescu wrote:
> >I've had a good time with std.benchmark today and made it ready for
> >submission to Phobos.
> [snip]
> 
> You destroyed, and I listened. I updated my benchmark branch to
> accept a configurable time budget for measurements, and a number of
> trials that take the minimum time.
> 
> Still need to do the thread affinity stuff.
> 
> Code:
> 
> https://github.com/andralex/phobos/blob/benchmark/std/benchmark.d
> https://github.com/andralex/phobos/blob/benchmark/std/format.d
> 
> Dox:
> 
> http://erdani.com/d/web/phobos-prerelease/std_benchmark.html
> http://erdani.com/d/web/phobos-prerelease/std_format.html
> 
> 
> Destroy once again.
> 
> Andrei

Overall I like this module very much.

1. I'm not that convinced that prepending benchmark_ to a function is
that nice even though it's simple. But following this idea I could
remove all my unittest blocks and add functions starting with test_ and
write a module std.test in same fashion. If this approach is considered
the way to go we can remove unittest altogether and solve testing
entirely in a library. After all it looks less elegant to me but I admit
that I see no other way and it is the easiest solution after all. I just
want to stress that when seeing the code I questioned myself whether
having unittest in the language is nice to have.

2. When benchmarking code you often investigate the algorithm's
properties regarding its scalability. That's why I think it's useful to
give an example in the documentation. Something like

  void benchmark_insert_array(size_t length)(uint n)
  {   
      benchmarkPause();
      Array!bool array;
      array.length = length;
      benchmarkResume();

      foreach(i; 0 .. n)
          array.insertAfter(array[0 .. uniform(0, length)], true);
  }

  void benchmark_insert_slist(size_t length)(uint n)
  {   
      benchmarkPause();
      SList!bool list;
      foreach (i; 0 .. length)
          list.insertFront(false);
      benchmarkResume();

      foreach(i; 0 .. n)
      {   
          auto r = take(list[], uniform(0, length));
          list.insertAfter(r, true);
      }
  }

  alias benchmark_insert_slist!(10) benchmark_insert_slist_10;
  alias benchmark_insert_array!(10) benchmark_insert_array_10;
  alias benchmark_insert_slist!(100) benchmark_insert_slist_100;
  alias benchmark_insert_array!(100) benchmark_insert_array_100;

This example needs more polishing.

Input size is the most important dimension that CPU time depends on. The
example should clarify how to benchmark depending on the input size.

3.
(r[i] / 10_000_000.).to!("nsecs", int)
should be written as
(r[i] / 10_000_000).nsecs

Same for msecs and similar.

4. Test results should be exposed to the caller.
Usually, it is enough to write the results to the console. But you may
want to post process the results. Being it either graphically or to
output XML for whatever reason.
I think it would be beneficial to add this flexibility and separate the
benchmarking logic from its generating the output.
It might be useful to add the used CPU using std.cpuid to the output.

5. In the first example I completely missed that it is not "relative
calls". I thought this was one term even though I didn't know what it
should mean. Later I figured that there is an empty column called
relative.

6. Wrong numbers in documentation
The examples in the documentation have wrong numbers. In the end one
should pay special attention that the documentation is consistent. I
think one cannot automate this.
There are also rounding errors. That means when I try to compute these
numbers I get different in the first digit after the decimal point. I
wonder why this is. Even though the measurement may be noisy the
computed numbers should be more precise I think.

7. Changing units in the output.
One benchmark report may use different units in its output. The one in
the example uses microseconds and nanoseconds. This makes comparison
more difficult and should be avoided (at least for groups). It is
error-prone when analyzing the numbers.

8. I prefer writing std.benchmark.pause() over benchmarkPause() and
std.benchmark.resume() over benchmarkResume().
or
import std.benchmark : benchmark;
try benchmark.pause()

Supporting benchmark.start() would improve the example. Since you just
need to specify one line that indicates where the benchmarking should
start. But this seems difficult to get to work with the current design
and I don't consider it important. It's a very minor thing.

10. The file reading/writing benchmark should be improved. Since the
file should be deleted. The file writing benchmark should be more like:
void benchmark_fileWrite()
{   
    benchmarkPause();
    auto f = File.tmpfile();
    benchmarkResume();

    f.writeln("hello, world!");
}

But this also measures the deletion of the file.
I have no idea how to fix the benchmark for reading a file. But I
believe it involves using mkstemp on POSIX and similar on Windows.
mkstemp should be added to std.file in the long run.

Jens