Review of Andrei's std.benchmark

Mon Oct 1 18:09:52 PDT 2012

Jens Mueller wrote:
> Hi,
> 
> it's my pleasure to announce the begin of the formal review of Andrei's
> std.benchmark. The review will start today and end in two weeks, on 1st
> of October. The review is followed by a week of voting which ends on 8th
> of October.
> 
> Quoting Andrei from his request for formal review:
> "I reworked the benchmarking framework for backward compatibility,
> flexibility, and convenience.
> 
> There are a few enhancement possibilities (such as tracking
> system/user time separately etc), but there is value in keeping things
> simple and convenient. Right now it really takes only one line of code
> and observing a simple naming convention to hook a module into the
> benchmarking framework."
> 
> Code: https://github.com/D-Programming-Language/phobos/pull/794
> Docs: http://dlang.org/phobos-prerelease/std_benchmark.html
> 
> If std.benchmark is accepted it will likely lead to a deprecation of
> std.datetime's benchmark facilities.
> 
> The code is provided as a pull requested and being (as usual) integrated
> by the auto tester for Mac OS X, FreeBSD, Linux and Windows (see
> (http://d.puremagic.com/test-results/pull-history.ghtml?repoid=3&pullid=794).
> 
> In your comments you can/should address the
> * design
> * implementation
> * documentation
> * usefulness
> of the library.
> 
> Provide information regarding the depth (ranging from very brief to
> in-depth) of your review and conclude explicitly whether std.benchmark
> should or shouldn't be included in Phobos.
> 
> Post all feedback to this thread. Constructive feedback is very much
> appreciated.
> 
> To conclude in more Andrei like words: Happy destruction!
> 
> Jens

The review of std.benchmark is over.
I'd like to give some conclusions regarding the review of the proposed
benchmarking module hoping to reach some consensus on how to proceed.

First of all I'd like to thank all the reviewers for taking part in this
review.

Let me first summarize the main point that was intensely discussed
in the review.

* benchmarking vs. profiling
  std.benchmark, as the review clarified, is primarily intended for
  micro benchmarking.
  The review questioned this arguably limited scope. In particular, it
  asked for different reductions (max, avg (with std deviation)).
  This whole initial confusion should be addressed by improving the
  documentation and making sure a broader set of use cases can be
  supported without breaking code.

  It is not entirely settled whether this broader scope of usage will be
  addressed by Andrei's proposal. Maybe the community can address this
  issue by providing additional functionality at a later point in time.
  But it needs to made sure that those changes won't break code written
  against the first version of std.benchmark.

  The issues related to this are
  -non-deterministic/randomized functions (currently not supported)
  -benchmarking real time software
  -benchmarking with warm vs. cold caches
  -measuring by using hardware counters

  These seem to be of particular interest to D community.

During the review other issues were identified which the author will
look into.
* make number of function runs configurable
* individual registration of benchmark functions in addition to naming convention
* auto detect when min is stable with some degree of statistical certainty
* specify a problem size
* make scheduleForBenchmarking a template mixin
* benchmark should return the BenchmarkResult array

A major issue is that std.benchmark won't work if two cyclically
dependent modules are benchmarked because it uses module/static
constructors. Currently there is no idea on how to solve this issue.

I hope this summary captures the most important issues that where
brought up. How to proceed seems difficult at this point. I believe the
issue with cyclic dependent modules has to be solved and we should reach
consensus regarding the intended use cases that should be addressed by
std.benchmark even in the long run.
For the time being I suppose there is no use in proceeding with voting.
Any comments?

Jens