Review of Andrei's std.benchmark

Fri Sep 21 14:23:44 PDT 2012

Andrei Alexandrescu wrote:
> On 9/21/12 5:39 AM, Jacob Carlborg wrote:
> >On 2012-09-21 06:23, Andrei Alexandrescu wrote:
> >
> >>For a very simple reason: unless the algorithm under benchmark is very
> >>long-running, max is completely useless, and it ruins average as well.
> >
> >I may have completely misunderstood this but aren't we talking about
> >what do include in the output of the benchmark? In that case, if you
> >don't like max and average just don't look at it.
> 
> I disagree. I won't include something in my design just so people
> don't look at it most of the time. Min and average are most of the
> time an awful thing to include, and will throw off people with
> bizarre results.
> 
> If it's there, it's worth looking at. Note how all columns are
> directly comparable (I might add, unlike other approaches to
> benchmarking).
> 
> >>For virtually all benchmarks I've run, the distribution of timings is a
> >>half-Gaussian very concentrated around the minimum. Say you have a
> >>minimum of e.g. 73 us. Then there would be a lot of results close to
> >>that; the mode of the distribution would be very close, e.g. 75 us, and
> >>the more measurements you take, the closer the mode is to the minimum.
> >>Then you have a few timings up to e.g. 90 us. And finally you will
> >>inevitably have a few outliers at some milliseconds. Those are orders of
> >>magnitude larger than anything of interest and are caused by system
> >>interrupts that happened to fall in the middle of the measurement.
> >>
> >>Taking those into consideration and computing the average with those
> >>outliers simply brings useless noise into the measurement process.
> >
> >After your replay to one of Manu's post, I think I misunderstood the
> >std.benchmark module. I was thinking more of profiling. But are these
> >quite similar tasks, couldn't std.benchmark work for both?
> 
> This is an interesting idea. It would delay release quite a bit
> because I'd need to design and implement things like performance
> counters and such.

You mean like extending StopWatch and allowing the user to provide the
measuring code, i.e. counting the number of instructions. This would be
very useful. Is it possible to make sure that these changes can be
introduced later without breaking the API?

Jens