std.benchmark ready for review. Manager sought after

Sun Apr 8 13:13:07 PDT 2012

On 4/8/12 3:03 PM, Manfred Nowak wrote:
> Andrei Alexandrescu wrote:
>
>> Clearly there is noise during normal use as well, but
>> incorporating it in benchmarks as a matter of course reduces the
>> usefulness of benchmarks
>
> On the contrary:
> 1) The "noise during normal use" has to be measured in order to detect
> the sensibility of the benchmarked program to that noise.

That sounds quite tenuous to me. How do you measure it, and what 
conclusions do you draw other than there's a more or less other stuff 
going on on the machine, and the machine itself has complex interactions?

Far as I can tell a time measurement result is:

T = A + Q + N

where:

A > 0 is actual benchmark time

Q > 0 quantization noise (uniform distribution)

N > 0 various other noises (interrupts, task switching, networking, CPU 
dynamically changing frequency, etc). Many people jump on Gaussian as an 
approximation, but my tests suggest it's hardly so because it has a lot 
of jerky outliers.

How do we estimate A given T?

> 2) The noise the benchmarked program produces has to be measured too,
> because the running benchmarked program probably increases the noise
> for all other running programs.

How to measure that? Also, that noise does not need to be measured as 
much as eliminated to the extent possible. This is because the benchmark 
app noise is a poor model of the application-induced noise.

> In addition: the noise produced by a machine under heavy load might
> bring the performance of the benchmarked program down to zero.

Of course. That's why the documentation emphasizes the necessity of 
baselines. A measurement without baselines is irrelevant.

Andrei