std.benchmark ready for review. Manager sought after

Mon Apr 9 07:25:27 PDT 2012

Andrei Alexandrescu wrote:

> all noise is additive (there's no noise that may make a benchmark
> appear to run faster)

This is in doubt, because you yourself wrote "the machine itself has 
complex interactions". This complex interactions might lower the time 
needed for an operation of the benchmarked program.

Examples that come to mind:
a) needed data is already in a (faster) cache because it belongs to a 
memory block, from which some data is needed by some program not 
belonging to the benchmarked set---and that block isnt replaced yet.
b) needed data is stored in a hdd whose I/O scheduler uses the elevator 
algorithm and serves the request by pure chance instantly, because the 
position of the needed data is between two positions accessed by some 
programs not belonging to the benchmarked set.

Especially a hdd, if used, will be responsible for a lot of noise you 
define as "quantization noise (uniform distribution)" even if the head 
stays at the same cylinder. Not recognizing this noise would only mean 
that the data is cached and interpreting the only true read from the 
hdd as a jerky outlier sems quite wrong.

>> 1) The "noise during normal use" has to be measured in order to
>> detect the sensibility of the benchmarked program to that noise.
> How do you measure it, and what 
> conclusions do you draw other than there's a more or less other
> stuff going on on the machine, and the machine itself has complex
> interactions? 
> 
> Far as I can tell a time measurement result is:
> 
> T = A + Q + N

For example by running more than one instance of the benchmarked 
program in paralell and use the thereby gathered statistical routines 
to spread T into the additiv components A, Q and N.

>> 2) The noise the benchmarked program produces has to be measured
>> too, because the running benchmarked program probably increases
>> the noise for all other running programs.
> 
> How to measure that?

Similar to the above note.

> Also, that noise does not need to be measured
> as much as eliminated to the extent possible.

I wouldn't define two programs to be equivalent based on the time until 
completion only. That time might be identical for both programs, but if 
only one of the programs increases the answering time of the machine  
to inacceptability I would choose the other. 

-manfred