avgtime - Small D util for your everyday benchmarking needs

Fri Mar 23 08:33:18 PDT 2012

On 3/23/12 3:02 AM, Juan Manuel Cabo wrote:
> On Friday, 23 March 2012 at 05:16:20 UTC, Andrei Alexandrescu wrote:
> [.....]
>>> (man, the gaussian curve is everywhere, it never ceases to
>>> perplex me).
>>
>> I'm actually surprised. I'm working on benchmarking lately and the
>> distributions I get are very concentrated around the minimum.
>>
>> Andrei
>
>
> Well, the shape of the curve depends a lot on
> how the random noise gets inside the measurement.
[snip]

Hmm, well the way I see it, the observed measurements have the following 
composition:

X = T + Q + N

where T > 0 (a constant) is the "real" time taken by the processing, Q > 
0 is the quantization noise caused by the limited resolution of the 
clock (can be considered 0 if the resolution is much smaller than the 
actual time), and N is noise caused by a variety of factors (other 
processes, throttling, interrupts, networking, memory hierarchy effects, 
and many more). The challenge is estimating T given a bunch of X samples.

N can be probably approximated to a Gaussian, although for short timings 
I noticed it's more like bursts that just cause outliers. But note that 
N is always positive (therefore not 100% Gaussian), i.e. there's no way 
to insert some noise that makes the code seem artificially faster. It's 
all additive.

Taking the mode of the distribution will estimate T + mode(N), which is 
informative because after all there's no way to eliminate noise. 
However, if the focus is improving T, we want an estimate as close to T 
as possible. In the limit, taking the minimum over infinitely many 
measurements of X would yield T.

Andrei