std.benchmark is in reviewable state

Mon Oct 3 08:44:51 PDT 2011

On 10/3/11 8:59 AM, Jens Mueller wrote:
[snip]

Thanks for this rich feedback. I'll act on most or all points. A few 
notes follow.

> 1. I'm not that convinced that prepending benchmark_ to a function is
> that nice even though it's simple. But following this idea I could
> remove all my unittest blocks and add functions starting with test_ and
> write a module std.test in same fashion. If this approach is considered
> the way to go we can remove unittest altogether and solve testing
> entirely in a library. After all it looks less elegant to me but I admit
> that I see no other way and it is the easiest solution after all. I just
> want to stress that when seeing the code I questioned myself whether
> having unittest in the language is nice to have.

std.benchmark uses relatively new reflection features that weren't 
available until recently. Adding unittest now would be difficult to 
justify, but at the time unittest was a good feature.

> 2. When benchmarking code you often investigate the algorithm's
> properties regarding its scalability.

Yah, that definitely needs to be looked at. I think I need to add some 
support beyond mere documentation.

> 3.
> (r[i] / 10_000_000.).to!("nsecs", int)
> should be written as
> (r[i] / 10_000_000).nsecs
>
> Same for msecs and similar.

Noted.

> 4. Test results should be exposed to the caller.
> Usually, it is enough to write the results to the console. But you may
> want to post process the results. Being it either graphically or to
> output XML for whatever reason.
> I think it would be beneficial to add this flexibility and separate the
> benchmarking logic from its generating the output.
> It might be useful to add the used CPU using std.cpuid to the output.

We should decouple collection from formatting. However, this complicates 
matters because we'd need to expose the benchmark results structure. 
I'll think of it.

> 5. In the first example I completely missed that it is not "relative
> calls". I thought this was one term even though I didn't know what it
> should mean. Later I figured that there is an empty column called
> relative.

There are two spaces there, I should add three.

> 6. Wrong numbers in documentation
> The examples in the documentation have wrong numbers. In the end one
> should pay special attention that the documentation is consistent. I
> think one cannot automate this.

Not sure how this can be addressed at all (or maybe I don't understand). 
The examples show real numbers collected from real runs on my laptop. I 
don't think one may expect to reproduce them with reasonable accuracy 
because they depend on a lot of imponderables.

> There are also rounding errors. That means when I try to compute these
> numbers I get different in the first digit after the decimal point. I
> wonder why this is. Even though the measurement may be noisy the
> computed numbers should be more precise I think.

Where are calculations mistaken? Indeed I see e.g. that append built-in 
has 83ns/call and 11.96M calls/s. Reverting the first with a 
"calculator" program yields 12.04 Mcalls/s, and reverting the second 
yields 83.61ns/call. I'll see what can be done.

> 7. Changing units in the output.
> One benchmark report may use different units in its output. The one in
> the example uses microseconds and nanoseconds. This makes comparison
> more difficult and should be avoided (at least for groups). It is
> error-prone when analyzing the numbers.

Problem is space and wildly varying numbers. I initially had exponential 
notation for everything, but it was super ugly and actually more 
difficult to navigate. That's why I added the compact notation. I think 
we'll need to put up with it.

> 8. I prefer writing std.benchmark.pause() over benchmarkPause() and
> std.benchmark.resume() over benchmarkResume().

OK.

> 10. The file reading/writing benchmark should be improved. Since the
> file should be deleted. The file writing benchmark should be more like:
> void benchmark_fileWrite()
> {
>      benchmarkPause();
>      auto f = File.tmpfile();
>      benchmarkResume();
>
>      f.writeln("hello, world!");
> }
>
> But this also measures the deletion of the file.

If you don't want to measure std.file.write but instead only the time 
spent in writeln, his should work and measure only the time spent in 
writing to the file:

void benchmark_fileWrite()
{
      benchmarkPause();
      auto f = File.tmpfile();
      benchmarkResume();
      f.writeln("hello, world!");
      benchmarkPause();
}

I haven't yet tested the case when a functions leaves the benchmark paused.

> I have no idea how to fix the benchmark for reading a file. But I
> believe it involves using mkstemp on POSIX and similar on Windows.
> mkstemp should be added to std.file in the long run.

Why is there a need to fix it? The benchmark measures the performance of 
std.file.read soup to nuts.

In related news, I got word from a colleagues that I should also use 
times() (http://linux.die.net/man/2/times) to distinguish time spent in 
user space vs. system calls.  This is because some functions must issue 
certain system calls in any case and the user is interested in how much 
they can improve the situation by optimizing their end of the deal. This 
would apply e.g. to file I/O where the overhead of formatting etc. would 
become easier to distinguish.

Andrei