std.benchmark is in reviewable state

Mon Oct 3 09:06:12 PDT 2011

Andrei Alexandrescu wrote:
> On 10/3/11 8:59 AM, Jens Mueller wrote:
> [snip]
> 
> Thanks for this rich feedback. I'll act on most or all points. A few
> notes follow.
> 
> >1. I'm not that convinced that prepending benchmark_ to a function is
> >that nice even though it's simple. But following this idea I could
> >remove all my unittest blocks and add functions starting with test_ and
> >write a module std.test in same fashion. If this approach is considered
> >the way to go we can remove unittest altogether and solve testing
> >entirely in a library. After all it looks less elegant to me but I admit
> >that I see no other way and it is the easiest solution after all. I just
> >want to stress that when seeing the code I questioned myself whether
> >having unittest in the language is nice to have.
> 
> std.benchmark uses relatively new reflection features that weren't
> available until recently. Adding unittest now would be difficult to
> justify, but at the time unittest was a good feature.

I see. This is also my feeling.

> >6. Wrong numbers in documentation
> >The examples in the documentation have wrong numbers. In the end one
> >should pay special attention that the documentation is consistent. I
> >think one cannot automate this.
> 
> Not sure how this can be addressed at all (or maybe I don't
> understand). The examples show real numbers collected from real runs
> on my laptop. I don't think one may expect to reproduce them with
> reasonable accuracy because they depend on a lot of imponderables.

I just mean that the relative speed ups printed in the table does not
match the ones given in the surrounding text.

> >There are also rounding errors. That means when I try to compute these
> >numbers I get different in the first digit after the decimal point. I
> >wonder why this is. Even though the measurement may be noisy the
> >computed numbers should be more precise I think.
> 
> Where are calculations mistaken? Indeed I see e.g. that append
> built-in has 83ns/call and 11.96M calls/s. Reverting the first with
> a "calculator" program yields 12.04 Mcalls/s, and reverting the
> second yields 83.61ns/call. I'll see what can be done.

Yes. This is what I meant here.

> >7. Changing units in the output.
> >One benchmark report may use different units in its output. The one in
> >the example uses microseconds and nanoseconds. This makes comparison
> >more difficult and should be avoided (at least for groups). It is
> >error-prone when analyzing the numbers.
> 
> Problem is space and wildly varying numbers. I initially had
> exponential notation for everything, but it was super ugly and
> actually more difficult to navigate. That's why I added the compact
> notation. I think we'll need to put up with it.

If formatting and collecting is separated it is fine I'll guess. It's
okay for the default output.

> >10. The file reading/writing benchmark should be improved. Since the
> >file should be deleted. The file writing benchmark should be more like:
> >void benchmark_fileWrite()
> >{
> >     benchmarkPause();
> >     auto f = File.tmpfile();
> >     benchmarkResume();
> >
> >     f.writeln("hello, world!");
> >}
> >
> >But this also measures the deletion of the file.
> 
> If you don't want to measure std.file.write but instead only the
> time spent in writeln, his should work and measure only the time
> spent in writing to the file:
> 
> void benchmark_fileWrite()
> {
>      benchmarkPause();
>      auto f = File.tmpfile();
>      benchmarkResume();
>      f.writeln("hello, world!");
>      benchmarkPause();
> }
> 
> I haven't yet tested the case when a functions leaves the benchmark paused.

This looks good.

> >I have no idea how to fix the benchmark for reading a file. But I
> >believe it involves using mkstemp on POSIX and similar on Windows.
> >mkstemp should be added to std.file in the long run.
> 
> Why is there a need to fix it? The benchmark measures the
> performance of std.file.read soup to nuts.

Yes. But the benchmark should clean up after itself if possible. So it
should remove the file.

> In related news, I got word from a colleagues that I should also use
> times() (http://linux.die.net/man/2/times) to distinguish time spent
> in user space vs. system calls.  This is because some functions must
> issue certain system calls in any case and the user is interested in
> how much they can improve the situation by optimizing their end of
> the deal. This would apply e.g. to file I/O where the overhead of
> formatting etc. would become easier to distinguish.

Good hint from your colleague. The more data one can get when measuring
code the better. You could even be interested in cache misses when
benchmarking. But CPU time is usually what boils down to. I fear you may
loose simplicity.
I'm looking forward to your changes.

Jens