A file reading benchmark

Fri Feb 17 18:57:24 PST 2012

On Fri, 17 Feb 2012 19:44:04 -0600, bearophile <bearophileHUGS at lycos.com> wrote:
> A tiny little file lines reading benchmark I've just found on Reddit:
> http://www.reddit.com/r/programming/comments/pub98/a_benchmark_for_reading_flat_files_into_memory/
>
> http://steve.80cols.com/reading_flat_files_into_memory_benchmark.html
>
> The Ruby code that generates slowly the test data:
> https://raw.github.com/lorca/flat_file_benchmark/master/gen_data.rb
> But for my timings I have used only about a 40% of that file, the first 1_965_800 lines, because I have less memory.
>
> My Python-Psyco version runs in 2.46 seconds, the D version in 4.65 seconds (the D version runs in 13.20 seconds if I don't disable the GC).
>
> From many other benchmarks I've seen that file reading line-by-line is slow in D.

Bearophile, when comparing a deque to a classic vector, of course the deque is going to win. This has nothing to do with D, and everything to do with writing a good algorithm.

Also, Appender has known performance problems with large appends (Issue 5813 http://d.puremagic.com/issues/show_bug.cgi?id=5813 , I'm currently folding in Vladimir's suggestions and generating a pull request)