D and i/o
Jon Degenhardt
jond at noreply.com
Sun Nov 10 19:41:52 UTC 2019
On Saturday, 9 November 2019 at 23:39:09 UTC, bioinfornatics
wrote:
> Dear,
>
> In my field we are io bound thus I would like to have our tools
> fast as I can read a file.
>
> Thus I started some dummy bench which count the number of lines.
> The result is compared to wc -l command. The line counting is
> only a pretext to evaluate the io, this process can be switched
> by any io processing. Thus we use much as possible the buffer
> instead the byLine range. Moreover such range imply that the
> buffer was read once before to be ready to process.
>
>
> https://github.com/bioinfornatics/test_io
>
> Ideally I would like to process a shared buffer through
> multiple core and run a simd computation. But it is not yet
> done.
You might also be interested in a similar I/O performance test I
created: https://github.com/jondegenhardt/dcat-perf. This one is
based on 'cat' (copy to standard output) rather than 'wc', as I'm
interested in both input and output, but the general motivation
is similar. I specifically wanted to compare native phobos
facilities to those in iopipe and some phobos covers in
tsv-utils. Most tests are by-line based, as I'm interested in
record oriented operations, but chunk-based copying is included.
A general observation is that if lines are involved, it's
important to measure performance of both short and long lines.
This may even affect 'wc' when reading by chunk or memory mapped
files, see H. S. Teoh's observations on 'wc' performance:
https://forum.dlang.org/post/mailman.664.1571878411.8294.digitalmars-d@puremagic.com.
As an aside - My preliminary conclusion is that phobos facilities
are overall quite good (based on tsv-utils comparative
performance benchmarks), but are non-optimal when short lines are
involved. This is the case for both input and output. Both the
tsv-utils covers and iopipe are better, with iopipe being the
best for input, but appears to need some further work on the
output side (or I don't know iopipe well enough). By
"preliminary", I mean just that. There could certainly be
mistakes or incomplete analysis in the tests.
--Jon
More information about the Digitalmars-d
mailing list