Reading from stdin significantly slower than reading file directly?
Steven Schveighoffer
schveiguy at gmail.com
Thu Aug 13 14:41:02 UTC 2020
On 8/12/20 6:44 PM, methonash wrote:
> Hi,
>
> Relative beginner to D-lang here, and I'm very confused by the apparent
> performance disparity I've noticed between programs that do the following:
>
> 1) cat some-large-file | D-program-reading-stdin-byLine()
>
> 2) D-program-directly-reading-file-byLine() using File() struct
>
> The D-lang difference I've noticed from options (1) and (2) is somewhere
> in the range of 80% wall time taken (7.5s vs 4.1s), which seems pretty
> extreme.
>
> For comparison, I attempted the same using Perl with the same large
> file, and I only noticed a 25% difference (10s vs 8s) in performance,
> which I imagine to be partially attributable to the overhead incurred by
> using a pipe and its buffer.
>
> So, is this difference in D-lang performance typical? Is this expected
> behavior?
>
> Was wondering if this may have anything to do with the library
> definition for std.stdio.stdin
> (https://dlang.org/library/std/stdio/stdin.html)? Does global
> file-locking significantly affect read-performance?
>
> For reference: I'm trying to build a single-threaded application; my
> present use-case cannot benefit from parallelism, because its ultimate
> purpose is to serve as a single-threaded downstream filter from an
> upstream application consuming (n-1) system threads.
Are we missing the obvious here? cat needs to read from disk, write the
results into a pipe buffer, then context-switch into your D program,
then the D program reads from the pipe buffer.
Whereas, reading from a file just needs to read from the file.
The difference does seem a bit extreme, so maybe there is another more
complex explanation.
But for sure, reading from stdin doesn't do anything different than
reading from a file if you are using the File struct.
A more appropriate test might be using the shell to feed the file into
the D program:
dprogram < FILE
Which means the same code runs for both tests.
-Steve
More information about the Digitalmars-d-learn
mailing list