stdio performance in tango, stdlib, and perl
Andrei Alexandrescu (See Website For Email)
SeeWebsiteForEmail at erdani.org
Wed Mar 21 15:56:25 PDT 2007
kris wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> 13.9s Tango
>> 6.6s Perl
>> 5.0s std.stdio
>
>
> There's a couple of things to look at here:
>
> 1) if there's an idiom in tango.io, it would be rewriting the example
> like this: Cout.conduit.copy (Cin.conduit)
The test code assumed taking a look at each line before printing it, so
speed of line reading and writing was deemed as important, not speed of
raw I/O, which we all know how to get.
> 2) the output.newline on each line will cause a flush ~ this may or may
> not have something to do with it
Probably.
> 3) the test would appear to be stressing the parsing of lines just as
> much (if not more) than the io system itself. All part-and-parcel to a
> degree, but it may be worth investigating
I don't understand this.
> In order to track this down, we'd be interested to see the results of:
>
> a) Cout.conduit.copy (Cin.conduit);
The program wouldn't be comparable with the others.
> b) foregoing the output .newline, purely as an experiment
4.7s tcat
> c) on Linux, tango.io uses the c-lib posix.read/write functions. Is that
> what phobos uses also? (on Win32, Tango uses direct Win32 calls instead)
Then probably that could be filed as a bug in Tango. The nextLine
function should lock the file only once, thus giving each thread an
entire line, not a portion of a line. Also, using block-oriented read
for reading lines makes Tango incompatible with standard C usage (Tango
might read more than one line into its buffers; if a C-level function
tries to read from the file, it will be too late). Unfortunately there's
no a public API for such stuff so system-specific approaches must be
taken. readln on Linux uses Gnu's getline(), which locks the file only
once per line. See:
http://www.gnu.org/software/libc/manual/html_node/Line-Input.html
Unfortunately there's one extra copy going on - from the mallocated
buffer into D's gc'd array. That copy could be optimized away by using
Gnu's malloc hooks:
http://www.gnu.org/software/libc/manual/html_node/Hooks-for-Malloc.html
Andrei
More information about the Digitalmars-d
mailing list