stdio line-streaming revisited
kris
foo at bar.com
Wed Mar 28 18:52:30 PDT 2007
Last week there were a series of posts regarding some optimized code
within phobos streams. A question posed was, without those same
optimizations, would tango.io be slower than the improved phobos [1]
As these new phobos IO functions are now available, Andrei's "benchmark"
[2] was run on both Win32 and linux to see where tango.io could use some
improvement.
The results indicate:
1) on linux, the fastest variation of the revised phobos code runs 40%
slower than the generic tango.io equivalent. On the other hand, the new
phobos code seems a bit faster than perl
2) on win32, similar testing shows tango.io to be more than six times
faster than the improved phobos code. Tweaking the tango.io library a
little makes it over eight times faster than the phobos equivalent [3]
3) On Win32, generic tango.io is more than twice as efficient as the
fastest C version identified. It's also notably faster than MinGW 'cat',
which apparently performs various under-the-cover optimizations.
4) by making some further optimizations in the phobos client-code using
setvbuf() and fputs(), the improved phobos version can be sped up
significantly; at that point tango.io is only three times faster than
phobos on Win32. These adjustments require knowledge of tweaking the
underlying C library; thus, they may belong to the group of C++ tweaks
which Walter quibbled with last week. The setvbuf() tweaks make no
noticable difference on linux, though the fputs() improvements are
accounted for in #1 (above)
Note that tango.io is not explicitly optimized for this behaviour. While
some quick hacks to the library have been shown to make it around 20%
faster than the generic package (for this specfic test), the efficiency
benefits are apparently derived through the approach more than anything
else. With some changes to a core tango.io module, similar performance
multipliers could presumeably be exhibited on linux platforms also. That
is: tango.io is relatively sedate on linux, compared to its win32 variation.
FWIW: if some of those "Language Shootout" tests are IO-bound, perhaps
tango.io might help? Can't imagine they'd apply that as a "language"
test, but stranger things have happened before.
Here's the tango.io client (same as last week):
-------------
import tango.io.Console;
void main()
{
char[] content;
while (Cin.nextLine (content, true))
Cout (content);
}
------------
and here's the fastest phobos equivalent. Removing the setvbuf() code
makes it consume around twice as much time on Win32. Note that this
version is faster than the equivalent code posted last week, though
obviously more specialized and verbose:
------------
import std.stdio;
import std.cstream;
void main() {
char[] buf = new char[1000 ];
size_t len;
const size_t BUFSIZE = 2 * 1024;
setvbuf(stdin, null, _IOFBF, BUFSIZE);
setvbuf(stdout, null, _IOFBF, BUFSIZE);
while (( len = readln(buf)) != 0) {
assert(len < 1000);
buf[len] = '\0';
fputs(buf.ptr, stdout);
}
}
------------
[1] Timing measurements can be supplied to those interested.
[2] The recent changes within phobos apparently stemmed from Andrei
piping large text files through his code, and this "benchmark" is a
reflection of that process.
[3] That ~20% optimization has been removed from the generic package at
this time, since we feel it doesn't contribute very much to the overall
IO picture. It can be restored if people find that necessary, and there
is no change to client code.
More information about the Digitalmars-d
mailing list