stdio performance in tango, stdlib, and perl
kris
foo at bar.com
Thu Mar 22 11:05:21 PDT 2007
Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
>
>> torhu wrote:
>>
>>> torhu wrote:
>>> <snip>
>>>
>>>> Fastest first:
>>>>
>>>> tango.io.Console, no flushing (Andrei's): ca 1.5s
>>>>
>>>> C, reusing buffer, gcc & msvc71: ca 3s
>>>>
>>>> James' C++, gcc: 3.5s
>>>>
>>>> Phobos std.cstream, reused buffer: 11s
>>>>
>>>> C w/malloc and free each line, msvc71: 23s
>>>>
>>>> Andrei's C++, gcc: 27s
>>>>
>>>> C w/malloc and free each line, gcc: 37s
>>>>
>>>> Andrei's C++, msvc71: 50s
>>>>
>>>> James' C++, msvc: 51s
>>>
>>>
>>>
>>> I've run some of the tests with more accurate timing. Andrei's Tango
>>> code uses 0.9 seconds, with no flushing, and 1.6 seconds with
>>> flushing. I also tried cat itself, from the gnuwin32 project. cat
>>> clocks in at 1.3 seconds.
>>
>>
>>
>> Just for jollies, a briefly optimized tango.io was tried also: it came
>> in at around 0.7 seconds. On a tripled file-size (3 million lines),
>> that version is around 23% faster than bog-standard tango.io
>
>
> That's great news!
>
>> Thanks for giving it a whirl, tohru :)
>>
>>
>> p.s. perhaps Andrei should be using tango for processing those vast
>> files he has?
>
>
> Is it compatible with C's stdio? IOW, would this sequence work?
>
> readln(line);
> int c = getchar();
>
> Is 'c' the first character on the next line?
Nope. Tango is for D, not C. In order to make a arguably better library,
one often has to step away from the norm. Both yourself and Walter have
been saying "it needs to be fast and simple", and that's exactly what
Tango is showing: for those who care deeply about such things, tango.io
is shown to be around four times faster than the fastest C
implementation tried (for Andrei's test under Win32), and a notable
fourteen or fifteen times faster than the shipping phobos equivalent.
If "interaction" between D & C on a shared, global file-handle becomes
some kind of issue due to buffering (and only if) we'll cross that
bridge at that point in time. I'm sure there's a number of solutions
that don't involve restricting D to using a lowest common denominator
approach. There's lots of smart people here who would be willing to help
resolve that if necessary.
The tango.io package is intended to be clean, extensible, simple, and a
whole lot more coherent than certain others. We feel it meets those
goals, and it happens to be quite efficient at the same time. Seems a
bit like sour-grapes to start looking for "issues" with that intent,
particularly when compared to an implementation that proclaims "It peeks
under the hood of C's stdio implementation, meaning it's customized for
Digital Mars' stdio, and gcc's stdio" ?
Tango is not meant to be a phobos clone; it doesn't make the same claims
as phobos and it doesn't follow the same rules as phobos. If you need
phobos rules, then use phobos. If you don't like tango.io speed,
extensibility and simplicity, without all the special cases of C IO,
then use phobos. If you want both then, at some point, we'll consider
figuring out how to make your C-oriented corner-cases work with tango.io
Walter wrote: "One of my goals with D is to fix that - the
straightforward, untuned code should get you most of the possible speed."
Andrei wrote: "Just make the clear and simple code fastest. One thing I
like about D is that it clearly strives to achieve best performance for
simply-written code."
That sentiment is very much what Tango itself is about.
You began this thread by titling it "stdio and Tango IO performance" and
noting the following: "has anyone verified that Tango's I/O performance
is up to snuff? I see it imposes the dynamic-polymorphic approach, and
unless there was some serious performance work going on, it's possible
it's even slower than stdio. "
Given the results shown above, I hope we can put that to rest at this time.
More information about the Digitalmars-d
mailing list