stdio performance in tango, stdlib, and perl

kris foo at bar.com
Thu Mar 22 11:05:21 PDT 2007


Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
> 
>> torhu wrote:
>>
>>> torhu wrote:
>>> <snip>
>>>
>>>> Fastest first:
>>>>
>>>> tango.io.Console, no flushing (Andrei's): ca 1.5s
>>>>
>>>> C, reusing buffer, gcc & msvc71: ca 3s
>>>>
>>>> James' C++, gcc: 3.5s
>>>>
>>>> Phobos std.cstream, reused buffer: 11s
>>>>
>>>> C w/malloc and free each line, msvc71: 23s
>>>>
>>>> Andrei's C++, gcc: 27s
>>>>
>>>> C w/malloc and free each line, gcc: 37s
>>>>
>>>> Andrei's C++, msvc71: 50s
>>>>
>>>> James' C++,  msvc: 51s
>>>
>>>
>>>
>>> I've run some of the tests with more accurate timing. Andrei's Tango 
>>> code uses 0.9 seconds, with no flushing, and 1.6 seconds with 
>>> flushing.  I also tried cat itself, from the gnuwin32 project.  cat 
>>> clocks in at 1.3 seconds.
>>
>>
>>
>> Just for jollies, a briefly optimized tango.io was tried also: it came 
>> in at around 0.7 seconds. On a tripled file-size (3 million lines), 
>> that version is around 23% faster than bog-standard tango.io
> 
> 
> That's great news!
> 
>> Thanks for giving it a whirl, tohru :)
>>
>>
>> p.s. perhaps Andrei should be using tango for processing those vast 
>> files he has?
> 
> 
> Is it compatible with C's stdio? IOW, would this sequence work?
> 
> readln(line);
> int c = getchar();
> 
> Is 'c' the first character on the next line?


Nope. Tango is for D, not C. In order to make a arguably better library, 
one often has to step away from the norm. Both yourself and Walter have 
been saying "it needs to be fast and simple", and that's exactly what 
Tango is showing: for those who care deeply about such things, tango.io 
is shown to be around four times faster than the fastest C 
implementation tried (for Andrei's test under Win32), and a notable 
fourteen or fifteen times faster than the shipping phobos equivalent.

If "interaction" between D & C on a shared, global file-handle becomes 
some kind of issue due to buffering (and only if) we'll cross that 
bridge at that point in time. I'm sure there's a number of solutions 
that don't involve restricting D to using a lowest common denominator 
approach. There's lots of smart people here who would be willing to help 
resolve that if necessary.

The tango.io package is intended to be clean, extensible, simple, and a 
whole lot more coherent than certain others. We feel it meets those 
goals, and it happens to be quite efficient at the same time. Seems a 
bit like sour-grapes to start looking for "issues" with that intent, 
particularly when compared to an implementation that proclaims "It peeks 
under the hood of C's stdio implementation, meaning it's customized for 
Digital Mars' stdio, and gcc's stdio" ?

Tango is not meant to be a phobos clone; it doesn't make the same claims 
as phobos and it doesn't follow the same rules as phobos. If you need 
phobos rules, then use phobos. If you don't like tango.io speed, 
extensibility and simplicity, without all the special cases of C IO, 
then use phobos. If you want both then, at some point, we'll consider 
figuring out how to make your C-oriented corner-cases work with tango.io

Walter wrote: "One of my goals with D is to fix that - the 
straightforward, untuned code should get you most of the possible speed."

Andrei wrote: "Just make the clear and simple code fastest. One thing I 
like about D is that it clearly strives to achieve best performance for 
simply-written code."

That sentiment is very much what Tango itself is about.

You began this thread by titling it "stdio and Tango IO performance" and 
noting the following: "has anyone verified that Tango's I/O performance 
is up to snuff? I see it imposes the dynamic-polymorphic approach, and 
unless there was some serious performance work going on, it's possible 
it's even slower than stdio. "

Given the results shown above, I hope we can put that to rest at this time.



More information about the Digitalmars-d mailing list