stdio performance in tango, stdlib, and perl

Wed Mar 21 17:11:56 PDT 2007

kris wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
> [snip]
> 
>>>> 4.7s    tcat
>>>
>>>
>>> Thanks. If tango.io were to retain CR on readln, then it would come 
>>> out ahead of everything else in this particular test
>>
>>
>> Well probably but must be tested. Newlines comprise about 3% of the 
>> file size.
> 
> Yeah, I can imagine. Module tango.io.Console at line 119 should have a 
> slice in it ... if you change 'j' to be 'i+1' instead, that should 
> remove the chop

Yum.

> Tango should still come out in front, although I have to say that 
> benchmarks don't really tell very much in general i.e. doesn't mean much 
> of anything important whether tango "wins" this or not (IMO)

Why not? Programs using the standard input and output are ubiquitous, 
efficient, and extremely easy to combine. I write them all the time for 
processing huge amounts of data.

I didn't run the tests willy-nilly. I had a Perl script that took a 
night to run (it scrambles through some 20 GB of data), so I decided to 
give D a shot. The D equivalent was two times slower. With the new 
readln, it takes 98 minutes; parallelized, it is hand over fist another 
five times faster (which was impossible in the previous version because 
it used 98% CPU).

I was actually surprised that nobody noticed phobos' low I/O speed in 
years. It's a maker or breaker for me and many others.

If there's any chance that automated chopping could be removed from 
Tango, that would be awesome. Also it would be great to fix the 
incompatibility created by using read/write instead of getline.

Andrei