stdio line-streaming revisited

kris foo at bar.com
Wed Mar 28 22:24:55 PDT 2007


Andrei Alexandrescu (See Website For Email) wrote:
> kris wrote:
> 
>> Andrei Alexandrescu (See Website For Email) wrote:
>>
>>> kris wrote:
>>>
>>>> Last week there were a series of posts regarding some optimized code 
>>>> within phobos streams. A question posed was, without those same 
>>>> optimizations, would tango.io be slower than the improved phobos [1]
>>>>
>>>> As these new phobos IO functions are now available, Andrei's 
>>>> "benchmark" [2] was run on both Win32 and linux to see where 
>>>> tango.io could use some improvement.
>>>
>>>
>>> [snip]
>>>
>>> On my machine, Tango does 4.3 seconds and the following phobos 
>>> program (with Walter's readln) does 5.4 seconds:
>>
>>
>> On Win32, the difference is very much larger. As noted before, several 
>> times faster. Those benefits will likely translate to linux going 
>> forward.
> 
> 
> If I understand things correctly, it looks like the hope is to derive 
> more speed from further dropping phobos and C I/O compatibility, a path 
> that I personally don't consider attractive.

Nope. That's not the case at all. The expectation (or 'hope', if you 
like) is that we can make the linux version operate more like the Win32 
version

> 
> Also, the fact that the tango version is "more than twice as efficient 
> as the fastest C version identified" suggests a problem with the testing 
> method or with the C code. Are they comparable? If you genuinely have a 
> method to push bits through two times faster than the fastest C can do, 
> you may want as well go ahead and patent it. Your method would speed up 
> many programs, since many use C's I/O and are I/O bound. It's huge news. 

That's good for D then?

There's no reason why C could not take the same approach yet, one might 
imagine, the IO strategies exposed and the wide variety of special cases 
may 'discourage' the implementation of a more efficient approach? That's 
just pure speculation on my part, and I'm quite positive the C version 
could be sped up notably if one reimplemented a bunch of things.

> I'm not even kidding. But I doubt that that's the case.

You're most welcome to your doubts, Andrei. However, just because "C 
does it that way" doesn't mean it is, or ever was, the "best" approach


> 
>>> Also, the Tango version has a bug. Running Tango's cat without any 
>>> pipes does not read lines from the console and outputs them one by 
>>> one, as it should; instead, it reads many lines and buffers them 
>>> internally, echoing them only after the user has pressed end-of-file 
>>> (^D on Linux), or possibly after the user has entered a large amount 
>>> of data (I didn't have the patience). The system cat program and the 
>>> phobos implementation correctly process each line as it was entered.
>>
>>
>> If you mean something that you've written, that could presumeably be 
>> rectified by adding the isatty() test Walter had mentioned before. 
>> That has not been added to tango.io since (a) it would likely make 
>> programs behave differently depending on whether they were redirected 
>> or not. It's not yet clear whether that is an appropriate 
>> specialization, as default behaviour
> 
> 
> What is absolutely clear is that the current version has a bug. It can't 
> read a line from the user and write it back. There cannot be any 
> question that that's a problem.

Only with the way that you've written your program. In the general case, 
that is not true at all. But please do submit that bug-report :)


> 
>> , and (b) there has been no ticket issued for it
>>
>> Again, please submit a ticket so we don't forget about that detail. 
>> We'd be interested to hear if folk think the "isatty() test" should be 
>> default behaviour, or would perhaps lead to corner-case issues instead
> 
> 
> I was actually pointing out a larger issue: incompatibility with phobos' 
> I/O and C I/O. Tango's version is now faster (thank God we got past the 
> \n issue and bummer it's not the default parameter of nextLine) but it 
> is incompatible with both phobos' and C's stdio. (It's possible that the 
> extra speed is derived from skipping C's stdio and using read and write 
> directly.) Probably you could reimplement phobos and bundle it with 
> Tango to give the users the option to link phobos code with Tango code 
> properly, but still C stdio compatibility is lost, and phobos code has 
> access to it.

The issue you raise here is that of interleaved and shared access to 
global entities, such as the console, where some incompatability between 
tango.io and C IO is exhibited.

If you really dig into it, you'll perhaps conclude that (a) the number 
of real-world scenario where this would truly become an issue is 
diminishingly small, and (b) the vast (certainly on Win32) performance 
improvement is worth that tradeoff. Even then, it is certainly possible 
to intercept C IO functions and route them to tango.io equivalents instead.

It has been said before, but is probably worth repeating:

- Tango is not a phobos clone. Nor is it explicitly designed to be 
compatible with phobos; sometimes it is worthwhile taking a different 
approach. Turns out that phobos can be run alongside tango in many 
situations.

- Tango is for D programmers; not C programmers.

- Tango, as a rule, is intended to be flexible, modular, efficient and 
practical. The goal is to provide D with an exceptional library, and we 
reserve the right to break a few eggs along the way ;)



More information about the Digitalmars-d mailing list