stdio line-streaming revisited

Andrei Alexandrescu (See Website For Email) SeeWebsiteForEmail at erdani.org
Wed Mar 28 23:31:40 PDT 2007


kris wrote:
> Andrei Alexandrescu (See Website For Email) wrote:
>> kris wrote:
>>
>>> Andrei Alexandrescu (See Website For Email) wrote:
>>>
>>>> kris wrote:
>>>>
>>>>> Last week there were a series of posts regarding some optimized 
>>>>> code within phobos streams. A question posed was, without those 
>>>>> same optimizations, would tango.io be slower than the improved 
>>>>> phobos [1]
>>>>>
>>>>> As these new phobos IO functions are now available, Andrei's 
>>>>> "benchmark" [2] was run on both Win32 and linux to see where 
>>>>> tango.io could use some improvement.
>>>>
>>>>
>>>> [snip]
>>>>
>>>> On my machine, Tango does 4.3 seconds and the following phobos 
>>>> program (with Walter's readln) does 5.4 seconds:
>>>
>>>
>>> On Win32, the difference is very much larger. As noted before, 
>>> several times faster. Those benefits will likely translate to linux 
>>> going forward.
>>
>>
>> If I understand things correctly, it looks like the hope is to derive 
>> more speed from further dropping phobos and C I/O compatibility, a 
>> path that I personally don't consider attractive.
> 
> Nope. That's not the case at all. The expectation (or 'hope', if you 
> like) is that we can make the linux version operate more like the Win32 
> version
> 
>>
>> Also, the fact that the tango version is "more than twice as efficient 
>> as the fastest C version identified" suggests a problem with the 
>> testing method or with the C code. Are they comparable? If you 
>> genuinely have a method to push bits through two times faster than the 
>> fastest C can do, you may want as well go ahead and patent it. Your 
>> method would speed up many programs, since many use C's I/O and are 
>> I/O bound. It's huge news. 
> 
> That's good for D then?
> 
> There's no reason why C could not take the same approach yet, one might 
> imagine, the IO strategies exposed and the wide variety of special cases 
> may 'discourage' the implementation of a more efficient approach? That's 
> just pure speculation on my part, and I'm quite positive the C version 
> could be sped up notably if one reimplemented a bunch of things.
> 
>> I'm not even kidding. But I doubt that that's the case.
> 
> You're most welcome to your doubts, Andrei. However, just because "C 
> does it that way" doesn't mean it is, or ever was, the "best" approach

I think we're not on the same page here. What I'm saying is that, unless 
you cut a deal with Microsoft to provide you with a secret D I/O API 
that nobody knows about, all fast APIs in existence come with a C 
interface. It's very hard to contend that. Probably you are referring to 
the C stdio, and I'm in agreement with that. Of course there's a variety 
of means to be faster than stdio on any given platform, at various 
compatibility costs. It's known how to do that. "Hot water has been 
invented."

>>>> Also, the Tango version has a bug. Running Tango's cat without any 
>>>> pipes does not read lines from the console and outputs them one by 
>>>> one, as it should; instead, it reads many lines and buffers them 
>>>> internally, echoing them only after the user has pressed end-of-file 
>>>> (^D on Linux), or possibly after the user has entered a large amount 
>>>> of data (I didn't have the patience). The system cat program and the 
>>>> phobos implementation correctly process each line as it was entered.
>>>
>>>
>>> If you mean something that you've written, that could presumeably be 
>>> rectified by adding the isatty() test Walter had mentioned before. 
>>> That has not been added to tango.io since (a) it would likely make 
>>> programs behave differently depending on whether they were redirected 
>>> or not. It's not yet clear whether that is an appropriate 
>>> specialization, as default behaviour
>>
>>
>> What is absolutely clear is that the current version has a bug. It 
>> can't read a line from the user and write it back. There cannot be any 
>> question that that's a problem.
> 
> Only with the way that you've written your program. In the general case, 
> that is not true at all. But please do submit that bug-report :)

This is the fourth time we need to discuss this. Why do I need to 
_argue_ that this is a bug, I don't understand.

Let me spell it again: Cin.nextLine is incorrect. It cannot be used 
(without possibly some extra incantations I don't know about) to 
implement a program that does this:

$ ./test.d
Please enter your name: Moe
Hello, Moe!
$ _

I don't have an account on the Tango site, and in a fraction of the time 
it would take me to create one, you can submit the bug report.

>>> , and (b) there has been no ticket issued for it
>>>
>>> Again, please submit a ticket so we don't forget about that detail. 
>>> We'd be interested to hear if folk think the "isatty() test" should 
>>> be default behaviour, or would perhaps lead to corner-case issues 
>>> instead
>>
>>
>> I was actually pointing out a larger issue: incompatibility with 
>> phobos' I/O and C I/O. Tango's version is now faster (thank God we got 
>> past the \n issue and bummer it's not the default parameter of 
>> nextLine) but it is incompatible with both phobos' and C's stdio. 
>> (It's possible that the extra speed is derived from skipping C's stdio 
>> and using read and write directly.) Probably you could reimplement 
>> phobos and bundle it with Tango to give the users the option to link 
>> phobos code with Tango code properly, but still C stdio compatibility 
>> is lost, and phobos code has access to it.
> 
> The issue you raise here is that of interleaved and shared access to 
> global entities, such as the console, where some incompatability between 
> tango.io and C IO is exhibited.
> 
> If you really dig into it, you'll perhaps conclude that (a) the number 
> of real-world scenario where this would truly become an issue is 
> diminishingly small, and (b) the vast (certainly on Win32) performance 
> improvement is worth that tradeoff. Even then, it is certainly possible 
> to intercept C IO functions and route them to tango.io equivalents instead.

What Win32 primitives does tango use?

> It has been said before, but is probably worth repeating:
> 
> - Tango is not a phobos clone. Nor is it explicitly designed to be 
> compatible with phobos; sometimes it is worthwhile taking a different 
> approach. Turns out that phobos can be run alongside tango in many 
> situations.
> 
> - Tango is for D programmers; not C programmers.
> 
> - Tango, as a rule, is intended to be flexible, modular, efficient and 
> practical. The goal is to provide D with an exceptional library, and we 
> reserve the right to break a few eggs along the way ;)

Sounds great.


Andrei



More information about the Digitalmars-d mailing list