Minor std.stdio.File.ByLine rant

Steven Schveighoffer schveiguy at yahoo.com
Thu Feb 27 08:26:42 PST 2014


On Thu, 27 Feb 2014 10:04:47 -0500, H. S. Teoh <hsteoh at quickfur.ath.cx>  
wrote:

> On Thu, Feb 27, 2014 at 07:55:59AM -0500, Steven Schveighoffer wrote:
>> On Wed, 26 Feb 2014 18:44:10 -0500, H. S. Teoh
>> <hsteoh at quickfur.ath.cx> wrote:
>>
>> >First of all, the way ByLine works is kinda tricky, even in the
>> >previous releases. The underlying cause is that at least on Posix,
>> >the underlying C feof() call doesn't actually tell you whether you're
>> >really at EOF until you try to read something from the file
>> >descriptor.
>>
>> This is not a posix problem, it's a general stream problem.
>>
>> A stream is not at EOF until the write end is closed. Until then,
>> you cannot know whether it's empty until you read and don't get
>> anything back. Even if a primitive existed that allowed you to tell
>> whether the write end was closed, you can race this against the
>> other process closing it's write end.
>>
>> I think the correct solution is to block on the first front call. We
>> may be able to do this without storing an additional variable.
> [...]
>
> Unfortunately, you can't. Since Phobos can't know whether the file
> (which may be a network socket, say) is at EOF without first blocking on
> read, it won't be able to return the correct value from .empty, and
> according to the range API, it's invalid to access .front unless .empty
> returns false. So this solution doesn't work. :-(

Yes, you are right!

Thinking about it, the only correct solution is to do what it already does  
-- establish the first line on construction. empty cannot depend on front,  
and doing something different on the first empty vs. every other one makes  
the range bloated and confusing.

The issue really is, to treat the construction and popFront as blocking.  
Streams are a tricky business indeed. I think your solution is the only  
valid one. Unfortunate that you have to do this.

An interesting general solution is to use a delegate to generate the  
range, giving an easy one-line construction without having to make a  
wrapper range that lazily constructs on empty, but just using a delegate  
name does not call it. I did come up with this:

import std.stdio;
import std.range;

void foo(R)(R r)
{
     static if(isInputRange!R)
     {
         alias _r = r;
     }
     else // if is no-arg delegate and returns input range (too lazy to  
figure this out :)
     {
         auto _r(){return r();}
     }

     foreach(x; _r)
     {
         writeln(x);
     }
}
void main()
{
     foo(() => stdin.byLine);
     foo([1,2,3]);
}

The static if at the beginning is awkward, but just allows the rest of the  
code to be identical whether you call with a delegate or a range.

-Steve


More information about the Digitalmars-d mailing list