Minor std.stdio.File.ByLine rant

Thu Feb 27 09:32:44 PST 2014

On Thu, Feb 27, 2014 at 11:26:42AM -0500, Steven Schveighoffer wrote:
> On Thu, 27 Feb 2014 10:04:47 -0500, H. S. Teoh
> <hsteoh at quickfur.ath.cx> wrote:
> 
> >On Thu, Feb 27, 2014 at 07:55:59AM -0500, Steven Schveighoffer wrote:
> >>On Wed, 26 Feb 2014 18:44:10 -0500, H. S. Teoh
> >><hsteoh at quickfur.ath.cx> wrote:
> >>
> >>>First of all, the way ByLine works is kinda tricky, even in the
> >>>previous releases. The underlying cause is that at least on Posix,
> >>>the underlying C feof() call doesn't actually tell you whether
> >>>you're really at EOF until you try to read something from the file
> >>>descriptor.
> >>
> >>This is not a posix problem, it's a general stream problem.
> >>
> >>A stream is not at EOF until the write end is closed. Until then,
> >>you cannot know whether it's empty until you read and don't get
> >>anything back. Even if a primitive existed that allowed you to tell
> >>whether the write end was closed, you can race this against the
> >>other process closing it's write end.
> >>
> >>I think the correct solution is to block on the first front call. We
> >>may be able to do this without storing an additional variable.
> >[...]
> >
> >Unfortunately, you can't. Since Phobos can't know whether the file
> >(which may be a network socket, say) is at EOF without first blocking
> >on read, it won't be able to return the correct value from .empty,
> >and according to the range API, it's invalid to access .front unless
> >.empty returns false. So this solution doesn't work. :-(
> 
> Yes, you are right!
> 
> Thinking about it, the only correct solution is to do what it
> already does -- establish the first line on construction. empty
> cannot depend on front, and doing something different on the first
> empty vs. every other one makes the range bloated and confusing.
> 
> The issue really is, to treat the construction and popFront as
> blocking. Streams are a tricky business indeed. I think your
> solution is the only valid one. Unfortunate that you have to do
> this.
> 
> An interesting general solution is to use a delegate to generate the
> range, giving an easy one-line construction without having to make a
> wrapper range that lazily constructs on empty, but just using a
> delegate name does not call it. I did come up with this:

Actually, now that I think about it, can't we just make ByLine lazily
constructed? It's already a wrapper around ByLineImpl anyway (since it's
being refcounted), so why not just make the wrapper create ByLineImpl
only when you actually attempt to use it? That would solve the problem:
you can call ByLine but it won't block until ByLineImpl is actually
created, which is the first time you call ByLine.empty.

T

-- 
Don't drink and derive. Alcohol and algebra don't mix.