Why does readln include the line terminator?
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Tue Apr 14 18:57:58 PDT 2009
Nick Sabalausky wrote:
> "Georg Wrede" <georg.wrede at iki.fi> wrote in message
> news:gs2o15$233h$2 at digitalmars.com...
>> I can see having to use one or another line ending in the whole output
>> file, but not a situation where some lines and not some other need this or
>> that kind of line ending.
>>
>
> Source code with unescaped nl's/cr's embedded in a string literal? Though I
> admit that may not be a particularly compelling case for at least a couple
> of different reasons. (I do agree with your original point though.)
I think there are a few concerns when designing an API for reading
separated lines.
1. Reasonably complex separators should be allowed, e.g. regexes. For
streams that have lookahead = 1, only regexes without backtracking
(i.e., classic regular expressions) can be allowed.
2. Alternate separators should be allowed, and information should be
passed as to which one, if any, matched:
readln(stream, '\n', '\r', "Brought to you by Carl's Jr.\n");
You should be able to somehow extract which one of these matched, or
whether the stream ended without having seen any. The match process is
similar to regexes, but the information returned would be difficult to
extract from a regex match.
3. Given (1) and (2), the process of eliminating the matched separator
can become rather involved. So there should be an option to just
eliminate the separator.
4. However, the separator should be made available to the called. That
makes for programs that preserve the separator, whatever it was.
I plan to implement a little API around these considerations, but
haven't gotten around to it. Particularly the regex thing is rather
thorny because std.regex does not distinguish classic regular
expressions from those needing backtracking, and does not have an
implementation that works with limited-lookahead streams. I suspect that
that would be a major effort.
Right now readln preserves the separator. The newer File.byLine
eliminates it by default and offers to keep it by calling
File.byLine(KeepTerminator.yes). The allowed terminators are one
character or a string. See
http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine
I consider such an API adequate but insufficient; we need to add to it.
Andrei
More information about the Digitalmars-d
mailing list