Why does readln include the line terminator?

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Apr 14 18:57:58 PDT 2009


Nick Sabalausky wrote:
> "Georg Wrede" <georg.wrede at iki.fi> wrote in message 
> news:gs2o15$233h$2 at digitalmars.com...
>> I can see having to use one or another line ending in the whole output 
>> file, but not a situation where some lines and not some other need this or 
>> that kind of line ending.
>>
> 
> Source code with unescaped nl's/cr's embedded in a string literal? Though I 
> admit that may not be a particularly compelling case for at least a couple 
> of different reasons. (I do agree with your original point though.) 

I think there are a few concerns when designing an API for reading 
separated lines.

1. Reasonably complex separators should be allowed, e.g. regexes. For 
streams that have lookahead = 1, only regexes without backtracking 
(i.e., classic regular expressions) can be allowed.

2. Alternate separators should be allowed, and information should be 
passed as to which one, if any, matched:

readln(stream, '\n', '\r', "Brought to you by Carl's Jr.\n");

You should be able to somehow extract which one of these matched, or 
whether the stream ended without having seen any. The match process is 
similar to regexes, but the information returned would be difficult to 
extract from a regex match.

3. Given (1) and (2), the process of eliminating the matched separator 
can become rather involved. So there should be an option to just 
eliminate the separator.

4. However, the separator should be made available to the called. That 
makes for programs that preserve the separator, whatever it was.

I plan to implement a little API around these considerations, but 
haven't gotten around to it. Particularly the regex thing is rather 
thorny because std.regex does not distinguish classic regular 
expressions from those needing backtracking, and does not have an 
implementation that works with limited-lookahead streams. I suspect that 
that would be a major effort.

Right now readln preserves the separator. The newer File.byLine 
eliminates it by default and offers to keep it by calling 
File.byLine(KeepTerminator.yes). The allowed terminators are one 
character or a string. See

http://erdani.dreamhosters.com/d/web/phobos/std_stdio.html#byLine

I consider such an API adequate but insufficient; we need to add to it.


Andrei



More information about the Digitalmars-d mailing list