stream.readLine

Frits van Bommel fvbommel at REMwOVExCAPSs.nl
Tue Jan 23 10:04:29 PST 2007


bobef wrote:
> Then it is impossible to use the readLine() function to read non-utf8 streams?

InputStream.readLine (which I presume is the one you mean) returns an 
UTF-8 string. It doesn't mention in what format it is read. If someone 
wants to implement it to read a non-UTF string from somewhere and then 
convert it to UTF-8 and return it, that's a perfectly valid implementation.

> If it is so this sucks ass, because I have to read the stream to convert it to utf8, because obviously I can't force any stream out there to be utf8 just because D likes it :)

A conversion stream may not be so hard to implement. Just create an 
object implementing InputStream and pass another InputStream to its 
constructor. Or you can even inherit it directly from std.stream.File, 
forward the constructors, and only override the readLine* functions.
Then if you're reading a file formatted in some ASCII + extended 
codepage format, you just need a lookup table (or conversion function) 
to convert the last 128 values to the corresponding UTF codepoints and 
use std.utf.encode. For Latin-1 data it's even simpler, just pass it 
straight to std.utf.encode. You'll probably want to use the read(inout 
ubyte) method to read such a file.

The process for other text formats is probably similar, perhaps using 
other read() overloads to read it (for multi-byte encodings).


(Warning: I've never actually implemented a Stream, so the above may 
well be riddled with errors and misinformation :) )


More information about the Digitalmars-d-bugs mailing list