UTF-8 problems

Deewiant deewiant.doesnotlike.spam at gmail.com
Mon Jun 12 07:24:54 PDT 2006


import std.stream, std.cstream;

// åäöΔ

void main() {
	Stream file = new File(__FILE__, FileMode.In);
	// alternatively:
	//Stream file = din;

	while (!file.eof)
		dout.writef("%s", file.getc);
}
--

With the above UTF-8 code, I expect the program's source to be output, also in
UTF-8. However, I get ASCII output, and on line three appears everyone's
favourite "Error: 4invalid UTF-8 sequence".

Furthermore, unless I use the "alternative" where std.cstream.din is used, the
two line breaks after "std.cstream;" are not \r\n as they should be in the DOS
encoding I use, they are \r\r\n. Converting the line breaks to just \n causes
them to become \r\n in the output. Whence the extra \r?

What's strange is if I use e.g. readLine instead of getc, everything is fine.
Since readLine seems to use getc internally, I'm having trouble understanding
why this is the case.

A bug or two, or where am I going wrong?



More information about the Digitalmars-d-learn mailing list