Improving D's support of code-pages

Kirk McDonald kirklin.mcdonald at gmail.com
Sat Aug 18 23:43:00 PDT 2007


Walter Bright wrote:
> Kirk McDonald wrote:
> 
>> Pardon? I haven't said anything about stdio behaving differently 
>> whether it's printing to the console or not. writefln() would /always/ 
>> attempt to encode in the console's encoding.
> 
> 
> Ok, I misunderstood.
> 
> Now, what if stdout is reopened to be a file?

I've been thinking about these issues more carefully. It is harder than 
I initially thought. :-)

Ignoring my ideas of implicitly encoding writefln's output, I regard the 
encode/decode functions as vital. These alone would improve the current 
situation immensely.

Printing ubyte[] arrays as the "raw bytes" therein when using writef() 
is basically nonsense, thanks to the fact that doFormat itself is 
Unicode aware. I should have realized this sooner. However, you can 
still write them with dout.write(). This should be adequate.

Here is another proposal regarding implicit encoding, slightly modified 
from my first one:

The Stream class should be modified to have an encoding attribute. This 
should usually be null. If it is present, output should be encoded into 
that encoding. (To facilitate this, the encoding module should provide a 
doEncode function, analogous to the doFormat function, which has a void 
delegate(ubyte) or possibly a void delegate(ubyte[]) callback.)

Next, std.stdio.writef should be modified to write to the object 
referenced by std.cstream.dout, instead of the FILE* stdout. The next 
step is obvious: std.cstream.dout's encoding attibute should be set to 
the console's encoding. Finally, though dout should obviously remain a 
CFile instance, it should be stored in a Stream reference.

If another Stream object is substituted for dout, then the behavior of 
writefln (and anything else relying on dout) would be redirected. 
Whether the output is still implicitly encoded would depend entirely on 
this new object's encoding attribute.

It occurs to me that this could be somewhat slow. Examination of the 
source reveals that every printed character from dout is the result of a 
virtual method call. However, I do wonder how important the performance 
of printing to the console really is.

Thoughts? Is this a thoroughly stupid idea?

-- 
Kirk McDonald
http://kirkmcdonald.blogspot.com
Pyd: Connecting D and Python
http://pyd.dsource.org



More information about the Digitalmars-d mailing list