UTF-8 problems

Deewiant deewiant.doesnotlike.spam at gmail.com
Mon Jun 12 11:04:15 PDT 2006


Carlos Santander wrote:
> Deewiant escribió:
>> So, for instance, "c3 a4" is the UTF-8 equivalent of U+00E4, "ä". How do I 
>> combine the former two into a single "char"?
>> 
>> Say I check if the char received from getc() is greater than 127 (outside
>> ASCII) and if it is, I store it and the following char in two ubytes. Now 
>> what? How do I get a char?
> 
> Keep using readLine. The entire line should be made of valid UTF8 characters.

That would work, but I was originally using only getc() so it's easier for me to
replace that than to change half of my input paradigm. <g>

> Maybe something to do about it would be to add getUTF8char, getUTF16char and
> getUTF32char, which would return char[], wchar[] and dchar, respectively, the
> first one returning an array of 1 to 4 elements, and the second 1 or 2.
> 

Something like that would indeed be handy. It's too bad std.stream is lacking in
some respects, such as this.



More information about the Digitalmars-d-learn mailing list