std.string and unicode

Stewart Gordon smjg_1998 at yahoo.com
Sat Dec 16 17:29:13 PST 2006


Jari-Matti Mäkelä wrote:
> Frits van Bommel wrote:
>> Todor Totev wrote:
>>> Hello all,
>>> are std.string functions supposed to be UNICODE aware?
>> Yes.
> 
> No they are not.

Whether they're _supposed_ to be Unicode aware, and whether they 
actually _are_ Unicode aware, are two very different matters.

> The string constants (lowercase, letters, etc.) only
> feature ASCII characters. Some functions have "BUG: only works with
> ASCII" attached to them. Many of the functions expect that the char[]
> string consists of 8 bit characters.
<snip>

maketrans expects the char[] string to consist of 7-bit characters. 
Which functions expect it to consist of 8-bit characters?

But indeed, somebody needs to define a Unicode translation table format 
that isn't going to take up 4MB or so.  Why was the current translation 
table format put in in the first place, considering:
- it's obvious that it won't work in Unicode
- a dchar[dchar] is an intuitive way to do it
?

Stewart.



More information about the Digitalmars-d mailing list