string is rarely useful as a function argument
Piotr Szturmaj
bncrbme at jadamspam.pl
Sat Dec 31 07:10:17 PST 2011
Timon Gehr wrote:
> Me too. I think the way we have it now is optimal. The only reason we
> are discussing this is because of fear that uneducated users will write
> code that does not take into account Unicode characters above code point
> 0x80.
+1
>From D's string docs:
"char[] strings are in UTF-8 format. wchar[] strings are in UTF-16
format. dchar[] strings are in UTF-32 format."
I would additionally add some clarifications:
char[] is an array of 8-bit code units. Unicode code point may take up
to 4 chars.
wchar[] is an array of 16-bit code units. Unicode code point may take up
to 2 wchars.
dchar[] is an array of 32-bit code units. Unicode code point always fits
into one dchar.
Each of these formats may encode any Unicode string.
If you need indexing or slicing use:
* char[] or string when working with ASCII code points.
* wchar[] or wstring when working with Basic Multilingual Plane (BMP)
code points.
* dchar[] or dstring when working with all possible code points.
If you do not need indexing or slicing you may use any of the formats.
More information about the Digitalmars-d
mailing list