Wide characters support in D
Nick Sabalausky
a at a.a
Tue Jun 8 01:09:26 PDT 2010
"Ruslan Nikolaev" <nruslan_devel at yahoo.com> wrote in message
news:mailman.128.1275979841.24349.digitalmars-d at puremagic.com...
>>
>> Secondly, Java and Windows adapted 16-bit encodings back
>> when many people
>> were still under the mistaken impression that would allow
>> them to hold any
>> character in one code-unit. If that had been true, then it
>
> I doubt that it was the only reason. UTF-8 was already available before
> Windows NT was released. It would be much easier to use UTF-8 instead of
> ANSI as opposed to creating parallel API. Nonetheless, UTF-16 has been
> chosen.
>
I didn't say that was the only reason. Also, you've misunderstood my point:
Their reasoning at the time:
8-bit: Multiple code-units for some characters
16-bit: One code-unit per character
Therefore, use 16-bit.
Reality:
8-bit: Multiple code-units for some characters
16-bit: Multiple code-units for some characters
Therefore, old reasoning not necessarily still applicable.
> In addition, C# has been released already when UTF-16 became variable
> length.
Right, like I said, C#/.NET use UTF-16 because that's what MS had already
standardized on.
>I doubt that conversion overhead (which is small compared to VM) was the
>main reason to preserve UTF-16.
I never said anything about conversion overhead being a reason to preserve
UTF-16.
>
> Concerning why I say that it's good to have conversion to UTF-32 (you
> asked somewhere):
>
> I think you did not understand correctly what I meant. This a very common
> practice, and in fact - required, to convert from both UTF-8 and UTF-16 to
> UTF-32 when you need to do character analysis (e.g. mbtowc() in C). In
> fact, it is the only place where UTF-32 is commonly used and useful.
>
I'm well aware why UTF-32 is useful. Earlier, you had started out saying
that there should only be one string type, the OS-native type. Now you're
changing your tune and saying that we do need multiple types.
More information about the Digitalmars-d
mailing list