First Impressions!
Joakim
dlang at joakim.fea.st
Thu Nov 30 10:39:19 UTC 2017
On Thursday, 30 November 2017 at 10:19:18 UTC, Walter Bright
wrote:
> On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:
>> +- Unicode support is good. Although I think D's string type
>> should have probably been utf16 by default. Especially
>> considering the utf module states:
>>
>> "UTF character support is restricted to '\u0000' <= character
>> <= '\U0010FFFF'."
>>
>> Seems like the natural fit for me. Plus for the vast majority
>> of use cases I am pretty guaranteed a char = codepoint. Not
>> the biggest issue in the world and maybe I'm just being overly
>> critical here.
>
> Sooner or later your code will exhibit bugs if it assumes that
> char==codepoint with UTF16, because of surrogate pairs.
>
> https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java
>
> As far as I can tell, pretty much the only users of UTF16 are
> Windows programs. Everyone else uses UTF8 or UCS32.
>
> I recommend using UTF8.
Java, .NET, Qt, Javascript, and a handful of others use UTF-16
too, some starting off with the earlier UCS-2:
https://en.m.wikipedia.org/wiki/UTF-16#Usage
Not saying either is better, each has their flaws, just pointing
out it's more than just Windows.
More information about the Digitalmars-d
mailing list