First Impressions!
Walter Bright
newshound2 at digitalmars.com
Thu Nov 30 10:19:18 UTC 2017
On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:
> +- Unicode support is good. Although I think D's string type should have
> probably been utf16 by default. Especially considering the utf module states:
>
> "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'."
>
> Seems like the natural fit for me. Plus for the vast majority of use cases I am
> pretty guaranteed a char = codepoint. Not the biggest issue in the world and
> maybe I'm just being overly critical here.
Sooner or later your code will exhibit bugs if it assumes that char==codepoint
with UTF16, because of surrogate pairs.
https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java
As far as I can tell, pretty much the only users of UTF16 are Windows programs.
Everyone else uses UTF8 or UCS32.
I recommend using UTF8.
More information about the Digitalmars-d
mailing list