First Impressions!

Walter Bright newshound2 at digitalmars.com
Thu Nov 30 10:19:18 UTC 2017


On 11/27/2017 7:01 PM, A Guy With an Opinion wrote:
> +- Unicode support is good. Although I think D's string type should have 
> probably been utf16 by default. Especially considering the utf module states:
> 
> "UTF character support is restricted to '\u0000' <= character <= '\U0010FFFF'."
> 
> Seems like the natural fit for me. Plus for the vast majority of use cases I am 
> pretty guaranteed a char = codepoint. Not the biggest issue in the world and 
> maybe I'm just being overly critical here.

Sooner or later your code will exhibit bugs if it assumes that char==codepoint 
with UTF16, because of surrogate pairs.

https://stackoverflow.com/questions/5903008/what-is-a-surrogate-pair-in-java

As far as I can tell, pretty much the only users of UTF16 are Windows programs. 
Everyone else uses UTF8 or UCS32.

I recommend using UTF8.


More information about the Digitalmars-d mailing list