The D Programming Language Vision Document

Mon Jul 4 07:39:46 UTC 2022

On Sunday, 3 July 2022 at 21:06:40 UTC, rikki cattermole wrote:
> We have a perfectly good Unicode handling library already.
>
> (Okay, little out of date and doesn't handle Turkic stuff, but 
> fixable).
>
> The standard one is called ICU.

Yes, that is a common one that is maintained, but maybe there are 
BOOST licensed implementations too? One can do an exhaustive test 
for say two-character normalization against ICU to see if they 
are compliant.

Anyway, normalization should not happen behind your back in a 
system level language. You might want to treat different 
encodings of the same string differently when comparing.

> Anyway, we are straying from my original point, that limiting 
> ourselves to the string alias and not supporting wstring or 
> dstring in Phobos is going to bite us.

I guess some Windows programmers want 16 bit… but I don't think 
the conversion matters all that much in that context?

> There better be a good reason for this that isn't just removing 
> templates.

The good reason would be that you can focus on fast SIMD 
optimized algoritms that makes sense for the byte-encoding of 
UTF-8, and get something competitive.