UTF-8 Everywhere
Charles Hixson via Digitalmars-d
digitalmars-d at puremagic.com
Sun Jun 19 23:36:09 PDT 2016
To me it seems that a lot of the time processing is more efficient with
UCS-4 (what I call utf-32). Storage is clearly more efficient with
utf-8, but access is more direct with UCS-4. I agree that utf-8 is
generally to be preferred where it can be efficiently used, but that's
not everywhere. The problem is efficient bi-directional
conversion...which D appears to handle fairly well already with text()
and dtext(). (I don't see any utility for utf-16. To me that seems
like a first attempt that should have been deprecated.)
On 06/19/2016 05:49 PM, Walter Bright via Digitalmars-d wrote:
> http://utf8everywhere.org/
>
> It has a good explanation of the issues and problems, and how these
> things came to be.
>
> This is pretty much in line with my current (!) opinion on Unicode.
> What it means for us is I don't think it is that important anymore for
> algorithms to support strings of UTF-16 or UCS-4.
>
More information about the Digitalmars-d
mailing list