UTF-8 Everywhere
Walter Bright via Digitalmars-d
digitalmars-d at puremagic.com
Sun Jun 19 23:44:02 PDT 2016
On 6/19/2016 11:36 PM, Charles Hixson via Digitalmars-d wrote:
> To me it seems that a lot of the time processing is more efficient with UCS-4
> (what I call utf-32). Storage is clearly more efficient with utf-8, but access
> is more direct with UCS-4. I agree that utf-8 is generally to be preferred
> where it can be efficiently used, but that's not everywhere. The problem is
> efficient bi-directional conversion...which D appears to handle fairly well
> already with text() and dtext(). (I don't see any utility for utf-16. To me
> that seems like a first attempt that should have been deprecated.)
That seemed to me to be true, too, until I wrote a text processing program using
UCS-4. It was rather slow. Turns out, 4x memory consumption has a huge
performance cost.
More information about the Digitalmars-d
mailing list