UTF-8 Everywhere

Sun Jun 19 23:44:02 PDT 2016

On 6/19/2016 11:36 PM, Charles Hixson via Digitalmars-d wrote:
> To me it seems that a lot of the time processing is more efficient with UCS-4
> (what I call utf-32).  Storage is clearly more efficient with utf-8, but access
> is more direct with UCS-4.  I agree that utf-8 is generally to be preferred
> where it can be efficiently used, but that's not everywhere.  The problem is
> efficient bi-directional conversion...which D appears to handle fairly well
> already with text() and dtext().  (I don't see any utility for utf-16.  To me
> that seems like a first attempt that should have been deprecated.)

That seemed to me to be true, too, until I wrote a text processing program using 
UCS-4. It was rather slow. Turns out, 4x memory consumption has a huge 
performance cost.