Why UTF-8/16 character encodings?

Sat May 25 05:43:21 PDT 2013

On 5/25/13 3:33 AM, Joakim wrote:
> On Saturday, 25 May 2013 at 01:58:41 UTC, Walter Bright wrote:
>> This is more a problem with the algorithms taking the easy way than a
>> problem with UTF-8. You can do all the string algorithms, including
>> regex, by working with the UTF-8 directly rather than converting to
>> UTF-32. Then the algorithms work at full speed.
> I call BS on this. There's no way working on a variable-width encoding
> can be as "full speed" as a constant-width encoding. Perhaps you mean
> that the slowdown is minimal, but I doubt that also.

You mentioned this a couple of times, and I wonder what makes you so 
sure. On contemporary architectures small is fast and large is slow; 
betting on replacing larger data with more computation is quite often a win.

Andrei