Challenge: write a really really small front() for UTF8
Dmitry Olshansky
dmitry.olsh at gmail.com
Mon Mar 24 04:47:36 PDT 2014
24-Mar-2014 04:44, Simen Kjærås пишет:
> On 2014-03-24 00:32, Mike wrote:
>> On Sunday, 23 March 2014 at 21:23:18 UTC, Andrei Alexandrescu wrote:
>>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>>>
>>> Andrei
>>
>> This example only considers encodings of up to 4 bytes, but UTF-8 can
>> encode code points in as many as 6 bytes. Is that not a concern?
>>
>> Mike
>
> RFC 3629 (http://tools.ietf.org/html/rfc3629) restricted UTF-8 to
> conform to constraints in UTF-16, removing all 5- and 6-byte sequences.
More importantly Unicode standard explicitly fixed the range of code
points to that of representable in UTF-16. Starting with the 5th version
of the standard if memory serves me right.
--
Dmitry Olshansky
More information about the Digitalmars-d
mailing list