Challenge: write a really really small front() for UTF8
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Sun Mar 23 19:25:17 PDT 2014
On 3/23/14, 6:53 PM, Michel Fortin wrote:
> On 2014-03-23 21:22:58 +0000, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org> said:
>
>> Here's a baseline: http://goo.gl/91vIGc. Destroy!
>
> Optimizing for smallest assembly size:
>
> dchar front(char[] s)
> {
> size_t bytesize;
> dchar result;
> switch (s[0])
> {
> case 0b00000000: .. case 0b01111111:
> return s[0];
> case 0b11000000: .. case 0b11011111:
> return ((s[0] & 0b00011111) << 6) | (s[1] & 0b00011111);
> case 0b11100000: .. case 0b11101111:
> result = s[0] & 0b00001111;
> bytesize = 3;
> break;
> case 0b11110000: .. case 0b11110111:
> result = s[0] & 0b00000111;
> bytesize = 4;
> default:
> return dchar.init;
> }
> foreach (i; 1..bytesize)
> result = (result << 6) | (s[i] & 0b00111111);
> return result;
> }
>
Nice, thanks! I'd hope for a short path for the ASCII subset, could you
achieve that?
Andrei
More information about the Digitalmars-d
mailing list