Challenge: write a really really small front() for UTF8
    Michel Fortin 
    michel.fortin at michelf.ca
       
    Sun Mar 23 21:49:37 PDT 2014
    
    
  
On 2014-03-24 04:37:22 +0000, Michel Fortin <michel.fortin at michelf.ca> said:
> But try this instead, the result is even shorter:
Oops, messed up my patterns. Here's a hopefully fixed front():
dchar front(char[] s)
{
  if (s[0] < 0b1000000)
    return s[0]; // ASCII
  // pattern     indicator  tailLength
  // 0b1100xxxx  0b00 (0)   1
  // 0b1101xxxx  0b01 (1)   1 == indicator
  // 0b1110xxxx  0b10 (2)   2 == indicator
  // 0b1111xxxx  0b11 (3)   3 == indicator
  // note: undefined result for illegal 0b11111xxx case
  auto indicator = (s[0] >> 4) & 0b11;
  auto tailLength = indicator ? indicator : 1;
  dchar result = s[0] & (0b00111111 >> tailLength);
  foreach (i; 0..tailLength)
      result = (result << 6) | (s[1+i] & 0b00111111);
  return result;
}
-- 
Michel Fortin
michel.fortin at michelf.ca
http://michelf.ca
    
    
More information about the Digitalmars-d
mailing list