What is the legal range of chars?
    Ali Çehreli 
    acehreli at yahoo.com
       
    Wed Jun 19 08:13:22 PDT 2013
    
    
  
On 06/19/2013 05:34 AM, monarch_dodra wrote:
 > I know a "binary" char can hold the values 0 to 0xFF. However, I'm
 > wondering about the cases where a codepoint can fit inside a char. For
 > example, 'ç' is represented by 0xe7, which technically fits inside a 
char.
'ç' is represented by 0xe7 in an encoding that is not UTF-8. :)
That would be a special agreement between the producer and the consumer 
of that string. Otherwise, 0xe7 is not 'ç'. I recommend ubyte[] for 
those cases.
In UTF-8, 0xe7 is the first byte of a 3-byte code point:
import std.stdio;
void main()
{
     char[] a = [ 'a', 'b', 'c', 0xe7, 0x80, 0x80 ];
     writeln(a);
}
Prints a Chinese character:
abc瀀
Ali
    
    
More information about the Digitalmars-d-learn
mailing list