What is the legal range of chars?
Ali Çehreli
acehreli at yahoo.com
Wed Jun 19 08:13:22 PDT 2013
On 06/19/2013 05:34 AM, monarch_dodra wrote:
> I know a "binary" char can hold the values 0 to 0xFF. However, I'm
> wondering about the cases where a codepoint can fit inside a char. For
> example, 'ç' is represented by 0xe7, which technically fits inside a
char.
'ç' is represented by 0xe7 in an encoding that is not UTF-8. :)
That would be a special agreement between the producer and the consumer
of that string. Otherwise, 0xe7 is not 'ç'. I recommend ubyte[] for
those cases.
In UTF-8, 0xe7 is the first byte of a 3-byte code point:
import std.stdio;
void main()
{
char[] a = [ 'a', 'b', 'c', 0xe7, 0x80, 0x80 ];
writeln(a);
}
Prints a Chinese character:
abc瀀
Ali
More information about the Digitalmars-d-learn
mailing list