To Walter, about char[] initialization by FF

Carlos Santander csantander619 at gmail.com
Sat Jul 29 17:26:37 PDT 2006


Andrew Fedoniouk escribió:
> "Carlos Santander" <csantander619 at gmail.com> wrote in message 
> news:eagiip$1lad$3 at digitaldaemon.com...
>> Andrew Fedoniouk escribió:
>>> 2) For char[] selection of 0xFF is wrong and even worse.
>>> For example character with code 0xFF in Latin-I encoding is
>>> "y diaeresis". In many European languages and Far East encodings 0xFF is 
>>> a valid code point.
>>> For example in KOI-8 encoding 0xFF is officially assigned value.
>>>
>> But D's chars are UTF-8, not Latin-1 nor any other, so I don't think this 
>> applies.
>>
> 
> UTF-8 is a multibyte transport encoding of full 21-bit UNICODE codepoint.
> Strictly speaking single byte in UTF-8 sequence cannot be named as 
> char[acter]
> 
> char as typename implies that value of its type contains some complete
> codepoint (assumed that information about codepage is stored somewhere
> or is known at the point of use)
> 
> I mean that "UTF-8 characrter" (if it makes any sense at all) as type
> is always char[] and not a single char.
> 
> 0xFF as a char initialization value implies that D char is not supposed
> to handle single byte character encodings at all. Is this an original 
> intention?
> 
> Andrew Fedoniouk.
> http://terrainformatica.com
> 

My bad, then. I should've said char[] instead of char. Frits and Walter wrote 
better responses, anyway, so I'll leave this as is.

-- 
Carlos Santander Bernal



More information about the Digitalmars-d mailing list