To Walter, about char[] initialization by FF
Carlos Santander
csantander619 at gmail.com
Sat Jul 29 17:26:37 PDT 2006
Andrew Fedoniouk escribió:
> "Carlos Santander" <csantander619 at gmail.com> wrote in message
> news:eagiip$1lad$3 at digitaldaemon.com...
>> Andrew Fedoniouk escribió:
>>> 2) For char[] selection of 0xFF is wrong and even worse.
>>> For example character with code 0xFF in Latin-I encoding is
>>> "y diaeresis". In many European languages and Far East encodings 0xFF is
>>> a valid code point.
>>> For example in KOI-8 encoding 0xFF is officially assigned value.
>>>
>> But D's chars are UTF-8, not Latin-1 nor any other, so I don't think this
>> applies.
>>
>
> UTF-8 is a multibyte transport encoding of full 21-bit UNICODE codepoint.
> Strictly speaking single byte in UTF-8 sequence cannot be named as
> char[acter]
>
> char as typename implies that value of its type contains some complete
> codepoint (assumed that information about codepage is stored somewhere
> or is known at the point of use)
>
> I mean that "UTF-8 characrter" (if it makes any sense at all) as type
> is always char[] and not a single char.
>
> 0xFF as a char initialization value implies that D char is not supposed
> to handle single byte character encodings at all. Is this an original
> intention?
>
> Andrew Fedoniouk.
> http://terrainformatica.com
>
My bad, then. I should've said char[] instead of char. Frits and Walter wrote
better responses, anyway, so I'll leave this as is.
--
Carlos Santander Bernal
More information about the Digitalmars-d
mailing list