To Walter, about char[] initialization by FF
Andrew Fedoniouk
news at terrainformatica.com
Sat Jul 29 15:37:18 PDT 2006
"Walter Bright" <newshound at digitalmars.com> wrote in message
news:eagmrk$1pn9$1 at digitaldaemon.com...
> Andrew Fedoniouk wrote:
>>>> What is the point of current initializaton?
>>> The point is to initialize it with an invalid value, in order to flush
>>> out uninitialized data errors.
>>>
>>>> If you are doing intialization already
>>>> and this intialization is a part of specification so why not to use
>>>> official "Nul" values in this case?
>>> Because 0 is a valid UTF-8 character.
>>
>> 1) What "UTF-8 character" means exactly?
>
> For an exact answer, the spec is: http://www.ietf.org/rfc/rfc3629.txt
> There isn't much to it.
Sorry but I understand what UCS character means
but what exactly is "UTF-8 character" you are using?
Is this 1) a single octet in UTF-8 sequence or
2) is a sequence of octets representing one unicode character (21 bit value)
>
>> 2) In ASCII char(0) is officially NUL. Why not to initialize strings
>> by null?
>
> Because 0 characters are valid UTF-8 values. By using an invalid UTF-8
> value, we can flush out bugs from uninitialized data.
Oh....
0 as a value of UTF-8 octet can represent only single value character
with codepoint 0x00000000.
In plain English: UTF-8 encoded strings cannot contain zeros in the middle.
>
>> I don't get it, sorry. In KOI-8R (Russian) enconding 0xFF is letter '?'
>> Are you saying that I cannot use char[] to represen russian text in D?
>
> char[] is for UTF-8 encoded text only. For other encoding systems, use
> ubyte[]. But rest assured that Russian (and every other language) has a
> defined encoding in UTF-8, which is why it was selected for D.
Sorry but char[acter] in plain english means character - index of some
human readable glyph in some table like ASCII, KOI-8,
MAC-ASCII, whatever.
Element of UTF-8 sequence is an octet. I think you should rename
'char' type to 'octet' if D/Phobos intended to support only UTF-8.
Andrew.
More information about the Digitalmars-d
mailing list