To Walter, about char[] initialization by FF
Andrew Fedoniouk
news at terrainformatica.com
Sat Jul 29 13:27:14 PDT 2006
"kris" <foo at bar.com> wrote in message news:eaf9ei$2m7$1 at digitaldaemon.com...
> Andrew Fedoniouk wrote:
>> Could somebody shed light on the subject:
>>
>> According to http://digitalmars.com/d/type.html
>>
>> characters in D are getting initialized by following values
>>
>> char -> 0xFF
>> wchar -> 0xFFFF
>> dchar -> 0x0000FFFF
>>
>> what is the idea to have string initialized by valid character code
>> instead of 0?
>
> Try google?
>
> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html
Thanks, Kris.
To Walter:
Following assumption (
http://www.digitalmars.com/d/archives/digitalmars/D/3239.html):
"codepoint U+FFFF is not a legitimate Unicode character, and, furthermore,
it is guaranteed by the
Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character.
This codepoint will remain forever unassigned, precisely so that it may be
used
for purposes such as this."
is just wrong.
1) 0xFFFF is a valid UNICODE character - it is one of the "Specials" from
R-zone: {U+FFF0..U+FFFF} - region assigned already.
2) For char[] selection of 0xFF is wrong and even worse.
For example character with code 0xFF in Latin-I encoding is
"y diaeresis". In many European languages and Far East encodings 0xFF is a
valid code point.
For example in KOI-8 encoding 0xFF is officially assigned value.
What is the point of current initializaton?
If you are doing intialization already
and this intialization is a part of specification so why not to use
official "Nul" values in this case?
You are doing the same for floats - you are using NaNs there
(Null value for floats). Why not to use the same for chars?
I think I understand your intention, 0xFF is sort of
debug values in Visual C++:
0xCDCDCDCD
- Allocated in heap, but not initialized
0xDDDDDDDD
- Released heap memory.
0xFDFDFDFD
- "NoMansLand" fences automatically placed at boundary of heap memory.
Should never be overwritten. If you do overwrite one, you're probably
walking off the end of an array.
0xCCCCCCCC
- Allocated on stack, but not initialized
but this is far from concept of null codepoint in character encodings.
Andrew Fedoniouk.
http://terrainformatica.com
More information about the Digitalmars-d
mailing list