To Walter, about char[] initialization by FF

Andrew Fedoniouk news at terrainformatica.com
Sat Jul 29 13:27:14 PDT 2006


"kris" <foo at bar.com> wrote in message news:eaf9ei$2m7$1 at digitaldaemon.com...
> Andrew Fedoniouk wrote:
>> Could somebody shed light on the subject:
>>
>> According to http://digitalmars.com/d/type.html
>>
>> characters in D are getting initialized by following values
>>
>> char -> 0xFF
>> wchar -> 0xFFFF
>> dchar -> 0x0000FFFF
>>
>> what is the idea to have string initialized by valid character code 
>> instead of 0?
>
> Try google?
>
> http://www.digitalmars.com/d/archives/digitalmars/D/3239.html

Thanks, Kris.

To Walter:

Following assumption ( 
http://www.digitalmars.com/d/archives/digitalmars/D/3239.html):

"codepoint U+FFFF is not a legitimate Unicode character, and, furthermore, 
it is guaranteed by the
Unicode Consortium that 0xFFFF will NEVER be a legitimate Unicode character.
This codepoint will remain forever unassigned, precisely so that it may be 
used
for purposes such as this."

is just wrong.

1) 0xFFFF is a valid UNICODE character - it is one of the "Specials" from
R-zone: {U+FFF0..U+FFFF} - region assigned already.

2) For char[] selection of 0xFF is wrong and even worse.
For example character with code 0xFF in Latin-I encoding is
"y diaeresis". In many European languages and Far East encodings 0xFF is a 
valid code point.
For example in KOI-8 encoding 0xFF is officially assigned value.

What is the point of current initializaton?

If you are doing intialization already
and this intialization is a part of specification so why not to use
official "Nul" values in this case?

You are doing the same for floats - you are using NaNs there
 (Null value for floats). Why not to use the same for chars?

I think I understand your intention, 0xFF is sort of
debug values in Visual C++:

0xCDCDCDCD
  - Allocated in heap, but not initialized
0xDDDDDDDD
  - Released heap memory.
0xFDFDFDFD
  - "NoMansLand" fences automatically placed at boundary of heap memory. 
Should never be overwritten. If you do overwrite one, you're probably 
walking off the end of an array.
0xCCCCCCCC
  - Allocated on stack, but not initialized

but this is far from concept of null codepoint in character encodings.

Andrew Fedoniouk.
http://terrainformatica.com







 





More information about the Digitalmars-d mailing list