To Walter, about char[] initialization by FF

Bruno Medeiros brunodomedeirosATgmail at SPAM.com
Thu Aug 3 04:15:53 PDT 2006


Andrew Fedoniouk wrote:
> "Walter Bright" <newshound at digitalmars.com> wrote in message 
> news:eao5st$2r1f$1 at digitaldaemon.com...
>> Andrew Fedoniouk wrote:
>>> Compiler accepts input stream as either BMP codes or full unicode set
>> encoded using UTF-16.
>>
>> BMP is a subset of UTF-16.
> 
> Walter with deepest respect but it is not. Two different things.
> 
> UTF-16 is a variable-length enconding - byte stream.
> Unicode BMP is a range of numbers strictly speaking.
> 
> If you will treat utf-16 sequence as a sequence of UCS-2 (BMP) codes you
> are in trouble. See:
> 

Uh, the statement "BMP is a subset of UTF-16" means that you can read a 
BMP sequence as an UTF-16 sequence, not the opposite as you said: "If 
you will treat utf-16 sequence as a sequence of UCS-2 (BMP)".


>>> Ordinary people will do their own strings anyway. Just give them opAssign 
>>> and dtor in structs and you will see explosion of perfect strings. That 
>>> char#[] (read-only arrays) will also benefit here. oh.....
>>>
>>> Changing char init value to 0 will not harm anybody but will allow to use 
>>> char for other than
>>>
>>> utf-8 purposes - it is only one from 40 in active use encodings anyway.
>>>
>>> For persistence purposes (in compiled EXE) utf is the best choice 
>>> probably. But in runtime - please not on language level.
>> ubyte[] will enable you to use any encoding you wish - and that's what 
>> it's there for.
> 
> Thus the whole set of Windows API headers (and std.c.string for example)
> seen in D has to be rewrited to accept ubyte[]. As char in D is not char in 
> C
> Is this the idea?
> 
> Andrew.
> 
> 

Just a note, not to ubyte[] but to ubyte* .


-- 
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D



More information about the Digitalmars-d mailing list