First Impressions

Chad J "gamerChad\" at spamIsBad gmail.com
Fri Sep 29 07:21:20 PDT 2006


Georg Wrede wrote:
> Lionello Lunesu wrote:
> 
>> I also ALWAYS create aliases for char[], wchar[], dchar[]... I DO wish 
>> they would be included by default in Phobos.
>>
>> alias char[] string;
>> alias wchar[] wstring;
>> alias dchar[] dstring;
>>
>> Perhaps, using string instead of char[], it's more obvious that it's 
>> not zero-terminated. I've seen D examples online that just cast a 
>> char[] to char* for use in MessageBox and the like (which worked since 
>> it were string constants.)
> 
> 
> Using char[] as long as you don't know about UTF seems to work pretty 
> well in D. But the moment you realise that we're having potential 
> multibyte characters in what essentially is a ubyte[], you get scared to 
> death, and start to wonder how on earth you haven't yet blown up your 
> hard disk.
> 
> You start having nightmares about slicing char arrays at the wrong 
> place, extracting single chars that might not be storable in a char, and 
> all of a sudden you decide to stick with your old language "till things 
> calm down".
> 
> The only medicine to this is simply to shut your eyes and keep coding on 
> like you never did realise anything.
> 
> It's a little like when you first realised Daddy isn't holding your 
> bike: you instantly fall hurting yourself, instead of realizing that 
> he's probably let go ages ago, and you still haven't fallen, so simply 
> keep going.
> 
> ---
> 
> This doesn't mean I'm happy with this either, but I don't have the 
> energy to conjure up a significantly better solution _and_ fight for it 
> till it gets accepted. (Some things are just too hard to fix, like 
> "bit=bool" was, and now "auto/auto".)

haha too true.

I experienced this too as I read this ng.  It hasn't been THAT truamatic 
for me though, since everything seems to work as long as you stick to 
english.  I don't have the resources to even begin thinking about 
non-english text (ex: paying people to translate stuff), so I don't lose 
any sleep about it, at least not yet.

Perhaps there should be a string struct/class that has an undefined 
underlying type (it could be UTF-8, 16, 32, you dunno really), and you 
could index it to get the *complete* character at any position in the 
string.  Basically, it is like char[], but it /just works/ in all cases. 
  I'd almost rather have the size of a char be undefined, and just have 
char[] be the said magic string type.  If you want something with a 
.size of 1, then there is byte/ubyte.  There would probably have to be 
some stuff in the phobos internals to handle such a string in a correct 
manner.

Going even further... if you could make char[] be such a magic string 
type, then wchar[] and dchar[] could probably be deprecated - use ushort 
and uint instead.  Then add the following aliases to phobos:
alias ubyte utf8;
alias ushort utf16;
alias uint utf32;

Just a thought.  I'm no expert on UTF, but maybe this can start a 
discussion that will result in the nightmares ending :)



More information about the Digitalmars-d mailing list