First Impressions
Chad J
"gamerChad\" at spamIsBad gmail.com
Fri Sep 29 07:21:20 PDT 2006
Georg Wrede wrote:
> Lionello Lunesu wrote:
>
>> I also ALWAYS create aliases for char[], wchar[], dchar[]... I DO wish
>> they would be included by default in Phobos.
>>
>> alias char[] string;
>> alias wchar[] wstring;
>> alias dchar[] dstring;
>>
>> Perhaps, using string instead of char[], it's more obvious that it's
>> not zero-terminated. I've seen D examples online that just cast a
>> char[] to char* for use in MessageBox and the like (which worked since
>> it were string constants.)
>
>
> Using char[] as long as you don't know about UTF seems to work pretty
> well in D. But the moment you realise that we're having potential
> multibyte characters in what essentially is a ubyte[], you get scared to
> death, and start to wonder how on earth you haven't yet blown up your
> hard disk.
>
> You start having nightmares about slicing char arrays at the wrong
> place, extracting single chars that might not be storable in a char, and
> all of a sudden you decide to stick with your old language "till things
> calm down".
>
> The only medicine to this is simply to shut your eyes and keep coding on
> like you never did realise anything.
>
> It's a little like when you first realised Daddy isn't holding your
> bike: you instantly fall hurting yourself, instead of realizing that
> he's probably let go ages ago, and you still haven't fallen, so simply
> keep going.
>
> ---
>
> This doesn't mean I'm happy with this either, but I don't have the
> energy to conjure up a significantly better solution _and_ fight for it
> till it gets accepted. (Some things are just too hard to fix, like
> "bit=bool" was, and now "auto/auto".)
haha too true.
I experienced this too as I read this ng. It hasn't been THAT truamatic
for me though, since everything seems to work as long as you stick to
english. I don't have the resources to even begin thinking about
non-english text (ex: paying people to translate stuff), so I don't lose
any sleep about it, at least not yet.
Perhaps there should be a string struct/class that has an undefined
underlying type (it could be UTF-8, 16, 32, you dunno really), and you
could index it to get the *complete* character at any position in the
string. Basically, it is like char[], but it /just works/ in all cases.
I'd almost rather have the size of a char be undefined, and just have
char[] be the said magic string type. If you want something with a
.size of 1, then there is byte/ubyte. There would probably have to be
some stuff in the phobos internals to handle such a string in a correct
manner.
Going even further... if you could make char[] be such a magic string
type, then wchar[] and dchar[] could probably be deprecated - use ushort
and uint instead. Then add the following aliases to phobos:
alias ubyte utf8;
alias ushort utf16;
alias uint utf32;
Just a thought. I'm no expert on UTF, but maybe this can start a
discussion that will result in the nightmares ending :)
More information about the Digitalmars-d
mailing list