First Impressions

Georg Wrede georg.wrede at nospam.org
Tue Oct 3 02:44:53 PDT 2006


Kevin Bealer wrote:
> Georg Wrede wrote:
> 
>> Walter Bright wrote:
>>
>>> True, and some have called for renaming char to utf8. While that 
>>> would be technically more correct (as toUTF would be, too), it just 
>>> looks awful.
>>
>>
>> Let's just say it would be a first step in lessening the confusion 
>> _we_ create in newcomers' heads.
> 
> 
> I would kind of agree with this, but I think it's a two-edged knife.
> 
> If we say 'char[]' then users don't know it's a string until they read 
> the 'why D arrays are great' page (which they should read, but...)
> 
> If we say 'string' then we hide the fact that [] can be applied and that 
> other array-like operations can work.
> 
> For instance, from a Java perspective:
> 
> char[] : Users don't know that it's "String"; users see it as low-level.
>          Some will try to write things like 'find()' by hand since they
>          will figure arrays are low level and not expect this to exist.

Yes.

> string : Users will think it's immutable, special; they will ask "how do
>          I get one of the characters out of a string", "how do I convert
>          string to char[]?", and other things that would be obvious
>          without the alias.

Well, with string, folks would at least be inclined to search for the 
library function to do it.

---

Overall, having string instead of char[] should result in folks learning 
and doing more with D _before_ they get tangled with UTF issues. (I 
guess, getting tangled with UTF is unavoidable.) But the more later 
folks stumble on this, the better they can handle it. If it happens too 
soon, then they will just run away from D.

But substituting string for char[] in D is not enough. More than half 
the issue is the wording in the docs.

---

Another thing intimately connected with this is whether we should have 
char[] or utf8[] (string or no string, this is an important thing anyway).

I understand that "char" is one of the words that a seasoned 
programmer's fingers know by heart. So it would feel simply disgusting 
to have to learn (and bother) to write "utf8" which I admit is a lot 
more work to type. (Seriously.)

Now, "string" is easy for the fingers, and then you get to skip "[]", 
which makes it all a little more palatable.

Having string would let us have the underlying type be utf8[], which 
really emphasizes and calls your attention to the fact that it's not 
byte-by-byte stuff we have there.



More information about the Digitalmars-d mailing list