string is rarely useful as a function argument

Jakob Ovrum jakobovrum at gmail.com
Fri Dec 30 12:49:40 PST 2011


On Friday, 30 December 2011 at 19:55:45 UTC, Timon Gehr wrote:
> I think the way we have it now is optimal. The only reason we 
> are discussing this is because of fear that uneducated users 
> will write code that does not take into account Unicode 
> characters above code point 0x80. But what is the worst thing 
> that can happen?
>
> 1. They don't notice. Then it is not a problem, because they 
> are obviously only using ASCII characters and it is perfectly 
> reasonable to assume that code units and characters are the 
> same thing.
>
> 2. They get screwed up string output, look for the reason, 
> patch up their code with some functions from std.utf and will 
> never make the same mistakes again.
>
>
> I have *never* seen an user in D.learn complain about it. They 
> might have been some I missed, but it is certainly not a 
> prevalent problem. Also, just because an user can type .rep 
> does not mean he understands Unicode: He is able to make just 
> the same mistakes as before, even more so, as the array he is 
> getting back has the _wrong element type_.

I strongly agree with this. It would be nice to have everything 
be simple, work correctly *and* efficiently at the same time, but 
I don't believe the proposed changes make a definite improvement.

In the end, if you don't want to use the standard library or 
other UTF-aware string libraries, you'll have to know the basics 
of UTF to write the correct code. I too wish it was harder to 
write it incorrectly, but the current solution is simply the best 
one to appear yet.


More information about the Digitalmars-d mailing list