string is rarely useful as a function argument

Jonathan M Davis jmdavisProg at gmx.com
Wed Dec 28 12:39:37 PST 2011


On Wednesday, December 28, 2011 21:25:39 Timon Gehr wrote:
> Why? char and wchar are unicode code units, ubyte/ushort are unsigned
> integrals. It is clear that char/wchar are a better match.

It's an issue of the correct usage being the easy path. As it stands, it's 
incredibly easy to use narrow strings incorrectly. By forcing any array of 
char or wchar to use .rep.length instead of .length, the relatively automatic 
(and generally incorrect) usage of .length on a string wouldn't immediately 
work. It would force you to work more at doing the wrong thing. Unfortunately, 
walkLength isn't necessarily any easier than .rep.length, but it does force 
people to look into why they can't do .length, which will generally better 
educate them and will hopefully reduce the misuse of narrow strings.

If we make rep ubyte[] and ushort[] for char[] and wchar[] respectively, then 
we reinforce the fact that you shouldn't operate on chars or wchars. It also 
makes it simply for the compiler to never allow you to use length on char[] or 
wchar[], since it doesn't have to worry about whether you got that char[] or 
wchar[] from a rep property or not.

Now, I don't know if this is really a good move at this point. If we were to 
really do this right, we'd need to disallow indexing and slicing of the char[] 
and wchar[] as well, which would break that much more code. It also pretty 
quickly makes it look like string should be its own type rather than an array, 
since it's acting less and less like an array. Not to mention, even the 
correct usage of .rep would become rather irritating (e.g. slicing it when you 
know that the indicies that you're dealing with aren't going to cut into any 
code points), because you'd have to cast from ubyte[] to char[] whenever you 
did that.

So, I think that the general sentiment behind this is a good one, but I don't 
know if the exact idea is ultimately a good one - particularly at this stage 
in the game. If we're going to make a change like this which would break as 
much code as this would, we'd need to be _very_ certain that it's what we want 
to do.

- Jonathan M Davis


More information about the Digitalmars-d mailing list