Narrow string is not a random access range

Jonathan M Davis jmdavisProg at gmx.com
Tue Oct 23 17:43:48 PDT 2012


On Wednesday, October 24, 2012 01:33:28 Timon Gehr wrote:
> On 10/24/2012 01:07 AM, Jonathan M Davis wrote:
> > On Wednesday, October 24, 2012 00:28:28 Timon Gehr wrote:
> >> The other valid opinion is that the 'mistake' is in Phobos because it
> >> treats narrow character arrays specially.
> > 
> > If it didn't, then range-based functions would be useless for strings in
> > most cases, because it rarely makes sense to operate on code units.
> > 
> >> Note that string is just a name for immutable(char)[]. It would have to
> >> become a struct if random access was to be deprecated.
> > 
> > I think that Andrei was arguing for changing how the compiler itself
> > handles arrays of char and wchar so that they wouldn't have direct random
> > access or length anymore, forcing you to do something like str.rep[6] for
> > random access regardless of what happens with range-based functions.
> > 
> > - Jonathan M Davis
> 
> That idea does not even deserve discussion.

Actually, it solves the problem quite well, because you then have to work at 
misusing strings (of any constness or char type), but it's still extremely 
easy to operate on code units if you want to. However, Walter seems to think 
that everyone should understand unicode and code for it, in which case it 
would be normal for the programmer to understand all of the quirks of code 
units vs code points and code accordingly, but I think that it's pretty clear 
that that the average programmer doesn't have a clue about unicode, so if the 
normal string operations do anything which isn't unicode aware (e.g. length), 
then lots of programmers are going to screw it up. But since such a change 
would break tons of code, I think that there's pretty much no way that it's 
going to happen at this point even if it were generally agreed that it was the 
way to go.

The alternative, of course, is to create a string type which wraps arrays of 
the various character types, but no one has been able to come up with a design 
for it which was generally accepted. It also risks not working very well with 
string literals and the like, since a string literal would no longer be a 
string (similar to the nonsense that you have to put up with in C++ with 
regards to std::string vs string literals). But even if someone can come up 
with a solid solution, the amount of code which it would break could easiily 
disqualify it anyway.

- Jonathan M Davis


More information about the Digitalmars-d-learn mailing list