Notice/Warning on narrowStrings .length

Jonathan M Davis jmdavisProg at gmx.com
Mon Apr 23 19:14:02 PDT 2012


On Tuesday, April 24, 2012 01:01:57 James Miller wrote:
> I'm writing an introduction/tutorial to using strings in D,
> paying particular attention to the complexities of UTF-8 and 16.
> I realised that when you want the number of characters, you
> normally actually want to use walkLength, not length. Is is
> reasonable for the compiler to pick this up during semantic
> analysis and point out this situation?
> 
> It's just a thought because a lot of the time, using length will
> get the right answer, but for the wrong reasons, resulting in
> lurking bugs. You can always cast to immutable(ubyte)[] or
> immutable(short)[] if you want to work with the actual bytes
> anyway.

At this point, I don't think that it makes any sense to give a warning for 
this. The compiler can't possibly know whether using length is a good idea or 
correct in any particular set of code. If we really want to do something to 
tackle the problem, then we should create a new string type which better 
solves the issues. There's a _lot_ more to be worried about due to the fact 
that strings are variable length encoded than just their length.

There has been talk of creating a new string type, and there has been talk of 
creating the concept of a variable length encoded range which better handles 
all of this stuff, but no proposal thus far has gotten anywhere.

As for walkLength being O(n) in many cases (as discussed elsewhere in this 
thread), I don't think that it's that big a deal. If you know what it's doing, 
you know that it's O(n), and it's simple enough to simply save the result if 
you need to call it multiple times.

- Jonathan M Davis


More information about the Digitalmars-d mailing list