char[] annoyance...

Kevin Bealer Kevin_member at pathlink.com
Mon Apr 10 09:28:00 PDT 2006


In article <ops7rncsol23k2f5 at nrage.netwin.co.nz>, Regan Heath says...
>
>Take this code:
>
>void main()
>{
>	//..open a file, read line for line, on each line:
>
>	for(int i = 0; i < line.length-2; i++) {
>		if (line[i..i+2] != "||") continue;
>		//..etc..
>	}
>}
>
>
>There is a subtle bug. On all lines with a length of 0 or 1 it will give  
>the following error:
>
>Error: ArrayBoundsError line_length.d(6)
>
>
>The problem is of course the statement "i < line.length-2". line.length is  
>unsigned, and when you - 2 from an unsigned value.. well lets just say  
>that it's bigger than the actual length of the line - 2.
>
>
>Of course there are plently of other ways to code this, perhaps using  
>foreach, but that's not the point. The point is that this code  _can_ be  
>written and on the surface looks fine. Not even -w (warnings) spots the  
>signed/unsigned problem. At the very least can we get a warning for this?
>
>Regan

I see the "gotcha" here, and whether conversions should be done is an
interesting question.  But I wanted to propose a slightly different solution.

> for(int i = 0; i+2 < line.length; i++) {

Since you are reference "i+2" in the expression, that is really the value what
you need to be in the [0..length) range.

More generally, always do the arithmetic with the signed variables in cases like
this if there is a possibility of wraparound.

But if the automatic conversions were specified to do uint->int that would also
be fine with me.

One can also imagine a language with the following inequalities:

+>  unsigned greater-than
+<  unsigned less than
->  signed greater than
-<  signed less than
+>= unsigned greater-or-equal
etc

This would not replace the current >, <, but would essentially just be shorthand
for existing expressions:

(A +> B) is the same as (cast(uint)A > cast(uint)B)

.. and so on, except that uint, ulong, ushort, etc would be chosen as needed.

Is this worth doing?  Maybe - it's not a big deal to me, but the signed/unsigned
question does represent a common gotcha in certain expressions.

Kevin





More information about the Digitalmars-d mailing list