char[] annoyance...

Mon Apr 10 15:33:54 PDT 2006

On Mon, 10 Apr 2006 16:28:00 +0000 (UTC), Kevin Bealer  
<Kevin_member at pathlink.com> wrote:
> In article <ops7rncsol23k2f5 at nrage.netwin.co.nz>, Regan Heath says...
>>
>> Take this code:
>>
>> void main()
>> {
>> 	//..open a file, read line for line, on each line:
>>
>> 	for(int i = 0; i < line.length-2; i++) {
>> 		if (line[i..i+2] != "||") continue;
>> 		//..etc..
>> 	}
>> }
>>
>>
>> There is a subtle bug. On all lines with a length of 0 or 1 it will give
>> the following error:
>>
>> Error: ArrayBoundsError line_length.d(6)
>>
>>
>> The problem is of course the statement "i < line.length-2". line.length  
>> is
>> unsigned, and when you - 2 from an unsigned value.. well lets just say
>> that it's bigger than the actual length of the line - 2.
>>
>>
>> Of course there are plently of other ways to code this, perhaps using
>> foreach, but that's not the point. The point is that this code  _can_ be
>> written and on the surface looks fine. Not even -w (warnings) spots the
>> signed/unsigned problem. At the very least can we get a warning for  
>> this?
>>
>> Regan
>
> I see the "gotcha" here, and whether conversions should be done is an
> interesting question.  But I wanted to propose a slightly different  
> solution.
>
>> for(int i = 0; i+2 < line.length; i++) {
>
> Since you are reference "i+2" in the expression, that is really the  
> value what you need to be in the [0..length) range.

You're probably right.. now, if only I could train my brain to think of it  
that way round :)

> More generally, always do the arithmetic with the signed variables in  
> cases like this if there is a possibility of wraparound.

That's good advice, however you have to realise there is a possibility of  
wraparound, something I didn't do (or I wouldn't have had the problem in  
the first place) :(

> But if the automatic conversions were specified to do uint->int that  
> would also be fine with me.

But wouldn't that introduce a bug here:
   int a = int.max-1;
   uint b = int.max+1;
   assert(a < b);

wouldn't b be promoted to a signed value of <some negative number>?

> One can also imagine a language with the following inequalities:
>
> +>  unsigned greater-than
> +<  unsigned less than
> ->  signed greater than
> -<  signed less than
> +>= unsigned greater-or-equal
> etc
>
> This would not replace the current >, <, but would essentially just be  
> shorthand for existing expressions:
>
> (A +> B) is the same as (cast(uint)A > cast(uint)B)
>
> .. and so on, except that uint, ulong, ushort, etc would be chosen as  
> needed.
>
> Is this worth doing?  Maybe - it's not a big deal to me, but the  
> signed/unsigned question does represent a common gotcha in certain  
> expressions.

It's a good idea, but, same problem as before; you have to realise length  
is signed and you have to realise there is a chance for wraparound. A  
warning about the signed/unsigned comparrison is the very least we should  
do. I would be tempted to even make it an outright error requiring a cast  
(or one of these new operators above) to handle.

Regan