Portability bug in integral conversion

Graham St Jack Graham.StJack at internode.on.net
Sun Jan 16 19:32:51 PST 2011


On 17/01/11 13:30, Andrei Alexandrescu wrote:
> On 1/16/11 7:51 PM, Graham St Jack wrote:
>> On 17/01/11 10:39, Andrei Alexandrescu wrote:
>>> On 1/16/11 5:24 PM, Graham St Jack wrote:
>>>> On 16/01/11 08:52, Andrei Alexandrescu wrote:
>>>>> We've spent a lot of time trying to improve the behavior of integral
>>>>> types in D. For the most part, we succeeded, but the success was
>>>>> partial. There was some hope with the polysemy notion, but it
>>>>> ultimately was abandoned because it was deemed too difficult to
>>>>> implement for its benefits, which were considered solving a minor
>>>>> annoyance. I was sorry to see it go, and I'm glad that now its day of
>>>>> reckoning has come.
>>>>>
>>>>> Some of the 32-64 portability bugs have come in the following form:
>>>>>
>>>>> char * p;
>>>>> uint a, b;
>>>>> ...
>>>>> p += a - b;
>>>>>
>>>>> On 32 bits, the code works even if a < b: the difference will 
>>>>> become a
>>>>> large unsigned number, which is then converted to a size_t (which 
>>>>> is a
>>>>> no-op since size_t is uint) and added to p. The pointer itself is a
>>>>> 32-bit quantity. Due to two's complement properties, the addition has
>>>>> the same result regardless of the signedness of its operands.
>>>>>
>>>>> On 64-bits, the same code has different behavior. The difference a 
>>>>> - b
>>>>> becomes a large unsigned number (say e.g. 4 billion), which is then
>>>>> converted to a 64-bit size_t. After conversion the sign is not
>>>>> extended - so we end up with the number 4 billion on 64-bit. That is
>>>>> added to a 64-bit pointer yielding an incorrect value. For the
>>>>> wraparound to work, the 32-bit uint should have been sign-extended to
>>>>> 64 bit.
>>>>>
>>>>> To fix this problem, one possibility is to mark statically every
>>>>> result of one of uint-uint, uint+int, uint-int as "non-extensible",
>>>>> i.e. as impossible to implicitly extend to a 64-bit value. That would
>>>>> force the user to insert a cast appropriately.
>>>>>
>>>>> Thoughts? Ideas?
>>>>>
>>>>>
>>>>> Andrei
>>>> It seems to me that the real problem here is that it isn't 
>>>> meaningful to
>>>> perform (a-b) on unsigned integers when (a<b). Attempting to clean up
>>>> the resultant mess is really papering over the problem. How about a
>>>> runtime error instead, much like dividing by 0?
>>>
>>> That's too inefficient.
>>>
>>> Andrei
>>
>> If that is the case, then a static check like you are suggesting seems
>> like a good way to go. Sure it will be annoying, but it will pick up a
>> lot of bugs.
>>
>> This particular problem is one that bights me from time to time because
>> I tend to use uints wherever it isn't meaningful to have negative
>> values. It is great until I need to do a subtraction, when I sometimes
>> forget to check which is greater. Would the check you have in mind
>> statically check the following as ok?
>>
>> where a and b are uints and ptr is a pointer:
>>
>> if (a > b) {
>> ptr += (a-b);
>> }
>
> That would require flow analysis. I'm not sure we want to embark on 
> that ship. In certain situations value range propagation could take 
> care of it.
>
> Andrei
>

My fear is that if a cast is always required, people will just put one 
in out of habit and we are no better off (just like exception-swallowing).

Is the cost of run-time checking really prohibitive? Correct code should 
have some checking anyway. Maybe providing phobos functions to perform 
various correct-usage operations with run-time checks like in my code 
fragment above would by useful. They could do the cast, and most of the 
annoyance factor would be dealt with. A trivial example:

int difference(uint a, uint b) {
   if (a >= b) {
     return cast(int) a-b;
   }
   else {
     return -(cast(int) b-a);
   }
}

-- 
Graham St Jack



More information about the Digitalmars-d mailing list