Portability bug in integral conversion

Sun Jan 16 19:00:37 PST 2011

On 1/16/11 7:51 PM, Graham St Jack wrote:
> On 17/01/11 10:39, Andrei Alexandrescu wrote:
>> On 1/16/11 5:24 PM, Graham St Jack wrote:
>>> On 16/01/11 08:52, Andrei Alexandrescu wrote:
>>>> We've spent a lot of time trying to improve the behavior of integral
>>>> types in D. For the most part, we succeeded, but the success was
>>>> partial. There was some hope with the polysemy notion, but it
>>>> ultimately was abandoned because it was deemed too difficult to
>>>> implement for its benefits, which were considered solving a minor
>>>> annoyance. I was sorry to see it go, and I'm glad that now its day of
>>>> reckoning has come.
>>>>
>>>> Some of the 32-64 portability bugs have come in the following form:
>>>>
>>>> char * p;
>>>> uint a, b;
>>>> ...
>>>> p += a - b;
>>>>
>>>> On 32 bits, the code works even if a < b: the difference will become a
>>>> large unsigned number, which is then converted to a size_t (which is a
>>>> no-op since size_t is uint) and added to p. The pointer itself is a
>>>> 32-bit quantity. Due to two's complement properties, the addition has
>>>> the same result regardless of the signedness of its operands.
>>>>
>>>> On 64-bits, the same code has different behavior. The difference a - b
>>>> becomes a large unsigned number (say e.g. 4 billion), which is then
>>>> converted to a 64-bit size_t. After conversion the sign is not
>>>> extended - so we end up with the number 4 billion on 64-bit. That is
>>>> added to a 64-bit pointer yielding an incorrect value. For the
>>>> wraparound to work, the 32-bit uint should have been sign-extended to
>>>> 64 bit.
>>>>
>>>> To fix this problem, one possibility is to mark statically every
>>>> result of one of uint-uint, uint+int, uint-int as "non-extensible",
>>>> i.e. as impossible to implicitly extend to a 64-bit value. That would
>>>> force the user to insert a cast appropriately.
>>>>
>>>> Thoughts? Ideas?
>>>>
>>>>
>>>> Andrei
>>> It seems to me that the real problem here is that it isn't meaningful to
>>> perform (a-b) on unsigned integers when (a<b). Attempting to clean up
>>> the resultant mess is really papering over the problem. How about a
>>> runtime error instead, much like dividing by 0?
>>
>> That's too inefficient.
>>
>> Andrei
>
> If that is the case, then a static check like you are suggesting seems
> like a good way to go. Sure it will be annoying, but it will pick up a
> lot of bugs.
>
> This particular problem is one that bights me from time to time because
> I tend to use uints wherever it isn't meaningful to have negative
> values. It is great until I need to do a subtraction, when I sometimes
> forget to check which is greater. Would the check you have in mind
> statically check the following as ok?
>
> where a and b are uints and ptr is a pointer:
>
> if (a > b) {
> ptr += (a-b);
> }

That would require flow analysis. I'm not sure we want to embark on that 
ship. In certain situations value range propagation could take care of it.

Andrei