Treating the abusive unsigned syndrome
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Thu Nov 27 14:23:12 PST 2008
Derek Parnell wrote:
> On Tue, 25 Nov 2008 09:59:01 -0600, Andrei Alexandrescu wrote:
>
>> D pursues compatibility with C and C++ in the following manner: if a
>> code snippet compiles in both C and D or C++ and D, then it should have
>> the same semantics.
>
> Interesting ... but I don't think that this should be the principle
> employed. If code is 'naughty' in C/C++ then D should not also produce the
> same results.
>
> I would propose that a better principle to be used would be that the
> compiler will not allow loss or distortion of information without the
> coder/reader being made aware of it.
These two principle are not necessarily at odds with each other. The
idea of being compatible with C and C++ is simple: if I paste a C
function from somewhere into a D module, the function should either not
compile, or compile and run with the same result. I think that's quite
reasonable. So if the C code is behaving naughtily, D doesn't need to
also behave naughty. It should just not compile.
>> (1) u + i, i + u
>> (2) u - i, i - u
>> (3) u - u
>> (4) u * i, i * u, u / i, i / u, u % i, i % u (compatibility with C
>> requires that these all return unsigned, ouch)
>> (5) u < i, i < u, u <= i etc. (all ordering comparisons)
>> (6) -u
>
> Note that "(3) u - u" and "(6) -u" seem to be really a use of (4), namely
> "(-1 * u)".
Correct.
> I am assming that there is no difference between 'unsigned' and 'positive',
> in so much as I am not treating 'unsigned' as 'sign unknown/irrelevant'.
>
> It seems to me that the issue then is not so much one of sign but of size.
> It needs an extra bit to hold the sign information thus a 32-bit unsigned
> value needs a minimum of 33 bits to convert it to a signed equivalent.
>
> In the types (1) - (4) above, I would have the compiler compute a signed
> type for these. Then if the target of the result is a signed type AND
> larger than the 'unsigned' portion used, then the complier would not have
> to complain. In every other case the complier should complain because of
> the potential for information loss. To avoid the complaint, the coder would
> need to either change the result type, the input types or add a 'message'
> to the compliler that in effects says "I know what I'm doing, ok?" - I
> suggest a cast would suffice.
>
> In those cases where the target type is not explicitly coded, such as using
> 'auto' or as a temporary value in an expression, the compiler should assume
> a signed type that is 'one step' larger than the 'unsigned' element in the
> expression.
>
> e.g.
> auto x = int * uint; ==> 'x' is long.
I don't think this will fly with Walter.
> If this causes code to be incompatible to C/C++, then it implies that the
> C/C++ code was poor (i.e. potential information loss) in the first place
> and deserves to be fixed up.
I don't quite think so. As long as the values are within range, the
multiplication is legit and efficient.
> The scenario (5) above should also include equality comparisions, and
> should cause the compiler to issue a message AND generate code like ...
>
> if (u < i) ====> if ( i < 0 ? false : u < cast(typeof(u))i)
> if (u <= i) ====> if ( i < 0 ? false : u <= cast(typeof(u))i)
> if (u = i) ====> if ( i < 0 ? false : u = cast(typeof(u))i)
> if (u >= i) ====> if ( i < 0 ? true : u >= cast(typeof(u))i)
> if (u > i) ====> if ( i < 0 ? true : u > cast(typeof(u))i)
>
> The coder should be able to avoid the message and the suboptimal generated
> code my adding a cast ...
>
> if (u < cast(typeof u)i)
Yah, comparisons need to be looked at too.
Andrei
More information about the Digitalmars-d
mailing list