Treating the abusive unsigned syndrome
KennyTM~
kennytm at gmail.com
Thu Nov 27 13:31:11 PST 2008
KennyTM~ wrote:
> Andrei Alexandrescu wrote:
>> Don wrote:
>>> Andrei Alexandrescu wrote:
>>>> Don wrote:
>>>>> Andrei Alexandrescu wrote:
>>>>>> One fear of mine is the reaction of throwing of hands in the air
>>>>>> "how many integral types are enough???". However, if we're to
>>>>>> judge by the addition of long long and a slew of typedefs to C99
>>>>>> and C++0x, the answer is "plenty". I'd be interested in gaging how
>>>>>> people feel about adding two (bits64, bits32) or even four
>>>>>> (bits64, bits32, bits16, and bits8) types as basic types. They'd
>>>>>> be bitbags with undecided sign ready to be converted to their
>>>>>> counterparts of decided sign.
>>>>>
>>>>> Here I think we have a fundamental disagreement: what is an
>>>>> 'unsigned int'? There are two disparate ideas:
>>>>>
>>>>> (A) You think that it is an approximation to a natural number, ie,
>>>>> a 'positive int'.
>>>>> (B) I think that it is a 'number with NO sign'; that is, the sign
>>>>> depends on context. It may, for example, be part of a larger
>>>>> number. Thus, I largely agree with the C behaviour -- once you have
>>>>> an unsigned in a calculation, it's up to the programmer to provide
>>>>> an interpretation.
>>>>>
>>>>> Unfortunately, the two concepts are mashed together in C-family
>>>>> languages. (B) is the concept supported by the language typing
>>>>> rules, but usage of (A) is widespread in practice.
>>>>
>>>> In fact we are in agreement. C tries to make it usable as both, and
>>>> partially succeeds by having very lax conversions in all directions.
>>>> This leads to the occasional puzzling behaviors. I do *want* uint to
>>>> be an approximation of a natural number, while acknowledging that
>>>> today it isn't much of that.
>>>>
>>>>> If we were going to introduce a slew of new types, I'd want them to
>>>>> be for 'positive int'/'natural int', 'positive byte', etc.
>>>>>
>>>>> Natural int can always be implicitly converted to either int or
>>>>> uint, with perfect safety. No other conversions are possible
>>>>> without a cast.
>>>>> Non-negative literals and manifest constants are naturals.
>>>>>
>>>>> The rules are:
>>>>> 1. Anything involving unsigned is unsigned, (same as C).
>>>>> 2. Else if it contains an integer, it is an integer.
>>>>> 3. (Now we know all quantities are natural):
>>>>> If it contains a subtraction, it is an integer [Probably allow
>>>>> subtraction of compile-time quantities to remain natural, if the
>>>>> values stay in range; flag an error if an overflow occurs].
>>>>> 4. Else it is a natural.
>>>>>
>>>>>
>>>>> The reason I think literals and manifest constants are so important
>>>>> is that they are a significant fraction of the natural numbers in a
>>>>> program.
>>>>>
>>>>> [Just before posting I've discovered that other people have posted
>>>>> some similar ideas].
>>>>
>>>> That sounds encouraging. One problem is that your approach leaves
>>>> the unsigned mess as it is, so although natural types are a nice
>>>> addition, they don't bring a complete solution to the table.
>>>>
>>>>
>>>> Andrei
>>>
>>> Well, it does make unsigned numbers (case (B)) quite obscure and
>>> low-level. They could be renamed with uglier names to make this clearer.
>>> But since in this proposal there are no implicit conversions from
>>> uint to anything, it's hard to do any damage with the unsigned type
>>> which results.
>>> Basically, with any use of unsigned, the compiler says "I don't know
>>> if this thing even has a meaningful sign!".
>>>
>>> Alternatively, we could add rule 0: mixing int and unsigned is
>>> illegal. But it's OK to mix natural with int, or natural with unsigned.
>>> I don't like this as much, since it would make most usage of unsigned
>>> ugly; but maybe that's justified.
>>
>> I think we're heading towards an impasse. We wouldn't want to make
>> things much harder for systems-level programs that mix arithmetic and
>> bit-level operations.
>>
>> I'm glad there is interest and that quite a few ideas were brought up.
>> Unfortunately, it looks like all have significant disadvantages.
>>
>> One compromise solution Walter and I discussed in the past is to only
>> sever one of the dangerous implicit conversions: int -> uint. Other
>> than that, it's much like C (everything involving one unsigned is
>> unsigned and unsigned -> signed is implicit) Let's see where that
>> takes us.
>>
>> (a) There are fewer situations when a small, reasonable number
>> implicitly becomes a large, weird numnber.
>>
>> (b) An exception to (a) is that u1 - u2 is also uint, and that's for
>> the sake of C compatibility. I'd gladly drop it if I could and leave
>> operations such as u1 - u2 return a signed number. That assumes the
>> least and works with small, usual values.
>>
>> (c) Unlike C, arithmetic and logical operations always return the
>> tightest type possible, not a 32/64 bit value. For example, byte / int
>> yields byte and so on.
>>
>
> So you mean long * int (e.g. 1234567890123L * 2) will return an int
> instead of a long?!
>
> The opposite sounds more natural to me.
>
Em, or do you mean the tightest type that can represent all possible
results? (so long*int == cent?)
>> What do you think?
>>
>>
>> Andrei
More information about the Digitalmars-d
mailing list