Value Preservation and Polysemy -> context dependent integer literals

Fri Dec 5 18:16:03 PST 2008

On 2008-12-05 16:27:01 +0100, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> Fawzi Mohamed wrote:
>> On 2008-12-05 07:02:37 +0100, Andrei Alexandrescu 
>> <SeeWebsiteForEmail at erdani.org> said:
>> 
>>> [...]
>>> Well any integral value carries:
>>> 
>>> a) type as per the C rule
>>> 
>>> b) minimum value possible
>>> 
>>> c) maximum value possible
>>> 
>>> The type stays the type as per the C rule, so there's no change there. 
>>> If (and only if) a *narrower* type is asked as a conversion target for 
>>> the value, the range is consulted. If the range is too large, the 
>>> conversion fails.
>>> 
>>> Andrei
>> 
>> basically the implicit conversion rules of C disallowing automatic 
>> unsigned/signed conversions to unsigned?
>> Fawzi
>> 
> 
> Where's the predicate? I don't understand the question.
> 
> Andrei

The implicit conversion rules in C when performing arithmetic 
operations allow up-conversion of types, basically the largest type 
present is used, which almost already respect a,b,c: (using C names)

1) if long double is present everything is converted to it
2) otherwise if double is present everything is converted to it
3) otherwise if float is present everything is converted to it

if only signed or only unsigned integers are present the are ranked in 
the following sequence
	char, short, int ,long,long long
and everything is converted to the largest type (largest rank)

If the range of the signed integer include the range of the unsigned 
integer everything is copied to the signed type
these rules respect a,b,c

for example
    ushort us=1;
    printf("%g\n",1.0*(-34+us));
prints -33, as one would expect.

Now the two rules that break this and that you want to abolish (or at 
least you have problems with) if I have understood correctly are
* if the signed number has rank<= the unsigned convert to the unsigned
* otherwise the unsigned version of the signed type is used.
Is this correct? did I understand what you mean correctly? this is what 
polysemy does?

I agree that in general these two last rules can bring a little bit of 
confusion
    printf("%g\n",1.0*(-34+1u));
    printf("%g\n",1.0*(-34+1UL));
    printf("%g\n",1.0*(-34-1u));
prints
    4.29497e+09
    1.84467e+19
    4.29497e+09
but normally it does not matter, because the bit pattern is what one 
expects, and casting to the correct type one has the correct result
    printf("%g\n",1.0*cast(int)(-34+1u));
    printf("%g\n",1.0*cast(long)(-34+1UL));
    printf("%g\n",1.0*cast(int)(-34-1u));
prints
 -33
 -33
 -35
and the advantage of combining freely signed and unsigned without cast 
(and it happens often)  I think out weights its problems.

The only problem that I have seen connected to this is people thinking
	opCmp =	cast(signed)(unsigned-unsigned);
which is wrong.

What I would like to have is
1) adaptive integer literals

For one kind of integer literals, optimally the decimal literals 
without prefix, and introducing a prefix for int integer literals (yes 
in my first message I proposed the opposite, a prefix for the new kind 
of literals, but I changed idea already in the second one) to have a 
very lax regime, based of value preservation:
- all calculations of these integer literals between themselves are 
done with arbitrary precision
- this literal can implicitly cast to any type as long as the type can 
represent the value (that is obviously known at compile time)
- for matching overloaded functions one has to find a rule, this is 
something I am not too sure about, int if the vale fits in it, long if 
it doesn't and ulong if it does not fit in either could be a 
possibility.

2) different integer type for size_t ptr_diff_t (and similar integer 
types that have the size of a pointer)

no cast needed between size_t and ptr_diff_t
cast needed between them and both long and int

Fawzi