Value Preservation and Polysemy -> context dependent integer literals

Fri Dec 5 03:20:24 PST 2008

On 2008-12-05 09:40:03 +0100, Don <nospam at nospam.com> said:

> Andrei Alexandrescu wrote:
>> Sergey Gromov wrote:
>>> Thu, 04 Dec 2008 09:54:32 -0800, Andrei Alexandrescu wrote:
>>> 
>>>> Fawzi Mohamed wrote:
>>>>> On 2008-12-01 22:30:54 +0100, Walter Bright <newshound1 at digitalmars.com> said:
>>>>> 
>>>>>> Fawzi Mohamed wrote:
>>>>>>> On 2008-12-01 21:16:58 +0100, Walter Bright <newshound1 at digitalmars.com> said:
>>>>>>> 
>>>>>>>> Andrei Alexandrescu wrote:
>>>>>>>>> I'm very excited about polysemy. It's entirely original to D,
>>>>>>>> I accused Andrei of making up the word 'polysemy', but it turns out it 
>>>>>>>> is a real word! <g>
>>>>>>> Is this the beginning of discriminating overloads also based on the 
>>>>>>> return values?
>>>>>> No. I think return type overloading looks good in trivial cases, but as 
>>>>>> things get more complex it gets inscrutable.
>>>>> I agreee that return type overloading can go very bad, but a little bit 
>>>>> can be very nice.
>>>>> 
>>>>> Polysemy make more expressions typecheck, but I am not sure that I want that.
>>>>> For example with size_t & co I would amost always want a stronger 
>>>>> typechecking, as if size_t would be a typedef, but with the usual rules 
>>>>> wrt to ptr_diff, size_t,... (i.e. not cast between them).
>>>>> This because mixing size_t with int, or long is almost always 
>>>>> suspicious, but you might see it only on the other platform (32/64 
>>>>> bit), and not on you own.
>>>>> 
>>>>> Something that I would find nice on the other hand is to have a kind of 
>>>>> integer literals that automatically cast to the type that makes more 
>>>>> sense.
>>>> Wouldn't value range propagation take care of that (and actually more)? 
>>>> A literal such as 5 will have a support range [5, 5] which provides 
>>>> enough information to compute the best type down the road.
>>> 
>>> It sounds very nice and right, except it's incompatible with Cee.
>>> 
>>> Well, you can safely reduce bit count so that assigning "1025 & 15" to
>>> "byte" would go without both a cast and a warning/error.  But you cannot
>>> grow bitcount beyond the C limits, that is, you cannot return long for
>>> "1024 << 30."  You should probably report an error, and you should
>>> provide some way to tell the compiler, "i mean it."
>>> 
>>> In the worst case, any shift, multiplication or addition will result in
>>> a compiler error.  Do I miss something?
>> 
>> Well any integral value carries:
>> 
>> a) type as per the C rule
>> 
>> b) minimum value possible
>> 
>> c) maximum value possible
>> 
>> The type stays the type as per the C rule, so there's no change there. 
>> If (and only if) a *narrower* type is asked as a conversion target for 
>> the value, the range is consulted. If the range is too large, the 
>> conversion fails.
>> 
>> Andrei
> 
> Any idea how hard this would be to implement?
> 
> Also we've got an interesting case in D that other languages don't 
> have: CTFE functions.
> I presume that range propagation would not apply during evaluation of 
> the CTFE function, but when evaluation is complete, it would then 
> become a known literal, which can have precise range propagation. But 
> there's still some funny issues:
> 
> uint foo(int x) { return 5; }
> 
> int bar(int y)
> {
>      ubyte w = foo(7); // this is a narrowing conversion, generates 
> compiler warning (foo is not called as CTFE).
>      return 6;
> }
> 
> enum ubyte z = foo(7); // this is range propagated, so narrowing is OK.
> enum int q = bar(3); // still gets a warning, because bar() didn't compile.
> 
> int gar(T)(int y)
> {
>      ubyte w = foo(7);
>      return 6;
> }
> 
> enum int v = gar!(int)(3); // is this OK???

What I would like is that one type of integer literals (optimally the 
one without annotation) has *no* fixed C type, but is effectively 
treated as an arbitrary dimension integer.
Conversion form this arbitrary precision integer to any other type are 
implicit as long as the *value* can be represented in the end type, 
otherwise they fail.

ubyte ub=4; // ok
byte ib=4; // ok
ubyte ub=-4; // failure
ubyte ub=cast(ubyte)cast(byte)-4; // ok (one could see if the removal 
of cast(byte) should be accepted
byte ib=-4; // ok
byte ib=130; // failure
float f=1234567890; // ok even if there could be precision loss
int i=123455; // ok
long i= 2147483647*2; // ok

note that as the value is known at compile time this can always be 
checked, and one would get rid of annotations  most of the time L UL 
s...
Annotations should stay for compatibility with C and a short way 
instead of for example cast(uint)1234 .

This thing has one problem, and that is overloaded function calls... in 
that case a rule has to be chosen:  find the smallest signed and 
unsigned type that can represent the number. If both are ok, fail, 
otherwise choose the one that is ok, could be a possible rule, anyway 
that should be discussed to make the compiler work reasonable.

So this is what I would like, I do not know how this matches with the 
polysemy proposal, because from  Andrei comments I am not sure I have 
understood it correctly.

So to answer Don within my proposal your code would not be correct because

>      ubyte w = foo(7);

needs a cast, even when performed at compile time. You have no new 
types, special rules only apply to integer literals, as soon as the 
assume a fixed C type, then the normal rules are valid.