value range propagation for _bitwise_ OR

Tue Apr 13 02:39:14 PDT 2010

On 12-apr-10, at 21:40, Steven Schveighoffer wrote:

> On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger  
> <jeberger at free.fr> wrote:
>
>> Steven Schveighoffer wrote:
>>> J�r�me M. Berger Wrote:
>>>
>>>> J�r�me M. Berger wrote:
>>>>> 	OK, I found a solution that is optimal in over 99% of the cases
>>>>> (99.195% to be precise), while still being fast (because it avoids
>>>>> looping over the bits):
>>>>>
>>>> 	I've run my function and Andrei's on all possible min, max pairs  
>>>> in
>>>> 0..299 without checking, just for the timing. On my computer  
>>>> (Athlon
>>>> X2 64 @2GHz), my function took 50s and Andrei's took 266s. The
>>>> difference should be even bigger for larger numbers.
>>>
>>> Can I ask, what is the point of having a non-exact solution  
>>> (unless you are just doing this for fun)?
>>>
>>> We are talking about range propagation, a function of the  
>>> compiler, not a function of the compiled program.  Therefore,  
>>> slower but more exact functions should be preferred, since they  
>>> only affect the compile time.  Note that you will only need to do  
>>> this range propagation on lines that "or" two values together, and  
>>> something that reduces the runtime of the compiler by 216 seconds,  
>>> but only when compiling enough code to have over 8 billion 'or'  
>>> operations in it (300^4), I don't think is really that important.   
>>> Give me the exact solution that prevents me from having to cast  
>>> when the compiler can prove it, even if the runtime of the  
>>> compiler is slightly slower.
>>>
>> 	Funny you should say that given the current thread comparing the
>> speed of the D compiler to that of the Go compiler...
>
> My point was simply that the amount of time saved is relative to the  
> size of the program being compiled.  If we assume conservatively  
> that every other line in a program has bitwise or in it, then you  
> are talking about a 16 billion line program, meaning the 216 seconds  
> you save is insignificant compared to the total compile time.  My  
> point was that your fast solution that is inaccurate is not  
> preferable because nobody notices the speed, they just notice the  
> inaccuracy.
>
>> 	We are talking about range propagation, a function of the compiler,
>> not a function of the compiled program. Since we can't get a 100%
>> accurate representation of the possible values anyway (even yours
>> might leave holes in the middle after all), then it might be better
>> to choose a faster, slightly less precise algorithm if the
>> difference is not too great. That's the difference between a
>> compiler and a full-fledged static code analysis an program prover.
>
> When we're talking about the difference between O(1) and O(lgn),  
> I'll take accuracy over speed in my compiler any day.  The solution  
> should be 100% accurate for the problem statement.  If the problem  
> statement is not what we need, then we need a new problem  
> statement :)  Solving the problem statement for 99% of values is not  
> good enough.
>
>>
>> 	Anyway, the point is moot, I have a new version of my algorithm
>> with 100% precision and high speed.
>
> Yes, I'm still trying to understand how it works :)

Sorry for the probably stupid question, but I don't understand much  
the need of all this range propagation, in the compiler either you  
have a a constant (and then you have the value at compile time, and  
you don't need any range propagation, you can compare with the value),  
or you have a runtime value.
Do you want to explicitly add in the compiler the support for more  
limited runtime values?
Otherwise the range of a runtime value is a priori the whole possible  
range, and thus any rule based on range propagation might be expressed  
as static type based rule (as done in C).
You can gain something for example you can know that summing 4 shorts  
you will never overflow an int, is this where you want to go?
What kind of bugs you are trying to avoid? Or is it simply having the  
max and min properties defined?

Nobody commented on my proposal, but there I see the potential for  
bugs (1u-2)+1UL is 4294967296 and this happens also at runtime if you  
have, for example, a function returning size_t and another returning  
uint, and you combine them without thinking much and then you switch  
to 64 bits...

There I see a problem, but this you see without any special range  
propagation, just thinking that subtraction or negation of unsigned  
types is modulo 2^bit size, and thus cannot be then changed to another  
size without explicit cast.
Maybe in some cases using enums range propagation might spare a cast,  
but is it really an improvement?
I guess that I am missing something obvious, so I don't see the reason  
for range propagation, but maybe I am not the only one, so thanks for  
an explanation...

Fawzi