value range propagation for _bitwise_ OR

Tue Apr 13 03:29:39 PDT 2010

On 13-apr-10, at 12:02, Don wrote:

> Fawzi Mohamed wrote:
>> On 12-apr-10, at 21:40, Steven Schveighoffer wrote:
>>> On Mon, 12 Apr 2010 13:45:14 -0400, Jérôme M. Berger <jeberger at free. 
>>> fr> wrote:
>>>
>>>> Steven Schveighoffer wrote:
>>>>> J�r�me M. Berger Wrote:
>>>>>
>>>>>> J�r�me M. Berger wrote:
>>>>>>>    OK, I found a solution that is optimal in over 99% of the  
>>>>>>> cases
>>>>>>> (99.195% to be precise), while still being fast (because it  
>>>>>>> avoids
>>>>>>> looping over the bits):
>>>>>>>
>>>>>>    I've run my function and Andrei's on all possible min, max  
>>>>>> pairs in
>>>>>> 0..299 without checking, just for the timing. On my computer  
>>>>>> (Athlon
>>>>>> X2 64 @2GHz), my function took 50s and Andrei's took 266s. The
>>>>>> difference should be even bigger for larger numbers.
>>>>>
>>>>> Can I ask, what is the point of having a non-exact solution  
>>>>> (unless you are just doing this for fun)?
>>>>>
>>>>> We are talking about range propagation, a function of the  
>>>>> compiler, not a function of the compiled program.  Therefore,  
>>>>> slower but more exact functions should be preferred, since they  
>>>>> only affect the compile time.  Note that you will only need to  
>>>>> do this range propagation on lines that "or" two values  
>>>>> together, and something that reduces the runtime of the compiler  
>>>>> by 216 seconds, but only when compiling enough code to have over  
>>>>> 8 billion 'or' operations in it (300^4), I don't think is really  
>>>>> that important.  Give me the exact solution that prevents me  
>>>>> from having to cast when the compiler can prove it, even if the  
>>>>> runtime of the compiler is slightly slower.
>>>>>
>>>>    Funny you should say that given the current thread comparing the
>>>> speed of the D compiler to that of the Go compiler...
>>>
>>> My point was simply that the amount of time saved is relative to  
>>> the size of the program being compiled.  If we assume  
>>> conservatively that every other line in a program has bitwise or  
>>> in it, then you are talking about a 16 billion line program,  
>>> meaning the 216 seconds you save is insignificant compared to the  
>>> total compile time.  My point was that your fast solution that is  
>>> inaccurate is not preferable because nobody notices the speed,  
>>> they just notice the inaccuracy.
>>>
>>>>    We are talking about range propagation, a function of the  
>>>> compiler,
>>>> not a function of the compiled program. Since we can't get a 100%
>>>> accurate representation of the possible values anyway (even yours
>>>> might leave holes in the middle after all), then it might be better
>>>> to choose a faster, slightly less precise algorithm if the
>>>> difference is not too great. That's the difference between a
>>>> compiler and a full-fledged static code analysis an program prover.
>>>
>>> When we're talking about the difference between O(1) and O(lgn),  
>>> I'll take accuracy over speed in my compiler any day.  The  
>>> solution should be 100% accurate for the problem statement.  If  
>>> the problem statement is not what we need, then we need a new  
>>> problem statement :)  Solving the problem statement for 99% of  
>>> values is not good enough.
>>>
>>>>
>>>>    Anyway, the point is moot, I have a new version of my algorithm
>>>> with 100% precision and high speed.
>>>
>>> Yes, I'm still trying to understand how it works :)
>> Sorry for the probably stupid question, but I don't understand much  
>> the need of all this range propagation, in the compiler either you  
>> have a a constant (and then you have the value at compile time, and  
>> you don't need any range propagation, you can compare with the  
>> value), or you have a runtime value.
>
>
> It's been part of DMD2 for a while now. It allows you to do things  
> like:
>
> ubyte lowbits(int x)
> {
>    return x & 7;
> }
>
> without an explicit cast. The compiler knows that x&7 can safely fit  
> inside a single byte.  Whereas   ((x&7) << 12) | (x&3);
> does not fit, and requires an explicit cast.

ah ok I understand, I thought that was treated like x & cast(ubyte)7 ,  
and so as comparison of the compile time value with the ranges of  
ubyte (no range propagation needed).
But I can understand why it is treated as cast(ubyte)((cast(int)x) &  
7), as it is probably easier for the compiler, as it upcasts by default.

Thanks for the explanation.