Positive

Bruce Adams tortoise_74 at yeah.who.co.uk
Mon Oct 6 00:53:25 PDT 2008


On Mon, 06 Oct 2008 06:17:44 +0100, Brad Roberts <braddr at puremagic.com>  
wrote:

> Andrei Alexandrescu wrote:
>> Bruce Adams wrote:
>>> On Mon, 06 Oct 2008 00:55:33 +0100, Andrei Alexandrescu
>>> <SeeWebsiteForEmail at erdani.org> wrote:
>>>>>>
>>>>>> Andrei
>>>>>  I disagree. I'm not saying its easy but it could be done. We would
>>>>> have to start
>>>>> with something relatively simple and work our way up but it could be
>>>>> done.
>>>>
>>>> I, too, think it can be done in the same way supersonic mass
>>>> transportation can be done, but having done research in the area I can
>>>> tell you you are grossly underestimating the difficulties. It is a
>>>> project of gargantuan size. Today the most advanced systems only  
>>>> managed
>>>> to automatically prove facts that look rather trivial to the casual
>>>> reader. There is absolutely no hope for D to embark on this.
>>>>
>>> I'm not asking for a generalised theorem prover. Something to handle
>>> even the
>>> simple cases is a start.
>>> I agree that D won't have this (any time soon) but mainly because there
>>> are several hundred things higher up the priority list.
>>>
>>>>> Compiler's already do all kinds of clever analyses behind the scenes
>>>>> but each one
>>>>> is often hard coded. I suspect the main difficulty is giving users
>>>>> too much rope
>>>>> by which to hang themselves, or rather hang the compiler trying to
>>>>> prove something
>>>>> it doesn't realise it can't. Marrying declarative / constraint based
>>>>> programming
>>>>> at compile time is creeping in via templates. I wish it was less
>>>>> well hidden.
>>>>
>>>> I discussed the problem this morning with Walter and he also started
>>>> rather cocky: if you assert something early on, you can from then on
>>>> assume the assertion is true (assuming no assignment took place, which
>>>> is not hard if you have CFA in place). He got an arrow in a molar with
>>>> the following example:
>>>>
>>>> double[] vec;
>>>> foreach (e; vec) assert(e >= 0);
>>>> // now we know vec is all nonnegatives
>>>> normalize(vec);
>>>>
>>>> The definition of normalize is:
>>>>
>>>> void normalize(double[] vec)
>>>> {
>>>>      foreach (e; vec) assert(e >= 0);
>>>>      auto sum = reduce@"a + b"(vec, 0);
>>>>      assert(sum > 0);
>>>>      foreach (ref e; vec) e /= sum;
>>>> }
>>>>
>>>> If normalize takes udouble[] and you have one of those, there's no
>>>> need to recheck. Automated elimination of the checking loop above is
>>>> really hard.
>>>>
>>>>
>>>> Andrei
>>>
>>> Is it? I think your example needs to be expanded or we may be talking
>>> at cross purposes.
>>> Firstly I would rearrange things a little, though in principle it makes
>>> no difference.
>>>
>>>  pre
>>>  {
>>>     static assert(foreach (e; vec) assert(e >= 0));
>>>  }
>>>  void normalize(double[] vec)
>>>  {
>>>     auto sum = reduce@"a + b"(vec, 0);
>>>     assert(sum > 0);
>>>     foreach (ref e; vec) e /= sum;
>>>  }
>>>
>>> double[] vec;
>>> static assert(foreach (e; vec) assert(e >= 0));  // line X
>>>
>>> // now we know vec is all nonnegatives
>>> normalize(vec);   // line Y
>>>
>>> Imagine I have a prolog style symbolic unification engine to hand
>>> inside my compiler.
>>>
>>> At line X the static assertion is evaluated.
>>> The logical property e>=0 is asserted on the vec symbol.
>>> The compiler reaches line Y.
>>> Vec has not been modified so still has this property associated with  
>>> it.
>>> We now unify the symbol representing vec with the contract on  
>>> normalise.
>>> The unification succeeds and everything is fine.
>>>
>>> Now imagine we have a stupid analyser.
>>>
>>> double[] vec;
>>> static assert(foreach (e; vec) assert(e >= 0));  // line X
>>>
>>> vec[0] -= 1; // line Z
>>>
>>> // now we know vec is all nonnegatives
>>> normalize(vec);   // line Y
>>>
>>> when the compile time analyser reaches line Z it can't work out
>>> whether or not the
>>> contract still applies, so it removes the assertion that vec is e>=0
>>> for all elements, or
>>> rather asserts that it is not provably true.
>>> Now when we reach Y the unification fails.
>>> We don't throw a compile time contraint violation error. We haven't
>>> proved it to have failed.
>>> We throw a compile time constraint unprovable error or warning.
>>>
>>> Its like a lint warning. Your code may not be wrong but you might want
>>> to consider altering it
>>> in a way that makes it provably correct.
>>>
>>> We would really like to be able to assert that certain properties
>>> always hold for a variable
>>> through its life-time. That is a harder problem.
>>>
>>> I see something like this as a basis on which more advanced analysis
>>> can be built gradually.
>>>
>>> Actually re-using static assert above was probably misleading. It
>>> would be better to have something
>>> that says talk to the compile time theorem prover (a prolog
>>> interpreter would do).
>>>
>>> I have been meaning to write something along these lines for years but
>>> so far I haven't got around to it
>>> so I might as well stop keeping it under my hat.
>>>
>>> Regards,
>>>
>>> Bruce.
>>
>> I think it's terrific to try your hand at it. I for one know am not
>> equipped for such a pursuit. I have no idea how to perform unification
>> over loops, particularly when they may have myriads of little
>> differences while still having the same semantics; how to do that
>> cross-procedurally in reasonable time; how to take control flow
>> sensitivity into account; and why exactly people who've worked hard at
>> it haven't gotten very far. But I am equipped to define a type that
>> enforces all that simply and clearly, today.
>>
>>
>> Andrei
>
> I'm not an expert at optimizers, but I do read a whole lot.  This type
> of analysis is often call 'value range propagation'.  Tracking the
> possible values for data as it flows around and eliminating checks based
> on prior events.  Most modern compilers do some amount of it.  The
> question is how much and under what circumstances.  I'm not sure how
> many combine that with static value analysis to create multiple function
> versions.  IE, 'enough' callers to justify emitting two different forms
> of the function, each with different base assumptions about the incoming
> data and adjusting call sites.  I know that some compilers do that for
> known constant values.  I guess it's something halfway to inlining.
>
> I can guess at what dmd's backend does.. or rather doesn't do. :)
>
> Later,
> Brad

Perhaps more can be done with LLVM.

I was actually suggesting something more radical than this. Having a turing
complete interpreter available at compile time with access to the data  
structures
used to represent the data flow analysis. This is what I mean by rope  
enough
to hang yourself by. This isn't done currently for many reasons.
1. its a sledgehammer to crack a nut (though it is more general purpose)
- as Andrei points out he can do myriad useful things today without this
2. compiler authors are rightly uneasy about exposing internal  
data-structures at compile time
3. compiler authors are rightly uneasy permitting activities that can  
potential
result in longer compile times and especially infinitely long compile  
times (though templates
make this possible)
4. many programmers are uneasy about marrying different programming styles.
One of D's successes is actually in making template meta-programming more  
like regular procedural/functional
OO programming and less like pattern matching / logical / declarative  
programming.
That makes it more comprehendable to the masses which is a good thing.
But it also hides the possibility of slipping a constraint solver in there.
Though I don't think the CTFE system and templates are powerful enough to  
write an interpreter in.
You don't have access to state information except through arguments that  
are constant expressions.
It would be unnecessarily challenging to make this state mutable from  
different places in the program,
if its even possible. This is where a multi-level language might win.

I like the idea of possibly generating two different function  
implementations internally for the
internal optimizer. However, I was suggesting the analysis is focused on  
contracts at interface boundaries.

Regards,

Bruce.



More information about the Digitalmars-d mailing list