[GSOC] regular expressions beta is here

Fri Aug 12 04:31:30 PDT 2011

bearophile wrote:
> Don:
> 
>>> 2) I have found many situations where I am able to solve a problem with both a simple and slow brute force solver, and a complex and fast algorithm to solve a problem. The little program maybe is too much slow for normal usage, but it's just few lines long (especially if I use lot of std.algorithm stuff) but it's much less likely to contain bugs.
> 
>> Sorry, but personally I don't believe that this is useful outside of toy examples.
>> The question is, what bugs does it find that aren't found by a trivial unit test?
> 
>> There are two cases:
>> (1) it's a very tight test. In which case, it's essentially a unit test.
>> or (2) it's a very loose test. In which case, it doesn't find bugs.
> 
> Putting a simpler algorithm in the post-condition implements a third possibility you are missing.
> 
> Usually unit tests verify some specific cases (you are also able to add generic testing code in the unit test, but this is just like moving the postcondition elsewhere).
> 
> If you put an alternative algorithm in the postcondition (under debug{} if you want), you have some advantages:
> - It's tight, because the second algorithm is supposed to always give the same results as the function.

> - It works with the real examples the program is run too, not just the cases you have put in the unit test.

Conditions required for this to be true:
(1) the function must not be time critical;
(2) an alternative algorithm must exist;
(3) the alternative algorithm must be bug-free;
(4) the function must not have been tested properly;
(5) the faulty test cases must occur during debugging (they won't be 
caught during production);
(6) the programmer must remember to put the asserts in the 'out' 
contract, but not put them into the body of the function.

This doesn't leave much.

Sometimes you forget to add certain cases in the unittests. Putting the 
test in the postcondition makes sure it always run, for all the inputs 
your function is run on (unless you disable it), so you will catch the 
cases you didn't think of in the unittests.

> 
> 
>> The crucial feature is, they do NOTHING except find bugs in the function
>> they are attached to.
> 
> In Eiffel you have the prestate too (the old), so the postcondition is the only place where such information is usable. I hope prestate will be added to D DbC, because it's a majob sub-feature of DbC. But I don't agree that postconditions are useless in D.

??? Does that relate to my sentence in any way?

>> For starters, it really needs to be a function with multiple return
>> values. Otherwise, you can just stick asserts just before your return
>> statement, and you don't need __old or any such thing.
> 
> If a function has multiple return values the out(result) helps make sure all the return paths are verified.

That's what I said.

> If the function has only one return value it helps anyway, because it helps you not forget to verify the result.

???? Why would you remember to put an assert in the postcondition, when 
you didn't put it into the function?

> 
> 
>> Under what circumstances are they are more valuable than any other assert inside a function?
> 
> I have already given some answers.

No you haven't.

> Another answer is this:
> 
> 
> int foo(int x)
> in {
>     // ...
> }
> out(result) {
>     auto y = computeSomething(result);
>     assert(y ...);
>     assert(y ...);
> }
> body {
>     // ...
> }
> 
> The out{} helps you organize your code, separating the tests of the body from the postcondition tests. Also in the postcondition you are allowed to define new variables and call things. All this out(){} code vanishes in release mode. Ho do you do that with just asserts inside the body?
> 
> 
> If you do this the asserts will vanish in release mode, but the y will be computed still, wasting computations (a smart compiler is able to see y is not used and etc, but it's not sure this optimization happens if the computation of y is complex and it's done in-place):
> 
> int foo(int x)
> in {
>     // ...
> }
> body {
>     result = ...;
>     auto y = computeSomething(result);
>     assert(y ...);
>     assert(y ...);
>     return result;
> }
> 
> 
> I presume there are ways to disable the computation of y in release mode, but I don't want to think about them. I just stick the y computation in the postcondition and the compiler will take care of it.

Trivial!
Make the postcondition a nested function. (You can even make it a 
delegate literal, if it's only used in one place).

I'll explain my original statement further: If you have a theorem 
prover, then the theorem prover can use the 'out' contract in any 
function which calls that function.

Eg,
int square(int x) out { assert(result>=0); } body { return x*x; }

void foo()
{
    int q = square(-5);
    if (q < 0) { .... }

}
Theorem prover knows that q>=0, even if it doesn't have access to the 
body of 'square'. So it detects unreachable code in foo().

So in this case, the 'out' contract can be used to find bugs in code 
that the author of the contract didn't write.
Otherwise, out contracts only find bugs in the local function, which 
doesn't have much value, since unit testing already performs that role 
(and does it better).
By contrast, 'in' functions ALWAYS find external bugs rather than local 
ones, so they're an order of magnitude more valuable in the current 
implementation.