@safe leak fix?

Fri Nov 13 04:46:02 PST 2009

On Fri, 13 Nov 2009 15:29:20 +0300, Steven Schveighoffer  
<schveiguy at yahoo.com> wrote:

> On Fri, 13 Nov 2009 07:01:25 -0500, Denis Koroskin <2korden at gmail.com>  
> wrote:
>
>> On Fri, 13 Nov 2009 14:50:58 +0300, Steven Schveighoffer  
>> <schveiguy at yahoo.com> wrote:
>>
>>> On Thu, 12 Nov 2009 18:34:48 -0500, Jason House  
>>> <jason.james.house at gmail.com> wrote:
>>>
>>>> Steven Schveighoffer Wrote:
>>>>
>>>>> On Thu, 12 Nov 2009 08:45:36 -0500, Jason House
>>>>> <jason.james.house at gmail.com> wrote:
>>>>>
>>>>> > Walter Bright Wrote:
>>>>> >
>>>>> >> Jason House wrote:
>>>>> >> > At a fundamental level, safety isn't about pointers or  
>>>>> references to
>>>>> >> > stack variables, but rather preventing their escape beyond  
>>>>> function
>>>>> >> > scope. Scope parameters could be very useful. Scope delegates  
>>>>> were
>>>>> >> > introduced for a similar reason.
>>>>> >>
>>>>> >> The problem is, they aren't so easy to prove correct.
>>>>> >
>>>>> > I understand the general problem with escape analysis, but I've  
>>>>> always
>>>>> > thought of scope input as meaning @noescape. That should lead to  
>>>>> easy
>>>>> > proofs. If my @noescape input (or slice of an array on the stack)  
>>>>> is
>>>>> > passed to a function without @noescape, it's a compile error. That
>>>>> > reduces escape analysis to local verification.
>>>>>
>>>>> The problem is cases like this:
>>>>>
>>>>> char[] foo()
>>>>> {
>>>>>    char buf[100];
>>>>>    // fill buf
>>>>>    return strstr(buf, "hi").dup;
>>>>> }
>>>>>
>>>>> This function is completely safe, but without full escape analysis  
>>>>> the
>>>>> compiler can't tell.  The problem is, you don't know how the outputs  
>>>>> of a
>>>>> function are connected to its inputs.  strstr cannot have its  
>>>>> parameters
>>>>> marked as scope because it returns them.
>>>>>
>>>>> Scope parameters draw a rather conservative line in the sand, and  
>>>>> while I
>>>>> think it's a good optimization we can get right now, it's not going  
>>>>> to
>>>>> help in every case.  I'm perfectly fine with @safe being  
>>>>> conservative and
>>>>> @trusted not, at least the power is still there if you need it.
>>>>>
>>>>> -Steve
>>>>
>>>> what's the signature of strstr? Your example really boils down to  
>>>> proving strstr is safe.
>>>
>>> The problem is, strstr isn't safe by itself, it's only safe in certain  
>>> contexts.  You can't mark it as @trusted either because it has the  
>>> potential to be unsafe.  I think if safe D heap-allocates when it  
>>> passes a local address into an unprovable function such as strstr,  
>>> that's fine with me.
>>>
>>> So the signature of strstr has to be unmarked (no @safe or @trusted).
>>>
>>
>> Any example of how unsafe strstr may be?
>
> Sure (with the current compiler):
>
> char[] foo()
> {
>    char buf[100];
>    // fill buf
>    return strstr(buf, "hi"); // no .dup, buf escapes
> }
>

No, no, no! It's foo which is unsafe in your example, not strstr!

> The whole meaning of safe is fuzzy, because we don't know the safe rules  
> with regards to passing references to local data.  But I think the goal  
> is to make it so strstr can be marked as safe.  In order to do that, foo  
> must be required to be unmarked or @trusted, or foo allocates buf on the  
> heap.
>
> The point I was trying to make to Jason is that escape analysis is more  
> complicated than just marking parameters as @noescape -- you leave out  
> some provably safe functions.
>
>> BTW, strstr is no different from std.algorithm.find:
>>
>> import std.algorithm;
>>
>> char[] foo()
>> {
>>      char[5] buf = ['h', 'e', 'l', 'l', 'o'];
>>      char[] result = find(buf[], 'e');
>>
>>      return result.dup;
>> }
>>
>> I don't see why a general-purpose searching algorithm is unsafe.
>
> It isn't inherently unsafe.  It's just difficult for the compiler to see  
> just from a function signature where the data flows, and escape analysis  
> requires full data-flow disclosure.  I think with Walter's proposal of  
> allocating when a @safe function passes an address to a local to another  
> @safe function is perfectly acceptable to me.  I'd also like to see  
> cases where you can mark the input parameter as scope, potentially  
> optimizing out the allocation (but then you cannot return the scope  
> parameter or a reference to any part of it).
>
> -Steve

I don't like his proposal at all. It introduces one more hidden  
allocation. Why not just write

char[] buf = new char[100];

and disallow taking a slice of static array? (Andrei already hinted this  
will be disallowed in @safe, if I understood him right).

Speaking about safety, I don't know how we can allow pointers in safe D:

void foo()
{
    int* p = new int;
    p[1000] = 0; // Will it crash or not? Is this a defined behavior, or  
not?
    // If not, this must be disallowed in safe D
}

And, most importantly, *why* users would want to work with pointers in  
safe D at all?