Escape analysis

Tue Oct 28 20:33:53 PDT 2008

"Sergey Gromov" wrote
> Steven Schveighoffer wrote:
>> "Sergey Gromov" wrote
>>> Walter Bright wrote:
>>>> The first step is, are function parameters considered to be escaping by
>>>> default or not by default? I.e.:
>>>>
>>>> void bar(noscope int* p);    // p escapes
>>>> void bar(scope int* p);      // p does not escape
>>>> void bar(int* p);            // what should be the default?
>>>>
>>>> What should be the default? The functional programmer would probably
>>>> choose scope as the default, and the OOP programmer noscope.
>>> I'm for safe defaults.  Programs shouldn't crash for no reason.
>>
>> If safe defaults means 75% performance decrease, I'm for using unsafe
>> defaults that are safe 99% of the time, with the ability to make them 
>> 100%
>> safe if needed.
>>
>>> Here are my thoughts on escape analysis.  Sorry if they're obvious.
>>>
>>> I think it is possible to detect whether a reference escapes or not in
>>> the absence of function calls by analyzing an expression graph.
>>
>> Yes, but not in D, since import uses uncompiled files as input.
>
> Please note the "in the absence of function calls" part.  I'm talking
> about code which is doing pure calculus, without calling anything
> external.  It's pretty useless by itself, but it's the basics.

Ah, sorry.  I read 'absence of function source'.  My bad, in that case we 
agree on this one.

>
> Unfortunately I don't know how import is implemented.  It should do some
> parsing though, to be able to inline functions from other modules, and
> to expand templates.
>

Those are all problems to be solved.  But if the file used by the linker and 
the file that contains the expression graphs aren't the same, or at least 
forced to be related, then you end up with very weird issues.

>>> Assigning to a global state variable is an ultimate escape.
>>
>> Agree there.
>>
>>> In the worst case, when only the current function can be analyzed and no
>>> meta-info is available about other functions, the compiler must assume a
>>> reference escapes if it is passed as an argument to another function.
>>> This is the current D2 behavior.
>>
>> This leads to the current situation, where you have a huge performance
>> decrease for little or no gain in reliability.
>>
>>> Pure functions provide some meta-info because any reference passed as an
>>> argument can only escape via a reference return value or other mutable
>>> reference arguments.  This makes escape analysis possible even after an
>>> unknown pure function is called.
>>
>> Good point.  Easy analysis on pure functions.
>>
>>> For any function in a tree of imported modules the compiler could keep
>>> some meta-data about which argument escapes where, if at all.  This way
>>> even regular functions can participate in escape analysis without
>>> blowing it up.
>>
>> Where is the data kept?  It must be in the object file, and d imports 
>> must
>> then read the object file for api instead of the source file.  I don't 
>> think
>> it's worth anything to break the single file for imports/code model.
>> Requiring a .di file is a little iffy as it is today.
>
> Here I'm talking about disposable compile-time data, module-local if you
> wish.  This means that local optimization is better than inter-module
> optimization.  Nothing new here I suppose.

Except the linker has to enforce it.  Which means it needs to somehow be 
munged into the signature.  If the signature is defined only in a .di file 
then it might not match.  I just think the object file and .di file are too 
unrelated to force continuity.  Weird issues can happen when these things 
are edited separately.

If .di files were not editable and always generated with object files, I'd 
say they were a good place to put this info.  But they aren't.

>
> Of course it would be nice if this data is exported somehow and used
> when compiling other modules.  But it'd make the compilation process
> asymmetric, when meta-data is available for already compiled modules and
> not available for others.

It would have to be available for all of them.  That would be the point of 
including it in the object file.

>
>>> An argument to a virtual function call always escapes by default.  It
>>> may be possible to declare an argument as non-escaping (scope?) and
>>> compiler should then enforce non-escaping contract upon any overriding
>>> functions.
>>
>> This is tricky, because most class member functions are virtual, so you 
>> are
>> forced to litter all your functions with escaping/non-escaping syntax. 
>> To
>> be accurate you need to define the escape graph in the signature, which 
>> will
>> be a PITA.  What would be worse is to not have a way to express the 
>> complete
>> graph.
>
> Not every call to a virtual function is itself virtual, and not every
> virtual function cares whether its argument escapes.
>
> I'd say more: the noscope should be default for all reference types
> except delegates because you usually don't care.  I agree that having
> scope delegates the default is probably the right thing to do, but only
> if a compiler can detect violations of this contract.

A very very common technique in Tango to save using heap allocation is to 
declare a static array as a buffer, and then pass that buffer to be used as 
scratch space in a function (which is possibly virtual).

This would be my golden use case that has to not allocate anything and has 
to work in order for any solution to be viable.

Saying all reference types are noscope would prevent this, no?

>
>> Another solution is that a derived function must have the same expression
>> graph or a tighter one than the base class'.  But without being able to
>> store the graph with the compiled code (and having the compiler import 
>> the
>> metadata instead of the source file), this is a moot point.
>>
>>> An argument to a function declared as a prototype always escapes by
>>> default.  It may be possible for the compiler to export some meta-info
>>> along with the prototype when a .di file is generated, whether an
>>> argument is guaranteed to not escape, or maybe even detailed info about
>>> which argument escapes where, to mimic the compile-time meta-info.
>>
>> No, the di file might not be auto-generated.  You also now back to a
>> separate import and source file, like C has.  I think in order for this 
>> to
>> work, the graph and object code must be stored in the same file that is
>> imported.
>
> There are separate import files.  Actually compiler can simply put
> scope/noscope for the arguments based upon the meta-data collected
> during compilation.  If your .di is manually created, you either put
> them manually as well, or you don't care.

I think the graph has to be complete for this to be usable.  Otherwise, it 
becomes an unused feature.  Using .di files is optional.  I generally don't 
use them.

>>> The expression graph analysis should be the first step towards safe
>>> stack closures.
>>
>> I would agree with this.  But I don't think it's happening in the near
>> future.  And I hope it's not done through .di files.
>
> You can limit analysis to a single module for now.  This will cover
> local function calls, including some local method calls, and I hope
> it'll also cover template function calls which means std.algorithm will
> work without memory allocation again.

Yes, but not class virtual methods or interface methods.  These are used 
quite a bit in Tango.  End result, not a lot of benefit.

-Steve