While we're on the subject of escape analysis...

Thu Oct 30 20:15:48 PDT 2008

Benji Smith Wrote:
> Of course, this kind of optimization would require an interprocedural 
> graph-coloring algorithm to determine whether to use the heap allocator 
> or the stack allocator for every allocation. And I don't know whether 
> the current optimizer does any interprocedural analysis at all. (Though 
> it does inline short methods, which seems roughly equivalent, so I dunno.)
> 
> Thoughts?
> 
> --benji

This scenario could be optimized by the compiler by the use of a secondary
stack, together with Michel Fortin's ownership type proposal.

By secondary stack I mean a per-thread chunk of memory that works
exactly like the normal stack, but isn't cluttered by function calls.

A function whose return type is a scope reference type could take a hidden
argument that implements an allocator interface. If the call site
determines that the result never leaves its scope, it can pass the
secondary stack allocator, otherwise the regular GC allocator.
The new call which produces the returned object would then call the
hidden argument to allocate memory.

Note that this only works for a single level of nesting.
(Can't return the object up the stack through more steps, that'd require
every scope to allocate its own secondary stack, which requires a heap
allocation and defeats the purpose.)

Example:

class Foo
{
    scope Foo next;

    this (Foo n = null)
    {
        next = n;
    }

    scope Foo prolifoorate ()
    {
        return new Foo (this)
    }

    static Foo aGlobal;
}

void main ()
{
    scope Foo foo = new Foo;

    foo.prolifoorate().prolifoorate().prolifoorate();

    aGlobal = (new Foo).prolifoorate();
}

Would work more less like:

class Foo
{
    scope Foo next;

    this (Foo n)
    {
        next = n;
    }

    scope Foo prolifoorate (void* function (size_t) allocator)
    {
        Foo f = cast(Foo) allocator (sizeof (Foo));
        f.constructor (this);
        return f;
    }

    static Foo aGlobal;
}

void* SS;

void* secondary_stack_allocator (size_t size)
{
    void* x = SS;
    SS = &SS[size];
    return x;
}

void main ()
{
    void* SSmark = SS;
    scope(exit) SS = SSmark;

    // stack allocation
    Foo foo = cast(Foo) alloca (sizeof (Foo));
    foo.constructor ();

    // secondary stack allocations
    foo.prolifoorate(&secondary_stack_allocator)
        .prolifoorate(&secondary_stack_allocator)
        .prolifoorate(&secondary_stack_allocator);

    // heap allocation
    aGlobal = (new Foo).prolifoorate(&GC_allocator);
}

I didn't take multithreading into account because that's not the point of the example.
It'd be nice if a register could be found to store the pointer, since pthread_getspecific
might be a tad slow.

I hope D doesn't already have a secondary stack and I'm not looking silly writing all
this stuff...