Stack-allocated arrays

Tue Nov 11 21:00:03 PST 2008

"Frits van Bommel" <fvbommel at REMwOVExCAPSs.nl> wrote in message 
news:gfcjb4$25un$1 at digitalmars.com...
> dsimcha wrote:
>> I've noticed that, when writing multithreaded number crunching code,
>> allocating memory for temporary storage can be a huge threading 
>> bottleneck.
>> Of course, the obvious, but ugly solution is to pre-allocate and recycle 
>> temp
>> variables.  However, this breaks encapsulation at every place imaginable.
>>
>> An alternative that I've tried recently is to use alloca to stack 
>> allocate the
>> temporary arrays if they are small enough.  If the arrays being allocated 
>> are
>> larger, I heap allocate, and this isn't a problem because in the large 
>> array
>> case, the number crunching takes up enough time that heap allocations 
>> aren't
>> much of a bottleneck anymore.  However, this is also butt ugly because I 
>> have
>> to use a bunch of pointers and casts and explicitly test the size for it 
>> to
>> work.  Given that this is highly useful, I think a nice feature for D 
>> would be
>> to have some kind of a scheme to make this clean.  For example:
>>
>> auto foo = newTemp[length];
>>
>> If length is larger than some arbitrary but reasonable value, this is 
>> heap
>> allocated.  If length is small, it is stack allocated.  Either way, 
>> allowing
>> the array to escape the current scope is undefined behavior and the 
>> compiler
>> would detect at least the simple cases of this.  Ideally, since this 
>> would
>
> I'd love for "scope foo = new T[len];" to do for arrays what "scope bar = 
> new Class;" does for classes. And indeed, if it's too big the compiler

I'm surprised it doesn't and see that as a bit inconsistent, with the only 
serious argument against it being that 'scope' couldn't be used for large 
dynamic arrays.

But then again:

class C
{
   int[264_000] a;
}

void foo()
{
    scope C c = new C;
    ...
}

could also overflow the stack. In either case the work-around would be the 
same (increase the stack size or not use 'scope').

> can always fall back on heap-allocation. That could be slightly tricky if 
> you want to make sure the array is deleted in the heap-alloc case, but an 
> initial implementation could always just let the GC handle it.
> ("T[len] foo;" for non-constant len would also be a nice syntax)
>
> A workaround you could use:
> -----
> // N == your "arbitrary but reasonable value", use whatever you want.
> // (Add "= void" to next line to avoid initialization if you prefer)
> T[N] stackbuf;         // don't use this except in next line
> T[] buf = stackbuf;     // Set reference to stack buffer
> // Shrink to smaller size or heap-allocate bigger size:
> buf.length = length;
> -----
> The disadvantage being, of course, that stackbuf will always be N items. 
> If the majority of the cases is for the needed space to be much smaller an 
> alloca-using solution may use much less stack space, especially in code 
> where functions using this call each other.
>
>> only be used for temp variables in performance-critical code, the 
>> contents of
>> the array would also not be initialized.
>
> I disagree with not initializing it. By default it should be initialized. 
> An explicit "don't initialize this, please" syntax would be nice, but the 
> default should be safe.
>
>> It would be nice to do this in a library, but there's no obvious way to 
>> do it
>> without compiler support, since calling a function sets up a new stack 
>> frame.
>>  I'd like to just do it with an interface like:
>>
>> auto foo = newTemp!(T)(length).
>>
>> Anyone know any ASM frame pointer hacks to make something like this 
>> workable?
>>  Otherwise, is there enough interest in this that it might belong in the 
>> core
>> language?
>
> Implementing alloca isn't actually all that hard in ASM; it just won't be 
> in any way portable. The needed code depends on both the architecture 
> (obviously) and the calling convention. Oh, and on the assumption that the 
> compiler generates no code that assumes it knows the stack frame size 
> (i.e. don't use -fomit-frame-pointer or equivalent, make sure your 
> compiler uses a frame pointer[1])
>
>
> [1]: And doesn't extend the frame and address the extended part using the 
> frame pointer -- but most compilers don't do that anyway because it's less 
> efficient, better to just pre-allocate the entire stackframe at the start.