Stack-allocated arrays
Dave
Dave_member at pathlink.com
Tue Nov 11 21:00:03 PST 2008
"Frits van Bommel" <fvbommel at REMwOVExCAPSs.nl> wrote in message
news:gfcjb4$25un$1 at digitalmars.com...
> dsimcha wrote:
>> I've noticed that, when writing multithreaded number crunching code,
>> allocating memory for temporary storage can be a huge threading
>> bottleneck.
>> Of course, the obvious, but ugly solution is to pre-allocate and recycle
>> temp
>> variables. However, this breaks encapsulation at every place imaginable.
>>
>> An alternative that I've tried recently is to use alloca to stack
>> allocate the
>> temporary arrays if they are small enough. If the arrays being allocated
>> are
>> larger, I heap allocate, and this isn't a problem because in the large
>> array
>> case, the number crunching takes up enough time that heap allocations
>> aren't
>> much of a bottleneck anymore. However, this is also butt ugly because I
>> have
>> to use a bunch of pointers and casts and explicitly test the size for it
>> to
>> work. Given that this is highly useful, I think a nice feature for D
>> would be
>> to have some kind of a scheme to make this clean. For example:
>>
>> auto foo = newTemp[length];
>>
>> If length is larger than some arbitrary but reasonable value, this is
>> heap
>> allocated. If length is small, it is stack allocated. Either way,
>> allowing
>> the array to escape the current scope is undefined behavior and the
>> compiler
>> would detect at least the simple cases of this. Ideally, since this
>> would
>
> I'd love for "scope foo = new T[len];" to do for arrays what "scope bar =
> new Class;" does for classes. And indeed, if it's too big the compiler
I'm surprised it doesn't and see that as a bit inconsistent, with the only
serious argument against it being that 'scope' couldn't be used for large
dynamic arrays.
But then again:
class C
{
int[264_000] a;
}
void foo()
{
scope C c = new C;
...
}
could also overflow the stack. In either case the work-around would be the
same (increase the stack size or not use 'scope').
> can always fall back on heap-allocation. That could be slightly tricky if
> you want to make sure the array is deleted in the heap-alloc case, but an
> initial implementation could always just let the GC handle it.
> ("T[len] foo;" for non-constant len would also be a nice syntax)
>
> A workaround you could use:
> -----
> // N == your "arbitrary but reasonable value", use whatever you want.
> // (Add "= void" to next line to avoid initialization if you prefer)
> T[N] stackbuf; // don't use this except in next line
> T[] buf = stackbuf; // Set reference to stack buffer
> // Shrink to smaller size or heap-allocate bigger size:
> buf.length = length;
> -----
> The disadvantage being, of course, that stackbuf will always be N items.
> If the majority of the cases is for the needed space to be much smaller an
> alloca-using solution may use much less stack space, especially in code
> where functions using this call each other.
>
>> only be used for temp variables in performance-critical code, the
>> contents of
>> the array would also not be initialized.
>
> I disagree with not initializing it. By default it should be initialized.
> An explicit "don't initialize this, please" syntax would be nice, but the
> default should be safe.
>
>> It would be nice to do this in a library, but there's no obvious way to
>> do it
>> without compiler support, since calling a function sets up a new stack
>> frame.
>> I'd like to just do it with an interface like:
>>
>> auto foo = newTemp!(T)(length).
>>
>> Anyone know any ASM frame pointer hacks to make something like this
>> workable?
>> Otherwise, is there enough interest in this that it might belong in the
>> core
>> language?
>
> Implementing alloca isn't actually all that hard in ASM; it just won't be
> in any way portable. The needed code depends on both the architecture
> (obviously) and the calling convention. Oh, and on the assumption that the
> compiler generates no code that assumes it knows the stack frame size
> (i.e. don't use -fomit-frame-pointer or equivalent, make sure your
> compiler uses a frame pointer[1])
>
>
> [1]: And doesn't extend the frame and address the extended part using the
> frame pointer -- but most compilers don't do that anyway because it's less
> efficient, better to just pre-allocate the entire stackframe at the start.
More information about the Digitalmars-d
mailing list