alloca() notes [Was: Re: getNext]

Tue Jul 13 01:52:29 PDT 2010

Andrei Alexandrescu:
> T * getNext(R, E)(ref R range,
>                    ref E store = *(cast(E*) alloca(E.sizeof))
> {

Time ago I have filed bug http://d.puremagic.com/issues/show_bug.cgi?id=3822 on alloca(), but I think this code is not hit by it.

I have recently written some C99 code and I have appreciated its Variable Length Arrays both in syntax, safety and performance gain.

Compared to alloca() the VLAs have some advantages:
- Their syntax is shorter and nicer, and it's natural for a C programmer;
- no imports needed;
- the semantics is more clear, because they define a new variable, so the size of their scope is the same as the in all other variables, while alloca() seems to have two possible different implementations;
- VLAs are more typesafe, there is no need to use a cast. While the cast needed by alloca() may forbid it in SafeD code.
- VLAs don't need sizeof(T), you just specify a type.

The result is that the usage of alloca() feels dirty in both C and D, but Variable Length Arrays (VLA) of C99 don't feel dirty at all. 

On the other hand VLAs (and alloca) can produce a stack overflow, they are not so commonly useful (as alloca), and they can become essentially a third kind of arrays for D (this is not good).

In D I'd like something like alloca() that needs no casts and is able to find the size by itself, avoiding the bug prone usage of T.sizeof.

A way to do it is to use the same syntax used by C99 and allow a variable in the definition of a stack array:

auto foo(int n) {
    int[n] arr; 
    return arr;
}

The main difference is that in D when you return arr it gets copied by value.

Currently this code works:

int foo(int n) {
     return 3 * n + 1;
}
auto bar() {
    immutable int n = 5;
    int[foo(n)] arr;
    return arr;
}
void main() {}

because dmd runs foo() at compile time, so arr is allocated on the stack with a statically known size. If VLAs get introduced in D, then in this case the compiler has to do what it currently it doesn't do: to run a function at compile-time if possible (and create a fixed length array) and run it at runtime if that's impossible (and create a VLA).

To avoid that in some cases you can use something like:

int[StaticValue!(n)] arr;

Where StaticValue is a template that makes sure n is a value always known at compile-time.

But this is a little messy, and I don't like it too much.

Keep in mind that currently this gives an error:

int[] bar() {
    int[2] arr;
    return arr;
}
void main() {}

You have to use the auto return type or:

int[2] bar() {
    int[2] arr;
    return arr;
}
void main() {}

With VLAs you are forced to use auto, but it can't work anyway at the return point.
So I think the normal C99 syntax for VLAs is not good for D2. On the other hand alloca() semantics and syntax are bad. So D alloca() can be replaced by something better like:

T* ptr = StackAlloc!T(n);
Or:
T* ptr = Alloca!T(n);
Or:
T[] arr = VLA!T(n);

But this created an array that looks like a dynamic array, but its memory is on the stack so it must not escape the function. So it can be a bug-prone, similar to this that currently compiles:

int[] foo() {
    int[1] stackArray;
    int[] dynArray = stackArray[0 .. 1];
    return dynArray;
}
void main() {}

See enhancement request http://d.puremagic.com/issues/show_bug.cgi?id=4451

So with the syntax T[] arr=VLA!T(n); the compiler has to disallow the return of arr and its slices.

Whith those three syntaxes your signature becomes:

T* getNext(R, E)(ref R range, ref E store = *StackAlloc!E(1)) {
T* getNext(R, E)(ref R range, ref E store = *Alloca!E(1)) {
T* getNext(R, E)(ref R range, ref E store = *(VLA!E(1).ptr)) {

Bye,
bearophile