Array concatenation & optimisation

Quirin Schroll qs.il.paperinik at gmail.com
Mon Jul 22 12:03:33 UTC 2024


On Sunday, 21 July 2024 at 05:43:32 UTC, IchorDev wrote:
> Obviously when writing optimised code it is desirable to reduce 
> heap allocation frequency. With that in mind, I'm used to being 
> told by the compiler that I can't do this in `@nogc` code:
> ```d
> void assign(ref int[4] a) @nogc{
> 	a[] = [1,3,6,9]; //Error: array literal in `@nogc` function 
> `assign` may cause a GC allocation
> }
> ```
>> 'may cause' a GC allocation
>
> Does this mean that array literals are *always* separately 
> allocated first, or is this usually optimised out?
> For instance, will this example *always* allocate a new dynamic 
> array for the array literal, and then append it to the existing 
> one, even in optimised builds?

Optimization is unrelated to language semantics, except for what 
optimizations the language semantics allow for. Even if an 
allocation is optimized away, if the language semantics don’t 
require this optimization (which means it’s no optimization 
actually), it must pretend it’s not happening as far as 
diagnostics etc. are concerned.

My mental model of array literals is that array literals have 
their own, internal type. Think of `__arrayLiteral!(T, n)`. If 
you ask an array literal its type (e.g. using `typeof`), it’ll 
say `T[]`. But when you use it to initialize a static array, it 
simply behaves differently than a `T[]` because it just isn’t 
one. The madness about this is that even casts don’t affect the 
typing.

```d
void main() @nogc
{
     int x;
     enum int[] xs = [1,2,3];
     int[4] ys = cast(int[])(xs ~ x); // good
     int[4] zs = (b ? xs : xs) ~ x; // error
}
```
Here, neither the explicit type `int[]` for `xs` or the cast (you 
can remove any subset of them) make it so that the assignment 
isn’t `@nogc`. The whole `__arrayLiteral!(T, n)` is after some 
semi-constant folding that only applies to the length. In no way 
is `xs ~ x` a compile-time constant as `x` is a run-time value.

However, if you use `(b ? xs : xs)` instead of plain `xs` with a 
run-time boolean `b`, the language doesn’t see it as an array 
with compile-time-known length, and thus requires allocation.

In your example, you’re not assigning an array literal to a 
static array as far as the type system is concerned. The 
left-hand side is `a[]`, which has type `int[]`. So, as far as 
the type system is concerned, you assign an array literal to an 
`int[]`, and that requires allocating the literal on the GC heap, 
rendering the function non-`@nogc`. If the optimizer figures out 
that all of this ends up just putting some values in some static 
array and it removes the allocation, this has no effect on 
whether the function is `@nogc`.


More information about the Digitalmars-d-learn mailing list