Better alignment management

Tue Jul 27 13:17:26 PDT 2010

C# has added LINQ mostly to reduce the programming/cognitive impedence in designing commercial applications that often are made essentially of code that converts obejcts structures to data for databased or the other way around. D language on the other hand can become a bit more serious for numerical coding purposes, finding a (smaller) niche different from the C# one.

The D GC allocates memory on a 16 bytes alignment (see also the recently partially fixed http://d.puremagic.com/issues/show_bug.cgi?id=4400 ) so instructions like MOVAPS can be used, that require a 16 bytes alignment, in array operations.

But I don't know what's the alignment of fixed sized arrays, they too can be used with array ops. So they can cause bugs. See this:
http://d.puremagic.com/issues/show_bug.cgi?id=2278
And some implementation ideas, linked by Witold Baryluk:
http://gcc.gnu.org/ml/gcc/2008-01/msg00282.html

It's possible to test the alignment of the array contents before performing each array operation, but if you have fixed-sized arrays of 4 floats you want their array ops to be implemented with just 1 inlined CPU instruction, otherwise testing for their alignment each time kills any performance gain.

And then the future AVX instructions like VMOVAPS need memory aligned to 32 bytes:
http://fzj.helmholtz.de/jsc/docs/vendordocs/cce/doc/main_cls/intref_cls/common/intref_avx_load_ps.htm

Making the D GC spit out all memory aligned to 32 bytes is not a good idea.

Even returning all memory aligned to 16 bytes is a waste of memory, because in many situations you don't need to perform array ops. In general 8 bytes alignment can be enough (unless you have real numbers on non-Windows systems).

So I think the current management of array alignments in D is not good enough. The align() syntax can be extended to arrays too:

align(16) int[] arr = align(16) new int[16]; // dynamic, for SSE
align(32) float[8] arr2; // static, for AVX
align(1) ubyte[8] arr3; // can save some space on the stack

Alignment annotations are supported by GNU C too, but the compiler sees them only in a limited spot of the program. While in D the *type* of an array can contain its alignment too.

An array has a default alignment, that is for example 4 or 8 bytes. An array with an alignment of 16 or 32 is a subtype of the array with default alignment. So you can't assign the contents of a dynamic array with 16-alignment to an array with 8-alignment without a cast. On the other hand you can assign an array with 1-alignment to one with 8-alignment :-)

If you have the alignment statically encoded into the array type, then you need to manage slicing in a bit more restricted way: if you have a 16-aligned array of floats and you want to slice it, you can slice it in an arbitrary way and produce slices that have a type 4-aligned. Or you can impose run-time or compile-time tests on the modulus of the slicing bounds and then you can produce slices that have a 8 or 16-aligned type.

If this align() extension is introduced, then I think the GC can be free to allocate arrays 8-aligned on default, saving some memory.

Dynamic arrays with a specified alignment can be created as library types too (they can even support the slicing as I have explained), the built-in array ops can then just recognize such library-defined types and avoid the alignment runtime tests on them (and the GC can create dynamic arrays 8-aligned on default). But I think 16-aligned or 32-aligned stack-allocated fixed-sized arrays are harder to implement as library types.

Bye,
bearophile