How to do alligned allocation?
tsbockman
thomas.bockman at gmail.com
Sat Oct 1 00:32:28 UTC 2022
On Friday, 30 September 2022 at 15:57:22 UTC, Quirin Schroll
wrote:
> When I do `new void[](n)`, is that buffer allocated with an
> alignment of 1 or what are the guarantees?
It is guaranteed an alignment of at least 1 because `void.alignof
== 1` (and because that is the lowest possible integer
alignment). When I last checked, `new T` guaranteed a minimum
alignment of `min(T.alignof, 16)`, meaning that all basic scalar
types (`int`, `double`, pointers, etc.), and SIMD `__vector`s up
to 128 bits will be correctly aligned, while 256 bit (for
example, AVX's `__vector(double[4])`) and 512 bit (AVX512) types
might not be.
Arrays and aggregate types (`struct`s and `class`es) by default
use the maximum alignment required by any of their elements or
fields (including hidden fields, like `__vptr` for `class`es).
This can be overridden manually using the `align` attribute,
which must be applied to the aggregate type as a whole. (Applying
`align` to an individual field does something else.)
> How can I set an alignment?
If the desired alignment is `<= 16`, you can specify a type with
that `.alignof`.
However, if you may need higher alignment than the maximum
guaranteed to be available from the allocator, or you are not
writing strongly typed code to begin with, as implied by your use
of `void[]`, you can just align the allocation yourself:
```D
void[] newAligned(const(size_t) alignment)(const(size_t) size)
pure @trusted nothrow
if(1 <= alignment && isPowerOf2(alignment))
{
enum alignMask = alignment - 1;
void[] ret = new void[size + alignMask];
const misalign = (cast(size_t) ret.ptr) & alignMask;
const offset = (alignment - misalign) & alignMask;
ret = ret[offset .. offset + size];
return ret;
}
```
However, aligning memory outside of the allocator itself like
this does waste up to `alignment - 1` bytes per allocation, so
it's best to use as much of the allocator's internal alignment
capability as possible:
```D
import core.bitop : bsr;
import std.math : isPowerOf2;
import std.meta : AliasSeq;
void[] newAligned(const(size_t) alignment)(const(size_t) size)
pure @trusted nothrow
if(1 <= alignment && isPowerOf2(alignment))
{
alias Aligned = .Aligned!alignment;
void[] ret = new Aligned.Chunk[(size + Aligned.mask) >>
Aligned.chunkShift];
static if(Aligned.Chunk.alignof == alignment)
enum size_t offset = 0;
else {
const misalign = (cast(size_t) ret.ptr) & Aligned.mask;
const offset = (alignment - misalign) & Aligned.mask;
}
ret = ret[offset .. offset + size];
return ret;
}
private {
align(16) struct Chunk16 {
void[16] data;
}
template Aligned(size_t alignment)
if(1 <= alignment && isPowerOf2(alignment))
{
enum int shift = bsr(alignment);
enum size_t mask = alignment - 1;
static if(alignment <= 16) {
enum chunkShift = shift, chunkMask = mask;
alias Chunk = AliasSeq!(ubyte, ushort, uint, ulong,
Chunk16)[shift];
} else {
enum chunkShift = Aligned!(16).shift, chunkMask =
Aligned!(16).mask;
alias Chunk = Aligned!(16).Chunk;
}
}
}
@safe unittest {
static immutable(size_t[]) alignments =
[ 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024 ];
static immutable(size_t[]) sizes =
[ 9, 31, 4, 57, 369, 3358 ];
foreach(size; sizes) {
static foreach(alignment; alignments) { {
void[] memory = newAligned!alignment(size);
assert(memory.length == size);
assert((cast(size_t) &(memory[0])) % alignment == 0);
} }
}
}
```
> Also, is the alignment of any type guaranteed to be a power of
> 2?
In practice, yes.
On Friday, 30 September 2022 at 16:23:00 UTC, mw wrote:
> https://dlang.org/library/core/stdc/stdlib/aligned_alloc.html
>
> It's the C func, so check C lib doc.
https://en.cppreference.com/w/c/memory/aligned_alloc
Note that common implementations place arbitrary restrictions on
the alignments and sizes accepted by `aligned_alloc`, so to
support the general case you would still need a wrapper function
like the one I provided above.
(If this all seems overly complicated, that's because it is. I
have no idea why allocators don't just build in the logic above;
it's extremely simple compared to the rest of what a good
general-purpose heap allocator does.)
More information about the Digitalmars-d-learn
mailing list