__vector(ubyte[32]) misalignment

Steven Schveighoffer schveiguy at gmail.com
Sun Aug 9 12:37:06 UTC 2020


On 8/9/20 8:09 AM, Bruce Carneal wrote:
> On Sunday, 9 August 2020 at 09:58:18 UTC, Johan wrote:
>> On Sunday, 9 August 2020 at 01:03:51 UTC, Bruce Carneal wrote:
>>> The .alignof attribute of __vector(ubyte[32]) is 32 but initializing 
>>> an array of such vectors via an assignment to .length has given me 16 
>>> byte alignment (and subsequent seg faults which I suspect are related).
>>>
>>> Is sub .alignof alignment expected here?  IOW, do I have to manually 
>>> manage memory if I want alignments above 16?
>>
>> Do you have a code example?
>> And what compiler are you using?
>>
>> -Johan
> 
> At run.dlang.io recent runs of both dmd and lcd compilations of the 
> below revealed misalignment.
> 
> import std;
> 
> void main() @safe
> {
>      alias V = __vector(ubyte[32]); // requires -mcpu=native or other on 
> cmd line
>      V[] va;
>      size_t misalignments;
>      foreach(N; 1..101) {
>          va.length = N;
>          const uptr = cast(ulong)va.ptr;
>          misalignments += (uptr % V.alignof) != 0;
>      }
>      writefln("misaligned %s per cent of the time", misalignments);
> }

All blocks in the GC that are more than 16 bytes are aligned by 32 
bytes. You shouldn't have any 16 byte blocks here, because each element 
is 32 bytes long.

However, if your block grows to a page size, the alignment will be 16 
bytes off (due to the metadata stored at the front of the block).

A page size is 4096 bytes. So anything larger than 2048 will require a 
page-sized block or larger.

I would guess that once your array gets longer than.... 63 elements, 
it's always misaligned?

The current code ensures a 16 byte alignment. That really should go to 
32 (for this reason). I think this has come up before, there may even be 
a bug report on it.

See: 
https://github.com/dlang/druntime/blob/660d911bbd3342c1f1c1478d12e3e943c6038da0/src/rt/lifetime.d#L35

The other thing you can do is avoid allocating using the array runtime, 
and just allocate using the GC calls directly. This means appending 
won't work, and neither will destructors (though that shouldn't be 
important here).

Question for those in the know: are there any other alignments that we 
should ensure are possible?

-Steve


More information about the Digitalmars-d-learn mailing list