Optimal struct layout template?
Don
nospam at nospam.com
Thu Jan 8 02:59:13 PST 2009
Andrei Alexandrescu wrote:
> Don wrote:
>> Denis Koroskin wrote:
>>> On Thu, 18 Dec 2008 01:38:32 +0300, Sergey Gromov
>>>> Thu, 18 Dec 2008 00:21:58 +0300, Denis Koroskin wrote:
>>>>> On Thu, 18 Dec 2008 00:12:18 +0300, Sergey Gromov
>>>>>> Tue, 16 Dec 2008 10:09:41 +0100, Don wrote:
>>>>>>> struct Foo { double, int } // 12 byte size, wants 8-byte alignment.
>>>>>> Foo.sizeof is 16.
>>>>> It is 12 with align(1) (I see no problem with it).
>>>> With align(1) the Foo.alignof is 1 so there is no problem to talk
>>>> about.
>>> No, there is. You should have structure as follows:
>>>
>>> align(1):
>>> struct Foo
>>> {
>>> char c;
>>> double d;
>>> short s;
>>> bool b;
>>> }
>>
>> [snip]
>>
>>> so that Foo.d.alignof == 8. Align(1) is still necessary so that
>>> Foo.sizeof == 12.
>>
>> Sergey is correct to the extent that in DMD, x.alignof <= x.sizeof for
>> any x, and also x.sizeof is always an integer multiple of x.alignof.
>> When you assume that DMD is correct, then the simple ordering by
>> .alignof indeed gives an optimal solution.
>> But these values *don't* reflect the machine behaviour.
>>
>> real.alignof is 2. But it's significantly faster (factor of 2) on most
>> x86 machines to have real aligned to 16.
>>
>> These effects are not captured by the .alignof property for structs.
>> To capture it in the Foo example, you'd need (for example) a
>> Foo.memberalignof property, and a Foo.memberalignoffsetof property.
>> But you could reasonably argue that if you're specifying align(1) then
>> alignment is not important.
>> The simple algorithm may well be good enough.
>> (And note that you can use radix sort -- alignof can only be 1, 2, 4,
>> 8, or 16 -- so it's in O(n) not O(n lg n)).
>>
>> Interestingly, this 'internal alignment' applies to function
>> alignment, as well. D allows you to specify that a function be aligned
>> to (for example) a 16-byte boundary. This is hardly ever useful;
>> usually what you want is for the innermost loop to be aligned. And
>> instead of a pile of nops before the loop, you'd like the function
>> start address to be adjusted.
>
> That's cool! I think the way to the templates AlignForSpeed!(Foo) and
> AlignForSize!(Foo) is paved. Don, now all you have to do is a simple
> matter of coding :o).
>
> Andrei
Here's a very simple implementation that should solve the problem
described in the original post. It doesn't deal with the obscure and
difficult cases of align(1) structs, or 80-bit reals.
Implementation includes workarounds for FOUR compiler bugs.
Works without modification on both D1 and D2.
----
/// Order the provided members to minimize size while preserving alignment
/// BUG: bugzilla 2029 prevents the signature from being (string[]
names...),
/// so we need to use an ugly array literal instead.
char [] alignForSize(E...)(string[E.length] names)
{
// Sort all of the members by .alignof.
// BUG: Alignment is not always optimal for align(1) structs
// or 80-bit reals.
// TRICK: Use the fact that .alignof is always a power of 2,
// and maximum 16 on extant systems. Thus, we can perform
// a very limited radix sort.
int [][] alignlist; // workaround for bugzilla 2569
// alignof = 1, 2, 4, 8, 16, 32, 64
alignlist = [ [],[],[],[],[],[],[]]; // workaround for bugzilla 2562
char[][] declaration;
foreach(int i_bug,T; E) {
int i = i_bug; // workaround for bugzilla 2564 (D2 only)
declaration ~= T.stringof ~ " " ~ names[i].dup ~ ";\n";
int a = T.alignof;
int k = a==2? 1 : a==4? 2 : a==8? 3 : a==16? 4 : a==32? 5 :
a>=64? 6 : 0;
alignlist[k]~=i;
}
char [] s;
foreach_reverse(q; alignlist) {
foreach(int i; q) {
s ~= declaration[i];
}
}
return s;
}
// Example:
struct Foo {
mixin(alignForSize!(int[], char[13], short, double[5])(["x",
"y","z", "w"]));
}
is equivalent to:
struct Foo{
double[5u] w; // aligned to 8 bytes
int[] x; // aligned to 4
short z; // aligned to 2
char[13u] y; // aligned to 1
}
which will have a size of 63, followed by 1 byte of padding.
More information about the Digitalmars-d
mailing list