alignment on stack-allocated arrays/structs

Wed Nov 18 09:18:19 PST 2009

Don schrieb:
> Well, sort of.
> It's impossible to align stack-allocated structs with any alignment 
> greater than the alignment of the stack itself (which is 4 bytes). 
> Anything larger than that and you HAVE to use the heap or alloca().
> 

So how do other compilers supporting that alignment syntax do it?

> Nothing on x86 benefits from more than 16 byte alignment, AFAIK, and 
> it's never mandatory to use more than 8 byte alignment. I don't know so 
> much about the recent GPUs, though -- do they really require 16 byte 
> alignment or more?
> 

I'm not sure how exactly this works and why they require alignment. 
Couldn't find anything about that in the clEnqueueWriteBuffer 
description where data gets written into GPU memory.

The specification for the OpenCL C language itself only states:

A data item declared to be a data type in memory is always aligned to 
the size of the data type in bytes.  For example, a float4 variable will 
be aligned to a 16-byte boundary, a char2 variable will be aligned to a 
2-byte boundary.

A built-in data type that is not a power of two bytes in size must be 
aligned to the next larger power of two.  This rule applies to built-in 
types only, not structs or unions.

They also strangely state:

The components of vector data types with 1 ... 4 components can be 
addressed as <vector_data_type>.xyzw.

float4 c, a, b;

c.xyzw = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
c.z = 1.0f;         // is a float
c.xy = (float2)(3.0f, 4.0f); // is a float2

So I wonder why they used arrays in the headers and not structs to be 
consistent with this.