alignment on stack-allocated arrays/structs

Wed Nov 18 11:26:45 PST 2009

Trass3r wrote:
> Don schrieb:
>> Well, sort of.
>> It's impossible to align stack-allocated structs with any alignment 
>> greater than the alignment of the stack itself (which is 4 bytes). 
>> Anything larger than that and you HAVE to use the heap or alloca().
>>
> 
> So how do other compilers supporting that alignment syntax do it?

It might only be required on particular CPUs/OSes. Eg requirements for 
Sparc are quite different.
Some of them might be doing alloca() under the covers.

>> Nothing on x86 benefits from more than 16 byte alignment, AFAIK, and 
>> it's never mandatory to use more than 8 byte alignment. I don't know 
>> so much about the recent GPUs, though -- do they really require 16 
>> byte alignment or more?
>>
> 
> I'm not sure how exactly this works and why they require alignment. 
> Couldn't find anything about that in the clEnqueueWriteBuffer 
> description where data gets written into GPU memory.
> 
> 
> The specification for the OpenCL C language itself only states:
> 
> A data item declared to be a data type in memory is always aligned to 
> the size of the data type in bytes.  For example, a float4 variable will 
> be aligned to a 16-byte boundary, a char2 variable will be aligned to a 
> 2-byte boundary.
> 
> A built-in data type that is not a power of two bytes in size must be 
> aligned to the next larger power of two.  This rule applies to built-in 
> types only, not structs or unions.
> 
> 
> 
> They also strangely state:
> 
> The components of vector data types with 1 ... 4 components can be 
> addressed as <vector_data_type>.xyzw.
> 
> float4 c, a, b;
> 
> c.xyzw = (float4)(1.0f, 2.0f, 3.0f, 4.0f);
> c.z = 1.0f;         // is a float
> c.xy = (float2)(3.0f, 4.0f); // is a float2
> 
> 
> 
> So I wonder why they used arrays in the headers and not structs to be 
> consistent with this.