misaligned read handling on various processors

Michel Fortin michel.fortin at michelf.com
Tue Oct 6 10:31:12 PDT 2009


On 2009-10-06 09:58:42 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail at erdani.org> said:

> Consider:
> 
> struct A {
>      char a;
>      align(1) int b;
> }
> 
> Accesses to b will be rather slow because it's a misaligned read. My 
> question is, how exactly is that handled on various processors? I seem 
> to recall various anecdotes (including that misaligned reads on Intel 
> cause a trap that does the needed double reading, shifting, and 
> masking), but Google search has surprisingly little on the matter.

Wikipedia: 
<http://en.wikipedia.org/wiki/Data_structure_alignment#Architectures>

RISC

Most RISC processors will generate an alignment fault when a load or 
store instruction accesses a misaligned address. This allows the 
operating system to emulate the misaligned access using other 
instructions. For example, the alignment fault handler might use byte 
loads or stores (which are always aligned) to emulate a larger load or 
store instruction.

Some architectures like MIPS have special unaligned load and store 
instructions. One unaligned load instruction gets the bytes from the 
memory word with the lowest byte address and another gets the bytes 
from the memory word with the highest byte address. Similarly, 
store-high and store-low instructions store the appropriate bytes in 
the higher and lower memory words respectively.

The Alpha architecture has a two-step approach to unaligned loads and 
stores. The first step is to load the upper and lower memory words into 
separate registers. The second step is to extract or modify the memory 
words using special low/high instructions similar to the MIPS 
instructions. An unaligned store is completed by storing the modified 
memory words back to memory. The reason for this complexity is that the 
original Alpha architecture could only read or write 32-bit or 64-bit 
values. This proved to be a severe limitation that often led to code 
bloat and poor performance. To address this limitation, an extension 
called the Byte Word Extensions (BWX) was added to the original 
architecture. It consisted of instructions for byte and word loads and 
stores.

Because these instructions are larger and slower than the normal memory 
load and store instructions they should only be used when necessary. 
Most C and C++ compilers have an “unaligned” attribute that can be 
applied to pointers that need the unaligned instructions.

x86 and x86-64

While the x86 architecture originally did not require aligned memory 
access and still works without it, SSE2 and x86-64 instructions on x86 
CPUs do require the data to be 128-bit (16-byte) aligned and there can 
be substantial performance advantages from using aligned data on these 
architectures.


-- 
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/




More information about the Digitalmars-d mailing list