GC, the simple solution
Sean Kelly
sean at f4.ca
Tue Jun 13 19:40:07 PDT 2006
Bruno Medeiros wrote:
>
> Ah, I see. I had recently read about relative out-of-order execution
> problems in the Memory Barrier wikipedia entry, and I got the impression
> (out of nowhere) that the hardware took care of that transparently
> (i.e., without program intervention), but then, that is not the case,
> not allways at least?
A single CPU is allowed to do whatever it wants so long as it can fool
the user into thinking it's executing instructions in a purely
sequential manner. However, the only information other CPUs in the
system have is what they observe on the system bus. Obviously, so long
as CPUs aren't sharing data there aren't any problems. But things get
sticky when this isn't the case. The memory model defines observable
behavior allowed for a given architecture, as well as any methods
offered for affecting that behavior.
Say a specific architecture can operate much more quickly if it is
allowed to perform writes to memory (from cache) in order of ascending
address rather than in the order they were issued in code. There's no
way to lie to other CPUs about the order in which writes occur and still
have the optimization have any effect, so the designers state in the
memory model spec that this architecture is allowed to reorder writes
and then typically provide some means for overriding this behavior
should it prove necessary.
Now let's suppose you have two threads doing something like this:
thread/CPU 1:
A = 1;
B = 2;
thread/CPU 2:
if( A == 1 )
{
assert( B == 2 );
}
Given the order in which a and b were declared, and therefore the order
in which the writes occur, this assert may or may not fail.
Enter memory barriers. Memory barriers are simply a way for the
programmer to tell the CPU "I don't care if your way is faster, A simply
must be written to memory before B or thread 2 will explode." So the
CPU behind thread 1 does as it's told at great expense to performance
and thread 2 doesn't melt down.
The sticky part is that hardware designers don't agree with one another
on how things should work and they never take the advice of the software
people, so all architectures have different sets of observable behavior
and different methods for working around it when necessary. However,
the common concept is that memory barriers all constrain the order in
which memory accesses may occur with respect to each other. Think of it
as an assertion that "X may not occur before Y" or "X may not occur
after Y" at the instruction level.
The x86 is actually a bit weird in this regard as it has no formal
memory barriers for normal operations (though it has the FENCE
instructions for SSE use). I think this is largely for historical
reasons--x86 PCs couldn't do SMP at all until fairly recently so none of
this mattered, and the memory model has always been fairly strict (it
was actually sequential until not terribly long ago). Also, the LOCK
instruction acts as a heavy-handed sort of memory barrier as well, so
there has been little motivation to add new instructions for
finer-grained control.
Sean
More information about the Digitalmars-d
mailing list