GC, the simple solution

Wed Jun 14 03:45:21 PDT 2006

Sean Kelly wrote:
> Bruno Medeiros wrote:
>>
>> Ah, I see. I had recently read about relative out-of-order execution
>> problems in the Memory Barrier wikipedia entry, and I got the
>> impression (out of nowhere) that the hardware took care of that
>> transparently (i.e., without program intervention), but then, that is
>> not the case, not allways at least?
> 
> A single CPU is allowed to do whatever it wants so long as it can fool
> the user into thinking it's executing instructions in a purely
> sequential manner.  However, the only information other CPUs in the
> system have is what they observe on the system bus.  Obviously, so long
> as CPUs aren't sharing data there aren't any problems.  But things get
> sticky when this isn't the case.  The memory model defines observable
> behavior allowed for a given architecture, as well as any methods
> offered for affecting that behavior.
> 
> Say a specific architecture can operate much more quickly if it is
> allowed to perform writes to memory (from cache) in order of ascending
> address rather than in the order they were issued in code.  There's no
> way to lie to other CPUs about the order in which writes occur and still
> have the optimization have any effect, so the designers state in the
> memory model spec that this architecture is allowed to reorder writes
> and then typically provide some means for overriding this behavior
> should it prove necessary.
> 
> Now let's suppose you have two threads doing something like this:
> 
>   thread/CPU 1:
> 
>     A = 1;
>     B = 2;
> 
>   thread/CPU 2:
> 
>     if( A == 1 )
>     {
>       assert( B == 2 );
>     }
> 
> Given the order in which a and b were declared, and therefore the order
> in which the writes occur, this assert may or may not fail.
> 
> Enter memory barriers.  Memory barriers are simply a way for the
> programmer to tell the CPU "I don't care if your way is faster, A simply
> must be written to memory before B or thread 2 will explode."  So the
> CPU behind thread 1 does as it's told at great expense to performance
> and thread 2 doesn't melt down.
> 
> The sticky part is that hardware designers don't agree with one another
> on how things should work and they never take the advice of the software
> people, so all architectures have different sets of observable behavior
> and different methods for working around it when necessary.  However,
> the common concept is that memory barriers all constrain the order in
> which memory accesses may occur with respect to each other.  Think of it
> as an assertion that "X may not occur before Y" or "X may not occur
> after Y" at the instruction level.
> 
> The x86 is actually a bit weird in this regard as it has no formal
> memory barriers for normal operations (though it has the FENCE
> instructions for SSE use).  I think this is largely for historical
> reasons--x86 PCs couldn't do SMP at all until fairly recently so none of
> this mattered, and the memory model has always been fairly strict (it
> was actually sequential until not terribly long ago).  Also, the LOCK
> instruction acts as a heavy-handed sort of memory barrier as well, so
> there has been little motivation to add new instructions for
> finer-grained control.
> 
> 
> Sean

Cool; learn something new every day.  Thanks for the informative post.

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/