GC, the simple solution
Bruno Medeiros
brunodomedeirosATgmail at SPAM.com
Wed Jun 14 10:14:44 PDT 2006
Sean Kelly wrote:
> Bruno Medeiros wrote:
>>
>> Ah, I see. I had recently read about relative out-of-order execution
>> problems in the Memory Barrier wikipedia entry, and I got the
>> impression (out of nowhere) that the hardware took care of that
>> transparently (i.e., without program intervention), but then, that is
>> not the case, not allways at least?
>
> A single CPU is allowed to do whatever it wants so long as it can fool
> the user into thinking it's executing instructions in a purely
> sequential manner. However, the only information other CPUs in the
> system have is what they observe on the system bus. Obviously, so long
> as CPUs aren't sharing data there aren't any problems. But things get
> sticky when this isn't the case. The memory model defines observable
> behavior allowed for a given architecture, as well as any methods
> offered for affecting that behavior.
>
> Say a specific architecture can operate much more quickly if it is
> allowed to perform writes to memory (from cache) in order of ascending
> address rather than in the order they were issued in code. There's no
> way to lie to other CPUs about the order in which writes occur and still
> have the optimization have any effect, so the designers state in the
> memory model spec that this architecture is allowed to reorder writes
> and then typically provide some means for overriding this behavior
> should it prove necessary.
>
> Now let's suppose you have two threads doing something like this:
>
> thread/CPU 1:
>
> A = 1;
> B = 2;
>
> thread/CPU 2:
>
> if( A == 1 )
> {
> assert( B == 2 );
> }
>
> Given the order in which a and b were declared, and therefore the order
> in which the writes occur, this assert may or may not fail.
>
> Enter memory barriers. Memory barriers are simply a way for the
> programmer to tell the CPU "I don't care if your way is faster, A simply
> must be written to memory before B or thread 2 will explode." So the
> CPU behind thread 1 does as it's told at great expense to performance
> and thread 2 doesn't melt down.
>
> The sticky part is that hardware designers don't agree with one another
> on how things should work and they never take the advice of the software
> people, so all architectures have different sets of observable behavior
> and different methods for working around it when necessary. However,
> the common concept is that memory barriers all constrain the order in
> which memory accesses may occur with respect to each other. Think of it
> as an assertion that "X may not occur before Y" or "X may not occur
> after Y" at the instruction level.
>
> The x86 is actually a bit weird in this regard as it has no formal
> memory barriers for normal operations (though it has the FENCE
> instructions for SSE use). I think this is largely for historical
> reasons--x86 PCs couldn't do SMP at all until fairly recently so none of
> this mattered, and the memory model has always been fairly strict (it
> was actually sequential until not terribly long ago). Also, the LOCK
> instruction acts as a heavy-handed sort of memory barrier as well, so
> there has been little motivation to add new instructions for
> finer-grained control.
>
>
> Sean
Nice post!
Makes me think, how does one keep up with this? I mean, one who isn't
(nor wishes to be) a hardware expert, but wants to keep up with the
general developments in this area, thus maintaining an overview of it.
--
Bruno Medeiros - CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
More information about the Digitalmars-d
mailing list