Multicores and Publication Safety

Sean Kelly sean at invisibleduck.org
Mon Aug 4 22:20:09 PDT 2008


Brad Roberts wrote:
> Jb wrote:
>> "Walter Bright" <newshound1 at digitalmars.com> wrote in message 
>> news:g7855a$2sd3$1 at digitalmars.com...
>>> "What memory fences are useful for on multiprocessors; and why you should 
>>> care, even if you're not an assembly programmer."
>>>
>>> http://bartoszmilewski.wordpress.com/2008/08/04/multicores-and-publication-safety/
>>>
>>> http://www.reddit.com/comments/6uuqc/multicores_and_publication_safety/
>> None of that is relevant on x86 as far as I understand. I could only find 
>> the one regarding x86-64, but as far as I know it's the same on x86-32.
>>
>> http://www.intel.com/products/processor/manuals/318147.pdf
>>
>> The key point being loads are not reordered with other loads, and stores are 
>> not reordered with other stores.
>>
> 
> Pay very close attention to sections 2.3 and 2.4 of that document.

2.4 is the most interesting aspect of PC.  It means that you can run 
into situations like this:

     // Thread A
     x = 1;

     // Thread B
     if( x == 1 )
         y = 1;

     // Thread C
     if( y == 1 )
         assert( x == 1 ); // may fail

Alex Terekhov came up with a sneaky solution for this based on how the 
IA-32 spec says CAS is currently implemented:

     // Thread A
     x = 1;

     // Thread B
     t = CAS( x, 0, 0 );
     if( t == 1 )
         y = 1;

     // Thread C
     if( y == 1 )
         assert( x == 1 ); // true

In essence, Intel currently implements CAS by either storing the new 
value /or/ re-storing the old value based on the result of the 
comparison, and because all stores from a single processor are ordered, 
Thread C is therefore guaranteed to see the store to x before the store 
to y.

As cool as I find the above solution, however, I do hope that this helps 
to demonstrate the complexity of lock-free programming.  It also shows 
just how complex analysis of this stuff is.  Even with the full source 
code available it would take some doing for a compiler to recognize a 
problem similar to the above.


Sean



More information about the Digitalmars-d mailing list