[dmd-concurrency] word tearing status in today's processors

Wed Jan 27 20:05:45 PST 2010

Yeah, I know older (non-x86?) processors did a RMW operation at the word level to accomplish this.

On Jan 27, 2010, at 1:51 PM, Brad Roberts wrote:

> I can't provide contractory data, but careful with the assumption that 
> because the cpu provides a byte assign instruction that it's atomic.
> 
> On Wed, 27 Jan 2010, Andrei Alexandrescu wrote:
> 
>> Thanks, Robert. This is very useful!
>> 
>> Andrei
>> 
>> Robert Jacques wrote:
>>> On Wed, 27 Jan 2010 10:10:49 -0500, Andrei Alexandrescu <andrei at erdani.com>
>>> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> 
>>>> I'm looking _hard data_ on how today's processors address word tearing. As
>>>> usual, googling for word tearing yields the usual mix of vague
>>>> information, folklore, and opinionated newsgroup discussions.
>>>> 
>>>> In particular:
>>>> 
>>>> a) Can we assume that all or most of today's processors are able to write
>>>> memory at byte level?
>>> 
>>> Not sure. Both x86 and ARM seem to have set byte instructions.
>>> 
>>>> b) If not, is it reasonable to have the compiler insert for sub-word
>>>> shared assignments a call to a function that avoids word tearing by means
>>>> of a CAS loop?
>>> 
>>> Yes, in general, though on x86 xchg (not CAS) should be used instead.
>>> 
>>>> c) For 64-bit data (long and double), am I right in assuming that all
>>>> non-ancient Intel32 processors do offer a means to atomically assign
>>>> 64-bit data? (What are those asm instructions?) For processors that don't
>>>> (Intel or not), can we/should we guarantee at the language level that
>>>> 64-bit writes are atomic? We could effect that by using e.g. a federation
>>>> of hashed locks, or even (gasp!) two global locks, one for long and one
>>>> for double, and do something cleverer when public outrage puts our lives
>>>> in danger. Java guarantees atomic assignment for volatile data, but I'm
>>>> not sure what mechanisms implementations use.
>>> 
>>> The instructions you're looking for is CMPXCHG8B for 32-bit x86 CPUs. It's
>>> been around since the 486. For other CPUs, they generally use a linked-load.
>>> From wikipedia:
>>> All of Alpha, PowerPC, MIPS, and ARM have LL/SC instructions: ldl_l/stl_c
>>> and ldq_l/stq_c (Alpha), lwarx/stwcx (PowerPC), ll/sc (MIPS), and
>>> ldrex/strex (ARM version 6 and above).
>>> 
>>> Most platforms provide multiple sets of instructions for different data
>>> sizes, e.g. ldarx/stdcx for doubleword on the PowerPC.
>>> Some CPUs require the address being accessed exclusively to be configured in
>>> write-through mode.
>>> Some CPUs track the load-linked address at a cache-line or other
>>> granularity, such that any modification to any portion of the cache line
>>> (whether via another core's store-conditional or merely by an ordinary
>>> store) is sufficient to cause the store-conditional to fail.
>>> All of these platforms provide weak LL/SC. The PowerPC implementation is the
>>> strongest, allowing an LL/SC pair to wrap loads and even stores to other
>>> cache lines. This allows it to implement, for example, lock-free reference
>>> counting in the face of changing object graphs with arbitrary counter reuse
>>> (which otherwise requires DCAS).
>>> 
>>> And from an ARM website (STREXD is 64-bit):
>>> ARM LDREX and STREX are available in ARMv6 and above.
>>> ARM LDREXB, LDREXH, LDREXD, STREXB, STREXD, and STREXH are available in
>>> ARMv6K and above.
>>> All these 32-bit Thumb instructions are available in ARMv6T2 and above,
>>> except that LDREXD and STREXD are not available in the ARMv7-M profile.
>>> 
>>> ARM also has had a swap-byte instruction since v4, which may/may not be
>>> equivalent to LDREXB/STREXB.
>>> 
>>> So I think it's safe to say that 64-bit writes will be efficient on most
>>> CPUs out there and making a language level guarantee is okay.
>>> 
>>> Warning: most of this came from some quick Google searches, so I don't know
>>> if there's other gotchas out there.
>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Andrei
>>>> _______________________________________________
>>>> dmd-concurrency mailing list
>>>> dmd-concurrency at puremagic.com
>>>> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>>> 
>>> _______________________________________________
>>> dmd-concurrency mailing list
>>> dmd-concurrency at puremagic.com
>>> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>> _______________________________________________
>> dmd-concurrency mailing list
>> dmd-concurrency at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency
>> 
> _______________________________________________
> dmd-concurrency mailing list
> dmd-concurrency at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/dmd-concurrency