Is core.internal.atomic.atomicFetchAdd implementation really lock free?

Sun Dec 4 10:15:48 UTC 2022

On Sunday, 4 December 2022 at 01:51:56 UTC, max haughton wrote:
> On Saturday, 3 December 2022 at 21:05:51 UTC, claptrap wrote:
>> On Saturday, 3 December 2022 at 20:18:07 UTC, max haughton 
>> wrote:
>>> On Saturday, 3 December 2022 at 13:05:44 UTC, claptrap wrote:
>>>> On Saturday, 3 December 2022 at 03:42:01 UTC, max haughton 
>>>> wrote:
>>>>> On Wednesday, 30 November 2022 at 00:35:55 UTC, claptrap
>>>>
>>>> "Atomically adds mod to the value referenced by val and 
>>>> returns the value val held previously. This operation is 
>>>> both lock-free and atomic."
>>>>
>>>> https://dlang.org/library/core/atomic/atomic_fetch_add.html
>>>
>>> If it provides the same memory ordering guarantees does it 
>>> matter (if we ignore performance for a second)? There are 
>>> situations where you do (for reasons beyond performance) 
>>> actually need a efficient (no overhead) atomic operation in 
>>> lock-free coding, but these are really on the edge of what 
>>> can be considered guaranteed by any specification.
>>
>> It matters because the whole point of atomic operations is 
>> lock free coding. There is no oh you might need atomic for 
>> lock free coding, you literally have to have them. If they 
>> fall back on a mutex it's not lock free anymore.
>>
>> memory ordering is a somewhat orthogonal issue from atomic ops.
>
> Memory ordering is literally why modern atomic operations 
> exist. That's why there's a lock prefix on the instruction in 
> X86 - it doesn't just say "do this in one go" it says "do this 
> in one go *and* maintain this memory ordering for the other 
> threads".

No it's not, literally go read the Intel SDM, the lock prefix is 
for atomic operations, that's it's intent. That it also has extra 
memory ordering guarantees between cores is simply because it 
would be useless if it didn't also do that.

The m/s/l/fence instructions are for memory ordering.

If you still dont believe me consider this. x86 has always had 
strong memory ordering, there's no need to worry about reads to a 
given location being moved ahead of writes to the same location 
if you are on a single core. That only goes out of the window 
when you have a multicore x86. The lock prefix existed before x86 
cpus had multiple cores. IE locked instructions are for atomics, 
the memory ordering guarantees already existed for all 
reads/writes on single cores. The extra guarantees for memory 
ordering between cores were added when x86 went multicore, 
because lock free algorithms would not work if the atomic 
reads/writes were not also ordered between cores.

To say atmoics are about memory ordering is like saying cars are 
for parking. Yes you have to park them somewhere, but thats not 
the problem they are meant to solve.