Testing some singleton implementations

Fri Feb 7 17:15:45 PST 2014

Am Fri, 7 Feb 2014 18:42:29 +0000
schrieb Iain Buclaw <ibuclaw at gdcproject.org>:

> On 7 Feb 2014 15:45, "Sean Kelly" <sean at invisibleduck.org> wrote:
> >
> > On Friday, 7 February 2014 at 11:17:49 UTC, Stanislav Blinov wrote:
> >>
> >> On Friday, 7 February 2014 at 08:10:58 UTC, Sean Kelly wrote:
> >>>
> >>> Weird.  atomicLoad(raw) should be the same as atomicLoad(acq), and
> atomicStore(raw) should be the same as atomicStore(rel).  At least on x86.
>  I don't know why that change made a difference in performance.
> >>
> >>
> >> huh?
> >>
> >> --8<-- core/atomic.d
> >>
> >>         template needsLoadBarrier( MemoryOrder ms )
> >>         {
> >>             enum bool needsLoadBarrier = ms != MemoryOrder.raw;
> >>         }
> >>
> >> -->8--
> >>
> >> Didn't you write this? :)
> >
> >
> > Oops.  I thought that since Intel has officially defined loads as having
> acquire semantics, I had eliminated the barrier requirement there.  But I
> guess not.  I suppose it's an issue worth discussing.  Does anyone know
> offhand what C++0x implementations do for load acquires on x86?
> 
> Speaking of which, I need to add 'Update gcc.atomics to use new C++0x
> intrinsics' to the GDCProjects page - they map closely to what core.atomic
> is doing, and should see better performance compared to the __sync
> intrinsics.  :)

You send shared variables as "volatile" to the backend and
that is correct. I wonder since that should create strong
ordering of memory operations (correct?), if DMD has something
similar, or if D's "shared" isn't really shared at alĺ and
relies entirely on the correct use of atomicLoad/atomicStore
and atomicFence. In that case, would the GCC backend be able to
optimize more around shared variables (by not considering them
volatile) and still be no worse off than DMD?

-- 
Marco