shared methods

Fri Jan 24 01:51:27 PST 2014

Am Thu, 23 Jan 2014 19:20:34 +0000
schrieb "Kagamin" <spam at here.lot>:

> A minor caveat is GDC treats shared as volatile, which can be not 
> optimal. Even worse, on some platforms like itanium volatile 
> generates fences.

Do you have more information on that issue?
There are claims that Itanium "automatically generates fences", however
that statement is not precise and I can't find a detailed description.

Itanium has special instructions which refresh the caches before
reading/writing. But the interesting question is this: Does GCC use
these instructions if you access a variable marked as volatile?

If it does, that's really a performance problem. AFAICS we need a
'volatile' in D for embedded programming which just does the following:

* Avoid removing any stores / or loads to that variable (includes
  moving accesses out of loops)
* Avoid instruction reordering between accesses to volatile variables:
  volatile int a, c;
  int b;
  a = 0;
  b = 0;
  c = 0;
  b can be moved anywhere, but c must always come after a

Here volatile should only affect the ASM generated by the compiler, it
shouldn't care at all what the final CPU does, that's the programmers
task.

In some cases the compiler can't even know: For example some
memory areas in ARM are special and the processor won't reorder
accesses to those areas anyway, so a 'smart compiler' adding memory
barriers would actually only harm performance.
http://infocenter.arm.com/help/topic/com.arm.doc.dui0552a/CIHGEIID.html

Such a 'volatile' could be integrated with shared as it should not
cause any huger performance problems. However, if you think of this
code:

shared int X;
synchronized(x)
{
    for(int i=0; i< 100; i++)
        X = i;
}

it's actually valid to optimize the loop, so 'volatile' behavior is not
wanted. But what if we used atomic ops?

shared int X;

for(int i=0; i< 100; i++)
    atomicSet(X, i);

Now the loop can't be optimized away. Is the compiler smart enough to
figure out that an atomic op is used and it can't optimize this? Or
would we have to mark this volatile?