[dmd-internals] Regarding deprecation of volatile statements
sean at invisibleduck.org
Wed Aug 8 11:46:43 PDT 2012
On Aug 7, 2012, at 12:19 PM, deadal nix <deadalnix at gmail.com> wrote:
> 2012/8/4 Sean Kelly <sean at invisibleduck.org>
> On Aug 1, 2012, at 10:25 AM, Walter Bright <walter at digitalmars.com> wrote:
> > To reiterate, this is why I need to know what problem you are trying to address, rather than going at it from the solution point of view.
> I think the original request was for there to be some way to prevent compiler optimization of certain plain old loads/stores:
> On Jul 23, 2012, at 2:28 PM, Alex Rønne Petersen <xtzgzorex at gmail.com> wrote:
> > And further: How are people *really* supposed to prevent compiler
> > reordering in modern D2 programs (without using atomics; they are
> > expensive and wasteful for this)?
> This can be useful for tuning concurrent algorithms to avoid unnecessary synchronized operations and also for the occasional store where the ordering isn't important so much as that it simply be issued at all. Using DMD, my suggestion would be to use atomicStore!msync.raw, which performs a plain old store in asm and uses the fact that DMD doesn't optimize across asm blocks to make the operation behave in the desired manner. But I believe GDC and LLDC may both optimize more aggressively with respect to asm code and so this assumption doesn't hold universally. Personally, if I could be guaranteed that at least specific asm blocks would be treated as volatile by the compiler in that there's no code movement across them, etc, then that would probably be enough.
> shared is supposed to garantee that. volatile is useless in regard to concurrency.
Are you saying that if a compiler optimizes into or across asm blocks then it should be smart enough to not do so for asm blocks that operate on shared variables? That certainly seems reasonable to me. I don't suppose someone with LLVM or GCC back-end experience can comment on what either compiler does regarding inline assembler? Does either compiler even optimize in the way I've described? It's been a while since I researched this, and I can't remember which compilers do this sort of thing.
Regarding volatile… could you describe how the requirements for device mapped memory accesses might be different from shared memory accesses?
More information about the dmd-internals