tooling quality and some random rant

Tue Feb 15 03:21:44 PST 2011

Walter:

>Huh, I simply could never find a document about how to use those which gave me any comfortable sense that the author knew what he was talking about.<

http://www.agner.org/optimize/

------------------

Don:

>A problem with that, is that the prefetching instructions are vendor-specific.<

Right. Then I suggest some higher-level annotations (pragmas?) that the programmer uses to better state the temporal semantics of memory accesses in a performance-critical part of D code.

>Also, it's quite difficult to use them correctly. If you put them in the wrong place, or use them too much, they slow your code down.<

CPU caches have a simple purpose. Light speed is finite (how much distance does light travel in vacuum/doped silicon during a clock cycle of a 5 GHz POWER6 CPU? http://en.wikipedia.org/wiki/POWER6 ), and finding one thing among many things is slower than finding among few ones. So you speed up your memory accesses if you read information from a smaller group of data located closer to you. Most CPUs don't have a little faster memory that you manage yourself (http://en.wikipedia.org/wiki/Scratchpad_RAM ), the CPUs copy data from/to cache levels by themselves, so on such CPUs the illusion of a flat memory is at the hardware level, not just at C language level. Cache manage their memory in few different ways, often bigger CPUs offer ways to alter such ways a little, using special instructions. The main difference is how they keep coherence across different core caches and in what situations they store back data from the cache to RAM.

In some cases in your program you want to read from an array, and store data inside it again and another one too, but you never want to store far away data in the first one. There are few other common patterns of memory usage. In theory a normal language like Fortran is enough to specify what memory you want to read or write and when you want to do it. In practice today compilers are not so good at inferring such semantics, so some high level annotations probably help. In future maybe compilers will get better, so they will ignore those annotations, just like they often ignore "register" annotations. Being system-level programming languages practical things, adding annotations is not bad, even if 5-10 years later those annotations become less useful.

Bye,
bearophile