Wish: Variable Not Used Warning
Markus Koskimies
markus at reaaliaika.net
Fri Jul 11 13:36:20 PDT 2008
On Fri, 11 Jul 2008 17:16:54 +0000, BCS wrote:
> Reply to Markus,
>
>> For decades, PC processor manufacturers are optimized their processors
>> for software, not in the other way. That is why the processors execute
>> functions so quickly, that is the sole reasons for having caches (the
>> regular locality of software, e.g. the IBM study from 60's).
>>
>>
> I hope I'm reading you wrong but if I'm not: The whole point of the talk
> is that CPU's can't get better performance by optimizing them more. If
> the code isn't written well (the code isn't optimized for the CPU)
> performance will not improve... ever.
They will get better, and that is going to affect your software. IMO you
should not write your software for CPU, instead you need to follow
certain paradigms.
I explain this lengthly. The current processors are fundamentally based
on RASP models, which are an example of so called von Neumann
architecture. This architecture offers, when physically realized, a very
flexible but dense computing platform, since it is constructed from two
specialized parts - memory and CPU. The drawback of this architecture is
so called von Neumann bottleneck, which has been irritating both
processor and software designers for decades.
---
The processor fabrication technology sets limitations to how fast a
processor can execute instructions. The early processors fetched the
instructions always from main memory (causing of course lots of external
bus activity), and they processed one instruction at time.
Since processor fabrication technology gets better quite slowly, there
have always been interest to search "alternative" solutions, which could
give performance benefits on current technology. These improvements have
been for example;
- Pipelining
- Super-scalar architectures
- OoO execution
- Threaded processors
- Multi-core processors
- etc.
The more switches you can put to silicon, the more you can try to find
performance benefits from concurrency. Pipelining & OoO have had a major
impact to compiler technology; in early days, code generation was
relatively easy, but in modern days to get the best possible performance
you really need to know the internals of the processors. When writing
code with C or D, you really have very minimal possibilities to try to
make your software to utilize pipelines and OoO - if the compiler does
not do that, your program will not do that.
But at the same time, the processors have been tried to make compiler-
friendly; since high level languages uses lots of certain instructions
and patterns, the processors try to be good with them. If you take a look
to the evolution of processors and compare it to the evolution of
software design, you will see the impacts of changing from BASIC/
assembler programming to the compiled HLLs, changing from procedural
languages to OO languages, and changing to threaded architectures.
At BASIC/Assembler era, the processor machine language was intended for
humans; that was the era of CISC-style processors. Compilers does not
need human-readable machine code, and when the compiled languages were
taken into use, there were raise of RISC processors. The procedural
languages used lots of calls - the processors were optimized for calling
functions quickly. The OO introduced intensive use of referring data via
pointers (compared to data segments of procedural languages); the
processors were optimized for accessing memory efficiently via pointers.
How caching relates to this? Complex memory hierarchy (and in fact, the
pipelines and OoO, too) is not desirable and intentional thing, it is a
symptom raised from RASP model. It has been introduced only because it
can give performance benefits to software, and the key word here is
locality.
Locality - and its natural consequence, distribution - is, in fact, one
of the keyword of forthcoming processor models. The next major step in
processor architectures is very likely reconfigurable platforms, and they
will introduce a whole new set of challenges to compilers and software to
be fully utilized. Refer to PlayStation Cell compiler to get the idea.
At code level, you really can't design your software to "reconfigurable-
friendly". The best thing is just keep the code clear, and hope that
compilers can get the idea and make a good results.
At your software architecture level, if you are using threads, try to
keep everything local. The importance of that thing is just getting
higher.
>>> Are you suggesting that it's not
>>> something programmers should be aware of?
>> Yes, I am.
>>
>>
> How can you say that? Expecting the tool chain to deal with cache
> effects would be like expecting it to convert a bubble sort into qsort.
Does that description above answer to this question? In case it does not,
I'll explain; in general software, don't mess with the cache. Instead,
strive to locality and distribution. Use the threading libraries, and
when possible, try to do the interactions between threads with some
standard way.
If you're doing lower level code, like threading library or hardware
driver, you will probably need to know about caching. That is totally
different story, since especially writing a hardware driver introduces
much more things to take into account along with caches.
More information about the Digitalmars-d
mailing list