Andrei Alexandrescu needs to read this
Ola Fosheim Grøstad
ola.fosheim.grostad at gmail.com
Thu Oct 31 11:54:27 UTC 2019
On Sunday, 27 October 2019 at 20:11:38 UTC, Mark wrote:
> Would it be reasonable to say that modern CPUs basically do JIT
> compilation of assembly instructions?
Old CISC CPUs did just that, so they could have high level
instructions, and were reprogrammable... (I guess x86 also has
that feature, at least to some extent).
Then RISC CPUs came in the 90s and didn't do that, thus they were
faster and more compact as they could throw out the decoder (the
bits in the instructions were carefully designed so that the
decoding was instantaneous). But then memory bandwidth became an
issue and developers started to write more and more bloated
x86 is an old CISC architecture and simply survives because of
market dominance and R&D investments. Also, with increased real
estate (more transistors) they can sacrifice lots of space for
the instruction decoding...
The major change over the past 40 years that is causing
sensitivity to instruction ordering is that modern CPUs can have
deep pipelines (executing many instructions at the same time in a
long staging queue), that they are superscalar (execute
instructions in parallell), execute instructions speculatively
(execute instructions even though the result might be discarded
later), do tight-loop instruction unrolling before pipelining,
and have various schemes for branch prediction (so that they
execute the right sequence after a branch before they know what
the branch-condition looks like).
Is this a good approach? Probably not... You would get much
better performance from the same number of transistors by using
many simple cores and a clever memory architecture, but that
would not work with current software and development practice...
> branch predictor and so on. If so, you could argue that the
> Itanium was an attempt to avoid this "runtime" and transfer all
> these responsibilities to the compiler and/or programmer. Not a
> very successful one, apparently.
VLIW is not a bad concept, re RISC, but perhaps not profitable in
terms of R&D.
You probably could get better scheduling of instructions if it
was determined to be optimal statically. As the compiler would
then have a "perfect model" of how much of the CPU is being
utilized, and could give programmers feedback on it too. But
then you would need to recompile software to the actual CPU and
have more advanced compilers, and perhaps write software in a
different manner to avoid bad branching patterns.
Existing software code bases and a developer culture that is
resilient to change do limit progress...
People pay to have their existing stuff to run well, they won't
pay if they have to write new stuff in new ways, unless the
benefits are extreme (e.g. GPUs)...
More information about the Digitalmars-d