Ready for review: new std.uni
David Nadlinger
see at klickverbot.at
Fri Jan 11 21:17:13 PST 2013
On Friday, 11 January 2013 at 20:57:57 UTC, Dmitry Olshansky
wrote:
> You can print total counts after each bench, there is a TLS
> varaible written at the end of it. But anyway I like your
> numbers! :)
Okay, I couldn't resist having a short look at the results,
specifically the benchmark of the new isSymbol implementation,
where LDC beats DMD by roughly 10x. The reason for the nice
performance results is mainly that LDC optimizes the classifyCall
loop containing the trie lookup down to the following fairly
optimal piece of code (eax is the overall counter that gets
stored to lastCount):
---
40bc90: 8b 55 00 mov edx,DWORD PTR
[rbp+0x0]
40bc93: 89 d6 mov esi,edx
40bc95: c1 ee 0d shr esi,0xd
40bc98: 40 0f b6 f6 movzx esi,sil
40bc9c: 0f b6 34 31 movzx esi,BYTE PTR
[rcx+rsi*1]
40bca0: 48 83 c5 04 add rbp,0x4
40bca4: 0f b6 da movzx ebx,dl
40bca7: c1 e6 05 shl esi,0x5
40bcaa: c1 ea 08 shr edx,0x8
40bcad: 83 e2 1f and edx,0x1f
40bcb0: 09 f2 or edx,esi
40bcb2: 41 0f b7 14 50 movzx edx,WORD PTR
[r8+rdx*2]
40bcb7: c1 e2 08 shl edx,0x8
40bcba: 09 da or edx,ebx
40bcbc: 48 c1 ea 06 shr rdx,0x6
40bcc0: 4c 01 ca add rdx,r9
40bcc3: 48 8b 14 d1 mov rdx,QWORD PTR
[rcx+rdx*8]
40bcc7: 48 0f a3 da bt rdx,rbx
40bccb: 83 d0 00 adc eax,0x0
40bcce: 48 ff cf dec rdi
40bcd1: 75 bd jne 40bc90
---
The code DMD generates for the lookup, on the other hand, is
pretty ugly, including several values being spilled to the stack,
and also doesn't get inlined.
This is, of course, just a microbenchmark, but it is cases like
this which make me wish that we would just use LLVM (or GCC, for
that matter) for the reference compiler – and I'm not talking
about the slightly Frankensteinian endeavor that LDC is here.
Walter, my intention is not at all to doubt your ability at a
compiler writer; we all know the stories of how you used to annoy
the team leads at the big companies by beating their performance
numbers single-handedly, and I'm sure you could e.g. fix your
backend to match the performance of the LDC-generated code for
Dmitry's benchmark in no time. The question is just: Are we as a
community big, resourceful enough to justify spending time on
that?
Sure, there would still be things we will have to fix ourselves
when using another backend, such as SEH support in LLVM. But
performance will always be a central selling point of a language
like D, and do we really want to take the burden of keeping up
with the competition ourselves, when we can just draw on the work
of full-time backend developers at Intel, AMD, Apple and others
for free? Given the current developments in microprocessors and
given that applications such as graphics and scientific computing
are naturally a good fit for D, what's next? You taking a year
off from active language development to implement an
auto-vectorizer for your backend?
I know this question has been brought up before (if never really
answered), and I don't want to start another futile discussion,
but given the developments in the compiler/languages landscape
over the last few years, it strikes me as an increasingly bad
decision to stick with an obscure, poorly documented backend
which nobody knows how to use – and nobody wants to learn how to
use either, because, oops, they couldn't even redistribute their
own work.
Let's put aside all the other arguments (most of which I didn't
even mention) for a moment, even the performance aspect; I think
that the productivity aspect alone, both regarding duplicated
work and accessibility of the project to new developers, makes it
hard to justify forging leveraging the momentum of an established
backend project like LLVM. [1]
Maybe it is naïve to think that the situation could ever change
for DMD. But I sincerely hope that the instant a promising
self-hosted (as far as the frontend goes) compiler project shows
up at the horizon, it will gain the necessary amount of official
endorsement – and manpower, especially in the form of your
(Walter's) expertise – to make that final, laborious stretch to
release quality. If we just sit there and wait for somebody to
come along with a new production-ready compiler which is better,
faster and shinier than DMD, we will wait for a long, long time –
this might happen for a Lisp dialect, but not for D.
Sorry for the rant, [2]
David
[1] The reasons for which I'm focusing on LLVM here are not so
much its technical qualities as its liberal BSD-like license – if
it is good enough for Apple, Intel (also a compiler vendor) and
their lawyer teams, it is probably also for us. The code could
even be integrated into commercial products such as DMC without
problems.
[2] And for any typos which might undermine my credibility – it
is way too early in the morning here.
More information about the Digitalmars-d
mailing list