Potential of a compiler that creates the executable at once
rempas
rempas at tutanota.com
Thu Feb 10 09:41:12 UTC 2022
A couple of months ago, I found out about a language called
[Vox](https://github.com/MrSmith33/vox) which uses a design that
I haven't seen before by any other compiler which is to not
create object files and then link them together but instead,
always create an executable at once. This means that every time
we change something in our code, we have to recompile the whole
thing. Naturally, you will say that this is a huge problem
because we will have to wait a lot of times every time we make a
small change to our project but here is the thing... With this
design, the compilation times can become really really fast (of
course the design of the compiler matters too)!
At some point about 3 months ago, the creator of the language
said that at that point, Vox can compile 1.2M LoC/S which is
really really fast and this is a point that 99% of the projects
will not reach so your project will always compiler in less than
a second no matter what! What is even more impressive is that Vox
is single thread so when parsing the files for symbols and
errors, why could get a much bigger performance boost if we had
multithread support!
Of course, not creating object files and then link them means
that we don't have to create a lot of object files and then link
them all into a big executable but rather start creating this
executable and add everything up. You can understand how this can
save a lot of time! And CPUs are so fast in our days that we can
compile Million lines of code in less than a second using
multi-thead support so even then very rare huge projects will
compile very fast.
What's even more impressive is that Vox is not even the fastest
compiler out there. TCC is even faster (about 4-5 times)! I have
personally tried to see how fast TCC is able to compile using my
CPU which is Ryzen 5 2400G. I was able to compile 4M LoC in
700ms! Yeah, the speeds are crazy! And my CPU is an average one,
if you were to build a PC now, you would get something that is at
least 20% faster with at least 2 more threads!
However, this is not the best test. This was only an one-line
functions that had the same assembly code in it without any
preprocess and libraries linked so I don't know if this played
any role but that was 8 files using 8 threads and the speed is
just unreal! And TCC DOES create object files and then links
them. How faster it could be if it used the same design Vox uses
(And how slower would Vox be if it used the same design regular
compilers use?)?
Of course, TCC doesn't produce optimized code but still, even
when compared with GCC's "-O0", it generates code 4-7 times
faster than GCC so if TCC could optimize code as much as GCC and
was using the design Vox used, I can see it been able to compile
around 1-1.5M LoC/s!
I am personally really interested and inspired to make my own
compiler by this design. This design also solves a lot of
problems that we would have to take into account in the other
classic method. One thing that I thought was the ability to be
able to also export your project as a library (mostly
shared/dynamic) so in case you have something really huge like
10+M LoC (Linux kernel I'm talking to you!), you could split it
to "sub projects" that will be libraries and then link them all
together.
Another idea would be to check the type of the files that are
passed to the compiler and if they are source files, do not
create object files as they would not be kept anyways. So the
following would apply:
```
my_lang -c test3.lang // compile mode! Outputs the object files
"test3.o"
my_lang -c test1.lang test2.lang test3.o -o=TEST // Create
executable. "test1.lang" and "test2.lang" are source files so we
won't create object files for them but rather will go straight to
create a binary out of them. "test3.o" is an object files so we
will "copy-past" its symbols to the final binary file.
```
This is probably the best of both worlds!
So I thought about sharing this and see what your thoughts are!
How fast DMD could be using this design? Or even better if we
created a new, faster backend for DMD that would be faster than
the current one? D could be very competitive!
More information about the Digitalmars-d
mailing list