Potential of a compiler that creates the executable at once

Thu Feb 10 09:41:12 UTC 2022

A couple of months ago, I found out about a language called 
[Vox](https://github.com/MrSmith33/vox) which uses a design that 
I haven't seen before by any other compiler which is to not 
create object files and then link them together but instead, 
always create an executable at once. This means that every time 
we change something in our code, we have to recompile the whole 
thing. Naturally, you will say that this is a huge problem 
because we will have to wait a lot of times every time we make a 
small change to our project but here is the thing... With this 
design, the compilation times can become really really fast (of 
course the design of the compiler matters too)!

At some point about 3 months ago, the creator of the language 
said that at that point, Vox can compile 1.2M LoC/S which is 
really really fast and this is a point that 99% of the projects 
will not reach so your project will always compiler in less than 
a second no matter what! What is even more impressive is that Vox 
is single thread so when parsing the files for symbols and 
errors, why could get a much bigger performance boost if we had 
multithread support!

Of course, not creating object files and then link them means 
that we don't have to create a lot of object files and then link 
them all into a big executable but rather start creating this 
executable and add everything up. You can understand how this can 
save a lot of time! And CPUs are so fast in our days that we can 
compile Million lines of code in less than a second using 
multi-thead support so even then very rare huge projects will 
compile very fast.
What's even more impressive is that Vox is not even the fastest 
compiler out there. TCC is even faster (about 4-5 times)! I have 
personally tried to see how fast TCC is able to compile using my 
CPU which is Ryzen 5 2400G. I was able to compile 4M LoC in 
700ms! Yeah, the speeds are crazy! And my CPU is an average one, 
if you were to build a PC now, you would get something that is at 
least 20% faster with at least 2 more threads!

However, this is not the best test. This was only an one-line 
functions that had the same assembly code in it without any 
preprocess and libraries linked so I don't know if this played 
any role but that was 8 files using 8 threads and the speed is 
just unreal! And TCC DOES create object files and then links 
them. How faster it could be if it used the same design Vox uses 
(And how slower would Vox be if it used the same design regular 
compilers use?)?

Of course, TCC doesn't produce optimized code but still, even 
when compared with GCC's "-O0", it generates code 4-7 times 
faster than GCC so if TCC could optimize code as much as GCC and 
was using the design Vox used, I can see it been able to compile 
around 1-1.5M LoC/s!

I am personally really interested and inspired to make my own 
compiler by this design. This design also solves a lot of 
problems that we would have to take into account in the other 
classic method. One thing that I thought was the ability to be 
able to also export your project as a library (mostly 
shared/dynamic) so in case you have something really huge like 
10+M LoC (Linux kernel I'm talking to you!), you could split it 
to "sub projects" that will be libraries and then link them all 
together.

Another idea would be to check the type of the files that are 
passed to the compiler and if they are source files, do not 
create object files as they would not be kept anyways. So the 
following would apply:

```
my_lang -c test3.lang // compile mode! Outputs the object files 
"test3.o"

my_lang -c test1.lang test2.lang test3.o -o=TEST // Create 
executable. "test1.lang" and "test2.lang" are source files so we 
won't create object files for them but rather will go straight to 
create a binary out of them. "test3.o" is an object files so we 
will "copy-past" its symbols to the final binary file.
```

This is probably the best of both worlds!

So I thought about sharing this and see what your thoughts are! 
How fast DMD could be using this design? Or even better if we 
created a new, faster backend for DMD that would be faster than 
the current one? D could be very competitive!