D compilation is too slow and I am forking the compiler

Wed Nov 21 10:56:02 UTC 2018

On 11/21/2018 2:16 AM, Vladimir Panteleev wrote:
> On Wednesday, 21 November 2018 at 09:46:44 UTC, Walter Bright wrote:
>> It works by allocating memory from a memory-mapped file, which serves as the 
>> precompiled header.
> 
> Hey, that's a great idea! Can we do this for DMD? :D
> 
> On a more serious note: do you think that with D's features (type system / 
> metaprogramming), you could have avoided some of those bugs?
> 
> For example, one thing we can do in D which is still impossible in C++ is to 
> automatically serialize/deserialize all fields of a struct/class (using tupleof 
> / allMembers).
> 

Memory mapped files really were the key to success, because if you could reload 
the mmf at the same address, the pointers did not have to be patched. In the 
DMC++ source code, "dehydrating" a pointer meant subtracting a value from it so 
it was correct for the base address of the mmf, and "hydrating" a pointer was 
the inverse.

The two bug prone problems were:

1. separating out the tangled data structures into what goes into the pch, and 
what does not. Obviously, nothing in the pch could point outside of it.

2. .h files are simply not compatible with this, so you've got to detect when it 
won't work. For example, anything like a command line switch or a macro that 
might cause different code to be generated in the pch had to invalidate it.

Maybe I should have done your fork idea? :-)

My experience with this drove many design decisions for D modules, for example, 
D modules are unaffected by where they are imported, the order they are 
imported, or the number of times they are imported. (Yes, I know about 
https://digitalmars.com/d/archives/digitalmars/D/D_needs_to_be_honest_320976.html)

Anyhow, what I've thought about doing since the beginning was make DMD 
multithreaded. The language is designed to support multithreaded compilation. 
For example, lexing, parsing, semantic analysis, optimization, and code 
generation can all be done concurrently.

DMD 1.0 would read imports in a separate thread. This would speed things up if 
you were using a slow filesystem, like NAS or a USB stick, but it was eventually 
disabled because there wasn't a perceptible speedup with current filesystems.

Wouldn't it be awesome to have the lexing/parsing of the imports all done in 
parallel? The main difficulty in getting that to work is dealing with the shared 
string table.