Compilation times and idiomatic D code

H. S. Teoh via Digitalmars-d digitalmars-d at puremagic.com
Mon Jul 17 11:42:36 PDT 2017


On Sat, Jul 15, 2017 at 07:11:54PM +0000, Enamex via Digitalmars-d wrote:
[...]
> All that time I'd assumed that 'symbols' as linkers used them were
> constant length :T

They may have been once upon a time, many decades ago. :-D  But modern
linkers support symbols of arbitrary length (well, up to a limit :-D)
because of the demands of modern languages that generate long symbols as
part of implementing function overloading, etc..


> Just to be clear: 100% of that bloat resides in the symbol table?

Pretty much, yes.  Unless your program happens to want to print a string
constructed from the type name (very unlikely unless it's purely for
debugging purposes).


> So the current proposed remedy is to hash symbols above a length
> threshold?

Rainers' PR (which is now available as a dmd/druntime branch, btw)
changes the mangling scheme so that when a symbol contains substrings
that are repeated, the repeated instances are replaced with an encoded
back-reference. When substrings are long and/or repeated frequently,
this leads to a significant reduction in symbol size.

In fact, I just tested the `mangle` branch this morning on my code, and
it shows major improvements: my executable sizes are now down to 4MB,
about half the size of what I could achieve by type erasure via OO
polymorphism.  The longest symbol lengths are now about 1000 characters
or so, as opposed to 20KB (or, before I applied the horcrux trick,
800KB).  Compared to the original 600MB executable sizes before I
started tackling this issue, this represents a 99% improvement(!).

Now Rainers' changes are just waiting for various dmd testsuite issues
to be ironed out, and it should be available in git master soon. Hooray!


> Besides the symbol problem though, does the template instantiation
> explosion problem imply as many duplicated function bodies
> corresponding to the new type symbols?

That's a different, though somewhat related, issue.  I have some ideas
on that front, but for now, Rainers' PR addresses only the problem of
symbol length.  So the next task after his fix is merged is to tackle
the larger problem of how to improve the implementation of templates.

One of the major issues with heavily-templated code is that you could
potentially end up with many duplicated function bodies that may in fact
be binary-identical, but are stored separately in the executable because
from the language's POV, they are distinct functions.  (Well, there's
also the problem of separate compilation, where you could end up with
the *same* function instantiated multiple times in different object
files, but for the most part, LTO (link-time optimization) takes care of
that: the compiler writes out the duplicated instantiations as "weak
symbols", so the linker discards all but the first copy.)

Often, this results from templates that contain helper code that's
independent of the template parameters, or dependent only on a subset of
template parameters. For example:

	struct S(T, size_t size) {
		T method1() {
			// N.B.: independent of `size`
			return T.init;
		}
		int[size] method2() {
			// N.B.: independent of `T`
			return int[size].init;
		}
		T[size] method3() {
			// N.B.: depends on both `T` and `size`
			return T[size].init;
		}
	}

	S!(int, 5) si5;
	S!(int, 6) si6;
	S!(string, 2) ss2;
	S!(string, 4) ss4;

Here, si5.method1 and si6.method1 are identical, because method1 is
independent of `size`.  However, because they are part of the template
S, they have distinct names: `S!(int, 5).method1` and `S!(int,
6).method1`, so they will be duplicated in the generated code.  Ideally,
the compiler should detect that they are actually identical, and merge
them accordingly.

A similar situation occurs with method2. For method3, since it depends
on both template parameters, two versions of the function must be
generated, obviously.

Of course, this example is contrived, and rather trivial.  A less
trivial example shows that there's more to this problem than just
detecting which template parameters a method depends on:

	struct List(T) {
		struct Node {
			Node* next;
			T t;
		}
		Node* removeNext(Node* prev) {
			Node* tmp = prev.next;
			prev.next = tmp.next;
			tmp.next = null;
			return tmp;
		}
		...
	}

Here, removeNext takes Node* as parameter, and since the definition of
Node depends on T, it would appear that removeNext also depends on T.
However, closer inspection shows that actually it never touches the
portion of Node that is dependent on T.  Therefore, removeNext will
actually be identical across *all* instantiations of List.  This case is
not as easy for the compiler to detect, because the definition of Node
depends on T, and it could be the case that Node is defined in a way
that *will* require removeNext to also be dependent on T. For example,
if we defined Node as:

	struct Node {
		T t;
		Node* next;
	}

then the offset of Node.next will change depending on the size of T, so
it would not be possible to merge removeNext across different
instantiations of List.

The usual workaround is to use OO to manually perform the merging of
removeNext yourself:

	class NodeBase {
		NodeBase next;
	}

	// N.B.: module-global non-template function.
	NodeBase removeNext(NodeBase prev) {
		... // implementation here
	}

	struct List(T) {
		// N.B.: use inheritance to separate the T-dependent
		// part of Node from the T-independent part.
		class Node : NodeBase {
			T t;
		}

		// Ugly: need a wrapper function to forward calls to
		// List.removeNext to the module-global version
		Node removeNext(Node prev) {
			// Ugly: need to downcast to get Node back
			// rather than NodeBase
			return cast(Node) .removeNext(prev);
		}
	}

But this adds a lot of boilerplate and relies on the user to manually do
what, arguably, the compiler ought to be able to automate. Not to
mention the ugly points noted in the comments above. Of course, some of
the ugliness can be alleviated by also refactoring List(T) to have a
non-template base class that implements a removeNext method, or using a
`this` template parameter, etc., but the point is that this is tedious
boilerplate that would be nice for the compiler to automatically do on
our behalf.


T

-- 
If lightning were to ever strike an orchestra, it'd always hit the conductor first.


More information about the Digitalmars-d mailing list