String Switch Lowering

H. S. Teoh hsteoh at quickfur.ath.cx
Sat Jan 27 23:51:32 UTC 2018


On Sat, Jan 27, 2018 at 03:48:19PM -0800, Timothee Cour wrote:
[...]
> eg `ldc -hash-threshold` would be 1 option.
[...]
> with a small threshold:
> 
> mangled:
> _D8analysis3run__T9shouldRunℂ0abf2284dd3
> 
> demangled:
> pure nothrow @nogc @safe immutable(char)[] analysis.run.shouldRun.ℂ0abf2284dd3
> 
> The `ℂ` symbol indicating hashing was applied because symbol size
> exceed threshold.
> The demangled version also would have that. A separate file (dmd
> -mangle_map=file) could be produced in the rare case a user wants to
> see the full 17KB mangled and demangled symbols mapped by ℂ0abf2284dd3
[...]

This gives me an idea.  A lot of the recent complaints about symbol size
appears to be coming from templates with string template arguments, and
encoding strings inside a symbol tend to quickly make its length
explode.  What if we changed the mangling scheme such that if a string
template argument exceeds a certain length, or if the number of string
arguments (or their accumulated length) exceeds a certain size, we hash
the string arguments instead, and use the hash in the symbol instead of
encoding the raw strings themselves?

Regardless, *some* form of symbol compression is necessary in D,
especially now that druntime is also moving in the direction of more
templated code. Rainer's backref patch helped in one common use case
(chained range functions), but the string argument case remains a source
of major symbol bloat.

I was also talking with Stefan Koch on github that we need to take up a
project to improve the way dmd handles templates. There's much room for
improvement, and with D's focus on heavy-duty compile-time features,
this is a major area where we can reap large benefits for the effort
invested.


T

-- 
Freedom of speech: the whole world has no right *not* to hear my spouting off!


More information about the Digitalmars-d mailing list