Lib change leads to larger executables

kris foo at bar.com
Wed Feb 21 23:49:23 PST 2007


Justin C Calvarese wrote:
> I'm not trying to pick a fight with any of the people who have been 
> discussing this serious issue, but I have some thoughts I'd like to add. 
> Feel free to take my words as the ramblings of an idiot...
> 
> My theory is that Walter and you (Kris and everyone else who is trying 
> to talk some sense into Walter) are operating on different wavelengths. 
> (I may be on yet another wavelength.) When I read Kris's complaint, I 
> think "Wow, that sounds like a problem that needs fixing". When I read 
> Walter's response, I think "Hmmm, that makes sense, too. What was Kris's 
> problem again?". And that cycle repeats for me. Walter seems to still 
> think he understands the problem, but perhaps we could benefit from a 
> simple illustration of the problem. Or just a restatement of the problem 
> situation. I'm sure that I don't understand what's going on.
> 
> It's something about the compiler is generating the TypeInfo for 
> char[][], and it's bringing in all of "Core" (but we don't need all of 
> "Core"). And we especially don't need the "locale" package since it's 
> bloated (and unneeded), but the whole package (including all of "Core" 
> and "locale") is brought in because the compiler is generating TypeInfo 
> for the char[][]. (But if the "locale" package is so bloated and 
> unneeded, then why is it being compiled at all? Is "locale" part of 
> "Core"?) Is any of that right? I'm so confused.
> 
> (Perhaps part of the problem is that Walter isn't that familiar with the 
> Tango library and what it's all about. I suspect that I know more about 
> Tango than Walter does -- and I'm afraid that I know barely anything 
> about it -- so that could be part of the problem, too.)
> 

Well said, Justin. I'm personally feeling like there's either some vast 
misunderstanding or there's a lot of smoke billowing about. I'll try to 
recapture the issue and see where it goes. Let me know if I fail to 
explain something?

The problem space
-----------------

1) This is not about templates anymore. We're currently past that bridge 
and into different territory. A common territory that every developer 
using D will have to face in one way or another.

2) This is not specific to Tango at all. It is a generic problem and 
Tango just happens to trip it in an obvious manner.

3) In a nutshell, the linker is binding code from the library that has 
no business being attached to the executable. Let's call this the 
"redundant code"?

4) Given the last set of comments from Walter, he appears to think the 
the redundant code is somehow imported; either by the example program or 
indirectly via some chain of imports within the library itself. This is 
where the disconnect lies, I suspect.

5) There is /no/ import chain explicitly manifested anywhere in the code 
in question. This should be obvious form the fact that the example links 
perfectly cleanly when said redundant code is deliberately removed from 
the library.

6) The dependency that /does/ cause the problem is one generated by the 
D compiler itself. It generates and injects false 'dependencies' across 
object modules. Specifically, that's effectively how the linker treats them.

7) These fake dependencies are responsible for, in this case, the entire 
"locale" package to be bound to the example app, resulting in a 350% 
increase in size.

8) Fake dependencies are injected in the form of typeinfo. In this case, 
the typeinfo is for a char[][]. This is not part of the "prepackaged" 
set of typeinfo, so the compiler makes it up on the fly. Trouble is, 
this is "global" information -- it should be in one location only.

9) The Fake dependencies cause the linker to pick up and bind whatever 
module happens to satisfy it's need for the typeinfo resolution. In this 
case, the linker sees Core.obj with a char[][] decl exposed, so it say 
"hey, this is the place!" and binds it. Along with everything else that 
Core.obj actually requires.

10) The linker is entirely wrong, but you can't really blame it since 
the char[][] decl is scattered throughout the library modules. It thinks 
it get's the /right/ one, but in fact it could have chosen *any* of 
them. This is now getting to the heart of the problem.

11) If there's was only one exposed decl for char[][], e.g. like int[], 
there would be no problem. In fact you can see all the prepackaged 
typeinfo bound to any D executable. There's lots of them. However, 
because the compiler injects this typeinfo into a variety of objects 
(apparently wherever char[][] is used), then the linker is boondoggled.

12) If the linker were smart, and could link segments instead of entire 
object modules, this would still be OK (a segment is an isolated part of 
the object module). But the linker is not smart. It was written to be 
fast, in pure assembler, decades ago.


Why is this a problem now?
-------------------------

Well, it's always been a problem to an extent, over the years. The key 
here is that in the past, the problem was generated principally by the 
developers/coder by introducing duplicate symbols and so on. Because it 
was in the hand of the developer, it could be resolved reasonably well.

With D, that is still potentially the case. However, the /real/ problem 
is this: the compiler generates the duplicate symbols all by itself. So, 
the developer has no means to rectify the situation ... it is entirely 
out of their hands. What's worse is this: there are no useful messages 
involved ... all you get is some bizzare and arcane message from the 
linker that generally misguides you instead.

Case in point: you have to strip the library down by hand, and very very 
carefully sift through the symbols and literally hundreds of library 
builds until you finally get lucky enough to stumble over the problem.

Walter asserts that the linker can be tricked into doing the right 
thing. This seems to show a lack of understanding on his part about the 
problem and the manner in which the lib and linker operate.

The linker cannot be fooled or tricked in a dependendable manner, since 
the combinations of redundant symbols for it to choose from are nigh 
impossible for a human to track on a regular basis, and the particular 
module resolved depends very much on where the linker currently is in 
it's process. As you can imagine, in a large library with a large number 
of compiler-generated duplicates, that's potentially a very large 
explosion of combinations? The notion that a developer be responsible 
for tricking the linker, to cover up for these injected duplicates is 
simply absurd :)

As was pointed out to me, OMF librarians actually uses a two-level 
hashmap to index the library entries. This is used by the linker to 
locate missing symbols. I think it's clear that this is not a linear 
lookup mechanism as had been claimed, and is borne out by experiments 
that show the process cannot be controlled, and the linker cannot be 
faked in usable or dependable manner.

To hammer the overall issue home, consider this: in the example 
application I added a dummy /magical/ declaration of

char[][] huh = [];

When linked against the lib, my executable shrank from ~620kb to ~180kb. 
  Where did I get this magic from? Well, it took an exceddingly long and 
tedious process to discover it. Rinse and repeat for the next related error.



A word about Tango
------------------

Contrary to various implications made recently, Tango is rather well 
organized with very limited inter-module dependencies. For example, 
interfaces and pure-abstract classes are deliberately used to decouple 
the implementation of one module from those of others. You won't see 
that kind of thing in many libs, and certainly not in Phobos. Tango is 
designed and build by people who actually care about such things, crazy 
as that may sound :)

The "bloat" injected into the example executable comes entirely from an 
isolated package. It is the "locale" package, which supports a truly 
extensive array of I18N tools and calanders. I think it has 7 or 8 
different calander systems alone? It captures all the monetary, time and 
date preferences and idioms for every recongnized locale in the world. 
In short, it is an exceptional piece of work (from David Chapman). The 
equivalent out there is perhaps a good chunk of the IBM ICU project. The 
minimum size for that is a 7MB DLL. Typically 10MB instead.

I feel it important to point out that this powerful I18N package is, in 
no way, at fault here. The D compiler simply injected the wrong symbol 
into the wrong module at the wrong time, in the wrong order, and the 
result is that this package gets linked when it's not used or imported 
in any fashion by the code design. Instead, the dependency is created 
entirely by the compiler. That's a problem. It is a big problem. And it 
is a problem every D developer will face, at some point, when using DM 
tools. But it is not a problem rooted in the Tango code.



More information about the Digitalmars-d mailing list