Lib change leads to larger executables
Justin C Calvarese
technocrat7 at gmail.com
Wed Feb 21 21:55:10 PST 2007
kris wrote:
> Walter Bright wrote:
>> kris wrote:
>>
>>> Walter Bright wrote:
>>>
>>>> Some strategies:
>>>>
>>>> 1) minimize importing of modules that are never used
>>>>
>>>> 2) for modules with a lot of code in them, import them as a .di file
>>>> rather than a .d
>>>>
>>>> 3) create a separate module that defines the relevant typeinfo's,
>>>> and put that first in the library
>>>
>>>
>>>
>>> 1) Tango takes this very seriously ... more so than Phobos, for example.
>>
>>
>> Sure, but in this particular case, it seems that "core" is being
>> imported without referencing code in it. The only reason the compiler
>> doesn't generate the char[][] TypeInfo is because an import defines
>> it. The compiler does work on the assumption that if a module is
>> imported, then it will also be linked in.
>
> This core module, and the entire locale package it resides in, is /not/
> imported by anything. I spelled that out clearly before. You're making
> an assumption it is, somehow ... well, it is not. You can deduce that
> from the fact that the link succeeds perfectly well without that package
> existing in the library.
>
>>
>>> 2) That is something that could be used in certain scenario's, but is
>>> not a general or practical solution for widespread use of D.
>>
>>
>> The compiler can automatically generate .di files. You're probably
>> going to want to do that anyway as part of polishing the library - it
>> speeds compilation times, aids proper encapsulation, etc. That's why
>> the gc does it, and I've been meaning to do it for other bulky
>> libraries like std.regexp.
>
> You may remember that many of us find .di files to be something "less"
> than an effective approach to library interfacing? As to it making
> smaller, faster compiliations -- try it on the Win32 header files ... it
> makes them bigger and noticably slower to parse.
>
> This is neither a valid or practical solution.
>
>
>>
>> I wish to point out that the current scheme does *work*, it generates
>> working executables. In the age of demand paged executable loading
>> (which both Linux and Windows do), unused code in the executable never
>> even gets loaded into memory. The downside to size is really in
>> shipping code over a network (also in embedded systems).
>>
>> So I disagree with your characterization of it as impractical.
>
> Oh, ok. It all depends on what one expects from a toolset. Good point
>
>>
>> For professional libraries, it is not unreasonable to expect some
>> extra effort in tuning the libraries to minimize dependency. This is a
>> normal process, it's been going on at least for the 25 years I've been
>> doing it. Standard C runtime libraries, for example, have been
>> *extensively* tweaked and tuned in this manner, and that's just boring
>> old C. They are not just big lumps of code.
>>
>>> 3) Hack around an undocumented and poorly understood problem in
>>> developer-land. Great.
>>
>>
>> I think you understand the problem now, and the solution. Every
>> developer of professional libraries should understand this problem, it
>> crops up with most every language. If a developer doesn't understand
>> it, one winds up with something like Java where even the simplest
>> hello world winds up pulling in the entire Java runtime library,
>> because dependencies were not engineered properly.
>
> This is a problem with the toolchain, Walter. Plain and simple. The
> linker picks up an arbitrary, yes arbitrary, module from the library
> because the D language-design is such that it highlights a deficiency in
> the existing toolchain. See below:
>
> You can claim all you like that devs should learn to deal with it, but
> the fact remains that it took us more than a day to track down this
> obscure problem to being a char[][] decl. It will take just as long for
> the next one, and perhaps longer. Where does the cycle end?
>
> The toolchain currently operates in a haphazard fashion, linking in
> /whatever/ module-chain happens to declare a typeinfo for char[][]. And
> it does this because of the way D generates the typeinfo. The process is
> broken, pure and simple. We should accept this and try to figure out how
> to resolve it instead.
>
>
>>
>>> you might as well add:
>>>
>>> 4) have the user instantiate a pointless and magic char[][] in their
>>> own program, so that they can link with the Tango library?
>>
>>
>> I wouldn't add it, as I would expect the library developer to take
>> care of such things by adding them to the Tango library as part of the
>> routine process of optimizing executable size by minimizing dependencies.
>>
>
>
> Minimizing dependencies? What are you talking about? Those deps are
> produces purely by the D compiler, and not the code design.
>
>
>
>>> None of this is not gonna fly in practice, and you surely know that?
>>
>>
>> For features like runtime time identification, etc., that are
>> generated by the compiler (instead of explicitly by the programmer),
>> then the dependencies they generate are a fact of life.
>>
>> Optimizing the size of a generated program is a routine programming
>> task. It isn't something new with D. I've been doing this for 25 years.
>
> Entirely disingenuous. This is not about "optimization" at all ... it
> about a broken toolchain. Nothing more.
>
> I hope you'll find a way to progress this forward toward a resolution
> instead of labeling it something else.
I'm not trying to pick a fight with any of the people who have been
discussing this serious issue, but I have some thoughts I'd like to add.
Feel free to take my words as the ramblings of an idiot...
My theory is that Walter and you (Kris and everyone else who is trying
to talk some sense into Walter) are operating on different wavelengths.
(I may be on yet another wavelength.) When I read Kris's complaint, I
think "Wow, that sounds like a problem that needs fixing". When I read
Walter's response, I think "Hmmm, that makes sense, too. What was Kris's
problem again?". And that cycle repeats for me. Walter seems to still
think he understands the problem, but perhaps we could benefit from a
simple illustration of the problem. Or just a restatement of the problem
situation. I'm sure that I don't understand what's going on.
It's something about the compiler is generating the TypeInfo for
char[][], and it's bringing in all of "Core" (but we don't need all of
"Core"). And we especially don't need the "locale" package since it's
bloated (and unneeded), but the whole package (including all of "Core"
and "locale") is brought in because the compiler is generating TypeInfo
for the char[][]. (But if the "locale" package is so bloated and
unneeded, then why is it being compiled at all? Is "locale" part of
"Core"?) Is any of that right? I'm so confused.
(Perhaps part of the problem is that Walter isn't that familiar with the
Tango library and what it's all about. I suspect that I know more about
Tango than Walter does -- and I'm afraid that I know barely anything
about it -- so that could be part of the problem, too.)
--
jcc7
More information about the Digitalmars-d
mailing list