Lib change leads to larger executables

Walter Bright newshound at digitalmars.com
Fri Feb 23 13:28:25 PST 2007


kris wrote:
> As was pointed out to me, OMF librarians actually uses a two-level 
> hashmap to index the library entries. This is used by the linker to 
> locate missing symbols. I think it's clear that this is not a linear 
> lookup mechanism as had been claimed, and is borne out by experiments 
> that show the process cannot be controlled, and the linker cannot be 
> faked in usable or dependable manner.

The librarian takes a list of .obj files, and concatenates them 
together. It appends a dictionary at the end. The dictionary is a 
(ridiculously complex, but that's irrelevant here) associative array 
that can be thought of as being equivalent to:

	ObjectModule[SymbolName] dictionary;

The librarian reads the .obj files in the order in which they are 
presented to the librarian. Each .obj file is parsed, and the public 
names in it are inserted into the dictionary like:

	dictionary[publicname] = objectmodule;

Note that there can be only a 1:1 correspondence between publicnames and 
objectmodules. If a publicname is already in the dictionary, lib issues 
an error and quits.

COMDAT names are also inserted into the dictionary *unless they are 
already in there*, in which case they are ignored.

Hence, only the first COMDAT name is inserted. The rest are ignored. The 
hashmap lookup algorithm has ZERO effect on which object module is 
pulled in, because there is (and can only be) a 1:1 mapping. There is no 
way it can arbitrarily pick a different object module.

The process can be controlled by setting the order in which object 
modules are presented to the library.

What cannot be controlled is the order in which the linker visits 
unresolved names trying to find them, i.e. if A and B are unresolved, it 
cannot be controlled whether A is looked up first, or B is looked up 
first. That means, if you have two modules M1 and M2, and COMDATs A and B:

---------M1------------
A
C
---------M2------------
A
B
-----------------------

then if B is looked up first, the resulting exe will have only M2 linked 
in. If A is looked up first, then both M1 and M2 will be in the executable.



More information about the Digitalmars-d mailing list