std library hooks

Manu turkeyman at gmail.com
Sun Apr 15 05:05:36 PDT 2012


On 15 April 2012 14:55, Manu <turkeyman at gmail.com> wrote:

> 2012/4/15 "Jérôme M. Berger" <jeberger at free.fr>
>
>> Manu wrote:
>> > I have multiple definitions because I defined a function from the lib in
>> > one of my own objects, and then they get linked together.
>> > If I were to not specify libcmtd, every CRT call would be unresolved,
>> > except atoi in this case which I implemented in one of my objects. The
>> > point is, if I implement a function from a library, I get the behaviour
>> > I expect, which is the linker complains about multiply defined symbols.
>> > Your saying though, that I should be able to implement a library
>> > function in my own code, and that will somehow be found prior to
>> > searching the libs, and since the symbol is resolved, the lib search
>> > will not take place.
>> > How is the linker supposed to know which instance (my one in a loose .o
>> > file, or the one in the lib) is actually 'my' one? What if it was
>> > contained in 2 different libs, rather than one in a lib, and the other
>> > in a bundle of loose object files?
>> >
>> > When and how can I expect the behaviour you propose? I don't follow the
>> > requirements... or the logic the linker could follow. At link time, it
>> > no longer know's what's mine from what's in any given libs...
>> > As far as I was aware, I thought it DID just link every single thing
>> > given together in one huge blob, and then strips the unreferenced stuff
>> > as a post process.
>>
>>         Symbols from "loose" object files will all be included whether
>> they
>> are used or not (some linkers have special options to strip unused
>> symbols from the generated executable aftwerwards).
>>
>>        Symbols from libraries will only be included if at least one of the
>> following conditions applies:
>>
>> - Some object file that is already included uses the symbol and no
>> other already included object file defines the symbol. The included
>> object files may have been specified explicitly or come from a
>> library and included because it defines some other symbol that is
>> already used. Here, "already" means "reading the command line from
>> left to right";
>>
>> - The symbol is inside a library object file and that object file
>> also defines another symbol that is used from one of the already
>> included object files.
>>
>>
>>        The linker maintains a list of object files and two lists of
>> symbols: one for symbols whose definition has already been found
>> (whether these symbols were used or not) and one for symbols that
>> are used but whose definition is missing. Then it works like this:
>> - Initialize the lists to empty;
>> - Take object files and libraries in the order they are specified on
>> the command line and/or linker script (possibly adding some implicit
>> runtime libraries and objects at the end);
>> - For each explicit object file, add the file to the list of objects
>> and look at the symbols that are defined and used in this object
>> file. Update the symbol lists accordingly. Complain if a symbol that
>> is defined in the object file was already in list of defined symbols
>> (unless at least one of the definitions is marked "weak");
>> - Libraries are a collection of object files. For each library, look
>> through the object files in the library for symbols that are in our
>> "undefined" list. If any are found, add the corresponding object
>> files as if they came from the command line. Ignore all other object
>> files in the library;
>> - Once you have looked at all the object files and libraries, if
>> there are still symbols in the "undefined" list, complain unless
>> those symbols are marked "weak";
>> - Assemble the object files, updating them where needed with
>> references to the symbols from the "defined" list.
>
>
> Right, cheers for taking the time to write all that. I feel considerably
> more educated on the link process :)
>
> - Libraries are a collection of object files. For each library, look
> through the object files in the library for symbols that are in our
> "undefined" list. If any are found, add the corresponding object
> files as if they came from the command line. Ignore all other object
> files in the library;
>
> This is an interesting point. This suggests I can expect unpredictable
> behaviour under
> If lib A has object file A:O, and it contains symbols x and y...
>
> If I now have my own object, and I define x, but also reference y, it will
> search for unresolved y in libraries, eventually finding the symbol in A:O,
> which it will then include right?
> As it includes A:O from the lib to resolve y, will it not also pull x in
> the same object, causing a collision with my existing definition of x?
>
> Basically, if an object in a lib defines multiple symbols, and I use one,
> but attempt to 'override' the other, is there a way to avoid this collision?
> Is this the reason that CRT implementations always seem to strictly have
> one single .c file per CRT function?
>

That said, all of that does not match my observation with the VC linker.
I defined atoi locally, but it complained straight up. If it was
selectively pulling objects containing unresolved symbols from the lib, I
shouldn't have seen that error I pasted above, since the symbol was already
defined within the object that referenced it, it shouldn't have tried to
pull atox.obj...

I'm still going to put all of this down as 'very unpredictable' to anyone
who isn't a compiler author, and suggest that maybe if we want the formal
ability to override some of the basic druntime functions
(malloc/file/assert, etc), we should either make an API to register hooks
(like the deprecated assert hook), or at least mark the symbols that are
safe to replace as weak, and document it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20120415/26ca3949/attachment.html>


More information about the Digitalmars-d mailing list