On processors for D ~ decoupling
Walter Bright
newshound at digitalmars.com
Thu Apr 6 21:05:02 PDT 2006
kris wrote:
> It would help if you'd note under what circumstances the TypeInfo /is/
> included, then. For example, this program:
>
> void main()
> {
> throw new Exception ("");
> }
>
>
> causes all kinds of TypeInfo to be linked:
In general, an easy way to see why a particular module is being pulled
in is to temporarily remove it from the library (lib phobos -foo;),
link, and see where the undefined reference is coming from. I'd start by
running obj2asm on the module you just compiled, and see what extern
directives it puts out.
I'm not trying to be a jerk by telling you this procedure rather than
just giving the answer, but 1) I don't know the answer offhand and I'd
have to follow the same procedure to figure it out and 2) I hope that by
giving you the tools and methodology for figuring it out, this kind of
question won't repeatedly come up (and yes, it has come up repeatedly).
3) I hope that anyone else with these kinds of questions will get
familiar with how to use these tools, too. It's a lot better than
guessing and assuming.
Tools like lib, obj2asm, and grep are incredibly useful.
> Where did all that come from? I suspect you're looking at this concern
> with a microscope only, while I think the bigger picture is perhaps more
> important.
I don't think there is a bigger picture. There's only a case by case
analysis of what is needed and what isn't.
> Yes they are surprising ~ partly because there's more than one might
> imagine:
>
> 0003:00000D74 _D3std6string10whitespaceG6a 00411D74
> 0003:00000D7C _D3std6string2LSw 00411D7C
> 0003:00000D80 _D3std6string2PSw 00411D80
> 0002:00002464 _D3std6string3cmpFAaAaZi 00404464
> 0002:000024A8 _D3std6string4findFAawZi 004044A8
> 0002:000025E4 _D3std6string6columnFAaiZi 004045E4
> 0003:00000CF4 _D3std6string6digitsG10a 00411CF4
> 0002:00002424 _D3std6string7iswhiteFwZi 00404424
> 0003:00000D40 _D3std6string7lettersG52a 00411D40
> 0003:00000D84 _D3std6string7newlineG2a 00411D84
> 0002:00002514 _D3std6string8toStringFkZAa 00404514
> 0003:00000CE4 _D3std6string9hexdigitsG16a 00411CE4
> 0002:00002590 _D3std6string9inPatternFwAaZi 00404590
> 0003:00000D08 _D3std6string9lowercaseG26a 00411D08
> 0003:00000D00 _D3std6string9octdigitsG8a 00411D00
> 0003:00000D24 _D3std6string9uppercaseG26a 00411D24
>
> Please see the extensive list at the end for some further surprises
All those other names are are the static data. Things like:
const dchar LS = '\u2028'; /// UTF line separator
const dchar PS = '\u2029'; /// UTF paragraph separator
I submit that they aren't significant. The significant thing is the
entire std.string.obj is not linked in.
> You're focusing purely on the fact that adding an itoa() would increase
> the executable size.
Yes.
> At the same time, completely ignoring the explicit
> mention of using the C runtime function instead (which is usually linked
> also), and the clear fact that importing std.string brings along with it
> the following:
And the only possible problem I see there is worrying about executable size.
>
> 0003:00000D74 _D3std6string10whitespaceG6a 00411D74
> 0003:00000D7C _D3std6string2LSw 00411D7C
> 0003:00000D80 _D3std6string2PSw 00411D80
> 0002:00002464 _D3std6string3cmpFAaAaZi 00404464
> 0002:000024A8 _D3std6string4findFAawZi 004044A8
> 0002:000025E4 _D3std6string6columnFAaiZi 004045E4
> 0003:00000CF4 _D3std6string6digitsG10a 00411CF4
> 0002:00002424 _D3std6string7iswhiteFwZi 00404424
> 0003:00000D40 _D3std6string7lettersG52a 00411D40
> 0003:00000D84 _D3std6string7newlineG2a 00411D84
> 0002:00002514 _D3std6string8toStringFkZAa 00404514
> 0003:00000CE4 _D3std6string9hexdigitsG16a 00411CE4
> 0002:00002590 _D3std6string9inPatternFwAaZi 00404590
> 0003:00000D08 _D3std6string9lowercaseG26a 00411D08
> 0003:00000D00 _D3std6string9octdigitsG8a 00411D00
> 0003:00000D24 _D3std6string9uppercaseG26a 00411D24
>
>
> Along with a number of dependencies.
Take a look at those functions and data - what dependencies?
> And, apparently, you think it's perhaps responsible for bringing in the
> floating point support too.
That is a problem, and I can fix that. No big deal - it wasn't printf
bringing in the floating point - and a reengineering or rewrite of
Phobos is not necessary. I don't even need to change any library source
code.
> The point being made is that of coupling between low and high levels ~
> illustrated quite well by the above.
> I think this kind of thing is worth addressing, for a number of reasons.
I think you're seeing an effect that is an issue, but are mistaken as to
the cause of the problem.
> Who says the standard C IO should /always/ get linked in? D currently
> /enforces/ that, whereas it's not a requirement at all for valid
> operation.
There isn't that much to it, and it doesn't hurt anything.
> What's more, the enforcement is simply because Object.d has a
> print() method, which uses printf() like so:
>
> print ()
> {
> printf ("%.*s", toString());
> }
Again, it isn't necessarilly printf doing that. Try the code I posted in
the last message that stubs out printf, which will *prevent* it from
being linked in from the library. Compile/link it, and examine the .map
file.
(The stubbing out method is another technique for figuring out what
pulls in what.)
> Why not just use ConsoleWrite(), or anything but printf()?
Because it's not portable (what should the Linux one look like?), and
does not deliver the billed benefits. But the worst thing about calling
ConsoleWrite() directly is that it does not play well with any other IO
the user may have done or be in the process of doing. What will happen
is that any object.print()'s will not be synchronized with the output
from writef, printf, or any other of the stdout functions.
> There's a
> number of valid (and decoupled) alternatives to this approach. Why can't
> they be used instead? You're answer is "well, it doesn't make any
> difference anyway". That's entirely silly. Yes, the C-library
> console-startup wrapper causes the IO system to be linked also. But that
> can be replaced, since it's not directly part of the D runtime support.
Why does the C library need replacing? I honestly don't get it.
> To make things worse, Object.print() is perhaps the least used method in
> all of D! Thus, it tends to place this whole issue on the verge of
> ridiculous.
> Why not just remove the dependency instead?
Because it doesn't buy anything to remove it. Try it and see (or even
easier, try the source I posted with the stubbed out printf - that will
absolutely, positively prevent printf from being linked in from the
library, without needing to change or recompile object.d at all).
> One of the tenets of good library design is to build in layers, and then
> ensure there's no dependencies between a lower layer and any of the
> higher ones. Here's two cases of just such a dependency ~ they are
> almost trivial to fix, yet nothing happens ... why?
>
> Thus, I really don't wish to argue with you on this one, Walter. If you
> simply refuse to accept that any system might prefer to avoid the
> default IO platform, for whatever valid reason it may have, then there's
> little point in even discussing the nature of tight-coupling.
If you want to use a system that for some reason can't have C's IO
subsystem, then just include the one liner:
extern (C) int printf(char* f, ...) { return 0; }
somewhere in your code, and it's gone.
> One can hack the internal dependencies in an attempt to rectify the
> concerns; yet why? Better to leave all of /internal and friends as it
> stands to avoid branching the code. I really thought you'd understand
> the value in making that part platform (library) agnostic. And for such
> a minor cost, too.
You don't need to hack the internals to get rid of any vestige of
printf. Just stub it out.
> Or, at
> least trying to obfuscate a simple case of unecessary low-high coupling
> in D. But let's move on ...
I'm trying to point out that things aren't so simple.
> I'm quite familiar with __fltused.
Your questions about how printf avoided linking in %f support indicated
otherwise.
> It's clearly used by the little
> example program above, given that this stuff is linked in:
...
> That looks rather like floating point support; Where in the program is
> floating point actually used? I don't get it.
I went over that in my last post, too.
>> Pulling printf won't do anything. Try it if you don't agree.
> That's your claim, not mine :)
You don't have to believe me, that's why I encourage you to try it and
give you the tools and methodology to figure these things out.
> Keep in mind it's not the number of entries, but the number of
> superfluous entries that are of concern (I removed all Win32 imports in
> an attempt to make the list more managable).
Until you've tracked down each and every one and understand where it is
pulled in from and why it is there, there is no way to decide which ones
are superfluous or not.
There's an awful lot of startup and shutdown going on - stuff that is
required for D (or the C runtime library, for that matter) to function.
An awful lot is required for the exception handling support to work -
that has to be in all programs. For the gc to start up and shut down
gracefully. It goes on.
> Also, please keep in mind that the concern is one of unecessary coupling
> from the low-level runtime support, into the high-level library
> functions. This will often result in a cascade of dependencies, much
> like what we see below. Not only does it cause code-bloat, but it makes
> the language-support dependent upon a specific high-level library. These
> dependencies are /very/ easy to remedy, with an approriate reduction in
> code size as a bonus.
As we've discovered, pulling printf out of object.d isn't going to
remedy anything. It just is not that simple.
More information about the Digitalmars-d-announce
mailing list