On processors for D ~ decoupling

Walter Bright newshound at digitalmars.com
Thu Apr 6 21:05:02 PDT 2006


kris wrote:
> It would help if you'd note under what circumstances the TypeInfo /is/ 
> included, then. For example, this program:
> 
> void main()
> {
>         throw new Exception ("");
> }
> 
> 
> causes all kinds of TypeInfo to be linked:

In general, an easy way to see why a particular module is being pulled 
in is to temporarily remove it from the library (lib phobos -foo;), 
link, and see where the undefined reference is coming from. I'd start by 
running obj2asm on the module you just compiled, and see what extern 
directives it puts out.

I'm not trying to be a jerk by telling you this procedure rather than 
just giving the answer, but 1) I don't know the answer offhand and I'd 
have to follow the same procedure to figure it out and 2) I hope that by 
giving you the tools and methodology for figuring it out, this kind of 
question won't repeatedly come up (and yes, it has come up repeatedly). 
3) I hope that anyone else with these kinds of questions will get 
familiar with how to use these tools, too. It's a lot better than 
guessing and assuming.

Tools like lib, obj2asm, and grep are incredibly useful.


> Where did all that come from? I suspect you're looking at this concern 
> with a microscope only, while I think the bigger picture is perhaps more 
> important.

I don't think there is a bigger picture. There's only a case by case 
analysis of what is needed and what isn't.

> Yes they are surprising ~ partly because there's more than one might 
> imagine:
> 
>  0003:00000D74       _D3std6string10whitespaceG6a 00411D74
>  0003:00000D7C       _D3std6string2LSw          00411D7C
>  0003:00000D80       _D3std6string2PSw          00411D80
>  0002:00002464       _D3std6string3cmpFAaAaZi   00404464
>  0002:000024A8       _D3std6string4findFAawZi   004044A8
>  0002:000025E4       _D3std6string6columnFAaiZi 004045E4
>  0003:00000CF4       _D3std6string6digitsG10a   00411CF4
>  0002:00002424       _D3std6string7iswhiteFwZi  00404424
>  0003:00000D40       _D3std6string7lettersG52a  00411D40
>  0003:00000D84       _D3std6string7newlineG2a   00411D84
>  0002:00002514       _D3std6string8toStringFkZAa 00404514
>  0003:00000CE4       _D3std6string9hexdigitsG16a 00411CE4
>  0002:00002590       _D3std6string9inPatternFwAaZi 00404590
>  0003:00000D08       _D3std6string9lowercaseG26a 00411D08
>  0003:00000D00       _D3std6string9octdigitsG8a 00411D00
>  0003:00000D24       _D3std6string9uppercaseG26a 00411D24
> 
> Please see the extensive list at the end for some further surprises


All those other names are are the static data. Things like:

const dchar LS = '\u2028';      /// UTF line separator
const dchar PS = '\u2029';      /// UTF paragraph separator

I submit that they aren't significant. The significant thing is the 
entire std.string.obj is not linked in.

> You're focusing purely on the fact that adding an itoa() would increase 
> the executable size.

Yes.

 > At the same time, completely ignoring the explicit
> mention of using the C runtime function instead (which is usually linked 
> also), and the clear fact that importing std.string brings along with it 
> the following:

And the only possible problem I see there is worrying about executable size.

> 
>  0003:00000D74       _D3std6string10whitespaceG6a 00411D74
>  0003:00000D7C       _D3std6string2LSw          00411D7C
>  0003:00000D80       _D3std6string2PSw          00411D80
>  0002:00002464       _D3std6string3cmpFAaAaZi   00404464
>  0002:000024A8       _D3std6string4findFAawZi   004044A8
>  0002:000025E4       _D3std6string6columnFAaiZi 004045E4
>  0003:00000CF4       _D3std6string6digitsG10a   00411CF4
>  0002:00002424       _D3std6string7iswhiteFwZi  00404424
>  0003:00000D40       _D3std6string7lettersG52a  00411D40
>  0003:00000D84       _D3std6string7newlineG2a   00411D84
>  0002:00002514       _D3std6string8toStringFkZAa 00404514
>  0003:00000CE4       _D3std6string9hexdigitsG16a 00411CE4
>  0002:00002590       _D3std6string9inPatternFwAaZi 00404590
>  0003:00000D08       _D3std6string9lowercaseG26a 00411D08
>  0003:00000D00       _D3std6string9octdigitsG8a 00411D00
>  0003:00000D24       _D3std6string9uppercaseG26a 00411D24
> 
> 
> Along with a number of dependencies.

Take a look at those functions and data - what dependencies?

> And, apparently, you think it's perhaps responsible for bringing in the 
> floating point support too.

That is a problem, and I can fix that. No big deal - it wasn't printf 
bringing in the floating point - and a reengineering or rewrite of 
Phobos is not necessary. I don't even need to change any library source 
code.

> The point being made is that of coupling between low and high levels ~ 
> illustrated quite well by the above.
> I think this kind of thing is worth addressing, for a number of reasons.

I think you're seeing an effect that is an issue, but are mistaken as to 
the cause of the problem.


> Who says the standard C IO should /always/ get linked in? D currently 
> /enforces/ that, whereas it's not a requirement at all for valid 
> operation.

There isn't that much to it, and it doesn't hurt anything.

> What's more, the enforcement is simply because Object.d has a 
> print() method, which uses printf() like so:
> 
> print ()
> {
>     printf ("%.*s", toString());
> }

Again, it isn't necessarilly printf doing that. Try the code I posted in 
the last message that stubs out printf, which will *prevent* it from 
being linked in from the library. Compile/link it, and examine the .map 
file.

(The stubbing out method is another technique for figuring out what 
pulls in what.)

> Why not just use ConsoleWrite(), or anything but printf()?

Because it's not portable (what should the Linux one look like?), and 
does not deliver the billed benefits. But the worst thing about calling 
ConsoleWrite() directly is that it does not play well with any other IO 
the user may have done or be in the process of doing. What will happen 
is that any object.print()'s will not be synchronized with the output 
from writef, printf, or any other of the stdout functions.

> There's a 
> number of valid (and decoupled) alternatives to this approach. Why can't 
> they be used instead? You're answer is "well, it doesn't make any 
> difference anyway". That's entirely silly. Yes, the C-library 
> console-startup wrapper causes the IO system to be linked also. But that 
> can be replaced, since it's not directly part of the D runtime support.

Why does the C library need replacing? I honestly don't get it.


> To make things worse, Object.print() is perhaps the least used method in 
> all of D! Thus, it tends to place this whole issue on the verge of 
> ridiculous.
> Why not just remove the dependency instead?

Because it doesn't buy anything to remove it. Try it and see (or even 
easier, try the source I posted with the stubbed out printf - that will 
absolutely, positively prevent printf from being linked in from the 
library, without needing to change or recompile object.d at all).


> One of the tenets of good library design is to build in layers, and then 
> ensure there's no dependencies between a lower layer and any of the 
> higher ones. Here's two cases of just such a dependency ~ they are 
> almost trivial to fix, yet nothing happens ... why?
> 
> Thus, I really don't wish to argue with you on this one, Walter. If you 
> simply refuse to accept that any system might prefer to avoid the 
> default IO platform, for whatever valid reason it may have, then there's 
> little point in even discussing the nature of tight-coupling.

If you want to use a system that for some reason can't have C's IO 
subsystem, then just include the one liner:

extern (C) int printf(char* f, ...) { return 0; }

somewhere in your code, and it's gone.


> One can hack the internal dependencies in an attempt to rectify the 
> concerns; yet why? Better to leave all of /internal and friends as it 
> stands to avoid branching the code. I really thought you'd understand 
> the value in making that part platform (library) agnostic. And for such 
> a minor cost, too.

You don't need to hack the internals to get rid of any vestige of 
printf. Just stub it out.


> Or, at 
> least trying to obfuscate a simple case of unecessary low-high coupling 
> in D. But let's move on ...

I'm trying to point out that things aren't so simple.


> I'm quite familiar with __fltused.

Your questions about how printf avoided linking in %f support indicated 
otherwise.

> It's clearly used by the little 
> example program above, given that this stuff is linked in:
...
> That looks rather like floating point support; Where in the program is 
> floating point actually used? I don't get it.

I went over that in my last post, too.


>> Pulling printf won't do anything. Try it if you don't agree.
> That's your claim, not mine :)

You don't have to believe me, that's why I encourage you to try it and 
give you the tools and methodology to figure these things out.


> Keep in mind it's not the number of entries, but the number of 
> superfluous entries that are of concern (I removed all Win32 imports in 
> an attempt to make the list more managable).

Until you've tracked down each and every one and understand where it is 
pulled in from and why it is there, there is no way to decide which ones 
are superfluous or not.

There's an awful lot of startup and shutdown going on - stuff that is 
required for D (or the C runtime library, for that matter) to function. 
An awful lot is required for the exception handling support to work - 
that has to be in all programs. For the gc to start up and shut down 
gracefully. It goes on.


> Also, please keep in mind that the concern is one of unecessary coupling 
> from the low-level runtime support, into the high-level library 
> functions. This will often result in a cascade of dependencies, much 
> like what we see below. Not only does it cause code-bloat, but it makes 
> the language-support dependent upon a specific high-level library. These 
> dependencies are /very/ easy to remedy, with an approriate reduction in 
> code size as a bonus.

As we've discovered, pulling printf out of object.d isn't going to 
remedy anything. It just is not that simple.



More information about the Digitalmars-d-announce mailing list