On processors for D ~ decoupling
kris
foo at bar.com
Thu Apr 6 18:51:24 PDT 2006
Long post; sorry about that.
Walter Bright wrote:
> kris wrote:
>
>> Yes, that's correct. But typeinfo is a rather rudimetary part of the
>> language support. Wouldn't you agree? If I, for example, declare an
>> array of 10 bytes (static byte[10]) then I'm bound over to import
>> std.string ~ simply because TypeInfo_StaticArray wants to use
>> std.string.toString(int), rather than the C library version of itoa()
>> or a "low-level support" version instead.
>
>
> It has nothing to do with having a static byte[10] declaration. For the
> program:
>
> void main()
> {
> static byte[10] b;
> }
>
> The only things referenced by the object file are _main, __acrtused_con,
> and __Dmain. You can verify this by running obj2asm on the output, which
> gives:
>
> -------------------------------------
> _TEXT segment dword use32 public 'CODE' ;size is 0
> _TEXT ends
> _DATA segment para use32 public 'DATA' ;size is 0
> _DATA ends
> CONST segment para use32 public 'CONST' ;size is 0
> CONST ends
> _BSS segment para use32 public 'BSS' ;size is 10
> _BSS ends
> FLAT group
> includelib phobos.lib
> extrn _main
> extrn __acrtused_con
> extrn __Dmain
> __Dmain COMDAT flags=x0 attr=x0 align=x0
>
> _TEXT segment
> assume CS:_TEXT
> _TEXT ends
> _DATA segment
> _DATA ends
> CONST segment
> CONST ends
> _BSS segment
> _BSS ends
> __Dmain comdat
> assume CS:__Dmain
> xor EAX,EAX
> ret
> __Dmain ends
> end
> ----------------------------------
>
It would help if you'd note under what circumstances the TypeInfo /is/
included, then. For example, this program:
void main()
{
throw new Exception ("");
}
causes all kinds of TypeInfo to be linked:
_D3std8typeinfo2Aa11TypeInfo_Aa5tsizeFZk 004074E8
_D3std8typeinfo2Aa11TypeInfo_Aa6equalsFPvPvZi 00407470
_D3std8typeinfo2Aa11TypeInfo_Aa7compareFPvPvZi 004074CC
_D3std8typeinfo2Aa11TypeInfo_Aa7getHashFPvZk 00407430
_D3std8typeinfo2Aa11TypeInfo_Aa8toStringFZAa 00407424
_D3std8typeinfo7ti_char10TypeInfo_a4swapFPvPvZv 0040466C
_D3std8typeinfo7ti_char10TypeInfo_a5tsizeFZk 00404664
_D3std8typeinfo7ti_char10TypeInfo_a6equalsFPvPvZi 00404630
_D3std8typeinfo7ti_char10TypeInfo_a7compareFPvPvZi 0040464C
_D3std8typeinfo7ti_char10TypeInfo_a7getHashFPvZk 00404624
_D3std8typeinfo7ti_char10TypeInfo_a8toStringFZAa 00404618
_D3std8typeinfo7ti_uint10TypeInfo_k4swapFPvPvZv 00407400
_D3std8typeinfo7ti_uint10TypeInfo_k5tsizeFZk 004073F8
_D3std8typeinfo7ti_uint10TypeInfo_k6equalsFPvPvZi 004073B0
_D3std8typeinfo7ti_uint10TypeInfo_k7compareFPvPvZi 004073CC
_D3std8typeinfo7ti_uint10TypeInfo_k7getHashFPvZk 004073A4
_D3std8typeinfo7ti_uint10TypeInfo_k8toStringFZAa 00407398
_D6object14TypeInfo_Array4swapFPvPvZv 004028A8
_D6object14TypeInfo_Array5tsizeFZk 004028A0
_D6object14TypeInfo_Array6equalsFPvPvZi 00402778
_D6object14TypeInfo_Array7compareFPvPvZi 00402808
_D6object14TypeInfo_Array7getHashFPvZk 0040271C
_D6object14TypeInfo_Array8toStringFZAa 004026F8
_D6object14TypeInfo_Class5tsizeFZk 00402C48
_D6object14TypeInfo_Class6equalsFPvPvZi 00402BB8
_D6object14TypeInfo_Class7compareFPvPvZi 00402C00
_D6object14TypeInfo_Class7getHashFPvZk 00402BA8
_D6object14TypeInfo_Class8toStringFZAa 00402B9C
_D6object15TypeInfo_Struct5tsizeFZk 00402D3C
_D6object15TypeInfo_Struct6equalsFPvPvZi 00402C94
_D6object15TypeInfo_Struct7compareFPvPvZi 00402CE8
_D6object15TypeInfo_Struct7getHashFPvZk 00402C58
_D6object15TypeInfo_Struct8toStringFZAa 00402C50
_D6object16TypeInfo_Pointer4swapFPvPvZv 004026E0
_D6object16TypeInfo_Pointer5tsizeFZk 004026D8
_D6object16TypeInfo_Pointer6equalsFPvPvZi 004026AC
_D6object16TypeInfo_Pointer7compareFPvPvZi 004026C8
_D6object16TypeInfo_Pointer7getHashFPvZk 004026A0
_D6object16TypeInfo_Pointer8toStringFZAa 0040267C
_D6object16TypeInfo_Typedef4swapFPvPvZv 00402664
_D6object16TypeInfo_Typedef5tsizeFZk 00402658
_D6object16TypeInfo_Typedef6equalsFPvPvZi 00402628
_D6object16TypeInfo_Typedef7compareFPvPvZi 00402640
_D6object16TypeInfo_Typedef7getHashFPvZk 00402618
_D6object16TypeInfo_Typedef8toStringFZAa 00402610
_D6object17TypeInfo_Delegate5tsizeFZk 00402B94
_D6object17TypeInfo_Delegate8toStringFZAa 00402B70
_D6object17TypeInfo_Function5tsizeFZk 00402B6C
_D6object17TypeInfo_Function8toStringFZAa 00402B48
_D6object20TypeInfo_StaticArray4swapFPvPvZv 00402A40
_D6object20TypeInfo_StaticArray5tsizeFZk 00402A2C
_D6object20TypeInfo_StaticArray6equalsFPvPvZi 00402960
_D6object20TypeInfo_StaticArray7compareFPvPvZi 004029BC
_D6object20TypeInfo_StaticArray7getHashFPvZk 00402924
_D6object20TypeInfo_StaticArray8toStringFZAa 004028E4
_D6object25TypeInfo_AssociativeArray5tsizeFZk 00402B40
_D6object25TypeInfo_AssociativeArray8toStringFZAa 00402AFC
Where did all that come from? I suspect you're looking at this concern
with a microscope only, while I think the bigger picture is perhaps more
important.
> Examining the .map file produced shows that only these functions are
> pulled in from std.string:
>
> 0002:00002364 _D3std6string7iswhiteFwZi 00404364
> 0002:000023A4 _D3std6string3cmpFAaAaZi 004043A4
> 0002:000023E8 _D3std6string4findFAawZi 004043E8
> 0002:00002450 _D3std6string8toStringFkZAa 00404450
> 0002:000024CC _D3std6string9inPatternFwAaZi 004044CC
> 0002:00002520 _D3std6string6columnFAaiZk 00404520
>
> I do not know offhand why a couple of those are pulled in, but I suggest
> that obj2asm and the generated .map files are invaluable at determining
> what pulls in what. Sometimes the results are surprising.
Yes they are surprising ~ partly because there's more than one might
imagine:
0003:00000D74 _D3std6string10whitespaceG6a 00411D74
0003:00000D7C _D3std6string2LSw 00411D7C
0003:00000D80 _D3std6string2PSw 00411D80
0002:00002464 _D3std6string3cmpFAaAaZi 00404464
0002:000024A8 _D3std6string4findFAawZi 004044A8
0002:000025E4 _D3std6string6columnFAaiZi 004045E4
0003:00000CF4 _D3std6string6digitsG10a 00411CF4
0002:00002424 _D3std6string7iswhiteFwZi 00404424
0003:00000D40 _D3std6string7lettersG52a 00411D40
0003:00000D84 _D3std6string7newlineG2a 00411D84
0002:00002514 _D3std6string8toStringFkZAa 00404514
0003:00000CE4 _D3std6string9hexdigitsG16a 00411CE4
0002:00002590 _D3std6string9inPatternFwAaZi 00404590
0003:00000D08 _D3std6string9lowercaseG26a 00411D08
0003:00000D00 _D3std6string9octdigitsG8a 00411D00
0003:00000D24 _D3std6string9uppercaseG26a 00411D24
Please see the extensive list at the end for some further surprises
>
>> That's tight-coupling within very low-level language support. Uncool.
>> Wouldn't you at least agree that specific instance is hardly an
>> absolute necessity?
>
>
> std.string.toString is 124 bytes long, and doesn't pull anything else in
> (except see below). Writing another version of it in typeinfo isn't
> going to reduce the size of the program *at all*, in fact, it will
> likely increase it because now there'll be two versions of it.
You're focusing purely on the fact that adding an itoa() would increase
the executable size. At the same time, completely ignoring the explicit
mention of using the C runtime function instead (which is usually linked
also), and the clear fact that importing std.string brings along with it
the following:
0003:00000D74 _D3std6string10whitespaceG6a 00411D74
0003:00000D7C _D3std6string2LSw 00411D7C
0003:00000D80 _D3std6string2PSw 00411D80
0002:00002464 _D3std6string3cmpFAaAaZi 00404464
0002:000024A8 _D3std6string4findFAawZi 004044A8
0002:000025E4 _D3std6string6columnFAaiZi 004045E4
0003:00000CF4 _D3std6string6digitsG10a 00411CF4
0002:00002424 _D3std6string7iswhiteFwZi 00404424
0003:00000D40 _D3std6string7lettersG52a 00411D40
0003:00000D84 _D3std6string7newlineG2a 00411D84
0002:00002514 _D3std6string8toStringFkZAa 00404514
0003:00000CE4 _D3std6string9hexdigitsG16a 00411CE4
0002:00002590 _D3std6string9inPatternFwAaZi 00404590
0003:00000D08 _D3std6string9lowercaseG26a 00411D08
0003:00000D00 _D3std6string9octdigitsG8a 00411D00
0003:00000D24 _D3std6string9uppercaseG26a 00411D24
Along with a number of dependencies.
And, apparently, you think it's perhaps responsible for bringing in the
floating point support too.
The point being made is that of coupling between low and high levels ~
illustrated quite well by the above.
I think this kind of thing is worth addressing, for a number of reasons.
>>> Although there is a lot of code in std.string, unreferenced free
>>> functions in it should be discarded by the linker. A check of the
>>> generated .map file should verify this - it is certainly supposed to
>>> work that way. One problem Java has is that there are no free
>>> functions, so referencing one function wound up pulling in every part
>>> of the class the function resided in.
>>
>> This is exactly the case with printf <g>. It winds up linking the world
>
>
> No, it does not link in the world, floating point, or graphics
> libraries. It links in C's standard I/O (which usually gets linked in
> anyway), and about 4000 bytes of code. That's somewhat less than a
> megabyte <g>.
Who says the standard C IO should /always/ get linked in? D currently
/enforces/ that, whereas it's not a requirement at all for valid
operation. What's more, the enforcement is simply because Object.d has a
print() method, which uses printf() like so:
print ()
{
printf ("%.*s", toString());
}
Why not just use ConsoleWrite(), or anything but printf()? There's a
number of valid (and decoupled) alternatives to this approach. Why can't
they be used instead? You're answer is "well, it doesn't make any
difference anyway". That's entirely silly. Yes, the C-library
console-startup wrapper causes the IO system to be linked also. But that
can be replaced, since it's not directly part of the D runtime support.
To make things worse, Object.print() is perhaps the least used method in
all of D! Thus, it tends to place this whole issue on the verge of
ridiculous.
Why not just remove the dependency instead?
One of the tenets of good library design is to build in layers, and then
ensure there's no dependencies between a lower layer and any of the
higher ones. Here's two cases of just such a dependency ~ they are
almost trivial to fix, yet nothing happens ... why?
Thus, I really don't wish to argue with you on this one, Walter. If you
simply refuse to accept that any system might prefer to avoid the
default IO platform, for whatever valid reason it may have, then there's
little point in even discussing the nature of tight-coupling.
One can hack the internal dependencies in an attempt to rectify the
concerns; yet why? Better to leave all of /internal and friends as it
stands to avoid branching the code. I really thought you'd understand
the value in making that part platform (library) agnostic. And for such
a minor cost, too.
>> because it's a general purpose utility function that does all kinds of
>> conversion and all kinds of IO. Printf() is an all or nothing design ~
>> you can't selectively link pieces of it.
>>
>> That's usually not a problem. However, you've chosen to bind it to
>> low-level language support (in the root Object). That choice causes
>> tight coupling between the language low-level support and a high-level
>> library function ~ one which ought to be optional.
>>
>> Wouldn't you at least agree this specific case is not necessary for
>> the D language to function correctly? That there are other perfectly
>> workable alternatives?
>
>
> It's just not a big deal. Try the following:
>
> extern (C) int printf(char* f, ...) { return 0; }
>
> void main()
> {
> static byte[10] b;
> }
>
> and compare the difference in exe file sizes, with and without the
> printf stub.
Funny :-D
It makes little difference because all the other dependency code is
linked in from other places, Walter. It can be fixed one step at a time.
What you're saying here is the following. Take a shotgun, and pepper the
boat you're standing in with holes. Now, see? When you plug up this one
hole, it really doesn't stop the water coming in? See? Hardly any
difference!
Needless to say, I think you're being somewhat disingenious. Or, at
least trying to obfuscate a simple case of unecessary low-high coupling
in D. But let's move on ...
>>> printf doesn't pull in the floating point library (I went to a lot of
>>> effort to make that so!). It does pull in the C IO library, which is
>>> very hard to not pull in (there always seems to be something
>>> referencing it). It shouldn't pull in the C wide character stuff. D's
>>> IO (writefln) will pull in C's IO anyway, so the only thing extra is
>>> the integer version of the specific printf code (about 4K).
>>
>> How can it convert %f, %g and so on if it doesn't use FP support at all?
>
>
> It's magic! Naw, it's just that if you actually use floating point in a
> program, the compiler emits a special extern reference (to __fltused)
> which pulls in the floating point IO formatting code. Otherwise, it
> defaults to just a stub. Try it.
void main()
{
throw new Exception ("");
}
I'm quite familiar with __fltused. It's clearly used by the little
example program above, given that this stuff is linked in:
0003:00007150 ___wpscanfloat 00418150
0003:00007154 ___wpfloatfmt 00418154
0003:00007158 ___pscanfloat 00418158
0003:0000715C ___pfloatfmt 0041815C
0003:0000453C __8087 0041553C
0003:0000453C __80x87 0041553C
0002:0000E560 __8087_init 00410560
0002:0000E9B0 __FCOMPP@ 004109B0
0002:0000E9CE __FTEST0@ 004109CE
0002:0000E9EE __FTEST@ 004109EE
0002:0000EA06 __DTST87@ 00410A06
0002:0000EA0A __87TOPSW@ 00410A0A
0002:0000EA0F __DBLTO87@ 00410A0F
0002:0000EA1A __DBLINT87@ 00410A1A
0002:0000EA3B __DBLLNG87@ 00410A3B
0002:0000EA57 __FLTTO87@ 00410A57
0002:0000EA5E __status87 00410A5E
0002:0000EA63 __clear87 00410A63
0002:0000EA6C __control87 00410A6C
0002:0000EA93 __fpreset 00410A93
That looks rather like floating point support; Where in the program is
floating point actually used? I don't get it.
>> Either way, it's not currently possible to build a D program without a
>> swathe of FP support code,
>> printf,
>> the entire C IO package,
>> wide-char support,
>> and a whole lot more besides. I'd assumed the linked FP support was
>> for printf, but perhaps it's for std.string instead? I've posted the
>> linker maps (in the past) to illustrate exactly this.
>
>
> My point is that assuming what is pulled in by what is about as reliable
> as guessing where the bottlenecks in one's code is. You can't tell
> bottlenecks without a profiler, and you've got both hands tied behind
> your back trying to figure out who pulls in what if you're not using
> .map files, grep, and obj2asm.
>
>> Are you not at all interested in improving this aspect of the language
>> usage?
>
>
> Sure, but based on accurate information.
*Cough*
> Pulling printf won't do
> anything. Try it if you don't agree.
That's your claim, not mine :)
See the analogy above.
>
> For example, which modules pull in the floating point formatting code?
> It isn't printf. We can find out by doing a grep for __fltused:
>
> boxer.obj: __fltused
> complex.obj: __fltused
> conv.obj: __fltused
> date.obj: __fltused
> demangle.obj: __fltused
> format.obj: __fltused
> gamma.obj: __fltused
> math.obj: __fltused
> math2.obj: __fltused
> outbuffer.obj: __fltused
> stream.obj: __fltused
> string.obj: __fltused
> ti_Acdouble.obj: __fltused
> ti_Acfloat.obj: __fltused
> ti_Acreal.obj: __fltused
> ti_Adouble.obj: __fltused
> ti_Afloat.obj: __fltused
> ti_Areal.obj: __fltused
> ti_cdouble.obj: __fltused
> ti_cfloat.obj: __fltused
> ti_creal.obj: __fltused
> ti_double.obj: __fltused
> ti_float.obj: __fltused
> ti_real.obj: __fltused
>
> Some examination of the .map file shows that the only one of these
> pulled in by default is std.string. So I think a reasonable approach
> would be to look at removing the floating point from std.string
So importing std.string is causing FP support to be imported? No
surprises there; something is certainly bringing it in. Along with the
"world", as one can see from the attached .map of the example program:
void main()
{
throw new Exception ("");
}
Keep in mind it's not the number of entries, but the number of
superfluous entries that are of concern (I removed all Win32 imports in
an attempt to make the list more managable).
Also, please keep in mind that the concern is one of unecessary coupling
from the low-level runtime support, into the high-level library
functions. This will often result in a cascade of dependencies, much
like what we see below. Not only does it cause code-bloat, but it makes
the language-support dependent upon a specific high-level library. These
dependencies are /very/ easy to remedy, with an approriate reduction in
code size as a bonus.
The map file is here, since it's too big to attach:
http://www.dsource.org/projects/mango/browser/trunk/doc/map.txt?rev=818&format=raw
More information about the Digitalmars-d-announce
mailing list