On processors for D ~ decoupling

Walter Bright newshound at digitalmars.com
Thu Apr 6 16:26:29 PDT 2006


kris wrote:
> Yes, that's correct. But typeinfo is a rather rudimetary part of the 
> language support. Wouldn't you agree? If I, for example, declare an 
> array of 10 bytes (static byte[10]) then I'm bound over to import 
> std.string ~ simply because TypeInfo_StaticArray wants to use 
> std.string.toString(int), rather than the C library version of itoa() or 
> a "low-level support" version instead.

It has nothing to do with having a static byte[10] declaration. For the 
program:

void main()
{
     static byte[10] b;
}

The only things referenced by the object file are _main, __acrtused_con, 
and __Dmain. You can verify this by running obj2asm on the output, which 
gives:

-------------------------------------
_TEXT   segment dword use32 public 'CODE'       ;size is 0
_TEXT   ends
_DATA   segment para use32 public 'DATA'        ;size is 0
_DATA   ends
CONST   segment para use32 public 'CONST'       ;size is 0
CONST   ends
_BSS    segment para use32 public 'BSS' ;size is 10
_BSS    ends
FLAT    group
includelib phobos.lib
         extrn   _main
         extrn   __acrtused_con
         extrn   __Dmain
__Dmain COMDAT flags=x0 attr=x0 align=x0

_TEXT   segment
         assume  CS:_TEXT
_TEXT   ends
_DATA   segment
_DATA   ends
CONST   segment
CONST   ends
_BSS    segment
_BSS    ends
__Dmain comdat
         assume  CS:__Dmain
                 xor     EAX,EAX
                 ret
__Dmain ends
         end
----------------------------------

Examining the .map file produced shows that only these functions are 
pulled in from std.string:

0002:00002364       _D3std6string7iswhiteFwZi  00404364
0002:000023A4       _D3std6string3cmpFAaAaZi   004043A4
0002:000023E8       _D3std6string4findFAawZi   004043E8
0002:00002450       _D3std6string8toStringFkZAa 00404450
0002:000024CC       _D3std6string9inPatternFwAaZi 004044CC
0002:00002520       _D3std6string6columnFAaiZk 00404520

I do not know offhand why a couple of those are pulled in, but I suggest 
that obj2asm and the generated .map files are invaluable at determining 
what pulls in what. Sometimes the results are surprising.

> That's tight-coupling within very low-level language support. Uncool.
> Wouldn't you at least agree that specific instance is hardly an absolute 
> necessity?

std.string.toString is 124 bytes long, and doesn't pull anything else in 
(except see below). Writing another version of it in typeinfo isn't 
going to reduce the size of the program *at all*, in fact, it will 
likely increase it because now there'll be two versions of it.

>> Although there is a lot of code in std.string, unreferenced free 
>> functions in it should be discarded by the linker. A check of the 
>> generated .map file should verify this - it is certainly supposed to 
>> work that way. One problem Java has is that there are no free 
>> functions, so referencing one function wound up pulling in every part 
>> of the class the function resided in.
> This is exactly the case with printf <g>. It winds up linking the world

No, it does not link in the world, floating point, or graphics 
libraries. It links in C's standard I/O (which usually gets linked in 
anyway), and about 4000 bytes of code. That's somewhat less than a 
megabyte <g>.


> because it's a general purpose utility function that does all kinds of 
> conversion and all kinds of IO. Printf() is an all or nothing design ~ 
> you can't selectively link pieces of it.
> 
> That's usually not a problem. However, you've chosen to bind it to 
> low-level language support (in the root Object). That choice causes 
> tight coupling between the language low-level support and a high-level 
> library function ~ one which ought to be optional.
> 
> Wouldn't you at least agree this specific case is not necessary for the 
> D language to function correctly? That there are other perfectly 
> workable alternatives?

It's just not a big deal. Try the following:

extern (C) int printf(char* f, ...) { return 0; }

void main()
{
     static byte[10] b;
}

and compare the difference in exe file sizes, with and without the 
printf stub.


>> printf doesn't pull in the floating point library (I went to a lot of 
>> effort to make that so!). It does pull in the C IO library, which is 
>> very hard to not pull in (there always seems to be something 
>> referencing it). It shouldn't pull in the C wide character stuff. D's 
>> IO (writefln) will pull in C's IO anyway, so the only thing extra is 
>> the integer version of the specific printf code (about 4K).
> How can it convert %f, %g and so on if it doesn't use FP support at all? 

It's magic! Naw, it's just that if you actually use floating point in a 
program, the compiler emits a special extern reference (to __fltused) 
which pulls in the floating point IO formatting code. Otherwise, it 
defaults to just a stub. Try it.

> Either way, it's not currently possible to build a D program without a 
> swathe of FP support code,
> printf,
> the entire C IO package,
> wide-char support,
> and a whole lot more besides. I'd assumed the linked FP support 
> was for printf, but perhaps it's for std.string instead? I've posted the 
> linker maps (in the past) to illustrate exactly this.

My point is that assuming what is pulled in by what is about as reliable 
as guessing where the bottlenecks in one's code is. You can't tell 
bottlenecks without a profiler, and you've got both hands tied behind 
your back trying to figure out who pulls in what if you're not using 
.map files, grep, and obj2asm.

> Are you not at all interested in improving this aspect of the language 
> usage?

Sure, but based on accurate information. Pulling printf won't do 
anything. Try it if you don't agree.

For example, which modules pull in the floating point formatting code? 
It isn't printf. We can find out by doing a grep for __fltused:

boxer.obj:      __fltused
complex.obj:    __fltused
conv.obj:       __fltused
date.obj:       __fltused
demangle.obj:   __fltused
format.obj:     __fltused
gamma.obj:      __fltused
math.obj:       __fltused
math2.obj:      __fltused
outbuffer.obj:  __fltused
stream.obj:     __fltused
string.obj:     __fltused
ti_Acdouble.obj:        __fltused
ti_Acfloat.obj: __fltused
ti_Acreal.obj:  __fltused
ti_Adouble.obj: __fltused
ti_Afloat.obj:  __fltused
ti_Areal.obj:   __fltused
ti_cdouble.obj: __fltused
ti_cfloat.obj:  __fltused
ti_creal.obj:   __fltused
ti_double.obj:  __fltused
ti_float.obj:   __fltused
ti_real.obj:    __fltused

Some examination of the .map file shows that the only one of these 
pulled in by default is std.string. So I think a reasonable approach 
would be to look at removing the floating point from std.string - printf 
isn't the problem, nor is referencing a function in std.string.



More information about the Digitalmars-d-announce mailing list