Misc questions:- licensing, VC++ IDE compatible, GPGPU, LTCG,

Walter Bright newshound1 at digitalmars.com
Sun May 16 13:41:27 PDT 2010

bearophile wrote:
> Walter Bright:
>> This is not true of D. In D, the compiler can<
> Thank you for your answers. At the moment D compilers aren't doing this,

Yes, they are. dmd definitely inlines across source modules.

> The second optmizations it talks about is custom calling conventions:
>> Normally, all functions are either cdecl, stdcall, or fastcall. With custom
>> calling conventions, the back end has enough knowledge that it can pass
>> more values in registers, and less on the stack. This usually cuts code
>> size and improves performance.<

Right, dmd doesn't do custom calling conventions. But, it is not necessary for D 
to have the linker do them. As I explained, the compiler has as much source 
available to it as the user wishes to supply.

> The third optimizations it talks about is 'Small TLS Encoding':
>> When you use __declspec(thread) variables, the code generator stores the
>> variables at a fixed offset in each per-thread data area. Without LTCG, the
>> code generator has no idea of how many __declspec(thread) variables there
>> will be. As such, it must generate code that assumes the worst, and uses a
>> four-byte offset to access the variable. With LTCG, the code generator has
>> the opportunity to examine all __declspec(thread) variables, and note how
>> often they're used. The code generator can put the smaller, more frequently
>> used variables at the beginning of the per-thread data area and use a
>> one-byte offset to access them.<

Yes, but you won't find this to be a speed improvement. The various addressing 
modes all run at the same speed. Furthermore, the use of global variables (and 
that includes TLS) should be minimized. Use of TLS (or any globals) in a tight 
loop should be avoided on general principles in favor of caching the value in a 
local. I don't believe this optimization is worth the effort.

Many compilers spend a lot of time trying to optimize access to statics and 
globals. This ain't low hanging fruit for any but badly written programs.

More information about the Digitalmars-d mailing list