ldc 0.9.1 released
Frits van Bommel
fvbommel at REMwOVExCAPSs.nl
Wed May 27 15:19:32 PDT 2009
bearophile wrote:
> Frits van Bommel:
>
> Thank you for your answers.
>
>> This one is only done for certain GC allocations by the way, not all of them.
>> The ones currently implemented are:
>> * new Struct/int/float/etc.,
>> * uninitialized arrays (used for arr1 ~ arr2, for instance),
>> * zero-initialized arrays (e.g. new int[N])
>> * new Class, unless
>> a) it has a destructor,
>> b) it has a custom allocator (overloads new), or
>> c) it has a custom deallocator (overloads delete).
>
> I'm trying to find situations where that's true, but in two small programs that use both structs and classes (that don't escape the scope and follow your unless list) I see:
>
> call _d_allocmemoryT
> call _d_allocclass
> Are those calls to variants of alloca()?
No, those are GC allocations.
This small program contains no gc allocations with ldc -O3:
-----
struct Struct {
int i, j = 4;
}
class Class {
int i, j = 6;
}
int frob(T)(T t) {
t.i = 4;
return t.j;
}
int withStruct() {
return frob(new Struct);
}
int withClass() {
return frob(new Class);
}
-----
It does still contain them when inlining is disabled, as it is by default with
-O2 (aka -O); this seems to be because the LLVM pass that adds parameter
attributes (like nocapture, better known as 'scope' in these newsgroups) is
missing from the default list of optimizations :(. I'll fix this in the
repository soon.
Another constraint I forgot to mention: it doesn't work for allocations in
loops, because it's tricky to figure out whether the allocation is still
reachable when the loop reaches the same position again.
(For this reason, the pass by default runs before each inliner run and once
after all inlining is done since the inliner can inline code into loops, yet
allows for simplifications that make escape analysis more accurate)
> While looking for those alloca I have also tested code that has the following two lines one after the other:
> auto a = new int[1000];
> a[] = 2;
>
> That code is very common, because you currently can't write:
> auto a = new int[1000] = 2;
>
> The latest LDC compiles that as:
>
> pushl %esi
> subl $4016, %esp
> leal 16(%esp), %esi
> movl %esi, (%esp)
> movl $4000, 8(%esp)
> movl $0, 4(%esp)
> call memset
> movl %esi, (%esp)
> movl $2, 8(%esp)
> movl $1000, 4(%esp)
> call _d_array_init_i32
>
> I think the memset may be avoided.
That's trickier to get right, because the optimizer would have to look ahead to
see the new memset call is always followed by the initialization, with no reads
in between.
The 1-byte element case can probably be handled by LLVM if _d_array_init_i8 is
replaced by another memset, though. (and similarly, _d_array_init_i16 could be
handled for cases like 0xFFFF, but not 0x1234, by turning it into memset).
More information about the Digitalmars-d-announce
mailing list