Is this function pure?

Frits van Bommel fvbommel at REMwOVExCAPSs.nl
Wed Sep 19 15:33:28 PDT 2007


Nathan Reed wrote:
> Bill Baxter wrote:
>> The 'fast allocation' claim may be true of generational GCs or other 
>> fancy GCs, but D currently has a mark&sweep type.
> 
> I didn't know that; I assumed D was using at least a copying collector. 
>  You're right that allocation wouldn't be any faster with a 
> mark-and-sweep collector than in a language with explicit memory 
> management.  D /could/ theoretically use a copying collector or 
> generational collector though; as far as I know there's no barrier 
> except writing the code for it.

The big blocker to any sort of copying/moving collector is that it needs 
to know which bytes represent pointers and which don't.

This would require the compiler to emit extra type information it 
currently doesn't. In particular:
* ClassInfo.offTi[] would need to actually be filled (IIRC it's always 
an empty array currently)
* Similar data is needed for stack frames. AFAIK there's currently no 
way to communicate this.
* Pointer data for structs, delegates, array types, AAs etc. would also 
need to be provided (but could be "inlined" into the containing 
class/stackframe information)
* Heap objects would need some kind of tag to indicate how to find 
pointers in them. For classes it would be easy, just get 
.classinfo.offTi through the vtable ptr, but for example structs and 
array data can also be heap-allocated and don't have a vtable ptr. In 
fact, for arrays it isn't even constant (though a 'repeat x times to 
visit all allocated data' marker could be used).

The problem that requires all that is the fact that when you move a heap 
cell, you need to update all (reachable) pointers/references to it (but 
not non-pointer data that happens to *look* like a pointer). So the GC 
can only do this when it is sure it can update all pointers to it, and 
*only* pointers to it.
A GC could of course refrain from moving objects if it's not sure 
everything that looks like a pointer to it actually is one, but that 
would negate at least some of the benefit of a moving collector.

(An interesting idea might be to try and use debugging information as 
the source of type information. However, even though that might work for 
stack data it could be harder for heap data since all you've got to go 
on is the declared types of the pointers to it; and void*s could point 
to anything. Besides, requiring debug info for the GC to work well is a 
bit of a hack)

Another issue is that sometimes the compiler simply doesn't *know* 
whether something is a pointer or not. Consider:
* You can allocate void[]s, which explicitly *may* contain pointers but 
don't necessarily consist of only pointers.
* Unions of pointer and non-pointer types
* Code that's linked in could be compiled by another compiler entirely 
(for instance, a library written in C). Currently using these just 
requires that you keep a pointer where the GC can find it, but if you 
need to find all pointers to heap objects you'd need non-D code to 
declare where exactly they keep their pointers... (Assuming it can't be 
inferred from sources such as debug info)



More information about the Digitalmars-d mailing list