On heap segregation, GC optimization and @nogc relaxing

Wed Nov 12 04:49:40 PST 2014

On Wednesday, 12 November 2014 at 02:34:55 UTC, deadalnix wrote:
> Before going into why it is fallign short, a digression on GC 
> and the benefits of segregating the heap. In D, the heap is 
> almost segregated in 3 groups: thread local, shared and 
> immutable. These group are very interesting for the GC:
>  - Thread local heap can be collected while disturbing only one 
> thread. It should be possible to use different strategy in 
> different threads.
>  - Immutable heap can be collected 100% concurrently without 
> any synchronization with the program.
>  - Shared heap is the only one that require disturbing the 
> whole program, but as a matter of good practice, this heap 
> should be small anyway.

All this is unfortunately only true if there are no references 
between heaps, i.e. if the heaps are indeed "islands". Otherwise, 
there need to be at least write barriers.

> I'd argue for the introduction of a basic ownership system. 
> Something much simpler than rust's, that do not cover all uses 
> cases. But the good thing is that we can fallback on GC or 
> unsafe code when the system show its limits. That mean we rely 
> less on the GC, while being able to provide a better GC.
>
> We already pay a cost at interface with type qualifier, let's 
> make the best of it ! I'm proposing to introduce a new type 
> qualifier for owned data.
>
> Now it means that throw statement expect a owned(Throwable), 
> that pure function that currently return an implicitly unique 
> object will return owned(Object) and that message passing will 
> accept to pass around owned stuff.
>
> The GC heap can be segregated into island. We currently have 3 
> types of islands : Thread local, shared and immutable. These 
> are builtin island with special characteristics in the 
> language. The new qualifier introduce a new type of island, the 
> owned island.
>
> owned island can only refers to other owned island and to 
> immutable. they can be merged in any other island at any time 
> (that is why they can't refers to TL or shared).
>
> owned(T) can be passed around as function parameter or 
> returned, or stored as fields. When doing so they are consumed. 
> When an owned is not consumed and goes out of scope, the whole 
> island is freed.
>
> That means that owned(T) can implicitly decay into T, 
> immutable(T), shared(T) at any time. When doing so, a call to 
> the runtime is done to merge the owned island to the 
> corresponding island. It is passed around as owned, then the 
> ownership is transferred and all local references to the island 
> are invalidated (using them is an error).
>
> On an implementation level, a call to a pure function that 
> return an owned could look like this :
>
> {
>   IslandID __saved = gc_switch_new_island();
>   scope(exit) gc_restore_island(__saved);
>
>   call_pure_function();
> }

This is nice. Instead of calling fixed helpers in Druntime, it 
can also make an indirect call to allow for pluggable (and 
runtime switchable) allocators.

> The solution of passing a policy at compile for allocation is 
> close to what C++'s stdlib is doing, and even if the proposed 
> approach by Andrei is better, I don't think this is a good one. 
> The proposed approach allow for a lot of code to be marked as 
> @nogc and allow for the caller to decide. That is ultimately 
> what we want libraries to look like.

+1

Andrei's approach mixes up memory allocation and memory 
management. Library functions shouldn't know about the latter. 
This proposal is clearly better and cleaner in this respect.