DDMD as showcase?

Tue Feb 11 11:58:52 PST 2014

On Tue, 11 Feb 2014 08:33:59 -0800, Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org> wrote:

> On 2/11/14, 6:32 AM, Jakob Ovrum wrote:
>> On Tuesday, 11 February 2014 at 10:29:58 UTC, thedeemon wrote:
>>> On Tuesday, 11 February 2014 at 04:36:28 UTC, Adam Wilson wrote:
>>>
>>>> The GC itself is an orthogonal issue to the compiler. The way I see
>>>> it, once the compiler can output precise information about the heap,
>>>> stack, and registers, you can build any GC you want without the
>>>> compiler requiring any knowledge of the GC.
>>>
>>> If you want a fast GC it needs to be generational, i.e. most of the
>>> times scan just a portion of heap where young objects live (because
>>> most objects die young), not scan whole heap each time (as in current
>>> D GC). However in a mutable language that young/old generation split
>>> usually requires write barriers: compiler must emit code differently:
>>> each time a pointer field of a heap object is mutated it must check
>>> whether it's a link from old gen to young gen and remember that link
>>> (or just mark the page for scanning). So to have a generational GC in
>>> a mutable language you need to change the codegen as well. At least
>>> this is how most mature GCs work.
>>
>> D code has different allocation patterns from Java and C#. In idiomatic
>> D, young GC-allocated objects are probably much fewer.
>
> I agree this is a good hypothesis (without having measured). My  
> suspicion is a good GC for D is different from a good GC for Java or C#.
>
> Andrei
>

I'm not so sure about that. That might be true for Java but C# is a stack  
based language with value types. However, I think that, as with C#, we  
often forget about the temporaries we implicitly allocate. Strings,  
Arrays, closures (and lambdas in C#),the  ~= operator, etc. These highly  
ephemeral semantics are also quite common in C#. I imagine that this makes  
D's allocation patterns much closer to C# than Java.

I was thinking about this last night and as I continue reading the GC  
Handbook, I think I understand more about why MS did what they did with  
the .NET GC. First of all, they used every algorithm in that book in one  
way or another. For example, the Large Object Heap is a simple Mark-Sweep  
because there tend to be relatively few nodes to check and fragmentation  
is much lower than the ephemeral generations, however, they enabled opt-in  
compaction in the latest release because the large size of each node meant  
that fragmentation became a problem quicker in long-running processes.

Also the more I dive into it, the more I think that thread-local GC is a  
bad idea. As I understand it the point is to reduce the overall pause on  
any one thread by reducing the scope of the heap to collect. However, I  
would argue that the common case is that a program has a few threads that  
dominate the majority of running time, with many ephemeral threads are  
created for quick work (an incoming message over a socket for example). In  
this case your main threads are still going to have large heaps for the  
dominate threads and most likely heaps that are never collected on the  
ephemeral threads. This means that a few threads will still have  
noticeable pause times, and we've significantly increased compiler  
complexity to support thread local GC on all threads, and probably  
hammered thread start up time to do it.

I could go on, but my point is that at the end of the day if you want  
performant collections, you end up using every trick in the book. The  
mixture may be slightly different, but I would suggest that the mixture is  
going to be slightly different based on the type of app even using the  
same language, which is why .NET provides two modes for the collector,  
Server and Workstation, and Java has four. So saying that D's collector  
will be different is naturally obvious, but I don't think it will be  
significantly different than C# as implied. We still have roughly similar  
allocation patterns, with roughly similar use cases, and will most likely  
end up building in every algorithm available and then tuning it the  
mixture of those algorithms to meet D's needs.

-- 
Adam Wilson
GitHub/IRC: LightBender
Aurora Project Coordinator