DDMD as showcase?
Adam Wilson
flyboynw at gmail.com
Tue Feb 11 11:58:52 PST 2014
On Tue, 11 Feb 2014 08:33:59 -0800, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> wrote:
> On 2/11/14, 6:32 AM, Jakob Ovrum wrote:
>> On Tuesday, 11 February 2014 at 10:29:58 UTC, thedeemon wrote:
>>> On Tuesday, 11 February 2014 at 04:36:28 UTC, Adam Wilson wrote:
>>>
>>>> The GC itself is an orthogonal issue to the compiler. The way I see
>>>> it, once the compiler can output precise information about the heap,
>>>> stack, and registers, you can build any GC you want without the
>>>> compiler requiring any knowledge of the GC.
>>>
>>> If you want a fast GC it needs to be generational, i.e. most of the
>>> times scan just a portion of heap where young objects live (because
>>> most objects die young), not scan whole heap each time (as in current
>>> D GC). However in a mutable language that young/old generation split
>>> usually requires write barriers: compiler must emit code differently:
>>> each time a pointer field of a heap object is mutated it must check
>>> whether it's a link from old gen to young gen and remember that link
>>> (or just mark the page for scanning). So to have a generational GC in
>>> a mutable language you need to change the codegen as well. At least
>>> this is how most mature GCs work.
>>
>> D code has different allocation patterns from Java and C#. In idiomatic
>> D, young GC-allocated objects are probably much fewer.
>
> I agree this is a good hypothesis (without having measured). My
> suspicion is a good GC for D is different from a good GC for Java or C#.
>
> Andrei
>
I'm not so sure about that. That might be true for Java but C# is a stack
based language with value types. However, I think that, as with C#, we
often forget about the temporaries we implicitly allocate. Strings,
Arrays, closures (and lambdas in C#),the ~= operator, etc. These highly
ephemeral semantics are also quite common in C#. I imagine that this makes
D's allocation patterns much closer to C# than Java.
I was thinking about this last night and as I continue reading the GC
Handbook, I think I understand more about why MS did what they did with
the .NET GC. First of all, they used every algorithm in that book in one
way or another. For example, the Large Object Heap is a simple Mark-Sweep
because there tend to be relatively few nodes to check and fragmentation
is much lower than the ephemeral generations, however, they enabled opt-in
compaction in the latest release because the large size of each node meant
that fragmentation became a problem quicker in long-running processes.
Also the more I dive into it, the more I think that thread-local GC is a
bad idea. As I understand it the point is to reduce the overall pause on
any one thread by reducing the scope of the heap to collect. However, I
would argue that the common case is that a program has a few threads that
dominate the majority of running time, with many ephemeral threads are
created for quick work (an incoming message over a socket for example). In
this case your main threads are still going to have large heaps for the
dominate threads and most likely heaps that are never collected on the
ephemeral threads. This means that a few threads will still have
noticeable pause times, and we've significantly increased compiler
complexity to support thread local GC on all threads, and probably
hammered thread start up time to do it.
I could go on, but my point is that at the end of the day if you want
performant collections, you end up using every trick in the book. The
mixture may be slightly different, but I would suggest that the mixture is
going to be slightly different based on the type of app even using the
same language, which is why .NET provides two modes for the collector,
Server and Workstation, and Java has four. So saying that D's collector
will be different is naturally obvious, but I don't think it will be
significantly different than C# as implied. We still have roughly similar
allocation patterns, with roughly similar use cases, and will most likely
end up building in every algorithm available and then tuning it the
mixture of those algorithms to meet D's needs.
--
Adam Wilson
GitHub/IRC: LightBender
Aurora Project Coordinator
More information about the Digitalmars-d
mailing list