Potential GSoC project - GC improvements

Sat Mar 19 01:39:30 PDT 2016

On 18.03.2016 22:04, Jeremy deHaan wrote:
> On Friday, 18 March 2016 at 16:41:21 UTC, Rainer Schuetze wrote:
>>
>>
>> On 15.03.2016 02:34, Jeremy DeHaan wrote:
>>> [...]
>>
>> Being always way behind reading the forum these days, I'm a little
>> late and have not read all the messages in this thread thoroughly.
>> Here are some thoughts:
>>
>> [...]
>
> Thank you for the feedback. I'm still working on my proposal so nothing
> is set in stone just yet. I'm very interested in working on the GC for
> this GSoC, so what would you suggest be my main focus? It sounds like
> you already have a GC that is more or less what I was planning on
> implementing...

Well, there are a number of unfinished parts left:

- as far as I understand the precise GC PR 
https://github.com/D-Programming-Language/druntime/pull/1022 is 
currently stalled because it doesn't yield better overall performance 
than the current GC in most situations. The reasoning is that it should 
be able compensate the additional time during allocation (saving pointer 
information) by improved scanning using that information to skip 
non-pointers. That didn't work out yet, though.

- last time I checked generating RTInfo (which contains the pointer 
info) was a bit unreliable, see 
https://github.com/D-Programming-Language/dmd/pull/2480 and 
https://github.com/D-Programming-Language/dmd/pull/3958.

- I only implemented the DATA and TLS section TypeInfo emission for the 
OMF backend, this needs to be ported to other platforms. Martin Nowak 
recently considered to just emit pointer locations. This would make the 
scanning function a bit simpler, but might need a bit more memory in the 
binary.

Regarding the concurrent GC: I consider my implementation a prototype 
with rough edges and a lot of optimization opportunities. I'm not sure 
whether page protection is good enough to implement a generational GC, 
but it might still be possible to take advantage of the information that 
a page hasn't been written to.
Judging from https://blog.golang.org/go15gc Go 1.5 seems to be using a 
similar GC (concurrent mark-and-sweep), though with proper write 
barriers. I guess there is a lot of stuff that can be used from their 
experience.

 > so what would you suggest be my main focus?

Given that 32-bit applications are becoming legacy, and false pointers 
are not a common problem in 64-bit processes (they do happen eventually, 
though) I suspect that concurrency of the GC would make a much larger 
impact on the D language than preciseness. A good target should be to 
reduce stop-the-world-time to something acceptable for interactive 
programs, i.e. well below 50ms.