Significant GC performance penalty
Paulo Pinto
pjmlp at progtools.org
Fri Dec 14 11:27:46 PST 2012
On Friday, 14 December 2012 at 18:27:29 UTC, Rob T wrote:
> I created a D library wrapper for sqlite3 that uses a
> dynamically constructed result list for returned records from a
> SELECT statement. It works in a similar way to a C++ version
> that I wrote a while back.
>
> The D code is D code, not a cloned up version of my earlier C++
> code, so it makes use of many of the features of D, and one of
> them is the garbage collector.
>
> When running comparison tests between the C++ version and the D
> version, both compiled using performance optimization flags,
> the C++ version runs 3x faster than the D version which was
> very unexpected. If anything I was hoping for a performance
> boost out of D or at least the same performance levels.
>
> I remembered reading about people having performance problems
> with the GC, so I tried a quick fix, which was to disable the
> GC before the SELECT is run and re-enable afterwards. The
> result of doing that was a 3x performance boost, making the DMD
> compiled version run almost as fast as the C++ version. The DMD
> compiled version is now only 2 seconds slower on my stress test
> runs of a SELECT that returns 200,000+ records with 14 fields.
> Not too bad! I may get identical performance if I compile using
> gdc, but that will have to wait until it is updated to 2.061.
>
> Fixing this was a major relief since the code is expected to be
> used in a commercial setting. I'm wondering though, why the GC
> causes such a large penalty, and what negative effect if any if
> there will be when disabling the GC temporarily. I know that
> memory won't be reclaimed until the GC is re-enabled, but is
> there anything else to worry about?
>
> I feel it's worth commenting on my experience as feed back for
> the D developers and anyone else starting off with D.
>
> Coming from C++ I *really* did not like having the GC, it made
> me very nervous, but now that I'm used to having it, I've come
> to like having it up to a point. It really does change the way
> you think and code. However as I've discovered, you still have
> to always be thinking about memory management issues because
> the GC can eat up a huge performance penalty under certain
> situations. I also NEED to know that I can always go full
> manual where necessary. There's no way I would want to give up
> that kind of control.
>
> The trade off with having a GC seems to be that by default, C++
> apps will perform considerably faster than equivalent D apps
> out-of-the-box, simply because the manual memory management is
> fine tuned by the programmer as the development proceeds. With
> D, when you simply let the GC take care of business, then you
> are not necessarily fine tuning as you go along, and when you
> do not take the resulting performance hit into consideration it
> means that your apps will likely perform poorly compared to a
> C++ equivalent. However, building the equivalent app in D is a
> much more pleasant experience in terms of the programming
> productivity gain. The code is simpler to deal with, and
> there's less to worry about with pointers and other memory
> management issues.
>
> What I have not yet had the opportunity to explore, is using D
> in full manual memory management mode. My understanding is that
> if I take that route, then I cannot use certain parts of the
> std lib, and will also loose a few of the nice features of D
> that make it fun to work with. I'm not fully clear though on
> what to expect, so if there's any detailed information to look
> at, it would be a big help.
>
> I wonder what can be done to allow a programmer to go fully
> manual, while not loosing any of the nice features of D?
>
> Also, I think everyone agrees we really need a better GC, and I
> wonder once we do get a better GC, what kind of overall
> improvements we can expect to see?
>
> Thanks for listening.
>
> --rt
Having lots of experience in GC enabled languages, even for
systems programming (Oberon & Active Oberon).
I think there a few issues to consider:
- D's GC still has a lot of room to improve, so some of the
issues you have found might eventually get improved;
- Having GC support, does not mean to do call new like crazy, one
still needs to think how to code in a GC friendly way;
- Make proper use of weak references in case they are available;
- GC enabled languages runtimes usually offer ways to peak into
the runtime, somehow, and allow the developer to understand how
GC is working and what might be improved;
The goodness of having a GC is to have a safer way to manage
memory across multiple modules, specially when ownership is not
clear.
Even in C++ I seldom do manual memory management nowadays, if
working on new codebases. Of course, others will have a different
experience.
Other than that, thanks for sharing your experience.
--
Paulo
More information about the Digitalmars-d
mailing list