More radical ideas about gc and reference counting

Mon May 12 09:03:16 PDT 2014

On 12 May 2014 18:45, Walter Bright via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On 5/12/2014 12:12 AM, Manu via Digitalmars-d wrote:
>>
>> What? You've never offered me a practical solution.
>
>
> I have, you've just rejected them.
>
>
>> What do I do?
>
>
> 1. you can simply do C++ style memory management. shared_ptr<>, etc.

I already have C++. I don't want another one.

> 2. you can have the non-pausible code running in a thread that is not
> registered with the gc, so the gc won't pause it. This requires that this
> thread not allocate gc memory, but it can use gc memory allocated by other
> threads, as long as those other threads retain a root to it.

It still sounds the same as manual memory management though in
practise, like you say, the other thread must maintain a root to it,
which means I need to manually retain it somehow, and when the worker
thread finishes with it, it needs to send a signal or something back
to say it's done so it can be released... it sounds more inconvenient
than direct manual memory management in practise.
Sounds slow too. Dec-ing a ref is certainly faster than inter-thread
communication.

This also makes library calls into effective RPC's if I can't call
into them from the active threads.

How long is a collect liable to take in the event the GC threads need
to collect? Am I likely to lose my service threads for 100s of
milliseconds at a time?

I'll think on it, but I don't think there's anything practically
applicable here, and it really sounds like it creates a lot more
trouble and complexity than it addresses.

> 3. D allows you to create and use any memory management scheme you want. You
> are simply not locked into GC. For example, I rewrote my Empire game into D
> and it did not do any allocation at all - no GC, not even malloc. I know
> that you'll need to do allocation, I'm just pointing out that GC allocations
> and pauses are hardly inevitable.

C++ lets me create any memory management scheme I like by the same argument.
I lose all the parts of the language that implicitly depend on the GC,
and 3rd party libs (that don't care about me and my project).
Why isn't it a reasonable argument to say that not having access to
libraries is completely unrealistic? You can't write modern software
without extensive access to libraries. Period.

I've said before, I don't want to be a second class citizen with
access to only a subset of the language.

> 4. for my part, I have implemented @nogc so you can track down gc usage in
> code. I have also been working towards refactoring Phobos to eliminate
> unnecessary GC allocations and provide alternatives that do not allocate GC
> memory. Unfortunately, these PR's just sit there.

The effort is appreciated, but it was never a solution. I said @nogc
was the exact wrong approach to my situation right from the start, and
I predicted that would be used as an argument the moment it appeared.
Tracking down GC usage isn't helpful when it leads you to a lib call
that you can't change. And again, eliminating useful and productive
parts of the language is not a goal we should be shooting for.

I'll find it useful in the high-performance realtime bits; ie, the
bits that I typically disassemble and scrutinise after every compile.
But that's not what we're discussing here.
I'm happy with D for my realtime code, I have the low-level tools I
need to make the real-time code run fast. @nogc is a little bonus that
will allow to guarantee no sneaky allocations are finding their way
into the fast code, and that might save a little time, but I never
really saw that as a significant problem in the first place.

What we're talking about is productivity, convenience and safety in
the non-realtime code. The vast majority of code, that programmers
spend most of their days working on.

Consider it this way... why do you have all these features in D that
cause implicit allocation if you don't feel they're useful and
important parts of the language?
Assuming you do feel they're important parts of the language, why do
you feel it's okay to tell me I don't deserve access to them?
Surely I'm *exactly* the target market for D...? High-pressure,
intensive production environments, still depending exclusively on
native code, with code teams often in the realm of 50-100, containing
many juniors, aggressive schedules which can't afford to waste
engineering hours... this is a code environment that's prone to MANY
bugs, and countless wasted hours as a consequence.
Convenience and safety are important to me... I don't know what you
think I'm interested in D for if you think I should be happy to
abandon a whole chunk of the language, just because I have a couple of
realtime threads :/

> 5. you can divide your app into multiple processes that communicate via
> interprocess communication. One of them pausing will not pause the others.
> You can even do things like turn off the GC collections in those processes,
> and when they run out of memory just kill them and restart them. (This is
> not an absurd idea, I've heard of people doing that effectively.)

Most of the platforms I work on barely have operating systems.

> 6. If you call C++ libs, they won't be allocating memory with the D GC. D
> code can call C++ code. If you run those C++ libs in separate threads, they
> won't get paused, either (see (2)).

Whether this is practical or not thoroughly depends on the lib.
Maybe this concept can be applicable in some small places, but it's
not a salvation. I don't think this sufficient addresses the problems.
None of the problems are actually going away, they're just moved
somewhere else

> 7. The Warp program I wrote avoids GC pauses by allocating ephemeral memory
> with malloc/free, and (ironically) only using GC for persistent data
> structures that should never be free'd. Then, I just turned off GC
> collections, because they'd never free anything anyway.

That idea is obviously not applicable in my environment. Resource
usage is dynamic and fluid.

> 8. you can disable and enable collections, and you can cause collections to
> be run at times when nothing is happening (like when the user has not input
> anything for a while).

If I disable collections, then I just crash when I receive that network packet?
I'm back at manual memory management in practise.

I also don't think it's reasonable to assume there will just be 'times
when nothing is happening'. That's not how games work.
Games are often really fast paced, and even if they're not,
significant stuttering in the animation is usually considered a
non-ship-able bug.

https://www.youtube.com/watch?v=rqjOXR9QnMo
https://www.youtube.com/watch?v=giiZMktZrNI
https://www.youtube.com/watch?v=LoPC_ibBJiQ
Where would you manually issue collects?

> The point is, the fact that D has 'new' that allocates GC memory simply does
> not mean you are obliged to use it.

D also has ~, closures, dynamic arrays, even array literals. There are
various things that create implicit GC allocations. And library
calls...

> The GC is not going to pause your
> program if you don't allocate with it. Nor will it ever run a collection at
> uncontrollable, random, asynchronous times.

Those claims come with massive dependency on very specific
restrictions, like abandoning part of the language and moving library
calls to separate threads and accessing them via RPC or something like
that.

None of your suggestions sound practical, or like they'd result in any
less effort or complexity than manual management in the first place
which everyone is already accustomed to. I'm almost certainly
sacrificing safety in every case.
You can't then go on to say you gave me plenty of options, but I
rejected them, when none of them were really options.

I wonder if you have a good conception of the scope/scale of the
software we write. It's not comparable to Empire, or a linker, or a
compiler, or a web server, or many things at all really. Games are
some of the biggest, broadest software projects there are, very
tightly integrated, with some of the most stringent operating
requirements. They're also growing steadily... it's harder and harder
to manage the scope without helpful language tools; this is why you
see so many gamedevs in the independent space flirting with 'modern'
languages like C#. For games with extremely small scope that don't
push the platform (indy/casual games), this is sometimes okay, but
there are plenty of cases where it has been a complete disaster as the
scope has grown towards a more traditional 'big game'. My mates game I
helped them with from last weekend is 'mid-scoped', but it's grown to
saturate the PS4, and the GC is causing them a nightmare... right at
the end of the project when trying to finalise the build for shipping,
precisely as I've always predicted.
C# is a better productivity experience than C++, and it allowed them
to do a lot more work in a lot less time with a lot fewer people. But
it's clearly not really compatible with the workload, and I think the
future of the industry needs to do a lot better.

I've said before, we are an industry in desperate need of salvation,
it's LONG overdue, and I want something that actually works well for
us, not a crappy set of compromises because the language has a
fundamental incompatibility with my industry :/ ... It doesn't have to
be that way.