More radical ideas about gc and reference counting

Manu via Digitalmars-d digitalmars-d at puremagic.com
Sat May 10 23:27:45 PDT 2014


On 11 May 2014 14:57, Walter Bright via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On 5/10/2014 8:58 PM, Manu via Digitalmars-d wrote:
>>
>> This is truly a niche usage case though,
>
>
> Come on! Like about 80% of the programs on any linux box? Like the OCR
> program I run? A payroll processing program? Any scientific numerical
> analysis program? Engineering programs?

Linux programs aren't at risk of being ported from C any time soon.
And If the GC is a library, shell apps probably represent the set
where it is the most convenient to make use of a GC lib.
Accounting software is UI based software. It's just of a sort that
people aren't likely to notice or complain if it stutters
occasionally.
Engineering programs are typically realtime in some way. Most
productivity software of that sort (CAD, ART/design, etc) is highly
interactive.

Tools like OCR as a shell app, refer to the short-lived shell app
point. OCR as a feature of an art package (like photoshop), again,
highly interactive productivity software.


>> How many libs does DMD link?
>
>
> We've gone over this before. You were concerned that the libraries you
> linked with were incompetently written, and implied that if ARC was
> pervasive, they would be competently written. I can guarantee you, however,
> that ARC leaves plenty of opportunity for incompetence :-)

It's not about incompetence, it's about incompatibility.
If I can't tolerate a collect, not only do I sacrifice extensive and
productive parts of the language, this means a library which I have no
control over is banned from the GC outright. That is entirely
unrealistic.

I can approach a performance hazard which manifests locally, as with
ARC. I have no way to address the sort that manifests in random
places, at random times, for reasons that are outside of my control,
interferes with the entire application, and increases in frequency as
the free memory decreases (read: as I get closer to shipping day).


>> I'm hard pressed to think of any software I see people using every day
>> which isn't realtime in some sense.
>
>
> Programs you "see" by definition have a user interface. But an awful lot of
> programs are not seen, but that doesn't mean they aren't there and aren't
> running. See my list above.

Sure, and of the subset of those which run in a time-critical
environment, they are the best candidates to make use of GC as a lib.
As I see, most of those apps these days are more likely to be written
in a language like python, so it's hard to make any raw performance
argument about that category of software in general. It would be an
extremely small subset where it is a consideration.
Conversely, all realtime/UI based/user-facing software cares about
performance and stuttering.


>> then that why is there an argument?
>
>
> Because if this was an easy problem, it would have been solved. In
> particular, if the ARC overhead was easily removed by simple compiler
> enhancements, why hasn't ARC taken the world by storm? It's not like ARC was
> invented yesterday.

And as far as I can tell, it has, at least in this
(native/compiled/systems) space. O-C, C++/CX, Rust... where are the
counter-examples?
I don't think comparing D to Java/C# is a very good comparison
compared to the other native languages which do truly exist on the
same playing field.


>> That work just needs to be done,
>
>
> That's a massive understatement. This is PhD research topic material, not
> something I can churn out in a week or two if only I had a more positive
> attitude :-)

That's okay, I am just looking for direction. I want to see a
commitment to a path I can get behind. Not a path that, however many
years from now, I still have good reason to believe it won't satisfy
my requirements.
All my energy in the meantime would be a waste in that case.


>> but by all prior
>> reports I've heard, awesome GC is practically incompatible with D for
>> various reasons.
>
>
> There is no such thing as a GC which would satisfy your requirements.

Then... the argument is finished.
You have a choice. You may choose to explore the more inclusive
technology (which also solves some other outstanding language
problems, like destructors), or you confirm D is married to GC, and I
consider my options and/or future involvement in that context.


>> No matter how awesome it is, it seems conceptually
>> incompatible with my environment.
>
>
> My point!

My point too!


>> and I just don't think it's that unreasonable. ARC is an
>> extremely successful technology, particularly in the
>> compiled/native/systems language space (OC, C++/CX,
>
>
> What was dismissed is the reality pointed out many times that those systems
> resolve the perf problems of ARC by providing numerous means of manually
> escaping it, with the resulting desecration of soundness guarantees.

You've said that, but I don't think there's any hard evidence of that.
It is an option, of course, which is extremely valuable to have, but I
don't see any evidence that it is a hard requirement.
Andrei's paper which asserted that modern ARC fell within 10% of "the
fastest GC" didn't make that claim with the caveat that "extensive
unsafe escaping was required to produce these results".


>> Rust).
>
>
> Rust is not an extremely successful technology. It's barely even been
> implemented.

Maybe... but they probably had extensive arguments on the same issues,
and their findings should surely be included among the others. It's
definitely modern, and I'm sure it was considered from a modern point
of view, which I think is meaningful.


>> Is there
>> actually any evidence of significant GC success in this space?
>> Successes all seem to be controlled VM based languages like Java and
>> C#; isolated languages with no intention to interact with existing
>> native worlds. There must be good reason for that apparent separation
>> in trends?
>
>
> There are many techniques for mitigating GC problems in D, techniques that
> are not available in Java or C#. You can even do shared_ptr<> in D. You can
> use @nogc to guarantee the GC pause troll isn't going to pop up
> unexpectedly. There are a bunch of other techniques, too.

RC is no good without compiler support. One second you're arguing
precisely this case that useful RC requires extensive compiler support
to be competitive (I completely agree), and then you flip about and I
hear the "use RefCounted!" argument again (which also has no influence
on libraries I depend on).

I've argued before that @nogc has no practical effect, unless you tag
it on main(), and then D is as good as if it didn't have memory
management at all. Non-C libraries are practically eliminated. Sounds
realistic in a shell app perhaps, but not in a major software package.
D's appeal depends largely on it's implicit memory management, and
convenience/correctness oriented constructs that it enables.

Both these suggestions only have any effect over my local code, which
ignores the library problem again.


> ARC simply is not a magic, no problem solution one can use without careful
> thought in large, complex systems. (Of course, neither is GC nor any other
> memory management scheme.)

I have never claimed it's magic. It's **workable**. GC is apparently
not, as you admitted a few paragraphs above.
The key difference is that ARC cost is localised, which presents many
options. GC cost is unpredictable, and gets progressively worse as
environments become more and more like mine. Short of banning memory
management program-wide (absurd, it's 2014), or having such an excess
(waste) of available resources that I'm sabotaging competitive
distinction, theres not really workable options.

If it's true that ARC falls within 10% of the best GC's, surely it
must be considered a serious option, especially considering we've
started talking about ideas like "maybe we should make things with
destructors lower to use ARC"?

Performance, it turns out, is apparently much more similar than I had
imagined, which would lead me to factor that out as a significant
consideration. Which is the more _inclusive_ option?
And unless D is capable of 'worlds fastest GC', then ARC would
apparently be a speed improvement over the current offering too.


More information about the Digitalmars-d mailing list