Is garbage detection a thing?

Sun Nov 29 22:20:29 UTC 2020

On Sunday, 29 November 2020 at 16:05:04 UTC, Mark wrote:
> Hi,
>
> can I ask you something in general? I don't know anyone whom I 
> could ask. I'm a hobbyist with no science degree or job in 
> computing, and also know no other programmers.
>
> I have no good understanding why "garbage collection" is a big 
> thing and why "garbage detection" is no thing (I think so).

In order to detect garbage, you need extensive run-time 
instrumentation, the difficulties of which you have indicated 
yourself. In addition comes that detection depends on 
circumstance, which is an argument against the debug/release 
strategy you proposed. There is no guarantee that you’ll find all 
problems in the debug build. Garbage collection also comes at a 
runtime cost, but strategies exist to minimise those, and in 
addition a GC enables valuable language features. One such 
strategy is to minimise allocations, which improves performance 
in any memory management scheme.

[...]
> What I don't understand is, when today there exist tools for 
> C++ (allocator APIs for debugging purposes, or 
> Address-Sanitizer or maybe also MPX) to just detect that your 
> program tried to use a memory address that was actually freed 
> and invalidated,
>
> why did Java and other languages not stop there but also made a 
> system that keeps every address alive as long as it is used?

Elimination of memory problems is much more valuable than 
detection. Recovering from memory errors at run time is 
unreliable.

> One very minor criticism that I have is: With GC there can be 
> "semantically old data" (a problematic term, sorry) which is 
> still alive and valid, and the language gives me the feeling 
> that it is a nice system that way. But the overall behavior 
> isn't necessarily very correct, it's just that it is much 
> better than a corrupted heap which could lead to everything 
> possibly crashing soon.

At least in D, you can avoid old data to hang around for too 
long. See core.memory.

> Or maybe I could use the safe-c subset in D? But I believe it 
> uses garbage collection. I know nothing about it, sorry.

@safe D is not a sub-set, indeed it uses garbage collection. Fact 
is that there are very few domains where this is a problem. Not 
all garbage collectors are equal either, so if you think garbage 
collection is bad in one language, this may not directly apply in 
another. In D the garbage collector is even pluggable, various 
implantations exist. Have you seen the GC category on the 
blog?https://dlang.org/blog/2017/03/20/dont-fear-the-reaper/

BetterC is a subset of D, it does not use garbage collection.

You may be interested in current work being done in static 
analysis of manual memory management in D: 
https://youtu.be/XQHAIglE9CU

The advantage of D is that all options are open. This allows the 
following approach:
1) Start development without worrying about memory. Should 
collection cycles be noticeable:
2) Profile your program and make strategic optimisations 
https://youtu.be/dRORNQIB2wA. If this is not enough:
3) Force explicit collection in idle moments. If you need to go 
further:
4) Completely eliminate collection in hot loops using @nogc 
and/or GC.disable. When even this is not enough:
5) Try another GC implementation. And if you really need to:
6) Switch to manual memory management where it matters.

This makes starting a project in D a safe choice, in multiple 
meanings of the word.

— Bastiaan.