Is garbage detection a thing?

Mark was330 at via.tokyo.jp
Sun Nov 29 16:05:04 UTC 2020


Hi,

can I ask you something in general? I don't know anyone whom I 
could ask. I'm a hobbyist with no science degree or job in 
computing, and also know no other programmers.

I have no good understanding why "garbage collection" is a big 
thing and why "garbage detection" is no thing (I think so).

I want to get rid of undefined behavior. So I tell myself, what 
is this actually? It's most of the time corrupted heap memory and 
the C++ compiler giving me errors that I thought were kind of 
impossible.

Now I could follow all C++ guidelines and almost everything would 
be okay. But many people went into different directions, e.g. 
1995 they released Java and you would use garbage collection.

What I don't understand is, when today there exist tools for C++ 
(allocator APIs for debugging purposes, or Address-Sanitizer or 
maybe also MPX) to just detect that your program tried to use a 
memory address that was actually freed and invalidated,

why did Java and other languages not stop there but also made a 
system that keeps every address alive as long as it is used?

One very minor criticism that I have is: With GC there can be 
"semantically old data" (a problematic term, sorry) which is 
still alive and valid, and the language gives me the feeling that 
it is a nice system that way. But the overall behavior isn't 
necessarily very correct, it's just that it is much better than a 
corrupted heap which could lead to everything possibly crashing 
soon.

My bigger criticism is just, that compilers with garbage 
collection are big software (with big libraries) and tend to have 
defects in other parts. E.g. such compilers (two different ones) 
lately gave me wrong line numbers in error messages.

And other people's (not mine, not really much) criticism is that 
they say garbage collection increases the use of memory and it 
can create a blocking of threads when accessing shared memory, or 
something like this.




So... I wonder where the languages are that only try to give this 
type of error: Your faulty program has (at runtime) used memory 
which has already been freed. Not garbage collection. The 
compiled program just stops all execution and tells me this, so 
that I would go on with my manual memory management.

Now, from today's perspective I could use Rust to create a very 
formal representation of my requirements and create a program 
that is very deterministic and at the same time uses very few 
resources.

But I'd like to pretend there is no Rust (because the lifetimes 
and some other things make it a domain-specific language to some 
extent), and I would like to ask about the "runtime-solution".
Why shouldn't it be a good thing? Has it been tried?

All I would *need* to do additionally is dividing the project 
into two sub-projects as it is done with C++: Debug build an 
release build.

Then the debug build would use a virtual machine that uses type 
information from compilation for garbage detection, but not 
garbage collection.

And when I have tested all runtime cases of my compiled software, 
which runs slow, but quite deterministically, I will go on and 
build the release build.

And if the release build (which is faster) does not behave 
deterministically, I would fix the "VM/Non-VM compiler" I'm 
talking about until the release build shows the same behavior.

I guess there is a way this approach could fail: Timing may have 
influence and make the VM behave differently from the Non-VM 
(e.g. x64). And it's surely not easy to write a compiler that 
creates code which traces pointers and still leaves you much 
freedom to cast and alter pointers. In some way it is doomed to 
fail, but there are language constructs that work.

There have been C interpreters, iterators as pointer 
replacements, or just any replacement. BTW I know of CINT and 
safe-c, but I'm not happy how these projects look from the 
outside.

If I had the education and persistence I would like to try to 
build my own "safe-c", yet another one. But I think it's better 
to ask you why garbage detection isn't a popular thing. Does it 
exist at all as core idea in a language (probably a C 
improvement)?

Where are the flaws in my thinking?

I currently think, if I were serious about it (I'm not 100% 
sure), I should just find a C interpreter. CINT? Or this one 
academic compiler from five years ago? (I believe this compiler 
needs a special CPU) To be honest, I have no clue. Just one 
"interpreter" that tries to mimic pointers as much as it can, and 
later I would be free to port the code to Microsoft's C.

Or maybe I could use the safe-c subset in D? But I believe it 
uses garbage collection. I know nothing about it, sorry.

What I tried in the past few days was porting working Go code to 
C. I wanted the C code to be Go-idiomatic, and I was looking 
there for the common subset from Golang combined with C. Well, I 
used macros, had a few ideas, but then this C style quickly 
failed. Really frustrating. But.. I'm not planning to give up. ;)

Thanks a lot for reading, and sorry for a lot of text that is 
off-topic and is not related to D.


More information about the Digitalmars-d-learn mailing list