General Problems for GC'ed Applications?
Unknown W. Brackets
unknown at simplemachines.org
Sun Jul 23 01:58:38 PDT 2006
Personally, I see it as good coding practice to, in D, use delete on
those things you know you need/want to delete immediately. Just don't
go nuts about those things you don't.
Typically, in a well programmed piece of software, the number of
allocations you can't be entirely sure about deleting will be much less
than those you are sure about. Without a GC, this small percentage
causes a huge amount of headache. With it, you can ignore these and
worry only about the easily-determined cases.
Throwing caution to the wind and using the GC for everything isn't, in
my opinion, a good idea. If anyone disagrees with me, I'd like to hear
it, but I know of no reason why you'd want to do that. I realize many
people do throw caution to the wind, but so do many people drive drunk.
It doesn't mean anyone suggests it, necessarily.
#1 does not seem to be a severe problem to me. Memory usage will be
higher, but that's why there's flex for cache and buffers.
#2 would be a problem for any application. I do not estimate that my
application, if I did not track every allocation down (but only delete
when I know for sure), would use even 50% more memory.
As such, I think it would be difficult to show a piece of well-written
software that runs into this problem using the GC, but does not
otherwise. IMHO, if a program requires so much memory swapping starts
happening frequently, it's a bug, a design flaw, or a fact of life.
It's not the garbage collector.
Yes, a collect will cause swapping - if you have that much memory used.
Ideally, collects won't happen often (since they can't just happen
whenever anyway, they happen when you use up milestones of memory) and
you can disable/enable the GC and run collects manually when it makes
the most sense for your software.
Failing that, software which is known to use a large amount of memory
may need to use manual memory management. Likely said software will
perform poorly anyway.
#3 depends on the above cases being true. Some competition may happen
indeed, but I think it would take more analysis to see how strongly this
would affect things.
Neither is this a new problem; programs will always compete for main
memory. #3 is only a problem when taking into account the higher memory
usage, and is negligible if the memory usage is not much higher.
I would say these problems are heavily dependent on application design,
code quality, and purpose. In other words, these are more problems for
the programmer than for the garbage collector. But this is only my opinion.
As for the risks... a I see as the OS's problem where #1 is an issue; #2
I see as a problem regardless of garbage collection; c I agree with, as
mentioned above, these should be balanced - only when it is not of
benefit should the "leak" be left for the GC.
I'm afraid I'm not terribly familiar with the dining philosopher's
problem, but again I think this is a problem only somewhat aggravated by
garbage collection.
Most of your post seems to be wholly concerned with applications that
use at least the exact figure of Too Much Memory (tm). While I realize
there are several special-cases where such usage is necessary or
acceptable, I seem at a loss to think of any general or practical
reasons, aside from poor code quality... or database systems.
The software on my home computer which uses the most memory uses nothing
even close to the amount of system memory I have. Indeed, the sum of
memory use on my machine with several programs running is still less
than that typically shipped with machines even a few years ago (I'm
putting that number at 512 megabytes.)
The servers I manage have fairly standard amounts of ram (average, let's
say, 2 gigabytes.) Typically, even in periods of high traffic use, they
do not take much swap. In fact, Apache is garbage collected
(pools/subpools, that is) and doesn't seem to be a problem at all. PHP,
which is used on many of these servers, is not garbage collected
(traditional memory management) and tends to hog memory just a bit.
MySQL and other database systems obviously take the largest chunk. For
such systems, you don't want any of your data paged, ever. You
typically have large, static cache areas which you don't even want
garbage collected, and you never realloc/free until the end of the
process. These areas would not be garbage collected and the data in
them would not be scanned by the garbage collector.
In fact, you'd normally want your database on a dedicated machine with
extra helpings of memory for these reasons. Whether or not it was
garbage collected wouldn't affect whether it was suitable only as a
single instance on a single machine. As above, this is a matter of the
software, not of the GC.
So my suggestion is that you look at the limitations of your software,
design, and development team and make your decisions from there. A
sweeping statement that garbage collection causes a dining philosopher's
problem just doesn't seem correct to me.
Thanks,
-[Unknown]
> I see three problems:
>
> 1) The typical behaviour of a GC'ed application is to require more
> and more main memory but not to need it. Hence every GC'ed
> application forces the OS to diminish the size of the system cache
> held in main memory until the GC of the application kicks in.
>
> 2) If the available main memory is unsufficient for the true memory
> requirements of the application and the OS provides virtual memory
> by swapping out to secondary storage, every run of the GC forces
> the OS to slowly swap back all data for this application from
> secondary storage and runs of the GC occur frequently, because main
> memory is tight.
>
> 3) If there is more than one GC'ed application running, those
> applications compete for the available main memory.
>
>
> I see four risks:
>
> a) from 1: The overall reaction of the system gets slower in favor
> for the GC'ed application.
>
> b) from 2: Projects decomposed into several subtasks may face
> severe runtime problems when integrating the independent and
> succesful tested modules.
>
> c) from 2 and b: The reduction of man time in the development and
> maintenance phases for not being forced to avoid memory leaks may
> be overly compensated by an increase of machine time by a factor of
> 50 or more.
>
> d) from 1 and 3: A more complicated version of the dining
> philosophers problem is introduced. In this version every
> philosopher is allowed to rush around the table and grab all unused
> forks and declare them used, before he starts to eat---and nobody
> can force him to put them back on the table.
>
>
> Conclusion:
>
> I know that solving the original dining philosophers problem took
> several years and I do not see any awareness towards this more
> complicated version arising by using a GC.
> Risks c) and d) are true killers.
> Therefore GC'ed applications currently seem to be suitable only if
> they are running single instance on a machine well equipped with
> main memory and no other GC'ed applications are used.
> To assure that these conditions hold, the GC should maintain
> statistics on the duration of its runs and frequency of calls. This
> would allows the GC to throw an "Almost out of memory".
More information about the Digitalmars-d
mailing list