General Problems for GC'ed Applications?

Unknown W. Brackets unknown at simplemachines.org
Mon Jul 24 21:54:30 PDT 2006


Karen,

Your response seems to indicate a lack of knowledge about garbage 
collection, but perhaps I'm only reading what you said wrong.

First of all, let's get this clear:

1. Every call to malloc will not cause a collect.
2. It is, in fact, unlikely that two subsequent calls should ever 
trigger two subsequent collects - because of pooling.
3. Pooling does increase memory use, but it also means less collects.

Any program which triggers collections frequently is written badly.  If 
you must ACTIVELY and continuously allocate chunks of ram larger than 
64k you either:

   - need to avoid using the GC for those allocations.
   - need to disable the GC while allocating that data.
   - have a serious design flaw.
   - are not a reputable or skilled programmer.

Assuming you won't agree with the above, though, clearly garbage 
collection simply does not work for the *uncommon* and *impractical* 
case of constant and large allocations.  If you do not agree that this 
is an uncommon thing in computer programming, please say so.  I will not 
bother responding to you any further.

Furthermore, it is entirely practical to write generational garbage 
collectors, or other garbage collectors utilizing different methods or 
processes.  This is not done in the current implementation of D.  Yet, 
if it were then this problem could be avoided.

Regardless, I maintain that such a program would perform poorly.  I 
don't care if you have 20 gigs of main system memory.  Any program that 
is constantly allocating and filling large amounts of memory WILL BE 
SLOW, at least in my experience.

Please understand that the garbage collector, at least in D, works 
something like this (as far as I understand):

1. A batch of memory is allocated.  I believe this happens in fixed 
chunks of 64k, but it may scale the size of them.

2. From this memory, parts are dolled out.

3. If a "large" allocation happens, there is special code to handle this.

For more information, please see the source code to Phobos' garbage 
collector, available in src/phobos/internal/gc/gcx.d.

You could, theoretically, tell your garbage collector not to scan the 
memory range you allocated so it would never get swapped in (unless this 
range also contains pointers.)  In such a case, I again point to the 
programmer as the one at fault.

Thus you could avoid swapping in those special cases where you need 
large amounts of memory.  Again, I do not believe such things are 
common.  If you are unable to program with efficiency in respect to 
memory, I suggest you find a new occupation.

I hope you do not take offense to that, but I truly believe too many 
people these days try to force themselves into things they just aren't 
any good at.  Some people would make wonderful lawyers but they think 
being a doctor is cooler, so they make their lives horrible.

Honestly, I feel like I'm debating how dangerous it would be to be hit 
by a sedan or an SUV.  I really don't care, it's going to hurt either 
way.  A lot.  The answer is not to get hit, not to say that we should 
all break our bones with sedans because it's not as bad.

I mean, really.  It's one thing to argue about theoretical problems but 
it's quite another to argue about impractical ones and accuse a 
methodology of being flawed because it could fail in these impractical 
cases.  That's just not the logic I was taught.  Doesn't jive.

I really don't care to prove you wrong.  I've said what I'm going to 
say.  I may respond again if you seem reasonable and bring up something 
new; but if you bring nothing else new in (as with this post)... you've 
lost my interest.

Of course, this is only my opinion and understanding.

-[Unknown]


> Unknown W. Brackets wrote:
> 
> I disagree. Assume a non GC'ed program that allocates 1.5 GB to 1.7 
> GB memory, from which 0.7 GB to 0.9 GB are vital data. If you run 
> this program on a machine equipped with 1 GB, the OS will swap out 
> the 0.8 GB data that is accessed infrequently. Therefore this 
> program cause swapping only if it accesses data from the swapped 
> out part of data and the size of the swapped data will be 
> approximately bounded by doubling the size of the data needed to be 
> swapped back.
> 
> This changes dramatically if you GC it, because on every allocation 
> the available main memory is exhausted and the GC requires the OS 
> to swap all 0.8 GB back, doesn't it. 
> 
> 
>> I'm afraid I'm not terribly familiar with the dining
>> philosopher's problem, but again I think this is a problem only
>> somewhat aggravated by garbage collection.
>>
>> Most of your post seems to be wholly concerned with applications
>> that use at least the exact figure of Too Much Memory (tm). 
> 
> It is not only somewhat aggravated. Assume the example given above 
> is doubled by two instances of that program and the main memory is 
> not only doubled to 2GB but increased to 4GB or even more.
> 
> Again both non GC'ed version of the program run without any 
> performance problems, but the GC'ed versions do not---although the 
> memory size is increased by a factor that enables the OS to not 
> swap out any allocated data in case of the non GC'ed versions.
> 
> This is because both programs at least slowly increase their 
> allocations of main memory.
> 
> This goes without performance problems unitl the available main 
> memory is exhausted. The first program that hits the limit starts 
> GC'ing its allocated memory---and forces the OS to swap all in. 
> Hence this first program is getting in the danger that all memory 
> freed by its GC is immetiately eaten up by the other instance, that 
> continues running unaffected because its thirst for main memory is 
> accompülished by the GC of the other instance, if that GC is 
> freeing memory as the GC recognizes it.
> 
> At the time when this GC run ends there are at least two cases 
> distinguishable:
> a) the main memory at the end of the run is still insufficient, 
> because the other application ate it all up. Then this instance 
> stops with "out of memory".
> b) the main memory at the end of the run by chance is sufficient, 
> because the other application was not that hungry. Then this 
> instance will start being performant again. But only for the short 
> time until the limit is reached again.
> 
> This is a simple example with only one processor and two competing 
> applications---and I believe that case a) can happen.
> 
> So I feel unable to prove that on multi-core machines running 
> several GC'ed applications the case a) will never happen.
> 
> And even if case a) never happens there might be always at least 
> one application that is running its GC. Hence swapping si always on 
> the run. 
> 
>  
>> A sweeping statement that garbage collection causes
>> a dining philosopher's problem just doesn't seem correct to me.
> 
> Then prove me wrong.



More information about the Digitalmars-d mailing list