deprecated delete and manual memory management
Steven Schveighoffer
schveiguy at yahoo.com
Fri Apr 29 06:03:12 PDT 2011
On Thu, 28 Apr 2011 16:14:48 -0400, Timon Gehr <timon.gehr at gmx.ch> wrote:
> Steven Schveighoffer wrote:
>> I would need to see what your benchmark is before I'd say it was
>> conclusive.
>
> Ok. Version with delete is GC_hints (delete does not give many semantic
> guarantees, ergo it is a hint), version without delete is GC. It is also
> attached
> for convenience.
>
> This benchmark tests quite high loads on the GC heap, so for usual
> applications
> the GC won't perform all that bad. But then again, usual applications
> don't come
> with a memory allocation bottleneck.
>
> @Andrei: The benchmark also includes a comparison to malloc/free
> (version c_heap).
>
> Timon
>
>
> /* Benchmark GC vs GC & hints vs freelist vs C Heap
> * My measurements on Intel Core Duo 2x2.4 GhZ on Ubuntu 10.04 (lucid)
> Amd64
> * version=GC: 1m45s
> * version=GC_hints: ~6.8s
> * //using custom allocators:
> * version=freelist: ~3.7s
> * version=c_heap: ~4.5s
> */
>
>
> import std.stdio;
> import std.c.stdlib;
> import std.c.string;
> import std.algorithm:swap;
>
>
> version=GC;//default
> //version=GC_hints;
> //version=freelist;
> //version=c_heap;
>
>
> version(freelist) version=calldel;
> version(GC_hints) version=calldel;
> version(c_heap) version=calldel;
>
> class foo{
> char data[1020]=void;
> static foo freelist;
> foo next;
> version(freelist){
> new(size_t s){
> if(freelist !is null){
> void *r=cast(void*)freelist;
> freelist=freelist.next;
> return r;
> }
> return cast(void*)new char[s];//could use malloc(s); effects are
> similar
> }
> delete(void *p){
> if(p){
> foo oldfl=freelist;
> freelist=cast(foo)p;
> freelist.next=oldfl;
> }
> }
> }
> version(c_heap){
> new(size_t s){return malloc(s);}
> delete(void *p){if(p) free(p);}
> }
> };
>
> foo[100000] a;
>
> void main(){
> srand(100);
> int top;
> int dir;//0: expected growing, 1: expected shrinking
> foreach(t;0..10000000){
> if(!top){
> a[top++]=new foo;
> dir=0;
> }else if(top==100000){
> top--;
> version(calldel) delete a[top];
> dir=1;
> }else if(dir^!(rand()%3)){
> top--;
> version(calldel) delete a[top];
> }else a[top++]=new foo;
> if(!rand()%100) for(int i=0;i<100;i++) swap(a[0],a[rand()%top]);//mess
> around
> }
> }
I'll point out a couple of issues here:
1. you are never removing elements from the array when version(calldel) is
not enabled. That is, the GC might run, but does not clean any objects
when you just do top--. Although this might not make a huge difference,
it is not an equivalent comparison. I'd suggest adding else clauses to
the version(calldel) where you set a[top] to null.
2. the GC is conservative, and you are allocating swaths of 1K objects.
It's quite possible that you are running into false pointer issues. This
might be helped with a more precise GC (which there is a patch for in
bugzilla). David Simcha has also checked into github improvements for
allocation of large objects. This may also help with your benchmark.
I too have seen the GC perform horribly in situations like this, where you
are allocating lots and lots of data. All you need to do to prove it is
append an array for so many elements, you will see huge pauses where the
GC runs.
These are situations where delete helps because of the conservative
implementation of the GC. Yes, it is a valid point, but it doesn't prove
that delete is better against *any* GC. What we need to do is improve the
GC.
I tried setting a[top] to null in the GC system. For comparison, on my
system your original GC test took 2m42s.
With setting a[top] to null, the new run on my system is 2m36s. So the
time does not decrease significantly.
Some more data:
I added the following line to the foreach loop:
if(!(t % 10000)) writeln(t, "\t", top);
There is some interesting behavior, particularly for the GC and GC_hints
versions. Your code obviously goes through cycles in the array, first
growing up to 100,000, then shrinking back down to 0, then repeating the
cycle. There is some noise with the rand % 3, but it is not enough to
make the growth and shrinking truly random. What is interesting in this
display is, in both GC_hints and GC versions, the initial growth to
100,000 is somewhat slow, with the GC_hints version being a bit faster.
However, after that, the difference is drastic. The GC_hints version
shrinks very quickly, and the GC version continues its slow pace. This is
mildly surprising, because setting the top element to null should be much
faster than delete, and both are using the GC to allocate new nodes.
However, what I find really interesting is the subsequent growth back up
to 100,000 is still slow in the GC version, but is still very speedy in
the GC_hints version. Given that both are using the GC for allocation, I
would expect even footing. There is definitely something different going
on when delete is called than when a collection cycle is called. The GC
version may leave more objects allocated than the GC_hints version, but it
should speed up as it can collect more object blocks to reuse. It's
almost as if it's not reusing memory, but my system isn't running out of
memory. So I'm not sure exactly what's going on.
This clearly needs some investigation, thank you for sharing your
benchmark.
-Steve
More information about the Digitalmars-d
mailing list