Memory leak with dynamic array
Joseph Wakeling
joseph.wakeling at gmail.com
Sun Apr 11 09:50:11 PDT 2010
Thanks very much for the extended and detailed explanations. The assumeSafeAppend
function
indeed fixes the memory leakage. :-)
In reply to some comments ...
> > That is to be expected. The append function is not the most efficient
> > thing. It still must do a lot of work just to look up the valid length
> > and verify appending can be done, whereas the v2 "append" is a single
> > instruction!
>
> The v2 "append" seems two instructions:
> arr[arr_len] = j;
> arr_len++;
>
> And the C++ v1 too has to verify if the appending can be done, comparing
> the size with the capacity and if so assigning the item and increasing the
> size.
>
> I understand that the D code has to work more here, and I presume your code
> can't be improved. But those timings are a little sad anyway :-|
>
> > A better method is to set the length, and then write the data.
>
> What I have done in the D v2 example.
I was very happy to see that D _does_ have a 'reserve' function for arrays,
which I had been missing compared to C++ (it's not mentioned in the array
docs).
Still, I don't think that pre-reserving the memory per se is the influencing
factor on the differences in performance. To see why, compare these two pieces
of code -- a non-leaking D code, and the equivalent in C++:
////////////// D example
/////////////////////////
import std.stdio;
void main()
{
double[] x;
foreach(uint i;0..100) {
x.length = 0;
writefln("Array capacity (1) = %u",x.capacity);
assumeSafeAppend(x);
writefln("Array capacity (2) = %u",x.capacity);
foreach(uint j;0..5000000)
x ~= j;
writefln("At iteration %u, x has %u elements.",i,x.length);
}
}
/////////////////////////
//////////// C++ example
/////////////////////////
#include <vector>
#include <iostream>
using namespace std;
int main()
{
vector<double> x;
for(unsigned int i=0;i!=100;++i) {
x.resize(0);
cout << "Vector capacity: " << x.capacity() << endl;
for(unsigned int j=0;j!=5000000;++j)
x.push_back(j);
cout << "At iteration " << i << ", x has " << x.size() << " elements." <<
endl;
}
return 0;
}
/////////////////////////
Note that in C++ the memory is not preassigned either. The difference
between the performance of these pieces of code is striking -- on my
machine the D example takes about 70--75 seconds to run, whereas the
C++ example speeds through it in 10s or less.
D also uses about 20% more memory than the C++ even though the C++ code
declares a higher capacity for the vector (above 8 million) than D does
for the array (a little over 5 million).
I don't know if it was naive to assume that D's dynamic arrays would be
equivalent to vectors in C++. But it's clear that the D array appender
~= is much slower than the C++ vector push_back().
Using an Appender class and put() (from std.array) is even slower,
despite the std.array docs recommending this over ~. :-(
It's disappointing because I'm loving so much of D's syntax, but I can
write far more efficient code (with less explicit memory management)
in C++ ...
More information about the Digitalmars-d-learn
mailing list