GC and memory leaks

Ald Sannes aldarri_s at yahoo.com
Sun Nov 11 08:34:03 PST 2007


Hello.

I have good reasons to believe there are bugs in GC, or Phobos, or Zlib that comes bundled with Dmd 1.023
Here is my main function:

void main(char[][] argumentList)
{
	std.gc.minimize(); 

	buildTemporaryIndex();

	std.gc.fullCollect(); 

	buildPermanentIndex();
	
	findWords();
}

The three functions are completely isolated from each other, they only communicate through disk IO (by the way, great library for file IO).  Before exiting from the first function, I explicitly go through each class' static members and delete them.  Delete each element in case of array or has table.

Yet the 800 Mbytes of memory are not being freed until the program terminates.

Next issue.
I have commented out everything except for the code that decompresses data in files for processing.  

main()
{	buildTemporaryIndex();}

void buildTemporaryIndex()
{
	char[][] datasetFileList = listdir(Config.getInputDirectory());
	
	for(int i = 0; i < datasetFileList.length; i+=2)
	{
		indexFileStream = IndexDecompressor.gunzipFile(datasetFileList[i+1]);
		pageFileStream = IndexDecompressor.gunzipFile(datasetFileList[i]);

		delete indexFileStream;
		delete pageFileStream;

		std.gc.fullCollect(); 
		//break;
	}
}

	public static char[]			gunzipFile			(char[] fileName)
	{
		int zipFileSize = getSize(fileName);
		void [] zipFileContentRaw = read(fileName);
		void[] zipFileContent = uncompress(zipFileContentRaw, zipFileSize*2, 24);
		
		delete zipFileContentRaw;
		//delete zipFileContent;
					
		return cast (char []) zipFileContent;
		//return "";
	}

The problem is that, despite the delete statments and calls to garbage collector to free all it can, the program hugs some 100 Mbytes of main memory, which roughly corresponds to to the size of data extracted, until the termination.
I speculate that, despite gc.noRoots calls in the zlib wrapper, the memory leak happens there; the raw data in array is being taken for pointers that point literally everywhere, thus no memory is ever deallocated.

Third.
To parse HTML, I used std.regesp.replace().  On some files, it loops, ate all memory in less than a minute and crashed.

What can I do to help find the issues?  If it helps, I can post the entire source code.  And even the data set (50 Mbytes).


And one more thing.  Please fix 
http://www.digitalmars.com/d/1.0/dcompiler.html, for the link labeled 'latest compiler' points to DMD 1.015.

Thanks



More information about the Digitalmars-d mailing list