BigInt memory usage

Steven Schveighoffer schveiguy at yahoo.com
Thu Oct 13 08:34:37 PDT 2011


On Thu, 13 Oct 2011 11:00:54 -0400, Marco Leise <Marco.Leise at gmx.de> wrote:

> Am 13.10.2011, 02:26 Uhr, schrieb bearophile <bearophileHUGS at lycos.com>:
>
>> On a 32 bit system this program has allocates about 100 MB of RAM:
>>
>>
>> import std.bigint;
>> int main() {
>>     static assert(BigInt.sizeof == 12);
>>     auto data = new BigInt[2_000_000];
>>     foreach (i; 0 .. data.length)
>>         data[i] = BigInt("9223372036854775808123");
>>
>>     int count;
>>     foreach (i; 0 .. 2_000_000_000) { count++; }
>>     return count;
>> }
>>
>>
>>
>> It means about 50 bytes/biginteger.
>>
>> Every zero BigInt struct needs 12 bytes inside the 'data' array.
>>
>> The represented numeral requires about 73 bits, this means 3 words  
>> (with lot of wasted space):
>>
>>>>> l = 9223372036854775808123
>>>>> from math import log
>>>>> log(l, 2)
>> 72.965784284662092
>>
>>
>> Using so much memory allows to store less numbers, and makes them  
>> slower too (I have hit both problems). Do you know why D BigInts use so  
>> much memory?
>>
>> Bye,
>> bearophile
>
> So the BigInt struct has a size of 3 machine words:
> "data": a dynamic array (2 words) pointing to a block of uints
> "signed": a flag (1 word including alignment) that can be avoided
>            if you only need positive BigUInts.
>
> During parsing the "data" array is set to a predicted size of 4 uints  
> (16 bytes) for your number, which makes a total of 28 bytes. That is  
> what you must expect as a minimum from the implementation. Now when I  
> run your code I get between 45 to 46 bytes per BigInt. So that is a  
> difference of 17 to 18 bytes.
> That is the overhead of D's memory management. You get the same 17 to 18  
> additional bytes if you replace your BigInt with a dynamic array of  
> uints and initialize each of them like this:
>
>       auto data = new uint[][2_000_000];
>       foreach (i; 0 .. data.length)
>           data[i] = new uint[4];

Allocating an array of 4 uints allocates a block large enough to hold 7  
(block of 32 bytes)  This is likely where the overhead is coming from.

I'd have to look at the BigInt code, but if it's not using append  
operations, it should probably be creating the array using GC.malloc  
instead of new-ing an array.

-Steve


More information about the Digitalmars-d mailing list