BigInts in Tango/Phobos

bearophile bearophileHUGS at lycos.com
Wed Sep 9 06:59:48 PDT 2009


I have tried to find Don's email address, but so far I have failed (and I have not seen him on IRC d.tango), so I talk here, even if this isn't the best place.

I think this is the data representation of a Tango BigInt:

alias uint BigDigit;
const BigDigit[] ZERO = [0];

struct BigUint {
  BigDigit[] data = ZERO;
}

struct BigInt {
  BigUint data;     // BigInt adds signed arithmetic to BigUint.
  bool sign = false;
}



I think the following representation may be better:

struct BigInt {
  size_t* data;
  int sizeVal;  // sign(sizeVal) is always the sign of the BigInt.
                // If data!=null then abs(sizeVal) is the actually size used
                // of the memory block, otherwise sizeVal is the value of this
                // BigInt (useful for small BigInts), with overflow tests too.              
  int capacity; // The available size of the block
}

So the operations on such BigInts are cheap if the numbers can be stored in a 32 bit int (it just has to test if data is null and then perform a safe operation on 32 bit numbers). This first part can be done in a small struct method that I hope will always be inlined. If data isn't null then there's a call to the bigint machienery.
When a 32 bit sizeVal produces overflow the bigint is converted into a normal bigint.
Small integers are very common, so speeding up this case is very useful.

Overflow tests in C:
http://www.fefe.de/intof.html

The capacity/sizeVal pair is modeled on a similar pair of the C++ vector, so it can mutate inplace and grow avoiding some reallocations (GNU GMP too allows inplace mutation).

In theory inside such BigInt struct there's space for a 64 bit value (as a union) but I think 64 bit values aren't common enouggh to justify the little slowdown on smaller values.

Bye,
bearophile



More information about the Digitalmars-d mailing list