BigInts in Tango/Phobos
bearophile
bearophileHUGS at lycos.com
Wed Sep 9 06:59:48 PDT 2009
I have tried to find Don's email address, but so far I have failed (and I have not seen him on IRC d.tango), so I talk here, even if this isn't the best place.
I think this is the data representation of a Tango BigInt:
alias uint BigDigit;
const BigDigit[] ZERO = [0];
struct BigUint {
BigDigit[] data = ZERO;
}
struct BigInt {
BigUint data; // BigInt adds signed arithmetic to BigUint.
bool sign = false;
}
I think the following representation may be better:
struct BigInt {
size_t* data;
int sizeVal; // sign(sizeVal) is always the sign of the BigInt.
// If data!=null then abs(sizeVal) is the actually size used
// of the memory block, otherwise sizeVal is the value of this
// BigInt (useful for small BigInts), with overflow tests too.
int capacity; // The available size of the block
}
So the operations on such BigInts are cheap if the numbers can be stored in a 32 bit int (it just has to test if data is null and then perform a safe operation on 32 bit numbers). This first part can be done in a small struct method that I hope will always be inlined. If data isn't null then there's a call to the bigint machienery.
When a 32 bit sizeVal produces overflow the bigint is converted into a normal bigint.
Small integers are very common, so speeding up this case is very useful.
Overflow tests in C:
http://www.fefe.de/intof.html
The capacity/sizeVal pair is modeled on a similar pair of the C++ vector, so it can mutate inplace and grow avoiding some reallocations (GNU GMP too allows inplace mutation).
In theory inside such BigInt struct there's space for a 64 bit value (as a union) but I think 64 bit values aren't common enouggh to justify the little slowdown on smaller values.
Bye,
bearophile
More information about the Digitalmars-d
mailing list