Anything in the review queue?

Mon Mar 21 22:25:16 PDT 2011

Andrei Alexandrescu wrote:
> On 3/21/11 12:18 PM, dsimcha wrote:
>> == Quote from Don (nospam at nospam.com)'s article
>>>> Ok, I don't know much about how BigInts work under the hood.  I used a
>>>> fairly simple implementation of Euler's algorithm here.  Is there
>>>> something much more efficient for BigInts?
>>> Yes. Euler's algorithm performs very poorly for BigInts. In fact, you
>>> probably want to use the BinaryGCD algorithm even for builtins.
>>> There is a sub-quadratic algorithm for GCD, but it's very complicated.
>>> (It's not the one Kenny posted).
>>>>> I guess the primary question is:
>>>>> If there were a BigRational type, would you still want rational?
>>>>> Ie, are there use cases for rational over built-in types (rather 
>>>>> than it
>>>>> just being a workaround for a lack of a BigRational type)?
>>>>
>>>> My thoughts were that Rational would be **primarily** used with BigInt,
>>>> but occasionally, if you really know what you're doing with regard to
>>>> overflow, you could used fixed width as an optimization.
>>> This isn't sounding so promising.
>>> I hate to be saying that.
>>
>> I've thought about this some more.  The following **may** be a good 
>> way to
>> proceed.  I'd like Don's input, since I don't know in detail what he 
>> was planning.
>>
>> 1.  Review the design of my rational lib with a focus on whether 
>> sufficient
>> infrastructure for creating specializations without introducing 
>> breaking changes
>> is available.
>>
>> 2.  If no issues not related to the lack of specialization for BigInt 
>> are raised,
>> accept Rational into Phobos for now.
>>
>> 3.  At his leisure, Don can write a specialization of Rational 
>> specifically for
>> BigInt (or even just a specialization of gcd, if that's all that's 
>> needed).  Once
>> this is done, we include it as an optimization, but without changes to 
>> the
>> interface.  Every integer type except BigInt still uses the generic 
>> implementation.
>>
>> 4.  Maybe we provide hooks for specializing certain 
>> performance-critical things
>> like GCD, in case someone wants to use their own custom BigInt type 
>> with Rational
>> without having to modify the code for Phobos.  For example, a BigInt 
>> type might
>> optionally define a gcd() method.  Rational would look for this and 
>> use it if it
>> exists, or use the generic fallback algorithm otherwise.
> 
> I think by and large failure to define rational for BigInt in a way that 
> has many commonalities with rational for built-in integrals reflects a 
> failure of BigInt. By its very design, BigInt is supposed to 
> transparently replace integers.If this is largely the case, the design
> has succeeded. If the design of anything must be reworked essentially 
> from scratch to work with BigInt instead of int et al, then the 
> abstraction may be too porous to be useful.
> 
> There are a few approaches we can take from here. One is to define 
> certain traits that differentiate BigInt from other integrals (e.g. 
> preferAdditionToMultiplication or whatnot),  and then design Rational to
> use those traits. Another is of course to specialize the entire Rational 
> on BigInt. Third would be to specialize certain core routines (gcd and 
> friends) for BigInt and keep Rational agnostic.

You're missing something important here. Rational big integers are a 
very well defined area of research, with its own set of algorithms.
Although you can pretend a BigInt behaves as an int, and make a rational 
type, the performance will be pathetic, and everyone will laugh at you.

> Anyhow, BigInt should be cheap to copy (i.e. use refcounting). If there 
> are good reasons for which that's impossible, we should give up on the 
> dream of enacting that assumption throughout Phobos. Don?

Well, it uses copy-on-write so copying itself isn't expensive.
But temporaries are never going to be cheap. Things like bigint *= 
bigint is  NEVER an in-place operation (the size is twice as big, so a 
memory allocation is always required). Even bigint += bigint may trigger 
a memory allocation.
Refcounting might be better than copy-on-write, in the end, but probably 
not a huge win.
One thing which I will do eventually is implement the small-integer 
optimisation (so anything smaller than 31 bits/63 bits is stored inside 
the pointer and doesn't have any memory allocation at all). But that 
doesn't help the general case.