Numeric access to char[]

Brad Roberts braddr at puremagic.com
Fri Aug 25 15:11:29 PDT 2006


On Fri, 25 Aug 2006, Chad J wrote:

> nobody wrote:
> > Chad J wrote:
> > 
> > > nobody wrote:
> > > 
> > > > Chad J wrote:
> > > > 
> > > > > nobody wrote:
> > > > > ....
> > > > > 
> > > > > > 
> > > > > > So you will have to do it manually. I would like to suggest that if
> > > > > > you can pad the char[] to ensure its .length % 8 == 0 then you can
> > > > > > cast it to a ulong and your shifting will be faster.
> > > > > 
> > > > > 
> > > > > 
> > > > > Sure about ulong?  In my spare time I made my own minimal Bignum
> > > > > implementation.  I'm not sure about shifts, but for addition with
> > > > > carrying it was faster to use size_t (uint in this case) rather than
> > > > > ulong.  I wonder if maybe the compiler could optimize it better if it
> > > > > didn't have to emulate 64 bit integers.  Or my benchmark was borked.
> > > > 
> > > > 
> > > > 
> > > > I certainly could be wrong but I would be quite surprised. I would be
> > > > interested to see what tests you ran that suggest addition with carry
> > > > was faster.
> > > 
> > > 
> > > This BigNum I made is very incomplete and probably slow.  Changing the
> > > size of it's digits shouldn't do anything besides just that, though.
> > > 
> > > I have attached 2 .d files, BigNum.d is the implementation, main.d is the
> > > benchmark.  I compiled it with dmd -O -inline -release.  At the top of
> > > BigNum.d there is an alias that I use to switch between size_t and ulong
> > > variables used in the computation.
> > > 
> > > Here are my results.  See main.d for details of the setup.
> > > 
> > > Microseconds to complete 1000000 additions using size_t:
> > > 786779
> > > 776801
> > > 793653
> > > 783896
> > > 787937
> > > 789150
> > > 
> > > Microseconds to complete 1000000 additions using ulong:
> > > 866767
> > > 860499
> > > 866921
> > > 865791
> > > 860874
> > > 858154
> > 
> > 
> > Thanks for taking the time to post this. It will be a little while before I
> > can sit down and play with the code you sent so I thought I would give you
> > my first impression. It looks like you have tested the time it takes to add
> > a 20 "digit" number to an 18 "digit" number:
> > 
> >   BigDigit[20] test1;
> >   BigDigit[18] test2;
> > 
> > You define these numbers as one of the two:
> > 
> >   alias size_t BigDigit; // for optimal performance
> >   alias ulong BigDigit;
> > 
> > I think the easiest way to suggest the problem with your test is in finding
> > how many bytes we are adding in each case:
> > 
> >   20 * 4 = 80 bytes
> >   20 * 8 = 160 bytes
> > 
> > What your results actually seem to be saying is that treating char[160] as a
> > ulong takes slightly longer than treating char[80] as a uint. I am still
> > fairly convinced that if you test again with an equal number of bytes you
> > should find that the ulong case will be faster. If you have a chance to run
> > the test before me I would be interested to hear the results.
> > 
> 
> Ah, good catch.  That does turn the results around, now the ulong leads
> slightly instead of the other way around.
> 
> Changed
> BigDigit[20] test1;
> BigDigit[18] test2;
> 
> to
> 
> ulong[20] test1;
> ulong[18] test2;
> 
> 
> w/ size_t:
> 880821
> 905326
> 871996
> 895222
> 897846
> 
> w/ ulong:
> 849123
> 848298
> 849372
> 852560
> 895887
> 
> Another thing with the test... random numbers.  It probably won't matter much,
> but when using 64 bit digits, only the lo 32 bits get filled by phobo's rand()
> function, which means no carrying, ever.  Also, it might be interesting to do
> extreme cases like where the numbers always have all of the bits set, which
> forces carrying all the time.
> 
> I also noticed that I broke the unittest when I tried to free up any memory
> the carrying might have wasted, so I've fixed that and here's the new version.
> (This really didn't have an effect on speed, just correctness)

Have any of you looked at the GMP library?  It's a highly optimized large 
number library.  It supports both large integers and large floats.  
Providing a D wrapper on top of it would be the direction I'd head in for 
this sort of use case.

Later,
Brad



More information about the Digitalmars-d-learn mailing list