Which is faster? ulong or double
janderson
askme at me.com
Thu Sep 27 20:23:44 PDT 2007
Janice Caron wrote:
> I have this app I've written, and it needs to keep track of an integer
> quantity (time in microseconds, as it happens, but that's an
> unimportant detail). The point is, there are circumstances where the
> numbers involved get bigger than uint.max.
>
> So the question is, given that I'm using a 32-bit platform, should I
> switch to ulong, or to double?
>
> ulong sounds the most logical, since the quantity will always be an
> integer, but (correct me if I'm wrong) ulongs are emulated in
> software, which is fine for add and subtract, but not so fine for
> divide; whereas doubles have direct hardware support, and so might
> actually end up being faster if there are lots of divides.
>
> Am I talking nonsense? Is there a recommendation?
The only way to tell is to benchmark it. Also be aware that different
CPUs will perform differently due to many factors like prediction and
being able to run certain double and integer operations at the same
time. On some processors it may be faster to interleave doubles and
uints. Even then some processors can run more floating point operations
per cycle then uints (so the intervening may be like 4 doubles and 2
uints per cycle).
If you have a fast GPU you can offload this sort of operation to the GPU
which if you have enough of these values can be like 300 times faster
then the CPU.
Then theres SIMD, SIMD2, SIMD3 (specifically SSE2) ect.. which can do a
load of operations at once (ie 4 float divides at the same time) and
have some 64bit support (doubles, 64 ints) its similar to the GPU but
less operations. These I would recommend this over GPU if you want your
app to work on more systems. See:
http://www.hayestechnologies.com/en/techsimd.htm
64 bit machines + OS of course its pretty fast to do these operations in
64bit.
You could try an app optimisation where anything larger then the
boundary is stored on a separate list and processed separately (probably
easy to do with templates).
However the best thing to do is to profile and find out where your
bottleneck is and if its even worth the trouble applying these
optimizations. Algorithmic operations (in general) are much faster then
branching and other operations which cause memory fetching.
More information about the Digitalmars-d
mailing list