BitArray/BitFields - Review
bearophile
bearophileHUGS at lycos.com
Sun Jul 29 14:36:35 PDT 2012
Era Scarecrow:
> If it has to determine between compact and non-compact, then I
> really don't see how any speed improvement can be made,
Maybe it's better to consider two use cases: dynamic length
allocated on the heap (or allocated with a given allocator, often
a stack-styled allocator), and fixed-size allocated in place, so
there's no need for that run-time test.
> and if it's optimized it may be inlined which would be about as
> fast as you could get.
I think that "a[x] = 1;" is slower than "a.set(x);" even if
the array a is allocated in place and everything is inlined. This
is probably the most important operation an array of bits has to
support. If this is missing, I have to use my own implementation
still.
> Likely similar to the hamming weight table mentioned in TDPL.
> Combined with the canUseBulk I think I could make it fairly
> fast.
There is lot of literature about implementing this operation
efficiently. For the first implementation a moderately fast (and
short) code is probably enough. Later faster versions of this
operation will go in Phobos, coming from papers.
> Yes, currently a debate of it's partial ref/value issue is
> coming up. As it is it's usable, and if you want to be sure
> it's unique you can always dup it, otherwise as a previous
> suggestion I'll either make two array types for either full
> reference vs value, or take his separate slices (reference)
> idea.
The idea of making two different types (for the dynamic and
static versions) sounds nice.
There is a third use case, that is bit arrays that start small
and can grow, and rarely grow bigger than one or two words. This
asks for a dynamic-length bit array type that allocates locally
only when it's small (so it needs to perform run-time test of the
tag. Too bad the D GC doesn't support tagged pointers!). But
probably this use case is not common enough to require a third
type of array. So I suggest to not worry about this case now, and
to focus on the other two more common ones.
> A Matrix is just a multidimensional array right?
Well, yes, but there are many ways to implement arrays and nD
arrays. Sometimes the implementations differ.
> I'll have to read up on it a bit; doing simple division (Likely
> whole rows of 32bit at a time) would do that job,
Take a look at my implementation in Bugzilla :-) It's another
type, the third. The 1D and 2D cases are the most common.
> but expanding width several times (compared to height) would be
> quite a bit slower; Although not too horrible.
A 2D array of bits probably doesn't need to change its size after
creation, this is not a common use case.
> more likely bool array/matrix.. Although if you heavily use
> specific lines it could convert to bool[], and then back again,
> but if you go all over the place it would be worse than just
> using bool[].
It's not so simple :-) A bool[] causes many more cache misses, if
the matrix is not tiny. A bool[] or bool[][] is a good idea only
if the matrix is very small, or if you don't have a bit matrix
library and you don't want to write lot of code. Converting to
bool[] the current line is not a good idea.
Bye,
bearophile
More information about the Digitalmars-d-learn
mailing list