BitArray/BitFields - Review
Era Scarecrow
rtcvb32 at yahoo.com
Sat Jul 28 16:16:37 PDT 2012
On Saturday, 28 July 2012 at 23:02:17 UTC, Dmitry Olshansky wrote:
> On 29-Jul-12 02:31, Era Scarecrow wrote:
>> On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky
>> wrote:
>>> Mhm? Not getting you it at all. What's locking it ?
>>>
>>> What's wrong with:
>>> struct BitArray
>>> {
>>>
>>> union {
>>> struct {
>>> size_t[] data;
>>> ulong length; //in bits
>>>
>>> //length in words would be just
>>> // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
>>> // as it's a power of 2 it should get optimized
>>>
>>> }
>>> size_t[2+ulong.sizeof/size_t.sizeof] compact;
>>>
>>> // and say 1 bit is used for isCompact would be great if it
>>> could use
>>> // the least significant bit of pointer from data[], dunno if
>>> it's easy enough
>>
>> I'd rather not rely on the pointer at all as a reference for
>> if it's compact.
>
> No problem, I redesigned a bit:
>
> struct BitArray
> {
>
> union {
> struct {
> ulong length; //in bits, the top bit used as
> isCompact
> size_t[] data;
> }
> struct{
> ushort smallLength; //here as well
> ubyte[size_t.sizeof*2+ulong.sizeof-2] compact;
> }
> }
>
> Up 2^63 of bits would occupy 2^60 of bytes should be good
> enough for the years to come.
> I'm not seeing it as a problem as at very least processors
> have to actually be able to address more then 42 (or was it
> 48?) bits that they can now. (see AMD or Intel or ARM
> datasheets on 64 bit chips)
2^63, or drop 8 so it takes up no more than 2^55 bytes of memory.
Yes that's quite a large chunk.
Curiously enough the current structure is already smaller than
the proposed unions.
>> If the array is changed later to a template that could accept
>> bytes, then where does that leave you? When the slice doesn't
>> need to know more of the beginning,
> ???
> Slice can start at some random offset
Yes they can, but so can a normal array (as they are pretty much
the same)
>> it can re-slice it to a lesser more appropriate size.
>
> no idea what's you are talking about here.
As for requested compact to hold the starting/ending offsets at
1 byte each, obviously slicing at say 1024 to 2048 of a 10,000
length bit array, that won't work (0-127). If it's larger than
size_t in bits, it will subtract that size and internally change
the pointer of the array it actually points to, until it can fit
that new value.
start/end slice: [32 .. 64]
equals: size_t array[1 .. 2]
>> It may very well soon contain a third entry in the union for
>> main operations and use the larger size_t just for canUseBulk
>> (Actually that sounds better to me).
>
> Again I'm having hard time to get what follows from where. Can
> you expand on this point please? Preferably with some code as
> it was far easier to reason about.
A question earlier was brought up about endienness for the patch,
that is not part of what it can handle, so:
ba[100] = 1; //then save the array
assert(ba[100]); //fails on opposite endienness when loaded
HOWEVER! if individual bits are handled in a byte array
reference, then the above would succeed, even with the larger
size_t ones all working with canUseBulk.
>> Besides, separating the slice suggests 'length' is not part of
>> the union struct.
>
> How? All containers know their length.
Yes they do, they also know where they start, thereby giving
start/ending, which allows slicing. So why separate slicing?
>> Sure it's convertible to [0 .. length], but it's no longer
>> convertible back to BitArray (Unless you make a whole new
>> array and conversion/copying which takes a lot of time).
>
> Yes it's not convertible. And it's no problem and is intended.
> Otherwise like you said it doesn't make a lot of sense.
>>> If you do that, why not do BitSlice from the beginning which
>>> holds BitArray, but then if you do that why make them
>>> separate at all? It doesn't make sense to me. I can see
>>> making a separate range for the front/popFront/empty if you
>>> really wanted to use those, but it wouldn't change how it
>>> would work.
> Begin takes extra ulong I'm not sure it is needed in the
> container itself.
No it doesn't. bitfields are used to handle the beginning/ending
offsets, equaling a total of 14 bits. It's the most confusing
part (aside from [offset + sliceoffset + i[ formulas). canExpand
and isCompact fill the other two bits.
More information about the Digitalmars-d-learn
mailing list