BitArray/BitFields - Review

Sat Jul 28 16:02:01 PDT 2012

On 29-Jul-12 02:31, Era Scarecrow wrote:
> On Saturday, 28 July 2012 at 22:19:05 UTC, Dmitry Olshansky wrote:
>> Mhm? Not getting you it at all. What's locking it ?
>>
>> What's wrong with:
>> struct BitArray
>> {
>>
>> union {
>>     struct {
>>         size_t[] data;
>>         ulong length;    //in bits
>>
>> //length in words would be just
>> // (length + size_t.sizeof*4) / // (size_t.sizeof*8);
>> // as it's a power of 2 it should get optimized
>>
>>     }
>>     size_t[2+ulong.sizeof/size_t.sizeof] compact;
>>
>> // and say 1 bit is used for isCompact would be great if it could use
>> // the least significant bit of pointer from data[], dunno if it's
>> easy enough
>
>   I'd rather not rely on the pointer at all as a reference for if it's
> compact.

No problem, I redesigned a bit:

struct BitArray
{

union {
     struct {
         ulong length;    //in bits, the top bit used as isCompact
         size_t[] data;
     }
     struct{
	ushort smallLength; //here as well
	ubyte[size_t.sizeof*2+ulong.sizeof-2] compact;
     }
}

Up 2^63 of bits would occupy 2^60 of bytes should be good enough for the 
years to come.
  I'm not seeing it as a problem as at very least processors have to 
actually be able to address more then 42 (or was it 48?) bits that they 
can now. (see AMD or Intel or ARM datasheets on 64 bit chips)

If the array is changed later to a template that could accept
> bytes, then where does that leave you?
> When the slice doesn't need to
> know more of the beginning,

???
Slice can start at some random offset

  it can re-slice it to a lesser more
> appropriate size.

no idea what's you are talking about here.

  It may very well soon contain a third entry in the
> union for main operations and use the larger size_t just for canUseBulk
> (Actually that sounds better to me).

Again I'm having hard time to get what follows from where. Can you 
expand on this point please?
Preferably with some code as it was far easier to reason about.

>   Besides, separating the slice suggests 'length' is not part of the
> union struct.

How? All containers know their length.

Sure it's convertible to [0 .. length], but it's no longer
> convertible back to BitArray (Unless you make a whole new array and
> conversion/copying which takes a lot of time).

Yes it's not convertible. And it's no problem and is intended. Otherwise 
like you said it doesn't make a lot of sense.

>If you do that, why not
> do BitSlice from the beginning which holds BitArray, but then if you do
> that why make them separate at all?
> It doesn't make sense to me. I can
> see making a separate range for the front/popFront/empty if you really
> wanted to use those, but it wouldn't change how it would work.
>

>> then Range would be:
>>
>> struct Range{
>>     BitArray* array;
>>     ulong start, end;
>> }
>>
>> to mark [start..end] piece of it?
>>
>>> If you keep that information, why not just include slicing along with
>>> the rest of it?
>>
>> I'm not seeing slicing easily integrated just because length is known.
>
>   Beginning and ending are known, so slicing can be added at no extra cost.
Begin takes extra ulong I'm not sure it is needed in the container itself.

-- 
Dmitry Olshansky