BitArray new design - Hashing out general thoughts and ideas (with slices too)

Sat Jan 19 12:26:35 PST 2013

On Saturday, 19 January 2013 at 09:07:03 UTC, Dmitry Olshansky 
wrote:
> opBinaryAssign --> opOpAssign. opSlice/opSlice assign. In any 
> case it seems to me that (I think you already did this trick 
> before) you can reuse a lot of code by making an operator 
> string a template parameter (for functions that implement  =, 
> |=, &=, ^=, and therefore ^, |, &) and use string mixins.

  Yes I did when I could; The slice versions can only forwards 
them to the bitarray one.

> Other then this - don't forget the '~' _unary_ operator as it 
> makes a lot of sense for bit-vectors. Now since slices don't 
> support '~' concatenation it won't look so bad. Worst case:
>
> auto ba = BitArray(...);
> auopt slice = ba[...];
> ba ~= ~slice;

  Only works if you create a new version and append it; And all 
opBinary operations make duplicates due to their nature. So the 
above couldn't work if slice is a separate type, and instead:

   auto slice = ba[...];   //slice type of BitArraySlice
   auto bc ~= ~slice;      //ba/bc type of BitArray

   ba = (ba ~ (~slice))[]; //type of BitArraySlice

> And it's not much of problem.

  Hmmm.. Well as mentioned I don't think a separate slice/range 
type is the best design or as practical; I have it half working 
but one problem I'm really having is trying to make sense of the 
errors after I converted to use a pointer. It's confusing since 
it was compiling before and is an annoyance more-so now; And 
suddenly can't deduce the type when it should be able to, and if 
I explicitly call it with the type it doesn't help any.

[quote]
Error: template 
bitmanip.BitArray!(Fixed!(1024)).BitArray.opEquals(alias 
SL)(const SL rhs) if (hasMember!(SL, "isBitArray") || 
hasMember!(SL, "isBitArraySlice")) forward reference to template 
opEquals(alias SL)(const SL rhs) if (hasMember!(SL, "isBitArray") 
|| hasMember!(SL, "isBitArraySlice"))
Error: template 
bitmanip.BitArray!(Fixed!(1024)).BitArray.opEquals cannot deduce 
template function from argument types 
!()(BitArraySlice!(BitArray!(Fixed!(1024)), 
true),const(ulong),const(ulong))

Error: cannot implicitly convert expression 
((*this.slice).toString(this.start, this.end)) of type string to 
string
[/quote]

[code]
   //isBitArray and isBitArraySlice are enums in the structs
   //almost all functions templated to handle fixed/dynamic
   //types as compatible. Plus constness for range(s).

   //auto ref would be nice/preferred in a lot in these functions..
   //Otherwise copying is required unless i do more 
duplication/code forwarding

   //if opSlice is @system, then almost all these are @system, or 
@trusted

   //slice forwarding
   bool opEquals(SL)(const SL rhs) const
   if (hasMember!(SL, "isBitArray") || hasMember!(SL, 
"isBitArraySlice"))
   {
     //opSlice always returns BitArraySlice
     return opEquals(rhs[], 0, length);
   }

   bool opEquals(SL)(const SL rhs, ulong sliceStart, ulong 
sliceEnd) const
   if (hasMember!(SL, "isBitArraySlice"))
   {
     if ((sliceEnd-sliceStart) != rhs.length)
       return false;

       return opCmp(rhs, sliceStart, sliceEnd) == 0;
   }

   //slice forwarding
   int opCmp(SL)(const SL rhs) const
   if (hasMember!(SL, "isBitArray"))
   {
     //opSlice always returns BitArraySlice
     return opCmp(rhs[], 0, length);
   }

   int opCmp(SL)(const SL rhs, ulong sliceStart, ulong sliceEnd) 
const
   if (hasMember!(SL, "isBitArraySlice"));
[/code]

  A small consideration was coming to mind in cases where storage 
management if you only needed to prepend a few bits or append 
just after for a few bits but don't need to do a full 
reallocation. In that case perhaps a before/after allocated 
memory would serve nice.

   size_t beforeArray, afterArray;
   int beforeUsed, afterUsed; //course would likely be bitfield or 
ubyte

  Now if we use that it comes to mind that although it wouldn't be 
perfect storage for small compact arrays (as was the original 
complaint(s)), you could however rely on them just as much as the 
main array and the first size_t bits (*2 if you prepend/append to 
the array a lot. It would also be useful to append past the end 
of a sliced area without tampering with data outside it's access 
area until it's forced to resize.

  Fixed BitArraySizes seem far more wanted where you don't want to 
do extra memory allocation, so using reference counting wouldn't 
work, but with dynamic arrays you could. So if you did all the 
slices to a particular bit array could always point to the proper 
memory.

  If you did do that, prepending to an array would be a nightmare 
unless you could offset it with a virtual offset to represent the 
actual beginning. So...

   //start in the middle
   struct STORE {
     //virtualOffset instead?
     ulong startOffset = ulong.max / 2;
     size_t[] store;
   }

   //in slice/Bitarray, hope that's right
   RefCounted!(STORE) store;
   ulong start, end;

  With these three you can safely do dynamic slices while all 
slices point to the same data; in cases where there's only one 
reference store, start, end & startOffset can freely be changed.

  Assume you slice. So..

   BitArray ba = BitArray(100);
   auto sl = ba[10 .. 20];

   //assume startOffset says say 1024, start, 1024 & end = 1124.
   //slice holds start 1034, end 1044.

  Now if we have it prepend a whole lot.

   foreach(i; 0 .. 100)
     ba = true ~ ba;

  It had to be resized, so if it gave it say 512 more bits then...
   //now startOffset = 512.

   slice[0] = true; //start-startOffset = actual bit offset in 
store[]
   assert(ba[110]); //was set by the slice

  Full reference could be preserved, although if you prepend to a 
slice vs the main array then either you overlap data or need some 
temporary storage until duping/reallocating is required, or just 
dup/allocate new memory. Or have
a separate end/before arrays which won't force reallocation but 
could make for interesting overlapping as they wouldn't all 
always point to the same thing.

  The interesting problems that crop up again.

  Another thought, is what if we give an ID to fixed arrays so 
they know if they are pointing to proper data or junk; This would 
likely mostly be for the known fixed array slices. So..

   size_t ID; //in both slice and fixed array.
   size_t[1024] fixedSize;

  Now add a variant to ensure the ID's match on each and every 
access. If the ID gets corrupted the data is corrupted too, 
although not a guarantee it's pretty darn close if the ID is 
random.

  But as you mentioned making it @system then we could have it 
silently convert the BitArray from fixed to dynamic; And make the 
slice @system where normally it's @safe (although immediately 
duping the memory location after it's using the dynamic form 
would be safe).

  Reminds me... Should initialization/blit dup or 
reference/preserve the memory? I'd think it would point to the 
old memory unless explicitly told to (or going from const to 
non-const versions).