BitArray/BitFields - Review

Era Scarecrow rtcvb32 at yahoo.com
Sun Jul 29 00:19:00 PDT 2012


On Sunday, 29 July 2012 at 06:27:32 UTC, Dmitry Olshansky wrote:
> OK. Now back to the biggest question:
> Slices use references (sometimes) to previous bitarray. Since 
> it's a slice (like an array) this could be expected.
>
> The "sometimes" part is not god enough! The problem is that 
> small string optimization (or small array optimization) doesn't 
> work with reference semantics. You may dig up discussion on 
> proposal for small string optimization for char[] and string 
> that was shot down on these grounds.

> Bulit-in like semantics call for simple ptr/ptr+length struct 
> and you can't get that can you?.>
> E.g.
>
> void mess_with(BitArray ba)
> {
> ba ~= true;
> ba ~= false;
>  //does it change anything outside this function? *maybe*

If it was a previous slice, then it never chances outside and 
appending would automatically allocate new memory.

> ba[0] ^= true;
> ba[1] ^= true;
> auto x = ba; //even if used ref BitArray ba, x *maybe* 
> references ba
> //is x now a reference or a copy? *maybe*
>
> Thus BitArray either becomes value type (that may come as 
> unexpected, see your option c/d). Or it becomes full reference 
> type as in option a (though it sucks).

> Sorry but b is no go, as it makes code unpredictable I'd rather 
> take my idea with separate ranges then b.

> Solutions:
>  a) Drop the union, make all functions @safe (compared to 
> @trusted), and they are all ref/slices
>  b) Leave as sometimes ref, sometimes not; Use .dup when you 
> NEED to ensure different unique copies.
>  c) Apply COW (Copy-On-Write) where a known slice type 
> (canExpand is false) when a change would apply, it replaces the 
> present slice into it's own unique array.
>  d) All slices become new dup'ed allocated arrays (Most 
> expensive of them all)

  Agreed, B is a bad choice. I'm going to suggest c myself, as 
changes would only propagate to a new true copy, otherwise it's 
read-only as a slice.

> Because of d, I proposed that slices be separate type that is 
> always a reference to original. Then only coping arrays 
> themselves involves dup.

  Mmmm... i think you're right...

>> No it doesn't. bitfields are used to handle the 
>> beginning/ending offsets, equaling a total of 14 bits. It's 
>> the most confusing part (aside from [offset + sliceoffset + i] 
>> formulas). canExpand and isCompact fill the other two bits.
>
> OK, finally I think I understand how your current version 
> works. It's clever but leaves you in half-way value-reference 
> state.

  I didn't want to originally use the 1-byte per 
startBit/endingBit, but it does actually work when you have the 
functions working.

> To point out few problems with unification of slice type and 
> BitArray:
>
> - Every time a small slice is taken say [1..30] it becomes a 
> small array(?) thus it is a value type and doesn't affect 
> original array

  No, slices of compact arrays remain compact arrays. slices of 
larger arrays are reference until length is set, dupped, or 
appended to (forcing a resize/dup)

> - No idea if a = b make a copy will make programmers nervous, 
> esp if one is writing a library function as in this case there 
> is no assumption that fits all

  If it's compact, it's an auto separate copy/valuetype. If it's 
not compact, it's a reference to the original.

> - Same thing but more, does this work?
> 	auto c = ba; //if ba is compact
> 	ba ~= true;

  ba may or may not be a normal/compact at this point

> 	assert(c.length = ba.length-1);
> 	//c should remain with the same length unaffected (at least 
> like built in arrays)

  Most likely it would be compact afterwards (Even if it allocated 
memory).

  I guess after going this over, two things need to happen. First 
it needs to be non-reference type, I don't want to go option d 
route, so I'll go for c instead. On a copy I'll add a postblit 
that makes the 'copy' now a slice which will create a new array 
when it changes in any way. Only question is when you pass the 
BitArray to a function as you showed, it would still think it's 
the original, unless the postblit/opAssign (or similar) can 
handle that. Hmmm time for some testing.


More information about the Digitalmars-d-learn mailing list