BitArray/BitFields - Resumed and polishing

Dmitry Olshansky dmitry.olsh at gmail.com
Thu Jan 3 07:48:49 PST 2013


1/3/2013 2:20 PM, Era Scarecrow пишет:
> On Thursday, 3 January 2013 at 07:57:46 UTC, Dmitry Olshansky wrote:
>> 1/3/2013 6:05 AM, Era Scarecrow wrote:
>
>> Hm, I'd think that having Slice type be:
>>
>> BitArraySlice{
>> BitArray* bp;
>> size_t start, end;
>> // all else forwards to the pointed array
>> }
>> should work avoiding the most of code duplication. With any luck
>> inliner will expand all of this wrapping. To enable bulk mode operations:
>>
>> void opOpSliceAssign(BitArraySlice rhs)
>> {//has all the data to do it
>>    bulkOp(bp, start, end, rhs.bp, rhs.start, rhs.end);
>> }
>>
>> then it's a question of bit of refactoring & tweaking of array bulks ops.
>>
>> Then the only problem left is slice invalidation and original array
>> going out of scope.
>
>   I know we've had this discussion before. So let's assume you do
> slicing this way. So what happens when...
>
>   BitArray ba = [0,1,1,0];
>
>   ba = ba[1 .. $-1]; // ba is not a BitArraySlice
>
>   Suddenly it won't work and slicing is only a range and can only be
> used in foreach.

No surprise here. Basically container != range over it. It all flows 
from there. Range doesn't have append nor it owns any data.
About foreach see below. Yes, the BitArray container won't pass 
hasSlicing trait, but it shouldn't as it's meant for ranges.
(so in a sense my comment about trying to mimic built-in array applies)

And slice can have opIndex and opSlice being a random access range of bool.

To help people convert one to the other (exactly where they need it) we 
can add .dup(?) for the slice that allocates a BitArray and assigns 
values from range.

> Correct? You could do appending but likely it would
> require two functions in the original code (one for a BitArray and one
> for a slice).

Again I don't think there is a need to append to a slice. Create an 
array out of slice (or in fact any range of bool) then append.

>It it supports a opAssign to handle that now you have to
> take care of all opAssign's as well (like C++, once you handle one
> portion of it and opCast, you handle them all).

Yes, I'm afraid that for optimal speed they both have to be implemented.
The version for BitArray can just forward to the one for slice over the 
whole of it. Or there could be a common template function that handles 
all of bit-ops over [a, b) portions of BitArrays. Everything else then 
forwards to it.

>>> Does that mean we might be able to drop opApply? Hmmm... Depends on
>>> if we have enough implemented so the range accepts it. I'll glance
>>> over std.range.
>>
>> Yes, it's got to be a range to work like this and then opApply is
>> useless.
>
>   Maybe. Foreach if I remember right worked three different ways, and
> selected whichever one worked. opApply, front/pop/empty, and array
> (opIndex access). If enough is implemented that it recognizes it as an
> array then opApply can be removed.

There are 2 ways: range or opApply. How built-in arrays are done is up 
to compiler, there is no way to tap into it from user level.

There is an extra rule however, it is the following. If the object in 
question is not a range and doesn't have opApply but has opSlice() then 
it slices it first, then again it checks the 2 ways on this slice - as 
range or opApply.
>
>   A quick tests shows that currently isn't the case.
>
>>> Really depends on the use case. I'm guessing there's enough call for
>>> the 64bit small packed bitarray that justifies it, otherwise it would
>>> be better to throw it out. As you mention later, separating them does
>>> seem like a better idea.
>>
>> I've meant the fact that it has to preserve current std lib behavior
>> *and* has small string optimization that makes it overly painful and
>> partially defeats the optimization (by requiring pointless
>> compact->big conversions on slice etc.).
>
>   We could always 'dup' on slice instead, but then you can't use
> references on it, at which point it's a value type.

Dup on slice is very bad. Slice in container world (I'm looking at 
std.container and this not likely to change) is a view of the same data, 
thus it's *not* a new container on its own.

>Hmmm. Being a pure
> value type then a separate range makes more sense to use.
>
>>> hmmm.. Either would need some code duplication, or template to turn
>>> off/swap small string optimization. But separating them does seem
>>> like a good idea.
>>
>> I'd expect a fair chunk of code to be split off into free functions,
>> then both containers would make use of them.
>>
>> Now note that there is also std.container Array!bool that has
>> reference semantics and got to be doing bit packing... it could
>> potentially take place of big BitArray mentioned.
>
>   I'm aware of the container version, and overall if it can just replace
> BitArray as it is, then we might consider only worrying about a compact
> version, at which point it would become a template... Might start seeing
> BitArray!1024  in places (or BitString!1024 ?).

I'm not sure if it will gain that much to have templated fixed size bit 
array over one that just has small optimization. Could be a lot but 
somehow I don't expect much, that needs some measurement.

-- 
Dmitry Olshansky


More information about the Digitalmars-d-learn mailing list