2.011 invariant question

Sönke Ludwig ludwig at informatik_dot_uni-luebeck.de
Wed Feb 27 01:14:17 PST 2008


Frits van Bommel wrote:
> 
> A hashtable could get you an average O(1) for that lookup (the worst 
> case could still be O(N) or O(log N) though, depending on the 
> implementation). You wouldn't need to enter every allocation in there, 
> by the way. If GC pools are always allocated on some aligned address, 
> the start of one can be found by bitwise-anding out the lower bits of 
> the pointer, and that address could be checked in the hashtable. "is 
> this the start of an allocation" then becomes a hashtable lookup plus a 
> check "(cast(size_t)(p - pool_start) % pool_objsize == 0"[1]. (The 0 may 
> need to be adjusted to allocation_header.sizeof)
> 
> [1]: For pool_objsize == 2^N for some N no actual division needs to be 
> performed; "& ((1 << N) - 1)" has the same effect.
> 
> 
> On another note, how about this for a rule on concatenating to invariant 
> arrays:
> For each heap-allocated array, put both the reserved size and *the 
> maximum used size* in the header of the allocation. Whenever arr.ptr + 
> arr.length is equal to allocation.start + allocation.max_size_ever, it's 
> safe to append to it in-place as long as you increase the maximum size 
> accordingly.
> (This'll need some synchronization to be thread-safe, but when there are 
> multiple threads allocations are currently already synchronized anyway)
> 
> Note that the rule above allows appending to trailing slices; this is 
> unsafe for mutable arrays since they traditionally reallocated in this 
> instance and programs may rely on them not sharing data with the 
> original. However, this is safe for invariant arrays since the shared 
> prefix can't change -- the only way to detect sharing would be pointer 
> comparisons which are not required to provide meaningful information 
> anyway (IIRC the spec allows "moving" garbage collectors, so pointer 
> comparisons aren't reliable anyway since you can't count on your data 
> staying where it is).
> 
> This rule would allow incremental string building to be fairly efficient 
> even for invariant arrays; at worst, the initial append will trigger a 
> reallocation before an allocated block of memory runs out. The rest will 
> only reallocate when it's actually needed just like with mutable-string 
> building. (At least, as long as you don't slice away elements from the end)

This approach would probably be the way to go - don't know how bad the
implementation overhead would be, but probably not too high. In any case,
the idea sounds very appealing and provides a very unintrusive behaviour.

The block look-up currently seems to cache the last accessed block, so
for successively appending to a single array it's always O(1) - only when
building two or more arrays are built in parallel, a real look-up has to be
made. This is probably sufficient for most use cases and I think the
maintenance costs for a hashtable would probably be higher than the benefit.

So a dedicated container for array building can of course never hurt, but
this scheme does at least not make it any more necessary than it is now.



More information about the Digitalmars-d mailing list