Iterators and Ranges: Comparing C++ to D to Rust
Ola Fosheim Grøstad
ola.fosheim.grostad at gmail.com
Mon Jun 14 17:49:46 UTC 2021
On Monday, 14 June 2021 at 17:19:35 UTC, Steven Schveighoffer
wrote:
> Oh for sure. Generalizing the relationships that nodes have
> would be pretty difficult. Trees need 2 pointers at least
> (possibly a parent pointer as well), whereas a linked list only
> needs one (if it's not bidirectional).
Yes, my idea is to keep the chunk-header within two cache lines
(or one), but there should be auxillary space there for the
container as well. So when you insert a chunk-list into a new
container there is meta-info about the nature of the data (are
all chunks filled up, data sorted etc), so the container can then
decide whether to run over the chunks and "fix" the
structure/invariants. So if you take a singled linked list and
insert it into a double linked one with a hash index, then the
double linked list will run over it and update backpointers and
stuff in key-hash-values.
However, in my next attempt at this I will try to make it SIMD
friendly, I think. So using a struct-of-arrays with maybe a
bitmask that says whether a slot is occupied. So each chunk would
have a 32, 64 or 256 element capacity or something like that.
Then maybe a table pointer would be a pointer to the chunk + a
bitmask? But then it makes more sense to use a ranges approach.
So you point to first chunk with bitmask and the last chunk with
bitmask. So pointing to the end would be pointing to the last
chunk with a zero-bitmask? Not sure… :-D
It is funny how implementation choices for containers changes how
one thinks about table pointers/iterators/ranges.
> Unless you allocate nodes for the actual spaces, you have to
> deal with this kind of stuff. I felt like pointing at an
> element, and then storing whether that data is valid or not was
> a pretty reasonable way to have element pointers without safety
> problems.
Yes, but how does it affect efficiency?
It really depends on how they are used, for instance, if you
provide all the algorithms yourself, then inefficent
table-pointers/cursors can be turned into internal raw pointers
and then it does not matter as much (as they only are begin/end
markers that are mutated infrequently).
> My original use case for making dcollections was to have a
> fast-lookup map (using RBTree), and be able to remove items as
> I iterate it the tree. Using ranges for this just doesn't make
> sense in the general case, but using cursors works well.
Yes. I am concerned about how a struct-of-arrays datastructure
approach affects these kind of considerations.
Of course, there is also the question of whether the LDC/GDC
backend are able to compile to bitmasked SIMD instructions
(bitmasks that says whether an element is present or not)?? I
wonder if that is at all possible or if I will have to write
explicit SIMD code?
More information about the Digitalmars-d
mailing list