Iterators and Ranges: Comparing C++ to D to Rust
Steven Schveighoffer
schveiguy at gmail.com
Mon Jun 14 17:19:35 UTC 2021
On 6/14/21 1:00 PM, Ola Fosheim Grøstad wrote:
> On Monday, 14 June 2021 at 16:29:09 UTC, Steven Schveighoffer wrote:
>> However, in particular the red-black-tree based sets/maps just used
>> the Node pointer as-is for slicing, regardless of whether it was empty
>> or not. Which is very much not correct according to the concept. But I
>> haven't touched the code or used it in over a decade, so I'm not sure
>> if I'll ever fix that.
>
> Was that because it was harder to do for red-black-trees?
No, I think it was just a mistake of design.
The end node in a RBT is always the root (everything is based on a left
child), which makes it O(1) always to fetch it. But it's not a true
element, there is no data stored there. So the easiest thing to do was
to return a cursor with `end` as the node, and `false` for whether the
cursor had data. Which probably lead to just ignoring the `hasData` boolean.
But it should have been basically `cursor(_end.prev, false)` for the
result of `end()`, and the slice operator would have to take into
account which "space" a cursor was pointing at.
>
> I have been toying with chunked containers that allows fast transfers of
> ranges of objects from one container type to another.
>
> The basic idea being that you could have a queue and then quickly
> transfer the whole queue to a B-tree or maybe the first 100 objects of
> the queue to the middle of a deque.
>
> But it is challenging to come up with a chunk type that works equally
> well for many container types. So I limit it to sequential container
> types. Then have links between the chunks (so basically a linked list of
> small arrays). That way I can have one iterator type that works for all
> container types.
Oh for sure. Generalizing the relationships that nodes have would be
pretty difficult. Trees need 2 pointers at least (possibly a parent
pointer as well), whereas a linked list only needs one (if it's not
bidirectional).
> I would also like to make it possible to have table pointers that
> doesn't become invalidated when moving a chunk, but how to do this if
> they point to the space between elements and not the element itself?
> Like if it points to the space between two chunks and you move the chunk
> after to a new container then the table pointer would move to the new
> container as well (or the implementation would be inefficient).
>
> Seems like the semantics of pointing to elements instead of the space
> between elements is more robust for this kind of container semantics.
> So, when you mutate then the distinction between pointing to or between
> elements become significant.
Unless you allocate nodes for the actual spaces, you have to deal with
this kind of stuff. I felt like pointing at an element, and then storing
whether that data is valid or not was a pretty reasonable way to have
element pointers without safety problems.
Thinking about it some more, it may have been a conscious decision not
to care about the "valid" boolean at first, since I wasn't focused on
implementing the cursors as ranges at all.
But if I did it again today, this is exactly what I would do.
My original use case for making dcollections was to have a fast-lookup
map (using RBTree), and be able to remove items as I iterate it the
tree. Using ranges for this just doesn't make sense in the general case,
but using cursors works well.
-Steve
More information about the Digitalmars-d
mailing list