Iterators and Ranges: Comparing C++ to D to Rust

Mon Jun 14 17:19:35 UTC 2021

On 6/14/21 1:00 PM, Ola Fosheim Grøstad wrote:
> On Monday, 14 June 2021 at 16:29:09 UTC, Steven Schveighoffer wrote:
>> However, in particular the red-black-tree based sets/maps just used 
>> the Node pointer as-is for slicing, regardless of whether it was empty 
>> or not. Which is very much not correct according to the concept. But I 
>> haven't touched the code or used it in over a decade, so I'm not sure 
>> if I'll ever fix that.
> 
> Was that because it was harder to do for red-black-trees?

No, I think it was just a mistake of design.

The end node in a RBT is always the root (everything is based on a left 
child), which makes it O(1) always to fetch it. But it's not a true 
element, there is no data stored there. So the easiest thing to do was 
to return a cursor with `end` as the node, and `false` for whether the 
cursor had data. Which probably lead to just ignoring the `hasData` boolean.

But it should have been basically `cursor(_end.prev, false)` for the 
result of `end()`, and the slice operator would have to take into 
account which "space" a cursor was pointing at.

> 
> I have been toying with chunked containers that allows fast transfers of 
> ranges of objects from one container type to another.
> 
> The basic idea being that you could have a queue and then quickly 
> transfer the whole queue to a B-tree or maybe the first 100 objects of 
> the queue to the middle of a deque.
> 
> But it is challenging to come up with a chunk type that works equally 
> well for many container types. So I limit it to sequential container 
> types. Then have links between the chunks (so basically a linked list of 
> small arrays). That way I can have one iterator type that works for all 
> container types.

Oh for sure. Generalizing the relationships that nodes have would be 
pretty difficult. Trees need 2 pointers at least (possibly a parent 
pointer as well), whereas a linked list only needs one (if it's not 
bidirectional).

> I would also like to make it possible to have table pointers that 
> doesn't become invalidated when moving a chunk, but how to do this if 
> they point to the space between elements and not the element itself? 
> Like if it points to the space between two chunks and you move the chunk 
> after to a new container then the table pointer would move to the new 
> container as well (or the implementation would be inefficient).
> 
> Seems like the semantics of pointing to elements instead of the space 
> between elements is more robust for this kind of container semantics. 
> So, when you mutate then the distinction between pointing to or between 
> elements become significant.

Unless you allocate nodes for the actual spaces, you have to deal with 
this kind of stuff. I felt like pointing at an element, and then storing 
whether that data is valid or not was a pretty reasonable way to have 
element pointers without safety problems.

Thinking about it some more, it may have been a conscious decision not 
to care about the "valid" boolean at first, since I wasn't focused on 
implementing the cursors as ranges at all.

But if I did it again today, this is exactly what I would do.

My original use case for making dcollections was to have a fast-lookup 
map (using RBTree), and be able to remove items as I iterate it the 
tree. Using ranges for this just doesn't make sense in the general case, 
but using cursors works well.

-Steve