Safe Cursors and Ranges
Steven Schveighoffer
schveiguy at yahoo.com
Thu Aug 26 11:32:54 PDT 2010
On Thu, 26 Aug 2010 13:30:35 -0400, Pillsy <pillsbury at gmail.com> wrote:
> Steven Schveighoffer Wrote:
>
>> On Thu, 26 Aug 2010 11:17:32 -0400, Pillsy <pillsbury at gmail.com>
>> wrote:
> [...]
>> > If you don't mind me asking, what made you give up on it?
>
>> A cursor that refers to one element is much more efficient to pass
>> around/copy.
>
> Ah, yes. If you just need the one element, I figure that you can use
> a range. In most of the cases I'm familiar with, ranges are not
> distinctly less lightweight than array slices, and in D one tends to
> blithely pass slices all over the place. Things could be different for
> iterating over trees---is that where you ran into trouble with the
> idea in dcollections?
I think there is a misunderstanding of how I went about defining
dcollections cursors. I never actually tried the tripod idea, I just
posted it to the newsgroup as a possible solution for dcollections cursors.
Originally in dcollections 1, cursors were iterators -- allowing movement
of the cursor as well as dereferencing. This was way way back, before
ranges were a prevalent concept and I was trying to write a better
collection library for tango.
When negotiating with Andrei on allowing me to have cursors (when I hoped
dcollections could be a part of phobos), I went through several different
ideas to try and make cursors safe. I ended up trying to define a simple
reference type that didn't allow iterating. That is, a reference to a
single item that cannot move, thereby removing the unsafe nature of
iterators. I ran into the problem that the "end" cursor was pointing to
an invalid element, so you still had to rely on the coder not referencing
it, just like in C++. I decided I'd add a bool to the cursor to let it
know it was invalid, and the idea hit me all of a sudden -- that's just a
0 or 1-element range depending on the bool.
So the current incarnation was born. And I have to say, it's quite
usable. Once I defined it that way, the code almost wrote itself.
The issue with using ranges to refer to single elements is, it's difficult
to keep a node-based range intact, for later use. For example, if you
want to refer to a single element in a tree, and you use a range to do
that (assuming it's a range with a start and end node), things can
happen. More elements could be inserted between them, the end marker of
your range could get removed, etc. Then when you pass the range to the
tree's remove function, you're giving it an invalid range, or removing
more elements than you wanted. Cursors are not susceptible to these
problems, since they only ever refer to one element.
So there *is* value in having a special cursor type that only refers to
one element vs. a range, regardless of the copy performance.
>> Because some cursors do not need the hard reference, it is trivial
>> to tell whether such cursors belong to a range. For instance, it's
>> trivial to tell that a pointer is part of an array. So a cursor for an
>> array would not need to refer the range it belongs to.
>
> Ah, I see. That would make things harder to use for the specific use
> case I'm thinking of, where one has a range one would like to split
> at a certain element (which you might have found any number of
> ways). Basically, you can just pass around the cursor, instead of
> needing the cursor and the range.
>
> This strikes me as likely to be very helpful for three-legged races^W
> cases.
Sure, but you can build such a type and its corresponding splitter
function (which actually could be a member of the tripod) in terms of the
low-level function which are more powerful because they accept two
different entities -- the range and the cursor.
>> For instance, if I have an array, and a cursor in that array, it's
>> trivial
>> for an allBefore operation to determine that the cursor is a member
>> of the array. Now, let's say we have two slices that contain the
>> same cursor, in your scheme, in order to use one cursor against
>> the other range, you need to recompose the cursor, which seems
>> unnecessarily awkward.
>
> The way I see it, two different slices can't contain the same cursor;
> a cursor into a slice will always be associated with that slice and no
> other. Maybe there are important use cases I'm missing by defining
> things this way, but I don't know what they might be.
> [...]
Not sure what the use cases would be, but defining the simplest thing
possible and then building more complex concepts on top of them usually
allows more possibilities.
>> For example, if I want to store a link list element cursor so I can
>> change it later, why do I also have to carry around the baggage of
>> the range it came from?
>
> You may not, but why not just hang on to the range it came from
> in the first place instead of using the cursor? What's in the link-list
> range besides a pointer to a node anyway?
The pointer to the possibly invalid end node. See my above description.
> [...]
>> > What I'm thinking is that cursors aren't skinny ranges, they're
>> > ranges that are even fatter. You simply wouldn't use one
>> > where a range would suffice. Allowing questions about whether
>> > a range contains a cursor or not strikes me as just buying
>> > trouble.
>
>> But you don't always care if a cursor is a part of a range. For
>> example, to change or refer to a specific element of a range,
>> why do you need to refer to the range also?
>
> You don't, but what do you lose by referring to the range also? A
> few words on the stack (sticking with the idea that ranges are value
> types)?
It becomes awkward when you are passing a range around but are saying
"well, really, I'm only passing you a reference to the first element,
ignore the rest of it". You end up naming things like
removeFirstElement(range r) instead of remove(cursor c). This is from my
experience with defining dcollections and a std.container redblacktree.
> [...]
>> I don't have a problem with defining such a three-legged type,
>> but can we call it something besides a cursor?
>
> Sure. How about Tripod? 8^)
I like it :)
-Steve
More information about the Digitalmars-d
mailing list