complement to $
Steven Schveighoffer
schveiguy at yahoo.com
Mon May 17 04:45:55 PDT 2010
On Sun, 16 May 2010 02:24:55 -0400, Don <nospam at nospam.com> wrote:
> Steven Schveighoffer wrote:
>> Currently, D supports the special symbol $ to mean the end of a
>> container/range.
>> However, there is no analogous symbol to mean "beginning of a
>> container/range". For arrays, there is none necessary, 0 is always the
>> first element. But not all containers are arrays.
>> I'm running into a dilemma for dcollections, I have found a way to
>> make all containers support fast slicing (basically by imposing some
>> limitations), and I would like to support *both* beginning and end
>> symbols.
>> Currently, you can slice something in dcollections via:
>> coll[coll.begin..coll.end];
>> I could replace that end with $, but what can I replace coll.begin
>> with? 0 doesn't make sense for things like linked lists, maps, sets,
>> basically anything that's not an array.
>> One thing that's nice about opDollar is I can make it return coll.end,
>> so I control the type. With 0, I have no choice, I must take a uint,
>> which means I have to check to make sure it's always zero, and throw an
>> exception otherwise.
>> Would it make sense to have an equivalent symbol for the beginning of
>> a container/range?
>> In regex, ^ matches beginning of the line, $ matches end of the line
>> -- would there be any parsing ambiguity there? I know ^ is a binary
>> op, and $ means nothing anywhere else, so the two are not exactly
>> equivalent. I'm not very experienced on parsing ambiguities, but
>> things like ~ can be unambiguous as binary and unary ops, so maybe it
>> is possible.
>> So how does this look: coll[^..$];
>> Thoughts? other ideas?
>> -Steve
>
> If we were to have something like this (and I'm quite unconvinced that
> it is desirable), I'd suggest something beginning with $, eg $begin.
This would be better than nothing.
> But, it seems to me that the slicing syntax assumes that the slicing
> index can be mapped to the natural numbers. I think in cases where
> that's not true, slicing syntax just shouldn't be used.
slicing implies order, that is for sure. But mapping to natural numbers
may be too strict. I look at slicing in a different way. Hopefully you
can follow my train of thought.
dcollections, as a D2 lib, should support ranges, I think that makes the
most sense. All containers in dcollections are classes, so they can't
also be ranges (my belief is that a reference-type based range is too
awkward to be useful). The basic operation to get a range from a
container is to get all the elements as a range (a struct with the range
interface).
So what if I want a subrange? Well, I can pick off the ends of the range
until I get the right elements as the end points. But if it's possible,
why not allow slicing as a better means of doing this? However, slicing
should be a fast operation. Slicing quickly isn't always feasible, for
example, LinkList must walk through the list until you find the right
element, so that's an O(n) operation. So my thought was to allow slicing,
but with the index being a cursor (i.e. pointer) to the elements you want
to be the end points.
Well, if we are to follow array convention, and want to try not to enforce
memory safety, we should verify those end points make sense, we don't want
to return an invalid slice. In some cases, verifying the end points are
in the correct order is slow, O(n) again. But, you always have reasonably
quick access to the first and last elements of a container, and you *know*
their order relative to any other element in the container.
So in dcollections, I support slicing on all collections based on two
cursors, and in all collections, if you make the first cursor the
beginning cursor, or the second cursor the end cursor, it will work. In
some cases, I support slicing on arbitrary cursors, where I can quickly
determine validity of the cursors. The only two cases which allow this
are the ArrayList, which is array based, and the Tree classes (TreeMap,
TreeSet, TreeMultiset), where determining validity is at most a O(lgN)
operation.
Essentially, I see slicing as a way to create a subrange of a container,
where the order of the two end points can be quickly verified.
auto dict = new TreeMap!(string, string); // TreeMap is sorted
...
auto firstHalf = dict["A".."M"];
(You say that slicing using anything besides natural numbers shouldn't be
used. You don't see any value in the above?)
But "A" may not be the first element, there could be strings that are less
than it (for example, strings that start with _), such is the way with
arbitrary maps. So a better way to get the first half may be:
auto firstHalf = dict[dict.begin.."M"];
What does the second half look like?
auto secondHalf = dict["M"..dict.end];
Well, if we are to follow array convention, the second half can be
shortcutted like this:
auto secondHalf = dict["M"..$];
Which looks and reads rather nicely. But there is no equivalent "begin"
shortcut because $ was invented for arrays, which always have a way to
access the first element -- 0. Arbitrary maps have no such index. So
although it's not necessary, a shortcut for begin would also be nice.
Anyways, that's what led me to propose we have some kind of short cut. If
nothing else, at least I hope you now see where I'm coming from, and
hopefully you can see that slicing is useful in cases other than natural
number indexes.
-Steve
More information about the Digitalmars-d
mailing list