A Discussion of Tuple Syntax
Meta
jared771 at gmail.com
Fri Aug 16 14:07:50 PDT 2013
Awhile ago Kenji posted this excellent dip
(http://wiki.dlang.org/DIP32) that aimed to improve tuple syntax,
and described several cases in which tuples could be
destructured. You can see his original thread here:
http://forum.dlang.org/thread/mailman.372.1364547485.4724.digitalmars-d@puremagic.com,
and further discussion in this thread:
http://forum.dlang.org/thread/dofwinzpbcdwkvhzcgrk@forum.dlang.org.
It seemed that there was a lot of interest in having syntax
somewhat like what is described in Kenji's DIP, but it didn't
really go anywhere. There is this pull on Github
(https://github.com/D-Programming-Language/dmd/pull/341), but it
uses the (a, b) syntax, which has too much overlap with other
language constructs. Andrei/Walter didn't want to merge that pull
request without a full consideration of the different design
issues involved, which in retrospect was a good decision.
That said, I'd like to open the discussion on tuple syntax yet
again. Tuples are currently sorely underused in D, due in large
part to being difficult to understand and awkward to use. One
large barrier to entry is that fact that D has not 1, not 2, but
3 different types of tuples (depending on how you look at it),
which are difficult to keep straight.
There is std.typecons.Tuple, which is fundamentally different
from std.typecons.TypeTuple in that it's implemented as a struct,
while TypeTuple is just a template wrapped around the compiler
tuple type. ExpressionTuples are really just TypeTuples that
contain only values, and aren't mentioned anywhere except for in
this article: http://dlang.org/tuple.html, which frankly creates
more confusion than clarity.
A good, comprehensive design has the potential to make tuples
easy to use and understand, and hopefully clear up the unpleasant
situation we have currently. A summary of what has been discussed
so far:
- (a, b) is the prettiest syntax, and it also completely
infeasible
- {a, b} is not as pretty, but it's not that bad of an
alternative (though it may still have issues as well)
- #(a, b) is unambiguous and would probably be the easiest
option. I don't think it looks too bad, but some people might
find it ugly and noisy
- How should tuples be expanded? There is the precedent of an
expand() method of std.typecons.Tuple, but Kenji liked tup[]
(slicing syntax). So with a tuple of #(1, "a", 0.0), tup[0..2]
would be an expanded tuple containing 1 and "a". On the other
hand, Bearophile and Timon Gehr preferred that slicing a tuple
create another "closed" tuple, and to use expand() for expansion.
So tup[] would create a copy of the tuple, and tup[0..2] would
create a closed tuple eqvivalent to #(1, "a"). I don't have any
particular preference in that regard.
- Timon Gehr wanted the ability to swap tuple values, so #(x, y)
= #(y, x) would be allowed. Kenji was against it, saying that it
would introduce too many complications.
- There was no consensus on the pattern matching syntax for
unpacking. For example, #(a, _) = #(1, 2) only introduces one
binding, "a", into the surrounding scope. The question is, what
character should go in the place of "_" to signify that a value
should not be bound? Some suggestions were #(a, $), #(a, @), #(a,
?). I personally think #(a, ?) or #(a, *) would be best, but all
that's really necessary is a symbol that cannot also be an
identifier.
Also up for debate was nested patterns, e.g., #(1, 2, #(3, 4,
#(5, 6))). I don't think there was a consensus on unpacking and
pattern matching for this situation. One idea I saw that looked
good:
* Use "..." to pattern match on the tail of an
expressions, so take the above tuple. The pattern #(1, ?, ...)
would match the two nested sub-tuples. Or, say, #(1, 2, 3) could
be matched by #(1, 2, 3), #(1, ?, 3), #(1, ...), etc. You
obviously can't refer to "..." as a variable, so it also becomes
a useful way of saying "don't care" for multiple items, e.g.,
#(a, ...) -> only bind the first item in the tuple. We can play
around with this to get a few other useful constructs, such as
#(a, ..., b) -> match first and last, #(..., b) -> match last,
etc.
Assuming the "..." syntax for unpacking, it would be useful to
name the captured tail. For example, you could unpack #(1, 3,
#(4, 6)) into #(a, b, x...), where a = 1, b = 3, x = #(4, 6).
Similarly, #(head, rest...) results in head = 1, rest = #(2, #(4,
6)). I think this would be very useful.
- Concatenating tuples with ~. This is nice to have, but not
particularly important.
One thing that I think was overlooked, but would be pretty cool,
is that a tuple unpacking/pattern matching syntax would allow us
to unpack/pattern match just about anything that you can make a
tuple of in D. Combine this with the .tupleof property, and
things get interesting... Maybe. There is one possible problem:
.tupleof returns a TypeTuple, and it's not at all clear to me
how, if at all, TypeTuple would work with the proposed syntax. Is
#(int, string, bool) a valid tuple instantiation? This is
something that needs to be worked out.
This is the third or fourth time that I know of that tuple syntax
has come up, and as of yet, nothing has been done about it. I'd
really like to get the ball rolling on this, as I think a good
syntax for these tuple operations would do D a world of good. I'm
not a compiler hacker, unfortunately, so I can't implement it
myself as proof of concept... However, I hope that discussing it
and working out all the kinks will help pave the way for an
actual implementation.
More information about the Digitalmars-d
mailing list