A Discussion of Tuple Syntax
bearophile
bearophileHUGS at lycos.com
Tue Aug 20 18:43:45 PDT 2013
Andrei Alexandrescu:
> 1. What do we need?
We can live fine without a syntax to manage tuples, so strictly
speaking we need nothing (almost, I'd like to kill one currently
accepted syntax, see below).
But when you write D in a high-level style, and you are used to
other functional or scripting languages, in several cases you
desire a more handy/compact syntax to manage tuples.
Tuples are simple data structure, so the operations you want to
do with them are few and simple:
- To define a tuple of various fields of different type, that is
a tuple literal, to create a tuple;
- A way to read and write items of a tuple (unless it's
read-only), D uses a nice syntax [i] as arrays indexing, but i
can't be a run-time value.
- With the Phobos tuple we are used to giving names to fields.
It's handy, but it's not so necessary if you have a way to assign
tuple items to variables, defined in-place. There are several
places where pulling apart a tuple in that way is useful, here I
use a basic syntax to be more clear:
Assignments:
auto (a, b) = t1;
In function signatures:
auto mult = ((int x, int y)) => x * y;
In foreach loops:
void main() {
import std.range, std.stdio;
auto a = [10, 20];
auto b = [100, 200];
// Currently D supports this syntax:
foreach (x, y; zip(a, b))
writeln(x, " ", y);
// But it's unreliable, now x is the index of the
// array instead of the first tuple fields, so I
// think this feature of D should be killed as
// soon as possible:
foreach (x, y; zip(a, b).array)
writeln(x, " ", y);
}
Its output:
10 100
20 200
0 Tuple!(int, int)(10, 100)
1 Tuple!(int, int)(20, 200)
In (final) switch cases, this also shows one use case and one
usefulness of wildcards (and the usage of an unapply() standard
method, as explained a bit here:
http://d.puremagic.com/issues/show_bug.cgi?id=596 ):
final switch (tup1) {
case (0, 10): break;
case (0, ?): break;
case (?, ?): break;
}
An nice example usage of such basic pattern matching is visible
here, to implement the insertion in a red-black-tree. This is a
well known hairy operation to do in languages as C:
http://rosettacode.org/wiki/Pattern_matching
This is how insertion and balancing is done in Haskell (this
needs long lines and it probably comes out unreadable here, so
see here for a better visualization:
http://rosettacode.org/wiki/Pattern_matching#Haskell ):
data Color = R | B
data Tree a = E | T Color (Tree a) a (Tree a)
balance :: Color -> Tree a -> a -> Tree a -> Tree a
balance B (T R (T R a x b) y c ) z d
=
T R (T B a x b) y (T B c z d)
balance B (T R a x (T R b y c)) z d
=
T R (T B a x b) y (T B c z d)
balance B a x (T R (T R b y c) z d
) =
T R (T B a x b) y (T B c z d)
balance B a x (T R b y (T
R c z d)) =
T R (T B a x b) y (T B c z d)
balance col a x b = T col a x b
insert :: Ord a => a -> Tree a -> Tree a
insert x s = T B a y b where
ins E = T R E x E
ins s@(T col a y b)
| x < y = balance col (ins a) y b
| x > y = balance col a y (ins b)
| otherwise = s
T _ a y b = ins s
There are few more little pieces that are useful, like unpacking
an array:
string s = "red blue";
auto (col1, col2) = s.split;
Hara has suggested pattern matching in if statements, that is a
special case of the pattern matching in switch cases, and it's
sometimes useful (similar code is common in Scala):
if (auto {1, y} = tup) {...}
That is equivalent to:
if (tup[0] == 1) {
auto y = tup[1];
...
}
It looks simple and not so essential, but when tuples become a
little more more complex pattern matching shows it usefulness
better, as in the Haskell example.
Languages as Haskell and Rust also show that even a simple
version of pattern matching is handy when you manage types like
Nullable() (that need to define an unapply() standard method).
(It's not unthinkable to support pattern matching in function
signatures too, as often used in many functional languages as
OCaML and Haskell, but this is an orthogonal feature, it could be
added later, and we can probably live without it for now).
Others have suggested a syntax with ... to gather more than one
item, or to ignore more than one item with ?... Such syntax is
handy with arrays but with tuples, that are often short and of
well known size I think it's not so much useful, so it can be
left for later.
> 2. What does a solution look like in the current language?
The DIP32 shows a complete very small program that I wrote for
RosettaCode site, it computes the Huffman encoding of a given
string:
http://wiki.dlang.org/DIP32
import std.stdio, std.algorithm, std.typecons, std.container,
std.array;
auto encode(T)(Group!("a == b", T[]) sf) {
auto heap = sf.map!(s => tuple(s[1], [tuple(s[0], "")]))
.array.heapify!q{b < a};
while (heap.length > 1) {
auto lo = heap.front; heap.removeFront;
auto hi = heap.front; heap.removeFront;
foreach (ref pair; lo[1]) pair[1] = '0' ~ pair[1];
foreach (ref pair; hi[1]) pair[1] = '1' ~ pair[1];
heap.insert(tuple(lo[0] + hi[0], lo[1] ~ hi[1]));
}
return heap.front[1].schwartzSort!q{tuple(a[1].length, a[0])};
}
void main() {
auto s = "this is an example for huffman encoding"d;
foreach (p; s.dup.sort().release.group.encode)
writefln("'%s' %s", p[]);
}
That little program shows only three use cases of a tuple syntax:
in function signatures, in normal assignments, and in foreach:
import std.stdio, std.algorithm, std.container, std.array;
auto encode(T)(Group!("a == b", T[]) sf) {
auto heap = sf.map!((c, f) => (f, [(c,
"")])).array.heapify!q{b < a};
while (heap.length > 1) {
auto (lof, loa) = heap.front; heap.removeFront;
auto (hif, hia) = heap.front; heap.removeFront;
foreach ((_, ref e); loa) e = '0' ~ e;
foreach ((_, ref e); hia) e = '1' ~ e;
heap.insert((lof + hif, loa ~ hia));
}
return heap.front[1].schwartzSort!((c, e) => (e.length, c));
}
void main() {
auto s = "this is an example for huffman encoding"d;
foreach ((c, e); s.dup.sort().release.group.encode)
writefln("'%s' %s", c, e);
}
For an use case of the assignment from an array, let's say you
have a text file containing two numbers in a line, that you want
to read:
const (x, y) = filein.byLine.front.split.to!(int[]);
An example of implementation in a red-black-tree of the insert
and balance in C is not too much hard to find with Google. But
it's lot of bug-prone code.
Bye,
bearophile
More information about the Digitalmars-d
mailing list