Suggested std.variant.Algebraic redesign
bearophile
bearophileHUGS at lycos.com
Sat Feb 23 13:44:18 PST 2013
Here I propose a different usage syntax for
std.variant.Algebraic. I explain why with a series of steps. The
proposed syntax is at the bottom.
- - - - - - - - - - - - - - - - - - - -
To start with an example of a simple and common algebraic data
type, this is in F# language, and it's used in a Decision Tree
classifier (other functional languages use a similar syntax):
type Tree =
| Conclusion of string
| Choice of string * (string * Tree) []
- - - - - - - - - - - - - - - - - - - -
In Haskell the definition of that type is similar:
data Tree = Conclusion String
| Choice (String, [(String, Tree)])
Or more simply (here Choice doesn't build a tuple):
data Tree = Conclusion String
| Choice String [(String, Tree)]
t = Choice "Sci-Fi"
[("No", Choice "Action"
[("Yes", Conclusion "Stallone"),
("No", Conclusion "Schwarzenegger")]),
("Yes", Conclusion "Schwarzenegger")]
- - - - - - - - - - - - - - - - - - - -
In D if you don't add class constructors and toString that's
similar to:
abstract class Tree {}
final class Conclusion : Tree {
string result;
}
final class Choice : Tree {
string key;
Tuple!(string, Tree)[] branches;
}
- - - - - - - - - - - - - - - - - - - -
A better D version (I have used opCall to have less noisy tree
literals):
import std.typecons: Tuple, tuple;
import std.string: format;
import std.array: replace;
abstract class Tree {
override string toString() const;
}
final class Conclusion : Tree {
string result;
static typeof(this) opCall(typeof(result) result_) {
auto r = new typeof(this)();
r.result = result_;
return r;
}
override string toString() const {
return `Conclusion("` ~ result ~ `")`;
}
}
final class Choice : Tree {
string key;
Tuple!(string, Tree)[] branches;
static typeof(this) opCall(typeof(key) key_,
typeof(branches) branches_) {
auto r = new typeof(this)();
r.key = key_;
r.branches = branches_;
return r;
}
override string toString() const {
return format(`Choice("%s", %s)`, key, branches)
.replace("const(Tuple!(string, Tree))", "B");
}
}
void main() {
import std.stdio;
alias B = typeof(Choice.branches[0]);
auto t = Choice("Sci-Fi",
[B("No", Choice("Action",
[B("Yes",
Conclusion("Stallone")),
B("No",
Conclusion("Schwarzenegger"))])),
B("Yes", Conclusion("Schwarzenegger"))]);
writeln(t);
}
- - - - - - - - - - - - - - - - - - - -
The B type is necessary because the literal:
tuple("foo", Conclusion("bar"))
is of type:
Tuple!(string, Conclusion)
Instead of a type similar to this as in Haskell:
Tuple!(string, Tree)
- - - - - - - - - - - - - - - - - - - -
Currently in D you can't use a std.variant.Algebraic for this
common situation because it doesn't support recursive data types.
Once Algebraic supports recursive types, I think it also should
support field names; maybe like this:
alias Tree = Algebraic!(
string, "Conclusion",
Tuple!(string, Tuple!(string, Tree)[]), "Choice"
);
(Is it possible to create an Algebraic that supports recursive
data types? Inside the definition of Tree there is a reference to
the Tree identifier that will be created by the alias.)
- - - - - - - - - - - - - - - - - - - -
But putting the names on the front as in F#/OCaML/Haskell and
other functional languages seems more standard and more clear:
alias Tree = Algebraic!(
"Conclusion", string,
"Choice", Tuple!(string, Tuple!(string, Tree)[])
);
- - - - - - - - - - - - - - - - - - - -
The names are able to tell apart the fields so there is no need
to wrap the second-level types in a Tuple, as in F# (so after the
string that represents the field name there is one or more types
separated by a comma), this seems better and good enough:
alias Tree = Algebraic!(
"Conclusion", string,
"Choice", string, Tuple!(string, Tree)[]
);
- - - - - - - - - - - - - - - - - - - -
If that idea goes well, then I think the equivalent D version
becomes:
import std.typecons: Algebraic;
alias Tree = Algebraic!(
"Conclusion", string,
"Choice", string, Tuple!(string, Tree)[]
);
void main() {
import std.stdio;
alias Choice = Algebraic.Choice;
alias Conclusion = Algebraic.Conclusion;
auto t = Choice("Sci-Fi",
[tuple("No", Choice("Action",
[tuple("Yes", Conclusion("Stallone")),
tuple("No", Conclusion("Schwarzenegger"))])),
tuple("Yes", Conclusion("Schwarzenegger"))]);
writeln(t);
}
This D code is not as good as equivalent Rust/Scala code, but I
think it's acceptable. All this assumes it's possible to create
an Algebraic that supports recursive data types.
Here I think the alias B is not necessary, because Choice and
Conclusion return a value of type Tree, so tuple("x",
Conclusion("y")) is of type Tuple!(string, Tree) and the arrays
are of the correct type for Choice.
- - - - - - - - - - - - - - - - - - - -
If a more succinct tuple literal syntax will be introduced, that
code will look even more like functional language code (with just
few more parentheses that increase the noise a little. On the
other hand similar nested literals are not common in code, they
are hard to write in Haskell too. Such data structures are
usually built by the code).
- - - - - - - - - - - - - - - - - - - -
Better ideas are welcome :-)
Bye,
bearophile
More information about the Digitalmars-d
mailing list