First Draft: Callback For Matching Type

Mon Jun 24 18:20:36 UTC 2024

On Saturday, 22 June 2024 at 21:02:34 UTC, Richard (Rikki) Andrew 
Cattermole wrote:
> This proposal is a subset of matching capabilities to allow for 
> tagged unions to safely access with language support its values 
> and handle each tag.
>
> Some minor things have been changed from the ideas thread, I 
> have changed the match block to be a declaration block to allow 
> for ``static foreach`` and other conditional compilation 
> language features. So it is now using semicolon instead of 
> colon.
>
> ```d
> alias MTU = MyTaggedUnion!(int, float, string);
>
> MTU mtu = MTU(1.5);
>
> mtu.match {
> 	(float v) => writeln("a float! ", v);
> 	v => writeln("catch all! ", v);
> };
> ```
>
> Ideas thread: 
> https://forum.dlang.org/post/chzxzjiwsxmvnkthbdyy@forum.dlang.org
>
> Latest: 
> https://gist.github.com/rikkimax/79cbe199618b3f99104f7df2fc2a9681
>
> Permanent: 
> https://gist.github.com/rikkimax/79cbe199618b3f99104f7df2fc2a9681/95ae646da1ebb079a522b0c993e3408e5a1c0d78

I guess I have implemented something like that: 
https://d.godbolt.org/z/ePv4ndxeE

If I understand you correctly, we share the vision of a tagged 
union (I call them enum unions) as a type with certain members 
(duck typing), not the instance of a particular template.

But that’s where it seems our views diverge. In my 
implementation, the tag also allows distinct same-type options. 
(Options are discerned by tag, not by type.)

A type with the appropriate members is (usually) generated by 
mixing in a given mixin template (`EnumUnion`) which takes one 
parameter of struct type (usually a small private struct named 
`Impl`) and uses its data members (types and names) for types and 
tags. (I used to have `EnumUnion` take an array of string for 
names and a type tuple for types, but those get really long 
really fast and error messages become incomprehensible name–type 
gibberish.)

Example time! Let’s say we want simple expression parsing where 
an expression is a constant, a variable, a unary minus 
expression, or a binary plus or times expression.

```d
class Expr
{
     struct Binary { Expr lhs, rhs; }
     private static struct Impl
     {
         int constant;
         string variable;
         Expr minus;
         Binary plus, times;
     }
     mixin EnumUnion!Impl;
     // Provides: Constructors, a destructor if needed (not this 
case),
     // eponymous accessors (@safe get and @system set), @system 
re-assignment,
     // and some other stuff with two underscores in front.
}
```
Accessors:
* `constant`, `variable`, etc. getters return the 
constant/variable/… if the option is active, otherwise 
`assert(0)` with error message.
* `constant`, `variable`, etc. setters make the 
constant/variable/… option active and assign a value. (@system)

Among the other stuff:
* `__is_constant`, `__is_variable`, etc. return a boolean if the 
option is active.
* `__as_constant`, `__as_variable`, etc. return a pointer to the 
mentioned option if it’s active, or `null`. Essentially a safe 
cast. Similar to `key in aa` for associative array lookup.
* `__unsafe_constant`, `__unsafe_variable`, etc. return a 
reference, checked by an `in` contract. (@system)

We’re not done! Because enum unions aren’t simply instances of a 
template, but just duck-typed stuff, enum union types can be 
classes or structs depending on your needs and can have 
additional members!

```d
class Expr
{
     …

     int eval(int[string] context) => this.matchOrdered!(
         (constant) => constant,
         (variable) => context[variable],
         (minus)    => -minus.eval(context),
         (plus)     => plus.lhs.eval(context) + 
plus.rhs.eval(context),
         (times)    => times.lhs.eval(context) * 
times.rhs.eval(context),
     );
}
```

What is `matchOrdered`? A template defined in the same module as 
`EnumUnion`. It *requires* that all cases be handled (no 
default/catch-all) and in order of tags, that is, if you swap 
`(constant) => constant,` and `(variable) => context[variable],` 
you get an error. You do get the error because the parameter and 
tag names don’t line up, not because of a coincidental type 
mismatch.

There is also `match` which also requires all cases be handled 
but in any order. Handlers are inspected for the names of their 
parameters, get reordered, and passed to `matchOrdered`. 
Generally, use `matchOrdered` as you get better diagnosis.

There are also `matchOrderedDefault` and `matchDefault` which 
consider their last argument a default/catch-all handler.

Tags are also used for construction (named parameters). If, by 
types, construction is ambiguous, a tag can be used to clarify:

```d
void main() @safe
{
     // Build (-2) * 1 + (-x)
     immutable Expr expr = new Expr(plus: Expr.Binary(
         new Expr(times: Expr.Binary(
             new Expr(-2),
             new Expr(1)
         )),
         new Expr(minus: new Expr("x"))
     ));
     import std.stdio;
     writeln(expr, " = ", expr.eval(["x": 1]));
}
```

For `plus` and `times`, tags are required as they’re 
indistinguishable otherwise. For `minus`, the tag is optional, 
but helps understanding what’s built. For variables and 
constants, tags aren’t used in the example.

You could use enum unions to back sum types:
```d
struct SumType(Ts...)
{
     private static struct Impl
     {
         static foreach (i, alias T; Ts)
             mixin("T field", cast(int) i, ";");
     }
     mixin EnumUnion!Impl;
}
```

 From what I see, you want to make `match` an intrinsic, and TBH, 
the value of
```d
x.match {
     // handlers
}
```
over
```d
x.match!(
     // handlers
)
```
is negligible.

The value of being a first-class language construct is similar to 
the `foreach` → `opApply` lowering: `return` and other 
control-flow statements in the handlers could get lowered some 
way. Allowing that for arbitrary lambdas would be powerful and 
essentially allow programmers to implement custom control-flow 
statements.