A proposal: Sumtypes
Richard Andrew Cattermole (Rikki)
richard at cattermole.co.nz
Thu Feb 8 15:42:25 UTC 2024
Yesterday I mentioned that I wasn't very happy with Walter's
design of sum types, at least as per his write-up in his DIP
repository.
I have finally after two years written up an alternative to it,
that should cover everything you would expect from such a
language feature.
There are also a couple of key differences with regards to the
tag and ABI that will make value type exceptions aka zero cost
exceptions work fairly fast.
A summary of features:
- Support both a short-hand declaration syntax similar to the ML
family as well as the one proposed by Walter's enum-like syntax.
With UDA's.
- The member of operator refers to the tag name
- Proposed match parameters for both name and type (although
matching itself is not proposed)
- Copy constructors and destructor support
- Flexible ABI, if you don't use it, you won't pay for it (i.e.
no storage for a value or function pointers for copy
constructor/destructor)
- Default initialization using first entry or preferred ``:none``
- Implicit construction based upon value and using assignment
expression to prefer existing tag
- Does not have the null type state
- Comparison based upon tag, and only then value
- Introspection (traits and properties)
- Set operations (merging, checking if type/name is in the set)
- No non-introspection method to access a sum type value is
specified currently, a follow-up matching proposal would offer it
instead.
It can be done using the trait ``getMember``, although it will
be up to you to validate if that is the correct entry given the
tag for a value.
Latest version:
https://gist.github.com/rikkimax/d25c6b2bed8caba008a6967e9e0a7e7c
Walter's DIP:
https://github.com/WalterBright/DIPs/blob/sumtypes/DIPs/1NNN-(wgb).md
Example nullable:
```d
sumtype Nullable(T) {
:none,
T value
}
sumtype Nullable(T) = :none | T value;
void accept(Nullable!Duration timeout) {}
accept(1.minute);
accept(:value = 1.minute);
accept(:none);
```
The following is a copy of the proposed member of operator and
then the sumtype for posterity's sake.
------------------------
PR: https://github.com/dlang/dmd/pull/16161
# Member Of Operator
The member of operator, is an operator that operates on a
contextual type with respect to a given statement or declaration.
It may appear as the first term in an expression, then it may be
followed with binary and dot expressions.
The syntax of the operator is ``':' Identifier``.
## Context
The context is a type that is provided by the statement or
relevant declaration.
## Validation
The type that the member of operator results in is the same as
the one it is in context of.
If it does not match, it will error.
## Valid Statements and Declarations
- Return expressions
The compiler rewrites ``return :Identifier;`` as ``return
typeof(return).Identifier;``.
- Variable declarations
Type qualifiers may not appear as the variable type, there
must be a concrete type.
It can be thought of as the type on the variable as having
been aliased with the alias applying to the variable type and as
the context.
``Type var = :Identifier;`` would internally be rewritten as
``__Alias var = __Alias.Identifier;``.
- Switch statements
The expression used by the switch statement, will need to be
aliased as per variable declarations.
So
```d
switch(expr) {
case :Identifier:
break;
}
```
would be rewritten as
```d
alias __Alias = typeof(expr);
switch(expr) {
case __Alias.Identifier:
break;
}
```
- Function calls
During parameter to argument matching, a check to see if the
``typeof(param).Identifier`` is possible for
``func(:Identifier)``.
- Function parameter default initialization
It must support the default initialization of a parameter.
``void func(Enum e = :Start)``.
- Comparison
The left hand side of a comparison is used as the context for
the right hand side ``e == :Start``.
This may require an intermediary variable to get the type of,
prior to the comparison.
------------------------
Depends upon: [member of
operator](https://gist.github.com/rikkimax/9e02ad538d94615d76d869070f7fd65f)
# SumTypes
Sum types are a union of types, as well as a union of names.
Some names will be applied to a type, others may not be.
It acts as a tagged union, using a tag to determine which type or
name is currently active.
The matching capabilities are not specified here.
It is influenced from Walter Bright's DIP, although it is not a
continuation of.
## Syntax
Two new declaration syntaxes are proposed.
The first comes from Walter Bright's proposal:
```d
sumtype Identifier (TemplateParameters) {
@UDAs|opt Type Identifier = Expression,
@UDAs|opt Type Identifier,
@UDAs|opt MemberOfOperator,
}
```
TODO: swap for spec grammar version
The second is short hand which comes from the ML family:
```d
sumtype Identifier (TemplateParameters) = @UDAs|opt Type
Identifier|opt | @UDAs|opt MemberOfOperator;
```
TODO: swap for spec grammar version
For a nullable type this would look like in both syntaxes:
```d
sumtype Nullable(T) {
:none,
T value
}
sumtype Nullable(T) = :none | T value;
```
## Member Of
A sumtype is a kind of tag union.
This uses a tag to differentiate between each member.
The tag is a hash of both the fully qualified name of the type
and the name.
The tag should be stored in a CPU word size register, so that if
only names and no types are provided, there will be no storage.
When the member of operator applies to a sumtype it will locate
given the member of identifier from the list of names the entry.
## Proposed Match Parameters
There are two forms that need to be supported.
Both of which support a following name identifier that will be
used for the variable declaration in the given scope.
1. The first is a the type
2. Second is the member of operator to match the name
It is recommended that if you can have conflicts to always
declare entries with names and to always use the names in the
matching.
```d
obj.match {
(:entry varName) => writeln(varName);
}
```
If you did not specify a type, you may not use the renamed
variable declaration for a given entry nor specify the entry by
the type.
It will of course be possible to specify an entry based upon the
member of operator.
```d
sumtype S = :none;
identity(:none);
S identity(S s) => return s;
```
As a feature this is overwise known as implicit construction and
applies to types in general in any location including function
arguments.
## Storage
A sumtype at runtime is represented by a flexible ABI.
1. The tag [``size_t``]
2. Copy constructor [``function``]
3. Destructor [``function``]
4. Storage [``void[X]``]
The tag always exists.
If none of the entries has a copy constructor (including
generated), this field does not exist.
If none of the entires has a destructor (including generated),
this field does not exist.
If none of the entries takes any storage (so all entries do not
have a type), this field does not exist.
Copy constructors and destructors for the entries that do not
provide one, but are needed will have a generated internal to
object file function generated that will perform the appropriete
action (and should we get reference counting also perform that).
For all intents and purposes a sum type is similar to a struct as
far as when to call the copy constructors and destructors.
## Initialization
The default initialization of a sumtype will always prefer
``:none`` if present, otherwise it is the first entry.
For the first entry on the short hand syntax it does not support
expressions for the default initialization, therefore it will be
the default initialized value of that type.
Assigning a value to a sum type, will always prefer the currently
selected tag.
If however the value cannot be coerced into the tag's type, it
will then do a match to determine the best candidate based upon
the type of the expression.
An example of prefering the currently selected tag:
```d
sumtype S = int i | long l;
S s = :i = 2;
```
But if we switch to a larger value ``s = long.max;``, this will
assign the long instead.
## Nullability
A sum type cannot have the type state of null.
## Set Operations
A sumtype which is a subset of another, will be assignable.
```d
sumtype S1 = :none | int;
sumtype S2 = :none | int | float;
S1 s1;
S2 s2 = s1;
```
This covers other scenarios like returning from a function or an
argument to a function.
To remove a possible entry from a sumtype you must peform a match
(which is not being proposed here):
```d
sumtype S1 = :none | int;
sumtype S2 = :none | int | float;
S1 s1;
S2 s2 = s1;
s2.match {
(float) => assert(0);
(default val) s1 = val;
}
```
To determine if a type is in the set:
```d
sumtype S1 = :none | int;
pragma(msg, int in S1); // true
pragma(msg, :none in S1); // true
pragma(msg, "none" in S1); // true
```
To merge two sumtypes together use the pipe operator on the type.
```d
sumtype S1 = :none | int i;
sumtype S2 = :none | long l;
alias S3 = S1 | S2; // :none | int i | long l
```
Or you can expand a sumtype directly into another:
```d
sumtype S1 = :none | int i;
sumtype S2 = :none | S1.expand | long l; // :none | int i | long l
```
When merging, duplicate types and names are not an error, they
will be combined.
Although if two names have different types this will error.
## Introspection
A sumtype includes all primary properties of types including
``sizeof``.
It has one new property, ``expand``. Which is used to expand a
sumtype into the currently declaring one.
The trait ``allMembers`` will return a set of strings that donate
the names of each entry. If an entry has not been given a name by
the user, a generated name will provided that will access it
instead.
Using the trait ``getMember`` or using ``SumpType.Member`` will
return an alias to that entry so that you may acquire the type of
it, or to assign to it.
For the trait ``identifier`` on an alias of the a given entry, it
will return the name for that entry.
An is expression may be used to determine if a given type is a
sumtype: ``is(T == sumtype)``.
## Comparison
The comparison of two sum types is first done based upon tag, if
they are not equal that will give the less than and more than
values.
Should they align, then a match will occur with the behavior for
the given entry type resulting in the final comparison value.
If a given entry does not have a type, then it will return as
equal.
More information about the Digitalmars-d
mailing list