Sum Types - first draft

Tue Sep 10 19:56:28 UTC 2024

On Tuesday, 10 September 2024 at 19:13:33 UTC, Walter Bright 
wrote:
> On 9/10/2024 10:23 AM, Paul Backus wrote:
>>> I was approaching it from the other way around. Isn't a 
>>> non-nullable pointer a sumtype? Why have both non-nullable 
>>> types and sumtypes?
>> 
>> You have it exactly backwards. A _nullable_ pointer type is 
>> the sum of a non-nullable pointer type and typeof(null).
>
> "the sum of ..." doesn't make it a sumtype?
>
>> A non-nullable pointer type is a pointer type with its range 
>> of valid values restricted. You could think of it as a 
>> "difference type"--if you take T*, and _subtract_ typeof(null) 
>> from it (i.e., take the set difference [1] of their values), 
>> you get a non-nullable pointer type.
>> 
>> [1] 
>> https://en.wikipedia.org/wiki/Complement_(set_theory)#Relative_complement
>
> How is that different from a sumtype?

Let's walk through this very slowly, one step at a time.

One way to define a type is as a set of possible values.

For example:

     bool = {false, true}
     ubyte = {0, 1, 2, ..., 255}

Suppose we define a sum type of these two types:

     Example = bool + ubyte
     // Equivalent to:
     // alias Example = SumType!(bool, ubyte);

What is the set of possible values for the `Example` type? It's 
the set that contains all the elements of the `bool` set, and all 
the elements of the `ubyte` set--in other words, the set union. 
[1]

     Example = {false, true} ∪ {0, ..., 255} = {false, true, 0, 
..., 255}

Now, what are the possible values of a normal, nullable pointer 
type, like the ones we have in D today? Let's say we're on a 
32-bit architecture, just to make the numbers easier to write. In 
that case:

     uybte* = {null, 0x00000001, 0x00000002, ..., 0xFFFFFFFF}

Each 32-bit integer corresponds to a distinct pointer value, with 
0 corresponding to `null`.

Now, let's say we add non-nullable pointers to D, so that for 
every pointer type `T*`, there's a new type `nonnull(T*)` which 
is just like `T*`, except that it can't be null. What's the set 
of values for `nonnull(ubyte*)`?

     nonnull(ubyte*) = {0x00000001, 0x00000002, ..., 0xFFFFFFFF}

Naturally, it's the set of values for `ubyte*` with the value 
`null` removed. In set theory terms, we could write it as the 
difference between the `ubyte*` set, and the set that contains 
the single value `null`:

     nonnull(ubyte*) = ubyte* - {null}

As it turns out, there is actually a type in D that corresponds 
to the set `{null}`--it's called `typeof(null)`:

     typeof(null) = {null}

So, by substitution, we can write:

     nonnull(ubyte*) = ubyte* - typeof(null)

Or, in English: the type `nonnull(ubyte*)` is the _difference_ 
between the types `ubyte*` and `typeof(null)`.

Finally, what happens if we create a sum type of 
`nonnull(ubyte*)` and `typeof(null)`?

     Sum = nonnull(ubyte*) + typeof(null)
     Sum = {0x00000001, ..., 0xFFFFFFFF} ∪ {null}
     Sum = {null, 0x00000001, ..., 0xFFFFFFFF}

Wait a minute...we've seen that set before! It's the set for 
`ubyte*`!

So, once again, by substitution, we can write:

     Sum = ubyte*
     ubyte* = nonnull(ubyte*) + typeof(null)

Or, in English: the type `ubyte*` is the _sum_ of the types 
`nonnull(ubyte*)` and `typeof(null)`.

[1] https://en.wikipedia.org/wiki/Union_(set_theory)