Which language futures make D overcompicated?

Timon Gehr timon.gehr at gmx.ch
Sat Feb 10 12:35:39 UTC 2018


On 10.02.2018 03:12, Nick Sabalausky wrote:
> On Saturday, 10 February 2018 at 01:24:55 UTC, Timon Gehr wrote:
>> The fundamental issue is that D's type system has no parametric 
>> polymorphism,
> 
> Pardon my ignorance, but isn't that what D's templated functions do? 
> This sounds interesting but unclear exactly what you mean here and how 
> it relates to inout and its problems.

TL;DR: Parametrically polymorphic functions have /runtime/ type 
parameters. inout can be interpreted as a dependent function of type 
"{type y | y.among(x, const(x), immutable(x)) } delegate(type x)" and an 
inout function can be thought of as a function that takes inout as an 
argument and produces the function as the return value. This formulation 
is more powerful than what the inout syntax can capture, and this is 
what causes problems with type safety. In particular, 'inout' does not 
support proper lexical scoping.

TS;NM: https://gist.github.com/tgehr/769ac267d76b74109a195334ddae01c3 
(Some version of this was originally intended to go to the D blog, but I 
wanted to wait until inout has an obviously type safe definition. It 
also highlights other inout issues than just type unsafety and shows how 
all of them might be fixed in principle by adding polymorphism.)

---

I'll first explain parametric polymorphism, and then what the inout 
problem is.

The template

int foo(int x)(int y){
     return x + y;
}

relates to the function

int delegate(int) foo(int x){
     return (int y) => x + y;
}

just like the template

T foo(T)(T x){
     return x;
}

relates to the polymorphic function (fictional syntax):

T delegate(T) foo(type T){ // type is the type of types
     return (T x) => x;
}

Here, 'foo' takes a type T and produces an identity function for type T:

int delegate(int) idInt = foo(int);
int x = foo(int)(y);
writeln(foo(int)(2)); // 2

string delegate(string) idString = foo(string);
string y = foo(string)("b");
writeln(foo(string)("b")); // b


Alternatively, one might write

void id[T](T x){
     return x;
}

and then rely on implicit instantiation of the 'T' parameter:

int delegate(int) idInt = id![int]; // or something like this
int x = id(2);
writeln(id(2)); // 2

string delegate(string) idString = id![string];
string y = id("b");
writeln(id("2"));

As you will have noticed, all of this works just fine with templates, so 
what is the big difference?

For a polymorphic function, the type is a /runtime/ parameter. Some 
languages however enforce that polymorphic functions don't depend on the 
type T at runtime: just don't give any runtime methods or fields to the 
'type' type).

The most obvious benefit of parametric polymorphism is that 
parametrically polymorphic functions exist at runtime (while templates 
only exist at compile time). For example, one can define a 
parametrically polymorphic delegate:

T delegate[T](T x) = [T](T x) => x;

Or, a parametrically polymorphic virtual function

class Base{
     bool pickFirst;
     abstract T pickOne[T](T a, T b);
}

class Honest: Base{
     bool pickFirst;
     override T pickOne[T](T a, T b){
         return pickFirst ? a : b;
     }
}

class Dishonest: Base{
     override T pickOne[T](T a, T b){
         return pickFirst ? b : a;
     }
}

class Confused: Base{
     override T pickOne[T](T a, T b){
         return uniform(0,2) ? a : b;
     }
}

So as expected, the difference is that for parametrically polymorphic 
functions, the type T /does not need to be known at compile time/.


Now, what is 'inout'? If it was a first-class entity, it might have the 
type:

alias Inout = { type y | y.among(x, const(x), immutable(x)) } 
delegate(type x);

I.e. it takes a type x and produces a type y such that y is either x, 
const(x) or immutable(x).

(This is not necessarily the most restricted possible type, depending on 
the details of the polymorphic type system. inout additionally enforces 
that the qualifier applied is the same for all argument types x.)

Consider the following 'inout' function:

inout(int*) id(inout(int*) x){
     return x;
}

This might be expressed as:

inout(int*) id[Inout inout](inout(int*) x){
     return x;
}

I.e., we can make 'inout' an explicit polymorphic parameter.


What is the problem with 'inout'? Let's look at the first counterexample 
to type safety:

@safe:
int a;
immutable(int) b=2;

inout(int)* delegate(inout(int)*) dg;
inout(int)* prepare(inout(int)* x){
     dg = y=>x;
     return x;
}
void main(){
     prepare(&b);
     int* y=dg(&a);
     assert(&b is y); // passes. ouch.
     *y=3;
     assert(b is *&b); // fails!
}

We will express this using explicit polymorphism and see where the type 
error occurs. (For readability, I have named the two versions of 'inout' 
differently, but this is not strictly necessary. The compiler knows that 
they are different, because they are associated to different 
declarations in the AST.)

inout1(int)* delegate[Inout inout1](inout1(int)*) dg;

inout2(int)* prepare[Inout inout2](inout2(int)* x){
     dg = [Inout inout1](inout1(int)* y)=>x;
     return x;
}

Here, the error would be:

Error: cannot implicitly convert '[Inout inout1](inout1(int)* y)=>x;' of 
type 'inout2(int)* delegate[Inout inout1](inout1(int)* y)' to 
'inout1(int)* delegate[Inout inout1](inout1(int)*)'.

Note how the type error crucially depends on the fact that the type of 
the delegate contains _two incompatible functions_ of type Inout.

You can't have that if the only function of type Inout is the built-in 
inout!


Now let's look at the second counterexample to type safety:

@safe:
int a;
immutable(int) b=2;

inout(int)* delegate(inout(int)*)@safe delegate()@safe foo(inout(int)* y){
     inout(int)* bar(inout(int)* p){
         return y;
     }
     return ()=>&bar;
}
void main(){
     int* y=foo(&b)()(&a);
     *y=3;
     assert(&b is y); // passes. ouch.
     assert(b is *&b); // fails!
}

The problem is very similar. Can you spot it?

In summary, the issue is that there is only one 'inout' and therefore it 
is not properly lexically scoped. It is a bit like having a language 
where all variables are implicit function parameters and they all have 
the same, global, name. This sort of works fine until you want a 
function with two parameters or until you want to nest functions in a 
non-trivial way.

Also see: https://gist.github.com/tgehr/769ac267d76b74109a195334ddae01c3




More information about the Digitalmars-d mailing list