Consistency, Templates, Constructors, and D3

Thu Aug 23 22:14:34 PDT 2012

DISCLAIMER: This isn't a feature request or anything like that. 
It's ONLY intended to stir _constructive_ conversation and 
criticism of D's existing features, and how to improve them _in 
the future_ (note the 'D3' in the title).

I've had a couple of ideas recently about the importance of 
consistency in a language design, and how a few languages I 
highly respect (D, C#, and Nimrod) approach these issues. This 
post is mostly me wanting to reach out to a community that enjoys 
discussing such issues, in an effort to correct any 
mis-conceptions I might hold, and to spread potentially good 
ideas to the community in hopes that my favorite language will 
benefit from our discussion.

---------------------------

First, let me assert that "Consistency" in a language is 
critically important for a few reasons:

1. One way of doing things means one way to _remember_ things. It 
keeps us sane, focused, and productive. The more we have to fight 
the language, the harder it is to master.

2. Less things to remember means it's easier to learn. First 
impressions are key for popularity.

3. Less discrepancies means fewer human errors, and thus, fewer 
"stupid" bugs.

######### CAST/TRAITS ##########

To start, let's look at: cast(T) vs to!T(t)

In D, we have one way to use template function, and then we have 
special keyword syntax which doesn't follow the same syntactical 
rules. Here, cast looks like the 'scope()' or 'debug' statement, 
which should be followed by a body of code, but it works like a 
function which takes in the following argument and returns the 
result. Setting aside the "func!()()" syntax for a moment, what 
cast should look like in D is:

     int i = cast!int(myLong);

It's a similar story with __traits(). What appears to be a 
function taking in a run-time parameter is actually compile-time 
parameter which works by "magic". It should look like:

     bool b = traits!HasMember(Foo);

######### FUNCTIONS PARAMETERS ##########

All that brings me to my next argument, and that's that the 
"func!()()" is inconsistent, or at the very least, hard to 
understand (when it doesn't have to be). We have one way of 
defining "optional" runtime parameters, and a different set of 
rules entirely for compile-time parameters. Granted, these things 
are very different to the compiler, to the programmer however, 
they "appear" to just be things we're passing to a function.

I think Nimrod has a better (but not perfect) approach to this, 
in that there are different "kinds" of functions. One that takes 
in runtime params, and one that takes in compile-time ones; but 
at the call site, you use them the same:

     # Nimrod code

     template foo(x:int) # compile time
       when x == 0:
         doSomething()
       else:
         doSomethingElse()

     proc bar(x:int) # run time
       if x == 0:
         doSomething()
       else:
         doSomethingElse()

     block main:
       foo(0) # both have identical..
       bar(0) # ..call signatures.

In D, that looks like:

     void foo(int x)() {
       static if (x == 0) { doSomething(); }
       else { doSomethingElse(); }
     }

     void bar(int x) {
       if (x == 0) { doSomething(); }
       else { doSomethingElse(); }
     }

     void main() {
       foo!0();
       bar(0); // completely difference signatures
     }

Ultimately foo is just more optimized in the case where an 'int' 
can be passed at compile time, but the way you use it in Nimrod 
is much more consistent than in D. In fact, Nimrod code is very 
clean because there's no special syntax oddities, and that makes 
it easy to follow (at least on that level), especially for people 
learning the language.

But I think there's a much better way. One of the things people 
like about Dynamicly Typed languages is that you can hack things 
together quickly. Given:

     function load(filename) { ... }

the name of the parameter is all that's required when throwing 
something together. You know what 'filename' is and how to use 
it. The biggest problem (beyond efficiency), is later when you're 
tightening things up you have to make sure that 'filename' is a 
valid type, so we end up having to do the work manually where in 
a Strong Typed language we can just define a type:

     function load(filename)
     {
       if (filename != String) {
         error("Must be string");
         return;
       }
       ...
     }

vs:

     void load(string filename) { ... }

but, of course, sometimes we want to take in a generic parameter, 
as D programmers are fully aware. In D, we have that option:

     void load(T)(T file)
     {
       static if (is(T : string))
         ...
       else if (is(T : File))
         ...
     }

but it's wonky. Two parameter sets? Type deduction? These 
concepts aren't the easiest to pick up, and I remember having 
some amount of difficulty first learn what the "func!(...)(...)" 
did in D.

So why not have one set of parameters and allow "typeless" ones 
which are simply compile-time duck-typed?

     void load(file)
     {
       static if (is(typeof(file) : string))
         ...
       else if (is(typeof(file) : File))
         ...
     }

this way, we have one set of rules for calling functions, and 
deducing/defaulting parameters, with the same power. Plus, we get 
the convenience of just hacking things together and going back 
later to tighten things up. We can have similar (to existing) 
rules for specialization and defaults:

     void foo(int x) // runtime
     void foo(x:int) // compiletime that must be 'int'
     void foo(x=int) // compiletime, defaults to 'int'
     void foo(x:int|string) // can be either int or string
     void foo(x=int|string) // defaults to int; can be string

as well for deduction:

     void foo(int x, T y, T)            { ... }
     void bar(int x, T y, T = float)    { ... }
     void baz(int x, T y, T : int|long) { ... }

     void main()
     {
       foo(0, "bar");      // T is string
       foo(0, 1.0);        // T is double
       foo(0, 1.0, float); // T is float
       bar(0, 1.0);        // T is float (?)
       baz(0, 1.0);        // error: needs int, or long
     }

Revisiting the cast()/__traits() issue. Given our new function 
call syntax, they would looks like:

     cast(Type, value);
     traits(CommandEnum, values...);

Now, I'm sure everyone is saying "What about Type template 
parameters? How do we separate them from constructor parameters?" 
Please keep reading.

######### CONSTRUCTORS ##########

We're all aware that overriding new/delete in D is a depreciated 
feature. I agree with this, however I think we should go a step 
further and remove the new/delete syntax all together... :D crazy 
I know, but hear me out.

We replace it with special factory functions. Example:

     class Person {
       string name;
       uint age;

       this new(string n, uint a) {
         name = n;
         age = a;
       }
     }

     void main() {
       auto philip = Person.new("Philip", 24);
     }

Notice 'new()' returns type 'this', which makes it static and 
implicitly calls allocation methods (which could be overridden) 
and has a 'this' reference.

This way, creating objects is consistent across struct/class and 
is always done through a named function. So things like 
converting can become arbitrarily consistent through a naming 
convention:

for example, we could use 'from()' in replace of to()/cast():

     auto i = int.from(inputString);
     auto i = int.from(myLong);

and we don't have to fight for overloads, or have to remember 
special factories (for things like FreeLists) when we usually 
think to use the 'new T()' syntax:

     // Naming clarifies intention
     auto model = Model.new(...);
     auto model = Model.load("/path/to/file");

Desctructors would be named as well, so we could force delete 
objects:

     struct Foo {
       this new() { ... }
       ~this delete() { ... }
     }

     void somefunc() {
         auto foo = Foo.new();
         scope(exit) foo.delete(); // forced
     }

This would also keep consistent syntax when using 
FreeLists/MemeoryPools, because everything is done through 
factories in this case, and the implementation can be arbitrary:

     class Foo {
       private Foo _head, _next;

       this new() @noalloc {
         if (_head) {
           Foo result = _head;
           _head = _head.next;
           return result;
         }
         else {
           import core.memory;
           return cast(Foo, GC.alloc(this.sizeof));
         }
       }

       ~this delete() {
         ...
       }
     }

More importantly, it's keeps Type template parameters from 
conflicting with constructor params:

     struct Foo(T) {
         this new(T t) { ... }
     }

     alias Foo(float) Foof;

     void main() {
         auto foo = Foo(int).new(123);
         auto foof = Foof.new(1.0f);
     }

With these changes, the language is more consistent and there's 
no special syntax characters or hard to understand rules (IMO).

Thanks for your time. Please let me know if you have any thoughts 
or opinions on my ideas, it is after all, why I'm posting them. :)