[Issue 8381] Uniform function call syntax (pseudo member) enhancement suggestions

Thu Jul 12 09:11:32 PDT 2012

http://d.puremagic.com/issues/show_bug.cgi?id=8381

David Piepgrass <qwertie256 at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |qwertie256 at gmail.com

--- Comment #1 from David Piepgrass <qwertie256 at gmail.com> 2012-07-12 09:11:28 PDT ---
To some extent I like the goal of the proposal, but I don't like the
implementation:

- The rule is non-obvious. How could a new developer possibly guess that this
happens?
- It would get lowered to a call to a template function, but usually the
programmer wants to extend one specific type.

So I offer the following counterproposal:

1) Adding static member functions

Supposing that I would like to extend the set of static members in class Foo.
How about:

module A;
// assuming "static class" does not already have any meaning?
static class Foo {
    static void F();
    static int x;
}

This block defines members in a class namespace "Foo". If "regular class Foo"
is defined in module A then these static members go into the same class.
However, if the original Foo is defined in module B then "static class Foo"
goes into module A and is considered a separate class.

Now consider some code that tries to use F():

module C;
import A;
import B;
void code() { Foo.F(); }

Currently, the compiler complains that B.Foo and A.Foo conflict with each
other. I propose changing this rule when accessing static members. The compiler
does not need to declare a conflict as soon as it sees "Foo", instead it can
look for static members in ALL Foo classes, using the same anti-hijacking rules
that it uses for free-standing module functions.

The purpose of "static class" is not to facilitate this method lookup per se;
if A and B both contain non-static Foo classes, the compiler should still
search both classes for static members rather than giving an error immediately.
Rather, the main purpose of "static class" is to declare that the class has no
constructor (not merely a @disabled default constructor, but no constructor at
all). Therefore, "new Foo()" cannot mean "new A.Foo", leading the compiler to
the interpretation "new B.Foo". Also, multiple static classes can be defined
with the same name in the same module, and their contents are merged. 

IMO, when searching for static members, a class scope should pretty much behave
the same way as module scope. So the compiler should not report an ambiguous
call if the call has only one interpretation. In the above case, the compiler
should report an error if and only if a B.F() function exists.

One more thing, if module C declares a "class Foo" then it takes priority over
A.Foo and B.Foo. But if C declares "static class Foo" then C.Foo should allow
the same overloading behavior just described; thus "new Foo()" should still
mean "new B.Foo()" and "Foo.F()" should still mean "A.Foo.F()". I think an
equivalent way to say this is that, inside C, the lookup rules proceed as if
C.Foo were declared outside module C and imported into C.

2) Adding constructors aka "Constructors Considered Harmful"

Ahh, constructors, constructors. In my opinion the constructor design in most
languages including C++, C#, D and Java is flawed, because it exposes an
implementation detail that should not be exposed, namely, which class gets
allocated and when.

I offer you as "exhibit L" the Lazy<T> class in the .NET framework. This
class's main purpose is to compute a value the first time you access it. For
example:

int x;
Lazy<int> lazy = new Lazy<int>(() => 7 * x);
x = 3;
x = lazy.Value; // 21
x = lazy.Value; // still 21 (Value is initialized only once)

There is an annoying issue though Lazy<T> operates in some different modes, and
it also contains a member and extra code to aid debugging. So in addition to
holding the value itself and a reference to the initializer delegate, it's got
a couple of other member variables and the Value property has to check a couple
of things before it returns the value.

So what if, in the future, MS decided to optimize Lazy<T> for its default mode
of operation, and factor out other modes into derived class(es)? Well, they
can't do that. If MS wants to return a LazyThreadSafe<T> object (derived from
Lazy<T>) when the user requests the thread-safe mode, they can't do that
because all the clients are saying "new Lazy", which can only return the exact
class Lazy and nothing else.

MS could add a static function, Lazy.New(...) but it's too late now that the
interface is already defined with all those public constructors.

As "exhibit 2", a.k.a. "exhibit Foo", I recently wrote a library where I needed
to provide a constructor that does a bunch of initialization work (that may
fail) before it actually creates the object. By far the most natural
implementation was a static member function:

// Constructor with dependency injection
public Foo(arg1, arg2, arg3)
{
}
// Static helper method provides an easy way to create MyClass
public static Foo LoadFrom(string filename, ...)
{
    ... // do some work
    arg1 = ...; arg2 = ...; arg3 = ...;
    return new Foo(arg1, arg2, arg3);
}

However, the client didn't like that, and insisted that Create() should be a
constructor "for consistency". I was able to rearrange things to make Create()
into a constructor, but my code was a little clunkier that way.

So what I'm saying is, when clients directly allocate memory themselves with
new(), it constrains the class implementation to work a certain way, and this
constraint is unnecessary.

So I would like to propose a way to solve these three problems. I give you:
"static new" functions:

module P;
class Foo {
    // Normal constructor
    init(A arg1, B arg2, C arg3) { ... }

    static Foo new(string filename, ...)
    {
        ... // do some work
        arg1 = ...; arg2 = ...; arg3 = ...;
        return new Foo(arg1, arg2, arg3);
    }
}

A static method named "new" overloads with constructors and can be called by
the new operator, just like constructors: "new Foo(filename, ...)" calls the
static method. A "static new" function can return a derived class if it wants,
or it can throw an exception without actually allocating any memory for a new
object. 

Now, remember the original topic: people would like to add "constructors" to
existing classes. The two above proposals together enable this possibility:

module Q;
static class Foo {
    static P.Foo new(Bar b, Baz z) {
        ...
        return new P.Foo(...);
    }
}

Arguably a "static new" function should not be allowed to allowed to return an
object that is not derived from a class called "Foo", but note that this
function in class Q.Foo must be able to return a P.Foo even though it is not
related except by name. Also, arguably, "static new" functions should not
return null, but it may be impractical and overkill for the compiler to enforce
such a rule.

One final problem is that standard constructors cannot have names. You can
define static functions with names instead, but client must use a different
syntax to call static methods compared to constructors, and this difference
feels a little odd, especially when a class provides BOTH static methods AND
constructors to construct objects.

To solve this problem, I would advocate a "uniform constructor call syntax" as
well, which is quite simply to allow users to flip around "new Foo" to
"Foo.new" if they so choose. Thus a class might define:

class Bar {
    init() { ... }
    static Bar newFromResourceId(int id) { ... }
    static Bar newFromFilename(string fn) { ... }
}

In this case the user can call Bar.new(), Bar.newFromResourceId(42) or
Bar.newFromFilename("..."). So this third proposal is not to allow named
constructors per se, but to allow a unification of syntax so that 

1. there no longer must be a big syntactic difference in syntax between "new
Bar(...)" and a "named" constructor "Bar.newFromFilename" 
2. a client cannot distinguish at the call site whether he is calling a
constructor or a static method. In other words, without this third proposal,
the language would seem inconsistent because it is presumably possible to use
the syntax Foo.new(Bar, Baz) but Bar.new() would be illegal.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------