Symbol lookup rules and imports

Tue Dec 2 15:55:34 PST 2014

On Tue, Dec 02, 2014 at 11:02:18PM +0000, Meta via Digitalmars-d wrote:
> This whole thing is a huge hole in D that needs to be fixed (it may
> even be necessary to consider it higher priority than the current C++
> and GC).

Well, that's up to Walter & Andrei to decide. :-P

> As it works currently, I'd go as far as to say that almost every
> addition to Phobos must be considered a breaking change for these
> reasons. Given the recent discussion about trying as much as possible
> to not break code, fixing the issues with import is extremely
> important. When a library writer can break user code by introducing a
> *private* symbol (scoped or otherwise), something has gone wrong.

Yeah, you can imagine I was *not* happy when, some time ago, I upgraded
Phobos and my code broke because std.regex introduced a *private* symbol
that just happened to have the same name as a struct I defined myself.

> Furthermore:
> 
> >	// mod.d
> >	module mod;
> >	struct S {
> >		// Use a fully-qualified import.
> >		// We place it in the body of S because S's methods
> >		// repeatedly needs it -- after all, DRY is good, right?
> >		import std.format : format;
> >
> >		void method1(string fmt) {
> >			writeln(format(fmt, ... ));
> >		}
> >
> >		void method2() {
> >			auto s = format("abc %s def", ...);
> >			...
> >		}
> >	}
> >
> >	// main.d
> >	module main;
> >	import mod; // we need the definition of S
> >
> >	void format(S s) {
> >		... /* do something with s */
> >	}
> >
> >	void main() {
> >		S s;
> >		s.format(); // UFCS -- should call main.format(s), right?
> >	}
> 
> Am I correct that this bug is due to the fact that selective imports
> from a module implicitly imports all symbols in that module, rather
> than just the selected symbol?

Not in this case, I don't think, the crux of the problem is that this:

	import some.module : symbol;

is essentially the same as this:

	static import some.module;
	alias symbol = some.module.symbol;

Therefore this:

	struct S {
		import std.format : format;
		...
	}

is essentially equivalent to this:

	struct S {
		static import std.format;
		alias format = std.format.format;
		// ^^^ the above line is what makes s.format() break.
	}

> I don't know how anyone could think that D's module system is simple
> at this point when things behave so differently from how they
> intuitively should behave (for my own personal definition of
> intuitive).

Oh, it's certainly "simple". Simple to the compiler writers, that is, in
the sense of the implementation being quite straightforward, but that's
not necessarily "simple" from a user's POV!

Or perhaps a fairer way to state this, is that "import" as currently
implemented may not quite coincide with what users think "import" should
do. Currently, "import" means, quite literally, "import module X's
symbols into the current scope" -- all the symbols in X get inserted
into the symbol table of the current scope. However, what most people
understand when they hear of "import" is "add module X to the list of
scopes to search when we need to look up a symbol". Perhaps a better
name for this "more intuitive" import is "use": that is, use the symbols
from module X if they are referenced but don't pull them all into the
current scope's symbol table.

In this sense, the current behaviour is "correct" because the observed
effects are exactly what happens when module X's symbols get added to
the scope's namespace. However, this also means it's a poor tool for
encapsulation, because its semantics cause encapsulation-breaking
effects as described. And it also means that users, who generally tend
to expect "use" semantics rather than "import" semantics, will tend to
get into trouble when they start using it.

Furthermore, there isn't really any practical way right now to get "use"
semantics in D, since "import" is the only tool we currently have.  To
get "use" semantics you'd have to resort to really ugly hackery like
define a separate private sub-scope to pull symbols into, for example:

	struct S {
		// Private empty struct to provide a water-proof scope
		// into which we can import symbols without fear of
		// leakage:
		private static struct Imports {
			// Now this won't leak to S's scope:
			import std.format : format;
		}

		void method() {
			// But it *will* look really ugly:
			Imports.format(...);
		}
	}

	void format(S s) { ... }

	S s;
	s.format(); // now this won't get hijacked by the import

Hmm... actually, this gives me an idea. What if we implement a little
syntactic sugar for this in the compiler? Say:

	scope import std.conv ... ;
	scope import std.format ... ;

gets lowered to:

	private static struct __imports {
		import std.conv ... ;
		import std.format ... ;
	}

where __imports is an implicit nested struct that gets introduced to
each scope that uses "scope import".

Then we introduce a new lookup rule, that if a symbol X cannot be found
with the current lookup rules, then the compiler should try searching
for __imports.X in the current scope instead. That is to say, if:

	format("%s", ...);

cannot be resolved, then pretend that the user has written:

	__imports.format("%s", ...);

instead. Sortof like the import analogue of UFCS (if a member function
can't be found in a call obj.method(), then look for method(obj) in the
global scope instead).

This way, existing code won't have to change, no breakage will be
introduced, and only a small addition (not change) needs to be made to
the existing lookup rules. Then whenever we need "use" semantics as
opposed to raw "import" semantics, we just write "scope import" instead,
and it should all work. (So we hope.)

Heh, sounds like DIP material...

T

-- 
If blunt statements had a point, they wouldn't be blunt...