extern(C++, ns)

Mon Jan 4 04:42:15 PST 2016

On 4 January 2016 at 02:18, Walter Bright via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
> On 1/3/2016 3:25 AM, Manu via Digitalmars-d wrote:
>>
>> On 3 January 2016 at 18:03, Walter Bright via Digitalmars-d
>> <digitalmars-d at puremagic.com> wrote:
>>>
>>>
>>> C++ doesn't have a module structure,
>>
>>
>> It's called the filesystem ;) .. People use folders and filenames to
>> describe what would be 'modules' in C, if C was said to have modules.
>> Sometimes they make namespaces that match the filesystem ;)
>
>
> I know how C++ name scoping works, after all, I implemented a C++ compiler.
> C++ doesn't have modules, and doesn't even have a semantic notion of files.
> You can pretend it has them, but when you talk about "edge cases not
> working", they aren't going to work in spades here.

That doesn't really matter though, we're talking about practical
bindings to C++ libraries here. It's a practical notion; it's natural
to want to distribute the symbols among D modules that logically
relate to the structure/layout of the C/C++ .h files. Or even adapt it
slightly to fit among D modules more naturally to the D user.
It's not a reasonable binding solution to tell the user they import
one uber-module with the entire C++ lib. That's the only case where
extern(C++, "ns") separating symbols by namespace is actually
meaningful. If they're spread among respective D modules, as they
would be, the name collision naturally won't happen, and it's very
convenient to resolve if it does occur.

>> This!!! You're on to it!!
>> That's exactly what I want; I ***REALLLY*** want the compiler to emit
>> that error message in that exact situation!
>> Please, please can we have that? :)
>
>
> No. If D is to support C++ namespaces, it has to support declaring the same
> identifier for different purposes in different namespaces. C++ namespaces
> have different scopes in C++. Doing anything else would make it impossible
> to connect to perfectly legitimate C++ programs. The WHOLE POINT of C++
> namespaces is to support declaring the same identifier in different
> namespaces.

The point of supporting C++ namespaces is for mangling, and nothing
more. We just want the linker to link.
D already has a module system, it's excellent, and we want to use it
exactly how it is. Its soul purpose is to allow you to declare the
same identifier multiple times under different module namespaces.
I don't want to see any change to D's standard naming patterns (as
seen by the D library consumer) because the linker is linking to
something that was compiled from C++. It should have no impact on D's
normal naming/scoping rules.

If the declaration is in the D module x.y, then the identifier is
x.y.Identifier, just like EVERYTHING. If I didn't want it to be there,
I would put it somewhere else. If I wanted it under x.y.ns.Identifier,
I would make a module x.y.ns and put it there.
If I have C++ ns1::X and ns2::X, then I can make respective D modules
x.y.ns1 and x.y.ns2 and put each declaration of X in each one...
normal D name resolution applies.

I really don't understand what the problem is? There must be something
I'm completely missing, because it makes no sense to me to make it
impossible for the user to place the declaration where they want it,
with no opt-out.
There's nothing problematic or inconvenient about separating symbols
with the same names into separate modules, that's what you do in D.
We're not writing C++ code, we're writing a D api for use by normal D
code. We're only *linking* to a C++ binary, and that's irrelevant to
the user of the lib. They don't want to see weird C++ details creeping
into the API, it would make no sense to the consumer. It should look
like any normal D api to them.

>> This would solve a lot of awkward issues.
>
>
> It'd be a river of bug reports, because sure as shootin', people are going
> to try to interface to this C++ code:
>
>   namespace ns1 { int identifier; }
>   namespace ns2 { int identifier; }
>
> And why shouldn't they? It's correct and legitimate C++ code.

Sure, and they would have almost certainly made a module 'ns1' and
another module 'ns2', and put each 'identifier' in their respective
place. That's the natural thing to do as a D programmer, and if
someone complains as you say, suggest they make another module for the
other symbol...
I don't believe the situation you propose will emerge; it would be
unnatural for a D user to expect that 2 symbols with the same name
would coexist in one module, and if it does, the solution is extremely
simple, and works well... just as well as all other normal D code.

Surely a D consumer of a C++ library would assume that the different
C++ namespaces have been sensibly mapped to D modules appropriately by
the person who wrote the bindings? The bindings will have been written
and organised to present comfortably to a D user.

>> Oh yeah, I just remembered another great case, I had this in C++:
>>    namespace delegate {}
>> 'delegate' is not a valid identifier in D, I was going to come in here
>> and bang the drum again, and also suggest that the namespace needs to
>> be a string, ie extern(C++, "delegate")
>> D takes a lot more names from the namespace than C++ does.
>>
>> The user has _no control_ over what they name the C++ namespace; by
>> definition, it already exists. It's for binding purposes. It might not
>> even be a name that's a valid D identifier.
>> It's just for mangling.
>
>
> Please file an enhancement request for that. Though you can probably make it
> work by writing a C++ wrapper for it.

That would imply that binding to C++ libs may involve writing D
bindings, and also additional C++ bindings/adapters. A developer will
now need a parallel C++ build system to compile the adapter along with
their project.
We should make every effort to prevent that necessity as best we can,
and I don't think making the namespace a string offers any
disadvantage, but it reduces the chances of this sort of problem.

I'll log an enhancement request.
https://issues.dlang.org/show_bug.cgi?id=15512

>> One involves more complex inheritence; we have typical C++
>> 'interfaces', ie, classes populated only with abstract virtuals, and
>> multiple inheritence of those. Does extern(C++) currently support
>> inheriting multiple interfaces?
>
>
> Only if they have no fields, i.e. they mirror COM inheritance.

Yes, this is the case. Do you suspect this will work already?

Situation is like this:

class Base // normal C++ base class, has members
{
  int members;

  virtual void baseVirtual();
};

class Interface // pure COM style interface class, no members
{
  virtual void interfaceVirtual() = 0;
};

class MultipleDerived : public Base, public Interface
{
  int moreMembers;

  void baseVirtual() override {}  // implement function from base
  void interfaceVirtual() override {}  // also from interface
};

This is the situation we have of some classes that need representation
in D. Also, D code will further derive from MultipleDerived.
Ie, in D:

extern(C++) class Base
{
  int members;

  /+virtual+/ void baseVirtual();
}

extern(C++) interface Interface
{
  void interfaceVirtual();
}

extern(C++) class MultipleDerived : Base, Interface
{
  int moreMembers;

  override void baseVirtual() {}  // implement function from base
  override void interfaceVirtual() {}  // also from interface
}

class DClass : MultipleDerived // D class derives from C++ base class
{
  // D stuff
}

This all seems reasonable, the only bit I haven't tried yet is
inheriting from a proper base class and also an interface.
Will the extern(C++) class get this more complicated vtable structure right?

>> I also expect I'll need a way to express a class instance. I noticed that
>> in:
>>
>> extern(C++)
>> {
>>    class C {}
>>    struct S(T) {}
>> }
>> S!C x;
>>
>> I noticed in one case like this that 'x' seemed to mangle as C++
>> S<C*>, where the '*' in the mangled name seems to be implicit in D
>> (presumably because D's classes are always references?).
>> If that is indeed the case, we'll need a way to express a symbol name
>> that is just the instance 'C', not 'C*'. But these will be advanced
>> problems, with possible work-arounds when basic problems are solved.
>
>
> D doesn't support "Shimmer" aggregates ("Is it a floor wax or a dessert
> topping?") You may need to write C++ wrappers for the problematic C++ types,
> and then interface to the wrappers.

You did magic mangling for c_long... it's going to be a big ongoing
problem if we can't express a template symbol with a C without a '*'
after it.
C in the template signature is the common case, people almost always
template on C, and add the * or & at the site of the C member
declaration.
Are you sure we can't invent some little marker template to inform the
C++ mangler to leave off the '*'? Thing!(NoStar!Class) -> Thing<Class>

An example, there is a C++ class like this:

template<typename C>
struct Thing
{
  C *instance; // overwhelmingly common case; * (or &) on the member,
rarely in the template
  static void f();
};

In D, this struct is defined to match as we expect:

extern(C++) struct Thing(C)
{
  C instance; // obviously, * is implicit in D, no problem
  static void f();
}

Consider the C++ call: Thing<MyClass>::f();
>From D, it would be: Thing!MyClass.f();

The structs are identical, the function is correct and compatible,
this should work.
Problem is, the C symbol is: Thing<MyClass>::f
The extern(C++) symbol that D generates is: Thing<MyClass*>::f
And you fail to link.

If we could call: Thing!(NoStar!MyClass).f();
Where 'NoStar!' is just some thin skin, that could be a magic helper,
like c_long, which makes this connection work.