Solving the spurious forward/cyclic reference errors in DMD

Elie Morisse via Digitalmars-d digitalmars-d at puremagic.com
Sun Jul 16 18:16:30 PDT 2017


Timon, any update on this? What are the insights you gained with 
your frontend?

I recently reported two cases without a simple fix:

https://issues.dlang.org/show_bug.cgi?id=17656
https://issues.dlang.org/show_bug.cgi?id=17194#c1

and have seen a lot more referencing errors with Calypso, 
especially when this gets enabled: 
https://github.com/Syniurge/Calypso/commit/1e1ae319e32120bd9ef0009716ddabed92f69ac2

Calypso makes its mapped C++ symbols go through the same 
importAll -> semantic1,2,3 route that D symbols take. Ultimately 
this is mostly useless work that should be skipped, the reason it 
currently works this way being that I wasn't familiar yet with 
the DMD source code when I started. But what this hard and 
ungrateful work has also been doing (and many large libraries are 
blocked by this) is exposing a seemingly infinite number of bogus 
forward/circular/misc referencing DMD errors.
Those errors boil down to semantic calls getting triggered at the 
wrong time, on symbols that the caller doesn't really depend upon.

Because most of the time, the semantic() call on the LHS of 
DotXXXExp, inside AggregateDeclaration.determineSize, etc. is 
there in case there are:
  - mixins to expand
  - attributes whose members have to be added to the parent symtab
  - if LHS is a template to instantiate

These are (AFAIK) the only cases where the symtab of the LHS or 
the aggregate may get altered, and if I understand correctly 
that's what the semantic call is checking before searching for 
the RHS or determining the aggregate fields and then its size.

So would splitting semantic() into determineMembers() preceding 
the rest of semantic() be worth exploring? The thing is, this 
would help in most cases but I can imagine scenarios where simply 
splitting may not be enough. Example:

enum E { OOO = S.UUU }

import std.conv;
string genMember1() { return "enum G8H9 = " ~ 
(cast(int)E.OOO).to!string; }
string genMember2() { return "enum UUU = 1;"; }

struct S {
     mixin(genMember1());
     mixin(genMember2());
}

We'll have S.determineMembers -> E.OOO.semantic -> 
S.determineMembers, and although in this case the value of OOO 
may be interpreted to 1, at this point the compiler can't easily 
know whether mixins will generate zero, one or more UUU members 
or not. To attenuate the problem determineMembers() could be made 
be callable multiple times (recursively), each time starting from 
where the previous (on-going) call left off, so in this 
particular case the second S.determineMembers call would expand 
the second mixin to enum UUU = 1. But then how does the compiler 
knows and react if genMember1 generate a new UUU member? Ok a 
second UUU enum will error, but what if UUU was a function and 
genMember1() generates a more specialized overload of UUU? I.e:

enum E { OOO = S.UUU(1) }

import std.conv;
string genMember1() { return "static int UUU(int n) { return n; 
}; enum G8H9 = " ~ (cast(int)E.OOO).to!string; }
string genMember2() { return "static int UUU(int n, int a = 5) { 
return n + 5; }"; }

struct S {
     mixin(genMember1());
     mixin(genMember2());
}

At this point well it's getting a bit contrived, so maybe it's 
not really worth finding a solution to make this compile (but 
ideally the compiler should still warn the user).

Should I try splitting semantic() and make a PR? It might be a 
lot of work, so I'd like to know if this makes sense first.


More information about the Digitalmars-d mailing list