D parser in tango or phobos
Benji Smith
dlanguage at benjismith.net
Wed Sep 10 14:49:33 PDT 2008
Fawzi Mohamed wrote:
> I think that having a compiler of a language written in itself is
> certainly nice from the intellectual point of view, but not immediately
> useful in any sense, and frankly unimportant for most people, even if it
> gives some real benefits to the language development.
Actually, I was mulling over the idea this afternoon, and I think there
actually could be some major advantages to having the compiler
implemented in D.
Depending on the architecture, of course, it might be much easier to
load the compiler as a library. Then you could do all sorts of neat
things like compiling and loading code on the fly.
char[] sourcecode = getSourcecodeFromSomewhere();
ASTCodeModule myModule = parser.parse(sourcecode);
At this point, with the compiler exposing a well-defined API for all of
its internal representations, you could add your own hooks to operate on
AST nodes between those phases.
foreach (ClassDeclaration clazz; myModule.classes) {
FunctionDeclaration[] methods = clazz.publicFunctions;
foreach (auto method; methods) {
decorateMethodWithTraceLogging(method);
}
}
And, if the linker & loader were also written in D, you could take those
runtime-parsed and dynamically-modified pieces of code, immediately
lining and loading them right into the application.
SharedLib library = compiler.toLib(myModule);
// Maybe write the library to a file
library.emit(`C:\path\to\my-library.lib`);
// ...Or execute the code directly
void delegate() entry = library.entryPoint;
entry.execute();
The .NET framework has some of this kind of functionality (in
Reflection.Emit), allowing programmers to build executable code,
opcode-by-opcode, at runtime.
The resultant code is subject to the same JIT compilation as any other
.NET code.
A good example of its usage is in the Regex implementation, in the .NET
standard library. It builds a custom function, with raw GOTO opcodes and
everything, based on the regex string passed into the constructor at
runtime. Consequently, the .NET regex engine is very very efficient.
The same kind of thing exists in the Tango regex engine -- you can
generate and compile D code from a regex -- but only if the regex string
is known at compile-time.
Furthermore, if the D compiler was written in D, and if it could spawn
its own subordinate instances of the compiler on the fly, immediately
loading compiled code into executable memory, think of how that would
expand the power of CTFE. Any legal function would be callable at
compile-time just as easily as at runtime.
The opposite would be true too. You'd be able to generate and compile
templates at runtime, potentially creating whole new Types (which has
only ever been possible at compile-time). Admittedly, I can't think of
any actual utility for runtime type-generation, but I'm sure someone
more clever than me could think of some use for it.
Anyhow, those are the sorts of things that I think would become feasible
if the D parser, compiler, linker, and loader were all written in D.
Calling the compiler dynamically from user code, or from within the
compiler itself, could be hugely powerful.
(NOTE: I'm not actually *advocating* any of this. Just musing. There are
plenty of reasons *not* to write the compiler in D, such as already
having done ten years of work (more on the backend) to refine the
existing compiler.)
--benj
More information about the Digitalmars-d
mailing list