Adding a D backend to GNU Bison
H. S. Teoh
hsteoh at quickfur.ath.cx
Wed Jan 16 23:27:02 UTC 2019
On Tue, Jan 15, 2019 at 03:13:44PM +0000, Eduard Staniloiu via Digitalmars-d wrote:
[...]
> I'm posting this as a followup to the positive feedback that Andrei's
> Bison related post(
> https://forum.dlang.org/thread/1c3d8e77-ce4c-6310-0afd-e6518728299f@erdani.org)
> has received.
>
> Akim Demaille has started "turning the wheels" towards adding a D
> backend to GNU Bison.
Great!
> There currently is a skeleton for D on the Bison master
> (https://savannah.gnu.org/git/?group=bison) that you can use to check
> out the backend. A good starting point to explore this feature is to
> go into the .../share/doc/bison/examples/d directory and to run "make"
> there.
>
> There is no documentation and Akim doesn't have experience with the D
> programming language, that is where we, the D community, can lend a
> helping hand.
> I'm posting this to ask for your help in getting the D backend feature
> into Bison.
I glanced briefly at the various D-related notes, and took a good look
at the generated calc.d in the examples/d directory. Here are some
comments:
- I understand that the current D codegen is mainly based on the
existing Java backend, so unsurprisingly quite a few places shows
signs of being very Java-like rather than D-like. Hopefully, with
some work, we can get it to emit more idiomatic D. :-)
- The first question I have is how much the Bison API depends on the
lexer being swappable at runtime, i.e., via the Lexer interface. I'm
having a hard time imagining that there will be many use cases where
you'd like to swap lexers with the same parser at runtime, so I'm
thinking the parser should simply take the lexer type as a template
argument, with sig constraints ensuring that whatever type the user
passes in implements the necessary methods for the parser to work.
This lets us bind the lexer to the parser at compile-time, and elide
the vtable indirection (it can still be done if the user passes in a
class).
- Along a similar vein, I'm wondering if the generated parser ought to
be a class at all, or is the inheritability of the parser a key Bison
feature? Also, are language-specific directives supported /
encouraged? If so, it might be worthwhile to let the user choose
whether to use a struct/template API vs. an OO class-based API.
- On a more high-level note, I'm wondering how flexible the API of the
parser can be. The main thought behind this is that given enough
flexibility, we may be able to target, e.g., @nogc, @safe, pure, etc..
With @safe probably a pretty important target, if it's possible to do
so. While this depends of course on the exact code the user puts into
the .y file, a worthy goal is to make the emitted D code @safe (pure,
etc.) by default unless the user writes non- at safe code in the .y file.
- How flexible can the lexer API be? For example, currently
lexer.yyerror takes a string argument, which requires using std.format
in various places. If permissible, I'd like to have yyerror take a
generic input range instead, so that we can avoid the inherent memory
allocation of std.format (e.g., if we wish to target @nogc).
- Also, is it possible to use exceptions instead of yyerror()? Or would
that deviate too far from Bison's design?
- On a more general note, I'd like to make the parser/lexer APIs
range-based as much as possible, esp. when it comes to
string-handling. But I'm just not sure how much the APIs are expected
to conform to the analogous C/C++/Java APIs.
- I wonder if YYSemanticType could use std.variant somehow instead of a
raw union, which would probably force the parser to be @system.
- Can Bison handle UTF-8 lexer/parser rules? D uses UTF-8 by default,
and it would be nice to leverage this support instead of manually
iterating over bytes, as is done in a few places.
- Some minor points that should be easy to fix:
- The YYACCEPT, YYABORT, etc., symbols really should be declared as
enums rather than static ints.
- D does support the #line directive. So these should be emitted as
they are in C/C++. (I noticed they currently only appear as
comments.)
- YYStack needs to be fixed to avoid the reallocate-on-every-push
problem on arrays. A common beginner's mistake. Also, if we're
going to target @nogc (not 100% sure about that right now), we may
have to forego built-in arrays altogether.
[...]
> Akim is going to provide assistance with the process, but he is not to
> be expected to carry this task on his own.
[...]
Dumb question: If I wanted to contribute some commits, do I have to sign
up on savannah.gnu.org? What's the procedure for submitting pull
requests? (Sorry, I glanced over the README's and the FAQ at
savannah.gnu.org but didn't find a clear answer.)
T
--
May you live all the days of your life. -- Jonathan Swift
More information about the Digitalmars-d
mailing list