A new Tree-Sitter Grammar for D
Garrett D'Amore
garrett at damore.org
Mon Oct 17 05:21:10 UTC 2022
I'm happy to announce that I've created what I believe is a
complete, or at least very nearly so, Tree-Sitter grammar for D.
You can find it at https://github.com/gdamore/tree-sitter-d
Tree-Sitter is a tool that enables dynamic AST generation for a
variety of purposes, and is becoming quite popular with many
editor projects.
I've tested this grammar with as many different sources as I can
find, including the test cases for the DMD compiler itself, as
well as various other community sources and proprietary sources.
It does not include support for preview syntaxes for bit fields
or shortened function bodies, but I believe it should cover just
about every other case. I've been using this with the Helix
editor, along with the Serve-D language server, with some success.
Included in my repository are queries for highlighting, injection
(really just comments), and text objects (so you can navigate
across major structures if your editor supports it.). I have not
yet implemented indent queries.
This work includes a test suite that has a lot of test cases, but
of course is probably still far from complete.
For folks that care, out of 1067 test cases in the DMD compiler,
this parses successfully all but 5. The five that do not parse
are ones that contain errors in uninstantiated templates, a
problem with #line directives involving multi-line comments (you
should never encounter this!) and preview syntax support already
mentioned.
This grammar is slightly more strict than the officially posted
grammar, as some constructs which are flagged only at semantic
analysis are caught at parse time in my grammar. (Notably comma
expressions are not legal in constructs where they would be
evaluated as a single value -- DMD generates a compilation error
at semantic analysis time whereas my grammar simply rejects them
as legal syntax. This was done to reduce the overall size of the
generated parser as reduce the number of conflicts that would
have resolution.)
I believe this grammar may be the complete and accurate machine
readable grammar outside of the DMD compiler itself. Certainly
this has fixes to numerous defects found in both libdparse and in
the official grammar, although both those projects were extremely
useful as foundations to build upon. It is my hope that others
will find this useful.
I do welcome contributions of all forms -- whether bug reports,
additional test cases, or grammar fixes or corrections. I am
quite new to both Tree Sitter and to D, so it's entirely possible
that I've missed something or misunderstood something!
I will probably see if this can be adopted into either the Tree
Sitter or DLang community projects -- I'm not sure which is the
better location. If you have thoughts please don't hesitate to
let me know. I'm quite sure that the grammar itself could
probably benefit from some further optimization, and I welcome
advice or contributions!
More information about the Digitalmars-d-announce
mailing list