[GSOC Draft Proposal] ANTLR and Java based D parser for IDE usage
Luca Boasso
luke.boasso at gmail.com
Mon Apr 4 13:16:50 PDT 2011
Thank you for your comments.
Here the updated timeline, I'm always looking for advices:
- April 25 – May 23: Community Bonding Period
Since I am new in the D community I will spend some time learning how to
contribute while following the guideline of the community and the DDT project.
I will start reviewing the language reference asking for clarifications
when needed.
Once I have got an overall understanding I will write the production rules of
a subset of the D grammar(D0) in the ANTLR grammar notation (similar to EBNF).
Since the AST generation functionality is a key factor for a correct
integration with DDT, I will enhance the D0 parser with AST construction
rules from the beginning.
At this point, I need to discuss with the DDT team the type of AST that has to
be built for IDEs purposes, and confirm which annotations are most useful
(eg. source ranges).
- May 23 – July 11: Developing phase I
A fully functional D0 parser will be integrated in DDT.
Once the integration is complete I will augment the parser to handle a
superset of the D1 and D2 grammars.
To check the correctness of the parser, it will be tested with existing and
large D code base (like Phobos, Tango, the Andrei's TDPL book source
code...).
Subsequently I will modify the tree construction rules to reflect the changes
in the syntax.
- July 11 – August 15: Developing phase II
In this phase I will create unit tests to verify the correctness of the
generated trees and I will focus on the remain aspects of the integration
with the DDT project.
In the remaining time I will provide good error recovery to the parser and I
will improve the overall performance.
- August 15 - August 22: Final phase
I will use this last week to polish the code and improve the documentation.
As a final task, I will think about how support for incremental parsing can be
added in the future.
On 4/4/11, Bruno Medeiros <brunodomedeiros+spam at com.gmail> wrote:
> On 29/03/2011 19:51, Luca Boasso wrote:
>> Timeline
>> --------
>>
>> This is a tentative timeline to be further discussed with the help of the
>> community.
>> I am committed to dedicate substantially to this project knowing that I
>> also
>> have to pass some exams.
>> I estimate that I could spend initially approximately 30h/week.
>> After the exam session I will work full-time on this project.
>>
>> - April 25 – May 23: Community Bonding Period
>> Since I am new in the D community I will spend some time learning how
>> to
>> contribute while following the guideline of the community and the
>> DDT project.
>> I will start reviewing the language reference asking for clarifications
>> when needed.
>> Once I have got an overall understanding I will write the production
>> rules of
>> a superset of the D grammar in the ANTLR grammar notation (similar to
>> EBNF).
>>
>> - May 23 – July 11: Developing phase I
>> The correctness of the parser is of paramount importance.
>> I will create many tests to exercise the parser (at this point just a
>> "recognizer") obtained as output from ANTLR.
>> Once I am confident with the parser conforms to the language reference
>> and
>> recognizes the same language as the parser in DMD, I will enhance it
>> with AST
>> construction rules.
>> At this point, I need to discuss with the DDT team the type of AST that
>> has to
>> be built for IDEs purposes, and confirm which annotations are most
>> useful
>> (eg. source ranges).
>>
>> - July 11 – August 15: Developing phase II
>> In this phase I will create unit tests to verify the correctness of the
>> generated trees and I will focus on the integration of the parser with
>> the DDT
>> project.
>> In the remaining time I will provide good error recovery to the parser
>> and I
>> will improve the overall performance.
>>
>> - August 15 - August 22: Final phase
>> I will use this last week to polish the code and improve the
>> documentation.
>> As a final task, I will think about how support for incremental parsing
>> can be
>> added in the future.
>
> In line with my previous comments on the proposal, I have some comments
> regarding the timeline as well. They are somewhat general comments, it
> may not be that worthwhile to go into much detail in the timeline aspect
> unless the proposal is actually accepted.
>
> There is not much point in writing tests for a language-recognizer only
> parser, in other words, a test that only checks if the parser recognizes
> the source as valid or not. We can just feed a lot of existing valid
> source files(like Phobos, Tango, etc.) and check that the parser
> validates it correctly. (That doesn't test the *invalid* syntax cases,
> but that's a less important case for an IDE parser than making sure it
> is correct for the *valid* syntax cases)
>
> The other thing is that AST generation with all the necessary info is
> probably going to be the most significant aspect of this project, in
> terms of effort required. And to implement the AST actions, I suspect it
> might be necessary (or at least desirable) to change the language
> grammar to better suit the actions that generate the AST.
> So with this in mind, I think it would be better that, instead of doing
> a complete D language recognizer first and then adding the AST
> generation functionality, what should be done first is a AST-generating
> parser for a very limited D-like subset language (for example, a
> language with just variable, class, and function/function-parameter
> declarations), and then when we have this, to start expanding the
> grammar until it supports D1/D2 and has all the extra minutiae.
> The point of this is develop a prototype with the essential and more
> difficult aspects of the parser (AST generation, source ranges, some
> error correction) as soon as possible, and the extra stuff afterwards,
> instead of the other way around.
>
> --
> Bruno Medeiros - Software Engineer
>
More information about the Digitalmars-d
mailing list