D parsing

Tue Nov 5 12:18:22 PST 2013

On 11/4/13 11:27 PM, Brian Schott wrote:
> On Monday, 4 November 2013 at 13:20:01 UTC, Timothee Cour wrote:
>> for lexing there's already dscanner we could use (while we wait for
>> perhaps
>> a autogenerated lexer);
>> so I think priority is on the autogenerated parser (dscanner has one but
>> hand designed), where it's still unknown what will work well.
>
> Yes, that tool has two properties:
> 1) It works now. Not Soon(tm). You can download it, compile it, and use
> it to dump the AST of your D code in just a minute or two.
> 2) It wasn't built THE ONE TRUE WAY.
>
> But we should take a step back first. Before we try to implement a
> parser for D's grammar, we need to figure out what exactly D's grammar is.
>
> Seriously. We don't have a real grammar for D. We have the language spec
> on dlang.org, but it isn't complete, consistent, or updated when the
> language changes. Want examples? I have a tracker for them here:
> http://d.puremagic.com/issues/show_bug.cgi?id=10233
>
> There's also my project here: https://github.com/Hackerpilot/DGrammar,
> but it's not official and I keep finding differences between it and the
> behavior of DMD.
>
> Why am I the only one who thinks this is a problem?

I agree it's a problem, in fact three problems in one. In decreasing 
difficulty order:

1. Semantic changes for working code (e.g. order of evaluation etc) are 
subtle enough to be very difficult to track and require sheer attention 
and careful manual verification and maintenance of documentation.

2. Semantic analysis changes (i.e. compiles/doesn't compile) are also 
difficult and require attention, but at least can be to a good extent 
verified automatically (by means of test suites and runnable examples). 
In TDPL I have two categories of examples - visible and invisible. The 
invisible ones do not occur in the printed text but are present in the 
book source and are used to check whether the claims made by the book 
are true. It would be really cool if we had something like that for the 
online documentation. We should be able to intersperse freely 
documentation text with invisible unittests that ensure the 
documentation is correct.

3. Grammar changes are the simplest ones and in a way the most 
embarrassing if they happen. The best solution I see to that is deriving 
the documentation and the actual parser from the same source. This is 
part of why I'm so keen on parser generators.

Andrei

P.S. I haven't forgotten about the lexer - it's still on the back burner 
but I will publish it as soon as I get a chance.