Let's stop parser Hell

Tue Jul 31 22:26:07 PDT 2012

On Wed, Aug 1, 2012 at 12:58 AM, Jonathan M Davis <jmdavisProg at gmx.com> wrote:
> On Wednesday, August 01, 2012 00:54:56 Timon Gehr wrote:
>> Ddoc is typically not required. By default it should be treated as
>> whitespace. If it is required, one token seems reasonable: The
>> post-processing of the doc comment is best done as a separate step.
>
> That was how I was intending to deal with ddoc. It's just a nested block
> comment token. The whole comment string is there, so the ddoc processor can
> use that to do whatever it does. ddoc isn't part of lexing really. It's a
> separate thing.

OK. Same for standard comment and doc comments?
I was wondering how to get the code possibly inside a ---- / ----
block (I never dealt with something like documentation or syntax
highlighting), but your solution makes it easy:

Toten(TokenType.DocComment, "/** ... */"), Token(TokenType.Module,
"module"), ...

A small specialised parser can then extract text, DDocs macros and
code blocks from inside the comment. Findind and stripping '----' is
easy and then the lexer can be locally reinvoked on the slice
containing the example code.