Let's stop parser Hell
Jonathan M Davis
jmdavisProg at gmx.com
Tue Jul 31 22:39:45 PDT 2012
On Wednesday, August 01, 2012 07:26:07 Philippe Sigaud wrote:
> On Wed, Aug 1, 2012 at 12:58 AM, Jonathan M Davis <jmdavisProg at gmx.com>
wrote:
> > On Wednesday, August 01, 2012 00:54:56 Timon Gehr wrote:
> >> Ddoc is typically not required. By default it should be treated as
> >> whitespace. If it is required, one token seems reasonable: The
> >> post-processing of the doc comment is best done as a separate step.
> >
> > That was how I was intending to deal with ddoc. It's just a nested block
> > comment token. The whole comment string is there, so the ddoc processor
> > can
> > use that to do whatever it does. ddoc isn't part of lexing really. It's a
> > separate thing.
>
> OK. Same for standard comment and doc comments?
>From the TokenType enum declaration:
blockComment, /// $(D /* */)
lineComment, /// $(D // )
nestingBlockComment, /// $(D /+ +/)
There are then functions which operate on Tokens to give you information about
them. Among them is isDdocComment, which will return true if the Token type is
a comment, and that comment is a ddoc comment (i.e. starts with /**, ///, or
/++ rather than /*, //, or /+). So, anything that wants to process ddoc
comments can lex them out and process them, and if they want to know what
symbols that a ddoc comment applies to, then they look at the tokens that
follow (though a full-on parser would be required to do that correctly).
> I was wondering how to get the code possibly inside a ---- / ----
> block (I never dealt with something like documentation or syntax
> highlighting), but your solution makes it easy:
>
> Toten(TokenType.DocComment, "/** ... */"), Token(TokenType.Module,
> "module"), ...
>
> A small specialised parser can then extract text, DDocs macros and
> code blocks from inside the comment. Findind and stripping '----' is
> easy and then the lexer can be locally reinvoked on the slice
> containing the example code.
Yes. The lexer isn't concerned with where the text comes from, and it isn't
concerned with lexing comments beyond putting them in a token. But that should
be powerful enough to lex the examples if you've already extracted them.
- Jonathan M Davis
More information about the Digitalmars-d
mailing list