Writing a JFlex lexer for D - have an issue with cycles

FatalCatharsis via Digitalmars-d digitalmars-d at puremagic.com
Sun Jan 22 14:11:08 PST 2017


I'm writing a flex lexer for D and I've hit a roadblock. It is 
almost working EXCEPT for one specific production.

StringLiteral is cyclic and I don't know how to approach it. It 
is cyclic because:

      Token -> StringLiteral -> TokenString -> Token

To break the cycle, I was thinking I could just make a production 
which is Token sans StringLiteral and instead subbed with a 
production for StringLiteral that does not contain TokenString, 
but that fundamentally changes the language. Should the lexer 
really handle something like:

     q{blah1q{20q{"meh"q{20.1q{blah}}}}}

Lexically I don't know how this makes sense. To be clear, I'm 
wondering if this is acceptable:

     Token:
         Identifier
         StringLiteral
         CharacterLiteral
         IntegerLiteral
         FloatLiteral
         Keyword
         Operator

      StringLiteral:
         WysiwygString
         AlternateWysiwygString
         DoubleQuotedString
         HexString
         DelimitedString
         TokenString

      TokenString:
         q{ TokenNonNestedTokenStrings }


      TokenNonNestedTokenStrings:
         TokenNonNestedTokenString
         TokenNonNestedTokenString TokenNonNestedTokenStrings

      TokenNonNestedTokenString:
         Identifier
         StringLiteralNonNestedTokenString
         CharacterLiteral
         IntegerLiteral
         FloatLiteral
         Keyword
         Operator

      StringLiteralNonNestedTokenString:
         WysiwygString
         AlternateWysiwygString
         DoubleQuotedString
         HexString
         DelimitedString

Which basically disables nested token strings. Has anyone else 
run into this issue?



More information about the Digitalmars-d mailing list