Let's stop parser Hell
David Piepgrass
qwertie256 at gmail.com
Sat Jul 7 15:24:59 PDT 2012
On Saturday, 7 July 2012 at 22:07:02 UTC, Roman D. Boiko wrote:
> On Saturday, 7 July 2012 at 21:52:09 UTC, David Piepgrass wrote:
>> it seems easier to tell what the programmer "meant" with three
>> phases, in the face of errors. I mean, phase 2 can tell when
>> braces and parenthesis are not matched up properly and then it
>> can make reasonable guesses about where those missing
>> braces/parenthesis were meant to be, based on things like
>> indentation. That would be especially helpful when the parser
>> is used in an IDE, since if the IDE guesses the intention
>> correctly, it can still understand broken code and provide
>> code completion for it. And since phase 2 is a standard tool,
>> anybody's parser can use it.
>
> There could be multiple errors that compensate each other and
> make your phase 2 succeed and prevent phase 3 from doing proper
> error handling. Even knowing that there is an error, in many
> cases you would not be able to create a meaningful error
> message. And any error would make your phase-2 tree incorrect,
> so it would be difficult to recover from it by inserting an
> additional token or ignoring tokens until parser is able to
> continue its work properly. All this would suffer for the same
> reason: you loose information.
This is all true, but forgetting a brace commonly results in a
barrage of error messages anyway. Code that guesses what you
meant obviously won't work all the time, and phase 3 would have
to take care not to emit an error message about a "{" token that
doesn't actually exist (that was merely "guessed-in"). But at
least it's nice for a parser to be /able/ to guess what you
meant; for a typical parser it would be out of the question, upon
detecting an error, to back up four source lines, insert a brace
and try again.
More information about the Digitalmars-d
mailing list