Let's stop parser Hell

Tue Jul 10 13:40:51 PDT 2012

On 11-Jul-12 00:25, Jonathan M Davis wrote:
> On Tuesday, July 10, 2012 21:25:52 Timon Gehr wrote:
>> On 07/10/2012 09:14 PM, Philippe Sigaud wrote:
>>> Tue, Jul 10, 2012 at 12:41 PM, Roman D. Boiko<rb at d-coding.com> wrote:
>>>> One disadvantage of Packrat parsers I mentioned was problematic error
>>>> recovery (according to the article from ANTLR website). After some
>>>> additional research, I found that it is not a critical problem. To find
>>>> the
>>>> exact place of error (from parser's perspective, not user's) one only
>>>> needs
>>>> to remember the farthest successfully parsed position (among several
>>>> backtracking attempts) and the reason that it failed.
>>>
>>> IIRC, that's what I encoded in Pegged (admittedly limited) error
>>> reporting: remember the farthest error.
>>>
>>>> It is also possible to rerun parsing with some additional heuristics
>>>> after
>>>> failing, thus enabling advanced error repair scenarios.
>>>
>>> Do people really what error-repairing parsers? I want my parsers to
>>> tell me something is bad, and, optionally to advance a possible
>>> repair, but definitely *not* to automatically repair a inferred error
>>> and continue happily.
>>
>> FWIW, this is what most HTML parsers are doing.
>
> Which is horrible. You pretty much have to with HTML because of the horrid
> decision that it should be parsed so laxly by browsers, but pretty much
> nothing else should do that. Either it's correct or it's not. Having the
> compiler "fix" your code would cause far more problems that it would ever fix.
>

BTW clang does this and even more of stuff on semantic level. It's known 
to won a legions of users because of that (well not only that but good 
diagnostic in general).

-- 
Dmitry Olshansky