Looking for champion - std.lang.d.lex
Nick Sabalausky
a at a.a
Sat Oct 23 15:17:02 PDT 2010
"Andrei Alexandrescu" <SeeWebsiteForEmail at erdani.org> wrote in message
news:i9vlep$8ao$1 at digitalmars.com...
> On 10/23/10 16:39 CDT, Nick Sabalausky wrote:
>> "Andrei Alexandrescu"<SeeWebsiteForEmail at erdani.org> wrote in message
>> news:i9v8vq$2gvh$1 at digitalmars.com...
>> What's wrong with regexes? That's pretty typical for lexers.
>
> I mentioned that using regexes is possible but would make it much more
> difficult to generate good quality lexers.
I see. Maybe a lexer 2.0 thing.
>
> Besides, regexen are IMHO quite awkward at expressing certain things that
> can be easily parsed by hand, such as comments
//[^\n]*\n
/\*(.|\*[^/])*\*/
Pretty simple as far as regexes go, and I'm far from a regex expert. Plus
there's nothing stopping the use of a vastly improved regex syntax like GOLD
uses (
http://www.devincook.com/goldparser/doc/grammars/define-terminals.htm ). In
that, the two regexes above would look like:
{LineCommentChar} = {Printable} - {LF}
LineComment = '//' {LineCommentChar}* {LF}
{BlockCommentChar} = {Printable} - [*]
{BlockCommentCharNoSlash} = {BlockCommentChar} - [/]
BlockComment = '/*' ({BlockCommentChar} | '*' {BlockCommentCharNoSlash})*
'*/'
And further syntactical improvement is easy to imagine, such as in-line
character set creation.
> or recursive comments.
>
Granted, although I think there is precident for regex engines that can
handle matched nested pairs just fine.
More information about the Digitalmars-d
mailing list