Parsing D Maybe Not Such a Good Idea <_<;

Basile B. via Digitalmars-d digitalmars-d at puremagic.com
Tue Jun 14 21:59:59 PDT 2016


On Wednesday, 15 June 2016 at 03:59:43 UTC, cy wrote:
> So I was toying with the idea of writing a D parser, and this 
> happened.
>
> const(/*
> 				D is kind of hard to parse. /*
> 			/**/
> 			int//) foo(T//
> 			) foo(T//
> 						)(T /* I mean,
> 										 seriously
> 										 */ bar)
> 	if ( is (T == // how on earth do they do it?
> 		 int) ) {
> 		return
> 			cast /+  where does the function name /+ even start? +/
> 						+/
> 			( const (int) )
> 			bar//;}
> 			;}
> 			
> void main() {
> 	import std.stdio;
> 	writeln(foo(42));
> }
>
> I don't think I'm going to write a D parser.

After lexing you can remove all the tokComment and everything 
becomes simple.
I had the same issue in Coedit because it has an editor command 
that turns every

     version(none)

into

     version(all)

But

     version /*bla*/(/*bla*/none/*bla*/)

is valid. A version is easy to parse but only after removing all 
the comments ;)
otherwise you have a too complex stack of token to analyze and 
some crazy branches, e.g

     if (tok[1].kind == tkVersion && tok[2].kind != tkComment && 
...)

That's not sane. If you remove the comments then the sequence of 
lexical token to detect is always the same.


More information about the Digitalmars-d mailing list