D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Wed Mar 7 23:21:17 PST 2012

On Thursday, 8 March 2012 at 04:56:07 UTC, Jonathan M Davis wrote:
> If you took it from ddmd, then it's definitely going to have to 
> be GPL.
>
> Now, there is interest in having a D parser and lexer in 
> Phobos. I don't know
> if your version will fit the bill (e.g. it must have a 
> range-based API), but we
> need one at some point. The original idea was to more or less 
> directly port
> dmd's lexer and parser with some adjustments to the API as 
> necessary
> (primarily to make it range-based). But no one has had the time 
> to complete
> such a project yet (I originally volunteered to do it, but I 
> just haven't had
> the time).
>
> When that project was proposed, Walter agreed to let that port 
> be Boost rather
> than GPL (since he holds the copyright and the port would be 
> going in Phobos,
> which uses boost).
>
> The problem with what you have (even if the API and 
> implementation were
> perfect) is that it comes from ddmd, which had other 
> contributors working on
> it. So, you would have to get permission from not only Walter 
> but all of the
> relevant ddmd contributors. If you were able to _that_, and it 
> could get
> passed the review process, then what you've done could be put 
> into Phobos. But
> that requires that you take the time and effort to take care of 
> getting the
> appropriate permissions, making sure that the API and 
> implementation are
> acceptable for Phobos, and putting it through the Phobos review 
> process. It
> would be great if you could do that though.
>
> - Jonathan M Davis

This is great news. I was really worried that the license was 
etched in stone. I'll need help finding out who owns the code, 
plus legal advice if the process is more than just getting a 
simple confirmation email from each of the original authors.

I have some comments I feel are very interesting regarding the 
lexer and pointers. There are no pointers in any of the code 
besides the lexer, so I think that will be very satisfying to 
you. Now I don't know everything about ranges, but if you simply 
mean dynamic arrays, then yes, everything except the lexer uses 
arrays when necessary, although there's simply a lot of code 
which doesn't need them because most of the objects are really 
just lists of members, many of which are not arrays.

About the lexer, one thing I realized about the Wild-West pointer 
style as I was porting it is that it must be blazing fast. To my 
understanding, to call p.popFront() requires two operations, ++p; 
followed by --p.length; plus possibly array bounds checking, I 
don't know.

++p is all that the current lexer needs. It used to only check 
for EOF at each junction, but since I'm parsing little chunks of 
code instead of whole files now, it checks "if ( p >= endBuf )" 
at the beginning of each token scan, which gets pretty close to 
not going out of bounds, since most tokens aren't very long. That 
lexer is a tribute to very fast programming of an old school 
which will go away if it changes. Still, I can sense a tidal wave 
of RANGES coming, and I fear I'll just have to bid the little 
thing goodbye! :-(