D port of dmd: Lexer, Parser, AND CodeGenerator fully operational

Zach the Mystic reachMINUSTHISzachgmail at dot.com
Thu Mar 8 08:22:26 PST 2012


On Thursday, 8 March 2012 at 07:49:57 UTC, Jonathan M Davis wrote:
> The lexer is going to need to take a range of dchar (which may 
> or may not be an array),
> And while the lexer would need to operate on generic ranges of 
> dchar, it would probably have to be special-cased for strings 
> in a number of places

I know what you mean. I actually cut out ddmd's conversion stuff 
because I had glanced over phobos I saw plenty of functions 
designed for this! I must have intuited what you are saying. dmd 
does all conversion to char* prior to sending the buffer to the 
lexer. I doubt there's a reason to change this procedure, only to 
put that conversion code directly into module dmd.lexer instead.

> The parser would then take a range of  tokens and then output 
> the AST in some form  or other - it probably couldn't be  
> range, but I'm not sure.

Dmd's AST is pretty idiosyncratic.
Example: class FuncDeclaration (function declaration ) has a 
bunch of named members:
{
Identifier ident; // the function's name
Parameter[] parameters; // its parameters
Statement frequire; // the in{} contract, if present
Statement fbody; // function body
etc.

Each one has its own name. I actually was working on how to turn 
it into a more iterable format, since if you want to edit the AST 
directly you're going to need to cursor down or up to the element 
you want. It's actually doable, but it's not a natural range-ish 
format. That's where I'm confused about the licensing issues, 
since I'm not sure if the particular object structure which gets 
parsed is also going to be in phobos or if it must remain GPL, 
which I'm not sure I want to continue using.


> So, if you're not familiar with ranges, you probably have a 
> fair bit of
> learning ahead of you, and you're probably going to have to 
> make a number of
> changes to your lexer and parser (though the majority of it 
> will probably be
> able to stay intact). Unfortunately, a proper article and 
> tutorial on them is
> currently lacking in spite of the fact that Phobos uses them 
> heavily.
> Fortunately however, in a book that Ali Çehreli is writing on 
> D, he has a
> chapter on ranges that should help get you started:
>
> http://ddili.org/ders/d.en/ranges.html
>
> But I'd suggest that you play around with ranges a fair bit 
> (especially with
> strings) before trying to change what you have to use them. 
> std.algorithm in
> particular makes heavy use of ranges. And it wouldn't surprise 
> me at all if
> some portions of your lexer and parser really should be using 
> some of Phobos'
> functions but isn't currently, because it's originally a port 
> from C++. You
> should also make sure that you understand the basics of Unicode 
> fairly well -
> especially with how they pertain to char, wchar, and dchar - 
> since that will
> affect your ability to correctly translate code to use ranges 
> as well as
> properly optimize them.
>
> It would probably help if other D developers who are more 
> familiar with ranges
> took a look at what you have and maybe even helped you start 
> adjusting your
> code, but I don't know how many will both have the time and be 
> interested. If
> I have time, I'll probably start poking at it, but I don't know 
> that I'll have
> time any time soon, much as I'd like to.
>
> Regardless, you need to familiarize yourself with ranges if you 
> want to get
> the lexer and parser ready for inclusion in Phobos. And you 
> really should
> familiarize yourself with them anyway, since they're heavily 
> used in D code in
> general. Not being able to use ranges in D would be like not 
> being able to use
> iterators in C++. You can program in it, but you'd be fairly 
> crippled -
> particularly when dealing with the standard library.
>
> - Jonathan M Davis



More information about the Digitalmars-d-announce mailing list