Re: [GSoC’11] Lexing and parsing

BlazingWhitester max.klyga at gmail.com
Wed Mar 23 02:30:28 PDT 2011


On 2011-03-23 00:27:51 +0200, Ilya Pupatenko said:

> Hi,
> 
> First of all, I want to be polite so I have to introduce myself (you 
> can skip this paragraph if you feel tired of newcomer-students’ posts). 
> My name is Ilya, I’m a Master student of IT department of Novosibirsk 
> State University (Novosibirsk, Russia). In Soviet period Novosibirsk 
> became on of the most important science center in the country and now 
> there are very close relations between University and Academy of 
> Science. That’s why it’s difficult and very interesting to study here. 
> But I’m not planning to study or work this summer, so I’ll be able to 
> work (nearly) full time on GSoC project. My primary specialization is 
> seismic tomography inverse problems, but I’m also interested in 
> programming language implementation and compilation theory. I have good 
> knowledge of C++ and C# languages and “intermediate” knowledge of D 
> language, knowledge of compilation theory, some experience in 
> implementing lexers, parsers and translators, basic knowledge of 
> lex/yacc/antlr and some knowledge of Boost.Spirit library. I’m not an 
> expert in D now, but I willing to learn and to solve difficult tasks, 
> that’s why I decided to apply on the GSoC.
> 
> I’m still working on my proposal (on task “Lexing and Parsing”), but I 
> want to write some general ideas and ask some questions.
> 
> 1. It is said that “it is possible to write a highly-integrated 
> lexer/perser generator in D without resorting to additional tools”. As 
> I understand, the library should allow programmer to write grammar 
> directly in D (ideally, the syntax should be somehow similar to EBNF) 
> and the resulting parser will be generated by D compiler while 
> compiling the program. This method allows integration of parsing in D 
> code; it can make code simpler and even sometimes more efficient.
> There is a library for C++ (named Boost.Spirit) that follows the same 
> idea. It provide (probably not ideal but very nice) “EBNF-like” syntax 
> to write a grammar, it’s quite powerful, fast and flexible. There are 
> three parts in this library (actually there are 4 parts but we’re not 
> interested in Spirit.Classic now):
> • Spirit.Qi (parser library that allows to build recursive descent parsers);
> • Spirit.Karma (generator library);
> • Spirit.Lex (library usable to create tokenizers).
> The Spirit library uses “C++ template black magic” heavily (for 
> example, via Boost.Fusion). But D has greater metaprogramming 
> abilities, so it is possible to implement the same functionality in 
> easier and “clean” way.
> So, the question is: is it a good idea if at least parser library 
> architecture will be somewhat similar to Spirit one? Of course it is 
> not about “blind” copying; but creating architecture for such a big 
> system completely from scratch is quite difficult indeed. If to be 
> exact, I like an idea of parser attributes, I like the way semantic 
> actions are described, and the “auto-rules” seems really useful.
> 
> 2. Boost.Spirit is really large and complicated library. And I doubt 
> that it is possible to implement library of comparable level in three 
> months. That’s why it is extremely important to have a plan (which 
> features should be implemented and how much time will it take). I’m 
> still working on it but I have some preliminary questions.
> Should I have a library that is proposed and accepted in Phobos before 
> the end of GSoC? Or there is no such strict timeframe and I can propose 
> a library when all features I want to see are implemented and tested 
> well?
> And another question. Is it ok to concentrate first on parser library 
> and then “move” to other parts? Of course I can choose another part to 
> start work on, but it seems to me that parser is most useful and 
> interesting part.
> 
> 3. Finally, what will be next. I’ll try to make a plan (which parts 
> should be implemented and when). Then I guess I need to describe the 
> proposed architecture in more details, and probably provide some usage 
> examples(?). Is it ok, if I publish ideas there to get reviews?
> Anyway, I’ll need some time to work on it.
> 
> Ilya.
> 
> P.S. The funny thing is that I found minor bug in Phobos (#5736) while 
> trying (just for fun) to implement some tiny part of Spirit in D. 
> Submitting bugs seems to be important part of the task too.

Mimicking spirit might not be a good idea. It looks sort of like BNF 
grammar, but because of operator abuse, there is just so many noise.
A better idea might be using D compile time function evaluation to 
parse strings with grammars



More information about the Digitalmars-d mailing list