Improving std.regex(p)

Ben Hanson Ben.Hanson at tfbplc.co.uk
Sat Jun 19 12:39:55 PDT 2010


Hi BCS,
== Quote from BCS (none at anon.com)'s article
> Hello Ben,
> >This would lead to huge
> > interest from the boost.spirit guys...
> I don't remember what versions this is (I've done about 3 that I can remember)
> or even if it works, but I wonder if they would have any interest in this:
> http://www.dsource.org/projects/scrapple/browser/trunk/dparser/dparse.d
> Note (if that is the version I think it is) that the only mixins in that
> are tiny (mixin("label_"~tag!(id)~":;"); & mixin("goto label_"~tag!(id)~";");)
> or generating a big list invocations of other code.

I can't really speak for the Spirit people, but it's certainly interesting to
me! :-)

Here's some background to make things clearer:

I'm not even currently a Spirit user myself as I've only really needed a
tokeniser at work (they think that's mind-bogglingly sophisticated, never mind
using a beast like Spirit! ;-)).

The only reason they were interested in my lexer generator is that recursive
descent lexing is pretty slow. So for more demanding grammars, a DFA based
lexer is better. The lexer library allows you to create a lexer at runtime
(which doesn't fit in so well with their model of doing everything at compile
time), or you can generate code offline and then just add that to your project.
This is why a compile time DFA lexer would be really interesting to them.

>From memory (Joel did a PDF recently, but that is on my works machine) Joel has
been developing Spirit for over ten years. The latest version is pretty
sophisticated and has all kinds of clever stuff for directly parsing data in
structures all inline etc. Needless to say, I find the whole thing pretty mind
boggling. The biggest problem (as far as I can see as an observer) is the
compile times. This is where D could be really interesting overall (Hartmut has
certainly got his eye on it and in fact I'm sure all the Boost people do).

For my part, I'd like to see an LR parser generator developed. I'd be happy
with one that creates the parser at runtime (assuming decent performance), but
if it can generate and compile code efficiently at compile time, so much the
better! :-) I really like the idea of being able to switch the runtime/compile
time behaviour.

When it comes to the code generation part for the DFA regex engine in D, I'd be
happy to talk to you further about the techniques you've employed regarding
emit. I'm still completely new to D, so it'll take me a while! :-)

Cheers,

Ben


More information about the Digitalmars-d mailing list