Anyone interested in a Spirit for D?
pragma
ericanderton at yahoo.com
Wed Oct 18 18:13:19 PDT 2006
Bill Baxter wrote:
> Pragma wrote:
>> Bill Baxter wrote:
>>> [snip]
>> >
>>> Though, you know, even thinking about Boost::Spirit, I have to wonder
>>> if it really is necessary. From the intro it says that it's primary
>>> use is "extremely small micro-parsers", not a full blown language
>>> processor. But if that's the target then the runtime overhead of
>>> translating the EBNF description to a parser would be pretty
>>> trivial. So I guess the real benefit of a compile-time
>>> parser-generator is that your grammar can be _verified_ at compile-time.
>>
>> From what I gather, that's the major benefit, other than a
>> "self-documenting design". All the "prettyness" of using a near EBNF
>> syntax in C++ code gets you close enough to actual EBNF that it's
>> apparent what and how it functions.
>>
>> However, the only problem with composing this as an EBNF compile-time
>> parser, is that you can't attach actions to arbitrary terminals
>> without some sort of binding lookup. I'm not saying it's impossible,
>> but it'll be a little odd to use until we get some stronger reflection
>> support.
>>
>> But what you're suggesting could just as easily be a Compile-Time
>> rendition of Enki. It's quite possible to pull off. Especially if you
>> digest the grammar one production at a time as to side-step any
>> recursion depth limitations when processing the parser templates. :)
>
> Yes! Sounds like we're thinking along the same lines here. But if
> Walter's right, that the compile-time verification is not a big deal,
> then it would be even simpler.
>
> Actually it sounds very similar to the way writing shader code for
> OpenGL/Direct3D works. You have to compile the code it to use it, but
> conveniently compilation is so fast that you can do it at run-time
> easily. Or if you prefer, you can still precompile it. What I like to
> do is set up my IDE to go ahead and precompile my shaders just so I can
> check for errors at compile time, but then I use the runtime compilation
> in the end anyway because that makes some things easier -- like
> modifying the code on the fly.
>
> It actually works pretty well I think. The only difference between
> shader code and grammar code is that shader code doesn't need to make
> any callbacks. But callbacks aren't hard.
>
>> auto grammar = new Parser(
>> Production!("Number ::= NumberPart {NumberPart}",
>> // binding attached to production ('all' is supplied by default?)
>> void function(char[] all){
>> writefln("Parsed Number: %s",all);
>> }
>> ),
>> Production!("NumberPart ::= Sep | Digit "),
>> Production!("Digit ::= 0|1|2|3|4|5|6|7|8|9"),
>> Production!("Sep ::= '_' | ','")
>> );
>>
>> // call specifying start production
>> grammar.parse("Number",myInput);
>
> That's one way to do it, but I think you could also allow bindings to be
> attached after the fact:
>
> auto grammar = new Parser(
> "Number ::= NumberPart {NumberPart}
> NumberPart ::= Sep | Digit
> Digit ::= 0|1|2|3|4|5|6|7|8|9
> Sep ::= '_' | ','");
> );
>
> grammer.attach("Number",
> // binding attached to production ('all' is supplied by default?)
> void function(char[] all){
> writefln("Parsed Number: %s",all);
> })
>
> This is _exactly_ how parameter binding works in shader code. Just here
> the value we're binding is a function pointer instead of a texture
> coordinate or something.
>
>> Depending on how you'd like the call bindings to go, you could
>> probably go about as complex as what Enki lets you get away with. But
>> you'll have to accept a 'soft' binding in there someplace, hence you
>> loose the type/name checking benefits of being at compile time.
>
> I'll have to take your word for it. You mean in Enki you can say that
> Number has to output something convertible to 'real'?
Yes and no. The parser generator has a good deal of flexibility
built-in, including a pseudo-variant type that tries to perform
conversions wherever possible. For instance, if we re-wrote the
production for Number like so:
Number
= real handleNumber(whole,part)
::= (NumberPart {NumberPart}):whole '.'
(NumberPart {NumberPart}):fraction;
... Enki would emit code that binds the chars traversed for for 'whole'
and 'fraction', and passes those onto a function called 'handleNumber'
that returns a real. That return value is passed up parse chain so that
other terminals can bind to it:
Foobar = writeMe(foo) ::= Number:foo;
And so on.
>
>>> I wonder if it would be any easier to make a compile-time grammar
>>> verifier than a full blown parser generator? Then just do the
>>> parser-generating at runtime.
>>
>> Maybe I don't fully understand, but I don't think there's a gain
>> there. If you've already gone through the gyrations of parsing the
>> BNF expression, it's hardly any extra trouble to do something at each
>> step of the resulting parse tree*.
>>
>> (* of course template-based parsers use the call-tree as a parse-tree
>> but that's besides the point)
>
> Yeh, I was just talking crap. I thought maybe you might be able to save
> some bookkeeping if all you cared about was that the grammar made a
> valid tree, but didn't care about it's output. But probably it's the
> other way around. Checking validity is the hard part, not making a tree.
>
> --bb
More information about the Digitalmars-d
mailing list