Let's stop parser Hell

Dmitry Olshansky dmitry.olsh at gmail.com
Mon Aug 6 22:44:23 PDT 2012


On 07-Aug-12 09:11, Philippe Sigaud wrote:
> On Tue, Aug 7, 2012 at 7:02 AM, David Nadlinger <see at klickverbot.at> wrote:
>
>> As far as I know, ctRegex _always_ produces machine code. Bytecode is what
>> the runtime implementation, i.e. regex, uses.
>
> That's what I had in mind too, but Dmitry seemed to say things are
> more complicated. Maybe I misunderstood.

Yeah, sorry for confusion  - there are only 2 ways to do the job:
regex and ctRegex. The first one contains special bytecode, the second 
one generates native code.

> And do you know why it's called bytecode? I never heard of 'bytecode'
> for D outside of std.regex.
It's a special code that encodes pattern matching machine :) It's not a 
bytecode for D. In other words instruction go as follows: match char, 
match one of few chars, match one of character set, match any char, 
jump/loop/option etc.

> Does that just mean 'intermediate form D
> code', as in 'bunch of small and easily optimized instructions.'?

No :)

So far there are only 3 cases:
RT parsing ---> Bytecode ( { auto x = regex("blah) })
CT parsing ---> Bytecode ( static x = regex("blah"); )
CT parsing ---> Machine code ( static/auto  x = ctRegex!"blah" )

It would be nice to have JIT compiler for D at RT but no luck yet. 
(std.jit?) Then the picture would be complete.
-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list