Would there be interest in a SERIOUS compile-time regex parser?
Don Clugston
dac at nospam.com.au
Fri Oct 27 02:36:50 PDT 2006
Bruno Medeiros wrote:
> Don Clugston wrote:
>> Bruno Medeiros wrote:
>>> Don Clugston wrote:
>>>>
>>>> It would behave *exactly* like the existing std.regexp, except that
>>>> compilation into the internal form would happen via template
>>>> metaprogramming, so that
>>>> (1) all errors would be caught at compile time, and
>>>> (2) there'd be a minor speedup because the compilation step would
>>>> not happen at runtime, and
>>>> (3) otherwise it wouldn't be any faster than the existing regexp.
>>>> However, there'd be no template code bloat, either.
>>>>
>>>
>>> Whoa, "internal form" and "bytecoded program"? Out of curiosity, for
>>> those ignorant on the matter, like me, what kind of processing is
>>> done when creating a regexp, in terms of this internal form you speak
>>> of? Is it converted to a simple internal form, or something more
>>> complex? Bytecoded program seems pretty complex stuff, especially for
>>> a regexp (isn't the translation direct) ?
>>
>> It's *nowhere near* as complicated as it sounds. If you look into the
>> source of std.regexp, you'll see what I mean -- there's a function
>> called 'compile'.
>
> From my quick look, it is a simpler representation of the regexp
> string, but it still a linear (opcode) representation. It is not a graph
> state like I'd expect (and like benji said). (one is used/created later?
> I couldn't tell from std.regexp.test() )
The graph part is done via 'goto' opcodes. But the majority of the
simple graph branches get converted into filters.
More information about the Digitalmars-d
mailing list