Nested RegEx

Dmitry Olshansky dmitry.olsh at gmail.com
Sat Apr 21 10:41:18 PDT 2012


On 21.04.2012 21:24, nrgyzer wrote:
> Hi guys,
>
> I'm trying to use std.regex to parse a string like the following:
>
> string myString = "preOuter {if condition1} content1 {if condition2} content2
> {elseif condition3} content3 {else}any other content{/if}{/if} postOuter";
>

Simply put pure regex is incapable of arbitrary-nested if-else 
statements. In a sense if... /if is a case of balanced parens problem 
and it's widely know to form non-regular language.

The way out is to either preconstruct regex pattern for a given maximum 
depth or just use handwritten parser (it may use regex for parts of 
input internally).

One day I may add push-pop stack extensions (that allow this kind of 
thing) into regex but I'd have to think hard to make it efficient.

(look at .NET regex they kind of have syntax for this but it's too 
underpowered for my taste)

> Is there any chance to use std.regex to parse the string above? I currently
> used the following expression:
>
> auto r = regex(`(.*?)\{if:(?P<condition>(.+?))\}(?P<content>(.*))(\{/if\})
> (.*)`, "g");
>
> but it doesn't fit my nested if- and else-statements correctly.
>
> Thanks in advance for any suggestions!


-- 
Dmitry Olshansky


More information about the Digitalmars-d-learn mailing list