How to write parser?
Suliman via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Sun May 14 12:00:09 PDT 2017
I am trying to learn how to write text parser. I have example doc
with follow format:
#Header
my header text
##SubHeader
my sub header text
###Sub3Header
my sub 3 text
#Header21
my header2 text
##SubHeader21
my header2 text
###SubHeader22
my header3 text
I would like to wrap all level(#) tags to HTML div's, to get it's
look like:
<div 1>
#Header
my header text
<div 2>
##SubHeader
my sub header text
###Sub3Header
my sub 3 text
</div 3>
</div 2>
</div 1>
<div 1>#Header21
my header2 text
<div 2>##SubHeader21
my header2 text
<div 3>###SubHeader22
my header3 text
</div 3>
</div 2>
</div 1>
It's seems that I wrong understand parser logic. I am trying to
do it's in next way:
bool isH1Open;
bool isH2Open;
bool isH3Open;
string newcontent;
foreach(line; content.lineSplitter)
{
if(line.length > 3) // to prevent access to line < 3 symblos
{
if(!isH1Open && line[0] == '#' && line[1] != '#')
{
isH1Open = true;
line = `<div 1>` ~ "\n" ~ line ;
newcontent ~= line;
continue;
}
if(isH2Open && line[1] == '#' && line[2] != '#')
{
isH2Open = false;
line = "\n" ~ `</div 2>` ~ "\n";
newcontent ~= line;
continue;
}
if(isH1Open && line[0] == '#' && line[1] != '#')
{
isH1Open = false;
line = "\n" ~ `</div 1>` ~ "\n";
newcontent ~= line;
continue;
}
if(!isH2Open && line[1] == '#' && line[2] != '#')
{
isH2Open = true;
line = "\n" ~ `<div 2>` ~ "\n" ~ line ;
newcontent ~= line;
continue;
}
}
But I am getting wrong output:
<div 1>
#Header
<div 2>
##SubHeader
</div 1>
</div 2>
<div 1>
#Header31
<div 2>
##SubHeader31
it's there any better way to parse such format?
More information about the Digitalmars-d-learn
mailing list