Can I parse this kind of HTML with arsd.dom module?
Dr.No
jckj33 at gmail.com
Sun Jun 24 03:46:09 UTC 2018
This is the module I'm speaking about:
https://arsd-official.dpldocs.info/arsd.dom.html
So I have this HTML that not even parseGarbae() can del with:
<a href = "https://hostname.com/?file=foo.png&foo=baa">G!</a>
There is this spaces between "href" and "=" and "https..." which
makes below code fails:
string html = get(page, client).text;
auto document = new Document();
document.parseGarbage(html);
Element attEle = document.querySelector("span[id=link2]");
Element aEle = attEle.querySelector("a");
string link = aEle.href; // <-- if the href contains space, it
return "href" rather the link
let's say the page HTML look like this:
<body bgcolor="#000000">
<font color="yellow">
<h2>
Hello, dear world!
<span id="link2">
<a href = "https://hostname.com/?file=foo.png&foo=baa">G!</a>
</span>
</h2>
</font>
I know the library author post on this forum often, I hope he see
this help somehow
to make it work. But if anyone else know how to fix this, will be
very welcome too!
More information about the Digitalmars-d-learn
mailing list