Can I parse this kind of HTML with arsd.dom module?
Timoses
timosesu at gmail.com
Sun Jun 24 10:49:51 UTC 2018
On Sunday, 24 June 2018 at 03:46:09 UTC, Dr.No wrote:
> string html = get(page, client).text;
> auto document = new Document();
> document.parseGarbage(html);
> Element attEle = document.querySelector("span[id=link2]");
> Element aEle = attEle.querySelector("a");
> string link = aEle.href; // <-- if the href contains space, it
> return "href" rather the link
>
> [...]
>
> <body bgcolor="#000000">
> <font color="yellow">
> <h2>
> Hello, dear world!
> <span id="link2">
> <a href = "https://hostname.com/?file=foo.png&foo=baa">G!</a>
> </span>
> </h2>
> </font>
missing </body>
Seems to be buggy, the parsed document part refering to "a" looks
like this:
<a "https:=""https:" href="href" />G!
More information about the Digitalmars-d-learn
mailing list