Can I parse this kind of HTML with arsd.dom module?

Dr.No jckj33 at gmail.com
Sun Jun 24 03:46:09 UTC 2018


This is the module I'm speaking about: 
https://arsd-official.dpldocs.info/arsd.dom.html

So I have this HTML that not even parseGarbae() can del with:

<a href = "https://hostname.com/?file=foo.png&foo=baa">G!</a>

There is this spaces between  "href" and "=" and "https..." which 
makes below code fails:


	string html = get(page, client).text;
	auto document = new Document();
	document.parseGarbage(html);
Element attEle = document.querySelector("span[id=link2]");
	Element aEle = attEle.querySelector("a");
string link = aEle.href; // <-- if the href contains space, it 
return "href" rather the link



let's say the page HTML look like this:

<body bgcolor="#000000">
<font color="yellow">
<h2>
	Hello, dear world!
	<span id="link2">
<a href = "https://hostname.com/?file=foo.png&foo=baa">G!</a>
	</span>
</h2>
</font>

I know the library author post on this forum often, I hope he see 
this help somehow

to make it work. But if anyone else know how to fix this, will be 
very welcome too!


More information about the Digitalmars-d-learn mailing list