For those ready to take the challenge

via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Jan 10 04:21:36 PST 2015


On Friday, 9 January 2015 at 17:18:43 UTC, Adam D. Ruppe wrote:
> Huh, looking at the answers on the website, they're mostly 
> using regular expressions. Weaksauce. And wrong - they don't 
> find ALL the links, they find the absolute HTTP urls!

Yeah... Surprising, since languages like python includes a HTML 
parser in the standard library.

Besides, if you want all resource links you have to do a lot 
better, since the following attributes can contain resource 
addresses: href, src, data, cite, xlink:href…

You also need to do entity expansion since the links can contain 
html entities like "&".

Depressing.


More information about the Digitalmars-d-learn mailing list