html fetcher/parser

Faux Amis via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sun Aug 13 08:54:45 PDT 2017


On 2017-08-12 22:22, Adam D. Ruppe wrote:
> On Saturday, 12 August 2017 at 19:53:22 UTC, Faux Amis wrote:
>> [...]
> 
> [...]
> ---
> // compile: $ dmd thisfile.d ~/arsd/{dom,http2,characterencodings}
> 
> import std.stdio;
> import arsd.dom;
> 
> void main() {
>          auto document = Document.fromUrl("https://dlang.org/");
>          writeln(document.optionSelector("p").innerText);
> }
> ---
Nice!

> [...]
> Document.fromUrl uses the http lib to fetch it, then automatically parse 
> the contents as a dom document. It will correct for common errors in 
> webpage markup, character sets, etc.

Just curious, but is there a spec of sorts which defines which errors 
should be fixed and such?

> [...] 
> Bonus fact: 
> http://dpldocs.info/experimental-docs/std.algorithm.comparison.levenshteinDistanceAndPath.1.html 
> that function from the standard library makes doing a diff display of 
> before and after pretty simple....
Thanks for the pointer!


More information about the Digitalmars-d-learn mailing list