regarding Latin1 to UTF8 encoding
Adam D. Ruppe
destructionator at gmail.com
Sun Dec 8 19:50:40 PST 2013
On Monday, 9 December 2013 at 03:33:46 UTC, Hugo Florentino wrote:
> Coud this work using scope instead of try/catch?
Maybe, but I don't think it would be very pretty. Really, I think
validate should return a bool instead of throwing, but since it
doesn't the try/catch is as close as it gets.
> P.S. Nice unit, by the way.
BTW if you need to parse random html, grab that file and my dom.d
from the same repo.
auto document = new Document();
document.parseGarbage(whatever_data);
parseGarbage tries to determine the character encoding
automatically, from the validate check or the meta tags in the
HTML if they are there, then guessing if not. It is pretty good
at parsing broken html tag soup to make a dom similar to the
browser.
Then you can get data out of it doing things like
auto firstParagraph = document.querySelector("p:first-child");
if(firstParagraph is null) writeln("no first child paragraph");
else writeln("first child paragraph text: ",
firstParagraph.innerText);
and stuff like that, if you have used Javascript before dom.d
should look fairly familiar.
More information about the Digitalmars-d-learn
mailing list