Web crawler/scraping

Wed Feb 17 13:25:54 UTC 2021

On Wednesday, 17 February 2021 at 13:13:00 UTC, Adam D. Ruppe 
wrote:
> On Wednesday, 17 February 2021 at 12:12:56 UTC, Carlos Cabral 
> wrote:
>> I'm trying to collect some json data from a website/admin 
>> panel automatically, which is behind a login form.
>
> Does the website need javascript?
>
> If not, my dom.d may be able to help. It can download some 
> HTML, parse it, fill in forms, then my http2.d submits it (I 
> never implemented Form.submit in dom.d but it is pretty easy to 
> make with other functions that are implemented, heck maybe I'll 
> implement it now if it sounds like it might work).
>
> Or if it is all json you might be able to just craft some 
> requests with my lib or even phobos' std.net.curl that submits 
> the login request, saves a cookie, then fetches some json stuff.
>
> I literally just rolled out of bed but in an hour or two I can 
> come back and make some example code for you if this sounds 
> plausible.

No, I don't think it needs JS.
I think can submit the login form and then just fetch/save the 
json request using the login cookie as you suggest. A 
crawler/scraping solution maybe overkill...

I'll try with std.net.curl and come back to you in a couple of 
hours

Thank you!!