What is the best way to use requests and iopipe on gzipped JSON file

ikod geller.garry at gmail.com
Fri Oct 13 22:18:16 UTC 2017


On Friday, 13 October 2017 at 19:17:54 UTC, Steven Schveighoffer 
wrote:
> On 10/13/17 2:47 PM, Andrew Edwards wrote:
>> A bit of advice, please. I'm trying to parse a gzipped JSON 
>> file retrieved from the internet. The following naive 
>> implementation accomplishes the task:
>> 
>>      auto url = 
>> "http://api.syosetu.com/novelapi/api/?out=json&lim=500&gzip=5";
>>      getContent(url)
>>          .data
>>          .unzip
>>          .runEncoded!((input) {
>>              ubyte[] content;
>>              foreach (line; input.byLineRange!true) {
>>                  content ~= cast(ubyte[])line;
>>              }
>>              auto json = (cast(string)content).parseJSON;
>
> input is an iopipe of char, wchar, or dchar. There is no need 
> to cast it around.
>
> Also, there is no need to split it by line, json doesn't care.
>
> Note also that getContent returns a complete body, but unzip 
> may not be so forgiving. But there definitely isn't a reason to 
> create your own buffer here.
>
> this should work (something like this really should be in 
> iopipe):
>
> while(input.extend(0) != 0) {} // get data until EOF
>
> And then:
> auto json = input.window.parseJSON;
>
>>              foreach (size_t ndx, record; json) {
>>                  if (ndx == 0) continue;
>>                  auto title = json[ndx]["title"].str;
>>                  auto author = json[ndx]["writer"].str;
>>                  writefln("title: %s", title);
>>                  writefln("author: %s\n", author);
>>              }
>>          });
>> 
>> However, I'm sure there is a much better way to accomplish 
>> this. Is there any way to accomplish something akin to:
>> 
>>      auto url = 
>> "http://api.syosetu.com/novelapi/api/?out=json&lim=500&gzip=5";
>>      getContent(url)
>>          .data
>>          .unzip
>>          .runEncoded!((input) {
>>              foreach (record; input.data.parseJSON[1 .. $]) {
>>                  // use or update record as desired
>>              }
>>          });
>
> Eventually, something like this will be possible with 
> jsoniopipe (I need to update and release this too, it's 
> probably broken with some of the changes I just put into 
> iopipe). Hopefully combined with some sort of networking 
> library you could process a JSON stream without reading the 
> whole thing into memory.

This can be done with requests. You can ask not to load whole 
content in memory, but instead produce input range, which will 
continue to load data from server when you will  be ready to 
consume:

     auto rq = Request();
     rq.useStreaming = true;
     auto rs = rq.get("http://httpbin.org/image/jpeg");
     auto stream = rs.receiveAsRange();
     while(!stream.empty) {
         // stream.front contain next data portion
         writefln("Received %d bytes, total received %d from 
document legth %d", stream.front.length, rq.contentReceived, 
rq.contentLength);
         stream.popFront; // continue to load from server
     }


>
> Right now, it works just like std.json.parseJSON: it parses an 
> entire JSON message into a DOM form.
>
> -Steve




More information about the Digitalmars-d-learn mailing list