Curl support RFC

Jonas Drewsen jdrewsen at nospam.com
Sun Mar 27 13:43:41 PDT 2011


On 25/03/11 10.54, Johannes Pfau wrote:
> Jonas Drewsen wrote:
>> Hi,
>>
>>    So I've been working a bit on the etc.curl module. Currently most
>> of
>> the HTTP functionality is done and some very simple Ftp.
>>
>> I would very much like to know if this has a chance of getting in
>> phobos if I finish it with the current design. If not then it will be
>> for my own project only and doesn't need as much documentation or all
>> the features.
>>
>> https://github.com/jcd/phobos/tree/curl
>>
>> I do know that the error handling is currently not good enough... WIP.
>>
>> /Jonas
>>
>>
>> On 11/03/11 16.20, Jonas Drewsen wrote:
>>> Hi,
>>>
>>> So I've spent some time trying to wrap libcurl for D. There is a lot
>>> of things that you can do with libcurl which I did not know so I'm
>>> starting out small.
>>>
>>> For now I've created all the declarations for the latest public curl
>>> C api. I have put that in the etc.c.curl module.
>>>
>>> On top of that I've created a more D like api as seen below. This is
>>> located in the 'etc.curl' module. What you can see below currently
>>> works but before proceeding further down this road I would like to
>>> get your comments on it.
>>>
>>> //
>>> // Simple HTTP GET with sane defaults
>>> // provides the .content, .headers and .status
>>> //
>>> writeln( Http.get("http://www.google.com").content );
>>>
>>> //
>>> // GET with custom data receiver delegates
>>> //
>>> Http http = new Http("http://www.google.dk");
>>> http.setReceiveHeaderCallback( (string key, string value) {
>>> writeln(key ~ ":" ~ value);
>>> } );
>>> http.setReceiveCallback( (string data) { /* drop */ } );
>>> http.perform;
>>>
>>> //
>>> // POST with some timouts
>>> //
>>> http.setUrl("http://www.testing.com/test.cgi");
>>> http.setReceiveCallback( (string data) { writeln(data); } );
>>> http.setConnectTimeout(1000);
>>> http.setDataTimeout(1000);
>>> http.setDnsTimeout(1000);
>>> http.setPostData("The quick....");
>>> http.perform;
>>>
>>> //
>>> // PUT with data sender delegate
>>> //
>>> string msg = "Hello world";
>>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>>
>>> http.setSendCallback( delegate size_t(char[] data) {
>>> if (msg.empty) return 0;
>>> auto l = msg.length;
>>> data[0..l] = msg[0..$];
>>> msg.length = 0;
>>> return l;
>>> },
>>> HttpMethod.put, len );
>>> http.perform;
>>>
>>> //
>>> // HTTPS
>>> //
>>> writeln(Http.get("https://mail.google.com").content);
>>>
>>> //
>>> // FTP
>>> //
>>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>>> "./downloaded-file"));
>>>
>>>
>>> // ... authenication, cookies, interface select, progress callback
>>> // etc. is also implemented this way.
>>>
>>>
>>> /Jonas
>>
>
> I looked at the code again and I got 2 more suggestions:
>
> 1.) Would it be useful to have a headersReceived callback which would be
> called when all headers have been received (when the data callback is
> called the first time)? I think of a situation where you don't know
> what data the server will return: a few KB html which you can easily
> keep in memory or a huge file which you'd have to save to disk. You
> can only know that if the headers have been received. It would also be
> possible to do that by just overwriting the headerCallback and looking
> out for the ContentLength/ContentType header, but I think it should
> also work with the default headerCallback.

I'm a little confused as to what a headersReceived(string[string] 
headers) would give you compared to the onReceiveHeader(const(char)[], 
const(char)[])) callback that exists today in the example.

The headersReceived callback would probably lookup the content-length 
header and set a flag about whether to save content to file or memory.

The existing onReceiveHeader could do the same by setting the flag when 
it receives the content-length field.

Or maybe I'm misunderstanding you?


> 2.)
> As far as I can see you store the http headers in a case sensitive way.
> (res.headers[key] ~= value;). This means "Content-Length" vs
> "content-length" would produce two entries in the array and it makes
> it difficult to get the header from the associative array. It is maybe
> useful to keep the original casing, but probably not in the array key.
>
> BTW: According to RFC2616 the only headers which are allowed
> to be included multiple times in the response must consist of comma
> separated lists. So in theory we could keep a simple string[string]
> list and if we see a header twice we can just merge it with a ','.
>
> http://tools.ietf.org/html/rfc2616#section-4.2
> Relevant part from the RFC:
> ----------------------
>     Multiple message-header fields with the same field-name MAY be
>     present in a message if and only if the entire field-value for that
>     header field is defined as a comma-separated list [i.e., #(values)].
>     It MUST be possible to combine the multiple header fields into one
>     "field-name: field-value" pair, without changing the semantics of the
>     message, by appending each subsequent field-value to the first, each
>     separated by a comma. The order in which header fields with the same
>     field-name are received is therefore significant to the
>     interpretation of the combined field value, and thus a proxy MUST NOT
>     change the order of these field values when a message is forwarded.
> ----------------------

I will surely implement this combined value functionality. I also noted 
that header field names are case insensitive. This means that they could 
just be stored internally as lower cased and the documentation could 
specify lowercase for looking up by field name.


> I'm also done with the first pass through the http parsers.
> Documentation is here:
> http://dl.dropbox.com/u/24218791/std.protocol.http/http/http.html
>
> Code here:
> https://gist.github.com/886612
> The http.d file is generated from the http.d.rl file.
>

This is a nice protocol parser. I would very much like it to be used 
with the curl API but without it being a dependency. This is already 
possible now using the onReceiveHeader callback and this would decouple 
the two. At least until std.protocol.http is in phobos as well - at that 
point convenience methods could be added :)

/Jonas








More information about the Digitalmars-d mailing list