Curl support RFC

Johannes Pfau spam at example.com
Fri Mar 25 02:54:15 PDT 2011


Jonas Drewsen wrote:
>Hi,
>
>   So I've been working a bit on the etc.curl module. Currently most
> of 
>the HTTP functionality is done and some very simple Ftp.
>
>I would very much like to know if this has a chance of getting in
>phobos if I finish it with the current design. If not then it will be
>for my own project only and doesn't need as much documentation or all
>the features.
>
>https://github.com/jcd/phobos/tree/curl
>
>I do know that the error handling is currently not good enough... WIP.
>
>/Jonas
>
>
>On 11/03/11 16.20, Jonas Drewsen wrote:
>> Hi,
>>
>> So I've spent some time trying to wrap libcurl for D. There is a lot
>> of things that you can do with libcurl which I did not know so I'm
>> starting out small.
>>
>> For now I've created all the declarations for the latest public curl
>> C api. I have put that in the etc.c.curl module.
>>
>> On top of that I've created a more D like api as seen below. This is
>> located in the 'etc.curl' module. What you can see below currently
>> works but before proceeding further down this road I would like to
>> get your comments on it.
>>
>> //
>> // Simple HTTP GET with sane defaults
>> // provides the .content, .headers and .status
>> //
>> writeln( Http.get("http://www.google.com").content );
>>
>> //
>> // GET with custom data receiver delegates
>> //
>> Http http = new Http("http://www.google.dk");
>> http.setReceiveHeaderCallback( (string key, string value) {
>> writeln(key ~ ":" ~ value);
>> } );
>> http.setReceiveCallback( (string data) { /* drop */ } );
>> http.perform;
>>
>> //
>> // POST with some timouts
>> //
>> http.setUrl("http://www.testing.com/test.cgi");
>> http.setReceiveCallback( (string data) { writeln(data); } );
>> http.setConnectTimeout(1000);
>> http.setDataTimeout(1000);
>> http.setDnsTimeout(1000);
>> http.setPostData("The quick....");
>> http.perform;
>>
>> //
>> // PUT with data sender delegate
>> //
>> string msg = "Hello world";
>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>
>> http.setSendCallback( delegate size_t(char[] data) {
>> if (msg.empty) return 0;
>> auto l = msg.length;
>> data[0..l] = msg[0..$];
>> msg.length = 0;
>> return l;
>> },
>> HttpMethod.put, len );
>> http.perform;
>>
>> //
>> // HTTPS
>> //
>> writeln(Http.get("https://mail.google.com").content);
>>
>> //
>> // FTP
>> //
>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>> "./downloaded-file"));
>>
>>
>> // ... authenication, cookies, interface select, progress callback
>> // etc. is also implemented this way.
>>
>>
>> /Jonas
>

I looked at the code again and I got 2 more suggestions:

1.) Would it be useful to have a headersReceived callback which would be
called when all headers have been received (when the data callback is
called the first time)? I think of a situation where you don't know
what data the server will return: a few KB html which you can easily
keep in memory or a huge file which you'd have to save to disk. You
can only know that if the headers have been received. It would also be
possible to do that by just overwriting the headerCallback and looking
out for the ContentLength/ContentType header, but I think it should
also work with the default headerCallback.

2.)
As far as I can see you store the http headers in a case sensitive way.
(res.headers[key] ~= value;). This means "Content-Length" vs
"content-length" would produce two entries in the array and it makes
it difficult to get the header from the associative array. It is maybe
useful to keep the original casing, but probably not in the array key.

BTW: According to RFC2616 the only headers which are allowed
to be included multiple times in the response must consist of comma
separated lists. So in theory we could keep a simple string[string]
list and if we see a header twice we can just merge it with a ','.

http://tools.ietf.org/html/rfc2616#section-4.2
Relevant part from the RFC:
----------------------
   Multiple message-header fields with the same field-name MAY be
   present in a message if and only if the entire field-value for that
   header field is defined as a comma-separated list [i.e., #(values)].
   It MUST be possible to combine the multiple header fields into one
   "field-name: field-value" pair, without changing the semantics of the
   message, by appending each subsequent field-value to the first, each
   separated by a comma. The order in which header fields with the same
   field-name are received is therefore significant to the
   interpretation of the combined field value, and thus a proxy MUST NOT
   change the order of these field values when a message is forwarded.
----------------------

I'm also done with the first pass through the http parsers.
Documentation is here:
http://dl.dropbox.com/u/24218791/std.protocol.http/http/http.html

Code here:
https://gist.github.com/886612
The http.d file is generated from the http.d.rl file. 

-- 
Johannes Pfau
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20110325/9b554065/attachment.pgp>


More information about the Digitalmars-d mailing list