Curl support RFC

Mon Mar 14 07:57:09 PDT 2011

On 14/03/11 12.10, Johannes Pfau wrote:
> Jonas Drewsen wrote:
>> Hi,
>>
>>    So I've been working a bit on the etc.curl module. Currently most
>> of
>> the HTTP functionality is done and some very simple Ftp.
>>
>> I would very much like to know if this has a chance of getting in
>> phobos if I finish it with the current design. If not then it will be
>> for my own project only and doesn't need as much documentation or all
>> the features.
>>
>> https://github.com/jcd/phobos/tree/curl
>>
>> I do know that the error handling is currently not good enough... WIP.
>>
>> /Jonas
>>
>>
>> On 11/03/11 16.20, Jonas Drewsen wrote:
>>> Hi,
>>>
>>> So I've spent some time trying to wrap libcurl for D. There is a lot
>>> of things that you can do with libcurl which I did not know so I'm
>>> starting out small.
>>>
>>> For now I've created all the declarations for the latest public curl
>>> C api. I have put that in the etc.c.curl module.
>>>
>>> On top of that I've created a more D like api as seen below. This is
>>> located in the 'etc.curl' module. What you can see below currently
>>> works but before proceeding further down this road I would like to
>>> get your comments on it.
>>>
>>> //
>>> // Simple HTTP GET with sane defaults
>>> // provides the .content, .headers and .status
>>> //
>>> writeln( Http.get("http://www.google.com").content );
>>>
>>> //
>>> // GET with custom data receiver delegates
>>> //
>>> Http http = new Http("http://www.google.dk");
>>> http.setReceiveHeaderCallback( (string key, string value) {
>>> writeln(key ~ ":" ~ value);
>>> } );
>>> http.setReceiveCallback( (string data) { /* drop */ } );
>>> http.perform;
>>>
>>> //
>>> // POST with some timouts
>>> //
>>> http.setUrl("http://www.testing.com/test.cgi");
>>> http.setReceiveCallback( (string data) { writeln(data); } );
>>> http.setConnectTimeout(1000);
>>> http.setDataTimeout(1000);
>>> http.setDnsTimeout(1000);
>>> http.setPostData("The quick....");
>>> http.perform;
>>>
>>> //
>>> // PUT with data sender delegate
>>> //
>>> string msg = "Hello world";
>>> size_t len = msg.length; /* using chuncked transfer if omitted */
>>>
>>> http.setSendCallback( delegate size_t(char[] data) {
>>> if (msg.empty) return 0;
>>> auto l = msg.length;
>>> data[0..l] = msg[0..$];
>>> msg.length = 0;
>>> return l;
>>> },
>>> HttpMethod.put, len );
>>> http.perform;
>>>
>>> //
>>> // HTTPS
>>> //
>>> writeln(Http.get("https://mail.google.com").content);
>>>
>>> //
>>> // FTP
>>> //
>>> writeln(Ftp.get("ftp://ftp.digitalmars.com/sieve.ds",
>>> "./downloaded-file"));
>>>
>>>
>>> // ... authenication, cookies, interface select, progress callback
>>> // etc. is also implemented this way.
>>>
>>>
>>> /Jonas
>>
> Hi,
> I really like the API. A few comments:
>
> You use the internal curl progress meter. According to the
> documentation (It's a little hidden, look at CURLOPT_NOPROGRESS) the
> progress meter is likely to removed in future curl versions. The
> download progress should be easy to reimplement, although you'd have to
> parse the Content-Length header. Upload shouldn't be to difficult either
> (One problem: What does curl pass as ultotal/dltotal when chunked
> encoding is used or the total size is not known?). Then we could also
> use different delegates for upload/download.

I did see the notice about the future of NOPROGRESS's removal but 
decided to wrap it anyway. Maybe I should just remove it in an initial 
version. As you say it is pretty simple to implement ourselves.

> The callback interface suits curl best and I actually like it, but how
> will it interact with streams? As an example: If someone wrote a
> stream/filter that decoded gzip for files it should be usable with
> the http streams as well. But files/ filestreams have a pull
> interface (no callbacks, stream.read() in a loop). So how could a gzip
> stream be written without to much code duplication supporting files and
> the http stuff?

If we take Andrei's stream proposal as the base of a new streaming 
design then the http would just be another Transport. Files have a pull 
interface that blocks until data is read. The same could be done for a 
the http class.

What I would really like is for the stream design to support 
non-blocking as mentioned in the stream proposal. Just have to figure 
out how the streaming API should behave in such cases I guess.

> Do you plan to add some kind of support for header parsing? I think
> something like what the .net webclient uses
> ( http://msdn.microsoft.com/en-us/library/system.net.webclient(v=VS.100).aspx )
> would be great. Especially the HeaderCollection supporting headers as
> strings and as data types (for both parsing and formatting), but
> without a class hierarchy for the headers, using templates instead.

It would be nice to be able to get/set headers by string and enums 
(http://msdn.microsoft.com/en-us/library/system.net.httprequestheader.aspx). 
But I cannot see that .net is using datatypes or templates for it. Could 
you give me a pointer please?

> I've written D parsers/formatters for almost all headers in
> rfc2616 (1 or 2 might be missing) and for a few additional commonly
> used headers (Content-Disposition, cookie headers). The parsers are
> written with ragel and are to be used with curl (continuations must be
> removed and the parsers always take 1 line of input, just as you get it
> from curl). Right now only the client side is implemented (no parsers
> for headers which can only be sent from client-->server ). However, I
> need to add some more documentation to the parsers, need to do
> some refactoring and I've got absolutely no time for that in the next 2
> weeks ('abitur' final exams). But if you could wait 2 weeks or if
> you wanted to do the refactoring yourself, I would be happy to
> contribute that code.

That sounds very interesting. I would very much like to see the code and 
see if fits in.