etc.curl: Formal review begin

Johannes Pfau spam at example.com
Wed Aug 24 05:58:59 PDT 2011


David Nadlinger wrote:
>Now that Lars Kyllingstad's new and improved std.path has passed the 
>vote  – congratulations, Lars! –, and Jose Armando Garcia, the author
>of the proposed logging module, is currently not available, the
>etc.curl module by Jonas Drewsen is at the front of the review queue.
>I have volunteered to run the formal process for inclusion with Phobos.
>
>etc.curl provides a high-level interface to client-side networking 
>functionality for the HTTP, FTP and SMTP protocols by wrapping libcurl 
>[1], accessing the C interface of which is already possible using
>Phobos in the form of etc.c.curl. Prior stages of the work have
>already been discussed in March [2], May [3], and June [4]. Thanks to
>everybody who took the time to evaluate the library back then; as far
>as I am aware, all raised concerns have been addressed since then.
>
>Based on the amount of discussion this module already underwent, I 
>hereby propose a two-week review period. Barring any objections, the 
>review starts today (Aug 18) and ends on Aug 31, followed by a one
>week vote (Sep 1 - Sep 7).
>
>Code:
>   https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
>
>API documentation:
>   http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
>
>A discussion on the wrapper has already been started on digitalmars.D 
>[5]. To keep everything in one place, please post your comments and 
>reviews there.
>
>At this point, I would like to invite everyone to spend some time 
>testing the module and reading its documentation and code. It is 
>essential for a working open source code review process that many
>people participate in it, and regardless of whether you are new to D
>or a seasoned contributor, your opinion is very valuable.
>
>Jonas also asked us to have a look at the list of known issues (which 
>can also be found at the top of the source file), and, if possible, 
>suggest a solution:
>
>---
>Known issues:
>  * DDoc is not generated where the mixins ByLineAsync, ByLineSync, 
>ByChunkAsync and ByLineSync are used. This seems to be a limitation of 
>ddoc - suggestions on how to circumvent this appreciated.
>
>Possible improvements:
>  * Progress may be deprecated in the future. Maybe implement a
> replacement.
>  * Support typed http headers - (Johannes Pfau)
>---
>
>To make sure it doesn't get lost, let me also repeat Johannes Pfau's 
>request to add proxy support from another thread ([6]) here.
>
>Thanks a lot for your reviewing efforts,
>David
>
>
>
>[1] http://curl.haxx.se/libcurl/
>[2] 
>http://www.digitalmars.com/d/archives/digitalmars/D/Curl_support_RFC_131753.html
>[3] 
>http://www.digitalmars.com/d/archives/digitalmars/D/Curl_wrapper_136372.html
>[4] 
>http://www.digitalmars.com/d/archives/digitalmars/D/Curl_wrapper_round_two_138945.html
>[5] 
>http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=142661
>[6] 
>http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=142175


+++++++++++++++++++++++++++   Code Review    +++++++++++++++++++++++++

Comments and examples:
    /**
       Example:
       ----
curl.onReceive = (ubyte[] data) { writeln("Got data", cast(char[])
       data); return data.length;};
       ----
    */
Maybe rewrite comments like this:
    /**
     * Example:
     * ----
     * curl.onReceive = (ubyte[] data) { writeln("Got data",
     * cast(char[]) data); return data.length;};
     * ----
     */
This way, the examples don't break the indentation. I think this
"misaligned" example blocks make the source harder to read.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L152
bu --> by; thread --> threads

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L148
What does the phobos style guide say for alias names? I'd argue
outdata/indata is a type (or at least, looks like one), so -->
OutData/InData ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L198
Do not call this from... --> Warning: Do not call this from...

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L289
CurlOption.file --> CurlOption.writeData ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L326
Content of the delegate in the example should be idented.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L345
CurlOption.infile --> CurlOption.readData ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L362
indent

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L414
indent

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L420
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L693
dltotal, dlnow, ultotal, ulnow --> dlTotal, dlNow, ulTotal, ulNow ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L432
The C callbacks take char* pointers to the data buffers. As these
could contain binary data, shouldn't ubyte* be used here? I know that
the data is later cast to ubyte[], but I'm a little concerned about
this pointer slicing: str[0..size*nmemb] . This creates a temporary
char[] which could be invalid. Maybe not a big deal, but I think we
should avoid that.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L540
It's a little strange that such properties are write-only, but I know
that's caused by a curl limitation, so nothing we can do about that.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L576
CurlOption.intrface --> CurlOption.interface, likely also wrong in
etc.c.curl

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L595
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L584
int port --> ushort port? I see no need for negative values?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L645
indent

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L662
aften --> after

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L739
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L779
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L817
shouldn't those be documented, so that the user knows which methods can
be used on Http.Result/AsyncResult?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1176
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2464
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2899
Maybe some mixin documentation should be added manually, something like
"Mixes in Protocol"

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1431
I think I've seen REST APIs which use POST without data or with
parameters embedded in the URL. Is there no simple way to tell curl to
_not add_ any data and _not set_ the Content-Type header? But I'm not
even sure if a POST without a BODY is valid in HTTP.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1432
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1500
Shouldn't delim be "&"?
Also in the foreach loop, delim should be ignored for the first pair?
      foreach (i, key; params.byKey()) {
            if(i == 0)
                data ~= key ~ "=" ~ params[key];
            else
                data ~= delim ~ key ~ "=" ~ params[key];
        }

Also, is there a reason why the user must escape the parameters
manually? As the user has no Curl handle, that could be annoying.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1537
      const(char)[] k = "Content-Type";
      res.header(k, contentType);
--> res.header("Content-Type", contentType); ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1658
if ( _true_ ||!netAllowed) return;  ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1770
Add "; charset=utf-8" ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1789
IIRC it's possible to add parameter names to delegate types in D
to document their use, so
void delegate(const(char)[],const(char)[]) callback)
--> void delegate(const(char)[] key,const(char)[] value) callback)

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1798
if (header[0..5] == "HTTP/") {
-->
if (header.length >= 5 && header[0..5] == "HTTP/") {

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1829
if --> of

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1959
add a toString() method for debugging purposes?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2221
document which functions call execute() ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2283
why private?
(*v) ~= value; --> (*v) ~= "," ~ value; ?

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2327
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2308
Document that this still transfers all data into memory first.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2643
Document that Result mixes in ProtocolRequestParamsSetters and
RequestParamsSetters

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2652
Add this bug and all other similar bugs to a ddoc BUG: section in the
module comment, so this doesn't get lost.

https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2932
https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2943
Bounds checking: At least a assert((recipient|sender).length > 0);
should be added.

(I've skipped most of the async stuff, maybe someone else should have a
look at that.)

++++++++++++++++++++++   API docs Review   ++++++++++++++++++++++

First, huge example:
Split this into many smaller code blocks, add some high-level overview,
here's some inspiration in pseudo-ddoc, feel free to use this:

etc.curl provides wrappers for $(LINK curls) HTTP, FTP and SMTP
functions.

There are convenience functions which can be used to make simple
requests with a single line of code:
----------------------------------------
// Simple HTTP GET
auto result = Http.get("http://www.google.com").toString();
----------------------------------------
This makes a HTTP GET request to "http://www.google.com". These
convenience functions are always _static_ functions of the HTTP /
FTP / SMTP structs. They return a Result/AsyncResult struct which can
be used to set options of the request and get the result as a string or
ubyte[] array.
----------------------------------------
// Simple HTTP GET with timeout
auto request = Http.get("http://www.google.com");
request.connectTimeout(dur!"seconds"(10));
auto result = request.toString();
----------------------------------------
As connectTimeout and similar methods return the Result/AsyncResult
instance, it's possible to chain those calls into a single line:
----------------------------------------
auto result =
Http.get("http://www.google.com").connectTimeout(dur!"seconds"(10)).toString();
----------------------------------------
Note that the request to the remote webserver is done when you call
toString() or similar methods that internally call execute (see $(LINK
)). If you don't call toString() or a similar method, no request is
made!

etc.curl additionally provides asynchronous versions of these
convenience functions: Http.getAsync() is the asynchronous version that
will spawn a thread in the background and return a Http.AsyncResult
immediately. You can read data from the result at later point in time.
This allows you to start processing data before all data has been
received by using byChunk() or byLine() on the Http.AsyncResult
instance.
----------------------------------------
// GET using an asynchronous range
foreach (line; Http.getAsync("http://www.google.com").byLine())
{
    // Do some expensive processing on the line while more lines are
    // received asynchronously in the background.
    writeln("asyncLine: ", line);
}
----------------------------------------

If you need more control than the convenience functions provide, you'll
have to use the full API:
----------------------------------------
// GET with custom data receivers
Http http = Http("http://www.google.com");
http.onReceiveHeader =
    (const(char)[] key, const(char)[] value) { writeln(key ~ ": " ~
value); };
http.onReceive = (ubyte[] data) { /+ drop +/ return
data.length; };
http.perform();
----------------------------------------
First, we create an instance of the reference-counted Http struct. We
then set custom delegates which are called whenever the Http instance
receives a header or a data buffer. In this simple example, we simply
print the headers and ignore the data. See $(LINK
onReceiveHeader/onReceive) for more information.
We finally start the HTTP request by calling perform(). Note that you
cannot combine the convenience get/post/put/... functions with this
extended API: Although you could call http.get("http://some.url")
instead of perform, this would completely ignore the settings made to
the Http instance! Remember that get/post/put/etc are _static_ methods.
They cannot access this instances data.

If you want to make a different request than GET with the extended API,
you have to set the method property:
----------------------------------------
// PUT with data senders
string msg = "Hello world";
http.onSend = (void[] data)
{
    if (msg.empty)
        return 0;

    auto m = cast(void[])msg;
    size_t length = m.length;

    //We assume msg fits into the data buffer, real code shouldn't do
    //that
    assert(data.length > msg.length);
    data[0..length] = m[0..$];
    return length;
};
http.method = Http.Method.put;
http.contentLength = 11; // defaults to chunked transfer if not
specified 
http.perform();
----------------------------------------
If you data doesn't fit into the buffer provided by onSend, onSend will
be called multiple times. See $(LINK) for more information.

The extended API has also functions to help you track the transfer's
progress:
----------------------------------------
// Track progress
http.method = Http.Method.get;
http.url = "http://upload.wikimedia.org/wikipedia/commons/"
           "5/53/Wikipedia-logo-en-big.png";
http.onReceive = (ubyte[] data) { return data.length; }; //Discard data
http.onProgress = (double dltotal, double dlnow,
                   double ultotal, double ulnow) {
    writeln("Progress ", dltotal, ", ", dlnow, ", ", ultotal, ", ",
ulnow); return 0;
};
http.perform();
----------------------------------------

As a last example, here's how to send an email with etc.curl:
----------------------------------------
// Send an email with SMTPS
Smtp smtp = Smtp("smtps://smtp.gmail.com");
smtp.setAuthentication("from.addr at gmail.com", "password");
smtp.mailTo = ["<to.addr at gmail.com>"];
smtp.mailFrom = "<from.addr at gmail.com>";
smtp.message = "Example Message";
smtp.perform();
----------------------------------------

Some complete examples like these above could also be added to the
Http/Ftp/Smtp struct documentation. You could use the examples I used
in a reply to Walter, if you wanted to.

I'd also say, at least in the examples, use braces on their own lines.
And I'd insert a few empty lines in certain places, for example:
----------------------------------------
http.onSend = (void[] data) {
    if (msg.empty) return 0;
    auto m = cast(void[])msg;
    size_t length = m.length;
    data[0..length] = m[0..$];
    msg.length = 0;
    return length;
};
----------------------------------------

doesn't look very good, I'd write it like this:
----------------------------------------
http.onSend = (void[] data)
{
    if (msg.empty)
        return 0;

    auto m = cast(void[])msg;
    size_t length = m.length;

    //We assume msg fits into the data buffer, real code shouldn't do
    //that
    assert(data.length > msg.length);
    data[0..length] = m[0..$];
    return length;
};
----------------------------------------

++++++++++++++++++++++   Feature request   ++++++++++++++++++++++

Although David already quoted my request in the initial post (thx
David), I'll repeat it here:
 
Could you add proxy support to the
wrapper? All of CURLOPT_PROXY CURLOPT_PROXYPORT and CURLOPT_PROXYTYPE
should be available. (The 'combined' CURLOPT_PROXY since 7.21.7 is too
new, that's not even in the current ubuntu)

-- 
Johannes Pfau
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.puremagic.com/pipermail/digitalmars-d/attachments/20110824/3cfa6156/attachment-0001.pgp>


More information about the Digitalmars-d mailing list