etc.curl: Formal review begin

Johannes Pfau spam at example.com
Thu Aug 25 05:09:12 PDT 2011


Jonas Drewsen wrote:
>On 24/08/11 14.58, Johannes Pfau wrote:
>> David Nadlinger wrote:
>>> Now that Lars Kyllingstad's new and improved std.path has passed the
>>> vote  – congratulations, Lars! –, and Jose Armando Garcia, the
>>> author of the proposed logging module, is currently not available,
>>> the etc.curl module by Jonas Drewsen is at the front of the review
>>> queue. I have volunteered to run the formal process for inclusion
>>> with Phobos.
>>>
>>> etc.curl provides a high-level interface to client-side networking
>>> functionality for the HTTP, FTP and SMTP protocols by wrapping
>>> libcurl [1], accessing the C interface of which is already possible
>>> using Phobos in the form of etc.c.curl. Prior stages of the work
>>> have already been discussed in March [2], May [3], and June [4].
>>> Thanks to everybody who took the time to evaluate the library back
>>> then; as far as I am aware, all raised concerns have been addressed
>>> since then.
>>>
>>> Based on the amount of discussion this module already underwent, I
>>> hereby propose a two-week review period. Barring any objections, the
>>> review starts today (Aug 18) and ends on Aug 31, followed by a one
>>> week vote (Sep 1 - Sep 7).
>>>
>>> Code:
>>>    https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d
>>>
>>> API documentation:
>>>    http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
>>>
>>> A discussion on the wrapper has already been started on
>>> digitalmars.D [5]. To keep everything in one place, please post
>>> your comments and reviews there.
>>>
>>> At this point, I would like to invite everyone to spend some time
>>> testing the module and reading its documentation and code. It is
>>> essential for a working open source code review process that many
>>> people participate in it, and regardless of whether you are new to D
>>> or a seasoned contributor, your opinion is very valuable.
>>>
>>> Jonas also asked us to have a look at the list of known issues
>>> (which can also be found at the top of the source file), and, if
>>> possible, suggest a solution:
>>>
>>> ---
>>> Known issues:
>>>   * DDoc is not generated where the mixins ByLineAsync, ByLineSync,
>>> ByChunkAsync and ByLineSync are used. This seems to be a limitation
>>> of ddoc - suggestions on how to circumvent this appreciated.
>>>
>>> Possible improvements:
>>>   * Progress may be deprecated in the future. Maybe implement a
>>> replacement.
>>>   * Support typed http headers - (Johannes Pfau)
>>> ---
>>>
>>> To make sure it doesn't get lost, let me also repeat Johannes Pfau's
>>> request to add proxy support from another thread ([6]) here.
>>>
>>> Thanks a lot for your reviewing efforts,
>>> David
>>>
>>>
>>>
>>> [1] http://curl.haxx.se/libcurl/
>>> [2]
>>> http://www.digitalmars.com/d/archives/digitalmars/D/Curl_support_RFC_131753.html
>>> [3]
>>> http://www.digitalmars.com/d/archives/digitalmars/D/Curl_wrapper_136372.html
>>> [4]
>>> http://www.digitalmars.com/d/archives/digitalmars/D/Curl_wrapper_round_two_138945.html
>>> [5]
>>> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=142661
>>> [6]
>>> http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=142175
>>
>>
>> +++++++++++++++++++++++++++   Code Review
>> +++++++++++++++++++++++++
>>
>> Comments and examples:
>>      /**
>>         Example:
>>         ----
>> curl.onReceive = (ubyte[] data) { writeln("Got data", cast(char[])
>>         data); return data.length;};
>>         ----
>>      */
>> Maybe rewrite comments like this:
>>      /**
>>       * Example:
>>       * ----
>>       * curl.onReceive = (ubyte[] data) { writeln("Got data",
>>       * cast(char[]) data); return data.length;};
>>       * ----
>>       */
>> This way, the examples don't break the indentation. I think this
>> "misaligned" example blocks make the source harder to read.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L152
>> bu -->  by; thread -->  threads
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L148
>> What does the phobos style guide say for alias names? I'd argue
>> outdata/indata is a type (or at least, looks like one), so -->
>> OutData/InData ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L198
>> Do not call this from... -->  Warning: Do not call this from...
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L289
>> CurlOption.file -->  CurlOption.writeData ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L326
>> Content of the delegate in the example should be idented.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L345
>> CurlOption.infile -->  CurlOption.readData ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L362
>> indent
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L414
>> indent
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L420
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L693
>> dltotal, dlnow, ultotal, ulnow -->  dlTotal, dlNow, ulTotal, ulNow ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L432
>> The C callbacks take char* pointers to the data buffers. As these
>> could contain binary data, shouldn't ubyte* be used here? I know that
>> the data is later cast to ubyte[], but I'm a little concerned about
>> this pointer slicing: str[0..size*nmemb] . This creates a temporary
>> char[] which could be invalid. Maybe not a big deal, but I think we
>> should avoid that.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L540
>> It's a little strange that such properties are write-only, but I know
>> that's caused by a curl limitation, so nothing we can do about that.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L576
>> CurlOption.intrface -->  CurlOption.interface, likely also wrong in
>> etc.c.curl
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L595
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L584
>> int port -->  ushort port? I see no need for negative values?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L645
>> indent
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L662
>> aften -->  after
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L739
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L779
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L817
>> shouldn't those be documented, so that the user knows which methods
>> can be used on Http.Result/AsyncResult?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1176
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2464
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2899
>> Maybe some mixin documentation should be added manually, something
>> like "Mixes in Protocol"
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1431
>> I think I've seen REST APIs which use POST without data or with
>> parameters embedded in the URL. Is there no simple way to tell curl
>> to _not add_ any data and _not set_ the Content-Type header? But I'm
>> not even sure if a POST without a BODY is valid in HTTP.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1432
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1500
>> Shouldn't delim be "&"?
>> Also in the foreach loop, delim should be ignored for the first pair?
>>        foreach (i, key; params.byKey()) {
>>              if(i == 0)
>>                  data ~= key ~ "=" ~ params[key];
>>              else
>>                  data ~= delim ~ key ~ "=" ~ params[key];
>>          }
>>
>> Also, is there a reason why the user must escape the parameters
>> manually? As the user has no Curl handle, that could be annoying.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1537
>>        const(char)[] k = "Content-Type";
>>        res.header(k, contentType);
>> -->  res.header("Content-Type", contentType); ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1658
>> if ( _true_ ||!netAllowed) return;  ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1770
>> Add "; charset=utf-8" ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1789
>> IIRC it's possible to add parameter names to delegate types in D
>> to document their use, so
>> void delegate(const(char)[],const(char)[]) callback)
>> -->  void delegate(const(char)[] key,const(char)[] value) callback)
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1798
>> if (header[0..5] == "HTTP/") {
>> -->
>> if (header.length>= 5&&  header[0..5] == "HTTP/") {
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1829
>> if -->  of
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L1959
>> add a toString() method for debugging purposes?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2221
>> document which functions call execute() ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2283
>> why private?
>> (*v) ~= value; -->  (*v) ~= "," ~ value; ?
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2327
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2308
>> Document that this still transfers all data into memory first.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2643
>> Document that Result mixes in ProtocolRequestParamsSetters and
>> RequestParamsSetters
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2652
>> Add this bug and all other similar bugs to a ddoc BUG: section in the
>> module comment, so this doesn't get lost.
>>
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2932
>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl.d#L2943
>> Bounds checking: At least a assert((recipient|sender).length>  0);
>> should be added.
>>
>> (I've skipped most of the async stuff, maybe someone else should
>> have a look at that.)
>>
>> ++++++++++++++++++++++   API docs Review   ++++++++++++++++++++++
>>
>> First, huge example:
>> Split this into many smaller code blocks, add some high-level
>> overview, here's some inspiration in pseudo-ddoc, feel free to use
>> this:
>>
>> etc.curl provides wrappers for $(LINK curls) HTTP, FTP and SMTP
>> functions.
>>
>> There are convenience functions which can be used to make simple
>> requests with a single line of code:
>> ----------------------------------------
>> // Simple HTTP GET
>> auto result = Http.get("http://www.google.com").toString();
>> ----------------------------------------
>> This makes a HTTP GET request to "http://www.google.com". These
>> convenience functions are always _static_ functions of the HTTP /
>> FTP / SMTP structs. They return a Result/AsyncResult struct which can
>> be used to set options of the request and get the result as a string
>> or ubyte[] array.
>> ----------------------------------------
>> // Simple HTTP GET with timeout
>> auto request = Http.get("http://www.google.com");
>> request.connectTimeout(dur!"seconds"(10));
>> auto result = request.toString();
>> ----------------------------------------
>> As connectTimeout and similar methods return the Result/AsyncResult
>> instance, it's possible to chain those calls into a single line:
>> ----------------------------------------
>> auto result =
>> Http.get("http://www.google.com").connectTimeout(dur!"seconds"(10)).toString();
>> ----------------------------------------
>> Note that the request to the remote webserver is done when you call
>> toString() or similar methods that internally call execute (see
>> $(LINK )). If you don't call toString() or a similar method, no
>> request is made!
>>
>> etc.curl additionally provides asynchronous versions of these
>> convenience functions: Http.getAsync() is the asynchronous version
>> that will spawn a thread in the background and return a
>> Http.AsyncResult immediately. You can read data from the result at
>> later point in time. This allows you to start processing data before
>> all data has been received by using byChunk() or byLine() on the
>> Http.AsyncResult instance.
>> ----------------------------------------
>> // GET using an asynchronous range
>> foreach (line; Http.getAsync("http://www.google.com").byLine())
>> {
>>      // Do some expensive processing on the line while more lines are
>>      // received asynchronously in the background.
>>      writeln("asyncLine: ", line);
>> }
>> ----------------------------------------
>>
>> If you need more control than the convenience functions provide,
>> you'll have to use the full API:
>> ----------------------------------------
>> // GET with custom data receivers
>> Http http = Http("http://www.google.com");
>> http.onReceiveHeader =
>>      (const(char)[] key, const(char)[] value) { writeln(key ~ ": " ~
>> value); };
>> http.onReceive = (ubyte[] data) { /+ drop +/ return
>> data.length; };
>> http.perform();
>> ----------------------------------------
>> First, we create an instance of the reference-counted Http struct. We
>> then set custom delegates which are called whenever the Http instance
>> receives a header or a data buffer. In this simple example, we simply
>> print the headers and ignore the data. See $(LINK
>> onReceiveHeader/onReceive) for more information.
>> We finally start the HTTP request by calling perform(). Note that you
>> cannot combine the convenience get/post/put/... functions with this
>> extended API: Although you could call http.get("http://some.url")
>> instead of perform, this would completely ignore the settings made to
>> the Http instance! Remember that get/post/put/etc are _static_
>> methods. They cannot access this instances data.
>>
>> If you want to make a different request than GET with the extended
>> API, you have to set the method property:
>> ----------------------------------------
>> // PUT with data senders
>> string msg = "Hello world";
>> http.onSend = (void[] data)
>> {
>>      if (msg.empty)
>>          return 0;
>>
>>      auto m = cast(void[])msg;
>>      size_t length = m.length;
>>
>>      //We assume msg fits into the data buffer, real code shouldn't
>> do //that
>>      assert(data.length>  msg.length);
>>      data[0..length] = m[0..$];
>>      return length;
>> };
>> http.method = Http.Method.put;
>> http.contentLength = 11; // defaults to chunked transfer if not
>> specified
>> http.perform();
>> ----------------------------------------
>> If you data doesn't fit into the buffer provided by onSend, onSend
>> will be called multiple times. See $(LINK) for more information.
>>
>> The extended API has also functions to help you track the transfer's
>> progress:
>> ----------------------------------------
>> // Track progress
>> http.method = Http.Method.get;
>> http.url = "http://upload.wikimedia.org/wikipedia/commons/"
>>             "5/53/Wikipedia-logo-en-big.png";
>> http.onReceive = (ubyte[] data) { return data.length; }; //Discard
>> data http.onProgress = (double dltotal, double dlnow,
>>                     double ultotal, double ulnow) {
>>      writeln("Progress ", dltotal, ", ", dlnow, ", ", ultotal, ", ",
>> ulnow); return 0;
>> };
>> http.perform();
>> ----------------------------------------
>>
>> As a last example, here's how to send an email with etc.curl:
>> ----------------------------------------
>> // Send an email with SMTPS
>> Smtp smtp = Smtp("smtps://smtp.gmail.com");
>> smtp.setAuthentication("from.addr at gmail.com", "password");
>> smtp.mailTo = ["<to.addr at gmail.com>"];
>> smtp.mailFrom = "<from.addr at gmail.com>";
>> smtp.message = "Example Message";
>> smtp.perform();
>> ----------------------------------------
>>
>> Some complete examples like these above could also be added to the
>> Http/Ftp/Smtp struct documentation. You could use the examples I used
>> in a reply to Walter, if you wanted to.
>>
>> I'd also say, at least in the examples, use braces on their own
>> lines. And I'd insert a few empty lines in certain places, for
>> example: ----------------------------------------
>> http.onSend = (void[] data) {
>>      if (msg.empty) return 0;
>>      auto m = cast(void[])msg;
>>      size_t length = m.length;
>>      data[0..length] = m[0..$];
>>      msg.length = 0;
>>      return length;
>> };
>> ----------------------------------------
>>
>> doesn't look very good, I'd write it like this:
>> ----------------------------------------
>> http.onSend = (void[] data)
>> {
>>      if (msg.empty)
>>          return 0;
>>
>>      auto m = cast(void[])msg;
>>      size_t length = m.length;
>>
>>      //We assume msg fits into the data buffer, real code shouldn't
>> do //that
>>      assert(data.length>  msg.length);
>>      data[0..length] = m[0..$];
>>      return length;
>> };
>> ----------------------------------------
>>
>> ++++++++++++++++++++++   Feature request   ++++++++++++++++++++++
>>
>> Although David already quoted my request in the initial post (thx
>> David), I'll repeat it here:
>>
>> Could you add proxy support to the
>> wrapper? All of CURLOPT_PROXY CURLOPT_PROXYPORT and CURLOPT_PROXYTYPE
>> should be available. (The 'combined' CURLOPT_PROXY since 7.21.7 is
>> too new, that's not even in the current ubuntu)
>>
>
>Thank you. Will chew through it slowly.
>
>/Jonas
>
>
>

Thanks, I know it's a lot, but most things are small cosmetic changes
or typos or similar.

-- 
Johannes Pfau



More information about the Digitalmars-d mailing list