etc.curl: Formal review begin

jdrewsen jdrewsen at nospam.com
Mon Aug 29 13:46:08 PDT 2011


Den 29-08-2011 20:55, Andrei Alexandrescu skrev:
> On 8/17/11 6:12 PM, David Nadlinger wrote:
>> Now that Lars Kyllingstad's new and improved std.path has passed the
>> vote – congratulations, Lars! –, and Jose Armando Garcia, the author of
>> the proposed logging module, is currently not available, the etc.curl
>> module by Jonas Drewsen is at the front of the review queue. I have
>> volunteered to run the formal process for inclusion with Phobos.
> [snip]
>
> My review of etc.curl follows.
>
> ================
> Overall
> ================
>
> The library is comprehensive but does not cater to frequent use cases.
> It could be metaphorically compared with a remote control with
> equally-sized and spaced buttons, as opposed to one with enlarged
> frequent use buttons.

Others have mentioned this as well. I'll try to improve.

> Neither of these use cases is covered adequately by the proposed API.
>
> Furthermore, the documentation is also inadequate. Description is
> scarce, cross-references are missing, and examples are poor and scarce.
> A major overhaul is needed for acceptance in Phobos.

I guess I'm assuming that the user knows something about curl. Will have 
a look at it.

> ================
> Documentation
> ================
>
> * Should be std.curl.
ok

> * "Curl client functionality as provided by libcurl." -> "Networking
> client functionality as provided by libcurl."
ok

> * Code inlined within text should use $(D ...).
ok

> * The first example is meaningless:
>
> Http.get("http://www.google.com").connectTimeout(dur!"seconds"(10)).toString();
Someone else mentioned that as well.

> * "Http http = Http("http://www.google.com");" -> "auto http =
> Http("http://www.google.com");" No stuttering please. Also Http's
> constructor should imply "http://" if not present, so "auto http =
> Http("www.google.com");"
ok

> * "foreach (line; Http.getAsync("http://www.google.com").byLine())" is
> rather verbose considering it may be the most frequent use pattern. I
> was hoping for e.g. "foreach (line; Http.byLine("google.com"))"

I could easily create a convenience Http.byLine("google.com") method 
that simply wraps Http.get("google.com").byLine().
I think that keeping the Http.get("google.com") method is still 
important since it allows you to change the request settings before 
calling byLine(). For example:

Http.get("google.com").connectTimeout(dur!"seconds"(10)).byLine();


> * This comment is inaccurate:
>
> // Do some expensive processing on the line while more lines are
> // received asynchronously in the background.
>
> The idea of async in here is seldom to do expensive computation, but
> rather to offer incremental processing and output (and sometimes to stop
> early).
ok

> * In the "put with data senders" it's unclear what the length of data is
> and what happens if that length is shorter than msg.length. Setting
> contentLength to 11 is awkward. Surely there must be an easier API for
> sending some string up the wire.
That example it certainly broken. Anyway - the data is the buffer that 
curl want to be filled with data to send. The length of data is how 
large that buffer is of course.
As specified it is only necessary to set the content length if you do 
not want to used chunked transfer. Most often you wouldn't care.
Maybe I should just remove the contentLength from the example.

> * "Smtp smtp = Smtp("smtps://smtp.gmail.com");" -> "auto ..."
ok

> * "This documentation should really be in the Http..." -> we need to
> find a solution to this prior to approval. Perhaps version(StdDdoc) may
> be of help.
ok

> * Documentation for individual symbol lacks examples. For example, one
> has no idea what requestPause is about, and there's no cross-reference
> sending to the function that uses or returns it. Each symbol or symbol
> group should have examples.
ok

> * Documentation is extremely scarce and uses terms without defining
> them. It's reasonable to assume what escape() and unescape() do in such
> a module, but for example cleanup() says "Stop and invalidate this
> instance." which gives no indication of the meaning of invalidation.
> Also, isStopped() checks whether the object is "stopped and invalidated"
> but effecting stopping and invalidation is not done by stop(), but by
> cleanup().
agreed

> * verbose, dataTimeout etc. don't have corresponding properties for
> reading.
I general all the settings are simply set directly in libcurl itself. 
Libcurl only has a function for setting options and not one for getting 
them. This means that I would have to keep a copy of the setting myself 
just to make it readable which i judged was not the way to go. This is 
why they are read only. This is one of the reasons why I see a wrapper 
for libcurl as just temporary solution to a native networking library i D.

> * netInterface takes a string but not found ubytes (or an IP struct).
> this means someone who has the IP in another format must format it as a
> string first.
This is the libcurl API reflected. I've got no problem with overloading 
the netInterface to accept ubyte or an IP struct as well.

> * localPort() should take a ushort, not an int.
Again just reflecting libcurl. But will make the cast to make it cleaner.

> * Phrases such as "tcp nodelay socket option" should link to relevant
> off-site documentation or a glossary entry.
ok

> * onSend example is unindented, uses "l" as a symbol, and is liable to
> buffer overflow, which is, to be brutally honest, appalling.
Agreed. Embarressing.

> * The example for onReceive should use to!(const(char)[]) instead of the
> cast, such that the UTF validity is checked.
ok

> * Since neither value in onProgress can be fractional, the type should
> be size_t or ulong.
You are not the first to mention it. Again it is what the libcurl API 
provides. But I will fix it.

> * Due to probably a bug in ddoc, the members of struct Http are not
> indented.
Noticed. I'll see if it can be worked around.

> * The informal examples given with addHeader and setHeader should also
> be supported by code examples.
ok

> * Example for head() misses a semicolon.
ok

> * Documentation for post() makes it easy to overlook the important
> distinction between the two overloads. The documentation should explain
> how one is meant for text upload and the other for binary upload.
ok

> * optionsAsync has an extra paren.
ok

> * The various HTTP primitives wrapped (del, connect, ...) should include
> links to their definition in the documentation.
ok

> * The dox for the two postData() overloads should be merged.
ok

> * Until the documentation for perform() comes about towards the end of
> the dox, it's unclear that a lot of work is done by it. So someone who
> wants to understand the library without having first learned curl would
> need to infer this fact from the scarce examples.
I guess I'll have to move it to the top.

> * Again, many properties have setters but not getters (e.g. maxRedirects).
see comment about "verbose, dataTimeout..."

> * The word "simply" is overused.
ok

> ================
> Implementation
> ================
>
> * We usually import modules in alphabetical order.
ok

> * The frequent pattern if (!expr) throw new CurlException(msg); should
> be replaced by enforce(expr, new CurlException(msg));
ok

> * throwOnStopped should accept a defaulted string. There are several
> instances of the pattern if (stopped) throw new CurlException("...");
ok

> * I think we can in the future replace curl_easy_(un)escape with more
> efficient native functions.
probably. will put it in the "Possible improvements" section in the top 
of the source.

> * This kind of stuff is unsafe: cast(char*)(username ~ ":" ~ password)
> because the result of the concatenation is not necessarily
> zero-terminated. Please make a pass through all cast instances making
> sure the code appends a \0.
ok

> * Some lines are long enough to not be visible in git's code viewer.
I've got a commit ready already with a lot of suggested fixes/changes 
from other review comment. One if these is limiting lines to max 80-120 
char.
When this first review ends on wednesday I'll commit that and begin 
handling the rest of the review comments.

> * More buffer overflow opportunities, e.g. in line 1798: "if
> (header[0..5] == "HTTP/")" should be "if (header.startsWith("HTTP/"))"
ok

> Andrei

Thank you for you comments.


More information about the Digitalmars-d mailing list