etc.curl: Formal review begin
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Mon Aug 29 11:55:33 PDT 2011
On 8/17/11 6:12 PM, David Nadlinger wrote:
> Now that Lars Kyllingstad's new and improved std.path has passed the
> vote – congratulations, Lars! –, and Jose Armando Garcia, the author of
> the proposed logging module, is currently not available, the etc.curl
> module by Jonas Drewsen is at the front of the review queue. I have
> volunteered to run the formal process for inclusion with Phobos.
[snip]
My review of etc.curl follows.
================
Overall
================
The library is comprehensive but does not cater to frequent use cases.
It could be metaphorically compared with a remote control with
equally-sized and spaced buttons, as opposed to one with enlarged
frequent use buttons.
There are two prominent use cases:
1. Given a URL string gimme the content (binary or string). The URL
could be even "file://".
2. Given a URL as in (1) gimme a method to go through its content one
chunk or one line at a time. The method should be efficient (i.e.
support asynchrony transparently such that client code and downloading
code don't block each other).
I see these cases together covering a vast majority of use cases; (1)
would be for quick and dirty scripts, (2) would be for applications that
do casual networking. The only apps that need advanced APIs would be
those for which networking is a central feature.
Neither of these use cases is covered adequately by the proposed API.
Furthermore, the documentation is also inadequate. Description is
scarce, cross-references are missing, and examples are poor and scarce.
A major overhaul is needed for acceptance in Phobos.
Finally, I found a couple of egregious bugs related to buffer overflow,
both in the examples and the implementation. There should be significant
scrutiny of such. Ideally we'd frame these issues as an API design
problem that we may be able to design around.
My vote is contingent upon fixing these large issues. Below are more
details along with a variety of smaller comments.
================
Documentation
================
* Should be std.curl.
* "Curl client functionality as provided by libcurl." -> "Networking
client functionality as provided by libcurl."
* Code inlined within text should use $(D ...).
* The first example is meaningless:
Http.get("http://www.google.com").connectTimeout(dur!"seconds"(10)).toString();
It gets a string but throws it away. At best it's a test that google.com
is up (it would throw otherwise).
* "Http http = Http("http://www.google.com");" -> "auto http =
Http("http://www.google.com");" No stuttering please. Also Http's
constructor should imply "http://" if not present, so "auto http =
Http("www.google.com");"
* "foreach (line; Http.getAsync("http://www.google.com").byLine())" is
rather verbose considering it may be the most frequent use pattern. I
was hoping for e.g. "foreach (line; Http.byLine("google.com"))"
* This comment is inaccurate:
// Do some expensive processing on the line while more lines are
// received asynchronously in the background.
The idea of async in here is seldom to do expensive computation, but
rather to offer incremental processing and output (and sometimes to stop
early).
* In the "put with data senders" it's unclear what the length of data is
and what happens if that length is shorter than msg.length. Setting
contentLength to 11 is awkward. Surely there must be an easier API for
sending some string up the wire.
* "Smtp smtp = Smtp("smtps://smtp.gmail.com");" -> "auto ..."
* "This documentation should really be in the Http..." -> we need to
find a solution to this prior to approval. Perhaps version(StdDdoc) may
be of help.
* Documentation for individual symbol lacks examples. For example, one
has no idea what requestPause is about, and there's no cross-reference
sending to the function that uses or returns it. Each symbol or symbol
group should have examples.
* Documentation is extremely scarce and uses terms without defining
them. It's reasonable to assume what escape() and unescape() do in such
a module, but for example cleanup() says "Stop and invalidate this
instance." which gives no indication of the meaning of invalidation.
Also, isStopped() checks whether the object is "stopped and invalidated"
but effecting stopping and invalidation is not done by stop(), but by
cleanup().
* verbose, dataTimeout etc. don't have corresponding properties for reading.
* netInterface takes a string but not found ubytes (or an IP struct).
this means someone who has the IP in another format must format it as a
string first.
* localPort() should take a ushort, not an int.
* Phrases such as "tcp nodelay socket option" should link to relevant
off-site documentation or a glossary entry.
* onSend example is unindented, uses "l" as a symbol, and is liable to
buffer overflow, which is, to be brutally honest, appalling.
* The example for onReceive should use to!(const(char)[]) instead of the
cast, such that the UTF validity is checked.
* Since neither value in onProgress can be fractional, the type should
be size_t or ulong.
* Due to probably a bug in ddoc, the members of struct Http are not
indented.
* The informal examples given with addHeader and setHeader should also
be supported by code examples.
* Example for head() misses a semicolon.
* Documentation for post() makes it easy to overlook the important
distinction between the two overloads. The documentation should explain
how one is meant for text upload and the other for binary upload.
* optionsAsync has an extra paren.
* The various HTTP primitives wrapped (del, connect, ...) should include
links to their definition in the documentation.
* The dox for the two postData() overloads should be merged.
* Until the documentation for perform() comes about towards the end of
the dox, it's unclear that a lot of work is done by it. So someone who
wants to understand the library without having first learned curl would
need to infer this fact from the scarce examples.
* Again, many properties have setters but not getters (e.g. maxRedirects).
* The word "simply" is overused.
================
Implementation
================
* We usually import modules in alphabetical order.
* The frequent pattern if (!expr) throw new CurlException(msg); should
be replaced by enforce(expr, new CurlException(msg));
* throwOnStopped should accept a defaulted string. There are several
instances of the pattern if (stopped) throw new CurlException("...");
* I think we can in the future replace curl_easy_(un)escape with more
efficient native functions.
* This kind of stuff is unsafe: cast(char*)(username ~ ":" ~ password)
because the result of the concatenation is not necessarily
zero-terminated. Please make a pass through all cast instances making
sure the code appends a \0.
* Some lines are long enough to not be visible in git's code viewer.
* More buffer overflow opportunities, e.g. in line 1798: "if
(header[0..5] == "HTTP/")" should be "if (header.startsWith("HTTP/"))"
Andrei
More information about the Digitalmars-d
mailing list