etc.curl: Formal review begin

Jonas Drewsen jdrewsen at nospam.com
Thu Aug 25 03:05:19 PDT 2011


On 24/08/11 19.33, Andrei Alexandrescu wrote:
> On 8/24/11 9:35 AM, Walter Bright wrote:
>> On 8/24/2011 2:18 AM, Johannes Pfau wrote:
>>> Walter Bright wrote:
>>>> On 8/17/2011 4:12 PM, David Nadlinger wrote:
>>>>> the etc.curl module by Jonas Drewsen is at the front of the review
>>>>> queue.
>>>>
>>>> Thanks for doing this. I preface my remarks with saying I have never
>>>> done http: programming and know little about it.
>>>>
>>>> 1. The first, most obvious thing I'd like to do is present a URL and
>>>> get a string back that is the contents of that web page:
>>>>
>>>> string contents = getURL("http://www.digitalmars.com");
>>>>
>>>
>>> That's the first example in the API docs:
>>>
>>> // Simple GET with connect timeout of 10 seconds
>>> Http.get("http://www.google.com").connectTimeout(dur!"seconds"(10)).toString();
>>>
>
> I find this flow counterintuitive. It suggests that first a GET is
> issued and then the timeout is set. It takes a bit of getting used to
> things to realize that get() is actually preparing things and that the
> real work is done in toString.

What I was aiming at was something like:

Http.get("http://google.com", connectTimeout = dur!"seconds"(10));

But D does not support named parameters and that is why it ended up this 
way.

I guess renaming to Http.prepareGet("http://google.com") or something 
would better reflect its usage.


>> Ok, but 1. there's no indication in the doc that that's what it does
>> because 1. it is an incomplete fragment and 2. a simpler interface (like
>> what I proposed) would likely be better.
>>
>>
>>> or, if you don't care about timeouts:
>>> string contents = Http.get("http://www.google.com").toString();
>>
>> This example needs to be first and needs an explanation. Timeouts, error
>> recovery, progress indicators, etc., should all be brought up later.
>
> I'll note there's an annoying redundancy: the protocol appears twice,
> once in the URL and again in the type Http. That means if one needs to
> fetch from an URL transparently they'd have to go through e.g.:
>
> string url = readln();
> string contents;
> if (url.startsWith("http://")) {
> contents = Http.get(url).toString();
> } else if (url.startsWith("ftp://")) {
> contents = Ftp.get(url).toString();
> } else {
> enforce(false, "Unrecognized protocol.");
> }
>
> Probably this simple level of abstraction should be provided by the
> library (with the added advantage that client code will automatically
> work with new protocols as the library includes them).

Someone also got confused about the "get()" static method could be 
called on an instance of Http. This is of course not how it is intended 
to be used but never the less possible. I guess forcing the user to 
prepend the class name when calling a static class method would be nice 
to catch such errors. Maybe this should be change in the D language?

Since this is currently not the case I think moving all the 
Http.get(),Http.post()...Ftp.get()...  static methods to module scope 
would solve the issue.

I also think it would be a good idea to provide a function that parses 
the protocol and just performs the curl request as a convenience. This 
cannot replace the Http.get() style functions because they return a 
Result that know what can be done on a specific protocol. For example 
maxRedirects is supported for the Http protocol but not for Ftp/Smtp.


So to summarize:

* Move static convenience methods (Http.get/post/...) to module level 
functions and rename to reflect that they only prepare a request. One of 
toString(),bytes(),byLine(),byChunk(),byLineAsync(),byChunkAsync() will 
have to be called to actually get something.

* Create a new convenience function that parses protocol from url and 
performs a sync version of the request and then returns the result 
(ubyte[] or string). This is not as flexible but probably the most 
common case. For example:

string content = curl("http://mysite.com/form.cgi",
		      "hello world data",
	              "username:password");


Actually would adding opCast!string() and opCast!(ubyte[])() to 
AsyncResult and Result allow for this?:

string content = Http.get("http://mysite.com/form.cgi");
or
ubyte[] content = Http.get("http://mysite.com/form.cgi");


How does this sound?


>
> I'll be back with a thorough review.
>
>
> Andrei



More information about the Digitalmars-d mailing list