etc.curl: Formal review begin

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Aug 30 10:38:59 PDT 2011


On 8/30/11 12:22 PM, jdrewsen wrote:
> Walter suggested that I should write an article about using the wrapper.
> I've now taken the first steps on writing such an article. I will have
> to get the library API rock stable before I can finish it though.

I have a suggestion for you - write and test an asynchronous copy program.

It is a continuous source of surprise to me that even seasoned 
programmers don't realize that this is an inefficient copy routine:

while (read(source, buffer))
   write(target, buffer);

If the methods are synchronous and the speeds of source and target are 
independent, the net transfer rate of the routine is R1*R1/(R1+R2), 
where R1 and R2 are the transfer rates of the source and destination 
respectively. In the worst case R1=R2 and the net transfer rate is half 
that.

This is an equation very easy to derive from first principles but many 
people are very incredulous about it. Consequently, many classic file 
copying programs (including cp; I don't know about wget or curl) use the 
inefficient method. As the variety of data sources increases (SSD, 
magnetic, networked etc) I predict async I/O will become increasingly 
prevalent. In an async approach with a queue, transfer proceeds at the 
optimal speed min(R1, R2). That's why I'm insisting the async range 
should be super easy to use, encapsulated, and robust: if people reach 
for the async range by default for their dealings with networked data, 
they'll write optimal code, sometimes even without knowing it.

If your article discusses this and shows e.g. how to copy data optimally 
from one server to another using HTTP, or from one server to a file etc, 
and if furthermore you show how your API makes all that a trivial 
five-liner, that would be a very instructive piece.


Andrei


More information about the Digitalmars-d mailing list