etc.curl: Formal review begin

Wed Aug 31 12:15:07 PDT 2011

Am 31.08.2011, 20:08 Uhr, schrieb Andrei Alexandrescu  
<SeeWebsiteForEmail at erdani.org>:

> On 8/31/11 11:39 AM, Marco Leise wrote:
>> Am 31.08.2011, 00:07 Uhr, schrieb Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org>:
>>
>>> On 8/30/11 3:34 PM, Marco Leise wrote:
>>>> Am 30.08.2011, 20:48 Uhr, schrieb Andrei Alexandrescu
>>>> <SeeWebsiteForEmail at erdani.org>:
>>>>
>>>>> On 8/30/11 1:10 PM, Jonathan M Davis wrote:
>>>>>> std.file.copy is synchronous. Would the suggestion then be to change
>>>>>> it to be
>>>>>> asynchronous or to create a second function (e.g. copyAsync) which
>>>>>> does an
>>>>>> asynchrous copy?
>>>>>
>>>>> I think std.file.copy should do whatever the heck is best to copy a
>>>>> file. As such, it should be transparently asynchronous. It would be
>>>>> great if such a task caught your fancy - and don't forget to test
>>>>> speed under a few circumstances.
>>>>>
>>>>> Andrei
>>>>
>>>> I expect the performance to degrade if you copy asynchronously on a
>>>> single HDD, since more seeking is involved.
>>>
>>> Why would more seeking be involved?
>>>
>>> Andrei
>>
>> Usually files are laid out on the disk in more or less contiguous
>> chunks. The naive algorithm would read a large block first and then
>> write it. The disk head only has to move twice for every of these
>> blocks. Once it has to seek the source sector and once the destination
>> sector. If on the other hand you have both operations in parallel then -
>> unless the OS does some heavy heuristic optimization - you have the disk
>> head move constantly as the operating system interleaves the long series
>> of reads and writes in order to serve data to both threads.
>
> If I understand the above correctly, the same amount of seeking goes  
> around in both cases. The only difference is that requests come at a  
> faster pace, as they should.
>
> Andrei

That's not what I wanted to say. Let me put it like this: If you read the  
file in one call to read and the write the whole thing from your buffer  
you have only 2 of these 'long' seeks. Practically you wont use a 2 GB  
buffer for a 2 GB file though, but I assume that it would be the fastest  
copy mode from and to the same HDD, whereas the multi-threaded approach  
would make the OS switch between writing some data for the writer thread  
and reading some data for the reader thread, probably several times per  
second. And each time a seek is involved.
(Due to IO scheduler optimizations it wont end up like this though. The OS  
will detect your read pattern (linear) and read a sane amount of data  
ahead and disk access will generally be optimized to reduce seek times.)