stdio performance in tango, stdlib, and perl

Wed Mar 21 17:44:49 PDT 2007

On Wed, 21 Mar 2007 17:21:40 -0700, Andrei Alexandrescu (See Website For
Email) wrote:

> Derek Parnell wrote:
>>> Also, stdio also offers a readln() that creates a new line on every 
>>> call. That is useful if you want fresh lines every read:
>>>
>>> char[] line;
>>> while ((line = readln()).length > 0) {
>>>    ++dictionary[line];
>>> }
>>>
>>> The code _just works_ because an empty line means _precisely_ and 
>>> without the shadow of a doubt that the file has ended. (An I/O error 
>>> throws an exception, and does NOT return an empty line; that is another 
>>> important point.) An API that uses automated chopping should not offer 
>>> such a function because an empty line may mean that an empty line was 
>>> read, or that it's eof time. So the API would force people to write 
>>> convoluted code.
>> 
>> By "convoluted", you mean something like this ...
>> 
>>   char[] line;
>>   while ( io.readln(line) == io.Success )
>>   {
>>      ++dictionary[line];
>>   }
> 
> I said that the API would force people to write convoluted code if it 
> wanted to offer char[] readln(). Consequently, your code is buggy in the 
> likely case io.readln overwrites its buffer, which is mute testimony to 
> the validity of my point :o).

Actually you said "stdio also offers a readln() that creates a new line on
every call" and so does my fictious "io.readln(line)".  It can not
overwrite its buffer because it creates the buffer. 

  io.Status readln(out char[] pBuffer)
  {
     pBuffer.length = io.FirstGuessLength;

     // Note: This routine expand/contracts the buffer as required.
     fill_the_buffer_with_chars_until_EOL_or_EOF(pBuffer);

     // If I get this far then the low-level I/O system didn't fail me.
     return io.Success;
  }

> It should be pointed out that my point generalizes to more than 
> newlines. I plan to add to phobos two routines that efficiently and 
> atomically implement the following:
> 
> read_delim(FILE*, char[] buf, dchar delim);
>
> and
> 
> read_delim(FILE*, char[] buf, char delim[]);
> 
> For such functions, particularly the last one, it is vital that the 
> delimiter is KEPT in the resulting buffer.

And that would be because it stops at the leftmost 'delim' that is
contained in "char[] delim" so the caller needs to know which one stopped
the input stream? I presume that this would support Unicode characters too?

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Justice for David Hicks!"
22/03/2007 11:26:34 AM