stdio performance in tango, stdlib, and perl
Derek Parnell
derek at nomail.afraid.org
Wed Mar 21 17:44:49 PDT 2007
On Wed, 21 Mar 2007 17:21:40 -0700, Andrei Alexandrescu (See Website For
Email) wrote:
> Derek Parnell wrote:
>>> Also, stdio also offers a readln() that creates a new line on every
>>> call. That is useful if you want fresh lines every read:
>>>
>>> char[] line;
>>> while ((line = readln()).length > 0) {
>>> ++dictionary[line];
>>> }
>>>
>>> The code _just works_ because an empty line means _precisely_ and
>>> without the shadow of a doubt that the file has ended. (An I/O error
>>> throws an exception, and does NOT return an empty line; that is another
>>> important point.) An API that uses automated chopping should not offer
>>> such a function because an empty line may mean that an empty line was
>>> read, or that it's eof time. So the API would force people to write
>>> convoluted code.
>>
>> By "convoluted", you mean something like this ...
>>
>> char[] line;
>> while ( io.readln(line) == io.Success )
>> {
>> ++dictionary[line];
>> }
>
> I said that the API would force people to write convoluted code if it
> wanted to offer char[] readln(). Consequently, your code is buggy in the
> likely case io.readln overwrites its buffer, which is mute testimony to
> the validity of my point :o).
Actually you said "stdio also offers a readln() that creates a new line on
every call" and so does my fictious "io.readln(line)". It can not
overwrite its buffer because it creates the buffer.
io.Status readln(out char[] pBuffer)
{
pBuffer.length = io.FirstGuessLength;
// Note: This routine expand/contracts the buffer as required.
fill_the_buffer_with_chars_until_EOL_or_EOF(pBuffer);
// If I get this far then the low-level I/O system didn't fail me.
return io.Success;
}
> It should be pointed out that my point generalizes to more than
> newlines. I plan to add to phobos two routines that efficiently and
> atomically implement the following:
>
> read_delim(FILE*, char[] buf, dchar delim);
>
> and
>
> read_delim(FILE*, char[] buf, char delim[]);
>
> For such functions, particularly the last one, it is vital that the
> delimiter is KEPT in the resulting buffer.
And that would be because it stops at the leftmost 'delim' that is
contained in "char[] delim" so the caller needs to know which one stopped
the input stream? I presume that this would support Unicode characters too?
--
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Justice for David Hicks!"
22/03/2007 11:26:34 AM
More information about the Digitalmars-d
mailing list