some regex vs std.ascii vs handcode times
Juan Manuel Cabo
juanmanuel.cabo at gmail.com
Tue Mar 20 22:06:23 PDT 2012
On Monday, 19 March 2012 at 17:23:36 UTC, Andrei Alexandrescu
wrote:
[.....]
>
> I wanted for a long time to improve byLine by allowing it to do
> its own buffering. That means once you used byLine it's not
> possible to stop it, get back to the original File, and
> continue reading it. Using byLine is a commitment. This is what
> most uses of it do anyway.
Great!! Perhaps we don't have to choose. We may have both!!
Allow me to suggest:
byLineBuffered(bufferSize, keepTerminator);
or byLineOnly(bufferSize, keepTerminator);
or byLineChunked(bufferSize, keepTerminator);
or byLineFastAndDangerous :-) hahah :-)
Or the other way around:
byLine(keepTerminator, underlyingBufferSize);
renaming the current one to:
byLineUnbuffered(keepTerminator);
Other ideas (I think I read them somewhere about
this same byLine topic):
* I think it'd be cool if 'line' could be a slice of the
underlying buffer when possible if buffering is added.
* Another good idea would be a new argument, maxLineLength,
so that one can avoid reading and allocating the whole
file into a big line string if there are no newlines
in the file, and one knows the max length desired.
--jm
>
>> Ok, this was the good surprise. Reading by chunks was faster
>> than
>> reading the whole file, by several ms.
>
> What may be at work here is cache effects. Reusing the same 1MB
> may place it in faster cache memory, whereas reading 20MB at
> once may spill into slower memory.
>
>
> Andrei
More information about the Digitalmars-d
mailing list