[phobos] Bug 3763 (How to fix readlnImpl())

David Simcha dsimcha at gmail.com
Thu Feb 4 09:31:28 PST 2010


The other thing is, how many people are really going to use readln(char[]
buf) directly instead of using the version that returns a newly allocated
string or using byLine()?  If they're using the version that returns a newly
allocated string, this is a non-issue.  If they're using byLine(), then a
lot of the logic for recycling buffers could be moved there, where it is
safer because it is well-encapsulated.

On Thu, Feb 4, 2010 at 12:02 PM, Andrei Alexandrescu <andrei at erdani.com>wrote:

> So it seems like due to the low-level APIs that readlnImpl calls, it flies
> under the radar of Steve's detection.
>
> I think using an Appender with the useExistingBuffer primitive would work,
> but would be suboptimal - each read can only use as many characters as read
> the last time, which would trigger quite a few allocation (whenever the line
> size increases). Please let me know whether this assessment is correct.
>
> A possible solution would be to cache the last buffer and its maximum
> length in static variables inside readlnImpl. I think it's reasonable to say
> that readln is the owner of the char[] and you shouldn't take portions of it
> and expect they won't be changed. However, it will not trump over existing
> arrays.
>
> Andrei
>
> David Simcha wrote:
>
>>
>>
>> On Thu, Feb 4, 2010 at 12:10 AM, Andrei Alexandrescu <andrei at erdani.com<mailto:
>> andrei at erdani.com>> wrote:
>>
>>    David Simcha wrote:
>>
>>        I recently filed bug 3673
>>        (http://d.puremagic.com/issues/show_bug.cgi?id=3763) and started
>>        looking at possible fixes.  This is a bug in
>>        std.stdio.readlnImpl().  However, the more I think about it the
>>        more I think the function needs to be completely rethought, not
>>        just patched.  I'm not sure if this is a high priority given
>>        that it has nothing to do with the language spec and TDPL but
>>        it's a pretty embarrassing quality of implementation issue.
>>
>>        Anyhow, readlnImpl() takes a ref char[] and tries to recycle the
>>        memory to read in another line.  In doing so, it queries
>>        GC.capacity for the relevant memory block and resizes the array
>>        to the size of the memory block.  This is arguably unsafe in the
>>        general case because, if someone passes in a slice that starts
>>        at the beginning of a GC block to use as the buffer, everything
>>        else in the same GC block can get overwritten.  This massively
>>        violates the principle of least surprise.  On the other hand,
>>        when encapsulated in the higher level API of byLine(), it's both
>>        safe and efficient.
>>
>>
>>    Could you please give an example?
>>
>>  import std.stdio;
>>
>> void main() {
>>    auto writer = File("foo.txt", "wb");
>>    foreach(i; 0..1000) {
>>        writer.write('a');
>>    }
>>    writer.writeln();
>>    writer.close();
>>
>>    auto chars = new char[500];
>>    chars[] = 'b';
>>    auto buf = chars[0..100];
>>
>>    auto reader = File("foo.txt", "rb");
>>    reader.readln(buf);
>>    writeln(chars[$ - 1]);  // a
>>    assert(chars[$ - 1] == 'b');  // FAILS.
>> }
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100204/3b2ef485/attachment-0001.htm>


More information about the phobos mailing list