[phobos] readlnImpl (Again)

Thu Apr 22 10:05:33 PDT 2010

I was already using std.array.Appender since I fixed this the first time.  I
added the assumeSafeAppend calls in my own source tree, but I didn't commit
it both because I wanted input on whether we still want to allow stomping
and because for some reason its still horribly slow.  Whether Appender still
does what it's supposed to is a good question.

On Thu, Apr 22, 2010 at 12:44 PM, Andrei Alexandrescu <andrei at erdani.com>wrote:

> Guys, I'll let you make sure this is taken care of; I'm busy with the final
> touches for TDPL. Once I let it out of my hand, that will really be it.
>
> Just a brief thought - I hope std.array.Appender still works with the new
> allocation regime. If it does, maybe it could be put to use for readln.
>
>
> Andrei
>
>
> On 04/19/2010 11:32 AM, Steve Schveighoffer wrote:
>
>> This is a tricky situation.  I don't think assumeSafeAppend is the right
>> call. Imagine something like this:
>>
>> char[] data = new char[50];
>> data[40..45] = "hello";
>>
>> f.readln(data[0..40]);
>>
>> now, if the line extends past 40 characters, it will overwrite
>> data[40..45].
>>
>> What I think the safe thing to do is to not use appending until the
>> buffer is exhausted. However, even in that case, I don't think append is
>> the right move. Appending one char at a time, while maybe faster when
>> the buffer can append in place, is still extremely slow compared to just
>> setting a char in an array. A better solution is to simply increment the
>> length of the buffer when needed. A handy function from the runtime
>> would extend an array to the full capacity it could contain.
>>
>> A quick (uncompiled) version of the relevant part:
>>
>> uint used = 0;
>> char[4] encbuf;
>> uint enclen;
>> for (int c; (c = FGETWC(fp)) != -1; )
>> {
>> if ((c & ~0x7F) == 0)
>> {
>> enclen = 1;
>> encbuf[0] = cast(char)c;
>> }
>> else
>> {
>> enclen = std.utf.encode(encbuf, cast(dchar)c);
>> }
>> if(used + enclen > buf.length)
>> {
>> /* this should be a runtime function */
>> buf.length = used + enclen;
>> buf.length = buf.capacity;
>> }
>> buf[used..used+enclen] = encbuf[0..enclen];
>> if (c == terminator)
>> break;
>> }
>> if (ferror(fps))
>> StdioException();
>> return used;
>>
>>
>> This is where a D-specific buffering solution would be extremely useful.
>> Instead of using FGETC, you could simply gain access to the buffer, and
>> search for the terminator using std.algorithm. Tango uses this kind of
>> thing.
>>
>> -Steve
>>
>>
>>    *From:* David Simcha <dsimcha at gmail.com>
>>    *To:* Discuss the phobos library for D <phobos at puremagic.com>
>>    *Sent:* Mon, April 19, 2010 11:30:43 AM
>>    *Subject:* [phobos] readlnImpl (Again)
>>
>>    I've begun to realize that Steve's array stomping patch breaks some
>>    assumptions that std.stdio.readlnImpl made about how array appending
>>    works, leading to it being slower than molasses again. Should I just
>>    put some assumeSafeAppend() calls in there to make it behave as it
>>    used to, even though the old behavior was unsafe, or do we want to
>>    completely redesign this code now that we want to guarantee no
>> stomping?
>>
>>
>>
>>
>> _______________________________________________
>> phobos mailing list
>> phobos at puremagic.com
>> http://lists.puremagic.com/mailman/listinfo/phobos
>>
> _______________________________________________
> phobos mailing list
> phobos at puremagic.com
> http://lists.puremagic.com/mailman/listinfo/phobos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.puremagic.com/pipermail/phobos/attachments/20100422/3f504a1f/attachment.html>