Garbage collection, and practical strategies to avoid allocation

Fri May 31 21:16:30 PDT 2013

On Saturday, June 01, 2013 04:47:39 Brad Anderson wrote:
> I played around with adding an overload that accepted an output
> range to some of the std.string functions identified in my run of
> -vgc over phobos[1] (after Jonathan pointed out this is probably
> the best approach and is already what formattedWrite does).  It
> worked fine but it did make me realize there aren't a lot of
> output ranges available to plug in at the moment (appender and
> lockingTextWriter are the only two that come to mind though there
> may be others).  Appender isn't useful if your goal is to avoid
> the GC.  Array!char et al aren't output ranges (whether they
> should be or not I have no idea).  static arrays would need some
> sort of wrapper to make them output ranges I believe unless it
> was decided that put() should work by replacing the front and
> calling popFront for them (which I kind of doubt is the desired
> behavior).
> 
> (feel free to correct me on any of this, range experts)
> 
> 1. http://goo.gl/HP78r

Dynamic arrays are output ranges. The one potential hitch there though relates 
to the fact that they get written to rather than appended to. This is actually 
exactly what you want in a situation like Manu's. However, that means that you 
have to worry about an output range running out of space and how you deal with 
that.

If it's know how much will need to be appended, presumably you can check 
length if hasLength!R is true. Otherwise, I guess that the right thing to do 
is to check empty (arrays get shrunk as they're written to, so they'll be 
empty when you can't call put on them anymore). Unfortunately, put doesn't 
seem to worry about the case where the ouput range is full/empty, so the 
result when calling put on an empty range is undefined. The situation is even 
worse with narrow strings (assuming that put works with them - I'm not sure 
that it does at the moment) given that even if you knew their length (which 
you wouldn't if you were going by hasLength), you wouldn't know whether a put 
would succeed when the string was nearly empty, as the actual number of 
elements that the dchar would take up would depend on its value.

In general, I don't think that output ranges have really been sorted out on 
quite the level that input ranges have been, and I think that some discussion 
is in order with regards to how to handle things like when the range can't be 
put into anymore. Given that one reason to use output ranges is for 
performance-critical code that doesn't want to allocate, throwing when the 
range is empty is probably a bad idea, and it's unclear that we can reasonably 
determine in the general case whether you can put to an output range before 
you actually try it. One solution to that would be to make bool return whether 
it succeeded or not, but it's an issue that definitely needs to be explored a 
bit.

So, I very much think that the correct thing is to use output ranges, but how 
to use them needs to be better defined, and we probably need to better sort out 
some of their details (like the finish function which the hash stuff uses).

- Jonathan M Davis