Hidden allocations (Was: Array literals REALLY should be immutable )

Thu Nov 12 09:15:07 PST 2009

On Thu, 12 Nov 2009 19:49:58 +0300, dsimcha <dsimcha at yahoo.com> wrote:

> == Quote from Denis Koroskin (2korden at gmail.com)'s article
>> I strongly believe that "No hidden allocation" policy should be adopted  
>> by
>> D/Phobos (it is already adopted by Tango with a great success).
>
> I can see the value in this, but two issues:
>
> 1.  What counts as a "hidden" allocation?  How non-obvious does it have  
> to be that
> something requires an allocation?  If something really has to allocate  
> and it's
> not obvious from the nature of the function, is it enough to just  
> document it?
>

I can't give a formal definition of that, but for me a function is allowed  
to
allocate if that allocation is returned back to the user. If function  
allocates and the memory become unreferenced after function returns, then  
this allocation is redundant and should be get rid of.

For example, void mkdirRecurse(string pathname) shouldn't allocate, but it  
does, because the author didn't care about allocations when implemented it.

(It invokes mkdir() for each directory in a path, and mkdir allocates a  
new string to make sure it end with \0. Alternatively, a copy of path  
could be created only once - on a stack buffer - and get reused by putting  
\0 in place of slashes to terminate it. Something like this:

// untested
void mkdirRecurse(string path) {
     char* buffer = alloca(path.length);
     memcpy(buffer, path);

     foreach (i, c; buffer[0..path.length]) {
         if (c == '/') {
             buffer[i] = 0;
             mkdir(buffer);
             buffer[i] = '/';
         }
     }
}

There are a lot of functions that allocate without a clear reason.)

> 2.  How do you really design high-level library functions if they're not  
> allowed
> to allocate memory?  If you require the user to provide all kinds of  
> details about
> where the memory they use comes from then you lose some of the high  
> level-ness and
> make it seem more like an ugly C API that doesn't "just work" and  
> requires
> attention to the irrelevant the 90% of the time that you don't care  
> about an extra
> allocation.  The solution I personally use in my dstats lib, which works  
> pretty
> well in the limited case of arrays of primitives, but might not  
> generalize, is:
>
>     a.  For stuff that returns an array, the last argument to the  
> function is an
> optional buffer.  If it is provided and is big enough, the results are  
> returned in
> it.  If it is not provided or is too small, a new one is allocated.
>
>     b.  For temporary buffers used within a function, I use a  
> thread-local second
> stack  (TempAlloc).  While this is not **guaranteed** never to result in  
> an
> allocation (if we're out of space in our current chunk of memory, a new  
> one will
> be allocated), it very seldom does and only when the only alternative  
> would be to
> crash, throw an exception, etc.

Yes, this is a good solution.