Decision on container design

Tue Feb 1 09:08:26 PST 2011

On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin  
<michel.fortin at michelf.com> wrote:

> On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu  
> <SeeWebsiteForEmail at erdani.org> said:
>
>> On 1/28/11 8:12 PM, Michel Fortin wrote:
>>> On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden at gmail.com>  
>>> said:
>>>
>>>> Unfortunately, this design has big issues:
>>>>   void fill(Appender appender)
>>>> {
>>>> appender.put("hello");
>>>> appender.put("world");
>>>> }
>>>>  void test()
>>>> {
>>>> Appender<string> appender;
>>>> fill(appender); // Appender is supposed to have reference semantics
>>>> assert(appender.length != 0); // fails!
>>>> }
>>>>  Asserting above fails because at the time you pass appender object to
>>>> the fill method it isn't initialized yet (lazy initialization). As
>>>> such, a null is passed, creating an instance at first appending, but
>>>> the result isn't seen to the caller.
>>>  That's indeed a problem. I don't think it's a fatal flaw however,  
>>> given
>>> that the idiom already exists in AAs.
>>>  That said, the nice thing about my proposal is that you can easily  
>>> reuse
>>> the Impl to create a new container to build a new container wrapper  
>>> with
>>> the semantics you like with no loss of efficiency.
>>>  As for the case of Appender... personally in the case above I'd be
>>> tempted to use Appender.Impl directly (value semantics) and make fill
>>> take a 'ref'. There's no point in having an extra heap allocation,
>>> especially if you're calling test() in a loop or if there's a good
>>> chance fill() has nothing to append to it.

I've been thinking of making Appender.Impl public or at least its own  
type.  A stack-based appender makes a lot of sense when you are using it  
temporarily to build an array.

But array-based containers really are in a separate class from node-based  
containers.  It's tempting to conflate the two because they are both  
'containers', but arrays allow many more optimizations/features that  
node-based containers simply can't do.

>>  Yep, yep, I found myself wrestling with the same issues. All good  
>> points. On one hand containers are a target for optimization because  
>> many will use them. On the other hand you'd want to have reasonably  
>> simple and idiomatic code in the container implementation because you  
>> want people to understand them easily and also to write their own. I  
>> thought for a while of a layered approach in which you'd have both the  
>> value and the sealed reference version of a container... it's just too  
>> much aggravation.
>
> But are you not just pushing the aggravation elsewhere? If I need a by  
> value container for some reason (performance or semantics) I'll have to  
> write my own, and likely others will write their own too.

foo(container.dup) ; // value semantics

I'm sure some template guru can make a wrapper type for this for the rare  
occasions that you need it.  We can work on solving the  
"auto-initialization" issue (where a nested container 'just works'), I  
think there are ways to do it.

If that still doesn't help for your issues, then writing your own may be  
the only valid option.

>
> Using classes for containers is just marginally better than making them  
> by-value structs: you can use 'new' with a by-value struct if you want  
> it to behave as a class-like by-reference container:
>
> 	struct Container {
> 		...
> 	}
>
> 	auto c = new Container();
>
> The only noticeable difference from a class container is that now c is  
> now a Container*.

And doesn't get cleaned up by the GC properly.  Plus, each member call  
must check if the container is 'instantiated', since we can have no  
default ctors.

Yes, it's a trade-off, and I think by far class-based containers win for  
the common case.

>>> Personally, I'm really concerned by the case where you have a container
>>> of containers. Class semantics make things really complicated as you
>>> always have to initialize everything in the container explicitly; value
>>> semantics makes things semantically easier but quite inefficient as
>>> moving elements inside of the outermost container implies copying the
>>> containers. Making containers auto-initialize themselves on first use
>>> solves the case where containers are references-types; making  
>>> containers
>>> capable of using move semantics solves the problem for value-type
>>> containers.
>>  Neither values nor references are perfect indeed. For example, someone  
>> mentioned, hey, in STL I write set< vector<double> > and it Just  
>> Works(tm). On the other hand, if you swap the two names it still seems  
>> to work but it's awfully inefficient (something that may trip even  
>> experienced developers).
>
> Isn't that solved by C++0x, using move semantics in swap?
>

swap isn't the problem.

foreach(s; myVectorSet)
{
    // if s is by value, it must be copied for each iteration in the loop
}

-Steve