Decision on container design
Steven Schveighoffer
schveiguy at yahoo.com
Tue Feb 1 09:08:26 PST 2011
On Tue, 01 Feb 2011 11:44:36 -0500, Michel Fortin
<michel.fortin at michelf.com> wrote:
> On 2011-02-01 11:12:13 -0500, Andrei Alexandrescu
> <SeeWebsiteForEmail at erdani.org> said:
>
>> On 1/28/11 8:12 PM, Michel Fortin wrote:
>>> On 2011-01-28 20:10:06 -0500, "Denis Koroskin" <2korden at gmail.com>
>>> said:
>>>
>>>> Unfortunately, this design has big issues:
>>>> void fill(Appender appender)
>>>> {
>>>> appender.put("hello");
>>>> appender.put("world");
>>>> }
>>>> void test()
>>>> {
>>>> Appender<string> appender;
>>>> fill(appender); // Appender is supposed to have reference semantics
>>>> assert(appender.length != 0); // fails!
>>>> }
>>>> Asserting above fails because at the time you pass appender object to
>>>> the fill method it isn't initialized yet (lazy initialization). As
>>>> such, a null is passed, creating an instance at first appending, but
>>>> the result isn't seen to the caller.
>>> That's indeed a problem. I don't think it's a fatal flaw however,
>>> given
>>> that the idiom already exists in AAs.
>>> That said, the nice thing about my proposal is that you can easily
>>> reuse
>>> the Impl to create a new container to build a new container wrapper
>>> with
>>> the semantics you like with no loss of efficiency.
>>> As for the case of Appender... personally in the case above I'd be
>>> tempted to use Appender.Impl directly (value semantics) and make fill
>>> take a 'ref'. There's no point in having an extra heap allocation,
>>> especially if you're calling test() in a loop or if there's a good
>>> chance fill() has nothing to append to it.
I've been thinking of making Appender.Impl public or at least its own
type. A stack-based appender makes a lot of sense when you are using it
temporarily to build an array.
But array-based containers really are in a separate class from node-based
containers. It's tempting to conflate the two because they are both
'containers', but arrays allow many more optimizations/features that
node-based containers simply can't do.
>> Yep, yep, I found myself wrestling with the same issues. All good
>> points. On one hand containers are a target for optimization because
>> many will use them. On the other hand you'd want to have reasonably
>> simple and idiomatic code in the container implementation because you
>> want people to understand them easily and also to write their own. I
>> thought for a while of a layered approach in which you'd have both the
>> value and the sealed reference version of a container... it's just too
>> much aggravation.
>
> But are you not just pushing the aggravation elsewhere? If I need a by
> value container for some reason (performance or semantics) I'll have to
> write my own, and likely others will write their own too.
foo(container.dup) ; // value semantics
I'm sure some template guru can make a wrapper type for this for the rare
occasions that you need it. We can work on solving the
"auto-initialization" issue (where a nested container 'just works'), I
think there are ways to do it.
If that still doesn't help for your issues, then writing your own may be
the only valid option.
>
> Using classes for containers is just marginally better than making them
> by-value structs: you can use 'new' with a by-value struct if you want
> it to behave as a class-like by-reference container:
>
> struct Container {
> ...
> }
>
> auto c = new Container();
>
> The only noticeable difference from a class container is that now c is
> now a Container*.
And doesn't get cleaned up by the GC properly. Plus, each member call
must check if the container is 'instantiated', since we can have no
default ctors.
Yes, it's a trade-off, and I think by far class-based containers win for
the common case.
>>> Personally, I'm really concerned by the case where you have a container
>>> of containers. Class semantics make things really complicated as you
>>> always have to initialize everything in the container explicitly; value
>>> semantics makes things semantically easier but quite inefficient as
>>> moving elements inside of the outermost container implies copying the
>>> containers. Making containers auto-initialize themselves on first use
>>> solves the case where containers are references-types; making
>>> containers
>>> capable of using move semantics solves the problem for value-type
>>> containers.
>> Neither values nor references are perfect indeed. For example, someone
>> mentioned, hey, in STL I write set< vector<double> > and it Just
>> Works(tm). On the other hand, if you swap the two names it still seems
>> to work but it's awfully inefficient (something that may trip even
>> experienced developers).
>
> Isn't that solved by C++0x, using move semantics in swap?
>
swap isn't the problem.
foreach(s; myVectorSet)
{
// if s is by value, it must be copied for each iteration in the loop
}
-Steve
More information about the Digitalmars-d
mailing list