Paralysis of analysis

Tue Dec 14 11:02:34 PST 2010

I kept on literally losing sleep about a number of issues involving 
containers, sealing, arbitrary-cost copying vs. reference counting and 
copy-on-write, and related issues. This stops me from making rapid 
progress on defining D containers and other artifacts in the standard 
library.

Clearly we need to break this paralysis, and just as clearly whatever 
decision taken now will influence the prevalent D style going forward. 
So a decision needs to be made soon, just not hastily. Easier said than 
done!

I continue to believe that containers should have reference semantics, 
just like classes. Copying a container wholesale is not something you 
want to be automatic.

I also continue to believe that controlled lifetime (i.e. 
reference-counted implementation) is important for a container. 
Containers tend to be large compared to other objects, so exercising 
strict control over their allocated storage makes a lot of sense. What 
has recently shifted in my beliefs is that we should attempt to 
implement controlled lifetime _outside_ the container definition, by 
using introspection. (Currently some containers use reference counting 
internally, which makes their implementation more complicated than it 
could be.)

Finally, I continue to believe that sealing is worthwhile. In brief, a 
sealing container never gives out addresses of its elements so it has 
great freedom in controlling the data layout (e.g. pack 8 bools in one 
ubyte) and in controlling the lifetime of its own storage. Currently I'm 
not sure whether that decision should be taken by the container, by the 
user of the container, or by an introspection-based wrapper around an 
unsealed container.

* * *

That all being said, I'd like to make a motion that should simplify 
everyone's life - if only for a bit. I'm thinking of making all 
containers classes (either final classes or at a minimum classes with 
only final methods). Currently containers are implemented as structs 
that are engineered to have reference semantics. Some collections use 
reference counting to keep track of the memory used.

Advantages of the change:

- Clear, self-documented reference semantics

- Uses the right tool (classes) for the job (define a type with 
reference semantics)

- Pushes deterministic lifetime issues outside the containers 
(simplifying them) and factors such issues into reusable wrappers a la 
RefCounted.

Disadvantages:

- Containers must be dynamically allocated to do anything - even calling 
empty requires allocation.

- There's a two-words overhead associated with any class object.

- Containers cannot do certain optimizations that depend on container's 
control over its own storage.

What say you?

Andrei