Cloning in D

Jacob Carlborg doob at me.com
Tue Sep 7 01:59:17 PDT 2010


On 2010-09-07 04:16, Michel Fortin wrote:
> On 2010-09-06 20:55:16 -0400, dsimcha <dsimcha at yahoo.com> said:
>
>> == Quote from Michel Fortin (michel.fortin at michelf.com)'s article
>>> I'm under the impression that a too permissive generic implementation
>>> of cloning is going to break things in various scenarios.
>>
>> In general you raise some very good issues, but IMHO the right way to
>> do cloning
>> is to have permissive generic cloning that works in the 90% of cases
>> and can be
>> easily overridden in the 10% of cases, not to require writing tons of
>> boilerplate
>> in the 90% of cases just to make sure it doesn't do the wrong thing by
>> default in
>> the 10% of cases.
>
> To me automatic cloning of everything (physical cloning in your
> parlance) looks more like 50/50 work/doesn't-work ratio. I can only
> guess, but I'm probably used to different use cases than you are.
>
>
>> A second point is that the thing that brought this whole cloning issue
>> to my mind
>> was making std.concurrency's message passing model less obtuse. Right
>> now it's
>> hard to use for non-trivial things because there's no safe way to pass
>> complex
>> state between threads. If we start allowing all kinds of exceptions to
>> the "clone
>> the **entire** object graph" rule, cloning will rapidly become useless
>> for safely
>> passing complex object graphs between threads.
>
> This I agree with. I'm not arguing against automatic cloning per-see,
> I'm just trying to show cases where it doesn't work well.
>
> Personally, I'm rather skeptical that we can make it safe and efficient
> at the same time without better support from the language, something
> akin the mythical "unique" type modifier representing a reference with
> no aliasing.
>
>
>>> What if your
>>> object or structure is part of a huge hierarchy where things contains
>>> pointers to their parent (and indirectly to the whole hierarchy), will
>>> the whole hierarchy be cloned?
>>
>> Isn't that kind of the point?
>
> Well, that depends. If you send each leaves of a tree as a message to
> various threads presumably to perform something concurrently with the
> data in that leaf, then you may want only the leaf to be copied. You may
> not want every parent down to the root and then up to every other leaf
> to be copied alongside with each message just because the leaf you send
> has a pointer to the parent.
>
> In fact, it depends on the situation. If what you want to do with the
> leaf in the other thread requires the leaf to know its parent and
> everything else, then sure you need to copy the whole hierarchy. But
> otherwise it's a horrible waste of memory and CPU to clone the whole
> object graph for each message, even though it won't affect the program's
> correctness.
>
> And it's basically the same thing with observers. If your observer is a
> controller in charge of updating a window when something changes, you
> don't want to clone the observer, then clone the window and everything
> in it just because you're sending some piece of data to another thread.
> Perhaps the program architecture is just wrong, or perhaps that observer
> is a synchronized class capable of handling function calls from multiple
> threads so it doesn't really need to be copied.
>
>
>>> What happens if your object or structure
>>> maintains a reference to a singleton, will we get two instances of a
>>> singleton?
>>
>> Very good point. I guess the reasonable use case for holding a
>> reference to a
>> singleton (instead of just using the globally accessible one) would be
>> if it's
>> polymorphic with some other object type? If you're using message passing
>> concurrency, most of your mutable singletons are probably
>> thread-local, and what
>> you probably really want to do is use the thread-local singleton of
>> the thread
>> you're passing to.
>
> What intrigues me is how such a mechanism would work... although in my
> mind it's probably not even worth supporting at all, singletons be damned!
>
>
>>> My understanding is that a data structure containing a pointer cannot
>>> be cloned safely unless it contains some specific code to perform the
>>> cloning. That's because the type system can't tell you which pointers
>>> point to things owned by the struct/class and which one need to be
>>> discarded when cloning (such as a list of observers, or the parents of
>>> a hierarchy).
>>
>> This discussion is making me think we really need two kinds of
>> cloning: Physical
>> cloning would clone the entire object graph no matter what, such that
>> the cloned
>> object could be safely passed to another thread via std.concurrency
>> and be given a
>> unique type. Logical cloning would be more like what you describe. In
>> general,
>> this discussion has been incredibly useful because I had previously only
>> considered physical cloning.
>
> This is an interesting and valid observation. But I think you need to
> leave a door open to customization of the "physical cloning" case too.
> The ability to avoid cloning unnecessary data is as necessary as the
> ability to easily copying an entire object graph.

I think everything would be a lot easier if we had support for cloning 
in the standard library. Something like a clone method in Object that as 
a default implementation clones everything and then you can override it 
to customize the cloning. There can also be various hooks (static fields 
or mixins, even better with proper language support) available to say 
that a certain field shouldn't be cloned, or a whole object. Now with 
cloning in the standard library other parts of the std can take 
advantage of these hooks and do the appropriate action. For example, the 
singleton could say it shouldn't be cloned and the same may go for other 
classes as well.

-- 
/Jacob Carlborg


More information about the Digitalmars-d mailing list