Paralysis of analysis -- the value/ref dilemma

Wed Dec 15 09:05:36 PST 2010

On Wed, 15 Dec 2010 09:56:36 -0600
Andrei Alexandrescu <SeeWebsiteForEmail at erdani.org> wrote:

> Optimization (or pessimization) is a concern, but not my primary one. My 
> concern is: most of the time, do you want to work on a container or on a 
> copy of the container? Consider this path-of-least-resistance code:
> 
> void fun(Container!int c) {
>      ...
>      c[5] += 42;
>      ...
> }
> 
> Question is, what's the most encountered activity? Should fun operate on 
> whatever container it was passed, or on a copy of it? Based on extensive 
> experience with the STL, I can say that in the overwhelming majority of 
> cases you want the function to mess with the container, or look without 
> touch (by means of const). It is so overwhelming, any code reviewer in 
> an STL-based environment will raise a flag when seeing the C++ 
> equivalent to the code above - ironically, even if fun actually does 
> need a copy of its input! (The common idiom is to pass the container by 
> constant reference and then create a copy of it inside fun, which is 
> suboptimal.)

I do agree.

When a container is passed as parameter
* either it is a value in meaning and should be left unchanged (--> so that the compiler can pass it as "constant reference")
* or it means an entity with identity, it makes sense to change it, and it should be implemented as a ref.

What I'm trying to fight is beeing forced to implement semantics values as concrete ref elements. This is very bad, a kind of conceptual distortion (the author of XL calls this semantic mismatch) that leads to much confusion.
Example of semantic distinction:
Take a palette of predefined colors (red, green,..) used to draw visual widgets. In the simple case, colors are plain information (=values), and the palette (a collection) as well. In this case, every widget holds its own subset of colors used for each part of itself. Meaning copies. Chenging a given color assigned to a widget should & does not affect others.
Now, imagine this palette can be edited "live" by the user, meaning redefining the components of re, green,... This time, the semantics may well be that such changes should aaffect all widgets, including already defined ones. For this, the palette must be implemented as an "entity", and each as well. But the reason for this is that the palette does not mean the same thing at all: instead of information about an aspect (color) of every widget, we have now a kind of container of color _sources_. Instead of color values, the widget fields point to kinds of paint pots; these fields should not be called "color".
[It is not always that simple to find real-world metaphors helping us and correctly understand what we have to model and write into programs. A program's world is not at all reality, not even similar to it, even in the (minority of) cases where it models reality. In this case, "color" is misleading.]

In the first case, palette must be a value, in the second case it must be a ref. There is no way to escape the dilemma about having value or ref collections. Conceptually, we absolutely need both. Again the ref/value semantic duality is independant from data types. If the language provides one kind only, we have to hack, to cheat with it.

There is a special case in non-OO-only cisconstances: sometimes an element is passed as parameter while it is conceptually the "object" (in common sense) on which an operation applies (~ OO receiver). In OO, it would be passed by ref precisely to allow it beeing changed, even if it is a plain value (this prevents creating a new value at every tiny chenge, as opposed to immutability). But this relevant distinction between object of an operation (what) and true parameters (how) does not exist in plain function-based style:
	func(object, param1, param12);
So that we have to pass the object by ref when the operation is precisely here to modify it. But conceptually it is  not a parameter.

> In contrast, most of the time you want to work on a copy of a string, so 
> strings are commonly not containers. (This is nicely effected by string 
> being defined as arrays of immutable characters.) However, you sometimes 
> do need to mutate a string, which is why char[] is useful on occasion.

I agree with this as well. Do does the right thing for strings.

Denis
-- -- -- -- -- -- --
vit esse estrany ☣

spir.wikidot.com