Common Issue in Shared Code

Andrew Wiley wiley.andrew.j at gmail.com
Sun Nov 20 14:54:51 PST 2011


Gah, looks like I accidentally sent this as an HTML message, and the
web newsreader stripped out part of the sample code (
http://digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=149624
)
Here's the message again in plain text:

About a month or so ago, I started trying to convert a codebase I've
been working on into a multithreaded system, and I've been hitting
this sort of thing over and over:
--------
// used as a field and as a local variable all over the codebase
struct Data {
    int a,b,c;
    int total() {
        return a + b + c;
    }
}

// has a Data as one of its members but never escapes a pointer to it
class Bob {
    private:
    Data _dat;
    public:
    int currentTotal() {
        return _dat.total();
    }
}
--------
Now, as part of my multithreaded refactor, I need to make Bob
synchronized, but that means the Data field inside it is shared, which
means I can no longer call the total() method in currentTotal().
To fix this, I could make Data synchronized as well, but Data is used
all over the codebase, most of the time as a local variable inside a
function. In my particular case, I see this a lot with a struct that
represents a location, which is just 2 bytes in my codebase, so adding
a monitor would more than double the size, and the locking overhead
would be completely unnecessary.
If I don't want to make it synchronized, I could just cast away shared
everywhere I use it as a field, which looks ugly and is confusing when
I look at the codebase.
If I don't want to cast away shared, I could just make Data shared and
assume that the owner will make sure it's not shared improperly, but
at this point I've disabled all help the type system could provide me.

Firstly, according to TDPL:
--------
For synchronized methods:
"Maybe not very intuitively, the temporary nature of synchronized
entails the rule that no address of a field can escape a synchronized
address. If that happened, some other portion of the code could access
some data beyond the temporary protection conferred by method-level
synchronization."
For synchronized classes:
- All numeric types are not shared (they have no tail) so they can be
manipulated normally.
- Array fields declared with type T [ ] receive type shared(T) [ ] ;
that is, the head (the slice limits) is not shared and the tail (the
contents of the array) remains shared.
- Pointer fields declared with type T* receive type shared(T)*; that
is, the head (the pointer itself) is not shared and the tail (the
pointed-to data) remains shared.
- Class fields declared with type T receive type shared(T). Classes
are automatically by-reference, so they're "all tail."
These rules apply on top of the no-escape rule described in the
previous section.
One direct consequence is that operations affecting direct fields of
the object can be freely reordered and optimized inside the method, as
if sharing has been temporarily suspended for them—which is exactly
what synchronized does.
--------
At a first glance, it seems like the first rule should apply for
structs (which would mean it should address "value types"), but it
can't because a struct could contain a reference to another object,
and that reference should be transitively shared. Typing a struct as
shared if it contains a reference and unshared otherwise would just be
confusing, but this use case is one that the language does not
currently address in a satisfying way.

When I flag a type as shared, all instances of it are forced to become
shared, but the compiler assumes that the programmer has properly
synchronized things such that sharing instances of the type is safe.
Why, then, can I not force the compiler to assume I've properly
synchronized things for a field of a class? In this case, the effect
would be the opposite - the field wouldn't be flagged as shared, but
supposing we had such a keyword, it would act as a much more limited
version of the "shared" keyword because I'm only forcing the compiler
to assume I've done things properly within the context of a class.
The keyword would have to be restricted such that it could only be
applied to private fields, and the compiler would continue to enforce
(as much as is reasonable) that the address of the field does not
escape.

I believe that this case of data sharing will appear and frustrate
programmers in almost any multithreaded program, and that finding a
satisfying solution to allow the language to provide as many
guarantees as possible is worthwhile.

Any thoughts?


More information about the Digitalmars-d mailing list