toStringz note about keeping references

Charles Hixson charleshixsn at earthlink.net
Tue Oct 16 10:47:11 PDT 2012


On 10/14/2012 04:54 PM, Ali Çehreli wrote:
> On 10/14/2012 04:36 PM, Andrej Mitrovic wrote:
>  > On 10/15/12, Jonathan M Davis<jmdavisProg at gmx.com> wrote:
>  >> I'd have to see exactly what TDPL says to comment on that accurately
>  >
>  > Maybe I've misread it. On Page 288 it says:
>  >
>  > "An immutable value is cast in stone: as soon as it's been
>  > initialized, you may as well
>  > consider it has been burned forever into the memory storing it. It
>  > will never change
>  > throughout the execution of the program."
>  >
>  > Perhaps what was missing is: "as long as there is a reference to that
> data".
>
> Andrei must have written that only D in mind, without any C interaction.
> When we consider only D, then the statement is correct: If there is no
> more references, how can the application tell that the data is gone or not?
>
>  > I'd really like to know for sure if the GC implementation actually
>  > collects immutable data or not.
>
> It does. Should be easy to test with an infinite loop that generates
> immutable data.
>
>  > I've always used toStringz in direct
>  > calls to C without caring about keeping a reference to the source
>  > string in D code. I'm sure others have used it like this as well.
>
> It depends on whether the C-side keeps a copy of that pointer.
>
>  > Maybe the only reason my apps which use C don't crash is because a GC
>  > cycle doesn't often run, and when it does run it doesn't collect the
>  > source string data (either on purpose or because of buggy behavior, or
>  > because the GC is imprecise).
>  >
>  > Anyway this stuff is important for OOP wrappers of C/C++ libraries. If
>  > the string reference must kept on the D side then this makes writing
>  > wrappers harder. For example, let's say you've had this type of
>  > wrapper:
>  >
>  > extern(C) void* get_Foo_obj();
>  > extern(C) void* c_Foo_test(void* c_obj, const(char)* input);
>  >
>  > class Foo
>  > {
>  > this() { c_Foo_obj = get_Foo_obj(); } // init c object by calling
>  > a C function
>  >
>  > void test(string input)
>  > {
>  > c_Foo_test(c_Foo_obj, toStringz(input));
>  > }
>  >
>  > void* c_Foo_obj; // reference to C object
>  > }
>  >
>  > Should we always store a reference to 'input' to avoid GC collection?
>
> If the C function copies the pointer, yes.
>
>  > E.g.:
>  >
>  > class Foo
>  > {
>  > this() { c_Foo_obj = get_Foo_obj(); } // init c object by calling
>  > a C function
>  >
>  > void test(string input)
>  > {
>  > input_ref = input
>  > c_Foo_test(c_Foo_obj, toStringz(input));
>  > }
>  >
>  > string input_ref; // keep it alive, C might use it after test() returns
>
> That's exactly what I do in a C++ library that wraps C types.
>
>  > void* c_Foo_obj; // reference to C object
>  > }
>  >
>  > And what about multiple calls? What if on each call to c_Foo_test()
>  > the C library stores each 'input' pointer internally? That would mean
>  > we have to keep an array of these pointers on the D side.
>
> Again, that's exactly what I do in C++. :) There is a global container
> that keeps the objects alive.
>
>  > It's not know what the C library does without inspecting the source of
>  > the C library. So it becomes very difficult to write wrappers which
>  > are GC-safe.
>
> Most functions document what they do with the input parameters. If not,
> it is usually obvious.
>
>  > There are wrappers out there that seem to expect the source won't be
>  > collected. For example GtkD also uses toStringz in calls to C without
>  > ever storing a reference to the input string.
>
> Must be verified case-by-case.
>
> Ali
>
There's a problem with this kind of ad hoc solution... If the library 
version changes, it can break things without changing the interface.

I think the real answer is that there needs to be a C layer between the 
D program and the library that copies any immutable data that may need 
to be kept, and only passes pointers to the copy to the C library.

OTOH, a nicer solution would be if there were a way of marking in D that 
"This item should never be garbage collected".  There are ways to do 
approximately that, but unfortunately the one's I'm recalling only work 
on class instances.  That's a clumsy way to do things.  What seems 
better is a wrapper around an item that holds a reference, so that data 
is marked held.  This could later be released.  This is actually just 
"keeping a reference", but it's a bit of syntactic sugar to make it 
easy.  Call the pair of functions "hold" and "release" or some such. 
(Actually, hold would need to do a bit more than just keep a reference. 
  It would also need to ensure that the data wasn't on the stack, which 
might mean you were working with a duplicate of the data, but since it's 
immutable, that shouldn't matter.)


More information about the Digitalmars-d-learn mailing list