Wrapping a C library with its own GC + classes vs refcounted structs

aldanor via Digitalmars-d-learn digitalmars-d-learn at puremagic.com
Sat Jan 10 15:47:19 PST 2015


On Saturday, 10 January 2015 at 20:55:05 UTC, Laeeth Isharc wrote:
> Hi Aldanor.
>
> I wrote a slightly longer reply, but mislaid the file somewhere.
>
> I guess your question might relate to wrapping the HDF5 library 
> - something that I have already done in a basic way, although I 
> welcome your project, as no doubt we will get to a higher 
> quality eventual solution that way.
>
> One question about accurately representing the HDF5 object 
> hierarchy.  Are you sure you wish to do this rather than 
> present a flattened approach oriented to what makes sense to 
> make things easy for the user in the way that is done by h5py 
> and pytables?
>
> In terms of the actual garbage generated by this library - 
> there are lots of small objects.  The little ones are things 
> like a file access attribute, or a schema for a dataset.  But 
> really the total size taken up by the small ones is unlikely to 
> amount to much for scientific computing or for quant finance if 
> you have a small number of users and are not building some kind 
> of public web server.  I think it should be satisfactory for 
> the little objects just to wrap the C functions with a D 
> wrapper and rely on the object destructor calling the C 
> function to free memory.  On the rare occasions when not, it 
> will be pretty obvious to the user and he can always call 
> destroy directly.
>
> For the big ones, maybe reference counting brings enough value 
> to be useful - I don't know.  But mostly you are either passing 
> data to HDF5 to write, or you are receiving data from it.  In 
> the former case you pass it a pointer to the data, and I don't 
> think it keeps it around.  In the latter, you know how big the 
> buffer needs to be, and you can just allocate something from 
> the heap of the right size (and if using reflection, type) and 
> use destroy on it when done.
>
> So I don't have enough experience yet with either D or HDF5 to 
> be confident in my view, but my inclination is to think that 
> one doesn't need to worry about reference counting.  Since 
> objects are small and there are not that many of them, relying 
> on the destructor to be run (manually if need be) seems likely 
> to be fine, as I understand it.  I may well be wrong on this, 
> and would like to understand the reasons if so.
>
>
>
>
>
>
> Laeeth.
Thanks for the reply. Yes, this concerns my HDF5 wrapper project; 
the main concern is not that the memory consumption of course, 
but rather explicitly controlling lifetimes of the objects 
(especially objects like files -- so you are can be sure there 
are no zombie handles floating around). Most of the time when 
you're doing some operations on an HDF5 file you want all handles 
to get closed by the time you're done (i.e. by the time you leave 
the scope) which feels natural (e.g. close groups, links etc). 
Some operations in HDF5, particularly those related to 
linking/unlinking/closing may behave different if an object has 
any chilld objects with open handles. In addition to that, the C 
HDF5 library retains the right to reuse both the memory and id 
once the refcount drops to zero so it's best to be precise about 
that and keep a registry of weak references to all C ids that D 
knows about (sort of the same way as h5py does in Python).



More information about the Digitalmars-d-learn mailing list