Escaping the Tyranny of the GC: std.rcstring, first blood

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Sun Sep 14 19:26:58 PDT 2014


Walter, Brad, myself, and a couple of others have had a couple of quite 
exciting ideas regarding code that is configurable to use the GC or 
alternate resource management strategies. One thing that became obvious 
to us is we need to have a reference counted string in the standard 
library. That would be usable with applications that want to benefit 
from comfortable string manipulation whilst using classic reference 
counting for memory management. I'll get into more details into the 
mechanisms that would allow the stdlib to provide functionality for both 
GC strings and RC strings; for now let's say that we hope and aim for 
swapping between these with ease. We hope that at one point people would 
be able to change one line of code, rebuild, and get either GC or RC 
automatically (for Phobos and their own code).

The road there is long, but it starts with the proverbial first step. As 
it were, I have a rough draft of a almost-drop-in replacement of string 
(aka immutable(char)[]). Destroy with maximum prejudice:

http://dpaste.dzfl.pl/817283c163f5

For now RCString supports only immutable char as element type. That 
means you can't modify individual characters in an RCString object but 
you can take slices, append to it, etc. - just as you can with string. A 
compact reference counting scheme is complemented with a small buffer 
optimization, so performance should be fairly decent.

Somewhat surprisingly, pure constructors and inout took good care of 
qualified semantics (you can convert a mutable to an immutable string 
and back safely). I'm not sure whether semantics there are a bit too 
lax, but at least for RCString they turned out to work beautifully and 
without too much fuss.

The one wrinkle is that you need to wrap string literals "abc" with 
explicit constructor calls, e.g. RCString("abc"). This puts RCString on 
a lower footing than built-in strings and makes swapping configurations 
a tad more difficult.

Currently I've customized RCString with the allocation policy, which I 
hurriedly reduced to just one function with the semantics of realloc. 
That will probably change in a future pass; the point for now is that 
allocation is somewhat modularized away from the string workings.

So, please fire away. I'd appreciate it if you used RCString in lieu of 
string and note the differences. The closer we get to parity in 
semantics, the better.


Thanks,

Andrei


More information about the Digitalmars-d mailing list