The non allocating D subset
Adam D. Ruppe
destructionator at gmail.com
Sat Jun 1 07:40:09 PDT 2013
On Saturday, 1 June 2013 at 05:45:38 UTC, SomeDude wrote:
> Basically it is a non allocating D subset.
Not necessarily nonallocating, but it doesn't use a gc. I just
updated the zip:
http://arsdnet.net/dcode/minimal.zip
If you make the program (only works on linux btw) and run it,
you'll see a bunch of allocations fly by, but they are all done
with malloc/free and their locations are pretty predictable.
The file minimal.d is a test program and you can see a lot of D
works, including features like classes, exceptions, templates,
structs, and delegates. (Heap allocated delegates should be
banned but aren't, so if you do one built in it will leak. The
helper file, memory.d, though contains a HeapDelegate struct that
refcounts and frees it, so the concept is still usable.)
The other cool thing is since the library is so minimal, the
generated executable is small too. Only about 30 KB with an empty
main(), and no outside dependencies. A cool fact about that is
you can compile it and run on bare metal (given a bootloader like
grub) too, and it all just works.
You can also make "LIBC=yes" and depend on the C library, which
makes things work better - there's a real malloc function there!
- and adds about 10kb to the executable. That's probably a more
realistic way to use it on the desktop at least than totally
standalone.
But yeah, I haven't written any real code with this, but so far
it seems to be pretty usable.
I also talked a while on the reddit thread last night about this,
so let me copy/paste that here too:
Yes, certainly. And it wouldn't even necessarily be no array
concats, just you wouldn't want to use the built-in ones.
Some features that use the gc in the real druntime don't
necessarily have to. You'll need to be aware of this most the
time to free the memory in your app, but you can have a pretty
good idea of when it will happen. One example is new class. If
that mallocs, if you just match every new with a delete (or call
to free_obj() or whatever), you'll be fine, just like C++.
I played with one I wasn't sure would work earlier, but now think
it can: heap closures. scope delegates are easy, since they don't
allocate, but heap closures allocate automatically and don't give
much indication that they do.... but, if you are careful with it,
the rules can be followed (if it accesses an outside scope and
has its address taken/reference copied or passed to a function,
it will automatically allocate), and you can manually call
free(dg.ptr); when you're done with it.
I think it is probably safer to just disallow them, either by not
implementing _d_allocmemory in druntime (thus if you accidentally
use it, you'll get a linker error about the missing function),
or, and this is tricky right now but not actually impossible, use
compile time reflection to scan your methods and members for a
non-scope delegate reference and throw an error.
If we do the latter, a heap delegate can actually be allowed in a
fairly safe way, by wrapping it in a struct. Usage of
HeapDelegate!T will be pretty obvious, so you aren't going to
accidentally use it the way you might the more sugary built in. I
have a proof of concept implemented in my local copy of minimal.d
that automatically refcounts the delegate, freeing it when the
last reference goes out of scope.
Array slices are ok the way I have them implemented now: the
built in concat function is missing, so if you try a ~ b, it will
be a linker error (including the source file and line number btw,
easy enough to handle). No allocation there. The biggest risk is
lifecycle management, and the rule there is you don't own slices
(non-immutable ones at least). I'd like the compiler to implement
a check on this, but right now it doesn't. Not a hard coding
convention though.
Built in new array[] is not implemented, meaning it is a linker
error, because they are indistinguishable from slices type-wise.
(In theory it could be like classes, where you just know to
manually free them, but if you have a char[] member, are you sure
that was new'd or is it holding a slice someone else owns? Let's
just avoid it entirely.)
But, this doesn't mean we can't have some of D's array
convenience! In minimal.d, you can see a StackArray!T struct and
maybe, not sure if I put it in that zip or not, a HeapArray!T
struct. These types own their memory, stack of course going away
with scope, and heap being automatically reference counted to
call free() when all copies are gone, and overload a few
operators for convenience:
alias this is to a slice function, so you can do char[] slice =
myCustomArray; You can't change the original pointer through that
slice, so no risk of it losing the memory.
The ~= operator is implemented too on the *Array containers. They
know their length and their capacities, and you can append up to
the capacity. (The HeapArray could also realloc() as needed, but
right now I don't.) One important difference though with this and
regular D arrays is in regular D:
string a = "hey"; string b = a; b ~= " man";
assert(a == "hey" && b == "hey man");
Appending to the second one doesn't change the first one. It may
allocate as needed (see this for details:
http://dlang.org/d-array-article.html )
Whereas with a HeapArray or StackArray, they share the same
underlying data, so appending to one reference would append to
all. I think that's OK though, because we have helper things like
const to avoid that, and they are a custom type, so they are
allowed to work differently than the built ins.
Thankfully, btw, static arrays are a different type. They can be
permitted with ease.
I didn't implement a ~ b. I think that one would be too easy to
lose and either pointlessly malloc/free in the middle of an
expression, or just forget to free entirely, but maybe it could
be done too.
Another issue is strings. In phobos and druntime both, there's a
lot or creating strings on the heap and returning them from
functions. e.g. to!string(10) returns a brand new allocated
string "10". We don't want that in our library, so it looks a
little more like C, but slices make it easier to manage. The
analogous function I wrote is
char[] intToString(int a, char[] buffer)
You pass it an area of pre-allocated buffer to write to.
buffer.length tells it where it isn't allowed to continue (unlike
a plain char* in C). It returns the slice of your own buffer that
was actually used. So writeln(10) becomes:
char[16] buffer;
write(intToString(10, buffer), "\n");
a little more verbose, but there's no mystery there about the
memory. intToString knows it only has 16 spaces to work with, you
know exactly where it is going, no allocations, and the return
value conveniently has the length used too, so we can pass it
directly to another function. (as long as that function doesn't
store the reference!)
Built in AAs? Not implemented. But we could do a library AA just
as easily, and thanks to overloaded operators, it would be pretty
too.
Another issue is exceptions. They work, and must be classes. So
where do you free them? I haven't tried this yet, but I'm pretty
sure you can just do it when you catch() it, and no problem.
Well this is turning into a real beast of a comment, so let me
sum up and finish: a lot of D can work without the GC. It will
take some custom types and deliberately missing druntime
functions to make pretty, but it leaves us with a language at
least as usable as C++, with the same idea of no surprise/hidden
allocations in there. There's still a question of having stuff we
don't necessarily want like RTTI, but their impact can be
minimized so I think it will be ok too. (Oh btw, since I have a
custom druntime here, I added some runtime reflection the real D
doesn't have yet. It rox, and came cheap, but you can still
version it out.)
More information about the Digitalmars-d
mailing list