Slicing betterC

Thu Sep 6 17:57:13 UTC 2018

On Thursday, September 6, 2018 11:34:18 AM MDT Adam D. Ruppe via 
Digitalmars-d-learn wrote:
> On Thursday, 6 September 2018 at 17:10:49 UTC, Oleksii wrote:
> > struct Slice(T) {
> >
> >   size_t capacity;
> >   size_t size;
> >   T*     memory;
> >
> > }
>
> There's no capacity in the slice, that is stored as part of the
> GC block, which it looks up with the help of RTTI, thus the
> TypeInfo reference.
>
> Slices *just* know their size and their memory pointer. They
> don't know how they were allocated and don't know what's beyond
> their bounds or how to grow their bounds. This needs to be
> managed elsewhere.
>
> If you malloc a slice in regular D, the capacity will be returned
> as 0 - the GC doesn't know anything about it. Any attempt to
> append to it will allocate a whole new block.
>
> In -betterC, there is no GC to look up at all, and thus it has
> nowhere to look. You'll have to make your own struct that stores
> capacity if you need it.
>
> I like to do something like
>
> struct MyArray {
>        T* rawPointer;
>        int capacity;
>        int currentLength;
>
>        // most user interaction will occur through this
>        T[] opSlice() { return rawPointer[0 .. currentLength]; }
>
>        // fill in other operators as needed
> }

To try to make this even clearer, a dynamic array looks basically like this
underneath the hood

struct DynamicArray(T)
{
    size_t length;
    T* ptr;
}

IIRC, it actually uses void* unfortunately, but that struct is basically
what you get. Notice that _all_ of the information that's there is the
pointer and the length. That's it. If you understand the semantics of what
happens when passing that struct around, you'll understand the semantics of
passing around dynamic arrays. And all of the operations that would have
anything to do with memory management involve the GC - capacity, ~, ~=, etc.
all require the GC. If you're not using -betterC, the fact that the dynamic
array was allocated with malloc is pretty irrelevant, since all of those
operations will function exactly the same as if the dynamic array were
allocated by the GC. It's just that because the dynamic array is not
GC-allocated, it's guaranteed that the capacity is 0, and therefore any
operations that would increase the arrays length then require reallocating
the dynamic array with the GC, whereas if it were already GC-allocated, then
its capacity might have been greater than its length, in which case,
reallocation would not be required.

If you haven't read it already, I would suggest reading this article:

https://dlang.org/articles/d-array-article.html

It does not use the official terminology, but in spite of that, it should
really help clarify things for you. The article refers to T[] as being a
slice (which is accurate, since it is a slice of memory), but it incorrectly
refers to the memory buffer itself as being the dynamic array, whereas the
language spec considers the T[] (the struct shown above) to be the dynamic
array. The language does not have a specific name for that memory buffer,
and it considers a T[] to be dynamic array regardless of what memory it
refers to. So, you should keep that in mind when reading the article, but
the concepts that it teaches are very much correct and should help a great
deal in understanding how dynamic arrays work in D.

- Jonathan M Davis