Separate slices and dynamic arrays
Quirin Schroll
qs.il.paperinik at gmail.com
Fri Aug 2 21:43:33 UTC 2024
On Wednesday, 31 July 2024 at 23:07:14 UTC, monkyyy wrote:
> ```d
> int[3] foo;
> int[] bar=foo[];
> bar[1]=3;
> assert(foo[1]==3);//works, bar is a reference to foo
> bar~=5;//oh no
> bar[2]=4;
> assert(foo[2]==4);//failure, bar stopped being a reference to
> foo
> ```
>
> slices are sometimes better pointer math or sometimes gc'ed
> managed memory. This is unsafe and lacks clear ownership.
>
> So two parts:
>
> Introduce `int[?]` as a dynamic array, it "owns" the memory and
> duplicates on assignment from slices.
>
> Deprecate append to slices.
>
> `auto bar=foo[];` would still be a slice
I think your confusion is somewhat warranted by the fact that
many people on the forum call slices dynamic arrays. I don't. I'm
careful with technical terms because they inform people's
intuition about things, and in my intuition, dynamic arrays don't
overlap. D does have dynamic arrays, though: Array literals and
`new int[](n)` is what I would call dynamic arrays. But they
evaluate to only a slice of that dynamic array. Slices never own
data. If you append to a slice, it may have enough capacity and
is being extended in-place. If you concatenate two of them or
append one that doesn't have enough capacity, a new dynamic array
is allocated, stuff is copied to it, and you get a slice of that
data.
If you append, you can only write to the lower insides of the
result if you don't care if it possibly affects some other slice.
And sometimes, you actually do not care, especially if the
underlying type isn't even mutable, like with strings. With
mutable elements, you might get some “spooky action at a
distance” vibes, but then you're using slices incorrectly. You
can always force a copy with `dup`. You can't force an in-place
extension if there's just not enough capacity.
If you want an owning dynamic array type, you can make your own
type. Probably. Because I don't really know what you mean by
owning the data. You probably don't mean it in the “responsible
for freeing” sense, and I don't know any other sense.
The only thing I can imagine you'd want to have owned is in a
vector-like type that one can append to, and which, contrary to
built-in slices, on exceeding capacity, does not copy the
elements to the newly allocated bigger buffer, but moves them.
It'd be similar to how iterators of a C++ vector are invalidated
if you append to it.
Walter once proposed deprecating slice append and slice length
assignment, and he faced almost unanimous backash. Slice append
is very, very useful in small programs and at CTFE. Even Zig
acknowledges how useful such array manipulation is and allows
them at compile-time (not at run-time though, as it's not a GC
language).
D's slices are what in C++ is a vector or string, a span or
string_view, or a valarray, or even something like an output
iteratior. My first instinct was: This is either nuts or sublime
genius, and it's most definitely the latter.
I work with C++. It's my professional job. In our codebase, we
don't use the stdlib's string view because that thing can be
initialized with string temporaries and if you pass a temporary
string to a function takes a string view (for which there is no
indication whatsoever) and that function stores that somewhere,
you got yourself a dangling pointer stored. Our in-house string
view requires explicit construction from an rvalue string, so
there is an indication that we risk a dangling pointer. Why am I
saying this? In D, that's a non-issue. If you pass a slice to a
function and that function decides to store it, fine. The GC
keeps the memory around as needed. In C++, I must decide whether
to take as argument or store a sting or a string view. The latter
is dirt cheap and quick, but it doesn't (co-)own the memory.
Please, don't force this kind of decision onto D. It's simply
great not to have to choose.
More information about the dip.ideas
mailing list