DIP 1025--Dynamic Arrays Only Shrink, Never Grow--Community Review Round 1

Mon Nov 11 11:17:26 UTC 2019

On Monday, November 11, 2019 3:27:26 AM MST Mike Parker via Digitalmars-d 
wrote:
> This is the feedback thread for the first round of Community
> Review for DIP 1025, "Dynamic Arrays Only Shrink, Never Grow":
>
> https://github.com/dlang/DIPs/blob/1b525ec4c914c06bc286c1a6dc93bf1533ee56e
> 4/DIPs/DIP1025.md
>
> All review-related feedback on and discussion of the DIP should
> occur in this thread. The review period will end at 11:59 PM ET
> on November 25, or when I make a post declaring it complete.
>
> At the end of Round 1, if further review is deemed necessary, the
> DIP will be scheduled for another round of Community Review.
> Otherwise, it will be queued for the Final Review and Formal
> Assessment.
>
> Anyone intending to post feedback in this thread is expected to
> be familiar with the reviewer guidelines:
>
> https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md
>
> *Please stay on topic!*
>
> Thanks in advance to all who participate.

Ouch. This would be a _huge_ breaking change. The ability to grow dynamic
arrays is used all over the place. It's also one of the great benefits of
D's dynamic array IMHO. The fact that you append to a dynamic array without
caring where it came from can be incredibly useful, and if you really care
about checking whether it has the capacity to grow or whether it's managed
by the GC, we have functions to check for that.

How would this interact with stuff like assumeSafeAppend? There are code
bases that take advantage of the combination of assumeSafeAppend so that
they don't have to worry about managing allocations (since ~= takes care of
it all), but they can still reuse the dynamic array and thus avoid
reallocations if the GC-managed memory block that the dynamic array is a
slice of is large enough to hold whatever data is appended. It looks to me
like change would break the code in all such code bases.

As for Appender, it's designed around the idea that you're going to build an
array and then use it rather than you're going to grow it or shrink it as
needed, so it doesn't work anywhere near as well when dealing with an array
shrinking, let alone growing and shrinking. It also does not work for the
use case where you're passing dynamic arrays around and then appending to
them, since then you don't have access to the Appender, just the dynamic
array. Also, the ability to append to a dynamic array without worrying about
whether it allocates or not such that the array can grow in place when it
can (thus avoiding unnecessary allocations) but still have reallocations
occur when it needs to is something that isn't going to work with Appender.
In my experience, code operating on strings in particular takes advantage of
being able to slice and append to dynamic arrays with impunity - which works
particularly well with immutable, since then you don't have to worry about
any of the characters being mutated. Changing such code to use Appender or ~
would be far more complicated and is likely to cause a lot more unnecessary
allocations to occur.

Also, I fail to see how getting rid of the ability to append and reallocate
dynamic arrays really helps with managing memory when the memory that the
dynamic array is a slice of is not allocated by the GC. Dynamic arrays
aren't reference-counted. You can have slices throughout the code which all
refer to the same memory even if they never do an append operation. All of
the tools that you'd need to use in terms of scope or pure or coding
conventions or whatever to know whether a slice of that memory exists
anywhere (and thus whether it's safe to free it) would be the same whether
it can be appended to or not. And if you want to disable the ability to
append to a dynamic array, you can always just mark the code as @nogc. Why
do programmers who _do_ want to use the GC in their code need to be punished
by making dynamic array worse, when anyone who doesn't want to use the GC
can already explicitly add restrictions forbidding it?

Honestly, this seems like a prime case for how trying to make D work better
with code that doesn't want to use the GC is making D worse for code that
does want to use the GC. I really, really, really hope that this change does
not make it into the language. From where I sit, it would make the language
hands down worse. In fact, as far as usability goes, it would make D's
strings _worse_ than C++'s strings. Sure, we'd still be able to get
substrings more efficiently than C++ can, but not being able to append to a
string makes it a pretty terrible string. At least C++'s strings can do that
efficiently.

IMHO, where D's dynamic arrays and strings sit right now as far as memory
allocation goes is the sweet spot. They're efficient at both growing and
shrinking. If I wanted a language that was catering to the non-GC case over
the GC case, I'd use C++. Improvements to D that make it work better with
@nogc code without compromising code that uses the GC seems like a good idea
to me, but this very much compromises the language for code that uses the
GC. _Please_ don't do this.

- Jonathan M Davis