A slice can lose capacity simply by calling a function
Jonathan M Davis via Digitalmars-d-learn
digitalmars-d-learn at puremagic.com
Mon May 4 10:36:20 PDT 2015
On Monday, 4 May 2015 at 06:23:42 UTC, Ali Çehreli wrote:
> On 05/03/2015 06:06 PM, Jonathan M Davis via
> Digitalmars-d-learn wrote:
> (I am eagerly waiting for your DConf talk to see how you make
> sense of it all.)
Well, we'll see how much I'm able to cover about arrays. The
focus of the talk is on ranges, not arrays, so I don't know if
talking much about non-range stuff like array capacity is going
to fit in with everything else that needs to be discussed that
_is_ specific to ranges. I'd like to discuss it though.
Regardless, I keep meaning to write an article on ranges, and I'm
increasingly convinced that I should take a crack at writing one
on arrays, since while Steven's article is quite enlightening, I
think that it approaches things the wrong way (e.g. it focuses on
the memory buffers that the runtime manages rather than the
dynamic arrays themselves) and uses the wrong terminology (e.g.
talking about the memory buffers that the runtime manages as
being dynamic arrays, when according to the language spec T[] is
a dynamic array, and what it refers to really doesn't matter with
regards to whether it's a dynamic array or not). So, I'll
probably turn some portion of my talk into an article or two, and
there won't be the same time pressures there.
At this point, I feel like I have how dynamic arrays work well
ironed out in my head and that it's actually pretty
straightforward, but in general, we seem to do a poor job of
explaining it in a straightforward manner, which results in far
more confusion on the topic than I think there should be.
> > For the most part, D's dynamic arrays just
> > work.
>
> I know you are not trolling but I can't take your brushing off
> this issue with phares like "for the most part". That's the
> frigging problem! "For the most part" is not sufficient. Unless
> somebody explains the semantics in a consistent way, I will
> continue to try to do it myself. (Remember: Never append to a
> parameter slice. Good function, good!)
Aside from performance considerations, you can pretty much ignore
the capacity issue. The only other concern that it raises is
whether two dynamic arrays still refer to the same memory block,
and once you append to either of them, you can't assume that they
do, and you can't assume that they don't (though it's easy enough
to check via their ptr properties). That can be managed on some
level by checking the capacity ahead of time, but really, once
you start appending, you have to treat each slice as possibly
separate, and if you want to guarantee it, you really need to use
dup or idup.
But most code just doesn't need to care about capacity. And if
you really do need to care, odds are that you can either fix the
problem with a reserve call or by using Appender, or you should
just not use dynamic arrays directly. In general, I'd consider
code that was worrying much about the capacity of dynamic arrays
to be error-prone - or at least that it's not going about things
in the best way. By its very nature, it's likely to end up being
inefficient and is too likely to care about whether two dynamic
arrays refer to the same memory or not.
Dynamic arrays are badly designed for situations where you can
have random stuff appending to your array. They just are. Because
there's no ownership, and they're not full reference types,
making it trivial to end up with something appended to one
dynamic array but not actually end up on the one you want it on.
As such, I'd argue that anything that's doing a lot of random
appending to arrays shouldn't be using dynamic arrays (at least,
not without wrapping them so that there's clear ownership of the
memory).
So, ultimately, I see array capacity as being pretty much a
non-issue, because most code that would care much about is going
about things in the wrong way. But maybe what we need is a clear
set of guidelines about how dynamic array slices should be
managed so that they're generally used efficiently and without
risking weird behavior due to expecting two dynamic arrays to
refer to the same array when they don't.
In general though, I'd argue that code should be constructing
arrays up front and then processing them as ranges and not doing
a lot of appending to them later. In particular, if you do a lot
of appending and removals as you go along, it's going to be a
performance hit, and you seriously risk having trouble due to
having operated on a slice of the dynamic array you actually
wanted to operate on.
- Jonathan M Davis
More information about the Digitalmars-d-learn
mailing list