Suggestion: dynamic array operations

Wed Sep 6 15:43:42 PDT 2006

Gregor Richards wrote:
> Gregor Richards wrote:
>> Steve Horne wrote:
>>> On Wed, 06 Sep 2006 10:51:32 -0700, Gregor Richards
>>> <Richards at codu.org> wrote:
>>>
>>>> Arrays are arrays, not lists.  You can abuse them into stacks, 
>>>> queues and the ilk, but it's an inefficient use of memory.
>>>
>>> Sorry for pedanting, but it's an inefficient use of CPU cycles, not
>>> memory. Arrays are more memory efficient than linked lists - no
>>> per-item pointers, and only a one-off malloc overhead.
>>>
>>> Even for time, it depends on how big the stack/deque will get. For
>>> small sizes arrays are a good choice. Realloc problems are minimised,
>>> time to shift the items up/down is trivial, and there are cache
>>> benefits to keeping the items in adjacent memory locations.
>>>
>>> For large deques, what you really want is a list of small arrays. A
>>> trade-off between the costs and benefits of both lists and arrays.
>>> IIRC this is pretty much what std::deque does in C++. But even then,
>>> pushing and popping on the small arrays is needed to implement the
>>> full deque logic.
>>>
>>> And of course, that large deque class could overload those operators
>>> to allow the same syntax.
>>>
>>> I just think methods are a better choice than operators for this.
>>>
>>
>> To pop an element off the front of an array requires either losing 
>> that chunk of memory or moving the entire rest of the array back such 
>> that that memory is still used.  Pushing an element onto either side 
>> of an array involves expanding the memory used by that array, which 
>> often involves duplication.  Using an array as a stack or queue is a 
>> waste of MEMORY, not CPU cycles.
>>
>>  - Gregor Richards
> 
> Erm, totally misread everything.  Wowza.
> 
> Yeah, that'd be CPU cycles :P

Not necessarily.  Appending to an array that doubles its size on 
reallocations is an amortized O(1) operation.  And since arrays display 
far better locality than linked-lists, the potentially far superior 
cache performance can result in a noticeable performance increase over 
linked-list implementations of some data structures.  Stacks are 
relatively trivial to implement in arrays (push/pop from the back), and 
even queues can be made fairly efficient (if there is a soft upper bound 
on their size) by using a circular array.  Heaps and binary trees do 
fairly well in arrays also, since insertions and deletions do not 
require moving data.  That isn't to say that arrays are always the best 
choice for implementing such data structures, but I think it's a much 
grayer area than you're suggesting.  Particularly when real-world issues 
such as cache misses come into play.

Sean