Feature Request: Linked Arrays

Fri May 2 05:32:15 PDT 2008

downs wrote:
> janderson wrote:
>> Robert Fraser wrote:
>>> I'd rather ask the GC to expose a canExpand() method and have it
>>> implemented in the library.
>> I agree.
> 
> So do I, actually. That would be best.
>> BTW, to state the obvious:
>>
>> "Linked arrays" have several dis-advantages when compared to arrays.
>> Namely after a while of lots of insertions deletions they become very
>> fragmented and performance slows down.
> 
> I don't see how that could happen, if you're only appending at the end.

If you're only appending at the end, why use a linked array? Simply so 
your array does not need to be contiguous?

>> The main thing you have to
>> watch out for is traversal. Often I've seen code slow down because
>> people write a data structure to speed up insertion, not realizing that
>> most of the time was spent in traversal.  By using links the work of the
>> pre-fetcher, code optimizer and cache is increased.
> 
> It depends on the size of the elements.
> 
> If it's, say, an int[>], about 1k of them could fit into a single page of memory. At that point, and especially if the next array pointer is stored at the _end_ of the memory range, and considering that linear traversal can easily be enhanced with prefetch instructions, I think the traversal speed difference to flat arrays shrinks considerably.
> 
> I don't have the benchs to demonstrate that though. Sorry.

Data locality is a strong advantage, too. It depends, I think, on how 
often you are planning on appending to the array.

I think most people usually use small arrays most of the time, but I 
don't have data for this.

>> One think what would help, is if you could get the next free aligned
>> block in memory from the previous block.  That way the array would be
>> slightly more cache friendly.
>>
> 
> I don't understand.

When you access memory, the OS pulls in at least one page to cache. 
Often it'll pull in a bit more, since you're likely to access memory 
with nearby addresses. So if you plan your data layout around this, you 
get better performance.

>> I'm not saying this sort of data struct doesn't have it's uses,
>> sometimes its useful for minimizing Memory Manager fragmentation.  In
>> fact if you have a huge array, sometimes the only way to get around
>> fragmentation is to spit it up.
> 
> That's the idea :) Currently, appending to an array is almost embarassingly slow. This forces the use of preallocating, which is unintuitive.
> 
> Imnsho :)

The method is, I think, but not the idea.