Array append performance

Robert Jacques sandford at jhu.edu
Mon Aug 25 19:25:58 PDT 2008


On Mon, 25 Aug 2008 18:45:30 -0400, Robert Jacques <sandford at jhu.edu>  
wrote:

> On Mon, 25 Aug 2008 12:51:30 -0400, bearophile  
> <bearophileHUGS at lycos.com> wrote:
>> Rectangular 2D arrays enjoy a stride as 4th field, while (usually 1D)  
>> strings enjoy having a hash value there. The two cases are very  
>> distinct, with rare mixed cases.
>> So an ugly idea: if the 4th array is 1D the field can be the hash  
>> value, if it's 2D it can contain the stride ;-) (But if superdan is  
>> right, all this discussion may need to be re-built on different basis).
>> On the other hand, the capacity field isn't much useful for 2D arrays  
>> (because you usually append to 1D arrays), so you end with just 3  
>> fields with 2D arrays and 4 fields for 1D arrays :-) So, if you want to  
>> keep both structures of len 4 (assuming pointers and size_t having the  
>> same size) you can add a second stride, useful for 3D arrays too ;-)
>
> I was refering to 1D arrays with strides. For 2D/3D/ND arrays you need a  
> length and a stride per dimension in order to support array slices,  
> interfacing to both C and/or Fortran code bases, etc.
> So a ND array looks like
> {
>      T*           ptr;
>      size_t[N]    lengths;
>      ptrdiff_t[N] byte_strides;
> }
>
> However, regarding your idea, as caching the has makes sense for  
> invarient arrays and capacity would mostly be used during string  
> bulding, you could overload the field based on type. However, this means  
> neither optimization works for const arrays, which might not be a good  
> thing as it encorages code bloat.
>

Using these optimizations with const could be done with either a bit-flag  
(in length) or utilizing the block nature of the GC which leads to  
capacities with (at least 4 I think) zero low order bits.

P.S. byte_strides should probably go above lengths as it's a better of  
memory layout.



More information about the Digitalmars-d mailing list