Array append performance
Robert Jacques
sandford at jhu.edu
Mon Aug 25 19:25:58 PDT 2008
On Mon, 25 Aug 2008 18:45:30 -0400, Robert Jacques <sandford at jhu.edu>
wrote:
> On Mon, 25 Aug 2008 12:51:30 -0400, bearophile
> <bearophileHUGS at lycos.com> wrote:
>> Rectangular 2D arrays enjoy a stride as 4th field, while (usually 1D)
>> strings enjoy having a hash value there. The two cases are very
>> distinct, with rare mixed cases.
>> So an ugly idea: if the 4th array is 1D the field can be the hash
>> value, if it's 2D it can contain the stride ;-) (But if superdan is
>> right, all this discussion may need to be re-built on different basis).
>> On the other hand, the capacity field isn't much useful for 2D arrays
>> (because you usually append to 1D arrays), so you end with just 3
>> fields with 2D arrays and 4 fields for 1D arrays :-) So, if you want to
>> keep both structures of len 4 (assuming pointers and size_t having the
>> same size) you can add a second stride, useful for 3D arrays too ;-)
>
> I was refering to 1D arrays with strides. For 2D/3D/ND arrays you need a
> length and a stride per dimension in order to support array slices,
> interfacing to both C and/or Fortran code bases, etc.
> So a ND array looks like
> {
> T* ptr;
> size_t[N] lengths;
> ptrdiff_t[N] byte_strides;
> }
>
> However, regarding your idea, as caching the has makes sense for
> invarient arrays and capacity would mostly be used during string
> bulding, you could overload the field based on type. However, this means
> neither optimization works for const arrays, which might not be a good
> thing as it encorages code bloat.
>
Using these optimizations with const could be done with either a bit-flag
(in length) or utilizing the block nature of the GC which leads to
capacities with (at least 4 I think) zero low order bits.
P.S. byte_strides should probably go above lengths as it's a better of
memory layout.
More information about the Digitalmars-d
mailing list