Change representation of dynamic arrays?

Kris foo at bar.com
Sun Oct 21 18:14:24 PDT 2007


Can anyone explain why the compiler/iterators/foreach would insist on using 
the two operations Walter notes below (decrement the length, increment the 
pointer) instead of first converting to a pointer pair?

Seem like this is a concern that has more than one way to approach it?

- kris



"Walter Bright" <newshound1 at digitalmars.com> wrote in message 
news:ffbr56$r40$1 at digitalmars.com...
> Currently, arrays are represented under the hood as:
>
> size_t lengthOfArray;
> void* ptrToStartOfArray;
>
> Which works out reasonably well. The problem is if you want to use array 
> types as the basis of iterators, and you want to step through the array. 
> There's no escaping it being two operations:
>
> decrement the length
> increment the pointer
>
> This puts a brick in any fast implementation of iterators. To fix that, we 
> can change the representation to:
>
> void* ptrToStartOfArray;
> void* ptrPastEndOfArray;
>
> Then there's just one increment. Some tests show this can improve loop 
> performance by up to 70%.
>
> So, what does this not break?
>
> 1) Doesn't break array.ptr, this will still work.
> 2) Doesn't break array.length as rvalue, as this is rewritten by the 
> compiler as (array.end - array.start).
> 3) Doesn't break array.length as an lvalue, as that is handled by the 
> runtime library anyway.
> 4) Won't break anything on D 1.0, as it wouldn't get this change.
> 5) Won't break array slices, or any of that stuff we love about D arrays.
>
> What does this break?
>
> 1) Passing dynamic arrays to printf as in:
>
> printf("my string is %*.s\n", str);
>
> which relied on the under-the-hood representation. This doesn't work on 
> some architectures anyway, and is thoroughly obsolete. One could quickly 
> fix such code by writing it as:
>
> printf("my string is %*.s\n", str.length, str.ptr);
>
> 2) It breaks the internal library support code, but that's my problem.
>
> 3) It breaks binary compatibility with libraries already compiled. But we 
> expect to break binary compatibility with D 2.0.
>
> 4) It breaks things like cast(ulong)str, if one was crazy enough to do 
> that anyway.
>
> 5) It breaks anything that tries to look at the underlying representation 
> of dynamic arrays - but such code should be rewritten to use .ptr and 
> .length anyway, or slice notation.
>
> So, what do you think? 





More information about the Digitalmars-d mailing list