Change representation of dynamic arrays?
Walter Bright
newshound1 at digitalmars.com
Fri Oct 19 20:03:02 PDT 2007
Currently, arrays are represented under the hood as:
size_t lengthOfArray;
void* ptrToStartOfArray;
Which works out reasonably well. The problem is if you want to use array
types as the basis of iterators, and you want to step through the array.
There's no escaping it being two operations:
decrement the length
increment the pointer
This puts a brick in any fast implementation of iterators. To fix that,
we can change the representation to:
void* ptrToStartOfArray;
void* ptrPastEndOfArray;
Then there's just one increment. Some tests show this can improve loop
performance by up to 70%.
So, what does this not break?
1) Doesn't break array.ptr, this will still work.
2) Doesn't break array.length as rvalue, as this is rewritten by the
compiler as (array.end - array.start).
3) Doesn't break array.length as an lvalue, as that is handled by the
runtime library anyway.
4) Won't break anything on D 1.0, as it wouldn't get this change.
5) Won't break array slices, or any of that stuff we love about D arrays.
What does this break?
1) Passing dynamic arrays to printf as in:
printf("my string is %*.s\n", str);
which relied on the under-the-hood representation. This doesn't work on
some architectures anyway, and is thoroughly obsolete. One could quickly
fix such code by writing it as:
printf("my string is %*.s\n", str.length, str.ptr);
2) It breaks the internal library support code, but that's my problem.
3) It breaks binary compatibility with libraries already compiled. But
we expect to break binary compatibility with D 2.0.
4) It breaks things like cast(ulong)str, if one was crazy enough to do
that anyway.
5) It breaks anything that tries to look at the underlying
representation of dynamic arrays - but such code should be rewritten to
use .ptr and .length anyway, or slice notation.
So, what do you think?
More information about the Digitalmars-d
mailing list