New linked list
Sean Kelly
sean at f4.ca
Fri May 12 13:35:36 PDT 2006
Walter Bright wrote:
> Sean Kelly wrote:
>> *sigh* I can see the reason for this, but it dramatically reduces the
>> utility of foreach for me, as it's a very common idiom to want to
>> remove elements during an iteration. In fact, I already use this
>> technique quite a bit in Thread and ThreadGroup. That aside, does
>> this restriction also exist for processing AA elements?
>
> Yes - everything used as an aggregate in a foreach.
>
>> I ask because this is done on a dynamically allocated list of
>> references rather than against the AA itself, though I suppose that's
>> implementation defined.
>
> The idea is to enable the compiler to do aggressive loop optimizations,
> regardless of the aggregate type used, and regardless of whether it is
> user-defined or built-in.
To play the Devil's advocate... doesn't this imply that the body of an
opApply function is also not allowed to make any modifications to the
underlying data structure? As a contrived example, suppose I have a
SqlCursor object that performs lazy reads and caches this data once read
so the read must only occur once, and that this data is stored in an
SList which is used for the actual iteration. From a code optimization
standpoint this seems entirely reasonable. However, it does seem to
violate the general rule you've outlined for what is legal in foreach loops.
You've said that the point of foreach is to allow the compiler to treat
the aggregate as a loop invariant, but my understanding of this term
implies that only optimizations on the referent of v may be optimized.
For example:
int[] v;
...
foreach( int i; v )
{
...
}
may be transformed into:
t1 = v.ptr;
t2 = v.len;
t3 = 0;
L1: if( t2 <= t3 ) goto L2;
...
t3 += 1;
goto L1;
L2: ...
This would explain Derek's experience where sequence modifications were
not observed within the loop. However, without a viable 'const'
signifier, I don't think any optimization regarding the sequence itself
may be performed:
t1 = v.ptr;
t2 = v.len;
t3 = 0;
if( t2 <= 0 ) goto L2;
t4 = alloca( t2 )
t4[] = t1[0 .. t2]
L1: if( t2 <= t3 ) goto L2;
...
t3 += 1;
goto L1;
L2: ...
Is this correct? The first case would explain why rehashing an AA
within a foreach loop would likely result in undefined behavior, as
would an append-type operation on dynamic arrays. But more structurally
cohesive data structures (ie. those for which modifications don't result
in iterator invalidation in C++: map, list, etc) should be fairly
compatible with foreach even in the face of such changes regardless of
whether such detailed rules would be advisable within a language spec.
Regarding the 'const' label. The only way I can see such an idea
working in a way that supports compiler optimizations is to make it a
dynamic storage attribute of sorts. That is, the const-ness must be
applied to the data, not to the reference. I simply can't think of any
option where the traditional approach would work:
char[] ptr = "";
const char[] ref = ptr;
...
foreach( char c; ref )
{
...
}
If ptr is ever passed by reference to an opaque function then there is
no way to determine whether the data referenced by ref may change within
the life of the program. In fact, the same could be said if ref were
ever passed by reference to an opaque function implemented in any
language other than D, as one must assume that the const attribute could
be ignored. Frankly, this has me wondering whether it's possible to
enforce any sort of const behavior without data movement in the face of
aliasing. For example, in instances where const data is being passed to
an opaque code segment it should be possible to instead make a copy in a
write-protected memory page and pass a reference to that instead,
similar to how passing structs by value to functions works now. This
would result in a run-time error if a modification occurred to such
data, similar to what happens for modifications to const data in D now.
But as far as I know it isn't possible to mark arbitrary memory ranges
as read-only.
In the long-term, if transactional memory ever becomes a reality then it
may be possible to exploit this as a lightweight way of enforcing
const-ness: cache the data somewhere and if the cache is ever
invalidated then signal an error. But this may not be an option for
const data that has a long lifetime.
Sean
More information about the Digitalmars-d-announce
mailing list