New linked list

Walter Bright newshound at digitalmars.com
Fri May 12 19:13:27 PDT 2006


Sean Kelly wrote:
> Walter Bright wrote:
>> Sean Kelly wrote:
>>> *sigh*  I can see the reason for this, but it dramatically reduces 
>>> the utility of foreach for me, as it's a very common idiom to want to 
>>> remove elements during an iteration.  In fact, I already use this 
>>> technique quite a bit in Thread and ThreadGroup.  That aside, does 
>>> this restriction also exist for processing AA elements?
>>
>> Yes - everything used as an aggregate in a foreach.
>>
>>> I ask because this is done on a dynamically allocated list of 
>>> references rather than against the AA itself, though I suppose that's 
>>> implementation defined.
>>
>> The idea is to enable the compiler to do aggressive loop 
>> optimizations, regardless of the aggregate type used, and regardless 
>> of whether it is user-defined or built-in.
> 
> To play the Devil's advocate... doesn't this imply that the body of an 
> opApply function is also not allowed to make any modifications to the 
> underlying data structure?

Yes, it does imply that.

> As a contrived example, suppose I have a 
> SqlCursor object that performs lazy reads and caches this data once read 
> so the read must only occur once, and that this data is stored in an 
> SList which is used for the actual iteration.  From a code optimization 
> standpoint this seems entirely reasonable.  However, it does seem to 
> violate the general rule you've outlined for what is legal in foreach 
> loops.
> 
> You've said that the point of foreach is to allow the compiler to treat 
> the aggregate as a loop invariant, but my understanding of this term 
> implies that only optimizations on the referent of v may be optimized. 
> For example:
> 
>     int[] v;
>     ...
>     foreach( int i; v )
>     {
>         ...
>     }
> 
> may be transformed into:
> 
>     t1 = v.ptr;
>     t2 = v.len;
>     t3 = 0;
> L1: if( t2 <= t3 ) goto L2;
>     ...
>     t3 += 1;
>     goto L1;
> L2: ...
> 
> This would explain Derek's experience where sequence modifications were 
> not observed within the loop.  However, without a viable 'const' 
> signifier, I don't think any optimization regarding the sequence itself 
> may be performed:
> 
>     t1 = v.ptr;
>     t2 = v.len;
>     t3 = 0;
>     if( t2 <= 0  ) goto L2;
>     t4   = alloca( t2 )
>     t4[] = t1[0 .. t2]
> L1: if( t2 <= t3 ) goto L2;
>     ...
>     t3 += 1;
>     goto L1;
> L2: ...
> 
> Is this correct?  The first case would explain why rehashing an AA 
> within a foreach loop would likely result in undefined behavior, as 
> would an append-type operation on dynamic arrays.  But more structurally 
> cohesive data structures (ie. those for which modifications don't result 
> in iterator invalidation in C++: map, list, etc) should be fairly 
> compatible with foreach even in the face of such changes regardless of 
> whether such detailed rules would be advisable within a language spec.

If you don't mess with the iteration state, it will probably work. But 
the spec won't guarantee it will work, as the intention of foreach is to 
enable aggressive compiler optimizations. I want to leave the door as 
wide open to that as possible.

> If ptr is ever passed by reference to an opaque function then there is 
> no way to determine whether the data referenced by ref may change within 
> the life of the program.  In fact, the same could be said if ref were 
> ever passed by reference to an opaque function implemented in any 
> language other than D, as one must assume that the const attribute could 
> be ignored.  Frankly, this has me wondering whether it's possible to 
> enforce any sort of const behavior without data movement in the face of 
> aliasing.  For example, in instances where const data is being passed to 
> an opaque code segment it should be possible to instead make a copy in a 
> write-protected memory page and pass a reference to that instead, 
> similar to how passing structs by value to functions works now.  This 
> would result in a run-time error if a modification occurred to such 
> data, similar to what happens for modifications to const data in D now. 
>  But as far as I know it isn't possible to mark arbitrary memory ranges 
> as read-only.

It isn't possible to prevent the programmer from subverting the rules 
and modifying memory. The only thing that can be done is to label such 
as "undefined behavior".



More information about the Digitalmars-d-announce mailing list