array traversal

Don Clugston dac at nospam.com.au
Fri Mar 30 09:16:34 PDT 2007


Dan wrote:
> Stewart Gordon Wrote:
> 
>> "BCS" <ao at pathlink.com> wrote in message 
>> news:ce0a3343870c8c939107e7e1a1e at news.digitalmars.com...
>> <snip>
>>> I think that options he is taking about are these
>>> for(T* ptr = &start; ptr !is &stop; ptr++)
>>> {
>>>  T value = *ptr
>>> }
>>>
>>> vs.
>>>
>>> for(int i = 0; i< length; i++)
>>> {
>>>   T value = ptr[i];
>>> }
>>>
>>> only the second ever uses a multiplication
>> And even then not necessarily - some compilers may optimise one form to the 
>> other.
>>
>> Stewart. 
>>
> 
> Hmm... let me think of how that looks in ASM.  The first one is wrong, because you're only ptr++, it should be ptr += OFFSET.  Apart from that, you're right the first code looks mildly better in ASM.
> 
> Looping through by adding to the pointer instead of adding to the array index and performing a 'lea'.

But if T.sizeof == 2, 4, or 8, the multiplication can be done in 
hardware for free. EG if i is stored in esi,

mov eax, [ptr + esi*8];

I'd be surprised if there are any modern compilers that actually perform 
a multiply.

> 
> The good news is that D *should be* optimizing this out, and it ultimately only costs a half cycle which may either align or throw off the u/v pipes - something more important than 1/2 a cycle.
> 
> I wonder if Walter has it aligning the pipes as appropriate?

BTW, the u-v pipe thing is not relevant for recent Pentiums any more. 
Core2 CPUs are more likely to be limited by the decoding stage than by 
the execution units.



More information about the Digitalmars-d mailing list