stride in slices

Tue Jun 5 13:05:56 UTC 2018

On 6/4/18 5:52 PM, DigitalDesigns wrote:
> On Monday, 4 June 2018 at 17:40:57 UTC, Dennis wrote:
>> On Monday, 4 June 2018 at 15:43:20 UTC, Steven Schveighoffer wrote:
>>> Note, it's not going to necessarily be as efficient, but it's likely 
>>> to be close.
>>>
>>> -Steve
>>
>> I've compared the range versions with a for-loop. For integers and 
>> longs or high stride amounts the time is roughly equal, but for bytes 
>> with low stride amounts it can be up to twice as slow.
>> https://run.dlang.io/is/BoTflQ
>>
>> 50 Mb array, type = byte, stride = 3, compiler = LDC -O4 -release
>> For-loop  18 ms
>> Fill(0)   33 ms
>> each!     33 ms
>>
>> With stride = 13:
>> For-loop  7.3 ms
>> Fill(0)   7.5 ms
>> each!     7.8 ms
> 
> 
> This is why I wanted to make sure! I would be using it for a stride of 2 
> and it seems it might have doubled the cost for no other reason than 
> using ranged. Ranges are great but one can't reason about what is 
> happening in then as easy as a direct loop so I wanted to be sure. 
> Thanks for running the test!

See later postings from Ethan and others. It's a matter of optimization 
being able to see the "whole thing". This is why for loops are sometimes 
better. It's not inherent with ranges, but if you use the right 
optimization flags, it's done as fast as if you hand-wrote it.

What I've found with D (and especially LDC) is that when you give the 
compiler everything to work with, it can do some seemingly magic things.

-Steve