stride in slices

Mon Jun 4 18:11:47 UTC 2018

On 6/4/18 1:40 PM, Dennis wrote:
> On Monday, 4 June 2018 at 15:43:20 UTC, Steven Schveighoffer wrote:
>> Note, it's not going to necessarily be as efficient, but it's likely 
>> to be close.
> 
> I've compared the range versions with a for-loop. For integers and longs 
> or high stride amounts the time is roughly equal, but for bytes with low 
> stride amounts it can be up to twice as slow.
> https://run.dlang.io/is/BoTflQ
> 
> 50 Mb array, type = byte, stride = 3, compiler = LDC -O4 -release
> For-loop  18 ms
> Fill(0)   33 ms
> each!     33 ms
> 
> With stride = 13:
> For-loop  7.3 ms
> Fill(0)   7.5 ms
> each!     7.8 ms

Interesting!

BTW, do you have cross-module inlining on? I wonder if that makes a 
difference if you didn't have it on before. (I'm somewhat speaking from 
ignorance, as I've heard people talk about this limitation, but am not 
sure exactly when it's enabled)

-Steve