stride in slices

Tue Jun 5 19:05:27 UTC 2018

On Tuesday, 5 June 2018 at 18:46:41 UTC, Timon Gehr wrote:
> On 05.06.2018 18:50, DigitalDesigns wrote:
>> With a for loop, it is pretty much a wrapper on internal cpu 
>> logic so it will be near as fast as possible.
>
> This is not even close to being true for modern CPUs. There are 
> a lot of architectural and micro-architectural details that 
> affect performance but are not visible or accessible in your 
> for loop. If you care about performance, you will need to test 
> anyway, as even rather sophisticated models of CPU performance 
> don't get everything right.

Those optimizations are not part of the instruction set so are 
irrelevant. They will occur with ranges too.

For loops HAVE a direct cpu semantic! Do you doubt this?

Cpu's do not have range semantics. Ranges are layers on top of 
compiler semantics... you act like they are equivalent, they are 
not! All range semantics must go through the library code then to 
the compiler then to cpu. For loops of all major systems 
languages go almost directly to cpu instructions.

for(int i = 0; i < N; i++)

translates in to either increment and loop or jump instructions.

There is absolutely no reason why any decent compiler would not 
use what the cpu has to offer. For loops are language semantics, 
Ranges are library semantics. To pretend they are equivalent is 
wrong and no amount of justifying will make them the same. I 
actually do not know even any commercial viable cpu exists 
without loop semantics. I also no of no commercially viable 
compiler that does not wrap those instructions in a for loop(or 
while, or whatever) like syntax that almost maps directly to the 
cpu instructions.

> Also, it is often not necessary to be "as fast as possible". It 
> is usually more helpful to figure out where the bottleneck is 
> for your code and concentrate optimization effort there, which 
> you can do more effectively if you can save time and effort for 
> the remaining parts of your program by writing simple and 
> obviously correct range-based code, which often will be fast as 
> well.

It's also often not necessary to be "as slow as possible". I'm 
not asking for about generalities but specifics. It's great to 
make generalizations about how things should be but I would like 
to know how they are. Maybe in theory ranges could be more 
optimal than other semantics but theory never equals practice.