stride in slices
DigitalDesigns
DigitalDesigns at gmail.com
Tue Jun 5 19:05:27 UTC 2018
On Tuesday, 5 June 2018 at 18:46:41 UTC, Timon Gehr wrote:
> On 05.06.2018 18:50, DigitalDesigns wrote:
>> With a for loop, it is pretty much a wrapper on internal cpu
>> logic so it will be near as fast as possible.
>
> This is not even close to being true for modern CPUs. There are
> a lot of architectural and micro-architectural details that
> affect performance but are not visible or accessible in your
> for loop. If you care about performance, you will need to test
> anyway, as even rather sophisticated models of CPU performance
> don't get everything right.
Those optimizations are not part of the instruction set so are
irrelevant. They will occur with ranges too.
For loops HAVE a direct cpu semantic! Do you doubt this?
Cpu's do not have range semantics. Ranges are layers on top of
compiler semantics... you act like they are equivalent, they are
not! All range semantics must go through the library code then to
the compiler then to cpu. For loops of all major systems
languages go almost directly to cpu instructions.
for(int i = 0; i < N; i++)
translates in to either increment and loop or jump instructions.
There is absolutely no reason why any decent compiler would not
use what the cpu has to offer. For loops are language semantics,
Ranges are library semantics. To pretend they are equivalent is
wrong and no amount of justifying will make them the same. I
actually do not know even any commercial viable cpu exists
without loop semantics. I also no of no commercially viable
compiler that does not wrap those instructions in a for loop(or
while, or whatever) like syntax that almost maps directly to the
cpu instructions.
> Also, it is often not necessary to be "as fast as possible". It
> is usually more helpful to figure out where the bottleneck is
> for your code and concentrate optimization effort there, which
> you can do more effectively if you can save time and effort for
> the remaining parts of your program by writing simple and
> obviously correct range-based code, which often will be fast as
> well.
It's also often not necessary to be "as slow as possible". I'm
not asking for about generalities but specifics. It's great to
make generalizations about how things should be but I would like
to know how they are. Maybe in theory ranges could be more
optimal than other semantics but theory never equals practice.
More information about the Digitalmars-d
mailing list