Why ranges don't return vectors?
Piotr Szturmaj
bncrbme at jadamspam.pl
Thu Feb 28 17:02:16 PST 2013
Seriously, nobody will ever get performance from single-element
iterator/range pattern - this makes CPU stall!
Anytime you read one byte from typical hard disk, system reads full
sector - typically 512 bytes, no more, no less. Anytime you read one
byte from memory, CPU loads entire cache line from RAM into the cache
(64 bytes on all modern Intel CPU's). Why not exploit that with ranges?
Ranges could potentially return arrays in front() (or in
frontVector/whatever), so basically they will become ranges of ranges
where the deepest range is always RandomAccessRange. This has obvious
performance benefits. Everyone knows that traversing memory is faster
than iteraring by popFront(). On the other hand memory lacks the
flexibility of ranges - so lets make a hybrid range!
Advantages:
* performance (!)
* limited lookahead/backtracking
How?
1. instruction level parallelism: CPU's can execute few instructions in
parallel given that they operate on different data (different vector
elements)
2. SIMD: load entire vector to SSE/AVX register then run the operation
on all elements at once
3. use of L1 cache: traversing vector in memory is way faster that
calling popFront() for each vector element - likely if it's inlined
Disadvantages:
* potential inconvenience
I know it can't be easy drop-in addition to the current range
implementation, but who knows...
More information about the Digitalmars-d
mailing list