Why ranges don't return vectors?

Fri Mar 1 14:18:09 PST 2013

On Friday, 1 March 2013 at 15:38:28 UTC, Piotr Szturmaj wrote:
> Yes, I've written some buffered ranges myself, but my original 
> post was not about buffering.

My point wasn't about things being buffered. I was just 
suggesting that many ranges do return arrays, when it's 
appropriate. If you want to conceptualize a "single item" as a 
grouping of items from a range (and such a grouping may be a 
vector of 16 bytes if you so choose), then you are free to do so. 
The concept is flexible enough to allow this.

However, not all ranges should return arrays. As an answer to the 
topic's name, the reason why ranges don't (always) return arrays 
is that the concept of a range is to _abstract_ operations on 
_any_ type of container (and even things that are not containers, 
such as RNGs), as long as it has certain capabilities. Returning 
arrays isn't helpful when you're trying to process arrays, for 
instance, because if you didn't want an abstract look at an 
array, you would have worked on the original array without the 
abstraction in the first place.

> Range abstraction is limiting some useful optimizations, mainly 
> vectorization. Even if range internally uses a buffer and does 
> some SIMD operations on it, it exposes only one element from 
> this buffer, which can't be used for next SIMD operation. In 
> other words it limits pipelining.
>
> I'm arguing that (vector) ranges of primitive types should 
> always expose at least 16 byte vector, even if it's one element 
> range - but then abstraction guarantees that it can be used for 
> SSE ops. They could return slices pointing to fixed size 
> buffers (at least 16 byte ones), then slices may be adjusted 
> for the last iteration to specify actual number of elements in 
> a vector. Anyway SIMD op will be done on the full vector 
> (underlying buffer) - CPU don't care if element in a vector is 
> garbage or not. So, in short: ranges pass these vectors between 
> themselfs and each subsequent range mutates the same vector - 
> possibly using SIMD.

Okay, sure, performance. But if you want to code for performance, 
there's nothing stopping you now. The idea about ranges is to 
unify lots of concepts. If you need lower level work and require 
every bit of performance, but still want to use ranges, then you 
can, but it should be a specialized range type that you do it on 
which meets your precise specification.

I hope I'm explaining it clearly: ranges are more of a universal 
specification, and it sounds like you have a very particular set 
of requirements which is a subset of what ranges are supposed to 
cover. Creating a specialized range type that meets your exact 
specifications would give you better performance than a 
generalized range concept (even optimized) and an optimized 
generalized range concept would not be appropriate for all uses.

Perhaps you could code up a vectorRange that does things as you 
want and submit it in an enhancement request. And you can show 
off how awesome it is so that everyone will want to use it when 
it's appropriate. But certainly not all ranges should be 
vectorRanges because most of the time we're just trying to work 
on singular units at a time.