Why ranges don't return vectors?

Fri Mar 1 07:38:27 PST 2013

H. S. Teoh wrote:
> On Fri, Mar 01, 2013 at 03:35:23AM +0100, Chris Cain wrote:
>> On Friday, 1 March 2013 at 01:02:17 UTC, Piotr Szturmaj wrote:
>>> Seriously, nobody will ever get performance from single-element
>>> iterator/range pattern - this makes CPU stall!
>>
>> There's no reason you can't do that. In fact, some ranges already do
>> this. Take a look at std.stdio.File.byChunk and byLine. This isn't
>> appropriate for all ranges, though. Like arrays, for instance. You
>> want to be able to access single elements and do things with them,
>> otherwise the abstraction is pointless.

Yes, I've written some buffered ranges myself, but my original post was 
not about buffering.

> You can make a buffered range that reads in a large chunk of data at a
> time, and .front and .popFront merely move an internal pointer until it
> reaches the end of the buffer, then the next buffer page is loaded in.
> In this case, .front is very simple (just a pointer lookup) and will
> probably be inlined by an optimizing compiler.

Inlining is not always the case.

> Just because the abstraction is reading per-element, doesn't mean the
> implementation has to literally do that!

Range abstraction is limiting some useful optimizations, mainly 
vectorization. Even if range internally uses a buffer and does some SIMD 
operations on it, it exposes only one element from this buffer, which 
can't be used for next SIMD operation. In other words it limits pipelining.

I'm arguing that (vector) ranges of primitive types should always expose 
at least 16 byte vector, even if it's one element range - but then 
abstraction guarantees that it can be used for SSE ops. They could 
return slices pointing to fixed size buffers (at least 16 byte ones), 
then slices may be adjusted for the last iteration to specify actual 
number of elements in a vector. Anyway SIMD op will be done on the full 
vector (underlying buffer) - CPU don't care if element in a vector is 
garbage or not. So, in short: ranges pass these vectors between 
themselfs and each subsequent range mutates the same vector - possibly 
using SIMD.