opIndex, opSlice, length for joiner?

Wed Jan 20 17:22:43 PST 2016

On 01/20/2016 12:36 PM, Maverick Chardet wrote:
> Hi everyone,
>
> After reading the thread "Distributed Memory implementation":
> http://forum.dlang.org/post/oqcfifzolmolcvyupsef@forum.dlang.org
>
> And more precisely the suggestion of Dragos Carp on page 2:
> http://forum.dlang.org/post/txgabyyoutitketlpvgm@forum.dlang.org
>
> I looked at the joiner source code (std.algorithm.iteration.joiner) and
> I have several ideas to make it have the opIndex, opSlice and length
> methods under certain conditions.
>
> First question: what do you think of making this possible if all the
> considered ranges have opIndex, opSlice and length? I think that all the
> ranges types don't have to all implement the three methods in all cases
> (for instance length would actually only require ElementType!RoR and
> Separator to implement length) but we can discuss all that later...
>
> The most important issue that comes to my mind is that the operations
> would not take constant time... A trivial implementation would be in
> O(k) where k is the number of joined ranges, and with some
> precomputation in O(k) time and space we can make length O(1) and
> opIndex and opSlice O(log k)... Would that be a problem?
>
> Thank you in advance for your remarks!

Thanks for askin. Consider chain(), which currently DOES offer random 
access if its constituents to. In that case, the number of ranges 
chained is a compile-time constant dictated by the source code being 
compiled; no data dependency.

In contrast, the performance of joiner()'s proposed primitives would be 
dependent on the data. For your simplest example, length would be linear 
in the number of ranges. One way to improve on that would be to compute 
length upon the first call and then memoize (and update) it. There is a 
risk it could go out of sync if someone changes the ranges involved 
surreptitiously.

One good thing to do here would be to specialize joiner().walkLength. 
Then people writing r.walkLength would benefit from the quicker 
specialized version.

For indexed access, things get a fair amount more complicated. You can't 
afford linear time, so you need some serious data structure to ensure 
good performance there. That seems to take joiner() in a design space 
that makes it less attractive; currently it's a simple spec with a 
somewhat straightforward implementation.

Andrei