How is chunkBy supposed to behave on copy

Steven Schveighoffer schveiguy at gmail.com
Wed Mar 18 17:06:02 UTC 2020


On 3/18/20 12:37 PM, H. S. Teoh wrote:
> On Wed, Mar 18, 2020 at 12:18:04PM -0400, Steven Schveighoffer via Digitalmars-d wrote:

>> as it seems possible to do without this mechanism. It seems there is
>> some optimization surrounding pushing along the range without
>> iterating it twice. But I don't know if that's a worthwhile
>> optimization, especially if the allocation and reference counting are
>> more expensive than the iteration.
> 
> That's up for debate, but yes, the whole point of the reference counting
> thing is to avoid traversing a forward range twice when iterating over
> subranges.  It really depends on what the original range does, IMO.  If
> it's generating values on-the-fly with an expensive calculation, you
> could be saving quite a bit of work. But for simple ranges, yeah,
> reference-counting inner ranges is kinda overkill, probably with a
> performance hit.

Ugh, I think this is way overkill in most cases, and depends heavily on 
where the performance hit is.

Not only that, but you are only seeing a benefit if you iterate a chunk 
completely (I think).

For instance, an array has nearly zero cost to iterate elements, but the 
predicate for checking the chunking is probably way more expensive. The 
algorithm would be nicer if you simply iterated the array until you 
found the boundary, and then returned a slice as the element. Only one 
iteration through everything necessary (you pre-calculate the "next" range).

In other cases, iterating the elements is going to be expensive, so you 
don't want to do that twice if possible.

I think a good solution might be to provide different versions of the 
function or an enum designating what mechanism to prefer (e.g. 
IterationPolicy.MinimizePopfront or IterationPolicy.MinimizePredicateEval).

And of course, the .save behavior sucks, as always.

-Steve


More information about the Digitalmars-d mailing list