std.algorithm.splitter on a string not always bidirectional
Steven Schveighoffer
schveiguy at gmail.com
Sat Jan 23 15:07:27 UTC 2021
On 1/22/21 2:13 PM, Jon Degenhardt wrote:
> On Friday, 22 January 2021 at 17:29:08 UTC, Steven Schveighoffer wrote:
>> On 1/22/21 11:57 AM, Jon Degenhardt wrote:
>>>
>>> I think the idea is that if a construct like 'xyz.splitter(args)'
>>> produces a range with the sequence of elements {"a", "bc", "def"},
>>> then 'xyz.splitter(args).back' should produce "def". But, if finding
>>> the split points starting from the back results in something like
>>> {"f", "de", "abc"} then that relationship hasn't held, and the
>>> results are unexpected.
>>
>> But that is possible with all 3 splitter variants. Why is one allowed
>> to be bidirectional and the others are not?
>
> I'm not defending it, just explaining what I believe the thinking was
> based on the examination I did. It wasn't just looking at the code,
> there was a discussion somewhere. A forum discussion, PR discussion, bug
> or code comments. Something somewhere, but I don't remember exactly.
>
> However, to answer your question - The relationship described is
> guaranteed if the basis for the split is a single element. If the range
> is a string, that's a single 'char'. If the range is composed of
> integers, then a single integer. Note that if the basis for the split is
> itself a range, then the relationship described is not guaranteed.
>
> Personally, I can see a good argument that bidirectionality should not
> be supported in any of these cases, and instead force the user to choose
> between eager splitting or reversing the range via retro. For the common
> case of strings, the further argument could be made that the distinction
> between char and dchar is another point of inconsistency.
I would not want that. My use case is splitting a string on punctuation,
and using the lazy result for testing equality of something. But I have
some special suffix items that I want to handle first (and pop off).
dchar/char inconsistency isn't a problem, because they are both dchar
ranges (and both are bidirectional).
> Regardless whether the choices made were the best choices, there was
> some thinking that went into it, and it is worth understanding the
> thinking when considering changes.
I believe there was that thinking. It's why I posted, because before I
filed a bug, I wanted to make sure there wasn't a good reason.
It looks like there is NOT a good reason for the single-item based
splitting as you say to prevent bidirectional access. But there IS a
good reason (thanks for the example H.S. Teoh) to prevent it for
multi-element delimiters.
-Steve
More information about the Digitalmars-d-learn
mailing list