Proposed Changes to the Range API for Phobos v3

Fri May 17 16:24:05 UTC 2024

On Friday, May 17, 2024 8:30:47 AM MDT H. S. Teoh via Digitalmars-d wrote:
> On Fri, May 17, 2024 at 01:22:48PM +0000, Ogi via Digitalmars-d wrote:
> > On Thursday, 16 May 2024 at 14:56:55 UTC, Jonathan M Davis wrote:
> > > Alternatively, we could add an enum of some kind to the new range
> > > API to make it different. It would be kind of ugly, but it would
> > > allow us to keep all of the current function names
> >
> > Actually, this option worth exploring.
> >
> > We can see now that implicit interfaces have downsides. When the
> > interface changes, we are forced to jump through hoops to avoid
> > problems. And what would will do if at some point in future we want to
> > change API again? Come up with another set of names? Versioning would
> > solve this once and for all.
> >
> > If you think that enum is ugly, we can use an UDA instead. This way,
> > ranges that use the new API will be instantly recognizable.
>
> [...]
>
> +1, I like the UDA idea.  It will be much less effort to migrate to the
> new range API to slap a UDA on, than to rename every range method.

It wouldn't fix the import problem. Realistically, it's not going to work to
retain the same function names and have arrays work as ranges without
needing an explicit import. Adding the functions to object.d breaks pretty
much all range-based code in existence due to symbol conflicts, and due to
an issue with how selective imports work, you can't even use selective
imports with std.range to fix it; you have to use selective imports with
std.range.primitives - and of course, you'd have to use selective imports
for every range API function, which would be tedious. Using new function
names makes it so that we don't have any symbol conflicts - in addition to
allowing us to statically distinguish between the old and new APIs. It would
also make it easy to see at a glance which API a piece of code is using.

Also, having the main difference that can be statically checked between the
old and new API be just a UDA makes it much more likely that code will
accidentally be set up for the wrong range API version, because someone
forgot the UDA, and it will make it much riskier to try to write code that's
overloaded on the old and new APIs. It would not surprise me if a number of
library writers will want to make it so that their code works with both APIs
by creating overloads that call the new Phobos functions to wrap an old
range so that it works with the new API.

Some ranges will be easily distinguishable due to a lack of save, but if the
code is written to work on basic input ranges (or it overloads on the
category of the range), then if the primary difference is whether the
programmer remembered to put a UDA on the type or not, the odds are pretty
high that the wrong overload will be used some percentage of the time.
Whether that outright makes the code behave incorrectly or just degrades its
performance is going to be highly dependent on what it's doing, but I'm
afraid that if the main difference between the old and new APIs is a UDA,
it's going to be missed in many cases and cause issues, whereas a different
set of function names makes the difference crystal clear and completely
obviates the risk of a range from one version of the API being mistaken as a
range from the other version.

And maybe it won't be at all common to try to overload code based on which
range API it's using, but I don't think that there's much question that it
will be less error-prone if the two APIs are distinct.

But regardless of how risky it is to have the primary difference simply be a
UDA, I don't see how the import problem can be solved without changing the
names. And Walter has explicitly requested that we fix it so that we no
longer need to import anything for arrays to be treated as ranges.

We could force them to not be ranges at all and use wrapper types like
Rikki is suggesting, but I can't imagine that Walter would be any happier
about that, and having wrappers like that typically causes issues
(especially with type introspection and any code that actually needs to
operate on dynamic arrays), whereas the primary issues with having new names
is that you have to remember to use the new ones when writing code that uses
the new API, and when updating code, you'll have to do a search and replace.

All in all, while renaming the functions is annoying, it definitely will
solve some of the issues that we have, and I don't expect that it will
ultimately be a big issue. And the more I thought about it while working out
the proposal, the more I came to the conclusion that having the difference
between the two API versions be visible at a glance would reduce the number
of problems that we'd have. It makes the introspection situation cleaner, it
makes it much more likely that making a mistake will result in a compilation
error, and it makes it easier to understand what's going on when you read
the code instead of having to dig around to figure out which version of the
API is actually being used. And it allows us to get rid of needing to import
anything to use arrays as ranges, which I don't think that we can actually
do with the current function names.

- Jonathan M Davis