Range Redesign: Empty Ranges

Mon Mar 4 23:19:05 UTC 2024

On Monday, March 4, 2024 2:58:08 PM MST Sebastiaan Koppe via Digitalmars-d 
wrote:
> On Monday, 4 March 2024 at 21:29:40 UTC, Jonathan M Davis wrote:
> > 1. Make it so that for finite ranges, the init value of a range
> > is required to not just be valid, but it's also required to be
> > empty.
>
> Makes a lot of sense, but I agree it will be very annoying for
> some ranges.
>
> I suspect it will result in extra state and/or checks for those
> cases.
>
> Would it be too crazy of an idea for those ranges to implement a
> static init function?

You mean define a static function named init? I think that most of us agree
at this point that it was a mistake to allow init to be anything other than
the compiler-defined init, and as I understand it, it doesn't really work to
redefine it in the way that Walter intended. Realistically, code tends to
depend on the init value being the compiler-defined one, and trying to do
anything else is asking for trouble. A redefined init value doesn't actually
get used with default-initialization, so with a case like

struct Foo
{
    ...
    typeof(chain(func1(), func2(), func3())) _range;
    ...
}

any init that you declare will be ignored (even if it's a value and not a
function), meaning that unless that range type has a valid init value, that
variable won't work properly unless it happens to be assigned a value before
it's used.

We could of course add a function / enum to the range API which would be
required to be used instead of the init value when you're looking for some
kind of initialization, but that doesn't solve the problem of
default-initialized values. For that, it really needs the init to be valid.

Of course, we _could_ allow ranges to @disable default initialization, and
then the type could define init with a static function which would be used
when init was used explicitly, but that would potentially cause problems if
init were used explicitly at compile time. It's also on the list of things
that I half expect to be disallowed at some point with a future edition,
because allowing init to be anything other than the default one tends to
cause issues. So, we _might_ be able to make it work, but my gut reaction is
that we're playing with fire, and we'll be better off overall if we can just
require that the compiler-defined init value be valid.

Still, if we allowed it, at least it should result in compile time errors in
generic code rather than weird runtime behavior. But it would mean that
generic code couldn't rely on being able to default-initialize ranges, and
it would have to give all ranges a value at runtime, which would be
annoying, and it realistically wouldn't happen outside of something like
Phobos (and it would likely only happen in Phobos if that specific case was
tested for by Phobos functions in general). Code in general expects default
initialization to work, and it's usually only code specifically written to
work with types that don't allow it where it actually works to have a type
that @disables it.

So, I don't know. I'd really rather not play games with trying to redefine
the init value. I also suspect that most of the annoyance that ranges might
have with the init value would be there simply by requiring that the init
value be valid without requiring that it be empty, in which case, requiring
that it be empty wouldn't be onerous. It's requiring that it be valid which
would potentially be onerous. But I'm also not sure how much of an issue
that that would be in practice. For most ranges, I expect that it will be
pretty simple, and many of them already have a valid, empty init value. In
the vast majority of cases, I expect that if a range's init value isn't
valid, it's because the person who wrote it didn't think about the init
value being used (since it's probably a Voldemort type) and didn't test that
case.

And if the issue of it being problematic to make the init value valid isn't
common, I'm disinclined to make range-based code in general have to worry
about it - especially if it's still possible to make such ranges work (even
if it's more annoying than would be desirable). I can also think of several
ways to make it easy right off the top of my head, though you wouldn't
necessarily want the additional overhead (e.g. any range could just use is
to compare itself to its init value in empty, or it could have a bool to
indicate whether it had been initialized via a constructor and check that;
it would be annoying but easy).

But I'm also not entirely sure how onerous it will be in practice to make
the init values of ranges valid (be it empty or otherwise) - though it's
actually _really_ easy in the cases where it would segfault now, since if
such types then have a wrapper struct, it can easily check whether the
pointer or reference is null and return true for empty.

- Jonathan M Davis