Parameter storage classes on foreach variables

Quirin Schroll qs.il.paperinik at gmail.com
Fri May 17 18:59:13 UTC 2024


As of now, `foreach` admits `ref` variables as in `foreach (ref 
x; xs)`. There, `ref` can be used for two conceptually different 
things:
* Avoiding copies
* Mutating the values in place

If mutating in place is desired, `ref` is an excellent choice.
However, if mere copy avoiding is desired, another great option 
would be `in`.
On parameters, it avoids expensive copies, but does trivial ones.

A type supplying `opApply` can, in principle, easily provide an 
implementation where the callback takes an argument by `in` or 
`out`:
```d
struct Range
{
     int opApply(scope int delegate(size_t, in X) callback)
     {
         X x;
         if (auto result = callback(0, x)) return result;
         return 0;
     }
}
```
For `out`, it’s not really different.

However, how do classical ranges (`empty`, `front`, `popFront`) 
fare with these?
First `in`.
```d
foreach (in x; xs) { … }
// lowers to
{
     auto __xs = xs;
     for (; !__xs.empty; __xs.popFront)
     {
         static if (/* should be ref */)
             const scope ref x = __xs.front;
         else
             const scope x = __xs.front;
         …
     }
}
```

The first notable observation is that `out` makes no sense for 
input ranges. Rather, it would make sense for, well, output 
ranges: Every time the loop reaches the end, a `put` is issued, 
whereas `continue` means “this loop iteration did not produce a 
value, but continue” and `break` means “end the loop”:
```d
foreach (out T x; xs) { … }
// lowers to
{
     auto __xs = xs; // or xs[]
     for (; !__xs.empty /* or __xs.length > 0 or nothing */;)
     {
         auto x = T.init;
         …
         __xs.put(x); /* or similar */
     }
}
```
The program should assign `x` in its body. If control reaches the 
end of the loop, the value is `put` in the output range.
As an output range, in general, need not be finite, the loop is 
endless by design, but if the range has an `empty` member, it’s 
being used, and for types with `length`, but no `empty`, the 
condition is `__xs.length > 0`. For arrays and slices, the `put` 
operation is `__xs[0] = x; __xs = __xs[1 .. $];`.

If `T` is not explicitly given, and `xs` is not an array or 
slice, an attempt should be made to extract it from the single 
parameter of a non-overloaded `xs.put`. Otherwise, it’s an error.

Dynamic arrays and slices should support `size_t` keys as well:
```d
foreach (i, out x; xs) { … }
// lowers to
{
     auto __xs = xs[];
     for (size_t __i = 0; __xs.length > 0; ++__i)
     {
         size_t i = __i;
         auto x = typeof(xs[0]).init;
         …
         __xs[0] = x;
         __xs = __xs[1 .. $];
     }
}
```

Associative arrays specifically can be filled using `out` key and 
values:
```d
int[string] aa;
foreach (out key, out value; aa) { … }
// lowers to
{
     auto __aa = aa;
     for (;;)
     {
         KeyType key = KeyType.init;
         ValueType value = ValueType.init;
         …
         __aa[key] = value;
     }
}
```
At some point, a `break` is needed, otherwise the loop is 
infinite.


More information about the dip.ideas mailing list