Second Draft: Coroutines
Mai Lapyst
mai at lapyst.by
Sat Jan 25 16:44:00 UTC 2025
On Saturday, 25 January 2025 at 13:41:24 UTC, Richard (Rikki)
Andrew Cattermole wrote:
> The ``await`` keyword has been used for multithreading longer
> than I've been alive. To mean what it does.
> Its also very uncommon and does not see usage in
> druntime/phobos.
So "preventing breaking" is only reserved for phobos then, and
any user-written code is fine to break at every moment. I find
that a very problematic way when implementing / enhancing a
language. "Dont break userspace" comes to mind; we should first
and foremost be concerned with users interacting with the feature
(which you seem to be concerned with as well), and as such I
would'nt want to break all existing asyncronous libraries out
there when the new edition rolls around. This makes dlang seem
even more broken and "too niche" for people to use as any async
library up to this point used in examples, tutorials etc will
horrobly break.
> As it has no meaning outside of a coroutine, it'll be easy to
> handle I think.
Then the DIP should specify it. Either the tokens `await` becomes
an hard-keyword, disallowing any identifier usage of it, or it
becomes a soft one, where it only acts as a keyword in `@async`
contexts and like an normal identifier outside of it. You even
link C#'s definition of it that has the (somewhat) exact wording
needed for it:
```
Inside an async function, await shall not be used as an
available_identifier although the verbatim identifier @await may
be used. There is therefore no syntactic ambiguity between
await_expressions and various expressions involving identifiers.
Outside of async functions, await acts as a normal identifier.
```
> Stuff like this is why I added the multiple returns support,
> even though I do not believe it is needed.
Which multiple return support? The DIP states clearly that it is
**NOT** supported.
> Its also a good example of why the language does not define the
> library, so you have the freedom to do this stuff!
Yes, but honestly you do the same: your dependency system define
how libraries need to interact with coroutines, the same way
waker does. I dont want to argue that wakers dont define a
library usage as well, but dependencies to so as well.
> It is not part of the DIP. Without the operator overload
> example, it wouldn't be understood.
Then do not put it into the DIP. It should **only** contain your
design and whats possible with it, without having to rely on
possible future DIP's to add some operators to make your DIP
actually work.
> The compiler using just the parse tree can see the function
> ``opConstructCo`` on the library type
> ``InstantiableCoroutine``. Allowing it to flag the type as a
> instantiable coroutine.
Again: this description says that the compiler treats
`opConstructCo` differently as other functions. What would happen
if I want to use another name? What will happen if I have
multiple functions with the same signature but different names?
> See above, it can see that it is a coroutine by the parameter,
> rather than on the argument.
So the argument (lambda) would not be a coroutine and could not
use `await` or `@async return`? This seems counter-intuitive, as
I clearly can see that code as this will exist:
```d
ListenSocket ls = ListenSocket.create((Socket socket) @async {
auto line = await socket.readLine();
// ...
});
```
therefore the function should be anotated to be `async`; espc. bc
you say time and time again it should be useable by users without
prior knowlage of the insides of the system. Makeing it that
functions can only have `await` if they're `@async` but lambdas
are whatever they want to be seems like a hughe boobytrap.
> You don't win a whole lot by requiring it. Especially when they
> are templates and they look like they should "just work".
It makes things clearer for the writer (and future readers), and
by extend the compiler as it now certainly knows to slice the
lambda as well as this is the intention of the developer.
> It was heavily discussed
Where exactly? Haven't seen it yet sorry. And even then: these
should be part of the DIP under a section "non-goals" or
"discarded ideas" so people know that a) they where considered
and b) what where the considerations that lead to the decision.
> See the ``Prime Sieve`` example for one way you can do this.
I've seen it, but again: it uses undeclared things that aren't as
clear as day if your'e **not** the writer of the DIP.
```d
InstantiableCoroutine!(int) ico = \&generate;
Future!int ch = ico.makeInstance();
```
Why does this work? `generate` is an coroutine, but why can it be
"just" assigned to an library shell? Does it "just work"? Thats
not how programming works or how standards should be written. I
**could** see that you ment that an constructor that takes an
template parameter with the `__descriptorco` should be used, but
again: it is not stated in the DIP and as such should not be
taken as "granted" just bc you expect people to come to the
conclusion themself. Look at C++ papers, they are **hughe** for a
reason: EVERYTHING gets written down so no confusion can happen.
> The ``await`` statement does two things.
> 1. It assigns the expression's value into the state variable
> for waiting on.
> 2. It yields.
Then please for the love of good put it into the DIP! I'm sorry
that im so picky about this, but a **specification** (what your
DIP is), should contain **every detail of your idea** not only
the bits gemini deemed as important. We're humans, and as such we
should be espc carefull to give us each other as much information
as possible.
> Whereas the other approaches including C++ is still after much
> reading not in my mental model.
I somewhat start to get a graps of yours, while in your model,
you try to just "throw" the awaited-on back to anyone interested
in it and use an sumtype to do it, other languages define an
stricter interface that need to be followed: c++ with awaiters
and rust with it`s `Future<>`s and `Waker`s. Both ways prevent
splits in the ecosystem or that only one library gets on top
while everything else just dies. Thats what I tbh fear with the
current approach: there will be one way to use dependencies and
thats it. The problems it have will extend to all async code and
an outside viewer will declare async in dlang broken without
anyone realising thats just the library thats broken. Take
dlang's std.regex for example: it's very slow in comparison with
others and you easily could roll your own, but nobody does so
everybody just assumes it's a "dlang" problem and moves on. While
this has only minimal impact bc it's just regex, with an entire
language feature that will be presented through the lens of the
most used or most "present" library (not popular! big
difference), this will make people say "Hey dlangs async is so
bad bc. that and that". I want to prevent such a thing.
With an more strict protocol on how things are awaited (c++) or a
coroutine can be "retried" / woken up (rust) these problems go
away. Any executor can rely on the fact that any io / waiting
structure **will** follow protocol, and as such they're
interchangeable, which comes to a **big** benefit of user and
application code as noone needs to re-invent the whole weel.
Another benefit is also thag it (somewhat) helps in ensuring that
the coroutine is actually in a good state without the executor
needing to know about that state itself.
To help understanding a bit more the two models lets take a look
at a "typical" flow of a coroutine:
- starts coroutine
- initiate `read_all()` of a file
- `await`s the `read_all()` and pauses the coroutine
- gets re-called since the waited on part is now resolved
- processes data
In your proposal this works by setting a dependency on the
`read_all()`'s returntype. If now the executor simply ignores the
dependency, it recalls the coroutine and the coroutine is in a
bad state, as it does not validate if the dependency is actually
resolved (how would it?). As a result, you would need to put it
inside a loop:
```d
ReadDependency r = ...;
while (!r.isReady) {
await r;
}
```
Which is boilerplait best avoided.
Secondly the read_all itself: It and the exector would need to
agree on an out-of-language protocol on how to actually handle
the dependency; this will mostlikely be that an library would
expose an interface like `Awaitable` that any dependency would
need to implement, but with the downside that any dependent now
has an explicit dependency on said library. Sure, maybe over time
a standard set of interfaces would araise that the community
would adapt, but then we have the API-dependency hell in java
just re-invented.
In C++ the `co_await` dictates that the coroutine is blocked as
long as the `Awaiter` protocol says it does, since any user
**expects** that the `await`ed thing is actually resolved after
it's `await`ed. It dosn't mater if successfully or not the
keypoint is that it's **not pending** anymore.
In rust it's even simpler: polling is an concept that even kids
understand: when you want your parents to give you something, you
"poll" until they give it to you or tell you no in a way that
keeps you from continuing what you originaly wanted to do. Same
thing in rust: a coroutine is "polled" by the exector and can
either resolve with the data you expected, or tell you that's it
still waiting and to come back later. The compiler ensures that
only ever a ready state is allowed to continue the coroutine. If
you want to be more performant and not spin-lock in the executor
in the hopes that someday the future will resolve, you can give
it a waker and say: "hey, if you say you are still not done, I
will do other things; if you think you're ready for me to try
again, just call this and I will come to you!".
More information about the dip.development
mailing list