Second Draft: Coroutines

Mai Lapyst mai at lapyst.by
Sat Jan 25 16:44:00 UTC 2025


On Saturday, 25 January 2025 at 13:41:24 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
> The ``await`` keyword has been used for multithreading longer 
> than I've been alive. To mean what it does.
> Its also very uncommon and does not see usage in 
> druntime/phobos.

So "preventing breaking" is only reserved for phobos then, and 
any user-written code is fine to break at every moment. I find 
that a very problematic way when implementing / enhancing a 
language. "Dont break userspace" comes to mind; we should first 
and foremost be concerned with users interacting with the feature 
(which you seem to be concerned with as well), and as such I 
would'nt want to break all existing asyncronous libraries out 
there when the new edition rolls around. This makes dlang seem 
even more broken and "too niche" for people to use as any async 
library up to this point used in examples, tutorials etc will 
horrobly break.

> As it has no meaning outside of a coroutine, it'll be easy to 
> handle I think.

Then the DIP should specify it. Either the tokens `await` becomes 
an hard-keyword, disallowing any identifier usage of it, or it 
becomes a soft one, where it only acts as a keyword in `@async` 
contexts and like an normal identifier outside of it. You even 
link C#'s definition of it that has the (somewhat) exact wording 
needed for it:
```
Inside an async function, await shall not be used as an 
available_identifier although the verbatim identifier @await may 
be used. There is therefore no syntactic ambiguity between 
await_expressions and various expressions involving identifiers. 
Outside of async functions, await acts as a normal identifier.
```

> Stuff like this is why I added the multiple returns support, 
> even though I do not believe it is needed.

Which multiple return support? The DIP states clearly that it is 
**NOT** supported.

> Its also a good example of why the language does not define the 
> library, so you have the freedom to do this stuff!

Yes, but honestly you do the same: your dependency system define 
how libraries need to interact with coroutines, the same way 
waker does. I dont want to argue that wakers dont define a 
library usage as well, but dependencies to so as well.

> It is not part of the DIP. Without the operator overload 
> example, it wouldn't be understood.

Then do not put it into the DIP. It should **only** contain your 
design and whats possible with it, without having to rely on 
possible future DIP's to add some operators to make your DIP 
actually work.

> The compiler using just the parse tree can see the function 
> ``opConstructCo`` on the library type 
> ``InstantiableCoroutine``. Allowing it to flag the type as a 
> instantiable coroutine.

Again: this description says that the compiler treats 
`opConstructCo` differently as other functions. What would happen 
if I want to use another name? What will happen if I have 
multiple functions with the same signature but different names?

> See above, it can see that it is a coroutine by the parameter, 
> rather than on the argument.

So the argument (lambda) would not be a coroutine and could not 
use `await` or `@async return`? This seems counter-intuitive, as 
I clearly can see that code as this will exist:
```d
ListenSocket ls = ListenSocket.create((Socket socket) @async {
	auto line = await socket.readLine();
	// ...
});
```
therefore the function should be anotated to be `async`; espc. bc 
you say time and time again it should be useable by users without 
prior knowlage of the insides of the system. Makeing it that 
functions can only have `await` if they're `@async` but lambdas 
are whatever they want to be seems like a hughe boobytrap.

> You don't win a whole lot by requiring it. Especially when they 
> are templates and they look like they should "just work".

It makes things clearer for the writer (and future readers), and 
by extend the compiler as it now certainly knows to slice the 
lambda as well as this is the intention of the developer.

> It was heavily discussed

Where exactly? Haven't seen it yet sorry. And even then: these 
should be part of the DIP under a section "non-goals" or 
"discarded ideas" so people know that a) they where considered 
and b) what where the considerations that lead to the decision.

> See the ``Prime Sieve`` example for one way you can do this.

I've seen it, but again: it uses undeclared things that aren't as 
clear as day if your'e **not** the writer of the DIP.
```d
InstantiableCoroutine!(int) ico = \&generate;
Future!int ch = ico.makeInstance();
```
Why does this work? `generate` is an coroutine, but why can it be 
"just" assigned to an library shell? Does it "just work"? Thats 
not how programming works or how standards should be written. I 
**could** see that you ment that an constructor that takes an 
template parameter with the `__descriptorco` should be used, but 
again: it is not stated in the DIP and as such should not be 
taken as "granted" just bc you expect people to come to the 
conclusion themself. Look at C++ papers, they are **hughe** for a 
reason: EVERYTHING gets written down so no confusion can happen.

> The ``await`` statement does two things.
> 1. It assigns the expression's value into the state variable 
> for waiting on.
> 2. It yields.

Then please for the love of good put it into the DIP! I'm sorry 
that im so picky about this, but a **specification** (what your 
DIP is), should contain **every detail of your idea** not only 
the bits gemini deemed as important. We're humans, and as such we 
should be espc carefull to give us each other as much information 
as possible.

> Whereas the other approaches including C++ is still after much 
> reading not in my mental model.

I somewhat start to get a graps of yours, while in your model, 
you try to just "throw" the awaited-on back to anyone interested 
in it and use an sumtype to do it, other languages define an 
stricter interface that need to be followed: c++ with awaiters 
and rust with it`s `Future<>`s and `Waker`s. Both ways prevent 
splits in the ecosystem or that only one library gets on top 
while everything else just dies. Thats what I tbh fear with the 
current approach: there will be one way to use dependencies and 
thats it. The problems it have will extend to all async code and 
an outside viewer will declare async in dlang broken without 
anyone realising thats just the library thats broken. Take 
dlang's std.regex for example: it's very slow in comparison with 
others and you easily could roll your own, but nobody does so 
everybody just assumes it's a "dlang" problem and moves on. While 
this has only minimal impact bc it's just regex, with an entire 
language feature that will be presented through the lens of the 
most used or most "present" library (not popular! big 
difference), this will make people say "Hey dlangs async is so 
bad bc. that and that". I want to prevent such a thing.

With an more strict protocol on how things are awaited (c++) or a 
coroutine can be "retried" / woken up (rust) these problems go 
away. Any executor can rely on the fact that any io / waiting 
structure **will** follow protocol, and as such they're 
interchangeable, which comes to a **big** benefit of user and 
application code as noone needs to re-invent the whole weel.

Another benefit is also thag it (somewhat) helps in ensuring that 
the coroutine is actually in a good state without the executor 
needing to know about that state itself.

To help understanding a bit more the two models lets take a look 
at a "typical" flow of a coroutine:
- starts coroutine
- initiate `read_all()` of a file
- `await`s the `read_all()` and pauses the coroutine
- gets re-called since the waited on part is now resolved
- processes data

In your proposal this works by setting a dependency on the 
`read_all()`'s returntype. If now the executor simply ignores the 
dependency, it recalls the coroutine and the coroutine is in a 
bad state, as it does not validate if the dependency is actually 
resolved (how would it?). As a result, you would need to put it 
inside a loop:
```d
ReadDependency r = ...;
while (!r.isReady) {
   await r;
}
```
Which is boilerplait best avoided.

Secondly the read_all itself: It and the exector would need to 
agree on an out-of-language protocol on how to actually handle 
the dependency; this will mostlikely be that an library would 
expose an interface like `Awaitable` that any dependency would 
need to implement, but with the downside that any dependent now 
has an explicit dependency on said library. Sure, maybe over time 
a standard set of interfaces would araise that the community 
would adapt, but then we have the API-dependency hell in java 
just re-invented.

In C++ the `co_await` dictates that the coroutine is blocked as 
long as the `Awaiter` protocol says it does, since any user 
**expects** that the `await`ed thing is actually resolved after 
it's `await`ed. It dosn't mater if successfully or not the 
keypoint is that it's **not pending** anymore.

In rust it's even simpler: polling is an concept that even kids 
understand: when you want your parents to give you something, you 
"poll" until they give it to you or tell you no in a way that 
keeps you from continuing what you originaly wanted to do. Same 
thing in rust: a coroutine is "polled" by the exector and can 
either resolve with the data you expected, or tell you that's it 
still waiting and to come back later. The compiler ensures that 
only ever a ready state is allowed to continue the coroutine. If 
you want to be more performant and not spin-lock in the executor 
in the hopes that someday the future will resolve, you can give 
it a waker and say: "hey, if you say you are still not done, I 
will do other things; if you think you're ready for me to try 
again, just call this and I will come to you!".


More information about the dip.development mailing list