DIP 1024--Shared Atomics--Community Review Round 1

Mon Oct 14 21:49:04 UTC 2019

On Monday, 14 October 2019 at 18:04:28 UTC, Manu wrote:
> And it's not at hominem, I really do think you're just making 
> stuff up
> (although I'm sure informed by ideas in competing ecosystems)...

If you go after the person and not the statement then that is "ad 
hominem", isn't it?  Claiming that people are making stuff up 
does not compel me to expand on the topic, but I'll try to give 
you an idea of the landscape as I see it, anyway.

I'm not sure what you mean by «competing eco systems», but I am 
talking about theoretical and applied research. And it is not a 
field I am highly interested in, but I have a passing interest in 
it. It makes a lot of sense to use such approaches if you are 
designing communication systems or critical control systems. It 
might be wasted on games.

Anyway, there are various ways of reasoning formally about 
concurrency, and it has been done for a long time... And there 
are many approaches.  Ranging from dedicated tools for 
programming in various niches where it is beneficial, to formal 
methods and tools for analyzing distributed protocols. It is a 
big field, and I think the latter theme is easier to get into, 
and you'll find plenty on this if you search on keywords like: 
formal methods, formally verified, distributed computing, 
distributed protocols...

> The closest thing you've said to a real implementation proposal 
> so far
> is strong-ownership based message passing like rust, but that's 
> not
> really in contest with this DIP, and I can't imagine how 
> `shared`
> would have any influence on that toolbox.

My argument was that D would not have it, so it would not be 
possible to make significant optimization based on modelling 
concurrency. Therefore "shared" as a library type ought to be 
sufficient.  That was my line of argument.

I do not think D should model concurrency formally.

I belive such languages should be created from a clean slate. 
Though, some appears to do stuff with annotated C, but it does 
not sound very compelling to just slap it onto an existing 
language.

> That's all thread-local concepts. There's nothing reasonable 
> you can
> determine between threads without synchronising the threads.

You might need synchronization, but you can potentially have less 
of it if you have a proof that thread 1 is always in state B when 
thread 2 is in state A, or that thread 1 is never in state B if 
thread 2 is not in state D, etc.

It is not uncommon to modell processes as statemachines and 
analyze them formally. When you have that model, you can 
implement it at a lower level if you want, and make a 
correspondence proof that tie each chunk of low level code to the 
high level model. Then there are languages where everything is 
handled for you in a "regular" programming language, but they 
will have a particular focus, with a particular set of trade-offs.

> The only reasonable CS model of threading I've ever seen 
> requires you
> to start talking about strong ownership and transfering 
> ownership
> between threads.

There are many ways to model and reason about concurrency. One 
approach was actually inspired by the modelling of chemical 
reactions using PetriNets.

One "impractical", but interesting, low level 
instruction-by-instruction approach uses Hoare logic. While 
possible, you might be limited to very few instructions. Anyway, 
with this approach you might in theory be able to avoid explicit 
synchronization in some cases as you can then prove whether or 
not all possible interleaving of instructions during execution 
comply with the specification! Still, the combinatorial explosion 
makes this very limiting and hard to reason about.  Anyway, Hoare 
logic is a very low level approach that has been researched in 
the 70s or 80s, so as you can see there is a history of various 
approaches.  Maybe applicable to a tight low level scope like a 
simple high performance hardware driver.

Many problems are only computationally prohibitively expensive if 
you want the optimal solution always, so researchers sometimes 
find ways to solve problems that appears to be impossible due to 
computational complexity, but with the caveat that you sometimes 
get the result "I don't know".  In optimization, "I don't know" 
is good enough, because then you just  fall back on the 
conservative implementation. A solver that is insufficient for 
unguided verification might be good enough for optimization. If 
"impossible" problems can be dealt with "sometimes", then you 
will get optimizations "sometimes".

Anyway, applied solutions seem to focus more on high level 
representations and high level mechanisms, but I would not be 
surprised if you find applied solutions working on low level 
representations using low level mechanisms.

> While I agree that strong ownership and message passing is 
> robust and
> worth having in any async ecosystem, it's also tremendously
> inefficient for certain workloads, and if that were all we had 
> baked
> into the language, it excludes other (useful) possibilities.

Ok, but I didn't argue for it being built into the language...

I am arguing that if it is built into the language then there 
should be a semantic reason for doing so. And static analysis of 
locking patterns might be one such reason.

But in general I think D would be better off by providing generic 
mechanisms rather than adding more and more limited features.

So maybe not add borrowing, but perhaps add some sort of limited 
linear/affine typing mechanism than can be used for implementing 
libraries that provides it to some extent. So people can write 
libraries that statically check that files that are opened also 
are closed. Or that you follow specific patterns when setting up 
hardware (register protocols or something like that).

If meta-programming and library-authoring is D2's main selling 
point then that ought to be front and center.

>> Oh. People do it. Sure, it is hard in the most general case, 
>> but you do have languages with high level concurrency 
>> constructs that provide type systems that makes it possible to 
>> write provably correct concurrent programs.
>
> That's not so much language at that point, it becomes blessed 
> library,
> or a 'framework' even.

No, you cannot run a solver from a library. Well, I guess you 
could, but that is not how it is usually done.

> The way you initially described the premise was something along 
> the lines of "the language knows the *state* of interacting 
> threads and does magic thread-safe stuff", which I'm skeptical 
> exists anywhere without a very elaborate framework to manage 
> any such orchestration.

Not sure what you mean by elaborate framework. You need a 
language for expressing it. Then you need a solver for solving 
the constraints (analyzing/checking the model).

> I think now that you might be talking about ownership and 
> message passing...

It could be, but I am talking about formal verification of 
concurrency models. Or languages that are designed in such a way 
that they translates into such models that can be verified.

What you are talking about might be the other direction where you 
have languages that are designed in such a way that some of the 
issues provably cannot arise (e.g. Pony, maybe Go).

But that is not what I am referring to. I am referring to 
inference using a solver.

> Incidentally, my code that I linked to back up this thread is 
> effectively an implementation of a rust-like shared object 
> where it maintains a 'borrow' ref count, and guards access 
> safely... it's easy to write those tools in D.

Alright, but if you had access to more advanced typing machinery 
in D then you could go further down that road yourself. Maybe not 
to the extent that it could keep up with Rust, but closer.

> if a _language_ did that, it almost certainly
> comes at a high cost, and it's almost certainly not what I want 
> it to
> do... but I'm willing to be surprised.

I don't think you want to do it. You'd have to change how you 
approach it, and it is not really needed for more mundane tasks.

>> That's right. So those benefits are not avilable... which was 
>> my point. :-)
>
> I can't predict what you're talking about, and you haven't 
> shown anything substantial. I suspect if you show what you 
> mean, I may find the term 'benefits' to be problematic ;)

What do you mean by substantial?

The benefits is that you prove that the code follows the model, 
which in turn can be fed into the optimizer and assumed to always 
hold.

It is not realistic that D will ever go there.

It is more realistic that D would expand the type system or 
meta-programming mechanisms in ways that are useful to library 
authors and that they could create concurrent frameworks.