Prime sieve language race

Thu Jul 15 16:51:02 UTC 2021

On Thursday, 15 July 2021 at 13:16:01 UTC, Sebastiaan Koppe wrote:
> On Thursday, 15 July 2021 at 12:35:29 UTC, Petar Kirov 
> [ZombineDev] wrote:
>> On Wednesday, 14 July 2021 at 19:10:55 UTC, Sebastiaan Koppe
>>> Because member functions are harder to call from multiple 
>>> threads than static functions are. For one, you will have to 
>>> get the object on two threads first. Most functions that do 
>>> that require a shared object, which requires a diligent 
>>> programmer to do the casting.
>>
>> The problem with `std.stdio : std{in,out,err}` is they ought 
>> to be defined (conceptually) as `shared Atomic!File`, where 
>> `File` is essentially a wrapper around `SharedPtr!FileState` 
>> (and `SharedPtr` does atomic ref-counting, if it's `shared`) 
>> and until then, they shouldn't be `@trusted`, unless the 
>> program is single-threaded.
>
> Yes that is the sensible thing to do. But I am not sure that is 
> the right thing. I am afraid that it will lead to the 
> conclusion that everything needs to be shared, because who is 
> going to stop someone from taking your struct/class/function, 
> moving it over to another thread and then complain it corrupts 
> memory while it was advertised as having a @safe interface?

Not quite. If an aggregate has no methods marked as `shared`, it 
means that in essence it's not designed to be shared across 
threads (i.e it's not thread-safe). Just like `const` methods 
define the API of `const` object instances, `shared` methods 
define the API of `shared` objects. While it can be useful to 
overload methods based on the `this` type qualifier (e.g. I added 
`shared` overloads to the `lock`, `unlock` and `tryLock` methods 
of [`core.sync.mutex : Mutex`][0] (*)), it's not strictly 
necessary. It's perfectly possible to have a class which has one 
set of functions of single-thread use and a complete separate set 
of thread-safe functions. As an example, a simple non-thread-safe 
queue class can have `front`, `push`, `pop` and `empty` methods, 
while a thread-safe variant will instead have `tryGetFront`, 
`tryPush`, `tryPop` (and no `empty`) methods.

> I am afraid that it will lead to the conclusion that everything 
> needs to be shared

This is the sort thinking common in languages like C# and Java 
(at least, in my experience), where you don't know whether your 
class may be shared across threads, so you either find out 
eventually the hard way (via bug reports), or (e.g. if requested 
by code reviewers) you go in and preemptively add locks all over 
the code (usually not tested well, since you your initial 
use-case didn't involve sharing the object across threads).

This is not the case in D. If your aggregate doesn't have 
`shared` methods it means that it must not be `shared`, plain and 
simple.

That's why `__gshared` should be avoided - it shares both 
thread-safe and non-thread-safe objects across threads. A 
`__gshared` `Mutex` will work just fine (as the underlying 
Posix/Win32 primitives are obviously designed support it), but 
other types, like D's associative arrays would certainly go 
kaboom, if access to them is not *synchronized externally* (**).

In case of Phobos, `std.stdio : std{in,out,err}` should really be 
made thread-safe (you can find issues in bugzilla), as the whole 
idea of making them global mutable properties is to allow any 
thread to redirect them at any point of time. Whether that's a 
good idea is a separate topic, but it was certainly an intended 
case.

(*) `core.sync.mutex : Mutex.{lock, unlock, tryLock}` really 
should have been `shared @safe nothrow @nogc` from the beginning, 
but hey better late, then never :)
I considered removing the non-`shared` overloads, but I decided 
against, as that would have been a breaking change. That said, 
once we have enough high-quality APIs in Phobos to allow 
ergonomic use of `shared` (i.e. not requiring people to cast-away 
`shared` all over the place), we should consider deprecating them 
(the non-`shared` overloads of `lock`/`unlock`/`tryLock`).

(**) Another way to discuss `shared` is to think in terms of 
*internal* and *external synchronization*. If a method is 
`shared`, it follows that access to the underlying object is 
*internally synchronized*, i.e. you don't need an external mutex 
to guard it. And vice versa - if the methods are not `shared`, it 
means that you need to use external synchronization, and only 
then (assuming you have implemented it correctly), you can cast 
away `shared` and freely call the non-`shared` methods inside the 
scope of the lock. See Rust's [`Mutex`][1] and more specifically 
the [`MutexGuard`][2] types for a good example of this technique. 
Given a type like `Rust`'s `MutexGuard`, casting-away `shared` 
should really not be done in user-code - the idea is that the 
`MutexGuard` will give you a safe `scope`-ed access to a 
head-un-`shared` type (given `shared(SomeType**)` it will give 
you `scope shared(SomeType*)*`).

P.S. I use the term "method" when I mean non-static member 
function, and "aggregate" when I mean `struct`, `class`, or 
`interface` type.

[0]: https://dlang.org/phobos/core_sync_mutex.html
[1]: https://doc.rust-lang.org/std/sync/struct.Mutex.html
[2]: https://doc.rust-lang.org/std/sync/struct.MutexGuard.html