shared - i need it to be useful

Sun Oct 21 05:47:14 UTC 2018

On Sat, Oct 20, 2018 at 10:10 AM Stanislav Blinov via Digitalmars-d
<digitalmars-d at puremagic.com> wrote:
>
> On Saturday, 20 October 2018 at 16:48:05 UTC, Nicholas Wilson
> wrote:
> > On Saturday, 20 October 2018 at 09:04:17 UTC, Walter Bright
> > wrote:
> >> On 10/19/2018 11:18 PM, Manu wrote:
> >>> The reason I ask is because, by my definition, if you have:
> >>> int* a;
> >>> shared(int)* b = a;
> >>>
> >>> While you have 2 numbers that address the same data, it is
> >>> not actually aliased because only `a` can access it.
> >>
> >> They are aliased,
> >
> > Quoting Wikipedia:
> >
> >>two pointers A and B which have the same value, then the name
> >>A[0] aliases the name B[0]. In this case we say the pointers A
> >>and B alias each other. Note that the concept of pointer
> >>aliasing is not very well-defined – two pointers A and B may or
> >>may not alias each other, depending on what operations are
> >>performed in the function using A and B.
> >
> > In this case given the above: `a[0]` does not alias `b[0]`
> > because `b[0]` is ill defined under Manu's proposal, because
> > the memory referenced by `a` is not reachable through `b`
> > because you can't read or write through `b`.
> >
> >> by code that believes it is unshared
> >
> > you cannot `@safe`ly modify the memory  through `b`, `a`'s view
> > of the memory is unchanged in @safe code.
>
> And that's already a bug, because the language can't enforce
> threadsafe access through `a`, regardless of presence of `b`.
> Only the programmer can.
>
> >> and, code that believes it is shared.
> >
> > you cannot have non-atomic access though `b`, `b` has no @safe
> > view of the memory, unless it is atomic (which by definition is
> > synchronised).
>
> Synchronized with what? You still have `a`, which isn't `shared`
> and doesn't require any atomic access or synchronization. At this
> point it doesn't matter if it's an int or a struct. As soon as
> you share `a`, you can't just pretend that reading or writing `a`
> is safe.

`b` can't read or write `a`... accessing `a` is absolutely safe.
Someone must do something unsafe to undermine your threadsafety... and
if you write unsafe code and don't know what you're doing, there's
nothing that can help you.
Today, every interaction with shared is unsafe. Creating a safe
interaction with shared will lead to people not doing unsafe things at
every step.

> Encapsulate it all you want, safety only remains a
> contract of convention, the language can't enforce it.

You're talking about @trusted code again. You're fixated on unsafe
interactions... my proposal is about SAFE interactions. I'm trying to
obliterate unsafe interactions with shared.

> module expertcode;
>
> @safe:
>
> struct FileHandle {
>      @safe:
>
>      void[] read(void[] storage) shared;
>      void[] write(const(void)[] buffer) shared;
> }
>
> FileHandle openFile(string path);
> // only the owner can close
> void closeFile(ref FileHandle);
>
> void shareWithThreads(shared FileHandle*); // i.e. generate a
> number of jobs in some queue
> void waitForThreads();                     // waits until all
> processing is done
>
> module usercode;
>
> import expertcode;
>
> void processHugeFile(string path) {
>      FileHandle file = openFile(path);
>      shareWithThreads(&file);    // implicit cast
>      waitForThreads();
>      file.closeFile();
> }

This is a very strange program... I'm dubious it is in fact
"expertcode"... but let's look into it.

File handle seems to have just 2 methods... and they are both threadsafe.
Open and Close are free-functions. Close does not promise threadsafety
itself (but of course, it doesn't violate read/write's promise, or the
program is invalid).

I expect the only possible way to achieve this is by an internal mutex
to make sure read/write/close calls are serialised. read and write
will appropriately check their file-open state each time they perform
their actions. What read/write do in the case of being called on a
closed file... anyones guess? I'm gonna say they do no-op... they
return a null pointer to indicate the error state.

Looking at the meat of the program; you open a file, and distribute it
to do accesses (I presume?)....
Naturally, this is a really weird thing to do, because even if the API
is threadsafe such that it doesn't crash and reads/writes are
serialised, the sequencing of reads/writes will be random, so I don't
believe any sane person (let alone an expert) would write this
program... but moving on.
Then you wait for them to finish, and close the file.

Fine. You have a file with randomly interleaved data... for whatever reason.

> Per your proposal, everything in 'expertcode' can be written
> @safe, i.e. not violating any of the points that @safe forbids,
> or doing so only in a @trusted manner. As far as the language is
> concerned, this would mean that processHugeFile can be @safe as
> well.

This program does appear to be safe (assuming that the implementations
aren't invalid), but a very strange program nonetheless.

> Remove the call to `waitForThreads()` (assume user just forgot
> that, i.e. the "accident"). Nothing would change for the
> compiler: all calls remain @safe.

Yup.

> And yet, if we're lucky, we get
> a consistent instacrash. If we're unlucky, we get memory
> corruption, or an unsolicited write to another currently open
> file, either of which can go unnoticed for some time.

Woah! Now this is way off-piste..
Why would get a crash? Why would get memory corruption? None of those
things make sense.

So, you call closeFile immediately and read/write start returning null.
I'm going to assume that `shareWithThreads()` was implemented by an
'expert' who checked the function results for errors. It was detected
that the reads/write failed, and an error "failed to read file" was
emit, then the function returned promptly.
The uncertainty of what happens in this program is however
`shareWithThreads()` handles read/write emitting an error.

> Of course the program becomes invalid if you do that, there's no
> question about it, this goes for all buggy code.

In this case, I wouldn't say the program becomes 'invalid'; it is
valid for filesystem functions to return error states and you should
handle them.
In this case, read/write must return some "file not open" state, and
it should be handled properly.
This problem has nothing to do with threadsafety. It's a logic issue
related to threading, but that's got nothing to do with this.

> The problem is,
> definition of "valid" lies beyond the type system: it's an
> agreement between different parts of code, i.e. between expert
> programmers who wrote FileHandle et al., and users who write
> processHugeFile(). The main issue is that certain *runtime*
> conditions can still violate @safe-ty.

Perhaps you don't understand what @safe-ty means? It's a compiler
assertion that the code is memory-safe. It's not a magic attribute
that tells you that your program is right.
Runtime conditions being in a valid state is a high-level problem for
the program, and doesn't interacts with threadsafety in any
fundamental way, and not in any way that @safe has anything to do
with.
You're just describing normal high-level multi-threading logic
problems. `shared` does not and can not help you with that; you need
to look to libraries that offer threading support frameworks for that.
It can help you not write code that does invalid access to memory and
crash. That's the extent of its charter.

If a `shared` API is designed well, it can also offer strong implicit
advice about how to correctly interact with API's. The compiler will
coerce you to do the right things with error messages.

> Your proposal makes the language more strict wrt. to writing
> @safe 'expertmodule', thanks to disallowing reads and writes
> through `shared`, which is great.
> However the implicit conversion to `shared` doesn't in any way
> improve the situation as far as user code is concerned, unless
> I'm still missing something.

It does, it eliminates unsafe user interactions. It must be that way to be safe.
There were no casts above, it's great! And your program is safe!
(although it's wrong)

FWIW, I doubt anybody in their right mind would attempt to write a
threadsafe filesystem API this way. Any such API would be structured
COMPLETELY differently; it would likely have one `shared` method that
would accept requests for deferred fulfillment, and handle unique
objects associated with each request.