shared - i need it to be useful

Sun Oct 21 12:45:43 UTC 2018

On Sunday, 21 October 2018 at 05:47:14 UTC, Manu wrote:
> On Sat, Oct 20, 2018 at 10:10 AM Stanislav Blinov via 
> Digitalmars-d <digitalmars-d at puremagic.com> wrote:

>> Synchronized with what? You still have `a`, which isn't 
>> `shared` and doesn't require any atomic access or 
>> synchronization. At this point it doesn't matter if it's an 
>> int or a struct. As soon as you share `a`, you can't just 
>> pretend that reading or writing `a` is safe.

> `b` can't read or write `a`... accessing `a` is absolutely safe.

It's not, with or without your proposal. The purpose of sharing 
`a` into `b` is to allow someone to access `*a` in a threadsafe 
way (but un- at safe, as it *will* require casting away `shared` 
from `b`). That is what's making keeping an unshared reference 
`a` un- at safe: whoever accesses `*a` in their @trusted 
implementations via `*b` can't know that `*a` is being 
(@safe-ly!) accessed in a non-threadsafe way at the same time.

> Someone must do something unsafe to undermine your 
> threadsafety... and
> if you write unsafe code and don't know what you're doing, 
> there's
> nothing that can help you.

Ergo, it follows that anyone that is making an implicit cast from 
mutable to shared better know what they're doing, which mere 
mortal users (not "experts") might not. I.e. it's a way to giving 
a loaded gun to someone who never held a weapon before.

> Today, every interaction with shared is unsafe.

Nod.

> Creating a safe interaction with shared will lead to people not 
> doing unsafe things at every step.

Triple nod.

>> Encapsulate it all you want, safety only remains a
>> contract of convention, the language can't enforce it.
>
> You're talking about @trusted code again. You're fixated on 
> unsafe interactions... my proposal is about SAFE interactions. 
> I'm trying to obliterate unsafe interactions with shared.

I know... Manu, I *know* what you're trying to do. We (me, Atila, 
Timon, Walter...) are not opposing your goals, we're pointing out 
the weakest spot of your proposal, which, it would seem, would 
require more changes to the language than just disallowing 
reading/writing `shared` members.

>> module expertcode;
>>
>> @safe:
>>
>> struct FileHandle {
>>      @safe:
>>
>>      void[] read(void[] storage) shared;
>>      void[] write(const(void)[] buffer) shared;
>> }
>>
>> FileHandle openFile(string path);
>> // only the owner can close
>> void closeFile(ref FileHandle);
>>
>> void shareWithThreads(shared FileHandle*); // i.e. generate a
>> number of jobs in some queue
>> void waitForThreads();                     // waits until all
>> processing is done
>>
>> module usercode;
>>
>> import expertcode;
>>
>> void processHugeFile(string path) {
>>      FileHandle file = openFile(path);
>>      shareWithThreads(&file);    // implicit cast
>>      waitForThreads();
>>      file.closeFile();
>> }
>
> This is a very strange program...

Why? That's literally the purpose of being able to `share`: you 
create/acquire a resource, share it, but keep a non-`shared` 
reference to yourself. If that's not required, you'd just create 
the data `shared` to begin with.

> I'm dubious it is in fact "expertcode"... but let's look into 
> it.

You're fixating on it being file now. I give an abstract example, 
you dismiss it as contrived, I give a concrete one, you want to 
dismiss it as "strange".

Heh, replace 'FileHandle' with 'BackBuffer', 'openFile' with 
'acquireBackBuffer', 'shareWithThreads' with 
'generateDrawCommands', 'waitForThreads' with 
'gatherCommandsAndDraw', 'closeFile' with 'postProcessAndPresent' 
;)

> File handle seems to have just 2 methods... and they are both 
> threadsafe. Open and Close are free-functions.

It doesn't matter if they're free functions or not. What matters 
is signature: they're taking non-`shared` (i.e. 'owned') 
reference. Methods are free functions in disguise.

> Close does not promise threadsafety itself (but of course, it 
> doesn't violate read/write's promise, or the program is 
> invalid).

Yep, and that's the issue. It SHALL NOT violate threadsafety, but 
it can't promise such in any way :(

> I expect the only possible way to achieve this is by an 
> internal mutex to make sure read/write/close calls are 
> serialised.

With that particular interface, yes.

> read and write will appropriately check their file-open state 
> each time they perform their actions.

Why? The only purpose of giving someone a `shared` reference is 
to give a reference to an open file. `shared` references can't do 
anything with the file but read and write, they would expect to 
be able to do so.

> What read/write do in the case of being called on a closed 
> file... anyones guess? I'm gonna say they do no-op... they 
> return a null pointer to indicate the error state.
>
> Looking at the meat of the program; you open a file, and 
> distribute it to do accesses (I presume?)....

> Naturally, this is a really weird thing to do, because even if 
> the API is threadsafe such that it doesn't crash and 
> reads/writes are
> serialised, the sequencing of reads/writes will be random, so I 
> don't believe any sane person (let alone an expert) would write 
> this
> program... but moving on.

Um, that's literally what std.stdio does, for writes at least, 
except it doesn't advertise `File` as `shared`. That's how we get 
interleaved, but not corrupted, output even when writing from 
multiple threads. Now, that's not *universally* useful, but 
nonetheless that's a valid use case.

> Then you wait for them to finish, and close the file.
> Fine. You have a file with randomly interleaved data... for 
> whatever reason.

Or I have command lists, or images loaded in background...

> This program does appear to be safe (assuming that the 
> implementations aren't invalid), but a very strange program 
> nonetheless.
>
>> Remove the call to `waitForThreads()` (assume user just forgot
>> that, i.e. the "accident"). Nothing would change for the
>> compiler: all calls remain @safe.
>
> Yup.
>
>> And yet, if we're lucky, we get
>> a consistent instacrash. If we're unlucky, we get memory
>> corruption, or an unsolicited write to another currently open
>> file, either of which can go unnoticed for some time.

> Woah! Now this is way off-piste..
> Why would get a crash? Why would get memory corruption? None of 
> those things make sense.

Because the whole reason to have `shared` is to avoid the 
extraneous checks that you mentioned above, and only write actual 
useful code (i.e. lock-write-unlock, or read-put-to-queue-repeat, 
or whatever), not busy-work (testing if the file is open on every 
call). If you have a `shared` reference, it better be to existing 
data. If it isn't, the program is invalid already: you've shared 
something that doesn't "exist" (good for marketing, not so good 
for multithreading). That's why having `shared` and un-`shared` 
references to the same data simultaneously is not safe: you can't 
guarantee in any way that the owning thread doesn't invalidate 
the data through it's non-`shared` reference while you're doing 
your threadsafe `shared` work; you can only "promise" that by 
convention (documentation).

> So, you call closeFile immediately and read/write start 
> returning null.

And I have partially-read or partially-written data. Or Maybe I 
call closeFile(), main thread continues and opens another file, 
which gives the same file descriptor, `shared` references to 
FileHandle which the user forgot to wait on continue to work 
oblivious to the fact that it's a different file now. It's a 
horrible, but still @safe, implementation of FileHandle, yes, but 
the caller (user) doesn't know that, and can't know that just 
from the interface. The only advice against that is "don't do 
that", but that's irrespective of your proposal.

> I'm going to assume that `shareWithThreads()` was implemented  
> by an
> 'expert' who checked the function results for errors. It was 
> detected that the reads/write failed, and an error "failed to 
> read file" was emit, then the function returned promptly.
> The uncertainty of what happens in this program is however
> `shareWithThreads()` handles read/write emitting an error.

But you can only find out about these errors in `waitForThreads`, 
the very call that the user "forgot" to make!

>>> Of course the program becomes invalid if you do that, there's
>> no question about it, this goes for all buggy code.
>
> In this case, I wouldn't say the program becomes 'invalid'; it 
> is
> valid for filesystem functions to return error states and you 
> should handle them.
> In this case, read/write must return some "file not open" 
> state, and it should be handled properly.
> This problem has nothing to do with threadsafety. It's a logic 
> issue related to threading, but that's got nothing to do with 
> this.

There's no question about it, it *is* a logic error. The point 
is, it's a logic error that ultimately can lead to UB despite 
being @safe. Just like this is: 
https://issues.dlang.org/show_bug.cgi?id=19316.

>> The problem is,
>> definition of "valid" lies beyond the type system: it's an
>> agreement between different parts of code, i.e. between expert
>> programmers who wrote FileHandle et al., and users who write
>> processHugeFile(). The main issue is that certain *runtime*
>> conditions can still violate @safe-ty.
>
> Perhaps you don't understand what @safe-ty means? It's a 
> compiler assertion that the code is memory-safe. It's not a 
> magic attribute that tells you that your program is right.

I know.

> Runtime conditions being in a valid state is a high-level 
> problem for the program, and doesn't interacts with 
> threadsafety in any
> fundamental way, and not in any way that @safe has anything to 
> do with.

Yep.

> You're just describing normal high-level multi-threading logic
> problems. `shared` does not and can not help you with that; you 
> need to look to libraries that offer threading support 
> frameworks for that.
> It can help you not write code that does invalid access to 
> memory and crash. That's the extent of its charter.

I understand that. So... it would seem that your proposal focuses 
more on @safe than on threadsafety?

> If a `shared` API is designed well, it can also offer strong 
> implicit advice about how to correctly interact with API's. The 
> compiler will coerce you to do the right things with error 
> messages.

>> Your proposal makes the language more strict wrt. to writing
>> @safe 'expertmodule', thanks to disallowing reads and writes
>> through `shared`, which is great.
>> However the implicit conversion to `shared` doesn't in any way
>> improve the situation as far as user code is concerned, unless
>> I'm still missing something.

> It does, it eliminates unsafe user interactions. It must be 
> that way to be safe. There were no casts above, it's great! And 
> your program is safe!
> (although it's wrong)

It's @safe, but it's wrong because it's not threadsafe. Yay! :D

> FWIW, I doubt anybody in their right mind would attempt to 
> write a threadsafe filesystem API this way.

std.stdio ;) (yes, I know there's no `shared` there, but that's 
what it does).

> Any such API would be structured COMPLETELY differently; it 
> would likely have one `shared` method that would accept 
> requests for deferred fulfillment, and handle unique objects 
> associated with each request.

Perhaps. How would the user know that?