CURL review request

Jonathan M Davis jmdavisProg at gmx.com
Fri Aug 19 17:06:09 PDT 2011


On Friday, August 19, 2011 15:27 Timon Gehr wrote:
> On 08/19/2011 10:36 PM, Jonathan M Davis wrote:
> > On Friday, August 19, 2011 04:58 Timon Gehr wrote:
> >> On 08/19/2011 01:50 AM, Jonathan M Davis wrote:
> >>> On Thursday, August 18, 2011 16:00 Timon Gehr wrote:
> >>>> On 08/19/2011 12:35 AM, Jonathan M Davis wrote:
> >>>>> On Thursday, August 18, 2011 14:33 jdrewsen wrote:
> >>>>>> Den 17-08-2011 18:21, Timon Gehr skrev:
> >>>>>>> On 08/17/2011 05:58 PM, Jonathan M Davis wrote:
> >>>>>>>> On Wednesday, August 17, 2011 11:30:06 Steven Schveighoffer wrote:
> >>>>>>>>> On Wed, 17 Aug 2011 11:05:56 -0400, jdrewsen<jdrewsen at nospam.com>
> >>> 
> >>> wrote:
> >>>>>>>>>> Den 17-08-2011 15:51, Steven Schveighoffer skrev:
> >>>>>>>>>>> On Wed, 17 Aug 2011 05:43:00 -0400, Jonas Drewsen
> >>>>>>>>>>> <jdrewsen at nospam.com>
> >>>>>>>>>>> 
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>> On 17/08/11 00.21, Jonathan M Davis wrote:
> >>>>>>>>>>>>> On Tuesday, August 16, 2011 12:32 Martin Nowak wrote:
> >>>>>>>>>>>>>> On Tue, 16 Aug 2011 20:48:51 +0200,
> >>>>>>>>>>>>>> jdrewsen<jdrewsen at nospam.com>
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>> Den 16-08-2011 18:55, Martin Nowak skrev:
> >>>>>>>>>>>>>>>> On Tue, 16 Aug 2011 15:13:40 +0200,
> >>>>>>>>>>>>>>>> dsimcha<dsimcha at yahoo.com>
> >>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>> On 8/16/2011 7:48 AM, Jonas Drewsen wrote:
> >>>>>>>>>>>>>>>>>> Hi all,
> >>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>> This is a review request for the curl wrapper. Please
> >>>>>>>>>>>>>>>>>> read the
> >>>>>>>>>>>>>>>>>> "known
> >>>>>>>>>>>>>>>>>> issues" in the top of the source file and if possible
> >>>>>>>>>>>>>>>>>> suggest a
> >>>>>>>>>>>>>>>>>> solution.
> >>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>> We also need somebody for running the review process.
> >>>>>>>>>>>>>>>>>> Anyone?
> >>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>> Code:
> >>>>>>>>>>>>>>>>>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl
> >>>>>>>>>>>>>>>>>> .d
> >>>>>>>>>>>>>>>>>> Docs:
> >>>>>>>>>>>>>>>>>> http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
> >>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>> Demolish!
> >>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>>> /Jonas
> >>>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> From a quick look, this looks very well thought out.
> >>>>>>>>>>>>>>>>> I'll
> >>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> review
> >>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> more thoroughly when I have more time. A few
> >>>>>>>>>>>>>>>>> questions/comments
> >>>>>>>>>>>>>>>>> from a
> >>>>>>>>>>>>>>>>> quick look at the docs:
> >>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> Does the async stuff use D threads, or does Curl have its
> >>>>>>>>>>>>>>>>> own
> >>>>>>>>>>>>>>>>> async
> >>>>>>>>>>>>>>>>> API?
> >>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> In your examples for postData, you have onReceive a
> >>>>>>>>>>>>>>>>> ubyte[] and
> >>>>>>>>>>>>>>>>> write
> >>>>>>>>>>>>>>>>> it out to console. Did you mean to cast this to some kind
> >>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> string?
> >>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> For onReceive, what's the purpose of the return value?
> >>>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>>> If/when this module makes it into Phobos, are we going to
> >>>>>>>>>>>>>>>>> start
> >>>>>>>>>>>>>>>>> including a libcurl binary with DMD distributions so that
> >>>>>>>>>>>>>>>>> std.curl
> >>>>>>>>>>>>>>>>> feels truly **standard** and requires zero extra
> >>>>>>>>>>>>>>>>> configuration?
> >>>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>> I was also wondering about the async handling. In the
> >>>>>>>>>>>>>>>> long-term
> >>>>>>>>>>>>>>>> I'd like
> >>>>>>>>>>>>>>>> to see a bigger picture for async handling in phobos
> >>>>>>>>>>>>>>>> (offering
> >>>>>>>>>>>>>>>> some kind
> >>>>>>>>>>>>>>>> of futures, maybe event-loops etc.).
> >>>>>>>>>>>>>>>> Though this is not a requirement for the curl wrapper now.
> >>>>>>>>>>>>>>>> std.parallelism also has some kind of this stuff and file
> >>>>>>>>>>>>>>>> reading
> >>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>> benefit from it too.
> >>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>> This has been discussed before and I also think this is
> >>>>>>>>>>>>>>> very important.
> >>>>>>>>>>>>>>> But before that I think some kind of package management
> >>>>>>>>>>>>>>> should be
> >>>>>>>>>>>>>>> prioritized (A DIP11 implementaion or a more traditional
> >>>>>>>>>>>>>>> solution).
> >>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>> One thing I spotted at a quick glance, sending to be
> >>>>>>>>>>>>>>>> filled buffers to
> >>>>>>>>>>>>>>>> another thread should not be done by casting to shared not
> >>>>>>>>>>>>>>>> immutable.
> >>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>> I'm not entirely sure what you mean. There is no use of
> >>>>>>>>>>>>>>> shared buffers
> >>>>>>>>>>>>>>> in the wrapper. I do cast the buffer between
> >>>>>>>>>>>>>>> mutable/immutable because
> >>>>>>>>>>>>>>> only immutable or by value data can be passed using
> >>>>>>>>>>>>>>> std.concurrency.
> >>>>>>>>>>>>>>> Since the buffers are only used by the thread that
> >>>>>>>>>>>>>>> currently has the
> >>>>>>>>>>>>>>> buffer this is safe. I've previously asked for a non-cast
> >>>>>>>>>>>>>>> solution
> >>>>>>>>>>>>>>> (ie.
> >>>>>>>>>>>>>>> some kind of move between threads semantic for
> >>>>>>>>>>>>>>> std.concurrency) but
> >>>>>>>>>>>>>>> was
> >>>>>>>>>>>>>>> advised that this was the way to do it.
> >>>>>>>>>>>>>>> 
> >>>>>>>>>>>>>>>> martin
> >>>>>>>>>>>>>> 
> >>>>>>>>>>>>>> Pardon the typo. What I meant is that AFAIK casting from
> >>>>>>>>>>>>>> immutable to
> >>>>>>>>>>>>>> mutable has undefined behavior.
> >>>>>>>>>>>>>> The intended method for sending a uint[] buffer to another
> >>>>>>>>>>>>>> thread is
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>> cast that
> >>>>>>>>>>>>>> buffer to shared (cast(shared(uint[])) and casting away the
> >>>>>>>>>>>>>> shared
> >>>>>>>>>>>>>> on the
> >>>>>>>>>>>>>> receiving side.
> >>>>>>>>>>>>>> It is allowed to send shared data using std.concurrency.
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> Casting away immutability and then altering data is
> >>>>>>>>>>>>> undefined. Actually
> >>>>>>>>>>>>> casting it away is defined. So, if you have data in one
> >>>>>>>>>>>>> thread that
> >>>>>>>>>>>>> you know
> >>>>>>>>>>>>> is unique, you can cast it to immutable (or
> >>>>>>>>>>>>> std.exception.assumeUnique to do
> >>>>>>>>>>>>> it) and then send it to another thread. On that thread, you
> >>>>>>>>>>>>> can then
> >>>>>>>>>>>>> cast it
> >>>>>>>>>>>>> to mutable and alter it.
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> However, you're circumventing the type system when you do
> >>>>>>>>>>>>> this. So,
> >>>>>>>>>>>>> you have
> >>>>>>>>>>>>> to be very careful. You're throwing away the guarantees that
> >>>>>>>>>>>>> the compiler
> >>>>>>>>>>>>> makes with regards to const and immutable. It _is_ guaranteed
> >>>>>>>>>>>>> to work
> >>>>>>>>>>>>> though.
> >>>>>>>>>>>>> And I'm not sure that there's really any difference between
> >>>>>>>>>>>>> casting
> >>>>>>>>>>>>> to shared
> >>>>>>>>>>>>> and back and casting to immutable and back. In both cases,
> >>>>>>>>>>>>> you're circumventing the type system. The main difference
> >>>>>>>>>>>>> would be that if
> >>>>>>>>>>>>> you
> >>>>>>>>>>>>> screwed up with immutable and cast away immutable on
> >>>>>>>>>>>>> something that
> >>>>>>>>>>>>> really was
> >>>>>>>>>>>>> immutable rather than something that you cast to immutable
> >>>>>>>>>>>>> just to send it to
> >>>>>>>>>>>>> another thread, then you could a segfault when you tried to
> >>>>>>>>>>>>> alter it,
> >>>>>>>>>>>>> since it
> >>>>>>>>>>>>> could be in ROM.
> >>>>>>>>>>>>> 
> >>>>>>>>>>>>> - Jonathan M Davis
> >>>>>>>>>>>> 
> >>>>>>>>>>>> Yeah I know you have to be careful when doing these kind of
> >>>>>>>>>>>> things. I
> >>>>>>>>>>>> ran into the problem of sending buffers between threads (using
> >>>>>>>>>>>> std.concurrency) so that they could be reused. There isn't any
> >>>>>>>>>>>> "move ownership" support in place so Andrei suggested I could
> >>>>>>>>>>>> do it by casting immutable.
> >>>>>>>>>>>> 
> >>>>>>>>>>>> If anyone knows of a cleaner way to do this please tell.
> >>>>>>>>>>> 
> >>>>>>>>>>> casting to shared and back. Passing shared data should be
> >>>>>>>>>>> supported by std.concurrency, and casting away shared is
> >>>>>>>>>>> defined as long as you know
> >>>>>>>>>>> only one thread owns the data after casting.
> >>>>>>>>>>> 
> >>>>>>>>>>> -Steve
> >>>>>>>>>> 
> >>>>>>>>>> Why is this cleaner than casting to immutable and back?
> >>>>>>>>> 
> >>>>>>>>> Once it's immutable, it can never be mutable again. Casting to
> >>>>>>>>> immutable is a one-way street. Yes, you can cast to mutable, but
> >>>>>>>>> you still can't change the data unless you want undefined
> >>>>>>>>> behavior.
> >>>>>>>>> 
> >>>>>>>>> Shared is not like that, an item can be thread-local, then
> >>>>>>>>> shared, then thread local again, all the time being mutable. It
> >>>>>>>>> also reflects better what the process is (I'm sharing this data
> >>>>>>>>> with another thread, then that
> >>>>>>>>> thread is taking ownership). There's still the possibility to
> >>>>>>>>> screw up, but at least you are not in undefined territory in the
> >>>>>>>>> correctly-implemented case.
> >>>>>>>> 
> >>>>>>>> Are you sure? As I understand it, there's no real difference
> >>>>>>>> between casting to
> >>>>>>>> immutable and back and casting to shared and back. Both circumvent
> >>>>>>>> the type
> >>>>>>>> system. In the one case, the type system guarantees that the data
> >>>>>>>> can't be
> >>>>>>>> altered, and you're breaking that guarantee, because you know that
> >>>>>>>> it _can_
> >>>>>>>> be, since you created the data and know that it's actually
> >>>>>>>> mutable.
> >>>>>>> 
> >>>>>>> No. As soon as the data is typed as immutable anywhere it cannot be
> >>>>>>> changed anymore. You only break guarantees if you actually try to
> >>>>>>> change the data (otherwise std.typecons.assumeUnique would perform
> >>>>>>> its job outside defined behavior btw)
> >>>>>> 
> >>>>>> I'm thinking down the same lines as Jonathan. Is the behavior for
> >>>>>> immutable casts that you describe specified in the language
> >>>>>> reference somewhere?
> >>>>>> 
> >>>>>> I have no problem with using shared casts instead of immutable - I
> >>>>>> just want make sure it is really needed.
> >>>>> 
> >>>>> The behavior of casting a way const or immutable on a value and then
> >>>>> mutating it is undefined by the language, because you're breaking the
> >>>>> language's guarantees and what happens depends entirely on whether
> >>>>> the actual object was actually immutable. However, in the case of
> >>>>> casting to immutable and then casting back, you _know_ that the
> >>>>> object is mutable, so there's no problem. You're just circumventing
> >>>>> the type system which throws away the guarantees that it gives you
> >>>>> about immutability, which could screw up optimizations if you had
> >>>>> actually did more than just pass the variable around. But that's
> >>>>> just not happening here.
> >>>>> 
> >>>>> As for casting to and from shared and mutating the object, I don't
> >>>>> see how it is any more defined than casting to and from immutable
> >>>>> and then mutating the object is. In both cases, you circumvented the
> >>>>> type system, which breaks the compiler's guarantees and risks bugs
> >>>>> if you actually do more than just pass the variable around before
> >>>>> casting it back to being thread-local and mutable.
> >>>>> 
> >>>>> - Jonathan M Davis
> >>>> 
> >>>> As long as the data is not being shared between multiple threads after
> >>>> it's sharedness has been cast away, you are well in defined area,
> >>>> because you are NOT breaking anything.
> >>> 
> >>> The _only_ reason that you're not breaking anything is because you are
> >>> being careful and making sure that the data is not actually shared
> >>> between threads. You're _still_ circumventing the type system and
> >>> breaking the guarantee that a non-shared variable is not shared between
> >>> threads. _You_ are the one guaranteeing that the variable is only on
> >>> one thread, not the compiler. And when you cast away immutable, _you_
> >>> are the one guaranteeing that immutable data is not being mutated, not
> >>> the compiler. And in this case, you can make that guarantee in exactly
> >>> the same way that you can guarantee that the variable which was cast
> >>> to shared and back to be passed across threads isn't actually shared
> >>> between threads once it has been passed.
> >>> 
> >>>> The crucial difference between immutable and shared is, that something
> >>>> that is immutable will always be immutable, but being shared or not
> >>>> may change dynamically.
> >>>> 
> >>>> Casting to immutable is a one-way-street, while casting to shared is
> >>>> not.
> >>> 
> >>> Casting to immutable does not make the data magically immutable.
> >> 
> >> Yes it does. That is quite exactly the definition of it.
> >> 
> >>> It makes it
> >>> so that the compiler treats it as immutable.
> >> 
> >> The compiler is irrelevant, the fact that one compiler generates
> >> assembly that behaves as you'd like it to behave does not mean the code
> >> is valid.
> >> 
> >>> Casting from immutable makes it
> >>> so that the compiler treats it as mutable. It does not alter whether
> >>> the data is actually immutable.
> >> 
> >> 'Actually immutable' means that the variable is never changed. The
> >> language spec says that if an immutable variable is not 'actually
> >> immutable', your program is in error, even if immutability has been
> >> explicitly cast away.
> >> 
> >>> Casting away immutable and altering data is undefined,
> >>> because it depends entirely on whether the data is actually immutable
> >>> or not.
> >> 
> >> Just to make sure we agree on that: knowing why it is undefined does not
> >> imply there are cases where it is actually defined.
> >> 
> >>> If it isn't, and you don't have any other references to the data (or
> >>> none of the other references are immutable), then you don't break
> >>> _anything_ by mutating the variable after having cast away immutable.
> >> 
> >> If you believe the spec, you do break validity of your code even if the
> >> variable is not visible from another place after the cast. There is
> >> currently afaik no reason why such code would *have* to be broken, but
> >> that is what the spec says, because it categorically bans changing
> >> variables that were cast from immutable.
> >> 
> >>> With both shared and immutable, casting it away is circumnventing the
> >>> type system. In both cases, _you_ must make sure that you code in a way
> >>> which does not violate the guarantees that the compiler normally makes
> >>> about those types. If you do violate those guarantees (e.g. by sharing
> >>> non-shared data across threads or by mutating data which really was
> >>> immutable), then you have a bug, and what happens is dependent on the
> >>> compiler implementation and on your code.
> >> 
> >> Agreed, and this means what happens is undefined and such code is
> >> invalid.
> >> 
> >>> But if you don't violate those guarantees, then your program is fine.
> >>> It's the same for both shared and immutable. It's just that the bugs
> >>> that you're likely to get when you violate those guarantees are likely
> >>> to be quite different.
> >>> 
> >>> - Jonathan M Davis
> >> 
> >> Yes, I agree. And in this specific case you do violate the language
> >> guarantees about immutable variables when you change the variable after
> >> casting it back to mutable. Even if it was typed mutable sometime
> >> before, and even if it is the sole reference in the whole program.
> >> Therefore the behavior of the current implementation is undefined under
> >> the current spec, and changing the transfer protocol to use shared
> >> instead of immutable would fix this. (while generating near-identical
> >> assembly output with the current DMD)
> > 
> > What I completely disagree with you on is that casting something to
> > shared, passing it to another thread, casting from shared, and then
> > altering a variable is any more defined than casting to and from
> > immutable and altering a variable is. In _both_ cases, you are
> > circumventing the compiler, and in _both_ cases, _you_ must guarantee
> > that what you're doing doesn't actually violate what the compiler is
> > trying to guarantee.
> > 
> > - Jonathan M Davis
> 
> As I said, in the case of immutable, changing the data is the problem:
> It does actually violate the rules of the spec. It is explicitly
> disallowed to change any data that has been cast from immutable. (still,
> with DMD it probably works.)
> 
> shared has no such strong after-the-cast limitation (at least I haven't
> found a reason or a passage in the spec), the data should just not be
> visible by multiple threads afterwards. You may well change the variable
> after the cast.

Casting away immutable on a variable and mutating it is _not_ disallowed by 
the spec. It is undefined as to what happens when you mutate the data after 
having cast away immutable. This is because it's completely dependent on what 
the actual state of the data was. If it's actually immutable (e.g. it's in 
ROM), then things are going to blow up on you. If it's not actually immutable, 
but there are other references to the data, then the mutation of the variable 
can cause bugs because the compiler makes assumptions based on the guarantees 
that immutable provides, and you broke those guarantees. So, as long as the 
variable was created as a mutable variable and there are no longer any 
immutable references to that data which could be relying on that data to not 
change, then there are _zero_ problems with casting away immutable and 
mutating a variable. It _is_ not disallowed to do that. It is merely 
_undefined_ as to what happens. If it were disallowed, then you couldn't do.

In the case of shared, the compiler guarantees that mutating a variable which 
_isn't_ shared won't alter any variables on another thread. By casting a 
variable to shared and passing it to another thread, you have violated that. 
It is undefined as to whether altering a variable which you passed to another 
thread like that will alter any other variable. It depends on whether any 
reference to the same data still exists on the original thread. So, the 
behavior - as with immutable - is undefined. It depends on what you've done in 
the code instead of on what the compiler guarantees.

In _both_ cases, you can change the variable. In both cases, the exact 
behavior is undefined. And in both cases, there's pretty much no way that 
there's any danger - on any compiler implementation - of it not working to 
pass the variable across like we've been discussing if no reference is kept on 
the original thread.

In any case, there's probably no point in arguing this any further. We 
obviously don't agree, and I don't think that either of us is really going to 
convince the other. Using immutable and shared both work in this case. You can 
use both and be perfectly fine as long as you're careful. As long as the 
original data is mutable and no reference to it is kept on the original 
thread, then both approaches work just fine.

I do think that it makes more sense to use shared in this case, since 
immutable _is_ shared anyway, and all we're really trying to do is get the 
data passed to another thread, not change its mutability. But we're obviously 
not going to agree on the details of what you are and aren't violating by 
casting back and forth with shared and immutable.

- Jonathan M Davis


More information about the Digitalmars-d mailing list