CURL review request

Martin Nowak dawg at dawgfoto.de
Wed Aug 17 13:31:58 PDT 2011


On Wed, 17 Aug 2011 18:21:00 +0200, Timon Gehr <timon.gehr at gmx.ch> wrote:

> On 08/17/2011 05:58 PM, Jonathan M Davis wrote:
>> On Wednesday, August 17, 2011 11:30:06 Steven Schveighoffer wrote:
>>> On Wed, 17 Aug 2011 11:05:56 -0400, jdrewsen<jdrewsen at nospam.com>   
>>> wrote:
>>>> Den 17-08-2011 15:51, Steven Schveighoffer skrev:
>>>>> On Wed, 17 Aug 2011 05:43:00 -0400, Jonas Drewsen
>>>>> <jdrewsen at nospam.com>
>>>>>
>>>>> wrote:
>>>>>> On 17/08/11 00.21, Jonathan M Davis wrote:
>>>>>>> On Tuesday, August 16, 2011 12:32 Martin Nowak wrote:
>>>>>>>> On Tue, 16 Aug 2011 20:48:51 +0200,
>>>>>>>> jdrewsen<jdrewsen at nospam.com>
>>>>>>>>
>>>>>>>> wrote:
>>>>>>>>> Den 16-08-2011 18:55, Martin Nowak skrev:
>>>>>>>>>> On Tue, 16 Aug 2011 15:13:40 +0200,
>>>>>>>>>> dsimcha<dsimcha at yahoo.com>
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>> On 8/16/2011 7:48 AM, Jonas Drewsen wrote:
>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>
>>>>>>>>>>>> This is a review request for the curl wrapper. Please
>>>>>>>>>>>> read the
>>>>>>>>>>>> "known
>>>>>>>>>>>> issues" in the top of the source file and if possible
>>>>>>>>>>>> suggest a
>>>>>>>>>>>> solution.
>>>>>>>>>>>>
>>>>>>>>>>>> We also need somebody for running the review process.
>>>>>>>>>>>> Anyone?
>>>>>>>>>>>>
>>>>>>>>>>>> Code:
>>>>>>>>>>>> https://github.com/jcd/phobos/blob/curl-wrapper/etc/curl
>>>>>>>>>>>> .d
>>>>>>>>>>>> Docs:
>>>>>>>>>>>> http://freeze.steamwinter.com/D/web/phobos/etc_curl.html
>>>>>>>>>>>>
>>>>>>>>>>>> Demolish!
>>>>>>>>>>>>
>>>>>>>>>>>> /Jonas
>>>>>>>>>>>
>>>>>>>>>>>  From a quick look, this looks very well thought out. I'll
>>>>>>>>>>> review
>>>>>>>>>>> it
>>>>>>>>>>> more thoroughly when I have more time. A few
>>>>>>>>>>> questions/comments
>>>>>>>>>>> from a
>>>>>>>>>>> quick look at the docs:
>>>>>>>>>>>
>>>>>>>>>>> Does the async stuff use D threads, or does Curl have its
>>>>>>>>>>> own
>>>>>>>>>>> async
>>>>>>>>>>> API?
>>>>>>>>>>>
>>>>>>>>>>> In your examples for postData, you have onReceive a
>>>>>>>>>>> ubyte[] and
>>>>>>>>>>> write
>>>>>>>>>>> it out to console. Did you mean to cast this to some kind
>>>>>>>>>>> of
>>>>>>>>>>> string?
>>>>>>>>>>>
>>>>>>>>>>> For onReceive, what's the purpose of the return value?
>>>>>>>>>>>
>>>>>>>>>>> If/when this module makes it into Phobos, are we going to
>>>>>>>>>>> start
>>>>>>>>>>> including a libcurl binary with DMD distributions so that
>>>>>>>>>>> std.curl
>>>>>>>>>>> feels truly **standard** and requires zero extra
>>>>>>>>>>> configuration?
>>>>>>>>>>
>>>>>>>>>> I was also wondering about the async handling. In the
>>>>>>>>>> long-term
>>>>>>>>>> I'd like
>>>>>>>>>> to see a bigger picture for async handling in phobos
>>>>>>>>>> (offering
>>>>>>>>>> some kind
>>>>>>>>>> of futures, maybe event-loops etc.).
>>>>>>>>>> Though this is not a requirement for the curl wrapper now.
>>>>>>>>>> std.parallelism also has some kind of this stuff and file
>>>>>>>>>> reading
>>>>>>>>>> would
>>>>>>>>>> benefit from it too.
>>>>>>>>>
>>>>>>>>> This has been discussed before and I also think this is very
>>>>>>>>> important.
>>>>>>>>> But before that I think some kind of package management should
>>>>>>>>> be
>>>>>>>>> prioritized (A DIP11 implementaion or a more traditional
>>>>>>>>> solution).
>>>>>>>>>
>>>>>>>>>> One thing I spotted at a quick glance, sending to be filled
>>>>>>>>>> buffers to
>>>>>>>>>> another thread should not be done by casting to shared not
>>>>>>>>>> immutable.
>>>>>>>>>
>>>>>>>>> I'm not entirely sure what you mean. There is no use of shared
>>>>>>>>> buffers
>>>>>>>>> in the wrapper. I do cast the buffer between mutable/immutable
>>>>>>>>> because
>>>>>>>>> only immutable or by value data can be passed using
>>>>>>>>> std.concurrency.
>>>>>>>>> Since the buffers are only used by the thread that currently
>>>>>>>>> has the
>>>>>>>>> buffer this is safe. I've previously asked for a non-cast
>>>>>>>>> solution
>>>>>>>>> (ie.
>>>>>>>>> some kind of move between threads semantic for
>>>>>>>>> std.concurrency) but
>>>>>>>>> was
>>>>>>>>> advised that this was the way to do it.
>>>>>>>>>
>>>>>>>>>> martin
>>>>>>>>
>>>>>>>> Pardon the typo. What I meant is that AFAIK casting from
>>>>>>>> immutable to
>>>>>>>> mutable has undefined behavior.
>>>>>>>> The intended method for sending a uint[] buffer to another
>>>>>>>> thread is
>>>>>>>> to
>>>>>>>> cast that
>>>>>>>> buffer to shared (cast(shared(uint[])) and casting away the
>>>>>>>> shared
>>>>>>>> on the
>>>>>>>> receiving side.
>>>>>>>> It is allowed to send shared data using std.concurrency.
>>>>>>>
>>>>>>> Casting away immutability and then altering data is undefined.
>>>>>>> Actually
>>>>>>> casting it away is defined. So, if you have data in one thread
>>>>>>> that
>>>>>>> you know
>>>>>>> is unique, you can cast it to immutable (or
>>>>>>> std.exception.assumeUnique to do
>>>>>>> it) and then send it to another thread. On that thread, you can
>>>>>>> then
>>>>>>> cast it
>>>>>>> to mutable and alter it.
>>>>>>>
>>>>>>> However, you're circumventing the type system when you do this.
>>>>>>> So,
>>>>>>> you have
>>>>>>> to be very careful. You're throwing away the guarantees that the
>>>>>>> compiler
>>>>>>> makes with regards to const and immutable. It _is_ guaranteed to
>>>>>>> work
>>>>>>> though.
>>>>>>> And I'm not sure that there's really any difference between
>>>>>>> casting
>>>>>>> to shared
>>>>>>> and back and casting to immutable and back. In both cases, you're
>>>>>>> circumventing the type system. The main difference would be that
>>>>>>> if
>>>>>>> you
>>>>>>> screwed up with immutable and cast away immutable on something
>>>>>>> that
>>>>>>> really was
>>>>>>> immutable rather than something that you cast to immutable just to
>>>>>>> send it to
>>>>>>> another thread, then you could a segfault when you tried to alter
>>>>>>> it,
>>>>>>> since it
>>>>>>> could be in ROM.
>>>>>>>
>>>>>>> - Jonathan M Davis
>>>>>>
>>>>>> Yeah I know you have to be careful when doing these kind of things.
>>>>>> I
>>>>>> ran into the problem of sending buffers between threads (using
>>>>>> std.concurrency) so that they could be reused. There isn't any "move
>>>>>> ownership" support in place so Andrei suggested I could do it by
>>>>>> casting immutable.
>>>>>>
>>>>>> If anyone knows of a cleaner way to do this please tell.
>>>>>
>>>>> casting to shared and back. Passing shared data should be supported  
>>>>> by
>>>>> std.concurrency, and casting away shared is defined as long as you
>>>>> know
>>>>> only one thread owns the data after casting.
>>>>>
>>>>> -Steve
>>>>
>>>> Why is this cleaner than casting to immutable and back?
>>>
>>> Once it's immutable, it can never be mutable again.  Casting to  
>>> immutable
>>> is a one-way street.  Yes, you can cast to mutable, but you still can't
>>> change the data unless you want undefined behavior.
>>>
>>> Shared is not like that, an item can be thread-local, then shared, then
>>> thread local again, all the time being mutable.  It also reflects  
>>> better
>>> what the process is (I'm sharing this data with another thread, then  
>>> that
>>> thread is taking ownership).  There's still the possibility to screw  
>>> up,
>>> but at least you are not in undefined territory in the
>>> correctly-implemented case.
>>
>> Are you sure? As I understand it, there's no real difference between  
>> casting to
>> immutable and back and casting to shared and back. Both circumvent the  
>> type
>> system. In the one case, the type system guarantees that the data can't  
>> be
>> altered, and you're breaking that guarantee, because you know that it  
>> _can_
>> be, since you created the data and know that it's actually mutable.
>
> No. As soon as the data is typed as immutable anywhere it cannot be  
> changed anymore. You only break guarantees if you actually try to change  
> the data (otherwise std.typecons.assumeUnique would perform its job  
> outside defined behavior btw)
>
>> In the
>> other case, the type system guarantees that the data is thread-local and
>> therefore thread-safe, and you're breaking that guarantee by casting it  
>> to
>> shared. On the other end, you're then casting it back, since you know  
>> that the
>> data isn't actually shared. In both cases, you're circumventing the  
>> compiler's
>> guarantees. In both cases, you've claimed that it's thread for th second
>> thread to use the data, when if you screwed up and left references to  
>> it in
>> the first thread, then it isn't. I don't really see the difference.
>>
>> - Jonathan M Davis
>
> You don't break any guarantees when casting away shared if you know that  
> the data is actually not shared anymore.
> (Of course, if it is actually still shared between multiple threads,  
> this is about as bad as altering data which is typed as immutable  
> somewhere.)
>
>
> You don't break any guarantees if you don't actually break them. The  
> casts are just there because the compiler is unable to verify that you  
> don't.
>
> Therefore casting to immutable and back and then changing data is bad,  
> but casting data to shared, transferring ownership to another single  
> thread and then casting back to unshared is good.

Not wanting to drift too far off topic I'll add one last point.
given:
immutable(int[]) data = assumeUnique(myints);
send(cast(int[])data);
writeln(data[0]);

A compiler implementation could deduce data won't change between  
initialization and the write.
Thus performing optimizations that would break the code.
I think that and being able to store ctfe data in read only sections are  
the reasons for the undefined behavior.
Removing Immutable With A Cast =>  
http://www.digitalmars.com/d/2.0/const3.html

martin


More information about the Digitalmars-d mailing list