[Dlang-study] [rcstring] Defining rcstring

Andrei Alexandrescu andrei at erdani.com
Sat Feb 6 15:47:56 PST 2016


On 02/06/2016 04:51 PM, Михаил Страшун wrote:
> I am sorry that I keep arguing about things that may looks unimportant
> to you but so far it looks like a major effort will go into designing a
> thing that will only be helpful in a few situations and won't change
> overall situation much.

Of course. Sadly it may be the case that I don't have solutions for such 
things, not that I don't find them important.

>> Characters in a string may be modifiable by means of opAssign,
>> opOpAssign etc.
>
> Makes sense. And just to be extra sure - opSliceAssign is planned to be
> allowed to, right? If yes, that should be enough to for most scenarios.

That should be easy to support.

> Interaction with C bindings may become complicated though, do you have
> any vision about it?

A @system method offering access to the underlying buffer.

>>> When it comes to encoding, there is also issue of how lacking is current
>>> support of non-UTF encodings in Phobos.
>>
>> D uses UTF for strings. Vivid anecdotes aside, we really can't be
>> everything to everyone. Your friend could have written a translator to
>> UTF in a few lines.The DNA optimization points at performance bugs in
>> phobos that far as I know have been fixed or are fixable by rote. I
>> think this non-UTF requirement would just stretch things too far and
>> smacks of solving the wrong problem.
>
>  From a pure technical point of view you are perfectly right. But does
> that makes the fact potential users leave dissapointed better?

What greener pastures do they leave to? We should draw a page from the 
languages that support multiple encodings seamlessly.

>> Const is a non-modifiable view on data that may otherwise be mutable.
>
> Again, I'd like to read confirmation from Walter on this because I
> recall different statements from him in the past on this topic.

Yeah, he's on board.

> Also my
> own experience of trying to use const in such manner (== effectively
> logical const, like in C++) is rather bad and if it was the intention,
> it feels like major design PITA that is much too intrusive for declared
> goals. Physical immutability guarantess add at last some justification
> for it being that demanding.

We don't know how to make things work otherwise.

> You may call me paranoid but I was thinking about this :)
>
> void foo ( )
> in
> {
>      // compiler currently qualified "this" inside a conract with const
>      // which gives guarantees that enabling/disabling contracts has no
>      // (accidental) effect on class/struct semantics. If passing
>      // const "this" to a function may actually change a refcount, it
>      // may add to a contract impact in a subtle way
>      bar(this);
> }
> body
> {

I'm not sure what to say about this. We'll cross that bridge when we 
come to it.

>> Initially no sharing will be allowed. Following the initial
>> implementation we may add implementation for the "shared" qualifier for
>> rcstring.
>
> That is one of main decision topics for any string replacement. If
> sharing support is even not supposed to be discussed, what is the point
> of the case study?

We're free to discuss it. All I said is I don't know how to do it. So 
the logical thing do to is explicitly not support it; when we do know 
how to do it, the addition can be made without breaking any code.

> I have seen plenty of successful thread-local implementation of
> reference counted strings. It is multi-threading that makes things
> complicated - and commiting to new standard design which does not plan
> for sharing from the very beginning is a good way to ensure it will not
> be usable in such way.

The design is not "not planned" for sharing. The plan is to explicitly 
not do it for now. "We will drive around Stuttgart" is not "We did not 
plan for Stuttgart".

If any part of the API might impede future expansion into sharing 
semantics, by all means the point needs to be raised. Generally it is 
known where matters lie - reference count updates. As long as we don't 
expose some gnarly details to the user we should be safe for future 
extension.

>> We can't deliver two contradictory guarantees at the same time.
>
> I know what immutability is in D, but that doesn't really answer my
> question :) Right now I am aware of two truly scaling approaches to
> sharing in D:
>
> - `@safe Unique!T` which allows multi-threaded ownsership transfer (not
> actually supported by Phobos yet, but all prerequisites seem to be there)
> - immutable (both directly and by making immutable copy from mutable
> data) + atomics
>
> Anything that involves locking a mutex on method calls (like it tends to
> happen with all straightforward shared RC implementations) destroys
> performance so hard it is hardly even considered an option these days.
>
> So considering you are willing to abandond immutability and unqiqueness
> support still has a long way to go, what does remain?

There is no abandoning of immutability.

> Will new
> "standard" string type be incapable of lock-free sharing?

There is one standard string type today that is shareable lock-free and 
uses the garbage collector.

> On a related topic:
>
> Why do you completely discard external reference counting approach (i.e.
> storing refcount in GC/allocator internal data structures bound to
> allocated memory blocks)? Is there any paper explaining pitfalls of such
> concept?

One reason for creating this forum was to have a smaller confined circle 
for design discussions outside the corrosive atmosphere of the forum. A 
place where intellectual discussion, careful consideration, and pushing 
forward the state of affairs prevails having the louder voice, 
demonstrating competence, or winning arguments. In this smaller circle, 
I kindly but firmly invite everyone to steer clear of assumptions on the 
state of mind of other participants such as "things that look 
unimportant to you", "you are willing to abandon immutability", or "you 
completely discard". These do little else than putting the other in 
defensive and derailing the discussion. Now they don't need to further 
the technical argument, but instead they need to explain that in fact 
no, they don't want to do these things. Thank you.

Also: I am not using the Socratic method here. If I put forward an idea, 
design etc. that has shortcomings it simply means I don't know how to do 
better. Therefore, pointing the shortcomings will not push things 
forward; too many of those and we're back to stalemate. The best way is 
to propose better ideas.

Storing refcounts separately in the allocator is definitely possible but 
my understanding is it just moves the problem elsewhere. The "poison 
cast" that takes immutability away from the reference counter in order 
to manipulate it moves from rcstring's code to the allocator. I see an 
upside to that - we could move the allocator to "the language" and 
guarantee things about it (such as it can cast immutability away). What 
other advantages of the scheme do you see?


Andrei


More information about the Dlang-study mailing list