[Dlang-study] [rcstring] Defining rcstring
Andrei Alexandrescu
andrei at erdani.com
Sat Feb 6 15:47:56 PST 2016
On 02/06/2016 04:51 PM, Михаил Страшун wrote:
> I am sorry that I keep arguing about things that may looks unimportant
> to you but so far it looks like a major effort will go into designing a
> thing that will only be helpful in a few situations and won't change
> overall situation much.
Of course. Sadly it may be the case that I don't have solutions for such
things, not that I don't find them important.
>> Characters in a string may be modifiable by means of opAssign,
>> opOpAssign etc.
>
> Makes sense. And just to be extra sure - opSliceAssign is planned to be
> allowed to, right? If yes, that should be enough to for most scenarios.
That should be easy to support.
> Interaction with C bindings may become complicated though, do you have
> any vision about it?
A @system method offering access to the underlying buffer.
>>> When it comes to encoding, there is also issue of how lacking is current
>>> support of non-UTF encodings in Phobos.
>>
>> D uses UTF for strings. Vivid anecdotes aside, we really can't be
>> everything to everyone. Your friend could have written a translator to
>> UTF in a few lines.The DNA optimization points at performance bugs in
>> phobos that far as I know have been fixed or are fixable by rote. I
>> think this non-UTF requirement would just stretch things too far and
>> smacks of solving the wrong problem.
>
> From a pure technical point of view you are perfectly right. But does
> that makes the fact potential users leave dissapointed better?
What greener pastures do they leave to? We should draw a page from the
languages that support multiple encodings seamlessly.
>> Const is a non-modifiable view on data that may otherwise be mutable.
>
> Again, I'd like to read confirmation from Walter on this because I
> recall different statements from him in the past on this topic.
Yeah, he's on board.
> Also my
> own experience of trying to use const in such manner (== effectively
> logical const, like in C++) is rather bad and if it was the intention,
> it feels like major design PITA that is much too intrusive for declared
> goals. Physical immutability guarantess add at last some justification
> for it being that demanding.
We don't know how to make things work otherwise.
> You may call me paranoid but I was thinking about this :)
>
> void foo ( )
> in
> {
> // compiler currently qualified "this" inside a conract with const
> // which gives guarantees that enabling/disabling contracts has no
> // (accidental) effect on class/struct semantics. If passing
> // const "this" to a function may actually change a refcount, it
> // may add to a contract impact in a subtle way
> bar(this);
> }
> body
> {
I'm not sure what to say about this. We'll cross that bridge when we
come to it.
>> Initially no sharing will be allowed. Following the initial
>> implementation we may add implementation for the "shared" qualifier for
>> rcstring.
>
> That is one of main decision topics for any string replacement. If
> sharing support is even not supposed to be discussed, what is the point
> of the case study?
We're free to discuss it. All I said is I don't know how to do it. So
the logical thing do to is explicitly not support it; when we do know
how to do it, the addition can be made without breaking any code.
> I have seen plenty of successful thread-local implementation of
> reference counted strings. It is multi-threading that makes things
> complicated - and commiting to new standard design which does not plan
> for sharing from the very beginning is a good way to ensure it will not
> be usable in such way.
The design is not "not planned" for sharing. The plan is to explicitly
not do it for now. "We will drive around Stuttgart" is not "We did not
plan for Stuttgart".
If any part of the API might impede future expansion into sharing
semantics, by all means the point needs to be raised. Generally it is
known where matters lie - reference count updates. As long as we don't
expose some gnarly details to the user we should be safe for future
extension.
>> We can't deliver two contradictory guarantees at the same time.
>
> I know what immutability is in D, but that doesn't really answer my
> question :) Right now I am aware of two truly scaling approaches to
> sharing in D:
>
> - `@safe Unique!T` which allows multi-threaded ownsership transfer (not
> actually supported by Phobos yet, but all prerequisites seem to be there)
> - immutable (both directly and by making immutable copy from mutable
> data) + atomics
>
> Anything that involves locking a mutex on method calls (like it tends to
> happen with all straightforward shared RC implementations) destroys
> performance so hard it is hardly even considered an option these days.
>
> So considering you are willing to abandond immutability and unqiqueness
> support still has a long way to go, what does remain?
There is no abandoning of immutability.
> Will new
> "standard" string type be incapable of lock-free sharing?
There is one standard string type today that is shareable lock-free and
uses the garbage collector.
> On a related topic:
>
> Why do you completely discard external reference counting approach (i.e.
> storing refcount in GC/allocator internal data structures bound to
> allocated memory blocks)? Is there any paper explaining pitfalls of such
> concept?
One reason for creating this forum was to have a smaller confined circle
for design discussions outside the corrosive atmosphere of the forum. A
place where intellectual discussion, careful consideration, and pushing
forward the state of affairs prevails having the louder voice,
demonstrating competence, or winning arguments. In this smaller circle,
I kindly but firmly invite everyone to steer clear of assumptions on the
state of mind of other participants such as "things that look
unimportant to you", "you are willing to abandon immutability", or "you
completely discard". These do little else than putting the other in
defensive and derailing the discussion. Now they don't need to further
the technical argument, but instead they need to explain that in fact
no, they don't want to do these things. Thank you.
Also: I am not using the Socratic method here. If I put forward an idea,
design etc. that has shortcomings it simply means I don't know how to do
better. Therefore, pointing the shortcomings will not push things
forward; too many of those and we're back to stalemate. The best way is
to propose better ideas.
Storing refcounts separately in the allocator is definitely possible but
my understanding is it just moves the problem elsewhere. The "poison
cast" that takes immutability away from the reference counter in order
to manipulate it moves from rcstring's code to the allocator. I see an
upside to that - we could move the allocator to "the language" and
guarantee things about it (such as it can cast immutability away). What
other advantages of the scheme do you see?
Andrei
More information about the Dlang-study
mailing list