Our Sister
ZombineDev via Digitalmars-d
digitalmars-d at puremagic.com
Sat May 28 03:35:49 PDT 2016
On Saturday, 28 May 2016 at 09:43:41 UTC, ZombineDev wrote:
> On Thursday, 26 May 2016 at 16:11:22 UTC, Andrei Alexandrescu
> wrote:
>> I've been working on RCStr (endearingly pronounced "Our
>> Sister"), D's up-and-coming reference counted string type. The
>> goals are:
>
> <Slightly off-topic>
>
> RCStr may be an easier first step, but I think generic dynamic
> arrays are more interesting, because are more generally
> applicable and user types like move-only resources make them a
> more challenging problem to solve.
>
> BTW, what happened to scope? Generally speaking, I'm not a fan
> of Rust, and I know that you think that D needs to
> differentiate, but I like their borrowing model for several
> reasons:
> a) while not 100% safe and quite verbose, it offers enough
> improvements over @safe D to make it a worthwhile upgrade, if
> you don't care about any other language features
> b) it's not that hard to grasp / almost natural for people
> familiar with C++11's copy (shared_ptr) and move (unique_ptr)
> semantics.
> 3) it's general enough that it can be applied to areas like
> iterator invalidation, thread synchronization and other logic
> bugs, like some third-party rust packages demonstrate.
>
> I think that improving escape analysis with the scope attribute
> can go along way to shortening the gap between Rust and D in
> that area.
>
> The other elephant(s) in the room are nested contexts like
> delegates, nested structs and some alias template parameter
> arguments. These are especially bad because the user has zero
> control over those GC allocations. Which makes some of D's key
> features unusable in @nogc contexts.
> <End off-topic>
>
>>
>> * Reference counted, shouldn't leak if all instances
>> destroyed; even if not, use the GC as a last-resort
>> reclamation mechanism.
>>
>> * Entirely @safe.
>>
>> * Support UTF 100% by means of RCStr!char, RCStr!wchar etc.
>> but also raw manipulation and custom encodings via
>> RCStr!ubyte, RCStr!ushort etc.
>>
>> * Support several views of the same string, e.g. given s of
>> type RCStr!char, it can be iterated byte-wise, code
>> point-wise, code unit-wise etc. by using s.by!ubyte,
>> s.by!char, s.by!dchar etc.
>>
>> * Support const and immutable qualifiers for the character
>> type.
>>
>> * Work well with const and immutable when they qualify the
>> entire RCStr type.
>>
>> * Fast: use the small string optimization and various other
>> layout and algorithms to make it a good choice for high
>> performance strings
>>
>> RFC: what primitives should RCStr have?
>>
>>
>> Thanks,
>>
>> Andrei
>
> 0) (Prerequisite) Composition/interaction with language
> features/user types - RCStr in nested contexts (alias template
> parameters, delegates, nested structs/classes), array of
> RCStr-s, RCStr as a struct/class member, RCStr passed as
> (const) ref parameter, etc. should correctly increase/decrease
> ref count. This is also a prerequisite for safe RefCounted!T.
> Action item: related compiler bugs should be prioritized. E.g.
> the RAII bug from
> Shachar Shemesh's lightning talk -
> http://forum.dlang.org/post/n8algm$qra$1@digitalmars.com.
> See also:
> https://issues.dlang.org/buglist.cgi?quicksearch=raii&list_id=208631
> https://issues.dlang.org/buglist.cgi?quicksearch=destructor&list_id=208632
> (not everything in those lists is related but there are some
> nasty ones, like bad RVO codegen).
>
> 1) Safe slicing
>
> 2) shared overloads of member functions (e.g. for stuff like
> atomic incRef/decRef)
>
> 3) Concatenation (RCStr ~= RCStr ~ RCStr ~ char)
>
> 4) (Optional) Reserving (pre-allocating capacity) / shrinking.
> I labeled this feature request as optional, as it's not clear
> if RCStr is more like a container, or more like a slice/range.
>
> 5) Some sort of optimization for zero-terminated strings. Quite
> often one needs to interact with C APIs, which requires calling
> toStringz / toUTFz, which causes unnecessary allocations. It
> would be great if RCStr could efficiently handle this scenario.
>
> 6) !!! Not really a primitive, but we need to make sure that
> applying a chain of range transformations won't break ownership
> (e.g. leak or free prematurely).
>
> 7) Should be able to replace GC usage in transient ranges like
> e.g. File.byLine
>
> 8) Cheap initialization/assignment from string literals -
> should be roughly the same as either initializing a static
> character array (if the small string optimization is used) or
> just making it point to read-only memory in the data segment of
> the executable. It shouldn't try to write or free such memory.
> When initialized from a string literal, RCStr should also offer
> a null-terminating byte, provided that it points to the whole
> If one wants to assign a string literal by overwriting parts of
> the already allocated storage, std.algorithm.mutation.copy
> should be used instead.
>
> There may be other important primitives which I haven't thought
> of, but generally we should try to leverage std.algorithm,
> std.range, std.string and std.uni for them, via UFCS.
>
> ----------
>
> On a related note, I know that you want to use AffixAllocator
> for reference counting, and I think it's a great idea. I have
> one question, which wasn't answered during that discussion:
>
> // Use a nightly build to compile
> import core.thread : Thread, thread_joinAll;
> import std.range : iota;
> import std.experimental.allocator : makeArray;
> import std.experimental.allocator.building_blocks.region :
> InSituRegion;
> import
> std.experimental.allocator.building_blocks.affix_allocator :
> AffixAllocator;
>
> AffixAllocator!(InSituRegion!(4096) , uint) tlsAllocator;
>
> static assert (tlsAllocator.sizeof >= 4096);
>
> import std.stdio;
> void main()
> {
> shared(int)[] myArray;
>
> foreach (i; 0 .. 100)
> {
> new Thread(
> {
> if (i != 0) return;
>
> myArray = tlsAllocator.makeArray!(shared
> int)(100.iota);
> static
> assert(is(typeof(&tlsAllocator.prefix(myArray)) ==
> shared(uint)*));
> writefln("At %x: %s", myArray.ptr, myArray);
>
> }).start();
>
> thread_joinAll();
> }
>
> writeln(myArray); // prints garbage!!!
> }
>
> So my question is: should it be possible to share thread-local
> data like this?
> IMO, the current allocator design opens a serious hole in the
> type system, because it allows using data allocated from
> another thread's thread-local storage. After the other thread
> exits, accessing memory allocated from it's TLS should not be
> possible, but https://github.com/dlang/phobos/pull/3991 clearly
> allows that.
>
> One should be able to allocate shared memory only from shared
> allocators. And shared allocators must backed by shared parent
> allocators or shared underlying storage. In this case the
> Region allocator should be shared, and must be backed by shared
> memory, Mallocator, or something in that vein.
Here's another case where the last change to AffixAllocator is
really dangerous:
void main()
{
immutable(int)[] myArray;
foreach (i; 0 .. 100)
{
new Thread(
{
if (i != 0) return;
myArray = tlsAllocator.makeArray!(immutable
int)(100.iota);
writeln(myArray); // prints [0, ..., 99]
}).start();
thread_joinAll(); // prints garbage
}
writeln(myArray);
}
In this case it severely violates the promise of immutable.
More information about the Digitalmars-d
mailing list