Is mimicking a reference type with a struct reliable?
Denis Koroskin
2korden at gmail.com
Sat Oct 16 09:59:46 PDT 2010
Sorry, I misclicked a button and send the message preliminary.
On Sat, 16 Oct 2010 20:16:40 +0400, Steven Schveighoffer
<schveiguy at yahoo.com> wrote:
>
> A final option is to disable the copy constructor of such an unsafe
> appender, but then you couldn't pass it around.
>
> What do you think? If you think it's worth having, suggest it on the
> phobos mailing list, and we'll discuss.
>
It's still possible to pass it by reference, or even by pointer. You know,
that's what you actually do right now - you are passing a Data* (a pointer
to an internal state, wrapped with an Appender struct).
Passing by pointer might actually be a good idea (because you can default
it to null). One of the reasons I use "T[] buffer = null" as a buffer is
because you aren't force to provide one, null is also a valid buffer. Many
function would benefit of passing optional Appender (e.g. converting from
utf8 to utf16 etc), but we shouldn't force them to do so.
> Note that Appender is supposed to be fast at *appending* not
> initializing itself. In that respect, it's very fast.
>
This makes it useless for appending small amount of data.
>> I'm not sure it's worth the trade-off, and as such I defined and use
>> my own set of primitives that don't allocate when a buffer is provided:
>>
>> void put(T)(ref T[] array, ref size_t offset, const(T) value)
>> {
>> ensureCapacity(array, offset + 1);
>> array[offset++] = value;
>> }
>>
>> void put(T)(ref T[] array, ref size_t offset, const(T)[] value)
>> {
>> // Same but for an array
>> }
>>
>> void ensureCapacity(ref char[] array, size_t minCapacity)
>> {
>> // ...
>> }
>
> I'm not sure what ensureCapacity does, but if it does what I think it
> does (use the capacity property of arrays), it's probably slower than
> Appender, which has a dedicated variable for capacity.
>
>> Back to my original question, can we mimick a reference behavior with a
>> struct? I thought why not until I hit this bug:
>>
>> import std.array;
>> import std.stdio;
>>
>> void append(Appender!(string) a, string s)
>> {
>> a.put(s);
>> }
>>
>> void main()
>> {
>> Appender!(string) a;
>> string s = "test";
>>
>> append(a, s); // <
>>
>> writeln(a.data);
>> }
>>
>> I'm passing an appender by value since it's supposed to have a
>> reference type behavior and passing 4 bytes by reference is an overkill.
>>
>> However, the code above doesn't work for a simple reason: structs lack
>> default ctors. As such, an appender is initialized to null internally,
>> when I call append a copy of it gets initialized (lazily), but the
>> original one remains unchanged. Note that if you append to appender at
>> least once before passing by value, it will work. But that's sad. Not
>> only it allocates when it shouldn't, I also have to initialize it
>> explicitly!
>>
>> I think far better solution would be to make it non-copyable.
>>
>> TL;DR Reference semantic mimicking with a struct without default ctors
>> is unreliable since you must initialize your object lazily. Moreover,
>> you have to check that you struct is not initialized yet every single
>> function call, and that's error prone and bad for code clarity and
>> performance. I'm opposed of that practice.
>
> This is a point I've brought up before. As of yet there is no
> solution. There have been a couple of ideas passed around, but there
> hasn't been anything decided. The one idea I remember (but didn't
> really like) is to have the copy constructor be able to modify the
> original. This makes it possible to allocate the underlying
> implementation in Appender for example, even on the data being passed.
> There are lots of problems with this solution, and I don't think it got
> much traction.
>
> I think the default constructor solution is probably never going to
> happen. It's very nice to always have a default fast way to initialize
> structs, and there is precedence (C# has the same rule).
>
I think there is, but it goes far beyond default ctors problem (it solves
many other issues, too).
Currently, a struct is initialized with T.init/T.classinfo.init
Pros:
simple initialization - malloc, followed by memcpy
there is always an immutable instance of an object in memory, and you can
use it as default/not initialized state
Cons:
you can't initialize class/struct variables with runtime values
increased file size (every single class/struct now has a copy of its own)
In Java, they use another approach. Instead of memcpy'ing T.init on top of
allocated data, they invoke a so-called cctor (as opposed to ctor). This
is a method that initializes memory so that a ctor can be called.
memcpy'ing T.init has the same idea, however it is not moved into a
separate method. In general, cctor can be implemented the way it is in D
without sacrificing anything. However, a type-unique method is a lot
better than that:
1) most structs initialize all of its members with 0. For these compiler
can use memset instead.
2) killer-feature in my opinion. It allows initializing values to
non-constant expressions:
class Foo
{
ubyte[] buffer = new ubyte[BUFFER_SIZE];
}
This also solves an Appender issue:
struct Appender
{
Data* data = new Data();
}
3) it allows getting rid of T.init, significantly reducing resulting file
size
I'm not sure Walter will agree to such a radical change, but it can be
achieved in small steps. D doesn't even have to get rid of T.init, it can
still be there (but I'd like to get rid of it eventually)
a) Keep T.init/T.classinfo.init, introduce compiler-generated cctor what
memcpy'ies T.init over the object
(Optionally) Make cctor more smart, and generate proper class/struct
initialization code that doesn't rely on T.init
b) Allow non-constant expressions as initializers and initialize such
members in the cctor
(Optionally) Get rid of T.init altogether
> My suggestion would be to have it be an actual reference type -- i.e. a
> class. I don't see any issues with that. In that respect, you could
> even have it be stack-allocated, since you have emplace. But I don't
> have a say in that. I was the last one to update Appender, since it had
> a bug-ridden design and needed to be fixed, but I tried to change as
> little as possible.
>
> -Steve
More information about the Digitalmars-d
mailing list