What is the case against a struct post-blit default constructor?

Wed Oct 10 12:11:29 PDT 2012

Thank you for explaining.
See comments inline.

On Wednesday, 10 October 2012 at 18:12:13 UTC, Jonathan M Davis 
wrote:
> On Wednesday, October 10, 2012 13:40:06 foobar wrote:
>> Can you please elaborate on where the .init property is being
>> relied on? This is an aspect of D I don't really understand.
>> What's the difference between a no-arg ctor and one with args 
>> in
>> relation to this requirement?
>
> init is used anywhere and everywhere that an instance of a type 
> needs to be
> default-initialized. _One_ of those places is a local variable. 
> Not all places
> where an instance of an object needs to be created can be 
> initialized by the
> programmer. The prime example of this would be arrays. If you 
> declare
>
> auto i = new int[](5);
>
> or
>
> int[12] s;
>
> all of the elements in the array need to be initialized or 
> they'll be garbage,
> and there's no way for the programmer to indicate what values 
> they should be.
> The whole point of init is to avoid having variables ever be 
> garbage without
> the programmer explicitly asking for it. And having a default 
> constructor
> wouldn't help one whit with arrays, because the values of their 
> elements must
> be known at compile time (otherwise you couldn't directly 
> initialize member
> variables or static variables or anything else which requires a 
> value at
> compile time with an array). With init, the compiler can take 
> advantage of the
> fact that it knows the init value at compile time to 
> efficiently initialize the
> array.
>

I understand the idea of default initialization. I was more 
interested in the machinery and implementation details :) So 
let's dive in into those details:
Arrays - without changing existing syntax we can use these 
semantics:

auto a = new int[](5); // compiler calls T() for each instance
int[12] b; // ditto

This would be same as in C++. We could also expand the syntax and 
allow:
auto b = new int[](5, 9); // init all instances to 9
auto b = new int[](5, int (int index) { return index; });
initializes each member via a function call.
This can be generalized for multi dimensions.

> But even constructing objects sanely relies on init. All 
> user-defined objects
> are fully initialized to what their member variables are 
> directly initialized
> to before their constructors are even called. In the case of a 
> struct, that's
> the struct's init value. It's not for a class, because you 
> can't have a class
> separate from its reference (so it's the reference which gets 
> the init value),
> but the class still has a state equivalent to a struct's init 
> value, and
> that's the state that it has before any of its constructors are 
> called.

So for classes .init is null which complicates non-nullable 
classes. It seems the "solution" (more like a hack IMO) of 
@disable _breaks_ the .init guaranty in the language.

>
> If it weren't for that, you'd get the insanity that C++ or Java 
> have with
> regards to the state of objects prior to construction. C++ is 
> particularly bad
> in that each derived class is created in turn, meaning that 
> when a constructor
> is called, the object _is_ that class rather than the derived 
> class that
> you're ultimately constructing (which means that things can go 
> horribly wrong
> if you're stupid enough to call a virtual function from a 
> constructor in C++).
> I believe that Java handles that somewhat better, but it gets 
> bizarre ordering
> issues with regards to initializing member variables that cause 
> problems if
> you try and alter member variables from base classes inside of 
> a derived
> constructor. With D, the object is guaranteed to be in a sane 
> state prior to
> construction.
>

C++ is insanely bad here mainly due to [virtual?] MI which 
doesn't affect D
and Java _allows_ virtual methods in constructors, which I think 
is also "fixed" in the latest c++ standard. I don't know about 
the ordering problems you mention but AFAIK the complication 
arises with MI, not default initialization. It's just a matter of 
properly defining the inheritance semantics.

> And without init, even if every place that an object is 
> instantiated could be
> directly initialized by the programmer (which it can't), then 
> you would either
> end up with garbage every time that a variable isn't directly 
> initialized, or
> you'd have to directly initialize them all. In order for D's 
> construction
> model to work, this would include directly initializing _all_ 
> member variables
> even if the constructor then set them to something else (which 
> would actually
> cause problems with const and immutable). And that would get 
> _very_ annoying,
> even if it would be preferable for the local variable to 
> require explicit
> initialization.

You talk about:
class C {
immutable T val; // what to do here?
this() { ... }
}

This can be solved be either requiring a ctor call at # or if 
none specified call T(), or we can require the init to happen in 
the ctor a-la C++ semantics.

>
> Another case where init is required is out parameters. All out 
> parameters are
> set to their init value when the function is called in order to 
> avoid bugs
> caused by reading the value of an out parameter before it's set 
> within the
> function. That wouldn't work at all without init.

Personally, I'd just get remove this feature from the lanuage, 
tuples are a far better design for returning multiple values and 
even with this feature intact, we could always use the default 
no-arg constructor.
E.g
void foo(out T val);
becomes:
void foo(out T val = T());

>
> One of the more annoying AA bugs makes it so that if the foo 
> function in this
> code
>
> aa[5] = foo();
>
> throws, then aa[5] gets set with a init value of the element 
> type. While this
> clearly shouldn't happen, imagine how much worse it would be if 
> we didn't have
> init, and that element got set to garbage?
>

I don't get this example. If foo throws than the calling code 
will get control. How would you ever get to read that garbage in 
aa[5]? The surrounding try catch block should take care of this 
explicitly anyway.

E.g.
try {
  aa[5] = foo(); // foo throws
  // ## do something with aa[5], this won't happen
} catch {
// Please handle aa[5] here explicitly.
//
}
// @@ do something with aa[5], works due to the explicit fix in 
the catch.

> There are probably other cases that I can't think of right now 
> where init gets
> used - probably in the runtime if nowhere else. Every place 
> that could
> possibly result in a variable being garbage _doesn't_ result in 
> garbage,
> because we have init.
>
> And regardless of what the language does, there are definitely 
> places where the
> standard library takes advantage of init. It uses it a lot for 
> type
> inferrence, but it also uses it directly in places such as 
> std.algorithm.move.
> Without init, it would end up dealing with garbage values. It's 
> also a
> lifesaver in generic code, because without it, generic code 
> _can't_ initialize
> variables in many cases. Take something like
>
> T t;
>
> if(cond)
> {
>  ...
>  t = getValue();
>  ...
> }
> else
> {
>  ...
>  t = getOtherValue();
>  ...
> }
>
> How on earth could a generic function initialize t without 
> T.init? void?
> That's just begging for bugs when one the paths doesn't 
> actually set t like
> it's supposed to. It doesn't know anything about the type and 
> therefore
> doesn't know what a reasonable default value would be, so it 
> can't possibly
> initialize t properly.
>

Isn't @disable breaks those algorithms in phobos anyway? how 
would that work for non-nullable classes?
To answer the above question, I'd say there's nothing wrong with 
init to void. This is what happens anyway since the .init isn't 
used and the optimizer will optimize it away.

> I can understand prefering that local variables have to be 
> directly
> initialized by the programmer, but it just doesn't scale. 
> Having init is
> _far_more flexible and far more powerful. Any and every 
> situation that might
> need to initialize a variable can do it. Without init, that 
> just isn't
> possible.
>
> - Jonathan M Davis

Again, thanks for the explanation. I have to say that on a 
general level I have to agree with Don's post and I don't see how 
the .init idiom generally "works" or is useful. I can't see 
anything in the above examples that shows that .init is 
absolutely required and we can't live without it. The only thing 
that worries me here is the reliance of the runtime/phobos on 
.init.