Why are structs and classes so different?

Sun May 15 21:33:24 UTC 2022

On 5/15/22 13:05, Kevin Bailey wrote:

 > I've been programming in C++ full time for 32 years

Hi from an ex-C++'er. :) I managed to become at least a junior expert in 
C++ between 1996-2015. I don't use C++ since then.

I still think my answer is the real one. My implied question remains: 
Why does C++ have struct and class disticnction? I know they have 
different default access specifications but does that warrant two kinds?

I claim there are two types in C++ as well: value types and reference 
types. And types of an inheritance hirerachy are by convention reference 
types. As others reminded on this thread, C++ programmers follow 
guidelines to treat types of hierarchies as reference types.

 > so I'm familiar
 > with slicing. It doesn't look to me like there's a concern here.

Slicing renders types of class hierarchies reference types. They can't 
be value types because nobody wants to pass a Cat sliced as an Animal. 
It's always a programmer error.

All D does is (just like C# did) appreciate the differences between 
these two kinds and utilize existing keywords.

 > One question is, how should we pass objects - by value or by reference?
 > In C++, you can do either, of course, but you take your chances if you
 > pass by value - both in safety AND PERFORMANCE.

D is very different from C++ when it comes to that topic:

- Since classes are reference types, there is no issue with performance 
whatsoever: It is just a pointer copy behind the scenes.

- Since structs are value types, they can be shallow-copied without any 
concern. (D disallows self-referencing structs.) Only when it matters, 
one writes the copy constructor or the post-blit. (And this happens very 
rarely.)

- rvalues are moved by default. They don't get copied at all. (Only for 
structs because classes don't have rvalues.)

 > The bottom line is that
 > no one passes by value, even for PODs (although we may return even large
 > objects.)

I know it very well. In reality, nobody should care unless it matters 
semantically: Only if the programmer wants to pass an object by 
reference it should be done so. For example, to mutate an object or 
store a reference to it.

You must be familiar with the following presentation by Herb Sutter how 
parameter passing is a big problem. (Yet, nobody realizes until a 
speaker like Herb Sutter makes a presentation about it.)

   https://www.youtube.com/watch?v=qx22oxlQmKc&t=923s

Such concerns don't exist in D especially after fixing the "in 
parameters" feature. Semantically, the programmer should say "this is an 
input to this function". The programmer should not be concerned whether 
the number of bytes is over a threshold for that specific CPU or 
twhether the copy constructor may be expensive.

D does not have such issues. The programmer can do this:

- Compile with -preview=in

- Mark function parameters as in (the ones that are input):

auto foo(in A a, in B b) {
   // ...
}

The compiler should deal with how to pass parameters. The programmer 
provides the semantics and D follows these rules:

   https://dlang.org/spec/function.html#in-params

Although one of my colleagues advices me to not be negative towards C++, 
having about 20 years of experience with C++, I am confident C++ got 
this wrong and D got this right. D programmers don't write move 
constructors or move assignment. Such concepts don't even exist.

In summary, if a programmer has to think about pass-by-reference, that 
programmer has been conditioned to think that way. It has always been 
wrong. Passing by reference should have been about semantics. (Herb 
Sutter uses the word "intent" in that presentation.)

 > But I asked a different question: Why can't I put a class object on the
 > stack? What's the danger?

There is no danger. One way I like is std.typecons.scoped:

import std.stdio;
import std.typecons;

class C {
   ~this() {
     writeln(__FUNCTION__);
   }
}

void main() {
   {
     auto c = scoped!C();
   }
   writeln("after scope");
}

 > Note that operating on that object hasn't changed. If I pass by
 > reference, it's no different than if I had created a reference.

(Off-topic: I always wonder whether pass-by-reference comes with 
performance cost. After all, the members of by-reference struct will 
have to be accessed through a pointer, right? Shouldn't pass-by-value be 
faster for certain types? I think so but I never bothered to check the 
size threshold below which to confidently pass-by-value.)

 > One might say, Well, if D creates by value, then it has to pass by
 > value. But it doesn't; it has the 'ref' keyword.

That's only when one wants to pass a reference to an object. I blindly 
pass structs by-value. The reason is, I don't think any struct is really 
large to cost byte copying. It's just shallow copy and it works. (Note 
that there are not much copy constructors in D.)

 > I hope Ali's answer isn't the real reason. I would be sad if D risked
 > seg faults just to make class behavior "consistent".

I don't understand the seg fault either but my answer was to underline 
the fact that D sees two distinct kinds of types: value types and 
reference types. C++ does have reference types as well but they are 
implied by convention. Otherwise the programmer hits the slicing issue.

Ali