Fixing C's Biggest Mistake
Quirin Schroll
qs.il.paperinik at gmail.com
Tue Feb 7 10:57:42 UTC 2023
On Monday, 2 January 2023 at 22:53:30 UTC, Walter Bright wrote:
> On 12/31/2022 2:28 AM, Max Samukha wrote:
>> For types that require runtime construction, initializing to
>> T.init does not result in a constructed object.
> The idea is to:
>
> 1. have construction that cannot fail. This helps avoid things
> like double-fault exceptions
Here, it would help if you clarified what you mean by
_double-fault exceptions_ because I just tried to look it up and
was lead to tennis and CPU interrupts.
I have a rough idea of what you man and could guess, but I could
just ask.
Construction that cannot fail sounds nice, and if an aggregate
type constructor can pull it off, it should go for it. But this
sounds a lot like `nothrow` and not something the language should
impose. I know people dislike complicated rules because
exceptions (to rules, not `Exception`s), but a possibility could
be that nullary `struct` constructors must be `nothrow` or be
annotated `throw`.
> 2. have initializers that can be placed in read only memory
I’m not saying that `init` isn’t a great idea. It’s just that
`init` shouldn’t be used explicitly, but only as “the thing a
constructor must act on to produce a valid object”. A “naked”
`init` may be an object that violates its invariants.
An example would be a string optimized for short values (SSO). It
has (at least) a `pointer` to data, a fixed-size internal
`buffer`, and a `length` with the invariant: `pointer =
&buffer[0]` if and only if `length <= buffer.length`. A SSO’s
`init` cannot possibly represent the empty string unless we allow
`pointer` to be `null` to represent it. This means that a SSO has
two representations for the empty string. Or we interpret the
`null` data pointer as a `null` string. In any case, we get
something we don’t want.
> 3. have something to set a destroyed object to, in case of
> dangling references and other bugs
If a NaN state is available, use that. (I don’t think NaN states
are bad; I actually think that every built-in type except `bool`
should have one: Signed and unsigned integer types could use
`T.min` and `T.max`. Setting those to 0 is bad because in a lot
of contexts, 0 is a perfectly reasonable value, whereas `int.min`
and `size_t.max` rarely are.
> 4. present to a constructor an already initialized object. This
> prevents the common C++ problem of adding a field and
> forgetting to construct it in one of the overloaded
> constructors, a problem that has plagued me with erratic
> behavior
The problem is, C++ does not complain about you forgetting that
field. (For other people:) In C++, if a struct field is of
built-in type (e.g. `int`) and you forget to initialize it, it
has an unspecified value. Aggregate types call a nullary
constructor and fail to compile if no nullary constructor exists.
Now, even if there is a nullary constructor, it might not be what
you want.
Requiring initialization of every field in every constructor is
what C++ lacks.
> 5. provide a NaN state. I know many people don't like NaN
> states, but if one does, the default construction is perfect
> for implementing one.
One question is penalty for the NaN state. Floating-point NaN
values are supported by hardware. If we declared `int.min` and
`size_t.max` as their respective types’ NaN, we’d probably
specify existing behavior. The issue is, making them sticky
incurs costs.
Floating-point NaN serves two purposes: Indicate an invalid
result and error propagation through the program execution.
Integer min/max values are used for the former already. People
don’t like them do the latter probably.
Another issue of floating-point NaN values is their weird
comparison behavior. I understand the argument that `x == y`
should be false if `x` and `y` happen to be `NaN`, but `if (x ==
double.nan)` being silently always false feels broken.
> 6. it fits in well with (future) sumtypes, where the default
> initializer can be the error state.
I’m curious what comes out of this.
> An alternative to factory functions is to have a constructor
> with a dummy argument. Nothing says one has to actually use the
> parameters to a constructor.
Or we could just allow a nullary struct constructor. It should be
backwards compatible (at least to a large degree) if D defines
`this() @… {}` when `this()` is not defined explicitly (as
`@disable`d or otherwise). An explicit `this()` should be
`nothrow` if failure is a problem. With `throw` as an attribute,
`this() throw { … }` is a kind of: “Sorry, Walter, I know you
wanted the best for me, but for this type, it’s too wrong to be
right.”
That way, a declaration like `T x;` will call a constructor that
– in almost all cases – does nothing, in the remaining cases, in
almost all cases does something that cannot fail.
Sorry for the late answer.
More information about the Digitalmars-d
mailing list