NonNull template
Johan
j at j.nl
Mon Apr 21 17:29:58 UTC 2025
On Saturday, 19 April 2025 at 22:49:19 UTC, Jonathan M Davis
wrote:
> On Thursday, April 17, 2025 8:39:27 PM MDT Walter Bright via
> Digitalmars-d wrote:
>> I'd like to know what those gdc and ldc transformations are,
>> and whether they are controllable with a switch to their
>> optimizers.
>>
>> I know there's a problem with WASM not faulting on a null
>> dereference, but in another post I suggested a way to deal
>> with it.
>
> Unfortunately, my understanding isn't good enough to explain
> those details. I discussed it with Johan in the past, but I've
> never worked on ldc or with llvm (or on gdc/gcc), so I really
> don't know what is or isn't possible. However, from what I
> recall of what Johan said, we were kind of stuck, and llvm
> considered dereferencing null to be undefined behavior.
There is a way now to tell LLVM that dereferencing null is
_defined_ (nota bene) behavior.
> It may be the case that there's some sort of way to control
> that (and llvm may have more capabilities in that regard since
> I last discussed it with Johan), but someone who actually knows
> llvm is going to have to answer those questions. And I don't
> know how gdc's situation differs either.
So far not responded in this thread because I feel it is an old
discussion, with old misunderstandings.
There is confusion between dereferencing in the language, versus
dereferencing by the CPU. What I think that C and C++ do very
well is separate language behavior from implementation/CPU
behavior, and only prescribe language behavior, no (or very
little) implementation behavior. I feel D should do the same.
Non-virtual method example, where (in my opinion) the dereference
happens at call site, not inside the function:
```
class A {
int a;
final void foo() { // non-virtual
a = 1; // no dereference here
}
}
A a;
a.foo(); <-- DEREFERENCE
```
During program execution, _with the current D implementation of
classes and non-virtual methods_, the CPU will only "dereference"
the `this` pointer to do the assignment to `a`. But that is only
the case for our _current implementation_. For the D language
behavior, it does not matter what the implementation does: same
behavior should happen on any architecture/platform/execution
model.
If you want to fault on null-dereference, I believe you _have_ to
add a null-check at every dereference at _language_ level
(regardless of implementation details). Perhaps it does not
impact performance very much (with optimizer enabled); I vaguely
remember a paper from Microsoft where they tried this and did not
see a big perf impact (if any).
Some notes to trigger you to think about distinguishing language
behavior from CPU/implementation details:
- You don't _have_ to implement classes and virtual functions
using a vptr/vtable, there are other options!
- There does not need to be a "stack" (implementation detail
vocabulary). Some "CPUs" don't have a "stack", and instead do
"local storage" (language vocabulary) in an alternative way. In
fact, even on CPUs _with_ stack, it can help to not use it! (read
about Address Sanitizer detection of stack-use-after-scope and
ASan's "fake stack")
- Pointers don't have to be memory addresses (you probably
already know that they are not physical addresses on common
CPUs), but could probably be implemented as hashes/keys into a
database as well. C does not define ordered comparison (e.g. >
and <) for pointers (it's implementation defined, IIRC), except
when they point into the same object (e.g. an array or struct).
Why? Because what does it mean on segmented memory architectures
(i.e. x86)?
- Distinguishing language from implementation behavior means that
correct programs work the same on all kinds of different
implementations (e.g. you can run your C++ program in a REPL, or
run it in your browser through WASM).
cheers,
Johan
More information about the Digitalmars-d
mailing list