[Issue 18016] using uninitialized value is considered @safe but has undefined behavior

Thu Jun 6 13:53:06 UTC 2019

https://issues.dlang.org/show_bug.cgi?id=18016

--- Comment #9 from Steven Schveighoffer <schveiguy at yahoo.com> ---
I'll start by saying, I think the operation is @safe, but I'm not sure it's
necessary for @safe code. You can indeed escape @safe with @trusted, so there
are likely ways around this. The easiest to prove route here is that we just
disable void initialization, and tough, you just have to deal with that.

(In reply to Manu from comment #8)
> (In reply to Steven Schveighoffer from comment #7)
> > It's garbage data, but it's not garbage pointers. As long as the memory is
> > not used to reference anything, it's not going to cause a memory corruption
> > to use it.
> 
> You can't know what the memory is going to be used for.> You would need
> astonishingly competent flow-analysis to make judgements of that kind.
> It could be given as an argument to any operation that references something,
> perhaps as an offset, or any conceivable thing could be done with that data,
> and it's 100% guaranteed to be a rubbish operation.

The garbage offset argument is already handled, @safe code will throw an error
if you try to escape the bounds of an array.

> It's a varifiably rubbish value, how could that inject valid program flow
> into any usage context?

Only if written incorrectly. The above certainly is useless as is. It's clearly
not something you would want to have in your code. But the problem @safe is
trying to prevent is corrupting memory. Tailoring @safe to be as narrow as
possible allows more leeway in programs that do not corrupt memory. You should
be able to say, if you see the @safe tag, this will NOT corrupt memory.

> > Why would you want to use this? Because it's more efficient to not
> > initialize stack data before overwriting it with the real value.
> 
> Right, but it requires very special-case handling, and it's error-prone; for
> instance, you might think you can simply:
>   T x = void;
>   x = T();
> 
> For some subset of possible T's that might be fine, but then some T arrives
> with elaborate assignment semantics and it's a spectacular crash.

Spectacular crashes can happen in @safe code. This is @safe:

int* foo;
*foo = 1; // crash

However, this raises a good point that =void overrides the expectations of the
type itself. If it's expected the type is at least default initialized, setting
it to garbage originally can possibly have safety problems, if there is any
@trusted code inside the type itself.

We could potentially limit =void to POD types that contain no references.

> > Can you explain a way that f() is unsafe in the example above?
> 
> f() is potentially @safe, assuming that `x` is a type without elaborate
> assignment (it is `int` above), but it depends on the compiler having
> powerful flow analysis to determine those facts.
> So it *could* be @safe, but I don't think DMD has the technology required to
> prove that at this time?

The compiler knows the type of an item, it can determine whether it has
elaborate assignment without powerful flow analysis. We just "is it OK to =void
this type?". We already have it for pointers, maybe we also need it for types
that have member functions, or elaborate assignment, or something that
determines it's possible to exploit this for memory corruption.

> Exposing uninitialised memory is a data leak at best. Many forms of exploit
> take advantage of leaking private or inaccessible data, but typically it can
> be used to source or craft values that lead to unexpected or otherwise
> invalid program flow or improper array offsets.

So it comes down to the questions: is it @safe's charter to prevent such
things? and can @safe actually guarantee such things don't happen?

> > Would you consider this function @safe?
> > 
> > int[] allocate(int size)
> > {
> >    auto result = cast(int *)malloc(size * int.sizeof);
> >    return result[0 .. size];
> > }
> > 
> > It doesn't corrupt any memory, the data is not left dangling, as it's not
> > freed, but it's also not initialized. Is that a big problem?
> 
> malloc's not @safe (it's not even D), neither is dynamically slicing a
> pointer, and the memory is uninitialised. This function is certainly not
> @safe.

I didn't express what I wanted clearly enough. What I meant was, would you
consider calling this function to be a safe call? Personally, I would have no
problem marking this @trusted, even though none of the integers are
initialized.

To give you an idea, this is allowed currently in d:
https://github.com/dlang/phobos/blob/c5664d4436235cba2606103f8729341ac79a4487/std/array.d#L811-L825

--