UB in D

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Sat Jul 9 19:29:15 PDT 2016


On 7/9/16 7:44 PM, H. S. Teoh via Digitalmars-d wrote:
> On Sat, Jul 09, 2016 at 07:17:59PM -0400, Andrei Alexandrescu via Digitalmars-d wrote:
>> On 07/09/2016 06:36 PM, Timon Gehr wrote:
>>> Undefined behaviour means the language semantics don't define a
>>> successor state for a computation that has not terminated. Do you
>>> agree with that definition? If not, what /is/ UB in D, and why is it
>>> called UB?
>>
>> Yah, I was joking with Walter that effectively the moment you define
>> undefined behavior it's not undefined any longer :o). It happens to
>> the best of us. I think we're all aligned here.
>>
>> There's some interesting interaction here. Consider:
>>
>> int fun(int x)
>> {
>>     int[10] y;
>>     ...
>>     return ++y[9 >> x];
>> }
>>
>> Now, under the "shift by negative numbers is undefined" rule, the
>> compiler is free to eliminate the bounds check from the indexing
>> because it's always within bounds for all defined programs. If it
>> isn't, memory corruption may ensue. However, if the compiler says
>> "shift by negative numbers is implementation-specified", the the
>> compiler cannot portably eliminate the bounds check.
>
> I find this rather disturbing, actually.  There is a fine line between
> taking advantage of assert's to elide stuff that the programmer promises
> will not happen, and eliding something that's defined to be UB and
> thereby resulting in memory corruption.

Nah, this is cut and dried. You should just continue being nicely 
turbed. "Shifting by a negative integer has undefined behavior" is what 
it is. Now I'm not saying it's good to define it that way, just that if 
it's defined that way then these are the consequences.

> In the above example, I'd be OK with the compiler eliding the bounds
> check if there an assert(x >= 0) either in the function body or in the
> in-contract.  Having the compiler elide the bounds check without any
> assert or any other indication that the programmer has made assurances
> that UB won't occur is very scary to me, as plain ole carelessness can
> easily lead to exploitable security holes.  I hope D doesn't become an
> example of this kind of security hole.

Yeah, we'd ideally like very little UB and no UB in safe code. I think 
we should define shift with out-of-bounds values as "implementation 
specified".

> At the very least, I'd expect the compiler to warn that the function
> argument may cause UB, and suggest that an in-contract or assert be
> added.

You should expect the compiler to do what the language definition 
prescribes.

> On a more technical note, I think eliding the bounds check on the
> grounds that shifting by negative x is UB is based on a fallacy.

No.

> Eliding
> a bounds check should only be done when the compiler has the assurance
> that the bounds check is not needed. Just because a particular construct
> is UB does not meet this condition, because, being UB, there is no way
> to tell if the bounds check is needed or not, therefore the correct
> behaviour IMO is to leave the bounds check in. The elision should only
> happen if the compiler is assured that it's actually not needed.
>
> To elide simply because negative x is UB basically amounts to saying
> "the programmer ought to know better than writing UB code, so therefore
> let's just assume that the programmer never makes a mistake and barge
> ahead fearlessly FTW!". We all know where blind trust in programmer
> reliability leads: security holes galore because humans make mistakes.
> Assuming humans don't make mistakes, which is what this kind of
> exploitation of UB essentially boils down to, leads to madness.

You're overthinking this. Undefined is undefined. We're done here.


Andrei




More information about the Digitalmars-d mailing list