Bad array indexing is considered deadly

Nick Sabalausky (Abscissa) via Digitalmars-d digitalmars-d at puremagic.com
Wed May 31 19:20:48 PDT 2017


On 05/31/2017 05:03 PM, Moritz Maxeiner wrote:
> On Wednesday, 31 May 2017 at 20:23:21 UTC, Nick Sabalausky (Abscissa) 
> wrote:
>> On 05/31/2017 03:17 PM, Moritz Maxeiner wrote:
>>> in general you have to assume that the index *being* out of bounds is 
>>> itself the *result* of *already occurred* data corruption;
>> Of course not, that's absurd. Where do people get the idea that 
>> out-of-bounds *implies* pre-existing data corruption?
> 
> You assume something I did not write. What I wrote is that the runtime 
> cannot *in general* (i.e. without further information about the 
> semantics of your specific program) assume that it was *not* preexisting 
> data corruption.
> 

Ok, fine. However...

>> Most of  the time, out-of-bounds comes from a bug (especially in D, 
>> what with all of its safeguards).
> 
> Unfortunately the runtime has no way to know *if* the out of bounds 
> comes from a bug or a data corruption, which was my point; only a human 
> can know that. What is the most likely culprit is irrelevant for the 
> default behaviour, because as long as it *could* be data corruption, the 
> runtime cannot by default assume that it is not; that would be unsafe.
> 

Like I said, *anything* could be the result of data corruption. (And 
with out-of-bounds in particular, it's very rare for the cause to be 
data corruption, especially in D).

If the determining factor for whether or not condition XYZ should abort 
is "*could* it be data corruption?", then ALL conditions must abort, 
because data corruption and undefined state can, by their very nature, 
cause *any* state - heck, even ones that "look" perfectly valid.

So, since that approach is a complete non-starter even in thory, the 
closest thing we *can* reasonably do is instead, use the crieteria "is 
this *likely enough* to be data corruption?" (for however we choose to 
define "likely enough").

BUT, in that case, out-of-bounds *still* fails to meet the criteria by a 
longshot. When an out-of-bounds does occurs, it's vastly most likely to 
be a bug, not data corruption. Fuck, in all my decades of programming, 
including using D since pre-v1.0, NOT ONCE have ANY of the hundreds, 
maybe thousands, of out-of-bounds I've encountered ever been the result 
of data corruption. NOT ONCE. Not exaggerating. Even as an anecdote, 
that's a FAR cry from being able to reasonably suspect data corruption 
as a likey cause, regardless of where we set the bar for "likely".

>> Sure, data corruption is one possible cause of out-of-bounds, but data 
>> corruption is one possible cause of *ANYTHING*. So just to be safe, 
>> let's just abort on all exceptions, and upon everything else for that 
>> matter.
> 
> No, abort on Errors where the runtime cannot know if data corruption has 
> already occured, i.e. the program is in an undefined state.

The runtime can NEVER be know that no data corruption has occurred. Let 
me emphasise that: *NEVER*.

By the very nature of data curruption and undefined states, it is NOT 
even theoretically plausible for a runtime to EVER be able to rule out 
data corruption, *not even when things look A-OK*, and hell, not even 
when the algorithm is mathematically proven correct, because, shoot, 
let's just pretend we live in a fantasy world where hardware failures 
are impossible why don't we?

Therefore, if we follow your reasoning (that we must abort whenever data 
corruption is possible), then we must therefore abort all processes 
unconditionally upon creation.

Your approach sounds nice, but it's completely unrealistic.


More information about the Digitalmars-d mailing list