Bad array indexing is considered deadly

Jonathan M Davis via Digitalmars-d digitalmars-d at puremagic.com
Thu Jun 1 03:56:20 PDT 2017


On Thursday, June 01, 2017 06:26:24 Steven Schveighoffer via Digitalmars-d 
wrote:
> On 5/31/17 9:05 PM, Walter Bright wrote:
> > On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
> >> Technically this is a programming error, and a bug. But memory hasn't
> >> actually been corrupted.
> >
> > Since you don't know where the bad index came from, such a conclusion
> > cannot be drawn.
>
> You could say that about any error. You could say that about malformed
> unicode strings, malformed JSON data, file not found. In this mindset,
> everything should be an Error, and nothing should be recoverable.

I think that it really comes down to what the contract is and how it makes
sense to treat bad values. At the one extreme, you can treat all bad input
as programmer error, requiring that callers validate all arguments to all
functions (in which case, assertions or some other type of Error would be
used on failure), and at the other extreme, you can be completely defensive
about it and can have every function validate its input and throw an
Exception on failure so that the checks never get compiled out, and the
caller can choose whether they want to recover or not. Both approaches are
of course rather extreme, and what we should do is somewhere in the middle.

So, for any given function, we need to decide whether we want to take the
DbC approach and require that the caller validates the input or take the
defensive programming approach and have the function itself validate the
input. Which makes more sense depends on what the function does and how it's
used and is a bit of an art. But ultimately, whether something is a
programming error depends on what the API and its contracts are, and that
definitely does not mean that one-size-fits-all.

As a default, I think that treating invalid indices as an Error makes a lot
of sense, but it is true that if the index comes from user input or is
otherwise inferred from user input, having the checks result in Errors is
annoying. But you can certainly do additional checks yourself, and if you
wrap the actual call to index the array in an @trusted function, it should
be possible to avoid there being two checks in the case that the index is
valid.

I get the impression that Walter tends to prefer treating stuff as
programmatic error due to the types of programs that he usually writes. You
get a lot fewer things that come from user input when you're simply
processing a file (like you do with a compiler) than you get with stuff like
a server application or a GUI. So, I think that he's more inclined to come
to the conclusion that something should be treated as programmatic error
than some other folks are. That being said, I also think that many folks are
too willing to try and make their program continue like nothing was wrong
after something fairly catastrophic happened.

> > Otherwise, while it's hard to write invulnerable programs, it is another
> > thing entirely to endorse vulnerabilities. I cannot endorse such
> > practices, nor can I endorse vibe.d if it is coded to continue running
> > after entering an undefined state.
>
> It's not. And it can't be. What I have to do is re-engineer the contract
> between myself and arrays. The only way to do that is to not use builtin
> arrays. That's the part that sucks. My code will be perfectly safe, and
> not ever experience corruption. It's just a bit ugly.

Well, you _can_ use the built-in arrays and just use a helper function for
indexing arrays so that the arrays are passed around normally, but you get
an Exception for an invalid index rather than an Error. You would have to be
careful to remember to index the array through the helper function, but it
wouldn't force you to use different data structures. e.g.

auto result = arr[i];

becomes something like

auto result = arr.at(i);

As an aside, I think that there has been way too much talk of memory
corruption in this thread, and much of it has derailed the discussion from
the actual issue. The array bounds checking prevented the memory corruption
problem. The question here is how to deal with invalid indices and whether
it should be treated as programmer error or bad input, and that's really a
question of whether arrays should use DbC or defensive programming and
whether there should be a way to choose based on your application's needs.

- Jonathan M Davis



More information about the Digitalmars-d mailing list