Bad array indexing is considered deadly

Paolo Invernizzi via Digitalmars-d digitalmars-d at puremagic.com
Thu Jun 1 05:25:16 PDT 2017


On Thursday, 1 June 2017 at 10:26:24 UTC, Steven Schveighoffer 
wrote:
> On 5/31/17 9:05 PM, Walter Bright wrote:
>> On 5/31/2017 6:04 AM, Steven Schveighoffer wrote:
>>> Technically this is a programming error, and a bug. But 
>>> memory hasn't
>>> actually been corrupted.
>>
>> Since you don't know where the bad index came from, such a 
>> conclusion
>> cannot be drawn.
>
> You could say that about any error. You could say that about 
> malformed unicode strings, malformed JSON data, file not found. 
> In this mindset, everything should be an Error, and nothing 
> should be recoverable.

Everything coming as an input of the _process_ should be 
validated... once validated, if still find during the execution 
malformed JSON data, malformed unicode strings, etc, there's a 
bug, and the process should terminate.

>>> This seems like a large penalty for "almost" corrupting 
>>> memory. No
>>> other web framework I've used crashes the entire web server 
>>> for such a
>>> simple programming error.
>>
>> Hence the endless vectors for malware insertion in those other 
>> frameworks.
>
> No, those are due to the implementation of the interpreter. If 
> the interpreter is implemented in @safe D, then you don't have 
> those problems.

It seems to me that reducing the danger only to corrupted memory 
is underestimating the damage that can be done, for example by a 
simple SQL injection, that can be done without corrupting memory 
at all.


>>> Compare this to, let's say, a malformed unicode string 
>>> (exception),
>> malformed JSON data (exception), file not found (exception), 
>> etc.
>>
>> That's because those are input and environmental errors, not 
>> programming
>> bugs.
>
> Not necessarily. A file name could be sourced from the program, 
> but have a typo. An index could come from the environment. The 
> library can't know, but makes assumptions one way or the other. 
> Just like we assume you want to use the GC, these assumptions 
> are harmful for those who need it to be the other way.

The library should not assume nothing about anything coming from 
the environment, the filesystem, etc: there's must be a 
validation at the boundaries.

> I can detail exactly what happened in my code -- I am accepting 
> dates from a given week from a web request. One of the dates 
> fell outside the week, and so tried to access a 7 element array 
> with index 9. Nothing corrupted memory, but the runtime 
> corrupted my entire process, forcing a shutdown.

And that's a good thing! The input should be validated, 
especially because we are talking about a web request.

See it like being kind with the other side of the connection, 
informing it with a clear "rejected as the date is invalid".

:-)

/Paolo



More information about the Digitalmars-d mailing list