Null references (oh no, not again!)
Denis Koroskin
2korden at gmail.com
Wed Mar 4 04:13:13 PST 2009
On Wed, 04 Mar 2009 14:40:58 +0300, Walter Bright <newshound1 at digitalmars.com> wrote:
> Denis Koroskin wrote:
>> On Wed, 04 Mar 2009 13:55:57 +0300, Walter Bright
>>> If software is in your flight critical systems, the way one proceeds
>>> is to *assume skynet takes it over* and will attempt to do everything
>>> possible to crash the airplane.
>> Assume you got a null-derefence under Linux. How are you going to
>> recover from it? You can't catch the NullPointerException, so your
>> program will fail and bring down the whole system *anyway*.
>
> You design your critical system so it is not vulnerable to the failure
> of a subsystem of it, even if that subsystem is powered by linux.
>
> For example, you might have two computer systems controlling the
> process. They vote, and if they disagree, they both are removed and the
> backup is engaged. The two systems use different operating systems - say
> one linux the other windows, they use different software written with
> different algorithms in different languages.
>
> The space shuttle, for example, had 4 independent flight control
> computers voting, and a 5th (with reduced capability) that could be
> manually brought online in case the 4 primaries all failed.
>
> Google did an interesting design for their Chrome browser. Each tab in
> it was powered by a separate process, meaning the hardware isolated it
> from the operation of the other tabs. So if the browser crashed in one
> tab, it wouldn't affect the other ones.
>
> I've read elsewhere that if you want to create a robust system, you
> break it up into different modules and run those modules as separate
> processes (not just separate threads) that communicate via interprocess
> communication. Any particular module dying could then be restarted
> without affecting the rest of the modules.
>
> The wrong way to do it is to lump everything into one gigantic process.
> Then, any failure brings everything down.
Most people can't afford their applications run on a few computers just in case one of them fails. Besides, as you yourself pointed out, NPE are often repeatable, so if you re-run the task on another PC, chances are it will fail, too.
No doubt, Google Chrome is a beautiful piece of software. It doesn't crash the whole browser when something is null-dereferenced. But the message I've been writing for half an hour is *lost* anyway when the host process fails.
The way you suggest writing software is like a doctor who suggests curing/hiding symptoms rather than the cause of an illness. You shouldn't rely on exception recovery when you may avoid the whole class of bugs altogether.
More information about the Digitalmars-d
mailing list