Developing Mars lander software
Tolga Cakiroglu
tcak at pcak.com
Tue Feb 18 16:16:02 PST 2014
On Tuesday, 18 February 2014 at 23:05:21 UTC, Walter Bright wrote:
> http://cacm.acm.org/magazines/2014/2/171689-mars-code/fulltext
>
> Some interesting tidbits:
>
> "We later revised it to require that the flight software as a
> whole, and each module within it, had to reach a minimal
> assertion density of 2%. There is compelling evidence that
> higher assertion densities correlate with lower residual defect
> densities."
>
> This has been my experience with asserts, too.
>
> "A failing assertion is now tied in with the fault-protection
> system and by default places the spacecraft into a predefined
> safe state where the cause of the failure can be diagnosed
> carefully before normal operation is resumed."
>
> Nice to see confirmation of that.
>
> "Running the same landing software on two CPUs in parallel
> offers little protection against software defects. Two
> different versions of the entry-descent-and-landing code were
> therefore developed, with the version running on the backup CPU
> a simplified version of the primary version running on the main
> CPU. In the case where the main CPU would have unexpectedly
> failed during the landing sequence, the backup CPU was
> programmed to take control and continue the sequence following
> the simplified procedure."
>
> An example of using dual systems for reliability.
TL;DR the link though, how are they detecting that a CPU fails?
An information must be passes outside of CPU to do this. The only
solution comes to my mind is that main CPU changes a variable on
an external memory at every step, and back up CPU checks it
continuously to catch a failure immediately. But this would
require about 50% of CPU's power already.
While thinking about this kind of back up systems, knowing and
reading that some people are really doing is really great.
More information about the Digitalmars-d
mailing list