For Programmers: Free Programming Magazines  


Home > Archive > Fortran > September 2005 > Re: Ariane









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author Re: Ariane
Jan Vorbrüggen

2005-09-22, 7:57 am

> When people tell me that it was a programming/specification
> error, I ask them: "If you had been in the project software
> specification panel, what course of action would you have suggested
> to handle the exception 'acceleration value does not fit
> into the 16-bit integer range' when you know that the sensor
> works correctly and that the physical upper bound must fit ?"


The specification error lay in the fact that the factoid "the physical
upper bound must fit" was never reconsidered in the new environment (i.e.,
Ariane 5) the software would be working with. Of course, that factoid was
no longer true, as Arianespace found out. It turns out there simply was no
requirements document for the INS of the Ariane 5 at all, so nobody ever
considered the question you asked above.

In addition, that routine was executing in error already on the Ariane 4
- for certain reasons, it was left running after T-0 (main engine start)
but should have been stopped after liftoff. In fact, it was (very slightly)
corrupting navigation data while it was running after liftoff!

Just revisiting this decision - the basis for it also no longer being
applicable for Ariane 5 - would have saved flight 501 and the first
Cluster incarnation. Just performing an integrated test with the real
INS in place, simulating sensor input instead of INS output, would have
discovered the problem. All possible safety nets had been removed, and
people were surprised that the first attempt at the salto mortale lead
to a dead artist, err, rocket.

Finally, while the flight as such was more-or-less doomed with a failed
navigation system, there were two flaws on the system engineering level:
First, only hardware failures were considered in the design; thus, when
the exception occured, the INS just threw in the towel instead of trying
to continue on a best-effort basis. This lead to an unrecoverable common
cause failure of the INS as a system. Second, there was no distinction
between debug mode and mission mode. Thus, the steering computer inter-
preted the error code put out by the INS as data, commanded a sharp turn,
and aerodynamically disassembled the rocket within fractions of a second,
generating those impressive fireworks everybody remembers. That interface
"misunderstanding" just blows my mind.

If such a thing had happened in pre-1945 Japan, a lot of engineering and
management jobs would suddenly have become vacant all over Europe, with
an attendant spike in business for the undertakers.

All in all an impressive example of how not to do systems engineering.
But for once, the programmers and their tools were not at fault.

Jan
Sponsored Links







Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive

Copyright 2009 codecomments.com