100 – System Health Management

(average: 4.62)

In this 100th episode of omega tau we talk to Dr. Stephen B. Johnson about system health management, a set of techniques and processes used to improve system dependability. The episode is based on a book Stephen co-edited, and as a consequence of Stephen’s background, we use aerospace examples in this episode. We discuss the fundamental concepts such as functions, states and the state vector, failures and faults. We then discuss the influence of complexity on failures, as well as human involvement. We discuss means to prevent failures such as fault isolation, redundancy and model adjustment. We conclude the three-hour conversation by looking at the future of systems engineering and system health management with a particular focus on formal methods.

If you are interested in fault tolerance in software, I suggest you listen to part 1 and part 2 of the interview with Bob Hanmer on SE Radio.