Making manufacturing systems more robust i.e. able to carry out their function under the presence of faults, is an issue of paramount importance. This issue involves on-line real-time detection of faults, diagnosis of faults, and recovery from the faults. In this paper we present a methodology that is able to generate practical and efficient solutions to this problem. Our methodology is available as part of IPCS, which is a software package used for generating on-line real-time applications for large-scale plants. The basic fault modeling methodology enables a user to specify probability and time weighted causal relationships between equipment and process failure-modes. Fault detection is specified in terms of violation conditions for alarms and alarm failure-mode associations. Polynomial complexity graph algorithms enforce structural, probabilistic, and temporal constraints on the fault model and the set of ringing alarms for generating a diagnosis. The methodology enables a user to model multiple fault recovery strategies. These range from explicit operator messages to switching to backup equipment to complex restructuring strategies. The last kind may include operating parts of a plant at a lower production rate and/or changing control and monitoring strategies. A key benefit in using this methodology is the facility to reuse generic fault models and recovery strategies. Another major benefit is hierarchical model specification, which helps in managing model complexity. Benefits of IPCS include graphical model building, multiple modeling paradigms, object-oriented technology, and automatic application generation. IPCS has been used to generate real-time diagnostic and recovery applications in chemical and power generating plants.