Fault Isolation and Alarm Design in Non-linear StochasticSystemsbyFeras A. AlrowieB.Sc., King Fahd University of Petroleum and Minerals, 2001M.Sc., The University of Leeds, 2007A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinTHE FACULTY OF GRADUATE AND POSTDOCTORALSTUDIES(Chemical and Biological Engineering)The University of British Columbia(Vancouver)January 2015© Feras A. Alrowie, 2015AbstractIn this project, first we propose a novel model-based algorithm for fault detection andisolation (FDI) in stochastic non-linear systems. The algorithm is established based onparameter estimation by monitoring any changes in the behaviour of the process and iden-tifying the faulty model using a bank of particle filters running in parallel with the processmodel. The particle filters are used to generate a sequence of hidden states, which are thenused in a log-likelihood ratio to detect and isolate the faults. The newly developed schemeis demonstrated through implementation in two highly non-linear case studies. Finally, theeffectiveness and robustness of the proposed diagnostic algorithm are illustrated by com-paring the results obtained by applying the algorithm to the multi-unit chemical reactorsystem using other FDI techniques, based on EKF and UKF state estimators.Second, we propose an approach based on particle filter algorithm to isolate actuatorand sensor faults in stochastic non-linear and non-Gaussian systems. The proposed FDIapproach is based on a state estimation approach using a general observer scheme (GOS),whereby a bank of particle filters is used to generate a set of residuals, each sensitive toall but one fault. The faults are then isolated by monitoring the behaviour of the residualswhere the residuals of the faulty sensors or actuators behave differently than the faultlessresiduals. The approach is demonstrated through implementing two highly non-linear casestudies.iiNon-linear stochastic systems pose two important challenges for designing alarms: (1)measurements are not necessarily Gaussian distributed and (2) measurements are correlated- in particular, for closed-loop systems. We therefore present an algorithm for designingalarms based on delay timers and deadband techniques for such systems, with unknownand known models. In the case of unknown models, our approach is based on Monte Carlosimulations. In the case of known models, it makes use of a probability density functionapproximation algorithm called particle filtering. The alarm design algorithm is illustratedthrough two simulation examples. We show that the proposed alarm design is effective indetecting the fault, even though the measurements are non-Gaussian.iiiPrefaceThis thesis, entitled “Fault Isolation and Alarm Design in Non-linear Stochastic Systems,”is submitted in partial fulfillment of the requirements for the Doctor of Philosophy in Chem-ical and Biological Engineering Department at the University of British Columbia. Thethesis contains the results of research carried out by the author in the period from Septem-ber 2009 to October 2014 under the supervision of Professor R. B. Gopaluni and ProfessorK. E. Kwok. Contributions and collaborations with respect to published papers or papersubmitted for publication are as follows.• A version of Chapter 3 has been published in the following two conference papers[14] and [15]. An extended version of the previous two papers has been publishedin the Journal of Control Engineering Practice, [16]. The simulation runs in theprevious three papers were done in close collaboration with Professors Gopaluni andKwok. They also helped in the preparation and revision of these papers before finalsubmissions.• Chapter 4 is based on work conducted using simulation experiments and has beenpublished in [17]; an extended version of the work has also been prepared for publi-cation.• Finally, the work in Chapter 5 has been published in F. Alrowaie, R.B. Gopaluni, andK.E. Kwok, 2014, Alarm Design for Non-linear Stochastic Systems, in Proceedingsivof the 11th World Congress on Intelligent Control and Automation. An extension tothis work has also been prepared for submission.vTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiNomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xviiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xx1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Background and motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 Fundamental definitions and concepts in fault diagnosis . . . . . . . . . . . 41.3 Characteristics of a fault diagnosis system . . . . . . . . . . . . . . . . . . 71.4 Alarm management system . . . . . . . . . . . . . . . . . . . . . . . . . . 91.5 Outline and contributions of this thesis . . . . . . . . . . . . . . . . . . . . 101.6 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12vi2 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.2 Model-based fault diagnosis methods . . . . . . . . . . . . . . . . . . . . . 162.2.1 Quantitative model-based methods . . . . . . . . . . . . . . . . . . 182.2.2 Qualitative model-based methods . . . . . . . . . . . . . . . . . . 252.3 Data-based fault diagnosis methods . . . . . . . . . . . . . . . . . . . . . . 252.4 Model-based fault isolation strategies . . . . . . . . . . . . . . . . . . . . . 272.5 Robustness of model-based FDI . . . . . . . . . . . . . . . . . . . . . . . 302.6 State-of-the-art model-based non-linear FDI techniques . . . . . . . . . . . 312.7 Alarm design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323 FDI Based on Parameter Estimation Using DOS . . . . . . . . . . . . . . . . 373.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.3 Fundamentals of sequential Monte Carlo method . . . . . . . . . . . . . . 413.3.1 Recursive Bayesian estimation . . . . . . . . . . . . . . . . . . . . 413.3.2 Sequential Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . 433.4 Proposed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 463.4.1 Fault detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 493.4.2 Fault isolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503.4.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.5 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 513.5.1 Case study 1: application to a two-unit chemical process . . . . . . 513.5.2 Case study 2: application to a polyethylene reactor system . . . . . 643.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71vii4 FDI Based on State Estimation Using GOS . . . . . . . . . . . . . . . . . . . 724.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734.3 Proposed algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.3.1 Sensor FDI based on GOS . . . . . . . . . . . . . . . . . . . . . . 754.3.2 Actuator FDI based on GOS . . . . . . . . . . . . . . . . . . . . . 764.4 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804.4.1 Case study 1: application to a simple heat exchanger model . . . . . 804.4.2 Case study 2: application to a two CSTRs and a flash tank with arecycle stream process . . . . . . . . . . . . . . . . . . . . . . . . 844.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955 Alarm Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965.2 Problem statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1005.3 Delay timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035.4 Deadbands timers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.5 Case studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1115.5.1 Case study 1: application to a single chemical reactor . . . . . . . . 1115.5.2 Case study 2: application to a two CSTRs and a flash tank with arecycle stream process . . . . . . . . . . . . . . . . . . . . . . . . 1145.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1186 Conclusions and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1196.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1206.2.1 Multi-model FTC system . . . . . . . . . . . . . . . . . . . . . . . 120viii6.2.2 Model invalidation . . . . . . . . . . . . . . . . . . . . . . . . . . 1226.2.3 Alarm design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125ixList of TablesTable 3.1 Process parameters and steady-state values for the two CSTR reactors . . 53Table 3.2 Fault scenarios 1−4 for the two CSTR reactors . . . . . . . . . . . . . 55Table 3.3 Fault scenario 1 for the two CSTR reactors: FAR - false alarm rate,ADTD - alarm detection time delay . . . . . . . . . . . . . . . . . . . . 59Table 3.4 Process parameters and steady-state values for the polyethylene reactor . 67Table 3.5 Process variables of the polyethylene reactor system . . . . . . . . . . . 68Table 3.6 Fault scenarios for the polyethylene reactor system . . . . . . . . . . . . 70Table 4.1 Sensor FDI decision-making logic . . . . . . . . . . . . . . . . . . . . . 77Table 4.2 Actuator FDI decision making Logic . . . . . . . . . . . . . . . . . . . 78Table 4.3 Process parameters and steady-state values for the heat exchanger process 81Table 4.4 Sensor fault scenarios 1–4 for the simple heat exchanger process . . . . 82Table 4.5 Process variables of a two CSTRs and flash tank system . . . . . . . . . 87Table 4.6 Process parameters and steady-state values for the two CSTRs and flashtank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88Table 4.7 Steady-state inputs for the two CSTRs and flash tank system . . . . . . . 88Table 4.8 Actuator and sensor fault signature tables for the two CSTRs and a flashtank with a recycle stream process . . . . . . . . . . . . . . . . . . . . . 89xTable 4.9 Sensor fault scenarios 1-4 for the two CSTRs and a flash tank with arecycle stream process . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Table 4.10 Actuator fault scenarios 5-6 for the two CSTRs and a flash tank with arecycle stream process . . . . . . . . . . . . . . . . . . . . . . . . . . . 90Table 5.1 (On/Off)- delay timers recommendations based on signal type . . . . . . 103Table 5.2 Recommended alarm deadbands based on signal type . . . . . . . . . . 109Table 5.3 Process parameters and steady-state values for the CSTR reactor . . . . . 113xiList of FiguresFigure 1.1 Source of abnormal situation. . . . . . . . . . . . . . . . . . . . . . . . 3Figure 1.2 Classification of faults according to their location as sensor, actuatorand component faults. . . . . . . . . . . . . . . . . . . . . . . . . . . . 5Figure 1.3 Classification of faults according to their time characteristics. . . . . . . 7Figure 1.4 Alarm management lifecycle. . . . . . . . . . . . . . . . . . . . . . . . 10Figure 1.5 Types of errors in decision making. . . . . . . . . . . . . . . . . . . . . 11Figure 2.1 Comparison between physical and analytical redundancy-based schemesfor FDI [110]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16Figure 2.2 Model-based fault detection methods [110]. . . . . . . . . . . . . . . . 17Figure 2.3 Model-based fault detection scheme. . . . . . . . . . . . . . . . . . . . 19Figure 2.4 Basic models of faults: a) additive fault; b) multiplicative faults. . . . . 20Figure 2.5 Schematic diagram of parity space approach . . . . . . . . . . . . . . . 21Figure 2.6 Schematic diagram of the state estimation approach . . . . . . . . . . . 22Figure 2.7 Schematic diagram of parameter the estimation approach . . . . . . . . 24Figure 2.8 Classication of qualitative-based methods [117]. . . . . . . . . . . . . . 26Figure 2.9 Model-based FDI based on a dedicated observer scheme. . . . . . . . . 29Figure 2.10 Model-based FDI based on a general observer scheme. . . . . . . . . . 30Figure 3.1 Overview of the proposed model-based FDI approach. . . . . . . . . . 47xiiFigure 3.2 Schematic of two chemical reactors connected in series. . . . . . . . . . 53Figure 3.3 The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 1 forthe two CSTR reactors. . . . . . . . . . . . . . . . . . . . . . . . . . . 55Figure 3.4 The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 2 forthe two CSTR reactors. . . . . . . . . . . . . . . . . . . . . . . . . . . 56Figure 3.5 The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 3 forthe two CSTR reactors. . . . . . . . . . . . . . . . . . . . . . . . . . . 56Figure 3.6 The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 4 forthe two CSTR reactors. . . . . . . . . . . . . . . . . . . . . . . . . . . 57Figure 3.7 The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of eachfaulty model Linew(m,k). Given that the largest test statistic is withL1new(m,k), it is concluded that F1 is the fault present in the two CSTRreactors system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58Figure 3.8 The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of eachfaulty model Linew(m,k). Given that the largest test statistic is withL2new(m,k), it is concluded that F2 is the fault present in the two CSTRreactors system.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59xiiiFigure 3.9 The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of eachfaulty model Linew(m,k). Given that the largest test statistic is withL3new(m,k), it is concluded that F3 is the fault present in the two CSTRreactors system.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60Figure 3.10 The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of eachfaulty model Linew(m,k). Given that the largest test statistic is withL4new(m,k), it is concluded that F4 is the fault present in the two CSTRreactor systems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61Figure 3.11 The top plot shows the log-likelihood ratio test statistic and the bottomplot shows the alarm signal for scenario 5 for the two CSTR reactors. . 61Figure 3.12 The top plot shows the log-likelihood ratio test statistic and the bottomplot shows the alarm signal for scenario 6 for the two CSTR reactors. . 62Figure 3.13 The top plot shows the log-likelihood ratio test statistic based on EKFalgorithm and the bottom plot shows the alarm signal for scenario 1 forthe two CSTR reactors. . . . . . . . . . . . . . . . . . . . . . . . . . . 62Figure 3.14 The top plot shows the log-likelihood ratio test statistic based on UKFalgorithm and the bottom plot shows the alarm signal for scenario 1 forthe two CSTR reactors. . . . . . . . . . . . . . . . . . . . . . . . . . . 63Figure 3.15 Comparison of false alarm rates in fault detection using particle filter,EKF and UKF with 5% biased sensor. . . . . . . . . . . . . . . . . . . 63Figure 3.16 Schematic of the industrial gas-phase polyethylene reactor system. . . . 65Figure 3.17 The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for scenario 1 for the polyethylene reactorsystem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70xivFigure 3.18 The differences between the log-likelihood ratio test statistic of thenominal process L0new(m,k) and the log-likelihood ratio test statistic ofeach faulty model Linew(m,k). Given that the largest test statistic is withL1new(m,k), it is concluded that F1 is the fault present in the two CSTRreactor system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70Figure 3.19 The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for scenario 2 for the polyethylene reactorsystems. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Figure 4.1 Model-based sensor FDI based on GOS. . . . . . . . . . . . . . . . . . 77Figure 4.2 Model-based actuator FDI based on GOS. . . . . . . . . . . . . . . . . 79Figure 4.3 Simple heat exchanger . . . . . . . . . . . . . . . . . . . . . . . . . . 81Figure 4.4 Residuals generated using a particle filter approach for fault scenariof1 for the heat exchanger process based on DOS and GOS using Qν1and Qω1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Figure 4.5 Residuals generated using a particle filter approach for fault scenariof3 for the heat exchanger process based on DOS and GOS using Qν1and Qω1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83Figure 4.6 The residuals generated using particle filter approach for fault scenariof2 for the heat exchanger process based on DOS and GOS using Qν1and Qω1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Figure 4.7 Residuals generated using a particle filter approach for fault scenariof4 for the heat exchanger process based on DOS and GOS using Qν1and Qω1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84Figure 4.8 Two CSTRs and a flash tank with a recycle stream. . . . . . . . . . . . 85xvFigure 4.9 Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recyclestream process based on GOS. . . . . . . . . . . . . . . . . . . . . . . 91Figure 4.10 Residuals generated using a particle filter approach for the biased tem-perature sensor T20 for two CSTRs and a flash tank with a recyclestream process based on GOS. . . . . . . . . . . . . . . . . . . . . . . 91Figure 4.11 Residuals generated using a particle filter approach for the biased flowrate sensor F10 for two CSTRs and a flash tank with a recycle streamprocess based on GOS. . . . . . . . . . . . . . . . . . . . . . . . . . . 92Figure 4.12 Residuals generated using a particle filter approach for the biased flowrte sensor F20 for two CSTRs and a flash tank with a recycle streamprocess based on GOS. . . . . . . . . . . . . . . . . . . . . . . . . . . 92Figure 4.13 Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recyclestream process based on GOS. . . . . . . . . . . . . . . . . . . . . . . 93Figure 4.14 Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recyclestream process based on GOS. . . . . . . . . . . . . . . . . . . . . . . 93Figure 4.15 Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recyclestream process based on GOS. . . . . . . . . . . . . . . . . . . . . . . 94Figure 4.16 Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recyclestream process based on GOS. . . . . . . . . . . . . . . . . . . . . . . 94xviFigure 4.17 Residuals generated using a particle filter approach for the biased tem-perature sensor T10 and the biased temperature sensor T20 for two CSTRsand a flash tank with a recycle stream process based on GOS. . . . . . . 95Figure 5.1 FAR/MAR PDFs using a delay timer technique. . . . . . . . . . . . . . 103Figure 5.2 FAR/MAR PDFs using deadband technique. . . . . . . . . . . . . . . . 109Figure 5.3 High and low alarms with deadband technique. . . . . . . . . . . . . . 109Figure 5.4 Schematic of a single chemical reactor process. . . . . . . . . . . . . . 112Figure 5.5 A plot of MAR and FAR against alarm threshold for the single chemicalreactor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114Figure 5.6 A plot of expected detection delay as a function of threshold for thesingle chemical reactor. . . . . . . . . . . . . . . . . . . . . . . . . . . 115Figure 5.7 A plot showing the histograms of xk before and after the fault was in-troduced for the single chemical reactor. . . . . . . . . . . . . . . . . . 115Figure 5.8 A plot of MAR and FAR against alarm threshold for the two CSTRsand flash tank model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 116Figure 5.9 A plot of expected detection delay as a function of threshold for thetwo CSTRs and flash tank model. . . . . . . . . . . . . . . . . . . . . 117Figure 5.10 A plot showing the histograms of xk before and after the fault was in-troduced for the the two CSTRs and flash tank model. . . . . . . . . . . 117Figure 6.1 Structure of the proposed integrated design of the multiple model-basedFDI approach and multiple MPC controllers. . . . . . . . . . . . . . . 121xviiNomenclatureOnly frequently used symbols and abbreviations are given.List of Symbolsxk State vectoryk Measurement vectorθ i A vector of constant values in the modely1:k Sequence of measurements up to time kr Residualu Input signalGP Monitored process variableGM Actual process variableye Estimated outputN Number of faulty modelsm Window size in statistical testP Nominal parameterPe Estimated parameterθ 0 Null hypothesisθ 1 Alternate hypothesisxviiiν ik The state noise sequencesω ik The measurement noise sequencesw(i) Sampling weightList of AbbreviationsCSTR Continuous stirred tank reactorCUSUM Cumulative sum test statisticDOS Dedicated observer schemeEDD Expected detection delayEKF Extended Kalamn filterFAR False alarm rateFD Fault detectionFDI Fault detection and isolationFDII Fault detection, isolation and identificationFI Fault isolationFTC Fault tolerant controlGLR General log-likelihood ratioGOS General observer schemeLLR Log-likelihood ratioMAR Missed alarm rateMPS Model productive controlPDF Probability density functionSMC Sequential Monte CarloSSM Stochastic state-space modelUKF Unscented Kalamn filterxixAcknowledgmentsFirst and foremost, all praises and thanks are due to Allah, the Almighty, for giving me thegreat blessings, capability and opportunity to accomplish this work. I would also like toacknowledge and thank all of the people who have inspired, encouraged and supported meover the last few years.I would like to express my heartfelt gratitude to my PhD supervisors, Professors BhushanGopaluni and Ezra Kwok, for their support and encouragement during these years. Withouttheir guidance and constant feedback, this PhD would not have been achievable. ProfessorGopaluni has been my primary resource for answering my questions and was very help-ful in doing all the simulation work in this thesis, as well as in preparing my papers forsubmission. I would also like to thank Dr. Aditya Tulsyan, who is currently working asa postdoctoral associate in the Department of Chemical Engineering at the MassachusettsInstitute of Technology, USA, for his help and support in completing the last chapter of thisthesis.I am also very grateful to my employer, Royal Commission for Jubail and Yanbu, forgiving me the opportunity to pursue my graduate studies. Another thanks goes the SaudiArabian Cultural Bureau in Canada for their invaluable support and facilities during mystudies.xxA very special thanks goes to my beloved wife, Shaikha, for her patience, encourage-ment, and continual support. She has been a great source of inspiration, love, and motiva-tion, filling the absence of the rest of our family. I also would like to thank my wonderfulchildren, Zainab, Mohammed, and Yousif, who joined our lovely family during my PhDstudies, for always making me smile and giving me their love.Finally, I would also like to express sincere gratitude to my beloved mother and fatherfor always believing in me and for their encouragement and love throughout the durationof my studies.xxiChapter 1IntroductionIn this thesis, model-based fault diagnosis and alarm system designs in chemical processesare addressed for non-linear and non-Gaussian stochastic systems. In the literature, model-based fault diagnosis designs for chemical processes have received less attention comparedto other disciplines–e.g., aerospace, mechanical and electrical engineering–due to the non-linear nature of these processes. However, in recent years, the fault diagnosis problems innon-linear systems have received more attention due to the increasing demand for highersafety and reliability standards in chemical plants, as well as the growth in computationalcapabilities, which have made statistically intensive methods such as sequential MonteCarlo techniques affordable and practical [34].This chapter is organized as follows. Section 1.1 presents introductory background aswell as the author’s motivation for studying the field of FDI and alarm system designs.This followed in Section 1.2 by a review of fundamental definitions and concepts in faultdiagnosis and further faults classification. Section 1.4 offers a brief introduction to alarmmanagement. Section 1.5 outlines the thesis and summarizes its main contributions. Fi-nally, the objectives of the thesis are presented in Section 1.6.11.1 Background and motivationModern chemical processes have become large, complex and highly interconnected due tothe increasing demands for both quantity and quality of chemical products. As a result, thedegree of automation used in these processes has also become complex [115]. However, theincreased level of automation, and its complexity, often leave these processes susceptible tounexpected abnormal situations that have significant safety, economic, and environmentalimpacts. An abnormal situation can be defined as a disturbance or series of disturbancesin a process that causes plant operations to deviate from their normal operating state andmay lead to minor or catastrophic consequences [42]. A survey by [96] revealed that theUS petrochemical industry alone could save up to $10 billion annually if abnormal processbehaviour could be detected, diagnosed and appropriately dealt with. Vedam and Venkata-subramanian [116] highlighted that the same industry loses over $20 billion per year due toinappropriate reactions to abnormal process behaviour. Another global survey conductedin different countries around the world, including the USA, the UK, Canada, the EuropeanUnion and Japan, indicated that about 42% of abnormal operations were caused by hu-man errors as shown in Figure 1.1 [96]. Therefore, to maintain high safety and reliabilitystandards in chemical plants, it is important that abnormal conditions and faults be detectedand isolated as quickly as possible, before they degrade the process and lead to catastrophicincidents [84].Fault diagnosis systems monitor a large number of instruments during plant operations.The performance and safety of these plants depend on the detection and isolation of faultsas well as the processing of the information from these instruments to raise various safetyand performance alarms [12]. The standard industrial practice is to design sensitive alarmsthat bring to the attention of operators every minor safety or performance violation [11].2Figure 1.1: Source of abnormal situation.More often than not, the majority of these alarms do not require immediate action, butsome of them, if left unaddressed, can lead to catastrophic events later. This approach todesigning alarms results in operators being flooded with a large number of alarms at anygiven time. In fact, operators sometimes completely ignore certain recurring but uncriticalalarms. This can lead to complacency on the part of operators and the management. Thereare numerous historical examples in which large numbers of alarms in conjunction withthe complacent behaviour of operators have resulted in catastrophic events with significanthuman and material losses. These events often also result in environmental pollution [109].The following few incidents are often cited in the literature to motivate researchers in theareas of fault monitoring and alarm system designs [84]:• Piper Alpha disaster, Scotland, 1988• Buncefield fire, Hertfordshire Oil Storage Terminal, England, 2005• British Petroleum (BP) disaster, Gulf of Mexico, 2010Such incidents cannot be completely prevented, but at the very least the consequencesof faults could be avoided by using a suitable fault diagnosis system and an effective alarm3management system that can detect any deviation from the normal behaviour of the processand allow enough lead time for the operator to take appropriate corrective action beforecatastrophic failures can ensue [127]. For the above-mentioned reasons, the developmentof fault detection and isolation (FDI) and alarm system designs is receiving more attentionfrom the theoretical and practical points of view.1.2 Fundamental definitions and concepts in faultdiagnosisThe IFAC SAFEPROCESS technical committee [67] defines a fault as an unpermitted de-viation of at least one characteristic property or parameter of a system from its normalcondition. It causes a degradation in the system’s performance but may not result in com-plete loss of system functionality. So, faults are differentiated from failures, where a failureis defined as a permanent interruption of the system’s ability to perform required functionsunder specified operating conditions. Ideally, a fault diagnosis system is used to moni-tor the behaviour of the process and provide all information about any abnormality in theprocess through three different tasks [37]:1. Fault Detection (FD): decides on the occurrence of a fault in the system and deter-mines the time of occurrence.2. Fault Isolation (FI): exactly locates the component or subsystem that shows dis-crepancies in system behaviour.3. Fault Identification: aims to determine its type, magnitude and cause once it hasbeen detected and isolated.A fault diagnosis system, depending on its performance, is designated as FD (for faultdetection), FDI (for fault detection and isolation) or FDII (for fault detection, isolation and4Figure 1.2: Classification of faults according to their location as sensor, actuator andcomponent faults.identification) [37]. FD and FDI systems have been more extensively studied than FDII.Faults can take place in different parts of a controlled system and may cause the systemto deviate from its normal behaviour. In the FDI literature, faults are classified accordingto where they occur in the system, as shown in Figure 1.2. These faults and their causes arebriefly described below.• Actuator faults: represent partial or total loss of control action due to lock-in-place,outage, cut or burned wiring, shortcuts or loss of effectiveness. In the case of com-plete loss of actuator action, the actuator will not be able to produce any action evenif an input is applied to it. However, when the actuator partially fails, it producesonly part of the normal control action. If only gain changes, this can be modelled aseither additive or a multiplicative fault signal [101].• Sensor faults: represent a variation in the measurements coming from the sensors,due to broken wires, lost contact with surface, aging, lock-in-place or other causes. Itcan also be represented as a partial or total loss of correct reading. The static sensoroutput measurement y(t) can be influenced by a constant offset, a value-dependentoffset or a direction-dependent offset, and the faults can be modelled as additivefaults. However, faults that change the dynamic of the sensor can be modelled as5multiplicative faults [123], [101].• Component faults: represent the faults that occur in the components of the plantitself and cause a change in the physical parameters of the systems due to wear andtear, aging of components etc. They often result in a change in the dynamical be-haviour of the controlled system. It represents all the faults that are not classifiedas sensors or actuators fault. It can be modelled either as multiplicative faults for achange in process parameters e.g. heat exchanger coefficient, fluid resistance coeffi-cient or additive faults e.g. mass flow leaks [101].Further, as shown in Figure 1.3, faults are classified according to their time charac-teristics as abrupt, incipient or intermittent. Abrupt faults occur instantaneously, often asthe result of hardware damage. They have a step-like behaviour, in which the fault termchanges abruptly from the nominal value to the faulty value. Usually they are very severe,as they affect the performance and/or stability of the controlled system. Incipient faultsrepresent slow parametric changes, often as a result of aging. They have drift-like be-haviour, where the fault term changes gradually from the normal value to the faulty value.They are more difficult to detect due to their slowness but are also less severe. Finally,intermittent faults appear and disappear repeatedly–for instance, due to partially damagedwiring. Because of the temporary effect of this type of fault, the fault term changes fromthe nominal value to the faulty value and returns back to the normal value in a short periodof time. Abrupt faults are characterized by a “step-like” time profile, while the incipientfaults characterized by an “ramp-like” profile.6Figure 1.3: Classification of faults according to their time characteristics.1.3 Characteristics of a fault diagnosis systemAn advanced fault diagnosis system should satisfy most of the following criteria. However,achieving all of these attributes in a single technique is difficult [65], [64], [41], [111]:• Early detection and diagnosis: the fault diagnosis system should capable of accu-rate, quick and early detection of small faults with a minimal delay and abrupt orincipient time behaviours in order to avoid severe damages during abnormal situa-tions.• Generality of the diagnosis system: refers to the system’s ability to handle differenttypes of faults–e.g., process component, actuator, and sensor faults–as well as theability to handle linear and non-liner systems.• Isolability: is the ability of the diagnosis system to distinguish between differentfaults by locating the root cause among different components of the system. How-ever, isolability is the most difficult feature to achieve in any FDI system becauseisolability depends not only on the diagnostic system but also on the way that thefault may affect the system. Moreover, to have perfect isolation, the FDI systemshould be sensitive to uncertainties–i.e., noises, disturbances or modelling errors.• Robustness: measurement noise, systems disturbances and modelling errors are un-7avoidable in the practical implementation of any fault diagnosis system. So, to havea reliable and accurate FDI system, these uncertainties should be taken into accountin its design. More details about robustness are given in Section 2.5.• Multiple and simultaneous fault identifiability: the ability to identify single, mul-tiple and simultaneous faults is an important and desirable attribute of any diagnosissystem. However, achieving it is not easy due to non-linearities and interaction be-tween the states and potential faults.• Novelty identifiability: is the ability of the diagnosis system to detect and isolateunknown “novel” faults. The detection of unknown faults is relatively achievable,but isolating and identifying them is extremely difficult because they have not beenencountered before.• Reasonable storage and computational requirements: the most desirable charac-teristics of any diagnosis system are its memory and computational requirements.There is a trade-off between computational complexity and system performance. Itis desirable to balance these two requirements to enable fast and accurate online de-cisions with minimal storage and computational complexity.• Adaptability: the fault diagnosis system should adapt itself intelligently to anychanges in the operating conditions that occur due to disturbances and environmen-tal changes. The system should be adaptable to take care of changes that degradeprocess performance over the time.• Explanatory facility: a diagnostic system should be able to explain where a faultoriginated and how it propagated in the system to cause the current situation.81.4 Alarm management systemOnce a fault is detected, an alarm signal should be triggered to notify the operator aboutthe fault. “The processes and practices for determining, documenting, designing, operating,monitoring and maintaining alarm systems” [63] are known as alarm management. Alarmmanagement systems are discussed in detail in [63] and [40]. An overview of the alarmmanagement lifecycle is provided in Figure 1.4. The overall objective of the alarm man-agement system is documented in the philosophy component, which provides guidance forthe other lifecycle stages [10]. Potential alarms are identified using different methods, suchas process hazard analysis, safety requirement specifications, recommendations from anincident investigation and other possible approaches. The rationalization step determinesthe properties of every alarm–i.g., alarm setpoint, alarm priorities–their consequences andthe actions that the operator should take. In this stage, each potential alarm is tested againstthe criteria already documented in the philosophy stage. The alarm parameters and alarmattributes are designed in a detailed design stage based on the requirements documentedin the philosophy stage. As stated in [10], the rest of the blocks in the alarm lifecycle arequite self-explanatory. In the audit section, the overall performance of the alarm system isreviewed periodically to identify possible improvements in the alarm philosophy.The performance of fault diagnosis and alarm management systems is an importantconcern in process industries and can be measured using three major criteria: false alarm,missed alarm rate and expected detection delay. The false alarm rate (FAR) is also knownas a type I error and refers to the probability of accepting the alarm in the normal regionwhile it is false. On the other hand, the missed alarm rate (MAR) is known as a type IIerror and refers to the probability of rejecting the alarm in the abnormal region even thoughit is true. The relationship between the false alarm and missed alarm rates is shown in9Figure 1.4: Alarm management lifecycle.the confusion matrix in Figure 1.5. The average time that the system requires to raise analarm after an incident occurs is termed the expected detection delay (EDD), which is thedifference between the time of the occurrence and the detection time. So, to have a robustdiagnosis system, the above criteria must be controlled to avoid performance degradationof the overall process. Recently, a number of methods have been widely used in theoryand practice. Proper implementation of those methods will improve an alarm system’sperformance.1.5 Outline and contributions of this thesisThe dissertation is organized as follows:Chapter 2 gives the general background to the development of FDI techniques for linearas well as non-linear systems. Then, an overview and general classification of model-basedfault diagnosis methods are presented. The importance of model uncertainties that can10Figure 1.5: Types of errors in decision making.affect the robustness of the diagnosis system is highlighted. Finally, state-of-the-art model-based FDI techniques for non-linear systems are described, as these are the main focus ofthis thesis.In Chapter 3, a novel model-based algorithm for FDI in stochastic non-linear systemsis proposed. The algorithm monitors changes in the process behaviour and identifies a cor-responding fault using a bank of particle filters running in parallel. The particle filters areused to generate a sequence of hidden states that are then used in a log-likelihood ratio todetect and isolate the fault. The approach is demonstrated through the implementation oftwo highly non-linear case studies: a multi-unit chemical reactor system and a polyethylenereactor system. The effectiveness and robustness of the proposed algorithm are illustratedby comparing the results with FDI techniques that use EKF and UKF state estimators in-stead of particle filters.In Chapter 4, another approach is proposed to isolate actuator and sensor faults instochastic non-linear and non-Gaussian systems. The approach is based on a particle filteralgorithm using a general observer scheme (GOS) in which a bank of particle filters is used11to generate a set of residuals, each sensitive to all but one fault. Faults are detected usinga single particle filter while they are isolated by monitoring the behaviour of the residualsgenerated by the set of particle filters. The application of the approach is demonstratedthrough implementation in two highly non-linear case studies: a simple heat exchangersystem and reactor-separator process consisting of two continuously stirred tank reactors,and a flash tank separator with a recycle stream.In Chapter 5, an algorithm for designing alarms for systems with unknown and knownmodels is presented. In the case of unknown models, our approach is based on Monte Carlosimulations. In the case of known models, it makes use of a probability density functionapproximation algorithm called particle filtering. The alarm design algorithm is illustratedthrough two simulation examples. The proposed alarm design is shown to have been effec-tive in detecting the fault even though the measurements were non-Gaussian.Finally, in Chapter 6, some conclusions and directions for future work are indicated.1.6 ObjectivesThe specific objectives of this thesis are as follows:1. Develop a particle filter algorithm applicable to estimate the state of stochastic non-linear/non-Gaussian models with good performance for FDI purposes.2. Develop a hypothesis testing algorithm, i.e., a log likelihood ratio test (LLR) to beused as a decision maker regarding the presence or absence of faults and couple itwith the particle filter algorithm.3. Apply the approach to real processes to demonstrate its effectiveness and robustnessin comparison with the available approaches used for stochastic non-linear processes–12i.e., EKF and UKF.4. Investigate the capability of the proposed algorithm in dealing with different classesof faults that may occur, according to their location of occurrence in the system, i.e.,actuator and sensor faults as well as with respect to their time characteristics (abruptand incipient).5. Develop an FDI algorithm to detect and isolate actuators and sensors based, on ageneral observer scheme (GOS) using a bank of particle filters, and implement it inindustrial case studies.6. Develop an algorithm to design alarms for unknown and known models using particlefilters.7. Different alarm design techniques are described in the literature, such as on/off delaytimers, deadbands and filtering. We try to develop each of them based on particlefilter approximation by balancing three different measurement indices: false alarmrate, missed alarm rate and expected detection delay.13Chapter 2Literature ReviewIn recent years, increasing attention has been paid to the theory and application of fault di-agnosis and alarm design techniques, due to safety, economic, and environmental concerns.Many different FDI methods and alarm design techniques for linear and non-linear systemshave been developed and applied to several industrial processes. The aim of this chapteris to review the basic concepts and general classifications of existing diagnosis techniques,with particular attention to state-of-the-art model-based FDI approaches in non-linear sys-tems. Section 2.1 provides an introduction to and general classification of fault diagnosisapproaches. The differences between qualitative and quantitative model-based fault di-agnosis methods are discussed in detail in Section 2.2. In Section 2.3 data-based faultdetection and the isolation approach is discussed. Section 2.4 presents two different model-based fault isolation strategies. Section 2.5 discusses the importance of model uncertaintiesin FDI designs. State-of-the-art model-based non-linear FDI techniques are then describedin Section 2.6. Finally, Section 2.7 provides a brief introduction to alarm design methods.142.1 IntroductionA number of techniques are used for fault diagnosis in technical processes. In general,there are two main categories of diagnosis techniques: physical redundancy and analyticalredundancy. Physical redundancy is a classical approach in fault diagnosis, in which re-dundant hardware (e.g., two or more sensors or actuators) is used to measure or control aparticular variable. During normal operation, one piece of hardware is sufficient to do thework, however, two or three are usually required to ensure reliable measurements and con-trol in faulty cases. The most important advantages of the physical redundancy approachare the ease of implementation and the direct isolation of the faulty components. However,using the physical redundancy approach doubles the design, maintenance and operationalcosts, making it inconvenient to be applied in complex processes. Also, in some cases itcannot detect faults because the affected parameters cannot be measured directly. In addi-tion, this approach dramatically increases the system’s size, complexity, weight and powerconsumption. Hence, the physical redundancy approach is very expensive and can only bejustified for critical systems. On the other hand, the analytical redundancy method, whichwas introduced in the early 1970s by Beard [2], does not need extra hardware becausethe redundancy is generated from the process model instead, thereby reducing weight andcost. Analytical redundancy is a more affordable scheme that compares the actual systembehaviour against a mathematical model to check for consistency; it is often referred to asthe model-based approach to fault diagnosis. Any inconsistency is used for detection andisolation purposes [90], [18], [121]. Figure 2.1 illustrates the main concepts of physicalredundancy and analytical redundancy. In this thesis, we utilize the model-based approachto solve fault detection and isolation problems as well as alarm design problems.The mathematical models used in model-based FDI approaches primarily can be de-15Figure 2.1: Comparison between physical and analytical redundancy-based schemesfor FDI [110].rived by one of the following two methods: theoretical/physical modelling or experimen-tally by a model identification technique [65]. Theoretical modelling which is sometimescalled “white-box models”, uses a priori knowledge of the system to develop mathemati-cal models using the physical laws that describe the dynamic behaviour of the monitoredsystem, which can be described by a set of difference or differential equations. This is alsoknown as a first-principle approach. In contrast, the experimental modelling method or“black-box models” which is also known as data-base approach, can be derived by evaluat-ing the input and output measurements using a huge amount of historical data by means ofprocess identification. The general classification of the fault detection and isolation meth-ods is shown in Figure 2.2 and more detailed description of the two approaches is givenbelow.2.2 Model-based fault diagnosis methodsModel-based approach to fault detection in dynamic systems has received increasing at-tention over the last four decades, both in a research context and also in the domain of16Figure 2.2: Model-based fault detection methods [110].application studies of real systems. The literature presents a wide variety of techniques,based on the use of mathematical models of the process under investigation and exploitingmodern control theory [23].The model-based fault detection can be broadly classified as qualitative or quantitative.Model-based quantitative models uses a priori knowledge of the system to develop themathematical models according to the physical laws that describe the dynamic behaviour ofthe monitored system, using a set of difference or differential equations. On the other hand,model-based qualitative models are expressed in terms of qualitative knowledge about theprocess and are also known as knowledge-based approaches. Typical qualitative modelsare causal models and abstraction hierarchies [117].172.2.1 Quantitative model-based methodsQuantitative model-based FDI methods rely on an explicit model and typically consist oftwo main steps: residual generation (the procedure of extracting fault symptoms from theprocess) and residual evaluation (the procedure of decision making) [29]. The two stepsare depicted in Figure 2.3 and are described as follows:1. Residual generation: The mathematical model utilizes the input u(t) and output mea-surements y(t) from the real process to generate residual signals. Information aboutthe fault can be extracted by checking the consistency between the actual measuredvariables of the monitored system and the estimated variables obtained from themathematical model. The residual should normally be zero or close to zero whenno fault is present, but distinguishably not zero when a fault occurs [6]. However, theresidual signal is usually compared to suitable thresholds to avoid false alarms aris-ing from measurement noises, disturbances or model uncertainties [23]. Therefore,the residual is the carrier of the faulty information. The procedure used to generatethe residual is called the residual generation step. So, to have a robust FDI method,we need to have a good residual generator.2. Residual evaluation: The residuals r(t) generated from the previous step are thenevaluated for the likelihood of faults so that a decision can be made as to the occur-rence of faults. There are a number of statistical techniques to evaluate the residualsfor such deviations. In the literature on FDI, residuals are often evaluated using alog-likelihood ratio (LLR), generalized log-likelihood ratio (GLR), sequential prob-ability ratio test (SPRT) or cumulative sum test statistic (CUSUM).Since 1970, many fault diagnosis approaches based on analytical redundancy have beendeveloped and applied for linear and non-linear systems. However, the application of these18Figure 2.3: Model-based fault detection scheme.approaches depends on the kind of faults to be detected (e.g., additive and multiplicativefaults). As shown in Figure 2.4, additive faults influence a variable Y by the addition ofthe fault f (e.g., sensor biased) and multiplicative faults by the product of another variableU with f (e.g., a parameter change within the process) [66]. In general, the most popularapproaches used to generate residuals that can handle different types of faults can be splitinto three categories [37], [102], [4], [105], which will now be described.Parity space relationParity space relation is one of the three most popular of FDI schemes available in the liter-ature. Early contributions to the parity space approach were made by [106], [8], [9], [36],19Figure 2.4: Basic models of faults: a) additive fault; b) multiplicative faults.[29], [27] and [54] in which the residual is generated using the so-called parity functionsdefined over a time window of system data. The parity relation approach provides a directconsistency check between the model and measured process outputs [54]. For more detailsabout the approach, see [28]. The basic idea of this method is shown in Figure 2.5 whereGP represents the monitored process, which runs in parallel with the mathematical modelthat represents the actual process GM. The residual r based on parity relations can then bemathematically expressed in Equation 2.1 with u denoting the input signal.r = (GP−GM)u (2.1)The parity space approach assumes that the model parameters are known and constant.Thus, residuals based on this method are susceptible to model inaccuracies due to sys-tem uncertainties. To overcome the problem of imprecise models, robust parity relationswere introduced in the mid 1980s by Chow and Willsky [29] and by Lou, Willsky andVerghese [27]. Although parity space methods have been developed and applied primarilyfor the linear systems, it has also been extended to cover some classes of non-linear sys-tems in different disciplines [79], [95]. Furthermore, a number of FDI methods have beenfirst introduced based on a parity space framework and then extended to a state estimationframework because the parity space methodology is much easier to design and implement,20Figure 2.5: Schematic diagram of parity space approachequivalent to some extent to the observer-based approach. It has almost the same prop-erties of the observer-based approach in which it can only deal with residual generationproblems [48], [104], [65]. From this viewpoint, it is reasonable to include the parity spacemethodology in the framework of observer-based FDI techniques [37].State estimation approachThe state estimation (observer/filter) based approach for fault diagnosis was first intro-duced in the early 1970s by [2] and then modified by [3] to the so-called Beard-Jones faultdetection filter. The basic idea behind this approach is to estimate system outputs frommeasurements by using either observers in deterministic processes or filters in stochasticprocesses (e.g., a Kalman filter for deterministic processes and a particle filter for stochasticprocesses). In the state estimation approach, residuals are generated by comparing actualsystem measurements with measurements estimated using observers or filters. This ap-proach has been widely used in the literature and in applications (more than 50% of sensorfaults are detected using observer-based methods [67]). A schematic diagram of this ap-proach is shown in Figure 2.6. In this figure, ye denotes the estimated system output, ydenotes the measured system output and r denotes the residual, which can be calculated21Figure 2.6: Schematic diagram of the state estimation approachaccording to Equation 2.2.r = y− ye (2.2)The state estimation approach has been the most extensively studied and applied forlinear and non-linear systems. This approach has been primarily demonstrated by differ-ent researchers [103] as the most suitable for fault detection, since it is inherently fast andcauses a very short time delay in a real-time decision-making process in comparison withthe parameter estimation approach. This method is especially useful when dealing withcases in which states or outputs are not measurable. Similar to the parity relation method,diagnostic observers also require accurate knowledge of the system model parameters. Therelationship between diagnostic observers and parity relation-based designs has been dis-cussed by Mgni and Mouyon for linear systems [87].22Parameter estimation approachThe third approach to residual generation is the parameter estimation approach, which wasfirst introduced by [20], [5] and is based on the assumption that faults are reflected in thephysical parameters of the system rather than in a change in the system output. Thus, thefault can be modelled as an additive term acting on the parameter vector of the system, i.e.,PM = Pnom + f (2.3)with PM and Pnom standing for the modelled parameter and nominal parameter, respectively.The parameter estimation approach requires only the structure of the system model ratherthan an accurate model of the system. However, in some cases, the parameters of the systemmodel may, as part of their normal behaviour, vary continuously in certain ranges. In thesecases, the parameter estimation approach is more affordable than the other approaches,which assume that the model parameters are known and constant. The fault decision isperformed by comparing the estimated parameter obtained from the measurement data withthe nominal process parameter. The parameter estimation-based fault detection method isillustrated in Figure 2.7, where Pe represents the estimated parameter, Pnom represents thenominal parameters and r represents the residual as calculated in Equation 2.4:r = Pnom−Pe (2.4)One of the most important advantage of the parameter estimation approach is that it canestimate the magnitude of the fault that occurs in the process [46]. However, parameterestimation methods usually need persistent excitation of the process input to allow onlinediagnosis. It is also recommended for component faults or faults that change the dynamicof sensors and actuators [64]. Parameter estimation techniques have also been applied to23Figure 2.7: Schematic diagram of parameter the estimation approachfault detection in non-linear systems; a study of parameter estimation-based fault detectionin non-linear systems can be found in [65] and its application to a non-linear satellite modelin [72]. There is a close relationship between parameter identification-based fault detectionand the observer-based fault detection approach, as demonstrated in [86].As discussed above, no general method exists for solving all fault diagnosis problems;all available approaches have advantages and disadvantages with regard to the detection ofdifferent types of faults. So, to utilize the advantages, a combination of different model-based fault detection methods should be used. Isermann [65] summarizes the differentcombinations as follows:• Sequential parameter estimation and parity equations: in this combination, parame-ter estimation is used to obtain the model while parity equations are used for faultdetection with less computations.• Sequential parameter estimation and state estimation: parameter estimation is usedto obtain the model and state estimation is used for fast change detection.24• Parity equations and state estimation: this combination is used for multiplicative andadditive faults and depends on input excitation.2.2.2 Qualitative model-based methodsQualitative models can be developed as either qualitative causal models or abstraction hi-erarchies, as shown in Figure 2.8. In causal models, the cause-effect relations can be rep-resented as a set of variables in the form of nodes in a directed graph. Causal modelsare a very good alternative when quantitative models are not available but the functionaldependencies are understood. Another form of model knowledge comes from developingabstraction hierarchies based on decomposition. The idea of decomposition is to be able todraw inferences about the behaviour of the overall system solely from the laws governingthe behaviours of its subsystems.Abstraction hierarchies help to quickly focus the diagnostic system’s attention on prob-lem areas. One of the advantages of qualitative methods based on deep knowledge is thatthey can explain a fault propagation path; this is indispensable when it comes to decision-support for operators. They can also guarantee completeness, in that the actual fault willnot be missed in the final set of faults identified. However, they suffer from resolution prob-lems resulting from ambiguity in qualitative reasoning. When quantitative information ispartially available, one can use order-of-magnitude analysis or interval-calculus to improvethe resolution of purely qualitative methods [54].2.3 Data-based fault diagnosis methodsUnlike model-based methods, which require a priori knowledge of the system, data-basedfault diagnosis methods require only large amounts of historical data. If the system is toocomplex and cannot be derived using a first-principle method, then a data-based method is25Figure 2.8: Classication of qualitative-based methods [117].more appropriate. However, one of the major disadvantages of these techniques is that theyrequire such large amounts of historical data for nominal and faulty conditions, which maybe costly to obtain. In addition, because of the non-linear behaviour of chemical processes,it is often difficult to distinguish regions of faulty operation due to overlap [97]. There aredifferent ways that data can be transformed and presented a prior knowledge to a detectionsystem; these are known as feature extraction. In terms of feature extraction, model-freemethods can be either qualitative or quantitative in nature. Two of the major methods thatextract qualitative historical information are expert systems and trend modelling methods.Methods that extract quantitative information can be non-statistical or statistical. Neuralnetworks are an important class of non-statistical classifiers. Nowadays, data mining is oneof the most active research fields. The key advantage of data mining-based fault detec-tion is that it can automatically generate concise and accurate detection models from largeamounts of data.262.4 Model-based fault isolation strategiesOnce a fault is successfully detected, it must be isolated in order to distinguish it fromothers within a monitored system. Basically, a fault is detected using a single residual set,while model-based fault isolation can be achieved using a set of residuals, based on one oftwo frameworks: structure residual and directional residual.Structured residualThe main idea behind this approach is to design a bank of structured residuals, in whichsome of the residuals are designed to be sensitive to a certain group of faults, but insensitiveto others. The design procedure consists of two steps: (1) specify the sensitivity and insen-sitivity relationships between residuals and faults, according to the assigned isolation task(2) design a set of residual generators according to the desired sensitivity and insensitivityrelationships [4]. The structured residual approach can be designed in two different ways:dedicated residual scheme and general residual scheme.A. Dedicated residual schemeIn a dedicated residual scheme (which was introduced by [30]), one measurement isfed into each residual generator; these generators are designed to be sensitive onlyto single faults, as shown in Figure 2.9. In the literature, this is also widely knownas a dedicated observe scheme (DOS). Two restrictions arise in this type of multipleobserver/filter state estimation-based FDI scheme. First, since each observer/filter inthe scheme is driven by only one output measurement, the states of the plant shouldbe completely observable through each sensor or actuator, which is not always thecase in practical applications. Second, multiple and simultaneous faults are difficultto identify, especially in large processes [59]. If all possible faults are to be isolated,a dedicated residual set can be designed according to the following fault-sensitive27condition:ri(t) = G( fi(t)); i ∈ {1,2, . . . ,N} (2.5)where G(·) stands for a function relation and N is the number of faults to be isolatedwithin the process. A simple threshold logic, as in [4], can be used to make a decisionin the presence or absence of faults:ri(t) > ξi =⇒ fi(t) 6= 0 (2.6)where ξi(i = 1,2, . . . ,N) are predetermined thresholds corresponding to residualsri(i = 1,2, . . . ,N). The threshold values are selected in such a way that the falsealarm and missed alarm rates are minimized.Chen and Saif [24] recently extended Clark’s DOS to actuator fault isolation. Theirscheme is able to detect and isolate multiple actuator faults using a bank of N ob-servers, where N is the total number of actuators in the system under consideration.B. General residual schemeAn alternative approach is the general residual scheme, also known as the generalobserver scheme (GOS). In this approach, each residual is designed to be sensitive toall but one fault [4], i.e.:r1(t) = G( f2(t), . . . , fN(t))...ri(t) = G( f1(t), . . . , fi−1(t), fi+1(t), . . . , fN(t))...rN(t) = G( f1(t), . . . , fN−1(t))(2.7)28Figure 2.9: Model-based FDI based on a dedicated observer scheme.The isolation task can be achieved using simple threshold testing, according to thefollowing logic:ri(t)≤ ξir j(t) > ξ j ∀ ∈ {1, . . . , i−1, i+1,N}⇒ fi(t) 6= 0; (2.8)Fixed direction residualAnother method of fault isolation is to design a directional residual vector. This idea isthe basis of a fault detection filter, in which the residual vector receives specific directions,depending on the fault that is acting upon the system.29Figure 2.10: Model-based FDI based on a general observer scheme.2.5 Robustness of model-based FDIModel-based approaches make use of mathematical models to generate residuals. How-ever, due to the nature of the processes, it is difficult to develop a precise and accuratemathematical model that can represent the actual dynamic behaviour of the process. Modeluncertainties such as modelling errors, disturbances, noises and parameter variations usu-ally make designing a mathematical model challenging, especially when the process isnon-linear and the noises are non-Gaussian. Hence, there are always discrepancies be-tween the actual system behaviour and its mathematical model, even if there is no fault.Such deviations, can actually obscure the effects of faults and become a potential source30of false alarms and missed alarms. Therefore, the effects of model uncertainties should beconsidered in FDI designs because these are crucial factors that may affect the performanceof the diagnosis system [23]. Usually, model uncertainties are modelled as an external in-put “additive noise” or as a parameter deviation “multiplicative noise”, which changes thesystem characteristics. So, to have a robust diagnosis system, model uncertainties shouldbe taken to account because they are unavoidable even during normal operation. An im-portant task of any model-based FDI scheme is to be able to detect different types of faults,i.e., abrupt, incipient and intermittent faults, in the presence of model uncertainties.2.6 State-of-the-art model-based non-linear FDItechniquesFDI for linear dynamic systems has been extensively studied theoretically and practically,and the literature contains a number of well-established methods. Detailed studies of thesemethods are available in recent books [23], [37], [112] and survey papers [47], [46], [6],[49], [67]. In practice, most industrial processes are non-linear in nature. Therefore, to de-sign a fault diagnosis system that can handle non-linearity, systems are usually linearizedaround certain operating points, and linearization errors are modelled as unstructured un-certainties. Using linear FDI schemes on non-linear systems apparently limits the perfor-mance and stability of the diagnosis system, especially when the system is highly non-linear. Usually, using linear model-based approaches will result in high false alarm rates,with most faults going undetected [6]. FDI techniques for non-linear systems have there-fore received considerable research attention in recent years; see, for instance, [48], [52],[118] and the references therein.To generate the residuals, model-based methods typically rely on the model being lin-ear and noise being Gaussian [47]. Extensions to non-linear systems have also been con-31sidered in the literature; these, however are based on suboptimal state estimators such asextended Kalman filters (EKF), and unscented Kalman filters (UKF). These suboptimal fil-ters approximate the non-linear system through linearization and/or assume that the noiseis Gaussian. The approximations are often not satisfactory and lead to a high rate of falsealarms. The advent of high-performance computers has allowed extensive use of computa-tionally intensive methods such as sequential Monte Carlo (SMC), which will be describedin detail in Chapter 3.2.7 Alarm designAfter the fault diagnosis system detects an abnormality in the process, an alarm shouldbe raised to assist the process operators in managing this abnormality. The integrity andeffectiveness of alarm systems either guide the operators to the fault or distract them fromthe real cause. The alarm design problems constitute an important area that has generatedgreat interest. Recently, a number of researchers have tried to tackle such problem.Naghoosi et al. [39] developed a method to estimate the chattering index, based onstatistical properties of the process variable as well as alarm parameters. The estimate canbe used for developing analytical methods to optimally design alarm parameters for min-imal chattering. Another alarm design approach has been proposed in [120], wherein theresearchers formulated two novel rules to detect chattering alarms caused by random noiseand repeating alarms using regular patterns such as oscillation; they proposed an onlinemethod to effectively remove chattering and repeating alarms via an m-sample delay timer.In another study [119], an online method to detect the presence of chattering alarms dueto oscillation process signals and to reduce the number of these chattering alarms was pro-posed. They suggested a revised chattering index to quantify the level of chattering alarms;a discrete cosine transform-based method is used to detect the presence of oscillation; then32two mechanisms, which involve adjusting the alarm trippoint and using a delay timer, areexploited to reduce the number of chattering alarms.The concept of run length is introduced in the alarm management context to study alarmchatter, and an index is proposed to quantify the degree of alarm chatter based on run lengthdistributions obtained exclusively from readily available historical alarm data. The chat-ter index hence plays a crucial role in the routine assessment of industrial alarm systems.Prominent features of the proposed chatter index and its variants are demonstrated usingindustrial data [77]. In the paper [128], the Gaussian kernel method is applied to generatepseudo-continuous time series from the original binary alarm data. This can reduce theinfluence of missed, false and chattering alarms. By taking into account time lags betweenalarm variables, a correlation colour map of the transformed or pseudo-data is used to showclusters of correlated variables, with the alarm tags reordered to better group the correlatedalarms. However, one main disadvantage of this method is that when computing correla-tions between two time series, the simultaneous zeros in both series bear little informationbecause they are in the normal region, resulting in the reduction of correlation coefficients.Adnan and co-workers in [11] used Markov processes to calculate detection delays fordeadbands and delay timers. They proposed a design procedure that is a compromise be-tween the alarm indices–i.e., detection delay, false alarm rate and missed alarm rate–foran optimal configuration. Adnan et al. [11] proposes a generalized delay timer frame-work wherein instead of consecutive n samples, as in the conventional case, n1 out of nconsecutive samples (n1 ≤ n) are considered to raise an alarm. For the generalized delaytimer, three important performance indices, namely, the false alarm rate (FAR), the missedalarm rate (MAR) and the expected detection delay (EDD), are calculated using Markovprocesses. Moreover, the performance and sensitivity of generalized delay timers are com-33pared with those of conventional delay timers.Naghoosi and co-workers in [94] used the alarm deadbands method to reduce falsealarm rates, missed alarm rates and the amount of chattering. In this paper, they studiedthe relation between optimal alarm thresholds and deadbands. Two equations have beenproposed to estimate the optimal threshold with respect to the deadband and history of aprocess variable. A moving average filters technique to improve alarm accuracy has beenemployed in [25] and [26]. They proposed a numerical optimization-based procedure toreduce the weighted sum of false and missed alarm rates. The article [124] used an on/offdelay timers technique to calculate the FAR, MAR and EDD using the probability densityfunctions (PDFs) of the univariate process variable under normal and abnormal conditions.[125] used on-delays and moving average filters.Flooding is a critical problem in process industries; operators can be flooded by a num-ber of alarms and may not be able to respond to all of them in a timely way. Ahmed etal. [13] proposed a new analysis method to investigate similar alarm floods using histor-ical alarm data and grouping them on the basis of patterns in alarm occurrences. Alarmflooding may occur due to either badly designed alarm management systems (AMS) orcausal-dependent disturbances, either of which raise an alarm based on a single causal dis-turbance. The article [45] presents an overview of an algorithm for automatic alarm dataanalyzer (AADA). The algorithm is able to find possible and significant reasons for alarmfloods by identifying the most frequent alarms and those causal alarms consolidating alarm-sequences. They are to be used to improve and to redesign an AMS, so that the alarm floodproblem can be reduced at the end.Chowdgury et al. [43] proposed a technique for enhancing the residual signal to reduce34the false alarm rate in fault detection problems. Their approach is based on eliminatingnoise from the residual signal using autoregressive (AR) modelling. The proper handlingof alarms is crucial to automated process control. In practice, many alarms are distractionsand do not indicate a fault situation. Chowdgury and colleagues proposed that aims to re-duce the occurrence of the so-called nuisance alarms. It is a general computerized tool thattakes advantage of the control system’s built-in functions, and it is a first step in achiev-ing an enhanced overall alarm situation. With the first prototype, a significant reduction inalarms in a biofuel district heating plant was achieved.Alarm nuisance can be eliminated using different methods. The most common methodsavailable in the literature are: (1) Build a model of all or part of the process (2) Groupalarms and use alarm priorities (3) Alarm shelving (4) Tune the alarm limits or applyingtime delays to the signals [22]. Bergquist et al. [22] proposed an offline computerizedapproach to reduce nuisance alarms in any kind of plant. Their toolbox is known as theAlarm Cleanup Toolbox (ACT) and has the ability to do the following: filtering, time-delay,deadbands and alarm window. The paper [80] proposes an alarm processing algorithm thatgoes beyond the prioritization of alarms. This conceptual algorithm has additional featuresthat offer the operator recommendations and decisions for the event that has caused thealarms, as well as a feature that may execute controls if the event is non-critical. Thealgorithm’s features are as follows:• Prioritization on the basis of area responsibility• Root cause identification of alarms• Elimination of multiple alarms from the same cause• Prioritization on the basis of a predetermined list of alarm weights35• Prioritization on the basis of recencyThe next generation of alarm software may be able to provide power system operatorswith tools that have enhanced futures bordering: identification and recommendation ofcontrol actions, decision-making assistance and actuation of controls.36Chapter 3FDI Based on Parameter EstimationUsing DOS3.1 IntroductionThe performance and efficiency of any model-based fault diagnosis methods usually relyon the ability of the dynamic model to mimic the actual behaviour of the process. As dis-cussed in Chapter 2, most real systems are non-linear and non-Gaussian in nature. So, it isnot straightforward to get a perfect approximation of these models using suboptimal stateestimators such as extended Kalman filters (EKF), and unscented Kalman filters (UKF).These suboptimal filters either approximate the non-linear system through linearization orassume that the noise is Gaussian. The approximations are often not satisfactory and leadto a high rate of false alarms. In this thesis, an algorithm for fault diagnosis based on asequential Monte Carlo (SMC) method called a particle filter is proposed. This approachdoes not require linearization of the process or Gaussianity of noise.The advent of high-performance computers has allowed extensive use of computation-37ally intensive methods such as SMC. Some literature exists on the use of SMC for faultdetection and isolation; it can broadly be classified into algorithms that are based on log-likelihood tests [114], [115], [75], those based on explicit fault models and associated faultprobabilities [99], [100], and those based on an assessment of the estimated state prob-ability density functions [57], [73], [91], [33], [1], [107]. There have also been a fewapplications of these methods [33], [88], [99], [98], [113]. The algorithm proposed in thischapter is based on a modification of the log-likelihood test and has been chosen due to theNeyman-Pearson optimality of the log-likelihood test [84]. The known algorithms basedon the log-likelihood ratio test [75] derive the likelihood function through an approxima-tion that is not applicable to all types of non-linearities. Moreover, very little work on faultdetection and isolation in non-linear stochastic processes has been reported in the fieldof chemical engineering. In this chapter, an algorithm that overcomes some of the prob-lems with the algorithms in [114], [115] and [75] is proposed. It is applicable to generalnon-linear/non-Gaussian stochastic systems. The work presented here extends the faultdetection algorithm in [15] to handle fault isolation, and it compares and contrasts the pro-posed particle filter-based algorithm with those based on other types of filters.This chapter is organized as follows. In Section 3.2, the model-based FDI problemis formulated, followed by a description of the fundamentals of SMC in Section 3.3. Abrief description of the statistical test used to evaluate the residuals is given in Section3.4. Section 3.5 presents the application of the proposed method on two highly non-linearsystems. Finally, some conclusions and future work are outlined in Section 3.6.3.2 Problem statementThe model-based FDI algorithm developed in this work is based on a multi-model approachwhere it is assumed that there are N possible known faults that may occur in the process and38there are N +1 models {Mi}Ni=1, where M0 corresponds to the nominal process model andMi, for i = 1,2, ...,N, represents the ith faulty model. The FDI approach considered in thiswork uses a bank of particle filters, in which each particle filter corresponds to a knownpossible fault and runs in parallel with a particle filter that corresponds to the nominalprocess. The standard FDI solution consists of two main steps:• Fault detection (FD), when one decides weather there have been a change from thenominal model, M0, to one of the faulty models {Mi}Ni=1, and the time at which thechange has occurred.• Fault isolation (FI), when one determines which model among {Mi}Ni=1 the processhas assumed.It is assumed that the nominal process behaviour and all the possible faults can beexpressed by the following discrete stochastic non-linear state space models:xik = fi(xik−1,uik−1,ν ik,θ i) (3.1)yk = gi(xik,uik,ω ik,θ i) for i = 0 to N (3.2)where f i and gi are the state and measurement dynamic functions for the respective models,k denotes a time instant, xk is the state vector with known initial probability density functionp(xi0) and yk is the vector of measurements. The random variables ν ik and ω ik representthe state and measurement noise sequences with known probability density functions withzero mean. θ i denotes a vector of constant values in the model and includes not only theparameters but also other process measurements that are assumed to be constant. It is inplace to point out that the noise in the above dynamic equations enters the process in anon-linear fashion. In most traditional approaches to FDI, in linear and to some extent innon-linear processes, it is often assumed that the measurement and state noise enter the39process in a linear fashion. This assumption allows one to write the measurement Equation3.2 as,yk = gi(xik,uik,θ i)+ω ik. (3.3)This fundamental assumption in the measurement equation allows one to detect faultssimply by generating and monitoring the prediction errors (or residuals) between the pro-cess measurements and model predictions. The one-step-ahead predictions from Equation3.3 can be written asyˆik = gi(xik|k−1,uik,θ i) (3.4)where xik|k−1 is the one-step-ahead prediction of the state and yˆik is the one-step-ahead pre-diction of the output. Then the prediction error or the residual can be written asrˆik = yk− yˆik. (3.5)If the model used to evaluate the residuals is the same as the true process, then the densityfunction of the residuals, rˆik, must closely follow that of the corresponding measurementnoise, ω0k . Any deviation of the residuals from this density function implies a fault in theprocess. There are a number of statistical techniques to evaluate the residuals for such de-viations. In the literature on FDI, residuals are often evaluated using a log-likelihood ratio(LLR), generalized log-likelihood ratio (GLR), sequential probability ratio test (SPRT) orcumulative sum test statistic (CUSUM).The noise sequences in the model in Equation 3.1 and Equation 3.2 enter the process in anon-linear fashion, so straightforward residual analysis, as presented in Equations 3.4-3.5)is difficult. To overcome this problem, the authors in [75], [114] propose a log-likelihoodratio test that is based on evaluating the likelihood functions of the measurements. How-40ever, this approach presents other difficulties in terms of approximating the likelihood func-tions. We propose a novel algorithm that resolves the problem of estimating the likelihoodfunction of measurements by instead using a joint likelihood function of the states and themeasurements. To estimate the joint likelihood function, an estimate of the hidden statesis needed. While there are numerous approaches, we use sequential Monte Carlo-basedparticle filters. The advantage of particle filters is that their accuracy is independent of theprocess’s non-linearity and can be improved simply by increasing the number of particles.A brief description of the standard particle filter algorithm is presented below to providethe necessary background for the algorithm developed in the subsequent section.3.3 Fundamentals of sequential Monte Carlo methodThe sequential Monte Carlo (SMC) approach is a recursive Bayesian method for non-linearand non-Gaussian filtering problems. The basic idea behind this approach is to approximatethe density functions involved in filtering by propagating a set of “particles” through theprocess dynamic equations. The basic framework of the SMC approach is presented below.3.3.1 Recursive Bayesian estimationThe SMC approach is based on a recursive Bayesian algorithm for dynamic state estimationand involves constructing of the probability density function of the current state xk, givena sequence of measurements. Please note that in the rest of this section, the superscript iis dropped to keep the notation simple. Let y1:k denote a sequence of measurements up totime k, i.e., y1:k = {y1,y2, . . . ,yk}, then the Bayesian solution to the filtering problem wouldbe to calculate the density of the state, i.e., p(xk|y1:k), for every iteration.In principle, this density can be obtained recursively in two steps: prediction and up-date. To estimate the posterior density it is assumed that the initial conditions expressed in41terms of the density function, p(x0|y0), are known. Then, the posterior density function,p(xk|y1:k), can be obtained recursively using the following two steps:1. Prediction:p(xk|y1:k−1) =∫p(xk|xk−1)p(xk−1|y1:k−1)dxk−1 (3.6)2. Update:p(xk|y1:k) =p(yk|xk)p(xk|y1:k−1)p(yk|y1:k−1)(3.7)where p(yk|y1:k−1) is the normalizing constant and defined as:p(yk|y1:k−1) =∫p(yk|xk)p(xk|y1:k−1)dxk. (3.8)This posterior density will encapsulate all the information about the state xk, which is con-tained in the measurements y1:k and the prior density of x0.However, Equation 3.6 and Equation 3.7 do not have analytical solutions except for linearprocesses with Gaussian noise. In most cases the integrals in Equation 3.6 and Equation3.7 are complex and intractable. For linear Gaussian systems where the density is uniquelycharacterized by the mean and covariance, the Kalman filter is used to propagate and updatethe means and covariance of density. However, for general non-linear, non-Gaussian sys-tems described by Equation 3.1 and Equation 3.2, there is no simple way to proceed. TheSMC algorithms make the complex integrals tractable through the use of efficient samplingstrategies [83], [38].423.3.2 Sequential Monte CarloSequential Monte Carlo methods were first introduced by Gordon in 1993 [93], and sincethen the algorithm has been further developed and adapted to many different applicationsand disciplines [83]. It appears in literature in different names such as bootstrap filter-ing [93], the condensation algorithm [85], particle filtering [70], [114] interacting particleapproximations [32], [82], [99] and [74]. The basic idea of SMC is the recursive compu-tation of relevant probability distributions using the concepts of importance sampling andthe approximation of probability distributions by a set of random samples with associatedweights. SMC has become an important alternative to the extended Kalman filter (EKF)and unscented Kalman filter (UKF), due to its generality and robustness. The main advan-tage of particle filtering over other methods is that the exploited approximation does notinvolve linearization around current estimates but rather approximations in the representa-tion of the desired distributions.Consider the state space model given by Equation 3.1 and Equation 3.2. The basic ideaof SMC follows the framework of Bayesian recursive estimation discussed in the previoussection. Bayesian recursive estimation is implemented via Monte Carlo sampling, ratherthan through solving the integrals in Equation 3.6) and Equation 3.7 directly.At each time step k, the two pieces of information required for estimating the prob-ability density function are the samples x jk and their associated weights wjk. Samples xjkare assumed to be generated from a known density called the importance density function,q(xk|y1:k), which is easy to sample from. The corresponding weights of the samples aredefined as:w jk =p(x jk|y1:k)q(x jk|y1:k)(3.9)43and the weights upon normalization become:w jk =w jk∑Nsj=1 wjk(3.10)where Ns is the number of samples used. If the importance density function is chosen to befactorized such thatq(xk|y1:k) = q(xk|xk−1,yk)q(xk−1|y1:k−1) (3.11)then one can obtain samples x jk ∼ q(xk|y1:k) by augmenting each of the existing samplesx jk−1 ∼ q(xk−1|y1:k−1) with the new state xjk ∼ q(xk|xk−1,y1:k), and the updated weight wikassociated with it can be obtained according tow jk ∝ wjk−1p(yk|xjk)p(xjk|xjk−1)q(x jk|xjk−1,y1:k)(3.12)More details of the derivation of Equation 3.12 can be seen in [83]. Furthermore, ifq(xk|xk−1,y1:k) = q(xk|xk−1,yk), then the importance density becomes dependent only uponxk−1 and yk. This is particularly useful in common cases when only a filtered estimate ofp(xk|y1:k) is required at each time step. From this point on, such a case is assumed, exceptwhen explicitly stated otherwise. In such a scenario, only x jk need be stored; therefore,one can discard the path x j1:k−1 and history of observations y1:k−1. Using the state spaceassumptions (first-order Markov process or observational independence given state), theimportance weights can be estimated recursively byw jk ∝ wjk−1p(yk|xjk)p(xjk|xjk−1)q(x jk|xjk−1,yk)(3.13)44and the posterior filtered density p(xk|y1:k) can be approximated as:p(xk|y1:k)≈Ns∑j=1w jkδ (xk− xjk) (3.14)where δ is the Dirac delta function, x jk is the ith sample that approximates the distribution,and the coefficient w jk is the weight associated with the corresponding sample. It can beshown that as Ns→ ∞, the approximation in Equation 3.14 approaches the true posteriordensity p(xk|y1:k). This idea can be easily extended to find the density function of thehidden states given the measurements up to the current time instant. The density functionp(xt |y1:t ,θ) is called a filter, where y1:t is a vector used to denote the measurements from 1to t.After a few iterations, the weights of most particles become insignificant due to mod-elling errors, and the noise makes the particles drift away from the real state. Consequently,most particles have no influence on the posterior density function, and the approximationof target distributions are then determined from only a few particles. This problem is calleddegeneracy and cannot be avoided because the variance of the importance weights can onlyincrease over time [38]. This degeneracy implies that the performance of this algorithmmay weaken the successful application of Monte Carlo sampling, and the computationalresources may be wasted on updating particles with little or no relevance to the approxi-mation of the posterior density function p(xk|y1:k). This degeneracy problem can be solvedusing a technique called resampling. The basic idea of resampling is to eliminate trajec-tories that have small normalized importance weights and to concentrate upon trajectorieswith large weights.The following is a summary of the particle filter algorithm.45Particle Filter Algorithm1. Initialization: Generate N samples of the initial state x1 from an initial distribution,p(x1). Set w(i)1|1 =1N for i ∈ {1, · · · ,N}. Set t = 2.2. Prediction: Sample N values of xt from the distributions p(xt |x(i)t−1,θ) for each i.3. Update: Using (3.10), find the weights of filter density, w(i)t|t .4. Resampling: Resample N particles from the set {x(1)t , · · · ,x(N)t } with the probabilityof picking x(i)t being w(i)t|t . Assign w(i)t|t =1N for all i.5. Set t = t +1. Repeat (2), (3) and (4) for t ≤ T .3.4 Proposed algorithmThe FDI algorithm developed in this section is based on a bank of particle filters {Mi}Ni=1running in parallel with a particle filter that represents the nominal process model M0 andthe actual process. A schematic of this algorithm is shown in Figure 3.1. The algorithm isexpected to identify any changes in the process and specify which of the N possible knownfaults has occurred. The algorithm detects any changes in the actual process by monitoringthe vector θ . One popular way to monitor changes in θ is to monitor the likelihood functionof the measurements, p(y1:k|θ), under a null hypothesis and an alternate hypothesis, asdefined below,H0 : θ = θ 0 - Null hypothesisH1 : θ 6= θ 0 - Alternate hypothesiswhere θ 0 represents the true value of θ and if in fact θ = θ 0, then it is assumed that thenull hypothesis is true. A fault condition is one in which the alternate hypothesis is deemed46Figure 3.1: Overview of the proposed model-based FDI approach.to be true. A popular approach to test the above hypotheses is to perform the LLR test. Thebasic idea behind this test is to monitor the logarithm of the ratio between the likelihoodof the measurements under the null hypothesis and under the alternate hypothesis. Whilethere are different variants of the test statistic for the likelihood ratio, the following is usedin this dissertation,L = logp(y1:k|θ 0)supθp(y1:k|θ) . (3.15)To evaluate this test statistic one needs to estimate the likelihood function, p(y1:k|θ). Thisevaluation is not straightforward due to the hidden states in the process model. In [114],the likelihood function was expanded asp(y1:k|θ) = p(y1|θ)k∏l=2p(yl|y1:l−1,θ) (3.16)47and p(yl|y1:l−1,θ) was approximated using the following standard formula:p(yl|y1:l−1,θ) = p(h(yl|y1:l−1,θ)|θ)∣∣∣∣∂h∂yl∣∣∣∣ . (3.17)where h is the inverse function of g that provides an estimate of the measurement noise.There are two problems with using Equation 3.17: the formula is only applicable for mono-tone functions, h, and it is often not possible to obtain an analytical expression for h. Toovercome these problems, a new test statistic that replaces the standard log-likelihood ratioby using joint likelihood function of both states and measurements is proposed:Linew(m,k) = logE[p(yk−m:k,xk−m:k|θ i)]supθE [p(yk−m:k,xk−m:k|θ)] , (3.18)where E denotes the expectation with respect to the hidden states xk−m:k. The integer, mdenotes the size of the time window over which the new statistic is calculated. The pro-posed log-likelihood ratio is the logarithm of the ratio between the joint likelihood functionevaluated at the true value of the vector, θ , and that evaluated at the maximizing value of θ .Therefore, Linew(m,k) will always be negative. The expectation in Linew(m,k) can be easilycalculated using a particle approximation of the corresponding integral:E [log p(yk−m:k,xk−m:k|θ)] =∫p(xk−m:k|yk−m:k,θ) (3.19)log[p(yk−m:k,xk−m:k|yk−m:k,θ)]dxk−m:k.A number of approximations of the integral in Equation 3.20 have been explored in [55]and [56]. For simplicity, the following particle-based approximation that accounts for the48probability of occurrence of states is used:E [log p(yk−m:k,xk−m:k|θ)]≈N∑j=1w jk−m:k log[p(xjk−m:k,yk−m:k|yk−m:k,θ)] (3.20)where w jk−m:k are weights proportional to the likelihood of occurrence of estimates of thestate sequence, x jk−m:k, obtained through particle filters. The Markov property of the non-linear state space model allows one to expand the joint state and measurement likelihoodas follows:p(x jk−m:k,yk−m:k|yk−m:k,θ) = (3.21)p(x j1|yk−m:k,θ)k∏t=k−mp(x jt |xjt−1,θ)k∏t=k−mp(yt |xjt ,θ)and use it in Equation ??. With this approximation, it is possible to evaluate the proposedlikelihood ratio statistic in Equation 3.18.3.4.1 Fault detectionNow that a test statistic is defined and an approximation of it can be numerically evaluated,one can perform hypothesis testing to decide whether the null or the alternative hypothesisis correct. Ideally, one would like to choose this sensitivity level such that the probabilityof type I error is minimized. The sensitivity level is often chosen using a power functionthat relates the type I error to the sensitivity level used. It is difficult to derive a powerfunction for the proposed statistic; however, one can use an empirical sensitivity level todetect faults. One can accept the alternate hypothesis H1 and consider the process to haveexhibited a fault if the new test statistic, L0new(m,k), is less than a predetermined sensitivitylevel. The sensitivity level α can be chosen by the user such that when L0new(m,k)< α , theprocess is deemed to have encountered a fault. Using the definition of L0new(m,k), it is easy49to see that a large negative value for L0new implies that the log-likelihood function under thenull hypothesis is much smaller than that under the alternate hypothesis. In other words,the alternate hypothesis is more likely, and the assumed process model M0 is incorrect.However, this information allows one to detect the fault but not to isolate it.3.4.2 Fault isolationThe fault isolation step is in general more difficult than the detection step. This is primarilydue to the fact that even distinct faults often do not provide good enough data to isolate andidentify specific faults. Moreover, at least theoretically speaking, often a large number ofpossible faults exist and may occur simultaneously. In this dissertation, it is assumed thatthe faults defined by the models {Mi}Ni=0 are independent of each other and do not occur si-multaneously. This assumption does not preclude one from identifying simultaneous faults,as they can easily be defined using a new model. For instance, suppose that the faults M1and M2 can occur simultaneously; then one can define a corresponding new fault M1,2 andadd it to the set of fault models.To identify the fault model that best describes the fault in the process, a number of par-allel particle filters for each model Mi are implemented and the corresponding test statisticLinew(m,k) is evaluated. The problem of fault isolation is to identify the most likely hypoth-esis among the following N hypotheses:H1 : θ = θ 1 - Alternate hypothesis 1...HN : θ = θN - Alternate hypothesis NConsequently, N log-likelihood test statistics are obtained and the hypothesis corre-sponding to the largest Linew(m,k) at any given k is accepted. From the definition of the50test statistic, it is easy to see that the larger the test statistic, the higher the probability thatthe corresponding fault model provides a good representation of the process fault. In thefollowing section, the effectiveness of this approach is illustrated through two simulationexamples of highly non-linear processes.3.4.3 LimitationsThe proposed approach relies on computationally intensive particle filters. The compu-tational complexity of these filters is proportional to Nsm. Given that N + 1 filters aresimultaneously used in the proposed algorithm, the overall computational complexity is ofthe order of (N + 1)Nsm. Therefore, the computational burden increases linearly with thenumber of faults. However, given that faults often do not occur frequently, the proposedapproach is still practical. Another limitation is that the sensitivity level has to be chosenthrough experience or trial and error. In an ideal situation, one would like to choose it froma power function of the test statistic such that type I errors are minimized. However, deriv-ing an analytical power function may not be possible; instead one might have to estimatethe power function through numerical simulations.3.5 Case studies3.5.1 Case study 1: application to a two-unit chemical processProcess descriptionThe example considered in this section is taken from [50, 92]. This process consists of twowell-mixed, non-isothermal continuous stirred tank reactors (CSTRs) connected in seriesas shown in Figure 3.2. Three parallel irreversible elementary exothermic reactions of theform AK1−→ B, AK2−→U & AK3−→ R take place in the reactor. A is the reactant species, B is thedesired product and U and R are undesired byproducts. K1, K2 and K3 are the reaction rates.51The feed to the first reactor consists of pure A at flow rate F0, molar concentration CA0 andtemperature T0, and the feed to the second reactor consists of the output of the first reactor,and an additional fresh stream feeding pure A at flow rate F3, molar concentration CA03and temperature T03. Due to the non-isothermal nature of the reactions, a jacket is used toremove heat from or provide heat to both reactors. Under standard modelling assumptions,a mathematical model of the plant can be derived from material and energy balances andtakes the following form:dT1dt=F0V1(T0−T1)+3∑i=1(−∆Hi)ρcpRi(CA1,T1)+Q1ρcpV1(3.22a)dCA1dt=F0V1(CA0−CA1)−3∑i=1Ri(CA1,T1) (3.22b)dT2dt=F1V2(T1−T2)+F3V2(T03−T2)+3∑i=1(−∆Hi)ρcpRi(CA2,T2)+Q2ρcpV2(3.22c)dCA2dt=F1V2(CA1−CA2)+F3V2(CA03−CA2)−3∑i=1Ri(CA2,T2) (3.22d)where Ri(CA j,Tj) = ki0 · exp(−Ei/RTj) ·CA j, for ( j = 1,2). T,CA,Q, and V denote thetemperature of the reactor, the concentration of species A, the rate of heat input/removalfrom the reactor and the volume of the reactor, respectively, with subscript 1 denotingCSTR-1, and subscript 2 denoting CSTR-2. ∆Hi, ki, Ei, i = 1,2,3, denote the enthalpies,pre-exponential constants and activation energies of the three reactions, respectively. cp andρ denote the heat capacity and density of the fluid in the reactor. The manipulated inputsfor these studies are the inlet concentration, CA0, and the net heat removed/added, Q1, forthe CSTR-1 while the inlet concentration of stream 3, CA30 and net heat removed/added Q2for the second reactor. The steady-state operating data of the two CSTRs system, is givenin Table 3.1.52Table 3.1: Process parameters and steady-state values for the two CSTR reactorsParameter Value UnitF0,F1,F3 4.998, 4.998, 8 m3hV1,V2 1, 3 m3R 8.314 kJkmol.KT0,T03 280, 280 KCA0s,CA03s 2.4, 2.6 kmolm3Q1,Q2 0.7×106,0.3×106 kJh∆H1 −1.00×105 kJkmol∆H2 −1.04×105 kJkmol∆H3 −1.08×105 kJkmolk10 3.0×106 h−1k20 3.0×105 h−1k30 3.0×105 h−1E1 5.0×104 kJkmolE2 7.53×104 kJkmolE3 7.53×104 kJkmolρ 2000 kgm3cp 0.731 kJkg.KT1s,T2s 424.4, 444.5 KCA1s,CA2s 1.69, 0.89 kmolm3Figure 3.2: Schematic of two chemical reactors connected in series.53Simulation results and discussionIn this section, eight different fault scenarios are simulated using data corrupted with somenoise; the proposed algorithm is used to detect and isolate those faults, then the results arecompared with other FDI algorithms. In all simulation runs, the measurement and statenoise variances used are assumed to beQν =0.001 0 0 00 0.001 0 00 0 0.001 00 0 0 0.001andQω =0.10 0 0 00 0.01 0 00 0 0.10 00 0 0 0.001,respectively. In this example, the reactor is simulated for 1000 samples considering fourdifferent possible faults, F1, F2, F3 and F4 as shown in Table 3.2 (we also refer to thesefaults as scenarios 1−4). It is assumed that in each case, one of these faults is introducedat k = 400 and fixed at k = 600. Experience has shown that a reasonable method for de-tecting a fault is to use a sensitivity level, α , that allows for the ratio of joint likelihoodfunctions to be about 0.3. This leads to α = log(0.3) = −1.2. The time window, m, overwhich the likelihood function is evaluated is chosen to be 100. The length of this windowdetermines the sensitivity of the algorithm in terms of rate of fault detection and time todetection.54Table 3.2: Fault scenarios 1−4 for the two CSTR reactorsFault Steady-State Faulty-State Time IntervalF1:Faulty biased temperature sensor, T0 280 295−311 oC 400−600F2:Faulty biased flow rate sensor, F0 4.998 5.25−5.5 m3/h 400−600F3:Faulty biased temperature sensor, T03 280 295−311 oC 400−600F4:Faulty biased flow rate sensor, F1 4.998 5.25−5.5m3/h 400−600Figure 3.3: The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 1 for the twoCSTR reactors.Figure 3.3, Figure 3.4, Figure 3.5 and Figure 3.6 show the profiles of L0new(m,k) and thealarm signals (which stay at 0 when there is no fault and at one when a fault is detected)where the faults are clearly identified, albeit with a brief delay. Few false alarms appearbefore or after the detection. These false alarms can obviously be avoided by choosing alower sensitivity level. It is believed that this false alarm is an artifact of the sample size,m, used in evaluating the likelihood ratio. The delay in detecting the disappearance of thefault is also due to the fact that the window length, m, was chosen to be 100.55Figure 3.4: The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 2 for the twoCSTR reactors.Figure 3.5: The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 3 for the twoCSTR reactors.56Figure 3.6: The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for the 5% biased sensor in scenario 4 for the twoCSTR reactors.Once a fault is detected, one can use the remaining particle filters to isolate it. For thesame fault scenarios given above, the faults were perfectly isolated, as shown in Figure 3.7,Figure 3.8, Figure 3.9 and Figure 3.10. For example, Figure 3.7 indicates that fault F1 isthe most likely fault in the process, as L1new(m,k) is the largest test statistic after the fault isintroduced (note that the figures show L1new(m,k)−L0new(m,k) to make it visually evidentthat L1new(m,k) is in fact the largest test statistic).Scenario 5 and 6 demonstrate the ability of the proposed algorithm to detect multiplefaults that may occur simultaneously. The simulation runs we carried out using 2000 sam-ples. In scenario 5, F1 was introduced at k = 400, F2 was introduced at k = 800 and bothfaults were removed at k = 1300. The corresponding plot showing the likelihood ratio andthe alarm signal is shown in Figure 3.11. The algorithm detected the first fault at k = 405,and the alarm signal, as expected, stayed faulty even after the second fault was introduced.57Figure 3.7: The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of each faultymodel Linew(m,k). Given that the largest test statistic is with L1new(m,k), it isconcluded that F1 is the fault present in the two CSTR reactors system.The fault free-condition was detected at k = 1497.In scenario 6, F1 was introduced at k = 400, F2 was introduced at k = 800 and removedat k=1000, F3 was introduced at k = 900 and both F1 and F2 were removed at k = 1400.Figure 3.12 shows the likelihood ratio and the alarm signal of scenario 6. The algorithmdetected the first fault at k = 407, and the alarm signal, as expected, stayed faulty even afterthe second and third faults were introduced. The fault-free condition was detected at k =1497. These simulations clearly show the effectiveness of the above algorithm in such ahighly non-linear process.In this chapter, particle filters are used to estimate parameters that are required to ap-58Figure 3.8: The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of each faultymodel Linew(m,k). Given that the largest test statistic is with L2new(m,k), it isconcluded that F2 is the fault present in the two CSTR reactors system..Table 3.3: Fault scenario 1 for the two CSTR reactors: FAR - false alarm rate, ADTD- alarm detection time delayScenario Fault Fault Size Fault Type (α) FAR ADTDPF UKF EKF PF UKF EKF1FaultyTemperatureSensorT 0 5 %Biased (Abrupt)-0.5 0.16 4.24 4.33 0.50 -1.2 0-0.8 0.08 4.16 4.24 1.00 -1.2 0.2-1.1 0.07 4.06 4.15 2.30 -1.2 0.2-1.4 0.03 3.89 4.02 3.10 -1.2 0.5Drift (Incipient)-0.5 0.21 4.40 4.51 27.8 9.6 7.8-0.8 0.10 4.30 4.46 31.4 9.8 10-1.1 0.07 4.21 4.40 35.4 10.6 11.4-1.4 0.04 4.21 4.30 39.2 10.6 11.610 %Biased (Abrupt)-0.5 0.27 4.31 4.02 1.80 0 0.6-0.8 0.13 4.15 3.92 2.90 0.2 0.6-1.1 0.09 4.02 3.89 2.70 0.6 0.8-1.4 0.06 3.98 3.89 2.70 0.8 1Drift (Incipient)-0.5 0.26 4.22 3.97 22.1 2.4 8.2-0.8 0.21 4.18 3.83 24.1 2.6 8.8-1.1 0.17 4.10 3.70 26.2 4.0 8.8-1.4 0.13 3.98 3.66 28.3 5.2 8.859Figure 3.9: The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of each faultymodel Linew(m,k). Given that the largest test statistic is with L3new(m,k), it isconcluded that F3 is the fault present in the two CSTR reactors system..proximate the test statistics. The accuracy of parameter estimation obtained using particlefilters depends on the number of particles used and is independent of the non-linearity ofthe model or the Gaussianity of the noise. It is therefore conceivable to instead use othernon-linear filters, such as an extended Kalman filter (EKF) or an unscented Kalman Filter(UKF). With EKF, the non-linearity of the process model is approximated, and with UKF,the density functions of the states are approximated. To understand the effectiveness of us-ing particle filters, the algorithm is modified to use EKF or UKF instead of particle filters.In scenarios 7 and 8, the FDI algorithm based on EKF and UKF has been applied to thefaults in scenario 1. For the same sensitivity level, as shown in Figure 3.13 and Figure 3.14,the performance of the algorithm using EKF and UKF is worse than when using particlefilters. Table 3.3 shows the performance of the particle filter-based approach comparedwith EKF and UKF in terms of false alarm rates and the average detection time delays for60Figure 3.10: The difference between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of each faultymodel Linew(m,k). Given that the largest test statistic is with L4new(m,k), it isconcluded that F4 is the fault present in the two CSTR reactor systems.Figure 3.11: The top plot shows the log-likelihood ratio test statistic and the bottomplot shows the alarm signal for scenario 5 for the two CSTR reactors.61Figure 3.12: The top plot shows the log-likelihood ratio test statistic and the bottomplot shows the alarm signal for scenario 6 for the two CSTR reactors.Figure 3.13: The top plot shows the log-likelihood ratio test statistic based on EKFalgorithm and the bottom plot shows the alarm signal for scenario 1 for thetwo CSTR reactors.62Figure 3.14: The top plot shows the log-likelihood ratio test statistic based on UKFalgorithm and the bottom plot shows the alarm signal for scenario 1 for thetwo CSTR reactors.Figure 3.15: Comparison of false alarm rates in fault detection using particle filter,EKF and UKF with 5% biased sensor.63different sensitively levels under fault scenario 1. The proposed algorithm detects biasedand abrupt changes with a lower false alarm rate than the other methods. Figure 3.15 showsthe goodness of the proposed algorithm using the 5% biased temperature sensor in scenario1.3.5.2 Case study 2: application to a polyethylene reactor systemProcess descriptionThe industrial gas-phase polyethylene reactor system considered in this section is takenfrom [35] and [51]. As shown in Figure 3.16, the feed to the reactor consists of ethylene,comonomer, hydrogen, inert gas, and catalyst. A recycle stream of unreacted gases flowsfrom the top of the reactor and is cooled by passing through a water-cooled heat exchanger.Cooling rates in the heat exchanger are adjusted by mixing cold and warm water streamswhile maintaining a constant total cooling water flow rate through the heat exchanger. Themass balances of the hydrogen and comonomer have not been considered in this studybecause the hydrogen and the comonomer have only mild effects on the reactor dynamics[89]. Table 3.5 includes definitions of all the parameters and process variables used inEquation (3.23 and Equation 3.24. The manipulated inputs considered for these studiesare the feed temperature, Tf eed , and the inlet flow rate of ethylene, FM1. The steady-stateoperating data of the system is given in Table 3.4.64Figure 3.16: Schematic of the industrial gas-phase polyethylene reactor system.A mathematical model for this reactor has the following form:d[In]dt=FIn−[In][M1]+[In]btVg(3.23a)d[M1]dt=FM1−[M1][M1]+[In]bt −RM1Vg(3.23b)dY1dt= Fcac− kd1Y1−RM1MW1Y1Bw(3.23c)dY2dt= Fcac− kd2Y2−RM1MW1Y2Bw(3.23d)dTdt=H f +Hg1−Hg0−Hr−HpolMrCpr+BwCppol(3.23e)dTw1dt=FwMw(Twi−T w1)−UAMwCpw(Tw1−T g1) (3.23f)dTg1dt=FgMg(T −T g1)−UAMgCpg(Tw1−T g1) (3.23g)65wherebt =VpCv√([M1]+ [In]) ·RR ·T −Pv (3.24a)RM1 = [M1] · kpo · exp[−EaR(1T−1Tf)]· (Y1 +Y2) (3.24b)Cpg =[M1][M1]+ [In]Cpm1 +[In][M1]+ [In]CpInH f = FM1Cpm1(Tf eed−T f )+FInCpIn(Tf eed−Tf ) (3.24c)Hg1 = Fg(Tg1−Tf )Cpg (3.24d)Hg0 = (Fg +bt)(T −Tf )Cpg (3.24e)Hr = HreacMW1RM1 (3.24f)Hpol =Cppol(T −Tf )RM1MW1 (3.24g)Simulation results and discussionBecause the processes are stochastic, several simulation runs were conducted to demon-strate the effectiveness of the proposed FDI scheme for the two different fault scenarios,as given in Table 3.6. All simulation runs carried out in this section were conducted usingnoise-corrupted data, with measurement and state covariances assumed to beQν =0.0001 0 0 0 0 0 00 0.0001 0 0 0 0 00 0 0.0001 0 0 0 00 0 0 0.0001 0 0 00 0 0 0 0.0001 0 00 0 0 0 0 0.0001 00 0 0 0 0 0 0.000166Table 3.4: Process parameters and steady-state values for the polyethylene reactorParameter Value UnitVg 500 m3V p 0.5Pv 17 atmBw 7×104 kgkp0 85×10−3 m3mol.sEa (9000)(4.1868) JmolCpm1 (11)(4.1868) Jmol.KCv 7.5 atm−0.5 molsCpw,CpIn (103)(4.1868), (6.9)(4.1868) Jkg.KCppol (0.85×103)(4.1868)Jkg.Kkd1 0.0001 s−1kd2 0.0001 s−1Mw1 28.05×10−3kgmolMw 3.314×104 kgMg 6060.5 molMrCpr (1.4×107)(4.1868) JKHreac (−894×103)(4.1868) JkgUA (1.14×106)(4.1868) JK.sFIn,FM1 ,Fg 5,190,8500molsFw (3.11×105)(18×10−3)kgsFsc5.83600kgsTf ,T sf eed,Twi 360,293,283.56 KRR 8.20575×10−5 m3.atmmol.KR 8.314 Jmol.Kac 0.548 molkg[In]s 439.68 molm3[M]s 326.72 molm3Y1,Y 2 3.835,3.835 molTs,Tw1,Tg1s 356.21,290.37,294.36 K67Table 3.5: Process variables of the polyethylene reactor systemParameter Valueac Active site concentration of catalysetbt Overhead gas bleedBw Mass of polymer in the fluidized bedCpm1 Specific heat capacity of ethyleneCv Vent flow coefficientCpw,CpIn,Cppol Specific heat capacity of water, inert gas, polymerEa Activation energyFc,Fg Flow rate of catalyst and recycled gasFIn,FM1,Fw Flow rate of inert gas, ethylene and cooling waterH f ,Hg0 Enthalpy of fresh feed stream, total gas outflow stream from reactorHg1 Enthalpy of cooled recycled gas stream to reactorHpol Enthalpy of polymerHr Heat liberated by polymerization reactionHreac Heat of reaction[In] Molar concentration of inserts in the gas phasekd1 ,kd2 Deactivation rate constant for catalyst sites 1, 2kp0 Pre-exponential factor for polymer propagation rate[M1] Molar concentration of ethylene in the gas phaseMg Mass holdup of gas stream in heat exchangerMrCpr Product of mass and heat capacity of reactor wallsMw Mass holdup of cooling water in heat exchangerMW1 Molecular weight of monomerPv Pressure downstream of bleed ventR,RR Ideal gas constant, unit of JmolK ,m3atmmolKT,Tf ,Tf eed Reactor, reference, feed temperatureTg1,T w1 Temperature of recycled gas, cooling water stream from exchangerTwi Inlet cooling water temperature to heat exchangerUA Product of heat exchanger coefficient with areaVg Volume of gas phase in the reactorVp Bleed stream valve positionY1,Y2 Moles of active site types 1, 268andQω =0.01 0 0 0 0 0 00 0.01 0 0 0 0 00 0 0.01 0 0 0 00 0 0 0.01 0 0 00 0 0 0 0.01 0 00 0 0 0 0 0.01 00 0 0 0 0 0 0.01..In both fault scenarios, the polyethylene reactor process was simulated for 1000 sampletimes, with the fault being introduced at k=400 and assumed to have been fixed at k=600.In scenario 1, it is assumed that a faulty biased sensor on the flow rate, FIn, is reading avalue of 30 mol/s instead of the true value shown in Table 3.4. Experience has shown thata reasonable choice for detecting a fault is to use a sensitivity level, α , that allows for theratio of the joint likelihood functions to be about 0.6. This leads to α = log(0.6) = −0.5.The time window, m, is chosen to be the same as in the previous case study. Figures 3.17and 3.18 show that the proposed algorithm not only identifies the existence of a fault butalso isolates it accurately.In scenario 2, it is assumed that a faulty biased sensor on the flow rate, Fg, is reading avalue of 9500 mol/s instead of the true value of 8500 mol/s shown in Table 3.4. Using thesame sensitivity level as in scenario 1, the fault was detected accurately, an alarm signalwas triggered at k =422 and it disappeared at k = 678, as shown in Figure 3.17. Followingthe fault isolation approach, the algorithm clearly identified that F1 was the fault present inthe polyethylene reactor process.69Table 3.6: Fault scenarios for the polyethylene reactor systemFault Steady-State Faulty-State Fault Time DurationF1: Faulty biased flow rate sensor, FIn 5 30 mol/s 400−600F2: Faulty biased flow rate sensor, Fg 8500 9500 mol/s 400−600Figure 3.17: The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for scenario 1 for the polyethylene reactor system.Figure 3.18: The differences between the log-likelihood ratio test statistic of the nom-inal process L0new(m,k) and the log-likelihood ratio test statistic of each faultymodel Linew(m,k). Given that the largest test statistic is with L1new(m,k), it isconcluded that F1 is the fault present in the two CSTR reactor system.70Figure 3.19: The top plot shows the log-likelihood ratio test statistic, and the bottomplot shows the alarm signal for scenario 2 for the polyethylene reactor systems.3.6 ConclusionsIn this chapter, a fault detection isolation algorithm has been developed for stochastic non-linear non-Gaussian systems, The proposed algorithm was designed using particle filtersand is based on a parameter estimation approach. The performance and robustness of theproposed algorithm has been investigated using other parameter estimation approaches,namely, EKF and UKF.71Chapter 4FDI Based on State Estimation UsingGOS4.1 IntroductionThis chapter addresses the problem of fault isolation in non-linear dynamic systems us-ing a structured residual approach for both sensors and actuators. Basically, there are twomain observer schemes usually used in fault isolation using a structured residual approach:the dedicated observer scheme (DOS) and the generalized observer scheme (GOS). In thischapter, the GOS scheme is used to isolate sensors and actuator faults based on a state esti-mation approach. The performance and robustness of the proposed approach are comparedto the DOS scheme for the same fault scenarios. The main idea is to design a bank of struc-tured residuals using particles, in which some of the residuals are designed to be sensitiveto a certain group of faults but insensitive to others. The design procedure consists of twosteps: (1) to specify the sensitivity and insensitivity relationships between residuals andfaults according to the assigned isolation task and (2) to design a set of residual generatorsaccording to the desired sensitivity and insensitivity relationships [4]. This chapter is orga-72nized as follows. In Section 4.2, the model-based FDI problem is formulated, followed bya brief description of the general observer scheme used for fault isolation, in Section 4.3.In Section 4.4, the application of the proposed method on simple heat exchanger systemsare presented. Finally, some conclusions and future work is outlined in Section 4.5.4.2 Problem statementAssume that there are N possible known faults that may occur in the process and there areN + 1 models {Mi}Ni=1, where M0 corresponds to the nominal process model and Mi, fori = 1,2, ...,N, represents the ith faulty model. The FDI approach considered in [16] wasestablished based on a parameter estimation approach using a bank of particle filters run-ning in parallel, where each model {Mi}Ni=1 was designed to be sensitive to a known singlesensor fault and excited by all inputs u and all output measurements yi; this is known as adedicated observer scheme (DOS) [122]. In this work, we consider a different analyticalredundancy approach, known as a general observer scheme (GOS), in which every model isexcited by all inputs and all outputs except one, which is the sensor to be monitored, or byall outputs and all inputs except one, which is the actuator to be monitored. The standarddesign procedure for any FDI approach, as discussed, consists of the following two steps:• Fault detection (FD): which decides on the occurrence of a change in the nominalmodel, M0, to one of the known faulty models {Mi}Ni=1 and determine, the time ofoccurrence.• Fault isolation (FI): which determines which {Mi}Ni=1 of the possible known faultymodels has happened.In this chapter, the sensor faults are modelled as additive faults to the output signaland the actuator faults are modelled as additive faults to the input signal, as shown in thefollowing stochastic non-linear dynamic state space model:73xik = fi(xik−1,uik−1,ν ik)+ f ia (4.1)yik = gi(xik,uik,ω ik)+ f is for i = 0 to N (4.2)where f i and gi represent the state and measurement dynamic functions, respectively. kdenotes a time instant. xk is the state vector with known initial probability density functionp(xi0) and yk is the vector of measurements. The random variables ν ik and ω ik represent thestate and measurement noise sequences with known probability density functions with zeromean. f is and fia denote the sensor and actuator faults, respectively. As we mentioned inthe previous chapter the noise in the above dynamic equations also enters the process in anon-linear fashion. In most traditional approaches to FDI, in linear and to some extent innon-linear processes, it is often assumed that measurement noise and state noise enter theprocess linearly. This assumption allows one to write the measurement Equation 4.2 asyk = gi(xik,uik)+ω ik + f is. (4.3)This fundamental assumption about the measurement equation allows us to detect faultssimply by generating and monitoring the prediction errors (or residuals) between the pro-cess measurements and the model predictions. The one-step-ahead predictions from Equa-tion 4.3 can be written asyˆik = gi(xik|k−1,uik) (4.4)where xik|k−1 is the one-step-ahead prediction of the state and yˆik is the one-step-ahead pre-diction of the output. Then the prediction error or the residual can be written asrˆik = yk− yˆik. (4.5)74The residuals, rˆik, are expected to be close to zero or to follow the corresponding measure-ment noise, ω ik, for sensor faults or state noise, ν ik, for actuator faults in fault-free cases.Any distinguishable deviation of the residuals implies a fault in the process. As previouslymentioned, the literature contains a number of statistical techniques to evaluate the residu-als for such deviations.4.3 Proposed algorithmFaults can take place in different parts of a controlled system and may cause the system todeviate from its normal behaviour. We propose an algorithm to detect sensor and actuatorfaults.4.3.1 Sensor FDI based on GOSSensor faults represent variations in the measurements coming from sensors, due to bro-ken wires, lost contact with the surface, aging, lock-in-place and other factors. Also, itcould be represented as a partial or total loss of correct reading. The static sensor out-put measurement y(t) can be influenced by a constant offset, a value-dependent offset or adirectional-dependent offset, and faults can be modelled as additive faults. However, thefaults that change the dynamic of the sensor can be modelled as multiplicative faults [123],[101].The FDI algorithm developed in this section is based on a bank of particle filters{Mi}Ni=1 running in parallel with a particle filter that represents the nominal process modelM0. The algorithm is expected to specify which of the N possible known faults has occurredin the process, based on the set of the residuals. In this work, sensor faults are modelledas additive faults to the output signal yk. To isolate sensor faults, a set of particle filters is75used such that each particle filter is dedicated to a single sensor fault. Each particle filter isdriven by all inputs u and all measurement outputs y except one, as shown in Figure 4.1.The problem of fault isolation is to identify the most likely residual, which is affectedby the known faults. Residuals are usually generated by comparing the output measurementy with the predicted measurement of every model, as in Equation 4.5. Assume that we havefour different faults F = [ f1, f2, f3, f4] that occur in the process.rˆGOS,1 = G1( f2, f3, f4) (4.6)rˆGOS,2 = G2( f1, f3, f4) (4.7)rˆGOS,3 = G3( f1, f2, f4) (4.8)rˆGOS,4 = G4( f1, f2, f3) (4.9)Each particle filter generates one residual that is sensitive to only one sensor fault. Thisleads to a diagonal signature table to isolate actuator faults, as presented in Table 4.8a. Thegeneral observer scheme FDI decision-making logic can be achieved using Equation 2.7and is illustrated in detail for four different faults in Table 4.1. For example, when sensor 1is faulty, all the residuals except the first will be corrupted by the faulty sensor signal. Fromthere, we can conclude that sensor 1 is the faulty sensor out of the four monitored sensors.The generalized observer scheme can be only used to detect and isolate single sensor faultwith the robustness to unknown inputs.4.3.2 Actuator FDI based on GOSActuator faults represent partial or total loss of control action due to lock-in-place, outage,cut or burned wiring, short-cuts, loss of effectiveness. In the case of complete loss of actu-76Table 4.1: Sensor FDI decision-making logicFault in sensor 1 Fault in sensor 2rˆ1 = y1− yˆ1 = 0 rˆ1 = y1− yˆ1 6= 0rˆ2 = y2− yˆ2 6= 0 rˆ2 = y2− yˆ2 = 0rˆ3 = y3− yˆ3 6= 0 rˆ3 = y3− yˆ3 6= 0rˆ4 = y4− yˆ4 6= 0 rˆ4 = y4− yˆm 6= 0Fault in sensor 3 Fault in sensor 4rˆ1 = y1− yˆ1 6= 0 rˆ1 = y1− yˆ1 6= 0rˆ2 = y2− yˆ2 6= 0 rˆ2 = y2− yˆ2 6= 0rˆ3 = y3− yˆ3 = 0 rˆ3 = y3− yˆ3 6= 0rˆ4 = y4− yˆ4 6= 0 rˆ4 = y4− yˆ4 = 0Figure 4.1: Model-based sensor FDI based on GOS.77Table 4.2: Actuator FDI decision making LogicFault in actuator 1 Fault in actuator 2rˆ1 = y1− yˆ1 = 0 rˆ1 = y1− yˆ1 6= 0rˆ2 = y2− yˆ2 6= 0 rˆ2 = y2− yˆ2 = 0rˆ3 = y3− yˆ3 6= 0 rˆ3 = y3− yˆ3 6= 0rˆ4 = y4− yˆ4 6= 0 rˆ4 = y4− yˆm 6= 0Fault in actuator 3 Fault in actuator 4rˆ1 = y1− yˆ1 6= 0 rˆ1 = y1− yˆ1 6= 0rˆ2 = y2− yˆ2 6= 0 rˆ2 = y2− yˆ2 6= 0rˆ3 = y3− yˆ3 = 0 rˆ3 = y3− yˆ3 6= 0rˆ4 = y4− yˆ4 6= 0 rˆ4 = y4− yˆ4 = 0ator action, the actuator will not be able to produce any action, even if an input is appliedto it. However, when the actuator partially fails, it produces only part of the normal controlaction. If only gain changes, this can be modelled as either additive or multiplicative faultsignal [101].To detect and isolate actuator faults, a bank of particle filters that represents the knownfaulty models {Mi}Ni=1 is run in parallel to a particle filter that represents a nominal processmodel M0. Each particle filter is dedicated to one actuator fault so that each model {Mi}Ni=1is driven by all output measurements and all inputs except one, as shown in Figure 4.2. Inthis work, actuator faults are modelled as an additive change in the state Equation 4.1.The actuator FDI decision making-logic is given in Table 4.2 for four different actuatorfaults. For example, when actuator 1 is faulty, all the other residuals except the first onewill be corrupted by the faulty actuator signal. From there we can conclude that actuatorone is the faulty actuator out of the four monitored actuators. The generalized observerscheme can be used only to detect and isolate a single actuator fault.The FDI approach based on GOS is robust against the parameters and measurement78Figure 4.2: Model-based actuator FDI based on GOS.noises and can only detect one fault at a time, whereas the FDI method based on DOS candetect more than one fault at the same time but has no robustness to unknown inputs. Inthe next section, we utilize particle filter approximation to examine the robustness of theabove two schemes in terms of state and measurement noise variances, using a simple heatexchanger process, two continuously stirred tank reactors and a flash tank separator with arecycle stream.794.4 Case studies4.4.1 Case study 1: application to a simple heat exchanger modelProcess descriptionThe process we consider here is a simple heat exchanger model, taken from [71], in whichthe heat is transferred between a cold and a hot fluid, as shown in Figure 4.3. Each sideof the heat exchanger is approximated as a single, perfectly mixed tank. Neglecting thevariations in liquid volume and the heat accumulated in the walls yields a mathematicalmodel with two states given by the following two Equation 4.10 and Equation 4.11.τCdTCdt=qCq∗C(Tci−Tc)+αC(TH−TC) (4.10)τHdTHdt=qHq∗H(Tci−Tc)+αH(TH−TC) (4.11)where q∗ denotes the steady-state flow, andτC =VCq∗C;αC =UAρCq∗CcpτH =VHq∗H;αH =UAρHq∗HcpThere are two manipulated inputs to the system, defined as qC and qH ; their values and thesteady-state operating data of the system are given in Table 4.3.Simulation results and discussionIn this section, we assume that there are four different possible faults { f1, f2, f3 and f4}that may occur in the process, as shown in Table 4.4.The four fault scenarios are simulated for 1000 samples using data corrupted with two80Figure 4.3: Simple heat exchangerTable 4.3: Process parameters and steady-state values for the heat exchanger processParameter Value UnitVH =VC 1 m3qH = qC 0.01 m3/minTci 25 oCThi 100 oCTC 61.59 oCTH 63.41 oCUA 300 KJ/oC.minP 500 Kg/m3cp 3 KJ/oC.Kgdifferent measurement and state noise variances to examine the robustness of GOS againstDOS in isolating sensor and actuator faults. In case 1, the state and measurement noisevariances used for the four fault scenarios are assumed to be:Qν1 =0.0001 00 0.0001andQω1 =0.0001 00 0.0001 ,81Table 4.4: Sensor fault scenarios 1–4 for the simple heat exchanger processFault Steady-State Faulty-State Time Intervalf 1 Faulty biased temperature sensor, Tci 25 30 400-600f 2 Faulty biased temperature sensor, Thi 100 130 400-600f 3 Faulty biased flow rate actuator, qC 0.01 0.05 400-600f 4 Faulty biased flow rate actuator, qH 0.01 0.05 400-600respectively, and the faulty sensors and actuators read 30 oC, 130 oC, 0.05 m3/min and 0.05m3/min instead of the true values given in Table 4.3.In fault scenario 1, the temperature sensor, Tci, reads 30 oC instead of the true valuegiven in Table 4.3. The fault f1 was introduced at k = 400, removed at k = 600 and sim-ulated as a step change in the temperature sensor with a magnitude of 5 oC. Figure 4.4ashows how DOS failed to isolate, f1, while Figure 4.4b shows the ability of GOS to isolatethe same fault. Figure 4.5, shows the residuals generated using a particle filter approachfor fault scenario f2. The faulty biased temperature sensor Thi was simulated using a stepchange of a magnitude of 30 oC. Again, DOS failed to isolate f2, while Figure 4.5b showsthe ability of the algorithm to isolate the fault correctly.Fault scenarios 3 and 4 were also simulated using the same measurement and state noisevariances based on DOS and GOS, as shown in Figure 4.6 and Figure 4.7, respectively. Inboth fault scenarios, the algorithm was able to isolate the faults.In case 2, the same fault scenarios were simulated using smaller state and measurementnoise variances:Qν2 =0.000001 00 0.00000182(a) Dedicated observer scheme DOS (b) General observer scheme GOSFigure 4.4: Residuals generated using a particle filter approach for fault scenario f1for the heat exchanger process based on DOS and GOS using Qν1 and Qω1.(a) Dedicated observer scheme DOS (b) General observer scheme GOSFigure 4.5: Residuals generated using a particle filter approach for fault scenario f3for the heat exchanger process based on DOS and GOS using Qν1 and Qω1.andQω2 =0.000001 00 0.000001 ,respectively. In this case, all sensor and actuator faults are isolated correctly for both isola-tion schemes.83(a) Dedicated observer scheme DOS (b) General observer scheme GOSFigure 4.6: The residuals generated using particle filter approach for fault scenario f2for the heat exchanger process based on DOS and GOS using Qν1 and Qω1.(a) Dedicated observer scheme DOS (b) General observer scheme GOSFigure 4.7: Residuals generated using a particle filter approach for fault scenario f4for the heat exchanger process based on DOS and GOS using Qν1 and Qω1.4.4.2 Case study 2: application to a two CSTRs and a flash tank witha recycle stream processProcess descriptionThe example considered in this section is taken from [44]. The process consists of twoCSTRs and a flash tank separator, all connected in series, as shown in Figure 4.8. The feed84Figure 4.8: Two CSTRs and a flash tank with a recycle stream.to the first reactor consists of reactant A at flow rate F10, which is converted into the desiredproduct B. The desired product B can then further react into an undesired side-productC. The feed to the second CSTR is composed of the effluent of the first CSTR and anadditional fresh feed, F20. The reactions A→ B and B→C (referred to as 1 and 2, respec-tively) take place in the two CSTRs in series before the effluent from the second reactor isfed to the flash tank. The overhead from the flash tank is condensed and recycled to thefirst CST R, and the bottom product stream is removed. A small portion of the overhead ispurged before being recycled to the first reactor.Assuming that the three vessels have static holdup and standard modelling assumptionsapply, the dynamic behaviour of the system can be described by the following differential85equations, obtained through material and energy balances:dT1dt=F10V1(T10−T1)+FrV1(T3−T1)+−∆H1ρCpk1e−E1RT1 CA1 +−∆H2ρcpk2e−E2RT1 CA1 +Q1ρCpV1(4.12a)dCA1dt=F10V1(CA10−CA1)+FrV1(CAr−CA1)− k1e−E1RT1 CA1− k2e−E2RT1 CA1 (4.12b)dCB1dt=−F10V1CB1 +FrV1(CBr−CB1)+ k1e−E1RT1 CA1 (4.12c)dCC1dt=−F10V1CC1 +FrV1(CCr−CC1)+ k2e−E2RT1 CA1 (4.12d)dT2dt=F1V2(T1−T2)+(F20 +4F20)V2(T20−T2)+−∆H2ρCpk1e−E2RT2 CA2+−∆H2ρcpk2e−E2RT2 CA2 +Q2ρCpV2(4.12e)dCA2dt=F1V2(CA20−CA2)+(F20 +4F20)V2(CA20−CA2)− k1e−E1RT2 CA2− k2e−E2RT2 CA2 (4.12f)dCB2dt=F1V2(CB1−CB2)+(F20 +4F20)V2CB2 + k1e−E1RT2 CA2 (4.12g)dCC2dt=F1V2(CC1−CC2)+(F20 +4F20)V2CC2 + k2e−E2RT2 CA2 (4.12h)dT3dt=F2V3(T2−T3)−HVapFrρCpV3+Q1ρcpV3(4.12i)dCA3dt=F2V3(CA2−CA3)−FrV3(CAr−CA3) (4.12j)dCB3dt=F2V3(CB2−CB3)−FrV3(CBr−CB3) (4.12k)dCC3dt=F2V3(CC2−CC3)−FrV3(CCr−CC3) (4.12l)The definitions for the variables used in the system can be found in Table 4.5, with theparameter values given in Table 4.6.The model of the flash tank separator operates under the assumption that the relativevolatility for each of the species remains constant within the operating temperature rangeof the flash tank. This assumption allows calculation of the mass fractions in the overhead86Table 4.5: Process variables of a two CSTRs and flash tank systemParameter ValueCA1,CA2,CA3 Concentration of A in vessels 1, 2, 3CB1,CB2,CB3 Concentration of B in vessels 1, 2, 3CC1,CC2,CC3 Concentration of C in vessels 1, 2, 3CAr,CBr,CCr Concentration of A, B, C in the recycleT1,T2,T3 Temperatures in vessels 1, 2, 3T10,T20 Feed stream temperture to vessels 1, 2F1,F2,F3 Effluent flow rate from vessels 1, 2, 3F10,F20 Feed stream flow rate to vessels 1, 2CA10,CA20 Concentration of A in the feed stream to vessels 1, 2Fr Recycle flow rateV1,V2,V3 Volume of vessels 1, 2, 3u1,u2,u3,u4 Manipulated inputsE1,E2 Activation energy for reactions 1, 2k1,k2 Pre-exponential values for reactions 1, 2kd1 ,kd2 Heats of reaction for reactions 1, 2∆H1,∆H2 Relative volatilities of A, B, C, DHvap Heat of vaporizationαA,αB,αC,αD Relative volatilities of A, B, C, DMWA,MWB,MWC Molecular weights of A, B, and CCp Heat capacityR Gas constantbased upon the mass fractions in the liquid portion of the vessel. It has also been assumedthat there is a negligible amount of reaction taking place in the separator. The following al-gebraic equations model the composition of the overhead stream relative to the compositionof the liquid holdup in the flash tank:CAr =αACA3K, (4.13a)CBr =αBCB3K, (4.13b)CCr =αCCC3K, (4.13c)K = αACA3MWAρ +αBCB3MWBρ +αCCC3MWCρ (4.13d)87where αA, αB, αC and αD are the relative volatility constants of the three reacting speciesalong with the inert species D. MWA, MWB and MWC are the molecular weights of the threereacting species. Finally, xD is the mass fraction of the inert species D in the liquid phaseof vessel 3 and is found from the mass balance. The values of the process parameters aregiven in Table 4.6. There are three manipulated external heat inputs to the system, definedas Q1, Q2 and Q3 and one manipulated flow rate input to the second vessel, F20. The valuesof these inputs are given in Table 4.7.Table 4.6: Process parameters and steady-state values for the two CSTRs and flashtankParameter Value UnitT10,T20 300,300 KF10,F20,Fr 5,5,1.9 m3/hCA10,CA20 4,3 kmol/m3V1,V2,V3 1.0,0.5,1.0 m3E1,E2 5E4,5.5E4 kJ/kmolk1,k2 3E6,3E6 1/h∆H1,∆H2 −5×104,−5.3×104 kJ/kmolHvap 5 kJ/kmolCp 0.231 kJ/kgKR 8.314 kJ/kmolKρ 1000 kg/m3αA,αB,αC,αD 2,1,1.5,3 UnitlessMWA,MWB,MWC 50,50,50 kg/kmolTable 4.7: Steady-state inputs for the two CSTRs and flash tank systemParameter Value UnitQ1 5×104 (kJ/h)Q2 1.5×105 (kJ/h)Q3 2×105 (kJ/h)F20 4.998 (m3/h)Simulation results and discussionIn this example, eight different sensor and actuator fault scenarios as shown in Table 4.9and Table 4.10, are simulated for 1000 samples using data corrupted by some noise. In all88Table 4.8: Actuator and sensor fault signature tables for the two CSTRs and a flashtank with a recycle stream processr1 r2 r3 r4Sensor 1, T10 1 0 0 0Sensor 2, T20 0 1 0 0Sensor 3, F10 0 0 1 0Sensor 4, F20 0 0 0 1(a) Sensor fault signature tabler1 r2 r3 r4Actuator 1, Q1 1 0 0 0Actuator 2, Q2 0 1 0 0Actuator 3, Q3 0 0 1 0Actuator 4, ∆F20 0 0 0 1(b) Actuator fault signature tableTable 4.9: Sensor fault scenarios 1-4 for the two CSTRs and a flash tank with a recyclestream processFault Steady-State Faulty-State Time IntervalF1 Faulty biased temperature sensor, T10 300 320 400-600F2 Faulty drifted temperature sensor, T20 300 320 400-600F3 Faulty biased flow rate sensor, F10 5 6.5 400-600F4 Faulty drifted flow rate sensor, F20 5 6.5 400-600simulation runs, the measurement and state noise variances used are assumed to be:Qν = 10−4× In and Qω = 10−4× In, respectively of size n = 12. The simulated faultsscenarios are:Sensor Faults : Four sets of residuals were designed using a bank of particle filters drivenby all inputs and all outputs except the one sensitive to the fault to be isolated.1. F1 Faulty biased sensor on feed stream temperature to vessel 1, T10. The fault isintroduced at k = 400 as a step change of magnitude 5% of the true value given inTable 4.6 and then removed at k = 600.2. F2 Faulty drifted sensor on feed stream temperature to vessel 1, T20. When k = 400,a gradual faulty signal is introduced as a ramp change of magnitude 5% of the actualsensor value 300 K and then removed at k = 600.3. F3 Faulty biased sensor on feed stream flow rate to vessel 1, F10. The fault is intro-duced at k = 400 as a step change of magnitude 10% of the true value given in Table89Table 4.10: Actuator fault scenarios 5-6 for the two CSTRs and a flash tank with arecycle stream processFault Steady-State Faulty-State Time IntervalF5 Faulty biased actuator, Q1 5×104 7×104 400-600F6 Faulty drifted actuator, Q2 1.5×105 1.65×25 400-600F7 Faulty biased actuator, Q3 2×105 3.5×25 400-600F8 Faulty drifted actuator, F∆20 4.998 6.5 400-6004.6 and then removed at k = 600.4. F4 Faulty drifted sensor on feed stream flow rate to vessel 2, F20. When k = 400, agradual faulty signal is introduced as a ramp change of magnitude 10% of the actualsensor value 5 m3/h and then removed at k = 600.Actuator Faults : Four sets of residuals were designed using a bank of particle filtersdriven by all outputs and all inputs except the one sensitive to the fault to be isolated.5. F5 Faulty biased actuator, Q1. The fault is introduced at k = 400 as a step change inthe actuator signal, as given in Table 4.6, then is removed at k = 600.6. F6 Faulty drifted actuator, Q2. When k = 400, a gradual faulty signal is introducedas a ramp change of magnitude 5% of the actual actuator value 1.5×105 and then isremoved at k = 600.7. F7 Faulty biased actuator, Q3. The fault is introduced at k = 400 as a step change inthe actuator signal given in Table 4.6, then is removed at k = 600.8. F8 Faulty drifted actuator signal, ∆F20. When k = 400, a gradual faulty signal isintroduced as a ramp change of magnitude 10% of the actual sensor value 5 m3/hand then removed at k = 600.Based on the results shown in Figure 4.9–Figure 4.16, we can notice the ability of theproposed fault isolation scheme GOS to isolate the eight different fault scenarios. Also,90Figure 4.9: Residuals generated using a particle filter approach for the biased temper-ature sensor T10 for two CSTRs and a flash tank with a recycle stream processbased on GOS.Figure 4.10: Residuals generated using a particle filter approach for the biased tem-perature sensor T20 for two CSTRs and a flash tank with a recycle streamprocess based on GOS.91Figure 4.11: Residuals generated using a particle filter approach for the biased flowrate sensor F10 for two CSTRs and a flash tank with a recycle stream processbased on GOS.Figure 4.12: Residuals generated using a particle filter approach for the biased flowrte sensor F20 for two CSTRs and a flash tank with a recycle stream processbased on GOS.92Figure 4.13: Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recycle streamprocess based on GOS.Figure 4.14: Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recycle streamprocess based on GOS.93Figure 4.15: Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recycle streamprocess based on GOS.Figure 4.16: Residuals generated using a particle filter approach for the biased tem-perature sensor T10 for two CSTRs and a flash tank with a recycle streamprocess based on GOS.94Figure 4.17: Residuals generated using a particle filter approach for the biased tem-perature sensor T10 and the biased temperature sensor T20 for two CSTRs anda flash tank with a recycle stream process based on GOS.multiple fault scenarios were examined by introducing several different faults simultane-ously. Figure 4.17 shows the failure of the algorithm to isolate the multiple fault scenario.This result supports [6], which claimed that the design of the GOS allows the detection andisolation of one fault at a time.4.5 ConclusionsA general model-based fault isolation approach for stochastic non-linear non-Gaussian sys-tems has been developed using GOS. The simulation results show the excellent perfor-mance of the proposed approach against the dedicated observer scheme DOS in non-linearsystem for both sensor and actuator faults. The results also show the robustness of the al-gorithm in detecting abrupt and incident faults in highly non-linear examples. Finally, theproposed algorithm failed to isolate simultaneous multiple faults.95Chapter 5Alarm Design5.1 IntroductionModern process industries are equipped with highly automated diagnosis systems to main-tain high safety and reliability standards. The main objective of all fault diagnosis sys-tems is to raise an alarm that notifies the operators about any deviation from the normalbehaviour of the process [16]. According to the Abnormal Situation Management Con-sortium (ASM), US petrochemical plants alone lose more than $10 billion per year due tothe abnormal situations and a lack of alarm management strategies [31]. However, thereare large number of process variables being monitored during plant operation, increasingthe amount of information transmitted from the sensing elements in the plant to the controlroom, as well as the number of alarms being processed, presented or handled by the oper-ators [12].Efficient monitoring of industrial processes is important to maintain product quality andquantity; it also allows for enough lead time to take appropriate corrective action beforecatastrophic failures [68]. During abnormal situations in plant operations, the operators are96frequently flooded with alarms in short period of time. These alarms are difficult to dealwith in a timely fashion because there are often far more alarms than various internationalstandards deem to be safe [39]. Generally, alarm floods include nuisance alarms, whichare not critical and do not directely lead to catastrophic failures. However, alarm floodingleaves the operators struggling to prioritize the alarms that need urgently to be resolved.Moreover, frequent displays of uncritical alarms can lead to complacency on the part ofoperators [61], [62]. A number of catastrophic incidents have occured due to the bad per-formance of alarm systems. Hence, alarm system design is vital for fault monitoring anddiagnosis systems and for reducing the time operators need to take appropriate correctiveactions. While alarm design involves a number of steps, including the design, processingand presentation of alarms, here, we focus on design.The standard industrial practice is to design sensitive alarms that notify operators aboutany safety or performance violation in the plant. The majority of these alarms are not crit-ical and do not require immediate action. However, some if left unaddressed may causecatastrophic failures. Consequently, operators are often flooded with an unmanageablenumber of alarms. Sometimes, operators ignore or acknowledge frequent alarms withouttaking further action. This can lead to complacency on the part of operators and manage-ment. EEMUA 191 [40] defineds alarm rate recommendations as follows:• An average alarm rate of < 10 minute periods for steady-state operations.• An avergae of < 10 alarms per 10 minute period after the plant upset.There are numerous historical examples in which large number of alarms in conjunc-tion with the complacent behaviour of operators have resulted in catastrophic events withsignificant human and material losses. These events often also resulted in environmental97pollution as well.Nuisance alarms in industrial processes occur for the following reasons:• The alarms are poorly designed. Some alarms are unnecessary for operating theplant, while other important alarms are missed.• The alarm system is unable to handle different operating states. For example, analarm system could work well during normal operation but fail in the event of amajor accident.• The alarm system presents too much (inadequate) information. The operator isflooded and thus unable to separate important alarms from minor events.• The alarm system is poorly tuned. Alarms are triggered by: tight alarm limits,controller-caused oscillations, unfiltered noise disturbances, outliers, transients, etc.Alarm design has received considerable attention in recent years from industry as wellas academia. It has been extensively studied and applied primarily to linear systems, al-though real systems are non-linear in their dynamics. Hence, the corresponding data areoften non-Gaussian. Moreover, closed-loop systems generate data that are strongly cor-related. In this chapter, we propose to alleviate these problems through Bayesian stateestimation methods.The performance of the alarm system can be measured using three different theoreti-cal and simulation-based indices: false alarm rate (FAR), missed alarm rate (MAR) andexpected detection delay (EDD). The FAR and MAR are used to assess the accuracy ofthe alarm system in detecting variations in the process variables from normal to abnormalconditions [124]. False alarms are those that occur during fault-free operation and missed98alarms are those that are missed during faulty operation [94]. A good alarm design algo-rithm must reduce the false and missed alarm rates to avoid distracting the operators anddegrading the process. The user usually defines the alarm design requirements in terms offalse and missed alarm rates and the maximum acceptable detection delay. These parame-ters are then used in conjunction with an iterative process to identify an appropriate alarmthreshold. The literature on alarm design for measurements generated by linear systems isbased on deriving false and missed alarm rates using Markov chain analysis. This approachto alarm design has been presented in [10]. The authors have extended their work to alarmssuch as on-delay, off-delay alarms, and generalized delay. They have also shown that theiralgorithms are readily applicable to filtered data.There are three main techniques widely used in industry to improve alarm performance.Alarms that depend on hard upper and lower bounds are called deadband alarms. Alarmsthat turn on only after a few measurement samples have crossed the threshold are called de-lay timers. On-delay timers are triggered at the time alarm activation, while off-delay timersare triggered at the time of alarm deactivation [12]. Alarm filtering is another commonlyused alarm design technique. The moving average filter and the exponentially weightedmoving average filter are the most common filters used in industry. In this dissertation, wefocus on delay timers and deadbands.The alarm design algorithms can be classified into two main categories: data-basedalgorithms and model-based algorithms [126]. The data-based approaches rely on largehistorical data sets and process a priori or historical knowledge, while model-based ap-proaches rely on physical laws that describe the dynamic behaviour of the system. Model-based methods tend to be more accurate if high-quality models are available [69]. It isdifficult to obtain such models in large, complex industrial plants, although it is possible99to develop reliable models for smaller units such as reactors and heat exchangers. In thisdissertation, we develop a data-based alarm algorithm for correlated measurements from anon-linear stochastic system and extend that to a model-based algorithm. We handle thecorrelations between data and the non-linearity using a Bayesian state estimation algorithmcalled a particle filter.This chapter is organized as follows. In Section 5.2, the alarm system design problemin non-linear stochastic systems is formulated. In Section, 5.3 and Section 5.4, techniquesare discussed and the alarm indices are derived for each technique (FAR, MAR and EDD).In Section 5.5, two examples are used to illustrate the proposed approach. Finally, someconclusions are drawn in Section 5.6.5.2 Problem statementThe main problem of industrial alarm design addressed in this thesis is formally defined inthis section, but first we describe the model for the process for which an alarm design issought.Let xk ∈X ⊆ Rn be a discrete-time, unobserved Markov state process characterizedby its initial density x0 ∼ p(x0) and transition density xk+1|(xk,uk,θ) ∼ p(xk+1|xk,uk,θ).Here, uk ∈ U ⊆ Rp and θ ∈Θ⊆ Rr are the control variables and model parameters, re-spectively. The process {xk}k∈N is hidden, but observed through a measurement pro-cess yk ∈ Y ⊆ Rm. The measurement process {yk}k∈N is conditionally independent given{xk,uk}k∈N, and is characterized by the conditional marginal density yk|(xk = xk,uk,θ) ∼p(yk|xk,uk,θ). In this dissertation, the parameter set θ ∈Θ and control signal {uk}k∈N ∈Uare assumed to be known a priori. Thus for notational simplicity, the explicit dependencyof the densities on θ and {uk}k∈N will not be considered hereafter, unless otherwise war-100ranted. To summarize, we have the following model representation:Model 5.2.1. Stochastic non-linear time-series modelx0 ∼ p(x0); (5.1a)xk|xk−1 ∼ p(xk|xk−1); (5.1b)yk|xk ∼ p(yk|xk). (5.1c)For the purposes of this dissertation, certain assumptions on Model 5.2.1 are made:Assumption 5.2.1. {νk}k∈N ∈ Rn and {ωk}k∈N ∈ Rm are mutually independent sequencesof independent random variables described by the probability density functions (PDFs)-νk ∼ p(νk) and ωk ∼ p(ωk) and defined independent of X0 ∼ p(x0).Assumption 5.2.2. The PDFs- p(νk), p(ωk) and p(x0) are known in their classes (e.g.,Gaussian) and are parameterized by a known and finite number of moments (e.g., mean;variance).Most complex chemical processes can be represented by a sub-class of Model 5.2.1,popularly referred to as a discrete time stochastic nonlinear state-space model (SSM) andrepresented as:Model 5.2.2. Stochastic non-linear state-space modelxk = f (xk−1,uk−1,νk,θ) (5.2a)yk = g(xk,uk,ωk,θ) (5.2b)f ∈ Rn is an n-dimensional non-linear state transition function; and g ∈ Rn is an m-dimensional non-linear measurement mapping function. A time instant is denoted by k.101Problem 5.2.1. Given a process described by Model 5.2.2 under Assumptions 5.2.1 and5.2.2, design delay timer and deadband alarm systems for the hidden states.To have a perfect alarm design, we need a good approximation of the density functionsin the fault-free region Pna and the faulty region Pa. So, if the model is unknown, ourapproach is based on Monte Carlo simulations. However, when we know the model, wemake use of a probability density function approximation algorithm called particle filtering,which is discussed in detail in Chapter 3.• Unknown model: Assuming that the density function of the process variables is sta-tionary, an approximate density function of a process variable of interest is estimatedusing a large set of N samples. If the model is unknown, the approximation of thedensity functions pna(yk) and pa(yk) of the data in the normal and faulty regions isgiven by:p(yk)≈1NN∑i=1δ (yk− y(i)k ) (5.3)where y(i)k are the process variable samples collected using a window from time k−N+1 to k. Clearly, this density function automatically changes when a fault appears.• Known model: If a reliable nonlinear model is available, accurate alarms can bedesigned not only on measured variables (yk) but also on hidden states (xk). Theapproximation of the density functions pna(xk−n+1:k|y1:k) and pa(xk−n+1:k|y1:k) inthe normal and faulty regions is given by:p(xk−n+1:k|y1:k)≈1NN∑i=1δ (xk− x(i)k ) (5.4)102Figure 5.1: FAR/MAR PDFs using a delay timer technique.5.3 Delay timersDelay timers are also known as debounce timers and wildly used in industry to reducefalse alarms and eliminate repeat nuisance alarms. An on-delay timer is used to eliminatenuisance alarms that repeat over a short period of time. It is designed in such a way that itonly triggers once the signal is continuous for a specific period of time. An off-delay timeris used to reduce chattering alarms by introducing a delay in clearing the alarm for a periodto prevent alarm resetting [78]. As in the deadband technique, [40] and [63] recommendcertain design values for delay timers based on the type of process variables, as shown inTable 5.1.Table 5.1: (On/Off)- delay timers recommendations based on signal typeSignal type (On/Off) Delay TimersFlow Rate 15 secondsLevel 60 secondsPressure 15 secondsTemperature 60 secondsIn this section we briefly describe a simulation based approach to design an on-delaytimer for a single output process. However, that does not preclude us from using it on a103multivariate process. An on-delay timer is activated if a process variable exceeds a pre-determined threshold for a certain number of consecutive samples. Suppose that Sx is thethreshold on a single process measurement yk ∈ R, and assume that we are interested in anon-delay with a delay of n samples. Then the on-delay timer is activated when the processvariable exceeds Sx in n consecutive samples. We do not assume that the process variablesfollow a known distribution. The proposed algorithm involves the following steps:Step 1: We compute the probability of the fault-free region Pna and the faulty region Pausing the output signal for unknown models and the process model for known models,using Equation 5.3 and Equation 5.4.Step 2: First, we consider an on-timer delay alarm design with n sample delay. The designparameters for the alarm can be computed as follows:• FAR: For a given threshold Sx and a given delay n, the false alarm rate (FAR) forunknown and known models is calculated as:1. Unknown model:FAR =∫ ∞Sx· · ·∫ ∞Sxpna(yk−n+1:k)dyk−n+1:k (5.5)where y(i)k−n+1:k > Sx is used as a short form to imply that every element ofy(i)k−n+1:k is greater than the threshold Sx. The samples y(i)k−n+1:k are obtained bychoosing the many n consecutive samples available in the data set. This choiceis justified due to the assumption of stationary distribution.2. Known model:If a reliable nonlinear model is available, accurate alarms can be designed notonly on measured variables (yk) but also on hidden states (xk). The alarm de-sign procedure for the measured variables will be similar to the one presented104previously, with minor modifications to account for the known model.FAR =∫ ∞Sx· · ·∫ ∞Sxpna(xk−n+1:k|y1:k)dxk−n+1:k (5.6)where Sx is the alarm thresholds. Note that no closed-form solution to (5.5) and (5.6)exists for the model form considered in this dissertation. To approximate (5.5) and(5.6), we use the approximation given in the previous step.• MAR: Similar to the FAR calculations, MAR can be calculated as follows:1. Unknown model:MAR =∫ Sx−∞· · ·∫ Sx−∞pa(yk−n+1:k)dyk−n+1:k (5.7)2. Known model:MAR =∫ Sx−∞· · ·∫ Sx−∞pa(xk−n+1:k|y1:k)dxk−n+1:k (5.8)Then MAR can be approximated as well using the approximation outlined in firststep.• Expected detection delay: This is computed in a similar way for unknown andknown models. Let D be a discrete random variable representing the delay in detect-ing the fault after the fault has occurred. The probability D = 0, given that the faultoccurs at kβ ∈ [1,N], is given byP(D = 0|fault at kβ ) = 0The above probability is 0 since the in-built delay in the alarm design is n > 0 sam-105ples. In fact, we have the following probability relationP(D = k|fault at kβ ) = 0 k = 1,2, . . . ,nFor detection delay D > n, we get the following relation:– for D = n+1P(D = n+1|fault at kβ ) = Pa(Xkβ < Sx,Xkβ+1:kβ+n+1 ≥ [Sx]n)=∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞pa(dxkβ :kβ+n+1)– for D = n+2P(D = n+2|fault at kβ ) = Pa(Xkβ :kβ+1 < [Sx]2,Xkβ+2:kβ+n+2 ≥ [Sx]n)+Pa(Xkβ ≥ Sx,Xkβ+1 < Sx,Xkβ+2:kβ+n+2 ≥ [Sx]n)=∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞∫ Sx−∞pa(dxkβ :tβ+n+1)+∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞∫ +∞Sxpa(dxkβ :tβ+n+1)106– for D = n+3P(D = n+3|fault at kβ ) = Pa(Xkβ :kβ+2 < [Sx]3,Xkβ+3:kβ+n+3≥ [Sx]n)+Pa(Xkβ ≥ Sx,Xkβ+1:kβ+2 < [Sx]2,Xkβ+3:kβ+n+3 ≥ [Sx]n)+Pa(Xkβ :kβ+1 ≥ [Sx]2,Xkβ+2 < Sx,Xkβ+3:kβ+n+3 ≥ [Sx]n)+Pa(Xkβ < Sx,Xkβ+1 ≥ Sx,Xkβ+2 < Sx,Xkβ+3:kβ+n+3 ≥ [Sx]n) =∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞∫ Sx−∞∫ Sx−∞pa(dxkβ :kβ+n+1)+∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞∫ Sx−∞∫ ∞Sxpa(dxkβ :kβ+n+1)+∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞∫ +∞Sx∫ ∞Sxpa(dxkβ :kβ+n+1)+∫ +∞Sx. . .∫ +∞Sx︸ ︷︷ ︸n terms∫ Sx−∞∫ +∞Sx∫ Sx−∞pa(dxkβ :kβ+n+1)The probabilities can be similarly calculated for higher D = d values.Observations:• There is a pattern in which these probabilities evolve for a given D = d value. Wehave to see whether we can leverage it to find a general expression for P(D =d|fault at tβ ) for any given d ∈ [0,∞). Without a general solution, computing theexpected detection delay, given byE[D|fault at tβ ] =+∞∑d=0dP(d|fault at tβ )107will be difficult.• The dimension of the integral in the computation of P(D = d|fault at tβ ) increaseswith d. This may translate into large approximation errors for large d values.Step 3: In the final step an appropriate delay in the range nmin ≤ n≤ nmax is chosen at theoptimal threshold.A single chemical reactor example is used in Section 5.5.1 to illustrate the applicationof delay timers in real chemical processes.5.4 Deadbands timersAn alarm deadband is also known as “lock up” and can be defined as the range throughwhich an input must be varied from the alarm limit necessary to clear the alarm [40].Deadband is another technique used widely in industry to reduce the amount of oscillationor alarm chattering. It is frequently used in alarm systems due to its simple configurationand because it does not need any memory or special software for implementation [94]. [40]and [63] recommend some design values for deadbands based on the type of process vari-ables, as shown in Table 5.2. As can be seen there, rapidly changing process signal typeshave wider deadbands (i.e., composition, flow rate and pressure), while slow signal typeshave narrower deadbands (i.e., level and temperature).The alarm deadband technique is configured using two limits instead of one for trigger-ing and clearing the alarm, as shown in Figure 5.2. For example, for high alarms, a limitis set, as usual, for raising the alarm. However, once an alarm is activated, it will not becleared even if the variable falls below the limit. To clear the alarm, the variable must gobelow a lower threshold, as shown in Figure 5.3. For the high alarm, the deadband limit is108Figure 5.2: FAR/MAR PDFs using deadband technique.Figure 5.3: High and low alarms with deadband technique.given by threshold ∗ (1−deadband), and for the low alarm, the deadband limit is given bythreshold ∗ (1+deadband).Table 5.2: Recommended alarm deadbands based on signal typeSignal Type DeadbandsComposition 10%Flow Rate 5%Pressure 5%Level 2%Temperature 1%In the case of deadband design, the probabilities of false and missed alarms are calcu-lated as follows:Step 1: This is exactly the same as the first step in the delay timers section, where we109compute the probability of the fault-free region Pna and the faulty region Pa using the outputsignal for unknown models and the process model for known models.Step 2: The alarm design parameters for unknown and known models with (x1,x2) dead-band are computed in similar way to the delay timer technique, as follows:• FAR: First, we compute the FAR for normal operation:1. Unknown model:FAR =∫ x1−∞· · ·∫ x1−∞pna(yk−n+1:k)dyk−n+1:k+∫ +∞x2· · ·∫ +∞x2pna(yk−n+1:k)dyk−n+1:k (5.9)2. Known model:FAR =∫ x1−∞· · ·∫ x1−∞pna(xk−n+1:k|y1:k)dxk−n+1:k+∫ +∞x2· · ·∫ +∞x2pna(xk−n+1:k|y1:k)dxk−n+1:k (5.10)where x1 and x2 are the lower and upper alarm thresholds. Note that no closed-formsolution to (5.9) and (5.10) exists for the model form considered in this dissertation.To approximate (5.9) and (5.10), we use the approximation given in the previoussection.• MAR: Similar to the FAR calculations, MAR can be calculated as follows:1. Unknown model:MAR =∫ x2x1· · ·∫ x2x1pa(yk−n+1:k)dyk−n+1:k (5.11)1102. Known model:MAR =∫ x2x1· · ·∫ x2x1pa(xk−n+1:k|y1:k)dxk−n+1:k (5.12)Then MAR can be approximated using the approximation outlined previously.• Expected detection delay: This is computed in the same way used for delay timers.Step 3: In the final step, an appropriate delay in the range nmin ≤ n≤ nmax is chosen at theoptimal threshold.A simulation example involving two CSTRs and a flash tank is used in Section 5.5.2 toillustrate the application of delay timers in real chemical processes.5.5 Case studies5.5.1 Case study 1: application to a single chemical reactorProcess descriptionA single, well-mixed non-isothermal continuous stirred tank reactor, shown in Figure 5.4 isconsidered in this section. The example is taken from [92], where there are three parallel,irreversible, elementary exothermic reactions of the form Ak1−→ B, Ak2−→U & Ak3−→ R takingplace in the reactor; A is the reactant species, B is the desired product and U and R areundesired byproducts. Pure A is fed to the reactor at flow rate F , molar concentration CA0and temperature T0. The reactor is assembled with a jacket to remove from or add heatto the reactor due to the nonisothermal nature of the reactions. Under standard modellingassumptions, a mathematical model that represents the dynamics of the process can be111Figure 5.4: Schematic of a single chemical reactor process.derived from the mass and energy balances and takes the following form:dTdt=FV(TA0−T )+3∑i=1(−∆Hi)ρcpRi(CA,T )+QρpV1(5.13a)dCAdt=FV(CA0−CA)−3∑i=1Ri(CA,T ) (5.13b)dCBdt=−FVCB +R1(CB,T ) (5.13c)where Ri(CA,T ) = ki0 · exp(−Ei/RT )CA. T,CA,CB,Q and V denote the temperature of thereactor, the concentration of species A, the concentration of species B, the rate of heatinput/removal from the reactor and the volume of the reactor, respectively. ∆Hi, ki, Ei,i = 1,2,3, denote the enthalpies, pre-exponential constants and activation energies of thethree reactions, respectively. cp and ρ denote the heat capacity and density of fluid in thereactor. The manipulated inputs used in this study are the rate of heat input u1, the inletstream temperature u2 and the inlet reactant concentration u3. The steady-state operatingdata of the benchmark example are given in Table 5.3.112Table 5.3: Process parameters and steady-state values for the CSTR reactorParameter Value UnitF 4.998 m3hV 1 m3R 8.314 kJkmol.KTA0 300 KCA0,CB0 4, 0 kmolm3∆H1 −5.0×104 kJkmol∆H2 −5.2×104 kJkmol∆H3 −5.4×104 kJkmolk10 3.0×106 h−1k20 3.0×105 h−1k30 3.0×105 h−1E1 5.0×104 kJkmolE2 7.53×104 kJkmolE3 7.53×104 kJkmolρ 1000 kgm3cp 0.231 kJkg.KT sA 388.57 KCsA,CsB 3.59,0.41kmolm3Simulation results and discussionIn this example, the single CSTR model is simulated for 1000 samples, and the measure-ment and state noise variances used are assumed to beQν =0.01 0 00 0.01 00 0 0.01andQω =0.01 0 00 0.01 00 0 0.01,113Figure 5.5: A plot of MAR and FAR against alarm threshold for the single chemicalreactor.respectively. A step change in the second state CA of 0.5 magnitude was introduced atk = 500; the model was simulated 200, times and the corresponding data were collected.Figure 5.5 shows a graph of FAR/MAR as a function of various threshold values and thetimer delay. Different delays were tried until a desired FAR = MAR = 5% was achieved.The corresponding lower bound on delay was nmin = 100. The corresponding expecteddetection delay–shown in Figure 5.6–was approximately 150 samples. A histogram of thestates before and after the fault is shown in Figure 5.7. It is clear that the states do notexhibit a Gaussian distribution.5.5.2 Case study 2: application to a two CSTRs and a flash tank witha recycle stream processThe process description and the mathematical model that represents the behaviour of theprocess has already been discussed in Chapter 4.114Figure 5.6: A plot of expected detection delay as a function of threshold for the singlechemical reactor.Figure 5.7: A plot showing the histograms of xk before and after the fault was intro-duced for the single chemical reactor.115Figure 5.8: A plot of MAR and FAR against alarm threshold for the two CSTRs andflash tank model.Simulation results and discussionThe example considered here is a highly non-linear system and more complex than theprevious one. The process is simulated for 1000 samples, and the measurement and statenoise variances used are assumed to be: Qν = 0.06× In and Qω = 0.09× In, respectively ofsize n = 12. A step change in the first state T1 of 2 magnitude was introduced at k = 500;the model was simulated 200, times and the corresponding data were collected. Figure5.8 shows a graph of FAR/MAR as a function of various threshold values with differentdeadbands. Different delays were tried until a desired FAR = MAR = 5% was achieved.The corresponding lower bound on delay was nmin = 200. The corresponding expecteddetection delay – shown in figure 5.9 – was approximately 310 samples for 0.1 deadbandvalue. A histogram of the states before and after the fault is shown in Figure 5.10. It isclear that the states do not exhibit a Gaussian distribution.116Figure 5.9: A plot of expected detection delay as a function of threshold for the twoCSTRs and flash tank model.Figure 5.10: A plot showing the histograms of xk before and after the fault was intro-duced for the the two CSTRs and flash tank model.1175.6 ConclusionsAn algorithm for designing alarms for data generated by non-linear stochastic systems hasbeen proposed. The algorithm is flexible and can be applied to a variety of non-linearprocesses. The central idea behind the algorithm is to use a particle filter algorithm toestimate FAR/MAR and expected detection delays, followed by a standard procedure foralarm design. The algorithm has been successfully demonstrated in two simulation casestudies.118Chapter 6Conclusions and Future WorkThis thesis is divided into two main parts. In the first part, novel approaches are presentedin the domain of fault diagnosis, with the main focus being to achieve a robust fault estimateunder the effect of the disturbances/uncertainty. In the second part, different alarm designtechniques were investigated by utilizing a particle filter algorithm. This chapter discussesthe results presented in this thesis, summarizes its contributions and provides suggestionsfor future research.6.1 ConclusionsA general model-based fault detection approach for stochastic non-linear non-Gaussiansystems has been developed by proposing a new test statistic. The test statistic is approx-imated using particle filters. Using a bank of particle filters, the proposed approach ismodified to isolate faults. The algorithm is illustrated through simulation using two highlynon-linear chemical processes.Another general model-based fault isolation approach for stochastic non-linear non-Gaussian systems has also been developed, using a GOS approach. The simulation results119show the excellent performance of the proposed approach in comparison with a DOS ap-proach in a non-linear system for both sensor and actuator faults. The results show therobustness of the algorithm in detecting abrupt and incipient faults in a highly non-linearexample. The proposed algorithm failed to isolate simultaneous multiple faults.Finally, an algorithm for designing alarms for data generated by non-linear stochasticsystems was proposed. The algorithm is flexible and can be applied to a variety of non-linear processes. The central idea behind the algorithm is to use a particle filter algorithmto estimate FAR/MAR and expected detection delays, followed by a standard procedure foralarm design. The algorithm was successfully demonstrated in two simulation case studies.6.2 Future workThis work can be extended in different directions. In future studies, we intend to utilize thepower of the particle filter algorithm to deal with stochastic non-linear systems and therebytackle different chemical engineering problems. The following sections describes somepotential areas for future work.6.2.1 Multi-model FTC systemThe proposed FDI algorithm can be extended to a fault-tolerant control (FTC) system,which can tolerate a control system even in the presence of a fault. To address this issue,the multi-model-based FDI approach algorithm can be combined with multiple model pre-dictive controllers, forming a tolerant control system. Figure 6.1 shows the structure of theproposed FTC system.Once a fault is detected and localized by the FDI unit, the switching mechanism switches120Figure 6.1: Structure of the proposed integrated design of the multiple model-basedFDI approach and multiple MPC controllers.to (or stays at) the corresponding MPC controller that corresponds to that fault. Controllerswitching represents a class of projection-based methods to active FTC system. Basically,it uses a bank of MPC controllers in which each corresponds to a possible known fault thatmay occur in the process. Based on the information coming from the FDI model, the switchwill try to accommodate any changes in the dynamic behaviour of the process by selectingthe MPC controller that stabilizes the overall process to an acceptable performance level.A drawback of the proposed approach is that it can only deal with a limited number of an-ticipated faults. However, the advantage is that model uncertainty can easily be addressed121by making the local controllers robust with respect to it.6.2.2 Model invalidationModel-based controllers, such as model-predictive controllers, are widely used in manyindustries. These controllers are designed to be optimal with respect to the model; in otherwords, the performance of the controller is optimized for the model. Any model–plantmismatch leads to deterioration in the controller performance. The process models are typ-ically identified from a batch data set and used in the controller design. However, mostprocesses change with time and render the models inaccurate. Therefore, online mainte-nance of these models is important.Model maintenance is known to be difficult. Identification of a model forms a signif-icant part of the time and money spent in designing a controller. For instance, during theinstallation of model predictive controllers, engineers often take a few weeks (sometimeseven few months) to identify acceptable process models. Most industrial control systems,therefore, have a system to monitor the performance of the controller [58], [60]. However,it is often difficult to identify the source of controller performance degradation. Given thetime and costs involved in modelling a process, this exercise must be done sparingly andonly if the degradation can be conclusively attributed to the model [21], [7].Model invalidation (often also called model validation) is the task of determining whethera model is still satisfactory given a new set of data from the process. This exercise is nor-mally carried out on a batch data set when a new model is identified [81]. Online modelinvalidation is required to maintain the quality of models and trigger a new identificationexercise. Model invalidation is, however, not straightforward, as the processes are non-linear and the operating data does not contain enough excitation.122The conventional approaches to model invalidation are based on minimization of somefunction of the difference between the measurements and the model predictions. The sim-plest approach to model validation is one in which one-step-ahead prediction errors arecontinuously monitored online. Any large deviation in some statistic of these errors is usedto flag a model change. However, this approach is clearly prone to errors, as a change inmodel need not necessarily be reflected in the statistic used (it actually depends on the dis-tribution of the errors). This problem was overcome in [76] using a semi-intrusive modelinvalidation approach. The central idea was to estimate the spectral disturbance character-istics in open-loop (or manual-control) and compare them to those obtained under closed-loop conditions. Their approach does not require any external excitation, as it makes useof an open-loop “template” of the disturbance characteristics.The model invalidation problem is much more complex for non-linear systems. Theprobability distributions of the prediction errors are generally non-Gaussian, and the con-trollers are often not explicit. The well-known solutions to this problem are meant forsingle-variable processes under open-loop control. [19] used an approach in which a sta-tistical transformation called Rosenblatt transformation was used on process variables togenerate a series of uniform variables, which in turn were used to detect any change inthe probability distribution of the measurements. The Rosenblatt transformation [108] isa simple technique that maps any absolutely continuous multivariate random variable to aset of uniformly distributed, independent, random variables. However, the transformationsinvolved are too complex to derive analytical solutions, and hence, the authors proposed aparticle filter-based approximation. A similar approach was also suggested in [53]. Theseapproaches are developed for single-variable processes and are not applicable to closed-loop systems.123We suggest to use a new set of approximations for Rosenblatt transformation underclosed-loop conditions for multivariate processes. In a closed-loop system correlationsexist not only in the temporal direction but also between the multiple outputs (at a giventime). These correlations change the Rosenblatt transformation used in [19] and hence therelated particle filter approximations as well.6.2.3 Alarm designAs a future extension to the work done in Chapter 5, we will try to use the particle filteralgorithm to investigate different alarm design approaches (e.g., filtering technique).124References[1] Distributed diagnosis of continuous systems: Global diagnosis through localanalysis. PhD thesis, Computer Science, Vanderbilt University, Nashville,Tennessee.[2] Failure accommodation in linear systems through self reorganization. PhD thesis,Massachusetts Institute of Technology, Mass.,USA.[3] Failure Detection in Linear Systems. PhD thesis, Massachusetts Institute ofTechnology, Mass.,USA.[4] Robust residual generation for model-based fault diagnosis of dynamic systems.PhD thesis, University of York, York, UK.[5] Process fault detection based on modeling and estimation methods–a survey.Automatica, 20(4):387 – 404, 1984.[6] Fault diagnosis in dynamic systems using analytical and knowledge-basedredundancy: A survey and some new results. Automatica, 26(3):459 – 474, 1990.[7] In Yucai Zhu, editor, Multivariable System Identification For Process Control.Oxford:Elsevier Science, 2001.[8] Mironovsky L. A. Functional diagnosis of linear dynamic systems. Automation andRemote Control, 40:1198–1205, 1979.[9] Mironovsky L. A. Functional diagnosis of linear dynamic systems - a survey.Automation and Remote Control, 41:1122–1142, 1980.[10] N.A. Adnan. Performance Assessment and Systematic Design of Industrial AlarmSystems. PhD thesis, Department of Electrical and Computer Engineering,University of Alberta, Alberta, Canada, 2013.[11] N.A. Adnan, Y. Cheng, I. Izadi, and T. Chen. Study of generalized delay-timers inalarm configuration. Journal of Process Control, 23(3):382 – 395, 2013.[12] N.A. Adnan, I. Izadi, and T. Chen. On expected detection delays for alarm systemswith deadbands and delay-timers. Journal of Process Control, 21:1318–1331, 2011.125[13] K. Ahmed, I Izadi, Tongwen Chen, D. Joe, and T. Burton. Similarity analysis ofindustrial alarm flood data. Automation Science and Engineering, IEEETransactions on, 10(2):452–457, April 2013.[14] F. Alrowaie, R.B. Gopaluni, and K.E. Kwok. An algorithm for fault detection instochastic non-linear state-space models using particle filters. In Advanced Controlof Industrial Processes (ADCONIP), 2011 International Symposium on, pages 60–65, May 2011.[15] F. Alrowaie, R.B. Gopaluni, and K.E. Kwok. An algorithm for fault isolation instochastic non-linear state-space models using particle filters. In Proceedings ofCanadian Congress of Applied Mechanics, pages 37–40, 2011.[16] F. Alrowaie, R.B. Gopaluni, and K.E. Kwok. Fault detection and isolation instochastic non-linear state-space models using particle filters. Control EngineeringPractice, 20(10):1016 – 1032, 2012.[17] F. Alrowaie, K.E. Kwok, and R.B. Gopaluni. Fault isolation based on generalobserver scheme in stochastic non-linear state-space models using particle filters.In Advanced Control of Industrial Processes (ADCONIP), 5th InternationalSymposium, pages 537 –542, May 2014.[18] H. Alwi, C. Edwards, and Chee Pin Tan. Fault tolerant control and fault detectionand isolation. Fault Detection and Fault-Tolerant Control Using Sliding Modes,pages 7–27, 2011.[19] C. Andrieu, Arnaud Doucet, S.S. Singh, and V.B. TADIC. Particle methods forchange detection, system identification, and control. Proceedings of the IEEE,92(3):423–438, Mar 2004.[20] C. Baskiotis, J. Raymond, and A. Rault. Parameter identification and discriminantanalysis for jet engine machanical state diagnosis. In Decision and Controlincluding the Symposium on Adaptive Processes, 1979 18th IEEE Conference on,volume 18, pages 648–650, 1979.[21] M. Bauer and I.K. Craig. Economic assessment of advanced process control asurvey and framework. Journal of Process Control, 18(1):2 – 18, 2008.[22] T. Bergquist, J. Ahnlund, and J.E. Larsson. Alarm reduction in industrial processcontrol. In Emerging Technologies and Factory Automation, 2003. Proceedings.ETFA ’03. IEEE Conference, volume 2, pages 58–65 vol.2, Sept 2003.[23] J. Chen and R. Patton. Robust model-based fault diagnosis for dynamic systems,1999.126[24] W. Chen and M. Saif. Observer-based strategies for actuator fault detection,isolation and estimation for certain class of uncertain nonlinear systems. ControlTheory Applications, IET, 1(6):1672–1680, 2007.[25] Yue Cheng, I Izadi, and Tongwen Chen. On optimal alarm filter design. InAdvanced Control of Industrial Processes (ADCONIP), 2011 InternationalSymposium on, pages 139–145, May 2011.[26] Yue Cheng, I Izadi, and Tongwen Chen. Optimal alarm signal processing: Filterdesign and performance analysis. Automation Science and Engineering, IEEETransactions on, 10(2):446–451, April 2013.[27] Xi cheng Lou, T Alan S. Willskyt, and George C. Verghese. Optimally robustredundancy relations for failure detection in uncertain systems. Automatica, pages333–344, 1986.[28] E. Chow and A. Willsky. Analytical redundancy and the design of robust failuredetection systems. IEEE Transactions on Automatic Control, AC-29:603–614,1984.[29] E. Chow and A.S. Willsky. Analytical redundancy and the design of robust failuredetection systems. Automatic Control, IEEE Transactions on, 29(7):603–614, 1984.[30] R. N. Clark, D.C. Fosth, and V. M. Walton. Detecting instrument malfunctions incontrol systems. Aerospace and Electronic Systems, IEEE Transactions on,AES-11(4):465–473, 1975.[31] E.L. Cochran, C. Miller, and P. Bullemer. Abnormal situation management inpetrochemical plants: can a pilot’s associate crack crude? In Aerospace andElectronics Conference, 1996. NAECON 1996., Proceedings of the IEEE 1996National, volume 2, pages 806–813 vol.2, May 1996.[32] D. Crisan, P. Del Moral, and T. Lyons. Interacting particle systems approximationsof the kushner stratonovitch equation. In Advances in Applied Probability, pages819–838, 1999.[33] Richard D. and Dan C. Particle filters for real-time fault detection in planetaryrovers. In In Proceedings of the Thirteenth International Workshop on Principles ofDiagnosis, pages 1–6, 2002.[34] T. F. Edgar D. A. Dixon and G. V. Reklaitis. Vision 2020: Computational needs ofthe chemical industry. In In National Academy of Sciences, 1999.[35] S. A. Dadebo, M. L. Bell, P. J. McLellan, and K. B. McAuley. Temperature controlof industrial gas phase polyethylene reactors. Journal of Process Control, 7(2):83 –95, 1997.127[36] M. Desai and A. Ray. A fault detection and isolation methodology. In Decision andControl including the Symposium on Adaptive Processes, 1981 20th IEEEConference on, volume 20, pages 1363–1369, 1981.[37] Steven X. Ding. Model-based Fault Diagnosis Techniques: Design Schemes,Algorithms, and Tools. Springer Publishing Company, Incorporated, 1st edition,2008.[38] A. Doucet, S. Godsill, and C. Andrieu. On sequential monte carlo samplingmethods for bayesian filtering. STATISTICS AND COMPUTING, 10(3):197–208,2000.[39] Naghoosi E., Iman I., and Tongwen C. Estimation of alarm chattering. Journal ofProcess Control, 21(9):1243 – 1249, 2011.[40] EEMUA. Alarm systems: A guide to design, management and procurement.EEMUA Publication No. 191 Engineering Equipment and Materials UsersAssociation, London, 2 edition.[41] Khashayar Khorasani Ehsan Sobhani-Tehrani. Fault Diagnosis of NonlinearSystems Using a Hybrid Approach. Springer US, 2009.[42] E. P. Sharpe Eryurek and D. White. Abnormal situation prevention through smartfield devices. presented at the NPRA 2005 Annual Meeting (AM-05-49), 2005.[43] C. M. Belcastro F. N. Chowdgury, B. Jiang. Reduction of false alarms in faultdetection problems. International Journal of Innovative Computing,Information &Control, 12(3):481–490, June 2006.[44] H.S. Fogler. Elements of chemical reaction engineering. Prentice-Hall, 1999.[45] J. Folmer and B. Vogel-Heuser. Computing dependent industrial alarms for alarmflood reduction. In Systems, Signals and Devices (SSD), 2012 9th InternationalMulti-Conference on, pages 1–6, March 2012.[46] P. M. Frank. Analytical and Qualitative Model-based Fault Diagnosis - A Surveyand Some New Results. European Journal of Control, 2(1):6–28, 1996.[47] P. M. Frank, S. X. Ding, and T. Marcu. Model-based fault diagnosis in technicalprocesses. Transactions of the Institute of Measurement and Control, 22(1):57–101,2000.[48] Paul M Frank. On-line fault detection in uncertain nonlinear systems usingdiagnostic observers: a survey. International journal of systems science,25(12):2129–2154, 1994.128[49] Paul M. Frank and Birgit Kppen-Seliger. Fuzzy logic and neural networkapplications to fault diagnosis. International Journal of Approximate Reasoning,16(1):67 – 88, 1997.[50] R. Gandhi and P. Mhaskar. A safe-parking framework for plant-wide fault-tolerantcontrol. Chemical Engineering Science, 64(13):3060 – 3071, 2009.[51] Adiwinata Gani, Prashant Mhaskar, and Panagiotis D. Christofides. Fault-tolerantcontrol of a polyethylene reactor. Journal of Process Control, 17(5):439 – 451,2007. Selected papers presented at the 2005 IFAC World Congress, 2005 IFACWorld Congress.[52] E. Alcorta Garca and P.M. Frank. Deterministic nonlinear observer-basedapproaches to fault diagnosis: A survey. Control Engineering Practice, 5(5):663 –670, 1997.[53] R. Gerlach, C. Carter, and R. Kohn. Diagnostics for time series analysis. Journal ofTime Series Analysis, 20(3):309–330, 1999.[54] J. J. Gertler. Fault detection and diagnosis in engineering systems. Marcel Dekker,New York, 1998.[55] R. B. Gopaluni. A particle filter approach to identification of nonlinear processesunder missing observations. The Canadian Journal of Chemical Engineering,86(6):1081–1092, 2008.[56] R. B. Gopaluni. Nonlinear system identification under missing observations: Thecase of unknown model structure. Journal of Process Control, 20(3):314 – 324,2010.[57] Frank H. and Richard D. Efficient on-line fault diagnosis for non-linear systems,2003.[58] T. J. Harris. Assessment of control loop performance. The Canadian Journal ofChemical Engineering, 67(5):856–861, 1989.[59] Pau-Lo Hsu, Ken-Li Lin, and Li-Cheng Shen. Diagnosis of multiple sensor andactuator failures in automotive engines. Vehicular Technology, IEEE Transactionson, 44(4):779–789, 1995.[60] B. Huang, S.L. Shah, and E.K. Kwok. Good, bad or optimal? performanceassessment of multivariable processes. Automatica, 33(6):1175 – 1183, 1997.[61] S. David I. Izadi, S.L. Shah and T. Chen. An introduction to alarm analysis anddesign. In Fault Detection, Supervision and Safety of Technical Processes, pages645–650, 2009.129[62] S. Kondaveeti I. Izadi, S.L. Shah and T. Chen. A framework for optimal design ofalarm systems. In in Proceedings of 7th IFAC Symposium on Fault Detection,Supervision and Safety of Technical Processes, pages 651–656, 2009.[63] ISA. Management of alarm systems for the process industries. Technical report,International Society of Automation, 2009.[64] R. Isermann. Model-based fault-detection and diagnosis status and applications.Annual Reviews in Control, 29(1):71 – 85, 2005.[65] R. Isermann. Fault-Diagnosis Systems: An Introduction from Fault Detection toFault Tolerance. Springer, Berlin Heidelberg, 2006.[66] R. Isermann. Fault-Diagnosis Applications: Model-Based Condition Monitoring:Actuators, Drives, Machinery, Plants, Sensors, and Fault-tolerant Systems.Springer, Berlin Heidelberg, 2011.[67] R. Isermann and P. Ball. Trends in the application of model-based fault detectionand diagnosis of technical processes. Control Engineering Practice, 5(5):709 –719, 1997.[68] I. Izadi, S.L. Shah, and T. Chen. Effective resource utilization for alarmmanagement. In Decision and Control (CDC), 2010 49th IEEE Conference on,pages 6803–6808, Dec 2010.[69] T. Bergquist J. Ahnlund and L. Spaanenburg. Rule-based reduction of alarm signalsin industrial control. J. Intell. Fuzzy Syst., 14(2):73–84, April 2003.[70] P. Clifford J. Carpenter and P. Fearnhead. An improved particle filter for non-linearproblems. IEE Proceedings Radar, Sonar and Navigation, 146(1):2–7, 1999.[71] E.W. Jacobsen and S. Skogestad. Identification of ill- conditioned plants - abenchmark problem. in The Modeling of Uncertainty in Control Systems, (R.S.Smith and M. Dahleh (Eds.), Springer-Verlag, London):367–376, 1994.[72] T. Jiang, K. Khorasani, and S. Tafazoli. Parameter estimation-based fault detection,isolation and recovery for nonlinear satellite models. Control Systems Technology,IEEE Transactions on, 16(4):799–808, 2008.[73] Xenofon K., James K., and Feng Z. Monitoring and diagnosis of hybrid systemsusing particle filtering methods. In In Proceedings of the Fifteenth InternationalSymposium on the Mathematical Theory of Networks and Systems (MTNS 02),University of Notre Dame, South, 2002.[74] D. Koller K. Kanazawa and S. Russell. Stochastic simulation algorithms fordynamic probabilistic networks, 1995.130[75] V. Kadirkamanathan and P. Li. Fault detection and isolation in non-linear stochasticsystemsa combined adaptive monte carlo filtering and likelihood ratio approach.International Journal of Systems Science, 77(12):1101–1114, 2004.[76] Leonardo C. Kammer, Dimitry Gorinevsky, and Guy A. Dumont. Semi-intrusivemultivariable model invalidation. Automatica, 39(8):1461 – 1467, 2003.[77] Sandeep R. Kondaveeti, Iman I., Sirish L. Shah, David S. Shook, Ramesh K., andTongwen C. Quantification of alarm chatter based on run length distributions.Chemical Engineering Research and Design, 91(12):2550 – 2558, 2013.[78] S.R. Kondaveeti, I Izadi, S.L. Shah, and Tongwen Chen. On the use of delay timersand latches for efficient alarm design. In Control Automation (MED), 2011 19thMediterranean Conference on, pages 970–975, June 2011.[79] V Krishnaswami, G.-Luh, and G Rizzoni. Nonlinear parity equation based residualgeneration for diagnosis of automotive engine faults. Control Engineering Practice,3(10):1385 – 1392, 1995.[80] E. Kyriakides, J.W. Stahlhut, and G.T. Heydt. A next generation alarm processingalgorithm incorporating recommendations and decisions on wide area control. InPower Engineering Society General Meeting, 2007. IEEE, pages 1–5, June 2007.[81] Lennart Ljung, editor. System Identification (2Nd Ed.): Theory for the User.Prentice Hall PTR, Upper Saddle River, NJ, USA, 1999.[82] Pierre D. M. Nonlinear filtering: Interacting particle resolution. Comptes Rendusde l’Acadmie des Sciences - Series I - Mathematics, 325(6):653 – 658, 1997.[83] S. Maskell M. Arulampalam and N. Gordon. A tutorial on particle filters for onlinenonlinear/non-gaussian bayesian tracking. IEEE Transactions on SignalProcessing, 50(2):174–188, 2002.[84] M. Basseville M. Basseville and I. Nikiforov. Detection of Abrupt Changes:Theory and Application. Prentice-Hall, Englewood Cliffs, NJ, 1993.[85] J. MacCormick and A. Blake. A probabilistic exclusion principle for trackingmultiple objects. In INTERNATIONAL JOURNAL OF COMPUTER VISION, pages572–578, 1999.[86] J.-F. Magni. On continuous-time parameter identification by using observers.Automatic Control, IEEE Transactions on, 40(10):1789–1792, 1995.[87] J.-F. Magni and P. Mouyon. On residual generation by observer and parity spaceapproaches. Automatic Control, IEEE Transactions on, 39(2):441–447, 1994.131[88] M. Marseguerra and E. Zio. Monte carlo simulation for model-based faultdiagnosis in dynamic systems. Reliability Engineering and System Safety,94(2):180 – 186, 2009.[89] K. B. McAuley, D. A. Macdonald, and P. J. McLellan. Effects of operatingconditions on stability of gas-phase polyethylene reactors. AIChE Journal,41(4):868–879, 1995.[90] Ulrich Mu¨nz and P Zufiria. Parametric fault diagnosis in stochastic dynamicalsystems. Proceedings of the 19th CEDYA, Madrid, Spain, 2005.[91] Sriram N., Richard D., and Emmanuel B. Combining particle filters andconsistency-based approaches for monitoring and diagnosis of stochastic hybridsystems. In In 15th International Workshop on Principles of Diagnosis, 2004.[92] A. Gani N. El Farra and P. Christofides. Fault-tolerant control of process systemsusing communication networks. International Journal of Systems Science,51(6):1665–1682, 2005.[93] D. Salmond N. Gordon and A. Smith. Novel Approach to Nonlinear/Non-GaussianBayesian State Estimation. Radar and Signal Processing, IEEE Proceedings F,140(2):107–113, 1993.[94] E. Naghoosi, I Izadi, and Tongwen Chen. A study on the relation between alarmdeadbands and optimal alarm limits. In American Control Conference (ACC), 2011,pages 3627–3632, June 2011.[95] Sing Kiong Nguang, Ping Zhang, and S. Ding. Parity based fault estimation fornonlinear systems: an lmi approach. In American Control Conference, 2006, pages6 pp.–, 2006.[96] I. Nimmo. Adequately address abnormal situation operations. ChemicalEngineering Progress, 91(1):1361–1375, 1995.[97] B.J. Ohran, D. Munoz de la Pena, P.D. Christofides, and J.F. Davis. Enhancingdata-based fault isolation through nonlinear control: Application to a polyethylenereactor. In American Control Conference, 2008, pages 3299–3306, 2008.[98] M Orchard and G Vachtsevanos. A particle filtering approach for on-line failureprognosis in a planetary carrier plate. International Journal of Fuzzy Logic andIntelligent Systems, 7(4):221–227, 2007.[99] M. Orchard and G. Vachtsevanos. A particle filtering-based framework forreal-time fault diagnosis and failure prognosis in a turbine engine. In ControlAutomation, 2007. MED ’07. Mediterranean Conference on, 27-29 2007.132[100] Marcos E. Orchard and George J. Vachtsevanos. A particle-filtering approach foron-line fault diagnosis and failure prognosis. Transactions of the Institute ofMeasurement and Control, 31(3-4):221–246, June/August 2009.[101] Paul M.; Clark Robert N. Patton, Ron J.; Frank. Issues of Fault Diagnosis forDynamic Systems. Springer, 2000.[102] R. Patton, P. Frank, and R. Clark. Fault diagnosis in dynamic systems - Theory andapplication. Prentice Hall, 1989.[103] R. J. Patton. Fault-tolerant control: The 1997 situation. In In Proceedings of the3rd IFAC symposium on fault detection, supervision and safety for technicalprocesses, page pp. 10331055, August 1990.[104] R. J. Patton and J. Chen. Review of parity space approaches to fault diagnosis foraerospace systems. Journal of Guidance, Control, and Dynamics, 17(2):278–285,March 1994.[105] Ron J. Patton, Paul M. Frank, and Robert N. Clark. Issues of Fault Diagnosis forDynamic Systems. Springer Publishing Company, Incorporated, 1st edition, 2010.[106] I. E. Potter and M. C. Sunman. Threshold-less redundancy management with arraysof skewed instruments. Technical Report AGARDOGRAPH-224 (pp 15-11 to15-25), AGARD, 1977.[107] Indranil R., Gautam B., and X. Comprehensive diagnosis of continuous systemsusing dynamic bayes nets. In In Proc. of the 19th International Workshop onPrinciples of Diagnosis, pages 151–158, 2008.[108] M. Rosenblatt. Remarks on a multivariate transformation. The Annals ofMathematical Statistics, 23(3):470–472, 09 1952.[109] D.H. Rothenberg. Alarm Management for Process Control: A Best-Practice Guidefor Design, Implementation, and Use of Industrial Alarm Systems. MomentumPress, 2009.[110] Silvio S., C. Fantuzzi, and R.J. Patton. Model-based Fault Diagnosis in DynamicSystems Using Identification Techniques. Advances in Industrial Control. Springer,2010.[111] V. Venkatasubramanian S. Dash. Challenges in the industrial applications of faultdiagnostic systems. Computers and Chemical Engineering, 24(27):785 – 791,2000.[112] C. Fantuzzi S. Simani and R. J. Patton. Model-based Fault Diagnosis in DynamicSystems Using Identification Techniques. Springer, 2003.133[113] Vandi V. Tractable particle filters for robot fault diagnosis, 2004.[114] M. H. V. Kadirkamanathan, P. Liand Jaward and S. Fabri. A sequential monte carlofiltering approach to fault detection and isolation in nonlinear systems. Decisionand Control, 2000. Proceedings of the 39th IEEE Conference on, 5:4341–4346vol.5, 2000.[115] M. Jawardand V. Kadirkamanathan, P. Li and S. Fabri. Particle filtering-based faultdetection in nonlinear stochastic systems. International Journal of Systems Science,33(4):259–265, 2002.[116] H. Vedam and V. Venkatasubramanian. pca-sdg based process monitoring and faultdiagnosis. Control Engineering Practice, 7(7):903 – 917, 1999.[117] V. Venkatasubramanian, R. Rengaswamy, S. N. Kavuri, and K. Yin. A review ofprocess fault detection and diagnosis: Part iii: Process history based methods.Computers & Chemical Engineering, 27(3):327–346, 2003.[118] Marcin W. Identification and Fault Detection of Non-Linear Dynamic Systems.University of Zielona Gra Press, 2003.[119] J. Wang and T. Chen. An online method for detection and reduction of chatteringalarms due to oscillation. Computers & Chemical Engineering, 54(0):140 – 150,2013.[120] J. Wang and T. Chen. An online method to remove chattering and repeating alarmsbased on alarm durations and intervals. Computers & Chemical Engineering,67(0):43 – 52, 2014.[121] Timothy J. Wheeler, P. Seiler, Andrew K. Packard, and G.J. Balas. Performanceanalysis of fault detection systems based on analytically redundant lineartime-invariant dynamics. In American Control Conference (ACC), 2011, pages214–219, 29 2011-July 1.[122] A. Willsky. A survey of design methods for failure detection in dynamic systems.Automatica, 12(6):601 – 611, 1976.[123] Marcin Witczak. Modelling and Estimation Strategies for Fault Diagnosis ofNon-Linear Systems: From Analytical to Soft Computing Approaches. Springer,Berlin Heidelberg, 2006.[124] Jianwei X., Jiandong W., I Izadi, and Tongwen C. Performance assessment anddesign for univariate alarm systems based on far, mar, and aad. Automation Scienceand Engineering, IEEE Transactions on, 9(2):296–307, April 2012.134[125] Jianwei Xu and Jiandong Wang. Averaged alarm delay and systematic design foralarm systems. In Decision and Control (CDC), 2010 49th IEEE Conference on,pages 6821–6826, Dec 2010.[126] Bingyong Y., Zuohua T., and Songjiao S. A novel distributed approach to robustfault detection and identification. International Journal of Electrical Power andEnergy Systems, 30(5):343 – 360, 2008.[127] B. Yan, Z. Tian, and S. Shi. A novel distributed approach to robust fault detectionand identification. International Journal of Electrical Power & Energy Systems,30(5):343 – 360, 2008.[128] F. Yang, S.L. Shah, D. Xiao, and T. Chen. Improved correlation analysis andvisualization of industrial alarm data. {ISA} Transactions, 51(4):499 – 506, 2012.135
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Fault isolation and alarm design in non-linear stochastic...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Fault isolation and alarm design in non-linear stochastic systems Alrowaie, Feras A. 2015
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Fault isolation and alarm design in non-linear stochastic systems |
Creator |
Alrowaie, Feras A. |
Publisher | University of British Columbia |
Date Issued | 2015 |
Description | In this project, first we propose a novel model-based algorithm for fault detection and isolation (FDI) in stochastic non-linear systems. The algorithm is established based on parameter estimation by monitoring any changes in the behaviour of the process and identifying the faulty model using a bank of particle filters running in parallel with the process model. The particle filters are used to generate a sequence of hidden states, which are then used in a log-likelihood ratio to detect and isolate the faults. The newly developed scheme is demonstrated through implementation in two highly non-linear case studies. Finally, the effectiveness and robustness of the proposed diagnostic algorithm are illustrated by comparing the results obtained by applying the algorithm to the multi-unit chemical reactor system using other FDI techniques, based on EKF and UKF state estimators. Second, we propose an approach based on particle filter algorithm to isolate actuator and sensor faults in stochastic non-linear and non-Gaussian systems. The proposed FDI approach is based on a state estimation approach using a general observer scheme (GOS), whereby a bank of particle filters is used to generate a set of residuals, each sensitive to all but one fault. The faults are then isolated by monitoring the behaviour of the residuals where the residuals of the faulty sensors or actuators behave differently than the faultless residuals. The approach is demonstrated through implementing two highly non-linear case studies. Non-linear stochastic systems pose two important challenges for designing alarms : (1) measurements are not necessarily Gaussian distributed and (2) measurements are correlated - in particular, for closed-loop systems. We therefore present an algorithm for designing alarms based on delay timers and deadband techniques for such systems, with unknown and known models. In the case of unknown models, our approach is based on Monte Carlo simulations. In the case of known models, it makes use of a probability density function approximation algorithm called particle filtering. The alarm design algorithm is illustrated through two simulation examples. We show that the proposed alarm design is effective in detecting the fault, even though the measurements are non-Gaussian. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2015-02-02 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivs 2.5 Canada |
DOI | 10.14288/1.0167121 |
URI | http://hdl.handle.net/2429/52010 |
Degree |
Doctor of Philosophy - PhD |
Program |
Chemical and Biological Engineering |
Affiliation |
Applied Science, Faculty of Chemical and Biological Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2015-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/2.5/ca/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2015_may_alrowaie_feras.pdf [ 2.09MB ]
- Metadata
- JSON: 24-1.0167121.json
- JSON-LD: 24-1.0167121-ld.json
- RDF/XML (Pretty): 24-1.0167121-rdf.xml
- RDF/JSON: 24-1.0167121-rdf.json
- Turtle: 24-1.0167121-turtle.txt
- N-Triples: 24-1.0167121-rdf-ntriples.txt
- Original Record: 24-1.0167121-source.json
- Full Text
- 24-1.0167121-fulltext.txt
- Citation
- 24-1.0167121.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0167121/manifest