F A U L T D E T E C T I O N AND ISOLATION USING T H E L O C A L A P P R O A C H By Lechang Cheng B. Sc. (Chemical Engineering) Zhejiang University, P. R. China, 1997. A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF THE REQUIREMENTS FOR THE DEGREE OF M A S T E R OF APPLIED SCIENCE in THE F A C U L T Y OF GRADUATE STUDIES DEPARTMENT OF C H E M I C A L A N D BIOLOGICAL ENGINEERING We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH C O L U M B I A August 2000 © Lechang Cheng, 2000 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department of by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed with my written permission. Department of Chemical and Biological Engineering The University of British Columbia 2216 Main Mall Vancouver, BC Canada, V6T 1Z4 Date: August 2000 Abstract Fault detection and isolation (FDI) has become a crucial issue for industrial process monitoring in order to increase availability, reliability, and production safety. Model-based FDI methods rely on the mathematical model and input-output data of a process to perform detection. The local approach is a new model-based FDI method which aims to detect slight changes of parametric properties of a system. This thesis mainly addresses to the application of FDI using the local approach. Robustness with respect to model uncertainties is an important issue for the local approach. A new algorithm was proposed to recalculate threshold based on the original threshold and covariance matrix of the estimated parameters in order to reduce false alarms due to the estimation error of process parameters. A similar algorithm was also provided to recalculate threshold to reduce fault alarms due to regular parameter fluctuations. As fault detection algorithms are often applied to closed-loop data, closed-loop fault detection was also investigated. Two methods were proposed to deal with the relevance between system input and output data in closed-loop detection: the dimension reduction method and the indirect detection method. The dimension reduction method uses a linear transformation to reduce the dimension of the normalized residual so that the covariance matrix of the revised normalized residual has full rank. The indirect detection method uses the closed-loop model to calculate the primary residual and the normalized residual. By detecting the changes of the closed-loop parameters, the method also detects the changes of the open-loop parameters. Simulation results show that both of these methods can detect changes of every single parameters of a system. Industrial data from a cross-direction (CD) control system in a paper machine was also used to assess the applicability of the local approach. By dividing the CD databox into small sections, the sensitivity of the detection algorithm was improved and the algorithm successfully detected abrupt faults of a single actuator. However, incipient faults of a single actuator can not be detected due to noise and inaccuracy of the process model. i i Table of Contents Abstract ii List of Tables vi List of Figures vii Acknowledgements ix 1 Introduction 1 1.1 Background 1 1.1.1 Basic concepts of fault detection 1 1.1.2 General structures of fault detection and isolation algorithms 2 1.1.3 Existing methods for fault detection and isolation 4 1.1.4 The local approach 6 1.1.5 Robustness of fault detection and isolation 6 1.2 Outline and contributions of the thesis 7 2 Principle of the local approach 9 2.1 Residual generation of the local approach 9 2.2 Fault detection 12 2.3 Fault isolation 13 2.3.1 Sensitivity test 13 2.3.2 Min-max test 14 2.4 General procedure of FDI using the local approach 15 3 Robustness of FDI using local approach 17 3.1 Introduction IV 3.2 Problem formulation 18 3.3 Recalculation of the threshold 19 3.3.1 Distribution of the normalized residual using the estimated parameter. 20 3.3.2 Evaluation of the normalized residual 21 3.3.3 Deduction of the threshold 23 3.3.4 Optimization of the thresho Id 24 3.3.5 Discussion of the results 25 3.3.6 Extension of the result to the case of parameter fluctuations 25 i i i 3.4 Simulation results 26 3.4.1 System specification 27 3.4.2 Simulation of robustness with respect to model uncertainties 27 3.4.3 Simulation of robustness with respect to parameter fluctuations 32 3.5 Experiment results 36 3.5.1 Identification of the system parameters 37 3.5.2 Training of the FDI system 38 3.5.3 Test of the false alarm rate 38 3.5.4 Test of the detection power 39 3.6 Conclusions 40 4 Closed-loop fault detection using local approach 41 4.1 Introduction 41 4.2 Discussion of a simple example 42 4.3 Closed-loop fault detection using the dimension reduction method 47 4.4 Closed-loop fault detection using the indirect detection method 51 4.5 Simulation results 54 4.5.1 Closed-loop detection using the dimension reduction method 55 4.5.2 Closed-loop detection using the indirect detection method 60 4.6 Experiment results 64 4.7 Conclusions 66 5 Application of fault detection in paper-making processes 68 5.1 Background 68 5.2 Problem formulation 69 5.2.1 System model 70 5.2.2 Modeling of actuator faults 70 5.3 Actuator fault detection and isolation 72 5.3.1 Machine-direction and cross-direction data separation 72 5.3.2 Mapping of databox to actuators 72 5.3.3 General algorithms for actuator fault detection and isolation 73 5.3.4 Actuator fault detection by dividing the system output into sections 75 5.3.5 Actuator fault detection by decoupling the system 76 iv 5.4 Simulation results. 77 5.4.1 Process specification 77 5.4.2 Simulation with real process model and generated data 78 5.4.3 Simulation with real process data 83 5.4.4 Discussion of simulation results 87 5.5 Conclusions 88 6 Conclusions and future work 89 6.1 Conclusions 89 6.2 Future work 90 References 91 v List o f Tables 3.1 False alarm rate in a fault-free case 28 3.2 False alarm rates for the robustness scheme and the upgrading scheme 31 3.3 False alarm rate in a fault-free case with parameter fluctuations 33 3.4 False alarm rates for the upgrading scheme and the robustness scheme in a fault-free case with parameter fluctuations 35 vi Table of Figures 1.1 S ystem representation 3 1.2 Simplified block representation of the system 3 1.3 Conceptual structure of FDI using analytical redundancy 4 3.1 Chi-square value of the normalized residual in a fault-free case 28 3.2 Chi-square value of the normalized residual in a faulty case (a change occurred at the parameter a) 29 3.3 Chi-square value of the normalized residual in a faulty case (a change occurred at the parameter b) 30 3.4 Chi-square value for the robustness scheme and for the upgrading scheme in a fault-free case 31 3.5 Chi-square value of the normalized residual in a fault-free case with parameter fluctuations in parameter b 32 3.6 Chi-square value of the normalized residual in a faulty case with parameter fluctuations (a change occurred in the parameter a) 34 3.7 Chi-square value of the normalized residual in a faulty case with parameter fluctuations (a step change occurred in the parameter b) 34 3.8 Chi-square value for the upgrading scheme and the robustness scheme in a fault-free case with parameter fluctuation 35 3.9 Process diagram of the tank level and temperature control system 36 3.10 Test of the false alarm rate in a fault-free case using the revised threshold 38 3.11 Test of the detection power using the revised threshold 39 4.1 Detection of the change in ax using the dimension reduction method 57 4.2 Detection of the change in a2 using the dimension reduction method 58 4.3 Detection of the change in using the dimension reduction method 58 4.4 Detection of the change in b2 using the dimension reduction method 59 4.5 Detection failure with the dimension reduction method 60 4.6 Detection of the change in a2 using the indirect detection method 62 4.7 Detection failure using the indirect detection method 64 vii 4.8 Detection of the change of tank level using the dimension reduction method 66 5.1 Schematci of a typical Fourdrinier paper machine 69 5.2 Process interaction matrix 78 5.3 Detection of a jammed actuator at the 25 t h actuator using the decoupling method 79 5.4 Detection of a jammed actuator at the 25 t h actuator using the general method 80 5.5 Detection of a jammed actuator at the 25 t h actuator using sectioning method 81 5.6 Detection of an abrupt fault at the 25 t h actuator using the sectioning method 82 5.7 Detection of an incipient fault at the 25 t h actuator using the sectioning method 82 5.8 Detection of a jammed actuator at the 25 t h actuator using the sectioning method 84 5.9 Detection of jammed actuators in section 3 using the sectioning method 84 5.10 Detection of an incipient fault at the 25 t h actuator using the sectioning method 86 5.11 Detection of incipient faults at the 23 r d - 27 t h actuators using the sectioning method 86 5.12 Detection of an abrupt fault at the 25 t h actuator using the sectioning method 87 5.13 Prediction of the system output using the system inputs and process model 88 viii Acknowledgements I would like to thank my research supervisor Dr. K Ezra Kwok for giving me the opportunity to work under his guidance. I benefited a lot from his valuable advice and I am also grateful for his support in the past two years. I also thank Dr. Biao Huang for being my co-supervisor in this research, especially for his support and guidance during my research in the Department of Chemical and Materials Engineering at the University of Alberta. Thanks to Professor Ping L i of Zhejiang University for being my advisor in my study in China. Without his recommendation, I would not have the opportunity to study at UBC under Professor K. Ezra Kwok. Special thanks go to the following people: Dr. Greg Steward for providing industrial data for this project, Mr. Norman Woo for his assistance in my thesis writing, Mr. Michael Chong Ping, Mr. Stevo Mijanovi, Xin Huang, Sheng Wan and my friends at the Pulp and Paper Center, Department of Chemical and Biological Engineering at U B C and control group of the Department of Chemical and Materials Engineering at the University of Alberta for their friendship during my studies. They made my study and research an enjoyable experience. And last, I would like to pay special tribute to my parents, sisters and brother for all they have done for me. I can do nothing without their love and support and also want to dedicate this thesis to them. ix Chapter 1 Introduction 1.1 Background With the increasing productivity requirements and performance specifications, chemical processes and automatic control systems are becoming more and more complicated. Such conditions increase the possibility of system failures, which can be characterized by unpredictable changes in the system dynamics. Consequently, there is a growing demand for fault tolerance for the purpose of availability, reliability and production safety while maintaining high performance. Fault tolerance can be achieved not only by providing hardware redundancy and improving the reliability of individual functional units in the system, but also by implementing plant health monitoring systems and online fault detection. Fault detection and isolation (FDI) have drawn growing attention and become a crucial issue for industrial system monitoring in the past several decades. It has been widely investigated both in theory and application. 1.1.1 Basic concepts of fault detection and isolation A fault can be understood as unpermitted deviation of at least one characteristic property of a system (Isermann, 1994). Usually, three kinds of faults are considered: sensor fault (instrument fault), actuator fault, and component fault. Fault detection (FD) is a decision on whether faults exist in a system. Fault isolation (FI) is a decision on the presence of a faulty mode among a number of possible modes. With respect to the different units where faults can occur in a system, three categories of fault detection exist: instrument fault detection (IFD), actuator fault detection (AFD) and component fault detection (CFD) (Frank 1990). The nature of possible failure situations may be classified as abrupt (sudden) faults, which are typically modeled as step-like changes and incipient (slowly developing) faults, which are represented by drift-type changes. With respect to the nature of faults, application of fault detection typically can be divided into two broad categories: (1) Quick detection of abrupt faults of sensors, actuators or other components of a control system; l Chapter 1: Introduction (2) Early detection of incipient changes in the dynamics of a system and its interpretation and diagnosis (Zhang, 1998) In abrupt-type fault detection, it is crucial that the algorithm is able to detect the changes quickly to avoid disastrous consequences. In such cases, early detection and isolation are the key objectives. Early detection of incipient changes is also referred to as condition-based maintenance. It is extremely important in the industry, but has not been investigated extensively. The key idea is to replace regular systematic inspection of a system by condition-based inspections (i.e. inspections based on continuous monitoring of the system) in order to prevent possible malfunction or damage from happening. A solution to this condition-based maintenance may be the early detection of incipient changes of the parameters of a system without any artificial excitation. One of the main difficulties in the detection of incipient faults is the compensating effect of feedback control, which tends to diminish the effect of small incipient faults on the control performance. A novel philosophy has emerged and is increasingly discussed. It is based on the use of analytical redundancy instead of physical redundancy. This implies that the information redundancy contained in the dynamic relations between the system input and measured output is exploited for fault detection and isolation. It is called the model-based fault detection and isolation. Another FD method in use is knowledge-based method. The knowledge-based method differs from the model-based method due to use of qualitative models instead of quantitative models to perform detection. It may be an alternative way to model-based fault detection or complement it (Frank, 1990). These two types of methods can even be combined together into one scheme. This thesis mainly discusses model-based fault detection and isolation. 1.1.2 General structures of fault detection and isolation algorithms The process and fault model are very important in the performance of FDI algorithms. Figure 1.1 shows a typical dynamic system with input u and output y. It consists of an actuator, a sensor and plant components. Because system and measurement noise, modeling errors are inevitable in any system and their effect can lead to false alarms, they must be taken in account in the model. Figure 1.2 is a simplified block diagram of the dynamic system. 2 Chapter 1: Introduction Faults i Modelling errors c System noise ( ^ A c t u a t o r s =)fe* Measurement noise„ 32. Sensors Actual system Figure 1.1 System representation. (Frank 1990) Faults u c Noise 4> Actual System Figure 1.2 Simplified block representation of the system. (Frank 1990) It has been widely acknowledged that the FDI problem can be split into two steps: 1. Residual generation. Residuals are functions of the system input u and output y, and nominal system model. They are accentuated by the faults and can be regarded as the index of faults in the system. 2. Residual evaluation. Residuals are evaluated to make decisions on the presence of faults and to decide the time, location, type, size, and source of the faults. 3 Chapter I: Introduction Controller i Reference input Fault *7 a Actuators 1 1 Components (Plant dynamics) S2. d Sensors Plant Residual generation Decision function generator Fault decision logic => Fault time :^>Fault location Figure 1.3 Conceptual structure of FDI using analytical redundancy (Frank 1990). Figure 1.3 illustrates the conceptual structure of model-based FDI algorithms. In the residual generation stage, a validation of the nominal relationship is performed with the system input u, output y and nominal model. If a fault happens, the nominal relationship will not be satisfied and a residual occurs. The residual is then used to form appropriate decision functions. They are evaluated in the fault decision logic in order to monitor both the time of occurrence and location of the fault. 1.1.3 Existing methods of fault detection and isolation Various methods of fault detection and isolation have been developed in the past several decades, as can be seen from survey papers by Willsky, (1976); Gertler, (1988); Basseville, (1988); Frank, (1990); Isermann, (1993). Among them are the parity space approach, the detection filter, the fault observer, and parameter estimation techniques. The main idea of each of these approaches will be introduced in the following passage. The comparison of each of these techniques is discussed in Basseville and Nikiforov (1993), Basseville (1997) and Nikiforov, etal (1996). 4 Chapter 1: Introduction 1. Parity space approach. The parity equation based method was first developed by Potter and Suman (1997), Desai and Ray (1981), Chow and Wilsky (1984) and Lou et al (1986). It was further developed by Patton and Chen (1991), Gertler, Hofling and Pfeufer (1994). The parity space approach relies on a check of the parity of the mathematical equations of the system (analytical redundancy relation) by using the actual system input and output. A fault alarm is generated once preset thresholds are surpassed. There are two types of analytical redundancy relationships: direct redundancy which are the relationships among instantaneous redundant sensor outputs (algebraic relations) and temporal redundancy which are dynamic relationships between system inputs and system outputs. 2. Dedicated observer approach. Dedicated observer method was developed by Clark et al. (1975) , Willsky (1976), Clark (1978a, b), Frank and Keller (1980), Frank (1978b, 1988), Mehra and Peshon (1971) and Willsky (1976). The basic idea of the observer approach is to reconstruct the outputs of the system from the measurements or subsets of the measurements with the aid of observers or Kalman filters using the estimation error or innovation, respectively, as a residual for the detection and isolation of the faults. 3. Fault filter approach. The fault detection filter (or fault sensitive filter) is a full-order state estimator with a special choice of feedback gain matrix. It was first proposed by Beard (1971) and Jones (1973) and then developed by Wilbers and Speyer (1989). By proper choice of feedback gain matrix, the residual due to a particular fault is constrained to a single direction or plane in the residual space independent of the model of the fault. A fault is detected when one or more of the residual projections along the known fault direction or in the known fault plane are sufficiently large. 4. Parameter identification approach. The use of parameter estimation methods for fault detection of dynamical systems was demonstrated by Hohmann (1977), Baskiotis et al. (1976) , Geiger (1982), Filbert and Metzger (1982). The development of parameter estimation methods for fault detection was then summarized by Isermann (1984). This is an alternative approach to the above described methods based on state estimation. It makes use of the facts that faults of a dynamical system are reflected in the physical parameters, as for example, friction, mass viscosity, resistance, capacitance, inductance, etc. The idea of the parameter identification approach is to detect the faults through estimation of the parameter of the mathematical model (Isermann, 1984). 5 Chapter 1: Introduction 1.1.4 The Local approach For additive faults, parity check and observer-based detection methods are quite effective. However, monitoring innovations or observer errors is not sufficient for detection of non-additive faults and additive faults in nonlinear systems. In such cases, a local test relying on efficient scores has proven to be useful. A local approach based on system identification and local test has been developed which can be used to detect both incipient faults and abrupt faults. The main idea of this approach is to transform fault detection problems concerning a parameterized process into the universal problem of monitoring the mean of a Gaussian vector. Under the framework of this approach, it can also be used in model validation. Benveniste, Basseville and Moustakides (1987) first proposed the basic idea of the local approach in change detection and model validation. Zhang (1993), Basseville (1993) and Benveniste (1997) have done notable work in this area. They have built up the framework of this fairly general methodology and generalized this approach to a wide arrange of system models. Delyon and Benveniste (1997) discussed the relationship between identification and local test. Zhang (1998) applied the local approach to a fairly wide type of nonlinear systems and proposed a combined input-output and local approach. He also combined the observer-based fault detection method and the local approach together and applied them into monitoring nonlinear dynamical systems (1998). There have been a lot of applications of the local approach in the literature. B. Huang (1999) applied it to the problem of process and control loop performance monitoring. Wahnon and Berman (1990) implemented it the problem of tracking time-varying parameters in linear systems in adaptive signal processing and adaptive control. O'Reilly (1998) applied local approach to the problem of sensor decalibration. This thesis is mainly dedicated to application issues of the local approach: the robustness of the local approach, closed-loop detection and implementation of the local approach in actuator fault detection in paper-making CD control systems. 1.1.5 Robustness of fault detection and isolation A l l model-based fault detect and isolation methods rely on the mathematical model of the monitored system to perform detection and diagnosis. If the system model is accurate, fault detection may be very straightforward. However, noise and model uncertainties are inevitable in a practical system. The consequence is that even in a fault free case, the nominal relationship 6 Chapter I: Introduction between the system input and output is not satisfied. It may lead to erroneous decision on the presence of faults in the system. In order to improve reliability and performance of fault detection algorithms, it is of great importance to consider the effects of noise and model uncertainties on the detection algorithms, i.e. robustness of FDI algorithms. Robustness means the ability of the algorithms to detect changes correctly in the presence of model uncertainties, disturbance and noise (Basseville, 1998; Gertler, 1988). Many efforts have gone into increasing robustness of fault detection algorithms. Investigations on robustness of FDI fall into two types. The first type concerns the residual generation stage. The purpose is to find residuals that are sensitive to faults and insensitive to noise and model uncertainties. The second type concerns the residual evaluation stage. It involves either the selection or calculation of a threshold to accommodate the effect of noise and model uncertainties. Adaptive threshold selection and fuzzy decision logic have been used in the evaluation of the residuals (Basseville, 1998). Though the robustness problem of FDI has been investigated widely with the parity check and observer-based methods, little attention has been concerned with the robustness of FDI using the local approach. Part of this thesis is about the robustness of FDI using the local approach. As a number of methods have been proposed in the stage of residual generation to improve the insensitivity of the residual to model uncertainties, the robustness of the detection algorithms can also be improved by appropriate selection of thresholds to reduce the false alarm rate. From the application point of view, the selection of thresholds is as important as the generation of the residual. 1.2 Outline and Contributions of the Thesis In chapter 2, the principle of fault detection and isolation using the local approach is introduced. It includes basic concepts used in the local approach, regular assumptions for the local approach to work, and a general procedure of fault detection and isolation using the local approach. Chapter 3 considers the robustness of fault detection using the local approach. Since noise and model uncertainties are unavoidable in control systems, it is meaningful to consider their affects on fault detection algorithms. In this chapter, a new algorithm was proposed to revise the threshold to reduce the false alarm rate to the required level while keeping the sensitivity of 7 Chapter 1: Introduction fault detection algorithms toward faults. The result was also extended to the situation of regular system parameter fluctuations. A similar algorithm was provided to calculate the threshold to be used in the residual evaluation. With the revised threshold, the detection algorithm will not generate any alarm to acceptable parameter changes. Chapter 4 addresses the problem of closed-loop fault detection. As most industrial systems are operated in closed-loop, it is important to apply detection algorithms to closed-loop data. Problems that will arise in closed-loop detection are analyzed and two methods were proposed: dimension reduction method and indirect detection method. Simulation studies in this chapter show the effectiveness of the proposed methods. In chapter 5, the local approach is applied to the problem of actuator fault detection in paper-making cross-direction (CD) control systems. CD control systems generally have many actuators. It is helpful to detect and isolate faults for these actuators to maintain C D control performance and reduce maintenance cost. Conclusions and future directions will be provided in chapter 6. Chapter 2 Principle of the Local Approach The local approach is a statistical FDI method that aims at detecting slight changes in the parameters of a parametric system. The key idea is to transform FDI problems into the general problem of monitoring the mean value of a Gaussian random vector. The terms fault and change will be used interchangeably in the following. It has been widely acknowledged that the FDI algorithms can be split into two steps: residual generation and residual evaluation. For the local approach, the main purpose of residual generation is to calculate a primary residual and a normalized residual. A primary residual is a vector-valued function of the measurements and the parameters of the monitored system. It has zero mean in the case of no fault and non-zero mean otherwise. Such a function can be regarded as the index of faults in the system. A normalized residual is a vector-valued function built on the primary residual. Under the assumption of small changes, it is Gaussian distributed with a known covariance matrix. In the residual evaluation stage of the local approach, a statistical test is applied to the normalized residual to make a decision on the presence or absence of faults and to be able to further isolate the faults. To illustrate the procedure of the local approach, the following state-space model is used to describe the monitored system xk+l=f(0,xk,uk) yk =h(0,xk,uk) + ek where xk, uk, yk are vectors of state variables, system inputs and outputs at time k respectively; 9 is the vector of the system parameters that remain to be monitored; ek is noise which could be white or colored. 2.1 R e s i d u a l g e n e r a t i o n o f t h e l o c a l a p p r o a c h A primary residual for monitoring a system parameterized by 6 is a function of the system parameter vector 6, the system input vector uk and system output vector yk. More precisely it is a function of the nominal value of the parameter vector 00 and an auxiliary process zk 9 Chapter 2: Principle of the local approach driven by the input uk and output yk. In general, zk is generated by a dynamic system of the form ^ = # » ^ " t J t ) ( 2 2 ) where <p, $ are two appropriate functions. Through this auxiliary process, zk summarizes in some sense all the observations up to the time instant k. The auxiliary vector may be simply composed of a finite sample of observed output and the innovation. A simple example is given for the following first order discrete system yk =auk_x +ek The auxiliary vector can be generated through the dynamic system ^k+l = Uk Zk = £* Primary residuals associated with parametric models for the purpose of FDI should satisfy the following requirements. Definition 2.1: (Primary residual) A vector-valued function H(6,zk) is called a valid primary residual i f it is differentiable in 6 and EeH(90,zk) = 0, if0 = do (2.3) EeH(60,zk)*0, if 6*0, (2.4) where 0O is the nominal value of the system parameter 6, Ee means the expectation when the actual system parameter value is 8. In eqns. 2.3 and 2.4, zk is a function of 6. A typical example of such a primary residual is the gradient of the least square (LS) prediction error with respect to 6. It is assumed that the sensitivity matrix M(eo)=Eeo^H(9,zk)\e=eo (2.5) exists and has full column rank (Zhang, 1998a). General statistical approaches to residual evaluation require the knowledge of the statistical properties of the FDI residuals. However, even for a simple linear model, the distribution of the primary residual H(0,zk) is unknown. The solution of the local approach to circumvent this difficulty is to calculate a normalized residual with the primary residual H(6, zk) and to make 10 Chapter 2: Principle of the local approach the assumption that the change in the parameters is small. It assumes that parameters before any change are given by 0O and after any change are given by 0 N where A is an unknown, but constant, vector and N is the data sample size. With the small change assumption, the change detection problem can be formulated as the following statistical test Nominal system:HO:0 = 0O (2.6) A Faulty system: Hl:0 = 0o+ — (2.7) For large N, hypothesis Hl implies the smaller deviations in 0. The small change assumption is reasonable in the sense that with more data available, the detectability will be increased, (i.e. smaller changes can be detected with a larger sample size). With valid primary residuals a normalized residual can be calculated as the average of the primary residual at N sample time: Definition 2.2: (Normalized residual) Given a primary residual H(0,zk) and TV-size sample of zk, for k = 1,2, • • •, N, the normalized residual is defined as ^ ( 0 o ) = ^ | > ( 0 o ^ ) (2-8) For the design of the FDI algorithm, it is also assumed that the covariance matrix 2Z(0o) = ]imEe£N(0o)£N(0o)T (2-9) exists and is positive definite (Zhang, 1998a). Under the assumption that the changes in the parameters are small, the normalized residual turns out to be asymptotically Gaussian distributed. This property makes it possible to evaluate the normalized residual in a statistical way. The following central limit theorem holds for a large class of dynamic systems, as shown in Basseville and Nikiforov, (1993) and Basseville, (1998). Theorem 2.1 (Zhang, 1998) If the normalized residual CN(0O) defined in eqn. 2.8 is built from a primary residual satisfying assumption 2.3 and 2.4, then it is a asymptotically Gaussian distributed vector under both hypotheses of eqn. 2.6 and 2.7, that is when N —> °°, u Chapter 2: Principle of the local approach (N(0O) ~ N(O,H0O)) under H0 (2.10) CM ~ N{M{60)A,Ke0„ underHx (2.11) "~" means follow and the matrix M(0O) and 2Z(0Q) are defined in eqn. 2.5 and 2.9. This notation will also be used in the following chapters. Theorem 2.1 shows that under both HQ and Hl, the normalized residual CN(0O) is Gaussian distributed and the covariance matrices are the same. Therefore, one can detect changes in the system parameters 0O by monitoring the mean value of CN(d0). With theorem 2.1, the problem of detecting a change is transformed into a statistical problem of detecting whether the mean of a Gaussian distributed vector is zero or not. 2.2 Fault detection With the results in section 2.1, detection of small changes in the parameter 0 is asymptotically equivalent to the detection of changes in the mean of a Gaussian vector CN(0O) • It can be shown that for deciding between H0 and Hx, the optimum test statistics is the generalized likelihood ratio (GLR) (Zhang, 1998a): yi. - ? m m a x ^ ( ^ ) XgIobal-2ln ( 2 < i 2 ) where PZ(CN) i s t n e density function of the Gaussian vector CN under H{, p0(CN) is the density function of CN under H0. Under both hypotheses, %2globai is asymptotically a chi-square distributed variable, with a number of degrees of freedom equal to the dimension of 0 . It is central under H0 and has the non-centrality under Hx equal to y = ATM TYTXMX. When the sensitivity matrix M is a square matrix, the test can be simplified to xlhbai = CN^CN • ^ threshold xl c a n be found from a standard chi-square table according to the false alarm rate a. If x\iobai * s found to be larger than the threshold %l, a change in the parameters is detected. 12 Chapter 2: Principle of the local approach 2.3 Fault isolation In the implementation of the FDI algorithm, FD is performed as frequently as possible. FI is performed only after the detection of a fault. The task of FI is to decide which components of X are nonzero. Without loss of generality, assume that Xa consists of the first component of X. Letting Xb be the complementary vector of Xa in A,, i.e. ' - [ i l with na =dim(A a ) , nb =dim(/l 6). FI is to decide between two hypotheses: H0 :Xa =0 and Hx:Xa*0. The value of Xb is unknown and there are two ways to deal with this problem: assume Xb to be zero or equal to the least favorable value for making the decision between H0 and Hx. These two solutions result in the sensitivity test and min-max test. 2.3.1 Sensitivity test: In the sensitivity test, Xb is assume to be zero and FI is a decision between two hypotheses H0:X = [0 Of Hx:X = [Xa Of . Partition M = [Ma Mb] so that MX = MaXa +MbXb. It is shown that the sensitivity test for monitoring Xa is the GLR test between H0 and Hl, where Xa * 0 (Basseville, 1998) - > n n ^ P y ( ( , ) = n F r - Z ( 2 1 3 ) where C a = M ^ c N ; Pi \ (0 stands for the density function of £N ~ N(MaXa +MbXb,2Z); Faa = MTaYTxMa is the covariance matrix of . Under both hypotheses, xl * s distributed as a chi-square random variable with na degrees of freedom. This chi-square distribution is central under H0 and non-central under Hv A threshold xl c a n D e found from a standard chi-square table according to the false alarm rate. 13 Chapter 2: Principle of the local approach By comparing xa w * m m e threshold one can make a decision on whether Xa is zero or not. 2.3.2 Min-max test: For the min-max test, the parameters in A,a are considered as nuisances replaced by its least favorable value, namely the value which minimizes the noncentrality parameter y = XTMT2Z~XMX. This is equivalent to the GLR method (Basseville 1998): max, , p, , (CN) X? = 2 In (2.14) Letting F = MT1TXM and partition it as F = ~FAA FAB~ fba FBB. b MTb2Z-lMa MTbYTxMb It has been shown that X2:=CTF;-XC (2.i5) where c:=c -FABFb-bicb, c =M: I - 'G, cb= F„ = FAA - FABF;BFBA is the covariance matrix of Cl • Under both hypotheses xl* is a chi-square distributed variable with na degrees of freedom. This chi-square distribution is central under H0 and non-central under Hx. A threshold value can be found from a standard chi-square table according to the false alarm rate a and one can make a decision on whether Xa is zero or not by comparing xl* with the threshold value. 2.4 General procedure of FDI using the local approach As a model-based FDI method, the local approach relies on the nominal model of the monitored system to perform detection. Therefore, the first step in the application of the local approach is to determine a parametric model that can best describe the dynamics of the system and to estimate the value of the system parameters under nominal situations. The primary residual can simply be calculated as the gradient of the estimation error with respect to the 14 Chapter 2: Principle of the local approach system parameters. Since a sensitivity matrix and a covariance matrix are used in the residual evaluation, they must be estimated before the local approach can be applied to detect faults. Generally, the implementation of the local approach can be split into two steps: training and detecting. The purpose of training is to estimate the sensitivity matrix and the covariance matrix. To make the local approach work properly, the sensitivity matrix and the covariance matrix must be estimated under a nominal situation. Given a primary residual H(0o,zk) and TV-size training data zk for k = 1,2, • • •, N, the sensitivity matrix can be simply estimated using the following average equation: M(9o) = ^ i[JeHiZk'ei^ (2-16) Estimation of the sensitivity matrix can simply use the following formula: ^=^H{zk,e0)HT(zk,e0) + £^%(mzkA)HT(zk+iA) + H(zk^HT(zk,0o)) (2.17) with properly selected numbers J. In practice, one can gradually increase the value of J until the H2 -norm of the change of £(<?„) is smaller than some preset threshold (Huang, 1998). One alternative way to estimate Z(0O) is proposed by Devauchelle-Gach (1991). According to their suggestion, the data are divided into L segments of length M and the sensitivity matrix can be estimated as ^ o ) = }Z£VX,/ (2.18) where 1 M DMJ=~4^JtH{Zk+lMA) ( 2 , 1 9 ) LM = N is the total length of the sample. Obviously, D(0O) is genetically positive definite for L large enough, M must be taken large enough so that DMI behaves approximately as a Gaussian random variable (Zhang, 1994). With the sensitivity matrix and the covariance matrix, FDI can be performed according to 15 Chapter 2: Principle of the local approach the following procedure: 1. Collect the input and output data in order to calculate the auxiliary variable and the primary residual; 2 . Calculate the normalized residual with the primary residual; 3. Perform chi-square test on the normalized residual and generate an alarm when the chi-square value exceeds a pre-designed threshold; 4. Perform fault isolation i f a fault has been detected in step 3. 16 Chapter 3 Robustness of fault detection using the local approach 3.1 Introduction The task of FDI is to provide prompt information to supervision processes about the occurrence of unexpected changes in a system. The terms change and fault will be used interchangeably in the following. With a nominal model, the model-based fault detection and isolation algorithm calculate residuals which are signals with zero mean in fault-free cases and non-zero in other cases. However, noise and uncertainties are unavoidable in any system. It is of great importance to consider their effects on the detection algorithm (i.e. the robustness of the FDI algorithm). In order to improve the robustness of the FDI algorithm, the approach is to reduce the effect of noise and uncertainties to the detection algorithm, while keeping the sensitivity of the algorithm toward faults. Different kinds of fault detection approaches have different ways to generate residuals and thus have different ways to deal with noise and model uncertainties. The local approach is a powerful tool for incipient change detection and condition-based maintenance. It is designed to detect slight changes in parameters of a system so as to prevent a possible malfunction or damage. As a model-based fault detection and isolation method, the local approach has the same robustness problem with respect to noise and model uncertainties. Especially since it aims at detecting slight changes in parameters of a system, the issue of robustness should be examined very carefully. As a supervisory tool, the FDI algorithm should not provide any alarm when there is no fault in the system or i f changes in parameters are within some acceptable range. Too many false alarms will greatly reduce the applicability of the algorithm. It is important to consider the impact of noise on the local approach residuals. Even in a fault-free case, the normalized residual is an asymptotically Gaussian vector with zero mean. Due to noise in the system, the normalized residual is not exactly zero. A n alarm is triggered when the chi-square test of the normalized residual exceeds some threshold. However, the local approach does not consider the effect of model uncertainties on the method. Since the local 17 Chapter 3: Robustness of fault detection using the local approach approach is mostly used to detect changes in the parameters of a system, it is natural to think of parameter uncertainties. The meaning of robustness due to parameter uncertainties may be illustrated in the following two cases: 1) Actual value of system parameters can never be known. Instead, an estimated value obtained through identification is used. Due to noise, estimation error is unavoidable. When estimated parameters are used to calculate the primary and normalized residual, they may cause more false alarms. 2) There are regular fluctuations in system parameters due to changes in the operating condition. If the parameters fluctuate within an acceptable range, no alarm should be triggered. There are two approaches to improve the robustness of FDI with respect to noise and model uncertainties: (1) find a residual which is sensitive to faults and insensitive to noise and model uncertainties and (2) revise the threshold to accommodate the effect of noise and model uncertainties. In this chapter 3, the later approach is considered. 3.2 Problem formulation The effect of model uncertainties on the robustness of the local approach will be analyzed, followed by the effect of parameter fluctuations. In fault detection, a parametric system characterized by the following state-space model is considered: yk = h(0,xk,uk) where uk, yk, xk are vectors of the system input, output and state variable at time k \0 is the vector of the system parameters. In order to obtain the primary residual, a vector of auxiliary variables zk is calculated using the following system: Zk+x=<p{0£k^k>yk) ( 3 2 ) zk=&{d,Zk,uk,yk) The primary residual H(90,zk) and normalized residual CN(0O) can be calculated with auxiliary variable zk and the nominal value of system parameter 0O. The normalized residual is a Gaussian vector with zero mean in fault-cases and non-zero mean in other cases. With pre-estimated sensitivity matrix M and covariance matrix Z (Definitions of the sensitivity matrix and the covariance matrix can be found in chapter 2), a chi-square test is performed on £N (0Q) 18 Chapter 3: Robustness of fault detection using the local approach to detect whether the mean of the normalized residual is zero or not. That is to compare the chi-square value of the normalized residual t = (0O )L-xM<MT2Z-lM)-lMTZ-xCN (6Q) (3.3) with a threshold xa obtained from a standard chi-square table according to the pre-assigned false alarm rate a and the dimension of the normalized residual. A fault is detected when t is larger than xa • In practice, the nominal parameters 0O of a system can never be known. Instead the estimated parameters 0O obtained through system identification are used. Usually 0O is not equal to 0O because of noise and disturbance around the system. The primary residual H{60,zk) and normalized residual £N(00) calculated with the estimated parameters 0O have non-zero mean even in a fault-free case. Regarding the chi-square test of £N(0O) t = cj ie0wlM{MTii-lMrxMT^xCAdo) (3-4) t is a non-central chi-square distributed random variable and the possibility of t to exceed xa will be larger than a. A new threshold value must be found to reduce the false alarm rate and thus improve robustness of the detection algorithm. However, the larger the threshold is, the less sensitive the detection algorithm is to faults. In order to minimize the loss of detection performance, the threshold should be set as small as possible. The robustness problem can be summarized as follows: (1) Use the original scheme of the local approach and the estimated parameter 0O to calculate the primary residual and the normalized residual and do a chi-square test. In order to keep the false alarm rate small, a new threshold xa m u s t be found such that P(t>xa\H0)<a (3.5) where t is calculated using eqn. 3.4 and H0 means the fault-free case. (2) In order to keep the sensitivity of the detection algorithm, the threshold xa should be set as small as possible. 3.3 Recalculation of the threshold The algorithm used to calculate the threshold will be given in the following. 19 Chapter 3: Robustness of fault detection using the local approach 3.3.1 Distribution of the normalized residual using the estimated parameter As mentioned in section 3.1, the primary residual and the normalized residual, calculated with estimated parameters, have non-zero mean even in the fault free case. This is because of the inherent differences between the estimated parameters and the nominal parameters. The first step is to find the distribution of the normalized residual with the estimated parameters. With the primary residual H(0O ,zk), the normalized residual is calculated as follows: CAe°)=7fit<Hi9o'Zk) (3'6) With the estimated parameter 0O, the normalized primary residual becomes £N(0O). Using a first order Taylor series expansion, £N(60) can be approximated as (3.7) « CN (0O) + ^ ( £ ^ H ( 0 , zk )| )[(0O - 90 yjN] By the law of large numbers, when N approaches infinity i<i |^>L.>^ 0.10 where M is the sensitivity matrix defined in (2.5). Therefore, the approximation of the normalized residual calculated with the estimated parameters 0O can be written as Cn0O)"CN(Oo) + M0o-0o)JN (3.9) As mention in chapter 2, under the assumption of small changes, that is to distinguish between two hypotheses H0:0 = 0O and Hl:0 = 0O+X/y[N, £N(0O) follows: £N(0O)~N(O,Z) under H0 (3.10) Cs ) ~ N(-MX, I) under//, (3.11) where X is the covariance matrix. The definition of £ can be found in chapter 2. Combining eqn. 3.9 and 3.10, the approximate distribution of the normalized residual under nominal situation is obtained: CN0o)~Nmeo-eo)4N,T) u n d e r ^ (3.12) Combining eqn. 3.9 and 3.11, the approximate distribution of the normalized residual under faulty situation is obtained: 20 Chapter 3: Robustness of fault detection using the local approach £N0O) ~ N(M0o -0oUN-MX,I) under Ht (3.13) For simplicity, the following denotation is used: a = 0o-OO)^N (3.14) The distribution of the normalized residual calculated £N0O) with the estimated parameter 60 can then be summarized as CN0O)~N(Ma,X) under7/ 0 (3.15) £N 0O) ~ N(Ma - MX, Z) under Hl (3.16) Comparing eqn. 3.12 and 3.13 with 3.10 and 3.11, one can conclude that the estimated error 0O -00) does not affect the covariance matrix of the normalized residual. It only contributes a constant bias Ma to the mean of the normalized residual under both H0 and Hx. 3.3.2 Evaluation of the normalized residual In the residual evaluation stage, a chi-square test is performed on the normalized residual CN (^o) t o detect whether the mean value of CN (0Q) is zero or not. To compute generalized likelihood ratio (GLR) between H, and H0: PoiCM)) (3-17) = £N(0o)Tz-lMrlMTi:-l£N(do) where I = MTH~lM, px(x) is the density function of Gaussian distribution under Hl and p0(x) is the density function of Gaussian distribution under H0. When M is a square matrix t = CN(0o)^CN(0o) (3-18) The normalized residual calculated with the estimated parameters follows: CN0O)~N(Ma,^) under# 0 £N 0O) ~ N(Ma - MX, Ii) under Hx Then the generalized likelihood ratio (GLR) between Hl and H0 is m a x ^ ( P PoiO (3-19) = (CN 0o) " Ma)TI-lMr{MT^ (£„ 0O) - Ma) It is assumed that M is square matrix, thus 21 Chapter 3: Robustness of fault detection using the local approach 1 = (CN 0o ) " Ma) T I"1 (<T„ (0O ) " Ma) (3.20) I f the est imat ion error 0Q-0O is k n o w n , a can be calculated us ing eqn. 3.13. T h e norma l i zed res idual CN(0O) can be estimated as CN(Pt>)-Ma. T h i s is ca l led the upgrad ing scheme (it w i l l be d iscussed i n the s imula t ion studies). T h e k e y idea is to estimate the constant b ias Ma w i t h t ra in ing data. In detect ing faults, one can calculate the no rma l i zed res idual CN0O) w i t h est imated parameters 0O and use an estimated bias to upgrade the no rma l i zed res idual as gN(0o)-Ma. A chi-square test can be per formed on CN(0Q)-Ma and the or ig ina l threshold can then be used wi thout increasing the false a larm rate. Howeve r , the real va lue o f the est imat ion error 0O - 90 is unknown . A l t h o u g h it can be estimated through ident i f icat ion, the est imated parameter error is not accurate. Therefore, it is better to use the or ig ina l scheme to per fo rm a chi-square test on CN0O) as fo l l ows : t = C/(do)I,-1CN0o) (3-21) t is a distr ibuted non-central chi-square distr ibuted random var iable i n a fault- free case. T h e probab i l i t y o f t to exceed % a i n a fault-free case (false a larm rate) can be larger than the pre-assigned va lue a. A s ment ioned i n sect ion 3.1, the robustness o f F D I us ing the l oca l approach is dependent o n m a k i n g the false a larm rate no larger than a. Therefore, the threshold X a needs to be rev ised as % a so that the cr i ter ion P(t>Xa\H0)<a is sat isf ied. I f the probab i l i t y o f t be ing less than % a is larger than the probab i l i t y o f t be ing less than Xa P(i<Xa)>P(t<Xa) under # 0 then the false a la rm rate w i t h the rev ised threshold X a P(i > Xa\Ho) = 1 -P(i * Xa\H0) * 1"P(t * Xa\H0) = a Moreove r , i f t<%a is a lways true when t<xa, then P(i <xa)^-P(t^Xa) • Therefore, i f a threshold X a c a n D e found that i < X a is true w h e n t < xa, the probab i l i t y o f t be ing less than % a is larger than the probab i l i t y o f t be ing be less than X a > the false a larm rate o f the ch i -22 Chapter 3: Robustness of fault detection using the local approach square on Cs0o) with the revised threshold will be less than a. This is also equivalent to finding a threshold xa mat wnen {CN{e0)-Ma)TlT\CN{e0)-Ma)<Xa the following inequality is true: £/(0 o)E-'£„(0 o)<i a Therefore, the robustness problem is transformed into a problem of finding a threshold %a which can guarantee that i < %a holds when t<xa-3.3.3 Deduction of the new threshold Since (CN 0O) - Ma)T Z"1 (Cs 0o )-Ma)<Xa ^CNT0o)^Cs0o)-2CNT0,WxMa + aTMTJ.-lMa<X (3.22) « Cj0oW'CN0o) < %a + 2CsT WoWMa - aTMTX-lMa where <=> means equivalent to. It is easy to prove that for any n -dimension vectors X and Y 2XTY<-XT X + kYTY k where k is any positive real number. Therefore, for positive number k 2Cj(90)Ma<\Cs\00WLCM + kaTMTl:lMa (3.23) k Substituting eqn. 3.23 into 3.22, then (Cs 0o )~Ma)T XT1 (CN 0O )-Ma)< X a o CJ 0oCs 0o)<Xa+2Cj 0OWlMa - aTMTYTlMa => CJ0O)Z~1CN0O) < Xa+jCj0oWCN0o) k + kaTMTirxMa-aTMT^Ma o (l-j)Cj0o)X-lCs0o) < Xa+(k-\)aTMT*-xMa k Therefore, when k > 1 23 Chapter 3: Robustness of fault detection using the local approach (CN (0O) - Ma) T Z- 1 (CN (0o )~Ma)< X a ^C/0o)^lCAOo)<T^-;Xa+kaTMTI.-iMa k -1 If the threshold xa ls s e t to be Xa=-^Xa+kaTMT^Ma (3.24) k -1 then ^ r(4)Z- 1^(4)<f„ is always true when (^(4)-Mz) rZ-'(^(4)-Mz)<^ is true. Therefore, i f %„ determined by eqn. 3.24 is set to be the new threshold, the false alarm rate of the chi-square test on CN (0O) w i " D e l e s s than a. 3.3.4 Optimization of the threshold In order to make the threshold value as small as possible and to maintain the detection performance, X a is minimized with respect to k. Let X^aTMTYT'Ma (3.25) Then XT =mjn(-^-xa+karM Ti:-lMa) * k-l = ™H-t-Xa + kZfi ) ( 3 - 2 6 ) * k-l = (4%a~ + Jih)2 where Xp is defined in eqn. 3.22. X a reaches its minimum value xT w n e n k = l + ^ Xa/Xp • In eqn. 3.26, Xp remains to be determined. Usually the value of nominal parameter 0O is obtained through system identification and it is assumed that: 0 0 - 7 ^ ( 0 0 , 1 , ) then a = ( 0 0 - 0 0 ) ^ - ^ ( 0 , ^ 1 , ) In practice, Xp c a n D e calculated as Xp = E(aTMTlrxMa) (3.27) 2 4 Chapter 3: Robustness of fault detection using the local approach To be more precise, this Xp is n o t the same as that defined in equ. 3.25. 3.3.5 Discussion of the results Noise and model uncertainties are two main factors leading to false alarms. The revised threshold xT c a n be rewritten as follows: 2T = (4Xa~ + Jx~P)2 =Xa+XP+ 2^XaXfi (3-28) In eqn. 3.28, there are three terms. The first term X a * s r elated to system noise. If there is no noise in the system, the non-zero normalized residual will indicate a fault. The threshold here is set to be zero. Because of the noise, the normalized residual is a Gaussian random vector with zero mean. A threshold xa other than zero is set so that a fault alarm is triggered only when chi-square value of the normalized residual goes beyond the threshold. When there are model uncertainties, the normalized residual has a non-zero mean even in fault-free cases. A term Xp is introduced to increase the threshold to reduce false alarms due to model uncertainties. While the noise and model uncertainties can interact with each other, the third term 2^xaXp appears in eqn. 3.28 which takes in consideration the interaction between noise and model uncertainties. A natural requirement for the revised threshold is that when there is no uncertainty in the nominal parameter, (i.e. d0 =60), the threshold does not need to be changed and the revised threshold should be the same as the original one. Through eqn. 3.13 and 3.25 it can be shown that when 90 = 0Q, X p = 0 and xf = Xa • 3.3.6 Extension of the results to the case of parameter fluctuations Secondly, consider the robustness of the local approach with respect to regular fluctuations of system parameters. Suppose that A is the regular range of the parameter fluctuations. The algorithm to revise the threshold is provided as follows: At any one moment, suppose that the real parameter value is 0 and it is in the range of regular fluctuation. The normalized residual follows: CN(e0)~N(Ma,T) (3.29) where 25 Chapter 3: Robustness of fault detection using the local approach a = 4N(d-00) The chi-square value of £N(0O) is Since the mean of £ W (0 O ) is not zero, then / is a non-central chi-square distributed random variable. If the original threshold is used this will lead to increased false alarms. In order to reduce the false alarm rate to the pre-assigned value a, a new threshold should be found such that the false alarm rate P(t>xa\H0)<a. Let t = (CN (0O) - Ma)TI-lMr{MTZ-1 (CN (60) - Ma) Since CN(9Q)-Ma~N(0,2Z) t is a central chi-square random variable with the degree of freedom equal to the dimension of the normalized residual. Therefore P(t > xa\H0) = a. If a new threshold can be found that when t < xa is always true when t < Xa»then P(t<Xa\H0)>P(i<Xa\H0) The false alarm rate with the revised threshold X a P(t > xa\H0) = \-P(t< Xa\H0) < 1 -P(t < %a\HQ) = a Using the result in section 3.3 and 3.4, the revised threshold can be calculated as Xa=(y[Za~ + 4%p~)2 (3-3°) where xa * s the original threshold, xp c a n be calculated as Xp =aMTYTxMa In order to make the false alarm rate to be less than a for all 0e A, xp should be computed as Xp =max(N(0-0o)TMTI,-1M(0-0o)),0e A (3.31) 3.4 Simulation results The effectiveness of the proposed algorithms of calculating a new threshold in section 3.3 will be shown by the simulation of a single-input-single-output (SISO) system. 26 Chapter 3: Robustness of fault detection using the local approach 3.4.1 System specification In the simulation, a first-order SISO system is used: A(z~x )y(t) = B(z~x )u(t -1) + e(t) where u(t) and y(f) are the input and output of the system and e(t) is a white noise. A(z-x) = \ + az-\ B(z~x) = b Nominal values of the system parameters a, b are: a0 = -0.6, b0 = 0.4 The primary residual is calculated as ~-y(t-i) (3.32) H(u(t),y(t),a,b) = u(t-\) (y{t)-ay(t-\)-bu(t-\)) (3.33) 3.4.2 Simulation of robustness with respect to model uncertainties First the parameters a, b were estimated using the Least-Square (LS) method and the estimated parameters a0, b0 were found to be a0 =-0.5899, b0 =0.4691 The system was excited with a white noise signal with a variance of 1 for 2000 time units. With the input and output data, the primary residual was calculated using eqn. 3.33. With the primary residual, the sensitivity matrix M and covariance matrix Z were estimated using eqn. 2.16 and 2.17: " 1.6360 -0.0119" -0.0119 0.3317 Z = 1.6173 0.0776 0.0776 0.2867 , M = In the estimation of M and Z , the estimated parameter d0,b0 were used instead of the nominal parameters a0,bQ. The false alarm ratio a was set to be 0.01 and the original threshold was obtained from the standard chi-square table as xa = 9.21. With the covariance matrix of the estimated parameters a0 and b0, the sensitivity matrix M and the covariance matrix Z , the new threshold could be calculated using eqn. 3.26. The threshold value was found to be xa -13-57 • In order to show the improvement of the new threshold, it was tested under both fault-free 27 Chapter 3: Robustness of fault detection using the local approach cases and faulty cases. (a) Test of the false alarm rate using the revised threshold in a fault free case The process model 3.32 was excited with a white noise signal with a variance of 1 to 10000 time units. A chi-square value was calculated at every 10 time units in order to reduce the computation load. Every chi-square value was based on the past 100 time units (i.e. a moving window of 100 time units was used). A plot of chi-square value versus time units is shown in Fig. 3.1. 1 1 I 1 1 1 — - Revised threshold : Original threshold -J5I i l 1 I \ 0 100 200 300' 400 500 600 700 800 900 1000 Time, 10 units Figure 3.1 Chi-square value of the normalized residual in a fault-free case Table 3.1 False alarm rate in a fault-free case Number of chi-square value exceeding threshold False alarm rate Original threshold 23 2.3% Revised threshold 7 0.7% Fig. 3.1 and Table 3.1 indicate that when the original threshold was used there were many more false alarms than necessary and the false alarm rate exceeded the expected rate of 1%. When the revised threshold value was used, there were only 7 false alarms. The false alarm rate was 0.7% close to the expected rate of 1%. The simulation results show that the revised threshold is suitable since it reduced the false alarm rate to around the preset value of 1%. If the false rate was well below the expected value of 1%, it may indicate that the revised threshold was too high. 28 Chapter 3: Robustness of fault detection using the local approach (b) Test of the detection power the using revised threshold in faulty cases Missing detection rate measures the probability of chi-square value of the normalized residual going under the threshold when a fault happens. It is also referred to as detection power. One can not obtain a low missing detection rate and a low false alarm rate simultaneously. By increasing the threshold, a low false alarm rate can be obtained. But the missing detection rate will also increase. Therefore, the detection power can be used as another criterion to evaluate a FD algorithm. The following simulations were done to test the detection power of the revised threshold. For each of the simulations, the process model was excited with a white noise signal with a variance of 1 to 5000 time units. A chi-square value was calculated at every 10 time units. Every chi-square value was based on the past 100 time units. In the first simulation, the parameter a was step changed from -0.6 to -0.8 at the 2500 t h time unit and the plot of chi-square value versus time units is shown in Fig. 3.2. In the second simulation, the parameter b was step changed from 0.4 to 1.3 at 2500 t h time unit and the plot of chi-square value versus time units is shown in Fig. 3.3. 50 100 150 200 250 300 350 400 450 500 200 v 150 .Si 9>: Sa 100 O so Revised threshold Original threshold 50 100 150 200 250 300 Time, 10 units 350 400 450 500 Figure 3.2 Chi-square value of the normalized residual in a faulty case (a change occurred at the parameter a) 29 Chapter 3: Robustness of fault detection using the local approach 50 100 150 200 250 300 350 400 450 500 80 v 60 g 0> . a 40 f-CT (A o 20 — Revised threshold Original threshold 50 100 150 200 250 300 Time, 10 units 350 400 450: ,500 Figure 3.3 Chi-square value of the normalized residual in a faulty case (a change occurred at the parameter b) Fig. 3.2 and 3.3 show that when faults occured, the effect of the revised threshold toward the detection power was quite small. The detection algorithm could detect a change in the system parameters a and b effectively. The tests of false alarm rate and detection power indicate that the revised threshold is quite effective. It reduces the false alarm rate below the expected value, while maintaining the detection power. (c) Comparison of the upgrading scheme and the robustness scheme As discussed in section 3.3.2, upgrading the normalized residual is a method to deal with constant bias due to parameter uncertainties. The idea is to estimate the constant bias of normalized residual with training data, while simultaneously estimating the sensitivity matrix and covariance matrix. The result is used to upgrade the normalized residual in the detection of faults. Upgrading the normalized residual is called the upgrading scheme while the method of revising the threshold to reduce the false alarm rate is called robustness scheme. 30 Chapter 3: Robustness of fault detection using the local approach The following simulation was done as a comparison of these two schemes. The process model was excited with a white noise signal with a variance of 1 to 5000 time units. A chi-square value was calculated at every 10 time units. Every chi-square value was based on the past 100 time units. Testoffalse alarm rate for the robustness scheme 100 200 300 400 500 600 700 800 900 1000 Test of false alarm rate for the upgrading .scheme Figure 3.4 Chi-square value for the robustness scheme and for the upgrading scheme in a fault-free case Table 3.2 False alarm rates for the robustness scheme and the upgrading scheme Number of chi-square value exceeding threshold False alarm rate Robustness scheme Original threshold 34 3.4% Revised threshold 8 0.8% Upgrading scheme Original threshold 28 2.8% Revised threshold 8 0.8% With the upgraded normalized residual and the original threshold, the false alarm rate decreased from 3.4% to 2.8%. However it was still well above the preset value of 1%. The revised threshold reduced the false alarm rate to 0.8% close to the expected value of 1%. One can conclude that the upgrading scheme is not as effective as the robustness scheme in dealing with parameter uncertainties. 31 Chapter 3: Robustness of fault detection using the local approach 3.4.3 Simulation with respect to parameter fluctuations In this simulation, the model parameter is assumed to have a regular fluctuation in b: 6 = 0.4 + 0.1sin(0.01?) The original threshold was obtained from standard chi-square table as X a = 9.21. Using eqn. 3.30 and 3.31, the revised threshold was calculated as % a =12.90. In order to show the improvement of the new threshold, a test was conducted under both a fault-free case and a faulty case. (a) Test of the false alarm rate using the revised threshold in a fault-free case The process model was excited with a white noise signal with a variance of 1 to 10000 time units. A chi-square value was calculated at every 10 time units. Every chi-square value was based on the past 100 time units. A plot of chi-square value versus time units is shown in Fig. 3.5. 1 i 1 1 1 n f 1 1 1 : — r ; 0.8 -£1 v 0.6 -0.2j-_ i 1 i 1 j L I I i 1 1.00 200 300 400 500 600 700 800 900 1000 -' ' —— Revised threshold .Original threshold mi ill * iiiyli H i ; 100 200 3 0 0 / 400 500 600 700 800 900 1000 Time, 10 units Figure 3.5 Chi-square value of the normalized residual in a fault-free case with parameter fluctuations at b 32 Chapter 3: Robustness of fault detection using the local approach Table 3.3 False alarm rate in a fault-free case with parameter fluctuations Number of chi-square value exceeding threshold False alarm rate Original threshold 23 2.3% Revised threshold 7 0.7% Fig. 3.5 and Table 3.3 show that when the original threshold was used, the false alarm rate of 2.3% was much higher than the expected value of 1%. The revised threshold value decreased the false alarm rate to 0.7% which is close to the expected value of 1%. (b) Test of the detection power using the revised threshold in faulty cases The two following simulations were done to test the detection power of the revised threshold. For each of the simulations, the process model was excited with a white noise signal with a variance of 1 to 5000 time units. A chi-square value was calculated at every 10 time units. Every chi-square value was based on the past 100 time units. In the first simulation, the parameter a was step changed from -0.6 to -0.8 at the 2500 t h time unit and the plot of chi-square value versus time units is shown in Fig. 3.6. In the second simulation, the parameter b was step changed from around 0.4 to 1.3 at 2500 t h time unit and the plot of chi-square value versus time units is shown in Fig. 3.7. Fig. 3.6 and 3.7 show that that when a fault occurred, the effect of the revised threshold toward the detection power was quite small. The local approach was still able to effectively detected changes in the system parameters a and b. 33 Chapter 3: Robustness of fault detection using the local approach 100 200 300 400i 500 600 700 800 900 1000 100 200 300 400 500 600 700 800 900 1000 500 400 3 ^ 3 0 0 CO §•200 5 100 Revised threshold Original threshold 0 100 200 300 400 500 600 700 800 900 1000 Time, 10 units F igure 3.6 Ch i -square va lue o f the norma l i zed res idual i n a faul ty case w i t h parameter f luctuat ions (a change occurred at the parameter a) 100 200 300; 400 500 600 700 800 900 1000 80 ,;60 j 40 •JE O 20 50 100 150 200 250 30.0 350 400 450 500 Time, 10 units Figure 3.7 Ch i -square va lue o f the norma l i zed res idual i n a faul ty case w i t h parameter f luctuat ions (a step change occurred at the parameter b) 34 Chapter 3: Robustness of fault detection using the local approach (c) Comparison of the upgrading scheming and the robustness scheme The following simulation was done to compare the robustness scheme and upgrading scheme in the case of parameter fluctuations. The process model was excited with a white noise signal with a variance of 1 to 5000 time units. For each scheme, a chi-square value was calculated at every 10 time units. Every chi-square value was based on the past 100 time units. The plot of chi-square value versus time units was shown in Fig. 3.8. Test of false alarm rate for the robustness scheme 1. i i i i i -—- Revised threshold Original threshold 1 m y i y V y i i m M 0 100 .200, 300 400: 500 600 700 800 900: 1000 Test of false alarm rate for the upgrading scheme " *~' 1 1 ' [ Original threshold 0 100 200 300 400 500 600 700 800 900 1000 Time. 10 units Figure 3.8 Chi-square value for the upgrading scheme and the robustness scheme in a fault-free case with parameter fluctuations Table 3.4 False alarm rates for the upgrading scheme and the robustness scheme in a fault-free case with parameter fluctuation Number of chi-square to exceed False alarm rate Robustness Original threshold 20 2.0% scheme Revised threshold 7 0.7% Upgrading Original threshold 20 2.0% scheme Revised threshold 6 0.6% 35 Chapter 3: Robustness of fault detection using the local approach With the upgraded normalized residual and original threshold, the false alarm rate was 2.0%, which was well above the preset value of 1%. The revised threshold reduced the false alarm rate to 0.7% which was close to the expected value of 1%. One can conclude that the upgrading scheme was not as effective as the robustness scheme in dealing with parameter uncertainties. 3.5 Experiment results In order to assess the applicability of the algorithm in a real world unit, a pilot-scale tank level and temperature control system was used to generate data for the algorithm. Fig. 3.9 shows the schematic of the process. Cold water Steam Condensate Effluent 1 1 R S - 2 3 2 C Opto22 | \J—In I—' Figure 3.9 Process diagram of the tank level and temperature control system In this experiment, only the temperature process was considered. The energy balance of the process is as follows: CpAh^r = C(pFlnTln - pF0UlT) + FslmmH (3.34) at where C: Heat capacity of water; 36 Chapter 3: Robustness of fault detection using the local approach A : Area of the tank; h : Water level of the tank; p: Water density; FIn: Inflow of cold water; Fout: Outflow of warm water; Fsteam '• F l ° w °f steam, determined by the opening of steam valve; H: Steam enthalpy. At steady state, FIn and Fout are constant and are determined by the water level in the tank h. C ,A and p were assumed to be constant. Thus the dynamics of the temperature process were only determined by h. In the experiment, the tank level h was regulated by a PI controller. Changes in the tank level h were used to simulate faults in the system. 3.5.1 Identification of model parameters Model-based FDI methods rely on the nominal model of the process to perform fault detection. The system model used was A(z~x )y(t) = B(z~x )u(t) + e(t) (3.35) The process was excited with a PRBS signal to identify the parameters. The sampling interval was 4 seconds. The estimated parameters were: A(z~l) = l + a1z-1 +a2z~2 +a3z~3 =l-0 .5192z _ 1 -0.1671z" 2 -0.1392z~3 B(z-X) = b,z~x +b2z~2 +b3z-3 =0.0185z"2 -0.0689z - 3 +0.011 Oz"4 The process model (3.35) can be rewritten as y(k) = <pT(k)0 where (p<k) = [-y(k-\) -y(k-2) -y(k-3) u(k-\) u(k-2) u(k-3)]T 0 = [a, a2 a 3 bx b2 6 3 ] r The primary residual was calculated as follows: # (0O, y(k), Uk)) = <P(k)(y(k) - <PT (kWo) where 0O are the nominal values of the system parameters. 37 Chapter 3: Robustness of fault detection using the local approach 3.5.2 Training of the FDI system The process was excited with a pseudo random binary signal (PRBS) at nominal state for 5000 sampling times. With the input-output data, the sensitivity matrix M and the covariance matrix E were estimated using eqn. 2.16 and 2.17. The false alarm ratio was set to be « = 0.01. The original threshold value was obtained from the standard chi-square table as Xa = 16.81. With M and S , the revised threshold was calculated as X a = 22.46 using eqn. 3.27 and 3.28. 3.5.3 Test of the false alarm rate Under nominal state (the nominal tank level is 14cm), the process was excited with a PRBS signal for 2000 sampling times. The length of the moving window was set to be 100. A chi-square value was calculated for every 5 sampling times. The plot of chi-square value is shown versus sampling time in the Fig. 3.10. —— Revised threshold 5 sampling times Figure 3.10 Test of the false alarm rate in a fault-free case using the revised threshold When the original threshold was implemented, there were 21 false alarms and the false alarm rate was 5.25%. When the revised threshold value was implemented, there were 9 false alarms and the false alarm rate is 2.25%. Although the revised threshold greatly decreased the false alarm rate, it was still not large enough to reduce the false alarm rate to the preset value of 1%. The reason is that only the effect of high frequency noise was considered in both training and detection. In the parameter identification and the training stage, low frequency disturbance was not taken into account. Therefore, the detection algorithm can not distinguish the effects of low frequency disturbance and those of faults. Reducing the effect of low frequency 38 Chapter 3: Robustness of fault detection using the local approach disturbances toward the detection algorithm may be a future research topic. 3.5.4 Test of the detection power The process was excited with a PRBS signal for 2000 sampling times. The length of the moving window was set to be 100 time units. A chi-square value was calculated for every 5 sampling times. The tank level of the system changed from 14cm to 11cm at the 800th sampling time. The plot of chi-square value versus sampling time is shown in the Fig. 3.11. 1000 1500 Sampling time 2000 2500 500 r 400 • «>' CO1 > 300 <D CO ^200 D 100 Revised threshold Original threshold 50 '100 150 200 250 300 5 sampling time 400! -450' 500 Figure 3.11 Test of the detection power using the revised threshold Fig. 3.11 shows that when a fault occurred, the effect of the revised threshold toward the detection power was quite small. The revised threshold has decreased the fault alarm rate while maintaining the ability to detect a real fault. 3.6 C o n c l u s i o n s In this chapter, the robustness of the local approach, with respect to model uncertainties, was investigated. As for the local approach, an algorithm was proposed to recalculate the threshold value to reduce the false alarm rate to the preset value. A similar algorithm was also proposed to calculate the threshold to accommodate the regular parameter fluctuations. 39 Chapter 3: Robustness of fault detection using the local approach Simulation results show that the revised threshold can reduce the false alarm rate to a value close to the pre-assigned one and maintain the ability of detecting fault in case of both parameter uncertainties and parameter fluctuations. The robustness scheme was also compared with the upgrading scheme through simulations and it was concluded that the robustness scheme is more effectiveness than the upgrading scheme in reducing the false alarm rate. The proposed algorithms were also applied to a pilot-scale tank level and temperature control system. Though, the revised threshold was focused to be able to reduce the false alarm rate, it was not large enough to decrease the false alarm rate to the preset value. This probably was due to the low frequency disturbances in the system. A future research topic may be to improve the robustness of the local approach with respect to low frequency disturbances. Possible solutions could either be to further increase the threshold value, or to take in account the low frequency disturbances in the process model. 40 Chapter 4 Closed-loop fault detection using the local approach 4.1 Introduction Fault detection and isolation (FDI) have become an important subject in the process control community with the increasing complexity of industrial systems. One of the applications of FDI is condition-based maintenance. The decision for such maintenance can be achieved by early detection of incipient changes in the parameters of a system with respect to their nominal value without any artificial excitation. By doing so, one can prevent possible malfunction or damage before it occurs. A statistical approach to this early detection task, the local approach, has been developed by Basseville (1998) and Zhang et al. (1998). It consists in the design of chi-square tests built on a function of the nominal model parameter and system inputs and outputs. The local approach is closely related to parameter estimation based detection methods. In parameter estimation based fault detection methods, system parameters are estimated with on-line input-output data and compared with the nominal value. If significant changes occur, a decision on the presence of faults in the system is made. However there is inevitable estimation error in the estimated parameters due to noise and disturbance around the system. Evaluating the significance of the changes with respect to estimation error remains a problem. In fact, fault detection and isolation algorithms are only concerned with changes in the system parameters, not the exact value of the parameters. Thus an alternative way to detect changes in system parameters is to check whether the current system inputs and outputs are still in agreement with the nominal model, i.e. a signal-to-model distance is considered instead (Zhang, et. at. 1998). The local approach is a method based on the above idea and it can be developed in line with system identification. A lot of research has been done on the local approach both theoretically and experimentally. The effectiveness and reliability of this approach has been demonstrated by its applications in the monitoring of some critical processes such as nuclear power plants, gas turbines, catalytic converter, etc (Huang, 1999). However, few studies have been conducted on closed-loop detection using the local approach. Since most of the control systems work under closed-loop 41 Chapter 4: Closed-loop fault detection using the local approach conditions, it is meaningful to investigate how the local approach can be applied to closed-loop monitoring using closed-loop data. The fundamental problem in closed-loop identification is the correlation between system inputs and outputs through a feedback controller. The local approach is a detection method naturally related to parameter identification. Therefore, when closed-loop data are used for fault detection, one can anticipate similar problems associated with closed-loop identification; that is insufficient excitation exists in system inputs and outputs. Terms of the primary residual and the normalized residual may be linearly related. From the information point of view, some of the information of the primary residual calculated using closed-loop data is redundant and the information contained in the primary residual is not rich enough to reflect all types of changes in the system parameters. With respect to the computation of the detection algorithm, it is usually assumed that the sensitivity matrix has full rank and the covariance matrix is positive definite. However, this may not be true in closed-loop detection. A chi-square test can not be performed on the normalized residual i f the covariance matrix is not positive definite. In this chapter the problem of closed-loop detection will be discussed. The original scheme of the local approach will be revised to make it useful for closed-loop detection. Closed-loop identification has received considerable attention in the literature. Several methods have been proposed and analyzed (Forsseell, 1999). Some ideas from existing results in closed-loop identification will be applied in the problem of closed-loop detection. 4.2 Discussion of a simple example A simple example is used in the following to outline the approaches for closed-loop detection. A second order process given by the following A R X model is considered: A(z-x)y(k) = B{z-x)u(k) + e(k) (4.1) where u(k) and y(k) are the system input and output respectively at time instance k, e(k) is a white noise sequence, and the polynomials A(z~l) and B(z~l) are defined as A(z~l) = l + alz~l +a2z~2 B(z-l) = blz-l+b2z~2 System 4.1 can be rewritten as y(k) = (pT(k)0 + e(k) (4.2) 42 Chapter 4: Closed-loop fault detection using the local approach where (p(k) = [-y(k-\) -y(k-2) u(k-l) u(k-2)]T 0 = [a, a2 bx b2 ]T For simplicity, the following notations are used yk = y(k)> <Pk = <P<Jk), uk = u(k), ek = e(k) The primary residual is calculated as H(0o,yk,(pk) = (pk(yk-<pkr0o) (4.3) Substituting 4.2 into eqn. 4.3, the primary residual becomes H(0O ,yk ,<Pk) = <Pk<PkT(0-0o) + <Pkek (4-4) Then the normalized residual can be calculated as = ^ P^-0o)+j=pk (4.5) 0-0,=-^= (4-6) In the local approach, the assumption of small change is made that A •JN where 0 is the system parameter after a fault, X is considered as an unknown, but fixed vector. Under the assumption of small change, the normalized residual becomes CAOo) = ^ l(^<PTk)^+-4=t^ (4-7> Since tpk, ek are uncorrected and ek is zero mean, <pkek is also zero mean. By the Central Limit Theorem, the second term in eqn. 4.7 converges to some Gaussian distribution with zero mean when N approaches infinity. Also by the Central Limit Theorem when N approaches infinity 1 N — ^ Z<Pk<Pl ->cov(^) = M where M denotes the covariance matrix of <pk. To summarize i f the assumption of small changes is taken, the normalized residual is asymptotically Gaussian distributed: 43 Chapter 4: Closed-loop fault detection using the local approach CN(0O) ~ N(M4,Z) when N approaches infinity, where Z is the covariance matrix of the second term in equation 4.7. A chi-square test can be performed on £N (0O) to detect whether X is zero or not. A non-zero X indicates a fault. When the system 4.1 is well excited, both M and E will be of full rank. It also means that all types of changes in the system parameter vector 0 can be reflected in the system input and output, and thus in the mean value of the normalized residual £N (0O). In a closed-loop case, M and E may not be of full rank because of the correlation between the system inputs and outputs. Suppose that a proportional controller with gain Kc is used to control the system 4.1 and the setpoint yr is zero, then uk=Kc(yr-yk) = -Kcyk Therefore, <Pk = i- yk-\ - yk-2 - Kcyk-i - Kcyk-i Y = p 9 k where -yk-x -yk-2 I o 0 1 0 (4.8) (4.9) (4-10) 0 Kc Then the primary residual becomes H(0O,yk,(pk) = (pk(yk -<pkT0o) = PTpk(yk -<pkTPT0o) (4.11) Letting h = [0 -Kc 0 1] then hP = [0 0] and hH(0o,yk,(pk) = hPq>k<yk-<pkTPT0o) = 0 (4.12) Therefore, the terms of H(0O,yk,<pk) are linearly related. In the case of closed-loop detection, the sensitivity matrix and the covariance matrix can be written as 44 Chapter 4: Closed-loop fault detection using the local approach M = cov(<pk) = E(<pk<pTk) = E(Plpk(pkTPT) = PE(<pkpkT)PT =Pcov(<pk)PT (4.13) I = cov(<pkek) = E{<pkekek(pTk) = E(P(pkekek<p[ PT) = PE(7pkekekyTk)PT = Pcov(spe)PT (4.14) <pk and <pkek are random vectors with length 2. Therefore, rank(cov((pk)) < 2 and rank(cov(<pkek )) < 2 . Since rank(P) - 2, then rank(M) = min[ra«A:(P),ra«A:(cov(^i))] = rank(cov((pk)) < 2 rank(L) = mm[rank(P),rank(cov(<pkek))] = rank(cov(<pkek)) < 2 From equations 4.10 to 4.12, it can be concluded that in the case of closed-loop detection: • Terms of the primary residual calculated with open-loop model are linearly related. This means that some terms in the primary residual can be calculated as the linear combination of the rest of the terms. They can be eliminated from the primary residual without loss of information about faults. • The sensitivity matrix M is not full rank. Therefore, for some non-zero X, MX can be zero. In such a case, the change in the system parameters X does not cause the change in the mean value of the normalized residual. A missed detection will occur. • The covariance matrix £ may not be of full rank, which means that it is impossible to take the inversion of £ to perform a chi-square test on the normalized residual and the detection algorithm can not be continued. In the case of closed-loop detection, insufficient excitation or correlation between system inputs and outputs makes it difficult to detect all types of changes in the system parameters. However, the primary residual and the normalized residual still contain useful information. The local approach may be revised to use such information to detect these changes in the system parameters that can be reflected in the closed-loop data. In the following, the ideas of closed-loop detection will be illustrated using the above example. With closed-loop data, the normalized residual can be calculated as 45 Chapter 4: Closed-loop fault detection using the local approach 1 # (4.15) = X ^ ) p T * + p ( i = where P is defined in eqn. 4.10. Letting " l 0 K'1 0 0 1 0 K:1 A D=\ (4.16) then DP = I2 = 1 0 0 1 where I2 means an unit matrix with number of columns equal to 2. With the following notation: CN(8o) = DCA0o), 00=PTd0, d=PT6, X=PTX (4.17) and then ^(0o)=DP(-i|;^#;)p rA+/jp(- /Lx%e i) \ N _ 1 N = (T7 X V&I M + "Try X V*e* (4.18) By the Central Limit Theorem, the second term in eqn. 4.7 converges to some Gaussian distribution with zero mean when N approaches infinity. Also by the Central Limit Theorem, when N approaches infinity J N _ (4.19) Suppose E is the covariance matrix of the second term in eqn. 4.19. Therefore, when N approaches infinity, (^0O)->N(m,L) (4.20) In can be proven that in a closed-loop case, M and I are of full rank. This means that the revised normalized residual CN(&O) c a n D e u s e u < t 0 detect whether X is zero or not, (i.e. the 46 Chapter 4: Closed-loop fault detection using the local approach actual change in 0 ). In fact the closed-loop model is A(z-1)y(k) = -B(z-1)Kcy(k) + e(k) (4.21) [A(z-1) + B(z-l)Kc]y(k) = e(k) (4.22) Since A(z~l) + B(z~l )KC=1 + (a, + Kcbx )z~l + (a2 + Kcb2 )z -2 (4.23) The closed-loop parameters are [ax+Kcbx a2+Kcb2}T =PT0 = 0 (4.24) Therefore, 0 is actually the closed-loop parameter vector of the system. From the above discussion, one can conclude that the local approach can be applied for closed-loop detection in either of the following two ways: • Use an open-loop system model to calculate the primary residual and the normalized residual. Reduce the dimension of the normalized residual to eliminate information redundancy. The regular local approach can be used to detect changes in the open-loop parameters. • Using the closed-loop model instead of the open-loop model to calculate the primary residual and the normalized residual and then perform detection on the closed-loop parameters. Since the changes in the closed-loop parameters indicate changes in the open-loop parameters, changes in the open-loop parameters can be detected indirectly. Based on the above ideas, two closed-loop detection methods are proposed: a dimension reduction method and an indirect detection method. They will be discussed separately in the following. 4.3 Closed-loop fault detection using the dimension reduction method First, closed-loop detection using the dimension reduction method is discussed. The basic idea of the dimension reduction method is to reduce the dimension of the normalized residual to eliminate information redundancy and reduce the dimension of the covariance matrix in order for it to have full rank. A parametric system given by the following state space model is considered 47 Chapter 4: Closed-loop fault detection using the local approach xk+i =f(0,xkiuk) (4.25) yk = h(0,xk,uk) + ek where xk, uk, yk, are vectors of state variables, system inputs and outputs at time k respectively; 6 is the vector of the system parameters of length n; ek is a white or colored noise. Using the method given in chapter 2, the primary residual and normalized residual can be calculated using the open-loop model 4.25. Under the assumption of small changes, that is to distinguish between two hypotheses H0 :0-0o and Hx:0 = 0o+Ji/*J~N, the normalized residual £N(0O) follows: £N(0o)~N(O,22) under i / 0 (4.26) £N(eQ)~N(-Mm under//, (4.27) The generalized likelihood ratio (GLR) between / / , and H0 is max., py(C) „ T i i r i -Po(0 (4.28) where F = MTI.~lM. t is a chi-square distributed variable with a number of degrees of freedom equal to the dimension of 6. This chi-square distribution is central under H0, and has non-centrality under Hx. Usually the system is operated in closed-loop conditions. When insufficient excitation exists in the closed-loop system, M and E will not be of full rank. Therefore, both S and F will be invertible. The t value, though, in eqn. 4.28 can not be calculated. Generally, the primary residual and the normalized residual have the same dimension as that of the system parameter vector 0. Therefore, M and E are nxn matrices. Assuming that M and X have the same rank r and using Singular Value Decomposition (SVD) on M and E M = UXSXVXT, X = U2S2V2T where Ux ,VX, U2 and V2 are all unitary matrices; 5,= a , 0 0 0 ar 0 0 0 S2 = px 0 0 0' o •. ; ; o ... pr o 0 ••• 0 0 48 Chapter 4: Closed-loop fault detection using the local approach Letting S, and £ be the sub-matrices with ranks of r ~ax ••• 0" "A ... o" : 0 Z = '•. 0 0 0 ar rxr 0 0 fir. Since X is a symmetric and semi-positive matrix, U2 = V2 and Bx,--,pr will be all numbers. Therefore, I is a positive definite matrix. Defining where Ir means the unit matrix with a rank of r. therefore, S, =PTS1P, S2 =PT"LP Then T = U2PTS2PU2T, M = UXPTS{PVXT The distribution of the normalized residual can be rewritten as CN(0o)~N(P> U2PrS2PU2T) under H0 CN{6Q)~N(rUxPTSlPVxTX, U2PTS2PU2) under Hx Using a linear transformation of the normalized residual where T=PU^ £N(60) is a r x 1 Gaussian distributed random vector and cov(<f„ (0O)) = ™v(PUT2£N (0O)) = PUl cov(Pt/[^ (0O ))PTU2 T nT = PU2*TPTU2 =PUT2U2PTIPU1UT2P Since U2 is a unitary matrix U2Ur2=In,UT2U2=In and PUT2U2PT = />/„P r = Ir Therefore, cow(CM)^PUlU2PTIPU2UT2PT =IrYIr = 1 Letting M = PUT2UlPTSx , I = PVlTZ positive (4.29) (4.30) (4.31) (4.32) (4.33) (4.34) (4.35) 49 Chapter 4: Closed-loop fault detection using the local approach Then ElCsWo)] = E[PUT2£N(0o)] = PUr2E[£N(0o)] = 0 under H0 E[CAOo)] = E[PU2T£N(0o)] = rUr2(-MA) _ under H. = -PUT2UXPTSXPVXT A = -MA The distribution of the revised normalized residual is CN(0o)~N(O,l) under/f 0 (4.36) CA0o)~N(-MX,T.) under Hx (4.37) M and X are full rank matrices. Using a linear transformation, the dimensions of the normalized residual are reduced so that the covariance of it is positive definite. A chi-square test then can be applied to the revised normalized residual to detect whether A is zero or not, i.e. (to detect the change in 0 ). - = 2 t o r ^ x 1 ^ = - / ( ^ _ _ 1 _ _ 1 _ r _ _ 1 _ ^ ( 4 3 g ) PoiO where F =M'I~lM. The t value is asymptotically chi-square distributed, with a number of degrees of freedom equal to the dimension of 0 . It is central under H0, and non-central under Hx. One can detect changes in 6 by comparing t with a threshold xa m a t c a n D e obtained from standard chi-square table according to the false alarm rate a and dimension of the reduced parameter 0 . Since the reduced parameters 0 = PVX9 are the linear combination of the original system parameter vector 9, changes in 0 also indicate the changes in the original system parameter vector 0. Therefore, changes in 9 can be detected indirectly. As a summary, the underlining philosophy of the dimension reduction method for closed-loop detection is that the local approach can be applied to a closed-loop system by using the open-loop model and the system inputs and outputs to calculate the primary and the normalized residual. During the training of the FDI system, the rank of the sensitivity matrix and the covariance matrix should be checked. If they are not full, the dimensions of the normalized residual and the system parameters should be reduced accordingly. Otherwise, nothing needs to be done. 50 Chapter 4: Closed-loop fault detection using the local approach 4.4 Closed-loop fault detection using the indirect detection method It is mentioned in section 4.2 that when the open-loop model and the closed-loop data are used to calculate the primary residual and to perform detection, only those changes that affect the closed-loop dynamics can be detected. Since the structure of the controller is fixed and the closed-loop model is only determined by the open-loop model, changes in the closed-loop parameters indicate changes in open-loop parameters. Thus changes in the open-loop parameters can be detected indirectly. If the structure of the controller is known, the closed-loop model instead of the open-loop model can be used to calculate the primary residual. The nominal value of the closed-loop parameters can also be calculated with the known controller and open-loop system parameters. Consider a single-input and single-output system that is defined by the following A R X model: where u(k), y(k) are the system input and output, e(k) is a white noise sequence, and A(z-l) = l + alz~i + — + anz-*' B(z-l) = b1z-1+- + bnbz-> The input signal is determined by: where Gc(z ') is a linear controller and r(k) is the setpoint. It is assumed that the setpoint of the system is zero. Then A(z~{ )y(k) = B(z~x )u(k) + e(k) (4.39) u(k)=Gc(z-l)(r(k)-y(k)) (4.40) y(k) = -Gc(z-1)u(k) (4.41) Suppose that (4.42) then the system closed-loop model can be rewritten as y(k) = D(z-1) e(k) (4.43) A<z-')C(z-x) + B<z-')D(z-1) Letting E(z~x) = A(z~x )C(z~x) + B(z-1 )D(z~l) (4.44) 51 Chapter 4: Closed-loop fault detection using the local approach v(k) = D(z-x)e(k) The closed-loop system becomes E(z-X)y(k) = v(k) (4.45) Suppose E(z-x) = \ + elzx+--- + enz-"' 0 =[el,e2,---ell]T With the closed-loop model 4.45, the system output y(k) can be used to detect changes in the closed-loop parameter 0 and thus the changes in the open-loop parameters. Before calculating the primary residual and performing fault detection with the closed-loop model 4.45, one must deal with the problem that v(k) is not a white noise. A colored noise will lead to biased estimation in system identification, as it will make the normalized residual non-central even in fault free cases. The following two cases will discuss how the primary residual can be handled. Case 1: D(z~x) does not have unstable poles. In this case, D(z~x) can be used to filter y(k). The closed-loop system is E(z-x)^Q- = e(k) (4.46) D(z~l) Letting *f D(z~x) System (4.46) becomes E(z-l)yf(k) = e(k) (4.47) Rewriting eqn. 4.18 as yf(k) = <pTf(k)0+e(k) (4.48) where (pf(k) = [-yf(Jc-\) -yf(k-2) - -yf(k-ne)]T Then the primary residual can be calculated as H(yf (k), 0O, cpf (k)) = (yf (k) - (pTf (k)0o )<pf (k) (4.49) 52 Chapter 4: Closed-loop fault detection using the local approach where 0O is the value of 6 in the nominal situation. Substituting eqn. 4.48 into eqn. 4.49 H(yf (k),0O,<pf (*)) = <pf (k)(pTf (k)(6 -0O) + <pf (k)e(k) (4.50) When 0=0O E[H(yf(k),0o,<pf(k))] = E[e(k)<pf(k)] = 0 and when 0 = 0O E[H(yf(k),0o,<pf(k))] = E[(pf(k)(p'f(k)(0-0o) + e(k)<pf(k)] = E[<pf (k)(pTf (k)](0 -0O) + E[e(k)<pf (k)] = cov((pf(k))(0-0o)*O Therefore, H(yf(k),0o,<pf(k)) defined in eqn. 4.49 meets the requirement of being a valid primary residual. Case 2: D(z~x) has unstable poles. D(z~l) is the denominator of the transfer function of the controller. It has an unstable pole in most cases because of the integration action in the controller. Therefore, it can not be used to filter y(k). Otherwise yf(k) will become a non-stationary signal. If the system 4.45 is rewritten as y(k) = (pT{k)0+v(k) (4.51) where (p(k) = [-y(k-\) -y{k-2) - -y(k-ne)]T The primary residual can be calculated as H(y(k),0o,<p(k)) = (y(k)-<pT(k)0o)(p(k) (4.52) H(y(k),0o,(p(k)) may have non-zero mean even in fault free cases. To illustrate it, suppose that a PI controller is used in the system. The PI controller can be written as -2L . ) , ^ - » - ) ( 4 5 3 ) 1 — z 1 — z Then D(z~l) = 1-z" 1 . When 0 =0O, H(y(k), 0O, <p(k)) = (y(k) ~ <PT {k)8Q )<p(k) = v(k)(p(k) = [e(k) - e(k - l)]<p(k) = e(k)(p(k) - e(k - \)cp(k) 53 Chapter 4: Closed-loop fault detection using the local approach Since e(k) is a white noise and y(k) is determined by the A R X model 4.39, it is easy to prove that 0 when m>n E[e(m)y(n)] = <Je when m — n therefore, E[e(k)<p(k))] = E(e(k)[-y(k-\) -y(k-2) ••• -y(k-ne))T) = [0 0 ••• O f Therefore, E[H(y(k)AMm = £[e(*")p(*)]- E[e(k -l)p(*)] = k o... Of There is a bias in the primary residual defined in 4.52. Bearing in mind that the only requirement for a primary residual is zero mean in a nominal case, one can simply subtract the bias from H(y(k),d0,(p(k)) to fulfill the requirement of being a valid primary residual. Since the bias is only determined by the covariance of noise, one can estimate the bias during the training of the FDI system as where TV is the number of data to be used in training. In the detection stage, one can calculate the primary residual and subtract the bias h from it. The new primary will be can be used to detect the change in the closed-loop system parameter 6 . After a suitable primary residual is found, the rest is straightforward. The normalized residual is calculated with the primary residual and a chi-square test can be applied to the normalized residual to detect changes in the closed-loop parameter. The limitation of the indirect detection method is that the controller structure must be known. Otherwise the closed-loop model can not be obtained. N h = J,H(y(k),0o,<p(k)) (4.55) H(y(k), 90, (p(k)) = H(y(k), 00, <p(k)) - h (4.56) 54 Chapter 4: Closed-loop fault detection using the local approach 4.5 Simulation results To illustrate the result presented in this chapter, a system given by the following second-order A R X model is considered A(z~l )y(t) = B(z-1 )u(t) + e(t) (4.57) where u(t), y(t) are the system input and output, e(t) is a white noise with a variance of 1. A(z-l) = l + alz~1 + a2z~2 = l-l Az~* + 0.3z"2; B(z-1) = b]z~i + b2z~2 = 0.7z _ 1 - 0 . 5 z - 2 . The system is operated in closed-loop with the PI controller G c ( z - ) = £ ^ > = ^ - ° ' f ' ( 4 . 5 8 ) and a setpoint of zero. 4.5.1 Closed-loop fault detection using the dimension reduction method The system model (4.57) can be rewritten as y(k) = <pT(k)6 (4.59) where (p(k) = [-y(k-\) -y(k-2) u(k-X) u(k-2)J 6 = [a, a2 bx b2 ]r The nominal value of the open-loop parameters are = [-1.1 0.3 0.7 - 0 . 5 f As for the dimension reduction method, the primary residual was calculated using the open-loop model as follows: H(e0,yk,(pk) = <Pk(yk-(pk TeQ) (4.60) In the training of the FDI system, the process was operated in closed-loop condition and the setpoint was set to be zero. With the closed-loop input and output data of 10000 time units, the primary residual was calculated using eqn. 4.60. In this case, the sensitivity matrix can be estimated as 55 Chapter 4: Closed-loop fault detection using the local approach i n = irJL(f>k<pl (4.61) N is the number of the training data. Using eqn. 4.61 and 2.17, the sensitivity matrix and the covariance matrix were estimated as '2.9360 2.2981 0.4405 0.0831" M = £ = 2.2981 2.9359 0.6066 0.4406 0.4405 0.6066 1.0252 0.9978 0.0831 0.4406 0.9978 1.0252 3.1036 2.5033 0.6325 0.2621 2.5033 2.9727 0.6904 0.4870 0.6325 0.6904 1.0060 0.9486 0.2621 0.4870 0.9486 0.9448 The rank of M and £ was 3. Since £ is not of full rank, the normalized residual need to be revised to reduce the dimension of £ . Using Single Variable Decomposition on £ £ = USUT where U = s = 0.6868 0.6785 0.2159 0.1464 5.8313 0 0 0 -0.2741 -0.0817 0.6538 0.7005 0 1.6709 0 0 0.6585 -0.7267 0.1952 -0.0092 0 0" 0 0 0.5250 0 0 0 0.1397 -0.0698 -0.6984 0.6984 The transform matrix was calculated as 0.6868 0.6785 0.2159 T = -0.2741 -0.0817 0.6538 0.1464 0.7005 0.6585 -0.7267 0.1952 -0.0092 In detection, the primary residual H(9Q,yk,(pk) was calculated using eqn. 4.60. The 56 Chapter 4: Closed-loop fault detection using the local approach normalized residual £N(0O) was calculated based on the past 100 time units (a moving window of 100 time units was used) using eqn. 2.18. It was calculated for every 10 t h time unit to reduce computation load. Since the covariance matrix of £N{60) was not full rank, it was revised with T using eqn. 3.32 as In the following, the dimension reduction method was applied to detect the change in every single parameter of the open-loop model 4.59. For each situation, the system was operated in a closed-loop condition with the setpoint of zero for 10000 time units. One of the four open-loop parameters ax, a2, bx or b2 was changed at the 5001 s t unit. The chi-square value was calculated for every 10 t h unit. Figure 4.1 to 4.4 shows the chi-squares versus time unit. 1 O O 0 2 0 0 0 3 0 0 0 4 0 0 0 5 0 0 0 6 0 0 0 7 0 0 0 8 0 0 0 : 9 0 0 0 1 0 0 0 0 1 0 0 0 2 0 0 0 3 0 0 0 4 0 0 0 5 0 0 0 6 0 0 0 7 0 0 0 8 0 0 0 9 0 0 0 1 0 0 0 0 . Figure 4.1 Detection of the change in ax using the dimension reduction method 57 Chapter 4: Closed-loop fault detection using the local approach 0:5 U 0. 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000; 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 1000 2000 3000: 4000 5000 6000 7000 8000: 9000 10000 Time,.unit-800 900 1000 Figure 4.2 Detection of the change in a2 using the dimension reduction method 1? 0 1000 2000 3000: 4000 5000 6000 7000 8000 9000 10000 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000: Figure 4.3 Detection of the change in bx using the dimension reduction method 58 Chapter 4: Closed-loop fault detection using the local approach Figure 4.4 Detection of the change in b2 using the dimension reduction method Fig. 4.1 to 4.4 indicate that a sudden jump of the chi-square value indicates a change in the system parameters. The proposed algorithm can successfully detect the change in every single parameter of the open-loop model. It is mentioned in section 4.2 that i f changes in the open-loop parameters do not affect the closed-loop dynamics, they can not be detected. That is to say, i f the change in closed loop parameter A happens to be MA = 0, it will not be reflected by the mean value of the normalized residual. One example is that A = [0.6306 3.5164 0.2048 -3.4922]. In this case, MA = 0. The closed loop parameters 0 change from [-1.10.3 0.7-0.5] to [0.4694 3.8164 0.9048 -3.7922]. The process was operated in closed-loop condition for 10000 time units. A chi-square value was calculated for every 10 time units. Fig 4.3 shows the chi-square value versus time unit. 59 Chapter 4: Closed-loop fault detection using the local approach i o:s r l -2 fe-6 •s o 1000 2000 30001 4000 5000 6000 7000 8000 9000 10000 1000 '2000 3000 4000 5000 6000 7000 8000 9000 10000 1000 2000 3000: 4000 5000 6000 7000 8000 9000 10000 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 1000 2000 3000, 4000 '5000: 6000 7000 8000' 9000 10000 Time; unit Figure 4.5 Detection failure with the dimension reduction method Fig. 4.5 shows no signification changes in the chi-square value, which means that the change in the open-loop parameters was not detected even though the change in 6 at the 5001 s t unit was significant. Fortunately, the probability for such a change to happen in a real system is very low. 4.5.2 Closed-loop fault detection using the indirect detection method The closed-loop model of the process defined in eqn. 4.57 with the PI controller in eqn. 4.58 60 Chapter 4: Closed-loop fault detection using the local approach can be calculated as E(z-1)y(k) = D(z-l)e(k) (4.62) where E(z-i) = A(z-i)C(z-i) + B(z-i)D(z-l) = l + elz-i + e2z~2 +e2z~3 (4.63) Under nominal situation, A(z~x) = 1 + axz~x + a2z~2 = 1 - 1 . l z _ 1 + 0.3z~2 B(z~x) = bxz~x + b2z-2 = 0.7z~x - 0.5z"2. C(z~x ) = 0.2 - O . l z " 1 , D(z-') = l - z - 1 then E(z'x) = A(z~x )C(z"') + B(z~x )D(z~x) = 1 -1.96z"' +1.23z"2 - 0.25z"3 The model in eqn. 4.63 can be rewritten as y(k) = <pT(k)0+v(k) (4.64) where cp(k) = [-y(k-\) -y(k-2) -y(k-3)] 0 = [e,, e2, e 3 f v(k) = D(z~x )e(k) The nominal value of the closed-loop parameter 9 is 0" = [-1.96 1.23 - 0 . 2 5 f The primary residual is calculated based on the closed-loop model 4.64 as H(y(k),(p(k)A) = y(k)-<pT(k)90 (4.65) Since D(z~x) = 1 - z"' is unproper. There is constant bias in the mean value of the primary residual and the normalized residual. Before the indirect method can be applied to detect changes in the closed-loop parameters, the bias of the primary residual has to be estimated together with the sensitivity matrix and the covariance matrix in the training of the FDI system. The process was operated in closed-loop condition for 10000 time units. With the input and output data, the bias of the primary residual was estimated using eqn. 4.55 as h = [1.0034 0.0011 0.0019] Then the primary residual can be revised as H(y(k)Mk)A) = H(y{k), q>(k),0o )-h (4.66) 61 Chapter 4: Closed-loop fault detection using the local approach The sensitivity matrix can be estimated as k=\ N 1 (4.67) With the revised primary residual, the covariance matrix can be estimated using eqn. 2.18 and 2.19. In the following simulations, the indirect detection method was applied to detect changes in every single parameter of the process. For each situation, the system was operated in closed-loop condition with the setpoint of zero for 10000 time units. One of the four open-loop parameters a,, a2, bx or b2 was changed at the 5001 s t time unit. The chi-square value was calculated for every 10 t h time unit to reduce the computation load. Figure 4.6 to 4.9 show the chi-square value versus time unit. 1000 2000 3000 4000 5000 6000 7000 8000, 9000 10000 1000 2000 3000, 4000 5000 6000 7000 8000: 9000 ,10000 Figure 4.6 Detection of the change in a2 using the indirect detection method 62 Chapter 4: Closed-loop fault detection using the local approach Fig. 4.6 shows that the change in the open-loop parameter a2 was successfully detected. The indirect detection method was also used to detect the change of ax, bx or b2. Simulation results show the effectiveness of the indirect detection method in detecting the change of each of the open-loop parameters. For each of these simulations, the plots of chi-square value, system input, system output versus time unit are similar to those plots obtained in the simulation using the dimension reduction method. Therefore, they were omited to avoid repetation. In certain situations, the open-loop parameter changes while the closed-loop remains unchanged. A n example is given as following: The open-loop parameters 0 changed from [-1.1 0.3 0.7 -0.5] to [-2.1 0.8 5.7 -5.5]. Therefore, the closed-loop model after the change was calculated as E(z~x) = A(z~x )C(z-x) + B(z~l )D(z'x) = 1 -1.96z-' +1.23z~2 - 0.25z"3 where A(z~x) = 1 - 2. l z _ 1 + 0.8z~2, B(z~x) = 5.7z _ 1 - 0.8z - 2 The closed-loop parameter did not change. The process was operated under closed-loop condition with setpoint set to be zero for 10000 time units. The chi-square value was calculated for every 10 time units. This can be seen in Fig. 4.7 of chi-square versus time unit. 63 Chapter 4: Closed-loop fault detection using the local approach 0 1 1 i — • 1 — r p -i i r te -1 "55 Time, 10 units Figure 4.7 Detection failure using the indirect detection method Changes in the open-loop parameters were not detected even though they were significant. A detection failure occurs when the change in open-loop parameters does not affect the closed-loop dynamics. 4.6 Experiment results The proposed algorithms have been applied to the pilot-scale tank level and temperature 64 Chapter 4: Closed-loop fault detection using the local approach control system as described in chapter 3. In the experiment, PI controllers were used to control both the water level and the temperature in the tank. A change in the water level was used to simulate a fault in the system. The input (steam valve) and output (temperature) data of the temperature process were used to detect changes in the tank level. The open-loop process model was obtained through system identification as described in chapter 3.The process was operated under closed-loop condition for 1000 sampling times in order to train the FDI system. With the training data, the sensitivity matrix M and the covariance matrix X were estimated using eqn. 2.16 and 2.17. Though X was of full rank, its condition number was more than 106. Using this X may bring calculation error to the chi-square value. Using Singular Value Decomposition, the dimension of covariance matrix was reduced to 5. The false alarm ratio was set to be a = 0.01 and the threshold value was obtained from standard chi-square table as xa ~ 16.81 Using the algorithm provided in chapter 3, the threshold was revised as xa = \ 1 A 1 . In the detection stage, the process was operated in closed-loop for 2500 sampling times and the set-point of the tank level was changed from 14 cm to 11 cm at the sampling time of 800. Using a moving window with a length of 100 sampling times, the normalized residual and chi-square value were calculated for every 10 sampling times. The chi-square value versus sampling time in Fig. 4.8. 65 Chapter 4: Closed-loop fault detection using the local approach 500 1000 1500 2000 2500 3000 ,3500 4000 4500 5000: 600 •8 400 I w 200 500 1000 1500 2000 2500 3000 :3500 4000, 4500 5000 n : T T T — : — : 1—: : rr. . . .i. — —_ , ,—„— 1. . — 0 500 1000 1500. 2000 2500 3000 3500 4000 4500 5000 Sampling time Revised threshold Original threshold 100 150 200 250 300 350 400 450 500 IO sampling times Figure 4.8 Detection of the change of the tank level using the dimension reduction method Figure 4.8 shows a sudden jump of chi-square value after the change in the tank level, which indicates that the change was detected by the algorithm. A remaining problem is that the chi-square value returned below the threshold after a period of time. This may be caused by the processing of the data. The trend of the input-output data was subtracted from them before they were used to calculate the normalized residual to reduce the affect of low-frequent disturbance to the detection algorithm. This also caused the loss of information of the change due to changes in the system parameters. Future research work may need to be done in order to find methods to distinguish the effect of low-frequency disturbances from those of parameter changes. 4.7 Conclusions In this chapter, closed-loop detection was investigated. When the closed-loop data are used to calculate the primary and the normalized residual, the relevance between the input and output causes a non-full rank covariance matrix. The dimension reduction method used a linear 66 Chapter 4: Closed-loop fault detection using the local approach transformation to reduce the dimension of the normalized residual so that the covariance matrix of the revised normalized residual will be of full rank. With the closed-loop data, fault detection algorithms can only detect those parameter changes that can affect the closed-loop dynamics. The indirect detection method used the closed-loop model instead of the open-loop model to calculate the primary residual and normalized residual to perform detection. By detecting the changes of the closed-loop parameters, the method also detected the changes of the open-loop parameters. Simulation studies showed that both the dimension reduction method and indirect detection method detected the change of every single parameter of a second-order linear process operated under the closed-loop condition. The implementation of the proposed algorithm on the tank level and temperature control system proved to be effective. From the simulations, the dimension reduction method and indirect method were found to be equivalent in performance. 67 Chapter 5 Application of fault detection in paper-making systems 5.1 Background Figure 5.1 shows a typical Fourdrinier paper machine. In manufacturing the paper, a mixture of fibres and water is pumped into the headbox. The headbox is designed to keep the fibers and water evenly distributed. The pulp is then delivered onto a moving wire known as a web or wire. Water is drained out of the mixture by means of gravity, pressing and drying. The resulting fibres are formed into a paper sheet. After passing through a series of rolls called the calender, the paper sheet is wound up onto a reel. In the paper-making systems, there are three important properties: (1) basis weight, which is defined as total sheet weight per unit area, (2) moisture, which refers to the percentage weight of the water in the total mass of the paper sheet, and (3) caliper, which is the thickness of the paper sheet. At the end of the paper machine, these properties are measured through a scanner mounted on an O-frame surrounding the moving sheet. The scanner continuously traverses back and forth in the cross machine direction. Variations in paper properties are considered to be composed of two parts: the machine direction (MD) part and the cross direction (CD) part. Machine direction means the direction of the movement of the paper sheet and cross direction means the direction perpendicular to machine direction. In order to obtain high-quality paper product and high product rate, automation control systems have been designed to control each of the above three properties. Efforts on improving the control of paper machine began in the early 1960s when computers became available for industrial system control. They were primarily dedicated to the control of M D properties. With the development of measurement devices and actuators, CD control has rapidly gained prominence since the 1980's (Dumont, 1995; Kwok, 1997). The adjustment of CD properties is done through a series of actuators mounted across the machine. The weight actuators are mainly the slice screws at the head box. For consistency profiling head boxes, the weight actuators are a series of valves that control the local consistency across the head box. The control of moisture profile across the sheet is carried out by steam boxes and steam 68 Chapter 5: Application of fault detection in paper-making systems showers installed at the dry section of the paper machine (Jahangir, 1997). Reef Figure 5.1 Schematci of a typical Fourdrinier paper machine With the growing demand for fault tolerance, FDI has become an important issue for system control community. In CD control systems of papermaking systems, there are many actuators. Performance of a CD control system not only depends on the control algorithm, but also on the reliability of the actuators. Therefore, it is helpful to on-line monitor the system using system input and output data and detect the actuator faults. By detecting the actuator faults and reporting them to the supervision system, one can repair the faulty actuators, improve the availability of the system and maintain the overall control performance. The local approach is a useful tool for both incipient change detection and abrupt fault detection. In this chapter, it will be applied to the problem of actuator fault detection for CD control systems of paper-making systems. 5.2 Problem formulation This chapter addresses the application of local approach on actuator fault detection in CD control systems. Since system and fault models are crucial to the performance of model-based FD algorithms, they will be specified in the following:. 69 Chapter 5: Application of fault detection in paper-making systems 5.2.1 System model A n open-loop CD system can be described by the following A R X model: Ay(t) = Bu(t-d) + e(t) (5.1) where • B is called the interaction matrix which contains the coupling information, d is the time delay; • A is the dynamic matrix of the system; • u and y are vectors of the system inputs and outputs, e(f) is a vector of white noise sequences. In order to simplify the problem, it is assumed that the dynamics for all outputs are the same. Under the above assumption, the dynamic matrix can be written as follows: " l - a z " 1 0 ••• 0 A = 0 l - a z " 1 : : '•. 0 0 ••• 0 l - a z - 1 (5.2) Therefore, system 5.1 can be rewritten as y(t) + ay(t-l) = Bu(t-d) + e(t) (5.3) The purpose is to detect the faults of actuators with system input u and output y and identify the faults, i.e. to find the faulty actuators. 5.2.2 Modeling of actuator faults For simplicity, it is assumed that actuator faults will not affect the dynamic matrix A. The faults only cause the changes of interaction matrix B. Therefore, the actuator faults can be detected by detecting changes of elements of the interaction matrix B. Because of its high dimension, it is very difficult to detect the change of one single element of the interaction matrix B. Since the fault with an actuator will cause the changes in all of the elements in the column corresponding to this actuator, it is then assumed that the fault affect the elements at the same rate. This means that when a fault happens with the p t h actuator, the interaction matrix B becomes: 70 Chapter 5: Application of fault detection in paper-making systems rb, B=\ '21 '22 rb 2p K b 2n = BF bml b, where F = ml [ (p- l )x (p- l ) 0 0 rb mp 0 r 0 0 0 (n-pMn-p) Therefore, the system model 5.3 can be rewritten as y(t) + ay(t -1) = BFu(t -d) + e(t) Since Fu{t -d) = l(p-\)x(p-\) (n-pMn-p) -d) Unit -d) ux (r - d) u„(t-d). Let U(t-d) = Uy(t-d) 0 0 un(t-d) (5.4) / = [! - r I f System (5.4) becomes: y(t) + ay(t-l) = BU(t-d)f + e(t) (5.5) where / is a n x l vector. Under nominal conditions, all elements of / are 1, i.e. f° = [l • • • l ] r . / can be called the index of fault since changes of its elements directly indicate faults of the correspondent actuators. Then the problem of detecting actuator faults is transformed to the problem of detecting changes of / with the system input u and output y . 71 Chapter 5: Application of fault detection in paper-making systems The advantage of the above method to model actuator faults is that it is much simpler to detect changes of a vector / than a matrix B. 5.3 Actuator fault detection and isolation In this part, algorithms for actuator fault detection in CD control systems will be proposed. Generally the input data of the CD system are setpoints for CD actuators and output data are the system measurements. Data collected by the scanner are referred as raw data and contain M D and CD variations. M D and CD variations are regulated by different controllers. For detection of faults of the CD actuators, CD variation should be seperated from M D variation. 5.3.1 Machine-direction and Cross-direction data separation There are two methods to separating M D and CD information: Exponential Multi-Scan Trending (EXPO) and Estimation and Identification of Moisture Content (EIMC) algorithm. Exponential multi-scan trending is very simple and widely used in the paper industry (Li, 1998). The filter is described by yn(t) = (\-p)yn<t-\) + Pyn(j) (5.6) where • yn (t) is the estimated CD value at CD position n • yn (t) is the present measured deviation from the scan average at CD position n • (3 is the weight on the measurement, /? e [0,1] The trended data are taken as the CD profile of the sheet, and the average of each scan is considered as the M D value. 5.3.2 Mapping of databox to actuators Paper-making systems are high-dimension systems. Therefore, computation of primary residuals and evaluation of normalized residuals will involve calculation of high dimension matrices. As on-line monitoring tools, fault detection algorithms must be simple enough to meet the computation and capability limitation of existing control systems. Therefore, it is meaningful to reduce the dimensions of system output and model (especially the interaction 72 Chapter 5: Application of fault detection in paper-making systems matrix) to avoid unnecessary computation. Let m and n represent the dimensions of system output y and system input u respectively. Usually, m is much larger than n. System output y is related to input u through interaction matrix G . So when m is larger than n, G is not of full rank and elements of y will be linearly related. Suppose that m is a multiple of n and p = mln. Averaging y for each p data points: y-i - 2^ya-i)p+k p k=i and it is equivalent to take linear transformation of system output: y = Gy where 1 ••• 1 0 ••• 0 0 ••• 0" 0 ... 0 1 ••• 1 0 ••• 0 0 ••• 0 0 ••• 0 1 ••• 1 Since Gy(t) + aOyif -1) = GBU(t - d)f + e{t) then y(t) + ay(t -1) = BU(t - d)f + e(t) (5.7) where B-GB. With linear transform G, the dimension of system output y is decreased from m to n and the dimensions of the interaction matrix B from mx n to nxn. This will greatly reduce the computation in calculation of primary residuals and evaluation of normalized residuals. Since y are the linear transformation of y, they contains all the information about actuator faults. For simplicity, y and G will be used to represent the system output and interaction matrix after averaging. 73 Chapter 5: Application of fault detection in paper-making systems 5.3.3 General algorithms for actuator fault detection and isolation For fault detection and isolation algorithms, the very first step is to find a suitable primary residual which is sensitive to the faults. Since A is assumed to be unaffected by actuator faults, let ys(t) = y(t) + ay(t-l) (5-8) then the system (5.3) can be rewritten as a static system; ys(t) = BU(t-d)f + e(t) (5.9) The primary residual can be calculated as H(ys (0, U(t - d)) = U(t - d)BT[ys (t) -BU(t- d)] (5.10) Substituting eqn. 5.9 into eqn. 5.10, obtain H(y, (t), U(t - d)) = U(t - d)BTBU(t - d)(f -f°) + U(t- d)BTe(t) Under nominal situation, / = f° H(ys (0, U(t - d)) = U(t - d)B Te(t) Since e(t) is the vector of white noise sequence, it is easy to prove that: E[H(ys (t),U(t - d))] = E[U(t - d)BTe(t)] = 0 When f*f° E(H(ys(t),U(t-d))) = E[U(t - d)BTBU(t - d)](f -f°) + E[U(t - d)BTe(t)] = E[U(t - d)BTBU(t - d)](f - f°) It can be shown that E[U(t-d)BTBU(t-d)]*0. Therefore, E(HM)M(t-m*0vrhenf*f°. The primary residual defined in eqn. 5.10 fits the requirement that it should have zero mean in fault-free cases and non-zero mean in other cases. Normalized residual can be calculated as follows: (N=itH(ys(k),U(k-d)) (5.11) 74 Chapter 5: Application of fault detection in paper-making systems where N is the length of moving window. £ N is a n x 1 vector. The fault of any actuator will cause the change of mean value of £ N and a chi-square test can be applied on £ N to detect whether the mean value of it is zero or not. Fault isolation always follows fault detection. In this case, it is designed to find out the faulty actuator. There are two fault isolation methods: the sensitivity test and the min-max test. Since the sensitivity test is easier to apply, it will be used in this chapter to isolate the faulty actuator. Detailed information about the sensitivity test and the min-max test can found in chapter 2. 5.3.4 Actuator fault detection by dividing the system output into sections In the algorithms proposed above, the primary residual and the normalized residual have the same dimension as that of actuators. With model uncertainty and noise present in the system model, the change of one actuator may not have an enough effect on the system output. This may lead to a fault being not detected by the algorithm. In fact, the interaction matrix is a sparse matrix, one output is only determined by the move of several actuators and the move of one actuator will only effect several system outputs. Therefore, one can divide the system output into several regions and detect faults of those actuators that have an effect on the outputs. Since the dimension of the output within a section is much less than that of the total output, one can anticipate that by appropriately dividing the system output into sections the detection algorithms can be more sensitive to the fault of a single actuator. After averaging, the system inputs and outputs have the same dimension n. Suppose the system outputs are divided into / sections and let p = nll. With the following denotation u U' = u il U (M)/+l u il system 5.3 can be rewritten as v' ( 0 + ay'(*-!) = G'U(t)f + e'(t), for i = \,--,l (5.12) 75 Chapter 5: Application of fault detection in paper-making systems Therefore, the whole system can be divided into / smaller systems. For each of those systems, one can detect the faults of those actuators that have effect on the outputs in the particular section. Primary residuals can be calculated as H'(t) = (G'U')T[y'(t) + ayi(t-l)-G'U(t)f] for i = l , - , / (5.13) and the normalized residual k=\ 5.3.5 Actuator fault detection by decoupling the system For most of the CD control systems interaction matrix B is usually not full rank or has high condition number. Calculation of the inverse of B is not possible or will bring large calculation error. In some cases, B may be full rank. In such cases, one can decouple the M I M O into n SISO systems. Taking the pseudo-inverse of B : D = (BT B)~x BT and multiple the both side of eqn. 5.9 by D, the system model becomes: Dy(t) = DBU{t - d)f + De(t) (5.15) (5.16) Letting y(t) = Dy(0X0 = De(t) then m = U(t-d)f + v(t) Since U(t-d) is a diagonal matrix, \(t-d) 0 T / , 1 rv,(0" ! + i (5.17) o «.(f-«ol/J |v.(0 The system is decoupled into n SISO systems: yi(t) = ui(t-d)fi+vi{t) for i = l,--,n (5.18) Therefore, one can detect the fault of each of the actuator respectively though monitoring the SISO systems separately. The primary residual for each of the SISO system can be calculated 76 Chapter 5: Application of fault detection in paper-making systems as H,(Pi(t),u,it - d), f,) = [y.(t) - «.(t - d)f.t ]u,(t-d) (5.19) 5.4 Simulation results The best way to test the applicability of proposed fault detection algorithms is to intentionally disable one or several actuators in a CD control system and see whether the algorithms can detect the faults or not. However, it is not feasible to do this, as it may deteriorate control performance. The alternative way is to collect data from a CD control system and simulate actuator faults by modifying the input and output data. If the proposed algorithms can successfully detect the simulated faults, one can anticipate that it will also detect real actuator faults in an actual CD control system. The simulation studies were carried out in two stages. First, a real system model was used to generate data that contained the information of a single actuator fault. The proposed types of algorithms were applied to detect the fault from the generated data and the performance of these algorithms was compared. Second, real system data were used to simulate actuator faults and the algorithm with the best performance was applied to detect this fault. 5.4.1 System specification The data used in the simulation studies was collected from the moisture profile C D control system with a steam box application in a Canadian paper mill producing newsprint grade paper. Steam boxes are quite slow systems as the dynamics of the systems depend on the thermal momentum of the paper machine. In the moisture CD control system, there are 50 actuators and 1500 measurements for each scan. The system model can be described as follows: Ay(t) = BAu(t -d) + aAy(t -1) (5.20) where y(f) is the CD profile, u(t) is the vector of the setpoints for the actuators. A = 1 - q~x. A system input and output incremency was taken to remove the initial conditions. The process model was obtained through a bump test and the parameters were determined through the Levenburg-Marquhardt algorithm (Gorinevsky, et. al., 1998). In this system, a = 0.9432, d = 2 . Figure 5.2 shows the three-dimensional view of interaction matrix B and gain of the 25 t h actuator to system outputs: 77 Chapter 5: Application of fault detection in paper-making systems 0.05 a 0.02 0 I -0.02 -0.04 -0.06 1 ; f — l 1 1 i 1 : V / 10 15 20 25 30 35 40 45 Databox Figure 5.2 System interaction matrix 50 The move of one actuator will cause a change of about 10 CD measurments around the actuator. Although B has full column rank, its condition number is very larger: cond(B) = 4.0705xlO 4 . 5.4.2 Simulation with real model and generated data In this chapter, three actuator fault detection algorithms were proposed based on a theoretical model of the system. The three algorithms are defined as the decoupling method, general method and sectioning method. Before these algorithms are applied to real industrial data, they will be compared by detecting faults in generated data. In the following simulation study, data were generated with eqn. 5.20 and the parameters are real system parameter. After a fault occurs with an actuator, the position of the actuator will either stop at the position of last scan, slowly drift to zero, or suddenly drop to zero. They can be called jammed fault, incipient fault and abrupt fault. The proposed algorithms will be applied to detect these three types of faults. The system was excited for 2000 time units by a vector of white noise with covariance of 0.01. A chi-square value was calculated for every 25 t h unit. The chi-square value was based on the past 100 units (i. e. a moving window of 100 units was used). At the 78 Chapter 5: Application of fault detection in paper-making systems 1001st unit, an artificial fault was introduced at the 25 t h actuator by fixing the position of the actuator. This is similar to a jammed actuator. Figure 5.3 to 5.7 show the results of fault detection using different algorithms. Case 1: Actuator detection using the decoupling method T n n 1 1 1 T J I L C J L 1 1 L 1 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000. Time, unit I r n —I r— 1 1 1 i r ; | - Original threshold | 111 y ' 1 0 20 40 60 80 100 120 140 160 180 200 Time, 25 units Figure 5.3 Detection of a jammed actuator at the 25 actuator using the decoupling method Though the position of the 25 t h actuator was fixed to a constant value, figure 5.3 shows no obvious change in the chi-square of the normalized residual of the SISO system containing the 25 t h actuator. This indicates that the fault was not detected. The explanation is the high condition number of interaction matrix B. Numerical calculation of the inverse of B matrix brings large calculation error. When the B matrix is used to decouple the system, the calculation error will lead to a large calculation error in the system outputs defined in eqn. 5.17. This error will overcome the change caused by actuator faults and makes the faults undetectable. 79 Chapter 5: Application of fault detection in paper-making systems Case 2: Actuator fault detection using the general method 100 0 500 1000 1500; 2000 .2500 3000 3500 4000' 4500 5000 Time, unit 80 100 120' Time, 25 units 140 160 180 200 Figure 5.4 Detection of a jammed actuator at the 25 t h actuator using the general method After the 2500 t h time unit, there are more chi-square values exceeding the threshold than before, which may be explained as the presence of fault in the system. But the increase of the chi-square values is not significant. The explanation is as follows: the primary residuals are calculated as the multiple of estimation error and the derivative of estimation error with respect to system parameter. In the entire detection method, faulty index / was regarded as the vector of n parameters of the system. The primary residual and the normalized residual have the same dimension as system outputs. Therefore, normalized residual will not be sensitive the fault of a single actuator. To solve this problem one can count the rate of chi-square value exceeding threshold instead of single chi-square to make decision on the presence of fault. That is actually equivalent to increasing the length of the moving window. 80 Chapter 5: Application of fault detection in paper-making systems Case 3: Actuator fault detection using the sectioning method Figure 5.5 Detection of a jammed actuator at the 25 actuator using the sectioning method Significant change of the chi-square value in figure 5.5 indicates an actuator fault in region 5. The algorithm successfully detected the fault with a single actuator. For the other two types of actuator faults, a fault causes more drastic change of the actuator position. Therefore, it is easier for the algorithm to detect them. Figure 5.6 and 5.7 showed the results of detection. 81 Chapter 5: Application of fault detection in paper-making systems 100 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 time, unit 800 , 6 0 0 j 400 O 200 20 40 80 100 120 140 160 180 200 Time,,25 units Figure 5.6 Detection of an abrupt fault at the 25 t h actuator using the sectioning method 100 S 60 Z 40 S 2 0 H 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 Time, unit Figure 5.7 Detection of an incipient fault at the 25 t h actuator using the sectioning method 82 Chapter 5: Application of fault detection in paper-making systems 5.4.3 Simulation with real system data In the following simulation studies, closed-loop data obtained from a CD control system was used to assess the performance of the proposed algorithms. As mentioned before, manually disabling the actuators will deteriorate the control performance. The alternative way to simulate the actuator faults is to modify the output data. Suppose that u{t) is the vector of setpoints for the actuators and u(t) is the actual position of the actuators. In the fault free case, u(t) should be equal to u(t). Actuator faults will case a difference between u(t) and u(t) and further lead to the change of system output. Letting ue(t) = u(t)-u(t) With the system model 5.3, the changes of the system output due to actuator faults can be calculated as ye(t) = aye(t) + But{t-d) (5.21) Then the actual measurements should be: m = ye(t)+y(0 (5.22) In practice, only the faulty output y(t) and the control signal u(t) can be obtained. They will be used to detect the simulated actuator faults. Since the sectioning method was proven to help increase the sensitivity of the detection algorithm, it was uses it in the following simulations. A set of raw data of 400 scans was obtained from a CD control system. There were 50 actuators and 1500 outputs. The outputs were averaged for every 30 points. The CD component was extracted from the raw data using EXPO algorithm mentioned in section 3.1. Figure 5.2 shows that the move of a single actuator will affect 10 outputs around it. Therefore, it is reasonable to divide the actuators into 5 sections and each section contains 10 actuators. For each of the sections, the sensitivity matrix and covariance matrix were estimated with the extracted CD component and input data. A threshold was also calculated using the algorithms proposed in chapter 3. Because of the limited number of data, the moving window of 8 scans was used. As mentioned in 4.1, there are three types of actuator faults based on the position of actuator after the actuator faults. The most common one is the jammed fault, where the position of the actuator is fixed to that in the last scan. Firstly, the sectioning method was used to detect this 83 Chapter 5: Application of fault detection in paper-making systems type of fault. The position of the 25 t h actuator was fixed from the 200 t h scan. A chi-square value was calculated for every scan. Figure 5.8 shows the result of chi-square test in the section 3. Posltionof the 25th actuator 150 200 250 Time, scan th Figure 5.8 Detection of a jammed actuator at the 25 actuator using the sectioning method 84 Chapter 5: Application of fault detection in paper-making systems Time,scan Figure 5.9 Detection of jammed actuators in section 3 using the sectioning method There is no significant change in the chi-square value of the normalized residual of the 3 r d section. The actuator fault was not detected. In another test, all the actuators in a section 3 (21 s t - 30 t h actuators) was fixed at 200 t h scan. Fig. 5.8 shows no significant change of chi-square value and the faults were not detected. In some cases when a fault occurred with an actuator, the position of the actuator gradually drops to zero after the fault. This is to simulate a minor fault attending the process for a period of time. Since there is a larger change in the position of the actuator than the first type of fault, one can anticipate that it will be easier to detect such a fault. Fig. 5.9 and 5.10 shows the chi-square value of the normalized residual of section 3. 85 Chapter 5: Application of fault detection in paper-making systems Position of the 25th actuator 120 100 a> •g 80 a> S» 60 —- Revised threshold .. . . . . original threshold -; , l !i u _ J — L - • -^L- J 1— 1 0 50 100 150: -200 250 300 350 400 Timo. scan Figure 5.10 Detection of an incipient fault at the 25 t h actuator using the sectioning method 25th CD measurement Time;scan Figure 5.11 Detection of incipient faults at the 23 r d - 27 t h actuators using the sectioning method Figure 5.10 shows that the algorithm can not detect the fault of one single actuator (the 25 t h actuator). When the number of faulty actuator increased to 5, there were some chi-square 86 Chapter 5: Application of fault detection in paper-making systems values exceeding the threshold, which indicate faults in section 3. However, the peak of chi-square values around 210 scan was not significant enough to distinguish it from possible peaks caused by disturbance. For the third type of fault, the position of the actuator will suddenly drop to zero after the failure. Figure 5.16 shows the chi-square value of the normalized residual of section 3. Position of:the:25th:actuat6r. 80 60 • SU'O' •E a> t j rx 20 Os 120 100 8 0 |^ 60 «>: E 40 20 6 150 200 250 25th CD;measurement — R e v i s e d threshold ' ' . Original threshold ... .J 1 r n I (A^AA\M: 200 Time, scan Figure 5.12 Detection of an abrupt fault at the 25 t h actuator using the sectioning method The algorithm successfully detected the fault of a single actuator with the presence of model uncertainty and disturbance. 5.4.3 Discussion of the simulation results With generated data, the proposed algorithm can detect all types of faults at one single actuator. However, the algorithm can only detect abrupt fault at an actuator. It failed to detect slowly developing faults. The explanation is the noise and model uncertainties. Figure 5.12 shows the prediction of one of the system output using the given system model and system 87 Chapter 5: Application of fault detection in paper-making systems inputs. Actual value of the 25th C D output 1 T 1 1 r. i 1 0 5 0 1 0 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 P r e d i c i t o n o f t h e 2 5 t h C D o u t p u t 5 I L i L . . . . 1 O S O 1 0 0 1 5 0 2 0 0 2 5 0 3 0 O 3 5 0 4 0 0 T i m e , s c a n Figure 5.13 Prediction of the system output using the system inputs and system model Actual value of the 25 t h CD output shows that it is very noisy. The difference between the actual value and prediction of the 25 t h CD output shows that the system model is very poor. The model inaccuracy in the system model and the noise in the system output make it difficult to detect slight faults of the actuators. 5.5 Conclusions In this chapter, algorithms based on the local approach were proposed to detect actuator faults in CD control systems. A CD control system is usually of high dimension and interaction matrix has ill-conditioned. Simulation studies shows that by divide the system output into sections, the sensitivity of the detection algorithm was improved. When the algorithm was applied to simulated data, different kinds of faults of a single actuator could be detected. Real process data was obtained from a CD control system to assess the proposed algorithms. In this case, only abrupt failure of an actuator can be detected. A n explanation could be that interaction matrix B is usually unaccurate and the output data are very noisy. The effect of an incipient fault of an actuator was concealed by the noise and the model uncertainty. If the system noise overwhems the information about the fault, it is impossible to detect the fault. Therefore, the improvement in the detection of the fault lies in denoising the output data. 88 Chapter 6 Conclusions and Future work 6.1 Conclusions This thesis addresses the application of fault detection and isolation using the local approach. As model uncertainties are unavoidable in practice system and they lead to false alarms in fault detection, the robustness of the local approach, with respect to model uncertainties, was investigated. A n algorithm was proposed to recalculate the threshold value to reduce the false alarm rate to the preset value. A similar algorithm was also proposed to calculate the threshold to accommodate the regular parameter fluctuations. Simulation results show that the revised threshold can reduce the false alarm rate to a value close to the pre-assigned value and maintain the ability of detecting fault in case of both parameter uncertainties and parameter fluctuations. The robustness scheme was also compared with the upgrading scheme through simulations and it was concluded that the robustness scheme is more effective than the upgrading scheme in reducing the false alarm rate. The proposed algorithms were also applied to a pilot-scale tank level and temperature control system. Though, the revised threshold was focused to be able to reduce the false alarm rate, it was not large enough to decrease the false alarm rate to the preset value. This probably was due to the low frequency disturbances in the system. Since most fault detection algorithms are applied to closed-loop system, closed-loop detection using the local approach was investigated. When the closed-loop data are used to calculate the primary and the normalized residual, the correlation between the input and output causes a non-full rank covariance matrix. The dimension reduction method used a linear transformation to reduce the dimension of the normalized residual so that the covariance matrix of the revised the normalized residual will be of full rank. With the closed-loop data, fault detection algorithms can only detect those parameter changes that can affect the closed-loop dynamics. The indirect detection method used the closed-loop model instead of the open-loop model to calculate the primary residual and normalized residual to perform detection. By detecting the changes of the closed-loop parameters, the method also detected the changes of 89 Chapter 6: Conclusions and future work the open-loop parameters. Simulation studies showed that both the dimension reduction method and indirect detection method detected the change of every single parameter of a second-order linear process operated under the closed-loop condition. The implementation of the proposed algorithm on the tank level and temperature control system proved to be effective. From the simulations, the dimension reduction method and indirect method were found to be equivalent in performance. Finally, real process data obtained from a CD control system was used to assess the applicability of the local approach. A CD control process is usually of high dimension and interaction matrix is ill-conditioned. Simulation studies shows that by dividing the system output into sections, the sensitivity of the detection algorithm was improved. When the sectioning method was applied to simulated data, different types of faults of a single actuator could be detected. The proposed algorithm was also implemented to real process data. In this case, abrupt fault of one single an actuator was detected. However, freezing faults and incipient faults could not be detected because of the noise and the inaccuracy of the process model. 6.2 Future work A future research topic in robustness of the local approach may be to improve the robustness of the local approach with respect to low frequency disturbances. Possible solutions could either be to further increase the threshold value, or to take in account the low frequency disturbances in the process model. In CD control systems in paper-making process, interaction matrix B is usually unaccurate and the output data are very noisy. The effect of an incipient fault of an actuator was concealed i f the system noise overwhems the information about the fault. Therefore, the improvement lies in denoising of the process data. 90 References Basseville, M . , "Detecting changes in signals and systems - a survey," Automatica, Vol. 24, No. 3, pp. 309-326, 1988. Basseville, M . , and I. V. Nikiforov, "Detection of abrupt change: theory and application," Prentice Hall, Englewood Cliffs, New Jersey, 1993. Basseville, M . , "On-board component fault detection and isolation using the statistical local approach," Automatica, Vol. 34, No. 11, pp. 1391-1415, 1998. Benveniste, A . , Basseville, M . , and G V. Moustakides, "The asymptotic local approach to change detection and model validation", IEEE transaction on Automatic Control, vol. AC-32,no. 7, July 1987. Bloch, G , Ouladsine, M . and Thomas, P., "On-line fault diagnosis of dynamic systems via robust parameter estimation," Control Eng. Practice, Vol. 3, No 12 pp. 1709-1717, 1995. Delyon, B. and Benveniste, A. , "On the relationship between identification and local tests," Proceedings of the 36 t h Conference on Decision &Control San Diego, California USA, Dec. 1997. Demetriou, M . A . And Polycarpou, M . M . , "Incipient fault diagnosis of dynamical systems using online approximators," IEEE Transaction On Automatic Control, Vol. 43, No. 11, Nov., 1998. Dumont, G A. , "Control Techniques in the Pulp and Paper Industry", in "Ad. Ind. System", C. T. Leondes, Ed., Control and Dynamic Systems, vol. 37, Academic Press New Yong (1990), pp. 65-114. Emami-Naeini, A . , M . M . Akhter and S. M . Rock, "Effect of model uncertainty on failure detection: the threshold selector," IEEE Transaction on Automatic Control, Vol. 33, No. 12, Dec. 1988. Fang, X . , Gertler, J., Kunwer, M . , Heron, J. and Barkana, T., " A double-threshold robust method for fault detection & isolation in dynamic systems," Proceedings of the American Control Conference, June 1994. Frank, P. M . , "Fault Diagnosis in Dynamic Systems Using Analytical and Knowledge-based 91 Redundancy-A Survey and Some New Results," Automatic, vol. 26, No. 3, pp. 459-474, 1990. Frisk, E. and Nielsen, L. , "Robust residual generation for diagnosis including a reference model for residual behavior," 14 th World Congress of IFAC, 1999. Gertler, J. J., "Survey of model-based failure detection and isolation in complex plants," IEEE Control Systems Magazine, 1988. Ghofraniha, J., "Cross directional response modeling, identification and control of dry weight profile on paper machines," Ph. D. Thesis, Department of Electrical Engineering, University of British Columbia. Gorinevsky, D. M . , Heaven, E. M . , Sung, C. and Kean, M . "Integrated tool for intelligent identification of CD process alignment shrinkage and dynamics," Pulp and Paper Canada, vol. 99, No. 2, pp. 40-60, 1998. Gustafsson, F. And Graeve, S. E , " Closed-loop performance monitoring in the presence of system changes and disturbances," Automatica, Vol. 34, No. 11, pp. 1311-1326, 1998. Huang, B. , "Process and control loop performance monitoring through detection of abrupt parameter changes," Proceedings of the 1999 IEEE Canadian Conference on Electrical and Computer Engineering, Edmonton, Alberta, Canada, May 1999. Huang, Biao and E. C. Tamayo, "Model Validation For Industrial Model Predictive Control Systems," Chemical Engineering Science, 55(12) pp. 2315-2327,2000. Huang, Biao, "On-Line Closed-loop Model Validation and Detection of Abrupt Parameter Changes," to appear in Journal of Process Control, accepted May 24, 2000. Huang, Biao, "The role of data prefiltering for integrated identification and model predictive control," Proceeding 1999 IFAC World Congress, Beijing, July 5-9, 1999. Isermann, R., "Fault diagnosis of machines via parameter estimation and knowledge processing - tutorial paper," Automatic, vol. 29, no. 4, pp. 815-835, 1993. Isermann R. and P. Balle, "Trends in the application of model-based fault detection and diagnosis of technical process," 1996 IFAC 13 t h Triennial World Congress, San Francisco, USA. Kristinsson, K. , "Cross directional control of basis weight on paper machines using Gram polynomials," Ph. D. Thesis, Department of Electrical Engineering, University of British Columbia. 92 Kwok, E. and L i , P., " M D - C D Interaction Modeling and Control of Paper Machines", Canadian Journal of Chemical Engineering, Vol. 75, No. 1, pp. 143-151, Feb. 1997. Lai, T. L. and Shan, J. Z., "Efficient recursive algorithms for detection of abrupt changes in signals and control systems," IEEE Transaction On Automatic Control, Vol. 44, No. 5, May, 1999. L i , Jing, "Adaptive control of sheet caliper on paper machines," M . A . Sc. Thesis, Department of Electrical and Computer Engineering, University of British Columbia. Ljung, L . and Forssell, U . , "An alternative motivation for the indirect approach to closed-loop identification," IEEE Transaction On Automatic Control, Vol. 44, No. 11, Nov., 1999. Lou, X . C , Willsky, A . S. and Verghese, G C , "Optimally robust redundancy relations for failure detection in uncertain systems," Automatica, Vol. 22, NO. 3, pp. 333-344. Malladi, D. P. and Speyer, J. L., " A generalized shiryayev sequential probability ratio test for change detection and isolation," IEEE transactions On Automatic Control, Vol. 44, No. 8, August 1999. Nikoukhah, R., "Innovations generation in the presence of unknown inputs: application to robust failure detection," Automatica, Vol. 30, No. 12, pp. 1851-1867, 1994. O'reilly, P. G , "Detection of sensor decalibration using the asymptotic local approach," Electronics Letters, Vol. 34, No. 21, pp. 2022-2023, Oct., 1998. Patton, R J . and M . Hou, "On Sensitivity of Robust Fault Detection Observers," 14 t h World Congress of IFAC, Beijing, P.R. China 1999. Poolla, K. , Khargonekar, P., Tikku, A. , Krause, J. And Nagpal, K. , " A time-domain approach to model validation," IEEE Transaction On Automatic Control, Vol. 39, No. 5, May, 1994. Rank, M . L. And Niemann, H. , "Norm based threshold selection for fault detectors," Proceedings of the American Control Conference, Philadelphia, Pennsylvania, June 1998. Sauter, D. and Hamelin, R, "Frequency-domain optimization for robust fault detection and isolation in dynamic system," IEEE Transaction on Automatic Control, vol., 44, no. 4, April 1999. Shen, L. C. And Hsu,. P. L. , "Robust design of fault isolation observers," Automatica, Vol. 34, No. 11, pp. 1421-1429, 1998. Van Den Hof, P. M . J. and Schrama, R. J. P., "An indirect method for transfer function 93 estimation from closed loop data," Automatica, Vol. 29, No. 6, pp. 1523-1527, 1993. Wahnon, E. and Berman, N . , "Tracking algorithm designed by the local asymptotic approach," IEEE Transactions. On Automatic Control, Vol. 35, No. 4, April, 1990. White, H. E. and J. L. Speyer, "Detection filter design: spectral theory and algorithms," IEEE Transaction on Automatic Control, vol. Ac-32, no. 12, pp. 1106-1115, Dec. 1988. Willsky, A . S., " A Survey of Design Methods for Failure Detection in Dynamic System," Automatica, Vol. 12, pp. 601-611. 1976. Zhang, Q. And Basseville, M . , "Monitoring nonlinear dynamical systems: a combined observer-based and local approach," Proceedings of the 37 IEEE Conference on Decision & Control Tampa, Florida USA, Dec. 1998. Zhang, Q., M . Basseville and A . Benveniste, "Fault detection and isolation in nonlinear dynamic system: a combined input-output and local approach," Automatic, vol. 34, no. 11, pp. 1359-1373, 1998. Zhang, Q, "Using nonlinear black-box models in fault detection," Proceeding of the 35 t h conference on Decision and Control, Kobe, Japan, Dec. 1996. Zhang, Q., Basseville, M . and Benveniste, A. , "Early warning of slight changes in systems," Automatic, Vol. 30, No. 1, pp. 95-113, 1994. 94
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Fault detection and isolation using the local approach
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Fault detection and isolation using the local approach Cheng, Lechang 2000
pdf
Page Metadata
Item Metadata
Title | Fault detection and isolation using the local approach |
Creator |
Cheng, Lechang |
Date Issued | 2000 |
Description | Fault detection and isolation (FDI) has become a crucial issue for industrial process monitoring in order to increase availability, reliability, and production safety. Model-based FDI methods rely on the mathematical model and input-output data of a process to perform detection. The local approach is a new model-based FDI method which aims to detect slight changes of parametric properties of a system. This thesis mainly addresses to the application of FDI using the local approach. Robustness with respect to model uncertainties is an important issue for the local approach. A new algorithm was proposed to recalculate threshold based on the original threshold and covariance matrix of the estimated parameters in order to reduce false alarms due to the estimation error of process parameters. A similar algorithm was also provided to recalculate threshold to reduce fault alarms due to regular parameter fluctuations. As fault detection algorithms are often applied to closed-loop data, closed-loop fault detection was also investigated. Two methods were proposed to deal with the relevance between system input and output data in closed-loop detection: the dimension reduction method and the indirect detection method. The dimension reduction method uses a linear transformation to reduce the dimension of the normalized residual so that the covariance matrix of the revised normalized residual has full rank. The indirect detection method uses the closed-loop model to calculate the primary residual and the normalized residual. By detecting the changes of the closed-loop parameters, the method also detects the changes of the open-loop parameters. Simulation results show that both of these methods can detect changes of every single parameters of a system. Industrial data from a cross-direction (CD) control system in a paper machine was also used to assess the applicability of the local approach. By dividing the CD databox into small sections, the sensitivity of the detection algorithm was improved and the algorithm successfully detected abrupt faults of a single actuator. However, incipient faults of a single actuator can not be detected due to noise and inaccuracy of the process model. |
Extent | 5542551 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-07-09 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0058817 |
URI | http://hdl.handle.net/2429/10560 |
Degree |
Master of Applied Science - MASc |
Program |
Chemical and Biological Engineering |
Affiliation |
Applied Science, Faculty of Chemical and Biological Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2000-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2000-0363.pdf [ 5.29MB ]
- Metadata
- JSON: 831-1.0058817.json
- JSON-LD: 831-1.0058817-ld.json
- RDF/XML (Pretty): 831-1.0058817-rdf.xml
- RDF/JSON: 831-1.0058817-rdf.json
- Turtle: 831-1.0058817-turtle.txt
- N-Triples: 831-1.0058817-rdf-ntriples.txt
- Original Record: 831-1.0058817-source.json
- Full Text
- 831-1.0058817-fulltext.txt
- Citation
- 831-1.0058817.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0058817/manifest