System Identification, Control Algorithms and Control Interval for the Box-Jenkins Dynamic Model Structure by Ky Minh Vu, M . Eng. McMaster University, 1980 A Thesis submitted in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy in The Faculty of Graduate Studies Department of Chemical and Bio-Resource Engineering We accept this thesis as conforming to the required standard The University of British Columbia June, 1997 © K y Minh Vu, 1997 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the Head of the Department of Chemical and Bio-Resource Engineering or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written consent or permission. Department of Chemical and Bio-Resource Engineering The University of British Columbia Vancouver, B . C . Canada Date 11 Abstract The Box-Jenkins model of a discrete control system has been studied. First the transfer function model was identified numerically via the derivatives of the variance of the disturbance, then the disturbance model was identified via the derivative of the variance of the generating white noise. Once the model had been identified, an approach to obtain the optimal gains of a discrete PID controller was suggested. In an adaptive environment, the recursive least determinant self tuning controller was designed to calculate the best possible control action based only on the presumed orders of the controller and without knowledge of the delay. The problem of determination of a possible slower control interval of a control loop was solved via modelling a skipped ARIMA through matrix algebra and a robust numerical procedure. Contents Abstract 1 1 Contents iii List of Figures vii List of Tables ix Acknowledgements xi 1 2 Introduction 1 1.1 1.2 1.3 1.4 1.5 1 1 3 3 4 Introduction The Research Areas of the Thesis The Objectives of the Thesis The Contribution of the Thesis Outline of the Thesis The Discrete Control System 5 2.1 2.2 2.3 5 5 7 7 8 8 8 9 Introduction The Discrete Control System The System Models 2.3.1 The Box-Jenkins Model 2.3.2 The Astr6m Model 2.3.3 The State-Space Model 2.3.4 Model Advantages and Disadvantages 2.4 Conclusion 3 Identification 11 3.1 3.2 11 11 11 Introduction Identification of the Box-Jenkins Model 3.2.1 Nonparametric Methods iii iv CONTENTS 3.3 3.4 3.5 3.6 3.7 3.8 4 . . . 12 13 13 15 20 29 30 30 35 43 43 46 50 61 Controllers 4.1 4.2 4.3 4.4 4.5 4.6 5 3.2.2 Parametric Methods Identification of the Transfer Function 3.3.1 The Linear Least Squares Theory 3.3.2 Identification Methods 3.3.3 The Semi-Analytical Approach Identification of the A RIM A 3.4.1 Methods of Identification for an ARM A 3.4.2 The Semi-Analytical Approach Identification of the Transfer Function-ARIMA Identification of the Predictor Form 3.6.1 The Predictor Form 3.6.2 Closed-Loop Data Examples Conclusion 63 Introduction The Minimum Variance Controller The Linear Quadratic Gaussian Controller The PID Controller 4.4.1 The Time Series Variance Formulae 4.4.2 The Minimum Variance PID Controller 4.4.3 The Linear Quadratic Gaussian PID Controller 4.4.4 The Pole Placement PID Controller 4.4.5 Examples The Self Tuning Controller 4.5.1 The Recursive Least Squares Self Tuning Controller 4.5.2 Convergence of the RLS Self-Tuning Algorithm 4.5.3 The Recursive Least Determinant Self Tuning Controller 4.5.4 Convergence of the RLD Self-Tuning Algorithm 4.5.5 Effect of Model Mismatch 4.5.6 Simulation Examples Conclusion 63 63 66 69 71 74 81 84 88 96 96 100 104 114 120 121 135 Control Interval 137 5.1 5.2 137 137 138 138 140 5.3 Introduction The Sampling and Controlling Rates 5.2.1 Sampling Too Slow 5.2.2 Sampling Too Fast The Control Interval . . CONTENTS 5.4 6 5.3.1 Literature Survey 5.3.2 The Optimal Control Interval 5.3.3 Examples Conclusion v • 140 141 152 160 Conclusion and Recommendations 161 6.1 6.2 6.3 161 161 162 162 163 Conclusion Summary of the Thesis Recommendations 6.3.1 The Nonlinear Stochastic Control System 6.3.2 The Self Correcting Controller Nomenclature 165 Appendices 168 A. Mathematical Results B. Computer Programs Bibliography . 169 183 223 CONTENTS List of Figures Figure 2.1. Conventional Block Diagram of a Feedback Control System Figure 2.2. Modified Block Diagram of a Feedback Control System Figure 3.1. Figure 3.2. Figure 3.3. Figure 3.4. Figure 3.5. Figure 3.6. Correlation Paths of a Feedback Control Loop Input-Output Series of Identified System Parameters of Compared Systems PE Estimated Parameters of Compared Systems SA Estimated Parameters of Compared Systems Comparison of Estimated Disturbance Variances 6 6 . 46 52 57 58 59 60 Figure 4.1. Gain Estimation of a Delayed System . . . Figure 4.2. Gain Estimation of a Nonstationarily Disturbed System . Figure 4.3. Gain Estimation of a Nonminimum Phase System Figure 4.4. Responses from PID Feedback Figure 4.5. Block Diagram of a Self-Tuning Controller Figure 4.6. Exponential Convergence of the RLD Self-Tuning Algorithm . . . . Figure 4.7. Convergence of the RLD Self-Tuning Algorithm Figure 4.8. Self-Tuning of a Correctly Estimated System Figure 4.9. Self-Tuning of an Underestimated Order (n) System Figure 4.10. Self-Tuning of an Underestimated Order (n) Underdamped System. Figure 4.11. Self-Tuning of an Underestimated Order (m) System . Figure 4.12. Self-Tuning of an Overestimated Order (n) System Figure 4.13. Self-Tuning of an Overestimated Delay System Figure 4.14. Self-Tuning of an Underestimated Delay System vii 92 93 94 95 99 122 123 125 129 130 131 132 133 134 LIST OF FIGURES List of Tables Table 3.1. Statistics of the Generated White Noise Table 3.2. Statistics of the Obtained White Noise Table 3.3. Model Comparison Table 4.1. L Q G PID Controller Design ix LIST OF TABLES Acknowledgements The author would like to express his deepest and sincere appreciation to Dr. Patrick Tessier and Dr. Guy A . Dumont for their guidance, support and encouragement throughout the course of this research. He also wishes to thank Dr. Paul A . Watkinson and N S E R C for financial support. He would like to express his sincere thanks to Ms. Rita Penco for her assistance in the P A P R I C A N library and literature survey. He would also like to express his thanks to the system managers: Messrs. Rick Morrison, Reid Turner and Brian D. McMillan for their help with the computer network. Last but not least he would like to thank countless friends and colleagues and the staff of P A P R I C A N for making the period he spent at the Pulp and Paper Centre a pleasant time. xi Chapter 1 Introduction 1.1 Introduction Perhaps the two largest industries that employ a majority of chemical engineers are the oil and the pulp and paper industries. Recently, there has been some setback in both industries in Canada. The oil industry has slumped due to the fact that there a.re more energy efficient cars and houses are better insulated. It is also facing a prospect of being replaced by other sources of energy such as electric cars or the revival of the coal industry in the next century. The pulp and paper industry in Canada has faced some difficulties because of competition from the developing countries and tougher environmental laws. Like the oil industry, it might also face less demand for its products in the developed countries. To make a profit or even to survive, these industries need to refine their technologies. Cost must be cut and products must be of highest quality. Chemical plants, pulp and paper mills are required to run at optimal conditions and with highest efficiency. One of the areas that can improve quality, boost productivity and cut cost is the area of process identification and control. The technology of this area can be improved. So in the following, we will discuss ways to improve the technology of identification and control. 1.2 The Research Areas of the Thesis To build a controller for a process, we need to know the system dynamics. A good controller requires a good knowledge of the system in the control bandwidth. To have a good knowledge of the system, we need good techniques to identify it. From the time when Gauss, K . F. (1809) introduced the method of linear least squares, this method has been the cornerstone of identification. However, this method does not work well for stochastic control systems. A stochastic control system, is a control system that is disturbed by random correlated 1 2 CHAPTER 1. INTRODUCTION sequences and it has two parts. The dynamic part is governed by a linear relationship between the input and output variables. This relationship can be described by a rational transfer function. Similarly to the dynamic part, the stochastic part has a rational transfer function with an uncorrelated sequence (white noise) as the input. This stochastic part disturbs the system from the steady state. It is known as an AutoRegressive Integrated Moving Average ( A R I M A ) time series. In this form, the stochastic control system is known as a Box-Jenkins model control system. Identification of a Box-Jenkins model control system means we have to obtain both the rational transfer function and the A R I M A for the system. The two well-known methods used to identify the Box-Jenkins model so far are the maximum likelihood and the prediction error (Astr6m, K . J. (1980)). The maximum likelihood method, because of the requirement of a normal distribution of the residuals, has fallen out of favour, leaving the prediction error method as the method of choice for identification of this Box-Jenkins model. If only the transfer function of the BoxJenkins model is desired, the output error method is the method to use. By themselves, the prediction error and the output error methods are not wrong or poor theoretically. But because the solution is usually obtained from a numerical procedure (Gauss-Newton), better algorithms should be used for a more accurate solution. Also, the prediction error method gives no information on the order of the system and the number of parameters. A 2-stage least squares approach from the semi-analytical method, introduced in this thesis, will be able to determine the number of parameters. By setting the derivatives to zeros, the semi-analytical method will remove the linear parameters from the iteration equation and gain additional accuracy in the process. The PID is the most commonly used controller in the process industry. From the day of its introduction as an analogue controller to the recent self-contained self-tuning PID controller, there has been a substantial literature. From the earliest paper of Ziegler, J . G. and Nichols, N . B . (1942) to the very recent paper by Cluett, W . R. and Wang, L . (1996), research on the PID controller is still alive. This is not because the research of the past was poor, but because modern technology enables sophistication. New sets of tuning rules have been suggested for better closed-loop performance. However, a large part of research on the PID controller has been on the analogue or continuous PID controller, in many cases assuming simple models of the control system. Fast digital control loops are now replacing the analogue control loops and new technology for tuning digital PID controllers must be provided. The Box-Jenkins model is a very general model to describe a discrete (digital) control system and techniques to design PID controllers for this model are desirable. The self-tuning controller can correct its gains to meet the control criterion of minimum variance or minimum variance with a constraint on the variance of the input variable. Since the pioneering paper by Astrom, K. J. and Wittenmark, B . (1973), various extensions have been proposed. For example, the work of Clarke, D. W . and Gawthrop, P. J. (1975) is one good progress. It is a minimum variance self-tuning controller with a 1.3. THE OBJECTIVES OF THE THESIS 3 constraint on the input variance, ie. a linear quadratic Gaussian self-tuning controller. The pole-zero placement self-tuning controller is another progress (Wellstead, P. E . et al (1977)). However, the self-tuning controller can be further improved. A self-tuning controller requires three parameters: the delay of the system (/), the number of past control input variables it remembers (m) and the number of past controlled output variables it remembers (n). Of these parameters, m must always be greater or equal to / . m partly depends on / . So if m is chosen, / can be eliminated. Techniques to eliminate the need to know the delay of the system must be provided to enhance the self-tuning algorithm. The choice of the control interval has been occasionally made by rule of thumb and experience. However, in the determination of whether a control loop can be controlled more slowly, the performance of the loop at the two control intervals must be compared, and decision made upon these performances. Since the closed-loop performance can be determined just by the delay of the system and the model of the A R I M A disturbance, the problem becomes the problem of modelling a skipped A R I M A . MacGregor, J. F. (1976) obtained the skipped A R I M A autoregressive part by first obtaining the roots of the autoregressive part of the original series and raising these roots to the correct power, then transforming these roots back to the right form for the autoregressive part of the skipped A R I M A . This approach requires more labor and some accuracy will be lost. The moving average part of the skipped A R I M A is obtained by solving a number of equations. This again will require more work. Matrix algebra to obtain the autoregressive parameters and a robust numerical algorithm to obtain the moving average parameters will reduce labor and increase accuracy. 1.3 The Objectives of the Thesis The purpose of this thesis is to provide process control engineers with improved algorithms to design feedback controllers for a discrete stochastic control system. To be more specific, it provides an easy and efficient algorithm to identify a stochastic control system model. From this model, the thesis provides an easy way to design a PID controller. For adaptive systems, the thesis provides an improved self tuning algorithm. It also improves an existing method to determine the control interval of a discrete control system. 1.4 The Contribution of the Thesis In summary, the contribution in this thesis can be separated into three areas: identification, control algorithms and control interval. In identification, this thesis introduces methods relating the parameters at optimal conditions. These formulae not only reduce the dimension or number of parameters in the search for optimality but can also be used CHAPTER 4 1. INTRODUCTION to test the system parameters obtained from other methods. The semi-analytical method in this thesis is more accurate than the prediction error method in the identification of the Box-Jenkins model of a control system, particularly when the number of parameters in the pole and moving average polynomials is smaller than those of the transmission zero and autoregressive polynomials. In control algorithms, this thesis suggests a numerical approach to obtaining the minimum variance and linear quadratic Gaussian gains for a PID controller. It introduces a different concept to self-tuning control. This concept is called recursive least determinant. This concept eliminates the need to the know the delay in the self-tuning algorithm. In the work on the control interval, this thesis improves an existing way to choose the control interval. With this improved approach, we can solve similar problems in statistics such as modelling a temporal aggregate A R I M A . 1.5 Outline of the Thesis The thesis is organized as follows. In chapter 2, we briefly discuss a discrete stochastic control system and its model representations. In chapter 3, the identification method for a rational transfer function and an A R I M A is discussed. In chapter 4, we discuss stochastic controllers. In chapter 5, an approach to determine the control interval of a discrete control system is discussed. Chapter 6 concludes the contribution of the research project with some recommendations for future work. The necessary mathematical derivations and computer programs for the algorithms are included in the appendices of the thesis. Chapter 2 The Discrete Control System 2.1 Introduction In the early days of process control, most processes were under feedback with an analogue PID controller. Many control loops are still under this kind of control today. These are the low level, independent and simple control loops. With the advent of the digital computer, more important control loops are under digital feedback. The signals from these loops are digitized by a sampler and fed to the digital computer - normally called the process computer. This has many advantages. First it might be economically sound, because one small reentrant subroutine in the software can be used by a number of control loops. More importantly, the loops may be interactive. Information stored in the computer can be shared by all the loops. This provides an opportunity for a process engineer to carry out more sophisticated control designs such as feedforward and feedback, ratio, decoupling controls etc... Parallel to this and of equal importance is the optimization strategy in plants and mills. In most plants and mills in North America, the process computer has disappeared and been replaced by a DCS (Distributed Control System). A DCS usually includes many control loops and each loop represents a discrete controlled process. 2.2 The Discrete Control System A linear feedback discrete control system can be described by the block diagram in Figure 2.1. This is the conventional block diagram of a feedback discrete control system. In this block diagram, if ysp is constant and the disturbance n is not, we say the system is subject to stochastic disturbance and the controller is a regulator. This is the regulating case, because the controller regulates the system at a specific point - the set point. In the opposite case, when n = 0 and ysp is not constant, the controller is a deterministic controller. This is the servomechanism or tracking case, because the controller is designed t t t t 5 CHAPTER 2. THE DISCRETE CONTROL 6 SYSTEM to track the set point. However, the difference between these two cases is minor and there is a duality between them (MacGregor, J . F. et al. (1984)). In the following, we will try to unify these two cases. We redraw the block diagram of Figure 2.1 and present it in the block diagram of Figure 2.2. Disturbance Set point ym (z- ) + 1 Plant Controller Sjz- ] 1 -]-yt yt+f+i Figure 2.1. Conventional Block Diagram of a Feedback Control System. t+f+i Disturbance a Set Point y Pt+m (z- ) 1 t+i+i n s + Vt +o Plant Controller + Figure 2.2. Modified Block Diagram of a Feedback Control System. From Figure 2.2, we can write yt+f+i = -yspt+f+i = r, _,ut b{z ) u(z- 1) + - yspt+f+i e{z- x) Ut + (j)(,-if + nt+f+i t+f+1 , ( 0 1 . ' (2.3) 7 2.3. THE SYSTEM MODELS Now we consider the case either ysp j i or n +f+i must be zero, then as far as the controller is concerned, the two problems of tracking and regulating are the same if t+ + t yspt+j+1 = -nt+f+1 (2.4) The minus sign is due to the fact that the disturbances are opposite in nature. In the deterministic case, we assume the system is at steady state and then we bring it to a new level. In the stochastic case, we assume the system has been disturbed and is at a level different from the desired level and the purpose of the controller is to bring the system back to the desired level. As mentioned earlier, the tracking and regulating problems a,re equivalent if their disturbance models are equivalent. But with the regulating case, or stochastic control we are equipped with more tools for analysis and design. Therefore, we will discuss only the stochastic control system in this thesis. Because of their duality, the word controller will occasionally be seen in place of the word regulator in this thesis. 2.3 The System Models 2.3.1 The Box-Jenkins Model With yspt+j+i = 0, Equation (2.2) gives us — 7. —Ut H —7, ; l^-O) :Clt+ f+1 In the above equation, the roots (z s) of the polynomial u(z~ ) are called the zeros of the system, while the roots of the polynomial S(z~ ) are called the poles of the system. The polynomials u(z~ ) and < $ ( 2 ) are assumed coprime, ie. they do not have a common root. If all the roots of ^ ( ^ ) are outside the unit circle, then the system is said to be minimum phase. In the opposite case, if one of the roots is inside the unit circle, then the system is said to be non-minimum phase. Similarly, if one of the roots of S(z~ ) is inside the unit circle, the system is said to be open-loop unstable. If one of the roots of <^(^ ) is on the unit circle, the system is said to be disturbed by a nonstationary disturbance. A l l the roots of 0(z~ ) are forced to be outside the unit circle from modelling. Since the first coefficients in the polynomials 8(z~ ), 0(z ) and < / > ( 2 ) are unity, they are said to be monic. _1 x 4 x x _ 1 - 1 1 _1 1 1 _1 - 1 CHAPTER 2. THE DISCRETE CONTROL SYSTEM 8 2.3.2 The Astrom Model Another model equally well-known is the Astrom model that is described mathematically below a(z- )y = ^- )u _ x + 7(«- )a* 1 t t (2-7) 1 f c This model is sometimes called the A R M A X model due to the fact that the model is an A R M A time series with an eXogenous input u . The polynomials a(z~ ), and 7 ( 2 ) are assumed to be coprime. Q ' ( 2 ) is monic, while ^{z" ) is normally not. If 7 ( z ) is monic, then the variance of a is normally not unit. If ^(z~ ) has a root inside the unit circle, then the system is said to be non-minimum phase. If ct{z~ ) does not have all its roots outside the unit circle, the system is said to be open-loop unstable. As in the case of the Box-Jenkins model, ^(z ) will have all the roots outside the unit circle from modelling. l t - 1 - 1 1 - 1 r t l -1 2.3.3 The State-Space Model A stochastic control system also has a state-space representation. It can be written as below. x i t + yt = Ax* + but-j + wt = c x + vt (2.8) (2.9) T t This model can be put into the following predictive form: x t + 1 |i = A x | ( _ i + bu - y = c x |<_i + a t t f + kat T t t t (2.10) (2.11) In the above equations, the matrix A is called the state transition matrix. The vector b is called the input or control vector. The vector c is called the output vector. The vector x is a vector of the state variables. If the vector k is the steady state Kalman gain vector, then the vector 5tt+i\t is the optimal one step ahead state estimator. If A does not have all its eigenvalues inside the unit circle, the system is open-loop unstable. The white noise W i is the process noise and the white noise vt is the measurement noise. The process noise and the measurement noise are not correlated. The white noise at is sometimes called the innovation sequence, because each at is an innovation. For this reason, the second state space representation is sometimes called an innovation (state space) representation. t 2.3.4 Model Advantages and Disadvantages Now we will discuss the models advantages as well as their disadvantages. The BoxJenkins model is more rational. It separates the process dynamics and the disturbance. 2.4. CONCLUSION 9 The disturbance can be stochastic - for example an A R I M A time series. In the deterministic case, the disturbance can be a set point change. This model also uses the least number of parameters. Since the Box-Jenkins model uses rational transfer functions, it can be clumsy in the case of multivariable systems for identification and control. The state space model is very convenient for this case. The Astrom model can be loosely considered as the intermediate model, when we convert the Box-Jenkins model to the state space model. As an intermediate model, it does not present any difficulty in the multivariable case. In the single input single output case, it is easier to identify, but it does not give as clear an insight of the physical system as the Box-Jenkins model does. Since we can easily convert the Box-Jenkins model to other models, we will choose this model in this thesis as the more general case. 2.4 Conclusion The Box-Jenkins model is well-known model for discrete control systems, however, due to its complexity only a relatively small amount of research has used it. Progress in control theory - for example the self tuning controller - has often used the Astrom model. In multivariable control theory, the state space model has reigned supreme. This thesis is an effort to make this model representation more popular. All the research work will use this model. This is not due to some special preference for this model by the author, but because the Box-Jenkins model can be considered as the basic model from which other models can be derived. Once a model is chosen, we can proceed with control problems such as identification, controller design and choice of control interval. CHAPTER 2. THE DISCRETE CONTROL SYSTEM Chapter 3 Identification 3.1 Introduction Identification is the first step in the design of a controller, since we must know at least some information about a process before we design a controller for it. By assuming a linear input output model, the word identification can be considered the same as the word parametric estimation or a procedure to obtain estimates of the parameters of the system from a record of the process. Identification dates back to the time of Gauss, K . F., when he introduced the method of least squares. In this chapter, we will examine some well known identification methods and introduce our semi-analytical approach. 3.2 Identification of the Box-Jenkins Model Identification can be classified into two categories: Parametric and nonparametric. In a parametric approach, we will obtain the values for the parameters according to some criterion. The number of parameters is assumed known a priori. In a nonparametric approach, not only the values of the parameters, but also the number of the parameters must be obtained. The model has not been parameterized. In the following, we will discuss a few well-known methods of both types. 3.2.1 Nonparametric Methods There are a few nonparametric methods and they will be discussed in the following. The simplest method is the analysis of the transient of either the step or impulse response. But this method only works well for deterministic and low order systems. The P R B S (Pseudo Random Binary Signal) gave an enormous boost to the nonparametric impulse response estimation in the mid '60s (Wellstead, P. (1981)), but later it was under criticism, because 11 CHAPTER 12 3. IDENTIFICATION of possible damages inflicted to the hardware of the control system by the P R B S . The frequency analysis method requires large test time for the transient effect to die out and the input signal must be a sinusoid. Theoretically, the discrete input variable ut cannot exactly form a sinusoid and hence the method is normally used in continuous systems. The frequency analysis can give information of a single sinusoid. For multiple frequencies, the tool is the Fourier analysis. Perhaps, the most complicated nonparametric method to use is the spectral analysis method. In this method, one gets a spectrum to examine with a window. The input signal does not have to be a sinusoid. Historically, the frequency, Fourier and spectral analysis were statistical tools to look for harmonics in a time series. Generalization to identification has not been very successful and popular. The parametric methods that have been used quite often are the transient analysis of step response data and the cross-correlation analysis of impulse response data. From the step response data, the dead time, time constant and steady state gain of a first order system may be estimated. For more general models, cross-correlation analysis has been used with impulse response data for identification. Box, G. E. P. and Jenkins, G. M . (1970) used the crosscorrelation function between the input and output variables to identify the transfer function from the impulse response weights. The authors also suggested prewhitening the input variable for a more accurate crosscorrelation. This is preferred to the P R B S , because as mentioned above the P R B S was criticized for possible damage to the system. Sinha, N . K . et al. (1978) suggested a similar crosscorrelation analysis to calculate the Markov parameters of a multivariable system, which are the impulse response weights of a single input single output system, and manipulated the Hankel matrix to obtain the state space model of the transfer function. Usually, nonparametric methods are used to obtain the number of parameters and the initial estimates of the system parameters. These initial estimates are then used in a parametric method to produce finer estimates. 3.2.2 Parametric Methods In a parametric method, we assume that the model has been parameterized and the the total number of parameters is known. There are a number of these methods known in the control literature. We will list them here for a brief reference, and will discuss them more later. The oldest one was introduced by the German mathematician Gauss, K . F. (1809) namely the (linear) least squares method. The method can give the solution in a closed-form both recursively and nonrecursively. Like the method of linear least squares, the method of instrum.ental variables (Young, P. (1970)) can give the solution in a closed-form recursively. However, it has the disadvantage of having to choose the instrumental variables. The more commonly-used method is the method of maximum likelihood. This method has the disadvantage of not giving the solution in a closed form like the method of least squares or instrumental variables. The solution is usually ob- 13 3.3. IDENTIFICATION OF THE TRANSFER FUNCTION tained from a numerical optimization procedure. One method which is very close to the maximum likelihood method is the prediction error method (Astrom, K . J . (1980)). As of today, this is the most used method and it is normally found in many identification software packages. Another method which is less well-known than the method of prediction error, but usually implemented similarly is the output error method. This method is useful in the identification of only the rational transfer function. Needless to say, the problem with the parametric methods is the knowledge of the number of parameters. We will discuss these well-known methods in more details in the next section. This is done to see their weakness in applying to identification of the Box-Jenkins model and the reason for our research. 3.3 3.3.1 Identification of the Transfer Function The Linear Least Squares Theory The method of least squares is the oldest statistical method for parameter estimation. The method was introduced by Karl Friedrick Gauss in 1809 in his masterpiece "Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium" (Theory of the Motion of the Heavenly Bodies Revolving around the Sun in Conic Sections) where he discussed the determination of the elliptical orbit of a planetary body. To avoid confusion, the method is sometimes called linear least squares, because there is a linear combination of the estimated parameters. The method is also known as linear regression. We consider the following system model: y ~ uj v -f-i +u; w _/_ + n t 0 t 1 t 2 (3.1) t with its estimation problem: N Min ^0,^1 S = ^2 (yt - u u -f-i - io u -s-2) t=f+2 2 N 0 t x t (3.2) The variable y is the regressee or the regressand and the variables u -/-i, Ut-f-2 are the regressors. If we take the derivatives of SN with respect to UIQ and toi and set them to zero, we obtain t t N Yl (Vt ~ vout-f-i t=f+2 - uiut-f-2)ut-j-i = 0 (3.3) N (Vt ~ U U -f-l - UJlU -f-2)Ut-f-2 t=f+2 0 t t = 0 (3.4) CHAPTER 14 3. IDENTIFICATION or TV £ t=f £ +2n u _ , _ t t 2 = 0 (3.5) = 0 (3.6) From the last two equations, we can write 1 N H J V _- . f f_1 (-) = 3 7 0 *=/+2 TV _1 ———J (3.8) ut-f-2n 1 <=/+2 Now if we can make an assumption that nt has constant zero mean and we can replace the average sum of the above equations by the expectation operator, then the statistical meaning of the above two equations is the series ut does not crosscorrelate to nt at lags / + 1 and / + 2. 7 « n ( / + l) = £ { > , - / - i M = 0 7 u n ( / + 2) = E{ut-f-2nt} = 0 (3.9) (3.10) The method of instrumental variables that we will discuss later makes use of this property of the least squares method. Note that we have not made any assumption about any statistical property of nt besides it has a constant zero mean. It can autocorrelate to any finite lag and normally or uniformly distributed. The residual of the output error method has this statistical property. Now if in Equation (3.1), we replace ut-j-2 by yt-f-i as below yt = c j o ^ i - / - ! -rutyt-f-i + nt (3.11) then we get the following equation: 7«n(/+l) 7*n(/+l) = 0 = 0 (3.12) (3.13) With the above second equation, we cannot say nt can autocorrelate to any lag, because if it does it might crosscorrelate to yt at lag / + 1 and this equation will not be true. If nt autocorrelates to only lag / , this equation will be true. The prediction error method makes use of this fact and we can have predictors that predict yt up to / + 1 steps ahead. 15 3.3. IDENTIFICATION OF THE TRANSFER FUNCTION Now if nt is white, then the above equations will be true. The maximum likelihood method requires this characteristic from the residual of the model. From the above discussion, we can see all these methods has one statistical property in common and that is the residual does not crosscorrelate to the regressor variables. This property comes from the minimization of the sum of squares of the residuals. 3.3.2 Identification Methods Now we will see if we can apply some of what we discussed above to identify our transfer function from the model Vt = j^ut-f-i b{z ) + n* l 1 - £-=i (3.14) fa-* which is known as the output error model. The Least Squares Method One way to apply the (linear) least squares method to our problem is to perform a long division of the transfer function polynomials to get the following equations: V t (3.16) ~8[z^) - m Yl fii t-f-i-i + "t i=0 = ~ Ut J 1 + n t (3.17) u Ut-f-1 Ut-f-2 Ut-f-l-m Po Pi Pm « x f (3.18) J (3.19) f3 + n t then obtain the optimal parameters /3,- as below - t-f-l Ut-f- Ut-f-l Z-t=/+2 ft Jm _ = t=f+2 U 2 N l^ =f+2 t-f- L YltLf+2 Ut-j-l-mll-t-f-l -l sym. U t Avt=/+2 t-S-\-m a CHAPTER 3. IDENTIFICATION 16 Et=/+ ytut-f-i l*t=f+2yi t-f-2 2 u (3.20) E t e / + 2 VtUt-f-l-m In matrix form, the least squares solution to the above equation can be written as below = (3.21) [X X]- X y r x r where the matrices X and y are defined as T yj+2 yj+3 7+2 rT X 7+3 = (3.22) L yN The parameters /3{S are the impulse response weights or the Markov parameters. The problem that we might have is the division is usually long and we have to invert a large dimension matrix which is normally ill-conditioned and hence gives poor results for the estimated parameters /?;s. Even though retrieving u;,-s and 8{S from /?,-s is possible, it usually gives error in this process. Another way to apply least squares theory to our problem is to multiply both sides of Equation (3.15) by the polynomial 6 ( z ) to obtain the following equation: -1 6(z )y l + S(z ^nt = u(z )u -f-i 1 t t (3.23) or (3.24) yt t=i Now, we do not know n _,-, i > 0 at the time of regressing. However, if we calculate the parameters recursively, then we can approximate n -i with the estimated parameters from the last estimation. Even though this approach does not work all the time, it has seen working for some systems and the approach has a number of names, such as the pseudolinear regression, the approximate maximum likelihood and the extended least squares method. In the above least squares estimation, if the disturbance n is white, then the estimation is always unbiased. If it is coloured, then it will depend on whether the loop is closed or not. If the data is open-loop, then u and n are not correlated, because they are disjoint. The estimation will be unbiased. On the other hand, if the data is closed-loop, the estimation is usually biased. We will discuss more about closed-loop data in a later section. 4 t t t t 3.3. IDENTIFICATION OF THE TRANSFER 17 FUNCTION The Instrumental Variables Method The method of instrumental variables is relatively new. It was introduced by Peter Young in 1970. To apply this method to our estimation problem, we rewrite the model as i-i is (3.25) irit- (3.26) i=i i=i (3.27) x f £ + et Now if we regress y on x and obtain the optimal parameter vector 0 as t t 0 = [X X]- X y r 1 (3.28) T Then the parameters will be biased, because x< is correlated to e . The problem can be seen as follows t Vt yt - et (3.29) <0 (3.30) The matrix solution for the parameters 0 of the above equation will be given by 0 = (3.31) [X X]- X y T 1 r [ X ^ - ^ y - l 1v T r. (3.32) [X X]" X e r where r X = X X T Z+ 2 -\ " yj+2 ' / + 3 , y = yf+3 , y = " y/+2 ' yf+3 " e/+2 e , " / + 3 (3.33) e = T e . yN . . yN . . x^v _ The concept of the instrumental variables method is choosing the variable vector z in place of the variable vector x< such that it does not correlate to the residual e , ie. the second term in the Equation (3.32) becomes zero. The parameters will then be given by N t t 0 = [Z X]- Z y T 1 T (3.34) with Z Z X / + 2 T / + 3 (3.35) X The variable vector z is the vector of the instrumental variables. The problem with this method is the choice of the instrumental variables. t CHAPTER 3. 18 IDENTIFICATION The Maximum Likelihood Method The method of maximum likelihood is attributed to Fisher, R. A . (1956), but the idea was also known to Gauss, K . F.. To use the method we have to be able to express the probability density function involving the estimated parameters. From this probability density function, we can derive an expression for the likelihood function, then maximize it - hence the name maximum likelihood - to obtain the desired parameters. We have nt = Vt - (3.36) xf Assuming a normally distributed function for n with zero mean and variance the probability density function for n as below we have t t ,2 p(n ) = t ——exp[--±] (3.37) If all the n s are independent, the joint probability density function for all the n s is given t t by p(n , n , •••) t = p(n ) x p{n ) x • • • t+l t t+1 1 X> r e*p[-^} )N L(u>i,6i) (3.38) (3.39) (3.40) The above function is called the likelihood function. Since it is a probability function, its value always lies in the interval (0,1). If the distribution has a constant variance <r , which must be for a stationary series n , then the likelihood function has the greatest value when the exponent term is maximum, ie. when 2 4 S = £ n\ t=f+2 (3.41) is minimum. The difficulty of applying the maximum likelihood method to our problem is the sequence n is not independent. They are usually correlated. Secondly, nothing can be known about the probability distribution of n before identification. Now if the residual a s can be assumed to be normally distributed, and since a is white, we can write t t t t p(a ,a ,-• t t+1 •) = p(a ) x p(a ) x • • • t t+1 1 exp[-^4] (3.42) (- ) 3 43 19 3.3. IDENTIFICATION OF THE TRANSFER FUNCTION Reasoning as before, we can say the right hand side of the above equation is maximum when JrJ = £ J W ' " W - ' - J ) ( 3 ' 4 4 ) is minimum. So the identification criterion for the Box-Jenkins model is But this means both the transfer function and the disturbance models are identified at the same time. It will be difficult to determine the orders of the polynomials C J ( Z ) , b(z- ), ^(z- ) and 0( - ). _ 1 v 1 r Z The Prediction Error Method In the paper by Astrom, K . J. (1980), the residual a of the maximum likelihood method discussed above is interpreted as the prediction error and we can write t y = y(t\t-l) + a t = Now (3.46) t E{y \t-l} +a t if n in our model is white then (3.47) t = ^y^-/-i t Sf(*|* - 1) (3-48) If n is not white which is the usual case, then by defining t — -) 1 we z' 1 + (3.49) 1 can write m = 6{z + ^Z7T«t 4>{z - ) 7(z - ) r^t-i <j>(z - ) i^ut-j-i 6(z-) -) -) u(z~ -) 6{z-) Lo(z~ -) 6(z- ) (3-50) 1 1 u(z' 1 8{z- 1 1 + at 1 7(2 1 - ) 1 (3.52) 9(z - ) 1 1 1 1 1 (3.51) T(^ ) , 6{z_1 8(z- ) r u\ z ^ ) ^ ) ^ y{t\t-l) + a t J 1 + '~ l J ^ 1 - 1 + a i (3.53) f 3 (3.55) 5 4 ) CHAPTER 20 3. IDENTIFICATION Like the method of maximum likelihood, it will be very difficult for us to determine the orders of the involved polynomials u.'(z ) and ^ ( z ) , if we want to use the prediction error method. It cannot identify only the transfer function unless the disturbance nt is white. -1 T h e O u t p u t E r r o r -1 M e t h o d The output error method does not give us complicated polynomials to identify, because the equation it uses is Vt = itt-f-i + nt (3.56) The method does not require any special statistical property of nt except that it has a constant zero mean and does not crosscorrelate to ut. It should be the favorite method for us to identify the transfer function of the Box-Jenkins model. However, the output error method does have a weakness and that is it does not exploit the fact that the parameters in the polynomial u(z~ ) are linear parameters and can be removed from the numerical identification algorithm for more accuracy. In practice, it has been seen implemented recursively (Dugard, L . and Landau, I. D. (1980)) or similarly to the prediction error method as in the function oe of the M A T L A B system identification toolbox. So we can summarize about all the methods we have just discussed above as follows. The (linear) least squares method has accuracy problem. The instrumental variables method has the problem of choosing the instrumental variables. The methods of maximum likelihood and prediction error cannot identify only the rational transfer function. The output error method should be the method of choice, but it should be enhanced by removing the linear parameters out of the numerical identification algorithm for more accuracy. l 3.3.3 T h e S e m i - A n a l y t i c a l A p p r o a c h As discussed above, we see that all the discussed methods have their roots in or are related to the method of least squares. To identify the Box-Jenkins model, we can have a twostage (rational) least squares approach. We identify the rational transfer function by the least (sum of) squares of the disturbance series n and identify the disturbance A R I M A model by the least squares of the white noise a . For the lack of a good name, we will call our method the semi-analytical method, because we can obtain only half the number of parameters for analysis. To identify the rational transfer function, we write the model as follows: t t LOQ yt = — U3\ Z (3.57) 3.3. IDENTIFICATION OF THE TRANSFER FUNCTION 21 and the identification criterion can be written as: (3.58) = Min Sjv S, U) SN N = Min ]T n] 8, UJ t=r+f+2 = Min (3.59) N U)Q — U)\Z J2 fa 1 — • • • — 1-6-LZ- S,UJ =r+f+2 LO Z (3.60) R 8 1 S t with Si 8 UJ 0 (3.61) 2 U) — — Ul r Now if we define x t (3.62) = y- n t t UJQ — U)\Z —1 — • • • — UJ Z —r l (3.63) R 8. •v — s 1 - 6 z~ 1 J then we can write /-r-l (3.64) — 8]_XN-2 X f2 r+ + — 8iX f i r+ + — ••• — = WoV>N-f-2 ~ 8 XN-s-l S — • • • — 8 X j -s S •f-r-2 (3.65) ^ l « / V - / - 3 — U U +l — UiU — • • • — U U\ r+ +2 0 r r T (3.66) Now if the system is completely at steady state before we collect data, all u s for t = (—oo,0] will be zeros. This means we have t LOQ — LO\Z r+f+l x — 1 1 — 8\z m _1 i=0 0 ••• — LJ Z~ R — • • • — 8 z~ ! (3.67) s (3.68) (3.69) CHAPTER 3. IDENTIFICATION 22 and we can safely assume r+f+l = 0 (3.70) X +f = 0 (3-71) Xr+f+2-s = 0 (3.72) X r then we can w rite T 1 -Si 1 ••• -S -8, •• 0 -S, 1 Si 1 a XN XN-1 -Sr 1 UN-f-1 UN-f-2 Xr+f+3 _ Xr+f+2 _ UN-f-r-1 UN-f-r-2 u 0 -U-'i U +2 Ur + 1 r Ul —Co',. (3.73) and therefore x N 1 -Si 1 ••• -8s •• • 0 -S„ XN-l Uw Xr+f+3 r+f+2 1 X 1 01 02 03 1 0 02 (3.74) -S 1 x N-f-r-2 X 1 01 1 Uw (3.75) 01 1 or (3.76) And with y defined as VN IjN-l (3.77) Vr+f+3 Vr+f+2 3.3. IDENTIFICATION OF THE TRANSFER FUNCTION 23 we can write the sum of squares Sjv as follows: S = [y-x] [y-x] (3.78) T N = [y-$Uwf[y-*Uw] = y y - 2u> U # y + w U * $ U w T T T T (3.79) T T r (3.80) The derivative of SN with respect to u> is dS N 3u> -ir,Ti - 2 U * y + 2U T * $Uw = T T J (3.81) J By setting this derivative to zero, we obtain the optimal parameter vector u> as below: (3.82) U * y T T By putting the optimal value u> back into the expression for SN, and differentiate it with respect to the individual <5,s, we will obtain s equations to solve for the components of the parameter vector S. This is done so, because the derivatives of the parameters if^s do not give nice relationships with the vector S. Taking the derivatives, then dividing by -2 and setting the resultants to zeros, we obtain the following equations: 0 (3.83) = 0 (3.84) = [ | | u « + *u|^] [y - * U £ ] T 9 l g 2 = [^Uu. oo + * u | ^ ] [ y - *Uw] T u°2 2 (3.85) where o ^ y> i>z o ••• 4> N-f-r-2 2 o (3.86) 4\ o 0 and the apostrophe denotes the derivative with respect to the parameter 8{. We can derive the relationships between r/'.;s and <5;s. However, this will require some work. This is not CHAPTER 24 3. IDENTIFICATION only because the relationships are expressed in determinants, but also because of the difficulty of the differentiation of these determinants, when the derivatives are required. Since what we are interested in are the numerical values of the parameters 0;s, we will introduce a fast recursive calculation of these parameters and their derivatives. By definition of the matrix \P, we have 1 0 -S„ 1 -6, 1 01 02 03 1 01 02 -6t 1 (3.87) 01 1 01 1 1 1 By multiplying the first row of the first matrix with the kth column of the second matrix on the left hand side of the above equation, we get k-i = Y ^lSk-l 0fc (3.88) 1=0 fc-1 (3.89) = s + Y i>rfk-i 1=1 k with <5fc = 0, k > s. Now differentiating both sides of the above equation with respect to Si, we obtain dS ds 50/= k dS; Wl t k-i <90/ +^YdSi +0 dS, k-i ds ( (3.90) t th dS dS k = 1 if k = i (3.91) = 0 otherivise (3.92) { To find the partial derivative of the vector u> with respect to the individual parameters 6jS, we rewrite Equation (3.82) as follows: [U \I/ *u]a> T T = U # y T (3.93) T and take the derivatives of both sides with respect to S\. Doing this we obtain u ^—*u + r oSi U * —u dSi T T dSi ~dS~ (3.94) 3.3. IDENTIFICATION OF THE TRANSFER 25 FUNCTION This equation gives us T U — — *u oSi u- ~d8~ U * *U T T +u * —u r T (3.95) obi and therefore 8CJ (3.96) ds~ u # *u T r u- 08; dSi Obi (3.97) At this point we want to clarify a possible confusion in the using of some notation. In Equation (3.82), there are no circumflexes on top of the matrices \Ps which is contrary to Equation (3.93). The circumflex sign " is used to denote an optimal value, ie. the derivative of the minimized quantity with respect to some variable has been set to zero. In Equation (3.82), the derivative of SN with respect to UJ has been set to a zero vector, and hence we have the notation CJ. And this is true whether the derivatives of SN with respect to <5s have been set to zeros or not. Since in Equation (3.82), these derivatives have not been set to zeros, we have the notation ty. However, in Equation (3.93), we were looking for the derivative of CJ with respect to Si which are needed in Equation 3.83 to Equation 3.85. In these equations, the derivatives of SN with respect to <5,-s have been set to zeros, and hence the notation tp. We will keep this notation strategy throughout the rest of this chapter. 4 We are now equipped with all the necessary tools to solve the equation 91 92 = 0 (3.98) numerically. However, if we solve this equation by the method of Newton-Raphson, we need further differentiation. The equation for the next estimate of this method is " Si ' 62 dgi dgi dS-i dS 2 S2 2 s dg s dSi db as. 0£2 dbi db dg dgi 2 db n -1 91 g2 s dg 88< s L 9s (3.99) CHAPTER 26 3. IDENTIFICATION which means we need second order differentiation of the matrix & and the vector u> with respect to the parameter The second derivative of xpk can also be calculated recursively like the first one. We can do this by taking the derivative of the first derivative as follows: d4>id6 -i j d 8 -i , 8^88, -i S -i + + — — + y /d6idSj oc (3-100) d8i dSj 88— d8i dSid6j , dSidSj dH dSidSj 8^ 2 2 , k k f k k k Q C =1 fc-i 3 <9V>/ > ^ dSidSj ~ dy>; 06k-i cfy'; 68i 88j 88) 88 2 ^ + (3.101' + t To find the second order partial derivative of the parameter vector CJ with respect to the individual parameters Si and Sj, we start from the following equation: TT 8^l T 8Cj OS; u —— * u + u * —-u o> + u * * u ^ - = u - — T r aSi r r - rr r - T 88 68; (3.102) : { and differentiate both sides with respect to the parameters Sj to obtain T T J 2 * % d4> d4> T I TjT T rr d8idSj T T d# T T U T dSi d* 88 dSi 88. u —— * u + T 3 T T T j T ~ d* 2 T 86{ TT 8<2> 88~ + u ^-*u + u * —u T * —u 88, + o8i T T 88, w + dSidSj r 88 3 8 UJ 88i88j 8 4f 2 2 = V 1 T 88,86/ (3.103) which will give us 02,:.LO 88\dS) 2,ffT U * *U r T 88,88j rr,8 ^ 2 ^8 Jt i r * T 8V a* T T U + U 88i 88j 88j 88i - 8^ T 8V •U]u> 88i88. 2 T T , / r T 8* T <9* (3.104) With the above equations, we can calculate the individual elements of the Hessian of the Newton-Raphson equation. The element at row i and column j of this matrix is given 3.3. IDENTIFICATION OF THE TRANSFER 27 FUNCTION below: 5<5j dSidSj 68 1 6$ 68j 68) 68 - 6CJ v 6^f 1 6Si6S 3 - 6CJ In the above discussion, we have obtained equations for the Newton-Raphson method. This does not mean this is the only way to solve Equations (3.83) to (3.85). Any other method can be tried. However, the Newton-Raphson has its own advantages. It is quadratically convergent. And most of all, it is a very simple method known by many engineers. The only problem with this method is it requires the starting point close to the solution. If we adopt a pseudo-recursive approach, we always have the starting point close to the solution. In case this is not the choice, then we have to do a crude search for the minimum. This search must be systematic and is crude in the sense that we vary the values of the parameters 8{S with a coarse resolution, calculate the corresponding parameters LOiS and the sum of squares SN- The values of that give the smallest sum of squares SN are used as the initial estimates in the Newton-Raphson iteration. This is necessary, even if one chooses not to use the Newton-Raphson method, because Equations (3.83) to (3.85) are not necessary linear in the parameters 6,-s. This means that the obtained solution might not be the global minimum solution. These equations give only an extremum solution. To be certain about the solution, we must have a crude estimate about it and the final value SN the algorithm gives. So by choosing the Newton-Raphson method and solve its initial estimates problem, we partly answer the question of global and local minima which is the topic we discuss next. Global and Local Minima In (1974), Astrom, K . J . discussed the problem of global and local extrema in the identification of an A R I M A time series. In this paper, the author concluded that if the estimated model has the same number of parameters as the true model, there is only one unique global minimum. If the estimated model has less parameters, there can be several local minima. If the estimated model has more parameters, there will be mam' minima. However, the minima are on a manifold with the property that the polynomials have common roots. In 1975, Soderstrom, T. discussed the global and local minima in the identification of a rational transfer function. The discussion was for both cases when the disturbance n is white or colored. The conclusion was similar to those of Astrom, K . J . in the case of an A R I M A time series. There will be a unique global minimum if and only if we have more or the same number of parameters. t CHAPTER 3. IDENTIFICATION 28 Parameterization Parameterization involves the determination of the number of parameters of the control system and the value of the delay, ie. we parameterize the model. Logically, we can say that if the estimated model, having the numbers of estimated parameters as r, s and / , does not have enough parameters, then the variance of the disturbance n will be inflated some amount caused by the missing parameters. So if we gradually increase r and s or change / and see no more drop in the variance of the disturbance n , we can claim we have reached the right number of parameters. If we have crossed this boundary, ie. if we overestimate, then the variance of n will tell us. The variance will no longer drop. The overestimated parameters will be almost zero or there are common roots in the estimated polynomials. Theoretically, the problem is easy. However, the estimation is usually imperfect and it will give us inconclusive results from which we cannot draw an easy conclusion. In this case, we have to resort to a statistical test. The most well-known tool for this is the Aka.ike's Information Criterion (AIC). It is given below t 4 t AIC = NlogV + 2p (3.106) n where N is the number of observations, V is the variance of the disturbance n and p is the number of estimated parameters. The log function in the above equation tells us the criterion has its root in the maximum likelihood method. The prediction error method has its own criterion called the F P E (Final Prediction Error) criterion. n FPE = t 1+ ^ fv n 1 (3.107) ~ N These criteria are given as benchmarks to choose discriminating models. We should choose the model with a lower criterion value. Our approach which is similar to the least squares estimation will have a test of its own - an F-test. For this test, one can refer to Soderstrom, T. and Stoica, P. (1989). Nonstationary Disturbance The question of nonstationary disturbance must be answered when we suggested an identification scheme that identifies the transfer function and the disturbance separately. By nonstationary disturbance, it is meant that the autoregressive polynomial of the disturbance time series has one or more roots on the unit circle. Let us say it has d roots on the unit circle, then we can write yt = jr^r^ts-i +n t (3.108) 3.4. IDENTIFICATION OF THE 29 ARIMA + (3.109) t=0 9(z E K H - + j=0 _- X*Lu-u * 2 - ! ) ^ * ( ^ (- ) a 3 N(1 in with the polynomial <^*(z ) has no roots on the unit circle. One requirement for the least squares estimation is that the residual (n in this case not a ) must have constant zero mean and unknown but constant variance. Because n is nonstationary, it does not have a constant variance. However, by multiplying both sides of the above equation by a factor of (1 — z~ ) , we have _1 t t t x d « (l-z-'fyt f;/3 -(l-^ )"u _/-i-,- + | ^ a t 1 1 (3.H91 t i=0 y; « g9 __ _. + ^ _ l a V*t = / iU? / 1 77I=TT"*Vi (3.113) t + ( - 3 1 1 4 ) Now because the autoregressive polynomial of the time series n* has no roots on the unit circle, it is stationary and has a constant variance. The problem of nonstationary disturbance will disappear if we use y* instead of yt and u*_j_ instead of u ~f-i in our identification method. We get the same transfer function as the original system. However, overdifferencing can cause difficulty in the estimation, because it increases the sample variance of the disturbance. The process of differencing always moves the long wavelength band to the short wavelength band, ie. higher frequency. This makes the data more sensitive to noise. 1 3.4 t Identification of the A R I M A The A R I M A time series is an important statistical tool to analyze equally sampled series. It is called an A R I M A (AutoRegressive Integrated Moving Average) to distinguish itself from another kind of time series - the harmonic time series - which has a discrete power spectrum and is composed of a number of sinusoids. The A R I M A time series has gained popularity over its harmonic counterpart due to its wide application in prediction and stochastic control. A R I M A time series modelling or identification is the art of obtaining a number of parameters that best portrays the behaviour of a process from which the time series was obtained. The A R I M A model can then be used for prediction and control. In 30 CHAPTER 3. IDENTIFICATION practice, we model only an A R M A time series. If we encounter an A R I M A , we change it to an A R M A by a process called differencing where each reading is changed by substracting by its previous reading. By identification of an A R M A n<, where n is given as below t (3.115) <A(- ) 1 i-n=i 1 - Ef=i = §\n-t-\ H ^a (3.116) t + § nv t v + a — Q\a -x t Q a- t q t q (3.117) We have to obtain the values for the parameters 0,-, d>i and a (the variance of the white noise a ) from a record of n or its equivalent statistics. 2 a t 3.4.1 t Methods of Identification for an A R M A There are a number of identification methods listed in the literature. Mayne, D. Q. and Firoozan, F. (1982) used a three-stage linear least squares estimation method. Even though the method is not complicated in practice, its theoretical formulation has such complications as p-consistency which means the asymptotic bias tends to zero as the degree p of the autoregressive polynomial tends to infinity and p-efficiency which means the asymptotic efficiency approaches the theoretical maximum as p tends to infinity. Other contributions are the Corner Method of Beguin, J. M . et al. (1980) and the R and S Array Method of Gray, H . L. et al. (1978). But these methods are only suited for identification of the orders of the time series and their estimates are used as initial estimates for other methods. The identification of an A R M A is parallel to the identification of a stochastic control system, because an A R M A is a linear stochastic control system with the input variable u = 0. This means methods like maximum likelihood and prediction error are also used widely. The prediction error method is the method of choice used by the M A T L A B software in its identification package. t 3.4.2 The Semi-Analytical Approach Since an A R M A time series also has a rational form, our first reaction would be formulating its identification problem similarly to the identification of a rational transfer function. Since the generating white noise a of the time series has minimum variance property, the identification problem can be established as the following optimization problem without any further statistical property of the white noise such as distribution, etc... t Min E a? (3.118) 3.4. IDENTIFICATION OF THE 31 ARIMA In practice with the assumption of ergodicity which means we can replace the expectation operator by the average sum, then the identification problem becomes equivalent to the following optimization problems: Min 9,<p E a] - N 1 Min t 1 (3.119) 0,4>N-m J^ +1 (3.120) - Min ^ N-m Min 0,0 (3.121) S N with m as the larger integer of the two integers p and q. To solve our identification problem as before, we rewrite the optimization problem as below SN — Min (3.122) SN 8,<t> 1 - 4>iz~ l = N 0>*=r+r (3.123) -fit) 0, 4> t=m+l N Min (nt —(<f>i-ei) + --- + \-e -^ (<f> -6 )z- +nt-i) m m m 1 2 (3.124) e z-<> lZ q From the above equation, we can see that the time series can actually be written as follows: O l - f l l ) + ---+ Q m - f l m ) ; 1 -01Z- - • • h(t\t - 1) + a 1 -m+l •e -o -n -i + a t t qZ (3.125) (3.126) t The first term of the right hand side of the above equation is the one-step ahead optimal predictor for n . If we treat n as y and n -\ as Ut-f-i, then we have a very similar identification problem as the identification of a rational transfer function. Similarly to the case of modelling the transfer function, we can write t 1 -0! 1 -01 t -e q 1 o -0! 1 -0! 1 t t XN-1 n-N-i n-N-2 Xm+2 n~m+i X N Tim. nN-m nN-m-1 n n\ 2 -6. •'m m v (3.127) CHAPTER 3. IDENTIFICATION 32 and therefore -0 1 ••• -0! X x N -0n 0 XN-I [Ni«p" - N 0] 2 -01 1 1 7i 72 73 1 7i 72 (.3.128) -01 1 7iv_ _i m [NK£ - N 0] 2 7i 1 (3.129) 7i 1 or in terms of matrix and vector notations x = r [NKP - N6>] (3.130) 2 with n-N-p njv_i n-N-2 "/V-p-1 m+l n -p+2 n nN-2 TlN-q n-N-q-i (3.131) m+l n-m-q + 2 Tim. Tlm — q+l n m n>m—p+1 This case is more complicated than the case of the rational transfer function, because of the existence of the matrix N . However, the approach to solve the problem is the same and the case does not cause any more technical difficulty. As before, we define n as 2 n N n-N-i (3.132) n n>m+2 m+l n then write the sum of squares SN as follows: S N = [n — x] [n — x] [n + rN 6» - rNKp] [n + TN 0 - TNrf] T T 2 2 (3.133) (3.134) 3.4. IDENTIFICATION OF THE ARIMA 33 The optimal parameter vector <p can be obtained by differentiating the above equation with respect to <p and set the derivative to zero. Doing this, we obtain <p = [Nfr rNi] r 1 Nf r T (3.135) [n + T N 0 ] 2 Now if we take the derivative of SN with respect to # , we obtain 8 8S N 80 (3.136) Afn+ rNatf-rN^Hn + r ^ f l - r N ^ ] 80, = % n + TN e - rN <p] 2 2 x (3.137) where e; is a column vector which has a unity value at row i and zeros elsewhere. The problem of identification of an A R M A becomes the problem of solving the following set of equations numerically. h = | |^N 0 +f N 80\ 2 2 e i - |£ n + T N 0 - TNK^J 2 Wi d0 0 (3.138) n + f N 0 - f Ni<p] = 0 2 80 2 = 2 (3.139) h„ = 8T_ N e + rN e, 80 2 2 a 8T 80 n +f N 0 - f 2 NKP] 0 n (3.140) The above equations will call for the derivative of <p with respect to 0,-. To obtain the expression for this derivative, we start with the equation [Nff fNi]<p = Nff r T [n + fN 0] (3.141) 2 and differentiate both sides with respect to 9{ to obtain 8T 8T [Nf—f N +N f f — + T T a 8T 80 T -' N[f r N ! ^ r = 8T n + TN 0 + N f T [ — N 0 + TN e,] J 2 t 2 2 (3.142) CHAPTER 3. IDENTIFICATION 34 By moving the term with the parameter vector <p to the right hand side of the above equation, we can obtain the expression for the derivative vector <f) with the parameter & vector (p removed as follows: Nff rN 4> , = T e - [Nf de, n + TN 6 + N{ Y [ t ^ - N 0 + rN e,-] x 1 2 f ^ N N + x x ] [Nf f t N T 1 _ 2 f 1 2 [n + f N 0 r 2 (3.143) And we also need to know the following quantity: de , "^dv* k , . de k (3.144) t Now if we want to solve the above set of equations by the method of Newton and Raphson, we must take the derivatives of the individual functions /i;S with respect to # s. This will call for the second order derivatives of <p and T. Proceeding as in the section of identification of the transfer function, we can derive ( k-l d% 2 ddid6i d9i 80j (3.145) 86, de, and from the equation 8T T [Nf—f N Nf ^ 8T ' + N[f —Nx]<A + N ? f f N ^ , , . T a r = [n + f N 0] + N f F [ | ^ N 0 + f N e,-] 2 2 (3.146) 2 we can differentiate both sides with respect to #j to obtain # f 2 r i V T T r - ^ 8t dr^ ~dt dt^ d9j 89, T T T ^ ~ dt 2 tT t N ](p + x de,ee 3 Nff rN! T ..TAT, af 2 ^ t A af., 5 r (3.147) 3.5. IDENTIFICATION OF THE TRANSFER FUNCTION-ARIMA 35 From the above equation, we found the second derivative of <p given as in the following equation: d <p d9ide 2 2-nT -1 3 ~ T dt dT 50,50, 2 r T ^ df 2 Tr ,5f dt dT T dT T r T T de, de. 66j 86 F)Y r)V i 86,36,' dt. 50, T 50 50, (3.148) The element at row i and column j of the Hessian in this case is given as below: dt [50,50, dK de, 2 ^ dt? 50; 5f 5r\ 5r^ dor^"* de T dt Nif 50,50, 2 T T . d 4> 2 T n + TN 0 - rN <p 2 x 3 T + —N 0 + f N e - - —N 4> ~ f N 0 50, 50 2 2 t r 5r 5r —N 6 + TN.e, - — 50 50/ 2 a - T N ^ (3.149) With the above equations, we can solve the identification problem of an A R M A time series by the Newton-Raphson method. 3.5 Identification of the Transfer Function-ARIMA When the transfer function and the A R I M A disturbance models are identified separately, this is called a split identification strategy. When both models are identified at the same time, it is called a combined identification strategy. The advantage of a split identification strategy is the ease of determination of the number of system parameters. But of course the combined identification strategy also has its strength. It is a one-shot attempt. We do not have to regenerate the disturbance time series. Therefore, in this section, we will CHAPTER 36 3. IDENTIFICATION also present equations to identify both the transfer function and the A R I M A disturbance simultaneously. However, our approach to this problem is different from many existing approaches. In our approach, we determine the parameters from two minimized quantities - one is the sum of squares of the disturbance time series and the other is the sum of squares of the white noise generating this disturbance. This is called a 2-stage least squares method. This approach is quite common in system identification. The prediction error method minimizes just the sum of squares of the generating white noise and interprets this white noise values as the prediction errors. Our approach guarantees no crosscorrelation between the input variable and the disturbance time series. This comes from the minimization of the sum of squares of the disturbance time series. First we will correct a problem created by the simultaneous identification approach. Since we still use the information from u\ to UN-/-I, the dimensions of the vector y and matrix U are still the same. In fact, nothing changes in the identification of the rational transfer function. This is because the interaction passes down from the identification of the rational transfer function to the identification of the A R I M A . This means we still have these results: [§Uw ad, +* U ^ ] [ y - W « ] r = 0 i = l,---s (3.150) V (3.151) and U * \T/U T r u V - y [ u u + u * (3.152) 0 with [U * *U] T r _ 1 (3.153) U * y T T However, the dimensions of the vector n and the matrices N i and N should be changed. Since we use y from y +f+2 to ypj, the matrices N i and N should have elements go from n +f 2 to n/v-i- This means 2 t r r 2 + riN-i nN-2 r+f+m+2 r+f+m+l n nr+f+m+3—p r+f+m+2-p n N N N n n -q UN-q-1 n -i nN-2 n-N-p 2 (3.154) = n-r+f+m+2 nr+f+m+l n +f+ +3r m g n +f+m+2-q r 3.5. IDENTIFICATION OF THE TRANSFER 37 FUNCTION-ARIMA and therefore n N nN-i (3.155) n r+f+m+3 n-r+f+m+2 n [ I j V - r - / - m - l | | O i V - r - / - m - l X m ] |y _ (3.156) ^Uu> (3.157) Wn y - $ U w which means the premultiplication by the matrix W will delete the last m rows. Now we can obtain similar expressions for the columns of the matrices N and N . The first column of these matrices can be obtained by premultiplying the matrix y — # U w with the matrix W i , where 0 x [0 Wj N-r-f-m-lxl |]J-7V- 2 -f-m- -1 ||0/V-r-/-m-lxm-l] (3.158) which means the last column of zeros of the matrix Wo is wrapped around and becomes the first column of zeros of the matrix W ] . The premultiplication by the matrix W j will shift the columns up one row and delete the last m rows. In general, the matrix W ; will shift up i rows and delete m last rows. This matrix will have i first columns of zeros followed by an identity matrix and possibly some columns of zeros after that. Now we are ready to write the expressions for the matrices N i and N . 2 y - WOJ y - *Uu> y - *Ud> W 0 y - *U« W y-*Uw W y - *Uo> W9 J T T 2 (3.159) As the dimension of the matrices N and N changes, the dimension of the matrix r also changes. The change is not the change in theory, but actually in separate identification, we redefine the number of observations in the identification of the A R M A time series. So in this combined identification, we have to correct for it. The matrix T is then defined as follows: a 2 1 7 i 72 73 1 7 i 72 r = lN-r-J-m-2 (3.160) 71 1 7i 1 CHAPTER 38 The identification of the A R M IDENTIFICATION A is the following optimization problem Wo [y - *UCJ] + TN 0 - TN p] [ w Min 3. [y - *Uw] + T 2 l( 0 TN 6 - TNrf 2 0,4 (3.161) and as before this optimization with respect to cp gives us cp = ^Nfr Nfr rNx T w T 0 y +rN 6» -*Uw 2 (3.162) From this equation, we can derive the result we obtained before with just the replacement of the vector n by the vector Wo[y — \PUd>] -l N ^ + TN 0 [W [y - *U«] T 2 0 (3.163) The interaction of the two identification problems comes from the equation below: -i Nfr rNx T <t>s. W y-fUw 0 +TN 6> 2 <9N T-^e] 2 + Nf T [-W —Ud; T r - W *Uo> + ds. 0 3Nf 0 5i (3.164) dSi To obtain the parameter vector 6, we can differentiate the identification optimization criterion and obtain a result we had before [ | N ^ + f N ^ - | N ' * - f N > ^ l T [Wo [y - *Uu>j + rN 6> - TN p\ = 0 i = l,---q 2 l( (3.165) With all the parameters defined, we can write our identification equations as: 9l g 2 = [ ^ U u , + *Ua>;j [y-*Uo>] =0 (3.166) = [^Uw + *U^J% = 0 (3.167) T ab 2 - ¥Uw] 3.5. IDENTIFICATION OF THE TRANSFER FUNCTION-ARIMA 39 (3.168) 9s dt [^N 0 + TN 9s+i = 2 dt - — 2 e i - *' N^-TN^J T (3.169) W [y - *Uw] + f N 0 - f N 0 2 0 </ r7F dt [—N 0 + f N e - — = s+2 2 2 - x -' Ntf-TNr^f 2 W [y - $Uw] + f N 0 - f Ni<p] = 0 (3.170) W [y - *Uw] + f N 0 - f Ni^>] = 0 (3.171) 2 0 55 + 7 2 0 The identification equations are similar to those of the split identification case. This means we should get the same results for both cases. The only difference is - of course the path to the final results. ion for our problem now is dgi d6 dgi 38 dgi d0 dgi oe dg dg +i 88 dg d8 dg +i d8 dg ddi dg +i d6i dg d9 dg i dO dg dSi dg d8 dg dg d0 r s -i q x " $1" s 8s k s s 1 . J_ V *+i q s s+q s s s+q gi g2 s s q s+ s . gs+q q s+g s+q s q .i (3.172) The Hessian of the above equation requires the derivatives of </,-s with respect to the two sets of independent variable 6;s and 0;s. The derivative of gi, for i < s with respect to 8j is still the same as the case of split identification. dgi 38., d9 dw dV 8Cj i 8 LJ, , . . . Uu> H U 1 U h* U y - $Uw 38; 38 i 38j 38, 36,38/ 38,38, 2 rr 1 T T T CHAPTER 40 _ «* [ U ( i + •dSi~~ ' 3. * §H^Uu>+« U ^ ] U d6i' c%~~ ' 88, L IDENTIFICATION for i<s (3.173) The derivative of ^ , for i > s is lengthier, but actually not more complicated. We will take the derivative of g i with respect to 8 . The derivative is new for this case of combined identification. s+ dg as, ^[|r ^ N + 3 f N 2 e i _|r N i ^ f N i ^ ] T (3.174) [W [y - Ww] + f N 0 - f N p] 0 2 1 Wo [ 4>e 86, d$i 86• l [ 1( y _ $Uw] + f N 0 - f N ^ ] 2 [_Wof - Wo*!*',, + t ^ O- f ^ - f N^',] (3.175) W lith Wf y - j9_ 0Nf T y - W T (3.176) 2 T y - / rp rp •[^-U« + * U w , . ] X (3.177) .[-u + w / w w w T p and similarly for the matrix N . Now we will derive the equations for the derivatives of g,s with respect to 9j. For i < s, the derivative with respect to 9j is zero, because the function contains no 9jS. 2 dgi , . — - = 0 tor i < s 00; n (3.178) 3.5. IDENTIFICATION OF THE TRANSFER 41 FUNCTION-ARIMA For % > s, the derivative of gi with respect to Oj is as below [W [y - *Uw] + f N 0 - f Ni^»] dOidOj odi 8Y .' ddj . 8 ch 00,00j . 2 - ^ - N i ^ - r N x - ^ - f + (3.179) 2 0 2 4 . 1 . . [ W [ y - * U £ ] + TN d - TN 0 2 df <9r -' [—N 0 + f N e - — N p - r N , J 2 o9i 1 < P l( ] 3 [|^N 0 + f N e, - | ^ N ^ - f N ^ . ] 2 . 1<P (3.180) 2 The elements of the Hessian matrix requires the first order partial derivatives and also the second order partial derivatives. We have given the first order partial derivatives of CJ and cp. Now we will give the second order partial derivatives. We have as before 0 CJ d 8 id 8 A 2 L ^ A r A - i - r d*, 86,38 3 2 1 u T (3.181) But the derivatives of cp will change. We have d <P = 2 r*N f r*TiW rTrN l T t 1 N f ^ : [ W „ [ y - * U c i ] + fN 2 # ] CHAPTER 42 d r - [Ni IDENTIFICATION d r 2 2 T 3. TNi + Ni de, de,- > T 1 x t J N i + N• ri frr de, de, (3.182) The above equation is obtained by differentiating the expression for <p . with respect to 9j. As we can see the derivative is symmetric with respect to 0,- and 9j. If we differentiate g the expression for (f> . with respect to Sj, we get the last partial derivative required. We start with the equation 6 N f fN r T 1 1 8T <p T N [ F [ | ^ N 0 + fN e ] - [Nf 2 2 8 N l + Nff (3.183) then differentiate both sides with respect to Sj to obtain JT ^ r r N dSj + Nf r r — ± ^ + N * r r N i i — ^ ds d9, db dW dt •[W [y-#Uw] + fN 0] + dS de, T T 1 T 3 3 1 x o 2 3 <9<5j #0; d9i dSj dSj [ N d9i ,|IrN d9i dSj 1 + N[Fg And so from this equation, we have d 4> dS d6, 2 Nfr rN! r 3 69 i dt, 66; •[W [y - *Uw] + f N 0] 0 2 N l ,^ (3.184) 43 3.6. IDENTIFICATION OF THE PREDICTOR FORM .dT <9N 8N2 2 [-' [ d0, f fNx r 86, + dt 1 - [Ni ^ - Nff f r 50, 86, fNx+ N f f <9r T ^ N x ] ^ 86, 86, 86, (3.185) Our Newton-Raphsori equation has become ' 6S 6S 6i Jo k' . dgi dgi 861 86s 8gs dgs 86x 8gs+i 86s dgs+i 8gs+i dgs+i 86x 86s 861 d0q dgs+q dgs+q 86i 86s 0 0 0 0 dgs+q d0i -1 9i gi gs+q dgs+q d6q (3.186) This should make the inversion of this matrix a little bit more accurate. The existence of the lower block makes the movement of the parameter vectors S and 0 dependent. 3.6 3.6.1 Identification of the Predictor Form T h e Predictor F o r m The Box-Jenkins model of a control system was first called the dynamic-stochastic model by a number of researchers. The dynamic part represents the dynamics of the process and the stochastic part is the part of which values are not certain - hence the word stochastic. This part is represented by an A R I M A . Then it was also called the transfer function-noise model or the transfer function-ARIMA model. Even though the names are different, the equations used are the same. The deviation output variable is the sum of two rational functions. One driven by the deviation input variable, the other is driven by a white noise. This is one form of the Box-Jenkins model. The Box-Jenkins model has another form called the predictor form. If the purpose of the identification is only to design a minimum CHAPTER 44 3. IDENTIFICATION variance or an L Q G controller, then it is better to use this form than the currently known or more popular one. We will derive this form in the following. From the usually known Box-Jenkins model "CO. c/ ~7~Z t-S-\ + yt = u (3.187) 77—TT * a we can derive an expression for a as below t yt at riT *-/-i u (3.188) d(z- v If we write the system model at time t + / + 1, we have yt+i+i = (3.189) TT t+/+i a and with the Diophantine equation ip(z + (3.190) -) 1 we can write uiz- ) 1 yt+f+i (z" ) ,. _ 1 7 (3.191) u 7(- ^-f0(z-> -V 0(z~ ) - ( z ~ ) z - x 1 / 7 ( + / + 1 (3.192) 1 ut + ^Qryt + Hz->t ni + (3-193) 0(2 (3.194) This equation can be put to the following equation: (z- )1>(z-*)<l>(z-*)u + 6(z-i)9(z-i) l yt+f+i U t 8(z-^(z^)y t + 0(z" )a 1 i + / + 1 (3.195) This form is called the predictor form of the Box-Jenkins model. It can also be called the controller form, because the numerator of the first term on the right hand side of the above equation is the equation of the minimum variance controller. With the above form, we can apply our technique of identification as in the case of a rational transfer function 3.6. IDENTIFICATION OF THE PREDICTOR 45 FORM or an A R I M A with no more difficulty. The identification criterion will be the minimum sum of squares S = N ]>> t=i t + / + i ) s(z- )e{zi ^ 3 1 9 6 ) This form provides a short cut to the design of a minimum variance, or an L Q G controller. This form is easier to identify and statistical tests for model adequacy are easier and in fact inherent in the model. The value / and the residual • 0 ( 2 ) a must agree. The time series ij)(z~ l)at is a moving average of order / and so it has truncated autocovariances at the value of / . Now if we write the model as _1 i a(z- )u + b(z- )y 1 Vt+f+i 1 t t T-zt; = +( d z _ ,„ . y 1Q7 (3.197) then we can see the polynomials of the original form can be retrieved from those of the predictor form quite easily. From the following relationships: a(z-') = uiz-'YKz- )^- ) (3.198) biz' ) = ^( - ) (z- ) (3.199) ciz' ) = «5(2- )0(z- ) (3.200) d^- ) = ^{z' ) (3.201) 1 1 1 1 1 1 1 2 7 1 1 1 and with a little bit of algebra, we can obtain the following equations: u{z- ) 8{z~ ) aiz' ) ciz-^-biz-^z-J- x 1 l (3.202) 1 6{z- ) c( - )d(2- ) x 1 1 (3.203) 2 ciz- ) - biz-^z-f- 1 <t>{z-i) 1 The numerators and denominators of the quantities on the right hand side of the above equations have common polynomials. But only when the identification is perfect can we have cancellations of these common polynomials. However, as discussed above this is not necessary when we are interested in a minimum variance or an L Q G controller. The polynomials a ( z ) , b(z~ ) and c?(^ ) are needed for the minimum variance controller. For an L Q G controller, we need the additional polynomial c(z~ ). Now if we write the predictor form model at time t and consider the case of a control system with no delay, ie. / = 0, then we have -1 l _1 x = y(t\t-l)+a t (3.205) CHAPTER 46 3. IDENTIFICATION The quantity y(t\t — 1) is the optimal (in the sense of minimum variance error) one step ahead predictor of y . This means the prediction error method identifies the Box-Jenkins model via the predictor form. The case / 7^ 0 can be established similarly. t y t = = + y ( t | t - / - l ) + y»(2- )a H z K ( 3 - 2 0 6 ) (3.207) 1 i But in this case the optimal predictor y(t\t — / — 1) is / + 1 step ahead. For less work in the derivation of the minimum variance or an L Q G controller of a delayed system, the / + 1 step ahead optimal predictor y{t\t — / — 1) should be used rather than the one step ahead optimal predictor y(t\t — 1). This is because we will get the controller directly from identification. 3.6.2 Closed-Loop Data The question of closed-loop data began with a paper by Akaike, H . (1967). In this paper, by using cross-spectral method Akaike, H . showed that if closed-loop data is used, we might not get the transfer function of the process dynamics but the inverse transfer function of the controller. From Figure 3.1, we can see that the feedback input variable u correlates to the controlled output variable y via two paths. The forward path will give us the transfer function of the process dynamics. The backward path will give us the inverse transfer function of the controller. t t Disturbance Set Point — o Dither d f Controller hi*' ) Vt Hz- ) 1 z-f- 1 + 1 Ut Backward Path + Plant LO + + O Forward Path Figure 3.1. Correlation Paths of a Feedback Control Loop. The question now is if our method is vunerable to closed-loop data? If it is and we carry out the same procedure as described in the section of identification of a rational 3.6. IDENTIFICATION OF THE PREDICTOR 47 FORM transfer function, what will we get? And what about the method of prediction error? Is this method vunerable to closed-loop data? Since our approach to identification is semi-analytical, it will help us in answering these questions quite easily. To answer these questions, we write our predictor form Box-Jenkins model as below. -1 U}\ z y = 6{z-W~ ) = y ^ t - f -\) + ^{z- )a x t ( K ( (3.209) l t = (t\t-f-l) +e yi } (3.210) t Now if the feedback controller is then we can write = Vt 6{z-i)0(z-iMz-i) + * (3.212) ^ z - ) ^ * - ) ^ - - ) / ^ - ) + 6{z-*)i(z-i)l (z-*) 6(z-i)9(z-i)h(z-i) « t - / - i + c, 1 1 1 1 2 = = y (t\t - f - 1) + e 2 (3.213) (3.214) t and Vt = 6(z-i)6{z-i)h(z-i) y (*|*-/-l) +e 3 yt-f-i + et (3-215) (3.216) t In the above equations, we list 3 optimal predictors. They are the same in the sense they give the same value. The predictor yi(t\t — f — 1) uses both the input and output variables u and y . The predictor y (t\t — / — 1) uses only the input variable u and the predictor — — u s e s o n 2/3(^K / 1) l y the output variable y . Both the predictors y (t\t — f — 1) and yz(t\t — f — 1) use the controller in their prediction equations. With the above equations, we now can answer the questions we raised before. A least squares estimation forces no crosscorrelation between the regressor variables and the residual. This is the statistical meaning of orthogonality in geometry. The answer to the question if our method of identification of a rational transfer function is vunerable to closed-loop data is affirmative. And if we carry out the same procedure as in the section of identification of a rational transfer function with closed-loop data, what we will get is neither the transfer function nor the inverse of the controller but the quantity t t 2 t t 2 ^ z - ) ^ - ) ^ - ) / ^ " ) + <5(z- ) (z- )/ (z- ) 6(z-*)9(z-*)h{z-i) 1 1 1 1 1 1 7 1 2 (3.217) CHAPTER 3. 48 IDENTIFICATION This comes from the optimal predictor y (t\t — f — 1). 2 The question involving the prediction error method can be answered as follows. In 1975, Ljung, L. et al. (1974) studied identifiability of closed-loop data and suggested that by shifting between different control laws, identifiability can be achieved as in open-loop data. The number of control laws or controllers must be greater than the ratio of the number of input to the number of output variables. This is for multivariable systems. For single input single output systems, we just need to shift between two control laws. The conclusion involving the prediction error method is confusing. In their own words, they concluded "Direct identification with a prediction error method can be used exactly as in the open-loop case; the fact that the system, operates in closed-loop causes no extra difficulty". Unfortunately, this conclusion is confusing and normally causes errors for a user. If the optimal predictors y2(t\t — / — 1) and yz{t\t — f — 1) are used, then the conclusion is correct. The authors actually used the predictor yz(t\t — f — 1) in their paper. However, in practice the predictor yi(t\t — f — 1) is normally used. Please see an example in Soderstrom, T. and Stoica, P. (1989). Using the predictors jj2(t\t — / — 1) and y3(t\t — f — 1) makes the method prediction error no longer direct, because the knowledge of the controller is used. Using the predictor yi(t\t — f — 1) makes the prediction error method vunerable to closed-loop data, because there might be a linear relationship between the input and output variables in the predictor. And so, the prediction error method is vunerable to closed-loop data. The method of maximum likelihood is also vunerable to closed-loop data. The physical meaning of the problem is - the data has one additional linear relationship and so alone it cannot give all the parameters but one less. The remaining one must be obtained from the linear relationship given by the controller. Using the optimal predictors y2(t\t — f — 1) and y3(t\t — f — 1) means we have used this linear relationship and in doing so all the parameters become identifiable. We will explain in mathematical details why this is so. We consider the identification of the polynomials a ( z ) , biyZ" ) and c(z~ ) in the following equation: _1 1 l Vt+f+i = (3.218) Now at the optimal condition, we have from Equation (3.82). (3 = = [X * #Xj T T [I^X^y X * y T r (3.219) (3.220) 3.6. IDENTIFICATION OF THE PREDICTOR 49 FORM where -Ci 1 • • • —cs+q -ex -ex 1 U/V-/-X X UN-f-2 (3.221) -ex UN-f-1 • • • VN-J-X-n • • ' UN-f-2-m VN-f-2 ' ' ' VN-f-2-n • • • 2//v UTV-f-X-m VN-l ; y = (3.222) If the matrix Lj is nonsingular, then the system is identifiable. This is because we can calculate all the parameters of the system. The matrix 1^ can be rightfully called the identifiability matrix. If Lj is nonsingular, then X must be full (column) rank, ie. there are no linear dependences of the columns of X. The question of identifiability becomes the question of linear independence of the columns of the. matrix X. Now a feedback control law will usually cause this dependence. The nonidentifiability of the system parameters, in case the data have been collected under a minimum variance feedback, has been established by Box, G. E . P. and MacGregor, J. F. (1976). Now if we shift between two control laws, then the top part of the matrix X corresponding to the first control law has a linear dependence of the columns which is different from that of the bottom part of the matrix X corresponding to the second control law. In overall, the columns of the matrix X become linearly independent, and so the matrix 1^ becomes nonsingular. This is an ingenious idea of Ljung, L. et al. (1974). The addition of a dither signal suggested by Box, G . E . P. and MacGregor, J. F. (1974) also works, because the dither signal breaks the linear dependence of the columns caused by the controller. Now suppose that we use a feedback control law that remembers more past input variable or past output variable than those of a minimum variance control law, then the system is identifiable. This is because the additional input or output variable acts just like a dither signal and that is breaking the linear dependence of the columns of the matrix X. On the contrary, if we use a feedback control law that remembers less past input and past output variables, then the system is not identifiable. In 1981 Gevers, M . R. and Anderson, B . D. 0 . presented a new approach to the problem. They said because under feedback the white noise drives both the input and output variable time series and therefore these variables can be put in a vector form and 50 CHAPTER 3. IDENTIFICATION treated as a vector time series. This vector form gives us a joint input-output model which contains both the controller and the transfer function of the process dynamics. This joint input-output model can be identified by a factorization of the joint spectral density matrix (Anderson, B . D. 0 . and Gevers, M . R. (1982)). However, with no dither signal this method becomes an indirect method. We must point out here that there is one problem with closed-loop data and that is it often contains little variation. This makes identification difficult. The addition of a dither signal will help, but this too can be under criticism. This is because, if the magnitude of the dither signal is small, then the data is like closed-loop. On the other hand, if the magnitude of the dither signal is moderate or large, then the situation is like openloop data. Even open-loop data are not always good. This is the reason why at times experiment designs are needed to collect data. For optimal conditions to minimize the error in the identification of the transfer function, one can refer to the paper by Gevers, M . and Ljung, L. (1986). We now close this section by concluding that our method and the prediction error method are vunerable to closed-loop data. If data must be collected under closed-loop condition, then we can shift between two feedback control laws. These control laws should be quite contrast to one another. If perturbation is allowed, then the addition of a dither signal is better than shifting between control laws. In the case the data are already available and no control law is known, then we will use an indirect method via the calculation of the two optimal predictors y2(t\t — f — 1) and yz(t\t — f — 1). Note that this indirect approach just suggested is different from the indirect approach discussed in Soderstrom, T. et al. (1975). 3.7 Examples In this section, we will consider a few examples to test for correctness of the equations. There are 3 programs written in M A T L A B software. These programs are included in Appendix B . The program tf J d is the program used to identify the transfer function. The program a r m a i d is the program used to identify the A R M A time series. The program bj J d is the program used to identify the combined model, ie. the transfer function and the A R M A time series at the same time. A l l these programs have been tested thoroughly for bugs. In this section, we will again test these programs and compare the results obtained from our method with the results obtained by the M A T L A B software package. The model we generated is given below: (3.223) 3.7. 51 EXAMPLES U3{ Z - 2.0 + 0.821 - 0.9-1 + 1 1 - O J 0 - +0.12z+ 1 - I.4.-1 -2 * 1 2 (- ) a + 3 QA8z 225 with a as the normally distributed white noise with unit variance. The input signal u is also white with a standard deviation of 5. The four time series y , u , n and a are plotted in Figure 3.2. With the combined identification strategy, we get the following results: t t t ucBiz' ) = 1.9632 + 0.8078,- ScBiz' ) = 1 1 1 -0.S9162 - 1 -1 (3.228) -2 CB OcBiz' ) 1 = t (3.227) -2 4> {z~ ) = 1 - 1.1928* + 0.29432 l t (3.226) 1 +0.20S12 t 1 - 0.47352 + 0.10532-1 (3.229) 2 The sample variance of the white noise is 1.1278 and its estimated value is 1.0975. W i t h the split identification strategy, the program tf Jd.m gives us the following model: u p{z~ ) = 1.9632 + 0.8078*- 8 {z- ) = l S 1 SP 1-0.89162 - 1 (3.230) 1 +0.208U- 2 (3.231) The above polynomials are identically the same as those given by the combined identification strategy. The excellent agreement tells us that both the theory which results in the equations and the software are correct. The split identification of the A R M A time series gives us the following results: (t> {z- ) = 1 - 1.04212 + 0.14S22 0SP(-Z ) = 1 SP _1 -1 1 - 0.3368s -1 -2 +0.1198z~ 2 (3.232) (3.233) and the estimated variance of the white noise 1.1145. On first sight, this appears to be unexpected, because we would think the split identification strategy would give us a better result, ie. a smaller estimated white noise variance and closer values to the true parameters. To check this result, we identified the same A R M A time series, but this time via the program tf Jd.m not the program armaid.m. We got identical results. This assures us all the software are correct. And the reason for the difference in the obtained results of the A R M A time series is the difference in the time series. When we identified the A R M A time series in the split identification strategy, we identified the raw time series, ie. the one we generated. In the combined identification strategy, we identified another time series - the one that had been substracted out of the output variable time series. CHAPTER 52 3. IDENTIFICATION T F - Input/Output Time Series Series Time A R M A - Input/Output Time Series 5.0 3.0 /1 1.0 i Series V ! -1.0 !i : v. 1 i , \ V -3.0 H •- n (It t -5.0 0 10 20 30 40 — i 50 r~ 60 70 I 80 Time Figure 3.2. Input-Output Series of Identified System. 90 100 3.7. 53 EXAMPLES The above example proves to us the theory is correct and the programs are bug-free. We actually need only one program - the program tf jd.m. This program is simpler and shorter than the other programs. The program bjJd.m, due to its length and number of equations, is less numerically stable than the other two programs. In fact, to use this program we have to set the initial estimates so close to the optimal values for the program to converge. The purpose of the above experiment is to test the equations and prove that we can obtain unbiased estimation of the rational transfer function when the disturbance is color. This is important for both the output error method and our method. Now we will compare our results with the results given by the prediction error method. For this system, the prediction error method in the M A T L A B software gives us the following results: LU (Z- ) 1 pe SpEiz' ) 1 = 2.0059 + 0.82272" = 1 - 0.89562- + 0 . 2 0 3 9 2 <t> (z- ) = r PE 0PE{Z~ ) 1 = (3.234) 1 -2 (3.235) 1 - 0.17592" + 0.12952" (3.236) 1 1 1 - 0.86002 + 0 . 0 3 3 9 2 -1 2 -2 (3.237) Comparing the results, we see that the prediction error method in the M A T L A B software gives almost the same accuracy as our method for the parameters of the transfer function model, but the obtained parameters of the A R M A model by the prediction error method seem to be less accurate than those given by the our method. The difference in the accuracies is not that much, since in this example our method iterates half the total number of parameters, ie. it uses only the 8s and 9s in the Newton-Raphson equations and there are exactly four of them (8\, 8 , 9\ and 9 ) out of a total eight parameters. In cases where the number of the iterating parameters, ie. the 8s and 9s is much less than the total number of parameters, significant improvement in accuracy can be obtained by our method. This can be illustrated by the following example in the identification of an ARMA. 2 2 One hundred observations of the following time series was generated 1-0 4 " 5 -1 1 - 1.9s" + L I S , - - 0 . 2 4 , - 3 1 2 with the statistics of the white noise given by the following table: (3.238) CHAPTER 54 3. IDENTIFICATION Table 3.1 Statistics of a< Lag Autocovariances 9.0197 0 -1.1262 1 -0.6632 2 -0.2408 3 4 -0.8945 0.1830 5 6 -0.5839 7 0.9126 -0.8533 8 -0.3383 9 Using these one hundred observations, our semi-analytical method gives us the following model: 1 - 1.7647Z- + 1.0323z- - 0.1891,1 2 3 v ' ' With this obtained model and the obtained white noise e<, we have the following statistic table: Table 3.2 Statistics of e Autocovariances Crosscovariances 8.6053 8.6053 -0.0000 0.0000 -0.0212 0.0000 0.0001 0.1733 -0.4569 -0.0478 0.2432 0.3906 -0.1776 -0.2535 0.9214 -0.1911 -0.6256 -0.7396 -0.2696 0.2348 t Lag 0 1 2 3 4 5 6 7 8 9 The fact that the autocovariance at lag 1 and the crosscovariances at lag 1, 2 and 3 in the above table are zeros indicates that the identification is perfect. Beyond these lags the statistics are theoretically supposed to be zeros. In the table, they are not zeros, 3.7. 55 EXAMPLES because they are the sample statistics. Since the identification results are good, we want to compare these results with what we obtain from the prediction error method. The comparison is given in the following table: Table 3.3 Model Comparison PE (MATLAB) True 2.1590 1.9000 -1.5742 -1.1800 0.3949 0.2400 0.8344 0.4000 9.0197 8.9191 Parameter fa fa fa 0i °l Derivative 1.7647 -1.0323 0.1891 0.4284 8.6053 From the above table, we can see that the prediction error method estimation is poorer than the estimation given by our method. With the estimated variance (8.9191) of the white noise which is smaller than its sample counterpart (9.0197), the method of prediction error is adequate, because it means the method indeed tries to look for the minimum variance of the white noise. However, because it has to iterate with 4 parameters, the result is poor. This can be seen as follows. The prediction error method uses the Gauss-Newton approach to calculate the values of the parameters for the next iterative stage, which means we have ' t' " t 1 fa 1 fa — ? + at 2 fa fa t+1 ' . 0i. t -1 X X X X X X X X X X X X X X X X X X X t (3.240) X where x is some number. Now the problem with the Gauss-Newton method and all the other Quasi-Newton methods is the approximation of the inverse matrix (Hessian) on the right hand side of the above equation. This creates some loss of accuracy. The loss of accuracy is more when the dimension of the matrix is larger. On the contrary, our method iterates with only one parameter 6\ 0i 0i (3.241) t-i As for the autoregressive parameters fas, they iterate in harmony with the parameter 0j, because they are calculated from the equation 4> NfFfNx N | F' n + rN 6> 2 (3.242) CHAPTER 3. 56 IDENTIFICATION. This is the reason why, in this particular case, our method could deliver such an excellent result. The above example does not mean the prediction error is a poor estimator. In fact, it is an acceptable estimator as it has proved by obtaining a variance of 8.9191 which is smaller than the sample variance 9.0197 of the white noise. One might argue that since most iterative numerical methods will have some approximation of the function to be minimized, the approximation of the Hessian by GaussNewton and Quasi-Newton methods might not be a great deal, because it might be absorbed by the approximation of the function. Newton's methods approximate the minimized function by a quadratic function. This is a valid argument. However, one cannot deny the fact that additional accuracy is gained by removing a number of parameters out of the minimized function, as the above example has shown. In general, the identification of an A R I M A is more difficult than the identification of a rational transfer function, because of a small signal to noise ratio. Therefore, an accurate method is in demand. Since in an A R I M A , we usually see more parameters </>;s than parameters 0;s, like in the above example, there should not be any argument that our method gives higher accuracy. This means our semi-analytical method should be used. In the identification of a rational transfer function, we usually have a decent signal to noise ratio and more parameters <5;s than parameters u;,s, a completely different scenario to the case of the A R I M A , so we would like to convince ourselves with many more examples proving a better accuracy of our method. To do this, we now go back to the identification of a rational transfer function. We again used the program tfJd.m and the M A T L A B system identification toolbox for a comparison. This toolbox has a function called oe which stands for output error. It must be mentioned here that even though the function is output error, the method described in the toolbox is referred to as the prediction error method. In our final comparison, we randomly created models and from these models we generated 100 observation data and from these observations of yt and ut, we identified the parameters by both methods. The model has the following form: Vt = , 1 - bz x 1 - bz . "*-i ai + ( 3 - 2 4 3 ) 2 which is a very typical model one can encounter in practice. The observation data were generated by generating at and ut. Both of these series are white. The series at is normal and has a unit variance, but the series ut is uniformly distributed and has a standard deviation of 2.5. This was done so to give a decent signal to noise ratio. A total of more than 200 models was created. Like the above case of identification of an A R M A , the criterion we will use to judge the model is the variance of the disturbance white noise at. We call a 2a the variance of the generated disturbance white noise, o~ 2 pE the variance of the disturbance white noise obtained by the prediction error method and crl the variance of the disturbance white noise obtained by our method or the semi-analytical method. SA 3.7. 57 EXAMPLES Figure 3.3. Parameters of Compared Systems. 58 CHAPTER 3. IDENTIFICATION Figure 3.4. P E Estimated Parameters of Compared Systems. 3.7. 59 EXAMPLES Figure 3.5. SA Estimated Parameters of Compared Systems. CHAPTER 3. 60 0.1 V a -0.4 u e s -0.9 1 .Ulii.ii.i.l..n.i.i.|i.i .Ii...In.,.!!,..!.! -1.4 0 ..Ii.. ili.li.i iii. ll.i.i.ii.l...i..iii i ilil|li.yi..,l.„|.ili,.l,. l | | | 1 ...il I„,.II .1.1.1.. I ..III.I 1 1 1 1 1 I I I 20 40 60 80 100 120 140 160 IDENTIFICATION I....H •iiih.ii I 180 200 Trial 2 2 a' a,SA <J a V a 1 u e s 200 Trial 2 2 a,PE~ a,SA T V a 1 0.9 a H u e s Trial Figure 3.6. Comparison of Estimated Disturbance Variances. 3.8. 61 CONCLUSION Figures 3.3, 3.4 and 3.5 in the previous pages show the parameters of the created and obtained models. We cannot draw any conclusion from these figures. However, from Figure 3.6 we can draw a very convincing conclusion about the accuracy of the two methods. From the top graph, we can see that the prediction error method is an adequate estimator, because a is smaller than al most of the time. There are only 21 times out of 199 trials, it fails to obtain a smaller variance than that of the generated disturbance. Our semi-analytical method fails only one - trial 101. This can be seen from the middle graph of Figure 3.6. The bottom graph of this figure tells us that there are only two cases the prediction error method gives better results than our method. These cases happen at trials 7 and 50. The result of trial 140 tells us that once in a while the result from the prediction error method given by the M A T L A B software package must be checked. So we can conclude that in general our method gives more accurate results than the methods of prediction and output error. 2 aP 3.8 E Conclusion In this chapter, we have discussed the identification of the Box-Jenkins model. The Box-Jenkins model has been considered difficult to identify, but is a superior model due to its parsimonious characteristic. A n accurate identification of the transfer function of the Box-Jenkins model is very important, because its inverse is the pole-zero cancellator needed in the design of many controllers. The method proposed in this thesis makes it easier to choose the initial values of the parameters and gives more accurate results than the methods of prediction error and output error. The method also unifies the identification of a rational transfer function and an A R I M A time series. It also opens up a discussion about the vunerability to closed-loop data of the prediction error and maximum likelihood methods. By introducing a robust semi-analytical method to identify the Box-Jenkins model, the proposed method in this thesis makes a contribution not only to control system theory but also to time series literature. CHAPTER 3. IDENTIFICATION Chapter 4 Controllers 4.1 Introduction Controller design is the next step after system identification. As with identification, there are different design methods. But unlike identification, where the different methods seek the same solution, the different methods of controller design aim at obtaining different controllers with different characteristics. In this chapter, we will discuss a few well known controllers and introduce an improvement of the PID controller and the self tuning controller. 4.2 The Minimum Variance Controller The minimum variance controller is the easiest to obtain, because it follows directly from the model of the system. Our work does not involve the minimum variance controller. However, its knowledge will help in the understanding of the minimum variance self tuning controller. Therefore, we will briefly discuss this controller. Derivation of the Controller We consider the usual Box-Jenkins model 6{z- v) uiz- ) 1 yt+f+1 = Wv W^f^ 111 + 1 ( } At time t we want to choose the control action u in the form of a linear combination of the past control actions and the past output variable values such that the output variable t 63 CHAPTER 4. 64 CONTROLLERS yt has minimum variance. Let the controller be «* = ,/ (-) 4 / (z x 2 ) 2 By replacing the controller in the control system equation we have ^/^ = y i ^T + + / + 1 (4 - 3) Therefore under feedback the system follows the time series "-iUl,--)' W i ^ ' = a,+ - +i (4 4) or fl*- )/^- ) 1 0(^) 1 Now if we define the following Diophantine equation: 4>^) n then the output variable y t+f+1 = * ) Hz-r + l 4 - b j can be written as ^ - 1 ) a t + / + 1 + ^ " 1 f^: ^^ 1 ) ^ (4.7) ^- )/ (^ ) 1 1 2 From the above equation, we can see that the variance of yt+f+i consists of two terms. The first term is independent of the controller, while the second term is dependent on it. The minimum variance controller sets the second term to zero, ie. 2(fL_) faz- ) = . W - , S(z- )l (z- ) 1 r x n ] (4.8) 2 or M*- ) Kz- )iiz- ) l (z-i) u{z-i)4>(z-i)il>(z-*) 1 2 X X (4.9) The minimum variance controller is hence given by = -^XtXl-tf' ( and under this control law the output variable follows a moving average series of order / yt = V>(* )«t _1 (4.11) 4 - i o ) 4.2. THE MINIMUM VARIANCE CONTROLLER 65 Nonstationary Disturbance In the case the disturbance has d roots on the unit circle and they can be separated as shown below - = -/-'w,--) ' ai (4 12) (1 where </>*(z) has no roots on the unit circle. T h e m i n i m u m variance controller can be written as -1 6(z~ )'y(z- ) 1 U t = 1 ~u{z-^)(l - z^Y^{z-^)iiz-^) '^ yt Siz-'Hz- ) u(z-^(z-^( -i 1 [ 1 + 2 + -2 z (4 13 + ..y yt _ (4 14) The controller is a linear combination of an infinite amount of past output variable readings. Since it integrates all these readings into a quantity, the controller is said to have an integrator. This is equivalent to the integral gain of the three mode P I D controller. Usually, the controller is written as ^ = ( 4 - i 5 ) where V = (1 — z' ) and the left hand side of the above equation has the physical meaning of the change i n the position of the input variable when d = 1. T h e controller then has two effects: the integral effect that demands the input variable to wander away from its steady state position to compensate for the nonstationary behaviour of the disturbance and the proportional effect that decouples the interaction between the current and past readings of the input, output variables and the disturbance. 1 Non-minimum Phase Systems Recall that under a m i n i m u m variance control law, the input variable will follow the series u(z )<p(z !) l For a stationary disturbance, the closed-loop input variable of a n o n - m i n i m u m phase system is nonstationary, because the polynomial u.(z~ ) has roots inside the unit circle. This is not desirable and therefore a m i n i m u m variance controller is not good for nonm i n i m u m phase systems. Also, non-minimum phase systems are extremely sensitive to model mismatch. N o n - m i n i m u m phase systems are known as systems w i t h an inverse 1 CHAPTER 66 4. CONTROLLERS response by process engineers. This inverse response might not be seen in a discrete control loop, if we sample slow enough. So this is one way to cope with non-minimum phase systems. The other way is to design a linear quadratic Gaussian controller. This is the topic of our next discussion. 4.3 The Linear Quadratic Gaussian Controller A minimum variance controller is not always desirable. The minimum variance control law allows a higher gain so that the controller can move maximally to give minimum variance of the output variable. A control law that produces as small a variance of the output variable as possible but with a penalty on the variance of the input variable is desired. A minimum variance control law is usually considered optimal, while a control law that limits the variance of the input variable is considered as suboptimal. However, the minimum variance controller with a penalty on the variance of the input variable is occasionally called an optimal controller. The term optimal used in this context does not mean minimum variance, rather than that it means the controller is derived from an optimal criterion. The criterion is quadratic in the input and output variables. The disturbance is driven by a white Gaussian (normal) noise. The controller and control system are both linear. A l l these attributes give the controller a name - the Linear Quadratic Gaussian (LQG) controller. Its formulation is given as follows. The minimum variance controller can be formulated as the following optimization problem Mm E{y* ] (4.17) +f+1 which is a special case of the following optimization problem Min E{yt + \u } 2 +f+1 A>0 (4.18) This is the optimal criterion or performance index of an L Q G controller. Its physical meaning is the sum of two closed-loop output and input variances with a weight on the variance of the input variable. If A is zero, we have a case of unconstrained movement of the input variable. This is the case of minimum variance control. The controller moves the control element maximally to achieve minimum variance of the output variable. On the other hand, if A is very large, the criterion puts all the weight on the input variable. In this case, minimization of the performance index almost means minimization of the movement of the input variable. The result is a zero value for the variance of the input variable under feedback. The controller practically does not move the control element at all, and the variance of the output variable is at the highest value given by the disturbance. 4.3. THE LINEAR QUADRATIC GAUSSIAN 67 CONTROLLER In this case, the loop is practically open. In the usual case, A will obtain a value in the interval [0, co) and the closed-loop variance of the input variable al will be lower than or equal to that of the minimum variance control case a . Actually, Equation (4.18) should be modified for the following reason. Suppose that we want the closed-loop input variable variance equal to or smaller than a , the optimization problem will be 2 umv 2 u Mm E{yl } (4.19) f+l Ut with the open-loop control system y i + / + i ^ = h + i^) - at+s+l (4 20) subject to a constraint on the closed-loop input variable variance E{u } < al 2 t (4.21) The Lagrangian of the above optimization problem is (Luenberger, D. (1969)) Mm E{y 2 + X(u - a )} 2 +J+1 (4.22) 2 t u not Min E{y 2+f+1 + \u 2} (4.23) u t The solution of the above optimization problem must follow the Kuhn-Tucker theorem, ie. we must have (Luenberger, D. (1969)) X(E{u }-al) 2 = 0 (4.24) In case it happens that al is smaller than al, A will be zero. On the contrary, E{u } = al. So in both cases, the Kuhn-Tucker theorem is satisfied. This Kuhn-Tucker theorem can be obtained from Equation (4.22) not Equation (4.18). Nevertheless, these two control criteria will give the same controller, and the latter has established itself in the control literature and therefore we will refer to this optimal criterion when we mention the L Q G controller. In Equation (4.21), the smaller sign (<) is used for the case al is smaller than al, and the equal sign is used for the case al is greater than a . Statistically, Equation (4.22) should also be modified. This is because the optimization is taken at time t and yt+f+i is in the future of time t. At this time yt and at are given, 2 m v mv 2 mv tl CHAPTER 68 4. CONTROLLERS so the expectation should be the conditional expectation given y or a - which means we must have either t Mm E{yl + \(u -*l)\y } Mm E{y + X(u - a )\a } 2 f+1 t (4.25) t or 2 u 2 +f+1 (4.26) 2 t u t t From the above equation, we can write Mm ^ { ( ^ ^ u ) \ b u t z < + § ^ a i + / + ) 1 + A(u -^)|« } 2 (4.27) 2 t n ) z By using a Diophantine equation we had before, we can write the above equation as Emz-^at+m) Mm + {j£^-ut 2 u z z *l Min^ mv + EiC^Ut + ^ ~ X ^ a ( 4 2 8 ) n ) °\ ) t or + jf=T^ 1 + Hu - a )\a } 2 + JF^Tf^ (4.29) 2 u t Now the expectation value of a -k (k > 0) given a is a -k and similarly for u -k (k > 0), because u is a linear combination of a -k (k > 0). This means the expectation operator will vanish and we are left with a quadratic equation in the variable u . In Vu, K . (1991), the author has obtained the solution for this problem as below: t t t t t t t 5(z- ) (,- ) ^ LLi 1 1 u t = ( ... (4.30) 1 7 ( A a t and therefore the L Q G controller is given as follows: (4.31) ^ l Z ^ u(z-i)6(z-^(z-i) + —e(z- )6(z-' ) 1 1 So compared to the minimum variance controller, the L Q G controller has an additional term —6(z~ )6(z~ ). Now if we put the above equation back into the Box-Jenkins model 1 u 0 1 69 4.4. THE PID CONTROLLER equation, we will get an equation for the time series the output variable will follow under feedback. y = t 5-^ a u{z-i)<p{z-i) + —S{z- )cj>(z- ) (4.32) t l v Wo From the two equations for u and y , we can find the following relationship between the variables yt and u under feedback t t t yt = - — « -/-i + i>{z- )a (4.33) l t t where 0 ( z ) is given in the Diophantine equation mentioned previously. From this equation, we can find the following relationship between the variance <r of y and the variance a\ of u under feedback. _ 1 2 t t °l = (-) *l 2 +< (4-34) m v This relationship can be seen easily from the Clarke-Gawthrop's self-tuning controller. We will mention this controller in a later section. From the above equation, we can write * -(-Y* 2 2 y CJo u = < (4-35) mv and see that if A is much smaller than cu , an increase in the output variance a will be compensated for by a huge reduction in the input variance G\. This is the attraction of constraining the input variance and the L Q G controller. 2 0 4.4 The PID Controller In this section, we will discuss the stochastic PID controller, ie. the control system is disturbed by an A R I M A time series. There are actually two types of methods to obtain the gains for a PID controller: one is model-based and the other is empirical. Most of the empirical methods were suggested earlier and most of the model-based methods were recently suggested. From the time of beginning of PID control, control engineers have worked on methods to obtain the optimal gains. The early work was by Ziegler, J. G . and Nichols, N . B . (1942) who obtained the gains from the closed loop ultimate gain and frequency. When it was suggested, this ultimate oscillation was obtained by on-line tuning with the loop closed. However, this can also be achieved with a relay feedback 70 CHAPTER 4. CONTROLLERS parallel to the controller (Astrom, K . J. and Wittenmark, B . (1989)). The ultimate gain and frequency can be calculated from the limit cycle produced by an ideal relay. In 1953, Cohen, G . H . and Coon, G . A . introduced the concept of open-loop process reaction curve to approximate the control loop by a delayed first order system and obtained the optimal gains for the minimum offset, one quarter decay ratio and minimum integral square error response to a disturbance change. A method that uses a similar concept to that of the Dahlin controller, is the Lambda (A) tuning method (Thomasson, F. Y . (1995)). In this method, the closed-loop transfer functions of simple processes are derived to be first order with unity gain and the PID controller gains are expressed as functions of this closed-loop time constant A. Tuning means varying this A constant until satisfactory performance is achieved. More recently, Rivera, D. E . et al. (1986) proposed tuning rules based on the Internal Model Control (IMC) design procedure. To use this method, we normally have to approximate the delay by a function (Pade) and reduce the dimension of the process. As closed-loop performance is the current talk-about topic of process'control, new tuning rules for desired closed-loop performance were suggested in Cluett, W . R. and Wang, L. (1996). These sets of rules are different for different processes. The rules are based on the closed-loop time constant and the ratio of the open loop time constant and the dead time. In their book, Astrom, K . J. and Hagglund, T. (1988) discussed a number of methods to tune analogue PID controllers. These methods include: the Ziegler-Nichols methods (both step and frequency response), the modified Ziegler-Nichols method, the dominant pole method, the frequency domain design method and the pole placement method. The frequency response method of Ziegler-Nichols is the one we mentioned above. It is the method that is more well-known. The step response method gives recommended settings as functions of two parameters: the apparent dead time and a constant which is the ratio of the product of the gain and the dead time and the time constant. The modified ZieglerNichols method generalizes the Ziegler-Nichols frequency response method by interpreting that in the latter the controller gains move one point on the Nyquist curve to a desired position. It moves the point where the Nyquist curve intersects the negative real axis to where with a phase advance of 25° at the ultimate frequency. The modified Ziegler-Nichols method suggests that other points of the Nyquist curve can be moved to other positions. The Ziegler-Nichols methods are based on the knowledge of one point on the Nyquist curve of the open-loop process dynamics. In the dominant pole design, two points on the Nyquist curve are used. The points correspond to two dominant poles of the closed-loop system. The reason for two poles is the poles can be a complex pair. The controller then must need at least 2 parameters (PI or PD) to move these two poles to the desired locations. PID means one parameter must be restricted. If several points on the Nyquist curve are known, then the frequency domain design method is the one suggested in the book. A design with closed-loop frequency response of unit gain and a resonance peak M is the criterion of the method. There are M curves of the closed-loop transfer function. v 4.4. THE PID CONTROLLER 71 The M value is the largest value of M on its Nyquist curve. The closed-loop performance is specified by the M value which is chosen in the range 1.1-1.5. The pole placement method is used for lower order systems, so that the controller can place all the poles not just the dominant ones. More recent work on analogue PID controllers by Zervos, C. et al. (1988) introduced the concept of orthonomal series to represent a system and from them tuned the controller gains. Perhaps, the K L V method of Landau, I. D. and Voda, A . (1992) is the most systematic method to design a PID controller. The acronym K L V probably comes from the names Kessler, Landau and Voda, because the latter two used the former's Symmetrische Optimum (Kessler, C. (1958)) to design their auto-calibration method. The method claims to give better results than the Ziegler-Nichols's frequency response method. The closedloop characteristics such as response and overshoot can be calculated easily. It assures good robustness margin and boasts a large field of applications. Works on multivariable analogue PID controllers use eigenvalue or pole assignment to place the closed loop poles for stability (Park, H . and Seborg, D. E . (1974), Seraji, H . and Tarokh, M . (1977)). A discrete version of the eigenvalue assignment method was studied by Stojic, M . R. and Petrovic, T. B . (1986) who used a grapho-analytical pole placement procedure to obtain the gains. Meo, J. A . C. et al. (1986) used projective control to obtain the gains. The adaptive or self tuning PID controller was discussed by Radke, F . and Isermann, R. (1987) who used 2-stage least squares to estimate the parameters of a system and a numerical optimization procedure to estimate the controller gains. In 1975, MacGregor, J. F. et al. suggested a method to obtain the optimal gains by plotting contours of variances under feedback. The optimal gains are obtained by extrapolation from the contours of variances. This is labor intensive, and the method does not work well for 3 parameter PID controllers. In Vu, K . (1992) the author used an algorithm similar to one used to calculate the steady state Kalman filter to obtain the optimal gains. Even though the theory of the method is correct, unfortunately the suggested algorithm fails on systems with a delay or with a nonminimum phase. Since in this thesis we choose the Box-Jenkins model, we will improve the design of a PID controller for this model. In the following, we will present our work on the PID controller. p v 4.4.1 The Time Series Variance Formulae Since under feedback both the input variable and output variable are time series driven by the same white noise and our purpose of control design is to have minimum variance of the output variable, therefore it is necessary to mention the variance formulae of a stationary time series. The time series variance formulae have been studied extensively by the author and the results have been published (Vu, K . (1988)). There are 3 ways to calculate the variance of a stationary time series. We will briefly present them here for a quick reference. CHAPTER 72 4. CONTROLLERS For a stationary time series y : t i yt -fljz- -6 z~ e z~ 2 1 n 2 n (4.36) at 1 - faz- - <p z 1 2 its variance can be calculated by one of the following methods. The Residue Method (Jury, E. I. (1964)) The variance can be expressed as: (4.37) -dz z^z-^zY c e(z-^e(z) o~l YJ Residues — =poles Z<fi(z = (4.38) )(f)(z) l 9(z^)9(z) (4.39) = a Resid,ue 2 a and it can be given by the following ratio: iril 1 (4.40) 2 0 with 2+ 2£0 -2</» _i 2 -20X + 2EW.-+1 -29j + 2 -IK n - &n-2 9i9i j —4> +i + <f>n "n-1 (4.41) —4>j ~ 4>j+2 ~<f>n-j-l 3 1 -20„_! + 29i 9 —<j) -29 , 0 n - n 0 r and 2 -</>! r 0 = -<t>2 - 2 ^ -2<j> 1 - <b -<f>! - -2<f> _i n 2 2 1 - (j> 4>4 3 $n-2 - <j>n -<j) -3 n 1 0 -2<b n -<t>n-l -<f>n-2 -<f>l 1 (4.42) 4.4. THE PID 73 CONTROLLER The Recursive Method (Astrom, K . J . (1970)) The variance can be calculated recursively as below: h {\-c? )Ik-i = + K fi (4.43) and the polynomial coefficients calculated recursively as: 6 {k) = i = 0,l,.--*-l #-)-a,#i = % (4-44) ro 0t i = 0,l,.--fc-l = B^-M % 1 ] k fi = ^ o k k (4.45) l 9 with the initial condition 27TZ Jc 0 (po(z 7c (^ )0o( )0o(^) _1i o f (e^Vdz 2-KiJc Uj") (4.47) Pl (4-48) i = and 0< ^ n) = -0 - i ^ 0 (4.49) = 1 (4.50) = -fr i ^0 (4.51) = 1 i= 0 (4.52) *l = t := 0 ? The variance is given by *X (4.53) The sequence of calculations is tabulated in the table below: 1 ri ^(n-1) ^(n-1) n ) ••• ... *< 1 n) ^(n-1) ^ - 1 ) (n) >) 'n-l (n-l) h • • n • ^ ^ ••• ^ «n-l ,(n-l) V-2 /)(") #n-l • #n-l ^ ) *(») 1 Pn CHAPTER 74 4. CONTROLLERS The State Space Model Method (Vu, K . (1988)) The variance is given as below: J E{=1 ^ifin+i-j r- n-\ (4.54) 1 <t>l<t>n with £ ? = i ( & -^)(^+i Ei=l(^i — — On+i-j) e^i^n+i-j (<f>l - - #l)(</>n - #n) - hh+i (4.55) — <f>i<f>n+i-j <^l^n and h I- ... o ... o 0 0 <f>4 0 4.4.2 (f>i (4.56) o The Minimum Variance PID Controller Now we will discuss procedures to obtain the minimum variance PID controller gains for the Box-Jenkins stochastic control system. When the control interval of a control system coincides with its sampling interval, we can write a discrete PID controller as follows: t ut = k y + ki J2 Vi + h{yt ~ yt-i) (4-57) p t or u - u -i t t = {k + k + k )y + {-k - 2k )y -\ + k y _ (4.58) = (4.59) p d t hyt + hyt-i + hyt-2 p d t d t 2 (4.60) 4.4. THE PID 75 CONTROLLER with k as the proportional gain, k{ as the integral gain and kj, as the derivative gain. By replacing the input variable in the Box-Jenkins model of the control system, we can write the output variable as follows: p 6(z !) </>(z ) l 1-z-i = V t + ^ + ^ ) a t + f + 1 ( 4 - 6 2 ) Moving the first term of the right hand side of the above equation to the left hand side and expressing yt+f+i as a time series, we obtain If the polynomial write 0(, ) has a root on the unit circle, we can factor this root out and -1 faz- 1) = fiz-^il-z- ) (4.64) 1 then Equation (4.63) can be written as: 6{z- )6{z-*) ..... x V t + f + 1 = 6(z-*)<Kz-*) - (z-W(z-*)z-f-U(z-*) at+f+1 ( 4 U ' 6 5 ) On the other hand, if (j>(z~ l) has no roots on the unit circle, we can divide both the numerator and denominator of the right hand side of Equation (4.63) by 1 — z' 1 and obtain Siz- )^- ) 1 V t + f + 1 = 6(z-*)<Kz-i) - W .. 1 (z-i)^)z-/-i/-(z-i) ' a + / + 1 ( ... } with n*~ ) = X 7^37 (4-67) This is necessary for the time series yt to be invertible. In both cases, under feedback the output variable will follow the following time series: Vt+f+i = ^ ^q 1 - a^- 1 t + / i (4.68) + amz~ m . CHAPTER 76 4. CONTROLLERS Now we consider the following system: w yt = Under PID feedback, the l-Bz- i—T—zi—7—z^ t-i + i A —t u 1 1 1 4 70 1 ez- ) 2 1 2 (1 - Stz- - 6 z-i)(l - z - ) - uz-^h 1 n -) a 1 — " — 02^ 1 —variable— time OoZ series is given 1 — by Z output (1 - SiZ- - s z- )(i - Vt ,_. 1 + l z-i + hz-*) 1 " at 2 (4 2 71) ^ (4.72) 1 - (1 + 4 ) 2 - ! - (4 - 6 )Z- + S Z~ - wl Z- 2 3 X 1 2 X - ul Z~ - 2 2 ul Z3 1 - (4 + B)z- - (S - 4fl)z~ + 6 0z1 2 3 2 1-{1 + S + toh)z- - {S - 4 + ul )z1 2 1 (4.73) 2 2 2 - (-4 + u;l )z-3 at 3 Since a time series will have the minimum variance when it is white, y will have the minimum variance when t 4 + 0 = 4-40 i+Si+uh = 4~4+^/ _6 0 = 2 (4.74) 2 (4.75) (4.76) which means . 'i 1-0 = , (1-0)4 h = (4.78) (l - '3 or K k i = 0)4 = (4.79) j i - ^ + u,) ( 4 ( = = (4.77) < ^ " 80) 4 8 1 ) (4.82) So for a system with no transmission zeros, no delay and the dynamics is of second order or lower and the disturbance is an integrated moving average of first order, the minimum variance controller is a PID controller. The optimal controller gains are given by the above equations. In other cases, the optimal controller gains cannot be obtained 4.4. THE PID 77 CONTROLLER directly from the parameters of the system, but through a numerical procedure outlined in the following discussion. In Equation (4.69), the controller gains appear in only the ft, ) polynomial and only from coefficient Pf+i to coefficient (5 . In practice, m is usually smaller than n. In the following, for simplicity we will assume that m equals to n. We can always do that by adding trailing zeros to the polynomial with fewer coefficients. If we use the formula given by the method of residue to express the variance of the output variable under feedback, we will get the following result. -1 n (4.83) tail with ' n 2 -2ft -ft -ft 1-ft -2ft -2ft - f t - ft -ft 1-ft -1 -2ft -ft - f t - 1 Pn-2 - ~P n- - 3 " -Pn-2 (4.84) 0 0 -ft " 1 0 0 -ft - f t - 1 -ft -ft -ft 1 1 1 -ft 1 0. - f t ••• - f t - f t ••• -ft-i 1 " -ft -ft -ft + 1 -ft -ft f t - f t -ft -ft -ft • • •- f t •• - f t ••• -ft -Pn-l 1 " ~Pn (4.85) and 2 + 2£a 2 -2ai + 2 £ aiQjj+i -2ft -2ft -ft -ft - -2ft_i -ft-i ft ~Pn-2 -2a 3 + 2 £ a,-a,- +i -ft+i -ft - -2ft — Pn ft "ft-i +2 — ft-j-1 -2a„_i + 2 a i a „ -2a„ 0 0 —ft 0 1. 0 -ft (4.86) The matrix Vti is identical to the matrix S7 except for the first column which is given in 0 CHAPTER 78 4. CONTROLLERS more details below 2 + 2 £ I U a\ -2a + 2E^i a ^+i 1 1 i (4.87) -2a _i + 2 a a n x n -2a„ To obtain the optimal (minimum variance) controller gains, we can take the derivatives of the closed-loop variance a with respect to the controller gains and set them to zeros. In doing so, we obtain 2 (4.88) <7„ = 0 for (4.89) J = l,-- We can solve the above set of equations with a numerical procedure to obtain the optimal controller gains. Since the above set of equations involves determinants and the expression for a determinant has a lot of terms, we have to be careful when the dimension of the matrix is large. In this case, we can use the formula given by the state space model method. In fact, for more practical purpose, we should always use this formula, because it is a ratio of two differentiable scalars. The formula is: ft Tr-i i + E L i ( " * - f l 0 - & + 2*'r 2 2 n-1 cr., = (4.90) E L i fikhw i - E L i # - 2 = —o~„ Efc=l PkPn+k-l 1-1 ft (4.91) 4.4. THE PID 79 CONTROLLER where u and v are two functions of the controller gains u = i + ]TK-ft) -ft 2 ft ft + 2£ r- 2 T k=l (4.92) L ft-i J Efe=i ftft+i i - £ f t 2 nL=i - 2 ft ft ftft+^-i (4.93) ft-i ftft and Efc=j OfcCV/t+i ELl r ft r ft = i . ft a t «+i5:-; a ft • • ft • • Q'/tft+i (XkPn+k-l ~ Q +k-t/3k - n " ft" 0 — • • o 0 - a/c+ift . 0 0 ft o ... o ... o • • • ft o ft-2 (4.94) (4.95) If we take the derivative of the closed-loop variance a with respect to the controller gain Ij, we have 2 y 2 V u'v — uv' (4.96) where u' and v' are the derivatives of u and v with respect to the controller gain Ij. For the controller gain Ij to be optimal in the sense of minimum variance, we must have (4.97) 0 which means u'v — uv' = 0 (4.98) CHAPTER 80 4. CONTROLLERS and this equation can be used to obtain the controller gain lj. So the problem of designing a PID controller finally becomes the problem of solving the following set of equations numericallv: h = UV = 0 (4.99) h = UV = 0 (4.100) h = U[ V — UV = 0 (4.101) 3 If the Newton-Raphson equation is used to obtain the gains, then we will iterate with the following equation: " h' h h i " ll' h h • 1 r dh dh dh dh dh dh dh I dh k 2 3 dh dh dh dh dh -I -1 h h h (4.102) dh where dhj di d_ u v (u dh du du dv vH dljdh dh dlj 3 (4.103) du dv dlj dh 2 dv dljdk 2 (4.104) The detailed expressions for the derivatives are included in Appendix A . With the given derivatives, we can easily obtain the gains / j , j = 1, • • • 3 by the Newton-Raphson iteration. The PID controller gains are related to these gains as in the following equations: kp k{ kd 1 1 1 -1 0 - 2 1 0 0 h h h 0 -1 -2 1 1 1 1 0 0 h h h When the disturbance is stationary, there is no need for integral action, relationship between these, two gain vectors is as follows: kp "i 0 r -1 (4.105) (4.106) = 0, the " 7* " '1 /* 2 (4.107; 4.4. THE PID 4.4.3 81 CONTROLLER The Linear Quadratic Gaussian PID Controller In this section, we will briefly discuss the minimum variance PID controller with a constraint on the input variable variance, ie. a linear quadratic Gaussian PID controller. This extension is actually quite easy. The control criterion can be seen to be given as Min E{y + Xu } 2 (4.108) 2 +f+1 r for the case of a P D controller or MinE{y + X(Vu ) } 2 (4.109) 2 +f+l t for the case of a PID controller. The vectors 1* and 1 are the controller gain vectors as before. In the first case, we have a stationary disturbance and the controller is a P D controller. In the second case, the controller is a PID controller, because the disturbance is nonstationary. In this case, we will constrain the variance of the differenced input variable V u j . From the control criteria, we can derive equations to solve for the optimal (but not necessarily minimum variance) controller gains as follows. For the PID controller case, we have Vu = t l{z~ )y (4.110) l t So if the output variable y follows the time series t y = t ^ (4.111) then the differenced input variable V i / will follow the time series: t v "' = < ;(,-) ( 4 a - 1 1 2 > 7(M) = f^lf- (4-113) The polynomial rj(z~ ) is usually not monic. Now by padding the polynomials a ( z ) , (5(z~ ) or ?7(z ), we can assume all the polynomials a ( z ) , /?(z ) and n(z~ ) have the same order n. We write the control criterion x l _ 1 -1 _1 Min E{y + \(Vu ) } 2 2 +f+1 t _1 x (4.114) as Min a + Xa 2 1 2 Vu (4.115) CHAPTER 82 4. CONTROLLERS J (4.116) with i +E L i K - f t ) - f t 2 ft ft Tri-i + 2^r 2 L Efc=J i - E L i f t 2 - 2 ft-i ftft+i E U I ft ft ftft+fc-, ft-i ftft (4.117) Since the moving average polynomial v(z ) of the time series Vu is not monic, we have to use another formula for the variance. This formula is given below. 1 t + 2CrTr - i Vo + U=i r}k(Vk-2 p ) Vo ft ft 1 k J ft-1 i - E L i « - 2 ELi ftft+i Efc=l PkPn+k-l r 2 0~„ (4.118) ft ft 1 ft-i ftft 10 , (4.119) — G„ where u, to and v are the functions of the controller gains as below u = l + f:(a -ft) -ft 2 fc 2 + 2^r- ft ft 1 (4.120) fc=i L to = Vo + Evk(Vk-2 p ) Vo k ft-i J ft ft V,-1 + 2C2 T T (4.121) fc=i L ft-i J 4.4. THE PID 83 CONTROLLER Pkftk+1 E L l i-Eft 2 2 ft ft PkPn+k-l Efc=i *;=i (4.122) Lft-i ftft and E L i k&k+i - ctkPk+i - ctk+iPk a E i = i a a k-i k - a f3 k-i n+ k - n+ O'lOn - Q i f t - a +fc-(ft n (4.123) Q ft n Efc=i »?Jfc*7fc+i - VkVoPk+i - Vk+iVoPk c r = = i- Y?k=l ft ft ft VkVn+k-l - ft o ft Vn+k-lVoPk (4.124) ... o ft o ... o ft o (4.125) VkVofin+k-l - 0 Lft o ft-2 0 ••• Now if we define a 2 = al + Xa ^ 2 , U ^ \ 2 V W V (4.126) (4.127) then we can take the derivative of the quantity a with respect to the controller gain Ij, we have 2 , u v — uv , w v — IVV + A- (4.128) where u', w' and v' are the derivatives of u, w and v with respect to the controller gain Ij. If the derivatives of a are set to zeros, we will obtain equations to solve for the optimal 2 CHAPTER 84 4. CONTROLLERS controller gains. hi — u\v — uv'i + K 'h w h 2 = ^3 = 'v - ' u h uv h ~ 'h) v wv = + X(w' v - wv' ) h 0 (4.129) 0 (4.130) = h u' v - uv' + X(io' v - ivv' ) = la h h 0 h (4.131) With the above equations and a given value of A, we can follow a similar approach as in the case of the minimum variance PID controller to obtain the optimal controller gains. If the method of Newton-Raphson is used, then as discussed many times before, we need further differentiation. In this case, we need Wi = + (4-132) 0u du dv du dv dv dljdh" ' dUdlj dljdU ~"dljdU d iv dw dv dw dv 3v ^dl dk W W ~W dT ~ dl dl) 2 2 2 2 V + 3 W l J J l 3 ( ' With the first and second order derivatives of the quantities u and v the same as in the case of the minimum variance PID controller, and the first and second order derivatives of w included in Appendix A , we can obtain the optimal controller gains via the NewtonRaphson iteration. 4.4.4 The Pole Placement PID Controller The Newton-Raphson method has an advantage, because it is a method known by all engineers. The method has a distinctive feature and that is when it converges, it converges very fast. It has quadratic convergence compared to linear or superlinear convergence of other methods. The only problem is the starting point must be close to the solution. Therefore, to use this method, we must know a neighbourhood of the solution or modify the method, so that it can be used for every point outside a close neighbourhood. Since modification of the method is not simple for the case of multidimensional Newton-Raphson equation, it is better to locate a neighbourhood of the solution. This approach will also solve the problem of local and global minima and therefore we will present an approach to look for a neighbourhood of the optimal solution. If we just look for a neighbourhood of the optimal gains by varying the three values k , k and k or the vector 1, then the problem is difficult, because of the specified values they assume. However, because the poles of the closed-loop must be stable, we can search for a neighbourhood of the optimal controller gains through the stable poles the controller can assign. The pole assignment PID controller is not a new concept. It has been used to p t d 4.4. THE PID 85 CONTROLLER design PID controllers for continuous systems (Park, H . and Seborg, D. E . (1974), Seraji, H . and Tarokh, M . (1977)). The problem with the pole placement methodology is the physical meaning of the assigned poles. In this application, the pole assignment will be used to locate a neighbourhood of the optimal solution. Under feedback, the output variable y either follows the time series t 6{z- )0{z- ) l yt = 1 ^ ( z - ) ^ * - ) ^ - z- 1) 1 (4.134) -at u{z^mz^)z-S-H{z-\ 6(z-*)4>{z-i) - wiz-^z-f-^iz-*)] 1 (4.135) at 6{z-*)0{z-*) (4.136) t>*(z~ )ip(z- -at x 1 or the time series siz-^eiz- ] 1 Vt = •V-:P U (4.137) at (z-i)<f>{z-*)z-f-il*(z- ) 1 6(z-i)9(z <f>(z-l)l6(z-l)-L0(z-l)z-f^l*(z-l)} at 6{z-*)9{z-i) (4.138) (4.139) : at The roots of the polynomials 0*(z ) or 0(, ) will correspond to the unassignable poles and the roots of the polynomials c ^ ( , ) or </p*(, ) will correspond to the assignable poles, because they contain the controller gains. It must be mentioned that among the assignable poles corresponding to the roots of the polynomial Lp(z~ ) or (/3*(z ), only 2 or 3 are independent. Two for a P D controller and 3 for a PID controller. The rest depends on the values of the independent poles and the coefficients of the polynomial Lp(z~ ) or -1 -1 _1 _ 1 l _1 l Consider the case of the polynomial ip(z *) t^(z ) -1 = 1— = <f\Z~ — tp Z~ X 2 2 (4.140) — •• • ^ ( , - ) ( l - z- ) - wO?- )*-'- /^- ) x 1 1 1 1 (4.141) From the above equations, we can write 0/X3 LO 0 " <fil' U -Si + o —6 + S 2 2 3 + 0 - U i (4.142) u 0 — U 0 r CHAPTER 86 4. CONTROLLERS (4.143) <p = d + W l Now suppose we have 3 assignable independent poles p i , p and p and a number of dependent parameters x s and they form a polynomial product as follows: 2 3 2 y(z~ ) = (1 - (pi +p +Ps)z~ ~ (-P1P2 - P l P 3 -P2P3)z~ l 1 2 2 (l-x - 3 z' ) 1 (4.144) 1 lZ = - PlP2P3Z~ ) Xi (1 -J2P' Z_1 Z IIP^ )( _3 + J2P'PJ ~ 2 - 1 x i ~ z XiZ~ ) l x (4.145) then from this equation, we can write 1 " <P1 ' ¥2 Epi PiPj Y\Pr -EPi 1 E,^jPiPi — rip. -Ept Ei&PiPj 1 -Epi 1 X 2 + 0 -TlPi (4.146) p + (4.147) Px The parameters XiS come from the number of delay periods / , the number of transmission zeros of the polynomial o ; ( 2 : ) and if the degree of the polynomial <5(z )(l — z ) is greater than that of the polynomial u>(z~ )z~*~ l(z~ ). Given the poles p;S, we can calculate these parameters and the controller gain vector 1 as below. We can write the following equation: -1 _ 1 _ 1 1 1 d + Wl 1 = p + Px (4.148) p-d (4.149) or W -P 1 = x In case, we are interested in only the controller gain vector 1, then we have - W P pXp r 1 0 - P T W i -1 - P T [ P - d ] (4.150) 4.4. THE PID 87 CONTROLLER The case of the polynomial tp*(z~ ) can be solved similarly. In this case, we have l ¥>•(*- ) 1 (4.151) = 1 - ip z -uo(z- )z-l-H*{z -1> 6(z x (4.152) l and o - Si ' 6 S3 2 w + / x 3 0 : —u.'! 0 y>* = d* + Wl* w 0 (4.153) r -w, (4.154) In this case, the controller gain vector contains only the proportional and derivative gains. This controller gain vector is given by -W P* T r i = 2 0 -P* W T -1 p*Tp* _p*T [p* - d*] (4.155) The vector p* and the matrix P* will correspond to the case of only two gains. This means they will be given by 1 -12Pi Pi + P2 -P1P2 0 l -Epi HPI 1 (4.156) p* n> J To look for a neighbourhood of the optimal solution, we can arbitrarily vary the assignable poles p s with a coarse resolution and compare the closed-loop variance of the output variable j / , or the performance index a + Xal (in the case of the L Q G PID controller) until optimality is achieved. Then the corresponding controller gain vector given by Equation (4.150) or (4.155) can be used as the initial estimate in the NewtonRaphson iteration to obtain the finer optimal solution for the PID controller gains. t 2 t CHAPTER 88 4.4.5 4. CONTROLLERS Examples In this section, we will consider a few examples to clarify the theory. A program written in the M A T L A B language (lqg_pid.m) which is included in the appendix was used to calculate the optimal controller gains for the examples in this section. The program has two stopping criteria. The first criterion is the number of iterations cannot be more than 50, and the second one is A, in Equations (4.129) to (4.131) cannot have absolute values greater than 1.0e-15. In the first example, we consider the following control system: 0.75 1 - 0.25;-i Vt 1 u + 1 z 2 T i _ Q.52- (4.157) a 1 V ; with a of unit variance. The minimum variance PID controller gains are required for this system. Under feedback the output variable follows the following time series: t 8{z- )9{z- ){\ - z- ) a (4.158) 8(z- )faz- ){\ - z- ) - ^ ( z - ) ^ - ) , - ' - / ^ - ) 8(z-y(z- ) a (4.159) ^ ( z - ) ^ - ) - ^(z- )^- ),-/- /*^- ) 1 - 0.25,1 - 0.75Z" + (0.125 - 0.75/i>- + (0.375/? - 0.75/^),- + 0.375/ *,(4.160) 1 Vt = l 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 t 1 1 1 2 3 4 2 The program lqg_pid.m used the initial estimates for the gains at values 11 = -0.5 l* = 0.0 2 (4.161) (4.162) and converged to the values /* = -0.303427 (4.163) /; = 0.031008 (4.164) after 16 iterations. The closed-loop variance of the output variable with these controller gains is <r = 1.2530. This is close enough to the possible minimum variance of 1.25 of the minimum variance controller. The optimal PID controller gains for the system are then: 2 k = -0.2724 (4.165) h = 0 (4.166) k = -0.0310 (4.167) p d 4.4. THE PID 89 CONTROLLER The results can be found graphically in Figure 4.1. In the second example, we consider the following control system: 0.168 1 - 0.908, ^ yt 1 + 1 - 1 . 3 , - + 0 . 1 3 , - 2 + 0.17,-3 « * ( 4 - 1 6 ) 8 with a having the variance a = 2.37. The system has no delay and the disturbance is nonstationary. The unassignable poles of the system are 0.5887 and -0.2887. With a pole resolution of 0.1, and the assignable poles of values: 0.8, -0.1 and -0.1, the closed-loop output variable has the smallest variance of 2.4415. This corresponds to the gain vector 2 t a h = -7.7857 (4.169) h = 6.2976 (4.170) h = 0.0476 (4-171) Iteration with the above initial estimates for the gain vector 1 gave a convergence, and the obtained gains were: k = -7.3398 (4.172) kt = -1.2711 (4.173) k (4.174) p = 0.9050 d and with these gains the closed loop output variable variance is 2.3871 which is close to the possible minimum variance of a^ = 2.37. Convergence is obtained in 14 iterations. The results can be found in Figure 4.2. In our third example, we consider a non-minimum phase system v V i = 0.25-0.3,l-0.8,-i 1-0.2,1 - 1 . 3 , - i + 0.3*-'°' 1 1 . ( 4 . - 1 ? 5 ) with the white noise variance of a = 2.0. With the initial estimate 2 a h = 2.0 (4.176) h = -1-5 (4.177) / = 0 (4.178) 3 the program lqg_pid.m achieved convergence in 14 iterations and obtained the final controller gains k = 1.4337 (4.179) k = 0.3378 (4.180) k = -0.9064 (4.181) v d CHAPTER 90 4. CONTROLLERS The closed loop variance given by this PID controller is 30.6067. A n orderly search for minimum variance gave the smallest variance of 31.3089. This was obtained with the following PID controller gains. k = 1.50 (4.182) k = 0.30 (4.183) k = 0.0 (4.184) p { d The results of the convergence can be found in Figure 4.3. In our final example, we consider the following deterministic second order model: = V t 1-0.9.-+0.2.-^ ( 4 - 1 8 5 ) To design a PID controller for this system to track step changes in the setpoint, we proceeded as follows. A step change function is equivalent to a random walk in statistical literature, therefore we have an equivalent system as below 0-25 ; Vt = 1 - 0.92- + 0 . 2 2 - y 1 ~ut-i H 1 r«i l-z' 2 1 4.186) v ; For this system, the minimum variance controller is a PID controller. The optimal gains are given below. kp — ki = kd — (1 - 0)(4 + 24) LO (l - - [1 - 8)62 = 4 - 4) -2.0 -1.2 -0.8 (4.187) (4.188) (4.189) The controller gain vector 1 is given as \ — [—4.0 3.6 — 0.8]. To constrain the movement of the input variable, we designed L Q G PID controllers. By using the program lqg_pid.m and with the above value of the vector 1 for the initial controller gain vector, we obtained the following table: T A kp k d Table 4.1 L Q G PID Controller Design 0.002 0.001 0 0.0001 -2.0000 -1.9751 -1.8576 -1.8189 -1.2000 -1.1888 -1.1127 -1.0652 -0.8000 -0.7722 -0.5505 -0.3862 1.0381 1.0003 1.0163 1.0000 29.6000 27.9238 18.8505 14.7595 4.4. THE PID 91 CONTROLLER In the above table, we see a gradual decrease in the controller gains and the variance of the differenced input variable V M . This happens at the expense of an increase in the variance of the output variable yt. By increasing the constraint constant A, we put more penalty on the movement of the control element. This results in a decrease in the variance of V u t and an increase in the variance of y . However, the trade-off is always positive. We normally get a huge reduction in the variance of Vu and only a slight increase in the variance of y . In the above table, we see that at the value of A = 0.002, we get a reduction of 50% in the variance of Vu , but only an increase of 3.8% in the variance of ( t t t t ytSince the controller was designed to track the setpoint, we would like to see how the system responds to setpoint changes. Since the above system consists of two first order systems in series, the responses to setpoint changes are fine. There are no overshoots of both input and output variables. The PID controller of this system needs no constraint on the movement of the input variable. A can be set to zero. However, the following system: 0.25 + 0.07,1-0.9^ + 0 . 2 ^ ^ 1 V t = , , , ( 4 - n n s 1 9 0 ) with the minimum variance gains as: k = -1.3881 (4.191) ki = -1.0072 (4.192) = -1.5256 (4.193) p k d and the variances: a = 1.0127 and o~\ = 36.5452 requires some constraint of the input variable movement. At the value of A = 0.003, the controller gains are: 2 u k p = -1.3855 (4.194) k { = -0.8408 (4.195) k d = -0.7386 (4.196) and the variances are: a = 1.0775 and a = 13.3724. At this value of A, the variance of the output variable increases 6%, but the variance of the incremental input variable decreases 63%. A huge gain in the trade-off of the variances. Figure 4.4 shows the responses of y and V u to a step change in the setpoint for two controllers: one with A = 0.0 and the other with A = 0.003. By constraining the movement of the input variable, the overshoot of the output variable in the minimum variance PID controller is eliminated. 2 2 y 7u t t CHAPTER 92 4. CONTROLLERS The Proportional Gain -0.25 K P -I -0.40 -0.45 -0.50 0 2 4 6 8 10 12 14 16 18 20 Iteration The Derivative Gain 0.002 -, 1 Iteration Figure 4.1. Gain Estimation of a Delayed System. THE PID CONTROLLER The Proportional Gain -5.4 - \ -5.9- \ -6.4-6.90 2 4 6 8 10 12 14 16 18 20 16 18 20 Iteration The Integral Gain Iteration The Derivative Gain 0.95 -| 0 = 2 4 6 8 10 12 14 Iteration Figure 4.2. Gain Estimation of a Nonstationarily Disturbed System. CHAPTER 94 4. The Proportional Gain The Integral Gain 0.55 -| 0 2 4 6 8 10 12 14 16 18 20 Iteration Figure 4.3. Gain Estimation of a Nonminimum Phase System. CONTROLLERS 4.4. THE PID CONTROLLER Input Variable u t Output Variable y Figure 4.4 Responses from PID Feedback. t CHAPTER 96 4.5 4. CONTROLLERS The Self Tuning Controller The self tuning controller is a relatively recent development in the control literature. Even though it has not been used as much as the PID controller, it is a famous controller. Strictly speaking, a self tuning controller is considered different from an adaptive controller, because a self tuning controller is supposed to have unknown but constant gains, while an adaptive controller constantly changes its gains in an adaptive environment. However, if the adaptive environment changes very slowly, then we can use a self tuning controller. The self tuning controller will tune itself to its correct gains before the environment adapts to a new operating point. Then it will tune itself again to have its new gains corresponding to the new operating point. The self tuning controller can be loosely considered as an adaptive controller. It is actually a matter of how fast the controller tunes to its correct gains and how slowly the environment changes its characteristics. In practice, we normally find a slowly changing environment and we want our self tuning controller tune to its correct gains as fast as possible. The self tuning controller is hence applicable. 4.5.1 The Recursive Least Squares Self Tuning Controller The Minimum Variance Controller The first pioneering paper on the self tuning controller was given by Astrom, K . J . and Wittenmark, B . (1973). In this paper, a minimum variance controller was successfully simulated. The problem with non-minimum phase systems was also mentioned. In most research, an Astrom's model was used for a self tuning controller. In this thesis, we will use the Box-Jenkins model for our self-tuning algorithms. Recall the controller form of the Box-Jenkins model control system ym t+ e(z-i)6{z-*) = + H z K + / + 1 ( 4 , 1 9 7 ) or e{z- )6{z- ) x x [y - i>(z- )a \ l t+f+1 t+s+l + ^hC*" )* 1 = uiz- )^- )^' )^ 1 1 1 (4.198) We can write this equation as below. Vt+f+i = w(2- )V'(2- )^(z- )^ + 1 1 1 ^(2- ) (2- ) 1 7 1 2 / t + V'(2- )a / i 1 t + + q+s - Y CiiVt+J+i-i ~ H.z~ )at+f+i-i) i=i l (4.199) 4.5. THE SELF TUNING 97 CONTROLLER ft Ut Ut-m Vt-1 ft Vt-n m+n+1 q+s (4.200) t=i g+ s = x f 0 + tp(z l ) a t + f + 1 - J2 Ciiyt+f+i-i ~ (4.201) ^ot+Z+i-t) If the last term in the above equation can be considered zero, which should be if in previous control times there were feedback actions that set this term to zero or very close to it, we can have a least squaresfitwith yt+j+i as the regressee and as the vector of the regressors. The self tuning controller does just that and therefore a least squares fit will result in unbiased estimation of the controller's parameters. The self tuning algorithm requires 3 parameters / , m and n. We can write the above equation as (4.202) yt+f+i and the en bloc least squares problem N-l Min (4.203) ]\2(y +f+i t PN <=I gives the parameters as follows: ft ft (4.204) Ljv-l JV-l D ftn+n +1 N with Ei=l uu t 1_ l^t=l t-m t = — N E f c i ym U A N E t u L Y^=\yt-nut N E t=l t-m ut-m u E f c i ytut-m i = i uy t t -N ut-m,yt l^t=l Et=i ytyt N i^t=i yt-nyt E Nt-i yt-nUt-m \^ix Et=i l^t=l uy_ t t n t-m,yt-n u 2^t=l ytVt-n l^t=l yt-nyt- n J (4.205) CHAPTER 98 4. CONTROLLERS Et=l Vt+f+lUt ^N 1 N (4.206) Et=i Vt+f+iVt L E Now since e / i correlates to the regressee vector x<, the estimation will be biased. This means if we collect a number of observations, get an estimate of the parameter vector j3 and implement a fixed control law with this parameter vector, we will not have a minimum variance control strategy, because the obtained parameter vector is biased. The trick of the self tuning control agorithm is - we estimate the parameter vector recursively and then introduce feedback action to reduce the factor contributing to the bias, until the obtained parameter vector is correct. We will discuss and prove later on that this happens when the matrix AN is singular. The following algorithm can be used for a recursive least squares self tuning controller. i + + 1. At the control time N, get the observed variable yN and the vector XJV_/_I = [ u -f-i N • • • u -f- -i m N VN-S-1 • • • yN-f-n-i ] (4.207) and treat these as y/v+/+i and XJV. 2. Get the parameter vector J3 from the following recursive least squares fit. N PN = PN-I + KAT \y f+i N+ ~x^/v-il ( 4 2 0 8 ) where KN PjV-lXjV (4.209) 1 + X^Pjv_iXiv and T N (4.210) 1 + X^PJV-IXJV 3. From the estimated parameter vector J3 , calculate the control action UN from the equation below. N UN-m yN J/JV-1 yN-n PN = 0 (4.211) 4.5. THE SELF TUNING 99 CONTROLLER This algorithm is different from the original algorithm suggested in Astrom, K . J . and Wittenmark, B . (1973) in the sense no parameters are fixed. Nevertheless, it has proved to be a stable algorithm and has been used in all the simulation runs described later. Figure 4.5 shows a block diagram for a self tuning controller. t+i+i a Self Tuning Mechanism yt Controller MO 0(z- ) faz- ) r Disturbance Plant 1 + Hz' ) 1 Figure 4.5. Block Diagram of a Self Tuning Controller. The L Q G Controller For a minimum variance self tuning controller, the controller parameters do not converge in the case the system has a non-minimum phase. Research on these non-minimum phase systems suggested a linear quadratic gaussian (LQG) control law or a pole-zero placement control strategy. A pole-zero placement control strategy has the difficulty of solving a Diophantine equation (Clarke, D. W . (1984)). This difficulty becomes a numerical problem, if overestimation of the polynomial orders exists (Allidina, A . Y . and Hughes, F. M . (1980)). Also for practical purpose, the determination of the pole locations is difficult. Therefore an L Q G control law is preferred. The problem with the existing approaches to obtain the controller's parameters for an L Q G self tuning controller is their complexities. A n L Q G control law usually requires a spectral factorization (Grimble, M . J . (1984)) or the solution of a Riccati equation (Astrom, K . J . and Wittenmark, B . (1989)). Both algorithms require extensive computation. Since a self tuning mechanism requires this calculation done at every control time, the burden on the control algorithms is obvious. In 1975, Clarke, D. W . and Gawthrop, P. J. proposed a very practical L Q G self tuning algorithm. The algorithm is not very much different from the minimum variance one. In the following, we will derive it for the Box-Jenkins model. For a non-minimum phase system, a minimum variance control law will have a non- CHAPTER 100 4. CONTROLLERS stationary input variable u for a stationary output variable y , as can be seen from the controller t t U t ( = - n it -\\y* LO(Z )<f>(z )ip(z 1 (4.212 ) l l To have a stationary input, we can use an LQG control law. The LQG control criterion Mm E{yl + Xu } (4.213) 2 f+1 gives the following controller u = t Vt \ u,{z-i)4>{z-i)rKz-i) + (4.214) -eiz-^iz' ) 1 LOQ By adding the quantity —0(z )S(z )u to both sides of Equation (4.198), we get t the following equation: 0(z )6(z ) 1 yt+j+i + —u x t - i\z )a l t + s + 1 LOQ Loiz- )^- )^- ) 1 1 1 + - 0 ( , - ) 5 ( , - ; Ut + Hz-'Mz-^yt 1 (4.215) 1 ( LOQ From the above equation, we can see that it is possible to have a simple way to design an LQG self tuning controller. We can use the self tuning algorithm discussed previously for the minimum variance case with only one exception and that is we use y^ H I<JV_/_I LO 0 instead of t/jv in the recursive estimation of (3 . This is the Clarke-Gawthrop's LQG self tuning controller for the Box-Jenkins model. N 4.5.2 Convergence of the RLS Self-Tuning Algorithm Now we will investigate the parametric convergence of the RLS self tuning algorithm. The stability and convergence of the parameters of a self tuning controller has been studied by Ljung, L. (1977). The approach uses the theory of differential equation. In this section, we will discuss the parametric convergence via matrix theory. This will give us an insight into the problem, so that we can present our innovative approach to the self tuning algorithm. But before doing that let us introduce two matrix theorems that are useful to our analysis. 101 4.5. THE SELF TUNING CONTROLLER Theorem 4.1 When a real symmetric matrix B is added to a real symmetric matrix A such as follows: C = A + B (4.216) then the eigenvalues of C are those of A in the same order, shifted by an amount that lies between the smallest and largest eigenvalues o / B . Proof. The theorem is proved via the minimax characteristic of the eigenvalues in Wilkinson, J. H. (1965). Q.E.D. • The roles of the matrices A and B can be interchanged. We can say that the eigenvalues of C are those of B in the same order, shifted by an amount that lies between the smallest and largest eigenvalues of A. The above theorem gives the following special case: Theorem 4.2 When a real symmetric matrix B of rank one is added to a real symmetric matrix A such as follows: C = A+ B (4.217) then the eigenvalues of C are those of A in the same order, shifted by an amount that is a positive fraction of the only non zero eigenvalue o / B . That means we can write ti(C)i = fx(A)i + m a{B) 0 < ra, < 1, £ m = 1 t (4.218) ; Proof. The theorem is proved in Wilkinson, J. H. (1965). Q.E.D. • Since the above theorems are important for the development of our Recursive Least Determinant self-tuning controller, we include these proofs in Appendix A . Now we will get back to the problem of convergence of the recursive least squares selftuning algorithm. First, we will note that a recursive least squares estimation and a batch or non-recursive (also called en bloc) least squares estimation will give the same values of the estimated parameters, if the recursive least squares start with the initial values calculated from a first few number of observations of the data. Then we can analyze the parametric convergence via the sequence of the matrices A;s and vectors b s, because the parameters are given by their products as described by Equation (4.204). The sequence of the matrix products: t A bi, 1 1 A 1 2 b , 2 A _ bjv-i, 1 Ar 1 A^bjv, (4.219) CHAPTER 102 4. CONTROLLERS will converge to a fixed vector when we have A ; s and b s converged. The convergence of a matrix can be established through its eigenvalues. To study the eigenvalues of the matrix AN, we can write it as below: ; A N = l [ = ^[(/V--1)A X l x ? ' - = + A (4.220) - - - + x v x ^ ] J J V _I+X J V X£] " - '~ " & X (4.221) (4.222) X The matrix AN is always singular, if TV is smaller than m + n + 2. If N is larger than m + n + 2, then AN can be singular or nonsingular. It depends on the values of the data. However, its eigenvalues are always real and non-negative. AAT is a positive semidefinite or a nonnegative definite matrix. If the data is open loop or nothing can be said about the relationship between the vector XAT and the matrix A J V - I , then nothing can be said about the eigenvalues of AN and their corresponding eigenvalues of AN-I- It — x depends on the eigenvalues of the matrix —— —. This is a result of the above theorems. The eigenvalues of AJV are those of AN-I substracted by amounts which lie between the smallest and largest eigenvalues of — — which can be negative as well as positive. In this case, we are not certain that the matrix AN will approach singularity. But this is not what we are interested in. We are interested in the case of self tuning closed-loop control with the control strategy: A T x N 1 x ^ 1 = 0 , x ^ 2 = 0, N N ••• x f3 N = 0. N (4.223) where the parameter vector /3,-s are given by Equation (4.204), and we want to prove that the matrix A J V = ^ [ x x f x + x 2 x r 2 + •••+ xx] N N (4.224) will approach a singular matrix. We consider the least squares optimization step of the self-tuning algorithm. From the equation Mm N-l £(y*+/+i- f/3,v) PN t=i x 2 (4.225) we obtain the following AN-IPN = bjv_i (4.226) 4.5. THE SELF TUNING 103 CONTROLLER With the control action (4.227) we can write x x J3 N N (4.228) = 0 N and obtain '[N - 1)AAT_I + XJVX^J [3 N NA p N N = (N-l)b ^ = (JV-l)bjv-i N The least squares optimization at the time N + 1 will give us A /3 N N P] N+1 N (4.230) N+l By substracting the above two equations, we obtain NA \p - (4.229) = Nb N (4.231) - (N - l)bjv_! (4.232) or A N (3 N+1 1 - (3 (4.233) -[NhN-iN-l^N^] N VN+f+lU-N N VN+f+lUN-m VN+f+lVN (4.234) L VN+f+lVN-n For stationary time series u and y , the values of J/JV-J'S and u^-iS can be said to be bounded. Now at large value of N, we have t t (4.235) 0 N For the above equation to be true, we have 3 cases: 1. The matrix Ajv is nonsingular and (3 and (3 are the same. 2. The matrix A^ is singular and (3 and (3^ are not the same. 3. The matrix Ajv is singular and f3 and (3 are the same. N N N N+l +1 N+1 CHAPTER 104 4. CONTROLLERS In the above listed cases, only the third case happens in practice. Case 1 cannot happen, because when the vector /3 and the vector f3 are the same, the self-tuning control law will approach a fixed control law and the matrix Ajv will approach singularity. In case 2, theoretically, we cannot estimate the parameter vector 0 , but for the sake of the discussion, let us say that we can estimate this parameter vector. This is highly probable, because we estimate this parameter vector recursively. The parameter vectors J3 and J3 can be different. But if we force the self-tuning algorithm such that we estimate only m + n + 1 parameters, then /3 and be the same. This is especially true if f3 and /3 are calculated recursively. This is the same as the third case - AAT is singular and (3 and the same. In other words, both AJV and 0 converge. N N+1 N N N+1 c a n N N N+1 a r e N 4.5.3 N The Recursive Least Determinant Self Tuning Controller In a previous section, we have asserted that a self tuning controller requires 3 parameters /, m and n. Of these parameters the delay parameter / is the most important one. If we choose wrong values for m and n, the controller performance might not be good, but normally acceptable. We have a case of suboptimal control. However, if a wrong value for the delay parameter / is chosen, the controller performance is normally bad, because this means all the parameters of the controller are wrong. Also the parameter m depends partly on the delay of the system. In many industries, the delay problem is a serious problem. The pulp and paper industry can be an example. In a paper mill, the speed of the beta gauge scanner used to measure paper properties such as basis weight, moisture and caliper is normally constant, whereas the paper machine speed which indicates how fast the paper sheet is running through the machine can occasionally change. This means the delay of the control system can change. A similar problem exists in a pulp mill. In a bleaching tower, a phenomenon called channeling can also change the delay of the control system. This means that even if we choose the right value for the delay parameter at one time, at another time of operation of the plant, the delay parameter might be wrong. Therefore, it is desirable to have an algorithm that will eliminate the use of the delay parameter. The Minimum Variance Controller We have proved earlier that the matrix AJV will approach a singular matrix with the recursive least squares self-tuning algorithm. It is quite easy to prove that with a fixed minimum variance control law, this matrix is singular. To do this what we need is to multiply both sides of Equation (4.198) by u and take the summation of both sides, we have t 9(z )6(z 1 ^[J^yt+l+iut-^iz ^ ^ G i + y + i U i = 4.5. THE SELF TUNING 105 CONTROLLER ft E UU t t ft ••• E UtUt-m E U y • • • E UtVt-n t (4.236) t For a minimum variance control law, we have Yyt+f+iUt = 0 (4.237) Y t+f+i t = 0 (4.238) a u which means ft ft ftn+n 0 (4.239) +1 A characteristic of a minimum variance control law is J2yt+i+iyt-k = Yyt+f+nk-k (4.240) ofork>o (4.241) = 0 for k > 0 This comes from the fact that y is a moving average time series of order / . Since a is a white noise, we have t t J2at+f+\yt-k = 0 for k ^ 0 (4.242) Y t+i+\ t-k (4.243) a u = 0 for k ^ 0 W i t h the above equations and by multiplying both sides of Equation (4.198) with u _k and y -k and taking summations, we obtain the following equations: t t ft ft ftn+n +1 (4.244) ft E yt-kUt J2yt-ku t m Eyt-kijt ••• Hyt-kyt- ft = 0 m+n+l (4.245) CHAPTER 106 4. CONTROLLERS For the Ut-k, we let k go from 0 to m and for yt-k, we let k go from 0 to n, we obtain a set of m + n -f 2 equations that can be ordered into a matrix equation as below: J2u u t J2 u - u t E u u -m E uy Y.ut- ut~m Y^ytut-m T,u - yt Yytyt t t m t t m t t E t E m t t Ut-rnVt-n E ytyt-n YVt-nVt E yt-nVt Eyt u y -n J3 = 0 (4.246) —n From the above equation, we can conclude the singularity of the matrix A/v at the minimum variance feedback condition. Another easier way to prove this fact is from the equation 9{z- )6(z- ) l x [y t+J+1 - = uKz- )^- )^- )^ + 1 TK^H+Z+I] 1 1 Siz-'Hz-^yt (4.247) s = xf/3 t (4.248) we square both sides, take summations and divide by N, we will obtain 1]>> = pAp 2 (4.249) T N Now at the minimum variance feedback condition, s = 0, because y = ^(z~ )a and therefore 1 t (3 A (3 T N t = 0 t (4.250) From the above equation, we can say the matrix AJV is singular with f3 an eigenvector associated with the zero eigenvalue. Since the matrix A ^ is singular at the minimum variance feedback condition, we will attempt a new control strategy by calculating UN at the control time from the equation \A \ N = 0 (4.251) The control action u/v is calculated from the singularity of the matrix A ^ at each control time. The matrix Ayv contains all the necessary information for us to calculate the control action u^. Doing this we do not need to know the delay / and we can bypass the calculation stage of the estimated control parameter J3 . Even though, we say we do not need to know the delay / , but when we choose the parameter m, we unconsciously choose m based partly on this knowledge. However, we do not use this information twice, N 4.5. THE SELF TUNING 107 CONTROLLER and there is a great reward for not doing so. This is related to the topic of parameter mismatch and we have discussed it briefly before. This strategy leads to the solution of a quadratic equation in the variable UN at each control time. This means there may be two solutions to this problem, and we have to devise a way to discard one of them. This problem is in fact not difficult. Because we always want less movement of the control element, we can always choose the solution that has the smaller absolute value for smaller variance. A more difficult problem that we want to prove is if there are always real solutions to this quadratic equation. The equation LAM = (4.252) 0 gives Et=l N E t=l u E =i t E i = Et=l uu t t v^vv N Z - W - l t-m,yt E Nt = l t-m t-m E t = l ytUt-r, E«=i ytyt t-m t u u ytu t l Vt-nUt t = 1 u N Et=l y - Utt Z^i=l u n m Utyt- n t-m.yt-n u TT^N i^t=\ X^N Z^t=l yt-nVt ytyt-n Vt-nVt-n (4.253) At the control time TV, the variable yN is available and we are required to calculate UN- This variable is an unknown and appears in only the first row and column of the matrix AN- By designating the following matrix: v—* N E i = l Ut-lUt-l Ei=l B \-^N E =i t [ U - U -i t m t ytut-i E^lVt-nUt-l EtLl Ut-lUt-m l^t=l l^t=l t-m t-m u E i = i N Et=l u E =i ytut-m t Nv Ej=i yt-nUt- m E< =i Ut-iyt E*=i«*-iyt-n t-m.yt L*t=\ u ytyt yt-nVt t-mljt-n u L,t=i 2-<t=\ ytyt-n yt—nyt-n (4.254) CHAPTER 108 4. CONTROLLERS we can write the above equation as: 2^2=1 t-m t u u B (4.255) or d (4.256) B with, the vector d defined as: E<=1 Ut-lUt 2^4=1 UN-IUN t-m ut u Ei=l E = i ytut Ell t = «t-m«l 1 y*«* + UN-mV-N UNUN (4.257; (4.258) d i + d ujv 2 Now by applying a property of the bordered matrix determinant, we have the following equation: N x IBI = 0 (4.259) t=i Since the first term of the above equation is a scalar, we can write: N ^2utut — d B _ 1 d) |B| = 0 (4.260) t=i and our control problem gives us the equation N ^« u -d B- d T i t=i t 1 = 0 (4.261) 4.5. THE SELF TUNING 109 CONTROLLER From Equation (4.261), we have N [ U l -t- U ' U A T [d +d r UU X>,u 4=1 T IT D T 2 x t 2Ujv 1 B" 1 [di + d ] = 2Ujv 0 (4.262) or N-l (l-d^B- d )« 1 2 2 -2dfB- d « 1 v 2 + ^ u -dfB t=i 2 i V _ 1 di = 0 (4.263) The above equation is a quadratic equation with ujy as the unknown. If we write it in the form: au% + bu + c = 0 N (4.264) then the coefficients are as follows: a = l-d b = r 2 B- d 1 -2dfB- d 2 1 2 (4.265) (4.266) N-l c = ^ u\ - d f B - M x t=i (4.267) and the control action at time ./V is given by -b ± Vb - iac 2 u = N 2a (4.268) Solvability of the Control Action In this section, we will discuss the solvability of the control action UN at time N when the output variable has value I/N- We have voiced some concern about the fact that there might not be a real solution for UN, because it is an unknown of a quadratic equation. We have the controller equation as follows: |AJV| = 0 ' 1 (4.269) [xixf + x x ^ + • • • + XTVX^I | 2 1 HN-^AN-I+XNXH ]\fm+n+2 (4.270) (4.271) CHAPTER 110 4. CONTROLLERS Now if the matrix A / v - i is positive definite, then we definitely have no real solutions for u/v, because the matrix x ^ x ^ will either leave the smallest eigenvalue of the matrix (TV — 1)AJV_I where it was or shift it a positive amount. The matrix AJV will never have a zero eigenvalue. This is similar to saying that the determinant of A AT will never be zero. Tests confirmed this assertion. There were no real solution u^s, whenever the matrix AAT_I was nonsingular. However, if A / v - i is singular, then we will have a real solution UN for A/v to be singular. To prove this is actually quite easy. We can proceed as follows. If Ajv_i is singular, then we have | ( / V | = 0 (4.272) or (N-l)A -if3 N = 0 N (4.273) where (5 is the eigenvector corresponding to the zero eigenvalue. Now we can always choose uJV such that N XJV/3/V = 0 (4.274) A fc = 0 (4.275) L3 (4.276) which will give us as before N N But this means 'JV+1 N Under many tests, whenever Ayv_i was singular, the two u^s coincided, and the parameter vector /3/v+i the same as f3 . This means that the parameter f3 will never diverge from its converged value. We have mentioned earlier that we will not have a real solution UN for the matrix AN to be singular at time TV, if the matrix AAT_I is nonsingular. In the beginning of the control period, this is normally the case. So our attempt to establish the singularity of the matrix AAT at every control time fails. So what we do instead is to use the singularity of Aoo as the converged criterion and at time N calculate UN from the matrix AN such that we come closer to this criterion. In this sense, our control strategy is a self-tuning one. We want the matrix AN tune itself until it is singular. If the parameters m and n are chosen correctly, the strategy will self-tune into a minimum variance controller. A t this criterion both the smallest eigenvalue and the determinant are zeros, therefore we can have two approaches to this problem. Either we calculate UN SO that the smallest eigenvalue of AN is as small as possible or we calculate UN SO that the determinant of AN is as small as possible. w a s N N 4.5. THE SELF TUNING 111 CONTROLLER To take the smallest eigenvalue approach, we can have an algorithm such that the smallest eigenvalue of the matrix AN is smaller than that of the matrix AJV_X. The strategy is for the smallest eigenvalue to approach zero. If we pursue this strategy, we can derive the algorithm from the following mathematical derivations. From Equation (4.222), we can write: eAe eA T T N ee A 9 1 N- -I* 9 T (4.277) N6 6 66 T T Let 9 be 9 such that the left hand side of the above equation is minimum, then we have X Q\AN9\ d A -i9 & T = 9\9 N Min "~""e\ X 9 k-JV-l 98 (4.278) N9 9 T T Now let 9 on the right hand side of the above equation be 9 such that 2 9 AN-\9- 9 A -i9 v 2 T 1 = 99 2 Min 2 (4.279) N 9 99 T then we have Min 9\AN9 9 99 T 9A- Min N 9 9\ I9 T < — XATX^ AN-1 (4.280) N9 9 BB T l 2 2 By the Courant-Fisher theorem, the left hand side of the above equation is the smallest eigenvalue of the matrix AN and similarly the first term on the right hand side is the smallest eigenvalue of the matrix AJV_X. So the above equation gives us ^mm(Ajv) < U i (AN-l) ~ 91 AN-\ XjvX^- 9, (4.281) N9l9 2 2 m n u For the smallest eigenvalue of the matrix AN to be smaller than that of the matrix Ajv_x, we can have the following control algorithm: • At the control time A^, we get the matrix AJV_X. • Get the smallest eigenvalue of AN-\ and its associated eigenvector. • Make this eigenvector the control parameter vector at time xi0 N2 u 0 (f3 ), ie. we have N (4.282) CHAPTER 112 4. CONTROLLERS With this control algorithm, we have the following relationship between the smallest eigenvalues of the matrices AJV_I and AAT. < A^min (AJV) ' ^ (4.283) \l in{A-N-l) m This algorithm, however, proved to be unstable under tests. This is because even though the smallest eigenvalue of AN can be smaller than that of A J V _ , the determinant of AN can be greater than that of A / v - i Now if we take the determinant approach, then instead of looking for solution UN of the equation 1 a,u + bu + c = 0 N N (4.284) we will look for the solution of the following optimization problem Min au + bu + c 2 N N (4.285) u N The solution of the above optimization problem is given by u N = ~ (4.286) dfB- ^ 1 - d^B-M, 1 (4.287) which is the unique solution for the determinant of the matrix A ^ obtain the smallest value. The matrix B will be nonsingular, when N is larger than m + n + 1. This solution can be used for the case the matrix AN-I is nonsingular as well as singular. This is one of a few ways to calculate UN- In a later section, we will mention another. The Control Algorithm In this section, we will present our self-tuning control algorithm. We will call our control algorithm the recursive least determinant (RLD) self tuning control algorithm, because we want the smallest determinant of the matrix AN at each control time and this determinant can be calculated recursively. The control action UN is calculated from the following optimization problem: Min \ A N \ (4.288) U N which results in Equation (4.287). From this equation, we can present our R L D self-tuning control algorithm as follows: THE SELF TUNING 113 CONTROLLER • At the control time TV, get the vector UN-l UN-m CJV = (4.289) VN VN-n update the vector UN-2UN-1 d (4.290) UN-l-m.UN-1 = djv_i + N y N -iU N -i VN-l-nUN-l d jV—1 + = (4.291) C -lUN-l N and the inverse matrix BAT (4.292) 1 •0-1 _ B j V 1 _ CAfC ^B r _ 1 + 1 c A F B 1 A A i V _ Civ (4.293) 1 1 • Calculate the control action d HN N c (4.294) N UN = 1 - c^B^cjv • Send UN to the control element. The vector djv and matrix BAT are given en bloc as follows: Y^=iu -mu IN t t E^LYytUt N-l ••• Et=i yt-nu l t (4.295) EUi BAT = u t Z>t=l t-mUt-l 2^4=1 Ut—mUt—m u E«=i Efci sym. t-iu -i E*=i ytut-m ytut-i yt-nUt-i •• E i i = • E£=i yt-nUt-m E i l i ytyt-n ••• Etli yt-nyt-n J CHAPTER 114 4. CONTROLLERS (4.296) It is not necessary to update B;v in the control algorithm but its inverse. The calculation of the inverse is done via the Sherman-Morrison formula. The above vectors and matrix are given for ease of initialization. The R L D algorithm has been tested and proved to be an easy and robust self tuning control algorithm. 4.5.4 Convergence of the R L D Self-Tuning Algorithm The Classification of Convergences We have given proof of convergence of the recursive least squares (RLS) self-tuning algorithm. In this section, we will study the convergence of our R L D self-tuning algorithm. To prove convergence of the R L D algorithm at control time N, we have to establish that the determinant of A / v is smaller than that of AJV_I. To do this, we first write their relationship A N = A ^ = A JV-1 - A "-* ~ " " X (4.297) X N - 1 Ajy^XjyX^ N N (4.298) then take determinant of both sides of the above equation to obtain |A*| = ^ ^ 1 1 ^ 1 + ^ - ^ | (4.299) In Vu, K . (1991), the author has proved the following equation: |C + AD| = k |C|[l + ^p (C- D)A ] 1 (4.300) l J i=l for any nonsingular matrix C of dimension k. The coefficients p;(C D) are the characteristic coefficients in the characteristic equation of the square matrix C D . The first coefficient p!(C D) is the sum of the eigenvalues of this matrix. This coefficient is also called the trace of the matrix. The last coefficient pfc(C D) is the product of the eigenvalues of the matrix. This is the determinant. The coefficients in between are all the possible permutations of sums of the products of the eigenvalues. We do not have to worry about these coefficients and the determinant in our application. We can write -1 _ 1 _1 -1 1 + = (4.301) 4.5. THE SELF TUNING 115 CONTROLLER Since the matrix A ^ ^ X A T X ^ is of rank one, it has only one nonzero eigenvalue, and its unique nonzero eigenvalue is x ^ A ^ ^ X / v . This leads to the fact that all the coefficients in the above equation are zero, except for the first coefficient. This is because any product of the eigenvalues will be zero. Only the sum of the eigenvalues which is the trace of the matrix gives a nonzero value. The first coefficient is N N _ 1 This will give us. N N-l ,N ,N N (4.302) N N = ( N (4.303) N-l and we can write (4.304) N - 1 N The minimization of the determinant |Ajv| leads to the minimization of X ^ A ^ - J X J V , because only the vector X;v contains the control action UN- This gives us another way to calculate UN for the R L D control algorithm. Now we write UN -1 E - u] N •^N A;v—i^-N N-l VN UN- UN X UN-m UN-n djv-i VN »JV-1 VN-n (4.305) or after evaluating the inverse matrix N-l 1 d r_ B A E«? UN - 1 d 1 y v ^ _ B 1 ^ d ^ UN C N B 7 _ d r_ d ^_ B ; _ B^Lidjy-i 1 E«? - d^.jB^jdjv-i N-1 + CN 1 A B 1 A 1 A 1 A 1 E^-d^B^djv-! J (4.306) Eu 2 t d^.iB^idjv-i "E«* - U N d^.jB^ljdjv-i CHAPTER 116 + 4 B — iv-i + I i ^ T n 1 C ] 4. CONTROLLERS (4.307) V and the minimization of the above equation gives UN (4.308) d ^_ B ; _ CAr = A 1 A 1 1 = c ^B > _ djv_ = c (4.309) 1 A A 1 1 (4.310) 7vCiv The above equation gives a shorter formula to calculate the control action UN- Since the controller parameter vector CN g i product of an inverse matrix and a column vector, it can be given recursively as the algorithm of recursive least squares self-tuning regulator. Note that this vector CN does not contain yN- B y putting this value of UN back into the quadratic equation, we obtain 1S v e n x Min a s a Jv-AjV-l Af (4.311) x jV-fc» r_ Cjv N-l A 1 and therefore the minimum value of IAJVI is = IA N-l (4.312) l( N N-l From Equation (4.293), we can write N B j V - 1 N N B N-1 C C N^N -N C 1( — c A/B A r _ CAr — 1 1 c C 1 + cJrBtf^CN JvB N-1 N 1+ c B _c c N (4.313) C (4.314) 1 N N 1 N and therefore p R r 1 + C ^B A 1 A 7 _ CAT 1 cN _ 1 = (4.315) 'N'^N' N C C Convergence of the RLD algorithm at the step N implies 1 N ] C B- ZN X N < 1 (4.316) N The first term of the left hand side of the above equation is smaller than unity, but the second term is larger than unity. Convergence at step N means the first term has a 4.5. THE SELF TUNING CONTROLLER 117 larger effect. In general, convergence cannot be guaranteed for every step, because of the unknown value of yw in the vector of the above equation. This is true for all stochastic systems. However, we can sa.y if the value of yw is not exceptionally larger or smaller than its previous counterparts, then the algorithm will normally converge. Now consider the case when ./V is large and u and y are bounded, which is not true for a nonminimum t phase system, the matrices N t * and are essentially the same, because they are a crosscovariance matrix of the two time series u and y . t t BAT N (4.317) - 1 7uu(m - 1) 7««(0) 7uy(l) luy{n - 1) 7„„(m - 1) 7«u(0) luy{m) luy(n - m) 7m, ( 1 ) luy{m) 7w(0) lyy( ) 7 (n) 7 (0) lu {n-m) lu {n - 1) V y n ra W (4.318) where ^ y{k) is the crosscovariance of the two time series u and y at lag k. This means U t cftB^c* ^ B^CAT With this result, Equation (4.316) N N-l t ( 4 3 1 9 ) will become ,N-1, < N (4.320) 1 which is true and therefore we can say that for a nonminimum phase system, the R L D self-tuning algorithm will converge at every step or converge exponentially when is large. When is not large, Equation (4.319) is approximately true, and we can expect the algorithm to converge. In practice, we might find that the R L D algorithm does not converge at every step or does not converge exponentially but converges eventually. This means the determinant of |Aw | is zero at the end of the self-tuning period, but the algorithm has some temporary divergences during this period. This is the case of eventual convergence (or simply convergence) and the self-tuning period consists of a number of stages. Each stage is a convergence in a number of steps. We will explain what we mean by this. Using the result of the above discussion, we can write AAT = I A N-l ,NN 1, -) m+n+2 [ l + cI,B £J A f i V _ CAfJ 1 (4.321) CHAPTER 118 \A \ = l A j v K ^ ^ r + ^ t l +c^B^c^] lA^I = \A . \( +^~ ) N+l N 1 [l CONTROLLERS (4.322) +c% ^ m+n+2 N+k 1 4. (4.323) + or by combining these equations together, we get \A \ N+k = l ^-il(^^) A m + " [ + N^N-^N] + 2 1 x • • • x [1 + C c B \_ N+k N k } lCN+k (4.324) - | A „ \( N ~ \m+n+2 N N-l N 1 C B N+k N+k-l N+k c c R (A-l')f\\ c If the algorithm converges in k steps, we have \A \ < N+k (4.326) |AJV_I| and fN - C T A N +k 1 N+k® N+k-l N+k c lN n+n+2 )vB 7 _ CAr 1 x x C c^B^cyv c N + k B \ c N k < (4 327) ^ N + k The Convergence Interval of yN We have said that the convergence of the R L D algorithm at step TV depends on the value of yN- In this section, we will find the value of yN for the R L D self-tuning algorithm to converge. This is useful because if we do not allow a temporary divergence, we can use the controller parameter vector CJV-I °f the- previous control time TV — 1. From the equation (l-Ir"+ [l 2 + « _ ^ ] (4-328) < 1 we can write ( 1 _ i . r + n + 2 [ l + c T B - l _ i C i v ] _ 1 < 0 ( 0 2 9 ) Let yl and y be the solutions of the quadratic equation given by the left hand side of the above inequality equation, then if yN is inside this interval, the R L D self-tuning algorithm converges for the control time N. 2 4.5. THE SELF TUNING 119 CONTROLLER The Analysis of a Temporary Divergence We have proved the R L D self-tuning algorithm converges when ./V is relatively large and also obtained the convergence interval for the output variable y^. In this section, we will analyze this convergence problem a little bit more, study the mechanism of a temporarydivergence and propose a solution for systems with a difficult convergence prospect. From the equation A N = A ^ - ^ - = AA^-EJV.X " 1 (4.330) ^ (4.331) we can say that if the matrix EAT-I is positive definite, then we have convergence at step N. This is because of the two theorems we presented earlier. The eigenvalues of A AT are those of AAT_I shifted negative amounts of which absolute values lie between the smallest and largest eigenvalues of the matrix EAT-I- If EAT-I is positive definite, all its eigenvalues are positive and this means the eigenvalues of Ayv are smaller than those of A^-i in the same order. This, of course, means convergence at step A , because a direct consequence is the determinant of A AT will be smaller than that of AAT-I- If the matrix EAT-I has a negative eigenvalue, we are not sure of convergence or divergence. It can be either one of these two cases. By taking the trace of both sides of Equation (4.330), we can write trA = tvA^-tr*"- -*"*" (4.332) 1 N = rA -r t + N X W ^ r A j y - 1 (4.333) Now, we have X ^ X A T = u + u_ N N H x \-y N -\ (4.334) If yN is so large that 2 , i „.2 uN-l + --- + V + ---> N ^AAT-I (4.335) then no real value of UN can change the fact trA N > trA ^ N (4.336) When this happens, not all the eigenvalues of AAT will be smaller than those of AAT-I in the same order. This creates a potential for a temporary divergence. To combat this problem, what we want is the magnitude of y in Equation (4.334) small compared to the N CHAPTER 120 4. CONTROLLERS magnitude of u , so a wide swing in value of will have a little effect on the eigenvalues of AN, because it can be absorbed by UN- M J V will reduce itself from its normal value and assume a smaller one, so that the eigenvalue shift by the matrix Xjvx^ is small and still leaves the matrix ~EN-\ positive definite. This is a condition for convergence at step N. In this case, the value of UN will determine the eigenvalues of AN and the convergence of the R L D algorithm. In simulation runs, when the value of UIQ in one simulation model was accidentally lowered to almost half its value, convergence happened in all the runs. Now, the values oi yN and UN will depend on the system. However, we can always create an artificial value, say u for example, which is a multiple of UN- We use the value of u* in the algorithm, but send the value of UN to the control element. With this practice, the convergence of the R L D self-tuning algorithm can be enhanced greatly. 2 N N 4.5.5 N Effect of Model Mismatch Order Overestimation We will assume that we have overestimated the same number of parameters for m and n, say m + o and n + o. That means the minimum variance controller has the orders m and n, and we design a self tuning controller with the orders of m + o and n + o. From the model of the system, we have (4.337) </?(z) as any polynomial of order o, then we can write ^(z- )^- )^- )^^! - VK^H+z+O = ^(z-) [«5(z- ) (z- )t/ + (z-)^(z-)</>(z-)u] _1 Now if we define 1 1 1 1 1 1 1 7 1 1 W i i (4.338) The above equation tells us that the controller will tune itself to a number of controllers. The final value of the parameter vector will depend on the initial estimate. The important thing is the controller has minimum variance performance, because we have Vt+f+i = i>{z- )a (4.339) l t+}+1 Now consider the case, we design a self tuning controller with the order of m + o and n + o + /. The equation that describes the system can be written as follows: ^(z-) [«5(z- )7(z- )y + 1 1 1 i (z- )^(2- )r(^ K] 1 W 1 1 + (O^-n-i (4-340) 4.5. THE SELF TUNING CONTROLLER 121 where d(z~ ) is a null polynomial of order /. The above equation tells us that the controller will tune itself to a number of controllers but forces the coefficients of the polynomial ^ ( z ) to zeros. From the above results, we can draw the conclusion that - if we overestimate only one order, then the extra number of parameters of the other order will be forced to approach zeros. From the above discussion, we can say that if we overestimate the orders of the selftuning controller both algorithms RLS and R L D will perform optimally as in the case we have the correct orders. l - 1 Order Underestimation In this case, minimum variance performance is not obtained, because the controller does not use enough past input and output variable values to decouple the interaction. This is similar to the case of a self-tuning PID controller in a system with a delay or orders higher than second order. The self-tuning algorithms will not obtain minimum variance performance. In this case, tests run have shown that both algorithms are stable. Wrong Delay Estimation Since the delay parameter / also affects the parameter m of the controller, a wrong estimation for the delay causes double penalty. If we overestimate it, the effect is less damaging than we underestimate it. This is because overestimating of the controller orders is acceptable but not underestimating them. We have discussed this before. However, the damaging effect is only on the RLS self-tuning algorithm. This is because it explicitly uses the delay / in the regressee yt+f+i of the algorithm. The R L D algorithm is immune to the wrong estimation of the delay. In all simulation test runs, whenever there was a wrong estimation of the delay, the RLS self-tuning algorithm diverged. As for the R L D algorithm, it was not affected by this problem. This is the strength of the R L D algorithm. We will show and discuss some of the simulation results in the next section. 4.5.6 Simulation Examples In this section, we will test our recursive least determinant (RLD) self tuning control algorithm. To be able to draw a realistic conclusion about the algorithm, we will compare the results of the R L D self tuning control algorithm with those of the recursive least squares (RLS) self tuning control algorithm. The results came from the same system. For a fair comparison, the same system was run twice with the same disturbance and the same open-loop initialization period (the first 14 observations). CHAPTER 122 Determinants of Closed-loop Variances 60 H >i M 11 11 ' i AJVS CONTROLLERS xlO" RLS 1 1 4. RLD A i :i i :i i :\ I '.i i i i I i i i r 1 i t :i :\ "'- \ • \ v '•. v •• \ \ '.v i ^ ^ i 50 100 150 200 100 t t Covariances Eigenvalues of A AT 200 30 H 20 H io H 200 Figure 4.6. Exponential Convergence of the R L D Self-Tuning Algorithm 4.5. THE SELF TUNING 123 CONTROLLER Determinants of Ayvs xlCT Closed-loop Variances 1 RLS \ RLD . > , i i\ j i • • i t i i t i i. r' : i '-. 'i 'i i i i i t i \ \ t ( v ^ 0.0 1 200 50 1 100 1 150 200 t Covariances Eigenvalues of A N 200 Figure 4.7. Convergence of the R L D Self-Tuning Algorithm CHAPTER 4. 124 CONTROLLERS Figures 4.6 and 4.7 summarize the results of two simulation runs. The model of the control system for these runs is V t = l-0.5*-i '- l-0.6*-i ' M a + f l ( 4 ' 3 4 1 ) with the white noise variance of unity value. In Figure 4.6, we see a case of exponential convergence of the R L D algorithm. In Figure 4.7, the R L D algorithm also converges but it has two temporary divergences. This means two increases in the determinants of Ayv instead of all decreases. Each of these figures contains four small graphs. The top left graph shows the closed-loop variances of the output variable y . The top right graph shows the determinants of the matrices Ayvs. Both the variances and the determinants are the results of both R L D and R L S self tuning algorithms. To show compatibility of the two algorithms, we plot in the bottom left graph five statistics: t l y y ( k + f + l) 7u„(fc = + / + l) = for ^J2 t-kyt+f+i u for fc = 0, 1 k (4.342) = Q, 1,2 (4.343) These are some of the autocovariances of y and the crosscovariances of u and yt- Astrom, K . J . and Wittenmark, B . (1973) mentioned in their paper that these statistics are supposed to be zeros under a minimum variance control law. In Figure 4.6 and 4.7, we see that the statistics approach zeros under the R L D algorithm. The bottom right graph shows the eigenvalues of the matrix AJV. The R L D algorithm decreases all the eigenvalues of this matrix. In all the graphs of these figures, the horizontal lines are the lines of theoretical values. In the top left graph of Figure 4.6, the horizontal line has a value of 1.04 which is the theoretical closed-loop variance calculated from the disturbance model and the delay. We see the dotted and dashed curves close to one another. This tells us the performances of both algorithms are good and almost the same. The dotted curve is the result of the RLS self tuning control algorithm, whereas the dashed curve is the result of the R L D self tuning control algorithm. These two curves coincide in the beginning of the simulation run, but differ at the end of the run. This is because from observation t = l to observation t=14, the loop is open and therefore the variances increase. From observation t=14 to observation t=199, the loop is closed and the variances decrease exponentially. From this graph, we can be misled and conclude that the RLS self tuning algorithm is a better algorithm, because it gives a smaller closed-loop variance and determinant of A/v. In fact, it is only the result of one particular run. The performance of the R L D algorithm is quite compatible with that of the RLS algorithm, as we shall see in Figure 4.8 later. t t 4.5. THE SELF TUNING 125 CONTROLLER Determinants of Closed-loop Variances 18 1 1 1 1 RLS 1 t i 1 1 I t 12 Ayvs 400 H RLD t t II ll ll ll ll ll ll ll 1' ll ll 1' 1 1 1 • 1 1 1 1 1 1 : . i1 V\ 1 1 50 100 200 150 200 Input Variables Output Variables 6.0 —RLD RLS 3.0 H J j r 0.0 -3.0 H -6.0 50 —r~ 100 t 150 200 Figure 4.8. Self-Tuning of a Correctly Estimated System. 200 126 CHAPTER 4. CONTROLLERS In Figure 4.7, we have a similar situation. The determinants of the matrices Ajys increase when the loop is open but decrease when the loop is closed. And as the closedloop variances approach the theoretical value, the determinants of AATS approach zeros. Figure 4.7 might also mislead us to conclude that the RLS algorithm is more resistant to large disturbances, because the determinant of A AT increases twice under the R L D algorithm but not under the RLS algorithm. Figure 4.8 shows that the R L S algorithm is also vunerable to large disturbances. This figure shows the results of the self-tuning algorithms of the following control system: O- 1-0.4Z- 8 < = r^^ - y Ut 2+ 1 f T^n^ A O A A \ - at (4 344) From this figure, we can conclude that the R L D algorithm can give compatible performance with that of the RLS algorithm, because not only the closed-loop variances, the determinants of AATS but also the output and input variables of the two algorithms are almost the same. To study the effect of wrong estimation of the system parameters / , m and n, we used the following model in our simulation. 1 -0.4Z- ~ ^z- 1 Vt = ; 1— T-Zi b\Z T—i t-S-i u 1 — bz 1 2 +i 1 4.345) 7CT=[ t a 1 — O.62 1 and changed the values of cuo, w i , 61, b and / to fit the individual cases. self-tuning algorithms, we assumed the system has the model: yt = - J f - ^ + ^f^a, 1 — b\z 1 — (pz 2 1 1 For both (4.346) The minimum variance controller for this system can be derived to be: ~n A -ivi U—vT=T\ 4.347) (1 - <j>z )(l - (<p-0)z ) This means we assumed / = 1, m = 2 and n = 1 for our controllers. In Figure 4.8, we have a case of correctly estimated system which means we set the values of the parameters in the simulated model as follows: U t = yt x l uj 0 = 0.8 (4.348) Lo = 0.0 (4.349) 4 = 0.5 (4.350) 6 = 0.0 (4.351) / = 1 (4-352) x 2 4.5. THE SELF TUNING CONTROLLER 127 In Figure 4.9, we have a case of underestimation of n. To achieve this, we set the above parameters as follows: LO = O.S (4.353) wi = 0.0 (4.354) 4 = 0.5 (4.355) 0 6 = 0.24 (4.356) / = (4-357) 2 1 The parameter n of the minimum variance controller for this system must have value n = 2 and we underestimated it with a value of n = 1. Now if we set u = 0.8 (4.358) wi = 0.0 (4.359) Si = 0.5 (4.360) 6 = -0.24 (4.361) / = 1 (4.362) 0 2 we will have a case of underestimation of n of an underdamped second order system. The simulation results for this system are shown in Figure 4.10. Similarly in Figure 4.11, we have a case of underestimation of m. To achieve this, we set the above parameters as follows: LO = 0.8 (4.363) CJI = 0.4 (4.364) 51 = 0.5 (4.365) 5 = 0.0 (4.366) / = 1 (4.367) 0 2 The parameter m of the minimum variance controller for this system must have value m = 3 and we underestimated with a value of m = 2. For the case of overestimation, we actually do not have to worry very much about it, because as discussed above, this case has minimum variance performance. To achieve the case of overestimation of n, we set the parameters as follows: LO = 0 0.8 (4.368) u>i = 0.0 (4.369) 4 = 0.0 (4.370) 8 = 0.0 (4.371) / = (4.372) 2 1 CHAPTER 128 4. CONTROLLERS This means the m i n i m u m variance controller for this system has n = 0 and we overestimated it w i t h n = 1. Figure 4.11 shows the results of a typical run for this case. In all the cases of wrong estimation above, both algorithms proved to be stable for all the simulation runs. Figures 4.9, 4.10, 4.11 and 4.12 just show typical results. T h e simulation results support our assertion we had before. If we have wrong orders of the controller, both algorithms w i l l be stable and perform satisfactorily. T h e same thing cannot be said, when we have a wrong estimation of the delay for the R L S self-tuning algorithm. Figures 4.13 and 4.14 show the results of wrong estimation of the delay parameter / . To achieve wrong estimation of the delay, we set the parameters as follows: u = 0.8 (4.373) wi = 0.0 (4.374) Sj = 0.5 (4.375) S = 0.0 (4.376) / = 0 (4.377) 0 2 This is the case of overestimation of the delay. This means the system has no pure delay but our algorithm gave it a value of / = 1 i n the self-tuning algorithms. In this case, the R L D algorithm proved to be stable but the R L S algorithm ran away i n every simulation run. In Figure 4.12, we see that the dotted curves go out of plot boundaries i n a l l four little graphs, but the dashed curves are inside the boundaries. Also i n this figure, we see that the determinant of A AT under the R L D algorithm approaches zero which is an indication of i m m u n i t y of the R L D algorithm to overestimation of the delay. The case of underestimation of the delay can be similarly established. W e set the parameters as follows: co = 0.8 (4.378) u = 0.0 (4.379) 51 = 0.5 (4.380) 5 = 0.0 (4.381) / = 2 (4.382) 0 x 2 Like the case of overestimation, the R L S self-tuning algorithm ran away, but the R L D algorithm was stable. T h e determinant of AN approaches zero under the R L D algorithm but goes out of boundary under the R L S algorithm. These results can be seen i n Figure 4.14. This means we can conclude that the R L D algorithm is immune to the wrong estimation of the delay whether it is an underestimation or an overestimation, but the R L S algorithm is not. 129 4.5. THE SELF TUNING CONTROLLER Closed-loop Variances Determinants of AATS 200 Output Variables Input Variables Figure 4.9. Self-Tuning of an Underestimated Order (n) System. CHAPTER 4. 130 CONTROLLERS Closed-loop Variances Determinants of AATS Output Variables Input Variables 10.0 6.0 RLD RLS 3.0 0.0 'I -3.0 H -6.0 50 100 150 200 t Figure 4.10. Self-Tuning of an Underestimated Order (n) Underdamped System. 4.5. THE SELF TUNING 131 CONTROLLER Closed-loop Variances Determinants of A;vs RLS RLD 3H 2H 1,1 ' \t \ ! i i i r I J ( 50 100 150 200 200 t Output Variables Input Variables Figure 4.11. Self-Tuning of an Underestimated Order (m) System. 132 CHAPTER Closed-loop Variances 4. CONTROLLERS Determinants of A ^ s 200 Output Variables Input Variables 6.0 RLD RLS 3.0 i i' ii 0.0 ll ll I I i • I- v -3.0 H -6.0 r~ 50 i 100 t 150 Figure 4.12. Self-Tuning of an Overestimated Order (n) System. 4.5. THE SELF TUNING 133 CONTROLLER Determinants of AATS Closed-loop Variances RLS RLD 3 H ll | ! ii i! \'< '!' t t* I'? i i 50 100 150 200 200 t Output Variables Input Variables 200 Figure 4.13. Self-Tuning of an Overestimated Delay System. CHAPTER 134 4. CONTROLLERS Determinants of AATS Closed-loop Variances 8.0 RLS ( 6.0 H t RLD t 1 1 1 1 1 i <\ \ 4.0 : !!!•' V !"!« ¥ !!',! in; 2.0 v ** H H 0.0 50 100 150 Output Variables 200 200 Input Variables Figure 4.14. Self-Tuning of an Underestimated Delay System. 4.6. CONCLUSION 4.6 135 Conclusion In this chapter, we have discussed two important controllers. These are the two popular controllers: the PID and the self tuning controllers. Both have been used in the industry and both have problems. This thesis has improved these controllers in the sense that it introduced a method to calculate the optimal gains for the PID controller and suggested a new approach to the self tuning controller. The novel concept presented in this chapter is an attractive way to enhance the self tuning controller. It not only eliminates a requirement for the delay of the system, but also facilitates the self tuning controller to correct its orders or structure. CHAPTER 4. CONTROLLERS Chapter 5 Control Interval 5.1 Introduction One fundamental problem in process control is the determination of the control interval or the sampling interval (rate) of the control loop. Sampling too fast means a burden on the process computer, while sampling too slowly will degrade the controller performance, because we control less often. The problem is occasionally determined by some rules of thumb. For examples, 1 second for flow loops, 5 seconds for level or pressure loops and 20 seconds for temperature loops. These rules of thumb have been based on the time constant of the dynamics of the system and do not take into account the effect of the stochastic part of the system. The stochastic part can be a loop of different nature. In this chapter, we will improve an existing method to determine the optimal control interval. 5.2 The Sampling and Controlling Rates Technically we have two rates to consider. One is the sampling rate which determines how fast we should sample for data. The other is the control rate or control interval which indicates how often we should control. Not only can the two be different, but also the sampling rate can be irregular. This situation has been mentioned in Lennartson, B. (1986). For simplicity of the discussion and the practicality of the application, we assume the sampling rate is regular and the same as the controlling rate, hereafter called the control interval. The problem we mentioned above is the determination of the control interval. For a digital process control system, a process computer has to process a number of control loops. Initially, these loops were sampled with the above-mentioned rules of thumb or the fastest available control interval. With expansion, there are more loops and the process computer might not have enough time to process all the control loops at shorter control interval. Some of the loops must be processed at a longer control interval. 137 CHAPTER 138 5. CONTROL INTERVAL Since if we control less often, the control performance is likely to be poorer. Therefore, an analysis of which loop should be controlled at a slower rate is necessary. It must be mentioned that the control loop in the analysis is known, ie. its model at the shorter control interval is available. Before analysis of the problem, we will briefly describe the effect of sampling slower and faster on the transfer function and disturbance models of the system. This will help us grasp the situation better and understand the problem thoroughly. 5.2.1 Sampling Too Slow If we have a model with a faster sampling rate and we want a corresponding model at a slower sampling rate, what will this model be like? This is the case we want to discuss and this question must be answered. In the following, we will give a logical answer first and then some mathematical insight later. Effect on the Transfer Function Imagine that we have a system with no disturbance and under open loop condition. Now introduce a step change to the input variable and observe the response of the output variable. If the system is open-loop stable, we will see that the response will oscillate around a particular value and eventually settles on it. If this case occurs, the system is of second order or higher. If the response approaches the final value slowly without crossing it, the system is of first order. Higher order systems can have this similar response, but in this case they can be treated as a number of first order systems in series. If the response settles on the final value right away, the system is of zero order. From these facts, we can conclude that if we sample the system more slowly, the system will approach a zero order system. Effect on the Disturbance In theory, the observations of an A R I M A will autocorrelate to a very high lag and the observations of an A R M A will autocorrelate to only a moderate lag. Beyond this lag the autocorrelation coefficients will be zeros. This means if we sample much slower, an A R M A will become a white noise exhibiting no serial correlation. 5.2.2 Sampling Too Fast Theoretically, if we have a model at a slower rate, we cannot say much about its corresponding model at a faster rate, because of a problem called alias. Fortunately, this is 5.2. THE SAMPLING AND CONTROLLING RATES 139 not the problem we wish to discuss. However, in the following, we will speculate on the effect of sampling faster on the models of the transfer function and the disturbance. Effect on the Transfer Function We consider the continuous-time system described by the following model: ^ A*x(t) + b * « ( t ) = dt y(t) = c x(t) (5.1) (5.2) T If this system is discretized into an equivalent discrete time system with fixed sampling interval A t , then the discrete control model will be as follows (Kwakernaak, H . and Sivan, R. (1972)): x i = Ax + bu Vt = c x< t+ t t T (5.3) (5.4) with A b = e' /•At = ( / e dt)b* Jo A At At (5.5) (5.6) Now consider the same continuous-time control system sampled with two different control intervals At\ and At2 and assume At = k&ti 2 (5.7) with k as an integer. Then we have the following relationship between the two state transition matrices eA * A t 0 A (5.8) A t 22 (5.9) k A * A i i \k ) = (A ) a fc (5.10) (5.11) The poles of the discrete transfer function with the control interval Atj are the eigenvalues of the state transition matrix Ai. The above relationship says that the eigenvalues of the state transition matrix A 2 are the kth. powers of the eigenvalues of the state transition matrix A i . Since for an open loop stable system, all the eigenvalues are inside the unit circle. This means the powers of these eigenvalues are smaller than them in absolute CHAPTER 140 5. CONTROL INTERVAL values. This verifies the fact we mentioned before. If we sample slower, the system will approach a zero order system with zero poles. Conversely, if we sample faster the eigenvalues of the state transition matrix or the poles of the system will increase in absolute values and approach instability. The effect of sampling faster on the transmission zeros is similar but a little difficult to prove mathematically, because of the integral in the expression for b. However, we can say that if we sample slower, a number of transmission zeros will approach zeros and conversely, if we sample faster the polynomial u ^ , ) will become unstable. This case is known as the case of a nonminimum phase. The process has an inverse response. It has been a practice in the process industry that engineers sample slower to avoid the case of nonminimum phase. - 1 Effect on the Disturbance Mathematically, a rational transfer function is different from an A R I M A in the sense that the effect of the transfer function is continuous in nature whereas an A R I M A is essentially discrete. However, the effect of sampling on the models is practically the same. If one samples too fast the A R M A might approach nonstationarity. As for the polynomial we always get an invertible (stable) polynomial from modelling or identification. 5.3 5.3.1 The Control Interval Literature Survey The control interval has been studied by only a few individuals. Lennartson, B . (1986) studied this problem. However, his work is more on the comparison of different control strategies than an in-depth study of the sampling or control interval. He mentioned in Astrom, K . J. and Wittenmark, B . (1984) a sampling rate h = (5.12) was suggested by these authors. In this formula, LOB is the closed-loop bandwidth and N is a number ranging from 6 to 10. In MacGregor, J . F. (1976) the control interval is determined by a comparison of the theoretical closed loop performances at different sampling rates. Since the method of MacGregor, J . F. is closely related to our work and it has more industrial appeal, we will choose to improve this work. Since the closed-loop performance can be determined entirely from the A R I M A and the delay of the system, the determination of the optimal control interval can be based on the modelling of an A R I M A at different sampling rates from the fastest to the slower ones. 141 5.3. THE CONTROL INTERVAL MacGregor, J . F.'s approach solves for the roots of the autoregressive polynomial and raises these roots to a power to obtain the autoregressive parameters of the new A R M A . As for the moving average parameters, they have to be solved for from a number of identities. In this chapter, we will improve this approach by using matrix algebra to obtain the autoregressive parameters and a robust numerical algorithm to obtain the moving average parameters. 5.3.2 The Optimal Control Interval The Parameters of a Skipped ARIMA We have mentioned that the optimal control interval can be determined from the modelling of the A R I M A disturbance at different sampling rates. But the way to do this is not just by getting the observations and running the identification algorithm a number of times. The proper way to do it is modelling the A R I M A at the fastest sampling rate. At this rate, we have high accuracy, because of the large number of observations. Then we determine the parametric relationship between the fastest sampling rate A R I M A and one of its slower sampling rates. The slower sampling rate A R I M A is normally called a skipped A R I M A , because a number of observations are actually skipped from the process of recording. In the following, we will determine this parametric relationship. As mentioned in an earlier chapter, we normally model only an A R M A , therefore, we will also continue to do so in this chapter. Suppose we are given an A R M A with the following model \-Q z~ e x n t = x (5.13) q - l 1 - <j> Z X The question is if this A R M A is observed at a rate r times slower than the original rate ( A T = r A t ) , what will the parameters of the new A R M A be? The original series can be put to another following state space model form (MacGregor, J. F. (1973)) Ax* C T + ba (5.14) t + 1 (5.15) Xi with <j>2 1 ••• 0 1 0 0 ' 0 A = 1 -01 , <t>m 0 0 ••• 0 ••• 0 0 1 0 b= (5.16) — — ^m-l #m CHAPTER 142 c = T 5. CONTROL INTERVAL (5.17) 1 0 0 ••• 0 where m = max[p, q], ie. 0, (5.18) if p < m and • • • ,6 = m 0, if q<m (5.19) From the above equations, we obtain x +i = Axi + bat+i (5.20) x +2 = Ax i + (5.21) = A[Axi + ba ] + b a = A Xi t t f + ba 2 t+ t+1 (5.22) t + 2 2-1 (5.23) ^bat+2-i + 2 i=0 r-l X t+r (5.24) Ax + Y A'bat+r-i i=0 r t By moving the first term on the right hand side to the left hand side and denoting z a unit backward shift operator, we can write 1 as r-l Xi+ r i-(Az- y]~ Y^t+ 1 — (5.25) 1 i=0 and c T [l-(Az-^] _ 1 ;£A'W (5.26) i=0 There are a few ways to avoid the inverse matrix in the above equation. The first way is to replace it by its relationship with the adjoint matrix and the determinant. The second way is to use the Cay ley-Hamilton theorem. In Vu, K . (1990), the author introduced a third way to attack this problem. Following this approach, we start from the scalar identitv 1 —x = 1+ x + x + 2 lx l (5.27) 143 5.3. THE CONTROL INTERVAL then write i 2 + ... i J^L 1—X ' ' ' ' 1 — cc [1 - x}' = 1 + x + x + • • • + x + x [1 - x]' (5.28) 1 + x + x + 1 x + 2 i i+1 (5.29) 1 and obtain the matrix version of the above equation as follows: [i-(A^- ) "]" 1 1 = I + A z~ 1 r + A z~ r 2r [i-(Az- )'-]" + --- + A z~ 2r lr 1 ir 1 (5.30) for any non-negative integer i. The above equation can be verified by premultiplication and postmultiplication both sides by the matrix I — (Az ) . If we multiply both sides of Equation (5.26) by a = 1 and choose i in Equation (5.30) asfc,we get _1 r 0 [c + c A z~ T r T + ••• + r c A ^ z - ^ T r-l i=0 Similarly, we can multiply both sides of Equation (5.26) by cx\z Equation (5.30) as k — 1 to get ai z- n r t = [a z- + a c A( - ) 2- [I-(A^- ) ]- t'=0 ]E ' *-» T + r lC c A zr T a i + --- + 2 r and choose i in r a A^- >z^ - > 2 T k l lC r-l r fc 1 r fcr 1 r 1 (- ) 5 32 A ba 1 We can continue the sequel each time multiply Equation (5.26) by ajZ~ with increasing j and choose i such that i = k — j in Equation (5.30). In general, we will get an equation ]r r-l a-n 3r ]Z t = [a c zT 3 jr - (Az ) ] ] £ A*'ba_,- (5.33) + ••• + a3c ' A ~> z~ [I T {k )r _1 fcr r -1 t j=o The last one in the sequel must be a z~ n kr k t = [a c z-*[I-{Az- )T ]Y T 1 k 1 t-i (- ) Aiha l 5 34 i=0 Now if we define the following polynomial: a(z- ) T = cxo + aiZ-r + a z~ 2r 2 + • • • + a z~ , kr k a = 1 0 (5.35) 144 CHAPTER 5. CONTROL INTERVAL and then by adding up all the above equations of the sequel, we obtain a(z- )n r = [c + c A z~ + ••• + c A^ ~ T t + T r c zT r T + c A z- r T ai r k z^ ~ > + c A z- [I 1)r k + •••+ 2r ai l T kr 1 r 1 _ 1 r _ 1 a A^- >z-^ T 2 lC + « c A^- ) ,-^[I-(A,- ) ]+ a c z~ + a c A z~ + ••• + a c A ^ z - ^ T - (Az ) ] kr r 1 1 T 2r T 2 r 3r T 2 + r 2 a c A - >z- [l-{Az- ) }T (k 2 kr l r 1 2 + ••• r-l (Az" )'-]- ]E + a c z- [IT kr 1 k 1 A!BA *- (5.36) T 0 4= By collecting terms of the same powers in z, we can write the above equation as a(z- )n r = [c + (c A + a )z- + T t T r T (c A + r T lC cA 2r + a c )z- + ••• T T ai + ( c A - ' + cv c A( - » + . . . + a _ c ),-( - » + (c A r (/j 1 r T fc 2 r T 1 T kr fc + aic A^ r 2r 2 fc 1 r 1 + ••• + a c ) [i - ( A z ) ] ~ * z~ ] £ A ' b a ^ i=0 r T - 1 r kr k (5.37) If the coefficients o,s are chosen such that c A + a A -^ + • • • + ct c = 0 (5.38) then the term with the power — kr in the z-transform operator of the previous equation will vanish. The equation will become T f c r T {k r T lC J2<xiz- n ir t k YJ2^ ~ ' 12 ^ cTA{l = J)rz ir Alha 5 i=0 i = 0 j=0 i=0 As for the coefficients OJJS, we can determine them from Equation (5.38) which will give T (k-l)r c A T (k-2)r c «2 Cil A - c TA A kr J •••Oik (5.40) Now by tranposing the above equation, we get - r (*-i)r • c A T (k-2)r c T 'tti" a A 2 . k . a - \c A T kr (- ) (5.41) 39 5.3. THE CONTROL 145 INTERVAL and by premultiplying both sides by the matrix - -1 - T ( * - l ) r " T-i • r (*-i)r c c A T (k-2)r c " T (*-l)r " c A T (k-2)r A c c A A X (fc-2)r A (5.42) c cA c cA c - r (fc-i)r - - r ( * - D r " T-I T T r A T r r T T we obtain c A c T (k-2)r c c A A r (^-2 A - i " r (fc-i)r c ) r A c T (*-2)r A cA T cA c T c A r T c (5.43) A T r hr r c T T The parameter k is chosen such that it is as large as possible, is in the range p < k < m + 1, and makes the inverse matrix in the matrix (5.42) exist. Equation (5.26) can also be written as NT = C S " Abfl< II( A W I (5.44) So if we premultiply a(z ) to both sides of Equation (5.26) to make its right hand side fractionless, then r I-(A2~ ) 1 a [z (5.45) r and c Adj [i - ( A z ) ] T -1 Y J l ^ A ^ z - = r i=0 (5.46) j=o The relationship between the moving average parameters is more complicated and can be derived from the equations given below a[z )n t = a[z fe-i Y -r\„r r-l c [l-(A - r]" T 1 2 (5.47) £ A W i (5.48) Y ajC A< '-'>*- Y T , ,v Aba^i CHAPTER 5. CONTROL INTERVAL 146 = EEE«i t=0 i=0 n=0 = a - c r A r ( i " J ) + n b o = V(*~>t = x (5.49) tphat-h 4hCLt-\ t '-n-«- (5.50) (5.51) (5.52) t The left hand side of the above equation gives the autoregressive part of the skipped A R M A with the observing rate A T . The parameters of the above equation are given by the following equation: bir] ^ = - £ a c A^ b j=o T (5.53) r t with the square brackets stand for the integer of. Now if we designate the skipped A R M A as ST = P ^ e r a(z (5.54) ) r and (1 + a z~ + • • • + a z- )s r kT x k T = (1-Piz~ = wj (5.55) Piz~ )e r lr T (5.56) then the orders of the polynomials ft(z~ ) and ct(z~ ) of ST can be obtained as follows. We have mentioned k is the order of a(z~ ). However, from the discussion on the effect of sampling on the transfer function and the results from MacGregor, J . F . (1976) the roots of the polynomial a(z~ ) are the roots of the polynomial <f)(z~ ) raised to the rth power, so k must be equal to p. If k is larger than p, then the last k — p coefficients will be zeros. As for the order / of the polynomial (3(z~ ). MacGregor, J . F. (1976) and Anderson, 0. D. (1975) gave / as below r r r r l r / = [ p + ^ ] (5.57) r We can also give an estimate of / as follows. From Equation (5.50), x is a moving average of order h sampled at the sampling interval At, so we if we sample x at the interval AT = rAt to get the moving average time series wj, then / - its order - is the integer or quotient of h/r, and h can be calculated as h = r — 1 + r(k — 1) from Equation (5.49). Therefore, we have t t I = [ ^ - ] (5.58) 5.3. THE CONTROL 147 INTERVAL Note that in Equation (5.54) for the skipped A R M A sx, the generating white noise is ej not ax, ie. not a nonskipped observation of a s. The reason for this is in the modelling of an A R M A , the white noise is usually considered as a fictitious uncorrelated sequence with the smallest variance. And so when a skipped A R M A sx is formed, it is not bound by the fact that it is driven by axNow we will obtain an expression for the parameters of the polynomial /3(z~ ). From Equation (5.50), if we square both sides, then take expectation, we will obtain t 1 E{a(z- )n a{z- )n } r + X» = (1 r t t 8 =1 (- ) 2 5 59 B Now if we carry out the same operation on the skipped A R M A sx, we will obtain = (1 E{a(z- )s a(z- )s } r r T T + Eft> (- ) 2 8 5 60 =1 And since at a nonskipped position sx = n , we have t E{a{z- )n a{z- )n } r = E{a(z- )s a(z- )s } r t r t (5.61) r T T This will give us (1 + X > M = (l+E#Ve (5-62) i=\ 8=1 Now instead of squaring both sides, we multiply both sides of Equation (5.50) by the quantity a(z~ )n „ , then take expectation, we will obtain r t r h—r E{a(z~ )n a(z~ )n - } r = (-ip + E V#;+r)<7 r t t (5.63) 2 r r 8 =1 and similarly E{a{z- )s a{z- )s ^} r = (-ft + £ r T T 8 ftft )<7 +1 2 e (5-64) =1 and we can write (-^ +EV'8^ 8 =1 + r )c7 Q 2 = (-f3 'j2^ )a 2 1+ t+1 8 =1 e (5.65) CHAPTER 148 5. CONTROL INTERVAL In general, we will have ( i + ft + ft + - - - + 2 740) 7^(1) ft )o- 2 2 2 (-ft + ftft + --- + ft_iAK ( - f t + ftft + • • • + A - f t K 2 2 2 (-#_! + ftft)o- 2 e -ft^e = 7^-1) = 740 = 7x(0) (5.66) 1x{r) (5.67) 7.(2r) (5.68) = (5.69) 7 ^ - r ) (5.70) 7*( 0 r In the above set of equations, we have I + 1 equations and / + 1 parameters, therefore we can solve for the parameters by substitution. However, this approach will be more cumbersome when / gets larger and so a numerical approach is preferred. Wilson, G . (1969) suggested an algorithm to obtain the parameters of a moving average time series numerically from its statistics. However, since it uses Newton's method, there is more computation burden, because the derivative has to be calculated. Furthermore, there is a matrix inversion in this method. A n alternative approach is presented below. We write the moving average parameters /3,-s as ft ftft + ••• + ft ftft + ••' + 41) ( i 7- 7 ftft- ft-i 7 ft- ft 2 ft ) 2 740) 42) ( l + ft + 2 ft + 2 - - - + f t 740) + ft + ft + - - - + 2 2 ft ) 2 7 . ( 0 ( 1 + ft + ft + --- + /3 ) 2 2 ) (5.71) (5.72) (5.73) 740) 2 ft 2 - 2 4 ' - i ) ( i + ft + ft + - - - + 2 (5.74) 740) and put the above equations to the following matrix form ft ft ft ••• ••• ft 0 ft 0 0 p P ft o 0 0 0 0 0 0 l + / 3 T / 3 740) 741) 7* (2) 743) (5.75) 740 The above equation has a familiar form x = f(x) (5.76) 5.3. THE CONTROL 149 INTERVAL This equation has a specific name for itself. It is called an iteration equation. Its name probably comes from the fact that the solution can be obtained numerically by the iteration XN = ( -77) 5 f(xN-l) until convergence. It has been proved by Ostrowski, A . M . (1966) that if the absolute value of the derivative of f(x) at the solution is smaller than unity, the iteration always converges to this solution from the initial estimate x = 0. Since the right hand side of Equation (5.75) is quadratic in the parameters fts, its derivatives with respect to these parameters must be linear functions of them. In addition to this, the parameters must form an invertible polynomial. These facts suggest us to try the iteration approach by defining the parameter vector ft' ft (5.78) = PN I ft N as the moving average parameters valued at iteration N, then solve Equation (5.75) by iterating j3pf as below: ft ft ft ft ft 0 PN PN-I ft 0 0 0 I + PN- IPN-\ 7*(0) 7«/(l) 742) 7.(3) (5.79) 7.(0 N-l with the initial estimate Po=0 until convergence. It has been tested extensively with software that this algorithm always converges to the correct and unique invertible parameters of the moving average polynomial. From the above discussion, we notice that it is not necessary to obtain the polynomial ?/>(z ) to obtain the parameters fts. However, the availability of this information can be used as an additional check for errors in the procurement of the parameters fts. It must also be noted that the calculation of the parameters a;s does not require the knowledge of the parameters 0 s. Only the calculation of the parameters fts requires this knowledge. _1 t The Control Interval Basically, the theory of MacGregor, J. F.'s method is as follows. We have a Box-Jenkins model at a control interval At Loiz- ) 6(z- ) (5.80) yt = -V.t-j-1 + 6{< -) 1 r 1 CHAPTER 150 5. CONTROL = ^lu _ _ xt f 6{z ) INTERVAL (5.81) 1+nt The minimum variance closed-loop performance at this control interval can be given entirely from the polynomials 9{z~ ) and </>(, ) and / . Let this variance be o-y(t). Now, if we model the same system at a slower rate A T = r At and get the model 1 VT = -1 ^^u . o*(z ) T . f / r 1 r 6*(z- ) r + ^ l e a(z ) (5.82) T r ii -//r-i + n T (5.83) T with the minimum variance closed-loop performance a {T). We can make the decision to choose the slower control interval A T by comparing o~y(T) and o-y(t). The control loop can be sampled and controlled more slowly, if 2 °l(T) * * (t) (5.84) 2 y The above equation says that there is not much degradation in the controller performance when the loop is controlled more slowly. Other Applications In the above discussion, we have introduced techniques to obtain expressions for the parameters of a skipped A R I M A in terms of the parameters of its original time series. This technique was applied to determine if a control loop can be controlled more slowly. In this endeavour, we have spent a lot of effort to obtain the expression for the autoregressive parameters of the skipped A R I M A - Equation (5.43). The question now is if what we got worth the effort? In term of accuracy, this seems to be a legitimate allegation. The gain in labor and accuracy is probably not much especially if the number of the autoregressive parameters is small (< 5). The answer to this allegation is the technology can be carried to other applications. The new expression for the time series n given by Equation (5.44) can be used as a stepping stone for other developments in time series literature. This equation enables us to obtain the coefficients ^>jS of the moving average time series x Equation (5.53). These coefficients facilitate the calculation of the parameters fts. These coefficients have not been obtained by MacGregor, J . F . (1976) whose work relied on a lemma in Telser, L. G . (1967)'s work. This work also did not give an expression for the coefficients iftiS for the general case. This leaves our work as the only contributor. As for other applications, we will consider two of them: one in statistics and one in engineering. t t 5.3. THE CONTROL 151 INTERVAL The aggregate A R I M A time series is one that is formed by addition of a number of consecutive observations of another A R I M A as described by the following equation: N T = (5.85) 3=0 = (5.86) 3=0 The new A R I M A time series NT has the physical meaning such as weekly, monthly data instead of daily or hourly data portrayed by the A R I M A time series n . Note that the new A R I M A NT has different sampling interval from that of the old A R I M A n . By replacing n in the above equation, we obtain t t t N T = c — - — — — — — ^ ^ A bat-i-j {Az- ) \ t'=0 j = o l r (5.87) The above equation gives us a direct relationship between the autoregessive parameters of the two time series NT and n . As for the moving average parameters, we can develop the right hand side of the above equation or obtain them from a relationship between the autocovariances of the two time series. The relationship between the autocovariances of an aggregate A R I M A and its disaggregate A R I M A is available in time series literature (Wei, W . W . S. (1990)). In the pulp and paper industry, there is one application that can use the technology we presented in this chapter. This application is very close to the aggregate A R I M A . The beta-gauge sensor measures the basis weight, moisture and caliper profile data while it moves across the width of a paper machine. This profile data have a very high resolution and the profile is called the high resolution profile. This profile can have up to 600 points. In control, we do not need such a high resolution profile. This profile will be reduced to a low resolution profile for control. This low resolution profile contains the averages of the same number of consecutive points of the high resolution profile. There are a number of reasons for doing this. The first reason is to reduce sensor noise. The second reason is to make sure there is enough memory space for storage of these profiles. The third reason is we do not want to waste execution time with these profiles when they contain a lot of data. Averaging out this high resolution profile is the right approach, but we do not want to overdo it and loose good cross-machine variation. With our presented technology, we can model the high resolution profile as the A R I M A n and the low resolution profile as the A R I M A NT and obtain the following relationship: t t Sg ""-' Al N t = C r|I-(A,-)'| ' (5 88) 152 CHAPTER 5. CONTROL INTERVAL For a different r, we get a different A R I M A NT and a different variance for this A R I M A . By comparing the variances of these A R I M A s NTS, we can come up with a decision how many points we will take for the averaging of the high resolution profile. 5.3.3 Examples In this section, we will present some examples to illustrate the presented theory. We consider the following time series: 1 - 0.8,- + 0.12z1 1- 1.2Z- 1 +0.47r-2 2 0.06,- 3 at (5.89) with a of unit variance. Now we want to find the parameters of the skipped A R M A with the skipping parameter r = 2. The original A R M A time series has a state space model form t Ax n t = ( + ba (5.90) t + 1 (5.91) c Xi T with A 1.2 1 0 " 0.47 0 1 , 0.06 0 0 = c T In our example, we have r means we have to obtain = b = 1" -0.8 0.12 [1 0 0 ] (5.92) (5.93) 2, m = 2 and therefore k should be equal to 3. This A 2 = 0.9700 -0.5040 0.0720 1.2000 1.0000 " 0 -0.4700 0 0.0600 (5.94) A 4 = 0.4081 -0.2520 0.0396 0.6600 -0.3839 0.0582 0.9700 -0.5040 0.0720 (5.95) A 6 = 0.1331 -0.0872 0.0143 0.2377 -0.1522 0.0245 0.4081 -0.2520 0.0396 (5.96) 5.3. THE CONTROL 153 INTERVAL and we have = — T 4 r 4 T 2 T 2 c a 3 . " c A " c A c A r 4 T 2 T 6 (5.97) c T . T~- 1 ' c A " ' c A " c A c A T 0.5 -0.0769 0.0036 (5.98) With these values of a;s, we have the roots of cx(z~ ) as 0.25, 0.16 and 0.09. Comparing to the roots of 0 ( z ) as 0.5, 0.4 and 0.3, we can claim that the autoregressive polynomial a(z~ ) is correct. As for the order of the polynomial f3(z~ ), we have h = 5 and hence / = 2. The coefficients z/>;s of the moving average time series x in Equation (5.50) are given below: 2 _1 2 2 t -c Ab T = -0.4000 - c A b - aic b T 2 ^3 - c A b - ajc/Ab •04 -c A b - a fp5 -c A b - a r r T l C (5.101) r A b - acb r A b - a c Ab = -0.0072 l C 5 (5.100) = 0.1720 3 4 (5.99) = 0.3700 T 2 T 2 3 = -0.0084 T 2 (5.102) (5.103) This gives us (5.104) = 7w(l) (2) 7tu 1.3266064 (5.105) = ( - 0 2 + i>ll/>3 + ^2^4 + ^ s ) ^ = -0.4431464 + = {-jp = 0.01128 4 ^ )a (5.107) (5.108) 2 5 (5.106) a (5.109) and as described in the theory, we have (1 + ft + ft> ( - f t + frfoal 2 2 7*(0) (5.110) 7*(1) 7^(2) (5.111) (5.112) Now if we solve these equations by substitution, we will obtain the following relationships: ft ft 742) 741) *e + 7«,(2) 2 (5.113) (5.114) CHAPTER 154 5. CONTROL INTERVAL and the final equation to solve for the variance a as 2 K ) + (2 2 4 7 w (2) - 740))(^) + (27 (2) - 27,(2)7.(0) + 7 (1))(^ ) 3 2 2 2 2 e + ( 2 5 ( 2 ) -740)7 (2))^ + 7 i ( 2 ) = 0 2 2 7 (5.115) This is a quartic equation and the solution for <r must be positive. To solve this problem by our method, we wrote a program called maJd.m to calculate the parameters ft, ft and the variance a . This program is included in Appendix B . From this program, we obtain the following solutions: 2 2 ft = 0.3782 (5.116) ft = -0.0097 (5.117) _2 = 1.1605 (5.118) With the above solution for a and the given autocovariances j (i), i = 0, • • • 2, the left hand side of the above quartic equation gives a value of 3.694e — 07. This is accurate enough and we can say both methods give the same solution. The skipped A R M A ST is hence given as follows: 2 w 1 - 0.3782^-'' + 0.0097z1 - 0.5*- + 0.0769z- - 0 . 0 0 3 6 z 2r S T ~ r 2r 3r6:r ^' ^ As the final test of our theory, we will calculate and compare the autocovariances of both time series sj and n . And if we have the following relationship: t r 7,(o) i 7,(1) . " 7n(r) — + l) . ls{k 7n(0) " . ln{rk (5.120) + rl) . then we can say the skipped A R M A ST consists of the skipped observations from the original A R M A n . But first of all, we need the equation to calculate the autocovariances of an A R M A . Since the derivation for this equation is short, we will include this derivation here. The white noise a has the property that it is not autocorrelated which means t t E{a a -j} t t = 0 forj^O = °~l for j = 0 (5.121) (5.122) It also has the property that its future value is not cross-correlated to the value of the time series which means E{a n _j} t t = 0 for j > 0 (5.123) = for (5.124) 1an{j) j < 0 5.3. THE CONTROL 155 INTERVAL From the equation of the A R M A n time series t (1 - faz' 1 - faz~ - faz~ )n 2 = 3 t (1 - - V - -9 z- )a 1 (5.125) 2 2 t if we multiply both sides of Equation (5.125) by a _j, let i go from 0 to o > 3, then take expectation, we have t (5.126) 7an(0) (5.127) 'Jan (1) - fajan(0) 7 ( 2 ) - 0 i 7 a n ( l ) - falan{^) -02O-1 an 7an(3) - ^l7on(2) - hlan{l) ~ falan(0) lan(o) ~ falan{o ~ 1) - falan{o - 2) - fa"1an{o = - 3) (5.128) 0 (5.129) 0 (5.130) The above equations can be written in the following matrix form: 1 1 -fa -fa -fa -fa 1 -fa 1 •->3 —92 ~9l ' 7an(0) ' 7an(l) 7an(2) 7an(3) \ ° 1 . 7an(o) . 0 l 0 (5.131) This means we have 1 -1 " 7an(0) " 7an(l) 7an(2) 7an(3) \ « 1 a -01*1 0 1 . lan(o) _ (5.132) 0 Now if we multiply both sides of Equation (5.125) by n _;, take expectation then let i go from 0 to o, we have 4 7n(0) - &7n(l) - M » ( 2 ) - &7n(3) -W0) + (l-^)7n(l)-M (2) B -Mn(0)-(^l+^ )7n(l)+7n(2) 3 = lan(0) ~ 01 an(l) ~ ^lan (2) (5.133) = -0!7an(O) - 027an(l) (5-134) = -0 7an(O) (5.135) 0 (5.136) 0 (5.137) - ^ 3 7 n ( 0 ) - <^ 7n(l) - M n ( 2 ) + 7n.(3) = 2 -faln{o ~ 3) - fa-J (o ~ 2) - faj (o - 1) + J (o) n n n = 7 2 CHAPTER 156 5. CONTROL INTERVAL In matrix form, the above equations can be written as " 7n(0) " -0i 1-02 "01 - 0 -02 0 0 1 7n(l) 7n(2) 7n(3) . 7n(o) . 01 1 -01 -02 — 01 -02 7an(0) —02 7an(l) 7an(2) (5.138) 7an(3) L 7an(o) and therefore we obtain the equation we are looking for as below 1 -01 -02 -03 7n(0) " 7n(l) 7n(2) 7n(3) -01 -02 1 - 02 -03 -01 - 03 1 -02 -01 -03 0 0 1 -03 -02 7n(o) . —1 -01 01 1 02 -01 03 - 0 2 1 -01 1 -03 -02 — 01 -02 1 2 0 [ ^ 1 (5.139) 0 -01 —0 0 1 -1 1 1 -01 -02 0 A program was written in M A T L A B software to calculate the autocovariances of the two time series n and Sj. This program which is called autocov.m is included in Appendix B. By using this program and choosing o equal 10 for the A R M A n and o equal 5 for the t t 5.3. THE CONTROL 157 INTERVAL skipped A R M A , we obtain the autocovariances of the two A R M A s as follows: " 1.1779 " 0.4557 0.1406 0.0252 -0.0085 -0.0136 -0.0108 -0.0071 -0.0043 -0.0024 . -0.0013 . [ 7n(0) 1 7n(l) 7n(2) 7n(3) 7n(4) 7n(5) 7n(6) 7n(7) 7n(8) = 7.(9) . 7n(10) . [ 7.(0) 1 7.(1) 7,(2) 7.(3) 7,(4) L 7,(5) J " 1.1779 " 0.1406 -0.0085 -0.0108 -0.0043 . -0.0013 . (5.140) Comparing the above listed autocovariances of the two time series n and sj, we can claim that the identification of the skipped A R M A is correct. Now suppose that this A R M A n disturbs a linear control system and the disturbed system can be written as below t t 1 - O . 8 2 - +0.12z~ 5.0 Vt = 1 ~, „ .-i"«-3+ 1_l. 1- n 0.452" g - l 0 , 4 7 ^ - 2 - nO.O62ng-3 2 z + , 2 a 3 « ( r i j i 5 1 4 1 . ) The control loop currently has a control interval of 10 seconds and the pure transportation lag has 20 seconds. Now a control engineer wonders if it is feasible to control this loop at a control interval of 20 seconds. To answer the above question, we will make use of the above result. We can write 1 n t = +0.12z~ 2 -0.8Z- 1 (5.142) +0.47^- - 0 . 0 6 z - at 0.0280 - 0.0371z -1 O.OO780(1 + OAz- +0.l3z~' + z~ )a 1 - 1.2Z- +0.47^-2 - O . O 6 2 1- 2 1.2Z- 1 3 2 1 3 t 3 1 (5.143) and the variance of the output variable given by the minimum variance control law is a (t) = 2 y = 1 + 0.42 + 0.132 (5.144) 1.1769 (5.145) At the 20 second control interval, the disturbance A R M A n will be skipped with a skipping parameter r = 2. By making use of the above result, we can say the disturbance at this control interval is ST which can be written as below: t 1 - 0.3782z- + O.OO970- r r (5.146) 1 - 0 . 5 2 - + 0.07692- '- - 0.0036z- 0.0063 - 0.00582" + 0.0004z~ - 2 r (1 + 0.1218* + 1 - 0.52~ + 0.07692-2r 0.0036 ~ - 3 r r )e r = 2 3? 2 2r T r (5.147) CHAPTER 5. CONTROL INTERVAL 158 with <7g = 1.1605. At this control interval, the variance of the output variable given by the minimum variance controller is a (T) = 1.1605(1 + 0.1218 ) 2 (5.148) 2 y = 1.1777 (5.149) We see that there is a slight increase in the output variable variance, but this increase is negligible. So we can conclude that it is feasible to control this control loop at the 20 second control interval. Now consider the case of the same linear system, but the disturbance time series this time is a moving average of the following form: n = (1 -1.8Z' + 1.19z 1 t - 0.342z- + 0.03602~ K -2 4 3 (5.150) with a of unit variance. In this case, the skipped A R M A is also a moving average. W i t h the skipping parameter r = 2, this moving average time series will have second order. B y using the program m a i d . m in Appendix B , we obtain this skipped A R M A as below: t s = (1 + 0.3588z~ + 0.0070z- )e r T 2r r (5.151) with ex having a variance of a = 5.1156. Now if the system is controlled at the interval of 10 seconds, then the output variable variance given by the minimum variance controller is 2 a (t) = 1 + 1.8 + 1.19 2 2 (5.152) 2 y = 5.6561 (5.153) And if the system is controlled at the interval of 20 seconds, then the output variable variance given by the minimum variance controller is a (T) = 5.1156(1 + 0.3588 ) 2 2 y = 5.7742 (5.154) (5.155) Now consider the same linear system but with the following second order autoregressive time series disturbance U t = 1-1.5,-+0.56.-^ ( 5 - 1 5 6 ) and a is of unit variance. This time series can be put to the following state space model: t x i + i = A x ( + ba (5.157) n = cx (5.158) t+1 T t t 5.3. THE CONTROL 159 INTERVAL with 1 0 1.5 -0.56 c = T The parameters of the skipped . a (5.160) 1 0 can be obtained as follows. We have A R M A ' 1.69 1.5 " " 1.69 1 " 1 0 1.5 0 2 (5.159) l " 1.69 1.5 " ' 1.5961 ' 1 0 1.6950 -1.1300 0.3136 (5.161) (5.162) and 0-1 -1.5 (5.163) -0.56 (5.164) 0 (5.165) — ^ 3 and therefore j (0) (5.166) = 3.5636 w (5.167) 0.56 7*(1) Using the program maJd.m, we obtain the following results: W with tr 2 = 3.4733. The skipped = T A R M A (1 + 0.1612z )e (5.168) -r T hence has the following model: 1 + 0.1612^ s 1 T - 1.132~ r (5.169) +0.3136z - 2 r This result has been checked by comparing the autocovariances of the two time series ST and n . As before, we can compare the loop performances by checking the minimum variances given by the minimum variance controllers for both cases of 10 and 20 second control intervals. We can write: t 1 1 - 1.5Z- +0.56z1 = (1 + 1.52" 1 2 (5.170) at + 1.69*" + 2 1.6950 1- -0.94642- l.bz- 1 1 -3 +0.562- ' 2 )a t (5.171) CHAPTER 160 5. CONTROL INTERVAL and 1 + 0.1612,-'" 1 - \ A 3 z - r + 0.3136,1.1455 - 0.4049,(1 + 1.29120~ + 1 - 1.13z- + 0.3136,- (5.172) 2 r r r 2r (5.173) At the control interval of 10 seconds, the output variable variance given by the minimum variance controller is a (t) = 2 y = 1 + 1.5 + 1.69 (5.174) 6.1061 (5.175) 2 2 Now if the system is controlled at the interval of 20 seconds, then the output variable variance given by the minimum variance controller is = 3.4733(1 + 1.2912 ) (5.176) = 9.2639 (5.177) 2 Comparing the degradations in feedback performance of the three cases above, we can summarize them as follows. In the first case, the increase in the output variable variance is around 0.01%. In the second case, this increase is at around 2%. In the third case, the increase in the variance is more than 50%. In the third case, it is not feasible to control more slowly. The difference between the third case and the other two cases is that in the third case, the roots of the autoregressive polynomial are big (0.7 and 0.8). This means stronger serial correlation. So skipping even only one observation means skipping too much correlation. The result is the feedback controller, corresponding to the case of the skipped A R M A , does not remove enough correlation of the disturbance and hence gives a higher value of variance for the output variable. 5.4 Conclusion In this chapter, we have discussed the problem of determining the optimal control interval. The word optimal normally means smaller variance in stochastic control. This has spawned researches into this topic by this criterion. The work in this thesis on this topic is not original in the sense that it only improves an existing work. However, the method discussed in this chapter provides an elegant way to attack the problem. This elegant way not only solves the problem of the optimal control interval but also solves similar problems such as modelling an aggregate A R I M A . Chapter 6 Conclusion and Recommendations 6.1 Conclusion In this thesis, we have discussed a few fundamental questions concerning the application of the Box-Jenkins model to control problems. Even though the discussion was about the Box-Jenkins model, the theory can also be applied to the tracking control problem. The work in this thesis and its contributions can be separated into three areas. These areas contain questions a control engineer usually faces in his or her working environment. Is the control interval or frequency right? What is the model of the system? Is a PID controller sufficient? Is constraint on the variance of the input variable necessary? Is self-tuning necessary? These questions were answered adequately in simple terms. 6.2 Summary of the Thesis In summary, the thesis has made contributions in 3 areas: identification, control algorithms and control interval. The contributions will be summarized as follows: • Chapter 3 discusses identification. Identification of the rational transfer function is discussed in Section 3.3. The sum of squares of the disturbance is differentiated and the derivatives are set to zeros. This gives us r + s + 1 equations, s equations are used in the Newton-Raphson iteration equation to obtain the parameters of the pole polynomial, r +1 equations are used to calculate the parameters of the transmission zero polynomial. The identification of an A R I M A is discussed in Section 3.4. There are two ways to identify it. The first way is to write it in the rational transfer function form and identify its parameters. In the second way, the sum of the squares of the white noise is differentiated and the derivatives are set to zeros. This gives p + q equations, q equations are used in the Newton-Raphson iteration equation to obtain 161 CHAPTER 162 6. CONCLUSION AND RECOMMENDATIONS the parameters of the moving average polynomial, p equations are used to calculate the parameters of the autoregressive polynomial. The combined identification is discussed in Section 3.5. There are a total of s + q equations in the Newton-Raphson iteration equation, r + p + 1 equations are used to calculate the parameters of the transmission zero and autoregressive polynomials. • Chapter 4 discusses the controllers. The PID controller is discussed in Section 4.4. The minimum variance and L Q G PID controller gains are obtained by taking the derivatives of the control criteria containing the variances of the input and output time series and setting them to zeros. A Newton-Raphson equation is used to iterate from an initial estimate of the controller gains until convergence. The initial estimates are estimated by a crude but systematic way of varying the poles until the control criteria are achieved. The recursive least determinant self-tuning algorithm is discussed in Section 4.5. The algorithm tunes itself to the minimum variance condition by setting the determinant of a positive definite matrix to a minimum. This matrix contains only the current and past input and output variables data. • Chapter 5 discusses the control interval. The optimal control interval is discussed in Section 5.3. The question if a control loop can be controlled less frequently is determined by obtaining the parameters of a skipped A R I M A and computing the theoretical minimum variance under feedback. The control loop can be controlled less frequent, if there is not much degradation in performance, ie. no great increase in variance of the output variable under feedback. 6.3 Recommendations 6.3.1 The Nonlinear Stochastic Control System The linear control theory has been well developed and understood. However, many control systems, especially chemical engineering systems, are nonlinear in nature. So a nonlinear controller must be developed. The problem with nonlinear control is its theory is not unified. Unlike linear control theory where the principle of superposition applies, the system can be sufficiently described by a rational function of polynomials, nonlinear control system has not been well described even mathematically. For industrial applications, the following simple nonlinear model can be used: k i i=o j=o •t-f-i-i 6.3. 163 RECOMMENDATIONS With an A R I M A disturbance, the nonlinear stochastic control system would have the following model: k yt = i EE «i t-/-.--i * i=0 j=0 Oiz' ) w k u 1 + n ( - ) 6 2 1 This means a nonlinear stochastic control system consists of two parts. The transfer function relates a nonlinear relationship between the input and output variables that can be described by a sum of truncated power series of past input variables. The disturbance that can always be described by an A R I M A which is a linear combination of an uncorrected sequence. The concept of using power series to describe nonlinear systems is not a wild idea. It has been proposed by Wadel, L. B . (1962). Identification of the above model can be done via a linear least squares estimation. The minimum variance nonlinear stochastic controller can be obtained in a similar fashion to the minimum variance controller of a Box-Jenkins model. However, the input variable of the nonlinear controller is not unique. But this can be solved easily. We can always set a strategy to obtain this control action. It must be real and closest to the previous control action. 6.3.2 The Self Correcting Controller The problem with the recursive least squares minimum variance self tuning controller is the determination of the delay parameter / and the controller orders m, n. The recursive least determinant self tuning controller gives us an advantage, because we do not need to know the delay of the system. However, we still need to know the orders m, n of the controller. This is the second suggestion of this thesis. A n intelligent scheme is needed to detect the wrong estimation of these two parameters. Once this scheme or algorithm is found, we can have what control engineers have been looking for for a long time - a (linear) controller that can correct itself. The problem of determining the right orders for the controller on line has been studied by a few authors. Kotzev, A . (1992) designed an algorithm called M O D (Model Order Determination) and applied it to an excavator. Kotzev's work, however, is inefficient in the sense it did not use a general model like the Box-Jenkins one and no useful conclusion can be drawn from the work. The analysis is mainly from a cost function defined as the sum of squares of the errors which is the difference between the measured output variable 2/JV+/+I and its estimate x. /3 . In her own words, the conclusion of the study was "for correct modeling, the cost function rises initially (when the error goes to zero), and then settles on a constant value. When under-modeled, the cost function's initial rise is N N CHAPTER 164 6. CONCLUSION AND RECOMMENDATIONS steeper, going to much higher values and leading to instabilities. When over-modeled, the behavior is much more moderate, but the performance deteriorates and the system can go unstable." We are not going to dispute this conclusion except for saying that it will not help us in our problem of determining the right orders for the controller. Kotzev's thesis also mentioned the work of Niu, S. et al. (1992) who introduced an algorithm called A U D I (Augmented Upper Diagonal Identification). This work is actually an expansion of the Bierman's U D factorization algorithm. It is more on the side of identification than an ingenious way to determine the correct orders for the controller. However, since it claimed it can identify a number of different parameter vectors at the same time, the algorithm could be tried to determine the correct orders for the controller. The self-correcting controller concept has not been suggested, because of the weakness of the recursive least squares approach to the self-tuning algorithm. At each control interval, the algorithm has to store the last controller's parameter vector (3 _ , and the variance matrix PN-I- If the parameter m or n changes, we have to start again. This is because we do not have these parameters from the last control interval. Therefore, it is very cumbersome to administer changes. In a recursive least determinant self tuning environment, the story is different. What the control algorithm stores is only two vectors of values of u and y , therefore administration of changes in terms of the parameters m or n is much easier. This fact makes the self correcting controller feasible. N t t 1 Nomenclature "CO Hz- ) Hz- ) faz- ) r s 1 1 1 P q f Ut yt kp k{ kd X m n Transmission zero polynomial Pole polynomial Moving average polynomial Autoregressive polynomial Degree of the transmission zero polynomial Degree of the pole polynomial Degree of the autoregressive polynomial Degree of the moving average polynomial The pure delay of the process in the Box-Jenkins model The input variable The output variable The proportional gain of a PID controller The integral gain of a PID controller The derivative gain of a PID controller The Larange multiplier or penalty constant in the control criterion The number of past input variables the controller remembers The number of past output variables the controller remembers Greek Symbols a a 2 The standard deviation The variance 165 NOMENCLATURE 166 Mathematical Operators E V Expectation operator Differencing operator — Partial derivative operator Min u() tr | | Adj z" Minimum value of a positive quantity Eigenvalue of a matrix Trace of a matrix. Sum of the diagonal elements of a square matrix Determinant of a matrix. Product of the eigenvalues of a square matrix The adjoint of a square matrix Backward shift operator o 1 Superscripts ' T -1 Derivative of Matrix transpose Matrix inverse Subscripts min mv N t Minimum value Minimum variance Sequence or time N Time t Overstrikes Optimal or estimate 167 NOMENCLATURE Acronyms ARIMA ARMA ARMAX AUDI DCS IMC LQG MOD PID PI PD PRBS RLD RLS UD AutoRegressive Integrated Moving Average AutoRegressive Moving Average AutoRegressive Moving Average with an eXogenous input Augmented Upper Diagonal Identification Distributed Control System Internal Model Control Linear Quadratic Gaussian Model Order Determination Proportional Intgeral Derivative Proportional Intgeral Proportional Derivative Pseudo Random Binary Signal Recursive Least Determinant Recursive Least Squares Upper diagonal Bold Face Capital character Regular character A matrix A vector NOMENCLATURE Appendices A. Mathematical Results In this appendix, we will present some proofs of some results that have been used in the thesis but not proved. These results were not proved in the thesis, because they might be obvious to some readers. Recursive Least Squares We consider the least squares problem Min E W i - W P t=i P = (A.1) and its solution [ X ^ X ^ y (A.2) If we have N — l observations, we can subscribe the parameter vector with the subscript N — 1 to indicate the fact that the parameter vector has been estimated with N — l observations, or it has been estimated at time N — 1, and we write $N-I — [X^_ XTV-I] 1 1 X^_ y7v-i 1 (A.3) where rT X i y/+2 yw-i N-l WV-l y/+3 L VN+J J 169 (A.4) APPENDIX 170 A Now when the observation iV is available, we want to find the relationship between J3 and P N - I - We can proceed as follows. N PN = [ X ^ X j v ] - 1 (A.5) X ^ y j v = [X^ _ Xiv-i + Xjvx^] = [X^_ XAT-I + XATX^] f Xjv-i 1 1 1 xN [X^_ yAf-i + -1 x YN-I VN+f+l (A.6) XNVN+J+I] (A.7) Using the Sherman-Morrison formula, we can write PAT (A.8) = [X^-Xjv] = [X^_ XAr_i + XATX^] -1 (A.9) 1 x [X^.iXiv-i] - [X^.JXJV-I] 1 xjvx^^ ,_, [X _ X r_i] i - i 1 + x [X. _ AAT_IJ X i v (A.10) 1 t , y T Ar 1 A x N = PJV^-PJV.X X N X N N 1 (A.ll) JV-1 1 + X^PiV-lXiv and therefore A (3 N = r T [X _ XAT_I JV T |X _ YJV_I + 3"I i f 1 + XJVX^J 7V XATX^- X ^ Y A T - I + XATJ/AT+Z+I] PJV-I — PJV-I 1 + X^PiV-lXjV-PJV-I IjV_l — P j V _ i XjyX^ 1+ N + Pw-i ivyiv+/+i — x PN-I P -1 [X^YJV-I +Xy N x P ^x N (A.12) XNVN+J+I 1 N N 1+ XATX^PjV-1., , ^ 1 + x^Pjv-aXjv T N N PAT_IX ^_ YAT_I A 1 (A.15) PN-IXNVN+J+I X P -lX N (A.14) N xx -PiV-1 ] N+f+1 (A.13) N Now if we define PN-T.X KN then the parameter vector 0 N P N N 1 + X^PAT.IXAT is given as XJVX^- P N - I - ^NX P _ N (A.16) N X PAT.iXAT VN+f+l + PiV-lXjV — PJV-I 1 + X^PAT-IXAT (A.17) = PN-I - = PN-1 + ^-N^NPN-I K N + K yN+f+i \y~N+f+l - N X-NPN-I] (A.18) (A.19) APPENDIX 171 A The above equation gives the estimated parameter vector (3 in a recursive form. This form has many advantages. The first advantage we can see right away is it makes the problem of a matrix inversion disappear. The second advantage is it gives the estimated parameter vector recursively, so that we can see the effect on the parameter vector of the new observation. This is so convenient for process control, because the parameter vector is updated with real time. The new parameter vector is calculated with only a little amount of computation. In general, any estimation method which gives the estimated parameter vector as a product of a matrix inverse and a vector can give its solution in a recursive form. N Derivatives of u, v and w In this section, we will give the detailed expressions for the derivatives of the quantities u, v and w used in the procurement of the PID controller gains. Since the controller gain Ij appears in the coefficients f3 s, but not a s, the derivatives of u and v are given as below. k k ft ft u k=\ Lft-i r ft \ & 1 1 ft +2? [T . ft . ft-i. E L i -2 2 (A.20) ft-i. ftft+i Efc=i ft ft -i-l ftft+fc-, fc=i . ftft £fe=i ftft+i ftft+A;-; E L l [r -n' ft ft ft-i ftft ft- E L i -2 EJLi ftft+i ftfti+/c_; ftft ^T ft ft i-l L ft-i (A.21) APPENDIX 172 A The derivatives of the individual components in the above equations are as follows: - E £ = i <Xk/3'k+i + ak+i/3'k E L l kfi -l (A.22) + &n+k-lfi a n+k k -<XlP'n - Un&' X The derivative of an inverse of a square matrix is no difficult task for us. To find the expression for the derivative of an inverse, we write r _ 1 r (A.23) = i and take the derivative of both sides of the above equation to obtain [r- ]'r + r - r ' = o (A.24) [r- ]' = - r - r ' r - (A.25) 1 1 and therefore 1 1 1 where fi fis r = 0 fix fin 2 03 04 ... o ... o 0 0 (A.26) 0 fix fin-2 uj fin 0 0 Since we are going to take the second order derivatives, it is better for us to write the derivative of u with respect to /; in a clearer form. The derivative is given as below: ™_i E f c = 1 dp x d(3 k+ k " " - f t - + *+i-ai7 a ft du 9 ^ dft dh + dh d(3 n QX dh ' a k-i n+ d/3x dh 02 dh LAn-l APPENDIX 173 A Efc=i kOSk+i - a Y!k=l CtkCtn+k-l Ofcft+i - — Ctkf3 +k-l — n a-i_a n - Ctifin otk+iPk ft ft CYn+k-lflk dh ft-i QL Pi J n 5ft YX=\ +2 k®k+i - a I Efcrri ct a k -i - n+k - « l « n ctkf3k+i - a>k+if3k - ctkfln+k-i dh 5ft 5/ t Ct k-l/3k (A.27) n+ dft-i 5L CtlPn — « n f t Before taking the second order derivative of u, we should remind ourselves that the controller gains exist only in the coefficients fts as linear functions. Therefore, a double differentiation on these coefficients will result in zeros. Secondly, the rules of differentiation of matrix and vector products are just the same as those for scalars. We have done this in the chapter of identification. We have ™_i n 5ft +1 5ft dh dh 5ft +k-l du 2 dh dljdk + OL k-l n+ ft ft 5ft 5/, dh L ft-i 5ft , 5ft 5/, ,„_! 5ft , _ 5ft E Z = i « * 5L 5/, 5ft 5/, 5ft + 1 - 2 5ft 5/ df3n+k-l dh 5ft ' + 5/, 1 dlj 4 5ft a, r • dh 5ft-! 5/, APPENDIX 174 _ n _ ! dh+i A d h h + 2 dh dh d h +k-i E f c = l oik h + Ctn+k-l dh dh 0n-1 dh , dh dh dlj ft + 2 E f c = l Ctk®n+k-l — Ctkh+k-l — Ci k-lh n+ r-i^r-i^Erdlj a a r Efc=l n - kO!k+l a>ih - Q - a otkh+i n 1 02 dk h ~ L0n-l J otk+ih 01 + 2 - ctkh+k-i - Y!k=l k&n+k-l a a a x n - a x h - a -ih n+k r-i^Er- —r1 3/,- 5/j L ctnh aft E £ = i OfcO/c+i - a n x - — ct h+k-i E f c = i ka +k-i a akh+i — a k a n - a ^ h - otk+ih n + dh k - i h dh-i &nh dh dh+i . d h n dh dh - 2 E j t = i <*k d h +k-i dh dh olj , + a n + dh dlj k-i dh dh d h dh dh dh-i dh 02 1 0n-l APPENDIX 175 A Efc=l CtkCtk+l - "fcft+l - 2 J2k=l kO! +k-l a n — CtkPn+k-l 5ft Ofc+lft - — dh -i5£r Ctn+k-lflk 5ft (A.28) 1 dh dh dh In a similar fashion, we can write the derivative of v with respect to /, as follows: dv dh - 2 + ft+l dh k+l d(3n+k-l £ f t ^ - 2 k=i 5ft d(3 R n+A; —/ dh dh 5ft 5/,- ft ft ft-i 5 f t , 5/3 5L ^ 5 / , fl 1 ULri ftft+i +2 EL=X 5/ ftft+/t-/ : ft ft 4 L ft-i J ftft 5ft a/,- Efc=i ftft+i EL=1 ftft+i-/ r -i 5ft dk 5ft-r ftft 5L (A.29) APPENDIX 176 and then take the derivative of this quantity with respect to Ij to obtain ! 5ft5ft+i j 2 h f <~w d/3 5 f t _ ; . dljdk dlj dh „ _ l a dPk+1 + 2 E L i Pk—ET, r dh dftn+k-l ft d(3 - 5 f t n+k dlj 5ft R Pk+\ dh L+ t ft.+k-r dh T Pi ft dh ft-i o ^ n ' „„_! J l djh ^ h + ( J n d h 5ft+i a 5ft 5ft dftn+k-l dh E U I P> 5ft 5L 5ft dh 5ft_i dh h „ _i n fl 5ft z^fc=i P ^ - + Pn + x ^ 5/, 5L dh 5ft r Pk+i 5/, Pi + 2 E f c = 1 5ft ft-^7— + 0 5/, dh ft ft-i ft^ + 5L - dh 1 5ft + dh k d^ m d%d£ dlj dh dlj dh 1 n 8f3 T ~~r~ +fc k ®i dh Efc=i + 1 dlj dh dlj dh ' k—1 dv 5ft ft-^ 5/, 1 ft i-i ft Lft-i A APPENDIX 111 A Efc=l ftft+1 -i5r -iar _i r PkPn+k-l E L l ft ft r r dlj dh ft-i ftft ELl ftft+i El=l r ftft+fc-/ ft ft - i — - i — - i r r dh dlj L ftft ft-i aft EL! + 2 dh 5ft dh ftft+i Efe=l Pkfin+k-l 5ft-i 50 ftft -1 5ft+i ELi # a 50 + ft+i 5ft 5ft 5/ 5^ 5/, 50 8 - ELI 2 ft + 50 ft+A;- 5ft B ^50 ELl 5ft 5ft+fc_; '50 5ft_i 5/,- 5ft + ^ 5 0 ftft+i 5ft 5/ 5ft 5/, 2 + 2 E l = i ftft+fc-. r -i5T 5/, (A.30) 5ft_i ftft 5/i The first and second order derivatives of the quantity w are given below. dw o = 57 ?o L 0 A5r/ 7 r > „ — — 4- 7 f c N ,=1 ^ 5^ 0 5j?o fl 0 5ft, 5/, APPENDIX 178 A $ c) d T £ L i Q^VkVk+i - QJ-VkVoPk+i - Tjj-rik+iVoPk • ft d ft 5 5 i-l Efc=l -^rVkVn+k-l — -QJ-VkVoPn+k-l ~ -^fVn+k-lVoh dh +2 ft-1 9 R 9 R 9 E L i yMk+i - ijkVoPk+i - Vk+ivoPk ft ft E L i VkVn+k-i - r/ r]of3 k-i - Vn+k-iVoPk k dh n+ ft-i E L i VkVk+i - r} rjo/3 k k+1 aft - r^+ir/oft dh 5ft E L l VkVn+k-l - T] 1]of3n+k-l ~ Vn+k-lVoPk r1 +2 (A.31) dh k 5ft-j dh Because the parameter rj s are linear functions of the controller gains, by taking the derivative of the above equation with respect to 0, we have k dw dljdk 2 57/o drjo %m~ + A dr\ dr\ k 2 drj k 5ft 0 £ 5l7 50 " 50 & " a 5 ( 2 2 2 ? ? 0 50 }+ 2 dr] dn k drj k 5ft 0 5 0 5 / ~ " 5/7^ ~ 5 ( 2 2 l 2770 T E L ! dLdh a T a T ^ ' / H l - dljdh aTar^VoPk+l ~dljdh aT^Vk+iVoPk 5 2 + 2 Ei=i Vk^n+k-l ~^VkVoh k-i dljdh + ~ J-^Vn+k-tVofc -.-1 ft ft ft-] 5 2 5 2 5 2 r/i??oft ??n??oft VlVn ~ dLdh dljdh 505/t- APPENDIX A 179 d d d - QJ-rikVoPk+i ~ QJ-Jlk+irioPk E £ = i -QfVkVk+i ft - 2 w 9 d J2k=l QJ-VkVn+k-l - ' 02 5 Q^VkVoPn+k-l ~ dh -Qj-Vn+k-lVoPk 0n-l 9 d YlZl -Qj-VkVk+i + 2 R 9 R 9 d - gj-VkVoPk+i d - dh d 5 J2k=l -Qi'VkVn+k-l 501 -Tjj-Vk+lVoPk d — -Q^VkVoPn+k-l dh i-l ~ -Q^Vn+k-lVoPk 50n-l 9 R 9 5 E L i ^rVkVk+i dh 3 5 - Q^-VkVoPk+i - di R 9 Mryk+ivoPk 5/, 01 5 5 Efc=i -^-VkVn+k-l dl 5 — T^rVkVoPn+k-l ~ dh 02 -Q^Vn+k-lVoPk dh 0n-: 9 dh 9 VlVn ~ R 9 -KTVlVoPn dh Efe=i Wfc+i - VkVoPk+i ~ R dh-^rVnVoPl ~ Vk+iVoPk T 01 + 2 Efe=l VkVn+k-l - VkVoPn+k-l ~ Vn+k-lVoPk dlj dh VlVn ~ VlVoPn ~ rinVoPl 02 0„-l Efc=l ?Mfc+l - VkVoPk+l ~ r/k+lTjofSk 01 + 2 02 E L l VkVn+k-l - T] rioPn+k-l k VlVn - mVoPn - ~ Vn+k-lVofik VnVoPl dh dlj L 0n-l APPENDIX 180 A dh -, T E L i VkVk+i - VkVoPk+i - Vk+iVoPk dlj E L l VkVn+k-l - VkVoPn+k-l ~ r? fc-/?7oft n+ dlj dh dh d d d E L i Qj-Vkm+i - Qj-Vk'ooPk+i - Qi-rik+i'noPk n T i Efc=l d d -WrVkVn+k-l d — -KTVkVoPn+k-l dh ~ dlj 9 R 9 dlj dh 9 dh dh dh dh 1-1 ^-Vn+k-lVoPk dh dh dh-i dh R dh dh djk dh E L i VkVk+i - VkVoh+i - Vk+ivoh E L l VkVn+k-l - VkVoh+k-l - Vn+k-lVoh dij (A.32) dh-i dh ViVn - ViVoh - VnVoh Eigenvalues of Sum of 2 Symmetric Matrices We consider the following matrix relation: C A + B = (A.33) where A and B are symmetric of dimension n. Denote 7,-, a,- and h t h corresponding eigenvalues of the matrices C , A and B arranged in a nonincreasing order for the three sets. By the minimax theorem, we have a s 7 = Min 1; zfx S xx = r z = Max 0 e (x Cx) (A.34) T x (i = 1,2,-•-5 - 1) (A.35) Hence, if we take any particular set of Zj, we have for all the corresponding x 7 S < Max (x Cx) = T Max (x Ax + x Bx) T r (A.36) APPENDIX 181 A If R is the orthogonal matrix such that R AR = diag(ai) T (A.37) then if we take Z ; = Re;, the above relations to be satisfied are: zfx = efy = 0 (i = l , 2 , - . - a - l ) (A.38) With this choice of z,-, the first (s — 1) components of y are zeros, and from Equation (A.36), we have < Max (x Ax + x Bx) T 7 s = Max T a,?/, + x Bx) 2 T (A.39) i=s However, < ( - °) A 4 while x Bx T < p\ (A.41) for any x. ft is the largest eigenvalue of the matrix B. Therefore, the expresssion in the brackets of Equation (A39) is not greater than a + ft. That means s 7 s < + Q s ft (A.42) Now applying the same result to the case of the matrix relation A = C + (-B) (A.43) with the eigenvalues of —B in nonincreasing order as — f t , — ft-i, • • • — ft, we obtain <*. < 7, + ( - f t ) (A-44) or 7, > « + s ft (A.45) The above eigenvalue relations state that when B is added to A all the eigenvalues of the latter are shifted an amount between the smallest and largest eigenvalues of the former. APPENDIX 182 A Eigenvalues of Sum of a Symmetric Matrix and a Rank Unity Symmetric Matrix Consider the following matrix relation: C = A+B (A.46) where A and B are symmetric of dimension n and B is of rank unity. In this case, there exists an orthogonal matrix R such that P o 0 0 R BR (A.47) where p is the unique nonzero eigenvalue of B. If we write A R AR = (A.48) a A„_i then there is an orthogonal matrix S of order n — l such that S A _iS = diag(Xi) T n (A.49) Now if we define Q by the relation ~1 0 Q = R ' (A.50) s 0 then Q is orthogonal and "A b T ' -p 0" + Q (A + B)Q = T b (A.51) 0 diag(Xi) 0 where b = S a. The eigenvalues of A and of A + B are therefore those of T A b X+p diag(Xi b T [ b b T (A.52) diag(Xi) J Now if we denote these eigenvalues //»-(A) and pi(A -f B) in decreasing order, then they satisfy the following relation: Ui(A where 0 < m; < 1 and Y i m = 1- + B) = Ui(A) + mip (A.53) 183 B APPENDIX B. Computer Programs In this section, we will include the programs written to verify the theory presented in the thesis. function [omega,delta,varnt]=tf_id(yt,ut,f,r,s,ndel); '/. % I d e n t i f i c a t i o n a l g o r i t h m f o r B-J t r a n s f e r f u n c t i o n model. 7. % Calling sequence: % °/ f u n c t i o n 0 [omega,delta,varnt]=tf_id(yt,ut,f,r,s,ndel); I % Input v a r i a b l e s : °/ yt : The % ut : The i n p u t v a r i a b l e column v e c t o r . % f : The pure d e l a y of t h e p r o c e s s . % r : The number of t r a n s m i s s i o n % s : The number of : The i n i t i a l estimate 0 % ndel output v a r i a b l e column v e c t o r . No d e l a y means zeros. poles. of t h e p o l e s p a r a m e t e r v e c t o r . I % Output v a r i a b l e s : % omega : The column v e c t o r of t h e p r o c e s s % d e l t a : The column v e c t o r of t h e t i m e % varnt : Variance W r i t t e n on May 15th 1996 by Ky M . Vu I nob=length(yt); for i=l:nob-f-r-l y(i,l)=yt(nob+l-i,l); for j=l:r+l u(i,j)=ut(nob+l-f-i-j,1); end; end; esp=l; lesp=l; iter=l; y. °/ 0 Start iterating gain. constant. of t h e d i s t u r b a n c e time I I f=0. series. APPENDIX 184 y. while (iter<100&esp>1.0e-7) del=ndel; psi(l,l)=del(l,l); f o r k=2:nob-f-r-2 sum=0; for l=l:k-l if (l>=k-s) sum=sum+psi(1,l)*del(k-1,1); end; end; if (k<=s) psi(k,l)=sum+del(k,1); else psi(k,l)=sum; end; end; y. % C a l c u l a t e t h e o p t i m a l omega v e c t o r for i=l:nob-f-r-l for if j=l:nob-f-r-l (j==i) mpsi(i,j)=l; elseif (j>i) mpsi(i,j)=psi(j-i,l); else mpsi(i,j)=0; end; end; end; omg= inv(u'*mps i'*mps i *u)*u'*mps i ' * y ; y. % C a l c u l a t e the i d e n t i f i c a t i o n equations y. for i=l:s for k=l:nob-f-r-2 sum=0; i f (k<i) dpsi(k,i)=0; B APPENDIX B elseif (k==i) dpsi(k,i)=l; else for if l=l:k-l (l>=k-s) sum=sum+dpsi(l,i)*del(k-l,l); end; if (l==k-i) sum=sum+psi(1,1); end; end; dpsi(k,i)=sum; end; end; % for k=l:nob-f-r-l for 1=1:nob-f-r-l i f (K=k) dipsi(k,l)=0; else dipsi(k,1)=dpsi(1-k,i); end; end; end; domg(:,i)=inv(u *mpsi *mpsi*u)*... , , (u'*dipsi'-(u'*dipsi'*mpsi*u+u'*mpsi'*dipsi*u)* inv(u *mpsi'*mpsi*u)*u'*mpsi')*y; ; g(i,l)=(dipsi*u*omg+mpsi*u*domg(:,i))'*(y-mpsi*u*omg); end; I % Calculate the derivative for matrix i=l:s for k=l:nob-f-r-l for 1=1:nob-f-r-l i f (K=k) dipsi(k,l)=0; else dipsi(k,l)=dpsi(l-k,i); end; APPENDIX end; end; C a l c u l a t e second d e r i v a t i v e s for j=l : s for k=l:nob-f-r-2 sum=0; if (k<i+j) d2psi(k,1)=0; elseif (k==i+j) d2psi(k,l)=2; else for l=l:k-l if (l>=k-s) sum=sum+d2psi(1,l)*del(k-l,1); end; if (l==k-i) sum=sum+dpsi(l,j); end; if (l==k-j) sum=sum+dpsi(l,i); end; end; d2psi(k,l)=sum; end; end; for k=l:nob-f-r-l for 1=1:nob-f-r-l i f (K=k) djpsi(k,l)=0; dijpsi(k,l)=0; else djpsi(k,l)=dpsi(l-k,j); dijpsi(k,l)=d2psi(l-k,l); end; end; end; Calculate the Jacobi d e r i v a t i v e matrix B 187 APPENDIX B % d2omg=inv(u'*mpsi'*mpsi*u)*(u'*dijpsi *y-... ; (u'*dijpsi'*mpsi*u+u'*dipsi'*djpsi*u+... u'*djpsi'*dipsi*u+u'*mpsi'*dijpsi*u)*omg-... (u' * d i p s i ' * m p s i * u + u ' * m p s i ' * d i p s i * u ) * d o m g ( : , j ) - . . . (u'*djpsi *mpsi*u+u *mpsi'*djpsi*u)*domg(:,i)); ; ) dg(i,j)=(dijpsi*u*omg+dipsi*u*domg(:,j)+... djpsi*u*domg(:,i)+mpsi*u*d2omg)'*(y-mpsi*u*omg)-... (dipsi*u*omg+mpsi*u*domg(:,i))'*... (djpsi*u*omg+mpsi*u*domg(:,j)); end; end; % °/„ M o d i f i e d Newton-Raphson i t e r a t i o n esp=g'*g/s; if (esp<=lespIiter==l) ldel=del; ddel=-dg\g; sdel=ddel; lesp=esp; iter=iter+l; iter esp del' omg' else ddel=ddel/2; sqddel=ddel *ddel/s; ; if (sqddel<=1.0e-10) rand('uniform'); for i=l:s ddel(i l)=rand*sdel(i,l); J end; end; iter [ l e s p esp] end; ndel=ldel+ddel; APPENDIX B 188 end; I % Get f i n a l results I omega=omg; delta=del; x=y-mpsi*u*omg; varnt=x'*x/(nob-f-r-l); function [phi,theta,varat]=arma_id(nt,p,q,nthe); y. % I d e n t i f i c a t i o n a l g o r i t h m f o r an ARMA t i m e s e r i e s model. % % Calling sequence: I % function [phi,theta,varat]=arma_id(nt,p,q,nthe); y. % Input v a r i a b l e s : I nt P y. % y. q nthe The ARMA t i m e series. The number o f a u t o r e g r e s s i v e parameters. The number o f moving average p a r a m e t e r s . The i n i t i a l estimate o f t h e moving average parameter vector. % % Output v a r i a b l e s : X phi The column v e c t o r of t h e a u t o r e g r e s s i v e p a r a m e t e r s . % theta The column v e c t o r o f t h e moving average p a r a m e t e r s . % varat Variance of t h e white n o i s e . y. y. W r i t t e n by Ky M. Vu on May 1 5 t h 1996 y. nob=length(nt); m=max(p,q); for i=l:nob-m n(i,l)=nt(nob+l-i,l); for j=l:p n l ( i , j ) = n t ( n o b + l - i - j , 1); end; for j=l:q n2(i,j)=nt(nob+l-i-j,1); end; 189 APPENDIX B end; esp=l; lesp=l; iter=l; % % Start iterating % while (iter<100&esp>1.0e-7) the=nthe; gama(l,l)=the(l,1); for k=2:nob-m-l sum=0; for l=l:k-l if (l>=k-q) sum=sum+gama(l,1)*the(k-l,1) ; end; end; if (k<=q) gama(k,l)=sum+the(k,1); else gama(k,l)=sum; end; end; % % Calculate the optimal p h i vector for i=l:nob-m for if j=l:nob-m (j==i) mgama(i,j)=l; elseif (j>i) mgama(i,j )=gama(j-i, 1); else mgama(i,j)=0; end; end; end; ophi=inv(nl'*mgama'*mgama*nl)*nl^mgama'*(n+mgama*n2*the); I % Calculate the i d e n t i f i c a t i o n equations APPENDIX B 190 I for i=l:q f o r k=l:nob-m-l sum=0; if (k<i) dgama(k,i)=0; elseif (k==i) dgama(k,i)=l; else for l=l:k-l if (l>=k-q) sum=sum+dgama(l,i)*the(k-1, 1 ) ; end; if (l==k-i) sum=sum+gama(l,1); end; end; dgama(k,i)=sum; end; end; % f o r k=l:nob-m for if 1=1:nob-m (K=k) digama(k,1)=0; else digama(k,l)=dgama(l-k,i); end; end; end; ei=zeros(q,1); ei(i,l)=l; dophi(:,i)=inv(nl *mgama *mgama*nl)*(nl'*digama *(n+mgama*n2*the)+... , , , nl'*mgama'*(digama*n2*the+mgama*n2*ei)-... (nl'*digama *mgama*nl+nl'*mgama'*digama*nl)*... ) inv(nl'*mgama'*mgama*nl)*nl'*mgama'*(n+mgama*n2*the)); h(i,I)=(digama*n2*the+mgama*n2*ei-... digama*nl*ophi-mgama*n2*dophi(:,i))'*... (n+mgama*n2*the-mgama*nl*ophi); end; APPENDIX B I % Calculate the d e r i v a t i v e matrix '/. for i=l:q for k=l:nob-m for l=l:nob-m if (K=k) digama(k,l)=0; else digama(k,l)=dgama(l-k,i); end; end; end; I % Calculate second derivatives % for j=l:q f o r k=l:nob-m-l sum=0; if (k<i+j) d2gama(k,1)=0; elseif (k==i+j) d2gama(k,1)=2; else for if l=l:k-l (l>=k-q) sum=sum+d2gama(l,1)*the(k-l,1); end; if (l==k-i) sum=sum+dgama(l,j); end; if (l==k-j) sum=sum+dgama(l,i); end; end; d2gama(k,l)=sum; end; end; for k=l:nob-m for l=l:nob-m APPENDIX B 192 if (K=k) djgama(k,l)=0; dijgama(k,l)=0; else djgama(k,l)=dgama(l-k,j); dijgama(k,l)=d2gama(l-k,1); end; end; end; ei=zeros(q,1); ej=zeros(q,1); ei(i,l)=l; ej(j,D=l; I % C a l c u l a t e the j a c o b i d e r i v a t i v e matrix % d2phi=inv(nl'*mgama'*mgama*nl)*(nl'*dijgama'*(n+mgama*n2*the)+... nl'*digama'*(djgama*n2*the+mgama*n2*ej)+... nl'*djgama'*(digama*n2*the+mgama*n2*ei)+... nl'*mgama'*(dijgama'*n2*the+digama'*n2*ej+djgama'*n2*ei)-. . . (nl'*dijgama'*mgama*nl+nl'*digama'*djgama*nl+... nl'*djgama'*digama*nl+nl'*mgama'*dijgama*nl)*ophi-... (nl'*digama'*mgama*nl+nl'*mgama'*digama*nl)*dophi(:,j)); dh(i,j)=(dijgama*n2*the+digama*n2*ej+djgama*n2*ei-... dijgama*nl*ophi-digama*nl*dophi(:,j)-... djgama*n2*dophi(:,i)-mgama*n2*d2phi)'*... (n+mgama*n2*the-mgama*nl*ophi)+... (digama*n2*the+mgama*n2*ei-... digama*nl*ophi-mgama*n2*dophi(:,i)).. (djgama*n2*the+mgama*n2*ej-... djgama*nl*ophi-mgama*n2*dophi(:,j)); end; end; I % M o d i f i e d Newton-Raphson i t e r a t i o n % esp=h'*h/q; i f (esp<=lesp|iter==l) lthe=the; dthe=-dh\h; APPENDIX B sthe=dthe; lesp=esp; iter=iter+l; iter esp' the' ophi' h' else dthe=dthe/2; sqdthe=dthe'*dthe/q; if (sqdthe<=l.Oe-10) rand('uniform'); for i=l:q dthe(i,l)=rand*sthe(i,1); end; end; iter [lesp esp] end; nthe=lthe+dthe; end; '/. % Get final results % phi=ophi; theta=the; x=n+mgama*n2*the-mgama*nl*ophi; varat=x'*x/(nob-m); function [omega,delta,phi,theta,varat]=bj_id(yt,ut,f,r,s,p,q,nprm) I identification algorithm for a Box-Jenkins control model. y. % Calling sequence: °/o % [omega,delta,phi,theta,varat]=bj_id(yt,ut,f,r,s,p,q,ndel,nthe); I °/ Input variables: % yt : The output variable time series. 0 APPENDIX B 194 1 I ut The i n p u t v a r i a b l e t i m e series. f The p u r e d e l a y o f t h e B-J model. No d e l a y f=0. % r The number o f t r a n s m i s s i o n z e r o s . No z e r o r=0. % s The o r d e r o f t h e system. '/. p The number o f a u t o r e g r e s s i v e p a r a m e t e r s . 7. q The number o f moving average p a r a m e t e r s . °/ The i n i t i a l nprm 0 estimate o f t h e time c o n s t a n t parameter vector and t h e moving average parameter v e c t o r . % Output variables: % omega The column v e c t o r o f t h e t r a n s m i s s i o n % delta The column v e c t o r o f t h e p o l e s . y. phi zeros. The column v e c t o r o f t h e a u t o r e g r e s s i v e parameters. % theta The column v e c t o r o f t h e moving average p a r a m e t e r s . °/„ varat Variance of the white n o i s e . y. % W r i t t e n by Ky M. Vu on May 15th 1996 nob=length(yt); for i=l:nob-r-f-l y(i,l)=yt(nob+l-i,l); for j=l:r+l u(i,j)=ut(nob+l-f-i-j,1); end; end; m=max(p,q); % I w0=[eye(nob-r-f-m-l) zeros(nob-r-f-m-1,m)]; y. esp=l; lesp=l; iter=l; y. % Start iterating y. while (iter<100&esp>1.0e-7) del=nprm(l:s,:); the=nprm(s+l:s+q,:); psi(l,l)=del(l,l); APPENDIX B for k=2:nob-r-f-2 sum=0; for l = l : k - l i f (l>=k-s) sum=sum+psi(l,l)*del(k-l,l); end; end; i f (k<=s) psi(k,l)=sum+del(k,1); else psi(k,l)=sum; end; end; '/. % C a l c u l a t e the optimal omega v e c t o r y. for i = l : n o b - r - f - l for j = l : n o b - r - f - l i f (j==i) mpsi(i,j)=l; e l s e i f (j>i) mpsi(i,j)=psi(j-i,l); else mpsi(i,j)=0; end; end; end; omg=inv(u'*mps i'*mps i *u)*u'*mps i ' * y ; y. % and i t s d e r i v a t i v e s w . r . t . the d e l t a ' y. for i=l:s for k=l:nob-f-r-2 sum=0; i f (k<i) dpsi(k,i)=0; e l s e i f (k==i) dpsi(k,i)=l; else for l = l : k - l APPENDIX B 196 if (l>=k-s) sum=sum+dpsi(l,i)*del(k-l,l); end; if (l==k-i) sum=sum+psi(1,1); end; end; dpsi(k,i)=sum; end; end; I for k=l:nob-f-r-l for 1=1:nob-f-r-l if (K=k) dipsi(k,l)=0; else dipsi(k,l)=dpsi(l-k,i); end; end; end; % d o m g ( : , i ) = i n v ( u * m p s i ' * m p s i * u ) * ( u ' * d i p s i ' *y-... ) (u'*dipsi'*mpsi*u+u *mpsi'*dipsi*u)*omg); , g(i,l)=(dipsi*u*omg+mpsi*u*domg(:,i))'*(y-mpsi*u*omg); end; y. % Now f o r m t h e d i s t u r b a n c e m a t r i c e s y. f o r k=l:m wi=zeros(nob-r-f-m-1,nob-r-f-1); for i=l:nob-r-f-m-l for if j=l:nob-r-f-l (j==i+k) wi(i,j)=l; end; end; end; if (k<=p) nl(:,k)=wi*(y-mpsi*u*omg); end; APPENDIX if 197 B (k<=q) n2(:,k)=wi*(y-mpsi*u*omg); end; end; % gamma(l,l)=the(l, for 1); k=2:nob-r-f-m-2 sum=0; for l=l:k-l if (l>=k-q) sum=sum+gamma(l,l)*the(k-l,1); end; end; if (k<=q) gamma(k,l)=sum+the(k,1); else gamma(k,l)=sum; end; end; % % Calculate the optimal p h i vector '/. for i=l:nob-r-f-m-l for j=l:nob-r-f-m-1 if (j==i) mgamma(i,j)=l; elseif (j>i) mgamma(i,j)=gamma(j-i,1); else mgamma(i,j)=0; end; end; end; I ophi=inv(nl'*mgamma'*mgamma*nl)*nl^mgamma'*... (wO*(y-mpsi*u*omg)+mgamma*n2*the); I % and i t s f i r s t for i=l:s order d e r i v a t i v e s w.r.t. d e l t a , APPENDIX B 198 for k=l:nob-r-f-1 for 1=1:nob-r-f-l i f (K=k) dipsi(k,l)=0; else dipsi(k,l)=dpsi(l-k,i); end; end; end; f o r j = l :m wi=zeros(nob-r-f-m-1,nob-r-f-1); f o r k=l:nob-r-f-m-1 for 1=1:nob-r-f-l i f (l==k+j) wi(k,l)=l; end; end; i f (j<=p) dinl(:,j)=-wi*(dipsi*u*omg+mpsi*u*domg(:,i)); end; i f (j<=q) din2(:,j)=-wi*(dipsi*u*omg+mpsi*u*domg(:,i)); end; end; end; I Ji The d e r i v a t i v e s of p h i w . r . t . d e l t a y. dphidel(:,i)=inv(nl'*mgamma'*mgamma*nl)*... (dinl'*mgamma'*(w0*(y-mpsi*u*omg)+mgamma*n2*the)+... nl *mgamma *(-wO*dipsi*u*omg-wO*mpsi*u*domg(:,i)+... mgamma*din2*the)-(dinl'*mgamma'*... mgamma*nl+nl' ^gamma' *mgamma*dinl)*ophi); end; y. for i = l : q f o r k=l:nob-r-f-m-2 sum=0; i f (k<i) dgamma(k,i)=0; , , APPENDIX B e l s e i f (k==i) dgamma(k,i)=l; else for l = l : k - l i f (l>=k-q) sum=sum+dgamma(l,i)*the(k-l,1); end; i f (l==k-i) sum=sum+gamma(l,1); end; end; dgamma(k,i)=sum; end; end; I for k=l:nob-r-f-m-l for 1=1:nob-r-f-m-l i f (K=k) digamma(k,1)=0; else digamma(k,l)=dgamma(l-k,i); end; end; end; ei=zeros(q,1); ei(i,l)=l; y. % and w . r . t . t h e t a y. d p h i t h e ( : , i ) = i n v ( n l ' * m g anuria'*mgamma*nl)*... (nl'*diganuna'*(w0*(y-mpsi*u*omg)+mgamma*n2*the)+... nl'*mgamma'*(digarrima*n2*the+mgamma*n2*ei)-... (nl'*digamma'*mgarama*nl+nl'*mgamma *digamma*nl)*ophi) ; g(i+s,I)=(digamma*n2*the+mgamma*n2*ei-... digamma*nl*ophi-mgamma*nl*dphithe(:,i))'*... (wO*(y-mpsi*u*omg)+mgamma*n2*the-mgamma*nl*ophi); end; , y. °/ C a l c u l a t e the d e r i v a t i v e matrix 0 y. APPENDIX 200 for i=l:s+q f o r j=l:s+q % if (i<=s) if (j<=s) for k=l:nob-f-r-2 sum=0; if (k<i+j) d2psi(k,1)=0; elseif (k==i+j) d2psi(k,1)=2; else for l=l:k-l if (l>=k-s) sum=sum+d2psi(1,1)*del(k-1,1); end; if (l==k-i) sum=sum+dpsi(1,j); end; if (l==k-j) sum=sum+dpsi(l,i); end; end; d2psi(k,l)=sum; end; end; for k=l:nob-f-r-l for 1=1:nob-f-r-l if (K=k) dipsi(k,l)=0; djpsi(k,l)=0; dijpsi(k,l)=0; else dipsi(k,l)=dpsi(l-k,i); djpsi(k,l)=dpsi(l-k,j); dijpsi(k,l)=d2psi(l-k,1); end; end; end; domgdeldel=inv(u'*mpsi'*mpsi*u)*(u'*dij p s i ' * y - . . B APPENDIX B (u'*dijpsi'*mpsi*u+u'*dipsi'*djpsi*u+... u *djpsi'*dipsi*u+u *mpsi *dijpsi*u)*omg-... (u'*dipsi *mpsi*u+u'*mpsi'*dipsi*u)*domg(:,j)-. (u'*djpsi'*mpsi*u+u'*mpsi'*djpsi*u)*domg(:,i)); dg(i,j)=(dijpsi*u*omg+dipsi*u*domg(:,j)+djpsi*u*domg(:,i) mpsi*u*domgdeldel)'*(y-mpsi*u*omg)-... (dipsi*u*omg+mpsi*u*domg(:,i))'*... (djpsi*u*omg+mpsi*u*domg(:,j)); else , , , , dg(i,j)=0; end; else ii=i-s; if (j<=s) jj=j; for k=l:nob-r-f-1 for 1=1:nob-r-f-1 if (K=k) djpsi(k,l)=0; else djpsi(k,l)=dpsi(l-k,jj); end; end; end; for kk=l:m wi=zeros(nob-r-f-m-1,nob-r-f-l); for k=l:nob-r-f-m-1 for 1=1:nob-r-f-l if (l==k+kk) wi(k,l)=l; end; end; if (kk<=p) djnl(:,kk)=-wi*(dipsi*u*omg+mpsi*u*domg(:,jj)); end; if (kk<=q) djn2(: ,kk)=-wi*(dipsi*u*omg+mpsi*u*domg(:,jj)); end; end; end; APPENDIX B jj=j-s; for k=l:nob-r—f-m-2 sum=0; if (k<ii+jj) d2gamma(k,1)=0; elseif (k==ii+jj) d2gamma(k,l)=2; else for l=l:k-l if (l>=k-q) sum=sum+d2gamma(l,l)*the(k-l, 1) ; end; if (l==k-ii) sum=sum+dgamma(l,jj); end; if (l==k-jj) sum=sum+dgamma(l,ii); end; end; d2gamma(k,1)=sum; end; end; end; for k=l:nob-r-f—m—1 for 1=1:nob-r-f-m-1 if (K=k) digamma(k,l)=0; if (j>s) djgamma(k,l)=0; dijgamma(k,l)=0; end; else digamma(k,l)=dgamma(l-k,ii); if (j>s) djgamma(k,l)=dgamma(l-k,jj); dijgamma(k,1)=d2gamma(l-k,1); end; end; APPENDIX 203 B end; end; ei=zeros(q,1); ej=zeros(q,1); ei(ii,l)=l; J (jj , D = i ; e i f (j<=s) dphidelthe=inv(nl'*mgamma'*mgamma*nl)*(djnl'*digamma *... (wO*(y-mpsi*u*omg)+mgamma*n2*the)+nl'*digamma'*... (-wO*djpsi*u*omg-wO*mpsi*u*domg(: , j j)+mgamiria*djn2*the) + . . . djnl *mgamma *(digamma*n2*the+mgamma*n2*ei)+... nl'*mgamma'*(digamma*djn2*the+mgamma*djn2*ei)-... (djnl'*mgamma'*mgamma*nl+nl'*mgamma'*mgamma*djnl)*... dphithe(:,ii)-... (nl'*digamma *mgamma*nl+nl'*mgamma'*digamma*nl)*... dphideK : , j j ) - . . . (djnl'*digamma'*mgamma*nl+nl'*digamma'*mgamma*djnl+... djnl'*mgamma'*digamma*nl+nl'*mgamma'*digamma*djnl)*ophi); ; , , ) % dg(i,j)=(digamma*djn2*the+mgamma*djn2*ei-... digamma*djnl*ophi-digamma*nl*dphidel(:,jj)~... mgamma*djnl*dphithe(:,ii)-mgamma*nl*dphidelthe)'*... (w0*(y-mpsi*u*omg)+mgamma*n2*the-mgamma*nl*ophi)+... (digamma*n2*the+mgamma*n2*ei-... digamma*nl*ophi-mgamma*nl*dphithe(:,ii))'*... (-wO*djpsi*u*omg-wO*mpsi*u*domg(:,j.. mgamma*djn2*the-mgamma*djnl*ophi-... mgamma*nl*dphidel(:,jj)); else dphithethe=inv(nl'*mgamma *mgamma*nl)*(nl *dijgamma'*... (w0*(y-mpsi*u*omg)+mgamma*n2*the)+••• nl'*digamma *(djgamma*n2*the+mgamma*n2*ej)+... nl'•djgamma'*(digamma*n2*the+mgamma*n2*ei) +.. . nl'*mgamma *(dijgamma*n2*the+digamma*n2*ej+... djgamma*n2*ei)-... (nl'*djgamma'*mgamma*nl+nl'*mgamma *djgamma*nl)*... dphithe(:,ii)-... (nl'*digamma'*mgamma*nl+nl'*mgamma'*digamma*nl)*... dphithe(:,jj)-... , , ) ; ; APPENDIX B 204 (nl'*dijgamma'*mgamma*nl+nl'*digamma'*djgamma*nl+. . . nl'*djgamma'*digamma*nl+nl'*mgamma *dijgamma*nl)*ophi); ; y. d g ( i , j ) = (dijgamma*n2*the+digamma*n2*ej+djgairima*n2*ei-... dijgamma*nl*ophi-digamma*nl*dphithe(:,jj)-... djgamma*nl*dphithe(:,ii)-mgamma*nl*dphithethe).. (w0*(y-mpsi*u*omg)+mgamma*n2*the-mgamma*nl*ophi)+... (digamma*n2*the+mgamma*n2*ei-... digamma*nl*ophi-mgamma*nl*dphithe(:,ii))'*... (djgamma*n2*the+mgamma*n2*ej-... djgamma*nl*ophi-mgamma*nl*dphithe(:,jj)); end; end; end; end; I % M o d i f i e d Newton-Raphson i t e r a t i o n y. esp=g'*g/(s+q); if (esp<=lespIiter==l) lprm=nprm; dprm=-dg\g; sprm=dprm; lesp=esp; iter=iter+l; iter esp' nprm' g' else dprm=dprm/2; sqdprm=dprm'*dprm/(s+q); if (sqdprm<=l.Oe-10) rand('uniform'); for i=l:s+q dprm(i,l)=rand*sprm(i,1); end; end; iter [lesp esp] APPENDIX 205 B end; nprm=lprm+dprm; end; I % Get f i n a l results omega=omg; d e l t a=nprm(1:s,:); phi=ophi; theta=nprm(s+l:s+q,:); at=wO*(y-mpsi*u*omg)+mgamma*n2*theta-mgamma*nl*phi; varat=at'*at/(nob-r-f-m-1); function % [sigma2n]=armavar(theta,phi,sigma2a); R o u t i n e t o c a l c u l a t e t h e v a r i a n c e f o r an ARMA t i m e series. The r o u t i n e c a n t a k e non-monic moving average p o l y n o m i a l . % % % Calling sequence: [sigma2n]=armavar(theta,phi,sigma2a); Input arguments: The moving average p o l y n o m i a l . theta [1 - t h e t a l - t h e t a 2 ... - t h e t a q ] ; The a u t o r e g r e s s i v e p o l y n o m i a l . phi [1 - p h i l - p h i 2 ... - p h i p ] ; The v a r i a n c e o f t h e w h i t e n o i s e . sigma2a Output arguments: sigma2n Author : The v a r i a n c e o f t h e t i m e K. Vu p=length(phi); q=length(theta); p=p-l; q=q-l; series. w r i t t e n on Nov. 13th 1993. APPENDIX B 206 n=max(p,q); r=abs(roots(phi')); for i=l:p, if abs(r(i,l)>=l) error('The Time S e r i e s i s N o n s t a t i o n a r y ' ) ; end; end; 1 gammaO=zeros(n+l,n+l); gammal=zeros(n+1,n+l); for i=l:n+l, for j=l:n+l, m=i+j-l; if (m<=p+l) gammaO(i,j)=phi(l,m); else gammaO(i,j)=0; end; end; end; for i=l:n+l, for j=i:n+l, m=j-i+l; if (m<=p+l) gammaO(i, j)=gammaO(i,j)+phi(1,m); end; end; end; for i=l:n+l, for j=2:n+l, gammal(i,j)=gamma0(i,j); end; end; for i=l:n+l, for j=l:n+l, m=i-l+j; if (m>0&m<=q+l) gammal(i,l)=gammal(i,l)+theta(l,j)*theta(l,m); end; end; APPENDIX 207 B gammal(i,l)=2*gammal(i,1) ; end; y. '/„ Get t h e r e s u l t y. sigma2n=det(gammal)*sigma2a/det(gammaO); function [c]=polymul(a,b); % F u n c t i o n t o m u l t i p l y two p o l y n o m i a l s % C a l l i n g sequence: % [c]=polymul(a,b); % Input argument: % a : Column o f i n p u t p o l y n o m i a l coefficients. °/ b : Column o f i n p u t p o l y n o m i a l coefficients. 0 % Output argument: % c : Column o f r e s u l t a n t p o l y n o m i a l c o e f f i c i e n t s , % W r i t t e n by K. Vu on June 1 s t 1996. [m,n]=size(a); if (m==l) degreea=n; else degreea=m; end; [m,n]=size(b); if (m==l) degreeb=n; else degreeb=m; end; degreec=degreea+degreeb-l; for i=l:degreec c(i,l)=0; for j=l:i c = a*b. APPENDIX B 208 m=i+l-j ; if (j<=degreea&m<=degreeb) c(i,l)=c(i,l)+a(j,l)*b(i+l-j,l); end; end; end; [kc,vary,varu]=lqg_pid(omega,delta,theta,phi,f,vara,lambda,nl); function °/ 0 % Routine t o c a l c u l a t e t h e LQG PID c o n t r o l l e r gains f o r a B o x - J e n k i n s model c o n t r o l system. % Calling sequence: °/ [kc, v a r y , v a r u , varmv] = l q g _ p i d (omega, d e l t a, t h e t a, p h i , f , v a r a , lambda, n l ) ; % Input 0 parameters: % omega : The z e r o p o l y n o m i a l of t h e t r a n s f e r f u n c t i o n . % delta : The p o l e p o l y n o m i a l o f t h e t r a n s f e r f u n c t i o n . % theta °/ phi 0 % f : The moving average p o l y n o m i a l o f t h e d i s t u r b a n c e . : The a u t o r e g r e s s i v e p o l y n o m i a l : The d e l a y o f t h e system. of t h e d i s t u r b a n c e . No d e l a y f=0. % vara % lambda : The L a r a n g e m u l t i p l i e r . MV PID c o n t r o l , % nl % : The v a r i a n c e o f t h e w h i t e n o i s e . lambda=0. : The e s t i m a t e d i n i t i a l g a i n v e c t o r . Output p a r a m e t e r s : % kc % : The o p t i m a l c o n t r o l l e r g a i n m a t r i x i n t h e sequence: kp, k i , kd. °/ vary : The v a r i a n c e o f t h e output % varu : The v a r i a n c e of t h e i n p u t v a r i a b l e . % Author 0 : K. Vu r=length(omega); r=r-l; s=length(delta); s=s-l; q=length(theta); q=q-l; variable. w r i t t e n on June 1 s t 1996. 209 APPENDIX B p=length(phi); p=p-l; I % Calculate the resultant polynomials. % [alpha]=polymul(delta,theta); [gama]=polymul(delta,phi); y. % Check i f n o n s t a t i o n a r y if (abs(polyval(phi',l))<=le-5) y. disturbance. m=3; nonsta=l; y. % F a c t o r out t h e n o n s t a t i o n a r y polynomial y. phim(l,1)=1; f o r i=2:p phim(i,l)=phim(i-l,l)+phi(i,1); end; p=p-l; if (r==0) psi=omega(l,l)*phim; else [psi]=polymul(omega,phim); end; g=[0 -1 -2;1 1 1;0 0 1]; else y. °/ 0 Case o f s t a t i o n a r y d i s t u r b a n c e o n l y PD c o n t r o l l e r . I m=2; nonsta=0; if (r==0) psi=omega(l,l)*phi; else [psi]=polymul(omega,phi); end; g=[l end; 1;0 -1] ; APPENDIX B 210 % % Start the i t e r a t i o n . y. esp=l; lesp=l; iter=l; while(iter<50&esp>le-15) y. % Get t h e p a r a m e t e r s o f t h e f e e d b a c k output v a r i a b l e t i m e s e r i e s . I l=nl; [betam]=polymul(1,psi); [eta]=polymul(l,alpha); na=length(betam); ng=length(gama); nb=max(ng,na+f+1); I for i=l:nb beta(i,l)=0; if (i<=ng) b e t a ( i , l ) = b e t a ( i , l ) + g a m a ( i , 1); end; if (i>f+l&i<=na+f+l) beta(i,l)=beta(i,1)-betam(i-f-1,1); end; end; na=length(alpha); nb=length(beta); nc=length(eta); [var1]=armavar(alpha',beta',vara); [var2]=armavar(eta',beta',vara); I if (nb<nc) beta(nb+l:nc,l)=zeros(1:nc-nb,1); end; alpha(2:na,l)=-alpha(2:na,1); beta(2:nb,l)=-beta(2:nb,l); eta(2:nc,l)=-eta(2:nc,l); n=max(nb,nc)-l; u=l; APPENDIX B w=eta(l,l)~2; for k=2:n+l % if (k<=na) i f (k<=nb) u=u+(alpha(k,l)-beta(k,l))~2-beta(k,l)~2; else u=u+alpha(k, 1 ) ~ 2 ; end; end; I if (k<=nc) i f (k<=nb) sum=2*eta(l,l)*beta(k,l); else sum=0; end; w=w+eta(k,1)*(eta(k,1)-sum); end; end; y. for i=l:n-l for j = l : n - l i f (i+j<=n) mgama(i,j)=-beta(i+j +1,1) ; else mgama(i,j)=0; end; if (j==i) mgama(i,j)=mgama(i,j)+l; e l s e i f (i>j) mgama(i,j)=mgama(i,j)-beta(i-j + l , 1 ) ; end; end; end; y. nd=length(psi); for i=l:m for j=l:n+l if (j>f+i&j<=nd+f+i) APPENDIX 212 betap(j,i)=psi(j-f-i,l); else betap(j,i)=0; end; end; vbetap(:,i)=betap(2:n,i); end; % etap=zeros(n,m); for i=l:m etap(i:na+i-l,i)=alpha(l:na,l); end; I for k=l :m for i=l:n-l for j=l:n-l js=(k-l)*(n-l)+j; i f (i+j<=n) gamap(i,j s)=-betap(i+j + l , k ) ; else gamap(i,js)=0; end; i f (i>j) gamap(i,j s)=gamap(i,j s)-betap(i-j + l , k ) ; end; end; end; end; y. xip=zeros(n-l,m); zetap=zeros(n-l,m); mbetap=zeros(n-l,m); for i=l:n-l xi(i,l)=0; zeta(i,1)=0; mbeta(i,1)=0; for j=l:n-l I i f (j<=na-l) i f (i+j<=na-l) B APPENDIX B xi(i,l)=xi(i,l)+alpha(j+l,l)*alpha(j+i+l,l); end; if (i+j<=nb-l) xi(i,l)=xi(i,l)-alpha(j+l,l)*beta(j+i+1,1); f o r k=l:m xip(i,k)=xip(i,k)+alpha(j+l,l)*betap(j+i+l,k); end; end; end; if (j<=nc-l) if (i+j<=nc-l) zeta(i,l)=zeta(i,l)+eta(j+l,l)*eta(j+i+1,1); zetap(i,k)=zetap(i,k)+etap(j+l,k)*eta(j+i+l,l)+... eta(j+l,l)*etap(j+i+l,k); end; if (i+j<=nb-l) zeta(i,l)=zeta(i,l)-eta(j+l,l)*eta(l,l)*beta(j+i+l,l); f o r k=l:m zetap(i,k)=zetap(i,k)-etap(j+l,k)*eta(l,l)*beta(j+i+1,1)eta(j+1,l)*etap(l,k)*beta(j+i+1,1)eta(j+1,1)*eta(l,1)*betap(j+i+1,k); end; end; end; if (i+j<=na-l&j<=nb-l) xi(i,1)=xi(i,1)-alpha(j+i+1,1)*beta(j+1,1); f o r k=l:m xip(i,k)=xip(i,k)+alpha(j+i+1,l)*betap(j+l,k); end; end; if (i+j<=nc-l&j<=nb-l) zeta(i,l)=zeta(i,l)-eta(j+i+1,l)*eta(l,l)*beta(j+l,1); f o r k=l:m zetap(i,k)=zetap(i,k)-etap(j+i+1,k)*eta(l,l)*beta(j+l,1)-.. eta(j+i+l,l)*etap(l,k)*beta(j+l,l)-.. eta(j+i+l,l)*eta(l,l)*betap(j+l,k); end; 214 APPENDIX end; % if (i+j<=n) mbeta(i,l)=mbeta(i,l)+beta(j+l,l)*beta(j+i+l,l); f o r k=l:m mbetap(i,k)=mbetap(i,k)+beta(j+l,l)*betap(j+i+l,k)+... betap(j+l,k)*beta(j+i+l,l); end; end; end; end; I vbeta=beta(2:n,1); invgama=inv(mgama); u=u+2*xi'*invgama*vbeta; w=w+2*zeta'*invgama*vbeta; v=2-beta'*beta-2*mbeta'*invgama*vbeta; % % Calculate the variances. I vary=(u/v)*vara; varu=(w/v)*vara; '/.[varl v a r y v a r 2 varu] %pause(5) I smbebep2=zeros(m,m); for i=l:m smalbep(i,l)=0; smetabep(i,l)=eta(l,l)*etap(l,i); smbebep(i,1)=0; f o r j=2:n+l if (j<=nb) if (j<=na) smalbep(i,l)=smalbep(i,l)+alpha(j,l)*betap(j,i); end; smbebep(i,l)=smbebep(i,l)+beta(j , l ) * b e t a p ( j , i ) ; f o r k = l :m smbebep2(i,k)=smbebep2(i,k)+betap(j,i)*betap(j,k); end; B APPENDIX if B (j<=nc) smetabep(i,l)=smetabep(i,1)+... etap(j,i)*(eta(j, l)-2*eta(l,l)*beta(j,1))+. eta(j,l)*(etap(j,i)-2*etap(l,i)*beta(j,1)-. 2*eta(l,l)*betap(j,i)); end; end; end; end; f o r i=l:m is=(i-l)*(n-l)+l; ie=i*(n-l); uprime(i,l)=-smalbep(i,l)-xip(:,i)'*invgama*vbeta... -xi'*invgama*gamap(:, is:ie)*invgama*vbeta... +xi'*invgama*vbetap(:,i); wprime(i,l)=smetabep(i,l)+zetap(:,i)'*invgama*vbeta... -zeta'*invgama*gamap(:,is:ie)*invgama*vbeta... +zeta'*invgama*vbetap(:,i); vprime(i,l)=-smbebep(i,l)-mbetap(:,i)'*invgama*vbeta... +mbeta'*invgama*gamap(:,is:ie)*invgama*vbeta... -mbeta'*invgama*vbetap(:,i); h(i,l)=(uprime(i,l)*v-u*vprime(i,1))+lambda*... (wprime(i,1)*v-w*vprime(i,1)); end; f o r i=l:m f o r j=l:m is=(i-l)*(n-l)+l; ie=i*(n-l); js=(j-l)*(n-l)+l; je=j*(n-l); vbebep2=zeros(n-l,1) ; vzetap2=zeros(n-l,1); for k=l:n-l for kk=l:n-l i f (k+kk<=nc-l) vzetap2(k,l)=vzetap2(k,l)+etap(kk+1,i)*etap(k+kk+1,j)+. etap(kk+l,j)*etap(k+kk+l,i)-... APPENDIX B 216 etap(k+kk+1,i)*etap(1,j)*beta(k+l, 1) etap(k+kk+1,i)*eta(l,1)*betap(k+l,j) • etap(k+kk+1,j)*etap(l,i)*beta(k+l,l)etap(k+kk+1,j)*eta(l,l)*betap(k+l,i)eta(k+kk+1,l)*etap(l,i)*betap(k+l,j)eta(k+kk+l,l)*etap(l,j)*betap(k+l,i) end; if (k+kk<=n) vbebep2(k,1)=vbebep2(k,1)+betap(kk+1,i)*betap(k+kk+1,j)+ bet ap(k+kk+1,i)*bet ap(kk+1,j); end; i f (kk<=nc-l&k+kk<=nb-l) vzetap2(k,l)=vzetap2(k,1)-... etap(kk+1,i)*etap(l,j)*beta(k+kk+1,1)etap(kk+l,i)*eta(l,l)*betap(k+kk+1,j)etap(kk+1,j)*etap(l,i)*beta(k+kk+l,1)etap(kk+l,j)*eta(1,1)*betap(k+kk+1,i)eta(kk+l,l)*etap(l,i)*betap(k+kk+1,j)eta(kk+l,l)*etap(l,j)*betap(k+kk+1,i). end; end; end; sumetabe2=etap(l,i)*etap(l,j); f o r k=2:n+l i f (k<=nc-l) sumetabe2=sumetabe2+etap(k,i)*(etap(k,j)-2*... etap(l,j)*beta(k,l)-2*eta(l,l)*betap(k,j))+... etap(k,j)*(etap(k,i)-2*etap(l,i)*beta(k,l)-2*... eta(l,l)*betap(k,i))... end; end; u 2 p r i m e ( i , j ) = x i p ( : , i ) ' * i n v g a m a * g a m a p ( : , j s:je)*invgama*vbeta. -xip(:,i)'*invgama*vbetap(:,j)... +xip(:,j)'*invgama*gamap(:,is:ie)*invgama*vbeta. APPENDIX B +xi'*invgama*gamap(:,js:je)*invgama*gamap(:,is:ie)* invgama*vbeta... +xi'*invgama*gamap(:,is:ie)*invgama*gamap(:, j s: je) * invgama*vbeta... -xi'*invgama*gamap(:,is:ie)*invgama*vbetap(:,j)... -xip(:,j)'*invgama*vbetap(:,i)... -xi'*invgama*gamap(:,j s:je)*invgama*vbetap(: , i ) ; I w2prime(i,j)=sumetabe2+vzetap2'*invgama*vbeta... -zetap(:,i)'*invgama*gamap(:,js:je)*invgama*vbeta.. +zetap(:,i)'*invgama*vbetap(:,j)... -zetap(:,j)'*invgama*gamap(:,is:ie)*invgama*vbeta.. +zeta *invgama*gamap(: , j s:je)*invgama*... gamap(:,is:ie)*invgama*vbeta... +zeta'*invgama*gamap(:,is:ie)*invgama*... gamap(:,j s:je)*invgama*vbeta... -zeta'*invgama*gamap(:,is:ie)*invgama*vbetap(:,j).. +zetap(:,j)'*invgama*vbetap(:,i)... -zeta *invgama*gamap(:,j s:je)*invgama*vbetap(:,i); J J I v2prime(i,j)=-smbebep2(i,j)-vbebep2'*invgama*vbeta... +mbetap(:,i) *invgama*gamap(:,j s:j e)*invgama*vbeta -mbetap(:,i)'*invgama*vbetap(:,j)... +rabetap(:,j)'*invgama*gamap(:,is:ie)*invgama*vbeta -mbeta'*invgama*gamap(:,j s:je)*invgama*... gamap(:,is:ie)*invgama*vbeta... -mbeta *invgama*gamap(:,is:ie)*invgama*... gamap(:,js:je)*invgama*vbeta... +mbeta'*invgama*gamap(:,is:ie)*invgama*vbetap(:,j) -mbetap(:,j)'*invgama*vbetap(:,i)... +mbeta'*invgama*gamap(:,js:je)*invgama*vbetap(:,i) dh(i,j)=(u2prime(i,j)*v+uprime(i,l)*vprime(j,1)... -uprime(j,1)*vprime(i,l)-u*v2prime(i,j))+lambda*... (w2prime(i,j)*v+wprime(i,1)*vprime(j,1)... -wprime(j,1)*vprime(i,l)-w*v2prime(i,j)); end; end; , ; y. % Modified Newton-Raphson iteration y. APPENDIX 218 esp=h' *h/m; (esp<=lespIiter==l) if 11=1; dl=-dh\h; sl=dl; lesp=esp; kc(:,iter)=g*l; iter=iter+l; iter 1' h' [ l e s p esp] else dl=dl/2; sqdl=dl'*dl/s; if (sqdl<=1.0e-9) rand('uniform'); for i=l:m dl(i,l)=rand*sl(i,l); end; end; iter [ l e s p esp] end; nl=ll+dl; end; function [ d n , b n i n v , c n , u n ] = s t n _ r l d ( d n _ 1 , b n _ l i n v , c n _ 1 , y n , u n _ 1 , m, n) ; °/ R o u t i n e t o c a l c u l a t e t h e c o n t r o l a c t i o n f o r t h e R e c u r s i v e 0 °/ L e a s t Determinant 0 % Calling (RLD) Self-Tuning Regulator. sequence: % [dn,bninv,cn,un]=stn_rld(dn_l,bn_linv,cn_l,yn,un_l,m,n); % Input % variables: dn_l : A colum v e c t o r o f d i m e n s i o n m+n+1. bn_linv : An i n v e r s e m a t r i x o f d i m e n s i o n m+n+1. B APPENDIX % B cn_l : A column v e c t o r of p a s t i n p u t % variables. output Dimension m+n+1. % yn : The newly a v a i l a b l e output % un_l : The last °/„ m : The number of p a s t i n p u t n : The °/ and variable. c o n t r o l a c t i o n or i n p u t variable. variables the c o n t r o l l e r remembers. 0 % % number of p a s t output v a r i a b l e s the c o n t r o l l e r remembers. % Output variables: I % dn : The u p d a t e d dn % bninv : The updated bninv matrix. column. % cn : The updated cn % un : The calculated control column. action. I % W r i t t e n by K. Vu on September 16th 1996. I cn(l,l)=un_l; cn(m+l,l)=yn; if (m>=2) for i=2:m cn(i,l)=cn_l(i-1,1); end; end; if (n>=l) for i=2:n+l cn(i+m,l)=cn_l(i+m-l,1); end; end; bninv=bn_1inv-bn_1inv*cn*cn'*bn_1inv/(1+cn'*bn_1inv*cn); un=dn_l'*bninv*cn/(1-cn'*bninv*cn); dn=dn_l+cn*un; function [pt,betat,ut]=stn_rls(pt_l,betat_l,yt,xt_f_1,xt); % °/„ R o u t i n e t o c a l c u l a t e the % L e a s t Squares (RLS) °/. % Calling sequence: c o n t r o l a c t i o n f o r the Self-Tuning Regulator. Recursive APPENDIX 220 % % [pt,betat,ut]=stn_rls(pt_l,betat_l,yt,xt_f_1,xt); I % Input I variables: A positve % pt_l '/„ betat.l % y. % yt xt_f_i '/, xt d e f i n i t e matrix o f d i m e n s i o n m+n+2. The parameter v e c t o r o f d i m e n s i o n m+n+2. The newly a v a i l a b l e output A v e c t o r of past variable. i n p u t and output : The above v e c t o r a t t i m e t . % variables. Dimension m+n+2. I t contains [0 u t _ l . . . y t y t _ l . . . ] ' . 'h Output variables: y. "I pt : The updated m a t r i x % betat : The u p d a t e d parameter v e c t o r o f d i m e n s i o n m+n+2. ut : The c a l c u l a t e d c o n t r o l t °/ 0 o f d i m e n s i o n m+n+2. action. I % W r i t t e n by K. Vu on September 16th 1996. 7. l=length(betat_l); kt=pt_l*xt_f_l/(l+xt_f_l *pt_l*xt_f_l); , betat=betat_l+kt*(yt-xt_f_l *betat_l); , ut=-betat(2:1,1)'*xt(2:1,1)/betat(1,1); pt=pt_l-pt_l*xt_f_l*xt_f_l *pt_l/(l+xt_f_l'*pt_l*xt_f_l); , function [theta,sigma2]=ma_id(gamma); % % I d e n t i f i c a t i o n a l g o r i t h m f o r a Moving Average % time s e r i e s g i v e n the autocovariances. y. % C a l l i n g sequence: y. °/ [ t h e t a , sigma2] =ma_ i d (gamma); 0 I % Input °/ 0 variables: gamma : The a u t o c o v a r i a n c e s y, 7, Output variables: s t a r t i n g from l a g 0. B APPENDIX B % theta : The moving average parameter v e c t o r . % sigma2 : The v a r i a n c e o f t h e w h i t e n o i s e . % W r i t t e n on J a n 15th 1997 by Ky M. Vu q=length(gamma)-1; b=zeros(q,q); theta=zeros(q,1); ntheta=zeros(q,1); gm=gamma(2:q+l,l)/gamma(l,l); err=l; iter=l; while (iter<100&err>le-ll) for i=l:q-l f o r j = 1:q-1 m=i+j; if (m<=q) b(i,j)=theta(m,1); end; end; end; ntheta=b*theta-(1+theta'*theta)*gm; iter=iter+l; err=(ntheta-theta)'*(ntheta-theta)/q; theta=ntheta; end; sigma2=gamma(l,1)/(1+theta'*theta); function [gamma]=autocov(phi,theta,sigma2,lag); % Routine t o c a l c u l a t e the autocovariances % o f an ARMA t i m e s e r i e s . % Calling % sequence: [gamma]=autocov(phi,theta,sigma2,lag); % Input v a r i a b l e s : % phi : The a u t o r e g r e s s i v e parameter column vector % theta : The moving vector average parameter column APPENDIX 222 sigma2 : The v a r i a n c e lag : The l a s t of t h e g e n e r a t i n g l a g of t h e d e s i r e d white noise, autocovariances. Output v a r i a b l e s : gamma 1 Written : The a u t o c o v a r i a n c e on J a n 15th 1997 by Ky M. Vu °/. p=length(phi)-l; q=length(theta)-l; aa=zeros(lag+1,lag+1); bb=zeros(lag+1,lag+1); cc=zeros(lag+1,lag+1); d=zeros(lag+1,1); I for i=l:q+l d(i,l)=theta(i,1)*sigma2; for j=l:q+l m=i+j-l; if (m<=q+l) bb(i,j)=theta(m,1); end; end; end; for i=l:lag+1 for j=l:i m=i-j+l; if (m<=p+l) aa(i,j)=phi(m,l); cc(i,j)=phi(m,l); end; end; for j=2:lag+1 n=i+j-l; if v e c t o r s t a r t i n g from l a g 0. (n<=p+l) aa(i,j)=aa(i,j)+phi(n,l); end; end; end; gamma=inv(aa)*bb*inv(cc)*d; B Bibliography [1] Akaike, H . (1967) "Some Problem in the Application of the Cross-Spectral Method." Proceedings of an Advanced Seminar on Spectral Analysis of Time Series, B . Harris Editor, Wiley, N . Y . , N . Y . , U.S.A. [2] Akaike, H . (1974) " A New Look at the Statistical Model Identification." I.E.E.E. Trans. Automatic Control, AC-19, pp. 716-722. [3] Allidina, A . Y . and Hughes, F. M . (1980) "Generalised Self-Tuning Controller with Pole Assignment" I E E P r o c , Vol. 127, Pt. D, No. 1, 13-18. [4] Anderson, O. D. (1975) "On the Collection of Time Series Data" Op. Res. Quart., Vol. 26, pp 331-335. [5] Anderson, B . D. 0 . and Gevers, M . R. (1982) "Identifiability of Linear Stochastic Systems Operating Under Linear Feedback." Automatica, Vol. 18, No. 2, pp 195-213. [6] Astrom, K . J. (1970) Introduction to Stochastic Control Theory. Academic Press, N . Y . , N . Y . U.S.A. [7] Astrom, K . J . and Hagglund, T. (1988) Automatic Tuning of PID Controllers. ISA, Research Triangle Park, N . C . [8] Astrom, K . J. and Wittenmark, B . (1973) "On Self-Tuning Regulators" Automatica, 9, pp 185-189. [9] Astrom, K . J. and Wittenmark, B . (1980) "Self-Tuning Controllers Based on PoleZero Placement." I E E P r o c , Part D, Vol. 127, No. 3, pp 120-130. [10] Astr 6m, K . J., Wittenmark, B . (1984) Computer Controlled Systems - Theory and Design. Prentice-Hall, Englewood Cliffs, N . J. [11] Astrom, K . J. and Wittenmark, B . (1989) Adaptive Control. Addison-Wesley. Reading, Mass., U.S.A. 223 224 BIBLIOGRAPHY [12] Astrom, K . J . (1980) "Maximum Likelihood and Predict ion Error Methods." Automatica, Vol. 16, pp. 551-574. [13] Astrom, K . J . and Soderstrom, T. (1974) "Uniqueness of the Maximum Likelihood Estimates of the Parameters of an A R M A Model" I E E E Transactions on Automatic Control, Vol. AC-19, No. 6, pp 769-773. [14] Astrom, K . J . and Bohlin, T. (1966) "Numerical Identification of Linear Dynamic Systems from Normal Operating Records." in P. H . Hammond (Ed.) Theory of Self Adaptive Control Systems. Plenum Press, New York. [15] A s t r o m , K. J., Bohlin, T. and Wensmark, S. (1965) "Automatic Construction of Linear Stochastic Dynamic Models for Stationary Industrial Processes with Random Disturbances Using Operating Records." I B M Nordic Laboratories, Report TP18.150. [16] Astrom, K . J . and Hagglund, T. (1988) Automatic Tuning of PID Controllers. I.S.A., Research Triangle Park, N C , U.S.A. [17] Beguin, J . M . , Gourieroux, C. and Monfort, A . (1980) "Identification of a mixed Autoregressive Moving Average Process: The Corner Method." Time Series (Ed. 0. D. Anderson), pp 423-436, North Holland, Amsterdam. [18] Box, G . E . P. and Jenkins, G. M . (1970) Time Series Analysis, Forecasting and Control. Holden Day, San Francisco, California. U.S.A. [19] Box, G. E . P. and MacGregor, J . F . (1974) "The Analysis of Closed Loop DynamicStochastic Systems." Technometric, Vol. 16, No. 3, pp 391-398. [20] Box, G. E. P. and MacGregor, J . F . (1976) "Parameter Estimation with Closed-Loop Operating Data." Technometric, Vol. 18, No. 4, pp 371-380. [21] Clarke, D. VV. and Gawthrop, P. J . (1975) "Self-Tuning Controller" Proc. I E E , 122, Part D, pp 929-934. [22] Clarke, D. W . (1984) "Self-Tuning Control of Nonminimum Phase Systems." Automatica, Vol. 20, No. 5, 501-517. [23] Cluett, W . R. and Wang, L . (1996) "New Tuning Rules for PID Control." Proc. Control System '96, pp 75-80. Whisler, B . C . , Canada. [24] Cohen, G. H . and Coon, G. A . (1953) "Theoretical Considerations of Retarded Control" Trans. A S M E , 75, 827. BIBLIOGRAPHY 225 [25] Dugard, L. and Landau, I. D. (1980) "Recursive Output Error Identification Algorithms: Theory and Evaluation" Automatica, Vol. 16, pp 443-462. [26] Fisher, R. A . (1956) Statistical Methods and Scientific Inference. Oliver and Boyd, Edinburgh. [27] Gauss, K . F. (1809) Teoria Motus Corporum Coelestium in Sectionihus Conicis Solem Amhientium. Reprinted Translation: Theory of the Motion of the Heavenly Bodies Moving about the Sun in Conic Sections. Dover, New York (1963). [28] Gevers, M . R. and Anderson, B . D. 0. (1981) "Representations of Jointly Stationary Stochastic Feedback Processes." Int. J. of Control, Vol. 33, No. 5, pp 777-809. [29] Gevers, M . and Ljung, L. (1986) "Optimal Experiment Designs with Respect to the Intended Model Application." Automatica, Vol. 22, No. 5, pp 543-554. [30] Gray, H . L . , Kelley, G. D. and Mclntire, D. D. (1978) " A New Approach to A R M A Modeling." Communications in Statistics, 87, pp 1-77. [31] Grimble, M . J. (1984) "Implicit and Explicit L Q G Self-Tuning Controllers" Automatica, Vol. 20, No. 5, 661-669. [32] Kessler, C. (1958) Das Symmetrische Optimum. Regelungstetechnik, 6, 11, pp 395400, 12 pp 432-436. [33] Kwakernaak, H. and Sivan, R. (1972) Linear Optimal Control Systems WileyInterscience. [34] Kotzev, A . (1992) Automatic Model Structure Determination for Adaptive Control. PhD. Thesis. University of British Columbia, Department of Mechanical Engineering. [35] Landau, I. D. and Voda, A . (1992) "The Symmetrische Optimum and the AutoCalibration of PID Controllers." I F A C - A C A S P 92 Symposium, Grenoble, July 1-3. [36] Lennartson, B . (1986) On the Design of Stochastic Control Systems with Multirate Sampling PhD. Thesis. Chalmers University of Technology. Goteborg, Sweden. [37] Ljung, L . , Gustavsson, I. and Soderstrom, T. (1974) "Identification of Linear Multivariable Systems Operating Under Linear Feedback Control." I E E E Trans, on Aut. Control, Vol 19, pp 836-840. [38] Ljung, L. (1977) "Analysis of Recursive Stochastic Algorithms." I E E E Trans., AC-22, 551-575. 226 BIBLIOGRAPHY [39] Ljung, L. (1981) "Analysis of a General Recursive Prediction Error Identification Algorithm." Automatica, Vol. 17, No. 1, pp 89-99. [40] Luenberger, D. (1969) Optimization by Vector Space Methods John Wiley & Sons, Inc. New York, N . Y . , U.S.A. [41] MacGregor, J . F., Harris, T. J . and Wright, J. D. (1984) "Duality between the Control of Processes Subject to Randomly Occuring Deterministic Disturbances and A R I M A Stochastic Disturbances" Technometrics, 26, 4, 389-397. [42] MacGregor, J . F. (1976) "Optimal Choice of the Sampling Interval for Discrete Process Control." Technometrics, Vol. 18, No. 2, pp. 151-160. [43] MacGregor, J . F., Wright, J. D. and Huynh, M . H . (1975) "Optimal Tuning of Digital PID Controllers Using Dynamic-Stochastic Models" IEC Process Des. Dev., 14, 398-402. [44] MacGregor, J. F. (1973) "Optimal Discrete Stochastic Control Theory for Process Applications." Canadian J. of Chem. Eng., Vol. 51, pp 468-478. [45] Mayne, D. Q. and Firoozan, F. (1982) "Linear Identification of A R M A Processes." Automatica, Vol. 18, No. 4, pp 461-466. [46] Meo, J. A . C , Medanic, J. V . and Perkins, W . R. (1986) "Design of Digital PI+ Dynamic Controllers Using Projective Control" International Journal of Control, Vol. 43, No. 2, 539-559. [47] Niu, S., Fisher, D. G. and Xiao, D. (1992) " A n Augmented U D Identification Algorithm." Int. J . of Control, Vol. 56, No. 1, pp 193-211. [48] Ostrowski, A . M . (1966) Solution of Equations and Systems of Equations Academic Press, New York, N . Y . , U.S.A. [49] Park, H . and Seborg, D. E. (1974) "Eigenvalue Assignment Using ProportionalIntegral Feedback Control." Int. J . Control, Vol. 20, No. 3, pp 517-523. [50] Radke, F. and Isermann, R. (1987) " A Parameter-Adaptive PID Controller with Stepwise Parameter Optimization" Automatica, 23, 449-457. [51] Rivera, D. E., Morari, M . and Skogetad, S. (1986) "Internal Model Control. 4. PID Controller Design." Ind. Eng. Chem. Process Des. Dev. Vol. 25, pp 252-265. [52] Seraji, H . and Tarokh, M . (1977) "Design of PID Controllers for Multivariable Systems." Int. J . Control, Vol. 26, No 1, pp 75-83. BIBLIOGRAPHY 227 [53] Sinha, N . K . , Mahalanabis, A . K . and Sherief, H . E l . (1978) " A Non-parametric Approach to the Identification of Linear Multivariable Systems." Int. J . of Systems Sci., Vol. 9, No. 4, pp 425-430. [54] Soderstrom, T., Gustavsson, I. and Ljung, L. (1975) "Identifiability Conditions for Linear Systems Operating Under Closed-Loop." Int. J. Control, Vol. 21, pp 243-255. [55] Soderstrom, T. (1975) "On the Uniqueness of Maximum Likelihood Identification." Automatica, Vol. 11, pp. 193-197. [56] Soderstrom, T. and Stoica, P. (1989) System Identification. Prentice Hall International Ltd. U K . [57] Stojic, M . R. and Petrovic, T. B. (1986) "Design of Digital PID Stand-Alone Single Loop Controller" International Journal of Control, Vol 43, No. 4, 1229-1242. [58] Telser, L . G . (1967) "Discrete Samples and Moving Sums in Stationary Stochastic Processes" J . Amer. Statist. Assoc., Vol. 62, pp 484-499. [59] Thomasson, F. Y . (1995) Process Control Fundamentals for the Pulp and Paper Industry. Tappi Press, Atlanta, G A . U.S.A. [60] Vu, K . (1988) "Linear Time Series Variance." Int. J. of Control, Vol. 47, No. 5, pp 1291-1297. [61] Vu, K . (1990) "Discrete Optimal Stochastic Controller and Its Forms." Int. J . Systems Science, Vol. 21, No. 3, pp 567-577. [62] Vu, K . (1991) "Determination of the Penalty Constant for Discrete Constrained Linear Quadratic Gaussian Controller Design." Int. J. Systems Science, Vol. 22, No. 4, pp 713-721. [63] V u , K . (1992) "Optimal Setting for Discrete PID Controllers." I E E Proceedings-D, Vol. 139, No. 1, pp 31-40. [64] Wadel, L. B . (1962) "Describing Function as Power Series" I R E Trans. Automatic Control, pp 50. [65] Wei, W . W . S. (1990) Time Series Analysis - Univariate and Multivariate Methods. Addison-Wesley. Redwood City, C. A., U.S.A. [66] Wellstead, P. E. (1981) "Non-Parametric Methods of System Identification." Automatica, Vol. 17, No. 1, pp 55-69. 228 BIBLIOGRAPHY [67] Wellstead, P. E., Edmunds, J . M . , Prager, D. and Zanker, P. (1977) "Self-Tuning Pole/Assignment Regulators" Int. J. Control, Vol. 30, pp 1-26. [68] Wilkinson, J. H . (1965) The Algebraic Eigenvalue Problem. Clarendon Press, Oxford, London, England. [69] Wilson, G. (1969) "Factorization of the Covariance Generating Function of a Pure Moving Average Process." SIAM J. Numerical Analysis, Vol. 6, No. 1, pp 1-7. [70] Young, P. C. (1970) " A n Instrumental Variable Method for Real Time Identification of a Noisy Process." Automatica, Vol. 6, pp 271-287. [71] Zervos, C , Belanger, P. R. and Dumont, G. A . (1988) "On PID Controller Tuning Using Orthonoma.1 Series Identification." Automatica, Vol. 24, No. 2, 165-175. [72] Ziegler, J. G. and Nichols, N . B . (1942) "Optimum Setting for PID Controllers." Trans A S M E , 64, 759-768.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- System identification, control algorithms and control...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
System identification, control algorithms and control interval for the Box-Jenkins dynamic model structure Vu, Ky Mihn 1997
pdf
Page Metadata
Item Metadata
Title | System identification, control algorithms and control interval for the Box-Jenkins dynamic model structure |
Creator |
Vu, Ky Mihn |
Date Issued | 1997 |
Description | The Box-Jenkins model of a discrete control system has been studied. First the transfer function model was identified numerically via the derivatives of the variance of the disturbance, then the disturbance model was identified via the derivative of the variance of the generating white noise. Once the model had been identified, an approach to obtain the optimal gains of a discrete PID controller was suggested. In an adaptive environment, the recursive least determinant self tuning controller was designed to calculate the best possible control action based only on the presumed orders of the controller and without knowledge of the delay. The problem of determination of a possible slower control interval of a control loop was solved via modelling a skipped ARIMA through matrix algebra and a robust numerical procedure. |
Extent | 8929818 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-04-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0058532 |
URI | http://hdl.handle.net/2429/7331 |
Degree |
Doctor of Philosophy - PhD |
Program |
Chemical and Biological Engineering |
Affiliation |
Applied Science, Faculty of Chemical and Biological Engineering, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1997-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1997-251810.pdf [ 8.52MB ]
- Metadata
- JSON: 831-1.0058532.json
- JSON-LD: 831-1.0058532-ld.json
- RDF/XML (Pretty): 831-1.0058532-rdf.xml
- RDF/JSON: 831-1.0058532-rdf.json
- Turtle: 831-1.0058532-turtle.txt
- N-Triples: 831-1.0058532-rdf-ntriples.txt
- Original Record: 831-1.0058532-source.json
- Full Text
- 831-1.0058532-fulltext.txt
- Citation
- 831-1.0058532.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0058532/manifest