UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Adaptive optimization of discrete stochastic systems Gracovetsky, Serge Alain 1970

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1971_A1 G73.pdf [ 4.99MB ]
Metadata
JSON: 831-1.0302211.json
JSON-LD: 831-1.0302211-ld.json
RDF/XML (Pretty): 831-1.0302211-rdf.xml
RDF/JSON: 831-1.0302211-rdf.json
Turtle: 831-1.0302211-turtle.txt
N-Triples: 831-1.0302211-rdf-ntriples.txt
Original Record: 831-1.0302211-source.json
Full Text
831-1.0302211-fulltext.txt
Citation
831-1.0302211.ris

Full Text

ADAPTIVE OPTIMIZATION OF DISCRETE STOCHASTIC SYSTEMS by S. A. GRACOVETSKY Ingenieur physicien, EPTJL, Lausanne, 1968 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS OF THE DEGREE OF DOCTOR OF PHILOSOPHY i n the Department of E l e c t r i c a l Engineering We accept t h i s thesis as conforming to the required standard Research Supervisor Members of the Committee... Acting Head of the Department Members of the Department of E l e c t r i c a l Engineering THE UNIVERSITY OF BRITISH COLUMBIA November , 1970 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of Br i t i sh Co 1umbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying o f this thesis for scholarly purposes may be granted by the Head of my Department or by his representatives. It is understood that copying or publication o f this thesis f o r financial gain shall not be allowed without my written permission. Department of The University of Br i t ish Columbia Vancouver 8, Canada A mes parents, A ma femme. ABSTRACT The general theory of s t o c h a s t i c optimal c o n t r o l i s based on determining a c o n t r o l which minimizes an expected cost. However, the use of minimum expected cost as a design o b j e c t i v e i s a r b i t r a r y . A d i r e c t con-sequence of this choice i s the need f o r extensive s t a t i s t i c a l information. I f the required s t a t i s t i c a l data i s not a v a i l a b l e or not accurate, the con-t r o l l e r i s suboptimum. The thesis begins with the i n v e s t i g a t i o n of the conventional method of s o l u t i o n and proposes an i n t e r p r e t a t i o n of the s o l u t i o n which introduces a d i f f e r e n t approach. This approach does not use the expected cost as design objective. The suggested new c r i t e r i o n i s based on a trade-off between det-e r m i n i s t i c optimization and a cost penalty for estimation error. In order to have a basis of comparison with the conventional method, the proposed adaptive s t o c h a s t i c c o n t r o l l e r i s compared with the standard s t o c h a s t i c op-timal c o n t r o l l e r f o r a l i n e a r d i s c r e t e system associated with l i n e a r measure-ments, addi t i v e noise and quadratic cost. The basic feature of the proposed method i s the i n t r o d u c t i o n of an adaptive f i l t e r gain which enters the prop-osed cost index a l g e b r a i c a l l y and couples the c o n t r o l l e r with the estimator. Unlike the conventional Kalman-Bucy f i l t e r gain, the proposed gain i s a s c a l a r independent of the second and higher order moments of noise d i s t r i b -utions. Simulation i s c a r r i e d out on second and f i f t h order l i n e a r systems with gaussian and non gaussian noises d i s t r i b u t i o n s . There i s a moderate cost increase of 1% to 12%. The method i s then extended to nonlinear systems. A general s o l -ution of the nonlinear problem i s formulated and a complete i n v e s t i g a t i o n of the properties of the s o l u t i o n i s given for d i f f e r e n t cases. S t a b i l i t y (i ) of the expected tracking error of the f i l t e r i s guaranteed by introducing bounds on the f i l t e r gain. Problems a r i s i n g from the use of suboptimum structures f o r the control are examined and discussed. I t i s shown that f o r a class of systems the proposed method has a p a r t i c u l a r l y a t t r a c t i v e form. As i n the l i n e a r case, the required s t a t i s t i c a l information i s l i m -i t e d to the expected values of the noises, and the expected value of the i n i t i a l state of the system. Simulation executed on second order systems indicates a cost decrease of 1% to 20% when compared with the method using an extended Kalman-Bucy f i l t e r . ( i i ) TABLE OF CONTENTS Page ABSTRACT i LIST OF FIGURES v ACKNOWLEDGEMENT .' v i 1. INTRODUCTION 1 2. BACKGROUND TO LINEAR STOCHASTIC CONTROL PROBLEM 4 2.1 I n t r o d u c t i o n 4 2.2 Problem f o r m u l a t i o n 4 2.3 Conventional method of s o l u t i o n 6 2.4 I n t e r p r e t a t i o n of the conventional method 9 3. THE PROPOSED NEW COST INDEX AND CONTROL POLICY .12 3.1 F i l t e r i n g problem 12 3.2 C o n t r o l problem 21 3.3 S t r u c t u r e of cost f u n c t i o n a l 21 4. ASYMPTOTIC STABILITY OF THE FILTER 24 4.1 S t a b i l i t y c o n d i t i o n ... 24 4.2 Choice of a norm 25 5. PROPOSED ALGORITHM 28 5.1 D e r i v a t i o n of the proposed c o n t r o l 28 5.2 Example 1-second order system 31 5.3 Example 2 - f i f t h order system 41 5.4 Conclusion 48 6. BACKGROUND TO NONLINEAR STOCHASTIC CONTROL PROBLEM 50 6.1 I n t r o d u c t i o n 50 6.2 Problem f o r m u l a t i o n 51 6.3 Conventional method of s o l u t i o n 52 7. THE PROPOSED COST INDEX AND CONTROL POLICY 53 7.1 F i l t e r i n g problem 53 7.2 C o n t r o l problem 58 7.3 S t r u c t u r e of cost f u n c t i o n a l 59 8. ASYMPTOTIC STABILITY OF THE FILTER'.. 61 8.1 S t a b i l i t y c o n d i t i o n 61 8.2 Choice of a norm 62 ( i i i ) Page 9. SUBOPTIMUM CONTROL 65 10. PROPOSED ALGORITHM 71 10.1 Derivation of the proposed control 71 10.2 Example 1-second order system 76 10.3 Example 2 - f i f t h order system 82 10.4 Comparison with previous work. Example 3 85 10.5 Conclusion 93 11. CONCLUSION 94 APPENDIX I 96 REFERENCES ' 97 (iv) LIST OF FIGURES Figure Page 1 Block diagram of standard optimum c o n t r o l l e r for l i n e a r s t o c h a s t i c systems 8 2 Tracking properties of the f i l t e r 18 3 Block diagram of closed loop s t o c h a s t i c c o n t r o l l e r f o r l i n e a r systems 32 4 Computation of E(J) 37 5 Measured p r o b a b i l i t y density, functions of plant and measurement noises . 40 6 Block diagram of perturbed power system around equilibrium point 42 7 Computation of E(J) 44 8 Measured p r o b a b i l i t y density functions of plant and measurement noises 47 9 Suboptimum d e t e r m i n i s t i c c o n t r o l 70 10 Block diagram of closed loop s t o c h a s t i c c o n t r o l l e r for non-l i n e a r systems 77 11 Comparison of the effectiveness of the proposed and stand-ard controls 81 12 Standard, proposed and p a r t i c u l a r a p p l i c a t i o n of c o n t r o l . . 86 13 Proposed method of s o l u t i o n 87 14 P a r t i c u l a r a p p l i c a t i o n of proposed method 91 15 Comparison of eff e c t i v e n e s s between standard and proposed methods for the estimation and control problem 92 (v) ACKNOWLEDGEMENT I wish to express my appreciation to Professor E. V. Bohn f or hi s support and encouragement throughout the course of th i s study. I should l i k e to express the g r a t e f u l acknowledgment to the National Research Council for a 1969-1970 scholarship, and the Northern E l e c t r i c company for t h e i r 1968-1969 fellowship. Many thanks are due to Miss Veronica Komczynski who did a time optimal job i n typing t h i s manuscript. (vi) 1 1. INTRODUCTION The s t o c h a s t i c c o n t r o l problem i s usually defined by the know-ledge of f i x e d elements (plant or system) associated with a measurement device. Further, a c r i t e r i o n or performance index i s given as a s c a l a r function of the state of the plant, the control and the time. Since both plant and measurement devices are disturbed by the noise, the given c r i t -e r ion J i s not computable since the exact value of the state vector at any given time i s not a v a i l a b l e . As a consequence, the c r i t e r i o n J does not induce a t o t a l ordering over a l l r e a l i z a b l e c o n t r o l p o l i c i e s , since i t i s a function of the unknown disturbances and does not associate a r e a l number with each ..control, action [1]. An optimal c o n t r o l p o l i c y can only be defined with respect to a c r i t e r i o n inducing a t o t a l ordering over a l l c o n t r o l functions. Thus the s t o c h a s t i c c o n t r o l problem begins with the a r b i t r a r y choice of some chosen number r e l a t e d to the system performance index. The approach that has been widely adopted i s to choose the expected cost index E(J) and therefore the meaning of "optimal" i s associated with minimization of E ( J ) . The b a s i c desire of having a c o n t r o l l e r optimal on the average i s reasonable. Unfortunately, the a n a l y t i c a l d e r i v a t i o n of the optimum control necessitates a complete s t a t i s t i c a l d e s c r i p t i o n of the disturbances acting upon the system. For a l i n e a r system with quadratic cost and gaussian s t a t -i s t i c s i t i s known that the optimal c o n t r o l p o l i c y , based on minimizing the expected cost, i s r e a l i z e d by cascading a Kalman-Bucy f i l t e r with a det-e r m i n i s t i c optimum feedback c o n t r o l [2]. When a l l required s t a t i s t i c a l information i s a v a i l a b l e , a c o n t r o l p o l i c y based on a bayesian approach may be used [3], [8]. In a r e a l world s i t u a t i o n , the required s t a t i s t i c a l data may not be a v a i l a b l e or may not be accurate. Depending on the a v a i l a b l e 2 data, a Bayes or minimax approach may be then used. Except i n some part-i c u l a r cases ( l i n e a r dynamics, gaussian noise) the s o l u t i o n i s almost impos-s i b l e to implement. Even when a set of s u f f i c i e n t s t a t i s t i c s e x i s t s (and this assumes very p a r t i c u l a r noise d i s t r i b u t i o n s ) , the co n t r o l equations must be solved numerically. The c o n t r o l l e r i s then suboptimum and the un-known s t a t i s t i c a l parameters take on the ro l e of tuning parameters [9]. A tuning method based on minimizing E(J) i s both time consuming and c o s t l y and does not appear very p r a c t i c a l . Furthermore, the c o n t r o l l e r has no adaptive c a p a b i l i t y to cope with changes i n s t a t i s t i c s , and i t s on-line im-plementation requires e i t h e r a large amount of memory for storage of some gain matrix sequence, or on-line computation of matrix d i f f e r e n c e equations and matrix inverses. The ob j e c t i v e of t h i s thesis i s to derive a con t r o l p o l i c y r e q u i r i n g much less s t a t i s t i c a l data than the conventional method. Further, the implementation of such a c o n t r o l must be simple when compared with e x i s t i n g techniques. Instead of find i n g a suboptimal s o l u t i o n of the standard problem, i t i s suggested to define d i f f e r e n t l y the standard problem so that the a v a i l a b l e information becomes s u f f i c i e n t to deduce an optimum c o n t r o l . Since the use of minimum expected cost as a design objective i s a r b i t r a r y , a d i f f e r e n t approach seems, therefore, equally j u s t i f i a b l e . Thus, the a r b i t r a r y conventional choice of E(J) i s dropped since i t i s responsible for the need f o r a complete knowledge of the s t a t i s t i c s of the process. A d i f f e r e n t cost index i s deduced from a development of the o r i g i n a l cost J . The cost index tyis based on formulating a trade-off between minimum cost for a d e t e r m i n i s t i c system and a minimum cost penalty f o r estimation e r r o r . Under a s u i t a b l e s t a b i l i t y c onstraint, the proposed c r i t e r i o n ty can be rel a t e d to the o r i g i n a l cost J. Thus ty can be minimized subject to the usual constraints created by the given system dynamics and the associated s t a b i l i t y constraint, j u s t i f y i n g the legitimacy of the use of iji instead of 3 the o r i g i n a l but i r r e a l i s a b l e J . A very important feature of the formul-a t i o n i s that the cost penalty depends only on a v a i l a b l e measurements and estimates, and an adjustable adaptive gain. The adaptive gain couples the c o n t r o l l e r with the estimator. T h e con t r o l a c t i o n i s found to depend only upon the expected value of the disturbances and the expected value of the i n i t i a l s t a te vector. This l i m i t s the required s t a t i s t i c a l data. I t i s these features which d i s t i n g u i s h the proposed c o n t r o l l e r from presently known optimum and suboptimum c o n t r o l l e r s . In p r a c t i c a l a p p l i c a t i o n s , these c o n t r o l l e r s take a v a r i e t y of suboptimum forms. Case studies of app l i c a t i o n s i n the paper-making industry and i n a i r c r a f t ' n a v i g a t i o n systems are given i n [17]. Further applications are given i n [18]. Although the control equations have a simple form, comparison with e x i s t i n g techniques by means of simulation are favourable, even though the proposed method requires f a r less s t a t i s t i c a l information. Furthermore, because of the reduced comput-a t i o n a l and storage requirements, the proposed c o n t r o l l e r i s e a s i e r to implement o n - l i n e . 4 • 2. BACKGROUND TO LINEAR STOCHASTIC CONTROL PROBLEM 2.1 Introduction The theory of s t o c h a s t i c c o n t r o l process has reached a high l e v e l of maturity a f t e r the pioneering work of Kalman. Most of the e f f o r t has been concentrated on determining a c o n t r o l which i s optimum i n the sense of mini-mizing an expected cost. One major shortcoming of the t h e o r e t i c a l tools i s that they assume e i t h e r p erfect information f o r the problem to be solved, or at l e a s t a f a i r l y precise idea about the structure of the missing inform-a t i o n [3]. Numerous attempts have been made to i d e n t i f y the missing s t a t -i s t i c a l information [6], [7], with various degrees of success. The main feature of these attempts i s that the design objective i s to minimize the expected value of the cost. This part of the thesis w i l l be concerned with the study of l i n e a r systems with a d d i t i v e noise and quadratic cost. I t i s known that the optimum conventional c o n t r o l i s r e a l i z e d by cascading a Kalman-Bucy f i l t e r with a d e t e r m i n i s t i c c o n t r o l l e r . The s i m p l i c i t y of this formulation w i l l be exploited to deduce the proposed method, and w i l l be used as a basis of comparison. The systems considered w i l l be d i s c r e t e , completely c o n t r o l l a b l e and observable. A l l functions w i l l be continuous and d i f f e r e n t i a b l e i n a l l t h e i r arguments. 2.2 Problem Formulation Consider a m u l t i v a r i a b l e l i n e a r d i s c r e t e system: * k + i • V k + V k + v \ £ U k ( 2 - x ) z k = x k + v k 0 <_ k <_ N-l (2-2) Equation (2-1) refers to the system, where x^ i s an n-dimensional vector, u^ i s an m-dimensional control, vector and w, i s an n-dimensionai zero mean d i s -k turbance vector which i s assumed to be independent of v and x . Equation (2-2) 5 r e f e r s to the measurement device disturbed by the ad d i t i v e zero mean noise v which i s assumed to be independent of x and w V-k. k . tc R Thus E(v k) = E(w k) = 0 E(w kx k) = E ( v k x k ) = 0 (2-3) E(x ) = x o o where E(-) denotes the expectation operator and a prime denotes the transpose of a vector or a matrix. The system (2-1) i s to operate f o r 0 - k - N-l , where k i s a member of the index set n ={k:k = 1....N-1}. I t i s desired to implement a control sequence u eU where the set U i s the m-dimensional k e u c l i d i a n vector space and i s c a l l e d the admissible set of controls. A cont r o l sequence u^, k en i s c a l l e d admissible i f each element i s a function of the past and present observations on the system and on the past control vectors. Define z 1 = {z ,...,z.} (2-4) O X as the c o l l e c t i o n of the (i+1) vectors {z ,...,z.}. . o x S i m i l a r l y u 1 = {u ...u.} o x Hence = \x^(z^, u^ "S = u k(z^) £ U (2-5) Relation (2-5) expresses the condition that the c o n t r o l i s to be r e a l i z a b l e . The existence of such c o n t r o l w i l l not be discussed here. In what follows, i t w i l l always be assumed that at l e a s t one admissible c o n t r o l sequence e x i s t s . The time varying matrices Aj^ and B k are supposed known for a l l time. The s t o c h a s t i c c o n t r o l problem can then be stated as follows: Find the admissible c o n t r o l sequence \x^, k en t r a n s f e r r i n g the system from i t s i n i t i a l state x^ to a f i n a l state x^, and minimizing a sc a l a r cost index. The s c a l a r cost index J -= 1 1 2 J „ ( x £ V k + u k V k } ( 2 - 6 ) k=0 w i l l be referred to as the o r i g i n a l c r i t e r i o n . The s p e c i a l form of the measurement equation (2-2) r e s u l t s i n s i m p l i f i c a t i o n i n the d e r i v a t i o n of the control equation which i s discussed i n t h i s s e c t i o n . A more general form of equation (2-2) i s considered i n Chapter 6: 2.3 Conventional Method of Solution As formulated by (2-1), (2-6) the s t o c h a s t i c c o n t r o l problem can-not be solved since (2-6) i s a random v a r i a b l e which does not associate a r e a l number with each co n t r o l sequence and therefore does not induce a t o t a l ordering over a l l p o s s i b l e admissible c o n t r o l sequences. D e f i n i t i o n : A f u n c t i o n a l $(•) i s s a i d to induce a t o t a l ordering over N-l a l l admissible c o n t r o l sequences u. eU i f , given any N-l N - l u^ e U and £ U, then e i t h e r < K U ^ - 1 ) < ^ ( U ^ " 1 ) ,0' $ ( u ^ _ 1 ) = ^ ( u ^ 1 ) , ° ' . N - l , i N - l , $ ( u i ) > $ ( u 2 ) • The f u n c t i o n a l $(•) induces a t o t a l ordering on U i f a complete c h a r a c t e r i z -at i o n i s provided for a l l other elements of the c o n t r o l system. In the case described by (2-1), (2-6), the s t a t i s t i c a l d e s c r i p t i o n of the noises and the i n i t i a l vector state i s incomplete since only t h e i r expected values are given. Since the o r i g i n a l cost index (2-6) does not induce a t o t a l ordering over a l l admissible control p o l i c i e s [1], the notion of o p t i m a l i f y f o r a co n t r o l sequence cannot be defined. I t i s therefore necessary to derive a c o n t r o l p o l i c y i n such a way that some chosen numbers r e l a t e d to (2-6) are minimized. A reasonable natural choice i s to adopt the expected value of (2-6) as a cost index, and define an optimum admissible c o n t r o l sequence with respect to the minimization of E ( J ) . This i s the standard or con-ventional choice. I t i s motivated by the desire to have a c o n t r o l which 7 i s optimum on an average ba s i s . I t i s nevertheless an a r b i t r a r y choice. I f a l l required s t a t i s t i c a l information about the system (2-1)-(2-6), i s a v a i l a b l e , then the optimum admissible c o n t r o l sequence minimizing E(J) i s given by [3]. In the case where a l l random variables have gaussian d i s t r i b u t i o n s , \ • Vk (2"8) \ + i • + \+i(\+r(\+\W (2"9) x = E(x ) o o Kk+1 = rk+;i C vk+l r k + l = ( A k r k A k + C w k ) _ 1 + C v k i l ( 2- 1 0 ) where E(v.v!) = Cv.<5.., E(w.w!) = Cw.6.. r J i i j i J l i j . . and where r i s the covariance matrix of the i n i t i a l s t a te x , Cv. and Cw. o o k k are the respective covariances of the noises vectors v^ and w^ at time "k", and S i s a time varying matrix given i n appendix I. Figure I i l l u s t r a t e s k the system (2-1), (2-6) with i t s feedback loop (2-8) i n block diagram form. Several remarkable features appear from (2-8)-(2-10). F i r s t of a l l , the c o n t r o l i s expressed as a l i n e a r function of the estimate x^ of x^. I t has been shown [14] that i n t h i s case, a l l s t a t i s t i c a l information contained k * i n the measurements z i s also contained i n x, which i s a s u f f i c i e n t s t a t -i s t i c . This f a c t i s r e l a t e d with the gaussian type d i s t r i b u t i o n used and the l i n e a r i t y of (2-1) and (2-2). Perhaps the most i n t e r e s t i n g feature i s contained i n the s o - c a l l e d separation theorem which states that the problems of estimation and control are separated. Thus the optimum standard s o l u t i o n i s obtained i n two steps: the f i r s t one consists i n computing the c o n d i t i o n a l expectation of the state vector given the measurements, or equivalently [2] r e a l i z i n g a weighted mean square estimate of the state vector. The s o l u t i o n of t h i s problem i s given by (2-9). Then the above f i l t e r i n g s o l u t i o n i s fed into a d e t e r m i n i s t i c function generator (2-8) whose matrix S i s i d e n t i c a l BLOCK DIAGRAM OF STANDARD OPTIMUM CONTROLLER FOR LINEAR STOCHASTIC SYSTEM FIG-1 9 to the d e t e r m i n i s t i c control maxtrix (see appendix I ) . The r e s u l t i n g feed-back loop i s optimum. I t - i s also remarkable that the gain K^+^ of the f i l t e r i s a known function of time only, depending only on the covariance of the i n i t i a l s t a te vector. . The numerical computation of this matrix gain can be done e i t h e r o n - l i n e , but then requires matrix inversions which are always undesirable, or o f f - l i n e and stored, but then the memory requirements may exceed the av a i l a b l e computer memory. To circumvent those d i f f i c u l t i e s , some sub-optimal versions can be considered. The simplest choice would be to have a gain on the form of a diagonal matrix. Note that t h i s can be possible here because of the p a r t i c u l a r simple form of the measurement equation (2-2). Thus c o u- L <^ D e considered as a s c a l a r m u l t i p l y i n g a unit matrix. However such a suboptimal s c a l a r gain would i n general r e s u l t i n poor system per-formance. 2.4 In t e r p r e t a t i o n of the Standard Case The following i s an i n t e r p r e t a t i o n of the standard s o l u t i o n of the previous problem. This i n t e r p r e t a t i o n w i l l suggest a possible structure fo r the desired new c r i t e r i o n to be used i n the proposed method. Consider the l a s t stage of the process defined by (2-1) - (2-6). The co n t r i b u t i o n to the cost (2-6) during the l a s t stage i s given by: • JN = 1 / 2 x t a + 1 / 2 v i ' V i V i ( 2 _ 1 1 ) Suppose an estimate x„ of x„ i s a v a i l a b l e . This estimate being not neces-v v N - l N - l & s a r i l y the standard choice. Define the e r r o r : 6N-1 " XN-1 " V l ( 2 " 1 2 ) Using (2-1) in t o (2-11), dropping a l l "N-l" indices y i e l d s J N = l/2(Ax+Bu+w)'^(Ax+Bu+w) + 1/2 u'Q u (2-13) 10 making use of (2-12) into (2-13) y i e l d s a f t e r rearranging J„ = J * + J * + J * (2-14) N c ce e where J * = 1/2(Ax + Bu)'R N(Ax + Bu) + l/2(x'R^x + u'Qu) (2-15) J * e = e'A'R N(Ax + Bu) + w'P^ w (2-16) J * = 1/2 e'A'P^Ae + 1/2 w'R^ w (2-17) I N-lN-2 The minimization of E{J„ z ,u } can be interpreted as follows: N v (a) Solve the p a r t i a l minimization problem defined by min E{J*|z ,u } = min J * (2-18) {u} c • {u} C Equation (2-10) i s quadratic i n u and the minimum i s given by u = Sx where S i s the standard c o n t r o l matrix for a d e t e r m i n i s t i c system. Note that so f a r , no p a r t i c u l a r d e f i n i t i o n of the estimate x i s required. The minimiz-a t i o n problem w i l l be r e f e r r e d .to l a t e r as the control problem. Consider now the l a s t term of (2-14): (b) Solve the p a r t i a l minimization problem defined by . N-lN-2 min E{J* z ,u } (2-19) ix} This step y i e l d s the minimum variance estimate or equivalently a weighted mean square estimate. The s o l u t i o n of (2-19) i s given by (2-9) with proper i n d i c e s . This minimization problem w i l l be referred to as the estimation or f i l t e r i n g problem. A consequence of (2-19) i s contained i n the p r o j e c t i o n lemma [2], and this leads d i r e c t l y to ^ r T J U 1 N-1N-2-, E{J* z ,u } = 0 ce 1 1 N-lN-2 Consequently, the minimization of E{J N|z ,u } can be seen as: 1 N-lN-2 min E{J N|z ,u } = min i>= min [J + J ] (2-20) (u,x) {u,x} {u,x} 11' where A M . „ „ A „ r T . i N-lN-2 •, J = E{J* z ,u } c c' J = E U H z V / " 2 } (2-21) and min \b = min J + min J (2-22) A c ~ e {u,x} {u} {x} Equation (2-22) i l l u s t r a t e s the consequences of the separation theorem. I f E ( J ) i s not chosen as c r i t e r i o n , then such a nice separation •H property w i l l not l i k e l y hold. I t also suggests that perhaps a con t r o l p o l i c y could be deduced by w r i t i n g down e x p l i c i t cost penalties functions describing an estimation and a con t r o l problem with some parameter coupling them. This introduces the idea of a trade-off between estimation and contr o l . . In other words, i t i s . d e s i r e d to f i n d an.expression analogous to (2-22) with the extra feature of having a coupling parameter. Suppose for the moment that i t i s pos s i b l e to f i n d a s c a l a r parameter g coupling J and J . Then (2-22) becomes c e min ty = min [J + J ] = min[J + J ] (2-23) u(g) u(g) c e {g} C { x ( g ) } M s ) 1 Equation (2-23) represents the general structure of the c r i t e r i o n to be used instead of conventional E ( J ) . The problem i s to introduce a coupling i n some meaningful way. 12 3. THE PROPOSED NEW COST INDEX AND CONTROL POLICY The r e a l i z a t i o n of a trade-off between estimation and control begins with the d e f i n i t i o n of proper cost functions f o r the estimator and the c o n t r o l l e r . Suppose that an estimate ( i n a sense to be defined l a t e r ) of i s a v a i l a b l e . Define the e r r o r at time "k" by E l iminating x^ from (2-6) and using (3-1) y i e l d s with J = J + J (3-2) c e JC =1/21 ( * k V k + ( 3 " 3 ) J e = 1 / 2 f K\\ + 2 G k \ V ' ( 3 " 4 ) 3.1 F i l t e r i n g Problem Assume f o r the moment that the c o n t r o l has the form u k = \ ^ X k ' k ^ (3-5) The estimate x i s to be defined i n such a way that (3-4) i s i n some sense "small". The problem of choosing x^ such that e^ and therefore are small i s a f i l t e r problem. There i s a problem of trade-off between choosing x^ so that J i s small and choosing x^ so that J i s small. In the standard approach, based on the minimization of the expected value of the cost E ( J ) , and gaussian s t a t i s t i c s f o r the disturbances and the i n i t i a l state vector, the trade-off problem i s solved by the. separation theorem which states that the f i l t e r i n g problem can be treated independently of the c o n t r o l problem. Although i t i s not intended to use E (J) as c r i t e r i o n , the b a s i c structure of the Kalman f i l t e r w i l l be adopted since i t has a simple and highly desirable p r e d i c t o r - c o r r e c t o r mode of operation. Thus, the estimate x w i l l be defined to be the output of a f i l t e r described by x k + i " Vk + \ \ + g k + i y k + i ( 3 " 6 ) where \ + i • \ + i " (Vk + Vk> (3"7) i s the d i f f e r e n c e between the actual and predicted measurement vector and g i s a s c a l a r function, &k+l = WVVzk+l> ( 3 _ 8 ) I t i s desired to f i n d a structure f o r (3-8) such that the estimate gen-erated by (3-6) keeps the e r r o r (3-1) "small". The basic d i f f i c u l t y here i s caused by the e r r o r e i n (3-4). Since the measurements are noisy, the s t a t e vector i s not exactly known at each time and consequently e^ i s also unknown. Thus d i r e c t use of (3-4) as a cost penalty d e f i n i n g a f i l t e r performance i s not r e a l i z a b l e . The standard approach avoids t h i s b a s i c d i f f i c u l t y by implementing a f i l t e r minimizing the expected value of the squared error E ( e k e k ) • Choosing E(e^e k) as a c r i t e r i o n f o r the f i l t e r requires a d d i t i o n a l s t a t i s t i c a l information not contained i n (2-1), (2-6). Because of the assumed r e s t r i c t e d a v a i l a b l e information the minimization of E(e'e ) cannot be c a r r i e d out exactly. K. K. However a modification i s possible y i e l d i n g a r e a l i z a b l e f i l t e r using the a v a i l a b l e data. More p r e c i s e l y from (3-6): g k + l y k + l ° x k + l " ( A k x k + B k U k ) ( 3 " 9 ) using (3-7) = x k + 1 + y k + 1 - z k + 1 using (2-2) - + y k + 1 - x k + 1 - v k + 1 (3-10) using (3-1) and rearranging V i = (lJgk+i)yk+rvk+i ( 3 _ 1 1 ) A p o s s i b l e modified cost index could be obtained by s u b s t i t u t i n g (3-11) into (3-4) and d e l e t i n g terms containing the measurement noise. I t would be, however, d i f f i c u l t to give more j u s t i f i c a t i o n f o r doing that. Rather, consider the following approach. Let *k+i = V k + \ \ + Vw ( 3 " 1 2 ) where g^ i s a s c a l a r gain already defined by (3-8). Suppose that the process has s t a r t e d and that the current value of the gain i s g^. A com-parison between (3-6) and (3-12) indicates that X j c + ^ i s t n e estimate of x, based on no adaptive change i n gain. The associated estimation error iv > _L i s by d e f i n i t i o n : 'k+1 = \+l ~ \ + l e, n - x, M - x, ,, (3-13) I t i s d e s i r a b l e to choose the new gain adaptively so that an improved estimate x, ... i s obtained. k+1 This improvement i n estimation i s the c r i t e r i o n on which the implementation of the f i l t e r w i l l be based. Equation (3-1) and (3-13) can be written as: \+l Xk+1 * k + l + \ + l \ + l 6 k + l + X k + l Xk+1 (3-14) Su b s t i t u t i n g (3~6 ) and .('3-12) into (3-14) y i e l d s \ + l " \ + l " V k + 1 ( 3 " 1 5 ) with Ag k i g k + 1 - g k (3-16) u s i n g (3-11) and assuming g k + ^ ^ 1 y i e l d s V i = V ^ W 1 t e k + i + W ( 3 " 1 7 ) t h a t i s ( 1 + W e k + i • \ + i - W W ( 3 ~ 1 8 ) with Y , „ = a g ( l-o ) x (3-19) k+1 hky b k + l ; ^ y ; 15 Note thaty,,-, as defined by (3-19) i s a s c a l a r . Forming the s c a l a r product K+1 e' e using (3-18) and taking the c o n d i t i o n a l expectation y i e l d s K , T 1 K.~r*X E ^ I V J W - ^ W ^ ^ i V i ' W ( 3 - 2 0 ) where and ( 1 + W °k +i " ^ k + i V ^ ^ ^ A + i l V i ? ( 3 " 2 2 ) I t i s now p o s s i b l e to define the estimate x^ i n the sense of y i e l d i n g a smaller expected estimation of the e u c l i d i a n norm of the e r r o r . From equation (3-20), this w i l l be the case i f 1 + a k + l Y k + l F ( W = : v < 1 ( 3 _ 2 3 ) <1 + W 2 Condition (3-23) can be developed as < V i ( 1 " W + 2 > V i ' 0 ( 3 " 2 4 ) For the various possible r e l a t i v e values of ( l - o ), the solutions Y, , K+1 K+1 s a t i s f y i n g (3-24) are ( i ) i f ( l - c k + 1 ) >. 0 [-co <_Yk+1 < ]U[0 <_ Y k + 1 <_ - ] (3-25) u k+1 ( i i ) i f d - V l ) < o t b < From (3-25), i t can be concluded that the condition (3-24) (and therefore (3-23) ) w i l l be s a t i s f i e d i f Y k + 1 £ 9 = { \ + l : 0 ^ k + l <- ^ } (3-26) 16 This insures that there w i l l always e x i s t a - 0> no matter how small, such that (3-23) i s s a t i s f i e d . From a p r a c t i c a l point of view, ° k + 1 1 S never excessively large implying that the range for Y,,-, i s not unreasonably K."~rX small. For example, i f the measurement device i s very noisy (say that the average amplitude of the noise equals the average amplitude of the sig n a l ) then °, ,-, ~ 1 and a p p l i c a t i o n of the condition (3-26) y i e l d s 0 ^ Y k + 1 < 1 (3-27) Thus (3-26) i s not a design l i m i t a t i o n f o r r e a l engineering systems. Using now the f a c t that Y, i s a p o s i t i v e quantity, equation (3-19) can be written successively as ^-W 1 - \ + i 9 0 (3"28)  A% = v i (3-29) Note that s g n [ A g ^ l - g ^ ) ] > 0 (3-30) where sgn(*) i s the sign function. Thus (3-30) i s equivalent to Ag k = . a . S g n ( l - g k + 1 ) (3-31) where a > 0 i s an adjustable parameter. To evaluate how d i f f i c u l t i t i s to compute a , consider again the case where the average amplitude of the s i g n a l equals the average amplitude of the measurement noise. From (3-27) i t i s po s s i b l e to define a range f o r • i t : w i H t>e shown i n the next s e c t i o n that a t y p i c a l range for ( x - 8 k + ^ ) l s given by | ( l - g k + 1 ) | < 1 (3-32) Hence from (3-29) |Ag k| < 1 (3-33) Combining (3-31) and (3-33) in d i c a t e s that 0 < a * 1 (3-34) Thus at this point a workable range for a can be obtained. I t w i l l be shown l a t e r that i t i s p o s s i b l e to r e s t r i c t t h i s range from p h y s i c a l con-sid e r a t i o n s and therefore s i m p l i f y i n g the choice f o r a. I t can be deduced from (3-6) and (3-12) that *k+l = \ + l + A g k y k + l ( 3 _ 3 5 ) and by using (3-31) that A + i = \ + i + - ° s g n ( 1 - S k + i ) y k + i ( 3 - 3 6 ) Equation (3-36) can be used to introduce a penalty cost function character-i z i n g the f i l t e r . I t i s desired here to penalize the values of g^-j, r o r which Y, , t © since then there w i l l be no improvement i n the estimate. k+1 A -Let Cfc = - 5?k - a s g n ( l - g k )y (3-37) where by convention sgn(O) = +. ' To show that (3-37) can be associated with a penalty cost function, i t has to be shown that zero penalty i s associated with zero e r r o r . From (3-35) and (3-37) i t i s seen that 0. = 0 implies (3-31) which i s the correct choice. Consequently, the s e l e c t i o n of a f i l t e r gain which i s associated with an improved estimate can be r e a l i z e d by minimizing the penalty cost function J * = C.'C. (3-38) pk k k A more.general form of (3-38) i s r e a l i z e d by introducing a p o s i t i v e def-i n i t e weighting matrix Wk such that (3-38) becomes: J , = C;w. C. (3-39) pk k k k The freedom associated with the i n t r o d u c t i o n of W w i l l be exploited l a t e r . K. I t i s i n t e r e s t i n g to note [9] that i f a l l s t a t i s t i c s are a v a i l -able and of gaussian type, the optimum gain insuring a maximum rate of decrease of the error measured as the trace of the matrix covariance er r o r ( i . e . equation (3-20)) i s p r e c i s e l y the Kalman gain (2-10). Therefore, i t i s c l e a r that the proposed f i l t e r based on the minimization of (3-39) w i l l y i e l d a gain which y i e l d s a rate of decrease of e r r o r slower than the rate which would be achieved by using a Kalman gain. However, i t i s also c l e a r that no s t a t i s t i c s beyond the expected values are assumed to be a v a i l a b l e . Hence the optimum Kalman gain i s not computable. The penalty cost function (3-39) can be also deduced from a h e u r i s t i c consideration of the average sign of the e r r o r . Consider the s i t u a t i o n i l l u s t r a t e d i n F i g . 2. Tracking properties of the f i l t e r F i g . 2 19 I t i s desired to implement a f i l t e r such that the e r r o r e = x - x K , t kC K . remains "small". This implies some tracking structure f o r 2c . From an engineering point of view, there i s no reason to follow too c l o s e l y at the expense of heavy computations i f the cost improvement i s not s i g n i f i c a n t . F i g . 2 i l l u s t r a t e s the basic c r i t e r i o n which i s to improve the predicted estimate x^ +^ given by (3-12). This means that the sign of the error e k + ^ must be known for each of the components, at l e a s t on an average b a s i s . Considering equation (3-11) and taking the co n d i t i o n a l expectation of both sid e s , and making use of (2-3) y i e l d s E ( e k + 1 | g k + 1 ) - ( l - g k + 1 ) E ( y k + 1 | g k + 1 ) (3-40) Equation (3-40) suggests that "on the average", the use of the a v a i l a b l e sgn(l-g ,,)y, ,, instead of the unknown e w i l l be j u s t i f i e d . Thus the k+1 k+1 k+1 quantity sgn(l-g )y .. could be used as an i n d i c a t i o n of the "average" K. i _L K.~T"_1_ d i r e c t i o n of the move, to correct the predicted estimate x, ,-. Therefore k+1 an improved estimate would have the form *k+l = Xk+1 + 6 x k + l (3-41) 6 x k + 1 £ a . s g n ( l - g k + 1 ) y k + 1 (3-42) where a > 0. Equations (3-41) and (3-42) are i d e n t i c a l to (3-36). At t h i s point the same deductions could be applied to evaluate the structure of the cost penalty function (3-39). An i n t e r p r e t a t i o n of the parameter ct can be given by rewriting (3-41) and (3-42) i n a different, way x k + l = \ + l + S g n ( 1 " W <^"> y k + i (3-43) For s i m p l i c i t y , suppose that a l l q u a n t i t i e s are s c a l a r , and that |x| * |x| - 1 (3-44) The following i n t e r p r e t a t i o n can be r e a d i l y extended to the vector case by using the e u c l i d i a n norm instead of absolute values, and generalizing (3-44). Equation (3-43) expresses the fact that as long as y^+i < K > the e r r o r |e | w i l l also be small. In this case, i t i s considered that iC i X x^ +^ = ' x k + l ^ S s a t l s ^ a c t o r i - ' - y tracking • Thus (3-4) can be neglected and the minimization of (3-2) i s i n f a c t e s s e n t i a l l y the minimization of (3-3). This suggests an open-loop type of c o n t r o l not disturbed by the measurement noise. Consequently, i n this case, the f i l t e r gain would be expected to decrease towards zero, which i s equivalent to opening the loop. However, when |y | >> 1/a, i t can be concluded that the e r r o r i s not acceptable, and tracking i s no longer s a t i s f a c t o r y . Emphasis must then be placed on the estimation problem since the c o n t r i b u t i o n of (3-4) to the minimization of (3-2) can no longer be neglected. A feedback c o n t r o l i s d e s i r a b l e and then the gain of the f i l t e r must increase to close the loop i n order to weigh more heavily the influence of the measurements. I t i s obvious that there must be some l i m i t s placed on the v a r i a t i o n of the gain. This w i l l be considered l a t e r on. Thus 1/a represents a measure of acceptable e r r o r . To c l a r i f y t h i s point, suppose that (3-44) holds. Assume a very noisy measurement device, f o r example that the a d d i t i v e measurement noise i s gaussian with unity variance. Then, as long as l y ^ ^ l < ( ° n e tenth of the variance) i t can be reasonably assumed that the f l u c t u a t i o n of the er r o r |y. I i s e s s e n t i a l l y caused by the noise and can be ignored. rC~T"X Setting a = .1 would y i e l d here for ly^-fi/l = y k + l 6 x k + 1 = s g n ( l - g k + 1 ) [ -TJ-r- ] = .01 s g n ( l - g k + 1 ) (3-45) 21 Thus • 16x k + 1 1 = . -01 and the c o r r e c t i o n term i ^ ^ - t - i I < < l^k+]J ^ s n e g l i § a h l e . In other words a represents here a measure of loss of tracking. 3.2 Control Problem Equation (2-18) suggests that the control problem be defined by the minimization of (3-3) subject to the constraint + ( 3 " 4 6 ) This i s a de t e r m i n i s t i c c o n t r o l problem and the s o l u t i o n i s known to be consistent with (3-5), [10], u k = S k x k (3-47) A simpler expression for (3-3) can be obtained by introducing the s o l u t i o n 1^ of a R i c c a t i matrix equation (see appendix I ) . [ 1 0 ] . Thus (3-3) becomes: J = 1/2 x'L x (3-48) c o o o Suppose the process has s t a r t e d and that the actual time i s "k". The con-t r o l (3-47) can be seen as minimizing the cost J £ 1/2 x'L x (3-49) pk k k k Thus the cost penalty function associated with the c o n t r o l problem at time "k" i s chosen to be (3-49). 3.3 Structure of cost f u n c t i o n a l Following the i n t e r p r e t a t i o n of the standard method defined by (.2-23), the new c r i t e r i o n i s to be r e a l i z e d by considering the cost penalty function associated with the problem of estimation and c o n t r o l . From (3-39) and (3-49), i t i s seen that the c r i t e r i o n to be minimized at time "k" 22 has the form: *k = 1 / 2 K\\ + C k \ C k ( 3 " 5 0 ) I t i s seen that the p o s i t i v e d e f i n i t e matrix determines the r e l a t i v e importance between updating the gain (3-35) to improve the estimate and updating the gain to reduce the de t e r m i n i s t i c cost index (3-49). In (3-50) the estimate x^ i s a function of g^ because of (3-6) and therefore ^ 1 S a function of g, . Let g* be the gain which minimizes (3-50) under the con-°k k s t r a i n t s (3-6) and (2-1). Since the constraints contain x^, z^, u^ e x p l i c -i t e l y i t i s evident that g* w i l l be of the form (3-7). Also the optimum g* w i l l be a random v a r i a b l e . Since the future noise i s unpredictable, i t has not been accounted f o r i n the control problem (3-49). I t i s therefore questionable to use the optimum g*. Rather, i t i s suggested to use (3-50) to deduce the d i r -ection of the move (g -g, n) rather than the move i t s e l f . In other words, a constraint where <5g k-l> 2 = «* g 4 - 1 (3-51) 0 < 81 £ i g 6 g k - l = g k " ' ( 3 " 5 2 ) i s introduced, and the best average d i r e c t i o n for the change of the gain i s taken to be given by the gradient of . A fortunate consequence of (3-51) and (3-52) i s that considerable s i m p l i f i c a t i o n s occur i n the computations. The minimization problem as defined by (3-50) associated with the constraints (2-1), (3-6) and (3-52) i s seen to be a " l o c a l " problem which must be solved at each time. By comparison, the standard method based on minimizing E ( J ) , i s a " g l o b a l " problem, i . e . , the control sequence u^ ., ken has a known e f f e c t on the , 23 o v e r a l l t r a j e c t o r y . The proposed method introduces the necessity f o r i n v e s t i g a t i n g the e f f e c t of the successive " l o c a l " minimization of (3-50). As i n d i c a t e d e a r l i e r , there must be bounds on the possible v a r i a t i o n s of f i l t e r gain to make sure that a de c i s i o n at time, say, "k", w i l l not have catastrophic e f f e c t on the cost at time "k + m". An a d d i t i o n a l constraint must therefore be found and added to the proposed minimization problem (3-50), (2-1), (3-6) and (3-52). 4. ASYMPTOTIC STABILITY OF THE FILTER 4.1 S t a b i l i t y Conditions I t i s desired to make sure that the global e f f e c t of the " l o c a l " minimization problem defined by (3-50), (2-1), (3-6) and (3-52) w i l l not have catastrophic consequences such as loss of tracking and therefore r e s u l i n an undesirable increase i n cost (3-2) from the co n t r i b u t i o n of (3-4). Therefore a study of the asymptotic properties of the f i l t e r equation must be done. From (3-6) and (3-7) i t can be deduced that \ + i = \ \ + \ \ + g k + i (zk+r(\\+ W } (4-1} the unforced equation (4-1) can be written C-l = ( 1-8k+l> \ \ ( 4 ' 2 ) Equation (4-2) w i l l be asymptotically stable i n the large provided the matrix ^"Sk-KL^^k s a t i s f i e s a s t a b i l i t y condition. The same s t a b i l i t y con-d i t i o n i s obtained by i n v e s t i g a t i n g the s t a b i l i t y of the expected value of the er r o r . From (2-1) and (3-6) then e k + i = \+i - \ + i = W + w k " WVk+VW (4~3) Hence e k + l = (1-8k+l)Vk + ( 1 - g k + l ) w k " g k + l \ + l ( 4 _ 4 ) Taking the con d i t i o n a l expectation of both sides of (4-4) and using (2-3) yi e l d s E ( ek+llW = ( 1 _ 8 k + l ) A k E ( ekl gk +1 } ' ( 4 - 5 ) Since the con t r o l i s to be non a n t i c i p a t i v e (see (2-5) ), equation (4-5) can be written as ^k+l'W = ( 1 - g k + l ) \ E ( £ k l g k ) ( 4 ' 6 ) 25 Assuming an i n i t i a l E(e |g ) < <», i t i s desirable that (4-5) be asymptot-i c a l l y stable i n the large so that the co n d i t i o n a l expected value of the erro r decreases. This w i l l be the case i f the absolute value of the l a r g e s t component of the vector E( ej c_j_^ ^ ^ decreases when the time increases. Note the s i m i l a r i t y between (4-6) and (4-2). 4.2 Choice of a norm For a f i n i t e dimensional n vector x, define the norm max x. {i} 1 (4-7) where x_^ , i = l , . . . , n i s a component of x. For a matrix M of dimensions nxn, the natural norm associated with the vector norm (4-7) i s [8] I M | = max Y. K , " {i} j = l 1 J where m.. i s an element of M. From the properties of the norm, (4-2) y i e l d s (4-8) Recursive a p p l i c a t i o n of (4-9) y i e l d s ~ U | k 1 x. ^ U I k+11 k i ! 0 H ( 1 - W A i (4-9) (4-10) Let j be the index i n (4-10) associated with the maximum value of the matrix norm, that i s II (J--g. + 1)A.|L =.sup | | ( l - g i + 1 ) A . | | (4-11) 'j+l' j Su b s t i t u t i n g (4-11) i n t o (4-10) y i e l d s {i=0,...,k} "k+11 < t « ( l - g j + 1 ) A . M j ll-xj (4-12) In order to insure that (4-2) i s asymptotically stable i n the large i t i s s u f f i c i e n t that M ( l - g j + 1 ) A . | | M < l (4-13) The condition (4-13) w i l l be s a t i s f i e d i f I l ( 1 _ 8 k + l ) \ l L < 1 f o r a l l k (4-14) From the d e f i n i t i o n (4-8), (4-14) can be written as ^ - W ^ L = i ^ k + i i K M - • i 1 - 8 ^ H . s i \ . J ( 4 - 1 5 ) . {1} j = l 13 Define a = max [ max v a, ] (4-16) max- s x s . x ._ 1 k. . Ik> i i ) 3=1 13 Then (4-16) and (4-15) are equivalent to g . < g, < g (4-17) mm k max with g . = 1 - [a ] " 1 (4-18) mm max g = 1 + [a ] ~ 1 (4-19) max max Note that i n the p a r t i c u l a r case where the matrix i s time invariant, the s t a b i l i t y of (4-<'iL') can be determined from the eigenvalues of ( I ' S ^ ^ ) ^ - I t i s known that i n this case, a l l .eigenvalues of (±-§jc+i) must be within the unit c i r c l e . I f X^ i s an eigenvalue of A. , then (1-g )X X i s an eigenvalue of ( l - g ^ ^ A ^ [8]. The s t a b i l i t y condition i s therefore l ^ k + i ^ k l < 1 ( 4 " 2 0 ) for a l l i and k. Define | x ™ a X | = max I J u l ' (4-21) • k {i}' k | Then (4-20) i s equivalent to 1 - < g, . i < 1 + (4-22) '.maxi °k+l i maxi umax I k+1 i,nu | Xk 1 l X k The gain g i s therefore bounded i n order to guarantee the s t a b i l i t y of (4-2) or (4-6). To avoid evaluating (4-22) at a l l time, one can write a more stri n g e n t r e l a t i o n v a l i d at any time "k", Define X = max [max|A*| ] (4-23) M {k} {i} k Then (4-22) can be put on the form ' W h e r e gmin < g k < gmax ( 4 " 2 4 ) A 1 g - = 1 - T - (4-25) mm A„ M g k 1 + — (4-26) max x M Hence any value of the f i l t e r gain s a t i s f y i n g (4-24), (4-25) and (4-26) w i l l insure the asymptotic s t a b i l i t y i n the large of (4-2) and (4-6). The con-s t r a i n t (4-22) or (4-24) must be added i n order to solve the minimization problem defined by (2-1), (3-6), (3-50), (3-52). The next s e c t i o n w i l l be devoted to the s o l u t i o n of the minimization problem. 28 5. PROPOSED ALGORITHM 5.1 Derivation of the proposed c o n t r o l This s e c t i o n i s concerned with the d e r i v a t i o n of the con t r o l equations which are then used to derive a con t r o l algorithm. For convenience r e c a l l that the problem i s defined by the knowledge of a system disturbed by noise Xk+1 " Vk + Vk ^ k 0 < k < N - l (5-1) and observed by z k = x k + V k (5-2) The disturbances v. and w, are assumed to have zero mean k k E( v k ) = E(w k) = 0 ^ k (5-3) An estimate of the i n i t i a l state of the system i s supposed to be a v a i l a b l e , E(x ) = x (5-4) o o The o r i g i n a l cost function i s given by J -1/21 K\\ + uk\V (5-5) where and Q_k .are p o s i t i v e 'definite matrices f o r a l l time. The assumed lack of s t a t i s t i c a l information about v, and w makes i t impossible to implement exactly a conventional control which requires a Kalman-Bucy f i l t e r . The proposed f i l t e r i s taken to have the structure x k + i = Vk + Vk + 8k+i(zk+r(Vk + W } (5"6) where &^+-^ = 8^+^ ^ ^k' zk+l' uk^ "'"S 3 s c a-'- a r B a i n « ^ n e c o n t r o l i s given by \ = S k \ ( 5 " 7 ) where S k i s given i n appendix I. 29 In the proposed method ( 5 - 5 ), which i s not a v a i l a b l e i s replaced by ( 3 - 5 0 ) , which i s : \ = 1 / 2 k \ \ + ( 5 " 8 ) where a_ = \~\ ~ «sgn(l-g k)y k (5 -9) 7 k = \ " ( \ - i \ - i + \ - i \ - i ) ( 5 " 1 0 ) and x k = + B k _ l V l + g k _ i y k (5-11) I t has been shown (section 4) that the gain g k must s a t i s f y the constraints g • < g, < g (5-12) &min B k &max where g . and g are given by (4-18) or (4-25) and (4-19) or (4 -26 ). A mm max further constraint from (3-51) i s ( 6 g k ) 2 = ( g k + 1 - g k ) 2 = 6l\ £ (5-13) with 0 <_6* < 1 (5-14) g " The problem consists now i n f i n d i n g the optimum g k at each time "k", which requires minimizing (5-8) subject to the constraints (5-1) - ( 5 - 14 ) . The proposed s o l u t i o n proceeds along the following steps: (a) Assume the process has started, and that the actual time i s "k", and that the current value of the gain i s gk_-^ • A new measurement z k i s made. Expanding (5-8) around gk_^_ a n ^ keeping the l i n e a r term y i e l d s 9iJ) 3 g &gk.x (5-15) s k - l Equation (5-15) suggests the use of a steepest descent method to obtain An improvement i n the cost (5 -8) w i l l be r e a l i z e d i f 3 k - l 8 g 6 gk _ ! < 0 (5-16) 'k-1 A choice f o r ^g, n which s a t i s f i e s (5-16) i s «g. . - - 6 4 ^ k-1 3 g (5-17) g k-1 Equation (5-17) must be consistent with (5-13). Therefore s u b s t i t u t i n g (5-17) in t o (5-13) y i e l d s 8i i l 3g k-1 = 8lh g. k-1 (5-18) where 6X= 6 a g 2 8 k - l — 8g g k - i S u b s t i t u t i n g (5-19) into (5-17) y i e l d s 6g = _6£ k-1 g g k_ L sgn Evaluating 34) 3g 3yV ag 3g 'k-1 from (5-8), (5-6) and (5-10) y i e l d s Sk-1 =k-l 3x/_ Z± 9S ' 9x (5-19) (5-20) y k [ L k V 2 V y k s g n ( 1 - g k - i ) ] ( 5 " 2 1 ) gk-1 Note that (5-21) has been obtained by s e t t i n g (^(-'--gjc_^) = 0, where 6 (•) i s the d e l t a function. This i s legitimate since g^ i s a random v a r i a b l e which i s equal to one with p r o b a b i l i t y zero. The matrix W (see (3-39) ) must K. be p o s i t i v e d e f i n i t e but i s otherwise a r b i t r a r y . For s i m p l i c i t y i n the evaluation of (5-21) l e t 2 W k " Lk (5-22) This i s pos s i b l e since i s s o l u t i o n of a R i c c a t i matrix d i f f e r e n c e equation, and i s p o s i t i v e d e f i n i t e [10]. Thus; (5-21) can be s i m p l i f i e d - . i l 9g = y k L k t x k ~ a s 8 n d - 8 k _ 1 ) y k ] (5-23) p k - l 31 Combining (5-20 and (5-13) y i e l d s A " gk-i { 1" 6 £g s g n [ yk Lk [ xk " 0 B8 n( 1-8 k_ 1)y k] 1 > The proposed algorithm to be executed at each stage i s therefore as follows: (5-25) 1. 2. 3. 4. From the measurement obtain y^ = A ' ^ ^ k - l ' S t - l + ^ k - l ^ - l ^ Evaluate x R = ( A ^ + V A - l K - i + g k - l y k Solve, (5"-25) for g f c and (5-20) for S g ^ ' Check the condition (5-12) and replace g^ by the appropriate upper or lower bound i f i t i s not s a t i s f i e d . 5. The estimate i s given by \ = \ + 6 g k - l y k and the c o n t r o l i s given by (5-7). The amount of computational e f f o r t involved can be favourably compared (no matrix i n v e r s i o n , s c a l a r manipulation) to the Kalman-Bucy computation „of the gain. Figure 3 i l l u s t r a t e s a d e t a i l e d block diagram of the algorithm. 5.2 Example I The f i r s t example concerns a second order l i n e a r system, relevant equations are ^+1 1. .101 .02525 .95 *k + ,01 .001 .1 .01 where Xlk x2k u, = ulk ' Wk = Wlk U2k W2k The (5-26) %+l vk+l B k+1 DELAY (one period-) k+1 'k+1 Z k+1 NOTATION = => VECTOR SCALAR u, k+1 Sk+1 Xk+1 DELAY X k . (one period) xk+1+^k Yk+1 °9k (Ak+BkVxk 'k+1 9min < 3kJgmax YES 9k.i9k* *9k=c{gk > 1+ NO TV x k+1 DELAY (one period) a J g max gk Yk+] ± - ^ n ( l - g k ) Y k + 1 |C <fgk = 0 BLOCK DIAGRAM OF CLOSED LOOP STOCHASTIC CONTROLLER FOR LINEAR SYSTEM FIG. 3 to The measurement device has the form (2-2) Z k = \ + \ W h 6 r e \ = The disturbances are assumed to have zero mean v l k 2k E[w k] = E [ v k ] The o r i g i n a l cost function i s given by J = 1/2 Z o 0. 0. .1 0. .0 .1 / k X k + V k The system i s operating during the time i n t e r v a l 0 < k < 100 The i n i t i a l condition on (5-26) was chosen i n the range 5. 15. < X < ; X 5. o 15. o 10. 10. (5-27) (5-28) (5-29) (5-30) (5-31) 1) Deterministic behaviour of the system In the de t e r m i n i s t i c case (wk = vy. = C s ^ k ) the optimum control can be evaluated, according to standard techniques [5]. The matrix has two eigenvalues X = .9186 X 2 = 1.0313 (5-32) Thus the unforced system has one stable eigenvalue and one unstable eigen-value. An optimal c o n t r o l based on minimizing the quadratic cost (5-29) drives x k to the o r i g i n . 2) Stochastic case For the s t o c h a s t i c problem described by (5-26), (5-31) i t i s proposed to make a comparison between the proposed method and the standard optimum c o n t r o l l e r composed of a cascade combination of a Kalman-Bucy f i l t e r and a d e t e r m i n i s t i c c o n t r o l generator. The comparison has been done by means of a simulation executed on a 360/67 IBM d i g i t a l computer. Because of the w e l l defined operating conditions, i t i s possible to generate reproducible noise sequences. By using i d e n t i c a l noise sequences for both the proposed and standard methods, i t was made c e r t a i n that the observed changes i n the behaviour of the system were caused only by the kind of feedback loop used. Thus, given a noise sequence (w^, v^, ken) the system was run f i r s t with a standard c o n t r o l , then run again with the proposed c o n t r o l . By repeating the process for a range of i n i t i a l conditions, and a large number of d i f f e r e n t noise sequences, i t was pos s i b l e to compute the average performance. As noted before, the information given i n (5-26), (5-31) does not allow the r e a l i z a t i o n of an optimum Kalman-Bucy f i l t e r . S t r i c t l y speaking, the standard method cannot be implemented as i t i s . Of course ., i t could be po s s i b l e to make some "guess" about the noise s t a t -i s t i c s and r e a l i z e a standard c o n t r o l l e r accordingly. But then, depending on the accuracy of the guess, i t i s c l e a r that a great deal of v a r i a t i o n i n performance may r e s u l t . Thi:s i t i s always possible to make a "guess" so that the proposed method appears highly s u c c e s s f u l . But then the comparison would be, i n fa c t , meaningless. For these reasons, i t was f e l t that a r e a l i s t i c comparison should be done by "giving'^ a l l the required exact s t a t -i s t i c a l information to the standard c o n t r o l l e r . Thus, i n the forthcoming case where a l l s t a t i s t i c s are gaussian, i t must be c l e a r that the standard method i s operating with f u l l knowledge of a l l necessary covariances, the proposed method having no access to t h i s information. Therefore, the comparison i s , i n fa c t , u n f a i r f o r the proposed method. As a consequence, the r e s u l t s of the comparison indicates what happens i n the worst p o s s i b l e operating conditions for the proposed c o n t r o l l e r 35 which i s competing against a highly informed c o n t r o l l e r . The f i r s t step i n implementing the proposed method i s to f i n d the allowed range f o r the f i l t e r gain. 3) S t a b i l i t y condition The magnitudes of the eigenvalues of are |x^| = .9186 and |X^ I = 1.0313. Hence from (4-10) |xm3X|= XM = 1.0313 (5-33) D i r e c t a p p l i c a t i o n of. the formulas (4-24), (4-25), (4-26) y i e l d s .0303 < g k < 1.9696 for a l l "k" (5-34) The condition (5-34) can be obtained by a p p l i c a t i o n of (4-17). In t h i s case the range of g^ i s found to be .092 < g k < 1.908 (5-35) which i s comparable with (5-34). The i n i t i a l condition i s taken a r b i t r a r i l y close to one. From a p r a c t i c a l point of view, i t i s simple to take (5-23) as v a l i d f o r g^ +^ = !• Then, from the continuity of (5-6) i n g k +-^ i t i s simple to avoid s e t t i n g a number a r b i t r a r i l y close to one and take d i r e c t l y g Q = 1 as i n i t i a l condition, with the convention sgn(0.) = +. The range of g k being defined, the next p r a c t i c a l problem to solve i s to decide a value for the parameters SSL and a i n equation (5-25). Since 6£ i s a step s i z e constraint with a c l e a r p h y s i c a l s i g n i f i c a n c e , i t can be set to .1. This means that at each time, the change i n the mag-nitude of the gain would be 10% of the magnitude of the present gain. This kind of choice i s common i n numerical techniques, and i s based on a com-promise between a small step to insure l i n e a r i t y and a large step to reduce the number of i t e r a t i o n s . The value of a i s not so easy to deduce. From the i n t e r p r e t a t i o n \ (3-45), s e t t i n g a = .1 would be equivalent to a n t i c i p a t i n g f i n d i n g a heavy noise condition. Another p o s s i b i l i t y would be to make a few computer runs and "tune" a. Since i t i s a scalar,- t h i s presents no computational com-p l i c a t i o n . Thus the choice 64g = a = .1 (5-36) i s made and these values w i l l not be modified during the simulation, although some i n v e s t i g a t i o n s i n d i c a t e d that an improvement i s s t i l l p o s s i b l e . Then, a l l equations of the c o n t r o l l e r are p e r f e c t l y defined. The necessary matrix are computed (see appendix I) and the Kalman gain can be deduced from (2-9) and (2-10). a) Case where the disturbances and i n i t i a l state vector have a gaussian p r o b a b i l i t y function d i s t r i b u t i o n . I t i s already known that the random sequences w^  and v^ have zero mean. The following information has been given to the standard c o n t r o l l e r , thus allowing the computation of the optimum Kalman gain matrix. E[v.v!] = 6 . 1 3 i j E [w .w! ] = <S. . i J i j E[v.w!] = o - ^ - H (5-37) E[v.x'] = E[w.x'] = 0 - ^ i 1 o 1 o E [ X Q X Q ] = I ; I = unit matrix I t i s seen that (5-31) i s not at a l l r e s t r i c t i v e since each component of x can be found within f i v e times i t s variance around the mean value (the o p r o b a b i l i t y that i t i s outside t h i s range i s 10 ^ which can be neglected). The system (5-26) was run 2100 times f o r both methods. The average cost has been evaluated as follows: From (5-26) to (5-37) i t i s seen that the only unspecified q u a n t i t i e s are: 1) the noise sequences w and v, . K. K. 2) the i n i t i a l condition f o r the system (5-26). Assume that a p a r t i c u l a r value ,of the i n i t i a l state i s x . Then, f o r t h i s 1 p a r t i c u l a r value of XQ = X ^ of the system i s run 100 times. Each time a new sequence f o r v^ and w^  i s generated. The average cost over these 100 runs i s then p r a c t i c a l l y independent of the noise sequences, that i s re-peating the process for d i f f e r e n t sequences v^ and w^  y i e l d s the same avera Denote t h i s average by J ( * Q ) to emphasize the f a c t that i t was obtained for the i n i t i a l c ondition x . The process i s then repeated 21 times by 1 modifying the i n i t i a l condition of the system as indicated i n figure 4. 0. I t . \ • / r \ • t / / i i i I / t \ l / i • * | t 0 i t i i i 1 7 1 1 i t k ^ \ f i n i t i a l condition of system 10. 15. (\) Computation of E ( J ) F i g . 4 38 The average cost corresponding to the i n i t i a l condition i s denoted by _ i J ( X Q ). The i n i t i a l condition i s assumed normally d i s t r i b u t e d with mean i and variance given by (5-31) and (5-37). Further each component of x^ i s assumed to be s t a t i s t i c a l l y independent of the other components. Then 21 _ the average cost i s defined to be E(J) = J ( X Q ) P ( X Q ) where the prob-i i a b i l i t y density function p (x^ ) i s normalized by ?1 p ( x ) = 1. A f t e r i 1 i running the system f i r s t with the proposed and secondly with the conventional method the average costs obtained were E ( J ) Kalman-Bucy f i l t e r + de t e r m i n i s t i c control + a l l information ( 5 - 3 7 ) . 1 8 7 5 . 1 3 Proposed method as described, with no access to information ( 5 - 3 7 ) . 1 8 9 8 . 2 Hence the proposed method y i e l d s a cost which i s 1.2% greater. I t can be noted that i n this case, the proposed method i s used as a suboptimal method since the very best i s done by the standard method. Thus, the " p r i c e " paid for replacing a 2 x 2 gain matrix by a s c a l a r and giving up the information (5-37) i s a 1.2% increase i n the average cost. The prop-osed c o n t r o l l e r i s evidently a very e f f e c t i v e comparatively simple sub-optimal c o n t r o l l e r for this p a r t i c u l a r system. b) Case where the disturbances have non gaussian p r o b a b i l i t y density functions. I t i s desired to test the proposed control i n the case where the disturbances are non gaussian. The proposed method i s compared with a con-t r o l l e r r e a l i z e d by cascading a Kalman-Bucy f i l t e r with a de t e r m i n i s t i c control generator. As i n the previous case, the standard c o n t r o l l e r i s given access to the same kind of data as contained i n (5-37). That i s , the covariance of the i n i t i a l state, and the covariance of the disturbances are supposed known. The Kalman gain i s computed using this new data. In the proposed c o n t r o l l e r , however, nothing i s changed. In p a r t i c u l a r , (5-36) remains unchanged. The p r o b a b i l i t y density functions of the noises has been obtained by taking the union of two gaus-sian noise sequences and truncating a l l elements of amplitude greater than 2.5 [11]. Then, i t i s possible to generate (see figure 5) non gaussian p r o b a b i l i t y density functions. An advantage of this method [11] i s the f a c i l i t y of r e a l i z a t i o n since pseudo-random noise generators are r e a d i l y a v a i l a b l e f or simulation. The extra information given to the standard con-t r o l l e r i s E[v.v!] = 1.706 (S-M E[w.w!] = .87786 '&±. E [ x x ' ] = I j 1 = u n i t matrix o o Note that the i n i t i a l state vector i s normally d i s t r i b u t e d . The same procedure described i n the previous example was repeated. The average costs E(J) for 2100 runs were E(J) Kalman-Bucy f i l t e r with complete • access to data (5-38) 1555. Proposed method with no access to data (5-38). No changes from the previous gaussian case were made. 1535. Here the proposed method i s 1.2% better on the average. I t must be noted that the standard configuration using a Kalman-Bucy f i l t e r i s c e r t a i n l y not optimum for this kind of noise, and t h i s explains why i t i s p o s s i b l e to obtain better r e s u l t s . From the study of this example, i t i s concluded that replacing a 2 x 2 Kalman gain matrix by a s c a l a r has l i t t l e influence on the cost, 40 PLANT NOISE I P(Wi, > ME A N=0. y 1 0 0 0 0 VARIANCE = .8778 J SAMPLES REALIZED BY UNION OF TWO NORMALLY DISTRIBUTED RANDOM NUMBER GENERATORS (~.5,1)U(.5,.5) 2. AMPLITUDE OF W: 'k MEASUREMENT NOISE P(Vj ) MEAN =0 | 10,000 k VARIANCE = U7Q6 ) SAMPLES REALIZED BY UNION OF TWO .MEASURED PROBABILITY DENSITY FUNCTIONS OF PLANT AND MEASUREMENT NOISES (SECOND ORDER EXAMPLE) FIG - 5 41 to say nothing about the f a c t that no data beyond the expected values of the disturbances and the expected i n i t i a l state vector i s required. In-ves t i g a t i o n s were also made f o r the case x ^ E(x ). S i m i l a r r e s u l t s o o as described above were noted. ; 5.3 Example II The simulation executed i n the f i r s t example was extended to a f i f t h order system. I t concerns a part of a power system c o n s i s t i n g of a generator connected to an i n f i n i t e bus [12]. Before the simulation s t a r t s , the system i s operating i n the steady-state. The equilibrium point has an average value of x^. The perturbation equations around the eq u i l i b r i u m point can be represented i n a block diagram form as shown i n f i g u r e 6. The system i s disturbed by zero mean noise. When the simulation s t a r t s at time zero, the load i s suddenly increased (closure of a switch) by a step function. The system tends to reach a new equilibrium point which has been scaled to be at the o r i g i n , so that the generated power equals the demand. The co n t r o l i s modifying the speed gear p o s i t i o n of the governor. T y p i c a l values for a d i s c r e t i z e d version of the time increment of • 5(s) during the time i n t e r v a l (0, 20[s] 1. .5 0. 0. 0. 0. -.3375 .875 12.5 0. -12.5 0. Xk+1 = 0. 0. 0. 1. 0. X k + 0. 0. -.03 0. 0. 0. 1. 0. 0. 0. 0. 0.95 0. u k + wk (5-39) where x/ = [x x x x. x ]; w' = [w w w w w ] ^ \ 2k 3 k \ 5 k k X k 2k 3 k \ \ The measurement device has the form (2-2) \ = X k + V k (5-40) AX «= V a r i a t i o n of governor p o s i t i o n (and control of gate opening) & u ™ Control a c t i n g on the gear p o s i t i o n of the governor APg V a r i a t i o n of mechanical power Aw « Angular frequency deviation AS *» Power angle deviation; y i e l d s the e l e c t r i c a l power Pd » Load disturbance (step function) Jr «x zero mean noise (random v a r i a t i o n s of the load) L a Load a f t e r shaping by f i l t e r BLOCK DIAGRAM OF PERTURBED SYSTEM AROUND EQUILIBRIUM POINT FIG- 6 43 where v R = [v v v v v ] k k k k k the disturbances are assumed to have zero mean E [ v k ] - E[w k] = [0) V k (5-41) The o r i g i n a l cost function i s given to be 1. 0. 0. 0. 0. 0. 1. 0. 0. 0. N J = 1/2 E [x,1 o k 0. 0. 1. 0. -1. (5-42) 0. 0. 0. 0. 0. 0. -1. 0. 1. The system i s operating during the time i n t e r v a l 0 £ k < 100 For the purpose of the simulation, i t w i l l be assumed that the equations (5-39).- (5-42) are a s u f f i c i e n t l y accurate model of the power system. The i n i t i a l condition on (5-39) w i l l be i n the hypercube centered i n x' = E(x') = (-.2847,.000014312,.9923,.9923,1.) (5-43) o o The corners of the hypercube are located at Z f i v e times the variance of the measurement noise along the respective axis. Figure 7 i l l u s t r a t e s the l o c a t i o n of two corners (over a t o t a l of 32 corners). The i n i t i a l condition X q i s supposed to be normally d i s t r i b u t e d with a covariance equal to the covariance of the measurement noise. The simulation follows exactly the same pattern as in d i c a t e d i n Example I. This means that the comparison with a standard optimum c o n t r o l l e r w i l l be e s s e n t i a l l y u n f a i r since the Kalman-Bucy gain of the f i l t e r w i l l be computed using exact s t a t i s t i c a l data. The f i r s t step i n implementing the proposed 44 distance i s f i v e times E(v. . v! ) x0 xo 0. 1 mm max Computation of E(J) F i g . 7 method i s the evaluation of the s t a b i l i t y range f o r the proposed f i l t e r . 1) S t a b i l i t y condition ' The eigenvalues of the system (5-39) are A l = .8339 + i.5988 X2 = .8339 - i.5988 X3 = -.5018 \ - .7090 X5 = = .9500 Hence X M = -| X | = 1.0266 (5-44) and d i r e c t a p p l i c a t i o n of (4-24), (4-25), (4-26) y i e l d s .0259 < g k < 1-974 (5-45) The next problem i s to set up the parameters Sir, and a i n equation (5-25) . For the same reasons as before, take 6£ = a = I g - 1 (5-46) The values i n d i c a t e d by (5-46) w i l l not be changed during the simulation. Note that (5-46) i s i d e n t i c a l to (5-36). This i n d i c a t e s that the choice of these parameters are not c r i t i c a l . Of course (5-46) w i l l not be the best p o s s i b l e choice i n the sense of reducing the average e r r o r . However the best choice can only be determined by repeated simulation, and would be v a l i d f o r only one kind of noise s t a t i s t i c s . The comparison w i l l be c a r r i e d out by using two kinds of noise s t a t i s t i c s . a) Case where the disturbances have gaussian p r o b a b i l i t y density function The a d d i t i o n a l s t a t i s t i c a l data required by the Kalman-Bucy f i l t e i s taken to be the following: E t v . v ' J i 3 = .1 fiij-E[w ±wj] = .1 6 ± j E[v.w|] i 3 " 0 E[x x'] L o o J = r = .1 o Equation (5-47) determines e n t i r e l y the operating conditions costs over 2200 runs were Kalman-Bucy f i l t e r + d e t e r m i n i s t i c c o n t r o l l e r + a l l data (5-47). Proposed method with no access to (5-47). E(J) 2197. 2437. (5-47) The average Thus the proposed method y i e l d s an average cost which i s 11% greater. The meaning of th i s 11% increase can be understood by comparing the average Kalman cost (2197.) with the cost obtained i n the noise free case. I f v = w. = 0 Yk then (5-39), (5-43) represent a de t e r m i n i s t i c c o n t r o l k k problem and the optimum c o n t r o l can be found. The corresponding cost i s 4.463. I t i s now i n t e r e s t i n g to compare the optimum de t e r m i n i s t i c cost (4.463) with the average Kalman cost (2197). C l e a r l y the disturbances a f f e c t h eavily the system performance. This indicates that the system i s operating under extremely noisy conditions. From the continuity of (5-42) i n i t s arguments and the continuity of (5-39), (5-40) and (5-6) i n the noise sequences,it i s seen that the cost (5-42) i s a continuous function of the noise sequences v^ and w^ . The structure of the f i l t e r (5-6) being s i m i l a r to the Kalman f i l t e r (2-9), both proposed and standard methods r e s u l t i n the same de t e r m i n i s t i c cost as the noise free case. I t i s therefore evident that i f the influence of the noise on the average cost decreases ( i . e . the covariances of the noise sequences v, and w, tend towards zero). k k then the dif f e r e n c e between the average proposed cost and the standard cost i s expected to decrease towards zero. This point was v e r i f i e d . Limited computational experience i n d i c a t e d that for noise r e s u l t i n g i n an average standard cost of 50, the average cost obtained by the proposed method was only 3% to 5% greater. In other words the increase of 11% characterizes the performance of the proposed method under heavy noise condition. b) Case where the noise s t a t i s t i c s are not gaussian . The noise s t a t i s t i c s were obtained i n a s i m i l a r manner as i n Example I. The relevant information i s i l l u s t r a t e d i n figu r e 8. The maximum noise amplitude was set to .5. The measured covariances matrices are: E[v.v.'] = .09596 6 - H i j J E[w.w!] = .09837 6 - H i 3 J 47 PLANT NOISE 10,000 SAMPLES |MEAN =0.0 I VARIANCE=-09837 UNION OF TWO NORMAL DISTRIBUTIONS (-.25/.25)U(,25/.l) •4 AMPLITUDE OF Wf MEASUREMENT NOISE p(\) 10,000 SAMPLES MEAN = 0.0 VARIANCE = -09596 UNION OF TWO NORMAL DISTRIBUTIONS (-.25,01 )U(.25,.25) ' 4 AMPLITUDE OF \ MEASURED PROBABILITY DENSITY FUNCTIONS OF PLANT AND MEASUREMENT NOISES (FIFTH ORDER EXAMPLE) FIG-8 48 The average costs over 2200 runs were E(J) Kalman-Bucy f i l t e r with data (5-48). 21900. Proposed method hot using (5-48), with no changes from previous case. 24860. Thus the proposed method y i e l d s an average cost 13% greater. The degrad-at i o n i n performance when compared with a second order example i s e s s e n t i a l l y due to the replacement of a 5 x 5 gain matrix ( i . e . 25 independent elements) by a s c a l a r gain. Whether or not such degradation i s acceptable depends on the system, the a v a i l a b l e computer, and the amount of known s t a t i s t i c a l data. In p r a c t i c a l s i t u a t i o n s , the cost index (5-42) i s often an a r b i t r a r y choice. Therefore the t h e o r e t i c a l "optimum" i s only as "good" as the cost index i s . I t seems reasonable to consider then that a degradation of, say 20%, w i l l be acceptable e s p e c i a l l y i f the computational advantages make an implementation less c o s t l y . 5.4 Conclusion The f i r s t part of the thesis was concerned with the c o n t r o l of a p a r t i c u l a r class of s t o c h a s t i c l i n e a r systems. From the study of the conventional method of s o l u t i o n , i t has been po s s i b l e to give an i n t e r p r e t -ation suggesting a completely d i f f e r e n t approach. I t has been shown to be p o s s i b l e to use cost indices other than the conventional expected cost. The proposed cost index i s structured so that the c o n t r o l represents a trade-off between a de t e r m i n i s t i c cost and an estimation er r o r cost penalty. The main advantage of this formulation i s that no d e t a i l e d s t a t i s t i c a l information beyond the expected values i s required concerning the disturbances 49 acting upon the system. The proposed method i s based on a d e c i s i o n p o l i c y which uses the gradient of the proposed cost index to adaptively update a f i l t e r gain. Two examples of l i n e a r systems with gaussian and non gaussian noise s t a t i s t i c s show that t h i s approach allows the Kalman gain matrices to be replaced by a s c a l a r with an i n s i g n i f i c a n t to moderate increase i n cost. I t remains to i n v e s t i g a t e i f the proposed method of s o l u t i o n can be extended to more general systems. This w i l l be the subject of the second part of the t h e s i s . 50 6. BACKGROUND TO NONLINEAR STOCHASTIC CONTROL PROBLEM 6.1 Introduction Determining the c o n t r o l of a nonlinear s t o c h a s t i c system by mini-mizing an expected cost i s , i n p r i n c i p l e , f e a s i b l e i f a l l required s t a t -i s t i c a l data i s a v a i l a b l e - [ 3 ] . However the solutions are exceptionally d i f f i c u l t to obtain and almost impossible to mechanize. Standard suboptimal approaches based on l i n e a r i z i n g the n o n l i n e a r i t i e s have one feature i n common i n that they require the computation of an extended Kalman gain matrix. This i s only p o s s i b l e i f s u f f i c i e n t s t a t i s t i c a l data i s a v a i l a b l e . If such a data i s not a v a i l a b l e , i t must be "guessed" or some i d e n t i f i c a t i o n technique used, which can severely a f f e c t the performance of the system [6]. A s o l u t i o n to t h i s problem w i l l be presented i n t h i s second part of the t h e s i s . The proposed method i s e s s e n t i a l l y an extension of the method introduced i n part I. The c o n t r o l of a nonlinear s t o c h a s t i c system w i l l not be obtained by the a r b i t r a r y choice of minimizing an expected cost. Instead, i t w i l l be shown that a d i f f e r e n t form of a cost index can be formulated. Under s u i t a b l e constraints a c o n t r o l p o l i c y minimizing this proposed cost index w i l l be shown to e x i s t , and by s u i t a b l e l i n e a r i z a t i o n a p r a c t i c a l algorithm w i l l be deduced. I t i s shown that for an important class of nonlinear system, any a d d i t i o n a l unspecified parameter introduced by a suboptimum s o l u t i o n can be updated by adaptive coupling. As before, the c e n t r a l idea remains the implementation of a c o n t r o l based on r e a l i z i n g a trade-off between estimation and c o n t r o l . The trade-o f f i s performed se q u e n t i a l l y by adaptive updating of a s c a l a r f i l t e r gain. The necessary s t a t i s t i c a l information i s not extended beyond the knowledge of the expected values of the noises and the expected value of the i n i t i a l 51 state vector. Two numerical examples i l l u s t r a t e the a p p l i c a t i o n of the method. 6.2 Problem formulation Consider a m u l t i v a r i a b l e nonlinear d i s c r e t e system described by: *k+i • F ( V V k ) + w k \ e u ( 6 - 1 } z k = H(x ,k) + v k 0 < k < N-l (6-2) Equation (6-1) r e f e r s to the system dynamics where i s an n dimensional vector, i s a m-dimensional control vector and w k i s an n-dimensional zero mean disturbance vector, which i s assumed to be independent of and v, at a l l time, k Equation (6-2) i s the measurement equation modelling the measure-ment device disturbed by the a d d i t i v e zero-mean noise r-dimensional vector which i s assumed to be independent of x kand V^at a l l times. Thus ^V = ° E(w k) = 0 V-k (6-3) where E(') i s the expectation operator. The system (6-1) i s to operate during the time i n t e r v a l 0 •< k < N - l . I t i s desired to implement a control sequence u e U-(as a function of the actual and past measurements (6-2) and the past controls) such that the o r i g i n a l cost index N J = Z J.(x.,U.,j) (6-4) o J 3 3 i s minimized. A l l nonlinear function F ( . , . , l ) , H(.,.), J (.,.,.), are assumed to be continuous and d i f f e r e n t i a b l e i n a l l t h e i r arguments. A l l matrix inverses which occur i n the algebraic manipulations are assumed to e x i s t . 52 6.3 Conventional Method of Solution As explained i n part I, the s t o c h a s t i c c o n t r o l problem as defined by (6-1) - (6-4) cannot be solved. The standard method consists i n f i r s t r e p l a c i n g (6-4) by i t s expected value E(J) and then determining a c o n t r o l sequence which minimizes E(J) . [3 ]. Even i f a l l required s t a t i s t i c a l inform-ation were a v a i l a b l e , the a n a l y t i c a l and computational d i f f i c u l t i e s are formidable. These d i f f i c u l t i e s are caused e s s e n t i a l l y by the computation of PC^I 2^) which i s the p r o b a b i l i t y density function of given the measure-ment z, - [4 ] . The three b a s i c d i f f i c u l t i e s are: K. a) Computation of p ( x ^ | z ^ ) : This problem i s complicated by the nonlinear r e l a t i o n s h i p s between z^ and x^. b) p (x^. |z^.) must be i n an a n a l y t i c a l form. This requirement i s necessary i f the s o l u t i o n i s to be implemented i n r e a l time. c) p |z^.) m u s t be "reproducing" d i s t r i b u t i o n . This implies that a set of f i n i t e dimensional s u f f i c i e n t s t a t i s t i c s must e x i s t . These d i f f i c u l t i e s severely l i m i t the p r a c t i c a l a p p l i c a t i o n of the optimum solutions generated by the standard method. Thus, the use of suboptimal controls i s the r u l e . They are e s s e n t i a l l y based on l i n e a r i z a t i o n and a p p l i c a t i o n of the standard method of s o l u t i o n for l i n e a r systems. In p a r t i c u l a r the design objective remains the minimization of E ( J ) . 53 ' 7. THE PROPOSED NEW COST INDEX AND CONTROL POLICY The minimization of E(J) as a design objective w i l l be discarded since i t implies an extensive s t a t i s t i c a l d e s c r i p t i o n of the system, and s o l u t i o n of the d i f f i c u l t i e s a), b ) , c) l i s t e d i n se c t i o n 6.3 • Since (6-4) does not induce an ordering over a l l c o n t r o l p o l i c i e s , a new c r i t e r i o n r e l a t e d to (6-4) i s des i r a b l e . The new c r i t e r i o n i s based on a trade-off between estimation and control and i s deduced from d e f i n i t i o n s of cost functions f o r the estimator and the c o n t r o l l e r . Define e^ = x^ - x^ (7-1) where x^ i s the estimate of x^ i n a sense to be defined l a t e r . Then, u s i n g ( 7 - l ) , the o r i g i n a l cost (6-4) can be developed as follows: N J = £ J. (x. , u. , i ) = J + J (7-2) £ x x x c e N with J = • I J (x.,u.,k) (7-3) c o c i x x A N J = Y, J (e.,x.,u.,i) (7-4) e e x I x ° i Equations (7-2), (7-3), (7-4) are s i m i l a r i n structure to (3-2), (3-3), (3-4) r e s p e c t i v e l y . 7.1 F i l t e r i n g problem Assume for the moment that the c o n t r o l l e r has the form: \ = \ (v k ) a~ 5 ) Following a s i m i l a r argument as explained i n Chapter 3, Section 3.1, the stru c t u r e of the f i l t e r i s assumed to have the form: V i = \+i + G k + i y k + i ( 7 ~ 6 ) where y k + 1 = - HC^.k+l) (7-7) and \ + 1 & F ^ . v ^ . k ) (7-8) and where the matrix gain G^+^ may be dependent on x^, z^ +-^ a n ^ the time "k" . The problem i s to f i n d a method to evaluate the f i l t e r gain G k +^ i n such a way that the error (7-1) i s "small", or at l e a s t that the trace of the covariance er r o r of the estimate i s a decreasing function of time. Define x ^ A 5 ^ + G k y k + 1 (7-9) Equations (7-7) and (6-2) y i e l d y k + l = H ( x k + l ' k + 1 ) + V k + 1 - H ( x k + l ' k + 1 ) ( 7 " 9 ) L i n e a r i z a t i o n about x k + ^ y i e l d s successively \ + i B H ( \ + r k + 1 ) + H X L < W W + \ + r H ( \ + r k + 1 ) ( 7 " n ) V i yk + i = HxL ( x k + i - , 5W + - \ + i ( 7- 1 2 ) \+i S u b s t i t u t i n g (7-6) and (7-1) into (7-12) y i e l d s y k + i = HxL < W ( 7" 1 3 )  xk+i Equation (7-13) can be rewritten ( f o r n o t a t i o n a l convenience the subscript xk+^  i s dropped) * k + l = CI ~ Hx Gk +l ] _ 1 Vk+1 + [I " Hx Gk +l ] _ 1 Vk +1 ( 7 " 1 4 ) Define = y + 1 - ^ (7-15) Equation (7-1) can be expanded as: ek+i = \+i " V i + \+i - \ + i = \+i + V i " V i ( 7 _ 1 6 ) From (7-6), (7-8) and (7-9), \+.lm%+l+ (Gk+rGk)yk+l ( 7 " 1 7 ) Equation (7-17) into (7-16) y i e l d s : ek +l = ik +l- (Gk+l-Gk> \ + l ( 7 " 1 8 ) s u b s t i t u t i n g (7-14) into (7-18) y i e l d s ek+i • vr C Gk +r Gk ] [ I ' H x G k + i r l [Vk +i + (7"19) P r e m u l t i p l i c a t i o n of both sides of (7-19) by y i e l d s H x V i " Vk+rWVk+i + \ + i ] ( 7 " 2 0 ) where Y k + 1 A H j G ^ - G ^ H - H ^ ] " 1 (7-21) D e f i n e . e£+i = Vk+r §k +i = H x \ + i ( 7 ~ 2 2 ) Then equation (7-20) becomes: e * = e* — v (e* + v ) (7-23) k+1 k+1 Yk+1^ k+1 k+r W ; which can be rewritten, f o r Y^+i ^ 1> a s e* = (I + Y ) _ 1 re* — Y v 1 (7-24) k+1 v Yk+r L k+1 Yk+1 k+1 J K ' By d e f i n i t i o n , the gain C^ +i i s considered to be b e t t e r than or as good as the gain G, i f the trace of the covariance of the e r r o r e* ., i s smalle k k+1 than or equal to the trace of the covariance of the err o r Thus, ^ { E [ e ^ + i e & i l \ + i ] } . ^ ^ ^ i ' ^ i ' (7'25) where the matrix • (ETe* e*' IY ] - Efe* e'* IY 11 < 0 (7-26) 1 L k+1 k+11 Yk+1J 1 k+1 k+1 | Yk+l n - W ; i s negative semi-definite. S u b s t i t u t i n g (7-24) into (7-25) y i e l d s , f o r a symmetric Yk+-^> Tr { E [ i k + i ^ + i i w + \+i E C vk +i vk+ i k +i ] ^k+i EtikVi5frilW (I+W} ( I + V i ) _ 1 < 0 (7-27) -1 If (I+Y, -) e x i s t s , a condition equivalent to (7-27) i s given by K."T"_L Tr- {E[^+i\*;ii\+i] + \ + i E [\ +i vk +i ] \ + i - ( i + \ + i ) E [ ^ + i ^ ; i i v i ] -(I+Y k + 1)> < o (7-28) where the matrix {'} i s negative semi-definite. Reducing (7-28) y i e l d s : T r {w E [\ +i vk +i ] - E [^ +i ig;i'\ +i ]) \ + i - v i E [ ik +i^:ii v i ] - E [ i ^ i 5 K + i i \ + i ] \ + i } * 0 (7-29) The condition (7-29) i s s a t i s f i e d i f WVEelYk+l - WcTVk+l = Q E v = E [\ +l Vk +l ] (7-30) E e = E [ i k H - A l l W ] where ft i s a negative semi-definite matrix. Solving (7-30) i s not a t r i v i a l task. However i n the p a r t i c u l a r case that H H' = I (7-31) x x a s o l u t i o n of (7-30) i s obtained by taking Gk+1 " 8k+lHx (7-32) whe re g k + 1 = g k + 1 ( x k » z k + i ' k ^ i s a s c a l a r function. In t h i s case, equation (7-21) becomes: \ + i = V g k+r gk ) Hx ( I - g k + i H x H x ) _ 1 = A g k ( 1 - g k + i r l 1 (7-33) = -Yvf.-i i where Y*,-, i s a sc a l a r (7-34) I t i s i n t e r e s t i n g to compare (7-33) with (3-19). Su b s t i t u t i n g (7-34) in t o (7-30) y i e l d s \ + l C E e " V + . 2 ^ + l E e = - n ( 7 ~ 3 5 ) Let IT be an a r b i t r a r y vector i n an r-dimensional e u c l i d i a n space. Pre- and p o s t - m u l t i p l i c a t i o n of (7-35) by 'TT y i e l d s : y,*2 TT ' [ E - E ]TT + 2y* T T ' E TT = -TT'QTT > 0 (7-36) k+1 e v k+1 e The s c a l a r function (7-36) must be p o s i t i v e or zero. I t i s r e a d i l y seen that there always e x i s t a s c a l a r p o s i t i v e a a r b i t r a r i l y small such that the condition (7-36) i s s a t i s f i e d , since E ^ i s non-negative d e f i n i t e . To i n v e s t i g a t e the p r a c t i c a l d i f f i c u l t y of f i n d i n g a value for Y k +^ the sc a l a r quantity T T ' E TT o u l = , J - 0 , T T ' E rr ^ 0 (7-37) k+1 TT E TT e. e i s introduced. Thus (7-36) can be w r i t t e n as: Y k + l ( 1 - a k + l ) + 2 t + l * ° ( 7 " 3 8 ) or ^ + i ( 1 - < W + 2 H + i >-0 ( 7 " 3 9 ) Equation (7-39 i s i d e n t i c a l with (3-24). The conclusions concerning (3-24) are therefore a p p l i c a b l e . I t follows that (7-33) and (7-34) can be written i n the form A g k ( 1 " g k + I r l = ^ +1 > ° ( ? - 4 0 ) 58 which i s i d e n t i c a l to (3-28). S i m i l a r l y to (3-31) equation (7-40) becomes: A g k = a. S g n ( l - g k + 1 ) (7-41) Su b s t i t u t i n g (7-32) into (7-17) y i e l d s V l = \ + 1 + \ % 1 ( 7 " 4 2 ) Equation (7-42) represents an improved estimate i n the sense that (7-25) holds. However, Ag i s not determined by f i l t e r i n g alone, since a trade-off K. between estimation and control must be formulated. This can be accomplished by introducing V i - w v i - a s g n ( 1 - W H x y k + i ( 7 ~ 4 3 ) and a s s o c i a t i n g a cost penalty f o r any deviations from the n u l l vector, C, , = 0, which represents the optimum choice (7-42). Equation (7-43) i s analogous to (3-37). The f i n a l form of the penalty cost function of the f i l t e r i s s i m i l a r to (3-39) Jpk 4 C k \ C k ( 7" 4 4> where W i s a p o s i t i v e d e f i n i t e matrix which w i l l be s p e c i f i e d l a t e r , k 7.2 Control problem From (7-5) and (7-48) i t i s seen that U k = V V 1 ^ = \ <-\ (' A 8k- ) ' k' ) (7-45) The remaining unsolved problem consists i n determining an e x p l i c i t s tructure for (7-5). D i r e c t extension of (3-46), (3-47) would require a d e t e r m i n i s t i c c o n t r o l based on minimizing (7-3) for the remaining time-to-go subject to the constraint xj + 1 = F(x j = k,...,N-l (7-46) However th i s control would not allow (7-3) to be expressed i n a compact form such as (3-49). Consequently, the structure (7-5) would have to be 59 determined at each stage by a complicated minimization problem. Another p o s s i b i l i t y would be to obtain the structure (7.-5) by minimizing (7-3) subject to the constraint (7-46) taking j = 0,...,N-1, and using the r e s u l t i n g optimum con t r o l structure. Depending on the n o n l i n e a r i t y of (7-46) t h i s may be too complicated, and some suboptimum structure may be des i r a b l e . This w i l l be inv e s t i g a t e d i n a forthcoming s e c t i o n . For the moment, consider the control to be given by (7-5). The c o n t r i b u t i o n of the control to the cost (7-3) i s given at each time by JC = JcA>vk) <7-47> k k which i s s i m i l a r to (3-49). S u b s t i t u t i n g (7-45) and (7-42) i n t o (7-47) makes i t evident that (7-47) i s a function of the incremental gain sequence 7.3 Structure of cost f u n c t i o n a l The f i n a l cost c r i t e r i o n , which i s to be structured so that the choice of the gain of the f i l t e r at each stage "k" i s based on a trade-o f f between estimation and c o n t r o l , i s obtained by considering (7-44) and (7-47), * k = J c v ( V V k ) + W k ( 7 - 4 8 ) k Equation (7-48) i s s i m i l a r to (3-50). In p r i n c i p l e , the f i l t e r gain would have to be determined through the minimization of (7-48) subject to the constraints (6-1), (7-5), (7-6), (7-8). As before, i t i s questionable to use the exact minimum because of the uncertainties i n cost introduced by future noise. The decision r u l e for evaluating the gain sequence i s there-fore based on taking the best average d i r e c t i o n f o r a gain c o r r e c t i o n to be determined by the gradient of i> and introducing 'a step s i z e constraint K. ( 6 g k ) 2 = 6£ 2 g 2 (7-49) where 0 * « 4 g < l ; ^ g k = g k + 1 - g k (7-50) So f a r , the determination of the f i l t e r gain i s based on a minimization problem defined by (7-48) subject to several c o n s t r a i n t s . However, the value of the o r i g i n a l cost c r i t e r i o n (7-2) depends on the g l o b a l e f f e c t of the c o n t r o l sequence obtained by s o l v i n g minimizations problems defined by (7-48) subject to the constraints (6-1), (7-5), (7-6), (7-8) and (7-49). Thus the general behaviour of the system must be investigated, and i t w i l l be shown i n the next s e c t i o n that an a d d i t i o n a l constraint of the f i l t e r gain must be introduced i n order that the proposed decision rule r e s u l t s i n a useful suboptimum c o n t r o l . 61 8. ASYMPTOTIC STABILITY OF THE FILTER 8.1 S t a b i l i t y condition A requirement i s imposed that the co n d i t i o n a l expectation of the f i l t e r i n g e r r o r i s to be asymptotically s t a b l e . From (7-1), (6-1) and (7-6) i t follows that \+i = V i - \+i = F ( W k ) + wk - F ( V V k ) - Gk+iyk+i (8-1} S u b s t i t u t i n g (7-14) i n t o (8-1) y i e l d s -1 \+i • F ( V V k ) - F ( V V k ) + wk - G k + i [ I - H x G k + i ] { H x V i + \+i } To s i m p l i f y (8-2), the matrix i n v e r s i o n lemma [5] ^ + G k + i ( I - v w " 1 ^ = ( i - w v " 1 where H = H x x i s the Jacobian matrix, i s u t i l i z e d . k+1 Subs t i t u t i n g (8-3) into (8-2) y i e l d s (8-2) (8-3) \ + l " ( I - G k + l H x } { F ( V V k ) " F ( x k ' \ ' k ) + V - G k + l \ + l Expanding the n o n l i n e a r i t y up to the second order terms gives (8-4) F(x k,u k,k) = F(x k,u k,k) + F x (8-5) s u b s t i t u t i n g (8-5) into (8-4) and taking the co n d i t i o n a l expectation y i e l d s ^ k + J W = ^ k + l V F x . ECeklGk+l] . Xk (8-6) (8-7) = M k + l E [ e k l G k ] Equation (8-6) i s a ge n e r a l i z a t i o n of (4-6) . In order to;have (8-6) asymptotically stable i n the large, i t i s s u f f i c i e n t that the matrix norm s a t i s f y the condition || +1 < 1. 62 The range of G for which t h i s condition i s s a t i s f i e d defines iC"t"X the desired constraint on the f i l t e r gain. In the p a r t i c u l a r case where a gain on the form (7-32) i s applicable equation (8-6) becomes \ + l = ( I - S k + l H x H x ) F > (8-8) x k where i s a s c a l a r . The problem of f i n d i n g the range of a s c a l a r g^ +^ so that the matrix norm s a t i s f i e s the condition 11^+^1 I < 1 I s already a simpler task. 8.2 Choice of a norm For the same reason as before (Section 4.2) the norm ||"|| i s CO adopted. R e c a l l that (see equation (4-8) ) M\+ilL = ? a x."l mijl < x (8"9) {i}j=i the condition (8-9) i s easy to v e r i f y since a l l m.. are l i n e a r i n g, ,., . i j k+1 Let H' = (h! . ) ; . H = (h..); F = (F..) (8-10) x i j x xj x xj Then from (8-8) i t i s seen that n r m . , = F., - g, ^ E n . L . h | . h . F . (8-11) x l i l Bk+1 p=l j-1 xj jp p i v 1 and therefore lm - i l < l F - i l + 181,4.1 I I * i - E i h ' - h - F 1 ( 8 _ 1 2 ) 1 i l 1 - 1 x l 1 1 k+ 1 1 1 p=l j = l xj jp p i Note that (8-12) may impose a more stringent condition than (8-9). However, s u b s t i t u t i n g (8-12) into (8-9) y i e l d s n n n n n r max E |m._ I < . E IF.. I + I g. J. I E I E E h! .h . F J < 1 { i ] 1=1 ' i l 1 1 = 1 ' 1 1 1 ' k + 1 1=1 P=l 3 = 1 X J JP P i (8-13) I t i s seen that (8-13) i s u s e f u l only i f max Z | F., I < 1 { . } 1=1 i l I (8-14) Assuming that (8-14) holds, and d e f i n i n g i * to be the s o l u t i o n of (8-13), then the range of i s given by 1- h F 1=1 i * , l ' 8k+l ' - m . n r E E E (8-15) h'...h F 1=1'p=l j = l ^ . J P P l I f (8-14) does not hold, then (8-12) i s too stringent and (8-11) must be used i n (8-9). I t i s d e s i r a b l e to obtain a condition of the form or < 2 < 2 6min - 6 k - &max (8-16) This i s po s s i b l e i f an a p r i o r i range for x, i s known. Let min x. max (8-17) In p r a c t i c a l s i t u a t i o n s , the range (8-17) i s often known since the state of the system i s always bounded. Evaluation of (8-15) can be done o f f - l i n e . Let -g =g = min . max min {x . < x < x } min max 1- E F. 1=1 i * , l m , n r E I E E h' .h. F I _ l = l ' p = l j = i l A 3 JP P 1 1 (8-18) In the p a r t i c u l a r case where H = I i s the unit matrix, considerable x s i m p l i f i c a t i o n s can a r i s e and should be incorporated before imposing the condition (8-12). Equation (8-8) becomes \ + 1 = U - J W ^ (8-19) 64 A p p l i c a t i o n of the natural matrix norm condition (8-9) y i e l d s : n n max Z |m | = | l - g | max £ IF I < 1 (8-20) / . 1 . 1 1 K+1 r . i . , 1 111 i l J J - l J {l} J=l J Let i * denote the s o l u t i o n of (8-20). Then H\+iil= i^ k+J i 'W < 1 (8-21) which has the s o l u t i o n 1 - < g k + 1 < 1 + (8-22) To obtain a s t a b i l i t y condition of the form (8-16) i t i s necessary to obtain a range for x^ analogous to (8-17). Let i I*.*. I 4 - , -V <8"23) • i 1 i * J M ^ x • < x < x } 3=1 3=1 M mxn max The range (8-16) f o r the f i l t e r gain i s then defined by e = i . _ [ Z | F I l gmin l j = l l i * j l M J W = 1 + ^ I I ^ J I M 1 " 1 (8-24) Note that (8-24) i s very s i m i l a r to (4-12) and (4-13). The above s i m i l a r i t y w i l l . b e i l l u s t r a t e d i n the numerical examples. I t i s desired to obtain a range as large as pos s i b l e , provided the computational load i s acceptable. The simpler the numerical evaluation of the range, the more r e s t r i c t i v e i t w i l l be. The choice of the norm |IM^^-J | s i m p l i f i e s the determination of a s u i t a b l e range for the gain. However i f the elements of the matrix M, .. are too large, the range computed with (8-23) may be too small to be used i n p r a c t i c a l s i t u a t i o n s . In this case, i t i s suggested to take a smaller time increment At=(k+1)-(k). 65 9. SUBOPTIMUM CONTROL I t has been assumed throughout Chapter 7 that the r e l a t i o n s h i p between the cont r o l and the estimated state has the form ( 7 - 5 ) . I t has been indi c a t e d i n Section 7.2 that ( 7 - 5 ) can be obtained by solv i n g the dete r m i n i s t i c problem associated with ( 6 - 1 ) and ( 6 - 4 ) with w^  = 0 f o r a l l time. The optimum con t r o l f o r a nonlinear system i s generally d i f f i c u l t to implement. For this reason, a suboptimum version i s usually d e s i r a b l e . This Chapter w i l l be concerned with the de r i v a t i o n of a suboptimal c o n t r o l structure p a r t i c u l a r l y adapted to the sto c h a s t i c control problem as form-ulated above. The class of nonlinear systems which w i l l be considered i n th i s chapter i s described by * k + i = Vk + Vk + f ( V X Q = E ( X Q ) given ( 9 - 1 ) 0 < k < N-l The cost index i s j • 1 / 2 Jo KVk + u k \ v ( 9 - 2 ) where K^ n-i i s an n-dimensional state vector, and u^ i s an m-dimensional control vector. The n o n l i n e a r i t y f (x^.) depends only upon the state. The cost function ( 9 - 2 ) i s quadratic and the matrices R^ and are p o s i t i v e d e f i n i t e at a l l time. The system i s to be operated during the time i n t e r v a l 0 < k < N - l . The problem i s to f i n d a c o n t r o l sequence u^, ken c o n t r o l l i n g the system ( 9 - 1 ) i n such a way that ( 9 - 2 ) i s minimized. The di s c r e t e maximum p r i n c i p l e [5] y i e l d s P k = ( A k + f k x ) , p k + l + \ \ <9~3> 0 • K Pk+i + \ \ . ( 9 - 4 ) where P^ +^ i s t n e costate v a r i a b l e associated with x^. The t r a n s v e r s a l i t y condition i s P N = V N ( 9 ~ 5 ) Assume a s o l u t i o n of the form pk = Vk"hk. (9~6) D i r e c t s u b s t i t u t i o n of (9-6) into (9-1), (9-3) and (9-4) y i e l d s the s o l -u t ion: = S(x k ,k)x k + T(x k ,k)h k (9-7) where T ^ . k ) = Q - 1 B^A^ + f f c ) _ 1 (9-8) k and S(xk,k) = -T(xk,k) (I^-R^ (9-9) and where the matrix L i s s o l u t i o n of the R i c a t t i equation Lk = Vk+i [ I + \ \ 1 VW" 1 \ + \ ( 9- i o ) The terminal condition i s L^R^ and the vector h k i s s o l u t i o n of the non-l i n e a r equation "hk " W l ^ + Vk l BkLk+l)_1 (fk + " h k +l } + f x { Lk+i ( I + Bk\ lBkLk+irl <\\ + fk + Vk^kVi) - \ + i } k (9-11) with the terminal condition h = 0. N 67 S o l u t i o n of (9-1) and (9-11) represents a non t r i v i a l two-point boundary value problem (TPBVP). The s o l u t i o n of t h i s TPBVP gives the optimum values f o r h^ and from (9-6) the optimum control u^ i s determined. To avoid the d i f f i c u l t y of s o l v i n g a TPBVP, a suboptimum approach can be used. Con-s i d e r the nominal t r a j e c t o r x£ obtained by d r i v i n g (9-1) with the nominal c o n t r o l u^ = S(x^, k)x^, k en. Then, the nominal h^ i s obtained by s o l v i n g (9-11) by a backward sequence. Consider the suboptimal c o n t r o l u j = S ( x k , k ) x k + uTCxj^k) h£ (9-12) where u i s a s c a l a r . The c o n t r o l structure (9-12) i s obtained from (9-7) by introducing co as a step-size c o r r e c t i o n for the nonlinear part of the c o n t r o l . Equation (9-7) r e s u l t s i f a f u l l c o r r e c t i o n , to = 1, i s used. Equation (9-12), with a general co , allows the p o s s i b i l i t y of choosing an "optimum" co . Equation (9-12) i s rewritten for s i m p l i c i t y as \ = Vk + w Tk h k ( 9 " 1 3 ) When d r i v i n g (9-1) by (9-13), the cost function (9-2) becomes a function of co. The constant s c a l a r to can be considered as the new c o n t r o l f or an a u x i l i a r y problem. The optimum co which minimizes (9-2) can be obtained by standard numerical techniques such as a steepest descent search i n c o n t r o l spacejWhich i s here the s c a l a r co . The s t r u c t u r e defined by (9-13) i s convenient for several reasons. The vector h" i s the only quantity i n (9-13) depending on the i n i t i a l con-d i t i o n X q . In the s t o c h a s t i c c o n t r o l problem (6-1) - (6-4), the best a v a i l a b l e i n i t i a l condition for the system (6-1) i s given by E ( X q ) . Thus evaluating h^ using E(x ) as an i n i t i a l condition for (9-1) i s the best which can be done with the a v a i l a b l e data. However, the true i n i t i a l c ondition of (6-1) i s not known, and i s not E(x ) with p r o b a b i l i t y one. o The nominal h, = h, [E(x )] used i s an approximation to the non r e a l i z a b l e k k o nominal h!* = h*1 (x ) since x i s not known for the system (6-1). k k o o Having a c o n t r o l structure depending l i n e a r l y on a parameter. u> gives a f u r t h e r p o s s i b i l i t y f o r adaptation. Since (7-48) represents the adopted cost c r i t e r i o n , the free design parameter GO w i l l be determined by considering (7-48) as a function of co . Thus co can be treated as an aug-mented f i l t e r gain which i s determined at each time "k". Hence the adaptive w(k) i s i n f a c t a function of time and w i l l be denoted by oo^ . Therefore, i t i s seen that the i n t r o d u c t i o n of the free design parameter co introduced K. by a suboptimum s o l u t i o n of the c o n t r o l structure, can be adaptively up-dated by a decision r u l e based on the proposed structured coupling between estimation and c o n t r o l . The i n i t i a l condition Wq i s taken to be given by the s o l u t i o n of the d e t e r m i n i s t i c problem (9-1), (9-2) and (9-13). The e f f e c t of the i n t r o d u c t i o n of to^ on the s t a b i l i t y condition can be r e a d i l y seen from (8-8). If F(x k,u k,k) = F 1(u k,k) + F 2 ( x k , k ) (9-14) then the gradient matrix F^ i n (8-8) i s not affected by any parameter introduced by a suboptimal version of the c o n t r o l . In p a r t i c u l a r , the class of problem described by (9-1) belongs to the more general class described by (9-14). I t i s i n t e r e s t i n g to note from (8-6) and (9-14), that the c o n d i t i o n a l expected value of the error i s independent of the c o n t r o l s t r u c t u r e . This feature allows a great f l e x i b i l i t y i n the choice of the suboptimal c o n t r o l . F i n a l l y i t should be noted that i f the system does not have the structure given by (9-14) a more elaborate s t a b i l i t y test i s necessary, since then M, ,,, i n (9-8) i s a function of 69 both g, ' and to, ,, . As an i l l u s t r a t i o n of (9-13), a s c a l a r nonlinear example k+1 k+1 has been solved and the r e s u l t s are shown i n F i g . 9. The various curves can be inte r p r e t e d as follows: Curve 1 displays the t r a j e c t o r y obtained when the l i n e a r part of the system i s co n t r o l l e d by the optimum l i n e a r con-t r o l . The associated cost of 2880.8 r e f l e c t s the i n s t a b i l i t y of the l i n e a r p a r t. The complete nonlinear system i s then c o n t r o l l e d by d i f f e r e n t types of state feedback. Curve 2 displays the t r a j e c t o r y r e s u l t i n g from the control of the nonlinear system by the l i n e a r optimum control used to obtain curve 1. Curve 3 r e s u l t s when the nonlinear system i s c o n t r o l l e d by (9-12), i n which the parameter to i s constant and set to equal to unity. Curve 4 re s u l t s . when the nonlinear system has a control (9-12) i n which the constant parameter to has i t s optimum value giving the minimum cost for th i s control s t r u c t u r e . Curve 5 r e s u l t s when the nonlinear system i s co n t r o l l e d by the optimum nonlinear control obtained by standard numerical techniques. The associated cost i s 43.2. From the comparison of the various costs, i t can be seen that the influence of the second term i n (9-13) which arises because of the system n o n l i n e a r i t y , has a s i g n i f i c a n t e f f e c t on the value of the cost index. In conclusion, i t should be noted that any suboptimal control structure may be used. For the p a r t i c u l a r class of systems (9-14), the choice of the structure does not a f f e c t the s t a b i l i t y c r i t e r i o n . Hie systems described by (9-1) have a simple c o n t r o l structure of the form (9-12) where the free design parameter ui enters l i n e a r l y . A n a l y t i c a l r e l a t i o n s allowing the p r a c t i c a l implementation of the adaptive parameter to w i l l be given i n the next chapter. 70 (TIME) SUBOPTIMUM DETERMINISTIC CONTROL 71 .10. PROPOSED ALGORITHM 10.1 Derivation of the proposed co n t r o l This Section w i l l be concerned with the d e r i v a t i o n of the con-t r o l equations which w i l l be used to deduce the proposed algorithm. A l l equations are presented again for convenience. The nonlinear system i s modelled by x^+1 = F(x k,u k,k) + w k 0 < k < N-l (10-1) and observed by z k = H(x k,k) + v k (10-2) wh ere x k i s an n-dimensional state vector, u k i s an m-dimensional control vector, z, i s an r-dimensional measurement vector. The nonlinear functions k F(.,.,.) and H(.,.) are assumed to be continuous and d i f f e r e n t i a b l e i n a l l t h e i r arguments. A l l necessary matrix inverses w i l l be assumed to e x i s t . The vector noise disturbances v k and wk are assumed to be non correlated with each other and with x, and have zero mean. k E ( v k ) = 0 E(w k) = 0 k (10-3) An estimate of the i n i t i a l state of (10-1) i s assumed to be av a i l a b l e x = E(x ) (10-4) o o The o r i g i n a l cost function i s given by N J = E J.(x.,u.,i) (10-5) o 1 1 1 The problem consists i n fi n d i n g a control sequence u^ ^ d r i v i n g the system (10-1) i n such a way that (10-5) i s minimized. The st o c h a s t i c c o n t r o l problem i s defined by (10-1) - (10-5) cannot be solved with conventional methods without making some guess about the missing s t a t i s t i c a l information. The proposed method consists i n defining a f i l t e r structure v r F ( v v k ) + w ; [ v f H ( l r k + 1 ) 1 ( 1 ° - 6 ) where = F U ^ u ^ k ) (10-7) and where g k + 1 = § k + i ( \ > z k + i > k ) i s a s c a l a r gain. The c o n t r o l i s defined by • \ = \ ( V V k ) = S k \ + \ \ h k ( 1 ° - 8 ) where S, , T, and hf are defined by (9-7), (9-11). k k k The penalty cost function to be minimized i s chosen to be (7-48) instead of the expected value of (10-5). Thus \ = J c k ( V V k ) + c k w k c k ( 1 0 " 9 ) where C k = x ^ - osgnU-g^H; y k + 1 (10-10) where _ y k + 1 = z k + 1 - H C x ^ . k + l ) (10-11) and where Wk i s a p o s i t i v e weighting matrix. Furthermore \ u = \ + i + ¥ ; \ + i ( i o - i 2 ) The Jacobian matrix i s evaluated at x^-^ • I t has been shown (Chapter 8) that the f i l t e r gain must s a t i s f y a condition (8-16), which i s of the form W g k < g m a x ( 1 ° - 1 3 ) where e . and g are defined e i t h e r by (8-18) or (8-24) r e s p e c t i v e l y . °min max A further constraint (7-49) i s imposed < 6 g k ) 2 = < gk+l " g k ) 2 = " g g k ( 1 ° - 1 4 ) where 0 < 6Z < 1 g-73 S i m i l a r l y , a step s i z e constraint i s imposed on co^ .: (6u>k)2 = ( o ) k + 1 - u) k) 2 = 6£ 2 a)2 (10-15) where 0 1 6 ^ 1 1 The problem consists i n minimizing (10-9) subject to the constraints (10-1) - (10-15). Expanding (10-9) up to the second order terms y i e l d s V g k - i + 6 g k - i ' V i + 6 u k - i ] V l 8(jJ 'k-1 Jk-1 6to k-1 'k-1 3 k-1 (10-16) A steepest descent method can be used to obtain <5gk_^ and 6^-1' An improvement i n (10-9) r e s u l t s i f 2*1 9g 6 g k - l ±° 'k-1 'k-1 (10-17) 3co 6w S o k-1 >k-l k-1 Relations (10-17) and (10-18) are s a t i s f i e d i f 3ij; fig-'k-1 " 3g 6 wk-l = ~6Jl. 3*1 3co gk=l <Vl (10-18) (10-19) 'k-1 'k-1 74 Equation (10-19) must be consistent with (10-14) and (10-15). Thus sub-s t i t u t i n g (10-19) in t o (10-14) and (10-15) y i e l d s 64, = 64 •1 8 2 8 k-1 i — ; 64. = 64 2 2 co 2 w k-1 k 3g g k - l 3co g k - l " k - l - V i J (10-20) Subs t i t u t i n g (10-20) into (10-19) y i e l d s 'k-1 -64g g k _ x sgn 9^k 'k-1 j k-1 6co k-1 -64 'co sen to k-1 b 3 co g k - l k-1 (10-21) (10-22) The p a r t i a l d e r ivatives i n (10-21) and (10-22) are obtained from (10-9) and (10-6). ^ k 9g = K [ j c k + j c k \ x - 2 V h ; s g ^ i - g k - i ) y k i ( i ° - 2 3 ) x u 'k-1 k-1 and 3^ 3co k ck (10-24) 'k-1 u 75 S t r i c t l y speaking, (10-23) i s v a l i d i f g ^ ^ ^ -^'J otherwise there should be another term with 6(l-g ) where 6(") i s the d e l t a function. However, rC _1_ since gk_-^ i s a random v a r i a b l e , 6(l-g^_^) = 0 with p r o b a b i l i t y one. Thus th i s extra term may be ignored. Using (10-14) and (10-15) i n conjunction with (10-21) and (10-22), the expressions f o r the f i l t e r gain and co n t r o l parameter become 3* k 8 k 8 k - l l-6£ g sgn ^  9g 3 k - l o k-1 (10-25) V Vi 1-62, sgn ( 3* 3u 'k-1 Vi (10-26) The p a r t i a l d e r i v a t i v e s i n (10-25) and (10-26) are given by (10-23) and (10-24) r e s p e c t i v e l y . In the p a r t i c u l a r but important case where the cost i s quadratic or can be approximated by a quadratic form, a simpler expression can be found for (10-23). Using the approximation <Jck + J c k V Vk 01. Tc-1 k-1 (10-27) and s e t t i n g 2 » k = L , gives the following form f or Eq. (10-23) 3* 3g k-1 = ~ " g n U - g ^ H y k ] 3 k - l V i (10-28) 76 Note that (10-27) i s v a l i d i f the system can be modelled by (9-1) where the n o n - l i n e a r i t y f ( x ) i s "small" when compared with the l i n e a r part. k Therefore, the proposed algorithm to be executed at each stage of the process i s as follows: 1) From the measurement z^, obtain y^ = z^ - H(x k,k) (Eq. (7-7) ) 2) Evaluate = F ( x k > u k , k ) + S ^ H / y ^ 3) Solve (10-28) then (10-25) f o r g k and (10-14) for 6g 4) Check the condition (8-16) and replace g by the appropriate upper or lower bound i f i t i s not s a t i s f i e d , 5) The estimate i s given by x = x, + 6g ,H'v .The evaluation of the k k k 1 x' k parameter « k updating the co n t r o l i s performed as follows: a) evaluate u k = u k ( x k > ^ - i ^ (eq. (9-12)) b) evaluate (10-24) then (10-26) which y i e l d s wk c) The co n t r o l i s given by u^ ^ = uj c( x> t uj c) Figure. 10 i l l u s t r a t e s a d e t a i l e d flow chart f o r the algorithm. 10.2. Example I The f i r s t example i s concerned with a second order non-linear system. The system i s described by .01 .001 ^+1 1. .101 .02525 .95 *k + .01 u, - .0002 k x x k k + w k (10-29) where \ = \ = w, k+1 vk+1 U, k+1 F(Xk+lUk+l/k+1) DELAY (ONE PERIOD) X k+1 k 6 z 1 D E M V Uk fOWE PERIOD) 1 /V+7 D E M V ('OWE PERIOD) F(XkU k) x, k+1 *k+l + ^ k " x \ + 1 gmin < g 'k+1 (gmax ^ - ^ K E S A/0 V k+1 DELAY (ONE PERIOD) ( - g max 9, =\ or k+1 j gmin -°(sgn(l-gk) hnk+1Jc(k+1) SI ~SI u DELAY (ONE PERIOD) OJ,, <fco. 17 s<Xk+l!»-')Xk+1+% T ( X k + 1 ^ ) h n k + , 61 sgn[(-)} co,. NOTATION • —> VECTOR . SCALAR BLOCK DIAGRAM OF CLOSED LOOP STOCHASTIC CONTROLLER FOR NONLINEAR SYSTEMS FIG-iO k+1 *6 'k+1 H'x Yk + 1 The system i s observed by z, = x. + v, where v. = k k k k (10-30) and the disturbances v, and w, are assumed to have zero mean k k E(v ) = E(w ) = 0 J-Y. The o r i g i n a l cost function i s given by N J = 1/2 E rx/ o k 0. X k + U k U k ] ( 1 ° _ 3 1 ) 0. • .1 The system operates during the time i n t e r v a l 0 < k < 100. The i n i t i a l condition f o r (10-29) i s - a random v a r i a b l e which i s assumed to be i n the given range c 15. (10-32) 5. < x < o -5. 15. The best estimation of X q i s assumed to be given by x = E(x ) o o 10. 10. (10-33) The system has the form described by (9-1) and (9-2). I t has been shown that the s t a b i l i t y condition i s then unaffected by the suboptimal structure of the system, a) S t a b i l i t y condition Direct, a p p l i c a t i o n of (8-8) y i e l d s u s i n g (10-29) y i e l d s \ + 1 = a-g k + 1) (A + f J l.-.0002x 9 ..101-.0006x? zk+ i V i .02525-.0002x .95 k+1 (10-34) (10-35) 79 The matrix norm defined by (8-9) w i l l be used here, Assume a given range -20. -20. * X k K~ 20. 20. (10-36) Dire c t a p p l i c a t i o n of (8-17), (8-20), (8-21), (8-23), and (8-24) y i e l d s bounds on the s c a l a r gain ,12511 St < 1.874 -^k (10-37) The i n i t i a l condition f o r the gain i s taken to be g = 1 with no modification to Eq. (10-28). Thus an err o r may a r i s e during the f i r s t step. However, due to (10-14), the e f f e c t of t h i s e r r o r on the f i n a l cost i s n e g l i g i b l e . b) Suboptimum con t r o l A suboptimum c o n t r o l of the form described by Eq. (9-13) i s ' used. The relevant equations have been derived i n Chapter 9, Eqs. (9-7) and (9-12). As explained i n Chapter 9, the free design parameter co i s up-dated by means of (10-24) and (10-26). The i n i t i a l condition i s taken to be ID = 1, to avoid s o l v i n g any TPBVP associated with an optimum co In (9-13). c) Choice of step s i z e constraints The proposed method has introduced three s c a l a r constants i n equations (10-25), (10-26), and (10-28). S i m i l a r considerations as explained i n Chapter 5 which resulted i n (10-24) can be applied. Thus 6ip = 6£ = a = i & co (10-38) The choice (10-38) i s c e r t a i n l y not optimum, but i t i s simple and consistent with the previous examples. 80 d) Comparison with conventional methods The proposed method was compared with a method using an extended Kalman-Bucy f i l t e r cascaded by the suboptimum control s tructure as described i n b) above with co, = co = 1 -/k. This f i l t e r i s dependent upon the estimate k o x at each time, which has been computed by l i n e a r i z i n g the n o n l i n e a r i t i e s k, about the a c t u a l l y estimated t r a j e c t o r y [5]. The relevant equations for the f i l t e r are • Vk + Wf(\) + rk +i c ; 1 tzk+r(Vk + BkVf(V } ] (10-39) k+l -1 _ 1 -1 with T = -X +: Cv k+l X = (\ + f x ) T k (A^+f ) + Cw (10-40) 1 +1 k+l 1 ^ . . , = r,.,, cv and where Cv , Cw are the covariances of the noise sequences v^, w^  r e s p e c t i v e l y and T i s the covariance of the i n i t i a l state vector x . o o e) D e s c r i p t i o n of the noise conditions. The noise sequences v^ and w^  are taken to be normally d i s t r i b u t e d with covariances C V = 1 (10-41) Cw = I where I i s a 2 x 2 unit matrix. The i n i t i a l condition of the state of the system (10-29) i s taken to be normally d i s t r i b u t e d with a mean value given by (10-33) and covariance r =Cov( x ) = I (10-42) o o Note that (10-42) i s consistent, with the l i m i t a t i o n introduced by (10-32) and (10-36) since the p r o b a b i l i t y that X q exceeds i t s i n t e r v a l ( i f i t were t r u l y a gaussian random variable) i s 10 ^. xk+l= 1. .101 Xk + '•01 •oof Uk -.0002 X1 X?  ]k 2k .02525 .95 1. .07 xl L >k J E(Vk) = E(Wk) = 0 EfVjVj') = E(Wj Wj) = £ij ) GAUSSIAN NOISE Zk-\ + \ 100 J=iL(Wk+u'kW k=l 2.7S PROPOSED GAIN 10.01 PLANT X, 2000 1333 SSS STANDARD = WOS PROPOSED = 178S TIME — i 1 • 1 1 i i i , i i 0-0 100 20.0 30.0 40.0 S0.0 60.0 700 SO.O 90.0 100. COMPARISON OF THE EFFECTIVENESS OF THE PROPOSED AND STANDARD CONTROL fig-11 82 f) Results The system was run under the above conditions. Two series of 210.0 runs were executed. During the f i r s t one, the system was controlled using the standard Eqs. (10-39), (10-40) with the exact conditions (10-41) and (10-42). The average cost was E(J) = 1636. The same noise sequences used i n the above runs were also used i n the second set of runs i n which the system was controlled by the proposed methodjWhich does not require the st a t -i s t i c a l data given by (10-41) and (10-42). The average cost i s E(J) = 1331. The proposed method yields a 20% smaller cost. Figure 11 i l l u s -trates a t y p i c a l set of traj e c t o r i e s which can be obtained by driving the Tsystern using either control, under the same noise and i n i t i a l conditions. 10.:3 Example I I In order to obtain an a n a l y t i c a l solution of (7-30)^ a class of measurements devices having the property H H' = I (10-43) x x i s considered. I t has been shown that i n this case, a p a r t i c u l a r l y simple solution of (7-30) can be obtained, and the f i l t e r structure (10-6) r e s u l t s . The interest i n measurements devices having the property (10-43) i s due to the fact that a l l components of the state vector x^ may not be ea s i l y a v a i l -able. This may occur for several reasons, such as d i f f i c u l t i e s i n setting up proper measurement instrumentation. The example to be considered w i l l compare the proposed and standard method i n the two cases of complete and incomplete measurements of a l l the components of x^. The system i s modelled by the following l i n e a r f i f t h order equation: 83 1.01 .0229 -.49 .31 .01 0. .0 1. .0 .0 .0 0. = .0 -.0024 .61 .31 .0 \ + 0. \ + W k ( 1 ° .0 -.0024 .0 .92 .0 1. .1 . .1 .0 .0 .0 .0 This system i s disturbed by the zero mean noise w . k The i n i t i a l condition i s assumed to be within the range < x < o 1.5 1.5 1.5 1.5 1.5 and E(x )=x = o o (10-45) The o r i g i n a l cost function i s defined by 100 J = 1/2 E ( x k x k + u ku k) (10-46) o As stated by (10-44) - (10-46) the control problem looks very s i m i l a r to Example 2 given i n Section 5.3. The d i f f e r e n c e , however, i s i n the measure-ment device. Two b a s i c simulations are done: a) Assume that the measurement device has the form + v. (10-47) In this case, the s t o c h a s t i c c o n t r o l problem defined by (10-44) - (10-47) i s exactly of the same type as developed i n Example 2 given i n Section 5.3. Comparison with the optimum conventional method i s done by following exactly the same experimental conditions. The relevant parameters a r e : Cv = .1 Cw = .1 Sir, = a = .1 (10-48) 84 The noise disturbances v k a n d and the i n i t i a l state vector of (10-44) are normally d i s t r i b u t e d with the covariances given by (10-48). Direct a p p l i c a t i o n of (4-11) y i e l d s the s t a b i l i t y condition .0108 < g k < 1.989 (10-49) (10-50) Under those conditions the average costs for the proposed and standard method are : Standard method E(J) = 124.8 Proposed method E(J) = 129.0 In this case the proposed method y i e l d s an average cost which i s 4% greater. The e f f e c t of the disturbances on the performance of the system can be e v a l -uated by computing the optimum cost for the d e t e r m i n i s t i c case where u = 0 Vk. The minimum cost i s found to be J =61.29. This value compared with the m average optimum cost of E(J) = 124.8 i n d i c a t e s that the cost increase due to the noise i s about 100%. In other words the system i s operating under heavy noise conditions. b.) Assume that the measurement device i s modelled by 1. 0. 0. 0. 0. z k = 0. 1. 0. 0. 0. x k + v k (10 0. 0. 1. 0. 0. 0. 0. 0. 1. 0. that the condition H H' = I i s s a t i s f i e d where I i s here a 4*4 X X unit matrix. Direct a p p l i c a t i o n of (8-8) y i e l d s \ f l = ( I " g k + l H x V A k (10-52) The matrix 1 S convergent i f .0105 < g k + 1 < 1.994 Vk (10-53) The c o n t r o l l e r i s e n t i r e l y determined by (10-6), (10-25), (10-28) and (3-47). The simulation i s executed with the same noise sequences as before (see (10-48) ). A p p l i c a t i o n of the standard method requires recomputation of the Kalman gain to take account of (10-51). The r e s u l t s are as follows: Standard method E(J) = 125.0 (10-54) Proposed method E(J) = 129.2 Again the proposed method y i e l d s a cost which i s 4% greater than the standard method. Comparing (10-54) with (10-50) indicates an increase i n the average cost. This i s due to the uncertainty introduced by the reduction i n the measurement data obtained from (10-51) . 10.4 Example III In a previous p u b l i c a t i o n [13], a s o l u t i o n of the s t o c h a s t i c cpntrol problem for nonlinear systems was presented. In this s e c t i o n a com-parison i s made between t h i s previous work and the proposed method. R e c a l l , for convenience, that the problem i s to control a system described by (6-1) and (6-4). The following i s a d e s c r i p t i o n of the con t r o l p o l i c y discussed i n [13]. The d i f f e r e n t steps of the approach are i l l u s t r a t e d i n F i g . 12. Figure 12a i l l u s t r a t e s the standard method of s o l u t i o n i n which an extended Kalman-Bucy f i l t e r i s cascaded with a d e t e r m i n i s t i c c o n t r o l generator. The equations for this f i l t e r can be deduced by s o l v i n g a nonlinear f i l t e r i n g problem [16] as indic a t e d i n F i g . 13, Eqs. 1, 2. The nonlinear f i l t e r i n g problem i s solved by using a l l the ava i l a b l e measurement data from time zero to the actual time "k". I t i s 66 \N0ISE |A/0/S£ u, CONTROL GENERATOR ESTIMATOR (A)-STANDARD METHOD OF SOLUTION NOISE \NOfSE XMaf<XkfUk,k> + Wk H(Xk k)+Vk CONTROLLER C£>-A ESTIMATOR (wifwz) =3-(BhPROPOSED METHOD OF SOLUTION NOISE Xk + 1=AkXk+W V\ NOISE Z, H Hxk + vk u k + Uk LINEAR PART OF CONTROL Xk(wi,wz) SU k_ LINEAR ESTIMATOR « g < i COORDINATOR (C) - PARTICULAR APPLICATION OF METHOD FIG-12 87 1 X NON LINEAR FILTERING CONTROL PROBLEM OPTIMAL TRAJECTORY EQUALLY RELIABLE FILTERING SOLUTIONS TIME -TRAJECTORIES ASSOCIATED WITH DIFFERENT W1 AND W2 k k FILTERING PROBLEM ( from tim e I—~k ) (I) Xj+1 --FfXj.Uj.jl + Wj (2) J -- I A A (3) YIEL DS X =Xfai ,wzk ) CONTROL PROBLEM (from time k—-N) Xj+1=F(XrUrJ) k N JH = I J, (XifU.-) k J J J A X, = X, \ * xk(wh<y^k) (initial condition) (4) (5) (6) J = J fri-k ,wzk ) (7) WHOSE OPTIMIZATION UNDER CONSTRAINTS YIELDS OPTIMUM W1 * AND wf k k AND THEREFORE X, AND U. k k PROPOSED METHOD OF SOLUTION F-ie-13 88 i n t e r e s t i n g to note that t h i s nonlinear f i l t e r i n g y i e l d s an estimate x mini-mizing a weighted mean square error c r i t e r i o n ( F i g . 13, Eq. 2). The time dependent weighting matrices Wl ^ and W2^ .+^  are p o s i t i v e d e f i n i t e but otherwise a r b i t r a r y . These are usually chosen to be the inverse noise co-variances when they are known. A p o s s i b l e l i n e a r i z e d s o l u t i o n of t h i s non-l i n e a r f i l t e r i n g problem i s given by (10-39) and (10-40) where the weights Wl , and W2. , have been replaced by the inverse noise covariances Cv and j + l 3+1 Cw. This f i l t e r has the structure (7-5) i n which the gain G^+^ i s replaced by the extended Kalman gain K j ^ (Cv, Cw) . The estimate x ^ + 1 obtained from (10-39) i s r e a d i l y seen to depend on Cv and Cw. \ + l = \ + l ( C V > C W > ( 1 0 _ 5 5 ) I f the weights Wl. , .. and W2 , ,. are considered as free design para-3+1 3+1 meters, (10-55) becomes = \ + l < W 1 k + l ' W 2 k + 1 } ( 1 ° - 5 6 ) For the following discussion, i t i s convenient to introduce the notion of a set of equally r e l i a b l e estimates. The reason for t h i s terminology i s to emphasize the f a c t that given no further information a l l elements of the set are considered to be equally "good". A functional w i l l be introduced to induce an ordering of the elements of the set. This i n turn orders the weighting matrices. For example, the f u n c t i o n a l could be the trace of the covariance e r r o r . Minimum covariance error r e s u l t s when the weighting matrices are chosen to be equal to the inverse of the noise covariances . I f the f i l t e r i n g problem were not associated with a s t o c h a s t i c control problem, then an estimate having a minimum covariance error would possess desirable properties. 89 Since the f i l t e r i n g problem i s associated with a control problem, i t may be d e s i r a b l e to leave the choice of the best estimate to be deter-mined by considerations other than minimum covariance e r r o r . In other words, the f u n c t i o n a l ordering the set w i l l not be determined by s t a t i s t i c a l considerations alone. Furthermore, an ordering based on a minimum covariance error estimate implies a complete s t a t i s t i c a l d e s c r i p t i o n of the noise disturbances. I t i s usually d i f f i c u l t i f not impossible to determine such s t a t i s t i c a l data on-line', and often an a r b i t r a r y choice i s made. I t can therefore be concluded that from a p r a c t i c a l point of view, i t i s desirable to f i n d a simple way of ordering the set t x ^ + ^ J according to a c r i t e r i o n which does not require extensive s t a t i s t i c a l information. A c r i t e r i o n having these properties i s i l l u s t r a t e d i n F i g . 13. I t i s seen that a f t e r an element of an equally r e l i a b l e set {x^} i s chosen, a deter-m i n i s t i c c o n t r o l problem has to be solved from time "k" to "N". The cost index f o r t h i s problem i s chosen as the f u n c t i o n a l which induces the desired ordering. The i n i t i a l condition for (4) i s given by (10-56). Since the c o n t r o l structure u. = u.(x.,i) i s known, the minimiz-J J 3 a t i o n of (5) i s accomplished by the s e l e c t i o n of the weights Wl and W2 . Consequently, not only i s the optimal c o n t r o l for the 'remaining time-to-go obtained, but also the optimum i n i t i a l state ( i . e . what i s considered to be the "best" estimate) and the optimum weighting parameters Wl and W2 . Thus the f u n c t i o n a l ordering the set {x^} i s given by (5) which i s analogous to (7-48) i n which W, = 0. I t i s .noted that s e t t i n g W, = 0 i n (7-48) i s k k not s a t i s f a c t o r y since no penalty i s associated with the f i l t e r i n g e r r o r . Therefore the above s o l u t i o n of the control problem i l l u s t r a t e d i n F i g . 13 i s l i m i t e d to very stable systems. Furthermore, [13] did not include any bounds on the f i l t e r gain, which can aggravate the d i f f i c u l t i e s created by the choice W = 0. 90 It can be suggested that reintroducing a coupling term by taking W 4 0 and computing proper s t a b i l i t y bounds may r e s u l t i n a control algorithm s u i t a b l e f o r systems which are not inherently highly s t a b l e . A numerical example i s presented i n [13]. I t concerns a p a r t i c u l a r form of a class of nonlinear problems given by (9-1). The control structure was chosen to be given by (9-12). In F i g . 14, a l i n e a r i z e d estimator determines a f i r s t l e v e l estimate x^ which i s used for the f i r s t l e v e l l i n e a r c o n t r o l u^. A second l e v e l coordinator solves a d e t e r m i n i s t i c c o n t r o l problem f o r the parameters to^, Wl^, W2^ which minimize f o r the remaining time-to-go. The s o l u t i o n r e s u l t s i n a cont r o l increment Su^ which i s added to the f i r s t l e v e l con-t r o l and a revised estimate x, which i s used to update the estimator. To k s i m p l i f y the computations, the search f o r to , Wl , W2 i s done by one K. K. K. steepest descent move using previous values as i n i t i a l conditions. This i s equivalent to (10-14} and (10-15)-. To i l l u s t r a t e the method the following example was simulated on IBM 360/67 d i g i t a l computer. x k + 1 = 1 . 0 0 3 x k + .001^ - .05x^ + w k (2-157) z k « x k + v k (2-158) 999 J = 1/2 x21Q0Q + 1/2 £ ( x 2 + u£) ; x± = E(x ) = 1 (2-159) 1 k The l i n e a r part of (10-57) was chosen s l i g h t l y unstable to make i t s e n s i t i v e to noises disturbances. The n o n l i n e a r i t y has a s t a b i l i z i n g i n f l u e n c e . The noises sequences w and v are gaussian. F i g . 15 i l l u s t r a t e s t y p i c a l r e s u l t s f o r si n g l e runs. Curves 1 and 2 give the actual cost and a v a i l a b l e cost r e s p e c t i v e l y i f a standard stoch a s t i c optimal c o n t r o l l e r i s used, curves 3 and 4 are the re s u l t s obtained by the proposed adaptive c o n t r o l l e r while curve 5 gives the i d e a l Xk+1sAkXk+Bk°k+ f(xk)+wk zk=Hkxk+vk j4ffx'kFkxk+uk/Gkuk) E(VK) = E(WK) = 0 (2) (3) (4) \ NOISE \NOISE ZK Xk + 1 = A k X k + Bk uk+ f(xk)+wk xk Hkxk+Vk u, SU, SUBOPTIMUM CONTROL *k CO, LINEARIZED EST IMA TOR w>>£ COORDINATOR Illll8' PARTICULAR APPLICATION OF PROPOSED METHOD Fie-14 92 75^ 1.0^ u. Q: Ui 50. 25-X k + J :1.003Xk +.00Wk -.05 Xk +Wk Z K = X k + V k 999 2 .2. j=lx.:^ + i j i j x k + u k ) 2 1000 2 k=1 xrxri STANDARD METHOD Q)=J(X,U) ©FILTERED X Q),J(X,U) ©PLANT X PROPOSED METHOD Q) = J(X, U) (6) FILTERED X Q)=J(X,U) (5) REFERENCE (noise sequences known) 1000 TIME COMPARISON OF. EFFECTIVENESS BETWEEN STANDARD AND PROPOSED METHODS FOR THE ESTIMATION AND CONTROL PROBLEM FlG-15 93 minimum cost, where the noise sequence i s considered known. I t i s seen that the objective of having curves 3 and 5 "c l o s e " i s achieved and that the proposed method i s s u b s t a n t i a l l y better than the standard method. Figure 15 i l l u s t r a t e s the v a r i a t i o n i n the e f f e c t i v e Wl^ and W2^ and the f i l t e r gain for the proposed method. In the standard method^Wl^ and W2^ are con-stant. Note that the adaptive gain tends to be small except when corrections are required. The gain f o r the standard method approaches a constant. The objective of the adaptive c o n t r o l l e r i s to keep the cost of a s i n g l e run small and not require extensive s t a t i s t i c a l data. However i t i s of i n t e r e s t to compare the state estimates f o r two approaches. In F i g . 15 these are given by curves 6 and 7 while curve 8 gives the actual s t a t e . Note that the non-adaptive f i l t e r tends to follow the actual state which i s perturbed by noise. The adaptive f i l t e r tends to follow the average behaviour of x . The f i l t e r i n g a c tion i s therefore improved. 10.5 Conclusion This chapter was concerned with the c o n t r o l of nonlinear stoch-a s t i c systems. The proposed method i s e s s e n t i a l l y an extension of the r e s u l t s given i n Chapter 5. The main feature remains the e l i m i n a t i o n of the expected value of the cost index as a design c r i t e r i o n and the d e f i n i t i o n of a new cost index based on an e x p l i c i t formulation of a trade-off between estimation and c o n t r o l . A n a l y t i c a l solutions can be given for some important classes of systems and measurement devices. The p r a c t i c a l usefulness of the proposed c o n t r o l l e r comes from the fact that no a p r i o r i assumptions are made about the d e t a i l s of the noise s t a t i s t i c s and that storage or computations of the usual f i l t e r gain matrix sequence i s replaced by the computations of a s c a l a r gain. A p r a c t i c a l form of suboptimum structure f o r the c o n t r o l i s developed. Most of the computations dealing with the determination of step s i z e constraints and s t a b i l i t y bounds can be done o f f - l i n e . 94 11. CONCLUSION The thesis began with the study of a p a r t i c u l a r class of stochastic co n t r o l problems represented by a l i n e a r system with additive gaussian noise and quadratic cost. From the study of the conventional method of s o l u t i o n , an i n t e r p r e t a t i o n suggesting a completely d i f f e r e n t approach has been given. This approach i s e s s e n t i a l l y based on e l i m i n a t i n g the expected value of the cost as a design objective. The new adopted c r i t e r i o n i s based on formulating an e f f e c t i v e trade-off between estimation and control r e q u i r i n g only a l i m i t e d amount of s t a t i s t i c a l data. This objective has been achieved by r e d e f i n i n g the f i l t e r i n g problem i n such a way that the f i l t e r equations depend upon a s c a l a r parameter which i s used to adaptively update a f i l t e r gain. The estimate obtained i s used to generate the c o n t r o l . I t i s shown that subject to s t a b i l i t y c onstraint co n s i s t i n g e s s e n t i a l l y i n p u t t i n g bounds to the v a r i a t i o n of the f i l t e r gain, the proposed feedback c o n t r o l l e r operates s a t i s f a c t o r i l y . A very important feature of the proposed method i s due to the fac t that no a p r i o r i assumptions are made about the d e t a i l s of the noise s t a t i s t i c s . The s t a t i s t i c a l data required does not go beyond the expected values of the disturbances and expected value of the i n i t i a l condition of the system. Furthermore a s c a l a r adaptive gain replaces a matrix sequence. Thus, - i t i s more p r a c t i c a l to apply the proposed method than the methods known at present. The above r e s u l t s have been extended to more general nonlinear s t o c h a s t i c systems. S i m i l a r properties for the control equations can be derived. For a class of nonlinear systems and nonlinear measurement devices, i t i s shown that a p a r t i c u l a r l y simple form for t h e - c o n t r o l l e r can be r e a l i z e d . The i n t r o d u c t i o n of a suboptimal s t r u c t u r e for the c o n t r o l l e r can be done 95 e f f i c i e n t l y , leaving a great deal of freedom i n the design of the complete feedback loop. The computer requirements are much less s tringent than those of presently known techniques. Simulations of second order and f i f t h order systems allowed a comparison of the average cost obtained by the proposed and conventional method for gaussian and non gaussian noises. Results i n d i c a t e d that f o r l i n e a r systems, the proposed method y i e l d e d an i n s i g n i f -i c a n t to moderate increase i n cost. In the nonlinear case, the proposed method y i e l d e d an i n s i g n i f i c a n t to sensibl e decrease i n cost. These extensive simulation studies i n d i c a t e that the obje c t i v e of the thesis have been achieved for a wide v a r i e t y of l i n e a r and nonlinear systems. APPENDIX I Consider a d i s c r e t e l i n e a r system *k+i = Vk + Vk W N-1 ( I - l ) X q = given The problem i s to f i n d a con t r o l sequence u^ d r i v i n g the system from x^ i n such a way that the cost function J = 1/2 E ( x k R ^ + u^Q u^) j <?w - ° (1-2) k=0 i s minimized. D i r e c t a p p l i c a t i o n of the maximum p r i n c i p l e [5] y i e l d s Pk = + \ \ ( I ~ 3 ) 0 = Vk+i + Vk (I~4) where p^ i s the costate v a r i a b l e associated with x^. The t r a n s v e r s a l i t y condition i s P N = Vw ( I _ 5 ) Assuming the s o l u t i o n pk = Vk (I"6) Then d i r e c t s u b s t i t u t i o n of (1-6) into ( I - l ) , (1-3), (1-4) y i e l d s \ = (1-7) with sk = A 1 W " 1 f\ - V (I~8) where the matrix L k i s s o l u t i o n of the R i c c a t i equation % + l with the condition Lk - v + 1 [ I + W B k L k + i ] _ 1 \ + \ ( i - 9 ) LN = V ( I- 1 0 ) REFERENCES 1. D. Sworder, Optimal Adaptive Control, Mathematics i n sciences and engineering, V o l . 25, 1966, Academic Press. 2. J . S. Meditch, Stochastic Optimal Linear Estimation and Control, 1969, 3. M. Aoki, Optimization of Stochastic Systems, Mathematics i n sciences and engineering, Vol. 32, 1967. 4. A. Bryson and Y. Ho, Applied Optimal Control, optimization, estimation and c o n t r o l , 1969, B l a i s d e l l Publ. comp. 5. A. P. Sage, Optimum Systems Control, Series i n networks, 1968, Prentice H a l l . 6. R. K. Mehra, "On the i d e n t i f i c a t i o n of variances and adaptive Kalman f i l t e r i n g " , IEEE Trans. Automatic Control, Vol. AC-15, No. 2, A p r i l 1970. 7. C. S. Sims and J . L. Melsa, " S p e c i f i c optimal estimation", IEEE Trans. Automatic Control, Vol. AC-14, No. 2, A p r i l 1969. 8. E. Isaacson and H. B. K e l l e r , "Analysis of Numerical Methods, 1966, John Wiley and Sons, New York. 9. G. W. Johnson, "A d e t e r m i n i s t i c theory of estimation and c o n t r o l " , IEEE Trans. Automatic Control, Vol. AC-14, August 1969. 10. M. Athans and P. Falb, Optimal Control, McGraw-Hill, 1966. 11. G. S. Sebastien, Decision Making Processes i n 'Pattern Recognition, McMillan Company, New York, 1962. 12. L. K. Kirschmayer, Economic Control of Interconnected Systems, New York, Wiley, 1959. 13. S. Gracovetsky and L. V. Bohn, "An adaptive method for combined s t o c h a s t i c estimation and optimal c o n t r o l " , Proceedings of the Third Hawaii I n t e r -natio n a l Conference on System Sciences, January 1970. 14. C. S t r i e b e l , " S u f f i c i e n t s t a t i s t i c s i n the optimum con t r o l of s t o c h a s t i c systems", Journal of Mathematical Analysis and A p p l i c a t i o n s , Vol. 12, pp. 576-592, 1965. 98 15. R. E. Kalman, "Contribution to the theory of optimal c o n t r o l , Bol. Soc. Math. Mex., Vol. 5, pp. 102-119, 1960. 16. R. E. Mortensen, "Maximum l i k e l i h o o d recursive nonlinear f i l t e r i n g " , Journal of Optimization theory and a p p l i c a t i o n , Vol. 2, No. 6, 1968. 17. IEEE Group on Automatic c o n t r o l , 1968 case studies i n System Control, The University of Michigan, June 1968. ' 18. R. S. Bucy and P. D. Joseph, F i l t e r i n g for s t o c h a s t i c processes with a p p l i c a t i o n to guidance, Interscience tracts i n pure and applied math-ematics, John Wiley, New York, 1968. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0302211/manifest

Comment

Related Items