APPLICATION OF AUTO REGRESSIVE FILTERING TECHNIQUES TO FLOOD FLOW PREDICTION by PETER G. CAHOON B.Sc, Dalhousie University, 1970 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF APPLIED SCIENCE in THE FACULTY OF GRADUATE STUDIES The Department of C i v i l Engineering We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA March, 1978 © Peter G. Cahoon, 1978 - In p r e s e n t i n g an this thesis in p a r t i a l advanced degree at the U n i v e r s i t y the L i b r a r y f u l f i l m e n t o f the requirements f o r of B r i t i s h s h a l l make i t f r e e l y a v a i l a b l e I f u r t h e r agree t h a t p e r m i s s i o n Columbia, f o r reference I agree that and study. f o r e x t e n s i v e copying o f t h i s t h e s i s f o r s c h o l a r l y purposes may be granted by the Head o f my Department or by h i s representatives. of t h i s t h e s i s f o r f i n a n c i a l gain written permission. Civil Department o f The University April Engineering of British 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date It i s understood that 5, 1978 Columbia shall copying o r p u b l i c a t i o n not be allowed without my ABSTRACT This thesis describes a study of the application of m u l t i channel deconvolution, an autoregressive f i l t e r i n g technique, to the problems of flood flow prediction. This application is divided into four component segments: 1. The description of the behaviour of a multiple input, multiple output basin using multichannel autoregressive techniques with multiple lags of f i n i t e length. These descriptions f a l l into two catagories: 2. a) the Weiner autoregressive technique^. b) minimum entropy auto regression-^. The restatement of the input/output problem as a time varying state-space description with a feedback mechanism for implementation of information having a unit time delay. 3. The analysis and characterization of the state-space and autoregressive methods using standard spectral analysis techniques and s t a t i s t i c a l 4. confidence l i m i t s . The linking of the state-space/autoregression characterizations for snowpack depletion and r a i n f a l l / r u n o f f into a f i n i t e state machine algorithm coalescing the two processes so they may be linked to s a t e l l i t e information. Several case studies were used in which the multiple p r e c i p i t a tion records and multiple flow records were characterized. These steady state c h a r a c t e r i s t i c s were then updated using the Kalman state-space description to provide an " o n l i n e " , information update. -ii- The snowpack depletion problem was treated as a multiple input ( r a i n f a l l , temperature, humidity), single output (snowmelt) phenomena and characterized by a single multichannel A second class of characterizations was autoregression. employed to cope with the "spike" a r r i v a l s caused by rapid snowmelt flowing into a single output. This minimum e n t r o p y ^ technique is developed for "flash flood" prediction. F i n a l l y , a f i n i t e state machine algorithm is developed to link the snowpack depletion to the r a i n f a l l runoff problem in such a way that i t can be readily linked to s a t e l l i t e produced data streams. For those unfamiliar with the complex and obtuse language of deconvolution and control theory, the role of this thesis can be described by analogy: You have arrived in the middle of a party where the participants are a l l quite intoxicated. They are huddled together in small knots talking excitedly, but without much c l a r i t y about decon operators, M.E.D. processes, frames and f i n i t e state machines. My role is to catch you up on the conversation and help you find your way to the bar. -iii- TABLE.OF CONTENTS Chapter 1 Page INTRODUCTION 1 A B r i e f Literature Review 3 2 THE CONVOLUTION PROBLEM 6 3 KALMAN FILTERING AND THE RECURSIVE LEAST SQUARES PROBLEM 9 STATE SPACE REPRESENTATION DECONVOLUTION PROCESS 19 4 5 6 OF THE IMPLEMENTATION OF AN ADAPTIVE KALMAN FILTER TO ESTIMATING RUNOFF 26 Computation of the Kalman F i l t e r Weiner Kalman Formulation Assumptions 31 35 CASE STUDIES 36 Computation of the Confidence Bounds on Predicted Traces Considerations Other Applications Other TechniquesrCombatability Numerical Examples Example No. 1 Discussion 7 8 49 51 FINITE STATE MACHINES 78 Numerical Example No. 2 (Snowpack Depletion at Mission) 84 Numerical Example No. 3 (Squamish problem using Weiner f i l t e r s as error f i l t e r s ) 86 OTHER CLASSES OF FILTERS Minimum Entropy Deconvolution 9 43 46 46 48 CONCLUSIONS . 89 ' 90 101 BIBLIOGRAPHY 102 -iv- ACKNOWLEDGEMENTS The author wishes to express his appreciation to his superv i s o r , S.O. Russell, whose patience and guidance made possible the research and preparation of this thesis. The author also wishes to acknowledge the others involved at various stages of the thesis preparation: criticism; B i l l Caselton for his constructive Tad Ulrych for his generous loan of countless algorithms and introduction to the Weiner processes; Colin Walker for t h e i r explanation. -v- Chapter 1 • . v INTRODUCTION The problems associated with computing flood flow predictions from meteorologic data have occupied hydrologists for a number of years. Approaches to hydrologic forecasting have p r o l i f e r a t e d through recent years into a long and ever expanding l i s t running from purely deterministic models through to spectral analysis. Present in a l l time series analysis is a dualism between deterministic observations and theoretical formulation that is fundamental to a l l systems. Usually, the data available represents an inadequate base for thoroughly explaining the properties and l i m i t a t i o n s of the theoretical representation. speed d i g i t a l Through the c a p a b i l i t i e s of high processors this dualism has taken on an increasingly i n t r i c a t e network formulation making the prediction process and subsequent analysis expensive and complex. Efforts to simplify this dualism between measurement and theory to achieve a s a t i s f a c t o r y compromise of s i m p l i c i t y and r e a l i t y have taken on various forms. Typical of these is the assignment of p r o b a b i l i t i e s of occurrence to the deterministic data then performing a frequency tabulation to separate out the stochastic v a r i a b i l i t y in to bands of probability. From this tabulation a "best" estimate is chosen as representative of the process. In 1960 Rudolf Kalman-| published a paper that described a way -to incorporate stochastic v a r i a b i l i t y into a system model and to update this system as new information was received. simple parts. This type of model had two The model of the way the system i t s e l f undergoes change -1- and a model of the measurement process that was used to monitor the data. The measure of agreement between the variations induced by the system model and those of the measuring process serve as the correcting term at each time of evaluation. This paradigm- provides a simple, compact scheme f o r incorporating a l l types of stochastic v a r i a b i l i t y that could occur in the model. However, as in a l l formalisms, the accuracy of the i n i t i a l guess at the system mechanics ultimately determines the usefulness of the Kalman formulation. The inaccuracies in these i n i t i a l guesses can be the principal downfall of any simulation procedure. This presentation is directed towards the r a i n f a l l runoff problem from a multichannel point of view, that is having a number of gauging points f o r the p r e c i p i t a t i o n records as well as flows monitored from several rivers. S p e c i f i c a l l y , the meteorologic stations at Vancouver A i r p o r t , Surrey Municipal Hall&Kwantlen Park. recorded on the Salmon and Nicomekl The corresponding flows are Rivers and at Murray Creek, a l l in the lower Fraser Valley. From this information various " f i l t e r s " are proposed to predict flood flows on these r i v e r s as a group. The f i l t e r s themselves, as o r i g i n a l l y calculated are time invariant matrices of ' s u i t a b l e ' c o e f f i c i e n t s that can be updated at discrete points in time using the Kalman model. The f i l t e r s themselves represent the solution to a (matrix) d i f f e r e n t i a l equation with constant c o e f f i c i e n t s . ordinary The Kalman formulation allows the c o e f f i c i e n t s to become time varying subject to changes in the input p r e c i p i t a t i o n . The various flood conditions change from time period to time period consequently, a set of these f i l t e r s is -2- used for the same phenomena depending on p a r t i c u l a r conditions. Thus, rather than one r e s t r i c t e d form of model this approach is p l u r a l i s t i c , adapting i t s e l f to the individual s i t u a t i o n rather than " c a l i b r a t i n g " some deterministic representation of the whole process. A BRIEF LITERATURE REVIEW: Various types of models have been employed in attempts to piece together the time variations and the l i n e a r overview composing the r a i n f a l l runoff problem. Typical of these is the a r t i c l e by Hino, M. Journal of Hydraulics, proc. of Am. Soc. of C i v i l Engineers V o l . 96, 1970. pp. 681. Each of these approaches develops a l i n e a r f i l t e r (set of autoregressive weights) that, when convolved with the input p r e c i p i t a t i o n at a f i n i t e lag over a discrete i n t e r v a l , y i e l d an output value (flow) at some future time. Considerable time and e f f o r t is expended to find the correct number of lags to employ so as to minimize the error in estimation. are invariant in time. Unfortunately these c o e f f i c i e n t s Consequently, revision of these f i l t e r weights in response to new information is not possible. A further step towards this time adaptation is the situation where one takes an average across the sampling period and computes a second set of weights to compensate for the variance not explained by the f i r s t set. This is a popular approach to the problem, but requires considerably more computation. The application of the time varying approach to a r a i n f a l l - r u n o f f problem is discussed by P. Young, Institute of Mathematics and f Its Applications Control D i v i s i o n , Dept. of Engineering, University of Cambridge, 1974. lag f i l t e r . He employs the Kalman formulation to a simple single The r e a l i z a t i o n of this formulation may be simply represented as: *k = V k where + e k y^ = flow at k Ui, = rain at k e^ = error at k where is some function of a i r temperature and the difference in flows at any two points on a r i v e r . He then proceeds to continually update a^ using information on the actual flow as i t is received to correct y^. The value of a^ is changed u n t i l i t s fluctuations f a l l within some required tolerance. The advantage of this approach is that a^ is now a time varying parameter and responsive to changes in the new incoming information. The combination of autoregressive and time v a r i a t i o n are now neatly combined into a single unit. An enormous amount of work has gone into adapting this idea to multiflow systems, employing varying numbers of lags in the Kalman formulation. However, the crux of the matter rests in finding the right model.(correct f i l t e r weights) f o r each situation. In most cases the weights do not vary a great d e a l , allowing one to pick a typical starting set and update these upon receipt of new information. This poses yet another problem typical to the forecasting of extreme flows. Given, that one has h i s t o r i c a l l y found the "best" set of these parameters and has a new set of p r e c i p i t a t i o n values but no new flow information u n t i l sometime a f t e r the process has been -4- started, the question becomes: how then can one assess the c r e d i b i l i t y of these parameters, or indeed the model structure when applying the model to a s l i g h t l y d i f f e r e n t case? It is to t h i s end that this model estimation is d i r e c t e d . enable the construction of a m u l t i r i v e r model with multiple Some of the well documented geophysical Geophysics 1974 v39, p. inputs. techniques are borrowed and re-examined in the water resource paradigm. Crump, N.D., To In p a r t i c u l a r the work of 1-14. In order to do this several concepts from spectral analysis need to be reformulated in terms of the runoff problem. This w i l l be done in a step by step manner with the suitable water resource -5- adaptations. Chapter 2 THE CONVOLUTION PROBLEM The word convolution means " f o l d i n g " . When a signal into a f i l t e r the output is the convolution of the signal impulse response of the f i l t e r . S i m i l a r l y , when a signal is passed and the (rainfall) is passed into the earth the output runoff represents the convolution of the input signal with the response function of the earth. ution means unfolding. Deconvol- The deconvolution operator is one which produces the inverse operation to the given convolution operator. Consequently, when one deconvolves runoff records with a d i g i t a l computer one is trying to unfold from the orginal heterogenous complex earth some simpler components so that i t s basic structure can be interpreted. This system deconvolution,"then separates out the effects of the data c o l l e c t i o n system. This is a problem which is d i f f e r e n t than simply separating signal from noise. For example consider a radar receiver that receives a signal masked with noise.Th.is.noise.jisgenerated by the system and has nothing to do with the s i g n a l . Compare this with the runoff records. Nearly a l l of this information contains meaningful information about the total system and the problem is to unfold the recorded records into simpler components so that one can see the contribution by primary events, ghost events or other causes. The information being considered consists of primary events or innovations. These innovations are, however, unpredictable. occur randomly with random amplitudes. one of these innovations is a response. -6- They Associated with each and every This response consists of the reverberations and repercussions generated by the innovation and each response persists out. for- r e l a t i v e l y short periods of time before damping Consequently, the resulting time series consists of a l l of these primary events overlapped subsequently masking the onset of each primary event. One wishes, therefore, to unfold these primary events. To enable this process one uses a measure of the information available. This is referred to as entropy. The r a t i o of the entropy per symbol to the maximum value i t could have is called the r e l a t i v e entropy. Viewed in terms of r a i n f a l l / runoff, the redundancy in recording represents the factor by which the average lengths of messages are increased beyond the minimum necessary to transmit the desired information. In natural processes, nature has b u i l t in enough redundancy to overcome noise. However, this redundancy causes so much overlapping and mutual interference that i t is not possible to interpret raw records provided. For this type of situation the process of deconvolution in e f f e c t separates the new information from the redundant information as time progresses. This separation permits interpretation of both kinds of information (new and redundant). Thus both have value iii comprehending the basic mechanisms. The new information consists of primary events or innovations, and the redundant information, the attached responses and repercussions. There is one additional assumption necessary before one applies the deconvolution technique. minimum delay. 1. This assumption is that the process is Inherent in this concept- are the following two assumptions: The deterministic hypothesis that the percolation through the layered earth has the same e f f e c t on a l l of these primary events, i . e . , each of the events is of a minimum delay shape. -7- 2. The s t a t i s t i c a l hypothesis that these primary events are randomly spaced in time and have random amplitudes. The deterministic hypothesis assumes that the primary response of the basin occurs on or near the surface. This means that the surface responds l i k e a l i n e a r reservoir at a fixed lag and the subsurface behaves l i k e a l i n e a r reservoir at a d i f f e r e n t lag. However, the primary response for each has e s s e n t i a l l y the same formulation and appearance. The s t a t i s t i c a l position is based on the assumption that the primary events ( i n f i l t r a t i o n ) occur in a manner that is not systematic and depends on the c h a r a c t e r i s t i c s of the individual earth layers which were l a i d down with no systematic scheme in mind. The evidence of this behaviour is realized in the runoff records in the form of a time series. The computed autocorrelation of these series averages out the nonsystematic events of the time series and preserves the systematic ones. These systematic variations are r e a l i z a b l e in the amplitude (autocorrelation) spectrum of the system. In order f o r this condition to be consistent with the two primary assumptions one must combine autocorrelation with the minimum phase condition. The minimum delay condition can be heurist-ically argued as follows: As the input signal (precipitation) percolates downward through the layered medium of the earth, i t is deflected by the d i f f e r e n t density layers and correspondingly.attenuates in strength. For the flood conditions, or f o r that matter most other conditions not a l l of the flow is retained. Hence, the more i t is reflected the more, attenuated i t becomes. Therefore, the impulse in a progressing wave form appears at i t s beginning rather than at i t s end. This then is the condition that is defined as- minimum delay. -8- Chapter 3 KALMAN FILTERING AND THE RECURSIVE LEAST SQUARES PROBLEM Since 1960 the use of recursive least squares f i l t e r i n g has changed the foundations of estimation theory as i t is applied to both discrete and continuous problems. The addition of the recursion property to this well known technique is i t s f i r s t innovation since Gauss (1809) and LeGehdre (1806) originally g observational data. applied i t to analyze This innovation came about through the presentation of a paper by Rudolf. Kalman..; * a control .theorist. His formulation allowed the unknown parameters to be described by inherently time variable states which in turn were described by a general set of l i n e a r stochastic state equations. This departs from the c l a s s i c a l least- squares d e f i n i t i o n of solving for a set of time-invariant parameters described in a l i n e a r set of equations. As a b r i e f introduction to the idea of recursive least squares, one should consider the original least square problem as posed by Gauss. Consider the linear regression problem in which a variable x 0 is known to be related to n other l i n e a r l y related variables through "the relation x 0 = ax i a., + a x 2 l j=l, 2 + . . . + a * , where 2 n n n are the n unknown constant parameters. 3 The variables x- are assumed to be known exactly, but x only in the presence of some noise n . x 0 0 is seen If one then c a l l s the observation y, one then has y *1 = = x a 0 + n i M x + or, ^ i + + Vnt -9- + n yt ( 1 ' 0 ) i = 1, 2, . . . k • is the error associated with the f i l t e r observation. n Suppose that this sequence of errors has statistical properties for i = 1 the following k. 1. It is a sequence of random variables with mean zero i . e . E [ n 2. The riy.j are u n c o r r e c t e d ; in time and have a constant variance a . y t ] = 0. 2 1 for i = j Efn,,.,. n„,-> ^ 3. ^ o 5.., where 2 = . 1 J 6,. 4 ^ 0 for i The T)y. are independent of the variables jj x.^. This is the formulation of a simple well known problem. The same problem can be reposed from an estimation point of view so as to set up the recursion property. Now what is required is a "best" estimate of the values of the unknown parameters a^ given the information y.j» Xi - , x .j 2 n x .; i=T,...,k. The least squares formulation of this situation is to minimize the sum of squares of some cost function C, where k i=i rk h x ji a j - y i Since this has only 1 parameter space (a.) J •to the a . , shouTd'.i be set to zero. 3 the minimization with respect This y i e l d s a set of l i n e a r equations or the "normal set of equations" as they are referred to in regression analysis. These then can be solved to obtain the least squares parameter estimate a.. J For the case where there is more than one parameter set one can simply rewrite (.!).'in vector form: -10- x 0 = x^a, where x^ = [ x a - l 9 x ,..., x ] &2'••s a^] 2 [a^, (2.0) then C becomes . C = l\ [ x ^ a - y.f, where (3.0) i =T y = i x. a + T TI 1 = 1,2 T k (4.0) the normal equations become: v.(C ) a = ^ . x J i=l a \\x.y. a - - 0 (5.0) i=l is the gradient (Jacobian) of C with respect to a l l elements of a. Provided that x-x."*" is non singular, the solution to (5.0) = \ - P k k i\ b w n e r (6-0) e r-Wr 1 y, b . ; (7.o) \\ Xl k 1=1 i=t •To extend (6.0) is: to a recursive form, where a f t e r k samples a^ is a l i n e a r sum of the estimate obtained a f t e r k-1 samples a^.-pplus some correction term based on the new information y^ and x^ received on the kth sample, one must use (7.0) so that and b^ can be related to P^ -j and b^ -j through the following equations: V ' • Ml"? b = b _, * x y k k x x k k T k - (8.0) (9.0) k which can be rearranged as k - k-i - Mi \ \ Mi \ ^ \ k-i p p ri+ J (also c a l l e d the "matrix inversion lemma") -11- Tp ' -°) 10 To obtain a in terms of a^-j put (8 and 9) into (6.0) to obtain: k k • pk-l xk U V pk-l xfc] + k <12-°> F i n a l l y a simpler expression f o r (11.0) can be found by multiplying k by P^ k k k and substituting from (8.0) f o r P^ -1 = P k Pk-l' + x k k x T P k-1 k x 1 + x k so that -1 T P k-1 x k _ 1 consequently (11.0) becomes a k Vl = " k P [ x k k x V l ~Vk] T This then is the recursion ( 1 3 "0) form f o r the original least squares form (1.0), The only thing missing from this structure is that i t does not make errors n , . in the observations. yi any use of the s t a t i s t i c a l To incorpor- ate this into (13.0) one needs only to retrace the steps used in the simple l i n e a r regression development. This requires (1) zero mean E [ a , J (2) P * k = E[ the variance covariance matrix - - J-i a To update a ] is related to P a o = through the simple r e l a t i o n P k k k and P^ in (13.0) and (10.0) to y i e l d \ = \ = a _ k p ] - V a* {x x k-i " k - i \ p k { T k ° k a 2+ - x y } k x T p =a P^. 2 (14.0) k k-i V * k x k T p k-i ( 1 5 -°) This algorithm not only supplies the parameter estimates at each sampling instant, but provides also an indication of the accuracy of these -12- estimates through the e r r o r covariance matrix P ^ ,;. useful f o r looking at things "on l i n e " . This i s extremely It also provides an estimate of the rate of convergence of the parameter estimates. * I f one considers the s t a t i s t i c a l i n t e r p r e t a t i o n of P ^ as an estimation e r r o r covariance matrix, i t . would then be i n t u i t i v e l y * reasonable to choose P n • -±. ±. -J.U ^.u T * • ^-J o c o n s i s t e n t with the level of confidence one has in the i n i t i a l estimate a. point of view, since a6 and P This i s a d i r e c t l i n k with the Bayesian 0 represent the a p r i o r i s t a t i s t i c s of the estimate which w i l l be updated with the input yi and x t o y i e l d * 1 = the . a p o s t e r i o r i covariance matrix P x . This then proceeds in a recursive manner f o r each step. In r e a l i t y one knows l i t t l e about the parameters, in which case setting a = [0] i s a reasonable s t a r t . Q Similarly P Q i s set to large diagonal elements (e 10 ) i n d i c a t i n g that there i s very l i t t l e confidence 3 in the i n i t i a l estimate and no idea of the cross-covariance terms. This i n t e r p r e t a t i o n can be applied to the c o r r e c t i o n term in (14.0) * \ a ( VkT V i - VkJ One can say that t h i s i s an instantaneous measure of the gradient of C P at the kth i n s t a n t , which i s modulated by a * s t a t i s t i c a l properties allow that P ^ x k . Since the Gaussian 2 w i l l be a s t r i c t l y decreasing function of time; t h i s then allows the i n t e r p r e t a t i o n that as the estimate proceeds and confidence increases, the weighting decreases; * consequently, the correction reduces. P ^ then tends to " f i l t e r " out the inaccuracy of the observations. This f i l t e r i n g idea applied to the l e a s t squares algorithm allowed Y.C..H02 to apply t h i s recursive a n a l y s i s . t o a multidimensional -13- stochastic approximation with the s c a l a r gain replaced by * p • k . This a 2 analogy permits an overview that the consistency of the estimates i s not j u s t a s t a t i s t i c a l d e f i n i t i o n , but part of the actual mechanism. It i s t h i s l i n k that unites the l e a s t squares analysis to pattern recognition and machine learning to state v a r i a b l e estimation and f i l t e r i n g . Since the idea of l e a s t squares applied i n a recursive form may not be a f a m i l a r form of analysis a simple example from physics w i l l be presented to aid i n c l a r i f i c a t i o n . Consider a body moving in a s t r a i g h t l i n e with a v e l o c i t y V_. One knows that the distance covered S. at any time t . i s given by S. = S + \lt. Q S Q Suppose there are k noisy observations contaminated with noise n „ yti Then y i = So we have S + Mt- Q O 0 y. of s^ (the p o s i t i o n @ t^) or n „ • i y i=l, 2 + Tji . y a, = S : a 1 i s the p o s i t i o n @ t k = v 2 and xj.. = 1.0 and x = t - the l i n e a r l e a s t squares s o l u t i o n i s i i trivial. 2 1 Consider a vector form of the same problem where x = i [l.t^ 1 ; a = [S ,V] T 0 Consequently, the simultaneous estimation of S Q and V by reference to y.j i s a 2 parameter system which can also be solved in the l e a s t squares manner. The following i s a s o l u t i o n to a set of observations f o r t h i s problem. This was presented by F. G r a y b i l l ( 1 9 6 0 ) . 3 as f o l l o w s : -14- His example i s For a 10.0 0 = [0] and P 0 = 10 i0 i 0 .10 r~ 1.0 0.1 0.01 0.001 I ' 1 2 0 I 34 I I I I 5 6 7 Number of samples Number of samples Figure 4 {a , stage-wise a,p = 1 0 0 . 0 Figure 3 r p = rPn P21 Pi 2 0 ] a , Po = H22 -15- 1-0 Here one can also see the s t r i c t l y decreasing nature of P k Figure 2 shows the results in the parameter estimation. Figure 3 shows how P when S 0 k evolves. . Figure 4 compares the results of the system is known a p r i o r i (only one unknown V) computed with d i f f e r e n t values of P (which is now a s c a l a r , since S 0 are compared to a stage wise least squares. is known). For P 0 > 10 The results 2 there is no real difference. There is one catch to this otherwise elegant algorithm. Things are assumed to be r e l a t i v e l y constant over the interval of observation. If this is not true, then using this analysis d i r e c t l y is c l e a r l y not a good idea. The next sequential step in the evolution towards the Kalman model was to allow the process to have a memory (ostensibly by weighting past values). This in i t s e l f is not reasonable since a l l parameters are treated a l i k e and no c r i t e r i o n - e x i s t s for s e l e c t i v i t y . The next step in the adaptation of this regression is to l i f t the invariance of the parameters r e s t r i c t i o n . analysis This means that where a k = a k-l' f o r a 1 1 k ' one assumes that the parameters vary in a manner described by the following stochastic matrix a $ k k k q k _ 1 _ 1 k Vk-i = Vi4 Vi !k,k-i (16 - is a t r a n s i t i o n matrix (n x n), also written as <t>. is an (m x m) input matrix, also written^as r. .j is a vector (noise) of independent r . v . ' s where E{q ) k = 0; E{q q } = Q \ . . T k k k 3 -16- 0) An example of this i s ^ h e random walk where a k = k-l a + k-l q The form of (16.0) provides additional a p r i o r i information that can be exploited to find a^. To demonstrate this consider the now familiar system *k = k x T a k + T1 from which one requires a^. y k It is assumed that a^ w i l l vary in a stochastic manner that can be described by (16.0). At the kth sampling (estimation), this additional p r i o r i n f o r mation allows one to make a k/k-l a n c * P k/k-l t 0 t ' i e a p r i o r i updates (predictions) c a l l e d estimates a^-j and covariance P|_-| obtained < at the k-1th sampling. Simply stated this means that because one knows how the true parameter vector w i l l vary between samplings, one can use this information to vary the parametric estimate in a similar manner p r i o r to the receipt of the new data at the kth instant. To demonstrate this prediction f i r s t r e c a l l that ECa^] = $ a k i (from r e s t r i c t i o n s on q ) k So an estimate of a^ p r i o r to the new data a r r i v i n g can be obtained by a k/k-l • k-l = a ( f l 7 where k/k-1 means k based on-k-1. The error is defined as (for the a p r i o r i estimate) a k/k-l = a k/k-l " k a f then from (16.0) arid "(17.0) a k/k-l = = $ a k-l $ a + k-l " r q k-l r q k-l + * k-l a r o m ( 1 6 - 0 ) - 0 ) in a similar manner (see Gelb pp. 107, 110) P k/k-l = $ P P k/k-l = E k-l $ T + r Q ^ k-1 k - l ^ a a r T a n w h ^ d e r (18.0) e 1 S ^ e c o v a n a n c e matrix of the parameter variation disturbance in (16.0). One now has a correction mechanism f o r the system given a new observation plus the prediction mechanism a k/k-l P k/k-l* * k-l = § *P . *»+rQr = k T 1 The formulation can be simply demonstrated by again using the random walk as an example, here t h e n $ = r = I a k/k-l P k/k-l = I: a = P identity matrix k-l k-1 + Q Q is the covariance of the random walk and needs only be a diagonal matrix. One has at this point a l l of the mechanisms of the Kalman F i l t e r as,.they are to-be.'.used in this paper. This .formulation of the state space representation .needs only to be linked to the deconvolution process to make the mechanism complete. Chapter 4 STATE SPACE REPRESENTATION OF THE DECONVOLUTION PROCESS Before one can e f f e c t i v e l y l i n k the deconvolution process to the Kalman state space representation, an equivalence between these two operations must be established along with some fundamental ideas concerning modelling random processes. To c l a r i f y the change in formulation a simple c i r c u i t example has been chosen. L Figure 5. . Figure 5 represents a system with an input u(t) and an output y(t). The manner in which energy is transferred from input to output is the solution to a pair of l i n e a r , homogeneous, ordinary d i f f e r e n t i a l equations. Formally stated the pair of ordinary d i f f e r e n t i a l equations are u(t) = Ldi(t) + Ri(t) dt input y(t) = Ri(t) output The solution H(t) represents the system response to the input u ( t ) . The solution has a form H(t) = • -P./, t R/llxp L To make the t r a n s i t i o n between the Kalman formalism and the transfer function form, one need only consider the following changes: 1. Define state variables x(t) = i ( t ) ; y ( t ) = Rx(t) The d i f f e r e n t i a l equations become x(t) = -R/ x(t) L + 1 / L u(t) ( 2 0 . 0 ) y ( t ) = Rx(t) There are now two things that become immediately apparent. F i r s t , the state space formulation permits R and L to vary as functions of time, whereas the t r a n s f e r function formulation does not. This impli that in the t-space formulation there i s no e x p l i c i t s o l u t i o n f o r R(t) and L ( t ) . Secondly, the t r a n s f e r function formalism has been replaced with a state formulation. In terms of the deconvolution process, the deconvolution operator i s equivalent to H ( t ) . This has assumed that changes in the system parameters are f i x e d (as are R and L). The physical system, such as the layered e a r t h , does not have i t s system parameters f i x e d and they do change with time. Consequently, to l i n k t h i s evolution process of the parameters to the best " f i x e d " approximation provided by the deconv o l u t i o n process, one needs to restate the problem in the state space configuration. This i s e f f e c t i v e l y done by making the appropriate substitutions. I f one now returns to the Kalman system model formulation the r e s t a t i n g of the deconvolution process in state space terms becomes s t r a i g h t forward. This can be done i n the f o l l o w i n g manner. x(t) = Ax(t) + Bu(t) where A, B, H, C are matrices y(t) .= Hx(t) + Cv_(t) . x, y , u, -v are vectors. - 2 0 - ( 2 1 , 0 ) H is not the transfer function. x = x(t); u(t) = u ( t ) ; A = -R/ ; L In terms of the previous example B = 1 C = V = 0; / l H=R Equations (21.0) are a general description of a time-varying l i n e a r system. The state-space description in terms of state vector functions of time has replaced H(t) in the linear system. The solution to (21.0) can be solved as an ordinary d i f f e r e n t i a l equation i f A and B are constant. This is done by analogy to the solution from x(t) = e A ( t - o x ( t ) +•/ e t t ) A ( t 0 - 6u(x)dx T ) 0 i f U ( T ) = 0 then, x(t) = e ^ " A t o ^x(t ) 0 which is similar form to H(t). The position of t h i s solution containing e ^ A t - t °^ i s the t r a n s i t i o n matrix which'describes the evolution of the state vector function as a function of time. When A, B, H, and C are not constant one then has to solve an associated estimation problem. Before embarking on the estimation problems one must also consider the problems associated with modelling a random process. By analogy with the previous, examples we consider again u(t) as the input to a l i n e a r system with a transfer function denoted by T and an associated output y ( t ) . To enable a representation of this system as a random white noise process driving T to y i e l d an output y , the following argument from Bayless and Brigham^. is presented. They consider the power spectral density, that i s the function (in this case the transfer function T) whose integral over any sampling interval represents the contribution to the variance from that sampling, -21- to be represented as o>. There are two contributions to the variance in the l i n e a r system we are looking at. They are cj>y(w) the contribution from the output o> (w) the contribution from the input u therefore, <byM = |T(jw)| <)>(w) and since u(t) can be considered u as a white noise process, then, <t>(w) = 1. u <i> (w) = |T(jw)| : This now implies: for the case where <j> (w) i s the variance contribution of a nondeterministic random process; ^(w) can be factored into T(jw)*T(jw). Hence, any given power spectral density can be factored to give T(jw). This means that there exists a T(jw) so that the random process u(t) can be represented as a white noise process driving the l i n e a r system T to give fy^M. As a result of this argument one can look at (19.0) as: u(t)-— T(jw) y(t) where u(t) i s white noise. When translating this formulation to the state space d e s c r i p t i o n , T(jw) i s replaced with a system of d i f f e r e n t i a l equations and output equations as can, be visualized in Figure 6. CO — 1 , Figure 6 -22- As has been previously noted A, B, J are generally a function of time. At this point the process description of (19.0) has undergone a two stage transformation. We have, f i r s t and foremost, allowed that the mean and autocorrelation of a random process u(t) can be modelled as a l i n e a r system driven by white noise. Secondly, the transfer function representation of this l i n e a r system has been replaced by the state variable description. It now remains only to dovetail these two attributes into the Kalman representation. A s i m p l i f i e d description of the deconvolution process can be v i s u a l i z e d in Figure 7. message process x(t) } ' V(t) /yN 1 Z(t) . h(t) impulse response x'(t) additive noise Figure 7 Here we wanted to f i n d an h acting on the input Z(t) so as to minimize the mean square error between x(t) and x ( t ) . Stated more formally: co x(t) = / h ( x ) Z(t-x)dx where h(x) — CO and s a t i s f i e s the r e l a t i o n (auto correlation m a t r i x ) * ( f i l t e r response) = (Cross c o r r e l a t i o n of input and desired output) This is the c l a s s i c a l Weiner Hophg integral equation defined by co / h ( x ) R (t-x)dx zz = R (22.0) zx —CO -23- R : zz R : = autocorrelation of input = cross-correlation of input and desired output. ZX In i t s most general sense (22.0) is written as t / h(t,x) R ( T , k ) = R ( t , k ) with to t x(t) = (23.0) xz z z / h(t,i-) z ( t y r ) < h to This equation is d i f f i c u l t to solve generally and the contribution from Kalman was that he changed (23.0) into a non-linear d i f f e r e n t i a l equation, whose solution determines the optimum f i l t e r . the discrete d i g i t a l formulation: k x h = $ k/ _ k = k k H x However, for case the Kalman equations take on the following x ] k-l + V + B k/ _l U k ( 2 4 0 ) k k This system is the same form as (22.0) with $ being the t r a n s i t i o n (propagation) matrix for the discrete state vector equations. In this case one wishes to minimize E [ ( x - x ) ( x - x ) ] based on the previous T k measurements of y (output; runoff). The sampling interval for this process was 1 time i n t e r v a l . Implicit in this formulation (following Crump^,1974) are the following assumptions. 1. , B, , H. are known for a l l times k. $ K 2. /k-l K K The driving input u^ has zero mean, known variance and is uncorrelated to the additive noise v^ for a l l times k. E[u ] = 0; k E [ u v ] = 0; T k k E[u u ] = Q T k k k ^ for a l l k and j -24- Q k is a known diagonal matrix. The additive noise v^ has zero mean and known variance f o r a l l times k. E [ v ] = 0; E[v k v k T k V j ] = V k 6 k j where, is a diagonal matrix The i n i t i a l state vector has known mean and variance E[x ] = x 0 0 E[x x ] = M T 0 0 0 It should be noted at this point that at each time k, second order s t a t i s t i c s are required f o r the noise processes u^, and v^. This is the feature that allows for non-staionarity in u^ and vv whose s t a t i s t i c s change from time step to time step. -25- Chapter 5 IMPLEMENTATION OF AN ADAPTIVE KALMAN FILTER TO ESTIMATING RUNOFF The c l a s s i c a l Kalman formulation is s u f f i c i e n t when one has continuing information on the output signal (runoff). However, as is the case in hydrologic prediction the information on runoff stops at some point and the only continuum (rainfall). of information is the input signal Estimation of the error covariance i n i t i a l l y is not a problem; neither is the estimation of the system noise. Consequently, the implementation of the Kalman f i l t e r can be effected using only minor modifications to i t s structure. One begins with x^. This is not a vector in this case but rather a matrix describing the change in the d i f f u s i n g retention) properties of the layered earth model. (percolation/ Therefore $ k ^ is the matrix describing the change in the solution form as a function of time and V, is the noise?associated with this measurement. : <. K x k = *k, k - l k - l x + V k To establish the h i s t o r i c behaviour of this system under peak (flood) conditions one uses the " s i g n i f i c a n t " storm records from several years of records to establish a typical design storm. This input storm can be chosen to r e f l e c t a variety of conditions. However, for flood prediction one would t y p i c a l l y want the "worst" of the data set. "Worst" can imply two things. F i r s t the signal that has the maximum intensity over a fixed period, or secondly, the maximum duration and maximum intensity. These signals are multichannel, implying that the same storm is monitored at a number of meteorologic'stations. -26- Associated with this input signal (at a suitable time lag) are the resulting monitored flow records. These are also multichanneled ( i . e . , monitored simultaneously on several d i f f e r e n t r i v e r s ) . wishes to associate the output trace with the input signal way as to characterize the system behaviour. One in such a This is done using the deconvolution process employing a standard Weiner f i l t e r . The Weiner f i l t e r is the d i g i t a l operator which solves in a l i n e a r least squares sense the Weiner Hopf equation (23.0). It is formulated in such a way that one produces a linear least squares predictive and enhanced stationary time series by means of a r e a l i z a b l e , time invarient l i n e a r operator. (For a discussion of Weiner f i l t e r s see Robinsonj Chapter 6). One of these realizations would t y p i c a l l y look l i k e the following: m y t {u^} n+1. y t = I, A t-j t-j u ; m = h + 1 are the vectors of input signals of length n at lags t - j ; j = l , A. . are the matrices of deconvolution operators at lags j ; is the estimated output at time t. In the Kalman notation: y k V k = A,u. k k A 'k = = + V k ' w h m T A' . u. . \ t-j t-j j=l 1 L \ , k-1 k-1 A + w e r e and A' is to be calculated from: t k Consequently, the system and measurement models look l i k e : A k = * k , k-1 k-1 y k = V k A + + w \ k s y s t e m measurement -27- In this problem application one has only estimates at y k called y whereas the measurement process is on the input signals u^. k One, therefore, needs to rewrite the updating procedure in a somewhat different manner. i m i.e. y £ A'_. u _^ + V = k .x l< 1 < signal estimate using deconvolution. k j=0 The transition then becomes 'k A = $ = A k, k-1 k-1 A ; $ k, k-1 = A k 1 + V A k-1 m •••*'k \ 'k + l\ A Vj k-j k j-1 From this one uses the Kalman algorithm to update A^, i.e., A = k A' k + K [y -y' ] k k k In terms of the entire process the step by step computation becomes as follows: Compute P(0): This is the matrix of terms defined by E[(A - A ) (A - A ) ] . Since one does not know the exact T k k k k solution to the differential system satisfied by A k a diagonal matrix with terms of order 10 . 3 i t suffices to use (For the motivation behind this see the preliminary discussion). Set A = A , the deconvolution operator computed for time t. t Now compute: P 'k where Q *k, = k-1 k-1 P $T k, k-1 + Q k-1 is a diagonal matrix describing the covariance of the signal k measurement. Compute: K k = 'k ^k P '-Vk^k ' + w n e r e -28- v k i s t h e "measurement of the noise in the measurement process. Then compute: K= A 'k + K k where A 'k »k, k-1 = A k-1 and y', k u. + k k = A', J m. Y, i j-l A A, k-j . u, . k-j Then compute: p k = p , k- Wk increment k and recompute P'. The Kalman equations as they are stated above are e f f e c t i v e l y minimizing the trace of P.,. The vector A ' k is the best projected estimate of A^, that is the best estimate f o r A^ based on the estimates y^_1 and the data through u ^ . The matrix P'^ is the error covariance matrix of the projected estimate A'^ influenced by the most recent data u^. F i n a l l y u'^ is the best estimate of the measurement vector u^ based on the samples u _-| and the estimates to y^. l< The nature of this discrete formulation can be v i s u a l i z e d using the block diagram formulation from Crump^ (Figure 8). The Kalman f i l t e r diagram contains the same delay and feedback system as the signal model. It is also clear that the estimate A^ is obtained using the weighted sum of two things: 1. The best predicted estimate A'^ 2. The l a t e s t measurement sample u^ The r e l a t i v e weighting of these two quantities is determined by the system dynamics model and is invoked through the d e f i n i t i o n of $ and -29- A. • and defined by Q and V. K-J Figure 8 Block diagram of signal model and discrete Kalman f i l t e r . -30- Computation of the Kalman F i l t e r To aid in the v i s u a l i z a t i o n of the actual mechanics of the Kalman process a flow description of the procedure is outlined below. 1. I n i t i a l i z e the f i l t e r by defining the i n i t i a l P 0 = M x 0 = x estimations. the covariance of the state matrix. 0 X Q q is the i n i t i a l estimate of the state of the system. This corresponds to the decon- volution operator at zero lag. 2. For each time step perform the following (a) calculations: Compute p fkk!k Wk = p k T+ (b) = P H (25 +V] k kk * k - T T 0) - l k Compute the updated state matrix x k Vk-i = " (c) x k : H k Compute a new matrix P^ k p • p k * W k Increment k and repeat. It is also useful to re-examine \ = *k k k " k +k Cy y ] This is where the prediction of the next system state is computed. If one writes this e x p l i c i t l y in the form: \ = *k*k-l + k»k " k\ k , k H x ] one can observe the following properties: -31- 1: If the model were a perfect one and there was no noise in the system the output would look l i k e h h = This means that x matrix $ k k \ \ = V k V = = ^x^-p Consequently, once the t r a n s i t i o n is obtained from the physical model, one could predict the state vector exactly for a l l time steps as long as y=y. this is not so then k [ y k = y' ] k k If is the error in the prediction of the state vector at the next time step. 2. The matrix k k (Kalman gain) has useful physical properties that can be examined as the i t e r a t i o n proceeds. The optimality of this Kalman f i l t e r is contained in the structure and s p e c i f i c a t i o n of this gain matrix. The mathematical least squares structure has been discussed at length consequently, an i n t u i t i v e logic should also be included with i t s properties. the form K = P^i/^" » 1 k w n e r e |< R = H This can be examined from k 'k k P H T + V k a n d r e P r e s e n t s the noise in the system. To better f a c i l i t a t e the physical interpretation of K , assume k for the moment H n x n matrices. k is an identity matrix. If R -1 This assumes P k and R k results from multiplying each column of the error covariance matrix P inverse of the mean square measurement noise. k are (system noise) is diagonal, (this means that the noise between channels is not r e l a t e d ) , the matrix K element of K k k by the "This implies that each is e s s e n t i a l l y the r a t i o between s t a t i s t i c a l measures of the uncertainty in the state estimate and the uncertainty in a measurement."g Simply stated this means that the gain matrix is proportional to the uncertainty in the estimate, and inversely proportional to -32- the measurement noise. If the measurement noise is large and the state estimate errors are small, the vector (in the o r i g i n a l equations 23.0) is due primarily to noise and only small changes to the state estimate should be made. However, a small measurement noise and a large uncer- tainty in the state estimate says that information on the errors in the estimate. contains a great deal of The difference between the actual and predicted measurement is used as the correction basis. can see from the formulation of One that this does indeed agree with the i n t u i t i v e motivation. As a f i n a l observation one should also consider the interpretation of the covariance matrix P^. The e f f e c t of system disturbances on the error covariance growth is the same as that observed when measurements were not available. As the s t a t i s t i c a l .parameters of the disturbance becomeilarger;-theirceffects are r e f l e c t e d in t h e " s i z e " of the Q^ matrix in (25.0) The more pronounced the e f f e c t of the disturbances r e f l e c t e d in the " s i z e " of the increase. matrix, the more rapidly the error covariance w i l l This e f f e c t of measurement noise on the behaviour of the error covariance can be seen in V + f 1 = V - ) " 1 + H V \ where (+) means a f t e r receiving information and (-) (as used in Gelb^). Large measurement noise ( R _ 1 k means before is small) provides only a small increase in the inverse error covariance ( P ^ ) . -1 Which means a small decrease in the error in P^ when the measurement is used. These measurements, therefore, contribute l i t t l e to the reduction of the estimated errors. R. ) _1 However, small errors in measurement (large cause a large decrease in the error covariance when the measurement -33- is used. If, on the other hand, one does not have measurement noise the appropriate equation f o r |<( ) P + = ^ is: " k k ^ | (-) K H P s i n c e R < k _ 1 d o e s n o t a PP - e a r To summarize the general behaviour of the P^ matrix one can say the following 1. things: Larger measurement noise w i l l cause the error covariance to decrease slowly, or conversely to increase, depending on the system, the disturbances and the i n i t i a l value of P^. 2. Smaller noise w i l l cause the f i l t e r estimates to converge on the true values more rapidly. As an i l l u s t r a t i o n , the following Figure 9 from Gelb i P J "i-i •Vi L k.| . is used. *k-l y COMPUTE iPUTE \H EOI COMPUTE R t. I EO.|A.2-161 R E A S O N ABIE NESS CHECKS C O MAPUTE PU K k I EO EO UPDATED ESTIMATE REASONABLENESS CHECKS COMPUTE (A2 -51 *k-l j SET ] • <r6 "* Figure 9 -34- ; k'"' I COMPUTE ; t l-l EO.IA.2-I7I I^J | k Wei tier Kalman Formulation Assumptions The algorithm to compute the actual multichannel deconvolution operator has three main assumptions and l i m i t a t i o n s governing i t s design. 1. The time s e r i e s input u ^ r a i n f a l l ) and the desired output (runoff) are s t a t i o n a r y , which means that t h e i r s t a t i s t i c a l properties do not change with time. 2. The approximation c r i t e r i o n i s taken to be the mean square e r r o r technique between desired and actual output. This means that one determines the operator ( f , f i , f2---) in such a way so as to 0 minimize the mean square-error matrix between desired and actual _ output.,.. 3. The operation used f o r signal enhancement and p r e d i c t i o n i s assumed to be a l i n e a r operation on the a v a i l a b l e information, or more f a m i l i a r l y , i t i s said to be time i n v a r i e n t . This p r e d i c t i v e f i l t e r ( c a l l e d a Weiner f i l t e r ) i s a l i n e a r l e a s t squares p r e d i c t i o n and enhancement of stationary time series by means of a r e a l i z a b l e , time i n v a r i a n t l i n e a r operators. The l i n k between the Kalman approach to time varing l i n e a r l e a s t squares approximation and the Weiner f i l t e r now becomes obvious. One can design an optimal deconvolution operator based on s i m i l a r sets of h i s t o r i c a l observations and allow the p r e d i c t i v e deconvolution operator to be modified by o n - l i n e (new) innovations. The change in the structure can be monitored through the formalism of the system model t r a n s i t i o n in the Kalman f i l t e r and updated s e q u e n t i a l l y using the Kalman measurement model. -35- Chapter 6 CASE STUDIES The s i t e chosen to test these predictive techniques has been looked at by a number of previous groups, being Sigma Engineering of Vancouver. the most recent of these The Surrey municipal boundaries roughly describe the basin being studied.. For a p i c t o r i a l description of the basin c h a r a c t e r i s t i c s and topologysee Figure 10, which has been reproduced from Taylor J.W.-jg. The Surrey d i s t r i c t as described in Figure 10 and as treated in Reference 15 was divided into four separate basins and each with i t s own unit hydrograph. These were then compared for common c h a r a c t e r i s t i c s . The treatment of this same problem has been approached from a very d i f f e r e n t perspective. In this case the information on the precipitation has been treated as three separated information channels which are not necessarily in phase with one another. The measurements are taken from the Vancouver A i r p o r t , Surrey Municipal H a l l , and White Rock. The storm fronts generally approach from the west and in terms of this sensing system, are f i r s t detected at the Vancouver Airport. This is followed by the stations at the Municipal Hall and at White Rock a few hours l a t e r (depending on the storm). Consequently, there is a difference in both start time and i n t e n s i t i e s . The output trace sensing stations are on the ;'Nicomekl.: and Salmon Rivers and on Murray Creek. These three channels respond at d i f f e r e n t rates and peak at d i f f e r e n t discharges. group of gulleys. Its Murray Creek is a low flow system fed by a response is "flashy" and dissipates -36- very quickly. Fig.10 STUDY AREA RECORDING NEW WESTMINSTER PRECIP. G A U G E S » | A Surrey Kwantlen Park 1 B Surrey M u n i c i p a l Hall —i C Langley Lochiel BRITISH COLUMBIA - C A N A D A WASHINGTON - U.S.A. [ The Nicomekl and Salmon r i v e r s are of roughly the same order of magnitude of base flow, with the response rate of the Nicomekl being s l i g h t l y faster. A l l three of these channels are considered simultaneously in time but as separate sources of information. The procedure for testing the p r e d i c t a b i l i t y of peak (flood) flow conditions consisted of scanning the total 72 hour p r e c i p i t a t i o n records for the largest volume, greatest i n t e n s i t y , and the "design-;,., storm". The design storm was determined to be the one that t y p i c a l l y reflected the flood condition period ( i . e . , December to February). The design storm signals were then input to the deconvolution algorithm along with the peak flow records found from the three rivers. The resulting deconvolution operator was stored and the same design storm applied to predict the following twenty-four hours of flow. This was done using the Kalman algorithm to update the deconvolution operator. This can be visualized in Figure 11. Ste-p l i> L O . J 12. ta. ^<Ad,bftcJ»i loop Figure 11 -38- Having done this there i s one further check that the updated deconvolution operator i s from the same solution set as the original operator from Step 1. One then takes the deconvolution operator to represent a multichannel time series by i t s e l f . Having done this the following c r i t e r i a should be s a t i s f i e d by both the actual and predicted filters. 1. Their autocorrelation spectra should be s i m i l a r , (since they . are supposed to represent the same time s e r i e s ) . 2. Their respective channels should be coherent up to the sampling frequency resolution. As a background to this type of test and i t s implications an explanation of the role of the autospectra and coherency are in order. Consider the two operators mentioned above to be two time series x and y , where x t x,(t) = y t = x,(t) y (t) n The autocorrelation i s defined to be f o r the series x^. as .T, E { Vs t x ):,(s) where;s i s some later~time, and x the L = measurement at that time. x t and X£ . This i s the expected measurement between Its r o l e , therefore, i s to average out the variations in +s the series to give an overall picture as time proceeds. The case that i s used here has multiple channels, hence there are cross c o r r e l a tion terms to be considered. These are the terms f o r which <b,-. i s defined f o r i 4 j . The resulting matrix looks l i k e : *s = E {x t+s x t T } = * i i ( s ) r * - .* (s) 1n . (s) nr ' d> d>" T -39- v nn v (s) ' Correspondingly, the crosscorrelation between two series x and y looks 1 i ke: cj> (s) (s) A X i y * = v r x y E{x = y } i f one lets T t + S t E {x,(t+s)y.(t)} j 1 J (s) This matrix is the vehicle through which one can look at the degree of i n t e r r e l a t i o n between any two time series. ease <|> (s) xy For means of computational is looked at from the frequency of occurrence domain defined by: • (f) = l| v -ZTrifs which is the Fourier, transform of the.<j> (s) representation. This enables one to calculate the spectral density, or the contribution to the variance in the series by the individual components. This provides easier s t a t i s t i c to characterize the differences between the two series. For further discussion of the implications of this procedure see Jenkins and Watt-. The end r e s u l t of this manipulation is to examine the maximum power (energy) shared by these two series at any sampling rate. This is formally defined as the geometrical mean i.e., by n / x^x^ M..(f) JK x = n of the two autospectra |<k( ) represented $ f / $ . . ( f ) $ . . ( f ) , where f=0, A f , . , .mAf; Af=l/2m; JJ KK .... ,.,. -.: m = number of cycles/unit time. In order for this top l i m i t to be obtained, the power (energy) at that frequency of sampling must have a fixed phase relationship with the two series being treated. If this top l i m i t is indeed attained then the two series are said to be coherent at that frequency. -40- As is usually the case with real series the phase relationships vary in a random fashion. If this change was completely random, then the bottom l i m i t (theoretically) is zero. be said to be completely incoherent. The two series could then For the majority of situations r e a l i t y i s , hopefully, somewhere in between these two extremes. The c o e f f i c i e n t of coherency is defined by: ie,k(f),_ iK^fJe-j • K (f) j k J k (f) r^rfT Its magnitude is defined as: ,Jf)| = 4,(f) = |K,.(f) The phase dag of the coherency.is e (f) = tan lm(^ (f)) = -e (t) _ 1 j k k R e ( *jk k j ( f ) ) The magnitude of K - l i e s between 0 and 1. k 0.0 < |K- (f)| < 1. k The algorithm that computes these parameters returns the results in the form: * (f)-,._jK H(f) = n e 1 2 1 2 (f)|~- K V,( 'n (f) f m' » f '•*nn< » f for the various frequencies of sampling. Returning to the actual problem one would be interested in the following terms of this matrix for A i t \t x .2t 'st X 3t -41- defined as before 1. One wants the coherencies 14 s k 2k 9 34 2. ^15 s 2 5 , *16 k 26 , k k K s 35» 36 and phase lags 14 s Zk s 9 9 16 25' 9 26 35> 6 36 0 15 6 G s at the frequency (sampling rate) of up to f = 2^r- ; If At = 1 hr ; f = .5 cycles the predicted deconvolution operator i s the solution to the same problem (basic c h a r a c t e r i s t i c s ) as the previous one the two operators w i l l be coherent at the same frequencies, with perhaps d i f f e r e n t phase lags (response rates). If this is not the case, e i t h e r the h i s t o r i c a l characterization is inadequate, or the prediction is defining conditions quite d i f f e r e n t than normal. If this is the case then the coherencies for the predicted flood flows and another similar h i s t o r i c situation should be checked, using the same signal input. For either of these two cases there is a consistency test to apply to establish c r e d i b i l i t y of the predictions. A similar test should be run when the signals being used are r a d i c a l l y d i f f e r e n t than the h i s t o r i c a l records, eg. the "100 year storm". When this signal is applied to h i s t o r i c a l data the deconvolution operator should be the same ( i . e . , have corresponding coherencies) values over several sets of data. Consequently, one can place confidence intervals on the predictions one makes using these operators. -42- Computation of the Confidence Bounds on Predicted Traces: .. 'Part .of the computation performed by the Weiner algorithm is the estimation of the mean square error along the trace of the computed operator. E { e t e t T } This can be formalized as follows: E{ = 2 e i (t)} .... E{e ("t)e (t)} m where e(t) 1 Efe^t) ••• e (t) m E{e (t)} 2 m is the error at time t ; m is the number of lags. The trace of this I = t r E{e(t) e ( t ) } T = l\ E{ 2 e j (t)} j=l is to be minimum. This minimum is returned as a percentage. Unfortunately, this is only an estimate of the agreement with the target data set and does not provide bounds for the f i l t e r ' s performance using new input data. The innovation variance structure has been dealt with at length in the l i t e r a t u r e (see, for example, P l a c k e t t ^ * I960) and . w i l l consequently be treated from a computational point of view only. Consider the autoregressive representation used for the Surrey data, i.e., y t = A _j u _. + t (Z is the noise) t j=0 Since one is now casting into the future without observation Z = 0. The variance imparted by this estimation is given as: Var[y] = Var[Au] + Var [Z] = u'Cu + a u' denotes transpose 2 C is the covariance of the estimator. -43- Since Var [u'C] = E{u'(X-y)'(X-y)'ii} = u'Vu where V now represents the covariance of the input (rainfall) variables. Note also V = a I 2 One can now substitute this new equality into the original Var[y] expression to obtain: Var[y] = (u (X'X) u+l)o 1 _1 2 Further to this one only has an estimate s 2 to the actual a 2 and since the population is assumed normal, the errors are then distributed with a Student's t - d i s t r i b u t i o n (see Jenkins and Watts^g',Chapter 3, p. 83). This is denoted by t (l--^-)' In this case there are n-m degrees of freedom (since i t takes m lags to define the process); consequently the d i s t r i b u t i o n value desired is t _ ( l - |-). n m For the Surrey data, n=72, m=7; n-m=68 degrees of freedom. For the 95% confidence interval tgg - 2. The f i n a l form of the confidence on y looks l i k e : y ± t 6 8 (1- f ) s /1 + u'U'X)"^ The average s is approximately 19.8. The expression / 1 + at t=8. u ' ( X X ) u is 0.8 with the covariance evaluated 1 _1 This gives a confidence band of: y ± 31.6 cfs at t = 8 hours. The covariance matrices of the u's w i l l obviously change as the time proceeds and w i l l (hopefully) s e t t l e down to some stable l e v e l . In this case the band settles down to approximately ±15 cfs after t=26 hours. -44- For other situations, such as the Squamish River data, the bands w i l l vary r a d i c a l l y dependent on the a r r i v a l of the flash flood conditions. The prediction confidence then depends on the s i m i l a r i t y between the h i s t o r i c flood and the snowpack of the test situation and the new conditions and target conditions, flows. As has been discussed previously, the a r r i v a l of one of these events gives r i s e to an enormous error in the f i l t e r whose confidence band is also proportional to the new covariance of the s i g n a l , which w i l l also be large. Recently a somewhat d i f f e r e n t type of deconvolution process has been developed that deals precisely with the a r r i v a l of these types of events. This minimum entropy technique was put forth by R. Wiggins,*, 1977. -45- Considerations: When applying these f i l t e r s some purely "engineering" . c r i t e r i a should be applied. This breaks down into two simple categories, namely, those dealing with the external behaviour of the system and those concerned with i t s internal behaviour. The Weiner f i l t e r formulation of the f i l t e r design c a l l s for the solution of the Weiner-Hopf integral equation., where the unknown is the impulse response function of the optimum f i l t e r . In this discrete case one simply solves a set of normal equations to obtain c o e f f i c i e n t s of this impulse response function. This impulse response describes the external behaviour of the f i l t e r , that i s , i t s performance specifications. In most cases this becomes the input to the system design. In discrete f i l t e r i n g , the " R L C - c i r c u i t " f i l t e r no longer applies. The physical information is now included in a s t a t i s t i c a l procedure (Kalman f i l t e r ) . As a result one can now question the internal behaviour of the system in a systematic manner. The s t a b i l i t y and delay c h a r a c t e r i s t i c s are a l l a part of the internal representation and cannot be disregarded. The net r e s u l t of this combination of the two processes is that one specifies the stochastic behaviour of the process that w i l l change over time. Other Applications: Given the a b i l i t y to systematically characterize a basin or basins in terms of these f i l t e r s , one obvious extension of the prediction capability exists. One begins by->assuming that the spatial of the'problem looks l i k e Figure 12. -46- configuration ad nV«rs Figure 12 Where 1, 2 and 3 are basins in some arbitrary region. precipitation and flows for 1 and 2's One knows the various r i v e r s a, b, c, d, e, f. Consequently, one can define a set of general f i l t e r s describing 1 and 2. One also knows the precipitation for 3, but the streams have not been gauged ( i . e . g, h, i are not gauged). frequency d i s t r i b u t i o n s and 2. One can determine from of r a i n f a l l a general frequency for basins 1 The convolution of $ ( f ) (the r a i n f a l l d i s t r i b u t i o n ) 1 2 fourier transform of the predictive f i l t e r d i s t r i b u t i o n of flows for 3. $ and the r j ( f ) w i l l y i e l d a frequency 1 2 Mow assume that one has some information on the flows in 3 e i t h e r as h i s t o r i c data or at some future time. The predictive deconvolution in the Kalman algorithm w i l l correct the behaviour of 3 as i n i t i a l l y estimated from 1 and 2. Another c h a r a c t e r i s t i c can be observed from the f i l t e r s from 1 and 2. This is the change in the phase lag over time. -47- resulting (The time base here would be in the order of years.) The change in this time lag would monitor the " d r i f t " of the flow with time or rate of urbanization. This can be done i n t e r n a l l y , i . e . , f o r basin 1 or 2 or as a cross comparison between the two regions. Other Techniques:Compatabi1ity The compatabi!ity of the various types of estimation techniques have been explored mathematically by numerous groups of individuals. The results by Kalman^ show that every second order process, (that is a process that takes into account only those stochastic properties determined by t h e i r means and variances) has a proper rational spectrum. This spectrum has a Markov r e a l i z a t i o n . He also shows that the starting assumptions for the Weiner f i l t e r and the Kalman f i l t e r are the same. The fact that every process defined in this manner has a Markovian r e a l i z a t i o n means that techniques that are autoregressive with a memory of only one time lag (such as CIarks-j-| method) produce results that represent limited approximations to the Kalman-Weiner formulation. Hence, one could have predictions for basin #3 from some other autoregressive formulation and perform subsequent analysis of the results using the Kalman Weiner paradigm. This is a t t r a c t i v e i f one already has estimation techniques established and understood for modeling p a r t i cular modes of the operation. -48- Numerical Examples: Example. No. 1 The system of rivers chosen for the f i r s t example basin defined in the map (Fig.10). Nicomekl is from the Surrey They consist of the Salmon and rivers along with Murray Creek. The input signals to this system consist of p r e c i p i t a t i o n data from the stations at Vancouver A i r p o r t , White Rock and the Surrey Municipal H a l l . For the actual computation of the deconvolution operator three channels of runoff data from 1973 flood flows were also included. The lags for the six channels of input data can be visualized in Figuine ,13. input data Precipitation 0 72 hr time 1973 flows 78 +12 hr ecedent runoff (prior to 1973 peaks), Figure 13 The assumption here is that the results of the p r e c i p i t a t i o n are lagged by a minimum of 12 hours from the flows and that the flows are to be shifted in a corresponding manner. With this set up one wishes to compute the operation that when convolved with the signal w i l l y i e l d the 1973 peak flood flows. One further assumption is made. p r e c i p i t a t i o n and signal That is that since both signal antecedent flows are trying to predict three channels of information that are the same, one can either constrain the -49- output traces to be three in number, or allow them to be redundant, i . e . put in three traces extra. as the original These three extra traces are the same ones. This can be visualized in Figure 14. 1973 Airport 0 P E R A T 0 R • White Rock Precipitation { Municipal Hall NicomekT ' Salmon Antecedent Flows Murray Creek 72 hr Nickomekl 1 Salmon 2 Murray Creek 3 Nicomekl 1 Salmon 2 Murray Creek 3 73 runoff 72 hr Figure 14 Having found this operator for the 1973 p r e c i p i t a t i o n and antecedent flows, the new 1974 p r e c i p i t a t i o n and antecedent flows are substituted and convolved with the operator to predict the 1974 flows, i.e., f 74 V ( ) :' ( \ \Signal j 73 74 V Operator / ^Peaks / The results and a sample of the operator are shown in Figures The computed operator is now read into the Kalman state equations as the matrix representation of x^, where x^ is the leading (0-lag) decon operator matrix and $ = xj^ x^ the t r a n s i t i o n matrix. The Kalman update proceeds in the following manner. After each time step of prediction using the decon operator the actual flow values for the 1974 flows are read i n . -50- The covariance of t h e i r agree- ment is used to update x^. signal The new x^ is then convolved with the to produce the new updated flows. situation. This simulates the "on l i n e " The resulting changes,are plotted in Figure 18a,b The Murray Creek trace was consistently overestimated due to the size of i t s peaks in comparison to the Salmon and Nicomekl." However, one can see the modification in the f i t by the updating mechanism. With s u f f i c i e n t h i s t o r i c a l data the overestimation can be simply subtracted out as a constant amplitude d i s t o r t i o n . Discussion: Several of the general c h a r a c t e r i s t i c s described in Chapters 2, 3 and 4 become immediately v i s i b l e . F i r s t , the r e a l i z a t i o n of the minimum phase c h a r a c t e r i s t i c -of-each o f ' t h e matrix operators. Plots of the magnitude of the c o e f f i c i e n t s of the f i l t e r and the d i s t r i b u t i o n with lag can be seen in Figures 14a,b,c. : The leading (zero) lag of the f i l t e r is the greatest amplitude and passes most of the information. This is consistent with the minimum phase condition described in Chapter 2. This is also important when one is using the Kalman f i l t e r program to update these c o e f f i c i e n t s . Since the Kalman f i l t e r in this application is an error f i l t e r , i t is applied to this leading term of the deconvolution operator only on the motivation that this is where most information is carried and also accounts for most of the error. A table of the order of change to this computed operator for the Surrey problem is included in Table -1. . In terms of the new information being passed the error update from-the Kalman f i l t e r is overlaid on the original Weiner estimate modulating this smoothed estimate with the. pulse of new flow information. This can be e a s i l y imagined when comparing the r e a l i z a t i o n of the Weiner f i t t i n g Figures 16, 17,-18 to the change in f i t caused by Kalman updating (See Figures 18a, b, c ) ; -51- TABLE 1 CHANGE IN THE LEADING LAG OF THE DECONVOLUTION OPERATOR AS f ( t ) t = 8 hours Channel No. 1 0. 3655 0. 3926 -0. 6841 0. 7326 -0. 5981 -0. 2391 2 0. 3223 0. 0639 -0. 3700 0. 1184 -0. 4052 -0. 0441 3 0. 4378 -0. 0365 -0. 3739 0. 0221 -0. 5829 0. 0964 4 0. 4765 -0. 1174 -0. 3051 0. 0005 -0. 0213 0. 0787 5 -0. 1752 -0. 4881 0. 6094 0. 0746 0. 3730 0. 0218 6 -0. 1821 -0. 4077 0. 6089 -0. 0421 0. 1774 -0. 0189 1 0. 3644 0. 3915 -0. 6852 0. 7315 -0. 5992 -0.,2401 2 0. 3223 0. 0639 -0. 3700 0.,1184 -0. 4052 -0.,0445 3 0. 4375 -0. 0368 -0. 3742 0.,0218 -0. 5832 0.,0961 4 0. 4764 -0. 1175 -0. 3052 0. 0004 -0.,0215 0.,0785 5 -0. 1740 -0. 4869 0. 6106 0. 0758 0. 3742 0.,0230 6 -0. 1824 -0. 4077 0. 6080 -0. 0420 0. 1774 -0.,0189 t = 72 hours Channel No. -52- frequency days (lag) ; Second Channel X z Frequency days Figure 14b (lag) lag (days) Figure 14c -57- -58- Figure 18a * 8 tt .. It . F i g u r e 18b k s u L « & Figure 18c Secondly, the problems and importance of having a common reference point for a l l of the channels (rivers) the seventy-two hour predictions. volume) compared to either the is apparent in the r e a l i z a t i o n s of Murray Creek is very small Nicomekl (in flow -or Salmon Rivers. As a con- sequence of this i t rises to peak flood conditions much more quickly than the other two. So what one sees is the dissipation of energy from Murray Creek, the peak from the Nicomekl- River and the Salmon River- approaching i t s peak, a l l lagged by several hours. This set of three d i f f e r e n t lags can be observed in the plots of the lag in degrees comparing the various channels. This representation shows how far behind the Salmon r i v e r is compared to Murray Creek and how the -•'Nicomekl River starts ahead of the Salmon River in i t s energy build up and gradually gets into phase with i t at one point and then out again. For larger systems (order >_ 1 0 ) of rivers the phase relations become more complex and the power spectrum ( d i s t r i b u t i o n of the covariance with frequency) becomes more important. The analysis of power spectra is standard tool in any time series analysis and is of p a r t i c u l a r interest in this case. 19, 20, & 21 show the d i s t r i b u t i o n of variance (power) The plots in Figures with frequency can be readily scaled using (logarithms) so that confidence intervals can be imposed on the probability of a p a r t i c u l a r channel exceeding the maximum power for any time period. The logarithmic choice is natural because this type of scaling makes the changes in power proportional. The r e a l i z a t i o n of these confidence bands for the 95% confidence region can be seen in Figures 22, 23, 24. The spectra for each channel has been smoothed using a Tukeyg (p. 91) window. This windows out the spurious (high) frequency o s c i l l a t i o n s and enhances the persistent frequencies. -63- As is apparent from Figures 25,26,27, Amp!ituds I 2. a» 4 Frequency in."days -67- - 7 1 - the confidence intervals are quite wide so that when one compares the actual and predicted power spectra (using the Weiner f i l t e r ) the band is wide enough to include the prediction also. This widely used mechanism is a simple test f o r consistency of prediction and provides an accurate indication of the " p r e d i c t a b i l i t y " of any given system. This is also a useful measure of the i n t e r r e l a t i o n of the various information channels and point out relationships at d i f f e r e n t times that otherwise are not apparent. This and the phase relations (Fig. 28a,b,c) disseminate periodic components residing in the data structure. The same sort of analysis is valuable when performed on the f i l t e r itself. For the case of a "three channel, four-lag f i l t e r " used here, the roots of this c h a r a c t e r i s t i c polynomial matrix can be analyzed in the following manner. A t^t + A t-A>i If one performs Z-transforms on the polynomial: + Vm^-m + = y t • w h e r e h e r e m = 4 one obtains a matrix polynomial in powers of z. A.z° + A. , z t t-i 1 + . . . + A. z t-m m m =0 which has complex roots describing the eigenvalues of the system. The real parts of the roots describe rates of decay of the system and the complex position the fundamental frequency. A r e a l i z a t i o n of the eigenvalues of this f i l t e r are in Figure 28d. The motivation for t h e i r c a l c u l a t i o n in t h i s problem is to determine a range as a f ( f t / t i m e ) describing the rate of energy d i s s i p a t i o n by the system along with the range of return times of the system. The return time/fundamental frequency is analagous to the harmonic frequency f o r a simple pendulum. The difference in this case is that one is dealing with a system of pendula. • ' . -73- N F i g u r e 28a FILTER CHARACTERISTICS FOR S U R R E Y B A S I N 1 9 7 4 . DISSIPATION RATE IN./HOUR 'FUNDAMENTAL FREQ.IN HOURS 9.348 9.348 3.394 3.394 F i g u r e 28d Chapter 7 FINITE STATE MACHINES This approach of treating the r a i n f a l l / r u n o f f problem as an information and feedback control problem is not new. date back several years.(See, for example, Young ) 17 The applications However, considering the d i f f e r e n t methodologies employed as a whole, i t would appear that l i t t l e has been done towards implementing the problem as a whole. The Kalman f i l t e r model-is t r a d i t i o n a l l y used by i t s e l f on the problem without the.-:use of .the pre-optimization provided by the Weiner f i l t e r . When the Kalman f i l t e r is used the obvious spectral analysis segments should be exploited too. The vast majority of problems a r i s i n g from the prediction aspects of both the Weiner and Kalman models are inherently nonlinear in nature. However, the structure "of both techniques lends i t s e l f to adaptation to a " f i n i t e state machine" 'mode of r e a l i z a t i o n . 1 To examine b r i e f l y the p o s s i b i l i t y of this formsof the structure a simple analogy is used. Suppose that a machine existed that had local Weiner f i l t e r s to describe the movement and behaviour of each of a large number of gears. Suppose also that a mechanism existed to switch the focus of attention from section to section, so that one could monitor the change from state to state. If one defined the Kalman formulation as the switching mechanism (not necessarily linear) then, the problem can be broken down into a set of these machines that together would describe the entire operation. In terms of the RiC c i r c u i t used originally,,the equivalent operation would be to have n of these simple " i r r e d u c i b l e " mechanisms combined to build up to an arbitrary f i n i t e state machine. -78- To focus this aspect into the resource management f i e l d one need only consider an entire system process for a basin. This system consists of a snow pack (for B.C.), a p r e c i p i t a t i o n system, an i n f i l t r a t i o n system and f i n a l l y a runoff system. The focus of this paper has been on a small part of the overali problem, (one section of the c i r c u i t ) , (one state of the machine). The structure for a l l of these processes is the same, whether they are nonlinear or l i n e a r . Consequently, the formulation and formalisms presented in this thesis are merely case examples f o r a f i n i t e state machine. This extension to f i n i t e state machines is discus in Krohn^. and in Robinson^, Ch. 5. To further define some of the methodology that would be employed, consider the problem where the nonlinear r e l a t i o n x is to be estimated s t a t i s t i c a l l y and functionally using two processes A and B. This means that we are looking for A and B so that A-B is the "best" approximation to x. This is by d e f i n i t i o n nonlinear because i t involves multiplying A and B. However, employing this idea of s t a t i s t i c a l machines this nonlinearity can be converted into a local linear problem. Assume one has an i n i t i a l value f o r A, i . e . , A ; and an estimate-for B, which 0 y i e l d s B . One now has a linear estimation problem since A Q Q is given the problem is no longer AB of unknown functions, but rather A B , B is 0 now a linear, unknown. Q This than can be solved using linear methods. This means that at a local level one has constructed a l i n e a r problem. Given this information one switches back to the f i r s t state with A as Al and a new estimate B 2 and so on. Since this has the features of a contraction mapping, the estimates w i l l converge to a l i m i t i n g set of values representing the f i n a l A and B. This is a well known result and readily applies to estimation of snowmelt given temperature and -79- p r e c i p i t a t i o n records. The a b i l i t y to link past and present inputs is also required to produce current outputs. sequential machine". This can be done through an "asynchronous This machine does hot require a "time incrementing procedure to link transitions between d i f f e r e n t states. This means that a t r a n s i t i o n can take place following any change in inputs. hand a synchronous machine uses a clock to i n i t i a t e changes. these procedures the methodology from Robinson^ is 1. To summarize utilized. To put these ideas of using "machines" to l i n k various of a problem together the synthesis mechanism is as On the other sections follows: Prepare a complete description of the f i n a l behaviour by means of a t r a n s i t i o n table (or graph). 2. Reduce this table to i t s shortest possible form. 3. Synthesize the resulting network. Consider the following sketch to represent this network: 1 . Figure 31 This network in Figure 31 is composed of machines with f i n i t e states S 1 and S , 2 a f i n i t e number of inputs outputs y , y . x integer. x 2 and a f i n i t e number of Let the time be represented by n, where n is an This is a synchronous machine according to our d e f i n i t i o n . -80- The present output y the present input x n-i s = n and the present state s and the previous state s n y„ s i Pli 2 ~* 2 m S n In terms of the sketch yn _ -> s . 2 S i m i l a r l y , the branch form e s 'n-i x n = x i» y n = i y -This can be summarized in a t r a n s i t i o n table: X x s S n-i 2 let Sj = x i s n y n s i y i S 2 0; s 2 n y s s 2 x i = 0; x 2 = 1 y i = 0; y 2 = n i n y n y i y 2 then one has i n 1 0 s 2 = 1 X V i are functions of : describes the branch form states s S p y s n n y n 0 0 0 1 0 1 1 1 0 1 or stated in Figure 32 -81- Figure 32 This is a s i m p l i f i e d Kalman formulatien. The t r i c k to a l l of this is determining from one of these transit tion graphs how many feedback loops are necessary and what type of behaviour should be embodied in the function generator. from Robinson^ A general formulation for blocks of an arbitrary number of channels can be seen in Figure 33. Figure 33 where x n is input. y n is output: = function of x s n is present state = f ( > S n p = previous state x s n n n and s _p n _D) D = delay in changing from one state to another S n D = s n ( e x c e Pt w n e n (variable) t r a n s i t i o n is in progress). This operator would work in the following manner. 1. A change in x -> change in s n (s the system). 2. s n changes s _ n D after a delay D. -82- n is the new internal state of 3. New changes y^ (with the system entering a new internal D state simultaneously with s _p. n 4. s n D can cause a change in s n l i k e i t did with x^. Thus one has an algorithm to separate nonlinear behaviour into linear f i l t e r s and a "machine" to control t h e i r global behaviour in a f i n i t e manner. In l i g h t of this new information from this machine structure the state space (Kalman) formulation takes on a unifying role linking the choice of f i l t e r types to the real time, "on l i n e " situation. One has defined, therefore, a "frame". This "frame" has attributes that define i t s state {Si and S ) 2 in the example. S and S x 2 could for this problem be the snowpack depletion and the r a i n f a l l / r u n o f f problem. Since the t r a n s i t i o n graph for the example and the two stage physical process are the same, they can be linked in a synchronous manner to a continuous input data stream ( x l 5 x ) from s a t e l l i t e observations and recursively 2 processed using selection c r i t e r i a (arrival type) on the type of f i l t e r to be used. from Si to S 2 The state space formulation moves the prediction or, S _p to S n n The outputs y^ and y 2 can be checked for accuracy with the next information reception and changes made accordingly. This composes a very t r i v i a l algorithm in ALGOL that allows for recursion. This algorithm schematically would be realized as follows: -83- Example #2: In conjunction with the evolution towards the f i n i t e state machine approach, a second example was performed using the Weiner f i l t e r s to predict snowpack depletion from the Mission area (See Fig. 29 ) using an operator computed using McBride snowpack depletion data. These two sets of snowpillow data were "driven" by a signal containing temperature, humidity and a h i s t o r i c record. type of f i l t e r is s l i g h t l y d i f f e r e n t , in signals and only one output trace. This that there are three input This type is called a constrained energy f i l t e r and is discussed by Robinson^v;PP^ 261 ,.269. The difference in' appearance is that the f i l t e r weights are now row vectors and the sum of squares of the output y t is minimized using LaGrange m u l t i p l i e r s . A r e a l i z a t i o n of the agreement using Mission 1972 temp and humidity convolved with the McBride 1973 f i l t e r can be seen in Figure 29. With this r e s u l t one now has set up the two stages of a "machine". The communication between these state machines can take on several forms. The simplest of these is to consider the r a i n f a l l runoff machine and the snowpack machine separately i . e . , as a pair of time invariant f i l t e r s which can be linked by augmenting the input of the second with the output of the f i r s t , eg. , finding the flow equivalent of both rain and snow depletion for that area and including the snow-water equivalent as an additional signal. However, i f one were to run this model in an "on l i n e " situation the update on each of these models would be performed through the Kalman error f i l t e r thus emulating a "continuous" feedback system. (discrete) Such would be the case i f one was sensing the snowpack depletion from a s a t e l l i t e track and then merging this with s a t e l l i t e -84- F i g u r e 29 -85- information on p r e c i p i t a t i o n , temperature, etc. The overhead for this would be some type of pattern matching device (program) and a language to monitor a l l the incoming information before involving the various f i l t e r mechanisms. Example #3: As a f i n a l example of the Weiner f i l t e r s used as energy constraints and as a state machine, the following case study of floods on the Squamish River was performed. Peculiar to this p a r t i c u l a r basin is the contribution of rapidly melted snow, caused by a cold front and then followed by .'warm" turbulent a i r from a southwest d i r e c t i o n . - (See Figure 30a). The majority of the melting occurs on or about the 5000 f t elevation. This apparently occurs only when the a i r is p r a c t i c a l l y isothermal from the valley f l o o r to the 5000 f t (850 m i l l i b a r ) range. This condition, when accompanied by heavy rains gives r i s e to flood conditions of short duration and high i n t e n s i t y . A r e a l i z a t i o n of the f i v e s i g n i f i c a n t flood years for the month of October can be seen in Fig.30a. (These p a r t i c u l a r conditions are in evidence in October and in J u l y , but predominate in October.) For this case one must budget (accumulate) depending on temperature. snow @ 5000 f t . This results in the temperature, p r e c i p i t a t i o n , snowmelt situation as inputs and the single flood flow as output. Since the record variation is so rapid the Weiner f i l t e r is apt to be outof-phase with the actual event. Consequently, one wishes to exploit one further property of the error behaviour of these constrained energy f i l t e r s . This c h a r a c t e r i s t i c can be argued as follows: -8.6- FLOW TRACES FOR SQUAMISH FLOODS FOR OCTOBER 1955 \AAAAA, 1 1 1 + I I I 1 I 1 I I 1 1 1 I I 1 1 I 1 T T T T 1 The time axis i s i n days 1958 + r T - r r r r r ir i i i T r - n o m I \ r i I I I i i i i i 1963 1 1 I I 1 1 1 1 1 I 1 I 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1965 + 1 1 I 1 1 1 1 1 I 1 1 1 1 I 1 1 1 1 1 1 I T T T T T T T T 1.00 0.75 0.50 0.25 0.0 -0.25 -0.50 -0.75 -1.00 „ peaks are flows .> 24 x 10 3 1966 1 Figure 30a -87- cts H The Weiner prediction error f i l t e r is realized in this case to minimize the output energy due to noise. Consequently, the amplitudes of the f i l t e r ' s output that are due to noise alone are small squares small). (least- If at some time t^ in the input stream a pulse (large snowmelt) a r r i v e s , i t can be predicted from the past values. a large error is observed in the predicted output. Therefore, However, the very fact that this error has occurred is an accurate means of detection. This pulse, i t appears is very close to the fundamental frequency of the system, (approximately 3.2 days) signal and i t s appearance in the input indicates consistently (at least with this data) the appearance of a flood. This phenomenon is an extremely useful tool for where spikes are evident in the output traces. -88- situations Chapter 8 OTHER CLASSES OF FILTERS One of the principal problems encountered with Weiner type deconvolution is that although i t provides signal compression, i t also causes, in a large number of cases, excessive amplification of noise components. As is apparent from the Squamish River floods, the noise and the a r r i v a l of large spikes (floods) dominate the traces. This pulse a r r i v a l also violates the minimum phase conditions required for a r e a l i z a b l e Weiner f i l t e r . signal The use of the Weiner f i l t e r as a (seen in the large error) of the a r r i v a l of these events does not, however, indicate the amplitudes accurately. Recently, ( W i g g i n s ^ , ? / ) another class of deconvolution 19 1 operators has been developed in an attempt to cope with pulse a r r i v a l s at varying lags. Since this technique uses the kurtosis optimizing c r i t e r i a as (as opposed to the Weiner least-squares) its only the components contributing to large variance components are followed. Implicit in this phaseless formulation of the deconvolution process is that the selection of the norm one uses is only r e s t r i c t e d by the fact that i t must represent a uniform, monotonicly increasing distribution. In simple terms this means that one can s e l e c t i v e l y redistribute the variance contributions by the observed data in any ' s u i t a b l e ' manner. This implies that as long as the r e d i s t r i b u t i o n function increases with increasing variance (up to some defined cut o f f point), one can choose any transformation that appears to s u i t the data. -89- Included in this chapter are two possible reformulations of the deconvolution problem. This type of transformation is thus the direction towards which the entire unfolding process w i l l move as more classes of transformations belonging to this type are uncovered. In addition to this " s i m p l i f y i n g " of the structure, the f i l t e r weights are vectors and not matrices. This w i l l speed up considerably the calculations and the adaptability to the Kalman formulation. Minimum Entropy Deconvolution: As a b r i e f summary of the minimum entropy deconvolution (M.E.D.) idea and i t s water resource applications one needs to imagine the s i t u ation where the input consists only of a few large spikes which are separated by d i f f e r e n t time intervals while the operator itself remains r e l a t i v e l y unchanged. In terms of the Squamish flood s i t u a t i o n , the spikes are pulses of flow caused by the rapid melting of the snowpack and 5000 f t . These a r r i v a l s are dependent on the wind d i r e c t i o n , temperature and amount of snowpack. The basin (operator) response is considered to be the same under a l l of these conditions. What the l i n e a r operator proposed by Wiggins-^ does is convert these signals to a "simple" appearance. output consists "Simple" meaning that the of a few large spikes of "unknown sign and l o c a t i o n " which are in turn separated by nearly zero terms. This type of approach maximizes the order of these spikes or "equivalently minimizes the entropy of the spikes; hence the name minimum entropy deconvolution."i 0 -.90- To simplify the behaviour of this process he chooses the kurtosis of the samples. The kurtosis measures the contribution to the t a i l s of d i s t r i b u t i o n of the samples; in other words, those spikes most deviant from the mean sample. To restate this in terms of the r a i n f a l l / r u n o f f context is straightforward. For the situation caused by high intensity r a i n f a l l accompanied by rapid melting, a spike arrival is seen to appear in the runoff. The intermediate lesser peaks are of no importance consequently one can ignore (statistically) their contribution in favour of the extreme ( t a i l ) events. In terms of the general autoregressive formulation used throughout, the problem appears as follows: Nf y ij ^ = Vi, j-k w h |<=T e r e 1 = 1 N S j = l,...,NT where NS are the number of trace elements and NT the number of time periods. (NF are the number of f i l t e r components). The norm ( l i k e the least squares norm) is to be found such that i t maximizes the variance. This is denoted as a "varimax" norm and has the following form: NS V (varimax norm) = - v = l\ y -• k I i=l I (I y --) 2 2 V. is the sum of the squares of the normalized variances. The behaviour of this norm looks l i k e : | V = 1.0 i M I i I i I I -91- For a single output which is everywhere zero except f o r a single spike has V=l. The more nonzero similar sized spikes the smaller V becomes. The derivation by Wiggins-^ of the f i l t e r c o e f f i c i e n t s is straight forward and appears as follows: 9V Since V is to be maximum -T-F = 0 k a t i- e i li " "* ~ 0' af k 1 P ....... ay. . Setting = 1 u. J i, dl^ f l * f ll V. and one obtains: 2 = I j y^ • 3 i substitute this into the expression for v i i p (the variance of y) . . one f i n a l l y obtains: j-K ?1 i , j - * u u i,j-k = J l Pi ?1 i j y u i,j-k which Wiggins-jg rewrites as: Rf = g R is the autocorrelation matrix consisting of a weighted sum of the autocorrelations of the input signals and g the cross correlation vector composed of a weighted sum of the cross correlations of the outputs cubed with the inputs. As i t stands the system is nonlinear, but can be solved i t e r a t i v e l y with an assumed f. The solution is not a unique one but provides an adequate estimate of the maximum. It is interesting to note that the e f f e c t of cubing a time series is similar to zeroing i t with the exception of a few of the largest values. The f i l t e r weights adapt themselves to finding a -92- f i l t e r shape to the spikes of the cubed outputs. From the results i t would appear that i f the i n i t i a l guess at f is a spike, then the output (runoff) spikes coincide with the actual flood spikes. The bursts of snowmelt in the input data seem to control the phasing or the delay of the output events. The way around this is to time s h i f t the outputs a f t e r comparing the actual and predicted traces for some similar target data. There are several features of this process that make i t a t t r a c t i v e for this type of event. If a l l of the output traces contain random noise (which is inevitable) then R is proportional to the unit matrix, meaning that the cross correlation matrix w i l l have only one non-zero value. filter. The same is true for the f i n a l output This value is the same as the i n i t i a l estimate for f. Wiggins-jg demonstrates that the process is quite stable in the presence of large noise, i . e . i t leaves white noise unmodified. The most s i g n i f i c a n t departure from the normal deconvolution process is the absence of the minimum phase r e s t r i c t i o n on the resulting filter. In f a c t , there are no phasing requirements at a l l . behaviour is also quite useful in other respects. This When one views the predictive deconvolution f i l t e r c h a r a c t e r i s t i c s generally, i t can be said that they a l l tend to amplify noise excessively. On the other hand, the M.E.D. (minimum entropy deconvolution) does just the opposite. While emphasizing the spike behaviour i t quite s e l e c t i v e l y suppresses bands where the signal coherent signals. to noise is smallest and accentuates the A sample of the type of agreement attainable from these spikey events is included in Figure 31. -93- -94- As is v i s i b l e in Figure 33, the largest contribution to the variance is by the largest spike (at the top right hand corner). The predominant contributions to the trace is by the small amplitude flows seen near the o r i g i n . The algorithm to compute the weights approaches this maximum in 10 iterations (Figure 34) and the resulting f i l t e r configuration is visualized in Figure 33a. Further classes of transformations on the d i s t r i b u t i o n of y^ have been investigated by 0oe-& Ulrych •(1978)-".The motivation behind the transformation of y. can be understood simply in terms of the normalized plots of variance vs. y^ below in Figure 35. Figure 35 The kurtosis distributes the variation in the variance contributions as per Figure 36. This may, however, be too radical a weighting for anything other than spike inputs. The process for r e d i s t r i b u t i n g the rate at which this contribution is accumulated is to introduce a transformation on y such that the resulting shape is determined on the transformed variable z. A suitable normalization is introduced and then the whole process is resubstituted into the varimax formulation and solved for the set of f i l t e r weights f^. ,-95- CO CO CD s~ CD Figure 34 Figure 33a As an example of t h i s , consider the transformation proposed by Ooe M., and Ulrych, T . ^ z = 1 - e x p ( - a y ) , where 2 a = -^T7 , k a constant determining the slope. If one substitutes z into the varimax norm the new norm looks l i k e : J 3V. J The expansion and substitution for - r ^ — then appears l i k e : k 9 t Z ij ^ W " ?! 1 ( a ) z ) _ 2 = 2(l\z)j h Vij-k ij-k x l\z l\z* l\ 2 a y ( l - z ) x J J* J from which one obtains a similar form: Rf = g as produced in Wiggins-^ to solve for f. This is just one of a class of norms possible for emphasizing d i f f e r e n t aspects of the data structure depending on conditions. a comparison between the straight varimax transformation and the " z " transformation, plots of both are in Figure 36. 1.0 o.5 T 4 Figure 36 -99- As Contained in this z transformation is the parameter k. This ultimately determines the shaping of the d i s t r i b u t i o n . Experiments have found a stable value of k to be between 1.4 and 3. The choice of k is such that one wants k to "look l i k e " y y . •'max m 2 for Restated more formally: J for We want Since z = 1 - e z = y a = ^ 2 ay' at y max a 2 is the variance Where a, the standard deviation is such that 0 = y max/1.4 or 3 This choice is on the assumption that.z is roughly l i k e a normal d i s t r i b u t i o n and 3a should approximately include the t a i l s this d i s t r i b u t i o n (See Figure 38). •100- of Chapter 9 CONCLUSIONS Each of these classes of predictive f i l t e r s perform well under d i f f e r e n t sets of conditions. error, no one class does i t a l l . As has been found by t r i a l and What has become most apparent, however, is the a p p l i c a b i l i t y of a l l these f i l t e r s to the state-space configuration and a s i m p l i c i t y of incorporation into f i n i t e state machines. On a modular level the f i l t e r (system) c h a r a c t e r i s t i c s are simply and d i r e c t l y interpretable using standard spectral techniques. This assemblage of tools in the form of general algorithms presents a formidable assessment package for a wide variety of multichannel problems. Although they do not represent a l l possibilities, they do provide a viable means of assessing flood prediction from a multichannel point of view. -101- BIBLIOGRAPHY 1. Kalman, R.E., "A New Approach to Linear F i l t e r i n g and Prediction Problems". Trans. J . of Basic Engineering, March 1960, p. 35-46. 2. Ho, Y.C., " J . Math. Analysis and Applications", 1962, 6, p. 152. 3. G r a y b i l l , F.A., "Ah Introduction to Linear S t a t i s t i c a l Vol. 1, McGraw H i l l , New York, 1960. 4. Bayless, J.W. and Brigham, E.O., 1970, "Applications of Kalman F i l t e r s to Continuous Signal Restoration", Geophysics, Vol. 35, p. 2-23. 5. Weiner, N., 1949, "Extrapolation, Interpolation and Smoothing of Stationary Time S e r i e s " , J . Wiley & Sons, New York. 6. Hamming, R.W.; Tukey, J.W., "Measuring Noise Colour", Bell Telephone Lab, Murray H i l l , N.J., 1960. 7. Robinson, E.A., "Multichannel Time Series Analysis with Digital Computer A p p l i c a t i o n s " , Holden Day, San Francisco, 1967. 8. Gelb, A., "Applied Optimal Estimation", M.I.T. Press, 1974, Models", Chapter 4. 9. Gauss, K.F.., Werke, Gontingeu, 4* 1821 (collected works) 1873. 10. Gump, N.D., 11. Clark, C O . , 12. Vol. 110, p. 1419-1488. Plackett, R.L., "Principles of Regression A n a l y s i s " , Press, Oxford, 1960. 13. Geophysics, 1974, Vol. 39, p. 1-14. "Storage and the Unit Hydrograph", Trans. A.S.C.E., Clarendon Wiggins, R.A., Del Mar Technical Associates, P.O. Box 1083, Del Mar, C a l i f o r n i a , 92014. -102- BIBLIOGRAPHY ( C o n t i n u e d ) 14. Krohn, K. and Rhodes, J . , "Development o f New Mathematical Techniques f o r B i o s i g n a l A n a l y s i s " , (Annual Report t o N.A.S.A. C o n t a c t , Krohn Rhodes Research I n s t i t u t e , B e r k e l e y , C a l i f o r n i a , 1967. 15. Ooe, M.; 16. J e n k i n s G., Watts; " S p e c t r a l A n a l y s i s and I t s A p p l i c a t i o n s " , Holden Day, San F r a n c i s c o , C a l i f o r n i a , 1968. 17. Young, P., " R e c u r s i v e Approaches t o Time S e r i e s A n a l y s i s " , I n s t i t u t e o f Mathematics and I t s A p p l i c a t i o n s , Department of E n g i n e e r i n g , U n i v e r s i t y o f Cambridge, 1974. U l r y c h , T., (Ta) -103-
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Application of auto regressive filtering techniques...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Application of auto regressive filtering techniques to flood flow prediction Cahoon, Peter G. 1978
pdf
Page Metadata
Item Metadata
Title | Application of auto regressive filtering techniques to flood flow prediction |
Creator |
Cahoon, Peter G. |
Date Issued | 1978 |
Description | This thesis describes a study of the application of multichannel deconvolution, an autoregressive filtering technique, to the problems of flood flow prediction. This application is divided into four component segments: 1. The description of the behaviour of a multiple input, multiple output basin using multichannel autoregressive techniques with multiple lags of finite length. These descriptions fall into two catagories: a) the Weiner autoregressive technique₇. b) minimum entropy auto regression₁₃. 2. The restatement of the input/output problem as a time varying state-space description with a feedback mechanism for implementation of information having a unit time delay. 3. The analysis and characterization of the state-space and autoregressive methods using standard spectral analysis techniques and statistical confidence limits. 4. The linking of the state-space/autoregression characterizations for snowpack depletion and rainfall/runoff into a finite state machine algorithm coalescing the two processes so they may be linked to satellite information. Several case studies were used in which the multiple precipitation records and multiple flow records were characterized. These steady state characteristics were then updated using the Kalman state-space description to provide an "online"₁, information update. The snowpack depletion problem was treated as a multiple input (rainfall, temperature, humidity), single output (snowmelt) phenomena and characterized by a single multichannel autoregression. A second class of characterizations was employed to cope with the "spike" arrivals caused by rapid snowmelt flowing into a single output. This minimum entropy₁₃ technique is developed for "flash flood" prediction. Finally, a finite state machine algorithm is developed to link the snowpack depletion to the rainfall runoff problem in such a way that it can be readily linked to satellite produced data streams. For those unfamiliar with the complex and obtuse language of deconvolution and control theory, the role of this thesis can be described by analogy: You have arrived in the middle of a party where the participants are all quite intoxicated. They are huddled together in small knots talking excitedly, but without much clarity about decon operators, M.E.D. processes, frames and finite state machines. My role is to catch you up on the conversation and help you find your way to the bar. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-02-24 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0063016 |
URI | http://hdl.handle.net/2429/20880 |
Degree |
Master of Applied Science - MASc |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-UBC_1978_A7 C33.pdf [ 4.2MB ]
- Metadata
- JSON: 831-1.0063016.json
- JSON-LD: 831-1.0063016-ld.json
- RDF/XML (Pretty): 831-1.0063016-rdf.xml
- RDF/JSON: 831-1.0063016-rdf.json
- Turtle: 831-1.0063016-turtle.txt
- N-Triples: 831-1.0063016-rdf-ntriples.txt
- Original Record: 831-1.0063016-source.json
- Full Text
- 831-1.0063016-fulltext.txt
- Citation
- 831-1.0063016.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0063016/manifest