UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Kalman filter and its application to flow forecasting Ngan, Patricia 1985

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
[if-you-see-this-DO-NOT-CLICK]
UBC_1985_A1 N52.pdf [ 5.43MB ]
Metadata
JSON: 1.0062573.json
JSON-LD: 1.0062573+ld.json
RDF/XML (Pretty): 1.0062573.xml
RDF/JSON: 1.0062573+rdf.json
Turtle: 1.0062573+rdf-turtle.txt
N-Triples: 1.0062573+rdf-ntriples.txt
Original Record: 1.0062573 +original-record.json
Full Text
1.0062573.txt
Citation
1.0062573.ris

Full Text

KALMAN FILTER AND ITS APPLICATION TO FLOW FORECASTING by PATRICIA NGAN  B. A. S c . , U n i v e r s i t y o f B r i t i s h Columbia,  1981  M. S., C a l i f o r n i a  19&2  Institute  o f Technology,  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY  in  FACULTY OF GRADUATE STUDIES ( C i v i l Engineering)  We a c c e p t t h i s to  t h e s i s as conforming  the r e q u i r e d s t a n d a r d  THE UNIVERSITY OF BRITISH COLUMBIA April  1985  P a t r i c i a Ngan,  1985  5-6  In p r e s e n t i n g  t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of  requirements f o r an advanced degree a t the  the  University  o f B r i t i s h Columbia, I agree t h a t the L i b r a r y s h a l l make it  f r e e l y a v a i l a b l e f o r reference  and  study.  I further  agree t h a t p e r m i s s i o n f o r e x t e n s i v e copying of t h i s t h e s i s f o r s c h o l a r l y purposes may department or by h i s or her  be granted by the head o f representatives.  my  It i s  understood t h a t copying or p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l gain  s h a l l not be allowed without my  permission.  Department o f  6lQvt-  &Nl6t^G:eftlM|  The U n i v e r s i t y o f B r i t i s h Columbia 1956 Main Mall Vancouver, Canada V6T  f^/R-n  1Y3  written  ABSTRACT The Kalman F i l t e r has been a p p l i e d t o many f i e l d s of hydrology,  particularly  i n the area of f l o o d f o r e c a s t i n g .  T h i s r e c u r s i v e e s t i m a t i o n technique  i s based on a  s t a t e - s p a c e approach which combines model d e s c r i p t i o n of a process with data i n f o r m a t i o n , and accounts f o r u n c e r t a i n t i e s i n a h y d r o l o g i c system. T h i s t h e s i s d e a l s with a p p l i c a t i o n s of the Kalman F i l t e r context of streamflow Kalman F i l t e r covariances vector  to ARMAX models i n the  p r e d i c t i o n . Implementation of the  r e q u i r e s s p e c i f i c a t i o n of the n o i s e  (Q, R) and i n i t i a l  c o n d i t i o n s of the s t a t e  (x , P ) . D i f f i c u l t i e s arise 0  0  i n streamflow  a p p l i c a t i o n s because these q u a n t i t i e s a r e o f t e n not known. F o r e c a s t i n g performance of the Kalman F i l t e r i s examined u s i n g s y n t h e t i c flow data, generated values f o r the i n i t i a l  with chosen  s t a t e vector and the n o i s e  c o v a r i a n c e s . An ARMAX model i s c a s t i n t o s t a t e - s p a c e  form  with the c o e f f i c i e n t s as the s t a t e v e c t o r . S e n s i t i v i t y of the flow f o r e c a s t s to s p e c i f i c a t i o n of x , P , Q, R, 0  may be d i f f e r e n t The  filter's  0  (which  from the generation v a l u e s ) i s examined.  f o r e c a s t i n g performance i s mainly  a f f e c t e d by  the combined s p e c i f i c a t i o n of Q and R. When both  noise  c o v a r i a n c e s a r e unknown, they should be s p e c i f i e d  relatively  l a r g e i n order t o achieve a reasonable f o r e c a s t i n g performance. S p e c i f i f y i n g Q too small and R too l a r g e should be avoided as i t r e s u l t s i n poor flow f o r e c a s t s .  i i  The  f i l t e r ' s performance i s a l s o examined u s i n g a c t u a l  flow data  from a l a r g e r i v e r , whose behavior  changes slowly  with time. Three simple ARMAX models a r e used f o r t h i s i n v e s t i g a t i o n . Although  there a r e d i f f e r e n t ways of w r i t i n g  the ARMAX model i n state-space  form, i t i s found  that the  best f o r e c a s t i n g scheme i s t o model the ARMAX c o e f f i c i e n t s as the s t a t e v e c t o r . Under t h i s f o r m u l a t i o n , the Kalman Filter  i s used t o give r e c u r s i v e estimates of the  c o e f f i c i e n t s . Hence flow p r e d i c t i o n s can be r e v i s e d a t each time step with the l a t e s t  s t a t e estimate. T h i s  a l s o has the feature that i n i t i a l  formulation  values of the ARMAX  c o e f f i c i e n t s need not be known a c c u r a t e l y . The  noise v a r i a n c e s of each of the three models a r e  estimated  by the method of maximum l i k e l i h o o d , whereby the  likelihood  f u n c t i o n i s evaluated  i n n o v a t i o n s . Analyses considered  i n terms of the  of flow data  in this thesis,  f o r the s t a t i o n s  i n d i c a t e that the v a r i a n c e of the  measurement e r r o r i s p r o p o r t i o n a l t o the square of the flow. In p r a c t i c e , flow p r e d i c t i o n s s e v e r a l time steps i n advance a r e o f t e n r e q u i r e d . For a u t o r e g r e s s i v e  processes,  t h i s i n v o l v e s unknown elements i n the system matrix  H of the  Kalman model. The Kalman a l g o r i t h m underestimates the v a r i a n c e of the f o r e c a s t e r r o r i f H and x a r e both unknown. For the AR(1)  model, a general e x p r e s s i o n  f o r the mean  square e r r o r of the f o r e c a s t i s developed.  I t i s shown that  the formula  f o r the case  reduces t o the Kalman equation  where the system matrix  i s known. The importance of t h i s  formula  i s realized in forecasting  s i t u a t i o n s where  management d e c i s i o n s depend on the r e l i a b i l i t y predictions,  r e f l e c t e d by t h e i r mean square  iv  of flow  errors.  Table of  Contents  Chapter  Page  ABSTRACT  i i  LIST OF TABLES  x  LIST OF FIGURES  xii  ACKNOWLEDGEMENTS  xii i  1.  2.  INTRODUCTION  1  1.1  Problem d e f i n i t i o n  1  1.2  General O b j e c t i v e s  2  1.3  General T h e s i s O u t l i n e  6  LITERATURE REVIEW  8  2.1  E x p l a n a t i o n of the Kalman F i l t e r  8  2.2  General A p p l i c a t i o n s  12  2.3  Hydrologic A p p l i c a t i o n s  13  2.4  Problems i n h y d r o l o g i c a p p l i c a t i o n s  14  2.4.1  Unknown dynamics  14  2.4.2  E s t i m a t i o n of S t a t e system matrix  15  2.4.3  Unknown Observation  16  2.4.4  E s t i m a t i o n of the noise c o v a r i a n c e s  2.5 3.  system matrix  D e t a i l e d t h e s i s o b j e c t i v e s and o u t l i n e  16 19  SENSITIVITY ANALYSIS  22  3.1  Experimental  Plan  23  3.2  Experimental  Procedure  25  3.2.1  Generation  of Flow Data  3.2.2  Kalman F i l t e r  3.2.3  F a c t o r i a l Experiment  28  3.2.4  A n a l y s i s of V a r i a n c e  32  Specifications  v  25 27  3.3  3.4  3.5 4.  5.  -Results  34  3.3.1  34  ANOVA R e s u l t s  3.3.2 Two-dimensional Graphs  36  3.3.3  42  Three-dimensional  D i s c u s s i o n of R e s u l t s  44  3.4.1  44  ANOVA R e s u l t s  3.4.2 I n t e r p r e t a t i o n of Graphs  45  Summary  45  HYDROLOGIC SYSTEMS  48  4.1  T i m e - i n v a r i a n t Systems  49  4.2  Three ARMAX Models  51  4.3  Scope of A p p l i c a t i o n s  53  ESTIMATION OF MEASUREMENT NOISE VARIANCE  54  5.1  The E s t i m a t i o n Method  54  5.2  E s t i m a t i o n Procedure  56  5.3  Transformation of flow data to achieve Uniform Variance  57  5.3.1  5.4  R e l a t i o n between the Noise V a r i a n c e and the Flow  58  5.3.2  Results  59  5.3.3  Conclusions  60  5.3.4  Box-Cox Transforms  60  Maximum L i k e l i h o o d E s t i m a t i o n of the Noise Var iance  5.5 6.  plot  •  62  5.4.1  Theory  62  5.4.2  Results  64  Summary  65  APPLICATION: AR(1)  67 vi  6.1  schemes  68  6.1.1 P r o p e r t i e s of Scheme 1  68  6.1.2 P r o p e r t i e s of Scheme 2  69  6.1.3 P r o p e r t i e s of Scheme 3  69  6.2  General D i s c u s s i o n  72  6.3  Experimental Procedure  73  6.4  Results  75  6.5  6.6 7.  D e s c r i p t i o n of the F o r e c a s t i n g  6.4.1 Estimate  of AR c o e f f i c i e n t  75  6.4.2 Estimate  of the noise v a r i a n c e  76  6.4.3 Performance I n d i c a t o r s f o r Scheme 1  76  6.4.4 Performance I n d i c a t o r s f o r Scheme 2  78  6.4.5 Performance I n d i c a t o r s f o r Scheme 3  80  Discussion  83  of R e s u l t s .  6.5.1 F o r e c a s t i n g Performance of Scheme 1  83  6.5.2 F o r e c a s t i n g Performance of Scheme 2  84  6.5.3 F o r e c a s t i n g Performance of Scheme 3  84  Summary  87  APPLICATION: TRANSFER FUNCTION MODEL  88  7.1  D e s c r i p t i o n of F o r e c a s t i n g Schemes  90  7.1.1 P r o p e r t i e s of Scheme 1  90  7.1.2 P r o p e r t i e s of Scheme 2  90  7.2  General D i s c u s s i o n  91  7.3  Experimental Procedure  92  7.4  Results  93  7.4.1 Estimate  of the model c o e f f i c i e n t s  93  7.4.2 Estimate  of the noise v a r i a n c e  94  7.4.3 Performance I n d i c a t o r s f o r Scheme 1 vii  95  7.4.4 Performance I n d i c a t o r s f o r Scheme 2  97  7.5  D i s c u s s i o n of R e s u l t s  99  7.6  Summary  100  APPLICATION: ARMAX MODEL  101  8.1  P r o p e r t i e s of the F o r e c a s t i n g Scheme  102  8.2  General D i s c u s s i o n  103  8.3  Experimental Procedure  104  8.4  Results  105  8.4.1 Estimate of the noise v a r i a n c e  105  8.4.2 Performance i n d i c a t o r s f o r the ARMAX model  106  8.4.3 Comparison of 1-step Performance Indicators  108  8.4.4 Comparison of 2-step Performance Indicators  109  D i s c u s s i o n of R e s u l t s  111'  8.5.1 Performance of the ARMAX model  111  8.5.2 Comparison  of 1-step performance  111  8.5.3 Comparison  of 2-step performance  111  8.5  8. 6  Summary  113  VARIANCE OF THE FORECAST ERROR  114  9.1  Background  115  9.2  Problem d e f i n i t i o n  115  9.3  Covariances of Products of normal random variables  117  9.4  Variance of the f o r e c a s t e r r o r f o r the s c a l a r case 118  9.5  Illustration AR(1)  9.6  of the v a r i a n c e formula f o r the 120  Conclusions  122 vi i i  10.  SUMMARY AND CONCLUSIONS  125  10.1 Summary of T h e s i s C o n t r i b u t i o n s  126  10.2 S e n s i t i v i t y  129  Analysis  10.3 Maximum L i k e l i h o o d E s t i m a t i o n of the noise variance  130  10.4 The A u t o r e g r e s s i v e Model  130  10.5 T r a n s f e r F u n c t i o n Model  132  10.6 Combined ARMAX Model  133  10.7 V a r i a n c e  134  of the f o r e c a s t e r r o r  10.8 State-space  Formulation  10.9 Future D i r e c t i o n s  135 135  BIBLIOGRAPHY  137  ix  LIST OF TABLES  Table No.  Title  Page  3.1  Values of x * , Q*, R* used generation  3.2  Schematic  3.3  Number of L e v e l s f o r the F a c t o r s  28  3.4  L e v e l s of the Noise Covariances  29  3.5  Computer Output - ANOVA R e s u l t s f o r CODE 7  35  5.1  Values of the slope f o r years  60  5.2  Maximum L i k e l i h o o d Estimates of the Noise Var iance  64  5.3  Averaged Maximum L i k e l i h o o d Estimate of the Noise Variance  66  6.1  P r o p e r t i e s of the f o r e c a s t i n g schemes  71  6.2  Least Squares Estimates of the autoregressive c o e f f i c i e n t  75  6.3  Maximum L i k e l i h o o d Estimates of the noise variance  76  6.4  Range of values f o r input parameters  77  6.5a  Values of PI, f o r Scheme 1  77  6.5b  Values of P I  2  f o r Scheme 1  77  6.5c  Values of P I  3  f o r Scheme 1  78  6.6  Range of values f o r a  78  6.7a  Values of PI, f o r Scheme 2  79  6.7b  Values of P I  2  f o r Scheme 2  79  6.7c  Values of P I  3  f o r Scheme 3  80  6.8  Range of values f o r input parameters  0  Coding  i n flow  Scheme  x  26 27  1970-1979  81  6.9a  Values  of PI, f o r Scheme 3  81  6.9b  Values  of P I  f o r Scheme 3  82  6.9c  Values  of P I  f o r Scheme 3  82  7.1  Least Squares Estimates coefficient  7.2  Maximum L i k e l i h o o d Estimates variance  7.3  Performance I n d i c a t o r s f o r Scheme 1  95  7.4  Performance I n d i c a t o r s f o r the AR(1) model  96  7.5  S e n s i t i v i t y of the Performance I n d i c a t o r s to the model c o e f f i c i e n t  96  7.6  Performance I n d i c a t o r s f o r Scheme 2  97  7.7  Range of Values  98  7.8  Performance I n d i c a t o r s f o r Scheme 2  8.1  Maximum L i k e l i h o o d Estimates var iance  8.2a  V a r i a t i o n of PI, t o noise s p e c i f i c a t i o n  106  8.2b  V a r i a t i o n of P I  t o noise s p e c i f i c a t i o n  106  8.2c  V a r i a t i o n of P I t o noise s p e c i f i c a t i o n  107  8.3a  V a r i a t i o n of P I , t o noise s p e c i f i c a t i o n  107  8.3b  V a r i a t i o n of P I t o n o i s e s p e c i f i c a t i o n  107  8.3c  V a r i a t i o n of P I  t o noise s p e c i f i c a t i o n  108  8.4  Values  of PI f o r ARMAX and AR(1) models  109  8.5  Values of PI f o r ARMAX and T r a n s f e r F u n c t i o n models  2  3  of the model of the noise  f o r input parameters  2  of the noise  3  2  3  xi  94 95  98 105  110  LIST OF FIGURES  F i g . No.  Title  l o c a t i o n of the study a r e a .  Page  1.1  Geographic  1.2  Outflow  2.1  A Schematic r e p r e s e n t a t i o n of the Kalman F i l t e r procedure a t each time s t e p .  11  2.2  S e n s i t i v i t y of the s t a t e estimates t o m i s - s p e c i f i c a t i o n s of n o i s e standard d e v i a t i o n s . ( A f t e r Mehra, 1968)  14  3.1  Schematic diagram Procedure.  33  3.2a  RRMS with r e s p e c t t o Q (CODES 1-4)  38  3.2b  RRMS with r e s p e c t t o R (CODES 1-4)  38  3.3a  RRMS with r e s p e c t t o Q (CODES 5-8)  39  3.3b  RRMS with r e s p e c t t o R (CODES 5-8)  39  3.4a  RRMS with r e s p e c t t o Q (CODES 9-10)  40  3.4b'  RRMS with r e s p e c t t o R (CODES 9-10)  40  3.4a  RRMS with r e s p e c t t o Q (CODES 11-12)  41  3.4b  RRMS with r e s p e c t t o R (CODES 11-12)  41  3.5  Response Surface - RRMS with respect t o QR speci f i c a t i o n s  43  4.1  A 'data window' approach t o s l i g h t l y t i m e - i n v a r i a n t h y d r o l o g i c systems.  50  hydrograph  at Hope, B.C.  of Experimental  xii  4 5  ACKNOWLEDGEMENTS I wish t o thank Dr. S.O. (Denis) R u s s e l l Civil  (Department of  E n g i n e e r i n g ) , and Dr. P i e t de Jong ( F a c u l t y of  Commerce) f o r t h e i r v a l u a b l e guidance and advice throughout t h i s r e s e a r c h . The knowledge  I have gained i s deeply  a p p r e c i a t e d , and the experience has been most rewarding. S i n c e r e thanks go t o Dr. M.C. Quick (Department of Civil  Engineering) and Dr. W. Caselton  (Department of C i v i l  Enginnering) f o r t h e i r comments on streamflow modelling d u r i n g t h i s work. A p p r e c i a t i o n  i s given t o Mr. R.G. Boals  (Environment Canada, Water Resources Branch, Ottawa), and Mr. O l i v e r Nagy (Water Survey of Canada, Vancouver) f o r t h e i r prompt are  a t t e n t i o n t o the flow data r e q u i s i t i o n .  a l s o extended t o Dr. Glen Cooper  UBC)  and Mr. Gordon F i n l a y  Thanks  (Computing Centre at  (Department of C i v i l  Engineering,  Graphics Lab) f o r t h e i r a s s i s t a n c e i n the computing aspects of  this  research.  I am g r a t e f u l t o my f r i e n d s f o r t h e i r comments throughtout t h i s  thesis.  F i n a l l y I wish t o thank my mother encouragement,  informative  f o r her support and  without which t h i s work would not have been  possible.  xi i i  1. INTRODUCTION Flow f o r e c a s t i n g i s an important  aspect  i n the o p e r a t i o n and  c o n t r o l of water resource systems. S t a t i s t i c a l  models are  used e x t e n s i v e l y i n the s t o c h a s t i c modelling of r i v e r f l o w s . Adaptive at  f o r e c a s t i n g occurs when p r e d i c t i o n s can be updated  each time step i n response  Kalman F i l t e r  t o incoming  o b s e r v a t i o n s . The  i s an example of an adaptive f o r e c a s t i n g  scheme. T h i s e s t i m a t i o n method i s based on a l i n e a r s t a t e - s p a c e model. Because time  s e r i e s or r e g r e s s i o n models  are amenable t o state-space f o r m u l a t i o n , they a r e p r e f e r r e d over c o n c e p t u a l models such as SWMM, HEC I, and HEC II i n adaptive h y d r o l o g i c f o r e c a s t i n g . A p p l i c a t i o n s of the Kalman Filter  have been very s u c c e s s f u l i n communications and  aerospace and  engineering, f i e l d s  the governing  However, such  i n which the system dynamics  p h y s i c a l equations are w e l l known.  i s not the case  i n streamflow  The performance of the Kalman F i l t e r  applications.  may be g r e a t l y a f f e c t e d  when these equations and dynamics a r e unknown.  1.1 Problem  definition  Although it  is briefly  the Kalman F i l t e r d e f i n e d here  problem addressed  i s e x p l a i n e d i n Chapter 2,  i n order t o s t a t e the g e n e r a l  i n t h i s t h e s i s . The l i n e a r  state-space  model, sometimes known as the Gauss-Markov model c o n s i s t s of two  equations:  State eqn:  x  fc  =• *  t  21^-1  1  +  -t  w ^(0,Q) fc  1.1  2  Observation  egn:  y_ = H t  fc  v/n/(0,R)  x_ + Y_ t  t  1.2  The  o b j e c t i v e i s t o r e c u r s i v e l y estimate the s t a t e v e c t o r ,  x ,  based on the c u r r e n t o b s e r v a t i o n s , y_ . The Kalman  fc  t  a l g o r i t h m p r o v i d e s a method whereby s t a t e estimates a r e c o n t i n u a l l y updated as new o b s e r v a t i o n s a r e r e c e i v e d a t each time step. The s t a t e vector i s c o r r u p t e d by white noise, w.A>(0,Q) with c o v a r i a n c e matrix Q. S i m i l a r l y , the vector of t  observations  i s c o r r u p t e d by white noise v^/(0,R)  covariance matrix R. The a l g o r i t h m assumes t h a t R are known a t time t . A f i l t e r e d the Kalman F i l t e r a t every  estimate  x  with H, Q, and  i s given by  time s t e p . In a d d i t i o n , the mean  square e r r o r f o r the s t a t e estimate  i s g i v e n . A problem  which o f t e n a r i s e s i n p r a c t i c e i s that of unknown dynamics; e i t h e r system matrices O'Connell and C l a r k e o f t e n overlooked  H or noise c o v a r i a n c e s Q , R. As  (1981) p o i n t out, t h i s d i f f i c u l t y i s  i n a p p l i c a t i o n s even though the s t a t e  e s t i m a t i o n procedure depends c r i t i c a l l y on the proper s p e c i f i c a t i o n of dynamics.  1.2  General  Objectives  T h i s r e s e a r c h work deals with the problem of unknown dynamics. ARMAX models f o r d e s c r i b i n g streamflow phenomenon are c o n s i d e r e d  i n t h i s t h e s i s . These are time s e r i e s models  with exogeneous i n p u t s .  3 The p r a c t i c a l a p p l i c a t i o n used motivated  by f l o o d f o r e c a s t i n g  streamflow station River,  The chosen problem i s  i s l o c a t e d near the downstream p o r t i o n of the F r a s e r (see F i g . 1.1) 2  3  i n B.C.  p r e d i c t i o n f o r the F r a s e r River at Hope. T h i s  (217,000 km ), m /s.  in t h i s t h e s i s i s  As a r e s u l t of the l a r g e drainage  flow magnitudes are i n the order of thousands  Peak flow u s u a l l y o c c u r s i n June, with a t y p i c a l 3  of 5000 m /s.  area  F i g . 1.2  g i v e s an outflow hydrograph  for  value 1983.  Station Spences  S t a t i o n at Texas Creek  near Bridge  /Thompson \ \ River /  W. Vancouver' .TOLL  CAMtOlA  ISIA*0'  "A*  ""VANCOUVER O Richmond  ^ VAIOU  n  'O  1  Fig.  i n = 4 5 km  1.1  = 28  mi  Geographic  location  of the study  area.  OUTFLOW HYDROGRAPH 8000 I  Time  Fig.  1. 2  Outflow  hydrograph  a t Hope,  B.  6  In t h i s t h e s i s , ARMAX models are c a s t i n t o the state-space format with  the c o e f f i c i e n t s as the s t a t e v e c t o r . In  h y d r o l o g i c a p p l i c a t i o n s , f u t u r e flow  i s the q u a n t i t y of  i n t e r e s t . F o r e c a s t i n g performance of the f i l t e r in terms of the o b s e r v a t i o n The 1.  general  i s measured  forecast error (y-y).  o b j e c t i v e s of the t h e s i s a r e :  t o examine the s e n s i t i v i t y of the Kalman F i l t e r  with  respect  t o m i s - s p e c i f i c a t i o n s of noise c o v a r i a n c e s and  initial  c o n d i t i o n s of the s t a t e v e c t o r . These inputs a r e  r e q u i r e d by the Kalman a l g o r i t h m ,  but a r e o f t e n unknown  in p r a c t i c e . 2.  t o i n v e s t i g a t e the maximum l i k e l i h o o d method f o r estimating practical  3.  the noise c o v a r i a n c e s  i n h y d r o l o g i c models of  interest.  t o compare the f o r e c a s t i n g performance of ARMAX models depending on whether the Kalman F i l t e r  1.3 General T h e s i s The  Outline  next chapter  F i l t e r . Included  i s used or not.  reviews the l i t e r a t u r e on the Kalman  i n t h i s review a r e a p p l i c a t i o n s and the  problems which a r i s e i n p r a c t i c e . Chapter 3 i n v e s t i g a t e s the s e n s i t i v i t y of the f i l t e r  with  respect  t o input q u a n t i t i e s ,  which a r e assumed known by the a l g o r i t h m . considers  Chapter 4  three s p e c i a l cases of the ARMAX models which are  most o f t e n used i n streamflow p r e d i c t i o n s . In subsequent chapters,  the research d e a l s with  Chapter 5 d e s c r i b e s  these three models.  the maximum l i k e l i h o o d method f o r  7 e s t i m a t i n g the noise v a r i a n c e . Chapters 6, 7, and  8 examine  the f o r e c a s t i n g performance of the three models, depending on whether or not the Kalman F i l t e r  i s used. Often i n  f o r e c a s t i n g , the standard d e v i a t i o n of the p r e d i c t i o n i s r e q u i r e d ; but there are many cases where the e x p r e s s i o n f o r the v a r i a n c e given by the Kalman F i l t e r  i s not a p p l i c a b l e .  Chapter 9 d i s c u s s e s t h i s i n d e t a i l and a t h e o r e t i c a l expression  i s presented  e r r o r . The  importance of t h i s i s i l l u s t r a t e d  F i n a l l y , chapter  f o r the v a r i a n c e of the f o r e c a s t i n an example.  10 summarizes the r e s u l t s of t h i s t h e s i s  and g i v e s recommendations f o r f u t u r e r e s e a r c h .  2. LITERATURE REVIEW There are f i v e s e c t i o n s to t h i s c h a p t e r . F i r s t , Filter  i s d e s c r i b e d i n d e t a i l . General  the Kalman  a p p l i c a t i o n s are  given i n s e c t i o n 2 and h y d r o l o g i c examples i n s e c t i o n 3. P r a c t i c a l problems i n the implementation d i s c u s s e d i n s e c t i o n 4. F i n a l l y , d e t a i l the s p e c i f i c  of the f i l t e r  are  section 5 o u t l i n e s in  t o p i c s addressed  i n subsequent c h a p t e r s .  2.1 E x p l a n a t i o n of the Kalman F i l t e r The  f o l l o w i n g n o t a t i o n i s used i n t h i s t h e s i s . V e c t o r s  are denoted by u n d e r l i n e d , small l e t t e r s ; and m a t r i c e s denoted by c a p i t a l The  letters.  l i n e a r state-space model, sometimes known as  Gauss-Markov model c o n s i s t s of two  State eqn:  The  x  elements of x  are  fc  = $ Xfi + w  equations:  w /vR(0,Q)  are the s t a t e v a r i a b l e s to be  fc  the  2.1  estimated,  but they need not be p h y s i c a l l y measurable. These s t a t e s are modelled  as a Markov p r o c e s s . The manner i n which they  evolve through m a t r i x . The  time  i s given by 3>, the s t a t e t r a n s i t i o n  s t a t e s are c o r r u p t e d by white n o i s e , w  d i s t r i b u t e d with mean 0 and c o v a r i a n c e matrix Q. $ and Q can be t i m e - v a r y i n g , but are assumed known at a l l times.  Observation  eqn:  Y_  fc  = H x  fc  + y_  t  8  v^H{0  ,R)  2.2  9 Measurements taken a t time t are r e l a t e d t o the s t a t e s of the system through H. They a r e a l s o c o r r u p t e d by white n o i s e , y_ d i s t r i b u t e d with mean 0 and c o v a r i a n c e matrix R. t  S i m i l a r l y , H and R may be t i m e - v a r y i n g , but a r e assumed known at a l l times. Moreover, w  fc  and v  fc  are assumed t o be  s e r i a l l y and mutually u n c o r r e l a t e d . For the Kalman model c o n s i s t i n g of eqn. 2.1 and 2.2, the o b j e c t i v e i s to estimate the s t a t e v e c t o r at c u r r e n t time, x for  fc  given a l l past measurements. O b t a i n i n g an estimate  the s t a t e a t the c u r r e n t time, i s known as f i l t e r i n g . In  particular,  i f a l l past i n f o r m a t i o n can be summarized i n a  p r i o r estimate f o r the s t a t e v e c t o r , t h i s type of e s t i m a t i o n i s known as r e c u r s i v e f i l t e r i n g . The Kalman F i l t e r r e c u r s i v e procedure  which y i e l d s estimates f o r x  time s t e p . T h i s f i l t e r i n g  fc  isa at every  technique can be given a Bayesian  i n t e r p r e t a t i o n . P o s t e r i o r estimates of the s t a t e v a r i a b l e s are obtained by updating t h e i r p r i o r estimates through the measurements r e c e i v e d at time t . However, Bayesian technique per se, as o f t e n quoted  i t i s not a i n the  l i t e r a t u r e , d e s p i t e t h i s d e c i s i o n t h e o r e t i c approach. In fact, x  t  (the ' p o s t e r i o r ' or updated estimate) i s d e r i v e d  from a l e a s t squares c r i t e r i o n by minimizing the r e s i d u a l of the s t a t e e s t i m a t i o n v e c t o r . The Kalman F i l t e r  i s a solution  which y i e l d s a l i n e a r , minimum v a r i a n c e e s t i m a t o r f o r the s t a t e v e c t o r at every time s t e p . In a d d i t i o n , the e r r o r c o v a r i a n c e matrix P , a s s o c i a t e d with x., i s g i v e n .  10  The Kalman F i l t e r  a l g o r i t h m can be d i v i d e d i n t o two  pa r t s: 1.  2.  P r i o r to observing y_ -t/t-1  =  *t-1 -t-1  P  =  *t-1 t - 1 * ' t - 1  t/t-1  2  P  +  Q  2  t  ,  3  4  '  A f t e r observing y_ , t  it P  =  +  h/t-i  t  K  [  t *t  = [I - K  H ]  = P ^ ^  H'  fc  where K  fc  - £  ]  2  t  -  5  2.6  t  t  i ^  t  /  t  ^ » \  +  1  * ]'  2.7  t  The 1-step f o r e c a s t f o r the o b s e r v a t i o n i s : £t+1  =  H  2  A+l/t  The Kalman F i l t e r  accounts  '  8  f o r u n c e r t a i n t i e s by  p r o v i d i n g an a l g o r i t h m which s e q u e n t i a l l y combines model ( s t a t e equation) to  and data  (measurement equation)  y i e l d updated estimates  estimates  information  of the s t a t e v e c t o r . These  can be p r o j e c t e d forward  to o b t a i n f u t u r e  p r e d i c t i o n s of the o b s e r v a t i o n s . I t i s c o m p u t a t i o n a l l y a p p e a l i n g due t o i t s r e c u r s i v e nature, information  i s contained  A schematic i n F i g . 2.1.  i . e . a l l previous  i n the p r i o r estimate of the s t a t e .  r e p r e s e n t a t i o n of t h i s procedure i s shown  11  State equation at  ~ Q  time t-1  1  Observations J  • at time t  R  i  Fig.  2.1  A Schematic r e p r e s e n t a t i o n of the Kalman F i l t e r p r o c e d u r e at each t i m e  step.  12 2.2  General  Applications  Application literature  of the Kalman F i l t e r  i s widespread, and the  spans over many d i s c i p l i n e s . I t i s used  particularly  in forecasting  s i t u a t i o n s . The c l a s s i c  example  i s from aerospace e n g i n e e r i n g . The problem i s t o c o n t i n u a l l y estimate the p o s i t i o n , v e l o c i t y and a c c e l e r a t i o n in Cartesian  co-ordinates,  upon r e c e i v i n g n o i s e c o r r u p t e d  measurements from a radar i n p o l a r c o - o r d i n a t e s . Kalman F i l t e r i n g has been a p p l i e d coefficients 1979). T h i s  of a t a r g e t  In f i n a n c e ,  to estimate the r e g r e s s i o n  i n a model f o r stock earnings per share technique has a l s o been a p p l i e d  (Mehra,  i n water q u a l i t y  p h y s i c a l models i n v o l v i n g BOD-DO equations. Examples of t h i s are  found i n the work by Young and Whitehead  (1977),  Constable and McBean (1979). However, n o n - l i n e a r i t i e s i n these f o r m u l a t i o n s lead to the use of the extended Kalman Filter  which i s not the subject  The  Kalman F i l t e r  has a l s o been i l l u s t r a t e d  a s p e c t s of the h y d r o l o g i c successfully applied  of the t h e s i s .  p r o c e s s . Bras (1978) has  t h i s technique to sampling network  d e s i g n . An example of e s t i m a t i n g  groundwater  c h a r a c t e r i s t i c s i s given by McLaughlin dimensionality  i n many  basin  (1978). However, the  of groundwater models i s l a r g e , as many  parameters need to be e s t i m a t e d . Computational e f f i c i e n c y and  the problem of j o i n t  two  major d i f f i c u l t i e s This  operation  filtering  s t a t e and parameter e s t i m a t i o n are  (Wilson e t a l . , 1978).  technique has a l s o been used i n the  of water resource systems. An example i s the  13 maintaining et  of water l e v e l s for n a v i g a t i o n purposes. Duong  a l . ( l 9 7 8 ) a p p l i e d the Kalman F i l t e r  estimates  t o o b t a i n r e a l time  of system parameters r e q u i r e d f o r c o n t r o l l i n g a  lock and dam gate.  2.3  Hydrologic Applications The  most abundant a p p l i c a t i o n s of t h i s  technique  estimation  a r e i n streamflow f o r e c a s t i n g . In many s i t u a t i o n s ,  the u n d e r l y i n g process  can be d e s c r i b e d by a s t a t i s t i c a l  model such as a r e g r e s s i o n or time s e r i e s model. In these cases,  the Kalman F i l t e r  i s a u s e f u l supplement t o  hydrologic forecasting. The  r a i n f a l l - r u n o f f process  autoregressive two  i s o f t e n modelled by an  scheme p l u s some input i n f o r m a t i o n . There a r e  ways of formulating t h i s type of h y d r o l o g i c model i n t o  the s t a t e - s p a c e formulated  n o t a t i o n . Rodriguez-Iturbe  (1978) has  the problem with the c o e f f i c i e n t s of an ARMA  model as the v a r i a b l e s of i n t e r e s t . T h i s i s a l s o shown i n an example by Wood et a l . , (1978). The instantaneous hydrograph, IUH represented  by a c o n v o l u t i o n  unit  integral,  is a  c l a s s i c example of such a f o r e c a s t i n g scheme. For examples, see Rodriguez-Iturbe  (1978), S z o l l o s i - N a g y  A l t e r n a t i v e l y , the d i s c h a r g e s  (1976).  can be taken as the s t a t e s ,  but the unknown ARMAX parameters then need t o be estimated a p r i o r i . An example of t h i s i s given  i n Szollosi-Nagy et  a l . , ( l 9 7 7 ) . Conceptual response models i n v o l v i n g the c o n t i n u i t y and storage discharge  equations  have a l s o been  14  used with the Kalman F i l t e r . However, these models l e a d t o non-linear  estimation.  Wood (1978) i n v e s t i g a t e d  r o u t i n g models v i a the Kalman F i l t e r  flood  f o r f o r e c a s t i n g water  levels.  2.4 Problems i n h y d r o l o g i c  2 . 4 . 1 Unknown  applications  dynamics  The problem of unknown noise encountered i n h y d r o l o g i c complexity of the runoff the extreme s e n s i t i v i t y underspecification  covariances i s often  m o d e l l i n g as a r e s u l t  of the  p r o c e s s . Mehra (1979) p o i n t e d out of the Kalman F i l t e r t o  of R, or the m i s s p e c i f i c a t i o n of Q.  100. in =  I  2 oc . — E Ui IU  Forward  .Smoothing  0.1  I  10  0.5  0, asumad/o. actual  E f f e c t of errors i n measurement noise variance on steady-state RMS error i n state e s t i mation. o -= measurement noise standard deviation, (from Mehra 1968). r  J _q attumad/O.q actuat  E f f e c t of errors i n process noise variance on steadystate RMS error i n state estimation. o°n = process noise standard deviation, (from Mehra 1968).  F i g . 2.2 S e n s i t i v i t y of the state estimates to m i s - s p e c i f i c a t i o n s of noise standard d e v i a t i o n s .  15  T h i s was shown i n q u a l i t a t i v e terms and the idea conforms with one's i n t u i t i o n . However, the f i l t e r ' s performance under simultaneous  m i s s p e c i f i c a t i o n of Q and R needs t o be  examined. In p r a c t i c e , these n o i s e c o v a r i a n c e s are s p e c i f i e d too small or too l a r g e with respect t o t h e i r t r u e v a l u e s . The manner i n which these s p e c i f i c a t i o n s a f f e c t performance i s important  the f i l t e r ' s  knowledge for the f o r e c a s t e r .  Moreover, the i n i t i a l c o n d i t i o n s of the s t a t e a r e o f t e n unknown. These questions are addressed Because flow  i n the next  chapter.  i s the v a r i a b l e of i n t e r e s t , the f i l t e r ' s  performance i n d i c a t o r s are based on the o b s e r v a t i o n f o r e c a s t error.  2 . 4 . 2 E s t i m a t i o n of State system matrix V a r i o u s methods have been proposed i n the past f o r e s t i m a t i n g these unknown m a t r i c e s ,  H, Q, and R. $ i s the  s t a t e system m a t r i x . When elements of # are unknown, t h i s leads to n o n - l i n e a r f i l t e r i n g . to  One way of r e s o l v i n g t h i s i s  l i n e a r i z e the problem with respect t o the l a t e s t s t a t e  estimate  by the Extended Kalman F i l t e r .  i n d i c a t e d t h a t the estimates  However, Mehra has  thus obtained a r e s e n s i t i v e t o  i n i t i a l c o n d i t i o n s . Moreover, the EKF has not been proven t o y i e l d c o n s i s t e n t estimates the Ombrone Basin  (Young, 1977). In an example on  i n Italy, Szollosi-Nagy  (1977) used the  i n s t r u m e n t a l v a r i a b l e (IV) approach by Young (1977) t o r e s o l v e the n o n - l i n e a r i t i e s of the ARMA c o e f f i c i e n t s i n An extension t o t h i s approach, developed  by T o d i n i et  16 a l . , ( l 9 7 8 ) known as MISP technique  was a l s o a p p l i e d to the  Ombrone B a s i n . T h i s made use of two Kalman F i l t e r s : one which y i e l d s the s t a t e estimates  (given the parameter  values) and the other which y i e l d s parameter  estimates  (given the s t a t e v a l u e s ) . A c o n s t r a i n e d l i n e a r  estimator  proposed by Natale and T o d i n i (1976) was i l l u s t r a t e d  i n an  a p p l i c a t i o n by Wood (1978).  2.4.3 Unknown Observation  system matrix  H i s the system matrix  i n the o b s e r v a t i o n equation of  the Kalman model. When elements of H are unknown, t h i s n e c e s s a r i l y leads to a d i f f e r e n t e x p r e s s i o n  f o r the variance  of the f o r e c a s t e r r o r . Authors have overlooked  this  (Harrison & Stevens,  using the  same Kalman F i l t e r or not t h i s  1976) when they  formula  suggested  to o b t a i n the v a r i a n c e . Whether  i s a s e r i o u s e r r o r in  practice  depends on the  a p p l i c a t i o n . F e l d s t e i n (1971) has s t u d i e d t h i s problem i n the context  of econometrics.  However, because one of h i s  assumptions i s o f t e n not s a t i s f i e d  in hydrologic  a p p l i c a t i o n s , t h i s problem needs f u r t h e r i n v e s t i g a t i o n .  2.4.4 E s t i m a t i o n of the noise  covariances  Implementation of the Kalman F i l t e r allows the f o r e c a s t e r t o e x e r c i s e h i s judgement regarding the accuracy of the u n d e r l y i n g model v s . that of the o b s e r v a t i o n s . T h i s i s achieved  by s p e c i f y i n g the noise c o v a r i a n c e s , Q and R at  each time s t e p . The Kalman gain matrix K, i n e f f e c t ,  17  c o n t r o l s how  the s t a t e estimates  performance of the f i l t e r  are updated. Hence the  depends c r i t i c a l l y  v a l u e s . These are o f t e n unknown i n water a p p l i c a t i o n s , and many authors  on  these  resource  have addressed  t h i s problem.  In r e a l - t i m e a p p l i c a t i o n s where h i s t o r i c a l data s c a r c e , adaptive a l g o r i t h m s are o f t e n used (Mehra IEEE). Jazwinski & B a i l i e  (1969),  c o v a r i a n c e matching techniques,  1973,  Sage & Husa (1969) propose  where the  c o v a r i a n c e of the i n n o v a t i o n s and matched by a d j u s t i n g Q and  are  theoretical  the sample c o v a r i a n c e  R appropriately. This h e u r i s t i c  approach i s c o m p u t a t i o n a l l y  a t t r a c t i v e though i t i s not  guaranteed to converge. The  s p e c i a l case of known Q but  unknown R i s reported to have been handled (Mehra,  are  more s u c c e s s f u l l y  1972).  C o r r e l a t i o n methods only apply to t i m e - i n v a r i a n t systems under s t a t i o n a r y c o n d i t i o n s which impose a limitation The  f o r some water resource  Bayesian  systems.  method i s used to compute the p o s t e r i o r  p r o b a b i l i t y of the 6 = {Q,  R}  given the observed  data, i . e .  p ( 0 | y ) . C a l c u l a t i o n of t h i s r e q u i r e s the l i k e l i h o o d of fc  £(fl|y ) = p ( y | y _.j r . . .0). I f there are p elements i n t  t  t  each has N c h o i c e s , then  3  there are N* p o s s i b l e  of the set 6 corresponding  to  6,and  combinations  Kalman F i l t e r s . P o s t e r i o r  probability  i s c a l c u l a t e d f o r each one  F i l t e r s . As  i t can be r e a d i l y  computationally  6,  of these Kalman  seen, t h i s i s not  unless N and p are both  attractive  small. This  procedure d i s c r i m i n a t e s the best model out of a given group  18 of  models. However, i f the o r i g i n a l set of 0's do not  encompass the true v a l u e , then the "best model" i s s t i l l o p t i m a l i n the sense that there i s s t i l l c o u l d be used. T h i s method was and Valdes et a l . ,  a better  not  6 that  used by Moore & Jones  (1978),  (1978). The tremendous complexity of the  system renders i t c o m p u t a t i o n a l l y u n a t t r a c t i v e  and  cumbersome. The p r e v i o u s l y d e s c r i b e d methods are somewhat ad hoc i n t h e i r a p p l i c a t i o n s . The Maximum l i k e l i h o o d method f o r e s t i m a t i n g Q and R i s based on a mathematical p r i n c i p l e can be u n i v e r s a l l y a p p l i e d . The main advantage  and  of t h i s  method i s that  i t i s a s y m p t o t i c a l l y unbiased, c o n s i s t e n t  e f f i c i e n t . The  idea i s to maximize the l i k e l i h o o d of  and  o b s e r v i n g a p a r t i c u l a r combination of Q and R given the observation y  over a range of v a l u e s f o r the n o i s e  c o v a r i a n c e s . The major drawback i s that the l i k e l i h o o d expression  i n v o l v e s computation  with a non-diagonal m a t r i x .  Because the p r e v i o u s l y d e s c r i b e d methods are not guaranteed to  work, t h i s method i s i n v e s t i g a t e d i n chapter 5. It  should be noted that a l t h o u g h the problem of  e s t i m a t i n g n o i s e v a r i a n c e s has been addressed in  theoretically  the l i t e r a t u r e , the r e s u l t s have had l i m i t e d use i n  p r a c t i c e . The main reasons a r e : 1.  Many of the proposed s o l u t i o n s were developed f o r some p a r t i c u l a r a p p l i c a t i o n . Hence they tend to be application  2.  dependent.  The methods which are e a s i l y  implemented c o m p u t a t i o n a l l y  19 are u s u a l l y h e u r i s t i c i n nature. They have not proven to give authors, who results  3.  The  r e l i a b l e e s t i m a t e s . Moreover, d i f f e r e n t  have used them, reported  (Mehra, 1972;  Husa, 1969;  been  contradictory  O'Connell & C l a r k e ,  1981;  Sage &  Wood e t ^ a l . , 1978).  more complex methods are  though i n general,  not  e a s i l y implemented,  they y i e l d more c o n s i s t e n t  estimates  (Mehra, 1972,1980). i n a d d i t i o n , the complex mathematics required  do not  lend  themselves to p r a c t i c a l  implementations. In f a c t , they act more l i k e a black-box to the  forecaster.  There i s a d e f i n i t e need to bridge the theoretical required  2.5  research  the  implementation  Detailed  thesis objectives  initial  and  in s e c t i o n 2.4,  s p e c i f i c a t i o n s f o r the  unknown. These correspond to Q, state-space model. Two 1.  between  the  feasibility  in p r a c t i c e .  As d i s c u s s e d the  and  gap  R,  the  noise c o v a r i a n c e s  state vector x , 0  Po  are  and  often  of the Kalman  q u e s t i o n s are addressed in chapter  Which of the above q u a n t i t i e s misspecified,  outline  (Q, R,  x ,  P)  have a p r a c t i c a l e f f e c t on  the  0  0  3.  if filter's  performance. 2.  How  do  these q u a n t i t i e s a f f e c t the  forecasting  performance. The  second h a l f of the  t h e s i s i s based on ARMAX models.  S t a t i s t i c a l models are chosen f o r the a p p l i c a t i o n s  because  20  they a r e amenable to s t a t e - s p a c e F i l t e r . Conceptual considered  f o r m u l a t i o n of the Kalman  models such as SWMM and HEC I a r e not  i n t h i s t h e s i s . These a r e p h y s i c a l l y based models  and a r e more d i f f i c u l t t o implement. These a r e more worthwhile t o consider f o r long-term  forecasting for a r i v e r  b a s i n . On the other hand, ARMAX models r e q u i r e r e l a t i v e l y l e s s data. T h i s t h e s i s i s focused on the Kalman F i l t e r rather than model i d e n t i f i c a t i o n of h y d r o l o g i c Thus, s t a t i s t i c a l  processes.  models a r e used. Moreover, the simpler  r e p r e s e n t a t i o n of streamflow phenomenon a l l o w s a more d i r e c t i n v e s t i g a t i o n of the p r a c t i c a l problems a s s o c i a t e d with the Kalman F i l t e r . Chapter 4 d e s c r i b e s three ARMAX models t o be used for the remainder of the t h e s i s . The method of maximum likelihood  i s used t o estimate  models i n chapter  the noise v a r i a n c e for these  5. In p a r t i c u l a r , a s i m p l i f i e d approach t o  e v a l u a t i n g the l o g - l i k e l i h o o d expression  i s examined. T h i s  has been proposed by Schweppe and i l l u s t r a t e d by L e d o l t e r ett  a l . (1983). The next  three chapters  i n v e s t i g a t e the  f o r e c a s t i n g performance of the three models depending on whether or not the Kalman F i l t e r c o n s i d e r s an AR(1)  i s used. Chapter 6  model f o r p r e d i c t i n g flows  1 day ahead a t  Hope. Chapter 7 examines a t r a n s f e r f u n c t i o n model f o r p r e d i c t i n g the 2-day advance flow. Chapter 8 s t u d i e s a combined model which g i v e s 1 and 2 day f o r e c a s t s . The 1 and 2 step f o r e c a s t s are compared t o those obtained models of chapters  from the  6 and 7 r e s p e c t i v e l y . F i n a l l y , a general  21  e x p r e s s i o n f o r the mean square e r r o r of the f o r e c a s t when H i s unknown, i s developed i n Chapter 9. T h i s  situation  o c c u r s , f o r example, when the AR(1) model i s used t o p r e d i c t the flow than more two time steps ahead. In t h i s case, the v a r i a n c e of the f o r e c a s t e r r o r equation, HPH' + R.  i s not given by the Kalman  3. The  SENSITIVITY ANALYSIS  Kalman F i l t e r has  been used in many  hydrologic  a p p l i c a t i o n s . It i s a popular e s t i m a t i o n its  recursive  nature and  technique due  i t s a b i l i t y to handle  u n c e r t a i n t i e s . However, i t s a p p l i c a t i o n s have not been s u c c e s s f u l . T h i s known input are  the  i s o f t e n because the  quantities,  initial  i s not  conditions  P o i Q»  0  R  always  assumption of  s a t i s f i e d . These q u a n t i t i e s  of the  the noise c o v a r i a n c e m a t r i c e s Q, assumes that x t  to  state vector R.  The  Kalman  x , P, 0  and  0  algorithm  are c o r r e c t l y s p e c i f i e d . But when  they are unknown i n p r a c t i c e , t h e i r m i s - s p e c i f i c a t i o n result  in unreliable  The  objective  estimates.  of t h i s study i s to examine  s e n s i t i v i t y of the  filter  quantities x ,  Q,  0  P,  as a s t a t i s t i c a l quantities  {x , 0  0  R.  to m i s - s p e c i f i c a t i o n of The  sensitivity  0  Q,  R}  i s treated  f o r e c a s t i n g performance of the i n d i c a t o r based on the indicator  i s the  the the  study i s analyzed  f a c t o r i a l experiment. Each of the P,  may  filter  observation  response v a r i a b l e  as a f a c t o r .  input The  i s measured by  forecast i n the  error.  an  This  factorial  exper iment. A s c a l a r ARMAX model i s used f o r streamflow modelling in  t h i s study. In t h i s a p p l i c a t i o n , an a u t o r e g r e s s i v e  of order  model  1, with temperature as an exogeneous v a r i a b l e i s  chosen. The  model i s c a s t  model c o e f f i c i e n t s as the  i n t o s t a t e - s p a c e format with state vector.  Hence Q represents  the noise c o v a r i a n c e matrix for the c o e f f i c i e n t s , and  22  the  R is  23 the v a r i a n c e Results  of the measurement e r r o r . of t h i s s e n s i t i v i t y  study show that  s p e c i f i c a t i o n s of the ARMAX c o e f f i c i e n t s , x  0  initial  and P , do not 0  a f f e c t the f o r e c a s t i n g performance of the f i l t e r  t o any  degree of p r a c t i c a l concern. The performance i s m a t e r i a l l y a f f e c t e d by the combined s p e c i f i c a t i o n of the noise covariances,  Q and R. In a d d i t i o n , c o n c l u s i o n s  to how the unknown noise  a r e made as  c o v a r i a n c e s should be s p e c i f i e d i n  order t o a c h i e v e good f o r e c a s t i n g performance. S p e c i f y i n g both Q and R r e l a t i v e l y l a r g e r e s u l t s i n a f i l t e r performance comparable t o the case of known Q and R. S p e c i f y i n g Q small and R l a r g e y i e l d s the worst performance. I f the measurement noise  variance  forecasting i s known,  then the f o r e c a s t i n g performance i s i n d i f f e r e n t t o under or o v e r - s p e c i f i c a t i o n of Q. However, i f Q i s known, i t i s found that b e t t e r  filter  measurement noise  3.1  Experimental The  performance i s obtained i f the variance  i s specified relatively  large.  Plan  approach t o t h i s i n v e s t i g a t i o n i s d i v i d e d  into  parts. 1.  Flow  generation  Flow sequences a r e generated by a chosen ARMAX model with known input q u a n t i t i e s : x * , Q*, R*. 0  2.  Kalman F i l t e r S p e c i f i c a t i o n The  flow sequences a r e t r e a t e d as observations  Kalman F i l t e r  and a  i s a p p l i e d t o each flow sequence. The  four  24 state-space format used corresponds to a l l o w i n g the ARMAX c o e f f i c i e n t s be the s t a t e v e c t o r i n the Kalman model. Hence, estimates of the c o e f f i c i e n t s are given at each time step and flow f o r e c a s t s are made with the l a t e s t estimate given by the f i l t e r . The noise c o v a r i a n c e m a t r i c e s used i n the g e n e r a t i o n of streamflow data are Q* and R*.  The Kalman a l g o r i t h m uses x ,  and R where these may  0  be d i f f e r e n t  P,  Q,  0  from the g e n e r a t i o n  v a l u e s . Q and R can be s p e c i f i e d to be l e s s than, equal to,  or g r e a t e r than t h e i r t r u e v a l u e s , Q* and R*.  the  s e n s i t i v i t y of the f i l t e r ' s performance  to these  s p e c i f i c a t i o n s i s examined. The performance filter  A Kalman F i l t e r  i s a p p l i e d to each combination of of these f i l t e r s c o n t a i n the  s p e c i f i c a t i o n s of the  Factorial The  error.  input q u a n t i t i e s . One  correct  of the  i s measured by an i n d i c a t o r based on the  observation forecast  the  Hence,  inputs.  Experiment  i n v e s t i g a t i o n i s s t u d i e d as a s t a t i s t i c a l  experiment with four f a c t o r s ; namely {x , 0  P, 0  factorial Q, R}.  The  l e v e l s of these f a c t o r s correspond to the d i f f e r e n t specifications. A n a l y s i s of V a r i a n c e S e n s i t i v i t y of the f i l t e r ' s performance  i s investigated  through a n a l y s i s of v a r i a n c e (ANOVA) on the i n d i c a t o r s . The a n a l y s e s i n d i c a t e which  performance  f a c t o r s or  combinations t h e r e o f , have a s i g n i f i c a n t e f f e c t on the  25 performance i n d i c a t o r .  3.2  E x p e r i m e n t a l Procedure  3.2.1  G e n e r a t i o n of Flow Data A p p l i c a t i o n of the Kalman F i l t e r  motivated by  real-time  in t h i s thesis i s  flow f o r e c a s t i n g of the Fraser  A s t o c h a s t i c model for d e s c r i b i n g  the  River.  streamflow phenomenon  i s chosen:  q  The  t  =  a  +  t *t-1  autoregressive  b  t  storage e f f e c t s . The  and  b , fc  eqn:  V b  The  +  t  v  3.1  t  flow i s dependent on  the the  are chosen to f o l l o w a random walk.  The  3.1  r u n o f f . The  upstream  i s recast  i n t o state-space framework  follows:  State  Obs.  p  c o e f f i c i e n t s of  ARMAX model of eqn. as  m  temperature term r e p r e s e n t s  snowmelt i n f l u e n c e on the fc  e  term i s c h a r a c t e r i s t i c of s t a t i o n s with  l a r g e d r a i n a g e areas, as the  model, a  T  eqn:  noise  t  3.2  "w, "  a  " t-r  t  +  b  _ t-1_  _  [q _  Temp ]  t  1  w 2 t  . 3.3  t  terms are chosen with the  following  properties:  26  1•  w ~ (0,Q*) fc  The  c o v a r i a n c e matrix  f o r the c o e f f i c i e n t s  (states) i s  constant with respect to time. I t i s a l s o a d i a g o n a l m a t r i x . Thus, independent e r r o r s f o r the model c o e f f i c i e n t s are assumed. 2.  v <v (0,R*) t  Similarly,  a constant v a r i a n c e f o r the o b s e r v a t i o n e r r o r  i s used.  Batches of flow sequences are generated  f o r each  testing  c o n d i t i o n denoted as "CODE". Each sequence has T=100 time s t e p s . Each CODE corresponds x_o*,  to a p a r t i c u l a r  combination of  Q* i R* • The f o l l o w i n g t a b l e g i v e s the values used i n  the flow  generation.  Table 3.1  x * = 0  Values of x * , Q*, R* used i n flow 0  .85  A r b i t r a r y s t a r t i n g values are used as t h i s  7.00  Qi* =  does not a f f e c t  000036 0  generation  0  0 * = .0001 II y  .0016  0  0 * = .0004 0 III 0 .0225 y  .01  = 100  The  chosen values of Q* a r e r e p r e s e n t a t i v e of p r a c t i c a l  situations;  n  = 400  0  generation.  R* :  R *  the streamflow  = 625  R * = 900 IV  a standard d e v i a t i o n of .5% and 2% from one time  27 step to the next. S i m i l a r l y , the range of R* r e f l e c t s a measurement e r r o r with a standard d e v i a t i o n from 3% to 12% of the flow. In p r a c t i c e , the standard d e v i a t i o n of the measurement e r r o r  i s not more than  The number of combinations  10% of the flow.  of the above input  q u a n t i t i e s i s twelve. Hence, the number of CODES i s a l s o twelve. Table 3.2 g i v e s a schematic  layout of the coding  scheme.  Table 3.2  Schematic  coding  scheme  0 * II y  V R R R  II  *  III IV  *  *  3.2.2  CODE 5  CODE 9  CODE 2  CODE 6  CODE 10  CODE 3  CODE 7  CODE 11  CODE 4  CODE 8  CODE 12  Specifications  Values of the q u a n t i t i e s used  of these q u a n t i t i e s f o r the Kalman a l g o r i t h m .  T h e r e f o r e , Q* i s used  i n g e n e r a t i o n ; while Q i s a  s p e c i f i c a t i o n value. Different 0  i n data generation are  with a '*'; otherwise the l e t t e r s represent  specifications  {x i  U  CODE 1  Kalman F i l t e r  denoted  0 * III  P o r Qf R l r e s u l t  specifications for  i n d i f f e r e n t Kalman F i l t e r s . For each  28  code, 36  Kalman F i l t e r s a r e evaluated, each with a sample  s i z e of 5 streamflow sequences. The number of streamflow sequences r e q u i r e d  under each CODE i s 180. For each of the  180 s e r i e s , the performance of the Kalman F i l t e r in terms of the flow f o r e c a s t  i s measured  e r r o r . The i n d i c a t o r i s  r e l a t i v e root mean square (RRMS) e r r o r d e f i n e d a s :  3.4  RRMS  This  i s preferred  over the common RMS e r r o r as RRMS does not  depend on the u n i t s of measurement.  3.2.3  F a c t o r i a l Experiment The  investigation  i s s e t up as a f a c t o r i a l  experiment  with a completely randomized d e s i g n . I t i s a f u l l y - c r o s s e d four  f a c t o r experiment. The f a c t o r s are the s p e c i f i c a t i o n s  of the input  quantities:  l e v e l s f o r each f a c t o r  Table 3.3  {x r o> Q> R) • The number of p  0  i s given  below:  Number of L e v e l s f o r the F a c t o r s  l  2  2  Q  R  3  3  29 The  l e v e l s correspond to d i f f e r e n t  s p e c i f i c a t i o n s of the  input q u a n t i t i e s . The number of ways of combining levels i s :  these  k = II / . = 36. Hence, 36 d i f f e r e n t Kalman i  F i l t e r s a r e used under each CODE. The l e v e l s of these f a c t o r s are d e s c r i b e d below. Table 3.4  L e v e l s of the Noise  level  x  1  g<?od es>timate  Covariances  P  0  Q, R  0  conf ideii t  too small  guess  2  ~o. 9 r  ".001  _6.50_  _0  b<ad e s t i.mate  .25Q*, .25R*  . 065_  non-con Eident  correct  guess  spec i f i c a t ion  \75  -  "0.425  0"  0~  Q*, R*  1 0.50 too l a r g e  3  4Q*, 4R*  The good estimate of x  0  corresponds to an i n i t i a l  s p e c i f i c a t i o n e r r o r of 7%, while the bad estimate corresponds  to 50%. These l i m i t s are r e p r e s e n t a t i v e of the  range of a c c u r a c i e s f o r e s t i m a t e s of x  0  in p r a c t i c e . Level 1  of the n o i s e s p e c i f i c a t i o n corresponds to under-estimating the standard d e v i a t i o n by a f a c t o r of 2. S i m i l a r l y , l e v e l 3  30  corresponds  to o v e r - e s t i m a t i n g by a f a c t o r  of 2. L e v e l 2  r e p r e s e n t s a s p e c i f i c a t i o n which i s equal t o the true c o v a r i a n c e matrix. In p r a c t i c e ,  i t i s expected  that  estimates of Q and R, a r e w i t h i n the bounds provided by levels  1 and 3.  The response  variable  of the f a c t o r i a l  experiment i s  the performance i n d i c a t o r , RRMS. I t s values are analyzed by ANOVA to order to determine which of the four f a c t o r s or combinations  t h e r e o f , have an important  performance of the Kalman  e f f e c t on the  Filter.  Each of the 36 Kalman F i l t e r s  i s s u b j e c t e d to n=5  s e r i e s of o b s e r v a t i o n s , the 5 s e r i e s being d i f f e r e n t i n each case. T h i s conforms to a completely  randomized (CR) d e s i g n .  For t h i s non-repeated measures design, the measurement errors  are u n c o r r e l a t e d . Thus, an u n r e p r e s e n t a t i v e  series  does not r e p l i c a t e i t s c h a r a c t e r i s t i c from one Kalman to another.  Alternatively,  Filter  the same 5 s e r i e s of o b s e r v a t i o n s  c o u l d be used f o r each Kalman F i l t e r  i n order to minimize  the v a r i a t i o n between groups. T h i s repeated measures design results  in correlated  Kalman F i l t e r s .  measurement  errors  among the v a r i o u s  Hence, a CR design i s chosen.  The s t a t i s t i c a l  model f o r a f a c t o r i a l  experiment with 4  factors i s : Y.  i j klm  = M + A. + B . + C , i  l  + D  k  + 2-way i n t e r a c t i o n s  ..  + 3-way i n t e r a c t i o n s ... + 4-way i n t e r a c t i o n  + e  m  3.5  31  For  t h i s s e n s i t i v i t y study, the f o l l o w i n g  substitution  variable  i s made:  RRMS = M + X x_oPo  +  0  + P  + 0  Q  +  R  • • •  + x P Q + ... 0  0  + x P QR + e 0  The n o t a t i o n  3.6  0  i s described  as f o l l o w s :  1.  RRMS i s the response v a r i a b l e of the experiment.  2.  ju i s the o v e r a l l mean of the performance  indicator,  RRMS. 3.  x , P , Q, R are the f a c t o r s and they a r e c a l l e d the 0  0  main e f f e c t s i n eqn. 3.6 4.  The 2-way i n t e r a c t i o n s i n c l u d e  a l l possible  combinations  of the f a c t o r s i n groups of two. These a r e x P , x Q, 0  0  0  x R, P Q, P R, QR. 0  5.  0  0  S i m i l a r l y , the 3-way i n t e r a c t i o n s are a l l combinations of the f a c t o r s i n groups of t h r e e . These a r e x P Q , 0  x P R, 0  6.  0  P QR, 0  0  x QR. 0  F i n a l l y , there i s the 4-way i n t e r a c t i o n and the e r r o r ,  32  3.2.4  A n a l y s i s of Variance The experiment  i s analyzed by ANOVA with a l l terms  (except M) on the r i g h t hand s i d e of eqn. 3.6 as the 'sources' i n the ANOVA t a b l e . One a n a l y s i s i s done f o r each CODE; hence, twelve ANOVA's are made i n t o t a l . The computer package 'ANOVAR' i s used and a sample output of the ANOVA t a b l e i s given i n the R e s u l t s ( S e c t i o n 3.3.1). The output gives the p-value f o r each source under  'F PROB'. These  ANOVA r e s u l t s are used as a q u a l i t a t i v e guide f o r p o i n t i n g out those sources which are important  in practical  s i t u a t i o n s . The p-values are compared among the sources i n order t o i n d i c a t e those which have a s i g n i f i c a n t e f f e c t on the response, RRMS. The smaller the p-value, the more l i k e l y it  i s that the corresponding source has a s i g n i f i c a n t  on the f i l t e r ' s  performance.  A flow c h a r t of the procedure  i s given i n F i g . 3.1.  effect  33  Generate  flows  choose  x  R'" o  generate flow 100  N=kn=180  sequences time  of  steps  long  i Kalman  Filter  .  K F  1,2  •• •  36,2  " •  K F  1,1  K F  1,5  Specifications  K F  Factorial  Experiment  36,1  K F  measure of  K F  36,5  the performance  t h e Kalman  Filter,  RRMS  ANOVA  determine sources  Repeat  Fig.  3. 1  Schematic  f o r each  diagram  significant  f o r RRMS.  CODE  of Experimental  Procedure.  34 3.3 Results  3 . 3 . 1 ANOVA R e s u l t s The ANOVA r e s u l t s i n d i c a t e significant  that  under a l l CODES, the  sources a r e the s p e c i f i c a t i o n of the main e f f e c t  Q, and the i n t e r a c t i o n QR. The s i g n i f i c a n c e it  of QR means that  i s the combined s p e c i f i c a t i o n of the noise c o v a r i a n c e s  which a f f e c t s the f i l t e r ' s  performance.  A sample output of the ANOVA t a b l e on the f o l l o w i n g  page. The p-values  f o r CODE 7 i s shown  (represented by F PROB  in the t a b l e ) are compared. Those corresponding t o the sources Q and QR are an order of magnitude lower  than the  r e s t . T h i s phenomenon i s noted under a l l CODES. The manner i n which the QR i n t e r a c t i o n a f f e c t s RRMS i s obtained from the f o l l o w i n g  two and t h r e e - d i m e n s i o n a l  graphs. The mean RRMS values a r e p l o t t e d l e v e l s of the source, QR.  with r e s p e c t t o the  35  Table  3.5  Computer ANOVA  Factors: p-values;  A  =  *  given  Output  Results  B = P 0  f o r CODE 7  0  C = Q  D = R  b y F PROB  ANALYSIS OP VARIANCE/COVARIANCE FOR VARIABLE  SOURCE A B AB C AC BC ABC D AD BO ABD CD ACD BCD ABCD ERROR TOTAL  D.F. 1 1 1 2 2 2 2 2 2 2 2 4 4 4 4 144 179  SUM OF SOUARES 1.439339E-05 3.582272E-05 2.683472E-05 3.062712E-03 5.325221E-04 \ 1.691381E-04 8.595134E-04 9.816524E-04 8.214938E-04 5.839004E-04 5.987364E-04 4.261532E-03 2.407318E-03 1.477050E-03 1.577517E-03 4.805392E-02 6.54S426E-02  MEAN SOUARE 1.439339E-05 3.582271E-05 2.683472E-05 1.531356E-03 2.662609E-04 S'.456905E-05 4.2975S7E-04 4.9092G2E-04 4.1074S8E-04 2.919501E-04 2.993S81E-04 1.O65383E-03 6.018295E-04 3.692624E-04 3.943790E-04 3.337075E-O4  Note t h a t t h e p - v a l u e s f o r s o u r c e s .0117 a n d . 0 1 5 1 r e s p e c t i v e l y .  C=Q  F VALUE 0. 0431 0. 1073 0. 0804 4 .5689 0. 7979 0. 2534 1 .2878 1 .471 1 1 .2309 0. 8749 0. 8971 3. 1926 1 .8035 1 .10S5 1 .1818  RRMS  F PROB 0. 8177 0. 7398 0. 7689 0. 01 17 0. 4559 0. 7791 0. 2785 0. 2316 0. 2949 0. 4219 0. 4126 0. 0151 0. 1300 0. 3560 0. 3211  a n d CD=QR a r e  36  3.3.2 Two-dimensional Graphs The  o r d i n a t e i n the f o l l o w i n g graphs i s the performance  i n d i c a t o r , RRMS. T h i s i n d i c a t o r  i s such that the smaller the  v a l u e , the b e t t e r i s the performance. The a b s c i s s a denotes the three l e v e l s of s p e c i f i c a t i o n For  i n s t a n c e , Q(1) corresponds  Graphs i n s e r i e s  f o r the noise  covariance.  t o s p e c i f y i n g Q too s m a l l .  (a) a r e p l o t t e d with respect t o Q f o r each  l e v e l of R ( i . e . h o l d i n g R c o n s t a n t ) . Those i n s e r i e s (b) are with respect t o R f o r each l e v e l of Q. The f e a t u r e s d i s p l a y e d by the graphs are summarized below: 1.  From s e r i e s  (b) of F i g s . 3.2-3.4, i t can be seen that  the s m a l l e s t RRMS value occurs a t Q(2)R(2).  This  represents c o r r e c t s p e c i f i c a t i o n s of the noise covariances. 2.  The graphs i n s e r i e s  (a) show that i f R i s given  c o r r e c t l y , then s p e c i f y i n g Q too small Q(1), l a r g e Q(3),  or too  r e s u l t s i n almost i d e n t i c a l RRMS v a l u e s . In  other words, the f i l t e r ' s performance i s i n d i f f e r e n t t o m i s - s p e c i f i c a t i o n of Q i f R i s known. 3.  A d i f f e r e n t behavior in s e r i e s noted  i s observed  by examining the graphs  (b) of F i g s . 3.2-3.4 f o r l e v e l Q(2).  that i f Q i s given c o r r e c t l y ,  It i s  then  u n d e r - s p e c i f i c a t i o n of R r e s u l t s i n a l a r g e r RRMS than o v e r - s p e c i f i c a t i o n . T h i s phenomenon i s p a r t i c u l a r l y n o t i c e a b l e under Q *  I I I  with l a r g e d i s t u r b a n c e s 4.  shown i n F i g . 3.4, f o r systems i n the s t a t e s .  For each CODE, nine RRMS values are p l o t t e d . In each  37  case, two of these values a r e r e l a t i v e l y c l o s e t o the minimum RRMS. These values correspond t o the specifications,  Q(2)R(3) and Q(3)R(3). I t i s noted that  both these s p e c i f i c a t i o n s i n v o l v e R(3). T h i s f e a t u r e i s more pronounced  as Q* and R* i n c r e a s e .  In g e n e r a l , s p e c i f y i n g R too small should be avoided. High v a l u e s of the performance  indicator  for a l l  s p e c i f i c a t i o n s of Q under l e v e l R(1) a r e o b t a i n e d . A large  i n c r e a s e i n the RRMS v a l u e i s noted f o r  Q(1)R(3). For example, see F i g . 3.3 ( b ) .  38  RRMS  Fig.  3.2 ( a )  with  respect to  Fig.  Q  RRMS w i t h  3.2 ( b ) respect-to  R  39  CODE 5  CODE 5 0.03*.  0.03-t  Legend  0.08  CODE 8  Legend o  ot  » Q2_ & 03  RRMS  Fig.  3.3 ( a )  with  respect  to Q  RRMS  Fig.  3.3 ( b )  with  respect  to R  40  Fig. RRMS  with  Fig.  3.4 ( a ) respect  to Q  RRMS  with  3.4 ( b ) respect  to R  Fig.  3.4  RRMS w i t h  (a) cont'd respect  to Q  Fig.  3.4  RRMS w i t h  (b) cont'd respect  to R  42  3.3.3 Three-dimensional p l o t A three-dimensional surface  p l o t of the i n t e r p o l a t e d response  f o r the RRMS i s g i v e n . T h i s i s p l o t t e d with  to the nine  respect  s p e c i f i c a t i o n l e v e l s of both Q and R. Because  the behavior  of the f i l t e r  only one response s u r f a c e  t o QR i n t e r a c t i o n i s the same, i s drawn. T h i s i s a q u a l i t a t i v e  p l o t as the axes a r e s c a l e d a p p r o p r i a t e l y t o g i v e a reasonable The  image.  s u r f a c e minimum occurs  i n the region  corresponding  to Q(2)R(2). The response s u r f a c e r i s e s i n a l l d i r e c t i o n s away from the minimum. I t i s noted that the topography i s not  too d i f f e r e n t  from the minimum, i n a zone d e f i n e d by  Q(2)R(3), and Q(3)R(3). The s u r f a c e a l s o shows a sharp increase  i n RRMS f o r the s p e c i f i c a t i o n ,  Q(1)R(3).  A3 RRMS  Q  levels R  levels  Q(1)R(3)  Fig.  3.5  Response  Surface  RRMS w i t h r e s p e c t t o QR s p e c i f i c a t i o n s  44 3.4 D i s c u s s i o n of R e s u l t s  3.4.1  ANOVA R e s u l t s The  f a c t o r i a l experiment c o n s i d e r s four f a c t o r s : x , 0  P,  Q, R. I t i s found t h a t :  1.  Initial  0  do  specifications  x , P 0  0  not have a s i g n i f i c a n t e f f e c t on the f i l t e r ' s  forecasting are  of the ARMAX c o e f f i c i e n t s  performance. In p r a c t i c e ,  these  coefficients  often unknown and t r a d i t i o n a l l y , much work has gone  into their estimation p r i o r shows that even an e r r o r  to forecasting.  of 50% i n t h e i r  T h i s study  initial  s p e c i f i c a t i o n , does not s i g n i f i c a n t l y degrade the performance. 2.  With the c o e f f i c i e n t s  as the s t a t e  variables  Kalman model, i t i s found that the f i l t e r to the combined s p e c i f i c a t i o n  i n the  i s sensitive  of the n o i s e  covariances,  Q and R. The  1-step flow f o r e c a s t  of the s t a t e  i s based on the l a t e s t  estimate  v e c t o r , which i s " updated by u s i n g the  Kalman gain matrix, K . Gelb (1979) shows that the fc  Kalman gain can be i n t e r p r e t e d uncertainties  i n the s t a t e  as a r a t i o of the  estimate t o the measurement  n o i s e . Because the u n c e r t a i n t y i n the s t a t e affected  by the noise c o v a r i a n c e Q, K  fc  estimate i s  can be c o n s i d e r e d  as a r a t i o of Q t o R. Hence, i t i s the combined specification filter's  of the n o i s e c o v a r i a n c e s which a f f e c t s the  performance.  45 3.4.2  I n t e r p r e t a t i o n of  Graphs  Because of the i n t e r a c t i o n between Q and R, i t i s not sufficient  t o examine the f i l t e r ' s  s e n s i t i v i t y to e i t h e r Q  or R i n d i v i d u a l l y . I t should be examined with respect t o both 1.  quantities. The optimum l e v e l of performance (represented by the minimum RRMS value) i s obtained when Q and R are s p e c i f i e d c o r r e c t l y . T h i s i s true under a l l t e s t i n g conditions.  2.  I f the v a r i a n c e of the measurement e r r o r R i s known, then the f i l t e r ' s performance i s i n d i f f e r e n t to under or o v e r s p e c i f i c a t i o n of Q. I t s performance l e v e l can only be  3.  improved i f Q i s s p e c i f i e d  correctly.  I f Q i s known to begin with, then  s p e c i f y i n g R too small  should be avoided. 4.  For systems with l a r g e d i s t u r b a n c e s i n the ARMAX c o e f f i c i e n t s , u n d e r - s p e c i f i c a t i o n of Q should be avoided. T h i s i s p a r t i c u l a r l y t r u e when R i s s p e c i f i e d l a r g e . The worst performance occurs f o r Q(1)R(3).  5.  A g e n e r a l g u i d e l i n e i s to s p e c i f y both  covariance  m a t r i c e s l a r g e because r e l a t i v e l y good f o r e c a s t i n g performance can be o b t a i n e d under  3.5  Q(3)R(3).  Summary The  Kalman F i l t e r  streamflow  i s a p p l i e d t o an ARMAX model f o r  f o r e c a s t i n g . A common problem i s that  initial  c o n d i t i o n s of the model c o e f f i c i e n t s and the noise  46 covariances  a r e unknown. The s e n s i t i v i t y  forecasting  performance to  is  the  ARMAX model  coefficients  sensitivity Filter terms  are of  as  study,  recast  into  the  states  i n the  filter's  of  x ,  P ,  0  observation  of  .5  to  Q, R  0  found t h a t  coefficients  2% change  feature  is  useful  when t h e r e estimates The  is of  the  to  If  the  the  larger  comparable  the  case  covariances  data  prior the  to  of  specified  in  by RRMS.  given too  it  due t o  the of  the  noise.  ARMAX This  especially reliable  performance  of  the  is  best  is  noise to  values.  specify This  performance  is  filter's  Q, i f  correctly, small.  it  and  forecasting.  known Q and R .  specification  Q is  this  Kalman  performance.  expected  c a n be e s t i m a t e d , is  the  filter's  in a forecasting  under o r o v e r  R is  of  for obtaining  than t h e i r  This  if  the  measured  p r a c t i c i n g engineer;  R accurately.  However,  For  measurement  filter's  Q and R a r e unknown,  results  noise  12% i n t h e  shows t h a t  specification to  with  model c o e f f i c i e n t s ,  combined s p e c i f i c a t i o n  both c o v a r i a n c e s  is  defined  specifications  affect  for  also  $ and H of  error,  i n the  coefficients  the  covariances.  3 to  insufficient  study  sensitive  of  initial  do not  Kalman m o d e l .  performance  forecast  format  t e s t e d correspond to a standard  standard deviation is  state-space  system m a t r i c e s  known. The f i l t e r ' s  the  deviation  if  specification  is  the  The - c o n d i t i o n s  It  the  examined. The  a  the  of  R is  If  o n l y one o f  better  to  estimate  insensitivity given  filter  the  to  correctly.  performs  worse  47 In g e n e r a l , i t i s best to s p e c i f y a l a r g e v a r i a n c e f o r the measurement e r r o r . T h i s r e s u l t s i n good f o r e c a s t i n g performance as long as Q i s not s p e c i f i e d much smaller than i t s true value.  4. HYDROLOGIC SYSTEMS The  Kalman F i l t e r has been a p p l i e d i n the p r e v i o u s  sensitivity  study to a g e n e r a l ARMAX model, used t o generate  s y n t h e t i c streamflow corresponds variables  d a t a . The state-space f o r m u l a t i o n  t o l e t t i n g the ARMAX c o e f f i c i e n t s be the s t a t e  i n the Kalman model. The study shows how the  f o r e c a s t i n g performance of the f i l t e r  i s a f f e c t e d by the  combined n o i s e covariance s p e c i f i c a t i o n . I t a l s o i n d i c a t e s that  initial  affect  s p e c i f i c a t i o n of the ARMAX c o e f f i c i e n t s do not  the flow f o r e c a s t i n g performance.  T h i s chapter d e s c r i b e s three p a r t i c u l a r ARMAX models, chosen f o r Kalman F i l t e r a p p l i c a t i o n s i n Chapters Actual d a i l y forecasting  flow data  6 t o 8.  i s used i n these s t u d i e s f o r  flows 1 and 2 days i n advance. These models  r e p r e s e n t h y d r o l o g i c systems with c e r t a i n b a s i n c h a r a c t e r i s t i c s , d e s c r i b e d i n t h i s chapter. The p r a c t i c a l i t y of  the Kalman F i l t e r  hydrologic  cannot be explored f o r a l l types of  systems represented by the g e n e r a l ARMAX model,  as i t i s data dependent t o some extent. Hence, s p e c i a l i z a t i o n to c e r t a i n types of systems i s necessary. The p r a c t i c a l i t y of the Kalman F i l t e r different  and i t s performance under  state-space f o r m u l a t i o n s of a h y d r o l o g i c model are  i n v e s t i g a t e d using r e a l  data.  S t u d i e s i n Chapters  6 t o 8 use d a i l y flow data from the  F r a s e r R i v e r a t Hope i n B.C. The F r a s e r River a t Hope i s one of  the l a r g e s t r i v e r s i n Canada and the flow changes  relatively  slowly through  time. As d i s c u s s e d i n s e c t i o n 4.1,  48  49 t h i s a l l o w s the assumption of t i m e - i n v a r i a n t systems. T h e r e f o r e , ARMAX models with constant c o e f f i c i e n t s w i l l be c o n s i d e r e d f o r these s t u d i e s . In p r a c t i c e then, only the noise v a r i a n c e a s s o c i a t e d with the measurement equation i n the Kalman model need t o be estimated p r i o r t o f o r e c a s t i n g . T h i s subject i s addressed The  i n Chapter  5.  three ARMAX models chosen a r e r e l a t i v e l y  simple  from a h y d r o l o g i c p o i n t of view because they do not i n v o l v e complicated p h y s i c a l equations. P h y s i c a l l y based models have been used f o r some types of r i v e r b a s i n s (e.g. Quick and Pipes,  1976). These a r e u s u a l l y d e t e r m i n i s t i c models  r e q u i r i n g e x t e n s i v e data  i n p u t s . In a d d i t i o n , the system of  equations do not r e a d i l y lend themselves t o s t a t e - s p a c e f o r m u l a t i o n . Because the focus of subsequent s t u d i e s i s on the Kalman F i l t e r  and i t s p o t e n t i a l  i n flow f o r e c a s t i n g  rather than on developing the best f o r e c a s t i n g procedure, t i m e - i n v a r i a n t ARMAX models a r e s e l e c t e d . These represent the simplest types of h y d r o l o g i c systems which a r e of practical  interest  i n water resource management.  4.1 T i m e - i n v a r i a n t Systems Flow f o r e c a s t i n g using ARMAX models with t i m e - i n v a r i a n t c o e f f i c i e n t s a r e c o n s i d e r e d . Constant  c o e f f i c i e n t models a r e  a p p r o p r i a t e f o r h y d r o l o g i c systems whose behavior  remains  constant with time. These types of systems a r e u s u a l l y c h a r a c t e r i z e d by l a r g e drainage areas, where the flows a r e i n f l u e n c e d by storage e f f e c t s . For these systems,  rainfall  i n p u t s have l i t t l e  i n f l u e n c e on the outflow.  R o d r i g u e z - I t u r b e (1978) and S z o l l o s i - N a g y (1976) concluded that the assumption extended  of t i m e - i n v a r i a n t c o e f f i c i e n t s can be  t o systems whose response c h a r a c t e r i s t i c s change  slowly over time. A "window" type of approach  i s suggested  f o r h a n d l i n g the time v a r y i n g nature, as shown i n the diagram.  TIME  (from z(t)  Fig.  4'. 1  A  S z o l l o s i - N a g y , 1976) i s the o b s e r v a t i o n at time  'data  window'  time-variant  approach  hydrologic  to  t.  slightly  systems.  51  A p p l i c a t i o n of the Kalman F i l t e r  i n t h i s t h e s i s i s motivated  by the f o l l o w i n g problem. Streamflow p r e d i c t i o n s f o r the F r a s e r R i v e r a t Hope a r e r e q u i r e d i n the c o n t r o l of t h i s water resource  system. The maximum d a i l y  (peak r u n o f f ) which occurs snowmelt. T h e r e f o r e , April  flow of the year  i n June, i s mainly due t o  a reasonable  1 t o September 30. Within  forecasting horizon i s  t h i s time frame, i t i s  expected t h a t the basin c h a r a c t e r i s t i c s do not change a b r u p t l y . The s t a t i o n a t Hope i s l o c a t e d near the downstream end  of the F r a s e r R i v e r . T h i s means that i t has a l a r g e  drainage  area and thus a slow b a s i n response. Hence,  t i m e - i n v a r i a n t ARMAX models a r e a p p r o p r i a t e  f o r the  i n v e s t i g a t i o n s i n t h i s t h e s i s . As a r e s u l t , only one noise variance  i s t o be estimated,  and t h i s  i s addressed i n  Chapter 5.  4.2  T h r e e ARMAX Models Three s t o c h a s t i c models commonly used i n streamflow  modelling  1.  AR(1) q  2.  t  = aq _ + v t  1  fc  TRANSFER FUNCTION q  3.  are considered.  t  = biq*t-  + 2  b  2<3**t-2  + V  t  Combined, ARMAX q  t  = c q _ 1  t  1  + c q* _ 2  t  2  + c q** _2 + v 3  t  fc  where q, q*, q** denote streamflows at three  different  52 stations. These a r e s p e c i a l cases  of the general ARMAX model. They can  be thought of as a (p,q) ARMA model with i n p u t s . General  justifications  models have been given  (r) exogeneous  f o r using these  i n the p r e v i o u s  types of  section. Particular  reasons f o r using each of these models a r e d i s c u s s e d below. 1.  The a u t o r e g r e s s i v e model of order  1 has been widely  used  in streamflow m o d e l l i n g . Quimpo (1973) has shown that storage  e f f e c t s can be represented  by a u t o r e g r e s s i v e  terms. P r e d i c t i o n s beyond t h i s can be made i f the unknown flows a t l a g one a r e r e p l a c e d by t h e i r  estimate.  However, t h i s n e c e s s a r i l y r e s u l t s i n a l a r g e r v a r i a n c e for  the flow  f o r e c a s t e r r o r . T h i s t o p i c i s addressed i n  Chapter 9. A p p l i c a t i o n of the Kalman F i l t e r  to t h i s  model i s i n v e s t i g a t e d i n Chapter 6. 2.  Upstream inputs can c o n t r i b u t e t o the r i s i n g p o r t i o n of an outflow transfer  hydrograph as p o i n t e d out by Wood (1978). A  f u n c t i o n model can be used where the  c o e f f i c i e n t s b, and b  2  represent  flow c o n t r i b u t i o n s from  upstream s t a t i o n s . The 2-day l a g i n the flows allow the p r e d i c t i o n of the 2-day f o r e c a s t with a l l the exogeneous v a r i a b l e s known a t time t . The a p p l i c a t i o n of the Kalman Filter  t o t h i s model i s examined i n Chapter 7.  In f a c t , f l o o d r o u t i n g procedures d e s c r i b e d by the convolution  i n t e g r a l a r e o f t e n l i n e a r i z e d . The d i s c r e t e  time r e p r e s e n t a t i o n of the i n t e g r a l can be considered as a t r a n s f e r f u n c t i o n model. An example of t h i s  53 f o r m u l a t i o n can be found 3.  i n Wood(l978).  The t h i r d model combines the a u t o r e g r e s s i v e process upstream i n p u t s . The Kalman F i l t e r  with  i s applied to t h i s  model i n Chapter 8.  4.3 Scope o f A p p l i c a t i o n s The  state-space  coefficients  f o r m u l a t i o n whereby the ARMAX  form the elements of the s t a t e v e c t o r i s  c o n s i d e r e d . T h i s format i s used throughout the remaining c h a p t e r s , except  where otherwise  a s s o c i a t e d with the s p e c i f i c a t i o n {x , 0  P , Q, R} have been reduced 0  noted. The problems of the inputs t o the problem of  e s t i m a t i n g R only because: 1.  Q=0 as t i m e - i n v a r i a n t c o e f f i c i e n t s  2.  Initial  specification  do not a f f e c t  are c o n s i d e r e d .  of the ARMAX c o e f f i c i e n t s ,  x , P 0  the f i l t e r ' s f o r e c a s t i n g performance.  0  5. ESTIMATION OF MEASUREMENT NOISE VARIANCE R e s u l t s of Chapter 3 show that the f i l t e r ' s f o r e c a s t i n g performance i s a f f e c t e d by the s p e c i f i c a t i o n of the noise covariance m a t r i c e s . In f a c t , v a r i a n c e e s t i m a t i o n i n state-space models i s a s u b j e c t of much c u r r e n t i n t e r e s t . A review  of t h i s was given i n Chapter 2, s e c t i o n 2.4.4. In  t h i s chapter, a method f o r e s t i m a t i n g the noise v a r i a n c e i s i n v e s t i g a t e d . The three models d i s c u s s e d i n Chapter 4 are considered. It  i s found  that the v a r i a n c e of the measurement e r r o r  cannot be assumed t o be constant  with time. A n a l y s i s  i n d i c a t e s that i t v a r i e s i n p r o p o r t i o n t o the square of the flow. T h e r e f o r e , a Box-Cox transform  i s used f o r s t a b i l i z i n g  the v a r i a n c e . The a p p r o p r i a t e t r a n s f o r m a t i o n  i s to take  logarithms of the o r i g i n a l measurements. The new s e r i e s has a noise v a r i a n c e which i s approximately of  maximum l i k e l i h o o d  constant. The method  i s then used to estimate  v a r i a n c e f o r each model. An e q u i v a l e n t  the n o i s e  log-liklihood  f u n c t i o n f o r the flow i s w r i t t e n i n terms of the o b s e r v a t i o n f o r e c a s t e r r o r ; and i s evaluated through the use of the Kalman  Filter.  5.1 The Estimation Method Because an accurate and r e l i a b l e estimate of the noise v a r i a n c e i s d e s i r e d , h e u r i s t i c methods d i s c u s s e d i n Chapter 2 are not c o n s i d e r e d . More s o p h i s t i c a t e d methods e x i s t , but are t r a d i t i o n a l l y avoided due t o extensive  54  computational  55 requirements. software,  However, the a v a i l a b i l i t y of computer  has made these methods more a t t r a c t i v e . An example  i s the Bayesian in  method of e s t i m a t i o n . T h i s w i l l not be used  t h i s t h e s i s but i s d i s c u s s e d b r i e f l y as i t has a t t r a c t e d  some a t t e n t i o n i n the recent l i t e r a t u r e . T h i s method does not  estimate  the noise v a r i a n c e per se. A c t u a l l y , i t i s a  d i s c r i m i n a t i o n technique. Numerous Kalman models are considered at the s t a r t ; each one c o n t a i n i n g a p o s s i b l e 2  value of the noise v a r i a n c e , a ^.  I t i s assumed t h a t one of  these Kalman F i l t e r s c o n t a i n s the c o r r e c t noise v a r i a n c e 2  a* .  2  P o s t e r i o r p r o b a b i l i t y of a ^ being  observations,  2  o*  given the  i s c a l c u l a t e d f o r each model. A f t e r s u f f i c i e n t 2  number of o b s e r v a t i o n s , the model with  a-  '  1  w i l l be assigned the highest p o s t e r i o r p r o b a b i l i t y . i n i t i a l a p p l i c a t i o n s by Valdes  2  c l o s e s t to a*  Although  et a l . , (1978) show that  method does s e l e c t the model with  this  2  a ^ c l o s e s t to the t r u e  v a r i a n c e , i t i s not c o n s i d e r e d here f o r the f o l l o w i n g reasons: 1.  A p p l i c a t i o n s of t h i s technique  i s only i n i t s i n i t i a l  stage, hence i t s p r o p e r t i e s s t i l l 2.  need to be examined.  In a study by the previous a u t h o r s ,  i t was found  that  300 time steps are r e q u i r e d b e f o r e the best model can be d i s t i n g u i s h e d . Since we are o n l y i n t e r e s t e d i n forecasting  f o r p a r t of the year, there i s not enough  data f o r the d i s c r i m i n a t i o n procedure. 3.  The o r i g i n a l set of models i s assumed to encompass the true value of R.  56 T h i s l a s t p o i n t presents a l i m i t a t i o n the engineer  i s not s u f f i c i e n t l y  in practice.  Often,  f a m i l i a r with the p h y s i c a l  c h a r a c t e r i s t i c s of a b a s i n t o provide the bound v a l u e s . 2  Hence, the set of a ^ ' s chosen may not i n c l u d e the true v a l u e . Instead, the method of maximum l i k e l i h o o d t h i s chapter t o estimate  i s used i n  the noise v a r i a n c e .  The maximum l i k e l i h o o d method gives estimates which are known t o be c o n s i s t e n t and a s y m p t o t i c a l l y e f f i c i e n t . No p r i o r knowledge of the system parameters a r e r e q u i r e d here. The d i f f i c u l t y with t h i s method i s the e v a l u a t i o n of the likelihood  f u n c t i o n . In time s e r i e s a p p l i c a t i o n s , e r r o r s i n  s u c c e s s i v e streamflow measurements are l i k e l y to be c o r r e l a t e d . Harvey (1981) o u t l i n e s i n h i s t e x t , how the e v a l u a t i o n of the l o g l i k e l i h o o d  f u n c t i o n can be s i m p l i f i e d  through the use of the Kalman F i l t e r . An example of t h i s e s t i m a t i o n procedure i s i l l u s t r a t e d by L e d o l t e r (1983) i n a Management Science  5.2  context.  E s t i m a t i o n Procedure An engineer  resource  o f t e n has t o make d e c i s i o n s f o r a water  system i n r e a l time.  I t i s d e s i r a b l e that the  v a r i a n c e of the measurements e r r o r s be c o n s i d e r e d with r e s p e c t to time. T h i s can be approximately transforming  constant  achieved by  the raw measurements. For each of the models  which w i l l be used i n subsequent s t u d i e s , the noise v a r i a n c e i s estimated 1.  by the f o l l o w i n g  procedure.  Determine the a p p r o p r i a t e Box-Cox t r a n s f o r m a t i o n f o r the  57  original  flow data such that the noise v a r i a n c e of the  new s e r i e s i s approximately 2.  For the transformed estimate  constant.  data, o b t a i n the maximum  likelihood  of the noise v a r i a n c e by using the Kalman  Filter.  5.3  Transformation  of flow data to achieve Uniform  Variance  Simple a p p l i c a t i o n s of s t o c h a s t i c flow models assume that the measurement e r r o r s a r e constant  f o r the raw flow  data. For an example, see Wood (1978). However, a more realistic  assumption i s that the v a r i a n c e of the flow  o b s e r v a t i o n e r r o r be a f u n c t i o n of the flow  i t s e l f . Patry  and Marino (1983) examined v a r i o u s noise assumptions commonly used i n h y d r o l o g i c models. In an a p p l i c a t i o n by Natale & T o d i n i (1976), a l i n e a r dependence of the noise term's standard d e v i a t i o n on the flow real-time forecasting s i t u a t i o n , simpler  i s assumed. In a  flow p r e d i c t i o n i s made  i f the measurement e r r o r s are t i m e - i n v a r i a n t .  Transforms f o r s t a b i l i z i n g the v a r i a n c e a r e a v a i l a b l e . A Box-Cox transform  i s used i n t h i s a p p l i c a t i o n . In g e n e r a l ,  t h i s c l a s s of transforms  i s a p p r o p r i a t e when the v a r i a n c e of  the o b s e r v a t i o n s can be expressed Before  as a f u n c t i o n of i t s mean.  the a p p r o p r i a t e Box-Cox t r a n s f o r m a t i o n can be  determined, the r e l a t i o n between the v a r i a n c e of the o b s e r v a t i o n s and i t s mean i s i n v e s t i g a t e d .  58 5.3.1 R e l a t i o n between the Noise V a r i a n c e and the Flow The three ARMAX models of Chapter subsequent chapters  4 a r e used i n  f o r the a p p l i c a t i o n of the Kalman  F i l t e r . The state-space f o r m u l a t i o n where the ARMAX c o e f f i c i e n t s a r e the s t a t e v a r i a b l e s a r e c o n s i d e r e d . The c o n d i t i o n a l v a r i a n c e of the flow, given the model coefficients,  i s equal t o the v a r i a n c e of the measurement  e r r o r . T h i s i s expressed  as the f o l l o w i n g , where z  represents the o r i g i n a l  flow data measured i n m /s.  z  fc  = H x  3  5.1  t  Var (z, x ) = Var(v )  5.2  t  The time s u b s c r i p t i s now dropped f o r n o t a t i o n a l convenience. observation  I t i s p o s t u l a t e d that the v a r i a n c e of the i s a f u n c t i o n of the mean:  V a r ( z ) = a (Mean(z))  or  Equation  k  l o g [ V a r ( z ) ] = l o g [ a ] + k l o g [Mean(z)]  5.3  5.4  5.4 i s a l i n e a r equation with k r e p r e s e n t i n g the  s l o p e . I t can be determined  by r e g r e s s i n g l o g [ V a r ( z ) ] on  l o g [ M e a n ( z ) ] . Hence, the dependence of V a r ( z ) on Mean(z) can be found. The v a r i a n c e and mean of the flow i n eqn. 5.4 a r e r e p l a c e d by l o c a l e s t i m a t e s . These estimates a r e moving  59 averages  of order " s " c e n t e r e d about the c u r r e n t time of  i n t e r e s t . Flow o b s e r v a t i o n s are made from t=1 t o T. Estimates f o r the mean and the v a r i a n c e of the flow a t time t are given by:  s .  Z  1  Mean(z) =  z  t i +  s  =  z  5.5  fc  2s+1  Var(z) =  s Z (z t + i i=-  - V* 2s  For s=2, the averaging hence t h i s  1 2s  s L 3  i = -s  2  't + i  - z .(2s+1)  5.6  i s with respect to 5 data p o i n t s ;  i s termed a 5-pt. average. The s m a l l e r " s " i s ,  the more l o c a l  the estimates a r e .  5.3.2 R e s u l t s In a p r e l i m i n a r y a n a l y s i s , estimates a r e c a l c u l a t e d with 3, 5, and 7-pt. averages. R e s u l t s of the r e g r e s s i o n s show l i t t l e  d i f f e r e n c e s i n the values o b t a i n e d f o r k. I t i s  decided t h a t a 5-pt. average  w i l l be used to o b t a i n l o c a l  estimates of the mean and v a r i a n c e of the flow. The time p e r i o d i s from A p r i l 1970  1 t o September 30. Ten years of data,  to 1979 a r e used to determine  the slope k, of eqn. 5.4.  R e s u l t s of the r e g r e s s i o n s are t a b u l a t e d below:  60 Table 5.1  Values of the slope for years 1970-1979  year  k  year  k  1970  2.54  1975  2.06  1971  1 .48  1976  1 .63  1972  1 .92  1977  1 .70  1 973  2.00  1 978  2.10  1974  1 .58  1979  2.64  The average  5.3.3  of the ten k v a l u e s i s  1.96.  Conclusions The averaged  k value of 1.96  v a r i a n c e of the measurement e r r o r  i n d i c a t e s that  the  i s approximately  p r o p o r t i o n a l to the square of the flow  (eqn. 5.3). In other  words, the standard d e v i a t i o n of the n o i s e term  is directly  p r o p o r t i o n a l to the flow. A constant v a r i a n c e f o r the measurement e r r o r would r e s u l t  5.3.4  B o x - C o x  i n k being c l o s e to z e r o .  Transforms  A c l a s s of transforms used  f o r s t a b i l i z i n g the v a r i a n c e  i s known as the Box-Cox transforms. An a p p r o p r i a t e t r a n s f o r m a t i o n of the raw  flow data can be made. The  r e s u l t i n g s e r i e s w i l l have an e r r o r term whose v a r i a n c e i s approximately  constant with time. These tranforms are  used  61  in s i t u a t i o n s where the v a r i a n c e can be expressed as a f u n c t i o n of the mean M;  V a r ( z ) = 0(M)  where <t> i s the  f u n c t i o n a l dependence. The f i r s t order T a y l o r S e r i e s expansion  of the  transformed data i s :  g(z) = g(M) + g'(M)(z-M)  It  5.7  i s d e s i r e d that Var ( g ( z ) ) be approximately c o n s t a n t .  Var  ( g ( z ) ) « g'(z)  2  0(M)  5.8  The f u n c t i o n a l dependence </>, determined  in section  5 . 3 . 3 , i s z . E v a l u a t i o n of eqn. 5.8 g i v e s the a p p r o p r i a t e 2  transformat i o n :  y = g(z) = l n ( z )  Eqn. 5.9 i n d i c a t e s that the incoming  5.9  flow data should be  tranformed by t a k i n g the n a t u r a l l o g a r i t h m s . The new  series  formed by the y ' s , has approximately uniform v a r i a n c e .  62  5.4 Maximum L i k e l i h o o d E s t i m a t i o n of the Noise  5.4.1  Variance  Theory Consider  a set of T dependent o b s e r v a t i o n s drawn from a 2  m u l t i v a r i a t e normal, y_ MVN(jz,a V). The l o g - l i k e l i h o o d function, log £ ( y  1 r  2  - | l o g 2n - | l o g a  y f««. 2  y ) = l o g £(y_) i s : T  - \ log | V | -  1  (y_-M)'V" (y_-u_)  5.10  D i f f e r e n t i a t i n g the above expression with r e s p e c t t o a equating 2  a .  2  and  i t t o 0 y i e l d s the maximum l i k e l i h o o d estimate of  However, the e v a l u a t i o n of the above e x p r e s s i o n  i s time  1  consuming due to | v j ; and V " . T h i s i s because V i s a non-diagonal  matrix as the i n d i v i d u a l e r r o r s are s e r i a l l y  correlated. Schweppe (1965) showed that an e q u i v a l e n t  expression  for l o g £(y_) can be w r i t t e n i n terms of the i n n o v a t i o n s , v^. It  i s d e f i n e d as the o b s e r v a t i o n  forecast error, y -y .  These i n n o v a t i o n s are independently  t  normally  2  with mean 0, and v a r i a n c e a t^, where f  f c  f c  distributed  i s c a l c u l a t e d by  the Kalman a l g o r i t h m . Hence, £ i s a set of independent 2  e r r o r s drawn from a m u l t i v a r i a t e normal, MVN(0, a D ) . D i s a d i a g o n a l matrix  defined as:  63 f D =  The e q u i v a l e n t l o g - l i k e l i h o o d e x p r e s s i o n , l o g £(y_)  -|log  211 - | logo-  Because  2  T - ^ 2 log f t=1  -  is:  t=1 f  5.11  i s a l i n e a r t r a n s f o r m a t i o n of y^. which i s  normally  d i s t r i b u t e d , equations  5.10 and 5.11 are  equivalent. The f o r e c a s t e r r o r  , and i t s v a r i a n c e f^. are  q u a n t i t i e s which can be computed by the Kalman F i l t e r at each time step, as p a r t of the Kalman a l g o r i t h m . T h i s i s achieved  by r e c a s t i n g the time s e r i e s model i n t o the  state-space  framework. Hence, the Kalman F i l t e r  i s used as a  t o o l f o r e v a l u a t i n g the l i k e l i h o o d of T dependent observations  from a m u l t i v a r i a t e normal.  D i f f e r e n t i n g the above e x p r e s s i o n with r e s p e c t t o y i e l d s the formula  for  2  a  2  a.  5.12  The q u a n t i t i e s , where f  f c  and f  = 1 + HPH'.  f c  are given by the Kalman  Filter;  64 5.4.2  Results The  three ARMAX models given  f o r flow f o r e c a s t i n g  i n Chapters 6, 7, and 8 r e s p e c t i v e l y .  For each model, an estimate the transformed obtained  i n Chapter 4 w i l l be used  flow data  of the noise v a r i a n c e based on  i s r e q u i r e d . F i v e estimates are  f o r each model. Flow records from 1976-1980 are  used, with the o b s e r v a t i o n s taken  from A p r i l  1  to  September 30. R e s u l t s a r e given below.  Table 5.2  year  Maximum L i k e l i h o o d  Estimates  of the Noise  Variance  Model 1  Model 2  Model  AR(1)*  TRANSFER  ARMAX  FUNCTION 1 976  .00201  .01237  .00209  1977  .00264  .01578  .00276  1 978  .00122  .01081  .00175  1979 .  .00233  .01085  .00259  1980  .00287  .01888  .00303  R  .0022  .01 37  .0024  av  * D e s c r i p t i o n s of these models a r e given i n Chapter 4, s e c t i o n 4.2. S p e c i f i c a t i o n of the noise v a r i a n c e f o r the Kalman a l g o r i t h m i s based on R_„, i n subsequent chapters  6 t o 8.  65  5.5  Summary  The  t h r e e ARMAX m o d e l s d e s c r i b e d  i n Chapter 4 w i l l  be  u s e d f o r f l o w f o r e c a s t i n g i n C h a p t e r s 6, 7, 8 r e s p e c t i v e l y . For  e a c h m o d e l , t h e v a r i a n c e o f t h e measurement e r r o r a t  each time  s t e p i s unknown. T h i s c h a p t e r  the e s t i m a t i o n of noise In  i s concerned  with  variance.  real-time f o r e c a s t i n g , i t i s d e s i r e d that the  v a r i a n c e o f t h e e r r o r t e r m be t i m e - i n v a r i a n t ; a l t h o u g h i n practice,  this  i s rarely  true.  E s t i m a t i o n of the noise v a r i a n c e divided 1.  i n t o two p a r t s a n d t h e c o n c l u s i o n s a r e :  Regressions  o f l o g [ V a r ( f l o w ) ] on  i n d i c a t e that the variance  log[Mean(flow)]  i s p r o p o r t i o n a l t o the square  of t h e f l o w . By u s i n g t h e Box-Cox t r a n s f o r m ,  i t is  d e t e r m i n e d t h a t t h e raw f l o w d a t a  transformed  by  should  uniform.  The method o f maximum l i k e l i h o o d  i s used t o estimate t h e  n o i s e v a r i a n c e s of t h e transformed The  be  l n ( f l o w ) . The n o i s e v a r i a n c e o f t h e new s e r i e s i s  approximately 2.  f o r e a c h model i s  log-likelihood expression  i n n o v a t i o n s and e v a l u a t e d  data  f o r each model.  i s w r i t t e n i n terms of t h e  by u s i n g t h e K a l m a n  A v e r a g e o f t h e maximum l i k e l i h o o d e s t i m a t e s model i s g i v e n .  Filter.  f o r each  66 Table 5.3  Averaged Maximum L i k e l i h o o d of the Noise  Estimate  Variance  Model 1  Model 2  Model 3  .0022  .0137  .0024  6.  APPLICATION:  An a u t o r e g r e s s i v e model of order t h i s chapter.  AR(1)  1, AR(1) i s c o n s i d e r e d i n  I t i s used to p r e d i c t flows at Hope one day i n  advance. Three ways of f o r m u l a t i n g the AR(1) model Kalman s t a t e - s p a c e  into  format are i n v e s t i g a t e d . Hence, three  schemes a r e used to f o r e c a s t the flow at Hope. The o b j e c t i v e of  t h i s study  i s to compare f o r e c a s t i n g performances of the  three s t a t e - s p a c e  f o r m u l a t i o n s , based on the f o l l o w i n g three  criteria: 1.  RMS e r r o r of the flow f o r e c a s t ,  2.  maximum r e l a t i v e f o r e c a s t e r r o r , and  3.  number of times of  the t r u e  the f o r e c a s t e r r o r i s g r e a t e r than 25%  flow.  From the study, The best  the f o l l o w i n g c o n c l u s i o n s are drawn.  f o r e c a s t i n g performance i s obtained  which uses the Kalman F i l t e r a. As the estimate  to estimate  i s given r e s u r s i v e l y ,  f o r the scheme  the AR c o e f f i c i e n t , flow p r e d i c t i o n i s  made at each time step with the most recent estimate On the other hand, the Kalman F i l t e r estimate  of a.  can be used t o  the flow d i r e c t l y . The f o r m u l a t i o n which models the  flow as the s t a t e , and the e r r o r term as the o b s e r v a t i o n noise r e s u l t s i n the worst f o r e c a s t i n g performance. C o r r e c t i o n t o the flow estimates a r e not adequate as the Kalman gain matrix Formulating  i s too s m a l l .  the AR process as the s t a t e equation i s  e q u i v a l e n t t o not using the Kalman F i l t e r at a l l . The f o r e c a s t i n g performance can be improved by c a s t i n g the model  67  68  in state-space  form and a p p l y i n g the f i l t e r  to estimate the  AR c o e f f i c i e n t r e c u r s i v e l y . The model used to d e s c r i b e the streamflow phenomenon at Hope i s:  q  where  t  =  a q  t-1  +  e  6  t  '  1  q = l n ( f l o w ) at Hope a = autoregressive  coefficient  e  N(0,a )  = white noise ~  fc  2  6.1 D e s c r i p t i o n of the F o r e c a s t i n g schemes Three schemes are used t o f o r e c a s t flows at Hope and are d e s c r i b e d below. The streamflow phenomenon i s represented  by the AR(1) model of eqn. 6.1.  6.1.1 P r o p e r t i e s of Scheme 1 The AR(1) model i s r e c a s t i n t o state-space  framework as  follows:  State eqn:  a  Obs. eqn:  q  a  6.2  fc  =  fc  = q ._ a  The AR(1) process  t_  4  1  t  +  6.3  i s w r i t t e n as the o b s e r v a t i o n  equation,  with a i s the s t a t e v a r i a b l e . Hence, the Kalman F i l t e r i s used to give r e c u r s i v e estimates  of the AR c o e f f i c i e n t .  69 Because t i m e - i n v a r i a n t s t a t e s are c o n s i d e r e d , there i s no n o i s e term a s s o c i a t e d with a, and thus Q = 0. The 1-step flow p r e d i c t i o n i s given by <*t+l/t  ^  s  §t+1  based on the l a t e s t  =  ^t^t+1/t  s t a t e estimate  '  "here  given by the  f i l t e r . Under t h i s scheme, the noise v a r i a n c e i s r e q u i r e d as input t o the Kalman a l g o r i t h m .  6.1.2  P r o p e r t i e s of Scheme 2 The state-space  r e p r e s e n t a t i o n of the AR(1) model i s :  S t a t e eqn.  q  fc  = aq  Obs. eqn.  q  fc  = q  t - 1  + e  6.4  fc  6.5  fc  In c o n t r a s t t o the p r e v i o u s  scheme, the AR(1) process i s  w r i t t e n as the s t a t e equation,  with q  as the s t a t e  fc  v a r i a b l e . Under t h i s f o r m u l a t i o n , an estimate  of a i s  r e q u i r e d p r i o r to f o r e c a s t i n g as i t i s assumed known by the Kalman a l g o r i t h m . An approximate l e a s t squares estimate, i s used. T h i s estimate  a  L g  of a i s h e l d f i x e d throughout the  f o r e c a s t i n g p e r i o d . F o r e c a s t s are given by the p r i o r estimate  6.1.3  of the s t a t e v e c t o r , Q  t + 1  /  = t  a  LS^t'  P r o p e r t i e s of Scheme 3 This f i n a l  estimates  scheme a l s o uses the Kalman F i l t e r  to give  of the flow d i r e c t l y . However, the AR(1) model i s  " s p l i t up" i n the s t a t e - s p a c e  formulation:  70  S t a t e eqn:  q  Obs.  qm  qm  t  eqn:  = aq  fc  t  6.6  t - 1  q  fc  + e  6.7  fc  r e p r e s e n t s the measured flow. Although  q  fc  i s the s t a t e  v a r i a b l e , the noise term i s modelled as the measurement n o i s e . Again, and  L  a  t  t  of a i s r e q u i r e d f o r the a l g o r i t h m  a g i s used. F o r e c a s t s are given by the f i l t e r  = LgQ « q ,  an estimate  as Q^+1/t  T h i s d i f f e r s from Scheme 2 i n the e x p r e s s i o n f o r  as w i l l be shown i n s e c t i o n 6.5.4. An estimate  f o r the  noise v a r i a n c e i s a l s o r e q u i r e d .  Table 6.1 summarizes the p r o p e r t i e s of the three schemes.  71  Table 6.1  Properties  of the f o r e c a s t i n g  schemes  Scheme 1  Scheme 2  Scheme 3  AR(1) = obs. eqn.  AR(1) = s t a t e eqn.  AR(1) " s p l i t up"  estimates of a are  estimate of a  estimate of a  given  required  required  recursively  by the Kalman  p r i o r to  p r i o r to  forecasting  f o r e c a s t ing  e q u i v a l e n t t o no  estimate of o  Kalman  required  Filter estimate of a required  2  Filtering  2  72 6.2 General D i s c u s s i o n For Scheme 1, the estimate of a can be updated at each time step through  the use of the Kalman F i l t e r . The  f o r e c a s t i n g performance can be a f f e c t e d by the input s p e c i f i c a t i o n of the Kalman a l g o r i t h m . For t h i s f o r m u l a t i o n , these inputs are the i n i t i a l  s p e c i f i c a t i o n s f o r the AR  c o e f f i c i e n t and the noise v a r i a n c e ; i . e . a , 0  Schemes 2 and 3 use the Kalman F i l t e r of  the flow d i r e c t l y , as q  P » R. 0  to give  estimates  i s the s t a t e v a r i a b l e . An  fc  estimate of a i s r e q u i r e d p r i o r t o f o r e c a s t i n g as i t i s assumed known by the Kalman a l g o r i t h m . The main of  disadvantage  these schemes i s that a i s h e l d f i x e d d u r i n g the  f o r e c a s t i n g p e r i o d . Hence, the performance of schemes 2 and 3 are not expected  to be b e t t e r than that of Scheme 1.  Under Scheme 2, the 1-step flow p r e d i c t i o n the p r i o r estimate of the s t a t e  q  t + 1  y  t  i s given by  = a q . The l a t e s t L S  f c  estimate of the flow i s : q The  = $  t  t / t  _.  +  K [q -q ] t  t  6.8  t  e x p r e s s i o n f o r the Kalman gain i s : K  where  =  t  P  T  /  P  T  _  H  t/t-1 '  t  H  P  H  = a ^ P ^  1  '  +  R  l  "  1  6.9  + Q  6.10  S p e c i f i c a t i o n of Q a f f e c t s K , and thus q . In t h i s fc  t  a p p l i c a t i o n , H=1 and R=0 leads to a constant gain matrix, K =1. T h e r e f o r e , r e g a r d l e s s of the value f o r Q, K t  same even though  fc  p  is equivalent to:  t  / _ i t  i s the  changes. Because K=1, equation  6.8  73  6.11  Hence the 1-step f o r e c a s t i s : ^t+1/t  =  a  6.12  q  LS t  T h i s e x p r e s s i o n f o r the 1-step f o r e c a s t c o u l d be o b t a i n e d before the AR(1) model was c a s t Hence, f o r t h i s s i t u a t i o n not  into state-space  form.  (K =l), Scheme 2 i s e q u i v a l e n t to fc  using the Kalman F i l t e r . The s e n s i t i v i t y of t h i s  scheme's performance specification  i s s t u d i e d w i t h respect t o  of a.  Under Scheme 3, the Kalman gain i s not e q u a l t o 1. In t h i s case, the performance of the f i l t e r depend on the s p e c i f i c a t i o n  i s expected to  of q , P , R and a. 0  0  6 . 3 Experimental Procedure The f o r e c a s t i n g performance of each scheme i s i n v e s t i g a t e d . The data used are streamflow measurements at Hope, from A p r i l 1 to September 30, f o r the y e a r s 1981 1983. T h e i r performance  i s measured by three  (PI):  1 .  2  PI  indicators  1982  74  This  r e l a t i v e root mean square e r r o r  average magnitude of the  forecast  (RRMS) expresses  e r r o r as a f r a c t i o n . I t i s 3  measured i n terms of the a c t u a l u n i t s ; i . e . y i s in 2.  PI  2  = maximum of the a b s o l u t e r e l a t i v e e r r o r ,  3.  PI  3  = number of times the a b s o l u t e flow f o r e c a s t  e r r o r > 25%  All  of the a c t u a l  three i n d i c a t o r s are  the b e t t e r  i n d i c a t e s the  2  such that  the  summarize, PI,  3 require  squares i s used to o b t a i n model as a r e g r e s s i o n (LSE)  between q and  the  t  and  model. An  of the  1 to  3  AR least time s e r i e s  sample c o r r e l a t i o n c o e f f i c i e n t  Q _i« Taking the dependent v a r i a b l e as  q ,  t  fc  five regressions  are  sample c o r r e l a t i o n c o e f f i c i e n t i s obtained i n  each case. Data used are April  PI  the  approximate l e a s t squares  t  the  values,  represents  method of  t r e a t i n g the  independent v a r i a b l e as Q _ i r  done and  their  estimates f o r the  , by  f o r a i s the  error  forecasts.  c o e f f i c i e n t p r i o r to f o r e c a s t i n g . The  estimate  smaller  denotes the maximum e r r o r , and  frequency of poor  Schemes 2 and  m /s.  flow.  the performance. To  average e r r o r , P I  the  September 30  streamflow records at Hope from f o r the years 1 9 7 6 - 1 9 8 0 . The  f i v e estimates i s used as a f o r  forecasting.  average  75 Schemes 1 and 3 r e q u i r e the n o i s e v a r i a n c e to be known. S i m i l a r l y , t h i s has to be estimated p r i o r the method of maximum l i k e l i h o o d  i s used  t o f o r e c a s t i n g and i n t h i s case. The  same flow records as above are used. Assuming n o r m a l i t y f o r the o b s e r v a t i o n s , the approximate maximum estimates average  (MLE) of a  2  likelihood  were obtained i n Chapter  5. Again, the  of the f i v e estimates i s used as R.  6.4 R e s u l t s  6.4.1 Estimate of AR c o e f f i c i e n t The  LSE of a obtained f o r the years 1976-1980, i s  presented below.  Table 6.2  Least Squares Estimates of the autoreqressive c o e f f i c i e n t  year  It  a  LS  1976  .9963  1 977  .9928  1 978  .9959  1 979  .9969  1 980  .9945  average  .995  i s noted that the estimates of a are c l o s e t o 1.  76  6.4.2 Estimate  of the noise 2  The MLE of a  obtained  variance i n Chapter 5 f o r the years  1976-1980 are g i v e n .  Table 6.3  The  Maximum L i k e l i h o o d Estimates of the n o i s e v a r i a n c e  2  year  MLE(a )  1976  .00201  1977  .00264  1978  .00122  1979  .00233  1980  .00287  average  .0022  average value  i s rounded to .002 f o r use i n the Kalman  algorithm.  6.4.3 Performance I n d i c a t o r s f o r Scheme 1 The  f i l t e r ' s performance i s t e s t e d with r e s p e c t t o  P , and R. The range of parameter values used are given 0  below, f o l l o w e d by the r e s u l t s .  a, 0  77  Table 6.4  Range of values f o r input parameters  Parameter  lower  a  0  .1  2  P  0  .1  10  R  2X10-  Table 6.5a a =1.0  6  upper  limit  2  Values of PI, f o r Scheme 1  P =3.0  o  limit  o  R=.002 u n l e s s otherwise  stated  Parameter v a l u e  1981  1982  1983  all  4%  5%  4%  4%  5%  5%  conditions  R=2X10"  6  Table 6.5b  Values of P I  2  f o r Scheme 1  Parameter v a l u e  1981  1982  1983  all  17%  20%  14%  18%  20%  12%  conditions  R=2X10-  6  78 Table 6.5c  Parameter  Values of P I  value  3  f o r Scheme 1  1981  1982  1983  a l l conditions  0  0  0  R=2X10"  0  0  0  6  R e s u l t s show that the f i l t e r s p e c i f i c a t i o n s of a  0  and P  0  i s very robust to  f o r a l l three performance  i n d i c a t o r s . I t i s noted t h a t i n t h i s a p p l i c a t i o n , the f i l t e r i s a l s o i n s e n s i t i v e t o the s p e c i f i c a t i o n of R.  6.4.4  Performance I n d i c a t o r s f o r Scheme 2 T h i s scheme i s e q u i v a l e n t to f o r e c a s t i n g without the  Kalman F i l t e r . I t was shown i n s e c t i o n 6.2 t h a t f o r e c a s t s do not depend on the value of the noise v a r i a n c e i n t h i s a p p l i c a t i o n . The f o r e c a s t i n g performance i s examined  with  r e s p e c t t o s p e c i f i c a t i o n of a. The range of v a l u e s used a r e :  T a b l e 6.6  Range of v a l u e s f o r a  Parameter  lower  a  .95  limit  upper 1 .05  limit  79  Table 6.7a  Values of PI, f o r Scheme 2  Parameter value  1981  1982  1983  a=.95  34%  35%  34%  a=.99  9%  10%  9%  a=1.0  4%  5%  4%  a=1.0l  10%  10%  10%  a=1.05  51%  52%  51%  Table 6.7b  Values of P I  2  f o r Scheme 2  Parameter value  1981  1982  1983  a=.95  44%  48%  40%  a=.99  24%  28%  20%  o=1.0  18%  21%  13%  O=1.01  16%  23%  18%  o=1.05  66%  74%  64%  80 Table 6.7c  Values of P I  f o r Scheme 2  3  Parameter value  1981  1982  1983  a=.95  182  179  182  a=.99  0  1  0  a=1.0  0  0  0  o=1.01  0  0  0  a=1.05  179  180  182  All  three performance  f o r e c a s t i n g performance The best performance  i n d i c a t o r s show that the  i s s e n s i t i v e t o s p e c i f i c a t i o n of a.  shows an average RRMS e r r o r  5%, while the worst performance The f o r e c a s t i n g p e r i o d  ( f o r a=1.05) r e s u l t s i n 50%.  spans over 183 days, from A p r i l  September 30 of each year. Table 6.7c shows that or  1.05, the f o r e c a s t  actual  6.4.5  (PI,) of  error  i s g r e a t e r that  1 to  f o r a=.95  25% of the  flow at almost every time s t e p .  Performance  Indicators  f o r Scheme 3  The f o r e c a s t i n g performance  i s t e s t e d with respect to  o-r Qo f o r and R. The range of v a l u e s f o r the parameters p  are:  81  Table 6.8  Range of v a l u e s f o r input parameters  Parameter  lower  a  .99  1.01  q  1  15  .3  30  0  P  0  R  2X10"  limit  6  upper  limit  20  R e s u l t s a r e presented below. a=1.0  qo=  Table 6.9a  7  =  Po 3  R=.002 u n l e s s otherwise s t a t e d .  Values of PI, f o r Scheme 3  Parameter value  1981  1982  1 983  a=.99  91%  92%  92%  a=1 .0  46%  48%  44%  a=1.01  24030%  23605%  23234%  R=20  46%  49%  45%  R=2  46%  48%  44%  46%  48%  44%  46%  48%  44%  R=2x10"  6  a l l v a l u e s of q  0  and P  0  82  Table 6.9b  Values of P I  2  f o r Scheme 3  Parameter value  1981  1982  1983  a=.99  1 00%  100%  1 00%  a=1.0  1 30%  77%  109%  a=1.01  80625%  80848%  72877%  R=20  121%  78%  101%  R=2  1 29%  77%  108%  131%  77%  1 10%  1 30%  77%  1 09%  R=2x10" all  6  v a l u e s of q  Table 6.9c  0  and P  0  Values of P I  3  f o r Scheme 3  Parameter value  1 981  1982  1983  a=.99  174  1 75  1 77  a=1 .0  1 16  1 30  1 33  a=1.01  1 78  174  1 72  R=20  1 22  1 25  1 40  R=2  1 15  129  1 32  1 16  130  1 32  116  130  133  R=2X10 all  -6  v a l u e s of q  All worst  0  and P  0  three Performance I n d i c a t o r s show t h a t t h i s i s the  forecasting  i s approximately  scheme as even the s m a l l e s t value f o r PI, 50%. Although  the performance i s robust to  83  specification  of q ,  specification  of a. From Table  0  P,  and  0  R;  i t i s very 6.9a,  sensitive  a=1.0  results  to the  i n the  s m a l l e s t PI,. However, using a=.99 doubles the average error.  6.5  Moreover, using a=1.0l  D i s c u s s i o n of  6.5.1  F o r e c a s t i n g Performance of Scheme 1  coefficient,  the  forecasting initial  of x  Table  and  P  0  i s also  on  the  noted that f o r t h i s  robust  to s p e c i f i c a t i o n  because a)  of  specification  of model c o e f f i c i e n t s .  formulation  a d j u s t e d by using the 6.2  i s that a can incoming flow  shows that the l e a s t  coefficient  i s most  the  b) the f i l t e r  Indeed, the be data.  In f a c t , they are c o n s t r a i n e d to be  of a  less  coefficients.  does not have t h i s  limitation.  can be used to estimate  for non-stationary  For t h i s a p p l i c a t i o n ,  i s robust  squares estimates  1 as they are the sample c o r r e l a t i o n  1.  found i n Chapter 3  l i t t l e effect  performance i s good; and  Consequently, the Kalman F i l t e r  than  has  AR  f o r m u l a t i o n of the Kalman F i l t e r  However, the Kalman F i l t e r  AR  I t was  in real-time forecasting,  are c l o s e to 1. than  0  filter  main advantage of t h i s continually  surprising.  to the  variance.  Hence, t h i s attractive  f i l t e r ' s robustness  performance. I t i s a l s o  application, the noise  the  a i s not  specification  forecasting  to  divergence.  Results  For Scheme 1,  that  l e a d s to f i l t e r  RMS  the  systems where a i s g r e a t e r  the estimates  of a given  by  84  the Kalman F i l t e r are a l s o c l o s e to 1. Hence, they the  regression  6.5.2  support  estimates.  F o r e c a s t i n g Performance of Scheme 2 The AR(1)  model i s w r i t t e n as the s t a t e equation  Kalman model. For K=1,  t h i s i s equivalent  of  to f o r e c a s t i n g  without the Kalman F i l t e r . An  estimate  of a i s r e q u i r e d  p r i o r to f o r e c a s t i n g . R e s u l t s  show the  filter's  to s p e c i f i c a t i o n of the AR a 5% range with  respect to a  Even s l i g h t  sensitivity  c o e f f i c i e n t . Values of a o u t s i d e r e s u l t s i n l a r g e r PI  L g  In p r a c t i c e , a cannot be estimated percent.  the  seasonal  with an accuracy  values. of a  v a r i a t i o n can cause the  few  true  value of a to wander by t h i s amount.  6.5.3  F o r e c a s t i n g Performance of Scheme 3 F o r e c a s t i n g under Scheme 3 r e s u l t s i n the worst  performance. Large i n c r e a s e s  i n a l l Performance I n d i c a t o r s  are noted when s p e c i f i c a t i o n of a i s changed by as l i t t l e 1% from a  L g  .  Therefore,  t h i s i s not a p r a c t i c a l f o r e c a s t i n g  procedure, as a cannot be estimated accuracy.  Poor estimates  c o r r e c t e d as  =  a  =  the  Vt-1  +  by: 6  of the s t a t e v e c t o r  K  [ ( t  be  forecasting period.  L S $t  i s the Kalman estimate  $t  t h i s degree of  of a once chosen cannot  Flow p r e d i c t i o n s are given  q.  with  i t i s held f i x e d d u r i n g  $t+l/t  as  m  3 t"  q i  \  ]  *  1 3  f o r time t :  6.14  85  or  $t  =  There are two 1.  K  = 1 t  t/t-i  q  ^ - V  +  l i m i t i n g cases of  r e s u l t s i n flow  This i s equivalent  K  t  q m  t  6  -  1 5  interest:  f o r e c a s t s given by q  t + 1  y  = a qm .  t  LS  t  to not using the Kalman F i l t e r  (Scheme 2). 2.  f o r e c a s t s given by Q  K =0 r e s u l t s in flow fc  q  T  t/t-1*  n  i  s  t +  i/  = t  a L  s  corresponds to not using the measurement  i n f o r m a t i o n at time t . T h i s scheme leads to  propagation  of e r r o r s as t i n c r e a s e s . In f a c t , t h i s i s the reason why  t h i s f o r e c a s t i n g scheme  y i e l d s such poor r e s u l t s . Successive  flow p r e d i c t i o n s are  based on the previous p r e d i c t i o n without c o n s i d e r a t i o n of the measurements. T h i s then e x p l a i n s why  a=1.0l r e s u l t s in  much l a r g e r Performance I n d i c a t o r s than a=.99. Flow forecasts in this s i t u a t i o n an exponent a , where q fc  i s approximated by q  can be taken to be the  0  0  r a i s e d to initial  flow. For l a r g e t , 1.01*" r e s u l t s in f o r e c a s t s , which d e v i a t e from the a c t u a l flow by a g r e a t e r amount than .99*". I t i s shown below that f o r t h i s a p p l i c a t i o n , K  fc  is  approximately  0. For H=1,  the expression K  K  t  P  t  i s small i f  " t/t-1  p t  S t a r t i n g with P , 0  / _ i t  /  [ P  f o r the Kalman gain reduces t o :  t/t-1  +  R ]  6  i s much l e s s than  Pyo  =  a 2  P  L  s °  +  °- ~  p  R. ° •  '  1 6  86 Hence, K, ~ p / [ p +R], which g i v e s P, ^ 0  1.  If a l a r g e P P^O.  2.  0  If P  0  i s s p e c i f i e d , then K,~1  Subsequent P  FC  which i s s t i l l steps,  p t /  / _i t  which r e s u l t s i n  w i l l a l l be c l o s e to  i s small compared to R,  0  (I-K^PQ.  then K^O.  0. However, P , ~ P  s m a l l . Hence a f t e r a couple approaches 0 r e g a r d l e s s of  of time  P . 0  This situation  should be d i s t i n g u i s h e d from Scheme 1 where  the AR  i s w r i t t e n as the o b s e r v a t i o n  process  that case,  2  a  = R a l s o . The  0  equation.  In  s t a t e v a r i a b l e however, i s the  AR c o e f f i c i e n t . Hence the Kalman gain i s a p p l i e d to a.  Since  small adjustments to a are adequate for improving  the  observation  necessary.  f o r e c a s t , only a small Kalman gain  If the s t a t e v a r i a b l e i s the flow small K  fc  values are not  is  (as in Scheme 3) then  sufficient  to c o r r e c t the  flow  estimates. Therefore, considered  t h i s f o r e c a s t i n g scheme i s the worst  i n t h i s chapter  because:  1.  No adjustment can be made to a.  2.  Flow p r e d i c t i o n at each time step i s not adequately  as K  i s too  one  small.  corrected  87 6.6  Summary The  AR(1)  model i s used to d e s c r i b e  streamflow  phenomenon at Hope. R e s u l t s f o r the performance of f o r e c a s t i n g schemes are given Scheme 1 i s the best  below.  f o r e c a s t i n g procedure. T h i s  corresponds to f o r m u l a t i n g the AR(1) observation  equation  the  model i n the  with a as the s t a t e v a r i a b l e . Hence,  t h i s allows c o n t i n u a l updating  of the AR c o e f f i c i e n t .  The  scheme i s best both i n terms of flow f o r e c a s t s obtained, robustness  to i n i t i a l  s p e c i f i c a t i o n of  Scheme 2 formulates equation  the AR(1)  a.  process  as the s t a t e  in the Kalman model. T h i s i s e q u i v a l e n t  using the f i l t e r  to not  at a l l . Although the performance can be  good as Scheme 2, f o r e c a s t s are s e n s i t i v e to s p e c i f i c a t i o n of the AR  and  as  the  coefficient.  Scheme 3 i s the worst f o r e c a s t i n g procedure. Not  only  are the flow p r e d i c t i o n s s e n s i t i v e to a, the performance i s always poor. Under t h i s scheme, a small Kalman gain leads to i n s u f f i c i e n t c o r r e c t i o n to q . J  +  1  / .. 4  7. APPLICATION: TRANSFER FUNCTION MODEL The AR(1) for  model i n chapter 6 g i v e s the 1-step ahead f o r e c a s t  the flow a t Hope. A p p l i c a t i o n of the Kalman F i l t e r  not n e c e s s a r i l y y i e l d  improved flow p r e d i c t i o n s . The best  f o r e c a s t i n g scheme i s t o use the Kalman F i l t e r the AR c o e f f i c i e n t to  does  t o estimate  r e c u r s i v e l y . T h i s a l l o w s flow p r e d i c t i o n s  be made with the l a t e s t estimate, a . The f o r m u l a t i o n fc  corresponds  t o w r i t i n g the AR(1)  model i n the o b s e r v a t i o n  equation. Often i n f l o o d management however, f o r e c a s t s are r e q u i r e d f o r s e v e r a l days i n advance. There i s a problem with using the AR(1)  model i n a Kalman F i l t e r  p r e d i c t o b s e r v a t i o n s more than model, H  fc  1 step ahead. In the Kalman  which r e l a t e s the o b s e r v a t i o n to the s t a t e , i s  c o n s i d e r e d known a t time t . For the 1-step H t  +  1  framework t o  = Q « But f o r the 2-step t  forecast, H  forecast,  f c + 2  = q  t + 1  « The  problem i s that tomorrow's flow i s unknown. In p r a c t i c e , the 1-step  forecast  i s used as an e s t i m a t e . However, the  v a r i a n c e of t h i s f o r e c a s t e r r o r  (<3  t+2  ~ Q 2^' t+  should not be  c a l c u l a t e d using the Kalman a l g o r i t h m . One a l t e r n a t i v e use a d i f f e r e n t  statistical  model, which avoids  i s to  this  problem, t o p r e d i c t the flows 2 days i n advance. A transfer  f u n c t i o n model i s used to r e l a t e flows  from  upstream s t a t i o n s t o the flow a t Hope. T h i s c o n t r a s t s with the time s e r i e s model of Chapter independent at  6. In t h i s case, the  v a r i a b l e s do not i n v o l v e past v a l u e s of the flow  Hope, the dependent v a r i a b l e . T h i s input-output model  88  g i v e s the 2-day ahead f o r e c a s t at d a i l y i n t e r v a l s , as measurements are r e c e i v e d everyday. The model used to describe  the flow at Hope i s a t r a n s f e r f u n c t i o n model with  constant c o e f f i c i e n t s :  q  where  H t  = [q  T t  _  2  q  N t  _ ] I  +  e  7.  t  2  q = ln(flow)  station  §_ = vector  of c o e f f i c i e n t s  e = white  noise~N(0,a =R)  2  superscripts: H = Hope T = Texas Creek N = Near Spences Bridge  S t a t i o n s T and N are l o c a t e d upstream of Hope. T h e i r geographic l o c a t i o n s are shown i n Chapter 1 , F i g . The o b j e c t i v e given Filter  i s to determine whether  flow  by t h i s model can be improved by using  1 . 1 .  forecasts  the Kalman  to estimate §_ i n r e a l - t i m e . Performance of the  f o r e c a s t i n g schemes are based on the same i n d i c a t o r s used f o r the A R ( 1 ) study.  The f o l l o w i n g c o n c l u s i o n s 1.  are drawn:  Flow f o r e c a s t s are not improved s i g n i f i c a n t l y by the Kalman F i l t e r t o be of p r a c t i c a l i n t e r e s t .  90 2.  However, under the Kalman F i l t e r performance i s robust  scheme, f o r e c a s t i n g  to s p e c i f i c a t i o n of £.  7.1 D e s c r i p t i o n of F o r e c a s t i n g  Schemes  Only two f o r e c a s t i n g schemes are c o n s i d e r e d study. In Chapter 6, three AR model were c o n s i d e r e d .  state-space  in this  formulations  I t was found that the  of the  formulation  which models the AR c o e f f i c i e n t as the s t a t e y i e l d s improved f o r e c a s t s . Hence, t h i s i s the only considered  in this  state-space  formulation  study.  7.1.1 P r o p e r t i e s of Scheme 1 No Kalman f i l t e r i n g  i s employed i n the f i r s t  scheme.  The t r a n s f e r f u n c t i o n model assumes that the c o e f f i c i e n t s are t i m e - i n v a r i a n t . Flow 2 days i n advance i s given  by the  1-step f o r e c a s t :  q  where  j3  L S  H t + 2  - [Q  T t  q  N t  l I  7.2  L S  i s the l e a s t squares estimate obtained  p r i o r to  forecasting.  7.1.2 P r o p e r t i e s of Scheme 2 Kalman F i l t e r i n g model i s c a s t  i s used and the t r a n s f e r  i n t o state-space  function  format as f o l l o w s :  91 = =  q  H  "»'t-r _»'t-i. T  t  = [Q _ t  1  The  7.3  2  q  N t  + e  _ ] l 2  t  2  c o e f f i c i e n t s j3 , |3 , are modelled  s t a t e s . T h e r e f o r e , the Kalman F i l t e r estimates of £  fc  7.4  fc  as t i m e - i n v a r i a n t i s used to give  r e c u r s i v e l y at each time s t e p . The  flow 2  days i n advance i s :  q"  where |?  of  T  N  t  q ] t  l  7.5  t  i s based on the l a t e s t estimate given by the Kalman  t  Filter.  - [q  t + 2  In order to apply the Kalman a l g o r i t h m , an  estimate  the noise v a r i a n c e i s r e q u i r e d .  7.2  General D i s c u s s i o n R e s u l t s of the previous study  which does use the Kalman F i l t e r  imply that the scheme  would y i e l d b e t t e r flow  f o r e c a s t s . T h i s i s because £ i s estimated r e a l - t i m e . Consequently, time  r e c u r s i v e l y in  flow p r e d i c t i o n s are made at each  step with the l a t e s t estimate of the c o e f f i c i e n t s . In  addition,  i t i s expected  that under Scheme 2,  f o r e c a s t i n g performance would be robust to the s p e c i f i c a t i o n , j3 . 0  T h i s i s not l i k e l y  the initial  to be t r u e f o r Scheme  1. Thus, performance of Scheme 1 i s examined with respect to d i f f e r e n t v a l u e s of j 3 « rc:  Scheme 2 i s examined with respect  92 to input s p e c i f i c a t i o n of J3 and 0  It  i s not known how  R.  the f o r e c a s t i n g performance of t h i s  proposed model compares with that of the a u t o r e g r e s s i v e model. T h i s i s a q u e s t i o n of model i d e n t i f i c a t i o n and i s not the main o b j e c t i v e of t h i s study or t h e s i s . However, the f o l l o w i n g reasoning can be g i v e n . S t a t i o n s with l a r g e drainage areas such as Hope, have r e l a t i v e l y  slow system  response. T h i s i s due to the storage c h a r a c t e r i s t i c s of l a r g e b a s i n s . Quimpo (1973) showed that t h i s phenomenon can be r e p r e s e n t e d by an a u t o r e g r e s s i v e p r o c e s s . With  this  c o n s i d e r a t i o n , i t would not be s u r p r i s i n g t o f i n d that the f o r e c a s t i n g performance of the AR(1) model i s b e t t e r than that of the t r a n s f e r f u n c t i o n  7.3  model.  Experimental Procedure The f o r e c a s t i n g performance of each scheme i s  i n v e s t i g a t e d using streamflow measurements from three s t a t i o n s : Hope, Texas Creek, and Near Spences B r i d g e . The f o r e c a s t i n g p e r i o d i s from A p r i l 1982 and  1 to September 30 f o r 1981  1983. The performance of each scheme i s measured by  the three i n d i c a t o r s as d e f i n e d i n Chapter 6. These a r e : 1.  PI, which r e p r e s e n t s the average e r r o r  2.  PI  2  which denotes the maximum e r r o r  3.  PI  3  which i n d i c a t e s the frequency of poor  Small PI v a l u e s are i n d i c a t i v e of good performance.  forecasts  forecasting  93  F l o w p r e d i c t i o n u n d e r Scheme 1 r e q u i r e s an e s t i m a t e f o r J3 p r i o r  t o f o r e c a s t i n g . The m e t h o d o f l e a s t  squares i s used  t o o b t a i n £ g. Flow r e c o r d s used f o r t h i s e s t i m a t i o n a r e L  b a s e d on f i v e y e a r s o f d a t a , of t h e year the  five  i s used; A p r i l  least  1976-1980. The same t i m e p e r i o d  1 t o S e p t e m b e r 30. An a v e r a g e o f  squares estimates  ( L S E ) i s u s e d a s J3 f o r  forecasting. Real-time an e s t i m a t e likelihood  f o r e c a s t i n g w i t h t h e Kalman F i l t e r  requires  f o r t h e n o i s e v a r i a n c e , R. A p p r o x i m a t e estimates  (MLE) were o b t a i n e d  the years  1976-1980. A g a i n ,  estimates  i s used as s p e c i f i c a t i o n  maximum  i n Chapter 5 f o r  the average of these  five  f o r R.  7.4 R e s u l t s  7.4.1 E s t i m a t e  of t h e model  Regression all  years,  for  |3, a r e g i v e n  coefficients  a n a l y s i s g i v e s values of p  1976-1980. Hence o n l y t h e l e a s t below.  2  close to 0 for squares  estimates  94  Table 7.1  Least Squares Estimates of the model c o e f f i c i e n t  year  The  ^LS  1976  (1 .0572)  1977  (1.0514)  1978  (1.0745)  1 979  (1.0556)  1980  (1.0675)  average  (1.0613)  regression results  i n d i c a t e that flows from s t a t i o n N do  not have a s i g n i f i c a n t c o n t r i b u t i o n i n p r e d i c t i n g the 2-day advance flow a t Hope. Because j3 = 0, only the s t a t i o n at 2  Texas Creek i s used as the upstream input i n t h i s input-output  7.4.2  model. Thus, p °  Estimate The  A V  = 1.0613.  o f the noise v a r i a n c e 2  f i v e MLE of a  from Chapter 5 are given below.  95  Table 7.2  Maximum L i k e l i h o o d Estimates of the n o i s e v a r i a n c e  year  R  1976  .01237  1977  .01578  1978  .01081  1979  .01085  1980  .01888  average  .0137  R i s rounded o f f to .015 f o r the Kalman a l g o r i t h m  input  7.4.3 Performance I n d i c a t o r s f o r Scheme 1 R e s u l t s of the performance measures u s i n g j 3  A V  are given  below.  Table 7.3  Performance I n d i c a t o r s f o r Scheme 1  1981  1982  1 983  PI i  10%  1 7%  18%  PI  2  31%  48%  41%  PI  3  3  24  35  The analagous scheme of the AR(1) model i s compared.  96 T h i s corresponds  to flow p r e d i c t i o n using a  L g  without the  Kalman F i l t e r .  Table 7.4  Performance I n d i c a t o r s f o r the AR(1) model  1981  1982  1983  PI ,  5%  7%  6%  PI  21%  24%  16%  2  PI  0  3  0  0  Values of the PI f o r the t r a n s f e r f u n c t i o n model (Table 7.3) are higher than those of the AR(1) model (Table 7.4). T h i s i n d i c a t e s a l e v e l a performance which i s not as good as that o b t a i n e d with the a u t o r e g r e s s i v e model. The  s e n s i t i v i t y of Scheme 1 with respect to input  specification  i s examined. Streamflow data from  used as t h i s r e s u l t e d i n the worst  Table 7.5  PI, PI  2  PI  3  1982 are  performance.  S e n s i t i v i t y of the Performance I n d i c a t o r s to the model c o e f f i c i e n t  0=1.0613  0=1.04  17%  20%  48%  46%  24  37  97 The  value of 0 i s 0 y. The second  first  A  value of 0 i s an  a r b r i t r a r y c h o i c e which r e p r e s e n t s a 2% change from the f3 »  averaged and P I  3  AV  There i s an i n c r e a s e i n PI, (average e r r o r )  (frequency of poor f o r e c a s t s ) f o r 0=1.04.  7 . 4 . 4 Performance I n d i c a t o r s f o r Scheme 2 »  Values of the performance i n d i c a t o r s f o r Scheme 2 are presented:  Table 7.6  Performance I n d i c a t o r s f o r Scheme 2  1981  1 982  1983  PI 1  9%  1 9%  1 3%  PI 2  40%  54%  33%  PI ,  3  32  12  Tables 7.3 and 7.6 are compared. For the 1981 data, comparable f o r e c a s t i n g performance i s obtained f o r the two schemes. In both cases, the number of times the f o r e c a s t error  i s i n excess of 25% of the a c t u a l  flow, ( P I ) 3  is  minimal. The  1982 data r e v e a l a s l i g h t l y  larger difference in  the performance i n d i c a t o r s between the two schemes. A l l three performance i n d i c a t o r s are l a r g e r f o r Scheme 2 which uses the Kalman F i l t e r . T h i s i m p l i e s that b e t t e r flow f o r e c a s t s can be obtained without c o n t i n u a l updating of £.  98  However, performance indicate Filter  indicators  f o r the 1983 data  t h a t b e t t e r f o r e c a s t s are obtained i f the Kalman  i s used.  For t h i s scheme, the s e n s i t i v i t y of i t s performance with r e s p e c t to the i n i t i a l  s p e c i f i c a t i o n of j3, and R i s  examined u s i n g the 1982 data. The range of values used f o r these parameters a r e :  Table 7.7  Range of Values f o r input parameters  Parameter value  lower l i m i t  upper l i m i t  initial £  (-.1.9)  (1.1)  R  .0015  15  Under a l l input s p e c i f i c a t i o n s ,  Table 7.8  PI  Performance  1  the PI v a l u e s are the same.  I n d i c a t o r s f o r Scheme 2  1 9%  PI 2  54%  PI 3  32  T h i s c o n t r a s t s from Table 7.5 where the PI values under Scheme  1 are sensitive  t o the s p e c i f i c a t i o n of j3.  99 7.5 D i s c u s s i o n of R e s u l t s Performance I n d i c a t o r s f o r the two schemes do not clearly  i n d i c a t e which scheme g i v e s b e t t e r f o r e c a s t s .  However, r e s u l t s of the s e n s i t i v i t y analyses  favor the  scheme which uses the Kalman F i l t e r . T h i s i s because the f o r e c a s t i n g performance under Scheme 2 i s robust specification  of the model c o e f f i c e n t s . I t i s found that  without the Kalman F i l t e r , can occur.  a marked decrease i n performance  T h i s was i l l u s t r a t e d with a 2% change i n the  specification  of 0 f o r the 1982 d a t a s e t . A r e g r e s s i o n  performed on the 1982 data very c l o s e t o 0 represent  to i n i t i a l  A V  found that | 3  L S  = 1 .062. T h i s i s  of 1.0623. Hence, r e s u l t s i n Table  7.3  the best performance p o s s i b l e f o r the 1982 data  under Scheme 1.  100  7.6  Summary A time-invariant  describe  t r a n s f e r f u n c t i o n model i s used to  the streamflow phenomenon at Hope. The independent  v a r i a b l e s are flows from two upstream s t a t i o n s lagged 2 days behind that of Hope. I t was found t h a t : 1.  2.  The f o r e c a s t i n g performance of t h i s model i s not significantly  improved o r degraded by a p p l y i n g the  Kalman F i l t e r  to r e c u r s i v e l y estimate J3.  However,  i t i s recommended  that the Kalman F i l t e r  used because the f o r e c a s t i n g performance s p e c i f i c a t i o n of  be  i s robust t o  T h i s i s important i n s i t u a t i o n s  where the model c o e f f i c i e n t s cannot be estimated a c c u r a t e l y or i f they change slowly  through time.  8. APPLICATION: ARMAX MODEL The Kalman F i l t e r transfer  has been a p p l i e d t o a u t o r e g r e s s i v e and  f u n c t i o n models t o y i e l d estimates of the model  c o e f f i c i e n t s r e c u r s i v e l y . These models are used f o r flows a t Hope 1 and 2 days i n advance  predicting  r e s p e c t i v e l y . The advantage of u s i n g the Kalman F i l t e r i s that the f o r e c a s t i n g performance i s robust t o i n i t i a l s p e c i f i c a t i o n of the model c o e f f i c i e n t s . As these a r e u s u a l l y unknown i n h y d r o l o g i c a p p l i c a t i o n s , t h i s  feature i s  a p p r e c i a t e d by p r a c t i c i n g e n g i n e e r s . In a d d i t i o n , use of the Kalman F i l t e r  g i v e s improved flow f o r e c a s t s f o r the A R ( 1 )  model. An ARMAX model f o r d e s c r i b i n g the flow a t Hope i s considered i n t h i s  q  H t  -  c i q  chapter:  H t  _ l  +  C  z  q  T  t - 2  +  c  3 S  N T  - 2  +  e  t  8  '  1  T h i s i s an A R ( 1 ) with upstream i n p u t s . The a u t o r e g r e s s i v e part of t h i s model r e p r e s e n t s storage c h a r a c t e r i s t i c s of the r i v e r b a s i n , while the exogeneous i n p u t s represent c o n t r i b u t i o n s from upstream s t a t i o n s . In t h i s study the Kalman F i l t e r  i s a p p l i e d to the ARMAX model t o g i v e  estimates of the ARMAX c o e f f i c i e n t s predictions  r e s u r s i v e l y . Flow  1 and 2 days i n advance a r e made. The  performance i s t h e r e f o r e , measured f o r both the 1 - s t e p and 2 - s t e p flow f o r e c a s t s . The o b j e c t i v e i s t o compare the 1 - s t e p f o r e c a s t i n g performance of the ARMAX model t o that of  101  1 02  the A R ( 1 ) model and the 2 - s t e p performance t o that of the regression  model. i n d i c a t e that the 1 - s t e p  R e s u l t s of the study  f o r e c a s t i n g performance of the ARMAX model i s comparable t o that of the A R ( 1 ) model. In a d d i t i o n , the 2 - s t e p performance i s b e t t e r than that of the t r a n s f e r - f u n c t i o n model.  8.1 P r o p e r t i e s of the F o r e c a s t i n g Scheme The state-space  formulation of the ARMAX model i s :  r  Ci  =  "c  c  2  =  c  2  c  3  =  c  3  H q  H T - [q _ ! q _  t  t  The Kalman F i l t e r  t  2  t  ] 2  -. £  + e  t  , t  i s used to give estimates  c o e f f i c i e n t s c,, c , and c 2  of the model  at each time step. Performance  3  f o r both the 1 - s t e p and 2 - s t e p flow f o r e c a s t s .  i s measured The  N q -  r  1 - s t e p f o r e c a s t i s given by:  q  H  = [q  t + 1  H  q  t  T  _! q  t  N t  _i  ]  8  £ +l/t t  The 2 - s t e p f o r e c a s t i s given by:  q  where £  t + 1  y  t  t  +  2  = fq  and £  t  t + 2  /t  +  q  1  a  r  e  t  ^  8  q J S. /t t  a s e <  t+2  ^  o  n  t  n  e  latest state  5  '  103 estimate given replaced The  by the f i l t e r .  As q  by the 1-step f o r e c a s t , q autoregressive  i s unknown, i t i s t  +  1  •  nature of t h i s model can r e s u l t i n  the problem of unknown system matrix, H. T h i s occurs when predicting  8.2 General It  flows more than one time step ahead.  Discussion  i s d e s i r e d i n p r a c t i c e that the 1 and 2-step  performance of the ARMAX model be b e t t e r than or equal t o that of the AR(1) and t r a n s f e r f u n c t i o n models r e s p e c t i v e l y . First  of a l l , only one model needs t o be used. Secondly,  there  i s only one noise v a r i a n c e The  t o estimate.  s t a t i o n a t Hope i s c h a r a c t e r i z e d by a l a r g e  drainage a r e a . Thus, i t i s expected that the a u t o r e g r e s s i v e p a r t of the combined model w i l l dominate. T h i s means that estimates of c, w i l l be r e l a t i v e l y and  c . With t h i s reasoning, 3  performance should The and  l a r g e r than those of c  2  the 1-step f o r e c a s t i n g  be s i m i l a r t o that of the AR(1) model.  Kalman a l g o r i t h m  r e q u i r e s s p e c i f i c a t i o n of c ,  P ,  0  0  R. From the s t u d i e s of Chapters 6 and 7, i t has been  concluded that f o r e c a s t i n g performance i s robust  to i n i t i a l  s p e c i f i c a t i o n of the model c o e f f i c i e n t s . T h e r e f o r e , the s e n s i t i v i t y of the f i l t e r ' s performance i s examined with respect  t o the s p e c i f i c a t i o n of the noise v a r i a n c e  only.  One i s s u e of concern here i s that the 2-step p r e d i c t i o n i n v o l v e s unknown elements i n the t r a n s i t i o n matrix, H the Kalman model. For the ARMAX model, H  fc+  2 =  fc+  2 of  104 H fQ  t  T +  1  Q  t  N Q l«  T  n  e  t  Kalman a l g o r i t h m  assumes that  this i s  known. As tomorrow's flow a t Hope i s unknown a t time t , 1-step f o r e c a s t  i s used as an estimate. V i o l a t i n g t h i s  assumption r e s u l t s i n an increase forecast  its  i n the v a r i a n c e  e r r o r . T h e r e f o r e , the 2-step  performance of the f i l t e r robustness of the f i l t e r  gives  of the  forecasting  an i n d i c a t i o n of the  t o the problem of unknown H i n a  practical situation. 8 . 3 Experimental Procedure The  f o r e c a s t i n g performance i s measured by the three  indicators described  i n Chapter 6. For t h i s ARMAX model, the  performance i s measured with respect 2-step f o r e c a s t  to both the 1-step and  errors:  1-step f o r e c a s t  error:  Q  2.  2-step f o r e c a s t  error:  ^t+2/t ~ t + 2  The  i n d i c a t o r s a r e based on these f o r e c a s t  t + 1  /  _  1.  t  P^+i q  e r r o r s measured  3  in a c t u a l u n i t s of m /s. These a r e : 1.  PI, which represents the average e r r o r  2.  P I which denotes the maximum e r r o r  3.  P I which i n d i c a t e s the frequency of poor  The  nature of these i n d i c a t o r s i s such that the smaller the  2  3  values, is  the b e t t e r  from A p r i l  forecast.  the performance. The f o r e c a s t i n g  period  1 to September 30 f o r 1981 1982 1983.  105 For the Kalman a l g o r i t h m i n p u t , an estimate of the noise v a r i a n c e i s r e q u i r e d , and the approximate maximum l i k e l i h o o d estimates obtained  8.4  i n Chapter 5 are used.  Results  8.4.1  Estimate  of the noise  variance  Approximate maximum l i k e l i h o o d estimates were obtained  i n Chapter 5 f o r the years  average of these  (MLE) of a  1976-1980. The  i s used as the s p e c i f i c a t i o n  f o r R i n the  Kalman a l g o r i t h m .  Table 8.1  Maximum L i k e l i h o o d Estimates of the noise v a r i a n c e  year  R  1976  .00209  1977  .00276  1978  .00175  1979  .00259  1980  .00303  average  .0024  R i s rounded to .0025 f o r the Kalman  2  specification.  106 8.4.2  Performance i n d i c a t o r s for the ARMAX model Two s e t s of performance i n d i c a t o r s  first  (PI) are g i v e n . The  set r e f e r s t o the 1-step performance of the Kalman  F i l t e r . The second s e t r e f e r s to the 2-step performance.  1-step Performance  Table 8.2a  Variation  Indicators  of PI, to noise s p e c i f i c a t i o n  Value of R  1981  1982  1983  .025  4%  4.5%  3.5%  .0025  4%  4.5%  3.5%  .00025  4%  4.5%  4.0%  Table 8.2b  Variation  of P I  2  to noise s p e c i f i c a t i o n  Value of R  1981  1982  1983  .025  26%  17%  21%  .0025  34%  17%  24%  .00025  35%  17%  24%  107  Table 8.2c  Variation  of P I  3  t o noise  specification  Value of R  1981  1982  1983  .025  1  0  0  .0025  1  0  0  .00025  1  0  0  2-step Performance I n d i c a t o r s  Table 8.3a  Variation  of PI, t o noise  specification  Value of R  1981  1 982  1 983  .025  9%  8.5%  7%  .0025  11%  8.5%  7%  .00025  11%  8.5%  7.5%  Table 8.3b  Variation  of P I  2  t o noise  specification  Value of R  1981  1 982  1 983  .025  62%  32%  33%  .0025  104%  32%  34%  .00025  108%  32%  44%  108  Table 8.3c  V a r i a t i o n of P I t o noise s p e c i f i c a t i o n 3  Value of R  1981  1982  .025  2  2  .0025  2  2  .00025  3  2  1983  Tables 8.2a and 8.3a show that v a l u e s of P I , a r e l e s s than 12% f o r both the 1 and 2-step performance. They i n d i c a t e that forecasts  the r e l a t i v e RMS e r r o r  also  f o r the 2-step  i s twice that of the 1-step f o r e c a s t s . Tables  8.2(a,b,c)  show the r e l a t i v e i n s e n s i t i v i t y of the 1-step PI  to s p e c i f i c a t i o n of R. Table 8.3b i n d i c a t e s a l a r g e r decrease i n P I f o r the 2-step f o r e c a s t s when R i s s p e c i f i e d 2  larger.  8.4.3 Comparison The  of 1-step Performance  Indicators  1-step f o r e c a s t i n g performance of the ARMAX model  i s compared t o that of the AR(1) model. In both cases, the Kalman F i l t e r  i s used f o r updating the model c o e f f i c i e n t s .  1 09 Table 8.4  Values of PI f o r ARMAX and AR(1)  PI:ARMAX  models  1981  1982  1983  4%  5%  4%  4%  5%  4%  34%  17%  24%  17%  20%  14%  :AR(1) PI ,  PI  PI  2  3  1  0  0  Values of PI, and P I  0 0  3  0  are approximately the same under  both models. However, the ARMAX model y i e l d s l a r g e r values f o r PI # 2  8.4.4  i . e . l a r g e r maximum e r r o r s are o b t a i n e d .  Comparison of 2-step Performance The  2-step f o r e c a s t i n g performance  of the t r a n s f e r f u n c t i o n model. The  Indicators i s compared to that  f o r m u l a t i o n used  corresponds to l e t t i n g the model c o e f f i c i e n t s be the s t a t e v e c t o r i n the Kalman model.  110 Table 8.5  Values of PI f o r ARMAX and T r a n s f e r F u n c t i o n models  PI:ARMAX  1981  1982  1983  11%  8.5%  7%  9%  19%  13%  104%  32%  34%  40%  54%  33%  2  1  2  3  32  12  :TRANSFER FUNCTION PI,  PI  PI  2  3  Values of PI, f o r the ARMAX model a r e comparable or l e s s than those f o r the input-output model. R e s u l t s f o r P I are more v a r i e d . F o r e c a s t i n g under the ARMAX model y i e l d s a higher P I value f o r the 1981 data; while a lower 2  PI  2  value  i s obtained f o r 1982 data. The 1983 data r e s u l t s i n almost the same P I v a l u e s f o r both models. For P I , the ARMAX 2  3  model y i e l d s lower v a l u e s i n a l l three c a s e s . In f a c t , the number of times the 2-step  forecast error  i s g r e a t e r than  25% of the a c t u a l flow i s a t most 2. T h i s compares with 32 f o r the t r a n s f e r seldom g i v e s poor  f u n c t i o n model. Hence, the ARMAX model forecasts.  2  111  8.5  Discussion of Results  8.5.1  Performance o f t h e ARMAX model Tables 8.2c and 8.3c i n d i c a t e that t h i s model  relatively greater the  few f o r e c a s t s which r e s u l t i n f o r e c a s t  than 25% of the a c t u a l flow. T h i s  i s true  error for both  1-step and 2-step flow p r e d i c t i o n s . The 1-step  f o r e c a s t i n g performance i s r e l a t i v e l y s p e c i f i c a t i o n of the noise larger results i n better  variance.  i n s e n s i t i v e to However, s p e c i f i f y i n g R  2-step performance i n d i c a t o r s . T h i s  i s not s u r p r i s i n g as the true n o i s e v a r i a n c e forecast  8.5.2  gives  f o r the 2-step  i s expected t o be l a r g e r .  Comparison o f 1 - s t e p performance The  1-step f o r e c a s t s of the ARMAX model a r e compared  with those of the AR(1) model. R e s u l t s  do not give a strong  i n d i c a t i o n as t o which model i s b e t t e r , as comparable v a l u e s for  a l l three  PI a r e o b t a i n e d . I f one i s only  interested in  p r e d i c t i n g the flow one day i n advance, then the AR(1) model i s adequate. In f a c t , i t i s p r e f e r r e d over the ARMAX model because i t gives a parsimonious  8.5.3  representation.  Comparison o f 2 - s t e p performance As noted i n s e c t i o n 8.4.4, the frequency of poor  forecasts  (PI ) 3  i s l e s s f o r the ARMAX model than the  t r a n s f e r f u n c t i o n model. P I o v e r a l l consistency  3  i s an important  and r e l i a b i l i t y  indicator i f  of the f o r e c a s t s a r e  112 criteria  i n p r a c t i c e . In a d d i t i o n , the average f o r e c a s t  e r r o r , represented by PI,, f o r the ARMAX model i s l e s s  than  or equal t o that of the input-output model. Hence, i n terms of model i d e n t i f i c a t i o n , there i s a s l i g h t preference f o r the ARMAX model for p r e d i c t i n g streamflow The  f a c t that reasonable  2 days i n advance.  performance i s obtained f o r  the 2-step f o r e c a s t s with the ARMAX model leads to the f o l l o w i n g c o n c l u s i o n . For the Kalman model, c a l c u l a t i o n of the 2-step f o r e c a s t r e q u i r e s that q  i s known. As t h i s i s  unknown i n p r a c t i c e , i t s estimate given by the Kalman F i l t e r as the 1-step f o r e c a s t i s used. Although  t h i s v i o l a t e s the  assumption of the Kalman model, i t i s found  in this  a p p l i c a t i o n that the 2-step f o r e c a s t i n g performance of the Kalman F i l t e r  i s acceptable  f o r e n g i n e e r i n g purposes. In  f a c t , the performance i s b e t t e r than that of the t r a n s f e r f u n c t i o n model. For the h y d r o l o g i s t , i t i s important  that the flow  p r e d i c t i o n s are given with t h e i r a s s o c i a t e d standard  error.  In the case of the 1-step f o r e c a s t , t h i s v a r i a n c e i s given by the Kalman F i l t e r as HPH' + R. T h i s assumes that H i s a d e t e r m i n i s t i c known q u a n t i t y . When elements of the t r a n s i t i o n matrix H a r e unknown, an estimate illustrated  i s used; as  i n the 2-step f o r e c a s t above. R e s u l t s of the  2-step PI i n t h i s study show that reasonable f o r e c a s t i n g performance can s t i l l  be o b t a i n e d . However, there  still  remains the problem of what e x p r e s s i o n i s t o be used f o r the v a r i a n c e of the 2-step f o r e c a s t e r r o r . T h i s i s addressed i n  1 13  the  next  chapter.  8.6 Summary If  flow  required,  predictions  the  ARMAX can be u s e d .  comparable performance 2-step forecasting input-output  system m a t r i x ,  study  that  indicates  are q u i t e  This  is  i n advance  are  because  has  AR(1) m o d e l . is  applications,  H is  often  not  better  that  using  it  In a d d i t i o n ,  than  that  time  performance.  to  assumption  an e s t i m a t e ,  p a r t i c u l a r example,  insensitive  the  satisfied  more t h a n one  forecasting  in t h i s  the  performance  observations  reasonable  as  1-day  of  a the  the  model.  In h y d r o l o g i c  future  more t h a n  the  the  when  of  known  predicting  step ahead. H still  Moreover,  results  it  1 and 2 - s t e p  specification  of  R.  This  is  in a  found  forecasts  9. VARIANCE OF THE FORECAST ERROR In f l o o d management, the engineer p r e d i c t streamflows  i s o f t e n r e q u i r e d to  s e v e r a l days ahead. The r e l i a b i l i t y of  these p r e d i c t i o n s i s r e f l e c t e d by t h e i r a s s o c i a t e d standard e r r o r s . For the Kalman F i l t e r *  =  Y_ = H x t  the  +  **t-l  t  t  t  9  *t v  +  s t a t e - s p a c e model, *  1  9.2  t  1-step f o r e c a s t e r r o r f o r y_ , known as the i n n o v a t i o n ,  i s g i v e n . I t s v a r i a n c e , c a l c u l a t e d by the Kalman a l g o r i t h m assumes that the system matrix H, i s known. In t h i s  chapter,  a g e n e r a l e x p r e s s i o n f o r the v a r i a n c e of the o b s e r v a t i o n f o r e c a s t e r r o r when both H and x are unknown, i s developed. ARMAX models have been used f o r streamflow  modelling i n  t h i s t h e s i s . The state-space f o r m u l a t i o n c o n s i d e r e d here i s where the ARMAX c o e f f i c i e n t s are the s t a t e v a r i a b l e s . The system matrix, H c o n t a i n s past streamflows. forecast, £  H  t + k  i s given by t+k-t+k"  flow p r e d i c t i o n s more than  F  o  r  t  h  e  The k-step A  R  (  1  flow  ) model,  1 time s t e p ahead r e q u i r e s  knowledge of f u t u r e flows, hence H ^ t +  i s unknown. For  instance, ^t +2  =  fi  x  t + 2 -t + 2  =  ^-t+]-t + 2  In p r a c t i c e , the 1-step p r e d i c t i o n for  Y_  t+1  . Because £  values of  t + 1  t+ 2  ,  3  i s used as an estimate  are both based on past  they are not n e c e s s a r i l y independent. The  o b j e c t i v e of t h i s study for  and x.  9  i s t o develop a g e n e r a l expression  the v a r i a n c e of the f o r e c a s t e r r o r when both H and x are  unknown, and t h e i r estimates are c o r r e l a t e d with each o t h e r .  1 14  11 5 From h e r e o n , t h e t i m e  s u b s c r i p t s a r e dropped f o r n o t a t i o n a l  convenience.  9.1  Background H a r r i s o n & Stevens  (1976) s u g g e s t e d  s q u a r e e r r o r (MSE) o f t h e k - s t e p f r o m t h e Kalman F i l t e r . Cov  f o r e c a s t c a n be o b t a i n e d  F o r k=1, t h e f i l t e r  gives  (y_-£) = HPH' + R  9.4  where R a n d P a r e c o v a r i a n c e m a t r i c e s n o i s e , and s t a t e e s t i m a t e s Priestley  t h a t t h e mean  for the observation  r e s p e c t i v e l y . A s p o i n t e d o u t by  i n t h e d i s c u s s i o n o f t h e above paper  (1976),  this  i s n o t a p p r o p r i a t e i f H i s unknown. Feldstein  (1971) d e v e l o p e d  a formula  b o t h H a n d x a r e unknown. R e g r e s s i o n in  that context. H corresponds  contains  t o a design matrix  The f o r m u l a  assumes t h a t H and x a r e independent.  noted  models a r e c o n s i d e r e d which  independent v a r i a b l e s , and x i s a v e c t o r of  regression coefficients.  presents  f o r Cov ( y - y ) when  some l i m i t a t i o n  g i v e n by F e l d s t e i n This  assumption  i n h y d r o l o g i c a p p l i c a t i o n s . As  f o r t h e AR(1) example, H and x a r e l i k e l y  t o be  c o r r e l a t e d a s t h e y a r e b o t h b a s e d on p a s t v a l u e s o f y.  9.2  Problem The  definition  statistical  model c o n s i d e r e d i s :  y_ = H x + £  9.5  1 16 where  The  1  vector of o b s e r v a t i o n s  x  vector of unknown parameters  H  system matrix  e  noise term  o b j e c t i v e i s to f i n d Cov (y_-£) when H and x are both  unknown. Assumptions a r e : 1.  H, x are unbiased  e s t i m a t o r s f o r H and x r e s p e c t i v e l y .  2.  The f o l l o w i n g covariance submatrices  are assumed known.  Z,!= E (fi-h)(fi-h)' E, = E ( f i - h ) ( x - x ) ' 2  L 2= 2  E (x-x)(x-x)'  where fi = vec H, The  operator  h = vec H.  "vec" on matrix H stacks i t s columns one  under the other, r e s u l t i n g  The  i n a column v e c t o r , h.  f o r e c a s t £ i s given by Hx. No assumption i s made on the  r e l a t i o n between H and x. The  covariance matrix  f o r the f o r e c a s t e r r o r i s :  Cov (y_-£) = Cov (Hx) + Cov U)  9.6  E v a l u a t i o n of Cov (Hx) i n v o l v e s c a l c u l a t i n g the expected value of squared  q u a n t i t i e s of H and x. T h i s r e q u i r e s the  f o u r t h moment of t h e i r j o i n t d i s t r i b u t i o n . E x p r e s s i o n f o r the f o u r t h moment can be obtained  i n terms of lower order  moments, i f H and x are assumed to be j o i n t l y d i s t r i b u t e d as  117  a m u l t i v a r i a t e normal. A convenient way of the elements Kronecker  of w r i t i n g products  of H and x i s through the use of the  product. A p r e l i m i n a r y r e s u l t  i s given i n the next  s e c t i o n before the d e r i v a t i o n of the g e n e r a l formula. The r e s u l t c o n t a i n s the e x p r e s s i o n s f o r the c o v a r i a n c e s of products of normal of the Kronecker  9.3  random v a r i a b l e s . I t i s w r i t t e n i n terms  product.  C o v a r i a n c e s of Products of normal  random v a r i a b l e s  The Kronecker product i s d e f i n e d as Given 2 m a t r i c e s  A  , B mxn  follows:  ., then A ® B i s a (ms,nt) matrix sxt  with submatrices c o n s i s t i n g of a^jB. T h e r e f o r e , the Kronecker  product of a n-dimensional v e c t o r , i s a super long 2  v e c t o r of l e n g t h n . From Magnus and Neudecker following result  i s used:  If a i s d i s t r i b u t e d as N ( y , n  a, then Cov  n  n  is a n  2  V) where "n" i s the l e n g t h of  (a ® a) i s given by  (I + K )  K  (1979), the  by n  2  J  (V ® V + V ® MM'  + MM'  ® V)  matrix d e f i n e d such that  9.7  K vec(A) =  vec(A'). For the purpose  of the t h e s i s , the f o l l o w i n g  i s r e s t r i c t e d to a s c a l a r model f o r y . The AR(1) fc  c o n s i d e r e d where to H  fc  and x  fc  y  fc  = Y_ t  1  + e^..  Y^-i  a n o  -  a  t  derivation model i s c o r r e s  P  o n  cl  r e s p e c t i v e l y . Hence, both H and x are s c a l a r s .  Flow f o r e c a s t s greater than 2 steps ahead, r e q u i r e estimates  1 18 for H. The expression f o r the v a r i a n c e i s developed i n the next  9.4  of the f o r e c a s t  error  section.  V a r i a n c e of the f o r e c a s t e r r o r f o r the s c a l a r case I d e n t i f y i n g the v e c t o r  a from s e c t i o n  9.3 as  fl ft  H x x fi  the Kronecker product, a ® a =  X  The s c a l a r q u a n t i t y  9.9  X  Var (fix) i s given by the 2nd (or 3rd)  element of the above v e c t o r  product. I t i s assumed  fi  Z, ,  L,  Z1 2  2-2 2  that  2  N,  where Z^j have been d e f i n e d  in section  development, they are s c a l a r Result  9.7  9.2. In t h i s  quantities.  from s e c t i o n  9.3  i s e v a l u a t e d below,  H  "2, 1  X  2  Z! 2  i d e n t i f i e d as  K,  =  10 0 0 0 0 10 0 10 0 0 0 0 1  (I + K ) = 2  2 0 0 0  0 1 1 0  Z 2 2  0 1 1 0  0 0 0 2  119  The f o l l o w i n g three matrices  ^r 1 i  are symmetric  1^12  2  Z1 2  1^12  1^22  z  Also V ® V =  2 12 1Z  Z! Z 2  22  Z ! Z2 2 2 Z 2  2 2  2  2  H Z 1 2 HxZ j 2  H ^ 1 1 HxZ, 1  2  x Z, 1  HxZ 1 2  2  V ®  MM' =  2  X  Z!  2  H Z 2 2 HxZ 2 2  2  X  H  1 1 H Z, 2  HxZ 1 1  2  L  H Z 2  Z2 2  HxZ ! 2  HxZ 1 2 HxZ 2  2  2  MM'  2  2  ft V = x Z 1i  xZ1 2  2  2  2  X  A d d i t i o n of these  2  2 2  2  3 terms and p r e m u l t i p l y i n g by (I + K ) 2  g i v e s the e x p r e s s i o n  H Z  Z2  + Z„  f o r Var(Hx):  2  (x + Z  2 2  )  +  £  1 2  <  2 H x  +  ^2)  9.9  Since, Var (y-y) = Var e + Var (Hx) the variance of the forecast error i s :  2  (R + H Z  2 2  ) + Z  2  V 1  (x + Z  2 2  ) + Z  1 2  (2Hx + Z  1 2  )  9.10  1 20  Eqn.  9.10 g i v e s the general formula  f o r the v a r i a n c e of the  f o r e c a s t e r r o r when both H and x are unknown s c a l a r q u a n t i t i e s . S p e c i a l cases can be obtained and are d i s c u s s e d below. 1.  I f H i s unknown, but i t s estimate H i s u n c o r r e l a t e d with x, then Z  1 2  = 0. T h i s assumption i s v a l i d  for regression  models where the v a r i a b l e H i s t r u l y exogeneous. Hence, Var  2  (y-y) = (R + H Z  2 2  ) + Z ^ U  2  + Z  2 2  )  9.11  T h i s i s the same expression as that given by F e l d s t e i n ' s formula 2.  f o r a s c a l a r r e g r e s s i o n model.  I f H i s known, then Z  1 2  = Z,, = 0. T h i s assumption i s  used i n the Kalman model f o r c a l c u l a t i n g the v a r i a n c e of the  1-step i n n o v a t i o n . Equation  Var  (y-y) = R + H Z  Z  i s the mean square e r r o r f o r the s t a t e v a r i a b l e x.  2 2  It  9.10 then  reduces to  2  9. 1 2  22  i s synonymus with "P" i n the Kalman F i l t e r  9.5 I l l u s t r a t i o n of the v a r i a n c e formula  notation.  f o r the AR(1)  The AR(1) model i s used t o i l l u s t r a t e the use of the general formula  (eqn. 9.10) f o r the variance of the f o r e c a s t  e r r o r . A p p l i c a t i o n of the Kalman F i l t e r to t h i s model was s t u d i e d i n Chapter 6. The s t a t e - s p a c e here corresponds  formulation  considered  to l e t t i n g the AR c o e f f i c i e n t be the s t a t e  v a r i a b l e . In p r a c t i c e , flow p r e d i c t i o n s s e v e r a l time steps ahead are o f t e n d e s i r e d . Because H and x are both unknown and  t h e i r estimates cannot be assumed to be independent,  121  eqn.  9.10  should be  The  used.  flow at Hope t y p i c a l l y  3  7000 m /s  throughout  coefficient  ranges  from  the y e a r . In p r a c t i c e , the  i s of the order of 1.0.  for  each term of eqn.  1.  R  9.10  Studies from Chapter  3  1000  m /s  to  AR  Characteristic  values  i s examined below:  5 i n d i c a t e that the standard  d e v i a t i o n of the measurement e r r o r  i s p r o p o r t i o n a l to  the flow i t s e l f . In p r a c t i c e , these e r r o r s are u s u a l l y l e s s than  10%  (Water Survey  of Canada, p e r s o n a l  communication). Hence, R i s taken to be 2•  2  .Oly .  2 2 2  T h i s i s given by the Kalman F i l t e r  as the MSE  of the  AR  c o e f f i c i e n t , a. As noted e a r l i e r , a i s approximately  1.  Estimates of a are o b t a i n e d at each time s t e p from Kalman F i l t e r . A c o n s e r v a t i v e estimate of i t s MSE percentage in L 2  =  2  3.  2,  of the true value i s about  the as a  10%. T h i s r e s u l t s  .01.  ,  T h i s i s the MSE  f o r H. For time t+2,  y  t + 1  t + 1  .  The M S E ( y  HPH'  t h i s corresponds  ) i s given by the Kalman F i l t e r  + R. Using the above v a l u e s f o r P and R, 2,,  to  as =  2  .02y . 4.  2  1 2  Since fl and x are p o s i t i v e l y c o r r e l a t e d , an estimate f o r Z  1 2  can be obtained by c o n s i d e r i n g the c o r r e l a t i o n  between fl and x. The c o r r e l a t i o n  i s of the order of .1  122  to  1, hence I  Each term of eqn.  . O O l y to . O l y .  i s approximately  1 2  9.10  i s expressed  i n terms of the  flow  magnitude:  (.Oly  2  2  + . O l y ) + .02y (1+.01) + .01y  (2y + . O l y )  2  9.13  D i s r e g a r d i n g terms with lower orders of magnitude y i e l d s  (.Oly  2  2  + . 0 l y ) + .02y  Expression  9.14  2  + .02y  2  9.14  i n d i c a t e s that i n p r a c t i c e , a l l terms i n the  general e x p r e s s i o n are of the same order of magnitude. Therefore,  i t i s not  general formula  j u s t i f i e d to n e g l e c t any part of  of eqn.  9.10.  Use  the  of the Kalman a l g o r i t h m to  c a l c u l a t e Var  (y-y) i s e q u i v a l e n t to using the terms i n the  first  only of eqn.  is  9.6  bracket  9.14.  Thus, the a c t u a l v a r i a n c e  s i g n i f i c a n t l y l a r g e r i f the c o r r e c t expression  Conclusions A s c a l a r o b s e r v a t i o n model i s c o n s i d e r e d  chapter, of  i s used.  y = Hx  + e.  A g e n e r a l formula  the f o r e c a s t e r r o r i s developed  in t h i s  f o r the  variance  when H and x are  both  unknown and  t h e i r estimates are c o r r e l a t e d with each o t h e r .  The  obtained i s :  formula  2  (R + H Z  2 2  ) + In  (x  2  + 2 ) 2 2  +  Z12  <2Hx +  I  1 2  )  123  The  expression  for  the measurement and  The  second part r e f l e c t s the u n c e r t a i n t y  ( I , , ) ; and  c o n s i s t s of three p a r t s : The state estimation  x  (£  1 2  £22)*  in the estimate, fl correlation  ).  Two  s p e c i a l cases of the general  1.  F e l d s t e i n s formula  formula  are:  1  (R + Z  2  2 2  H ) + I,! (x  2  +  T h i s i s used i f fl and appropriate 2.  accounts  e r r o r s (R +  the f i n a l part accounts f o r the  between fl and  first,  2 2  x are u n c o r r e l a t e d which i s  i f H i s t r u l y an exogeneous v a r i a b l e .  Kalman F i l t e r (R  Z )  formula  2  +  Z22H )  This i s appropriate  for the  1-step f o r e c a s t because H i s  a known q u a n t i t y . The  importance of t h i s formula i s i l l u s t r a t e d with  a p p l i c a t i o n of the AR(1) Hope. The  variance  inappropriate 1.  model to streamflow p r e d i c t i o n at  equation  given  i n t h i s case f o r two  by the Kalman F i l t e r  is  reasons:  Flow p r e d i c t i o n s 2-steps ahead or more r e q u i r e knowledge of f u t u r e flows  2.  an  The  (unknown H).  estimates fl and  x are c o r r e l a t e d with each other,  as  they are both based on past values of y.  The  practical  illustration  the general v a r i a n c e magnitude, and  i n d i c a t e s that a l l three p a r t s of  expression  are of the same order  hence cannot be n e g l e c t e d .  of  In any  forecasting  s i t u a t i o n , d e c i s i o n s are o f t e n made based on the  reliability  124  of the f u t u r e p r e d i c t i o n s . As the r e l i a b i l i t y in the standard e r r o r of f o r e c a s t ,  the c o r r e c t  developed i n t h i s chapter should be used.  is reflected expression  10. SUMMARY AND CONCLUSIONS ARMAX models a r e o f t e n used t o d e s c r i b e s t o c h a s t i c processes such as streamflow phenomena i n hydrology. These a r e time s e r i e s models with exogeneous i n p u t s . A p p l i c a t i o n of the Kalman F i l t e r  to these types of models f o r f l o o d  i s considered i n t h i s  forecasting  thesis.  The Kalman F i l t e r  i s a r e c u r s i v e e s t i m a t i o n procedure  which g i v e s a l i n e a r , minimum v a r i a n c e e s t i m a t o r f o r the s t a t e v e c t o r a t time t . The estimate i s updated a t each time step by making use of incoming n o i s e - c o r r u p t e d o b s e r v a t i o n s . The Kalman a l g o r i t h m , based on a s t a t e - s p a c e model, accounts for u n c e r t a i n t i e s both i n the s t a t e s and i n the measurements. The performance  of t h i s e s t i m a t i o n technique depends on  s a t i s f y i n g the assumptions Use of the Kalman F i l t e r  of the Kalman s t a t e - s p a c e model.  i n aerospace a p p l i c a t i o n s i s very  s u c c e s s f u l because the p h y s i c a l equations and system dynamics a r e w e l l known. However, such i s not the case i n streamflow a p p l i c a t i o n s as there a r e many u n c e r t a i n t i e s a s s o c i a t e d with t h i s s t o c h a s t i c phenomenon. Choice of the "best" ARMAX model with the "proper" noise s t a t i s t i c s i s o f t e n an impossible g o a l t o achieve i n e n g i n e e r i n g p r a c t i c e . N e v e r t h e l e s s , the Kalman F i l t e r assumptions  i s used even when the  of the Kalman model a r e not completely  satisfied. The f o l l o w i n g s e c t i o n summarizes the main c o n t r i b u t i o n s of the t h e s i s to the f i e l d of Kalman F i l t e r i n g  125  i n streamflow  126  f o r e c a s t i n g . Subsequent s e c t i o n s give the c o n c l u s i o n s f o r each study. The aim of t h i s t h e s i s i s t o f u r t h e r the understanding  of the Kalman F i l t e r as a p p l i e d t o h y d r o l o g i c  systems. The p r a c t i c a l i t y and the performance of t h i s e s t i m a t i o n technique  i s examined i n the context  of ARMAX  flow models. I t should be p o i n t e d out that the o b j e c t i v e i s not t o i d e n t i f y the best h y d r o l o g i c model f o r a p a r t i c u l a r phenomenon. No doubt, more p h y s i c a l l y based models have been proposed and used with much success, e x t e n s i v e data base necessary  f o r basins  with  as i n p u t s . R e c o g n i t i o n of  s t o c h a s t i c elements i n h y d r o l o g i c processes use of ARMAX models i n streamflow m o d e l l i n g .  has l e d t o the I t i s f o r these  s i t u a t i o n s that the a p p l i c a t i o n of the Kalman F i l t e r i s considered.  In t h i s t h e s i s ,  i t i s assumed that whichever  ARMAX model i s chosen, i t i s an adequate d e s c r i p t i o n of the underlying  process.  10.1 Summary of T h e s i s General  Contributions  p r a c t i c a l problems which a r i s e  a p p l i c a t i o n of the Kalman F i l t e r  i n the  are described  i n the  l i t e r a t u r e review of Chapter 2. S e v e r a l problems which f r e q u e n t l y occur  i n hydrologic a p p l i c a t i o n s are investigated  in t h i s t h e s i s . There a r e three main c o n t r i b u t i o n s of t h i s r e s e a r c h , each summarized i n the f o l l o w i n g paragraphs. Advances i n the understanding  of the Kalman F i l t e r  i s made  in terms o f : 1.  f i l t e r ' s s e n s i t i v i t y t o input s p e c i f i c a t i o n , and  127  2.  p r a c t i c a l i t y of the f i l t e r  for d i f f e r e n t  state-space  f o r m u l a t i o n s of ARMAX models. Correct expression forecast  f o r the mean square e r r o r (MSE) of  i s d e r i v e d f o r the AR(1)  model. I t i s shown that  under a p p r o p r i a t e assumptions, the e x p r e s s i o n Kalman equation  reduces t o the  for the v a r i a n c e of the i n n o v a t i o n s . T h i s  t h i r d c o n t r i b u t i o n i s important  when a u t o r e g r e s s i v e models  are used t o f o r e c a s t flows s e v e r a l time steps ahead. Under the Kalman model, t h i s process v i o l a t e s the assumption of known system matrix H, and use of the Kalman underestimates The  equation  the v a r i a n c e of the f o r e c a s t e r r o r s .  problem of input s p e c i f i c a t i o n ,  o f t e n unknown i n p r a c t i c e i s f i r s t  for q u a n t i t i e s  investigated. Results  i n d i c a t e that of the four q u a n t i t i e s ( x , P , Q, R), only 0  0  the combined s p e c i f i c a t i o n QR has p r a c t i c a l e f f e c t s on the observation  f o r e c a s t s . The s e n s i t i v i t y  study gives an  i n s i g h t as t o how the noise c o v a r i a n c e s can be s p e c i f i e d i n order t o achieve filter. also  reasonable  f o r e c a s t i n g performances for the  In a d d i t i o n , the worst s p e c i f i c a t i o n combination i s  noted. The  p r a c t i c a l i t y of the Kalman F i l t e r  i s illustrated  through three s p e c i a l cases of the ARMAX model. The three models a r e used to d e s c r i b e flow phenomenon of the F r a s e r R i v e r a t Hope; t y p i c a l of basins whose response c h a r a c t e r i s t i c s are constant, or change slowly through The  time.  g e n e r a l i t y of the state-space approach allows  flexibility  i n model f o r m u l a t i o n . For f o r e c a s t i n g purposes,  128  the  studies  write  i n d i c a t e that the most u s e f u l  formulation  i s to  the ARMAX model as the measurement equation with the  ARMAX c o e f f i c i e n t s as the s t a t e v a r i a b l e s . Hence, the Kalman Filter  i s used t o give  c o e f f i c i e n t s such that the  latest  recursive  estimates of the  flow f o r e c a s t s a r e always made with  s t a t e e s t i m a t e s . Two main advantages of t h i s  formulation  over other p o s s i b l e  ones a r e l e s s data  requirements, and robustness of flow f o r e c a s t s initial  t o poor  knowledge of ARMAX c o e f f i c i e n t s .  The  Kalman F i l t e r  estimation  i s a 1-step p r e d i c t o r - f i l t e r  technique. However, f o r e c a s t s  time-steps ahead a r e r e q u i r e d  for  several  i n p r a c t i c e and the f i l t e r i s  o f t e n used f o r making these k-step f o r e c a s t s . In s i t u a t i o n s where the system matrix H i s unknown, the v a r i a n c e f o r e c a s t e r r o r should not be c a l c u l a t e d algorithm.  A correct  model. T h i s  expression  important consequences i n p r a c t i c e because management  decisions  a r e o f t e n based on the r e l i a b i l i t y of flow  predictions It  from the Kalman  expression for t h i s variance i s  developed f o r the u n i v a r i a t e AR(1) has  of the  which i s i n d i c a t e d by t h e i r mean square e r r o r s .  i s a l s o shown that  in the d e r i v e d  i n hydrologic  a p p l i c a t i o n s , a l l terms  e x p r e s s i o n a r e of comparable magnitudes.  Conclusions of each i n v e s t i g a t i o n a r e given i n the following  sections.  1 29  10.2 S e n s i t i v i t y A n a l y s i s The  Kalman a l g o r i t h m r e q u i r e s input s p e c i f i c a t i o n f o r  the noise c o v a r i a n c e s and  initial  c o n d i t i o n s of the s t a t e  v e c t o r . As these are u s u a l l y unknown to the h y d r o l o g i s t , the performance of the f i l t e r of  these  inputs (Q, R,  with r e s p e c t to m i s s p e c i f i c a t i o n  x , 0  P) 0  i s examined. T h i s problem i s  s t u d i e d by f o r m u l a t i n g an ARMAX model i n s t a t e - s p a c e n o t a t i o n with the model c o e f f i c i e n t s as the s t a t e v e c t o r . Streamflow data are generated Q* and R*.  Performance of the f i l t e r ,  observation forecast error, input It 1.  with chosen noise c o v a r i a n c e s , based on  the  i s examined with respect to  specifications.  i s found t h a t : For the state-space f o r m u l a t i o n used,  initial  s p e c i f i c a t i o n of the model c o e f f i c i e n t s are important.  Poor c h o i c e s of x  0  and P  0  not  have l i t t l e  i n f l u e n c e on the flow p r e d i c t i o n s . 2.  Of the four input f a c t o r s  (Q, R,  x , 0  P ) ; only 0  the  combined s p e c i f i c a t i o n of the noise c o v a r i a n c e s have an important 3.  e f f e c t on the flow f o r e c a s t s .  If Q and R are unknown, i t i s best to s p e c i f y them both l a r g e r than t h e i r expected are comparable to those  values. Forecasts obtained  f o r the optimal case of known Q  and R. Other combinations  of s p e c i f y i n g Q and R r e s u l t  in worse f o r e c a s t i n g performance than the above. 4.  For t h i s state-space f o r m u l a t i o n , i t i s important estimate R c o r r e c t l y . The  filter  is indifferent  to  to  1 30  m i s s p e c i f i c a t i o n of Q i f R i s given 5.  correctly.  However, the reverse i s not t r u e . Even i f the true Q i s given, the f i l t e r  performs worse i f R i s s p e c i f i e d too  s m a l l . Thus, R needs to be estimated a l s o .  10.3  Maximum L i k e l i h o o d E s t i m a t i o n of the n o i s e v a r i a n c e For each of the three models c o n s i d e r e d , the method of  maximum l i k e l i h o o d  i s used to estimate  the noise v a r i a n c e .  T h i s method i s chosen because i t gives c o n s i s t e n t and asymptotic  e f f i c i e n t estimates. E v a l u a t i o n of the l o g  likelihood  f u n c t i o n i s f a c i l i t a t e d by using the Kalman  F i l t e r . T h i s i s another other than  10.4  a p p l i c a t i o n of the Kalman F i l t e r ,  that of f o r e c a s t i n g .  The A u t o r e g r e s s i v e Model The  1-day ahead f o r e c a s t f o r the AR(1) model i s  examined. Three schemes are used to f o r e c a s t the flow at Hope, each corresponding AR(1)  t o a p o s s i b l e f o r m u l a t i o n of the  model i n t o state-space  format.  Results i n d i c a t e that: 1.  The best f o r e c a s t i n g scheme i s to use the Kalman F i l t e r to  o b t a i n updated estimates  of the AR c o e f f i c i e n t . Thus,  a i s the s t a t e v a r i a b l e i n the s t a t e - s p a c e  framework.  Flow p r e d i c t i o n s a t each time step a r e made with the latest best  s t a t e estimate. T h i s f o r m u l a t i o n r e s u l t s i n the  f o r e c a s t i n g performance. In a d d i t i o n , the flow  f o r e c a s t s are i n s e n s i t i v e to i n i t i a l  s p e c i f i c a t i o n of a.  131  2.  Formulating  the AR process as the s t a t e equation  of  the  state-space model i s e q u i v a l e n t to not u s i n g the Kalman Filter  at a l l . An estimate  of the AR c o e f f i c i e n t i s  r e q u i r e d p r i o r to f o r e c a s t i n g . Comparable f o r e c a s t i n g performance to the best scheme can be obtained  for a  p a r t i c u l a r value of a. However, the f o r e c a s t i n g performance i s s e n s i t i v e to the s p e c i f i c a t i o n of the  AR  coef f i c i e n t . 3.  The  worst performance i s obtained  "splits  up"  the AR(1)  process  f o r the scheme which  i n the  state-space  f o r m u l a t i o n . Flow i s modelled as the s t a t e v a r i a b l e while the e r r o r term i s modelled as the n o i s e . Not  observation  only i s the f o r e c a s t i n g performance the  worst, i t i s p a r t i c u l a r l y s e n s i t i v e to the c h o i c e of a determined p r i o r to f o r e c a s t i n g .  As p a r t of the Kalman F i l t e r forecast (MSE)  f o r the o b s e r v a t i o n , and  are g i v e n . The  a l g o r i t h m , the  1-step  i t s mean square e r r o r  expressions are y=Hx, and HPH'  + R  r e s p e c t i v e l y . However, i n d a i l y water management, a longer f o r e c a s t i n g h o r i z o n i s u s u a l l y r e q u i r e d . Flow p r e d i c t i o n more than  1 day  i n advance introduces the problem of unknown  f u t u r e flows f o r a u t o r e g r e s s i v e models. For  i n s t a n c e , the  2-step f o r e c a s t r e q u i r e s knowledge of the flow The  1-day  ahead.  Kalman a l g o r i t h m assumes that tomorrow's flow i s known.  In p r a c t i c e , a p r e d i c t i o n can s t i l l  be obtained u s i n g  the  132  1 - s t e p f o r e c a s t f r o m t h e Kalman F i l t e r MSE  f o r the  via  t h e K a l m a n a l g o r i t h m . The  a s an e s t i m a t e .  2 - s t e p f o r e c a s t h o w e v e r , c a n n o t be c a l c u l a t e d MSE  i s important  r e f l e c t s the h y d r o l o g i s t ' s confidence B e c a u s e c o s t l y d e c i s i o n s may the  forecast error should  The  first  be  The for  10.5  require future flows in is  second approach i s t o d e r i v e the c o r r e c t forecast. This  expression  i s summarized i n  10.7.  Model  A t r a n s f e r f u n c t i o n model u s i n g upstream f l o w 2 days behind  t h a t o f Hope i s u s e d . Two  schemes a r e c o m p a r e d , one Filter.  a  i n the next s e c t i o n .  Transfer Function  lagged  variance  used.  f l o w 2-days i n a d v a n c e . T h i s  t h e mean s q u a r e e r r o r o f  section  f o r the  of  approach to t h i s problem i s to c o n s i d e r  to p r e d i c t the  discussed  the r e l i a b i l i t y  expression  d i f f e r e n t ARMAX m o d e l w h i c h d o e s n o t order  as i t  in his forecasts.  d e p e n d on  flow p r e d i c t i o n s , the proper  of t h e  The  The  latter  without  and  scheme f o r m u l a t e s  w i t h the c o e f f i c i e n t s as  the  one the  inputs  forecasting  w i t h the  Kalman  input-output  model  state v a r i a b l e s . Conclusions  are: 1.  2.  For  constant  coefficients,  not  improve the  use  of the Kalman F i l t e r  flow f o r e c a s t s .  However, f o r e c a s t s o b t a i n e d w i t h o u t sensitive  to the  initial  c o e f f i c i e n t s . Without the  the  filter  s p e c i f i c a t i o n of filter,  these  are  these  have t o  be  does  133  estimated from past data, thus making the choice of dataset 3.  important.  Although the f i l t e r  does not improve the f o r e c a s t i n g  performance of t h i s model, flow p r e d i c t i o n s a r e robust to poor s p e c i f i c a t i o n s of the model c o e f f i c i e n t s . Large amounts of data a r e not necessary f o r e s t i m a t i n g  these  c o e f f i c i e n t s p r i o r to f o r e c a s t i n g .  10.6  Combined ARMAX Model Using a d i f f e r e n t model t o f o r e c a s t k steps ahead f o r  various  k's i n v o l v e s too many models as k gets l a r g e .  T h e r e f o r e a combined model i s considered a parsimonious r e p r e s e n t a t i o n  i n order t o achieve  of the p r o c e s s . T h i s  ARMAX model i s a combination of the AR(1) models above. The Kalman F i l t e r  combined  and r e g r e s s i o n  i s a p p l i e d t o t h i s model t o  give estimates of the ARMAX c o e f f i c i e n t s r e c u r s i v e l y . Both 1 and  2-step f o r e c a s t s are obtained and are compared t o those  of the AR(1)  and the t r a n s f e r f u n c t i o n models r e s p e c t i v e l y .  C a l c u l a t i o n of the 2-step f o r e c a s t presents the same problem as the AR(1)  model, i n that tomorrow's f l o w . i s unknown.  N e v e r t h e l e s s , the 1-step f o r e c a s t given  by the Kalman F i l t e r  i s used as an e s t i m a t e . The study shows t h a t : 1.  The 1-step f o r e c a s t i n g performance i s comparable t o that of the AR(1)  2.  model.  The 2-step f o r e c a s t i n g performance i s b e t t e r than that of the t r a n s f e r f u n c t i o n model.  Thus, i n terms of i n d e n t i f y i n g the s t a t i s t i c a l  model f o r  1 34  predicting  streamflows  up t o 2 d a y s i n a d v a n c e ,  this  c o m b i n e d ARMAX m o d e l i s a d e q u a t e . The i m p o r t a n t this  study  i s that reasonable  result of  f o r e c a s t i n g performance i s  obtained  f o r t h e 2 - s t e p f l o w p r e d i c t i o n s ; e v e n t h o u g h an  estimate  i s used f o r tomorrow's f l o w  Violation  o f t h e a s s u m p t i o n o f known H s t i l l  reasonable  of the f o r e c a s t e r r o r  A formula  It  f o r t h e v a r i a n c e o f t h e f o r e c a s t e r r o r when  H a n d x a r e unknown, i s d e v e l o p e d  f o r t h e AR(1) model.  i s shown t h a t t h e new e x p r e s s i o n c a l c u l a t e s a v a r i a n c e  significantly  l a r g e r than  This expression  2  (R + H Z  All  results in  flow p r e d i c t i o n s .  10.7 V a r i a n c e  both  i n t h e s y s t e m m a t r i x H.  t h a t g i v e n by t h e Kalman  f o r t h e A R ( 1 ) model i s :  2  2 2  Filter.  ) + Z,, ( x + Z  2 2  terms i n t h i s equation  p r a c t i c e . The f i r s t  ) + I  1  2  (2Hx + Z , ) 2  a r e o f t h e same m a g n i t u d e i n  part of t h i s expression corresponds  to  t h e c a s e o f known H, a n d i s e q u i v a l e n t t o t h e Kalman equation  f o r the variance of the innovations. Z  2 2  i s the  s t a t e e r r o r c o v a r i a n c e m a t r i x . The s e c o n d p a r t , a c c o u n t s f o r the  f a c t t h a t H i s unknown b u t i t s e s t i m a t e  H i s independent  o f x . Z,, c o n t a i n s t h e v a r i a n c e s a n d c o v a r i a n c e s e l e m e n t s i n H. The f i n a l  of the  term acknowledges p o s s i b l e  c o r r e l a t i o n between H and x, a s r e f l e c t e d  in Z  1 2  .  135  10.8 State-space F o r m u l a t i o n Results Kalman F i l t e r  of the t h e s i s  i n d i c a t e that a p p l i c a t i o n of the  t o ARMAX models can give b e t t e r  f o r e c a s t s when  the model c o e f f i c i e n t s a r e formulated as the s t a t e v a r i a b l e s . This  i s because the f i l t e r  updates these  c o e f f i c i e n t s r e c u r s i v e l y i n the l i g h t of the observation forecast  e r r o r s . Although i t presents some conceptual  difficulties,  i t i s necessary t o choose between  a problem  which can be handled i n p r a c t i c e and one which i s hard t o c o n t r o l . The a l t e r n a t e  formulation  v a r i a b l e t o be the s t a t e , r e q u i r e s  of a l l o w i n g estimation  the flow of the ARMAX  c o e f f i c i e n t s p r i o r t o f o r e c a s t i n g . T h i s emphasizes the necessity  f o r abundant and good q u a l i t y data, as f o r e c a s t i n g  performance i s s e n s i t i v e t o the s p e c i f i c a t i o n of the s t a t e transition  matrix,  10.9 Future  Directions  In t h i s t h e s i s , h y d r o l o g i c described  systems which can be  by constant c o e f f i c i e n t ARMAX models are  c o n s i d e r e d . These a r e a p p r i o p r i a t e  when modelling streamflow  phenomenon f o r basins with l a r g e drainage a r e a s . Kalman Filtering  i s applied  t o these models and the r e s u l t i n g flow  p r e d i c t i o n s are better  than or equal t o those obtained  without the Kalman updating of the model c o e f f i c i e n t s . A u s e f u l e x t e n s i o n t o t h i s would be t o i n v e s t i g a t e the f o r e c a s t i n g performance of the f i l t e r  f o r more complex  systems whose c h a r a c t e r i s t i c s change s i g n i f i c a n t l y over  136  BIBLIOGRAPHY  Akaike, H i r o t u g u . "Markovian Representation of S t o c h a s t i c Processes by Canonical V a r i a b l e s . " SIAM J . C o n t r o l , v o l . 13, No. 1 (Jan. 1973), 163-73. Benjamin, Jack R., and C. A l l i n C o r n e l l . P r o b a b i l i t y , S t a t i s t i c s , and D e c i s i o n f o r C i v i l E n g i n e e r s . New York: McGraw-Hill, 1970. Bolzern, P., M. F e r r a r o , and G. Fronza. "Adaptive Real-Time Forecast of River Flow-Rates from R a i n f a l l Data." J . of Hydrology, 47 (1980), 251-67. Bowles, David S., and W. J . Grenney. " E s t i m a t i o n of D i f f u s e Loading of Water Q u a l i t y P o l l u t a n t s by Kalman Filtering." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Box,  G. E. P., and G. M. Jenkins. Time S e r i e s A n a l y s i s : F o r e c a s t i n g and C o n t r o l . 2nd ed. San F r a n c i s c o : Holden Day, 1976.  Campbell, W. G., S. B a l l o u , and C. Slade. Form and 6th ed. Boston: Houghton M i f f l i n Co., 1982.  Style.  Canada. Environment Canada, Inland Waters D i r e c t o r a t e , Water Resources Branch, Water Survey of Canada. S u r f a c e Water Data, B r i t i s h Columbia, 1980. Ottawa 1981. Canada. Water Resources Branch, Data S e c t i o n . Magnetic Tape Records, Ottawa, 1984. C h a t f i e l d , C. A n a l y s i s of Time S e r i e s : An ed. London: Chapman and H a l l , 1980.  Introduction.  2nd  Chiu, Chao-Lin, and E. 0. Isu. " A p p l i c a t i o n of Kalman F i l t e r to Open Channel Flow P r o f i l e E s t i m a t i o n . " In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Chow, Van Te. " E v o l u t i o n of S t o c h a s t i c Hydrology." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. C l e a r y , James P., and Hans Levenbach. The F o r e c a s t e r . Belmont: Wadsworth, Inc.,  Professional 1982.  Constable, T. W., and E. A. McBean. "Kalman F i l t e r of the Speed River Q u a l i t y . " ASCE-J. of the 137  Modelling  138 Environmental E n g i n e e r i n g D i v i s i o n , v o l . 105, No. EE5 (Oct. 1979), 961-78. C o u l t h a r d , John, ed. UBC Tape. UBC Computing Centre, March 1981. Cox, D. R., and E. J . S n e l l . A p p l i e d S t a t i s t i c s . London: Chapman and H a l l , 1982. de Jong, P i e t . UBC IMSL. UBC Computing Centre, Sept. 1980. , and Phelim P. Boyle. " M o n i t o r i n g M o r t a l i t y . " J . of Econometrics, 23 (1983), 1-16. , and B. Zehnwirth. "Claims R e s e r v i n g , State-Space Models and the Kalman F i l t e r . " The J o u r n a l of the I n s t i t u t e of A c t u a r i e s , v o l . 110, p a r t 1, No. 444 (June 1983), 157-81. Dixon, W. J . , c h i e f ed. BMDP S t a t i s t i c a l Software. B e r k e l e y : Univ. of C a l i f o r n i a P r e s s , 1983. Duong, Nguyen, Rah-Ming L i , and Y. H. Chen. "Adaptive C o n t r o l of Lock and Dam Gate Openings Using a Kalman F i l t e r f o r Real-Time I d e n t i f i c a t i o n of Upstream-Downstream Stage R e l a t i o n s h i p . " In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. F e l d s t e i n , M a r t i n S. "The E r r o r of F o r e c a s t i n Econometric Models when the F o r e c a s t P e r i o d Exogeneous V a r i a b l e s are Stochastic." Econometrica, v o l . 39, No. 1 (Jan. 1971), 55-60. Fox, D a n i e l J . , and Kenneth E. G u i r e . MIDAS. 3rd ed. Michigan: S t a t i s t i c a l Research Lab. Uhiv. of Michigan, Sept. 1976. Gelb, A r t h u r , ed. A p p l i e d Optimal E s t i m a t i o n . By The A n a l y t i c Sciences C o r p o r a t i o n . 6th ed. Cambridge: MIT Press, 1980. G r e i g , Malcolm, 1978.  ed. UBC ANOVAR. UBC Computing Centre, Oct.  G r i g o r i u , M i r c e a . " P r e d i c t i o n and S i m u l a t i o n with A u t o g r e s s i v e Models i n Hydrology." In A p p l i c a t i o n s of Kalman F i l t e r t o Hydrology H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. f  H a l f o n , E. "Assessment of U n c e r t a i n t y i n Long Term P r e d i c t i o n made through Mathematical Models." In  139 C l e v e l a n d : IFAC, Water and R e l a t e d Land Resource Systems, 1980. H a r r i s o n , P. J . , and C. F. Stevens. "Bayesian F o r e c a s t i n g . " J . Royal S t a t i s t i c a l S o c i e t y , B, 38 (1976), 205-47. Harvey, A. C. Time S e r i e s Models. Oxford: P. A l l a n ,  1981.  , and P. H. J . Todd. " F o r e c a s t i n g Economic Time S e r i e s with S t r u c t u r a l and Box-Jenkins Models: A Case Study." J . of Business and Economic S t a t i s t i c s , v o l . 1, No. 4 (Oct. 1983), 299-315. H i c k s , C h a r l e s . Fundamentals Concepts i n the Design of Experiments. 2nd ed. New York: H o l t , Rinehart and Winston, Inc., 1973. I n t e g r a t e d Software Systems C o r p o r a t i o n . T e l l a g r a f . 4. San Diego,CA: ISSCO, 1981.  version  J a z w i n s k i , A. H. "Adaptive F i l t e r i n g . " Automatica, v o l . 5 (1969), 475-85.  New  . S t o c h a s t i c Processes and F i l t e r i n g York: Academic Press, 1970.  Theory.  Johnson, R. A., and Dean W. Wichern. A p p l i e d M u l t i v a r i a t e S t a t i s t i c a l A n a l y s i s . Englewood C l i f f s , N.J.: P r e n t i c e - H a l l , 1982. Kahl, Douglas R., and Johannes L e d o l t e r . "A Recursive Kalman F i l t e r F o r e c a s t i n g Approach." Management Science, v o l . 29, No. 11 (Nov. 1983), 1325-333. Kalman, R. E. "A New Approach to L i n e a r F i l t e r i n g and P r e d i c t i o n Problems." T r a n s a c t i o n s of the ASME - J . of B a s i c E n g i n e e r i n g , (March 1960), 35-45. "A R e t r o s p e c t i v e a f t e r Twenty Years: From the Pure to the A p p l i e d . " In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. K e n d a l l , M. Time S e r i e s . London: G r i f f i n ,  1973.  K i t a n i d a s , Peter, and R a f a e l L. Bras. "A Study of C o l l i n e a r i t y and Parameter S t a b i l i t y i n R a i n f a l l - R u n o f f Models: Ridge Regression and Kalman F i l t e r i n g . " In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. • . "Error I n d e n t i f i c a t i o n i n Conceptual H y d r o l o g i c Models." In A p p l i c a t i o n s of  140  Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Lettenmaier, D. P., and S. J . Burges. "Use of S t a t e E s t i m a t i o n Techniques i n Water Resources System M o d e l l i n g . " Water Resources B u l l . , 12 ( 1 ) , (1976), 88-99. Li,  Rah-Ming, Nguyen Duong, and D a r y l B. Simons. " A p p l i c a t i o n of Kalman F i l t e r f o r P r e d i c t i o n of S t a t e Discharge R e l a t i o n s h i p i n R i v e r s . " In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978.  L i n s l e y , Ray K. J r . , Max A. K o h l e r , and Joseph L. H. Paulhus. Hydrology f o r E n g i n e e r s . 3rd ed. New York: McGraw-Hill, 1982. Logan, L. A., W. N. Lennox, and T. E. Unny. "A Time-Varying Non-Linear H y d r o l o g i c Response Model f o r F l o o d Hydrography E s t i m a t i o n i n a Noisy Environment." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Lyon-Lamb, V i c t o r i a . March 1982.  UBC  Tapeasy. UBC  Computing Centre,  McLaughlin, Dennis. " P o t e n t i a l A p p l i c a t i o n s of Kalman F i l t e r i n g Concepts to Groundwater Basin Management." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Magnus, Jan R., and H. Neudecker. "The Commutation M a t r i x : some P r o p e r t i e s and A p p l i c a t i o n s . " The Annals of S t a t i s t i c s , v o l . 7, No. 2 (1979), 381-94. Mair, Susan. UBC . UBC  S u r f a c e . UBC P l o t . UBC  Computing Centre, Nov.  Computing Centre, Nov.  1982.  1982.  Marsalek, J . "Instrumentation f o r f i e l d s t u d i e s of urban r u n o f f . " Canada - O n t a r i o Agreement Research Program, P r o j . 73-3-R, Res. Rep. No. 42, 82pp. Mehra, R. K. "On the I d e n t i f i c a t i o n of V a r i a n c e s and Adaptive Kalman F i l t e r i n g . " IEEE T r a n s a c t i o n s on Automatic C o n t r o l , v o l . AC-15, No. 2 ( A p r i l 1970), 175-84. . "Approaches  to Adaptive F i l t e r i n g . "  IEEE  V  141 Transactions  on Automatic C o n t r o l ,  (Oct. 1972), 693-98.  . " P r a c t i c a l Aspects of Designing Kalman Filters." In A p p l i c a t i o n s of Kalman F i l t e r t o Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. . "Kalman F i l t e r s and t h e i r A p p l i c a t i o n s to Forecasting." In TIMS S t u d i e s i n the Management S c i e n c e s . Amsterdam: North-Holland, 1979. . "A Survey of Time S e r i e s M o d e l l i n g and F o r e c a s t i n g Methodology." In Real-Time F o r e c a s t i n g / C o n t r o l of Water Resource Systems. Ed. E r i c Wood. Oxford: Pergamon Press, 1980. Meinhold, Richard, and Nozer D. Singpurwalla. "Understanding the Kalman F i l t e r . " The American S t a t i s t i c i a n , v o l . 37, No. 2 (May 1983), 123-27. Mendenhall, W i l l i a m . I n t r o d u c t i o n t o L i n e a r Models and the Design and A n a l y s i s of Experiments. Belmont: Duxbury Press, 1968. , and Richard L. S c h e a f f e r . Mathematical S t a t i s t i c s with A p p l i c a t i o n s . North S c i t u a t e : Duxbury Press, 1973. Moore, John B. WATFIV: F o r t r a n Programming with the WATFIV Compiler. Reston, V i r g i n i a : Reston P u b l i s h i n g Co. Inc., 1 975. Moore, Stephen. " A p p l i c a t i o n of Kalman F i l t e r to Water Quality Studies." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Moore, R. J . , and D. A. Jones. "Coupled Bayesian - Kalman F i l t e r E s t i m t a t i o n of Parameters and S t a t e s of Dynamic Water Q u a l i t y Model." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Nakamura, Masatoshi. " R e l a t i o n s h i p between Steady State Kalman F i l t e r Gain and Noise V a r i a n c e s . " Int. J . Systems Science, v o l . 13, No. 10, (1982), 1153-163. Natale, L., and E. T o d i n i . "A S t a b l e Estimator f o r L i n e a r Models." Water Resources Research, v o l . 12, No. 4 (Aug. 1976), 667-76. N i c o l , Tom, ed. UBC MATRIX. UBC Computing Centre,  Mar. 1982.  N i g h t i n g a l e , Jon. An I n t r o d u c t i o n t o TEXTFORM. UBC Computing  142 Centre, A p r i l  1984.  . Textform Layouts. UBC April  Computing Centre,  1984. #  TEXTFORM. UBC  Computing Centre,  April  1 984. P a t r y , G i l l e s G., and Miguel A. Marino. "Time-Variant Nonlinear F u n c t i o n a l Runoff Model f o r Real-Time Forecasting." J . of Hydrology, 66 (1983), 227-44. . L e t t e r to author. Dec.  1983.  O'Connell, P. E., and R. T. C l a r k e . "Adaptive H y d r o l o g i c a l F o r e c a s t i n g - A Review." H y d r o l o g i c a l S c i e n c e s B u l l e t i n , 25, 2 (June 198iT, 179-205. 0'Donovan, Thomas M. Short Wiley & Sons, 1983.  Term F o r e c a s t i n g . New  York: John  Quick, M.C, and A. Pipes. "A combined snowmelt and r a i n f a l l runoff model." Canadian J o u r n a l of C i v i l E n g i n e e r i n g , Vol.3, No.3, 1976, 449-460. Rao,  A. Ramachandra, and R. L. Kashyap. " C a u s a l i t y i n Hydrologic Systems." In C l e v e l a n d : IFAC, Water and Related Land Resource Systems, 1980.  Rodriguez-Iturbe, I., Juan Valdes, and J . M. Velasquez. " A p p l i c a t i o n s of Kalman F i l t e r i n R a i n f a l l - R u n o f f Studies." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Sage, A. P., and G. W. Husa. "Adaptive F i l t e r i n g with Unknown P r i o r S t a t i s t i c s . " Proc. J o i n t Automat. C o n t r o l , 1969, 760-69. Schweppe, " E v a l u a t i o n of L i k e l i h o o d Functions f o r Gaussian Signals." IEEE T r a n s a c t i o n s on Information Theory, 1965, 61-70. S i t t n e r , Walter T. "WMO P r o j e c t on Intercomparison of Conceptual Models used in H y d r o l o g i c a l F o r e c a s t i n g . " H y d r o l o g i c a l Sciences B u l l e t i n , XXI, 1 (March 1976), 203-222. Stevens, John, Tony Buckland, and Bruce J o l i f f e , eds. F o r t r a n . UBC Computing Centre, Nov. 1981.  UBC  S z o l l o s i - N a g y , A. "An Adaptive I d e n t i f i c a t i o n and P r e d i c t i o n Algorithm f o r the Real-Time F o r e c a s t i n g of H y d r o l o g i c a l Time S e r i e s . " H y d r o l o g i c a l Sciences B u l l . , XXI, No. 1  143 (March 1976), 163-76. , E z i o T o d i n i , and E r i c Wood. "A State-Space Model f o r Real-Time F o r e c a s t i n g of H y d r o l o g i c a l Time S e r i e s . " J . of H y d r o l o g i c a l Sciences, v o l . 4, No. 1 (1977), 61-76. T o d i n i , E. "Mutually I n t e r a c t i v e State-Parameter (MISP) Estimation." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. , and D. B o u i l l o t . "A R a i n f a l l - R u n o f f Kalman F i l t e r Model." In System Simulation i n Water Resources. Ed. G. C. V a n s t e e n k i s t e . Amsterdam: North-Holland, 1976. , and J . R. W a l l i s . "A Real Time R a i n f a l l - R u n o f f Model f o r an On-Line F l o o d Warning System." In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin C h i u . P i t t s b u r g : Univ. of P i t t s b u r g , 1978. V a l d e s , Juan B., J . M. Velasquez, and I . R o d r i g u e z - I t u r b e . "Bayesian D i s c r i m i n a t i o n of Hydrologic Forecasting Models Based on the Kalman F i l t e r . " In A p p l i c a t i o n s of Kalman F i l t e r to Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. Walpole, Ronald E., and Raymond H. Myers. P r o b a b i l i t y and S t a t i s t i c s f o r Engineers and S c i e n t i s t s . 2nd ed. New York: MacMillan, 1978. Wood, E r i c , i n t r o d . Real-Time F o r e c a s t i n g / C o n t r o l of Water Resource Systems. IIASA Proceedings S e r i e s . Oxford: Pergamon Press, 1980. Wood, E r i c . "An A p p l i c a t i o n of Kalman F i l t e r i n g to River Flow F o r e c a s t i n g . " In A p p l i c a t i o n s of Kalman F i l t e r t o Hydrology, H y d r a u l i c s , and Water Resources. Ed. Chao-Lin Chiu. P i t t s b u r g : Univ. of P i t t s b u r g , 1978. , and R. K. Mehra. "Model I d e n t i f i c a t i o n and Sampling Rate A n a l y s i s f o r F o r e c a s t i n g the Q u a l i t y of Plant Intake Water." In C l e v e l a n d : IFAC, Water and Related Land Resource Systems, 1980. , and A. S z o l l o s i - N a g y . "An Adaptive A l g o r i t h m f o r A n a l y z i n g Short-Term S t r u c t u r a l and Parameter Changes i n H y d r o l o g i c P r e d i c t i o n Models." Water Resources Research, v o l . 14, No. 4 (Aug. 1978), 577-81. Young, P., and P. Whitehead. "A Recursive Approach to Time-Series A n a l y s i s f o r M u l t i - v a r i a b l e Systems." I n t .  1 44  J . C o n t r o l , v o l . 25, No. Z a b l o s k y , J . P a u l . UBC O c t . 1983.  3 (1977),  X e r o x 9700. UBC  457-82. Computing  Centre,  

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
India 23 6
United Kingdom 22 0
Sweden 16 0
United States 11 1
Canada 5 0
Iran 5 3
China 4 2
Algeria 4 0
Italy 4 0
New Zealand 3 0
Kenya 3 0
Germany 3 2
Hong Kong 3 0
City Views Downloads
Unknown 56 5
Stockholm 13 0
Mumbai 10 1
Nairobi 3 0
Bangalore 3 3
Ashburn 3 0
Beijing 3 0
Lund 3 0
Chicago 3 0
Farsi 2 0
Siuri 2 0
Tokyo 2 0
Zanjan 2 2

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0062573/manifest

Comment

Related Items