Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Effects of uncertainty in hydrologic model calibration on extreme event simulation Roche, Anthony David 2005

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-ubc_2005-0307.pdf [ 27.56MB ]
Metadata
JSON: 831-1.0063331.json
JSON-LD: 831-1.0063331-ld.json
RDF/XML (Pretty): 831-1.0063331-rdf.xml
RDF/JSON: 831-1.0063331-rdf.json
Turtle: 831-1.0063331-turtle.txt
N-Triples: 831-1.0063331-rdf-ntriples.txt
Original Record: 831-1.0063331-source.json
Full Text
831-1.0063331-fulltext.txt
Citation
831-1.0063331.ris

Full Text

E F F E C T S O F U N C E R T A I N T Y F N H Y D R O L O G I C M O D E L C A L I B R A T I O N O N E X T R E M E E V E N T S I M U L A T I O N b y A N T H O N Y D A V I D R O C H E B . A . S c , U n i v e r s i t y o f Water loo , 2000 A T H E S I S S U B M I T T E D TN P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F A P P L I E D S C I E N C E i n T H E F A C U L T Y O F G R A D U A T E S T U D I E S ( C I V I L E N G I N E E R I N G - H Y D R O T E C H N 1 C A L ) T H E U N I V E R S I T Y O F B R I T I S H C O L U M B I A A p r i l 2005 © A n t h o n y D a v i d Roche , 2005 Abstract Compute r models representing the hydrologic cyc le as a s impl i f i ed system have become a preferred tool for est imating floods. H o w e v e r , scientif ic understanding o f the uncertainty inherent i n these models has not kept pace w i t h their development and appl icat ion. In many cases it is incorrect ly assumed that a l l uncertainty i n m o d e l structure, input data, and parameters is m i n i m i z e d or e l iminated through cal ibrat ion. The end result is a ubiquitous but unknowable degree o f m o d e l predic t ive uncertainty that m a y or m a y not s ignif icant ly affect the outcome o f any g iven appl icat ion. Ext rapola t ion o f a m o d e l beyond its ca l ibra t ion range (i.e., for extreme event s imulat ion) invar i ab ly results i n a substantial increase i n this uncertainty. T h i s w o r k aims to promote qualitative and quantitative understanding o f m o d e l predic t ive uncertainty i n extreme event s imula t ion . It therefore begins w i t h a rev iew o f the many sources contr ibut ing to m o d e l predict ive uncertainty, an analysis o f their or igins and interdependencies, and a synthesis o f var ious methods for ana lyz ing uncertainty. A s a pre-requisite step towards the larger goal o f reducing overa l l m o d e l predict ive uncertainty, this w o r k investigates the var iab i l i ty i n estimates o f extreme floods (e.g., peak f low , t im ing , and vo lume) int roduced b y subjective decisions made dur ing cal ibrat ion. M u l t i p l e automatic calibrations o f a conceptual hydro logic mode l are conducted us ing different objective functions to evaluate cal ibra t ion performance, result ing i n a co l lec t ion o f non-infer ior parameter sets. E a c h parameter set is then used to simulate an extreme event based o n hydro log ic data for the C o q u i t l a m L a k e watershed i n B r i t i s h C o l u m b i a , w h i c h is developed for hydropower b y B C H y d r o . The combined output o f these extreme event s imulat ions characterizes the relative var iab i l i ty i n the hydrographs. S imula t ions are conducted us ing the Un ive r s i t y o f B r i t i s h C o l u m b i a Watershed M o d e l ( U B C W M ) , w h i c h is w i d e l y used to describe and forecast watershed behaviour i n mountainous areas o f B r i t i s h C o l u m b i a . Cal ibra t ions o f the U B C W M ut i l ize the Shuff led C o m p l e x E v o l u t i o n A l g o r i t h m ( S C E - U A ) , an effective and efficient opt imizat ion-based automatic ca l ibra t ion routine. Because automatic calibrations fa i l to capture the different k inds o f expert knowledge i i inherent i n a manual ca l ibra t ion, extreme event hydrographs obtained us ing calibrated parameter sets are compared on a relative rather than absolute basis. Resul ts show that automatic cal ibra t ion m a y provide a straightforward method o f ident i fying potential areas where subject models are over-parameterized w i t h respect to the cal ibra t ion data. M o r e importantly, p re l imina ry results show that the va r i ab i l i ty is re la t ively constrained amongst s imulat ions based o n a Probable M a x i m u m F l o o d ( P M F ) scenario, w i t h coefficient o f var ia t ion for peak f low, event v o l u m e , and t ime to peak o f 4%, 1%, and 1% respectively. Th i s value is negl ig ib le i n compar i son w i t h other uncertainties that dominate extreme events l i ke the P M F . Thus , the P M F - b a s e d s imulat ions are re la t ively insensi t ive to the different measures o f cal ibrat ion performance used. S i m i l a r trials us ing other models w o u l d permit an estimate o f the extent to w h i c h one c o u l d expect to resolve divergent estimates through implement ing different but equal ly v a l i d cal ibrat ions. Observat ions o f this w o r k are appl icable for the management o f hydropower product ion and f lood control for these watersheds. These observations w i l l p rov ide insights into uncertainty i n extreme event s imula t ion and m a y contribute to the i m p r o v e d management o f water, hydropower systems, and pub l i c safety i n Canada and around the w o r l d . i i i Table of Contents ABSTRACT ii TABLE OF CONTENTS iv LIST OF TABLES vi LIST OF FIGURES vii ACKNOWLEDGEMENTS viii 1. INTRODUCTION 1 1.1 Guidance for the Reader 2 1.2 Approaches for Estimating floods 3 1.3 Presenting Floods for Decision-making 6 1.4 Expressing Uncertainty 8 1.5 Model Predictive Uncertainty in Context 10 2. HYDROLOGIC MODELLING 13 2.1 Hydrologic Processes 14 2.1.1 Scale in Hydrology 14 2.1.2 Precipitation 15 2.1.3 Runoff 18 2.1.4 Evapotranspiration 24 2.2 The Evolution of Hydrologic Modelling 26 2.2.1 The Beginnings of Modelling 26 2.2.2 The Philosophy of Model Development and Application 28 2.2.3 Model Complexity in Context 30 2.3 Model Classifications 31 2.3.1 Statistical, Empirical, and Black Box Models 32 2.3.2 Conceptual Models 33 2.3.3 Physically-Based Models 36 2.3.4 Other Classifications 39 2.4 Model Calibration 41 2.4.1 The Nature of Calibration 42 2.4.2 Manual Calibration 45 2.4.3 Automatic Calibration 47 2.4.4 Methods for Automatic Calibration 51 2.4.5 Evaluating Model Performance 60 2.4.6 Hydrologic Model Validation 67 3. UNCERTAINTY IN HYDROLOGIC MODELLING 70 3.1 Classification of Uncertainty in Hydrologic Modelling 72 3.1.1 Natural Variability 76 3.1.2 Data Uncertainty 79 3.1.3 Model Uncertainty 88 3.1.4 Parameter Uncertainty 94 3.2 Uncertainty and Calibration 100 3.3 Techniques for Exploring Uncertainty 103 3.3.1 Sensitivity Analysis 106 3.3.2 Reliability Analysis 109 3.3.3 Multi-Objective Analysis 113 3.3.4 Generalized Sensitivity Analysis 119 3.3.5 Equifinality 120 3.3.6 Uncertainty Isolation 123 3.4 Uncertainty and Extreme Event Simulation 126 iv 4. RESEARCH TOOLS AND METHODS OF ANALYSIS 129 4.1 Tools and Case Studies 130 4.1.1 The University of British Columbia Watershed Model 130 4.1.2 The SCE-UA Method 135 4.1.3 Interfacing UBCWM and SCE-UA 136 4.1.4 The Coquitlam Lake and Illecillewaet River Watersheds 137 4.2 Experimental Design 140 4.2.1 Selection of Objectives 141 4.2.2 SCE-UA Calibrations 142 4.2.3 Validation 146 4.2.4 Event Simulations 147 4.2.5 Simulations based on a PMF Scenario 149 5. RESULTS AND DISCUSSION 152 5.1 Preliminary Tests 152 5.2 Synthetic Calibrations 156 5.3 Calibrations against Observed Data 170 5.3.1 Coquitlam Lake Watershed 170 5.3.2 Calibrations with Expanded Parameter Space 173 5.3.3 Illecillewaet River Watershed 179 5.4 Sensitivity to Initial Random Seed 183 5.4.1 Coquitlam Lake Watershed 183 5.4.2 Illecillewaet River Watershed 186 5.5 Event Simulations 190 5.6 Extreme Event Simulations 195 5.6.1 Simulations Based on a PMF Scenario 195 5.6.2 Simulations Based on a PMP-Scale Event 197 6. CONCLUSIONS 200 7. FUTURE DIRECTIONS 208 8. GLOSSARY OF ACRONYMS AND WATERSHED MODELS 210 9. REFERENCES 213 APPENDIX A : SELECTION OF S C E - U A PARAMETER VALUES APPENDIX B: DETAILED CALIBRATION RESULTS B. 1: Complete Parameter Sets B.2: Annual Statistics for All Calibrations B.3: Mean Monthly Statistics for All Calibrations B.4: Summary Results from Event-based Simulations V List of Tables Table 2-1 : C o m m o n Measures for Quanti tat ive Eva lua t ion o f H y d r o log ic M o d e l Performance 64 Tab le 4 -1 : C o m m o n l y Cal ibra ted Parameters o f the U B C W M 134 Table 4-2: Expanded Parameter Ranges for C o q u i t l a m L a k e Watershed Cal ibra t ions 145 Table 4-3: Index o f S C E - U A Cal ibra t ions 146 Table 4-4: Start and E n d Dates for M a j o r In f low Events to C o q u i t l a m L a k e 148 Table 5-1: Resul ts o f Synthetic Ca l ib ra t ion for C a m p b e l l R i v e r Watershed 153 Table 5-2: Parameter V a l u e s for Successful Synthetic Cal ibra t ions - C o q u i t l a m L a k e Watershed 168 Table 5-3: Parameter V a l u e s for Successful Synthetic Cal ibra t ions - I l leci l lewaet R i v e r Watershed 169 Table 5-4: S u m m a r y Statistics for M u l t i - O b j e c t i v e Cal ibra t ions o f C o q u i t l a m L a k e Watershed 171 Table 5-5: Parameter V a l u e s for M u l t i - O b j e c t i v e Ca l ib ra t ion o f C o q u i t l a m L a k e Watershed. . 174 Table 5-6: S u m m a r y Statistics for M u l t i - O b j e c t i v e Cal ibra t ions o f C o q u i t l a m L a k e Watershed us ing E x p a n d e d Parameter Space 175 Table 5-7: Parameter Va lues for M u l t i - O b j e c t i v e Cal ibra t ions o f C o q u i t l a m L a k e Watershed us ing Expanded Parameter Space 176 Table 5-8: S u m m a r y Statistics for M u l t i - O b j e c t i v e Cal ibra t ions o f I l leci l lewaet R i v e r Watershed 179 Table 5-9: Parameter V a l u e s for M u l t i - O b j e c t i v e Cal ibra t ions o f I l lec i l lewaet R i v e r Watershed 182 Table 5-10: S u m m a r y Statistics for C o q u i t l a m L a k e Watershed Seed Sens i t iv i ty 184 Table 5-11: Parameter V a l u e s for C o q u i t l a m L a k e Watershed Seed Sens i t iv i ty 187 Table 5-12: S u m m a r y Statistics for I l lec i l lewaet R i v e r Watershed Seed Sens i t iv i ty 188 Table 5-13: Parameter V a l u e s for I l leci l lewaet R i v e r Watershed Seed Sens i t iv i ty 191 Table 5-14: S u m m a r y Resul ts for S torm 1 Simula t ions 194 Table 5-15: A v e r a g e Resul t s for A l l M u l t i - O b j e c t i v e Even t S imula t ions 195 Table 5-16: S u m m a r y Resul ts for P M F - b a s e d Ex t reme Event S imula t ions 197 Table 5-17: S u m m a r y Resul ts for P M P - b a s e d S to rm Simula t ions 199 vi List of Figures Figure 1-1: R i s k - Uncer ta in ty Interaction 9 F igure 2-1 : F lowchar t o f the S C E - U A A l g o r i t h m 57 F igure 2-2: F lowchar t o f the S C E - U A C C E Strategy 59 F igure 3-1: A c c u r a c y and Prec i s ion as Ana logues for E r r o r and Uncer ta in ty 71 F igure 3-2: M i n d M a p o f Uncertaint ies i n H y d r o l o g i c M o d e l l i n g 75 F igure 3-3: M o d e l Uncer ta in ty i n Pract ice 89 Figure 3-4: N o r m a l i z e d Parameter Sets for a S ingle-Objec t ive Op t imiza t i on where Object ive Func t ion V a l u e s D i f f e r b y < 1% 105 Figure 3-5: N o r m a l i z e d Pareto O p t i m a l Parameter Sets for a M u l t i - O b j e c t i v e O p t i m i z a t i o n . . . 106 Figure 3-6: H y d r o g r a p h Ranges Assoc ia t ed w i t h a Pareto So lu t ion Set 117 F igure 5-1: Parameter Convergence for Synthet ic Ca l ib ra t ion i n P re l imina ry Tes t ing 155 F igure 5-2: Object ive F u n c t i o n E v o l u t i o n for Cal ibra t ions w i t h 10, 15, and 20 complexes 156 F igure 5-3: Parameter Convergence for Successful Synthetic Cal ibra t ions 158 F igure 5-4: T y p i c a l A n n u a l Hydrographs for Successful Synthetic Cal ibra t ions 159 F igure 5-5: Parameter Convergence for Unsuccess fu l Synthetic Cal ibra t ions 160 F igure 5-6: T y p i c a l A n n u a l Hydrographs for Unsuccessfu l Synthetic Cal ibra t ions 161 F igure 5-7: N o r m a l i z e d Parameter V a l u e s for Successful Synthetic Cal ibra t ions 163 F igure 5-8: N o r m a l i z e d Parameter V a l u e s for Synthetic Ca l ib ra t ion Seed Sens i t iv i ty Tr ia l s ... 164 F igure 5-9: Object ive F u n c t i o n E v o l u t i o n for Synthetic Ca l ib ra t ion Seed Sens i t iv i ty T r i a l s . . . . 166 Figure 5-10: E x a m p l e A n n u a l Hydrographs for Synthetic Ca l ib ra t ion Seed Sens i t iv i ty T r i a l s . 167 Figure 5-11: T y p i c a l A n n u a l Hydrograph for M u l t i - O b j e c t i v e Ca l ib ra t ion o f C o q u i t l a m L a k e Watershed 172 Figure 5-12: N o r m a l i z e d Parameter V a l u e s for M u l t i - O b j e c t i v e Ca l ib ra t ion o f C o q u i t l a m L a k e Watershed 172 Figure 5-13: N o r m a l i z e d Parameter V a l u e s for M u l t i - O b j e c t i v e Cal ibra t ions o f C o q u i t l a m L a k e Watershed us ing E x p a n d e d Parameter Space 177 Figure 5-14: T y p i c a l A n n u a l Hydrograph for M u l t i - O b j e c t i v e Ca l ib ra t ion o f I l leci l lewaet R i v e r Watershed 180 F igure 5-15: N o r m a l i z e d Parameter Va lues for M u l t i - O b j e c t i v e Cal ibra t ions o f I l leci l lewaet R i v e r Watershed 181 F igure 5-16: T y p i c a l A n n u a l Hydrog raph for C o q u i t l a m L a k e Watershed Seed Sens i t iv i ty 184 F igure 5-17: Object ive Func t ion E v o l u t i o n for C o q u i t l a m L a k e Watershed Seed Sensi t ivi ty. . . 185 F igure 5-18: N o r m a l i z e d Parameter V a l u e s for C o q u i t l a m L a k e Watershed Seed Sens i t iv i ty . . 186 F igure 5-19: T y p i c a l A n n u a l Hydrog raph for I l leci l lewaet R i v e r Watershed Seed Sens i t iv i ty . 189 Figure 5-20: Object ive F u n c t i o n E v o l u t i o n for I l leci l lewaet R i v e r Watershed Seed Sens i t iv i ty 189 Figure 5-21: N o r m a l i z e d Parameter V a l u e s for I l leci l lewaet R i v e r Watershed Seed Sens i t iv i ty 190 F igure 5-22: T y p i c a l H y d r o g r a p h C o m p a r i n g o f In i t ia l Watershed Cond i t ions for S to rm 1 192 Figure 5-23: C o q u i t l a m L a k e Watershed S to rm 1 Pareto Hydrog raph 193 Figure 5-24: Pareto H y d r o g r a p h for P M F - b a s e d Ext reme Event S imula t ions 196 Figure 5-25: P M P - b a s e d S torm Pareto H y d r o g r a p h 198 v i i Acknowledgements W h i l e this thesis is n o m i n a l l y the product o f a sole author, it c o u l d never have been completed wi thout the advice and support o f many colleagues and friends. F i r s t ly , I w o u l d l i ke to thank m y supervisor, D r . Barbara Lence o f U B C C i v i l Engineer ing . H e r input, guidance, and enthusiasm were indispensable i n keeping me on course and focussed. She also rev iewed this entire document repeatedly throughout its evolut ion , and for that has captured m y admirat ion as w e l l as m y gratitude. Secondly , I w o u l d l i ke to thank m y informal thesis commit tee at B C H y d r o - M u r r a y Kroeke r , G r a h a m L a n g , and D r . Des Hart ford. T h e y helped to set m y course and are the "target audience" for w h o m this thesis is prepared. I w o u l d also l i ke to thank D r . M a r k u s W e i l e r o f U B C Forestry, w h o p rov ided an experienced hydrologis t ' s r ev iew and excellent insight as m y of f ic ia l second reader. Several people assisted w i t h the appl ied por t ion o f this thesis, and i n do ing so made m y j o b ve ry m u c h easier. I benefited greatly from the assistance o f U B C Professor Emeri tus D r . M i c h a e l Q u i c k , p r inc ipa l author o f the U n i v e r s i t y o f B r i t i s h C o l u m b i a Watershed M o d e l , and E d m o n d Y u , w h o d i l igen t ly maintains and upgrades the U B C W M source code. D r . Zo ran M i c o v i c , w h o puts the "expert" i n B C H y d r o ' s "expert cal ibrat ions", p rov ided advice on cal ibrat ion o f both the U B C W M and hydro logic models i n general. I salute D r . Q i n g y u n D u a n o f the U n i t e d States N a t i o n a l Oceanic and Atmospher i c A d m i n i s t r a t i o n and D r s . Soroosh Sorooshian and H o s h i n Gup ta o f the Un ive r s i t y o f A r i z o n a , w h o developed, support, and continue to advance the remarkable S C E - U A op t imiza t ion a lgor i thm used extensively i n m y work . D r . Steve Burges o f the U n i v e r s i t y o f Wash ing ton , D r . N i c k K o u w e n o f the U n i v e r s i t y o f Water loo , and m a n y other industry and research leaders also p rov ided commentary and guidance as I was starting out v i i i O f course, I am eternally grateful to those whose assistance kept m e f inancia l ly solvent dur ing m y work . P r i m a r y support for this endeavour was p rov ided b y the Natura l Sciences and Engineer ing Research C o u n c i l o f Canada ( N S E R C ) through a Post-Graduate Scholarship. A d d i t i o n a l f inancia l assistance was p rov ided through scholarships and awards from the U n i v e r s i t y o f B r i t i s h C o l u m b i a , U M A Group L t d . , M r s . E a r l R . Peterson, the Canadian Water Resources Assoc i a t i on , and a Research Assis tantship w i t h m y supervisor, D r . Barbara Lence o f C i v i l Engineer ing at U B C . B C H y d r o also supported this project through their Professional Partnership Program, p r o v i d i n g advice, access to their formidable l ibrary o f hydro logic studies, and o f course the data that forms the basis o f the experiment herein. F i n a l l y , I w o u l d l ike to thank m y colleagues at U B C , B C H y d r o , and K e r r W o o d L e i d a l , m y fami ly and friends, and most especial ly m y wi fe D iane , w h o have co l l ec t ive ly and exhaust ively put up w i t h me ta lk ing about this thesis for nearly four years. I love y o u a l l and am pleased to say that I l ook forward to m o v i n g on to the next b i g thing. i x 1. Introduction "Where there is no water, there is no life... we live by the grace of water." - National Geographic Special Edition, November 1993 There is no substance more relevant or more necessary to the cont inuat ion o f l ife on Ear th than water. It is a constant source o f sustenance, convenience, and confl ic t , but rare extremes o f either presence or absence can quite easi ly become a matter o f l i fe and death. A s such, there should be little surprise that efforts to manage, move , or mitigate the benefits and hazards o f water have been chronic led for thousands o f years. Perhaps more surpris ing is the extent o f uncertainty that persists i n our understanding and predic t ion o f the "when" , "where", and " h o w m u c h " o f water, and the lack o f tools avai lable to a id i n its characterization. Fa i lure to appreciate the impl ica t ions o f such uncertainty can result i n spectacular and awesome consequences. T h i s thesis discusses the uncertainty i n us ing computer models to predict watershed response. It spec i f ica l ly examines the influence o f uncertainty i n cal ibrat ion objectives o n the predic t ion o f f loods. Several non-inferior parameter sets are used to predict runof f f rom an extreme event based o n a P M F scenario for the C o q u i t l a m R i v e r watershed above C o q u i t l a m L a k e . F loods are the most obvious o f adverse consequences for a c i v i l i z a t i o n characterized b y lowland , r ipar ian, and coastal habitation. In fact, f loods are at once the most c o m m o n and most devastating o f natural disasters. O v e r two years (1994-1995), f loods constituted 5 0 % o f g loba l natural disasters, and were responsible for 8,500 casualties (Rosenhagen and Halpert , 1998). Resultant socia l and economic impacts can and do extend far beyond the direct ly f looded areas (Gregory et a l . , 1996); for the per iod above, total damages were estimated at approximate ly U S $ 5 0 B (Rosenhagen and Halpert , 1998). C lose r to home, 2 0 t h century Canad ian floods have resulted i n several b i l l i o n dol lars i n damages and at least 198 fatalities (Natural Resources Canada, 2003) . Protracted periods o f l o w f low (drought), dec l in ing lake levels , and fa l l ing water tables are another c o m m o n concern for industry, government, the environment, and the general popula t ion . 1 An accurate assessment of low flows is critical to fish passage, aquatic habitat, irrigation, and water use planning (Pike and Scherer, 2003). Although the primary focus of this work lies in addressing uncertainty in hydrologic modelling of extreme floods, it would be remiss not to mention the possibility for cross-application of many concepts herein for low-flow prediction. 1.1 Guidance for the Reader Defining the uncertainty surrounding flood management, or even flood magnitude estimation, is far beyond the scope of any single work. This thesis focuses on the technical uncertainties of hydrologic modelling, with an emphasis on the uncertainty incorporated through subjective model calibration. In particular, this work presents a synthesis of the current state of knowledge with regard to hydrologic model predictive uncertainty in the estimation of extreme floods. To illustrate the impact of the concepts discussed, an investigation of how model calibration affects the estimation of extreme flood events is undertaken. This introductory chapter takes an atypical form with the intent of placing later chapters in context. The chapter outlines some of the various ways that uncertainty can affect the selection of methods for flood estimation and the interpretation of their results. As this chapter is intended to provide the reader with an understanding of the larger context of uncertainty in which hydrologic models are applied, readers seeking a more focused discussion of applied hydrologic modelling may wish to proceed directly to Chapter 2 or 3. Chapter 2 provides a literature-based review of hydrologic modelling, beginning with a review of hydrologic processes and their potential contributions to uncertainty in modelling. The chapter explores a brief evolutionary history of hydrologic modelling, as well as the basics of model classification and various approaches for model calibration. This chapter is intended as a background for less experienced modellers or those seeking a basic review of uncertainty in hydrologic modelling. Readers comfortable with a comfortable understanding of the fundamentals of hydrologic modelling may wish to proceed directly to Chapter 3. Chapter 3 provides a more detailed literature-based discussion of the ways in which uncertainty in hydrologic modelling can be classified and explored. This chapter is intended for both beginning and experienced hydrologic modellers seeking to understand the various ways in 2 w h i c h uncertainty is introduced into the m o d e l output. Those famil iar w i t h the body o f literature o n uncertainty i n hydro logic m o d e l l i n g m a y w i s h to proceed direct ly to Chapter 4. Chapter 4 describes an experiment that applies the U n i v e r s i t y o f B r i t i s h C o l u m b i a Watershed M o d e l ( U B C W M ) to quantify the var iab i l i ty in t roduced i n extreme event s imula t ion through subjective assumptions made dur ing cal ibrat ion. M u l t i p l e automatic calibrations o f a conceptual hydro log ic m o d e l are conducted us ing different measures o f performance to provide a co l lec t ion o f non-infer ior parameter sets. E a c h parameter set is then used to simulate an extreme event based o n hydro logic data p rov ided b y B C H y d r o . The combined output o f these extreme event s imulat ions characterizes the relative va r i ab i l i ty i n the hydrographs associated w i t h the use o f different ca l ibra t ion objectives. Chapter 5 discusses the results o f the experiment out l ined i n Chapter 4. Chapter 6 provides conclus ions , and Chapter 7 gives recommendat ions for future work . These chapters w i l l l i k e l y be o f interest to a l l readers. Chapter 8 provides a glossary o f commonly -used abbreviations for ease o f reference. 1.2 Approaches for Estimating floods The floods causing greatest ha rm are those associated w i t h h igh magnitude and poor predictabi l i ty . Ex t reme examples o f this condi t ion inc lude debris f lows and floods i n steep terrain, w h i c h entrain rock fragments, earth, and air into the f lood f low (Horton, 1999). S i m i l a r l y , uncontrol led releases f rom natural (e.g., moraine-dammed) or anthropogenic reservoirs through failure o f a containment structure are c o m m o n l y associated w i t h significant damages and fatalities. Large , unpredictable f loods can also result f rom blockages o f the downstream channel (e.g., ice, l og or debris jams) , and backwaters f rom other channels concurrent ly i n f lood can restrict f l ow and inundate significant upstream areas. A l l o f these cases are diff icul t ( i f not imposs ib le) to predict effectively, and a l l are generally beyond the scope o f a watershed-scale hydro logic m o d e l . It is appropriate to commence a d iscuss ion o f h o w to select an approach for f lood est imation w i t h a reminder o f the insuff ic iency o f hydro logic models i n characterizing the complete f lood r i sk for certain situations. Nonetheless, w i t h rare exceptions, even these most extreme cases are ini t iated b y a more predictable, pure ly hydro log ic , h igh - f l ow event. 3 A l m o s t every f lood scenario requires heavy runof f f rom upstream areas. Therefore, the greatest part o f scientif ic effort i n f lood management has been directed toward quantifying the effects o f and relat ionship between precipi ta t ion and runoff. M a n y approaches have emerged for estimating the magnitude o f a g iven hypothet ical f lood f l ow (e.g., a design event). Gener ic approximat ions such as the rat ional method or regional ly-der ived drainage area - discharge curves are general ly acceptable i n the absence o f detailed data or as checks o n more detailed calculat ions for sma l l projects. Perhaps the simplest method for estimating f lood quantiles invo lves examin ing his tor ica l events and inferr ing future events based o n those observed i n the past (e.g., f lood frequency analysis or pa leof lood hydrology) . M o r e complex methods use computer applicat ions to represent one or more phases o f the hydro logic cycle as a s impl i f i ed system. Whereve r possible , more than one method o f est imating the design event should be used (Nat iona l Research C o u n c i l Canada, 1989). Regardless o f the tool(s) implemented, it is c ruc ia l to recognize that even the most complex tools are o n l y approximat ions o f the natural system ( M c C u e n , 1973). E a c h approach is natural ly subject to a unique set o f uncertainties. A l t h o u g h any approach to estimating runof f m a y be loose ly classif ied as a "hydro log ic m o d e l " , the neo-class ical def in i t ion o f hydro logic m o d e l l i n g invo lves the use o f a computer mode l to predict f lows under g iven input condit ions. F o r this reason, a d is t inc t ion is necessary between two different concepts o f a "mode l " . A fundamental hydrodynamic mode l or concept based on phys ica l p r inc ip les (e.g., k inemat ic or diffusive wave equation) is verif iable and repeatable i n appropriate cont ro l led experiments. A computer m o d e l effectively bu i lds o n more fundamental models , app ly ing them to a complex real i ty w i t h mul t ip le assumptions and potential cod ing errors (Smi th et a l . , 1994). There are two established fundamental phi losophies for def ining a f lood magnitude. P r i o r to the 1970s, the majori ty o f floods computed for engineering design and analysis were defined probabi l i s t ica l ly , i.e., an appropriate frequency for the design event was selected or mandated, and the magnitude defined as a co-requisite value (Berga, 1998). The most w e l l - k n o w n example o f this approach is f lood frequency analysis. Klernes (2000b) argues strongly against the hydrologis t ' s longstanding dependence o n f lood frequency analysis to provide frequency and magnitude for an extreme event. Much-d i scus sed problems w i t h f lood frequency analysis inc lude the impact o f p lo t t ing pos i t ion and the appl icab i l i ty o f the Independent and Ident ical ly 4 Dist r ibuted R a n d o m V a r i a b l e ( E D R V ) concept. Perhaps most importantly, a series o f N data cannot reasonably be expected to provide rel iable informat ion about probabil i t ies less than approximately l/N. Subsequent development o f mathematical and computer models has contributed to the emergence o f a determinist ic approach for est imating potential f lood condit ions. A determinist ic analysis generates an estimate b y c o m b i n i n g a g iven set o f in i t i a l condit ions and m o d e l l i n g the resultant hydro log ic response. The typ ica l approach invo lves c o m b i n i n g observed extreme hydro log ica l factors us ing a hydro logic m o d e l to obtain a worst-case scenario. H o w e v e r , this a priori specif icat ion o f input condi t ions has engendered m u c h debate concerning just h o w extreme an event shou ld be considered. S ince the 1950s, d a m and s p i l l w a y structures w i t h severe consequences o f failure have general ly adopted the "probable m a x i m u m f l o o d " ( P M F ) as a design cri ter ia (Graham, 2000) . The def ini t ion o f the P M F became something o f a h o l y gra i l for determinist ic model lers , especial ly i n the rea lm o f dam safety. In ra infa l l -dr iven watersheds, a P M F event w o u l d be dr iven b y the Probable M a x i m u m Prec ip i ta t ion ( P M P ) , defined as the greatest depth o f precipi ta t ion theoretically poss ible for a g iven locat ion, areal extent, and season (Hansen et a l . , 1988). Input condit ions for watersheds that experience significant snowmel t runof f are somewhat more convoluted. In m a n y snowmel t -dominated areas, the m a x i m u m f lood arises f rom some cr i t i ca l combina t ion o f snowmel t (as a function o f accumulated snowpack and temperature sequence) and heavy precipi ta t ion (poss ib ly occurr ing as rain-on-snow). The appl icat ion o f the P M F as a design event has recently begun to be questioned. Some scientists bel ieve that the Probable M a x i m u m F l o o d as defined above is too vague to a l l o w for its generic use as an approach for engineering design and analysis. There is no current set o f standardized procedures for ca lcula t ing the P M F , nor is there a method for quantifying h o w probable the f lood rea l ly is . A d d i t i o n a l l y , there is a dearth o f informat ion concerning behaviour o f watersheds under f lood ing o f P M F proportions. Fau lkner notes that his tor ic f loods o f record for some rivers are ini t ia ted b y different processes than those presumed to cause the P M F (Faulkner, 2003). Several prominent researchers have noted that the pursuit o f quantif icat ion ("how to route") shou ld be superseded b y attempts to ga in a better understanding o f the 5 processes involved in extreme event runoff ("what to route") (Burges, 2003; Cordova and Rodriguez-Iturbe, 1983). There is, therefore, significant uncertainty involved in selecting how any design flood should be defined. In fields such as structural engineering, government or other regulatory bodies provide guidance when calculating appropriate design conditions (e.g., load combinations); however, regulatory guidance on design flood selection is relatively limited. Since the 1970s, the move away from probabilistic flood estimation has resulted in a general shift away from prescriptive regulatory governance of flood selection (Berga, 1998). In the field of flood estimation, definitive and inflexible standards are of questionable value. In some cases, the design conditions suggested by prescriptive standards or guidelines do not include the critical case (Dumont and Dube, 2003). One possible alternative is presented in the guidelines published by the Canadian Dam Association in 1999 for use as best management practices (CDA, 1999). These guidelines link the selection of design floods for dams to the consequences of failure based on loss of life, environmental damage, and financial loss. Although the guidelines represent current best practice in the industry, they still lack quantitative guidance for selecting an appropriate flood condition when the consequences of failure are neither severe nor negligible. Recent thought challenges the common practice of using overly-conservative "worst-case" scenarios in such cases as inconsistent, unwarranted, and philosophically pessimistic (Lohani et al., 1997). However, in the absence of an alternative with a strong legal precedent, their use is likely to continue. 1.3 Presenting Floods for Decision-making Most computer-based hydrologic models present a single, deterministic estimate for a flood situation with each application. However, such a simple representation of a flood is often not sufficient for effective decision-making, especially where significant safety, financial, or environmental factors are involved. The problems with simple representation are obvious: if the flood has been over-estimated, inefficient or unnecessary designs translate into higher costs; if the flood is underestimated, an unsafe design results. Between these two extremes lies a range of values that would be "acceptable" if design conditions were known with certainty. 6 In most cases w e have at best a l imi t ed understanding o f where o n the safety-vs.-efficiency con t inuum any g iven determinist ic estimate l ies. The presentation o f "representative", "average", or "best" results wi thout a descr ipt ion o f their associated uncertainties gives an i l l u s i o n o f p rec i s ion and object ivi ty, especial ly to those not fami l ia r w i t h the approaches or tools used i n the estimate (Keeney and Winterfe ldt , 1989). The literature r ev iew i n Chapters 2 and 3 is concerned w i t h ident i fying the var ious mechanisms that render p rec i s ion and object ivi ty unattainable i n hydro logic mode l l i ng , at least at present. Those presenting f lood estimates for analysis must be conscious o f the g rowing involvement o f the pub l i c i n the dec i s ion-making process. The portrayal o f f lood-mi t iga t ion projects as prevent ing a l l or v i r tua l ly a l l f loods (i.e., b y des igning for the P M F or other extremely rare events) has had a po l a r i z ing effect. T y p i c a l l y , people are left either unaware o f the potential for failure or skept ical o f the experts ' analysis and design ( L i n s l e y et a l . , 1992; S l o v i c , 1992). The d iscuss ion o f low-probab i l i ty events i n the absence o f numer ica l data is par t icular ly diff icul t for the pub l i c to interpret, as quali tat ive concepts l i ke " a sma l l chance o f be ing exceeded" can have large ranges o f interpretation (Keeney and Winterfeldt , 1989). Di f f i cu l t i e s o f presenting the results o f a hydro log ic s imula t ion for d iscuss ion or decis ion are magni f ied w h e n the impact o f that informat ion is unclear. Uncer ta in ty and disagreement can arise i n translating f lows into water levels , failure probabil i t ies for hydraul ic structures, or consequences for aquatic resources (e.g., Ca i ss ie and E l - J a b i , 2003) . T h i s is compl ica ted w h e n different stakeholders have different v iewpoin ts and objectives (Gregory et a l . , 1996). M u t u a l l y -agreeable objectives for ident ifying, measuring, and understanding impacts are a pre-requisite for any group-oriented analysis o f technical issues (F ier ing , 1976). Where non-experts must interpret uncertain f lood r isks, biases and beliefs can p lay a significant role i n arguments and decisions. A variety o f personal characteristics (e.g., personality, opinions , values, economic or cul tural context) and situational factors (e.g., voluntariness o f exposure, famil iar i ty , control) have been noted to influence the relat ion between perceived r isk, perceived benefit, and r isk acceptance; few o f these factors are exp l i c i t l y quantifiable (Gregory et a l . , 1996; S l o v i c , 1987; S l o v i c , 1992). Therefore, experts have noted that the pub l ic assigns technica l ly unsubstantiated perceptions o f r i sk to certain situations (Gregory et a l . , 1996). 7 Exper ts must understand the impacts o f any biases that stakeholders m a y have, because this can potent ia l ly have as m u c h impact on the qual i ty o f a dec is ion as the technical uncertainties. Further diff icul t ies can arise i f non-expert stakeholders are required to use in tu i t ion to interpret statistical informat ion. E v e n experienced researchers m a y avo id s imple problems (e.g., the gambler ' s fal lacy) w h i l e fa l l ing prey to the same biases under intui t ive judgment o f more intricate and less transparent problems (Tversky and K a h n e m a n , 1974). 1.4 Expressing Uncertainty Since uncertainty and r isk are related at a fundamental leve l , a dec i s ion as to what is acceptable should address both the degree o f residual r i sk that is "acceptable" to stakeholders and the uncertainty surrounding the r i sk analysis. F igure 1-1 illustrates w h y this is necessary; it shows the relat ionship between the r isk and uncertainty, where r i sk is defined as the product o f frequency and consequence. No te that each curve represents an equal ly v a l i d but distinct interpretation o f r isk . E a c h curve is un ique ly defined b y its l eve l o f (un)certainty, expressed i n this case as confidence leve l . The figure impl ies that any arbitrary frequency w i l l be associated w i t h a range o f poss ible consequences, each w i t h a different co-requisite l eve l o f uncertainty; l i kewise , each consequence w i l l have a range o f possible frequencies. Therefore, it is imposs ib le to e x p l i c i t l y define an acceptable leve l o f r i sk wi thout at least i m p l i c i t l y def in ing a corresponding l eve l o f acceptable uncertainty. One o f the few less-explored areas for p o l i c y development i n hydro logy and other r isk-related fields is the determination o f what constitutes "acceptable uncertainty". Precedents exist for the select ion o f a s ingle threshold value; for example , transportation engineering regular ly implements projects designed to meet safety standards for a specific percentile o f drivers. A more direct example is g iven b y the U n i t e d K i n g d o m Hea l th and Safety Execu t ive ( H S E , 2001) through their adopt ion o f 10" 4 as an appropriate boundary between tolerable and unacceptable r i sk for situations i n w h i c h the r i sk is imposed or involuntary. A s shown i n F igure 1-1, different levels o f uncertainty require different levels o f protect ion and m a y therefore dr ive different choices among alternatives ( N R C , 2000a). Projects hav ing marg ina l or indeterminate benefit-cost ratios can be the most sensitive to uncertainty, as the 8 log scale 90% confidence 50% confidence 10% confidence log scale Figure 1-1: Risk - Uncertainty Interaction (from p. 5-20, Lohani et at, 1997) uncertainty m a y be sufficient to jus t i fy or deter their approval ( N R C , 1995). F o r example , a smal l uncertainty i n the condit ions leading to failure o f a levee can make a large difference i n the overa l l system re l iab i l i ty and therefore can influence decisions pertaining to the project as a who le ( N R C , 2000b) . In many cases, it is reasonable to expect that the avai lable informat ion is insufficient to even beg in to proper ly address cost-benefit studies. T h i s is an undesirable state o f affairs f rom both f iscal analysis and pub l i c safety perspectives. F o r the var ious reasons above, an expression o f the associated uncertainty should be required i n a l l r i sk analyses ( N R C , 1995). T o o often results are l imi t ed to values calculated w i t h the "best" estimates or b y averaging over the f inal p robabi l i ty distr ibut ion. E v e n where uncertainties are acknowledged and accounted for, answers p rov ided b y scientists are often d ivorced from their uncertainty as they are passed "up the tree" to stakeholders or decis ion-makers (Grayson et a l . , 1992b). The most thorough approach to expressing uncertainty involves presenting a fu l l descr ipt ion o f the uncertain results to decis ion-makers (e.g., a cumulat ive dis t r ibut ion showing the con t inuum o f event magnitude and probabi l i ty) . A l t h o u g h this can sometimes be di f f icul t to interpret, this 9 fu l l set o f results effectively a l lows the p o l i c y dec i s ion (i.e., what constitutes acceptable r isk) to be separated from the technical analysis. Al te rna t ive ly , L o h a n i et a l . (Lohan i et a l . , 1997) argue for presenting mean values to describe magnitude w h i l e i nc lud ing l o w probabi l i ty-h igh consequence informat ion wherever the impacts warrant. In cases where extensive quantitative informat ion is not available, the use o f successive approximations can sometimes provide acceptable bounds for the quantity o f interest (Keeney and Winterfeldt , 1989). E x a m p l e s inc lude the l i m i t i n g frequency o f 10" 6 i m p l i c i t i n the State o f Washington 's def in i t ion o f a design f lood for dams or the conceptual procedure for progressive refinement o f "Ul t ima te L i m i t State" estimates (Faulkner , 2003; Har t ford et a l . , 2001). The above discussions have i l lustrated some o f the ways that uncertainty can manifest i n hydrologic analysis through subjective choices made i n approach, event selection, acceptable r isk, biases, and the inc lus ion o f uncertainty i n an analysis. H o w e v e r , the process o f dec is ion-m a k i n g does not occur i n a static environment, and the potential for change is a major source o f uncertainty. Changes i n p o l i c y , knowledge , technology, or potential consequences can a l l create a " m o v i n g target" effect w h e n attempting to address f loods and their impacts (Faulkner , 2003). In most cases, the factors are interdependent. F o r example , Jarrett (1990) used pa leohydrologic evidence to show that large f loods observed at h igher elevations i n Co lo rado are l i k e l y due to debris f lows rather than intense high-elevat ion precipi tat ion. Th i s change i n knowledge affects potential consequences through a certain but indeterminate increase i n the expected frequency o f f lows o f s imi la r magnitude. The resul t ing increase i n uncertainty might cause a p o l i c y shift i n the region, leading to the implementa t ion o f addi t ional structural or non-structural protective measures. 1.5 Model Predictive Uncertainty in Context The broad-based nature o f the hydro logic system leads Grayson et a l . (1992b) to refer to the study o f hydro logy as "transscienti t le" - i m p l y i n g the pursuit o f answers to questions asked o f science w h i c h cannot be answered by science. The discussions o f the preceding sections do not address the m a n y ways that uncertainty can manifest i t se l f quanti tat ively w i t h i n the actual process o f m o d e l l i n g - the "sc ien t i f i c" por t ion o f any f lood study. These more technical uncertainties must be combined w i t h value judgments , biases, and other non-technical 10 uncertainties. The ult imate goal is to define the role that uncertainty plays, and identify h o w it interacts w i t h resul t ing r isks , options, and decisions for any g iven situation. Uncer ta in ty is also a necessary considerat ion for hydro log ic m o d e l l i n g i t se l f ( O ' C o n n e l l and T o d i n i , 1996). In addi t ion to the caveats appl icable to computer m o d e l l i n g i n any f ie ld (e.g., garbage i n - garbage out), hydro logic models create addi t ional challenges. M o s t important ly, the results generated by computer s imulat ions fuse together diverse types o f uncertainty, such as those ar is ing from natural processes and those related to mathemat ical representation. A lea to ry uncertainty, or natural var iabi l i ty , represents the var iab i l i ty o f the phys ica l w o r l d on the assumption that this var iab i l i ty cannot be mit igated. Converse ly , the U . S . Na t iona l Research C o u n c i l (p. 4 1 , N R C , 2000a) relates the concept o f epistemic or knowledge-based uncertainty to " a lack o f understanding o f events and processes, or a lack o f data from w h i c h to draw inferences". Knowledge-based uncertainty can be reduced under the correct condi t ions. S ince most hydro logic models do not exp l i c i t l y account for uncertainty i n any form, an understanding o f the magnitude and relat ionship o f these two components is c ruc ia l . Unders tanding the uncertainty o f the appl ied models and processes is an obv ious prerequisite for assessing the uncertainty i n any flood-related dec is ion ( N R C , 1995). W i t h i n the category o f epistemic uncertainty, potential sources for uncertainty include m o d e l structure, mathematical representation, mode l parameter values, and input data errors ( L e i and S c h i l l i n g , 1996). These different types o f uncertainty i n hydro log ic m o d e l l i n g are often lumped together i n an appl ied hydro log ica l context to create an aggregate " m o d e l predict ive uncertainty" ( N R C , 1995). A n y o f the types o f uncertainty l is ted above has the potential to influence results, and therefore a l l should be addressed i n any uncertainty analysis (V icens et a l . , 1975). It has often been assumed that mode l cal ibrat ion can resolve and reduce a l l components o f mode l predict ive uncertainty. H o w e v e r , ca lcula t ing m o d e l predic t ive uncertainty w i t h techniques such as sensi t ivi ty analysis is i m p l i c i t l y dependent on cal ibrat ion; it is o f o n l y l imi ted value unless the m o d e l and data errors are k n o w n to be insignif icant ( L e i and S c h i l l i n g , 1996). Past analyses o f " m o d e l predict ive uncertainty" for hydro logic models have shown a w i d e range o f var ia t ion i n m o d e l accuracy. The accuracies o f ra infa l l - runoff s imulat ions reported i n the literature are typ ica l ly inf luenced b y factors such as those presented b y M i c h a u d and Sorooshian (1994): 11 • the inc lus ion or exc lus ion o f mode l cal ibrat ion; • the inc lus ion or exc lus ion o f spli t-sample va l ida t ion ; • the number, variety, and c l imato logy o f storms examined; • different understandings o f " g o o d " and "poor" s imulat ions; • benchmark data re l iab i l i ty ; • the context o f results (e.g., real-time forecasting, s ingle his tor ic results, or mul t ip le peak f lows f rom a set o f his tor ical storms); • runof f dynamics (e.g., in i t i a l condi t ions, dominant processes); and • mode l assumptions, parameter values, and spatial resolut ion. A t t emp t ing to address m a n y o f the most c o m m o n problems i n the present era o f h igh-powered d ig i ta l computers has somewhat predictably led to the development o f some immense ly complex computat ional models o f hydro logy . H o w e v e r , theoretical rigour does not i n and o f i t se l f l i m i t uncertainty, and can i m p l y a degree o f accuracy that m a y not exist (Grayson et a l . , 1992b). Rather than increasing m o d e l complex i ty , progress i n reduc ing m o d e l predict ive uncertainty w i l l l i k e l y depend on the establishment o f a new paradigm that includes an acceptance o f uncertainty i n the results (Beven , 2002) . A best practice approach is needed to a l l o w professionals to m o v e o n to def ining appropriate pr inc ip les rather than arguing about h o w the impacts o f uncertainty should be addressed (Faulkner , 2003). 12 2. Hydrologic Modelling "Models are like maps: never final, never complete until they grow as large and complex as the reality they represent." - James Gleick, from "Genius: The Life and Science of Richard Feynman " T h i s chapter is intended as background for less experienced model lers or for those seeking a basic r ev iew o f the fundamentals o f hydro log ic mode l l i ng . Readers seeking a more advanced and focussed discuss ion o f uncertainty m a y w i s h to proceed di rect ly to Chapter 3. A so l id understanding o f processes represented i n hydro logic m o d e l l i n g is required for any d iscuss ion o f m o d e l predict ive uncertainty. Therefore, this chapter begins w i t h a summary o f var ious processes important to m o d e l l i n g . A l t h o u g h the concepts are basic , the nov ice mode l le r is encouraged to consider the d iscuss ion i n terms o f the potential uncertainty inherent i n m o d e l l i n g the more complex aspects o f the system. Those famil iar w i t h the complex i ty o f these processes and the related s impl i f ica t ions and assumptions i m p l i c i t i n different hydro logic models w i l l l i k e l y w i s h to proceed d i rec t ly to Sec t ion 2.2. Sect ion 2.2 provides the reader w i t h insight into the approach and ph i losophy o f hydro log ic mode l development and appl icat ion. In Sec t ion 2.3, a b r i e f ove rv i ew o f the var ious k inds o f models addresses their advantages and disadvantages. A detailed d iscuss ion o f automatic and manual ca l ibra t ion fo l lows i n Sec t ion 2.4, h igh l igh t ing the need for cons ider ing uncertainty i n the cal ibra t ion process. In particular, Sect ion 2.4 introduces the Shuff led C o m p l e x E v o l u t i o n method developed at the U n i v e r s i t y o f A r i z o n a ( S C E - U A ) , w h i c h is u t i l i z ed i n the quantitative experiment out l ined i n Chapter 4. T h i s chapter concludes w i t h a d iscuss ion o f some o f the factors l i m i t i n g progress i n hydro logic mode l l i ng , w h i c h leads into the d iscuss ion o f uncertainty i n Chapter 3. 13 2.1 H y d r o l o g i c P r o c e s s e s 2.1.1 Scale in Hydrology In nature, scales o f things are not arbitrary but tend to concentrate around discrete states as a funct ion o f their material substance and o f the balance between the interacting forces (Klemes , 1983). Scient i f ic progress has typ ica l ly been s lower i n disc ipl ines attempting to w o r k between dominant scale levels than i n those w o r k i n g w i t h i n a s ingle scale ( ibid.) . H y d r o l o g y is a classic example; component processes can be active at many different spatial and temporal scales from minute to g loba l . K l e m e s ( ibid.) generalizes the "characterist ic" scale o f hydro logy as between 1 and 1000 k m 2 i n space and 100 seconds to 100 years i n t ime. H o w e v e r , such generalizations serve o n l y i n ph i losoph ica l discussions; the prac t ic ing hydrologic model le r must understand what is occurr ing at a l l scales to avo id m a k i n g irresponsible s impl i f ica t ions . A c c o r d i n g to Song and James (1992), hydro log ic processes can be characterized at f ive typ ica l scales, i nc lud ing the f o l l o w i n g : • laboratory scale - t yp ica l ly less than 10m, for descr ibing detailed physics o f water-surface interactions (e.g., infi l trat ion) or subsurface processes; • plot or h i l l s lope scale - t yp ica l ly tens o f metres, for descr ib ing runoff processes; • catchment scale - typ ica l ly hundreds to thousands o f metres, for characterizing the interaction o f var ious h i l l s lopes feeding into a single channel; • bas in or watershed scale - typ ica l ly tens to thousands o f ki lometres , for characterizing the generation, storage and translation rout ing o f a channel network or r iver system; and • continental or g loba l scale - thousands o f ki lometres and greater, for characterizing the atmospheric processes that dr ive the hydro log ic cycle . Commona l i t i e s o f vegetation and land use at each scale typ ica l ly have associated commonal i t ies o f under ly ing hydro log ica l mechanisms or behaviours (e.g., alpine vs. sub-alpine at the catchment scale; t ropical vs . temperate at the bas in scale) (S ingh , 1995b). V i e s s m a n and L e w i s 14 (1996) note that it is easiest to deal w i t h hydro logy at the watershed or r iver bas in scale due to the re la t ively sharp boundaries o f the runof f system. It is c o m m o n l y accepted, however , that no adjustment o f scale can place such wel l -def ined boundaries on the other components o f the water balance; the hydro logic cyc le is a c losed system o n l y at the g loba l scale (Beven , 2000; K l e m e s , 1983; V i e s s m a n and L e w i s , 1996). 2.1.2 Precipitation Prec ip i ta t ion can take m a n y forms, but the two most c o m m o n (i.e., ra in and snow) are o f greatest import for hydro log ic mode l l i ng . The m a i n difference from a hydro log ica l perspective is the delayed runof f response associated w i t h snow. T h i s difference is very important f rom the standpoint o f hydro log ic mode l l i ng , as ra in-dominated basins promote a focus o n the accurate capture o f i n d i v i d u a l events w h i l e s imula t ion o f snowmel t -dominated basins is more concerned w i t h seasonal precipi ta t ion totals ( M i c o v i c , 2003a). Other forms o f precipi ta t ion (e.g., sleet, ha i l , graupel) are less c o m m o n ; their s ignif icance to hydro log ic m o d e l l i n g is therefore l imi ted . There are three p r imary categories o f precipi ta t ion events: convect ive , orographic, and cyc lon ic or frontal ( V i e s s m a n and L e w i s , 1996). Convec t i ve precipi ta t ion arises when mois t air heated near the terrestrial interface rises and cools , and typ ica l ly results i n short-term high-intensi ty l oca l precipi ta t ion i n the area o f the updraft ( A M S , 2000; Hor ton , 1999). Orographic precipi ta t ion results f rom the l i f t ing o f mois t air masses over natural barriers such as ridges or mounta in ranges, and is control led b y barrier slope, height o f barrier, and air mass stabil i ty (Qu ick , 1995; V i e s s m a n and L e w i s , 1996). Fronta l precipi ta t ion results f rom the interaction o f two air masses o f different density, almost invar iab ly segregated b y temperature ( A M S , 2000). The term "frontal prec ip i ta t ion" is most significant i n its sense o f d is t inc t ion from convect ive and orographic precipi ta t ion (Vies sman and L e w i s , 1996). The different mechanisms o f precipi ta t ion generation are dist inct i n their behaviour and should be mode l l ed as such wherever possible, since factors such as precipi ta t ion and t i m i n g can often control the f lood response o f a bas in (Konrad , 2001) . F o r example , the various precipi ta t ion events that led to the M i s s i s s i p p i R i v e r f lood o f 1993 were not among the most extreme events o f record at any spatial scale. Stationary or s l o w - m o v i n g storm systems can also lead to f lood condi t ions (e.g., Spr ing Creek, Co lo rado i n Ogden et a l . (2000) and K i c k a p o o Creek, Texas i n 15 S m i t h et a l . (2000) ). Success ive ly increasing peaks o f ra infa l l f rom a storm m o v i n g i n the downstream direct ion represent the worst case scenario from a hydraul ic standpoint. In this case, subsequent waves o f runof f can propagate and overtake preceding waves (Ogden et a l . , 2000; Thapa and K h a n a l , 2001). S ince vo lume , t im ing , and dis tr ibut ion o f precipi ta t ion are the most significant factors i n determining f lood magnitude, it is no surprise that techniques for the measurement o f precipi ta t ion are w e l l developed. The two m a i n sources for ra in data are gauge networks and radar measurement. R e c o r d i n g precipi ta t ion gauges are the most c o m m o n form o f data used i n hydro log ic research, p rov id ing continuous point estimates o f precipi ta t ion at a specific locat ion ( D u c h o n and Essenberg, 2001). The two most c o m m o n l y used classes o f recording gauges are t ipp ing bucket and w e i g h i n g gauges. A t ipp ing bucket gauge involves a smal l , bi-stable bucket hav ing two chambers, each w i t h a v o l u m e equivalent to a fraction o f mi l l ime te r o f ra in . The presence o f water i n one side o f the bucket w i l l cause it to tip to that side, sp i l l i ng the fu l l chamber and a l ign ing the empty chamber i n pos i t ion to col lec t precipi tat ion. A datalogger records the number o f tips w i t h i n a specif ied per iod. A w e i g h i n g gauge records continuous or per iodic mechanica l measurements o f the cumulat ive weight o f precipi ta t ion b y us ing a r o l l p lot or electronic means, and therefore requires per iodic cal ibrat ion i n the f ie ld (Duchon and Essenberg, 2001). F o r a more complete d iscuss ion o f automatic gauges, the reader is referred to Nys tuen et a l . (1996). Non- reco rd ing ra in gauges are m u c h more straightforward but do not a l l o w the user to estimate ra infa l l rate or intensity. Regardless, such data have proven useful to hydro logic model lers i n the past (e.g., Faures et a l . , 1995). Non- record ing ra in gauges typ ica l ly consist o f an ampl i f ied col lector w h i c h is measured manua l ly against a f ixed or removable scale. The i r s imp l i c i t y has led to the development o f an extensive network o f amateur meteorological stations across the U n i t e d States (Na t iona l Weather Service , 2003) . 16 The expansion o f D o p p l e r radar across N o r t h A m e r i c a provides an alternative fo rm o f precipi tat ion measurement. W h i l e recording ra in gauges capture temporal var iab i l i ty for a g iven locat ion, radar ra infa l l estimates can characterize spatial var iab i l i ty for a g iven series o f temporal snapshots. Radar can prov ide three-dimensional observations over thousands o f square ki lometres . The physics i n v o l v e d i n radar measurement l i m i t its accuracy at greater distances, and usual ly an expert must r ev iew results pr ior to use to a v o i d incorporat ion o f anomalies into the data set. S o l i d precipi ta t ion (i.e., snow) is usual ly measured w i t h a heated t ipp ing bucket gauge or a we igh ing gauge treated w i t h antifreeze to melt the snow o n contact. W h i l e m u c h less c o m m o n , radar can also be used to estimate snowfal l . C o l l i e r and L a r k e (1978) f ind that radar measurements can have as l i t t le as 1 3 % error w h e n compared to gauge-measured data. M o r e generally, however , radar estimates o f snowfa l l have not been very successful ( X i a o et a l . , 1998). Studies i n v o l v i n g radar and ground-based snow measurements should be interpreted w i t h care, as radar est imation relationships can be dependent o n diverse factors such as range, locat ion , temperature, snowfa l l type, and season ( ib id . ; Hunter et a l . , 2001). In most pract ical cases, error is l i k e l y to be at least several t imes that reported b y C o l l i e r and La rke (1978), and can easi ly exceed a factor o f two (Kra j ewsk i , 2005). Son ic snow depth sensors and snow p i l l o w s are also c o m m o n l y used to record variat ions i n snow depth and snow water equivalent over t ime. M a n u a l snow depth sampl ing (i.e., snow courses) are rout inely performed for the purposes o f est imating the spring freshet. In a less t radit ional context, Mat thews (1999) demonstrates h o w remote sensing can be appl ied to determine the snow-covered area for a bas in and thus prov ide insight into snowpack generation and deplet ion. There are two general approaches for estimating snowmelt : energy balance methods and index methods (Maidment , 1993). The physical ly-based energy balance method applies cont inui ty pr inciples to the var ious energy fluxes o f a watershed (V ie s sman and L e w i s , 1996). The m a i n drawback o f the energy balance method is that it requires significant amounts col lected f rom w i t h i n the bas in (e.g., radiat ion, w i n d , vapour pressure). The various index-based methods for estimating snowmel t use calibrated parameters and index variables to estimate snowmel t o n a catchment-wide basis (Mat thews, 1999). Index methods are 17 general ly less accurate but easier to apply, requir ing o n l y one or two data series for their index variable(s). A i r temperature is the most c o m m o n l y used variable due to its w ide ava i lab i l i ty and strong correlat ion to snowmel t processes. The W o r l d M e t e o r o l o g i c a l Organiza t ion ' s 1986 compar i son o f snowmelt models concludes that the choice o f m o d e l usual ly depends o n the intent o f the appl icat ion and the nature o f the p rob lem and hydro log ic regime ( W M O , 1986). 2.1.3 Runoff A number o f processes compr ise the transit ion o f precipi ta t ion to streamflow. A complete t axonomy o f the processes active i n any one catchment w o u l d show that over t ime, the compos i t i on o f outf low b y source changes depending o n w h i c h processes are most active i n a vo lumet r ic sense. In most areas, the in situ relative contr ibutions cannot be measured direct ly; however , quantitative estimates are often possible . In cases where source chemistry can be ident i f ied for each mechanism, the u t i l i za t ion o f chemica l data and tracers a l lows researchers to explore quantitative estimates for the proport ional o r ig in o f runof f waters (Hornberger and B o y e r , 1995). F o r decades, studies have attempted to characterize the properties o f various runoff responses. F o r example , Dunne (1982) provides estimates o f typ ica l veloci t ies for overland, subsurface, and channel f lows. Th i s section br ie f ly describes the mechanics o f and approaches for est imating the translation o f precipi tat ion into runoff. Prec ip i ta t ion m a y be intercepted b y vegetation or col lect i n m i n o r depressions i n the ground surface. These two intermediate processes are both sinks for precipi tat ion, but neither contributes direct ly to runoff; most o f the water detained b y these processes either infiltrates into the ground or evaporates after the ra in stops (P ike and Scherer, 2003). In both cases, even the most detailed hydrologic models approximate the processes on an areal basis, usual ly us ing empi r i ca l parameters. In particular, seasonal variations i n interception are often not considered. Infi l trat ion refers to the process b y w h i c h water moves f rom the surface into the so i l matr ix . Infil tration is constantly active i n a l l wetted permeable areas except those from w h i c h active seepage is occurr ing. The rate o f inf i l t ra t ion is a function o f so i l moisture and condi t ion as w e l l as s o i l type. Coarse soi ls , well-vegetated land, l o w so i l moisture, and a macro-porous top layer (i.e., affected b y bu r rowing insects and animals) a l l promote h igh infi l t rat ion rates (Fetter, 1994). 18 Ini t ia l s o i l moisture at a g iven loca t ion varies over t ime; as a dependent condi t ion , the l i m i t i n g inf i l t ra t ion rate does l ikewise . In situ so i l moisture can be assessed us ing a variety o f apparatus (e.g., tensiometers or T D R ( T i m e - D o m a i n Reflectometry) probes), and infi l t rat ion rates for so i l samples under different moisture condi t ions can be measured i n a laboratory. Howeve r , many factors can influence inf i l t rat ion at a loca l l eve l (e.g., l eaf litter and loca l topography). Th i s renders a detai led characterization o f the inf i l t ra t ion process compl ica ted even for ideal condi t ions ( V i e s s m a n and L e w i s , 1996). Infiltrated water generally percolates ver t ica l ly downward through the unsaturated zone unt i l meet ing the water table. H o w e v e r , it is c o m m o n - especial ly where the catchment is dominated b y steep h i l l s lopes - to have hor izonta l f l ow above the n o m i n a l saturated zone, a process referred to as in terf low. Preferential f l ow occurs through root vo ids and other macropores i n the so i l mat r ix under saturated or unsaturated condi t ions . H o w e v e r , in ter f low most c o m m o n l y occurs o n steep h i l l s lopes w h e n percolat ing water encounters a zone o f lower ver t ica l hydraul ic conduct iv i ty (Fetter, 1994; Refsgaard and S torm, 1995). The resul t ing perched water table often triggers lateral subsurface f low, either through lower-permeabi l i ty strata or so i l macropores ( W e i l e r et a l . , 2005) . Thus , interf low can occur under saturated or near-saturated condit ions even though the response occurs above the n o m i n a l water table ( ibid . ) . Th i s is an important process for de l ive r ing water to the va l l ey bot tom at the h i l l s lope scale ( W e i l e r and M c D o n n e l l , 2004). Interflow is di f f icul t to quantify due to its transient nature. A n effective numer ica l descript ion requires knowledge o f subsurface strata topography, hydraul ic conduct iv i ty , macroporosi ty, and connect ivi ty , as w e l l as antecedent so i l moisture and groundwater condi t ions . The scientific understanding o f interf low at the h i l l s lope scale has evo lved considerably over the past few decades through the appl icat ion o f new measurement techniques l i ke isotope and chemica l t racing ( W e i l e r et a l . , 2005). A l t h o u g h the p reva i l ing understanding o f interf low has advanced, in terf low is not w e l l represented i n most hydro log ic models . E v e n re la t ively sophisticated hydro log ic models l ike M I K E S H E ( D H I Software 's popular vers ion o f the Systeme Hydro log ique Europeen) calculate on ly ver t ica l f l ow i n the unsaturated zone (Refsgaard and Storm, 1995). In some cases (e.g., bu lk hydraul ic conduct ivi ty) , the so i l properties governing 19 in terf low (e.g., bu lk hydraul ic conduct iv i ty , i nc lud ing macropores) cannot be accurately assessed, since the act o f sampl ing often changes the property i n question (Beven , 2002). Saturated-zone groundwater processes are compara t ive ly w e l l understood but i n many cases just as diff icul t to quantify. Groundwater basef low sustains surface water systems through dry periods b y s low deplet ion o f subsurface storage. V o l u m e s and f low rates for groundwater are diff icul t to characterize because o f their strong dependence on regional geology. Further complex i ty arises f rom the three-dimensional nature o f subsurface f low. A l t h o u g h complex , groundwater can be mode l l ed i n three dimensions (e.g., M I K E S H E i n Refsgaard and S torm, 1995). H o w e v e r , many models assume f l o w i n the th i rd d imens ion to be negl ig ib le , a l l o w i n g for two-d imens iona l analysis w i t h i n the saturated zone (V ie s sman and L e w i s , 1996). Groundwater measurements typ ica l ly i n v o l v e recording the depth to w h i c h an unrestricted c o l u m n o f water w i l l r ise at a number o f i nd iv idua l locations. A series o f such measurements can determine f l o w di rec t ion but cannot fu l ly describe subsurface f l o w without addi t ional informat ion on hydraul ic conduct iv i ty . T y p i c a l l y , groundwater f l o w calculat ions are based o n equations developed for a s imple control vo lumes and benchmarked to the subject catchment us ing empi r i ca l parameters der ived f rom f ie ld measurements. Regardless o f the qual i ty o f the parameter est imation, equations der ived on the basis o f a homogeneous, isotropic control v o l u m e are often inappl icable at large scales due to heterogeneity and preferential f l ow pathways (Beven , 2002). M o d e r n hydro log ic literature d ivides f l o w over the ground surface into two distinct mechanisms. The dominant mechan i sm for a g iven region depends o n cl imate, topography, ground permeabi l i ty , and ra infa l l intensity. H o r t o n or " inf i l t ra t ion excess" Over l and F l o w ( H O F ) , named for hydro log ic pioneer R . E . Hor ton , occurs w h e n the precipi ta t ion rate exceeds the inf i l t ra t ion rate. Excess precipi ta t ion runs over the surface un t i l it either infiltrates elsewhere or reaches an area o f surface accumula t ion (e.g., depression, lake, stream). H O F can occur as a result o f extremely intense precipi ta t ion or moderately-intense precipi ta t ion fa l l ing on low-permeabi l i ty surface layers such as exposed bedrock, frozen ground, asphalt, or extremely dry so i l . H O F is usual ly the dominant form o f 20 over land f low observed i n ar id environments due to the typ ica l combina t ion o f a dry, l o w -permeabi l i ty surface layer and infrequent but intense precipi ta t ion events. Saturation Over l and F l o w ( S O F ) occurs w h e n a precipi ta t ion event causes the water table to rise to the surface (e.g., at the base o f a h i l l s ide) , creating a seepage face (Dunne, 1982). The term saturation over land f low refers to the combina t ion o f direct precipi tat ion o n saturated s o i l and f low emerging from an ephemeral seepage face. S O F is more c o m m o n l y observed than H O F i n wet and temperate regions due to its dependence o n topography rather than intense precipi ta t ion and l o w permeabi l i ty . S O F can be observed a long creeks and stream banks dur ing most higher-intensity precipi ta t ion events. F l o o d i n g due to S O F is typ ica l ly associated w i t h p ro longed periods o f ra in that substantially raise l oca l water tables. The v o l u m e o f surface f l ow is typ ica l ly calculated b y subtracting estimated losses and inf i l t ra t ion f rom estimates o f areal totals for ra in and snowmel t ( M i c h a u d and Sorooshian, 1994). The actual mechanics o f surface f l ow w o u l d be nearly imposs ib le to simulate i n detai l , necessitating the use o f substantial approximations. Where detailed s imula t ion o f over land f low is required, the most w ide ly -app l i ed approach is to use the St. Venant equations to route f l o w as a th in sheet o f water m o v i n g over a homogenous landscape o f constant or smooth ly-vary ing roughness. M o r e complex computations are used i n leading-edge computer software such as D H I ' s M I K E S H E , w h i c h applies the Saint Venant equations i n two hor izonta l dimensions (Refsgaard and Storm, 1995). These methods i m p l i c i t l y assume that surface runoff remains i n the fo rm o f characterist ically two-d imens iona l sheets rather than consider ing w e l l documented but more complex rea l -wor ld behaviour (e.g., r i l l i n g and backwater ing around uneven micro- ter ra in features). These assumptions m a y u l t imate ly provide the correct answer for t i m i n g and v o l u m e o f water reaching the channel , but bear l i t t le resemblance to the in situ process o n a h i l l s lope (Burges, 2002) . A l l runof f eventually makes its w a y to the channel network o f streams, r ivers, and lakes. The acceptable representation o f these surface water systems i n most models arises f rom a fa i r ly thorough understanding o f open-channel hydraul ics . H o w e v e r , many rout ing approaches s t i l l r e ly o n empi r i ca l and subjective methods o f approximat ion that are appl icable o n l y under specific condi t ions (e.g., M a n n i n g ' s formula) . T h i s suggests that our ab i l i ty to m o d e l r iver 21 hydraul ics is arguably "good enough" (i.e., not unacceptable) rather than "good" . Further, cases exist where the standard suite o f assumptions are i n v a l i d . F o r example, channel losses can be important i n ar id regions where Hor ton ian runof f dominates, and ephemeral lakes can change size appreciably dur ing runof f events, m a k i n g m o d e l l i n g diff icul t ( M i c h a u d and Sorooshian, 1994; Ogden et a l . , 2000). S t reamflow measurement is c r i t ica l to hydrology, as it is the most heav i ly used (and often the o n l y available) indicator o f the hydro log ica l behaviour o f a watershed. Fortunately, s treamflow measurement can be re la t ively straightforward compared to measurement o f other hydro log ica l processes (e.g., areally-distributed precipi ta t ion and evapotranspiration). Structures such as wei rs , f lumes, and culverts can be used to measure f l o w direct ly through the wel l -documented relat ionship between their geometric properties and the loca t ion o f the water surface at different f low rates. Accu rac i e s o f measurement vary w i t h the type o f structure. A current meter, used to measure f lu id ve loc i ty at a s ingle point , can also be used to measure f low b y employ ing a systematic pattern o f ve loc i ty and depth measurements. The product o f each pai r o f measurements is mu l t i p l i ed b y its corresponding por t ion o f the r iver cross-section. The results are then summed to attain an approximat ion o f the streamflow. B y repeating the above method at a variety o f different f l ow levels , a relat ionship between ambient water surface elevat ion and discharge can be obtained. W i t h this stage-discharge relationship, measurements o f water stage (i.e., us ing a s t i l l ing bas in or staff gauge) are easi ly converted into streamflow. T h i s method is favoured b y hydrometr ic agencies such as the Water Survey o f Canada and the U . S . G e o l o g i c a l Survey ( M o o r e , 2004) . Geomorpho log ic changes can s ignif icant ly affect the stage-discharge relat ionship; therefore, it must be updated o n a regular basis to preserve its accuracy. T h i s is the most c o m m o n method for ca lcula t ing streamflow; most gauging stations record on ly water leve l . A l l o f the above methods o f measur ing streamflow have condi t ions under w h i c h they are either infeasible or provide meaningless results. In basins where colder temperatures are associated w i t h l o w streamflow, significant data can be lost or made suspect due to ice condit ions (e.g., Env i ronment Canada, 2001). Genera l problems w i t h current meter ing at l o w f lows include streamflow veloci t ies at or be low the meter's stall speed, insufficient depth for required 22 submergence, and insufficient stream w i d t h to sample a sufficient number o f ver t ical sections (P ike and Scherer, 2003) . Turbulent environments can also disrupt current metering, as veloci t ies m a y not be consistent enough at any g iven point to provide a rel iable measurement. E v e n re la t ively accurate measurement devices such as wei rs and flumes are not exempt from problems; their ins ta l la t ion can affect the loca l f l o w regime, and condi t ions must be a l l owed to stabi l ize before results can be considered broadly appl icable ( M i c h a u d and Sorooshian, 1994). L o w f lows, irregular cross-sections, and h igh turbulence m a y preclude the use o f some or a l l o f the above gauging techniques. H o w e v e r , i n many such cases, discharge can be measured us ing a conservative tracer and the pr inc ip le o f mass balance. T h i s approach, ca l led salt d i lu t ion gauging, invo lves inject ing into the f l o w a solu t ion conta in ing a chemica l tracer o f k n o w n concentration. The inject ion can be performed as either a continuous input or a s lug inject ion. The d i lu t ion o f the tracer is then measured at an appropriate distance downstream. M o o r e (2004) cites c o m m o n table salt ( N a C l ) as the most popular tracer because it is inexpensive, readi ly avai lable, easi ly measurable (using electr ical conduct iv i ty) , and non- toxic for the exposures currently associated w i t h discharge measurements. Because salt d i l u t ion gauging relies on attaining a complete lateral m i x o f the tracer solut ion, it is idea l ly suited for the irregular and h i g h l y turbulent environments typica l o f mounta in streams. F o r environmental and pract ical reasons, salt d i l u t ion gauging is less appropriate for f lows greater than about 15 m 3 / s (K i t e , 1993). H o w e v e r , salt d i l u t ion gauging also has l imitat ions. M o s t important ly, the i m p l i c i t requirement for steady-state f l ow generally l imi t s salt d i lu t ion gauging to discrete (as opposed to continuous) measurements o f discharge. There is also the potential for environmental impacts (e.g., W o o d and D y k e s , 2002) , as h i g h concentrations are typ ica l ly observed near the point o f injection. H o w e v e r , deleterious impacts are un l ike ly g iven the shor t - l ived and loca l i zed nature o f these higher concentrations; concentrations downstream o f the m i x i n g reach are usual ly far be low commonly-accepted 48-hour tox ic i ty thresholds ( ib id . ; M o o r e , 2005). It is o f great s ignif icance to hydro logic model lers that large floods often ove rwhe lm f ixed measur ing structures, regardless o f the measurement approach. T h i s typ ica l ly means that data co l l ec t ion is interrupted on the r i s ing l i m b o f a f lood hydrograph, and is not resumed unt i l repairs 23 can be effected (e.g., Ogden et a l . , 2000). The result is that c r i t i ca l s treamflow data - i.e., that o f the f lood peak and durat ion - is lost. In such cases, hydrologists must re ly on expert analysis o f quali tat ive data such as eye-witness accounts and high-water marks such as debris scatter and tree scarring. A p p r o x i m a t e formulae such as the slope-area method, an approach c o m m o n l y appl ied i n such cases, incorporate substantial uncertainty into peak f low estimates. S t reamflow entering a reservoir can also be estimated where outf lows are control led or moni tored and the v o l u m e o f the reservoir is k n o w n . Inf lows are estimated as the sum o f total releases from the reservoir and any change i n storage. Change i n storage is typ ica l ly estimated b y measur ing reservoir l eve l and conver t ing it to v o l u m e us ing a stage-storage curve; this often requires an i m p l i c i t assumption that the lake or reservoir is groundwater-neutral (i.e., neither accumula t ing from nor d ischarging to the subsurface). D u e to the potential impact o f dynamic effects l i ke w i n d set-up and waves on stage measurement, back-calcula ted estimates for reservoir in f lows shou ld be used w i t h caution. 2.1.4 Evapotranspiration Evapora t ion and transpiration together can account for as m u c h as 8 0 % o f hydrologic act ivi ty i n a typ ica l bas in (K lemes , 1986a). Calcu la t ions o f pure evaporat ion are c o m m o n l y l imi t ed i n appl ica t ion to determining losses from lakes or reservoirs. Approaches include apply ing empi r i ca l adjustments to pan evaporat ion measurements, or us ing more detailed data to employ water budget, energy budget, or mass transfer techniques. In some cases, evaporation has been pre-calculated from data co l lec ted at a c l imate station (Envi ronment Canada, 2005). H o w e v e r , few hydro log ic models attempt to exp l i c i t l y calculate evaporation. Transpira t ion is essentially the evaporation o f water taken up b y plants, shrubs, and trees. In addi t ion to the phys ica l factors govern ing evaporation (e.g., exposure, heat f luxes), transpiration is k n o w n to vary w i d e l y w i t h plant species, density, and size (Fetter, 1994). A v a i l a b l e moisture can also l i m i t transpiration w h e n so i l moisture drops b e l o w the plant ' s w i l t i n g point (V ie s sman and L e w i s , 1996). Changes i n season w i l l direct ly affect transpiration as plants respond to environmental s t imul i (e.g., cf. deciduous and boreal forests). Measurement o f transpiration alone is extremely diff icul t and must be undertaken i n c lose ly contro l led laboratory condit ions 24 w h i c h el iminate external evaporation (e.g., us ing a potometer). T h e results o f such experiments are natural ly h i g h l y dependent o n ambient and constituent condi t ions . D u e to the d i f f icul ty o f obta ining dist inct estimates for each process independently, the term evapotranspiration ( E T ) is used to represent the combined return o f water to the atmosphere through evaporation and transpiration (P ike and Scherer, 2003). E T is the largest s ink for precipi ta t ion i n a l l but extremely h u m i d , c o o l cl imates (Fetter, 1994). Refsgaard and S torm (1995) note that E T accounts for approximate ly 7 0 % o f annual precipi ta t ion i n temperate zones. K l e m e s (1986a) points out the discrepancy between the large fraction o f hydro log ic act ivi ty i n a bas in ascribed to E T and its relative dearth o f treatment i n hydro logic literature and practice. E T plays a m i n o r role dur ing short-term s torm events, since the air is t yp ica l ly at or near its saturation point . F o l l o w i n g precipi ta t ion events, deplet ion o f the s o i l moisture through evapotranspiration creates a s o i l moisture deficit . Infiltrated water must replenish this deficit before subsurface runof f w i l l contribute to s tormflow. The dependence o f transpiration o n s o i l moisture is reflected i n the para l le l dependence o f E T o n so i l moisture. T o account for, Thornthwai te (1944) cal ls the upper l i m i t o f transpiration losses "potential E T " , defined as "the water loss w h i c h w o u l d occur i f at no t ime there is a def ic iency o f water i n the so i l for the use o f vegetation". A s s o i l moisture declines f rom f ie ld capacity to w i l t i n g point , there is some uncertainty as to the rate at w h i c h evapotranspiration is affected (Fetter, 1994). The reduced rate o f E T resul t ing from reduced so i l moisture is referred to as "actual evapotranspiration". A l t h o u g h s o i l water content can be measured w i t h some prec is ion (at least as a point value) , there is persistent uncertainty associated w i t h the convers ion between potential and actual evapotranspiration. Po in t approximations o f evapotranspiration for a particular plant and locat ion can be obtained us ing a lysimeter (Fetter, 1994), but such estimates are typ ica l ly l im i t ed due to in situ va r iab i l i ty o f species, size, and density. P a n evaporat ion data have also been related to E T through empi r i ca l coefficients w i t h va ry ing degrees o f accuracy ( A l l e n et a l . , 1998). Me thods for ca lcula t ing evapotranspiration i n hydro log ic m o d e l l i n g are generally either empi r ica l or based on the mass balance and energy budget approaches noted above. The water 25 budget approach is o f litt le use i n hydro log ic m o d e l l i n g , since E T is c o m m o n l y required to close a water balance w h e n estimating runoff, and energy-based approaches require types o f f ie ld data not c o m m o n l y avai lable i n meteorologic data sets (Fetter, 1994). The end result is that many hydro log ic models use purely empi r i ca l relat ionships that require a m i n i m u m amount o f data and w h i c h re ly o n cal ibrat ion to be l oca l l y appl icable (e.g., Q u i c k , 1995). M o r e advanced methods such as the Penman-Mon te i t h equation (Monte i th , 1965) di rect ly calculate actual evapotranspiration us ing complex energy balance inputs such as net radiat ion and vegetative canopy resistance to heat and vapour transfer. H o w e v e r , the Penman M o n t e i t h equation is s t i l l o n l y intended for use w i t h un i fo rm expanses o f vegetation - a rare condi t ion i n the natural environment ( A l l e n et a l . , 1998). In general, even the most accurate methods for est imating E T have substantial margins o f error. F o r a deeper d iscuss ion o f evapotranspiration methods, the reader is referred to the w o r k o f Jensen et a l . (Jensen et a l . , 1990). 2.2 The Evolution of Hydrologic Modelling 2.2.1 The Beginnings of Modelling M u s i n g s o n the nature o f hydro logy have been traced back as far as the philosophers o f A n c i e n t Greece; however , a scientif ic understanding o f the hydro log ic cyc le d i d not begin to emerge unt i l the fifteenth century. E a r l y efforts to quantify hydro log ic variables first began to appear i n the 17 t h century, w h i l e 18 t h century developments i n hydrau l ic theory and instrumentation l ed to extensive experimentat ion and empi r i ca l study throughout the 1 9 t h century (Vies sman and L e w i s , 1996). A m o n g these, S i n g h and W o o l h i s e r (2002) identify the beginnings o f mathematical m o d e l l i n g o f hydro logy i n the rational method o f M u l v a n y (1851) and the relationship between peak storm runof f and rainfal l intensity developed b y Imbeau (1892). The 1930s saw systematic f ie ld experiments conducted i n the U S M i d w e s t i n an attempt to understand the phys ica l processes i n v o l v e d i n the ra infa l l - runoff transit ion (Woolh ise r , 1996). H y d r o l o g i c m o d e l l i n g began to emerge as a process for deve lop ing concepts, theories, and models o f i nd iv idua l components o f the hydro log ic cycle , such as over land f low, channel f low, 26 inf i l t rat ion, depression storage, evaporation, interception, subsurface f l ow , and base f low (S ingh and W o o l h i s e r , 2002). E a r l y hydro logy typ ica l ly ignored informat ional uncertainty and adopted an either/or approach i n its formulat ion rather than c o m b i n i n g a l l avai lable informat ion (Vicens et a l . , 1975). Burges (2002) reviews the s ignif icance o f i nd iv idua l contributions b y Hor ton , Penman, Da rcy , and others to models o f the var ious component processes. H o w e v e r , their col lec t ive progress was l imi t ed b y the intensity o f the computations i n v o l v e d . Further diff icul t ies emerged as researchers prov ided strong experimental evidence o f the non-l inear nature o f the runof f process (Woolh i se r , 1996). The advent o f the d ig i ta l computer i n the 1960s w i t h its ab i l i ty to manage calculat ions o f p rev ious ly prohib i t ive complex i ty led to an exp los ion o f interest and research i n hydro logy (Woolh i se r , 1996). The first " r ea l " hydro logic models emerged (e.g., the Stanford Watershed M o d e l , S W M ) , be ing conceptual i n nature w h i l e retaining a degree o f theoretical phys ica l s ignif icance i n the cont ro l l ing parameters. Th rough these tools, model lers were first p rov ided w i t h the capabi l i ty to comprehensively synthesize past events, predict future events, quantify extreme condi t ions , evaluate anthropogenic impacts o n hydro logy , and thereby improve the understanding o f hydro logy (p. 238, Freeze and Har l an , 1969). E v e n w h i l e the earliest conceptual models were be ing refined, some researchers adopted a contrasting paradigm. T h e y proposed that models l i n k together exis t ing but independent mathematical descriptions o f the various hydro logic processes (e.g., Freeze and Har l an , 1969). The goal was the creation o f a comprehensive physical ly-based digi ta l hydro log ic m o d e l whose output w o u l d comple te ly describe the hydro log ic system. The dominant role o f spatial and sequential variat ions i n the mode l input and output is reflected i n Freeze and Ha r l an ' s v i s i o n o f such models as " three-dimensional boundary-value problems w i t h spat ial ly and sequential ly distributed inputs, so lved b y numer ica l methods" (p. 255, ib id) . A l t h o u g h researchers noted early on that data requirements for such models w o u l d be prohibi t ive even for sma l l heavi ly- instrumented research catchments, the intervening decades have seen w i d e adopt ion o f Freeze and Har l an ' s framework. B e v e n presents one possible reason for the acceptance o f the framework through his observation that "d i f f icu l t sciences [...] often aspire to 27 demonstrate progress and maturi ty b y more advanced mathematical descr ipt ions" (p. 203 , B e v e n , 2002). M a n y different mode ls are n o w based on variants o f either the Stanford Watershed M o d e l ("conceptual" models) or the F H 6 9 blueprint ("physical ly-based" models) . The var ious types o f models are discussed i n more detail i n Sec t ion 2.3.3. 2.2.2 The Philosophy of Model Development and Application The adoption o f hydro log ic models into mainstream science and engineering has resulted i n the emergence o f two dist inct goals w h e n deve lop ing (or applying) a hydro logic m o d e l . The first goal is research-oriented, attempting to increase our understanding o f hydro logy through explorat ion o f different assumptions and theories. The second goal seeks effective pred ic t ion o f real w o r l d behaviour for appl ied contexts such as water resources management (Grayson et a l . , 1992b). Exper imenta t ion advancing the first goal cannot be conc luded a priori to extend the understanding o f the second, and v i ce versa (Beven , 1989). Numerous mathemat ical models o f the last century have been developed i n the second sense, i.e., to address o n l y the hydro log ica l variable o f interest at the t ime (Franchin i and Pacc i an i , 1991). In contrast, several authors contend that the p reva i l ing "cost-effective", results-oriented appl ica t ion o f complex numer ica l models is o f l imi t ed significance to true progress i n hydro logy (e.g., Bergs t rom et a l , 2002; Braben , 1985; K l e m e s , 1982, 2000a). B e v e n (2000) contends that, i n the past, technologica l rather than scientif ic progress has fuelled the demand for hydro log ic models . M a n y acc la imed "advances" i n hydro logic m o d e l l i n g more correct ly reflect one or more o f (Beck , 1987; B e v e n , 2000; B e v e n and Feyen , 2002): • the technologica l capabi l i ty for enhanced data co l lec t ion and management; • the widespread implementa t ion o f more complex models ; • the appl ica t ion o f more, less, or different cal ibrat ion; or • the easy v i sua l i za t ion o f results. Fundamental m o d e l l i n g technology developed decades ago is s t i l l i n use i n many parts o f the w o r l d , i n part because new techniques have proven unable to improve mode l accuracy or 28 efficacy when the under ly ing m o d e l structure or data are fa l l ib le (Woolh ise r , 1996; S i n g h and Woolh i se r , 2002). S ince the development o f computer models o f hydro logy , peripheral problems l ike m o d e l cal ibra t ion have seemed to dominate the focus o n process and perspective advocated b y H o r t o n i n the 1930s (K lemes , 1986b). In general, hydro log ic models require a sound phys ica l basis i f they are to be sc ient i f ica l ly credible . Therefore, it is v i e w e d as a posi t ive development that more recent w o r k shows a g r o w i n g return to process-oriented research and a focus on f ie ld w o r k (K lemes , 2000a). Strong phys ica l reasoning or empi r i ca l evidence should be used to determine w h i c h representations and s impl i f ica t ions are appropriate for a g iven natural system ( O ' C o n n e l l and T o d i n i , 1996). W o o l h i s e r (1996) notes that a good match between dominant in situ and mode l l ed processes is a pre-requisite for success; neglect ing such intui t ive precepts can lead to structural inadequacies and cal ibra t ion errors (Gan and Burges , 1990b). Fau lkner et a l . (1998) and F ranch in i and Pacc ian i (1991) present some specific examples o f inconsistencies o f methodology. V e r y few models and data sets are substantial enough to a l l o w model lers to exp la in w h y a g iven hydro logic m o d e l s imula t ion is considerably different f rom its corresponding f ie ld measurements (Smi th et a l . , 1994). Often, the lack o f a para l le l f ie ld program for many models interferes w i t h advancing an understanding o f the m o d e l structure and its influence on results (Grayson et a l . , 1992b; M i c h a u d and Sorooshian , 1994; O ' C o n n e l l and T o d i n i , 1996; Song and James, 1992). In the absence o f object ively verif iable truthing data, W o o l h i s e r (1996) believes a mode l le r must constantly be asking questions such as "Does this make sense?", " W h a t is the uncertainty o f m y predic t ion?" , and "Does this l eve l o f uncertainty render the analysis meaningless?". Seibert and M c D o n n e l l (2002) argue that the mode l le r should also be check ing the mode l for consistency and reasonableness against any "soft" (i.e., imprecise or qualitative) data that m a y be available. Loague and Freeze (1985) point out that the usefulness o f results often depends o n the model le r ' s understanding o f the appl ied m o d e l and its relat ionship to the hydro logic nuances o f the subject catchment. 29 P o o r or uncertain mode l results should ins t inc t ive ly beget a return to the prototype rather than addi t ional "f ine- tuning" o f an extant m o d e l . Klernes (2000a) contends that, i f the substance o f the m o d e l is he ld sacrosanct, l i t t le insight can be gained into the prototype regardless o f the effort expended. Unfortunately, poor m o d e l results are se ldom reported (Beven , 2000; G r a y s o n et a l . , 1992b). The mode l developer ' s emot ional investment often makes it more appealing to search for a new appl icat ion for an unsatisfactory tool than to search for a new and better w a y o f dea l ing w i t h the p rob lem ( K l e m e s , 1983). 2.2.3 Model Complexity in Context U n t i l approximate ly a decade ago, l imi t ed computat ional capabi l i ty was a substantive barrier to advanced hydrologic mode l l i ng . H o w e v e r , the ongoing and accelerating growth o f processing power made it inevitable that computat ional abi l i ty w o u l d soon outpace and surpass our ab i l i ty to m o d e l . W i t h an increased ab i l i ty to solve numer ica l problems, studies o f the effects o f parameter var ia t ion o n mode l results became more v iab le (Gupta et a l . , 1999). Simultaneously, the value o f invest igat ion into s impl i f ica t ions and approximate solutions (e.g., K u c z e r a , 1997) d imin i shed as computat ional advances negated the m a i n advantage o f such approaches. The complex i ty o f most contemporary models is n o w constrained o n l y b y the degree o f complex i ty appropriate to the m o d e l , subject, and research context. The def in i t ion o f an "appropriate" degree o f complex i ty has been and continues to be a major focus for d iscuss ion and research. Some bel ieve that the required l eve l o f accuracy should dictate the choice o f m o d e l i n any situation, since even s imple , easy-to-use mathematical models can often exp la in a large part o f s treamflow variance (Beven , 1989; Dunne , 1982; Garen and Burges , 1981; Jakeman and Hornberger , 1993). S m i t h et a l . (1994) point out that the theoretical r igor o f complex models is not an a priori guarantee o f accuracy. In many cases, s impler models m a y g ive answers o f the same qual i ty as their complex counterparts, often at a lower cost (Gan et a l . , 1997; Woo lh i se r , 1996). In general, there is a con t inuum o f trade-offs between the l imi t ed process descr ipt ion o f the simplest models and the uncertainty introduced b y those o f higher c o m p l e x i t y (Hornberger et a l . , 1985). The hydrologis t should keep i n m i n d that a l l models are, b y def ini t ion, an abstraction o f real i ty and are therefore to some extent incorrect (Woolh i se r , 1996). I f a mode l ' s representation o f 30 real i ty were perfect and exhaustive, the m o d e l w o u l d not be a m o d e l o f the natural system but its duplicate ( K l e m e s , 2000a). Resul ts for even the best models should therefore be v i e w e d w i t h a degree o f skept ic i sm. Grayson et a l . (p. 855, 1994b) present a strong case that "models should not be appl ied as substitutes for knowledge" . Rather, the authors contend that the proper use and interpretation o f m o d e l results can require more knowledge and hydro log ic insight than w o u l d otherwise be necessary. 2.3 Model Classifications W h e n exp lo r ing an emergent f ie ld o f science, the in i t i a l relat ionships developed are generally s imple and lead to o n l y l imi ted knowledge and understanding. S u c h relat ionships are labeled " e m p i r i c a l " to d is t inguish them from the "causa l " relationships that characterize process dynamics ( K l e m e s , 1982). E m p i r i c a l relationships are often used as convenient summaries o f complex causal chains. In hydro logy the empi r i ca l approach has m a n y different labels, such as operational , prescript ive, analyt ical , and statistical, w h i c h a l l attempt to "understand" and " e x p l a i n " the behaviour o f any system i n terms o f a re la t ively few comprehensible elements ( ibid.) . Scient i f ic d isc ip l ines usual ly evolve f rom the construct ion o f empi r i ca l models to the development o f causal models after reaching a fa i r ly advanced stage o f development; hydro logy is no except ion. M o d e l s i n use today span a range o f complex i ty f rom pure ly empi r i ca l to h igh ly detailed. Ind iv idua l hydro log ic models are most c o m m o n l y classif ied according to their complex i ty . Three categories are typ ica l ly considered, namely empi r i ca l and b lack b o x models , conceptual models , and phys ica l ly-based reductionist models ( K u c z e r a and Parent, 1998). Boundar ies between the m o d e l types are not strict, and the roles o f models i n different categories can be complementary rather than compet i t ive ( O ' C o n n e l l and T o d i n i , 1996). G a n (1987) provides more informat ion on the var ious classes o f models . A l t h o u g h m a n y studies o f the 1970's and early 80's compare results o f models selected from w i t h i n a s ingle category, few compare or contrast the results generated b y different types o f models . There is even some debate about whether such comparisons are v a l i d g iven the d iss imi lar i t ies o f process and appl icat ion across m o d e l types; S m i t h et a l . (p. 851, 1994) refer to one cross-compar ison study as " a classic example o f apples versus oranges". The lack o f 31 compar i son studies makes it dif f icul t to assemble a robust synthesis o f experimental conclus ions (Woolh i se r , 1996). Further, the choice o f m o d e l for a g iven situation is frequently dictated b y avai lable data or other factors not considered i n a compar ison study. S i n g h (1995a) provides reviews o f many models o f va ry ing complex i ty . A comprehensive and fair ly up-to-date l i s t ing o f a l l watershed models is enumerated b y S i n g h and W o o l h i s e r (2002). 2.3.1 Statistical, Empirical, and Black Box Models F r o m a scientif ic perspective, statistical, empi r i ca l , and b lack-box models represent the simplest class o f hydro log ic m o d e l . S i n g h and W o o l h i s e r (2002) trace the roots o f mathematical m o d e l l i n g i n hydro logy to the 1 9 t h century works o f M u l v a n y (1851) and Imbeau (1892). M o r e recent examples o f empi r i ca l models i n hydro logy include statistical F l o o d Frequency A n a l y s i s ( F F A ) , regression models , and the transfer function models described b y Jakeman and Hornberger (1993). M o d e l s i n this category can also include the s imple correlat ion o f hydro log ic quantiles (e.g., f lood peak, characteristic unit hydrograph, or l o w f low) w i t h geologic , geomorphic , and c l ima to log ic variables (Dunne, 1982). Statist ical models c o m m o n l y consider a s treamflow t ime series as a mathematical series hav ing general descript ive properties such as central tendency, variance, and autocorrelation. Parameters are u n l i k e l y to have phys ica l s ignif icance. W h e n models are v i e w e d as a pure ly mathematical construct, B e v e n (1989) notes that three to five parameters should be sufficient to reproduce most o f the informat ion i n a hydro log ica l record. Jakeman and Hornberger (1993) conclude that the "permiss ible m o d e l c o m p l e x i t y " seems to be around s ix parameters. G r i g g et a l . (1999) strongly caution against b l i n d rel iance o n empir ical ly-es t imated floods (e.g., through F F A ) . K l e m e s (1982) presents a stronger v i e w , arguing that b y u t i l i z i n g the data to define an appropriate m o d e l , such models are se l f - l imi ted and have no jus t i f ica t ion beyond their under ly ing data set. T h i s does not mean that s imple mathematical models are o f l i t t le use to the hydrologis t ; o n the contrary, they can be appropriate or even op t imal i n circumstances where the l imi ta t ions o f the methods are acknowledged. The generic term " b l a c k - b o x " m o d e l refers to a mode l i n w h i c h inputs are converted into outputs us ing formulas, calculat ions, or pre-programmed relationships that the end user does not need to 32 see or understand to be able to use. A l t h o u g h a l l higher-order computer-based models o f hydro logy cou ld therefore be considered " b l a c k - b o x " models , the more c o m m o n interpretation refers to a m o d e l that - l ike statistical and empi r i ca l models - makes no attempt to employ the k n o w n physics o f the hydro log ic phenomenon ( K u c z e r a and Parent, 1998). H o w e v e r , un l ike statistical or s imple empi r i ca l models , " b l a c k - b o x " models can possess significant technical complex i ty . Substantial knowledge is c o m m o n l y required for set up, t raining, and analysis o f process-oriented output characteristics l i ke predic t ive uncertainty. The concept o f a "b lack-box m o d e l " i n hydro logy is perhaps most c lear ly understood w h e n used to describe an ar t i f ic ia l neural network ( A N N ) . A N N s create a (potentially complex) set o f relationships between input and output data that is fu l ly dependent o n (and thus var iable wi th) the set o f data used to " t r a in" the mode l . In this way , an A N N mode l differs f rom a l l other m o d e l types, w h i c h combine a f ixed set o f formulas and algori thms w i t h variable parameters. A w o r k i n g understanding o f the complex " b l a c k - b o x " mechanics o f an A N N hydro log ic m o d e l w o u l d y i e l d l i t t le to no hydro logic insight. S i n g h and W o o l h i s e r (2002) observe that even the most complex A N N s do not m o d e l the internal processes o f a catchment. H o w e v e r , the abi l i ty o f A N N s to recurs ively learn f rom data can save substantial t ime i n m o d e l development, especia l ly where tradit ional parameter est imation techniques are not convenient (S ingh and Woo lh i se r , 2002). F o r more informat ion o n A N N s , the reader is directed to a two-part, detailed d iscuss ion o f the role o f A N N s i n hydro logy authored b y the A S C E (2000a, 2000b). 2.3.2 Conceptual Models Conceptua l hydro logic models comprise an intermediate leve l o f complex i ty . T h e y generally represent the hydro log ic cyc le as several interconnected subsystems, each o f w h i c h simulates a component process through empi r i ca l ly or heuris t ical ly-determined but phys ica l ly-p laus ib le functions (Duan et a l . , 1992). Conceptua l models capture broad features o f catchment response but are computa t ional ly and informat ional ly straightforward, attempting to balance structural s imp l i c i t y against the physics o f the p rob lem (p. 217, F ranch in i and Pacc i an i , 1991). The intended spatial and temporal scales o f appl ica t ion often have a profound influence o n m o d e l structure development (S ingh , 1995b). 33 B e v e n (1989) reminds hydrologists that a conceptual m o d e l presents an approximat ion o f the real w o r l d and therefore must introduce significant potential for error and uncertainty. In particular, Hornberger et a l . (1985) note that some parts o f a conceptual mode l m a y have a stronger basis i n scientif ic or phys ica l theory than others. B o u n d a r y condit ions used to define the behaviours o f the var ious component sub-models m a y also be neglected or changed, d ivo rc ing the m o d e l structure from its basic consti tut ive assumptions (Franchini and Pacc ian i , 1991). There are different degrees o f phys ica l basis even w i t h i n the category o f conceptual models . Unsurpr i s ing ly , more complex models typ ica l ly require greater effort i n cal ibrat ion and appl icat ion than s impler models . In general, the l eve l o f ca l ibra t ion di f f icul ty is direct ly related to the number o f parameters and complex i ty o f m o d e l structure. F ranch in i and Pacc ian i (1991) apply several conceptual models o f va ry ing c o m p l e x i t y to a four-month s imulat ion, reporting that a l l but one o f the models generate acceptable s imulat ions o f the recorded discharges. S igni f icant ly , the more complex models require m u c h more effort i n cal ibrat ion than the most abstract mode l , a l though the authors report that it is general ly useless to attempt to identify l inkages between the in situ and mode l l ed runof f processes i n the most abstract case ( ibid.) . Cer ta in structural features m a y l i m i t the abi l i ty o f a conceptual m o d e l to represent the hydro logic response o f a catchment. F o r example, the typ ica l "conceptua l" representation o f the subsurface invo lves par t i t ioning it into two or more discrete zones w i t h f ixed storage capacities and inf i l t ra t ion thresholds. Th i s typ ica l ly results i n a crude representation o f processes such as inf i l t ra t ion, percolat ion, and evapotranspiration ( G a n and Burges , 1990a). K l e m e s (p. 102, 1982) cr i t ic izes such s imp l i c i t y for "[taking] shortcuts to f i l l the v o i d between the data and the goals w i t h l og i ca l l y p laus ible assumptions that are sometimes correct but often w r o n g and, more often than not, i n d i v i d u a l l y untestable". E x p l i c i t va l ida t ion o f the m o d e l structure is usual ly not feasible due to the abstract nature o f the m o d e l processes ( G a n and Burges , 1990a). The hydro log ic characteristics o f a catchment m a y also preclude good s imula t ion b y a conceptual m o d e l , as certain condit ions are more easi ly rea l ized i n a conceptual framework than others. G a n and Burges (1990b) observe poor ca l ibra t ion results from the appl icat ion o f a conceptual m o d e l to a catchment hav ing mul t ip le hydrologica l ly-d is t inc t subcatchments w i t h a 34 c o m m o n outf low. M o s t conceptual models have been developed for appl icat ion to temperate or wet cl imates, m a k i n g s imula t ion o f dry catchments (i.e., those i n w h i c h less than 2 0 % o f ra infa l l becomes runoff) more di f f icul t due to the greater s ignif icance o f inf i l t ra t ion and evapotranspiration ( G a n et a l . , 1997). The abstraction o f phys i ca l real i ty i n a conceptual m o d e l l imi t s the temporal resolut ion that can be real ized. The sensi t ivi ty o f a mode l to the chosen timestep is greatly increased as the timestep approaches and exceeds the bas in residence t ime ( A r n a u d et a l . , 2002) . G a n and Burges (1990b) f ind close agreement between observed and simulated mean f low rates at the da i ly scale, but the compar i son deteriorates for lesser timesteps. M i c o v i c (2003a) bel ieves that satisfactory results for a watershed-scale conceptual m o d e l w i l l prove unattainable for any t ime step less than one hour. The abstract, "g rey-box" nature o f conceptual models is dis t inguished b y the need to calibrate one or more parameters us ing observed data ( K u c z e r a and Parent, 1998). L i k e empi r i ca l models , conceptual models are best suited to applications where condi t ions dur ing the cal ibrat ion per iod are hydro log ica l ly s imi l a r to those o f the s imula t ion per iod (p. 186S, K l e m e s , 1986a. W h i l e ident i fying a unique and object ively verif iable set o f "true" parameter values is usual ly imposs ib le , G a n and B i f t u (1996) argue that a conceptual ly-real is t ic parameter set is a pre-requisite for good performance i n forecasting or predic t ion. H o w e v e r , earlier w o r k b y G a n (Gan , 1987; G a n and Burges , 1990a) shows that even a conceptual ly-real is t ic parameter set cannot guarantee good performance. Because the parameters o f conceptual models are defined b y the data, the resul t ing structure is not general i n its app l icab i l i ty (Beven , 2002). L i k e empi r i ca l models , there is no basis for assuming that a conceptual m o d e l can effectively extrapolate beyond its cal ibrat ion experience ( G a n and Burges , 1990a). R e d u c i n g dependence o n ca l ibra t ion b y decreasing the number o f parameters i n a conceptual m o d e l w o u l d have the undesirable effect o f t ransforming the "grey-b o x " (i.e., conceptual) representation o f the watershed into a pure ly b l ack -box descr ipt ion (Seibert, 2000). A d d i t i o n a l d iscuss ion concerning the extrapolat ion o f calibrated models is p rov ided i n Sec t ion 3.4. 35 2.3.3 Physically-Based Models In many fields inc lud ing hydraul ics , the most accurate form o f m o d e l l i n g involves a "phys ica l m o d e l " - an exact, scaled representation o f the natural system designed and constructed under contro l led condit ions for invest igat ive purposes. Howeve r , the large areal extent o f most watersheds combines w i t h smal l -scale hydro log ic act ivi ty to preclude the use o f phys ica l mode ls for a l l but the most basic hydro log ic experiments. The most complex hydro logic models are instead based o n h igh ly detailed mathematical representations o f the var ious in situ processes. Such models are referred to as physica l ly-based. In contrast to phys i ca l models , phys ica l ly -based mathematical models apply part ia l differential equations to represent the various processes w i t h i n the boundary condi t ions o f the bas in (Freeze and H a r l a n , 1969). Therefore, w h i l e not as exact as a wel l -deve loped phys ica l m o d e l , a physical ly-based hydro log ic mode l is n o m i n a l l y capable o f s imula t ing i n detail the internal processes and variables o f a catchment. There seems to be litt le argument that models o f laboratory-scale systems are physical ly-based, i n compar i son w i t h models operating at the catchment or bas in scales (Woolh i se r , 1996). Some argue that physical ly-based models are, i n essence, an attempt to "scale u p " the processes o f the laboratory scale to larger contexts (e.g., K u c z e r a and Parent, 1998). The scal ing and transposit ion o f processes between dominant scales can lead to potential inconsistencies w i t h their def in ing theoretical assumptions. The strengths and l imi ta t ions o f process and data sca l ing are discussed further i n Sec t ion 3.1. A s a result o f their complex i ty and detai l , physical ly-based models have greater data requirements than their s impl i f i ed counterparts. A s a m i n i m u m , physica l ly-based models general ly require the f o l l o w i n g four k inds o f input data (Freeze and Har l an , 1969): • m o d e l def ini t ion informat ion (e.g., g r id size, t ime step, and topographic data); • meteorological input data (e.g., the t ime-variant f lux o f water at the so i l surface); • f l o w parameter estimates (e.g., M a n n i n g ' s n, hydraul ic conduct iv i ty) ; and • mathematical basis and structure (i.e., the equations and functions that define the model ) . 36 Contemporary models can also incorporate diverse data from digital elevation models, remote sensing imagery, chemical tracer experiments, and groundwater level monitoring. The intensive data dependence of physically-based models implies that their performance is strongly dependent on the completeness, accuracy, and representativeness of the input data. Physically-based models are arguably more "realistic" than models which must be calibrated to historical data in a curve-fitting exercise (Beven, 2001). Physically-based models offer the possibility of hydrologic relevance under conditions beyond those of the process-recorded history, and may identify efficient shortcuts which can then be used to improve empirical or conceptual models (Klemes, 1982). In particular, research has shown that physically-based models offer improved performance over calibrated models for situations where physical and hydrologic descriptions of the watershed are available but a long-term gauging record is not (Michaud and Sorooshian, 1994). Physically-based models are also able to generate distributed predictions and thereby evaluate changes in the constituent conditions of a watershed (Beven, 2002). However, internal calculations for points within the catchment are not necessarily representative of physical reality and are easily divorced from their uncertainty (Grayson et al., 1992b). Nonetheless, the distributed predictions of these models have proved popular in diverse fields where distributed hydrologic results act as inputs to other physical, chemical, biological, environmental, or ecological models (Singh, 1995b). Two classes of criticism are commonly directed at physically-based models: firstly, strictly speaking, physically-based models almost never have an absolute physical basis (i.e., in the sense that it is highly unusual that all parameters can be determined a priori from research or field measurements); secondly, physically-based models are more likely to be mis-used than simpler models (Woolhiser, 1996). Grayson et al. (1994a) conclude that their inability to effectively represent the large- and small-scale spatial variability of rainfall is likely to eliminate any theoretical advantages that models of this type might otherwise possess. Others disagree, cautioning against "indicting" physically-based models on the basis of a single set of poor or non-representative results (Smith et al., 1994; Woolhiser, 1996). 37 In many cases, concern about the predict ive capabil i t ies o f physical ly-based models appears to come i n part f rom diff icul t ies i n appl icat ion experienced b y those w h o do not understand the m o d e l (Woolh i se r , 1996). O ' C o n n e l l and T o d i n i (1996) propose that rather than abandoning phys ica l ly-based mode l l i ng , one should re-a l ign expectations o f what can be achieved, and on what t ime scale. In the most precise sense, a "phys ica l bas is" impl i e s that a m o d e l does not require cal ibrat ion. T h i s is rarely the case i n practice, result ing i n an inab i l i ty to comple te ly abandon empi r ica l , cal ibrated parameters (Madsen , 2003). F o r example , the prominent physical ly-based m o d e l M I K E S H E uses calibrated coefficients for ca lcula t ing over land and channel f low, w h i l e f l ow through macropores i n the unsaturated zone is mode l l ed b y an "empi r i ca l bypass funct ion" (Refsgaard and S torm, 1995). F o r this reason, an uncalibrated physica l ly-based mode l can be at a dist inct disadvantage when compared di rect ly w i t h calibrated models (Loague and Freeze, 1985). Ca l ib ra t i on o f physical ly-based models is diff icul t due to frequent mathematical over-parameterization, w h i c h i n turn results from a lack o f defini t ive a priori parameter value est imation. W o o l h i s e r (1996) argues that the need for ca l ibra t ion often justif ies a reduction i n m o d e l d imensional i ty . In some cases, "excess" parameters can be replaced b y s impl i fy ing assumptions. F o r example, the M I K E S H E m o d e l can be used w i t h a lesser complement o f parameters i f the data are insufficient to support its fu l l d imensional i ty , a feature speci f ica l ly intended to reduce the r isk o f overparameterization (Refsgaard and S torm, 1995). The impact o f reductions i n m o d e l complex i ty or parameter d imens iona l i ty o n the degree o f "phys ica l bas is" has not been w i d e l y discussed i n the literature. W o o l h i s e r (1996) identifies a d icho tomy between the g r o w i n g acceptance o f physical ly-based models i n the engineering communi ty and the skept ic i sm o f many i n the research communi ty . Some have gone so far as to suggest that s impler models m a y be more appropriate i n certain situations ( ibid.) . Regardless o f their pract ical u t i l i ty i n a forecasting or predict ive context, it is general ly agreed that physical ly-based models have great potential for exp lor ing the details o f hydro log ic interactions and invest igat ing the fundamentals o f runof f processes (Dunne, 1982; G a n and Burges , 1990a). 38 2.3.4 Other Classifications M o d e l s can be c lass i f ied b y factors other than complex i ty o f mode l structure. S i n g h (1995b) presents a fa i r ly comprehensive overv iew o f alternative c lass i f icat ion systems. M o r e prominent dist inctions inc lude the topographical representation o f the watershed ( lumped or distributed), the durat ion o f the m o d e l (single event or continuous s imulat ion) , and the mathematical approach (stochastic or determinist ic) . Class i f ica t ion according to these factors is not necessari ly strict. A m o n g s t the above, the c lass i f icat ion most significant to the considerat ion o f uncertainty is arguably the d is t inc t ion between lumped and distributed models . I f the spatial scale o f m o d e l computat ion is large i n compar i son w i t h the scale o f in situ spatial var iat ion, m o d e l parameters become phys i ca l ly unmeasurable and assume "effective va lues" (Woolh ise r , 1996). S u c h models are referred to as " lumped" . M o s t lumped models use a single set o f representative values for each appl ica t ion (i.e., each watershed or catchment). Conceptua l models are almost always l umped to some degree, whereas distributed models have been developed i n response to the detailed representation required b y physical ly-based models (Hornberger and Boye r , 1995). T y p i c a l l y , the distributed m o d e l breaks d o w n a catchment into smaller response units or g r id cel ls . A d iscuss ion o f the relat ionship between var iabi l i ty , scale, and distributed m o d e l l i n g is inc luded i n Sec t ion 3.1.1. H y d r o l o g i c models p rov ide either continuous or single-event s imula t ion . B o t h continuous and event-based models step through their appl ica t ion per iod us ing discrete timesteps. H o w e v e r , continuous models simulate hydro log ic condit ions for a pro longed per iod, whereas event-based models are l i m i t e d to considerat ion o f a single runof f event. Event-based models typ ica l ly do not simulate E T , s o i l moisture depreciat ion or catchment recession. E i the r type o f m o d e l can be used for a g iven appl icat ion, but one type m a y be better suited than the other i n certain situations. F o r example, implement ing the event-based m o d e l K I N E R O S R a l lows Faures et a l . (1995) to compi l e a statistical representation o f peak estimates from mul t ip le s imulat ions o f s torm runof f wi thout hav ing to extract the informat ion from a continuous t ime series. In a different study, C o o p e r et a l . (1997) f ind that objective functions appl ied to the entire data set y i e l d better approximat ions o f the synthetic parameter set than objective functions that 39 on ly examine peak or l o w f low quantiles. The results o f Coope r et a l . (ibid.) i m p l y that a continuous mode l is more appropriate for their study. Event-based models require a "snapshot" o f watershed condi t ions at the beginning o f each s imulat ion. Measurements o f these condit ions are se ldom avai lable and therefore must be assigned a priori or calibrated as addi t ional parameters. One c o u l d argue that event-based models exchange the uncertainty i n m o d e l l i n g E T and s o i l moisture for uncertainty i n in i t i a l condit ions. F o r cases where in i t i a l condit ions for an event-based s imula t ion are obtained f rom a pre l iminary mode l , the user should be aware that the s imula t ion is s t i l l subject to uncertainty i n antecedent condi t ions; it is mere ly h idden as a computed variable (Grayson et a l . , 1992a). M i c h a u d and Sorooshian (1994) present an example o f a hydro log ic mode l ( K T N E R O S ) appl ied w i t h in i t i a l condi t ions excerpted from the output o f another m o d e l . A l t h o u g h continuous models also require in i t i a l condi t ions , an accurate estimate o f the in i t i a l condit ions is less important i f data series is suff icient ly long . In such cases, a "sp in-up" (or "warm-up") per iod can be used. The idea is that, over t ime and regardless o f in i t i a l condi t ions, mode l l ed watershed condi t ions w i l l gradual ly approach those in situ. Therefore, p rov ided a sufficiently l ong data series is avai lable, actual starting condi t ions are irrelevant (Houghton-Carr , 1999). Output f rom the spin-up per iod is not used to evaluate m o d e l performance. The proper durat ion o f the " sp in-up" per iod should be determined from the t ime required for mode l l ed watershed condi t ions to converge w i t h in situ condit ions. In practice, spin-up times range from several weeks to several years (B ingeman , 2001; M a d s e n , 2000; Thyer et a l . , 1999; V r u g t et a l . , 2003). F i n a l l y , al though this thesis is wri t ten largely i n the context o f deterministic hydro log ic models , the reader should be aware o f the range o f stochasticity that can be accounted for i n a hydro log ic mode l . M o s t hydro log ic models currently i n use are fu l ly determinist ic; f rom the perspective o f the mode l , a l l parameters and data are k n o w n w i t h certainty. H o w e v e r , it is possible for a mode l to incorporate probabi l i s t ic representation o f both parameters and data. In most cases, stochastic models use repeated M o n t e Ca r lo s imulat ions to establish a probabi l i ty-magnitude output w h i c h supplants the s imple magnitude estimate p rov ided b y determinist ic models . 40 Probab i l i ty distributions for stochastic m o d e l parameters are usua l ly either specified a priori based o n expert knowledge or publ i shed data, or estimated f rom a field sampl ing program. The m o d e l is run repeatedly us ing random samples f rom the dis t r ibut ion o f each parameter. Stochastic models , such as the one espoused b y K u c h m e n t and Ge l f an (2002), are typ ica l ly u t i l i zed for extreme event predic t ion . Ex t reme events are most often characterized b y significant uncertainty regarding antecedent condit ions and flood-generating processes. A l t h o u g h stochastic models are designed to par t ia l ly quantify mode l predic t ive uncertainty, the user should be aware that results w i l l t yp ica l ly be l imi t ed b y the least accurate constituent parameters, components, or probabi l i t ies . 2.4 Model Calibration E v e r y hydrologic model le r faces two fundamental challenges: the first is to select an appropriate m o d e l for the study site; the second is to determine parameter values such that the m o d e l c lose ly simulates the in situ behaviour (Sorooshian and Gupta , 1995). P o o r parameter specif icat ion can lead to poor s imula t ion regardless o f m o d e l sophist icat ion (e.g., M i c h a u d and Sorooshian, 1994). Therefore, careful set-up and in i t i a l i za t ion is necessary to m a x i m i z e the re l iab i l i ty o f the m o d e l (Gupta e t a l . , 1999). M o d e l ident if icat ion (e.g., system and boundary def ini t ion, data specif icat ion, and del ineat ion o f relationships between variables) cannot be treated as entirely dist inct from m o d e l ca l ibra t ion (e.g., selection o f evaluat ion cri teria, ident i f icat ion o f an informat ive data set, and parameter opt imizat ion) . A s the focus o f this w o r k invo lves neither des igning nor altering a mode l , the m o d e l identif icat ion phase is largely set b y the select ion o f an extant hydro logic mode l . H o w e v e r , even w h e n app ly ing a k n o w n mode l , the ca l ibra t ion phase should be pursued w i t h due considerat ion o f the context o f the structure and properties o f the i nd iv idua l mode l (Sorooshian and Gupta , 1985). Three stages should be explored i n m o d e l appl icat ion: cal ibrat ion, va l idat ion, and an assessment o f re l i ab i l i ty (Singh, 1995b). The bu lk o f this Chapter deals w i t h mode l cal ibrat ion. V a l i d a t i o n is discussed further i n Sec t ion 2.4.6, w h i l e re l i ab i l i ty ( in terms o f the uncertainty i n m o d e l results) is the subject o f Chapter 3. 41 In general, apply ing a mode l requires est imating values for two types o f parameters: " p h y s i c a l " parameters w h i c h can be measured (e.g., watershed area) and "process" parameters w h i c h represent watershed properties not d i rect ly measurable (e.g., t ime lag constants for runoff) (Sorooshian and Gupta , 1995). Ca l ib ra t i on involves back-ca lcu la t ion o f the "process" parameters through trial-and-error comparisons o f predict ions to runo f f records (Dunne, 1982). Soroosh ian and Gupta (1995) observe that a priori judgment and experience are frequently used to ident ify a range or bounds for each "process" parameter dur ing the p re l iminary stages o f cal ibra t ion. In general, the use o f expert judgment i n such matters should complement rather than replace other evidence (Keeney and Winterfeldt , 1989). The intent ion o f cal ibrat ion should be clear before c o m m e n c i n g - i.e., is the goal s imp ly to achieve the best possible fit for the cal ibra t ion set, or does it i n v o l v e the ident if icat ion o f a unique and realist ic parameter set that c lose ly represents the phys ica l system (Sorooshian and Gupta , 1983). Such general considerations are discussed i n Sec t ion 2.4.1, w h i l e Sections 2.4.2 and 2.4.3 address manual and automatic cal ibrat ion, respectively. 2.4.1 The Nature of Calibration U n t i l recently, cal ibrat ion has arguably been the most in tens ively investigated aspect o f hydro log ic m o d e l l i n g (Beven , 2000). Th rough many studies, threads o f consistency have emerged ident i fying various pre-requisites for a successful cal ibrat ion. In the current era o f reduced funding for hydrometr ic and meteorologic data networks, it is f i t t ing that p r imary considerat ion be afforded to the data. A l t h o u g h even a single year o f cal ibrat ion data can assist i n parameter specif icat ion, ca l ibra t ion is m u c h more sensitive to data informativeness than to length o f record ( G a n et a l . , 1997; G u p t a et a l . , 1998; Sorooshian et a l . , 1983). Refsgaard (1994) concludes that three to five years o f s t reamflow data constitute an adequate basis for cal ibrat ion o f a continuous mode l . Y a p o et a l . (1996) f ind that informativeness o f the data set is op t imal us ing an 8-year span o f s treamflow data. W h i l e m u c h less c o m m o n , some guidance o n the appropriate s ize o f a ca l ibra t ion data set for event-based models is also avai lable: Sorooshian and Gup ta (1995) r ecommend that the number o f data used i n ca l ibra t ion exceed 20 times the number o f parameters be ing estimated. Natura l ly , a mode l s imul taneously calibrated against mul t ip le data series (e.g., mul t ip le streamflow measurement 42 locat ions, or s treamflow and groundwater together) w i l l t yp ica l ly require a shorter per iod o f data than the same m o d e l calibrated against catchment out f low alone. W h i l e hydro log ic m o d e l ca l ibra t ion general ly impl i e s compar i son o f observed and simulated catchment outf low, the use o f addi t ional informat ion can increase confidence i n the representativeness o f the m o d e l . Ca l ib ra t ion can also inc lude comparisons to observed t ime series o f depth to the water table, snow depth or snow water equivalent, so i l moisture, stable natural isotopes (e.g., oxygen-18) , a lka l in i ty and p H , or ni trogen loadings (Bergs t rom et a l . , 2002). Ca l ib ra t ion against different data types m a y a l l ow insights into different aspects o f the m o d e l . F o r example, the appl ica t ion o f conservative chemica l tracer data provides an opportuni ty to improve descriptions o f water pathways and residence t imes ( ibid.) . A t the outset, a mode l le r must decide what aspects o f watershed behaviour w i l l fo rm the basis for cal ibrat ion, and h o w those behaviours w i l l be measured (Gupta et a l . , 1998). Further, the properties o f the m o d e l ( inc lud ing m o d e l error) need to be recognized and considered when des igning a ca l ibra t ion strategy ( Y a p o et a l . , 1996). M o s t important ly, a l l processes that are hydro log ica l ly relevant i n the catchment must be activated dur ing cal ibrat ion. Unac t iva ted process parameters cannot be considered calibrated and are l i k e l y to cause poor m o d e l performance i f activated dur ing later s imulat ions or i n other scenarios ( G a n and Burges , 1990b; L e i and S c h i l l i n g , 1996). Sorooshian and Gup ta (1985) u t i l i ze data sets chosen speci f ica l ly to ensure act ivat ion o f a l l aspects o f the m o d e l . Da ta sequences conta ining a h igh degree o f hydro log ic var iab i l i ty are more l i k e l y to activate the var ious operational modes o f the mode l and thus result i n more rel iable parameter estimates (Sorooshian and Gupta , 1995; Y a p o et a l . , 1996). W e t years (or catchments) tend to be better suited for ca l ibra t ion against catchment out f low than dry years (or catchments). T h i s is due to their more variable f l o w regime and the dominance o f out f low - as opposed to E T - as a d r i v i n g f lux (Gan , 1987; G a n et a l . , 1997; Thyer et a l . , 1999). Qua l i t y o f response is also less consistent for d ry antecedent condi t ions ( G a n and Burges , 1990b). Y a p o et a l . (1996) r ecommend us ing a data set consis t ing o f the wettest per iod o n record. T h i s recommendat ion is supported b y the experience o f the U S N a t i o n a l Weather Service (Hogue et a l , 2000). G a n (1987) attributes the acceptable performance o f his m o d e l under extreme input to good representation o f the processes 43 important to extreme event runoff within the calibration data. He generally notes that including major flood events in the calibration data series can improve the adequacy of calibration for extreme conditions (ibid.). One may be inclined to conclude that a model should be calibrated on conditions similar to those expected during its application. However, Klemes (1986b) points out that this is not logical if the goal of calibration is to ascertain an appropriate model structure rather than merely provide the best fit for the observed data. If the goal is an appropriate model structure, Klemes advises calibrating with hydrologically dissimilar data before validating on conditions similar to those of the target simulation. Models relying solely on calibration data are often overparameterized. In such cases, the process of establishing appropriate interaction between the model and the available data can overshadow the search for a good representation of the system (Kuczera, 1997). In a large number of cases, varying parameter values obtained from multiple calibrations of the same model generate similar goodness-of-fit results. Seibert (2000) find that the aggregate parameter range defined by multiple "successful" calibrations encompasses approximately one-third of the feasible parameter space. Non-identifiability (i.e., the inability to identify a unique and optimal parameter set) can manifest itself in two ways. Firstly, distinct output functions may yield similar results for quantitative evaluation criteria. Secondly, distinct parameter sets may generate identical output functions. The latter is generally of more concern to hydrologic modellers (Sorooshian and Gupta, 1985). Sorooshian and Gupta (1983) and Yapo et al. (1996) present several reasons why a model parameter may be poorly identifiable in the second sense: • parameter interdependence: parameters interact strongly with other parameters, allowing for compensating variations in the feasible parameter space; • parameter nonstationarity: parameter location and precision are correlated to varying watershed or data characteristics not accounted for during calibration; • data noninformativeness: the data do not contain the hydrologic conditions required to properly identify the parameters; 44 • cr i ter ion inadequacy: the evaluat ion cr i ter ia cannot proper ly extract the informat ion contained i n the data; • mathematical di f f icul ty: the ca l ibra t ion is constrained b y loca l features o f the response surface; and • insensi t ivi ty: variations i n the values o f the parameters do not s ignif icant ly affect the m o d e l output or the cal ibrat ion cr i ter ion. Tes t ing for g loba l ident i f iabi l i ty is imprac t ica l i n most cases where it is not imposs ib le (Sorooshian and Gupta , 1985). Therefore, i n most situations, ident ifying stable and realist ic parameter values is as important as f inding values w h i c h produce a good fit to the observed data ( M i c o v i c , 1998). Later chapters o f this w o r k focus o n the need to understand and quantify the potential impact o f alternate solutions on m o d e l s imulat ions . 2.4.2 Manual Calibration The most c o m m o n approach to mode l ca l ibra t ion is manua l (sometimes ca l led "expert") cal ibrat ion (Hogue et a l . , 2000). T y p i c a l l y , an expert user fo l lows a semi-intui t ive tr ial-and-error process, adjusting parameter values to improve the "closeness" o f fit between observed data and s imulated output ( B o y l e et a l . , 2000). A l t h o u g h v i sua l inspect ion o f the observed and simulated hydrographs is the typ ica l standard measure o f performance, statistical comparisons and other factors are c o m m o n l y used i n conjunct ion w i t h quali tat ive analysis o f fit (Hogue et a l . , 2000). Emphas i s is usual ly p laced o n match ing peaks, f lood volumes , recession slopes, and base f low (ibid.) . The evaluat ion process convent ional ly involves a quali tat ive synthesis o f v i sua l inspect ion and numer ica l cri teria, and often proceeds through sequential ca l ibra t ion o f parameter groups or objectives. The intent o f sequential ca l ibra t ion is to s impl i fy the process b y consider ing fewer variables s imultaneously. T h i s is achieved b y first i sola t ing and o p t i m i z i n g those parameters or objectives hav ing the greatest influence over the results w h i l e neglect ing secondary objectives and less-sensitive parameters. H o w e v e r , the potential for parameter interaction impl ies that 45 sequential ca l ibra t ion - unless h igh ly iterative - can l o c k the parameter values i n a trap set up b y compensat ing errors (Bergs t rom et a l . , 2002). The process o f manua l ca l ibra t ion is h igh ly labour-intensive. It cal ls for a considerable degree o f t raining, experience, and effort, and requires a substantial commitment o f t ime and expertise for even the most expert user ( B o y l e et a l . , 2000; G a n , 1987; H o g u e et a l . , 2000). A c h i e v i n g acceptable results requires that the model le r have a w o r k i n g understanding o f the mode l , the data, and the watershed system (Boy le et a l . , 2000). M a n y organizations have developed systematic procedures to be fo l l owed b y users i n an attempt to standardize the process. The U S Na t iona l Weather Service i n part icular has developed a manual procedure that provides excellent m o d e l cal ibrat ions but is knowledge- intensive and therefore not easi ly transferred between people or mode ls ( B o y l e et a l . , 2000). W h i l e the cal ibra t ion procedure fo l l owed b y a user m a y be standardized, the indiv idual ' s adjustments w i l l be inf luenced o n personal sk i l l s and experience. Hough ton-Car r (1999) finds that various experts offer differ ing perspectives o n what constitutes " g o o d " or " b a d " performance, and w i l l t yp ica l ly evaluate performance differently. M a n u a l ca l ibra t ion relies on the conceptual relat ionship between m o d e l parameters and watershed characteristics to support the hydrologis t ' s expectation that adjustment o f a g iven parameter value i n a part icular di rect ion w i l l have a predictable effect o n the mode l output (Gupta et a l . , 1999). Sorooshian and Gup ta (1995) state that the m a i n weakness o f manual cal ibra t ion is i n not k n o w i n g w h e n to quit, since even experts m a y not always be able to assess the potential for further improvement . The semi- intui t ive nature o f w o r k i n g through the complex process o f manual ca l ibra t ion can lead to considerable frustration (Lan , 2001). One o f the greatest advantages o f manual ca l ibra t ion is the ab i l i ty to incorporate expl ic i t a l lowance for potential data errors (Boy le et a l . , 2000) . Where s imula t ion qual i ty is poor due to erroneous or non-representative data, the user can choose to ignore the erroneous data i n favour o f those be l ieved to be more correct. 46 2.4.3 Automatic Calibration A u t o m a t i c cal ibrat ion uses numer ica l comparisons between observed and simulated data to op t imize parameter values wi thout user intervention (Madsen , 2003). A n automatic mode l ca l ibra t ion is typ ica l ly objective i n nature and relat ively easy to implement , but focuses o n numer ica l op t imiza t ion and m a y thus result i n a hydro log ica l ly unrealist ic parameter set ( B o y l e et a l . , 2000; Hogue et a l . , 2000) . H i s to r i ca l ly , this has led the hydro log ic m o d e l l i n g c o m m u n i t y to be h i g h l y skeptical o f automatic cal ibrat ion i n an appl ied context. Nonetheless , a substantial b o d y o f op in ion suggests that modern state-of-the-art automatic cal ibra t ion methods can be considered a v iab le alternative to expert ca l ibra t ion for some purposes ( G a n and B i f t u , 1996; Gup ta et a l . , 1999). One such situation is where mul t ip le objectives add significant c o m p l e x i t y to the cal ibrat ion process (Madsen , 2003). Ano the r example is the w o r k o f Seibert and M c D o n n e l l (2002), w h o use mul t i -object ive cal ibra t ion to demonstrate that automatic methods do not necessari ly preclude the incorporat ion o f quali tat ive or imprecise data. A u t o m a t i c cal ibra t ion can also be used as an adjunct to manual cal ibrat ion; automatic op t imiza t ion algori thms are c o m m o n l y used for f ine-tuning an expert hydrologist 's manual ca l ibra t ion (Duan et a l . , 1994b). The evo lu t ion o f automatic ca l ibra t ion has been motivated b y the need for s impl ic i ty , speed, objectivi ty, confidence, and less rel iance on hard-to-find expert calibrators (Hogue et a l . , 2000; Sorooshian and Gupta , 1995). M o s t research regarding automatic ca l ibra t ion has been carried out for lumped conceptual models (Madsen , 2003). Recent ly-developed mult i -object ive automatic cal ibrat ion tools m a y increase the appl icat ion o f automatic cal ibra t ion to more complex distributed catchment models as they are adopted b y distr ibuted model lers seeking consistency w i t h observed data throughout the catchment. The implementa t ion o f an automatic cal ibrat ion is defined b y the select ion o f its var ious components: objective function, search algori thm, cal ibrat ion data, and search terminat ion cri ter ia (Hogue et a l . , 2000) . T y p i c a l l y , the automatic por t ion o f any cal ibrat ion routine has the f o l l o w i n g steps i n c o m m o n (Gupta et a l . , 1999): 47 • identify components; • obtain an in i t i a l estimate o f the values (or ranges) for the parameters; • execute the mode l ; • measure performance; and • apply a search a lgor i thm to f ind parameter values that improve the value o f the objective function. The quest ion o f " w h e n to qu i t " is addressed for automatic cal ibra t ion through expl ic i t specif icat ion o f terminat ion cri teria. Appropr ia te terminat ion cri ter ia are necessary to achieve the desired balance between effectiveness and efficiency. Termina t ion cri teria typ ica l ly inc lude parameter convergence (meaning that unique values have been ident i f ied for each parameter), funct ion convergence (meaning the objective function cannot be s ignif icant ly improved) , or a m a x i m u m number o f iterations (to avo id endless loops) (Hogue et a l . , 2000). A l t h o u g h parameter convergence is usua l ly the best indicator o f a mathemat ica l ly-opt imal parameter set, none o f the above cri ter ia can conc lus ive ly ascertain that the g loba l o p t i m u m has been reached, since poor ly-chosen cri ter ia can halt the search procedure before it locates the best avai lable parameter set (S ingh and W o o l h i s e r , 2002; Sorooshian et a l . , 1983). The strength o f an automatic procedure depends on h o w w e l l its design reflects the various factors important to obta ining a successful cal ibrat ion i n each specif ic case. The ident i f icat ion o f these factors has been the subject o f significant research (Boy le et a l . , 2000). M u c h early research focused o n i m p r o v i n g search methods that converged to one o f the mul t ip le points or clusters compr i s ing the loca l op t ima o f the response surface (Kucze ra , 1997). A l t h o u g h many advanced search techniques can n o w a v o i d be ing trapped b y loca l opt ima, the p rob lem is far f rom solved. W h i l e new procedures m a y be able to produce rel iable estimates o f g loba l op t ima for complex problems (e.g., D u a n et a l . , 1994b), the focus o f these procedures is s t i l l directed at op t imiza t ion rather than hydro log ic m o d e l cal ibrat ion. Da ta uncertainty, m o d e l structure uncertainty, and dependence o n the cal ibrat ion procedure (e.g., objective funct ion selection) i m p l y that the best possible representation o f the system w i l l not necessari ly co inc ide w i t h the g loba l o p t i m u m o f the objective function (Vrugt et a l . , 2003). G a n 48 (1987) observes objective function values approaching their theoretical l i m i t for some cal ibrat ions, despite "gross s imula t ion errors" obvious i n the corresponding hydrographs. Hough ton-Car r (1999) emphasizes that op t imiza t ion algori thms lack an expert 's knowledge o f m o d e l structure and cannot judge w h e n a parameter set is unrealist ic. A u t o m a t i c ca l ibra t ion methods are t yp i ca l ly compared o n two scales: accuracy o f so lu t ion (i.e., effectiveness) and eff iciency o f convergence. E f f i c i e n c y o f convergence for an automatic cal ibra t ion method refers to h o w q u i c k l y the f inal solut ion is identif ied, i m p l i c i t l y representing the amount o f computat ional effort required (e.g., required number o f function evaluations) (Thyer et a l . , 1999). Robustness o f solut ions (i.e., the var ia t ion i n solutions across mul t ip le trials) is another potential cr i ter ion for evaluat ion (Cooper et a l . , 1997). The emergence o f automatic cal ibra t ion techniques has reduced the degree o f user t raining required to obtain rea l i s t ic - looking results f rom a hydro log ic mode l . H o w e v e r , the s impl i f i ed , "hands-off ' m o d e l cal ibrat ion process and the "ob jec t iv i ty" inherent i n automatic cal ibrat ion also increase the potential for error, as there is a net loss o f expert guidance i n in tu i t ive ly assessing the feasibi l i ty o f parameter values and sets. The knowledge contr ibut ion o f the expert user i n manua l ca l ibra t ion avoids many o f the pi tfal ls awai t ing the incautious automatic calibrator. Cons ide r the case o f obvious data error: methods o f automatic cal ibrat ion i m p l i c i t l y assume that a l l data are equal ly correct. A n automatic procedure m a y distort the cal ibrat ion to compensate for data error that an expert w o u l d q u i c k l y and easi ly disregard. The concept o f an "object ive" automatic cal ibra t ion must also be v i e w e d w i t h caution, as results have been shown to depend on subjective decisions such as the cri ter ia to be op t imized and the pe r iod o f cal ibrat ion data (Bergst rom et a l . , 2002). Resul ts o f comparisons for var ious automatic ca l ibra t ion algori thms m a y also be affected b y changing the d imens ional i ty o f the parameter space (Seibert, 2000). G a n et a l . (p. 97, 1997) present a study o f differences between automatic cal ibrat ions appl ied w i t h a variety o f models , op t imiza t ion algori thms, data, and objective functions. M o r e specific l imita t ions o f automatic m o d e l ca l ibra t ion have been discussed b y many authors. Some o f the more w i d e l y acknowledged challenges inc lude the f o l l o w i n g : 49 • the inab i l i ty o f any single performance measure to proper ly address a l l the important characteristics o f the system (Gan , 1987; M a d s e n , 2000); • interdependence o f parameter values, resul t ing i n uncertain and very s low improvements (Gan , 1987; M o o r e and C la rke , 1981); • problems w i t h m o d e l or parameter ident i f iabi l i ty , such that appreciable changes i n parameter value cause little or no change i n objective function ( M o o r e and C la rke , 1981; Sorooshian and Gupta , 1983); • the imposs ib i l i t y o f exhaust ively exp lo r ing the feasible region due to discontinuit ies , non-differentiable regions, or other complex i t i es o f the response surface (Gan , 1987; M o o r e and C l a r k e , 1981); • the potential for m o d e l divergence g iven poor in i t i a l parameter values (Gan , 1987); and • treatment o f each combina t ion o f parameter values as "theoret ical ly poss ib le" when some m a y be incompat ib le w i t h science or phys ics (K lemes , 2000a). O b v i o u s l y , automatic cal ibra t ion is not a facile so lu t ion to m o d e l cal ibrat ion, and cannot entirely replace manua l involvement i n the process. U s e r in format ion is required to cus tomize the approach and evaluate the process (Madsen et a l . , 2002). M u l t i p l e studies have shown that g loba l op t imiza t ion performance improves as parameter ranges are reduced or constrained (e.g., F ranch in i et a l . , 1998; G a n and Burges , 1990a; K u c z e r a , 1997; Seibert and M c D o n n e l l , 2002). Thus , l i m i t i n g the parameter space to be searched is one example o f a k e y area for expert user input. The above-noted potential inadequacy o f a single measure o f performance is not a lways insurmountable, since automatic cal ibrat ion is not l im i t ed to the single-stage g lobal op t imiza t ion o f general objectives. Often, the cal ibrat ion p rob lem is reduced to sub-problems, each o f w h i c h m a y be so lved manua l ly or b y us ing different op t imiza t ion techniques (Gupta et a l . , 1999). 50 Howeve r , one must be careful to avo id the poss ib i l i ty o f an op t imiza t ion a lgor i thm negating expert knowledge incorporated earlier i n the process. Mode l - spec i f i c knowledge-based expert systems are one example o f a hybr id approach. A knowledge-based expert system uses op t imiza t ion to automate var ious stages o f a manua l cal ibrat ion ( M a d s e n et a l . , 2002). These processes are designed to y i e l d results at least comparable to the expert process w h i l e i m p r o v i n g efficiency. B o y l e et a l . (2000) propose us ing expert knowledge to select feasible parameter ranges and values for dominant parameters, then implement ing an automatic search to m i n i m i z e parameter uncertainty w i t h a focus o n the potential for parameter interaction. H o g u e et a l . (2000) propose a Mul t i s t ep A u t o m a t i c Ca l ib ra t ion Scheme ( M A C S ) that sequential ly applies automatic cal ibrat ion to different subsets o f mode l parameters. Subject ive and statistical evaluations o f M A C S show a significant improvement i n eff ic iency over the manual approach, w i t h consistently comparable results. H o w e v e r , the authors caut ion that techniques such as M A C S are not yet ready for operational use (ibid.) . It is important to d is t inguish between successful testing o f automatic cal ibrat ion techniques and a readiness for " r ea l -wor ld" appl icat ion; the first does not necessari ly i m p l y the second. M u c h research and testing is pursued on "s te r i l i zed" data to a l l o w better assessment o f the effectiveness and eff ic iency o f the cal ibra t ion procedure. H o w e v e r , results f rom such tests are significant o n l y i n a research context. Before be ing approved for pract ical appl icat ion, automatic techniques should be extensively tested w i t h the type, quantity, and qual i ty o f data encountered i n the f ie ld (Hogue et a l . , 2000) . 2.4.4 Methods for Automatic Calibration Methods for automatic cal ibra t ion encompass a range o f complex i ty . T h i s section explores some o f the many techniques avai lable to the hydro logic model ler , beginning w i t h some s imple techniques and progressing to those o f higher complex i ty . The simplest o f automatic cal ibra t ion methods is M o n t e C a r l o S i m u l a t i o n ( M C S ) . " M o n t e C a r l o " refers to the general process o f random sampl ing f rom a pre-determined statistical popula t ion w i t h the goal o f generating an approximate solut ion to a quantitative p rob lem. F o r a M o n t e C a r l o procedure, the accuracy o f the approximat ion is related to the number o f trials. In 51 the hydro logic m o d e l l i n g context, M o n t e Ca r lo S i m u l a t i o n typ ica l ly involves repeated trials o f the candidate mode l us ing parameter values randomly sampled f rom a priori parameter distributions. In most cases where M C S is used as a ca l ibra t ion too l , the probabi l i ty dis t r ibut ion is assumed to be un i fo rm for each parameter to reflect the ignorance o f the model ler w i t h regard to the "most l i k e l y " parameter values. The M C S technique is straightforward, preserves nonl inear interactions w i t h i n the calculat ions, and is not dependent o n l i m i t i n g assumptions about an appropriate dis tr ibut ion o f parameter values ( B i n l e y et a l . , 1991). The process has no m e m o r y and each tr ia l is independent. Therefore, a M o n t e C a r l o approach is concerned w i t h neither i m p r o v i n g objectives nor updat ing the parameter distr ibutions. The chances that purely random sampl ing w i l l q u i c k l y result i n a near op t imal so lu t ion are negl ig ib le . E v e n i f found, there is l i t t le chance that the op t imal solut ion cou ld be recognized as anything more than the best member o f a finite set; a random search without di rect ion requires enumeration o f a stat ist ically large por t ion o f the parameter space before near-opt imali ty can be declared w i t h confidence (Lan , 2001). Therefore, one can conclude that w h i l e M C S is a good tool for exp lo r ing the parameter space or ident i fying a good starting point for more detailed procedures, it is not i t se l f a strong too l for automatic cal ibrat ion. H o w e v e r , M o n t e C a r l o theory plays a role i n most advanced op t imiza t ion algori thms. M o s t commonly-used methods for automatic ca l ibra t ion i nvo lve a directed search i n the form o f opt imiza t ion . The expl ic i t goal o f op t imiza t ion methods is the systematic adjustment o f parameter values i n search o f a set that m i n i m i z e s (or alternatively, max imizes ) the value o f an objective function or set o f functions (Cooper et a l . , 1997). Ana ly t i ca l ly -based solutions to op t imiza t ion problems require response surface condi t ions such as continuity, convexi ty , and twice differentiabili ty, w h i c h (at least i n combinat ion) are u n c o m m o n i n the context o f appl ied hydro log ic m o d e l l i n g ( ibid.) . T h i s precludes the use o f analyt ical or der ived solutions, l eav ing numer ica l solut ion as the o n l y v iab le approach. Numer ica l ly -based op t imiza t ion can i n v o l v e local-search or g lobal search strategies. L o c a l search strategies m o v e i n the d i rec t ion o f i m p r o v i n g solutions (e.g., downward gradient) un t i l no further improvement is possible . E x a m p l e s o f loca l search methods include the Pattern Search 52 M e t h o d , the Rosenbrock M e t h o d , Sequential Quadrat ic P r o g r a m m i n g ( S Q P ) , and the Nelder -M e a d A l g o r i t h m , also ca l led the S i m p l e x M e t h o d (Franchin i et a l . , 1998; Hogue et a l . , 2000) . T h e Pattern Search M e t h o d ( H o o k e and Jeeves, 1961) explores the response surface around the current point and then forces explora t ion i n the d i rec t ion o f greatest improvement (Franchini et a l . , 1998). The Pattern Search M e t h o d deals o n l y w i t h the value o f the objective function and therefore does not require cont inui ty or differentiabili ty. Te rmina t ion occurs when either the response surface shows negl ig ib le improvement i n any direct ion, or w h e n the calculated step size o f each improvement step reaches a specified l imi t ( ibid.) . The Rosenbrock M e t h o d (Rosenbrock, 1960) approximates a gradient search without requi r ing the evaluat ion o f derivatives. It systematical ly explores improvements i n the objective function value a long a set o f orthogonal vectors i n parameter space, rotat ing the ordinate system i n the d i rec t ion o f the computed gradient once a l l vectors have been explored. Hornberger et a l . (1985) apply the Rosenbrock method to the hydro log ic mode l T O P M O D E L , but f ind that parameter est imation is unreliable. The S Q P a lgor i thm applies a constra ined-opt imizat ion vers ion o f Newton ' s method and is extremely efficient at converg ing to a l oca l op t imum. F ranch in i et a l . (1998) demonstrate the potential for us ing S Q P as a " f ine- tuning" step i n cal ibrat ing conceptual ra infal l - runoff models . One o f the most robust and popular o f local-search techniques is the d o w n h i l l s implex a lgor i thm designed b y N e l d e r and M e a d (1965), often referred to as the S i m p l e x M e t h o d . The Ne lde r -M e a d d o w n h i l l s implex a lgor i thm should not be confused w i t h the S i m p l e x M e t h o d o f D a n t z i g (1951) used i n l inear p rogramming . The N e l d e r - M e a d a lgor i thm begins w i t h a s implex o f n+1 points i n n-dimensional parameter space and rank-orders them best to worst according to their objective function value. The wors t point is then reflected through the centroid o f a l l other points , replac ing its parent i f the ref lect ion results i n an improved objective function value. I f the reflected point is better than the best member o f the s implex , the s implex is expanded i n the d i rec t ion o f the reflected point. A s imi la r contract ion step is pursued i f the reflected point is worse than the worst member o f the s implex . 53 Termina t ion typ ica l ly occurs when the standard devia t ion o f objective function values across the s implex drops b e l o w a specif ied value. D u a n et a l . (1992) f ind that the N e l d e r - M e a d S i m p l e x M e t h o d outperforms other loca l search methods because it adjusts to the response surface and can avo id be ing trapped b y the loca l op t ima w i t h i n a region o f attraction. H o w e v e r , there are two noteworthy l imita t ions w i t h regard to the S i m p l e x M e t h o d and its treatment o f parameters. F i r s t ly , its effectiveness is substantially d imin i shed w h e n deal ing w i t h more than five to seven concurrent ly op t imized parameters (Burges, 2003). Secondly , because the o r ig ina l N e l d e r - M e a d A l g o r i t h m is unconstrained, bounded parameters must be transformed before they can be op t imized (Gan , 1987). Loca l - search methods can be appl ied as either single-step processes or successive opt imizat ions . There are two poss ible approaches for app ly ing multi-step l oca l search methods, namely repeated trials o f a s ingle a lgor i thm (e.g., D u a n et a l . , 1992) or consecutive appl icat ion o f different search strategies (e.g., F ranch in i et a l . , 1998). W h i l e such mult i-step loca l searches are more robust than a single t r ia l , any improvement over the single-tr ial case is most l i k e l y due to the reduced dominance o f the various problems affecting each i n d i v i d u a l t r ia l . L o c a l search methods are frequently chal lenged and often defeated b y numerous systematic problems. L o c a l op t ima can trap a search a lgor i thm before it reaches the g loba l solut ion, necessitating user intervention before the search can proceed (Sorooshian and Gupta , 1995; Thyer et a l . , 1999). P o o r objective function sensi t ivi ty to changing parameter values and significant parameter interaction can also pose diff icul t ies , especial ly as the search approaches the op t imum. Other c o m m o n problems inc lude non-uniqueness o f op t imal solutions and consti tutive assumptions that are incompat ib le w i t h the hydro logic data (Cooper et a l . , 1997). These diff icul t ies have led researchers to conclude that techniques designed to locate the o p t i m u m point o f a l oca l i zed region are mathemat ical ly insufficient for locat ing g lobal opt ima i n complex situations such as hydro logic m o d e l l i n g (e.g., D u a n et a l . , 1992; M c C u e n , 1973). M a n y op t imiza t ion techniques have been designed to avo id the most c o m m o n problems encountered b y l o c a l methods (Cooper et a l . , 1997; M a d s e n , 2003). These g loba l (sometimes ca l led parallel) search strategies include, among others, A d a p t i v e R a n d o m Search, Genet ic A l g o r i t h m s , S imula ted A n n e a l i n g , and the Shuff led C o m p l e x E v o l u t i o n a lgor i thm S C E - U A 54 (Hogue et a l . , 2000; M a d s e n et a l . , 2002). G l o b a l op t imiza t ion methods ( G O M s ) are c o m m o n l y v i e w e d as hav ing a loca l phase ( typica l ly replicated from wel l -s tudied pre-exis t ing loca l search techniques) and a g loba l phase that avoids entrapment at loca l opt ima. Therefore, the most significant differences between alternative G O M s are found i n their explora t ion o f the p rob lem space (Cooper et a l . , 1997). A d a p t i v e R a n d o m Search ( M a s r i et a l . , 1980) is arguably the simplest example o f a G O M . A d a p t i v e R a n d o m Search chooses a f ixed number o f random samples from each o f several nested sub-regions o f the parameter space. The nested sub-regions are typ ica l ly separated i n size b y an order o f magnitude but are a l l centred o n the same focal . Af t e r each subset has been examined, the sampled parameter set y i e ld ing the best objective function value becomes the focal for the next i teration. The search continues un t i l the best point is repeatedly ident i f ied at the smallest sub-region o f the search. D u a n et a l . (1992) report that the effectiveness o f A R S is greatly i m p r o v e d w h e n coupled w i t h a l oca l S i m p l e x search. S imula ted A n n e a l i n g (Ki rkpa t r i ck et a l . , 1983) is based o n observations o f the energy-m i n i m i z i n g processes undergone b y c o o l i n g metals. The process w a l k s through the parameter space, exp lo r ing new parameter combinat ions b y evaluating their objective function value. A n y point that improves the objective function becomes the starting point for the next explorat ion. B e i n g a derivat ive o f the M e t r o p o l i s process, S imula ted A n n e a l i n g also accepts infer ior points w i t h a specif ied probabi l i ty . The a lgor i thm must be set up to a l l o w a large number o f " u p h i l l " steps, but not so large as to make the process comple te ly random (Thyer et a l . , 1999). Genet ic A l g o r i t h m s comprise a special case o f G O M s that shares the fundamental components o f g loba l op t imiza t ion but is dis t inguished b y its search technique. A l t h o u g h research into Genet ic A l g o r i t h m s began i n the 1970s, the w o r k o f G o l d b e r g (1989) was the catalyst for their w ide r adopt ion i n diverse fields o f op t imiza t ion . L i k e other methods, Genet ic A l g o r i t h m s use quantitative measures o f performance as objective functions. H o w e v e r , Genet ic A l g o r i t h m s search the parameter space b y emulat ing genetics (i.e., through the probabi l i s t ic select ion and recombinat ion o f extant parameter values) rather than us ing M o n t e C a r l o or gradient-based search techniques to generate successive improvements (Seibert, 2000). Genet ic select ion a l lows ind iv idua l s w h i c h are better suited to their environment 55 (i.e., parameter sets w h i c h produce better m o d e l s imulat ions) to have a higher probabi l i ty o f reproducing (Franchini et a l . , 1998). V a r i o u s " ru les" are specified to govern the creation o f new parameter sets. C o m m o n examples o f different rules inc lude dupl ica t ion o f a parent (reproduction), re-combinat ion o f elements from mul t ip le parents into a new ind iv idua l (crossover), and selection o f random values from the feasible range or a sub-range bounded b y the parent values (mutation) (Lan , 2001; Seibert, 2000). Ru le s are typ ica l ly appl ied on a probabi l i s t ic basis. Implement ing an appropriate balance o f rules and probabi l i t ies is c ruc ia l for the success o f the a lgor i thm (Seibert, 2000). A s for many G O M s , parameter values can either be used i n their o r ig ina l form (e.g., Seibert, 2000) or mod i f i ed into a format more amenable to the processes used i n genetic algori thms; Go ldbe rg (1989) uses b inary representations o f parameter values. In either case, discrete rather than continuous parameters are required. Natura l ly , therefore, the G A ' s op t imiza t ion reflects an integer rather than continuous-variable solut ion. Wi thou t careful considerat ion o f the discret izat ion process, any G O M that produces discrete solutions m a y y i e l d near- or even sub-opt imal solutions w i t h ar t i f ic ia l ly precise parameter values. Resul ts o f studies apply ing G A s to hydro log ic models are m i x e d (e.g., F ranch in i and Galea t i , 1997; L a n , 2001 ; Seibert, 2000). W a n g et a l . (1995) recommend coup l ing a Genet ic A l g o r i t h m w i t h a secondary local-search a lgor i thm to improve results. F o r a complete d iscuss ion o f the development o f genetic algori thms, the reader is referred to the seminal w o r k b y G o l d b e r g (1989). One par t icular ly wel l -documented and successful g loba l op t imiza t ion technique for automatic cal ibrat ion o f hydro logic models is the Shuff led C o m p l e x E v o l u t i o n a lgor i thm ( S C E - U A ) , developed b y D u a n et a l . (1994a) at the U n i v e r s i t y o f A r i z o n a . The S C E - U A method applies a vers ion o f the N e l d e r - M e a d a lgor i thm us ing concepts from natural b io log i ca l evo lu t ion to address the inefficiencies o f the N e l d e r - M e a d process (Duan et a l . , 1992). The S C E - U A method is designed speci f ica l ly for the cal ibra t ion o f hydro log ic models and is based o n a synthesis o f ideas i nc lud ing a combina t ion o f determinist ic and probabi l is t ic approaches, compet i t ive evolut ion , complex shuff l ing, and systematic evolu t ion o f a popula t ion spanning the parameter space (Duan et a l . , 1994b). 56 The w e l l - k n o w n N e l d e r - M e a d a lgor i thm is the basis o f the loca l search component for the S C E -U A method. H o w e v e r , it is the g lobal f ramework that makes the S C E - U A method powerfu l and unique. D u a n et a l . (1992) describe the process i n detail ; their f lowchart o f the g lobal search framework is inc luded herein as F igure 2-1 . T h e exchange o f informat ion between points -achieved b y shuff l ing ind iv idua ls between complexes - is what a l lows the search to avo id be ing trapped b y loca l opt ima. The entire popula t ion w i l l converge toward the neighborhood o f the g loba l op t imum, p rov ided the in i t i a l popula t ion size is suff icient ly large (ibid.) . Input: n = dimension, p = number of complexes m = number of points in each complex Compute: sample size s = p x m Sample s points at random in Q. Compute the function value at each point. I Sort the s points in order of increasing function value. Store them in D. Partition D into p complexes of m points i.e., D = (Ak, k = 1, ... ,p) I Evolve each complex A , k = 1, ... , p I CCE algorithm (see Figure 2-2) Replace A , k = 1,... , p into D No Convergence criteria satisfied? Yes Figure 2-1: Flowchart of the SCE-UA Algorithm (from p. 1027, Duan et al., 1992) 57 The separation o f points into independent complexes a l lows each complex to explore the response surface i n different directions. The evolu t ion o f each complex is d r iven b y the "compet i t ive complex e v o l u t i o n " ( C C E ) strategy, i n w h i c h the majori ty o f new offspring are generated b y the N e l d e r - M e a d procedure (Nelder and M e a d , 1965). D u a n et a l . (1992) p rov ide a detailed explanat ion o f h o w each complex is evo lved ; F igure 2-2 is a f lowchart o f the C C E strategy excerpted f rom their work . The var ious process parameters o f the a lgor i thm control var ious aspects o f the opt imiza t ion . These parameters can influence the efficiency and effectiveness o f performance, and the user must ensure that a l l S C E - U A parameters are assigned appropriate values to m a x i m i z e the probabi l i ty that the a lgor i thm w i l l succeed. D u a n et a l . (1994b) describe the various parameters i n detail and explore the sensi t ivi ty o f the a lgor i thm to parameter values. The S C E - U A method m a y not always be able to reach g loba l convergence, but usual ly produces at least near-optimal results ( G a n and B i f t u , 1996). In the decade since it was first developed, the S C E - U A method has been studied extensively w i t h an emphasis o n establishing its performance relative to other established approaches for op t imiza t ion . The consensus i n the literature is that the S C E - U A method provides solutions that are equal to or better than a l l other procedures for automatic cal ibrat ion o f hydro logic mode ls (Burges, 2003; G a n and B i f t u , 1996; S i n g h and W o o l h i s e r , 2002; Sorooshian and Gupta , 1995). It has been compared favourably to Genet ic A l g o r i t h m s (Cooper et a l . , 1997; K u c z e r a , 1997), semi-automated methods (Gupta et a l . , 1999), Pattern Search and Sequential Quadratic P r o g r a m m i n g (Franchin i et a l . , 1998), S imula ted A n n e a l i n g (Cooper et a l . , 1997; Thyer et a l . , 1999), and the Mul t i -S t a r t S i m p l e x M e t h o d (Duan et a l . , 1992; G a n and B i f t u , 1996; Sorooshian et a l . , 1993). A l t h o u g h the a lgor i thm typ ica l ly requires tens o f thousands o f executions o f the hydro log ic m o d e l , most studies also conclude that the S C E - U A method is the most efficient automatic ca l ibra t ion tool avai lable. Despi te the successes o f the S C E - U A method i n the r ea lm o f automatic cal ibrat ion, Thyer et a l . (p. 773, 1999) caut ion that " S C E - U A should not be v i e w e d as a panacea". The S C E - U A method is s t i l l subject to general challenges l ike poor m o d e l formulat ion, unrepresentative or error-laden data, non-uniqueness o f op t imal solutions, and sensi t ivi ty to objective function. 58 C^Jnfro Given dimension n, complex A, and number of points m in A, select q, a, (3, where 2 <=q <=m, a>=l,/3>= 1. Set i = 1. Assign a triangular probability distribution to A: 2(w + l - / ) . , Pi = — -,i = \,—,m m(m +1) Select q points from A according to Pi. Store them in B and their relative positions in A in L. Set j = 1. Sort B and L in order of increasing function value. Compute the centroid of U i , ... , uq_i and let u q be the worst point in B. Compute r = 2g - uq (reflection step). Replace B into A according to L and sort A in order of increasing function value. . » r within 0 ? _ Generate a point z at random in H. Set r = z. JyYes Compute fr. Generate a point z at random in H. Compute = i + i r < Yes Return to SCE Figure 2-2: Flowchart of the SCE-UA CCE Strategy (from p. 1028,Duan et al, 1992) 59 G a n and B i f t u (1996) suggest that these issues i m p l y a cr i t ica l need to evaluate the g loba l op t imiza t ion features o f the S C E - U A method w h e n it is appl ied to a rea l -wor ld catchment, since even the basic concept o f a g loba l o p t i m u m m a y not make sense i n an appl ied context. It is relevant to emphasize again the general d is t inc t ion between methods w h i c h attempt to identify the g loba l op t ima o f an objective function and methods a imed at attaining the best poss ible representation o f a natural system. E v e n where a unique g lobal o p t i m u m can be found for a specif ic objective function, both the g loba l o p t i m u m and the s imula t ion qual i ty s t i l l depend o n the choice o f objective function as w e l l as the various m o d e l and data uncertainties. V r u g t et a l . (2003) point out that, w h i l e the S C E - U A method is able to identify the g lobal m i n i m u m for a p rob lem, it has had little success i n d r i v i n g forward the quest for a unique "best" parameter set. 2.4.5 Evaluating Model Performance M o d e l performance is evaluated dur ing ca l ibra t ion to provide direct ion to the search for a better representation o f the natural system. Performance is then re-evaluated dur ing va l ida t ion to support the hypothesis that the m o d e l is correct ly representing the prototype. Es tab l i sh ing " g o o d " m o d e l performance usual ly invo lves attaining an "acceptable" goodness o f fit between m o d e l output and his tor ical record under the assumption that condi t ions o f appl icat ion w i l l be s imi l a r to those experienced dur ing ca l ibra t ion and va l ida t ion (Klemes , 1986b). Such evaluations o f mode l "correctness" apply so le ly to the mode l output and cannot be extended to i m p l y "hydro log ica l soundness" o f the m o d e l structure ( ibid.) . M a d s e n (2000) proposes us ing four basic factors to evaluate a hydro logic m o d e l : water balance (average catchment runoff vo lume) , hydrograph shape, t im ing / rate / v o l u m e o f peak f lows , and l o w f low agreement. M o d e l ca l ibra t ion guidel ines g iven b y Burges (2002) focus o n the f o l l o w i n g : • main ta in ing annual, seasonal, week ly , and da i ly water balance; • matching hydrograph shape, peak values, and peak t im ing ; • ensuring predicted E T <potential E T for the region; • ver i fy ing s imulated water storage fluctuations w i t h precipi ta t ion patterns; • conf i rming parameter values are consistent w i t h catchment properties; 60 • matching surface f l ow to base f low ratio to so i l and geologica l condi t ions; • keeping comparisons consistent w i t h the accuracy and errors o f recorded data. Object ives typ ica l ly have different units o f measurement and often cannot be aggregated into a single measure. Such objectives are ca l led non-commensurate. C h o o s i n g a solut ion based on mul t ip le objectives usua l ly requires trade-offs between non-commensurate objectives (Reve l le et a l . , 1997). In hydro log ic m o d e l cal ibrat ion, no cri ter ia or set o f cr i ter ia for measur ing mode l performance can be declared incontrover t ib ly superior to a l l other techniques (K lemes , 1986b; Sorooshian et a l . , 1983). In a study assessing the performance o f several models us ing mul t ip le categories o f cri teria, Hough ton-Car r (1999) finds that no m o d e l performs w e l l i n a l l categories, and no single category adequately describes m o d e l performance. The cri ter ia chosen should therefore be related to the purpose to w h i c h the calibrated m o d e l w i l l be appl ied (K lemes , 1986b). M a n u a l cal ibrat ions and some rela t ively recent automatic ca l ibra t ion algori thms (e.g., Bast idas et a l . , 2001) consider mul t ip le cri teria i n an attempt to reduce the rel iance o n any single measure o f performance. Approaches to evaluat ion can be quantitative, quali tat ive, or both (Houghton-Carr , 1999). Often, a combina t ion o f statistical, graphical , and intui t ive measures are used to evaluate a g iven mode l . F o r example , G a n (1987) ut i l izes statistical and graphical comparisons, w h i l e also consider ing phys i ca l p laus ib i l i ty o f parameter values g iven the catchment characteristics and cal ibrat ion data. Hough ton-Car r (1999) finds that quantitative and quali tat ive approaches provide different informat ion: w h i l e qualitative cri ter ia are ambiguous, quantitative cr i ter ia demonstrate a relat ionship between f l o w regime and performance. Re l i ance o n any one approach to the exc lus ion o f others is inadvisable . F o r example, subjective graphical analysis m a y not c lear ly reveal consistent biases. Perhaps more importantly, quantitative comparisons o f paired values from two t ime can easi ly result i n unjustified and mis l ead ing levels o f error i f mode l or data t i m i n g is w r o n g or uncertain. I f t i m i n g is s l ight ly off, large errors are created for r i s ing and fa l l ing l imbs . Statist ical compar i son o f two such t ime series c o u l d result i n a h igh leve l o f error not supported b y graphical analysis. A s an alternative, 61 Burges (2002) proposes cal ibrat ion based on storm vo lumes and peak f lows as a mult i -object ive approach. Bergs t rom et a l . (2002) make a l lowance for uncertainty i n m o d e l l i n g ni trogen levels b y consider ing the best match o f observed and simulated ni trogen concentrations w i t h i n ± 3 days o f the n o m i n a l date. A l t h o u g h graphical comparisons are usual ly concerned w i t h evaluations o f s imulated and observed hydrographs, other types o f both quantitative and qualitative data can be expressed graphical ly . Different graphical techniques can illustrate the properties and patterns o f a data set (e.g., box plots , scattergrams, transformations). Hough ton-Car r (1999) makes good use o f graphical compar i son b y p lo t t ing two different performance measures on the x - and y-axes, w h i l e H o g u e et a l . (2000) improve the representation o f recessions w h i l e main ta in ing perspective for higher f lows b y apply ing a partial l o g transformation. G r a p h i c a l analysis can also help to c lear ly and conc ise ly organize related data. P lo t t ing the t ime series o f ra infa l l , catchment outf low, and residual da i ly f l o w v o l u m e error o n a single graph is an effective w a y o f assessing performance and is par t icular ly useful i n h igh l igh t ing systematic errors i n the m o d e l (Burges, 2002; G a n , 1987). W h i l e manual calibrations use both quantitative and quali tat ive cri ter ia to evaluate m o d e l performance, automatic cal ibra t ion algori thms re ly exc lus ive ly on quantitative calculat ions. Quanti tat ive measures can be as s imple as direct compar i son o f f l ow vo lume , peak f low, or t ime to peak (Loague and Freeze, 1985). M o r e complex quantitative measures such as m a x i m u m error, R o o t M e a n Square Er ro r ( R M S E ) , Coeff ic ient o f Determina t ion ( R 2 ) , eff iciency, and coefficient o f residual mass are typ ica l ly based on the relat ionship between each quantile o f the observed t ime series and its s imulated counterpart. W h i l e usual ly appl ied to the entire t ime series, considerat ion o f a subset o f results can prov ide insight into the performance o f a specific m o d e l component. The standard approach to quantitative evaluat ion involves def ining some measure o f the length o f a vector composed o f the t ime series o f residual errors (Gupta et a l . , 1998). The process o f ca l ibra t ion is then v i e w e d as an iterative attempt to f ind values o f the m o d e l parameters that m i n i m i z e the " length" o f the vector. The process is compl ica ted b y the lack o f an unambiguous ly correct w a y to define the length o f the error vector (Gupta et a l . , 1998; L a n , 62 2001). Further, any statistical measure exp l i c i t l y compar ing the residuals o f two t ime series arguably places more weight o n t i m i n g than o n match ing f low patterns (Burges, 2002). T o d i n i (1988) points out that statistical techniques based o n analysis o f residuals neglect the phys ica l characteristics o f the m o d e l , avo id ing rather than taking advantage o f pr ior expert knowledge intr insic to the m o d e l structure. Popular measures for quantitative evaluat ion inc lude coefficients o f l inear correlat ion (e.g., the coefficient o f determination, R 2 ) , coefficients o f eff ic iency (e.g., Nash-Sutc l i f fe efficiency, E ! ) , and various equations related to regression and curve fi t t ing theory (e.g., s imple least squares, S L S ) (Lan , 2001). A w i d e variety o f statistics are avai lable , al though an i nd iv idua l user w i l l generally have a set o f preferred measures. M a n y users o f the U B C W M prefer a mod i f i ed vers ion o f Nash-Sutc l i f fe eff ic iency that corrects for overa l l v o l u m e error (e.g., L a n , 2001 ; M i c o v i c , 1998). The U S N a t i o n a l Weather Service uses a w i d e variety o f measures, i nc lud ing D a i l y R o o t M e a n Squared Error , To ta l M e a n M o n t h l y V o l u m e Squared Error , M e a n and M a x i m u m Abso lu t e Er ror , Nash-Sutc l i f fe efficiency, B i a s (mean da i ly error), Peak Difference, Fi rs t L a g Autocor re la t ion , and N u m b e r o f S i g n Changes (i.e., the number o f t imes the sequence o f residuals changes sign) (Gupta et a l . , 1998). Some o f the more c o m m o n functions used i n evaluat ing hydro logic m o d e l performance are presented i n Table 2 .1 , w i t h descriptions, numer ica l formulat ions, and examples o f studies i n w h i c h each has been appl ied. These c o m m o n functions generally each require a set o f assumptions regarding the statistical d is t r ibut ion o f the output data errors, w h i l e errors i n input data are typ ica l ly ignored (Gupta et a l . , 1998). The coefficient o f determination ( R 2 or D ! ) is the square o f the coefficient o f l inear correlat ion, a popular measure o f the strength o f relat ionship between two sets o f data. F o r independent data w i t h no correlat ion whatsoever, R and R 2 are zero; for highly-correlated data sets, R and R 2 approach unity. T h e coefficient o f determination is defined as the quotient o f expla ined var ia t ion and total var ia t ion. L a n (2001) and Seibert and M c D o n n e l l (2002) note the difference between the case o f perfect correlat ion ( R 2 = 1) and that o f perfect s imula t ion (i.e., zero error, where q sim = qobs)- T h i s difference suggests that R 2 alone is not a good measure o f data s imi l i tude for an automatic ca l ibra t ion against a single data series. The l o w correlat ion (0.14) between the 63 Table 2-1: Common Measures for Quantitative Evaluation of Hydrologic Model Performance Symbol / Acronym Full Name Description Numerical Formulation Example Applications R2(D!) Coefficient of Proportion of variance in q o b s (x) that can be explained by the modelled q s i m (y) r 2 _ [«5jcy-(Zx)(Zy)]2 (1) (Lan, 2001) (Micovic, 1998) (Franchini and Pacciani, 1991) Determination [«Ex2 -(Zx) 2 l«Zy 2 -(£v) 2 J HMLE Heteroscedastic Maximum Likelihood Estimator Sum of weighted squared errors (wt-et). Assumes errors are Gaussian and heteroscedastic, with zero mean and covariance matrix V = o2I HMLE = ^ ± w , e , 2 ^ n f\w, % | (2) (Gan, 1987) (Sorooshian et al., 1983) LAD Least Absolute Difference; Absolute Least Value Sum of the absolute values of the errors (i.e., q o b s - qsim) for each pair of q o b s and q s i r a n LAD = Y\a°bs - q j \ 1=1 (3) (Gan, 1987) (Hornberger et al., 1985) (Houghton-Carr, 1999) E! Nash - Sutcliffe Efficiency A variant of the Coefficient of Determination that measures the relative magnitude of "noise" ( q s i m - q 0 b s ) to "information" (qobs - qobs) n 2J \Ssim ~ Qobs ] E\-\ , = 1 n (=1 (4) (Cooper et al., 1997) (Lan, 2001) (Micovic, 1998) (Gupta etal, 1999) (Houghton-Carr, 1999) The square root of the mean of squared errors (i.e., q o b s - qsim). It is sometimes referred to as DRMS when calculated for a daily time series. (Cooper et al., 1997) (Gupta etal., 1999) (Hogue et al., 2000) RMSE; DRMS (Daily) Root Mean Square Error RMSE=tfj(qohs-qsimY (5) SLS or SSE Simple Least Squares or Sum of Squared Errors A simple summation of the squared errors (i.e., q o b s - qsim) for each point in time series t SLS = YiQobs ~<lsiJ2 ;=1 (6) (Cooper et al., 1997) (Gan, 1987) (Hornberger et al., 1985) (Houghton-Carr, 1999) (Sorooshian et al., 1983) coefficient o f determination (D!) and residual vo lume error ( d V ) observed b y M i c o v i c ( M i c o v i c , 1998) supports this conc lus ion . The Nash-Sutc l i f fe eff iciency statistic (E!) measures the propor t ion o f var iab i l i ty i n the observed f l o w series that is expla ined b y the hydro log ic m o d e l (Loague and Freeze, 1985). Nash-Sutc l i f fe eff ic iency measures the relative magnitude o f noise ( q s j m - q 0 b s ) to informat ion (q 0b S - qmean)- E ! is better suited to evaluation o f hydro log ic s imulat ions than the coefficient o f determination because it evaluates the magnitude and shape o f difference between observed and simulated hydrographs. H o w e v e r , the Nash-Sutc l i f fe E ! is biased for data sets w i t h large total variance for w h i c h the residual variance ( in the numerator) is dominated b y the observed variance ( in the denominator) ( L a n , 2001). Large events are weighted more heav i ly i n the sense that residual error for a large event w i l l have m u c h more effect o n the statistic than the res idual error for a smal l event, even i f the two are the same as percentages o f event magnitude. The poor reproduct ion o f l o w f low periods i n Seibert (2000) is attributed to the bias o f the Nash-Sutc l i f fe eff iciency statistic toward match ing h igh- f low events at the expense o f l o w - f l o w events. Loague and Freeze (1985) observe that, i n general, Nash-Sutc l i f fe efficiencies are better w h e n calculated for vo lume and peak f low than for peak f low t iming . Recent w o r k has shown that improvement is possible i n some cases b y calculat ing the Nash-Sutc l i f fe E f f i c i ency for the natural logari thms o f the discharge t ime series (i.e., E ! = f[ ln(Q)]) (Wei le r , 2005). In a Genet ic A l g o r i t h m cal ibra t ion o f the U B C W M , L a n (2001) demonstrates that, across the 20th generation popula t ion, increasing E ! is not necessari ly associated w i t h decreasing relative v o l u m e error dV/V. T o emphasize conservat ion o f mass and c lo s ing o f the water balance, the U B C W M U s e r ' s M a n u a l (Qu ick , 1994) presents the mod i f i ed Nash-Sutc l i f fe statistic E O P T ! , defined as: E O P T ! = E ! -1 JJqobs - SqSim | / Sq0bs (7) The E O P T ! statistic is used for evaluating the performance o f the U B C W M . The two quantitative cri teria most w i d e l y used for m o d e l evaluat ion i n the literature are the S i m p l e Least Squares Er ro r ( S L S ) and the Roo t M e a n Square E r r o r ( R M S E ) statistics. The S L S 65 statistic is the sum o f the squares o f the i nd iv idua l residual errors (i.e., S L S = 2(q 0 bs - q Sim) 2 )• A s for other statistics, data error can lead to problems w h e n app ly ing S L S since equal weight is g iven to both v a l i d and erroneous quantiles i n the t ime series. H o w e v e r , for S L S , any distort ion is exacerbated through the squaring o f large residual errors (Lan , 2001) . Least absolute difference functions are better suited to erroneous data because the errors are not squared before summat ion. The R M S E statistic is defined as the square root o f the sum o f the i n d i v i d u a l residual errors, effect ively comput ing the standard devia t ion o f the mode l predic t ion error (Gupta et a l . , 1999). W h e n residual errors are calculated at a da i ly timestep, R M S E is sometimes labeled D a i l y R o o t M e a n Square Er ro r ( D R M S ) . The R M S E statistic assumes the presence o f Gauss ian , independent error w i t h homogeneous variance; the result ing i m p l i c i t bias toward matching the highest events o f record renders the R M S E statistic poo r ly appl icable to data consis t ing mos t ly o f l o w f lows w i t h a few large events ( G a n and B i f t u , 1996; Y a p o et a l . , 1996). The tendency o f the R M S E objective function to p rov ide good peak estimates w h i l e a l l o w i n g strong bias i n other parts o f the hydrograph is readi ly apparent at intermediate steps o f the ca l ibra t ion process (Hogue et a l . , 2000). B y assuming homogeneous variance i n error, S L S and R M S E i m p l y acceptance o f the argument proposed b y A r n a u d et a l . (2002): that since the magnitude o f error is not increasing w i t h size, absolute error is constant and therefore relative error i n the data should be expected to decrease for larger events. I f the accuracy o f l o w - f l o w s imula t ion is a pr ior i ty , the Heteroscedastic M a x i m u m L i k e l i h o o d Es t imator ( H M L E ) , discussed i n detail b y Sorooshian et a l . (1983), is a better choice than the R M S E or S L S statistics. Y a p o et a l . (1996) improve l o w - f l o w s imula t ion b y us ing the H M L E as an objective function instead o f R M S E , al though performance on higher f lows deteriorates. The H M L E assumes output errors are Gauss ian w i t h zero mean and uncorrelated, heterogeneous variance. The heteroscedastic basis o f the H M L E a l lows for the existence o f magnitude-dependent errors i n streamflow measurement (Sorooshian et a l . , 1983; Y a p o et a l . , 1996). B o t h the R M S E and the H M L E require the independent and ident ica l ly distributed designation for residual errors; i n neither case is the assumption o f error independence supported ( Y a p o et a l . , 66 1996). Independent cal ibrat ion us ing the R M S E and H M L E cri teria can result i n drastically-different values for many o f the parameters (Sorooshian et a l . , 1993). In the l i m i t i n g case o f homogeneous variance, the H M L E statistic reduces to the S L S . In a study compar ing S L S and H M L E , Sorooshian et a l . (1983) attribute the better cal ibrat ion-per iod statistics obtained w i t h the S L S cal ibra t ion to its strong curve-fi t t ing abi l i t ies; the H M L E statistics are found to be superior for the va l ida t ion per iod. One imp l i ca t i on o f these findings is that automatic cal ibrat ions us ing the S L S statistic m a y sacrifice phys ica l rea l i sm i n favour o f fi t t ing the cal ibra t ion data set. Measures w i t h a poor phys ica l basis should be avoided i f possible , as they can easi ly lead to overf i t t ing ( K l e m e s , 1982). 2.4.6 Hydrologic Model Validation M a n y models can be calibrated to approximate a g iven data set without correct ly representing the in situ processes. Therefore, a mode l must be tested o n a distinct and independent data set to conf i rm that the uncertainty i n appl ied m o d e l predict ions is acceptable (K lemes , 1986b; O ' C o n n e l l and T o d i n i , 1996). T h i s process is most c o m m o n l y referred to as " m o d e l va l ida t ion" . Elsewhere i n the literature, the process is referred to as "conf i rmat ion" or " m o d e l performance evaluat ion" (e.g., Gup ta et a l . , 1999). M o d e l va l ida t ion is also sometimes referred to as "ve r i f i ca t ion" (Gupta et a l . , 1999). A l t h o u g h the terms " v a l i d a t i o n " and "ver i f i ca t ion" are often used interchangeably, their respective meanings are distinct. A "va l ida ted" m o d e l must be internal ly consistent and have no k n o w n or detectable f laws. In contrast, a "ve r i f i ed" m o d e l is one whose truth has been demonstrated (Oreskes et a l . , 1994). It m a y b e convenient to th ink o f "ver i f i ca t ion" as the l i m i t state reached as the qual i ty o f the va l ida t ion process approaches perfection and the size and scope tend to inf ini ty . Hornberger and B o y e r (1995) contend that establishing truth is not and should not be the goal o f va l ida t ion , since even a perfect match between observed and predicted data does not ver i fy a mode l . Oreskes (p. 642, 1994) argues that ver i f ica t ion is on ly possible for "c losed systems i n w h i c h a l l the components o f the system are established independently and k n o w n to be correct". H y d r o l o g i c models cannot constitute c losed systems due to factors such as incomplete 67 knowledge , con t inuum theory, and scal ing o f non-addi t ive properties ( ibid.) . Therefore, hydro log ic models cannot be t ruly ver i f ied. T y p i c a l cri teria examined dur ing hydro logic m o d e l va l ida t ion include mathematical r igour, absence o f bias, and closeness o f fit. O b v i o u s l y , success w i t h these cri teria does not necessari ly i m p l y scientif ic object ivi ty or hydro log ica l insight ( K l e m e s , 2000a). F o r L e i and S c h i l l i n g (p. 81 , 1996), a successful va l ida t ion means o n l y that "the m o d e l is not rejected for this very task i n this ve ry si tuation". In this sense, a successful va l ida t ion should be considered as support ing evidence for, rather than p r o o f of, an acceptable representation o f reali ty (Bergst rom et a l , 2002). W h i l e the probabi l i ty o f a correct m o d e l representation improves w i t h increasing divers i ty o f va l ida t ion data, there is lit t le basis for extrapolating conclus ions beyond the temporal , spatial, and magnitude l imi t s o f the va l ida t ion data set (Beven , 1989; K l e m e s , 1986b; Oreskes et a l . , 1994). Thus , the data used for va l ida t ion should idea l ly be hydro log ica l ly s imi l a r to condi t ions expected i n the appl ied s imulat ions. A l t h o u g h va l ida t ion cannot conc lus ive ly ver i fy a g iven m o d e l , poor mode l performance, as measured against observed data i n the va l ida t ion phase, is an obvious indicator o f error. H o w e v e r , the results o f va l ida t ion are c o m m o n l y insufficient to conc lus ive ly support or refute a m o d e l , especial ly where va l ida t ion data are l im i t ed ( G o o d r i c h and Woolh i se r , 1994; Oreskes et a l . , 1994). F o r models that purport to simulate internal catchment responses, compar i son o f ou t f low hydrographs is an insufficient test o f v a l i d i t y (Beven , 1989). Bergs t rom et a l . (2002) recommend that measurements other than discharge be considered to further validate the m o d e l . Where va l ida t ion is inconc lus ive , the corresponding lack o f confidence should be reflected i n results. The qual i ty o f s imula t ion m a y appear to deteriorate f rom cal ibrat ion to va l ida t ion g iven the curve-f i t t ing nature o f the cal ibrat ion process and the assumed independence o f the va l ida t ion data set. G i v e n a successful cal ibrat ion, the change i n performance from cal ibrat ion to va l ida t ion should be modest to negl ig ib le (e.g., G a n and B i f t u , 1996; G a n et a l . , 1997; M a d s e n , 2003) . G a n and B i f t u (1996) f ind that m o d e l performance deteriorates more between cal ibrat ion and va l ida t ion when "stronger" automatic cal ibra t ion approaches such as the S C E - U A method, are used. T h i s m a y be an example o f "over f i t t ing" a data set w i t h a powerfu l op t imiza t ion too l . 68 There is often a temptation to return to the cal ibra t ion phase or otherwise alter parameter values to improve the fit o f the m o d e l to the va l ida t ion data. H o w e v e r , such contaminat ion o f the va l ida t ion process automatical ly precludes its success (Oreskes et a l . , 1994). B e v e n (1989) notes a further l imi ta t ion for event-based models , since i f boundary or in i t i a l condit ions must be calibrated for a va l ida t ion event, it cannot be considered a true va l ida t ion . I f a continuous mode l is be ing used i n an independent va l ida t ion test, a l lowance must be made for a "sp in-up" per iod to avo id any influence o f in i t i a l condit ions on the va l ida t ion per iod . Loague and Freeze (1985) assert that, i n the case o f a successful ca l ibra t ion and i n the absence o f overfi t t ing o f parameters or va l ida t ion beyond the cal ibra t ion range, errors i n cal ibra t ion and va l ida t ion data should be statistically s imi lar . The most c o m m o n approach to hydro log ic mode l va l ida t ion applies this p r inc ip le to the so-cal led "sp l i t - sample" test. The spli t-sample test invo lves the subjective d i v i s i o n o f a single record into two parts w i t h one used for cal ibrat ion and the other reserved for va l ida t ion . I f the record length is not sufficient to be split 50/50, the record should be split twice , e.g., 70/30 and 30/70, and va l ida t ion must pass o n both sets for the m o d e l to pass this test (Klemes , 1986b). M e r z and B l o s c h l (2004) present a case study i n w h i c h they use K l e m e s ' (1986b) concept o f spl i t -sample va l ida t ion to good effect. A spli t-sample test is not necessari ly rel iable i n a l l cases (e.g., Gup ta et a l . , 1999), especial ly g iven the typ ica l lack o f independence between the cal ibra t ion and va l ida t ion data series. I f the spli t-sample test is suspect, a more in-depth analysis o f m o d e l residuals is l i k e l y required. Such situations are not addressed herein. H o w e v e r , it is wor th not ing that qualitative analysis must also p lay a role i n m o d e l va l ida t ion , as parameter values must be feasible, realist ic, and able to lend a degree o f certainty to s imulat ions and forecasts (Sorooshian et a l . , 1983). 69 3. Uncertainty in Hydrologic Modelling "It is far better to foresee even without certainty than not to foresee at all." - Henri Poincare T h i s chapter is intended for bo th beginning and experienced hydro log ic model lers seeking to understand the var ious ways i n w h i c h uncertainty is introduced into the m o d e l output. Those famil iar w i t h the b o d y o f literature on uncertainty i n hydro log ic m o d e l l i n g m a y w i s h to proceed direct ly to Chapter 4. Despi te significant progress, hydro logic models are s t i l l far f rom achiev ing desired levels o f accuracy and certainty. Often, the best that can be achieved under ideal condit ions is to predict behaviour w i t h a certain degree o f "conf idence" - the complement o f w h i c h is a degree o f res idual uncertainty. A n y output f rom a hydro logic m o d e l should therefore be analyzed w i t h regard to its significant i n the context o f the associated m o d e l predict ive uncertainty ( B i n l e y et a l . , 1991). Howeve r , uncertainty i t se l f is often indeterminate or unidentif iable; it can be quantitative, qualitative, or u n k n o w n i n character ( L e i and S c h i l l i n g , 1996). Different aspects o f m o d e l behaviour m a y have different degrees o f uncertainty, and ident i fying the sources o f uncertainties i n m o d e l results can be extremely diff icul t (Garen and Burges , 1981). T h i s chapter provides an ove rv iew o f uncertainty i n the context o f hydro log ic mode l l i ng . A d is t inc t ion must be made between the concepts o f error (or accuracy) and uncertainty. A l t h o u g h sometimes used interchangeably, the two concepts are distinct i n nature. " E r r o r " is the difference between a computed or measured value and its true or theoret ical ly correct counterpart. Therefore, the term "error" is properly used w h e n the observed response, process, or outcome has been observed to be incorrect or i nva l i d . "Uncer ta in ty" , o n the other hand, is the cond i t ion that the computed or measured output may differ f rom the baseline response i n magnitude, process, or probabi l i ty . It is usual ly expressed i n a relative or probabi l i s t ic sense, as opposed to error, w h i c h is an absolute quantity. The concept o f uncertainty is analogous i n some ways to "p rec i s ion" i n the w e l l - k n o w n example o f accuracy i n marksmanship . A s illustrated i n F igure 3-1, a result can be accurate but not 70 precise; s imi la r ly , it can be precise but not accurate. B y extension, m o d e l results can be accurate but uncertain, or certain but erroneous. Ideally, both certainty and accuracy w i l l be present; more c o m m o n l y , neither can be established conclus ive ly . (c) accurate but not precise (d) accurate and precise Figure 3-1: Accuracy and Precision as Analogues for Error and Uncertainty A p p l i e d model lers sometimes tend to focus o n the more easi ly-identif iable issue o f m o d e l accuracy at the expense o f g i v i n g due considerat ion to uncertainty. Indeed, i f m o d e l l i n g experts struggle to understand the uncertainties i m p l i c i t i n their models , it w o u l d be nai've to assume that prac t ic ing hydrologists w i l l g ive suff iciently comprehensive considerat ion to uncertainty i n their results (Grayson et a l . , 1994b). 71 Large or diff icult- to-resolve uncertainties i n results can cause users to exert pressure on the m o d e l developer to " i m p r o v e " the m o d e l ( K l e m e s , 1982). I f " improvements" (e.g., more data, better understanding) are not easi ly achievable, some model lers resort to attempting to extract more informat ion from the data through mathematical manipula t ion ( ibid.) . M o d e l developers and promoters should ensure that there is an appropriate leve l o f awareness and discuss ion o f m o d e l l i n g uncertainties amongst the c o m m u n i t y o f users (Grayson et a l . , 1994b). A thorough understanding o f the sources o f uncertainty and a consistent t axonomy is necessary to facilitate awareness and discuss ion i n the m o d e l l i n g communi ty . These topics are explored i n Sec t ion 3.1. Substantial uncertainty w i l l persist i n the outputs o f even a wel l -ca l ibra ted hydrologic m o d e l ( B i n l e y et a l . , 1991). A discuss ion o f the interaction between the cal ibra t ion process and m o d e l predict ive uncertainty is contained i n Sec t ion 3.2. A l t h o u g h a l l aspects o f uncertainty cannot be measured object ively, a number o f approaches exist for exp lo r ing the impact o f uncertainty o n a m o d e l or decis ion . A n ove rv iew o f various methods for analyz ing uncertainty is p rov ided i n Sec t ion 3.3. F i n a l l y , the previously-discussed caveats o f m o d e l extrapolation i m p l y that uncertainty must increase consp icuous ly w h e n s imula t ing extreme events. Speci f ic considerations o f uncertainty for extreme event s imula t ion are discussed i n Sec t ion 3.4. 3.1 Classification of Uncertainty in Hydrologic Modelling O v e r t ime, many classif icat ion schemes for uncertainty have been proposed. L o h a n i et a l . (1997) prov ide perhaps the most fundamental, p ropos ing that two basic types o f uncertainty exist: what is not k n o w n at a l l , and errors i n what is k n o w n . A l t h o u g h h igh ly significant, such a perspective is ph i lo soph ica l i n nature and does not address the more pract ical quest ion o f sources o f uncertainty. B e c k (1987) proposes c lass i fying uncertainty accord ing to the processes through w h i c h it is in t roduced into the p rob lem formulat ion, i.e., through pr ior assumptions and knowledge , mode l 72 ident i f icat ion, and predict ion. Th i s approach is m u c h better suited to understanding the sources o f uncertainty i n hydro logic mode l l i ng , but is s t i l l somewhat general i n nature. V i c e n s et a l . (1975) consider uncertainty as be long ing to one o f two basic categories: natural and informat ional . The U S Na t iona l Research C o u n c i l (p. 4 1 , N R C , 2000b) has adopted the same perspective, def in ing natural var iab i l i ty and knowledge uncertainty as fo l lows : "Natural variability - sometimes called aleatory uncertainty -deals with inherent variability in the physical world; [...] In the water resources context, uncertainties related to natural variability include things such as stream flow, assumed to be a random process in time, or soil properties, assumed to be random in space. Natural variability is also sometimes referred to as external, objective, random, or stochastic uncertainty. " "Knowledge uncertainty - sometimes called epistemic uncertainty -deals with a lack of understanding of events and processes, or with a lack of data from which to draw inferences; by assumption, such lack of knowledge is reducible with further information. The word epistemic is derived from the Greek "to know. " Knowledge uncertainty is also sometimes referred to as functional, internal, or subjective uncertainty." The N R C (2000a) observes that these two uncertainties affect calculat ions o f r i sk differently, and cautions that the two should be c lear ly dis t inguished i n practice. Nonetheless , the dis t inct ion between the two is somewhat arbitrary and hypothet ical i n nature. Percept ion and context typ ica l ly govern the dis t inct ion, since different assumptions m a y cause natural uncertainties to become knowledge uncertainties and v i ce versa ( N R C , 2000b). V i c e n s et a l . (1975) d iv ide informational (knowledge) uncertainty into two basic components, corresponding to uncertainty i n mode l structure and i n parameter values. Later studies add a th i rd component corresponding to uncertainty ar is ing from uncertainty or error i n observed data (e.g., Garen and Burges , 1981; Loague and Freeze, 1985). E v e r y hydro log ic study is subject to va ry ing degrees o f a l l three components. These three components are co l l ec t ive ly label led m o d e l predict ive uncertainty. 73 M e l c h i n g et a l . (p. 2275, 1990) summarize the resul t ing four basic classifications o f uncertainty as fo l lows : • natural var iabi l i ty , w h i c h refers to "the random temporal and areal fluctuations inherent i n natural processes"; • data uncertainty, w h i c h includes measurement inaccuracy and errors, the adequacy o f the data to represent in situ condi t ions, and any data handl ing , t ransmission, or transcription errors; • m o d e l parameter uncertainty, w h i c h reflects "va r iab i l i ty i n the determination o f the proper parameter values to use i n m o d e l l i n g a g iven causative event"; and • m o d e l structure uncertainty, w h i c h characterizes "the abi l i ty o f the m o d e l to accurately reflect the watershed's true phys ica l runof f process". C r i t i c a l analysis o f this t axonomy encourages explora t ion o f relationships between specific sources o f uncertainty i n hydro log ic mode l l i ng . The m i n d map shown i n F igure 3-2 is the result o f one such analysis. The figure shows the central goal o f i m p r o v i n g est imation o f extreme events, surrounded b y var ious issues that inhibi t progress. M a j o r issues connect to the goal as trunks, w i t h sub-issues specif ied as branches. The major areas o f focus are analogous to the types o f uncertainty discussed above, w i t h event uncertainty corresponding to natural var iabi l i ty , and mode l uncertainty i nc lud ing both m o d e l structural and parameter uncertainty. Da ta uncertainty is separated f rom the other knowledge uncertainties because o f its dominant role i n most situations. F o u r connections are shown between the var ious regions o f the map (i.e., lack o f extreme event data, m o d e l extrapolation, cal ibrat ion, and phys ica l mode l l ing) , representing concepts that span the d is t inc t ion between uncertainties. F o r example , w h i l e ca l ibra t ion is p r imar i l y used to reduce parameter uncertainty, the process i m p l i c i t l y attempts to account for any erroneous or uncertain data. Phys ica l ly-based models attempt to reduce the uncertainty associated w i t h l u m p i n g data b y adopting a distributed framework and detailed process representation. M o d e l extrapolat ion typ ica l ly invo lves appl ica t ion o f a val idated structure o f process dynamics under arbitrary condi t ions far 74 r emoved from those o f va l ida t ion . F i n a l l y , the lack o f extreme event data can be classif ied as a funct ion o f both data uncertainty and natural var iabi l i ty . Further d iscuss ion o f h o w various sources o f uncertainty fit into each o f the four classif ications o f natural, data, parameter, and m o d e l uncertainty fo l l ows i n Sections 3.1.1through 3.1.4. 3.1.1 Natural Variability Natura l var iab i l i ty occurs i n both spatial and temporal d imensions . It is considered i r reducible , since it is a natural characteristic o f a g iven cont inuum o f t ime and space and is therefore not subject to m i n i m i z a t i o n or " improvement" . Uncer ta in ty is associated w i t h natural var iab i l i ty because model lers lack the tools to fu l ly understand, measure, and represent natural va r iab i l i ty i n hydro log ic mode l l i ng . F o r example , G a n et a l . (1997) conclude that dry catchments (i.e., those h a v i n g streamflow/rainfal l ratios less than 0.2) are general ly more diff icul t to mode l than wet catchments due to their greater hydro log ic complex i ty and var iabi l i ty . In this case, the watershed response o f dry catchments is more variable. H o w e v e r , it cannot be said to be "less certain", unless one is discussing our abi l i ty to reflect or replicate the response through data co l lec t ion and process mode l l i ng . Na tu ra l var iab i l i ty across a catchment can result i n data, parameter, or mode l uncertainties, or any combina t ion thereof. Often, it is a combina t ion o f these resultant uncertainties that is responsible for anomalous behaviour. H o w e v e r , the natural var iabi l i t ies o f a g iven system m a y also have systematic elements. F o r example , A r n a u d et a l . (2002) note that peak f lows are m u c h more strongly inf luenced than average runof f vo lumes b y the spatial and temporal characteristics o f the watershed and the p reva i l ing storm. S i m i l a r l y , W o o l h i s e r (1996) observes an inverse relat ionship between response var iab i l i ty and rainfal l rate. S u c h observations can help reduce the knowledge uncertainties associated w i t h natural var iab i l i ty . Charac ter iz ing natural va r iab i l i ty is a k e y focus for recent hydro logic research (Smi th et a l . , 1994). Na tura l var iab i l i ty can be significant at many different scales (Woolh ise r , 1996). It can arise f rom heterogeneous behaviour or properties (e.g., weather, topography), discontinuit ies (e.g., geologica l formations; land use), or processes (e.g., ra infa l l ; infil tration) (S ingh , 1995b; S o n g and James, 1992). A l t h o u g h the abi l i ty o f science to measure, record, and analyze observations has improved , a g rowing awareness o f small-scale hydro logic complex i ty has not 76 resulted i n an improved management approach (Beven , 2000). N o authoritative guidance is avai lable for assessments o f heterogeneity, and few methods exist for measur ing the spatial patterns o f hydro logic behaviour ( ibid.) . T h e most c o m m o n approach to def ining natural var iab i l i ty is to use a network or gr id o f point measurements. A l t h o u g h a l l point measurements represent an integration over some effective v o l u m e , this v o l u m e is often sma l l compared w i t h the macro-scale heterogeneity o f the process be ing measured (Beven , 2000). Thus , i n many cases the values obtained through f ie ld sampl ing are not representative o f the larger hydro logic domain . M u l t i - p o i n t (e.g., g r id or network) sampl ing programs, w h i l e generally preferable to single-point samples, m a y uncover greater heterogeneity and necessitate even more sampl ing (Grayson et a l . , 1992b). A n unrealist ic, s tat is t ical ly large number o f samples cou ld ul t imately be required to obtain an adequate characterizat ion o f condit ions (Beven , 2000). W h i l e there are opportunities for us ing advanced measurement and analysis techniques such as G I S and remote sensing to characterize natural var iabi l i ty , such techniques s t i l l require a theory or method o f spatial averaging, interpretation, and processing, and m a y not complete ly el iminate uncertainty (Beven , 1989). Character iza t ion o f ra infa l l va r iab i l i ty has been shown to have a par t icular ly pronounced effect o n the accuracy and qual i ty o f m o d e l results (e.g., M i c h a u d and Sorooshian , 1994; O ' C o n n e l l and T o d i n i , 1996; Ogden et a l . , 2000; Steiner et a l . , 1999). In a study o f ra infa l l var iab i l i ty across a high-densi ty ra in gauge network, Burges (2002) finds that no single gauge adequately describes either average depth (volume) or intensity, and no three gauges adequately describe spatial var iabi l i ty . Steiner et a l . (1999) consider the potential for c o m b i n i n g radar and rain gauge measurements, u l t imately conc lud ing that even i n ideal circumstances, the different natures o f the two measurements can s t i l l result i n discrepancies. T h e y f ind that, i n general, the ra in gauge data m a y not be spatial ly representative, w h i l e radar ra in measurements m a y not be temporal ly representative ( ibid.) . In this way , natural var iab i l i ty cou ld potent ia l ly defy complete and objective characterization even where an abnormal ly large amount o f data is avai lable. L u m p e d hydro log ic models attempt to account for natural va r iab i l i ty b y ident i fying a single set o f effective parameter values that characterize the cumulat ive response o f a heterogeneous 77 catchment. T h i s approach m a y be efficient where data are l imi ted , but is u n l i k e l y to produce accurate results. E v e n distributed models lose detail and smaller-scale var iab i l i ty through integration and averaging over m o d e l element areas. G o o d r i c h and W o o l h i s e r (1994) show that the assumption o f spatial uni formi ty , even at scales as smal l as 3 0 0 m , is not supported for W a l n u t G u l c h experimental watershed. W o o d et a l . (1988) contend that the var ious processes o f hydro logy should, i n theory, have a measurable catchment size at w h i c h va r i ab i l i ty is at a m i n i m u m . S i n g h (1995b) bel ieves that an appropriate scale must be smal l enough to capture any significant hydro log ic heterogeneity, but not so s m a l l as to be dominated b y loca l phys ica l features. Numerous studies have been directed toward determining the op t imal "sca le" or area for averaging hydro log ic heterogeneity. These attempts have met w i t h va ry ing degrees o f success and reached a range o f conclus ions . Song and James (1992) f ind that a scale o f approximate ly one square m i l e produces the best results, w i t h the op t imal scale s ize va ry ing upwards and downwards for gentle and mountainous topography, respectively. W o o d et a l . (1988) f ind that var iab i l i ty o f response stabilizes for catchment sizes greater than approximate ly one square ki lometre . Grayson et a l . (1992b) contend that the magnitude o f a representative elementary area is not f ixed , but rather w i l l depend o n the desired output. In the f inal analysis, not a l l catchments have a characteristic scale, no single characteristic scale is appl icable for a l l catchments, and output based o n this characteristic scale w i l l s t i l l be approximate. Res idua l uncertainty i n the f inal result is often ignored for lack o f a better alternative (Beven , 1989, 2001). W h i l e spatial heterogeneity has received the b u l k o f attention i n literature, a hydro logic mode l le r should also be aware o f potential temporal effects. F o r example , bias i n data m a y not be constant over t ime, and adjustments to the data must reflect this (Steiner et a l . , 1999). Garen and Burges (1981) identify the seasonal variance i n streamflow measurement uncertainty as a prominent example . 78 3.1.2 Data Uncertainty It is frequently imposs ib le for hydrologists to resolve processes k n o w n to occur i n the f ie ld w i t h the input and output data available (Hornberger et a l . , 1985). T h i s is not surprising, g iven that data uncertainty is usua l ly the most significant component o f uncertainty i n hydro logic m o d e l l i n g ( K o u w e n , 2003). In general, data uncertainty can be said to exist when there is a gap between the data itself, either co l l ec t ive ly or as i n d i v i d u a l quantiles, and what those data are assumed to represent. Na tura l ly , this def in i t ion i m p l i c i t l y includes a l l situations where data are s i m p l y unavai lable and must be inferred, extrapolated, transposed, or otherwise approximated f rom other locations or sources. Uncer ta in ty i n input data can lead to poor re l iab i l i ty or large predic t ion error for any otherwise-reasonable hydro log ic mode l ( M e l c h i n g et a l . , 1990). O b v i o u s l y , the ava i lab i l i ty o f data is o f paramount importance, as uncertainty is highest i n data-poor environments. In a study o f the s imple distr ibuted m o d e l T H A L E S , Grayson et a l . (1992a, 1992b) use in tensively-moni tored research catchments hav ing an unusual ly r i c h por t fol io o f data. H o w e v e r , the authors s t i l l report problems due to insufficient data. Often, data are s i m p l y not available i n areas o f concern to hydrologists and engineers, and subject catchments are chosen more for their data ava i lab i l i ty than for their appropriateness to the research purpose. F o r example , the w o r k o f Loague and Freeze (1985) is constrained to smal l up land catchments because these represent the o n l y scale w i t h sufficient data for their purposes. D a t a co l lec t ion is frequently constrained b y log is t i ca l issues such as lack o f funding or l imi t ed relevance to fields other than hydro logic m o d e l l i n g . In particular, Q u i c k (1995) notes the value o f both h igh and l o w elevat ion data for m o d e l l i n g orographic effects i n mountainous watersheds. H o w e v e r , data co l lec t ion stations are typ ica l ly sited for p r o x i m i t y to communi t ies and for easy maintenance access, and thus are usua l ly conf ined to lower elevations. Ea ton et a l . (2002) c lear ly state their op in ion that the hydrometr ic network i n general is "grossly inadequate" i n the face o f increasing g loba l , nat ional , and regional pressures o n water resources. T o be t ru ly effective, data co l lec t ion networks should be designed to be o f representative locat ion, density, process, and scale, w i t h due regard for the purposes to w h i c h the data w i l l be appl ied. In Canada and elsewhere, lack o f funding often forces data co l lec t ion agencies to reduce or rotate their moni to r ing networks. Gauge movement makes data analysis m u c h more diff icul t , as the 79 associated step-change m a y render pre- and post-relocat ion data i r reconci lable as a s ingle record (e.g., Hunter et a l . , 2002). Inhomogeneity can also be int roduced into a hydrologic data series w h e n one type o f gauge is replaced b y another (Sevruk, 1996). F o r example, M i c h a u d and Sorooshian (1994) expect streamflow measurement accuracy to change b y a factor o f t e n f o l l o w i n g replacement o f broad-crested weirs w i t h super-cri t ical flumes. Other challenges arise where an experimental database spans administrat ive boundaries, since different governments and organizations c o m m o n l y use different data co l l ec t ion methods w i t h different standards (Sevruk, 1994). T h e resul t ing (potentially fragmented) data series cannot a lways be resolved to the single continuous records usual ly required for hydro log ic s imula t ion . Th i s often means that gaps i n the data are f i l l ed w i t h tenuous estimates or arbitrarily-transposed data (e.g., M i c h a u d and Sorooshian, 1994; Seibert, 2000). Industry and professional associations are beginning to discuss the need for alternate means o f main ta in ing and increasing data co l lec t ion networks (e.g., K u l k a r n i and Bla is -Stevens , 2003). B e v e n (2000) bel ieves that one solut ion is for hydrologists to execute f ie ld programs as an integrated part o f hydro log ic m o d e l l i n g rather than s i m p l y adopting whatever data are available. Steiner et a l . (1999) cite data redundancy as the k e y to qual i ty control . F o r example, the authors suggest us ing clusters o f at least three ra in gauges w i t h i n a few hundred metres to establish data quali ty. W h e n redundant data are not avai lable, it is c o m m o n l y assumed that any avai lable data are o f sufficient qual i ty for mode l l i ng . T h i s assumption tends to persist even where p r o o f to the contrary can be considered a foregone conc lus ion (e.g., M i c h a u d and Sorooshian, 1994). H o w e v e r , data error and uncertainty often prove diff icul t to detect and quantify (Steiner et a l . , 1999). Steiner et a l . ( ibid.) present a clear example o f data error that is on ly identif iable because the study was conducted on a highly-instrumented experimental catchment. T h e y report that, for 30 storms over the watershed, average radar ra infa l l (as measured b y four W S R - 8 8 stations) is less than the average ra in gauge total for 8 0 % o f the storms. F o r 4 5 % o f the storms, the underest imation is at least 2 0 % , and for 3 0 % o f the storms, the underestimation exceeds 30%. D a t a uncertainties can arise f rom measurement errors, inconsistent or heterogeneous data, data hand l ing and transcription errors, or non-representative sampl ing caused b y temporal , spatial, or 80 f inancia l l imita t ions ( B i n l e y et a l . , 1991; N R C , 2000b) . E v e n i n the absence o f obvious mistakes, translation o f the data into usable (often digital) format can create uncertainty. Faures et a l . (1995) f ind that subject ivi ty i n d ig i t i z ing plotted precipi ta t ion data (e.g., operator selection o f breakpoints) can lead to peak f low coefficients o f va r i ab i l i ty for s imulated out f low i n the 3 -5% range. H a r l i n (1992) presents another example, descr ib ing h o w a single precipi ta t ion event occur r ing over two calendar days cou ld be ar t i f ic ia l ly spli t i n a da i ly data series, m a k i n g m o d e l results diff icul t to reconci le w i t h observed streamflow. M e l c h i n g (1995) expands on the concept o f measurement error, c i t ing the poss ib i l i ty for equipment malfunct ion , non-representativeness o f l o c a l condi t ions , and bias due to sampl ing locat ion. D a t a uncertainty can also arise from a non- informat ive data set, where the data do not constitute a sufficient sample set for m o d e l cal ibrat ion, va l ida t ion , and evaluat ion (Sorooshian and Gupta , 1983). The earliest data co l l ec t ion programs re l ied exc lus ive ly o n manual data co l lec t ion . A l t h o u g h manua l data co l l ec t ion has not disappeared altogether (e.g., the network o f amateur meteorologica l stations across the U . S . (Nat ional Weather Service , 2003)), most data is n o w col lected electronical ly. The introduct ion o f technology into data co l lec t ion y ie lded both benefits and drawbacks: electronic data can be col lec ted i n more detai l , often at less cost, and i n remote locations. H o w e v e r , many bel ieve that there has been a corresponding loss o f qualitative informat ion and data richness, and i n some cases, a loss o f accuracy (Wei le r , 2005). E lec t ron ic data sets col lec ted before telemetry technology became avai lable m a y be subject to addi t ional uncertainties such as the potential for asynchronici ty between precipi ta t ion and streamflow gauges (Loague and Freeze, 1985; M e l c h i n g , 1995). In general, m u c h early data from automatic stations cannot be assumed to exist w i t h the same amount o f certainty as modern data. In most cases the h is tor ica l levels o f uncertainty are indeterminate. A n y assumptions made about the nature o f data error are yet another possible source o f uncertainty (e.g., D u c h o n and Essenberg, 2001). Uncer ta in ty can also be introduced through the processes o f f ie ld sampl ing . F o r example , B e v e n (2000) shows that the hydraul ic conduct iv i ty measured from a f is t-sized so i l sample i n a laboratory, w h i l e correct for that sample, does not reflect the larger scale hydraul ic conduct iv i ty 81 due to the dominance o f larger-scale f l ow pathways in situ. M o r e generally, he cautions that measurement techniques can be " invas ive w i t h the potential to change the response o f the system b y the very process o f observat ion" (p. 192, B e v e n , 2002). The effect o f such a disturbance m a y or m a y not be permanent. M i c h a u d and Sorooshian (1994) observe a temporary change i n streamflow vo lume residuals f o l l o w i n g the in t roduct ion o f new f l o w measurement structures, l i k e l y indica t ing an adjustment o f the sediment regime. T h e existence o f data uncertainty (i.e., potential error) is often indicated b y cont inual failure o f a variety o f models and cal ibrat ions to adequately represent the response o f a catchment, or b y abnormal ly poor mode l performance for a specific event or per iod o f record. F o r example , F r anch in i and Pacc ian i (1991) report that a l l models i n their compar i son study have poor results for certain events. In their case study, the persistence o f problems across a diverse set o f models indicates that the problems l i k e l y originate i n the data. Al te rna t ive ly , Gup ta et a l . (1999) compare results for three different calibrations o f S A C - S M A , observing poor performance for a l l cal ibrat ions i n four o f the e leven years s imulated. In this case, data uncertainty is a l i k e l y contributor, but m o d e l uncertainty m a y also p lay a role. In another example, Hornberger et a l . (1985) report that almost a l l o f the squared error i n the m o d e l residuals occurs i n the largest event o f record. C lose r examinat ion o f this event reveals a large discrepancy between measured precipi ta t ion and runof f vo lumes , leading the authors to conclude that the error is a result o f uncertainty i n the data. Perhaps the most important step i n deal ing w i t h data uncertainty is to ensure that any preconceived expectations o f m o d e l accuracy are realist ic. Faures et a l . (1995) caut ion against expect ing the accuracy o f m o d e l outputs to exceed the resolut ion o f input data. S m i t h et a l . (1994) bel ieve that computer m o d e l results w i l l not be any more precise than the repeatabili ty o f a cont ro l led phys ica l experiment. The impl ica t ions o f their statement are wide- ranging g iven the imposs ib i l i t y o f repeatable large-scale phys ica l experiments i n hydrology. Genera l ly speaking, the best result attainable is one i n w h i c h data uncertainty is insufficient to affect the basic results and conclus ions o f a study (e.g., M i c h a u d and Sorooshian, 1994). The acceptabil i ty threshold for data uncertainty is i n part determined b y the nature o f the m o d e l be ing appl ied. F o r example , us ing the conceptual , quasi-distr ibuted U B C W M , L a n (2001) finds 82 that good precipi tat ion data f rom nearby basins can generally be used i n the study catchment i f the elevat ion is preserved. H o w e v e r , the U B C W M is c o m m o n l y appl ied i n alpine regions w i t h ve ry scarce data, and thus must be fu l ly calibrated for each study. One should not expect the same result for models that do not use ca l ibra t ion to attenuate any potential systematic aspects o f data error. A s discussed i n Sect ion 2.3, s imple hydro log ic models require s imple inputs, often o n l y the dynamic fluxes o f temperature, s treamflow, and precipi tat ion. M o r e complex physical ly-based mode ls usual ly require more c o m p l e x and comprehensive static data (i.e., phys ica l properties). B o t h static and dynamic data are subject to uncertainty. H o w e v e r , the uncertainty i n the dynamic data dominates i n most cases, since measurements o f catchment properties are used o n l y for estimating values for their corresponding m o d e l parameters. Parameter uncertainty, i n its turn, can be reduced through cal ibra t ion, as discussed i n Sec t ion 3.1.3. B e l o w are some specif ic aspects o f data uncertainty related to the two hydro logic fluxes that dominate the literature o n data uncertainty - s t reamflow and precipi tat ion. St reamflow data are generally considered to be more accurate than any other input data w i t h the except ion o f temperature (Sorooshian and Gupta , 1995). Sorooshian and Gupta (ibid.) cite an expected overa l l accuracy o n the order o f ± 1 0 % , a figure substantiated b y Faures et a l . (1995). Burges (2002) estimates that the best stream gauging stations i n the U . S . are l i k e l y to be accurate to w i t h i n ± 5 % . M o o r e (2004) provides a s imi la r estimate o f accuracy for current meter ing approaches for streamflow measurement. E v e n the replicate ra infa l l events on an imperv ious surface studied b y W u et a l . (1982) measure differences o f approximately 4 % i n peak discharge (Smi th e t a l . , 1994). Because most hydrometr ic stations record water l eve l and re ly o n a rat ing curve to convert measurements to streamflow, the changing accuracy o f the rat ing curve over t ime is another source o f uncertainty. S ince large floods often result i n an altered channel geometry, it is not surpris ing that errors i n measurements o f peak f low events - i n cases where the f lood d i d not interrupt data co l lec t ion comple te ly - are typ ica l ly m u c h greater than the more general values c i ted above. The error variance o f s treamflow measurements and residuals is typ ica l ly heteroscedastic and tends to increase as the f l ow gets larger (Sorooshian et a l . , 1983). 83 Other methods for streamflow measurement can have their o w n unique sources o f uncertainty. F o r example , M o o r e (2004) states that, under good condi t ions, salt d i l u t ion gauging can be accurate to w i t h i n 5%. However , the accuracy o f the method can be affected b y the choice o f m i x i n g reach, d i lu t ion o f the inject ion so lu t ion b y ra infa l l , or the presence o f vegetation, snow, or ice w i t h i n the channel (Moore , 2005). M e t h o d s for peak f low estimation such as the slope-area method are sometimes used to augment f l ow records, par t icular ly i n situations where the data are k n o w n to be unacceptable. F o r example , the gauging station at K i c k a p o o Creek , Texas was destroyed dur ing the 1994 f lood. H o w e v e r , peak f l o w c o u l d s t i l l be estimated because a v ideo documentary a l lowed est imation o f stage levels against elevat ion benchmarks o n a br idge (Smi th et a l . , 1996). A s expected, these k inds o f estimates are generally subject to even greater uncertainty than is present i n computer mode l l i ng ; the U S G S estimates for the f lood peak o n K i c k a p o o Creek have an uncertainty o f ± 1 5 % ( S m i t h et a l . , 2000). The var ious aspects o f data uncertainty i n streamflow records discussed above lead Burges (p. 284, 2002) to conclude that "any m o d e l that attempts to reproduce the measured hydrograph should inc lude exp l i c i t l y uncertainty bounds o n the mode l l ed streamflow hydrograph". I f data uncertainty can be said to dominate the literature o n knowledge uncertainty i n hydro logic mode l l i ng , then there is lit t le doubt that uncertainty i n precipi ta t ion plays the same dominant role amongst the components o f data uncertainty. Burges (2002) notes that fundamental influence o f precipi tat ion, po in t ing out that systematic errors i n the precipi ta t ion series preclude correct representation o f the water balance, u l t imate ly reducing the c red ib i l i ty o f m o d e l output. Prec ip i ta t ion gauges are a wel l -s tudied source o f measurement error and uncertainty. Steiner et a l . (1999) f ind that a l l ra in gauges w i t h i n their study catchment operate correct ly i n o n l y four o f the thirty storms examined, and no single ra in gauge functions correct ly for 100% o f the two-year per iod . T h e y report b io log i ca l fou l ing and human interference as c o m m o n causes o f malfunct ions for t ipping-bucket ra in gauges ( ibid.) . D u c h o n and Essenberg (2001) observe that i n cases o f h i g h ra in intensity, the t ime-to-tip for a t ipp ing bucket gauge m a y become significant. T h i s phenomenon generally is not addressed i n the manufacturer 's cal ibrat ion, al though after-market ca l ibra t ion is possible. F o r weighing- type gauges, gauge fr ic t ion c o m m o n l y causes the 84 first response recorded dur ing a ra in event gauge to occur later than the first t ip i n the t ipp ing bucket gauge. S i m i l a r l y , a we igh ing gauge m a y register a slight increase i n total precipi tat ion hours after the end o f the event that might not be captured b y an event analysis. Thus , gauge fr ic t ion is a c o m m o n source o f error for mechan ica l w e i g h i n g gauges ( ibid.) . A n equal ly important degree o f uncertainty i n precipi ta t ion measurement is related to representativeness. A l t h o u g h the natural aspects o f ra infa l l spatial va r iab i l i ty are addressed i n Sec t ion 3.1.1, one must also consider the ab i l i ty o f a measurement device to capture an accurate sample o f loca l condi t ions at the gauge site. E v e n g iven moderately intense, un i fo rm rainfa l l over a catchment w i t h funct ioning, accurate ra in gauges, there is no guarantee that results measured b y the gauge w i l l be truly representative o f l oca l condi t ions. Factors such as loca l topography, landscaping, and nearby bu i ld ings can have a strong influence. The in i t i a l wet t ing o f the sides o f the ra in gauge bucket can also lead to underest imation o f ra infa l l (Sevruk, 1996). E v e n the fa l l angle o f precipi ta t ion w i t h respect to the ground can have a non-negl ig ible effect, since most hydro log ic models assume that precipi ta t ion inputs are measured perpendicular to the ground surface. Exper iments have shown, however , that the most significant cause o f data error i n precipi ta t ion measurement is w i n d (Larson and Peck , 1974). Despi te its prevalence and potential s ignif icance, w ind- induced losses are not typ ica l ly accounted for i n publ i shed precipi ta t ion data (Sevruk, 1996). L a r s o n and P e c k (p. 857, 1974) expla in that "as the air rises to pass over the gage, precipi tat ion particles that w o u l d have passed through the gage or i f ice are instead deflected d o w n w i n d " b y turbulence and increased w i n d speed. The result ing increases i n w i n d speed can exceed 4 0 % (Sevruk, 1996). In this way , h igh w i n d condi t ions can induce significant bias (e.g., Burges , 2002). The magnitude o f precipi ta t ion undercatch can vary w i t h factors such as w i n d protection, height above ground, w i n d speed, precipi tat ion size, precipi ta t ion form, and gauge properties (e.g., shape, diameter, and orif ice r i m thickness). Different types o f w i n d shields are sometimes attached to precipi ta t ion gauges i n an attempt to m i n i m i z e wind- induced undercatch. A l t h o u g h models such as the A t l e r or N i p h e r w i n d shields have been observed to substantially reduce undercatch-related errors for snow, none comple te ly el iminate precipi ta t ion undercatch ( D u c h o n and Essenberg, 2001). 85 Estimates o f undercatch are themselves h igh ly uncertain due to the d i f f icul ty i n obta ining object ively "correct" baseline data. Steiner et a l . (1999) advocate obtaining "basel ine" data f rom gauges bur ied such that their aperture is contiguous w i t h the ground surface. W h e n appropriately protected against in-splash, such "p i t " gauges represent the reference standard advocated b y the W o r l d M e t e o r o l o g i c a l Organiza t ion (Sevruk and Nespor , 1998). H o w e v e r , pi t gauges are not w i d e l y used even i n experimental catchments, and are imprac t ica l for measur ing snowfa l l . D u c h o n and Essenberg (2001) u t i l i ze pit gauges to p rov ide a baseline w h e n est imating w i n d -induced undercatch for above-ground shielded and unshielded t ipping-bucket and Bel for t gauges. The authors conclude that it is imposs ib le to establish a direct relationship between w i n d speed and undercatch, l i k e l y due to the absence o f any drop-size informat ion i n their study. V a r i o u s studies b y Sevruk (e.g., Sevruk, 1996) and Sevruk and N e s p o r (e.g., Sevruk and Nespor , 1998; N e s p o r and Sevruk, 1999) have ident i f ied a threshold value for ra infa l l intensity that varies w i t h w i n d speed: b e l o w this threshold value, w ind- induced error increases qu ick ly ; above it, the increase i n w ind- induced error is m u c h slower. L a r s o n and P e c k (1974) cite a variety o f studies reporting l iquid-phase precipi tat ion undercatch f rom 5% to 2 0 % , w i t h more severe estimates o f 4 0 % to 8 0 % for snow. M o r e recently, Sevruk (1996) states that w ind- induced loss is generally between 2 % and 1 5 % for rain, and up to 8 0 % for snow, w i t h even higher values observed i n mountainous areas. Sevruk and N e s p o r (1998) and Nespor and Sevruk (1999) refer to average wind- induced errors o f 2 % - 1 0 % for ra in and up to 5 0 % for snow. D u c h o n and Essenberg (2001) observe the f o l l o w i n g undercatch values as a percentage o f total ra infa l l ( ibid.) : • T i p p i n g bucket above-ground gauges show approximate ly 4 % undercatch w i t h respect to the bur ied t ipping-bucket gauge; • W e i g h i n g bucket above-ground gauges show approximate ly 5 % undercatch w i t h respect to the bur ied w e i g h i n g gauge; • In both cases, the reduct ion i n undercatch rea l ized b y implementa t ion o f the A l t e r sh ie ld is less than 1%; and 86 • F o r extreme meteorologica l situations c o m b i n i n g h igh w i n d and ra infa l l s imul taneously (such as squal l l ines), the above figures increase to 1 5 % undercatch for the t ipp ing bucket gauge, 14% undercatch for the w e i g h i n g gauge, and a 3 % reduct ion i n undercatch us ing the A t l e r shie ld . T w o basic approaches have been used to explore the impact o f precipi tat ion uncertainty o n hydro logic m o d e l s imulat ions. The first ut i l izes a dense ra in gauge network, compar ing m o d e l results based o n a subset o f gauges to output obtained us ing the fu l l network. T h i s approach assesses the reduct ion i n uncertainty associated w i t h increasing gauge density, and can lead to insights for situations i n w h i c h a dense network is not avai lable. The second approach compares o n a relative basis results obtained w i t h baseline precipi ta t ion data and results associated w i t h a perturbed vers ion o f the baseline data. Th i s second approach is analogous to sensi t ivi ty analysis and can provide quantitative insight into the impacts o f uncertainty for any extant set o f data. F o l l o w i n g the second approach, L u m b and L i n s l e y (1971) use mathematical augmentation o f actual ra infa l l to understand the effects o f smal l increases i n precipi ta t ion on other hydro log ic variables and processes. The effect o f precipi ta t ion augmentation o n annual streamflow vo lumes is found to va ry (from zero to approximate ly 100% o f the addi t ional precipitation) w i t h an increasing ratio o f annual s treamflow to annual precipi ta t ion. T o this point , d iscuss ion o f uncertainty i n precipi ta t ion measurement has been confined to gauge-based observations. H o w e v e r , uncertainties i n radar measurement can be equal ly or even more significant. Burges (2002) writes that, i n many situations, a dense network o f rel iable gauge data w i l l tend to be more useful than even the best qual i ty radar measurements. S m i t h et a l . (1996) conclude that radar precipi ta t ion measurements for the 1995 f lood o n the R a p i d a n R i v e r are approximate ly one-third o f actual values. The authors l i n k the underestimation to three factors: g rowth o f ra infa l l through and b e l o w the radar beam, inappropriate parameters i n the ref lect ivi ty convers ion calculat ions, and an inappropriately l o w h a i l threshold ( ibid.) . A s the latter two factors indicate, radar estimates are h i g h l y dependent o n the method used for convers ion f rom ref lect ivi ty to ra infal l . Measurement error for snow is m u c h greater than for ra in , i n m a n y cases exceeding a factor o f two (Kra j ewsk i , 2005) . In general, radar measurement does not perform w e l l i n mountainous regions (Wei le r , 2005) . 87 Radar data are often adjusted us ing ra in gauge data. A r g u a b l y the most c o m m o n procedure is to correct for bias i n the radar data b y r emov ing the average difference o f radar and ra in gauge measurements under the assumption that the ra in gauge data are rel iable (Steiner et a l . , 1999). H o w e v e r , w h i l e this procedure can validate otherwise questionable data, one must recal l that both sets o f data are uncertain and, i n a l l l i k e l i h o o d , neither measurement is accurate. Steiner et a l . ( ibid.) show that important differences remain even w h e n bias adjustment o f radar data is performed us ing rel iable ra in gauge data. Uncer ta in ty is compounded where gauge-adjusted radar data are assumed appl icable beyond the immediate v i c i n i t y o f the ra in gauge. 3.1.3 Model Uncertainty T h e cartoon shown i n F igure 3-3 is a good starting point for a d iscuss ion o f m o d e l uncertainty. In this case, the model le r attempts to use a s imple coin-toss m o d e l to forecast the weather, w h i c h is the result o f a complex and dynamic set o f natural processes. O b v i o u s l y , the mode l has the potential to be i n error a large percentage o f the t ime, and is therefore subject to substantial m o d e l uncertainty. S o m e basic properties o f the coin-toss m o d e l contr ibut ing to m o d e l uncertainty include: • the popula t ion o f m o d e l outcomes consists exc lus ive ly o f " r a i n " and "shine", each hav ing a long-term probabi l i ty o f 5 0 % , whereas natural condit ions for any g iven day m a y include ra in , shine, or both, m i x e d i n va ry ing proport ions; • successive trials (i.e., c o i n tosses for each new day) are independent, w h i l e natural condit ions typ ica l ly exhibi t c y c l i c a l behaviour (e.g., frontal systems); and • the long-term probabi l i ty o f the m o d e l forecasting " r a i n " or "shine" for any g iven day does not change over t ime or space, w h i l e natural condi t ions might have seasonal or regional components (e.g., wet / d ry seasons or h u m i d / ar id cl imates). In the context o f hydro logic mode l l i ng , G a n (1987) lists the s imp l i fy ing assumptions bui l t into the physical ly-based Smith-Hebber t mode l , a h i l l s lope-scale m o d e l intended on ly for research purposes. These assumptions are arguably as unrealist ic f rom a hydro logic perspective as the expectation that a c o i n toss can be used to predict weather patterns. Thus , substantial m o d e l 88 RUBES by LEIGH RUBIN (HEADS) "IfC WEXRD? WITH AN ^ Figure 3-3: Model Uncertainty in Practice Creators Syndicate Inc. © 1996 Leigh Rubin Used by permission of Leigh Rubin and Creator's Syndicate Inc. uncertainty w o u l d be present i f this research-oriented m o d e l were subjected to pract ical appl icat ion. Examples o f s imp l i fy ing assumptions include (Gan , 1987): • the catchment is rectangular i n area; • a no- f low condi t ion exists at the upper catchment boundary; • drainage to the channel is perpendicular; • a l l movement o f water i n the unsaturated zone is ver t ica l ; • soi ls i n each subsurface layer are homogeneous and isotropic; • subsurface f low is para l le l to the so i l interface; and • vegetation consists o f un i fo rm short grass, w i t h root ing l imi t ed to the upper so i l zone. A s is obvious from the examples above, m o d e l uncertainty exists where there is a gap between the assumptions, s impl i f ica t ions , processes, and mathematical representations that comprise a 89 m o d e l and the natural processes that the m o d e l is attempting to replicate (Sorooshian and Gupta , 1983). It can manifest i n different ways , i nc lud ing as a comprehensive inab i l i ty to predict runof f accurately, even g iven correct parameters and input data, or as performance insensi t iv i ty to changes i n relevant aspects o f the m o d e l structure (e.g., Grayson et a l . , 1992a; Loague and Freeze, 1985). In particular, a large degree o f mode l uncertainty is c o m m o n when either natural var iab i l i ty is neglected or processes k n o w n to be active in situ are not considered (Song and James, 1992). F o r example , F ranch in i and Pacc ian i (1991) note that the T A N K and S S A R R models lack a direct surface runof f component, but are nonetheless able to produce acceptable s imulat ions o f h is tor ica l runoff. Natura l ly , less detai led or l umped models are subject to a higher degree o f m o d e l uncertainty ( G a n and B i f t u , 1996; G a n and Burges , 1990a). O ' C o n n e l l and T o d i n i (1996) bel ieve that m o d e l uncertainty fundamentally results f rom an incomplete understanding o f h o w var ious factors affect runoff at var ious spatial scales. B e v e n and Feyen (2002) offer a differing perspective, conc lud ing that our understanding o f hydrologic processes outstrips our abi l i ty to quantify them over a broad study area. K l e m e s (1982) cautions that mathematical convenience has been k n o w n to take the place o f scientif ic accuracy i n def ining a part icular mode l structure, and that awareness o f the arbitrary nature o f the m o d e l fades away over t ime. These var ious v iewpoin t s are not mutua l ly exc lus ive . In general, some processes are w e l l understood and are quantifiable, w h i l e others are understood but diff icul t to implement . S t i l l others are effectively imposs ib le to simulate due to l im i t ed knowledge or understanding. B l o s c h l (2001) suggests that model lers should identify the dominant processes that control hydro log ic response under different environmental condit ions and at different scales, and focus their efforts o n p r o v i d i n g good simulat ions o f these processes. H o w e v e r , model lers m a y be tempted to focus o n those processes most amenable to m o d e l l i n g rather than those that are hydro log ica l ly significant for the catchment. F o r example, E T can represent up to 8 0 % o f hydro log ic ac t iv i ty i n a bas in but receives disproport ionately less treatment i n hydro log ic literature and practice (K lemes , 1986a). Unsurpr i s ing ly , G a n and Burges (1990b) f ind that conceptual m o d e l predict ive capabi l i ty is poor i n regions where E T accounts for a very 90 significant propor t ion o f the precipi tat ion. M e l c h i n g (p. 73, 1995) provides a relevant quote from C o r n e l l (1972) that impl ies s imi la r concerns, stating that " i t is far better to have an approximate m o d e l o f the who le p rob lem than an exact m o d e l o f on ly a por t ion o f i t" . Pre-defined methods, process models , or formulae used w i t h i n a hydro logic m o d e l - such as those used for est imating E T - m a y have their o w n w i d e ranges o f uncertainty ( B i n l e y et a l . , 1991). M a n y hydro log ic models either calculate potential E T us ing previous ly-developed empi r i ca l formulae (e.g., Penman-Monte i th ) or require E T data as an input (e.g., V r u g t et a l . , 2003). The d iscuss ion o f E T i n Sect ion 2.1.3 notes that the various est imation methods avai lable are a l l subject to significant uncertainty. It is reasonable to consider this uncertainty to be i r reducible , and thus the m o d e l uncertainty associated w i t h this and other mode l components becomes subsumed into the m o d e l uncertainty o f the larger hydro logic mode l . Uncertaint ies i n m o d e l structure can affect the properties o f output informat ion (e.g., vo lume , peak f low) i n different ways and to different degrees ( M e l c h i n g , 1995). M o d e l structure uncertainty m a y also p lay a large role i n determining the behaviour o f uncertainty i n cal ibra t ion ( Y a p o et a l . , 1996). M o d e l uncertainty is often assumed to be insignif icant unless and unt i l a g iven cal ibrat ion is p roved unsuccessful , w i t h attention instead focussed o n uncertainty i n the data. H o w e v e r , there is potential for m o d e l error to be "substantial ly larger" than measurement error, and many bel ieve that m o d e l uncertainty is current ly l i m i t i n g m o d e l performance (e.g., G u p t a et a l . , 1998; Y a p o et a l . , 1996). A n a l y z i n g m o d e l uncertainty is not as straightforward as ana lyz ing uncertainty i n input and output data. Fi rs t and foremost, m o d e l errors w i l l not necessari ly have probabi l is t ic properties that can be exploi ted to gain insight into the p rob lem or potential solutions (Gupta et a l . , 1998). A c c o r d i n g to S i n g h and W o o l h i s e r (p. 283 , 2002), a lack o f fundamental error analysis has l im i t ed our understanding o f h o w different errors propagate through different m o d e l components and parameters. In evaluating m o d e l structure, general hydro log ic laws are often self-evident but are diff icul t to ver i fy due to uncertain data and the complex and variable nature o f the system boundary condit ions (Beven , 2002). W h e n consider ing mul t ip le models for a task, B e v e n (1989) recognizes mode l uncertainty as the potential for error i n any m o d e l , but seeks to avo id those models that are conc lus ive ly i n error. 91 T h i s phi losophy, though potent ial ly product ive, must be appl ied w i t h care. A l t h o u g h experimentat ion can identify errors i n a g iven s imula t ion , there is not necessari ly any objective basis for attributing the error to mode l uncertainty. F o r example , K u c z e r a and Parent (1998) use M e t r o p o l i s S a m p l i n g to identify apparent structural deficiencies i n the C A T P R O mode l . H o w e v e r , they do not e l iminate the potential for data and parameter uncertainty, and therefore cannot declare w i t h certainty that their observations are independent o f other sources o f error. A t the other end o f the cont inuum, attempts to show that m o d e l uncertainty is "neg l ig ib le" are even less feasible, consider ing that such efforts are, i n essence, ver i f ica t ion, and are doomed to failure as discussed i n Sec t ion 2.4.6. The above d iscuss ion is not meant to i m p l y that quantitative studies o f mode l uncertainty are o f l im i t ed importance. H a r l i n (1992) provides a strong example to the contrary. H e finds that three different but equal ly acceptable realizations o f the runoff-response function i n the H B V conceptual m o d e l result i n design f lood s imulat ions for four S w e d i s h watersheds hav ing an average uncertainty o n the order o f ± 2 0 % . A l t h o u g h the experiment does not represent the fu l l extent o f m o d e l uncertainty, it provides - at the very least - a good starting estimate. The bu lk o f literature examin ing m o d e l uncertainty tends to focus o n runoff and f low mechanics . T h i s is not surpris ing, since the abi l i ty to accurately m o d e l f lu id f lows already exists i n the form o f the Navie r -S tokes equations. Howeve r , B e v e n (2002) emphasizes a fundamental difference between f low equations i n hydro logy and other f l u id dynamics d isc ip l ines . In hydrology, l oca l geometry and boundary condit ions replace f lu id mechanics as the dominant constraint o n sma l l -scale f l o w pathways. A classic example is the tendency toward f l o w channel izat ion even at sma l l scales, w h i c h can distort phys ica l representativeness at the m o d e l element scale (Woolh i se r , 1996). Therefore, one c o u l d argue that m o d e l uncertainty i n f low-related components o f hydro logic models results f rom our inab i l i ty to comprehensively apply the Navie r -S tokes equations at the required leve l o f detail (Beven , 2002) . The inab i l i ty to relate actual boundary condi t ion measurements and processes to even the smallest o f element scales provides further evidence that the scale-dependence o f mode l structures is a significant source o f mode l uncertainty (Beven , 2001). The sca l ing o f process dynamics , especia l ly from laboratory to h i l l s lope or catchment scale, is a contentious issue w i t h 92 surpr is ingly litt le scientif ic support. M a n y authors be l ieve that the ab i l i ty to perform calculat ions at the lumped catchment scale or even the mode l element or gr id scale w i l l continue to elude hydrologists for the foreseeable future (e.g., B e v e n , 2000; B l o s c h l , 2001) . In the absence o f an acceptable, scaleable mode l framework, s impl i f ica t ions are necessary. D a t a uncertainty can be translated into addi t ional mode l uncertainty where a lack o f data requires further s imp l i fy ing assumptions w i t h i n the mode l structure. The conceptual izat ion o f over land f low as k inemat ic sheet f l o w is a c o m m o n example, despite widespread documentat ion that k inemat ic sheet f l ow is rare i n most hydro logic environments. In another example , B e v e n (2002) points out that Freeze and Har l an ' s blueprint for physical ly-based models is l imi t ed b y its rel iance o n D a r c i a n theory, w h i c h is not appl icable for large scales and heterogeneous condi t ions . Since the suite o f m o d e l parameters is usual ly specif ied b y the m o d e l structure, it should not be surpris ing that m o d e l uncertainty can be diff icul t to separate f rom parameter uncertainty. Cons ide r the case where a m o d e l o f forested areas includes parameters for litter-layer storage and interception storage but does not m o d e l the seasonal var ia t ion therein (e.g., Hornberger et a l . , 1985). In this case, the static nature o f the litter-layer parameters is determined b y the m o d e l structure. Therefore, even i f the parameter values are correct for m u c h o f the year, processes based on the t ime-invariant lit ter-layer parameters introduce m o d e l uncertainty into the results. Converse ly , i f the m o d e l structure permit ted seasonal var ia t ion o f the litter-layer parameters but the parameter values were indeterminate, this w o u l d be a case o f parameter uncertainty. M o d e l uncertainty can be easi ly mis taken for parameter uncertainty w h e n there is no basis for choos ing between parameter sets p roduc ing equal ly acceptable s imulat ions (e.g., Grayson et a l . , 1992a). Sorooshian and Gup ta (1983) present a systematic explora t ion o f the S L S function response surface for the S M A - N W S R F S mode l . W h e n objective function values are plot ted against two o f the percola t ion parameters, the authors identify a long , flat va l l ey i n the response surface. The authors conclude that the va l l ey is a product o f the structural representation o f the percola t ion subprocess. P rob lems o f non- ident i f iabi l i ty can sometimes be so lved through re-parameterization o f the appropriate equation; i n this case, a s imple re-parameterization resulted i n a marked improvement i n parameter ident i f iabi l i ty (Gupta and Sorooshian, 1983). 93 M o d e l uncertainty m a y arise most c o m m o n l y from process-related issues, but it can take other forms. Uncertainties can also arise from fundamental assumptions about h o w to interpret data and implement parameters w i t h i n the m o d e l structure. E v e n the temporal increments and parameters chosen b y the mode l le r can have a component o f uncertainty. F o r example, m o d e l l i n g frequently disregards the phys ica l basis for many t ime scales, m i x i n g phys ica l (e.g., day, season, year) and administrat ive (e.g., hour, week, month , decade) t ime intervals ind iscr imina te ly (K lemes , 1983). A r g u a b l y more important is the need to ensure that the m o d e l l i n g timestep is less than the runof f response t ime o f the system. F o r example , the t ime taken for f l ood peaks on the M e u s e and R h i n e R i v e r s to reach the Netherlands are several hours and several days, respectively (van Hofwegen and Schul tz , 1995). In this case, it is obvious that us ing a da i l y timestep m o d e l for f lood forecasting o n the M e u s e R i v e r w o u l d be inappropriate. Less obvious is the answer to the quest ion o f whether or not an hour ly timestep m o d e l is appropriate for the R h i n e R i v e r basin . The answer w o u l d depend o n the actual observed response t ime, since reasonable est imation o f the f lood hydrograph log i ca l l y requires a m o d e l l i n g t ime step considerably smaller than the t ime o f concentration. H a r l i n ' s study o f extreme floods o n s ix watersheds i n Sweden us ing the H B V m o d e l is an example o f the c o m m o n practice o f either assuming or fa i l ing to document that the chosen timestep is appropriate (Har l i n , 1992). H e uses a da i ly timestep without exp la in ing h o w this timestep is selected, and whether it i s , i n fact, an appropriate choice. 3.1.4 Parameter Uncertainty G i v e n alternate sets o f parameter values, a m o d e l m a y or m a y not achieve a good s imula t ion o f the his tor ica l data set. In this context, parameter uncertainty refers to the uncertainty associated w i t h ident i fying a set o f parameter values that the user bel ieves best simulates in situ processes. H a r l i n and K u n g (1992) illustrate the effects o f parameter uncertainty b y demonstrating substantial correlat ion between increased variance i n m o d e l output and decreased parameter accuracy. A s noted i n the preceding section, parameter uncertainty is c lose ly related to mode l uncertainty i n the sense that parameter defini t ions are a function o f the chosen m o d e l structure. A c c o r d i n g l y , it is reasonable to expect that, i f m o d e l uncertainty can be el iminated, then issues o f 94 parameter uncertainty and ident i f icat ion w i l l become more tractable (Beven , 2001). Howeve r , every appl ica t ion o f every m o d e l w i l l s t i l l require user-defined parameters; thus, parameter uncertainty is and w i l l undoubtedly continue to be a significant contributor to overa l l mode l predic t ive uncertainty. T h e U S Na t iona l Research C o u n c i l ( N R C , 1995) notes that parameter uncertainty often takes a continuous format (e.g., through statistical functions l ike the p robab i l i ty density function (pdf) and cumula t ive dis tr ibut ion function (cdf)). Converse ly , m o d e l uncertainty m a y invo lve "dis t inct and mutua l ly exclus ive cho ices" ( ibid.) . The N R C (1994) cautions that indiscr iminate combinat ions o f different types o f uncertainty (e.g., averaging predict ions from "equal ly probable" but distinct models) can y i e l d results inconsistent w i t h any o f the alternative models . Loague and Freeze (1985) contend that parameter uncertainty can have two distinct aspects, one related to natural var iabi l i ty and the other to knowledge uncertainty. The first, addressed i n Sec t ion 3.1.1, deals w i t h the potential for a poor representation o f the areal dis t r ibut ion o f catchment characteristics, and is s trongly inf luenced b y the innate var iab i l i ty o f the catchment and the degree to w h i c h the m o d e l is distributed i n nature. The second is a result o f parameter interdependence, i m p l y i n g that the best set o f parameter values m a y be unidentif iable due to non-uniqueness and correlat ion between parameters. K u c z e r a and Parent (1998) caut ion model lers against naive rel iance o n unique ly determined parameter values, the pursuit o f w h i c h has h is tor ica l ly commanded m u c h attention i n hydro logic literature. M o r e recently, model lers are beg inn ing to assess parameter uncertainty on a broader basis. Studies conclude almost un iversa l ly that no single point i n the parameter space can be un ique ly defined as the g lobal o p t i m u m , and i n some cases, there m a y not even be a wel l -def ined g loba l ly -op t imum region (Vrugt et a l . , 2003). T h i s aspect o f parameter uncertainty has considerable impl ica t ions for distr ibuted m o d e l l i n g , since a cal ibrated parameter value must have sma l l variance i f it is to be compared w i t h measured counterparts ( ibid.) . T o fu l ly address parameter uncertainty, such comparisons must be extended across mul t ip le f ie ld measurements to ensure that the uncertainty dis t r ibut ion o f the mode l parameter is representative o f in situ measurement var iabi l i ty . 95 The crux o f most ambigui ty and ident i f iabi l i ty problems is that the descriptions o f process dynamics contained i n the mode l structure are o f a substantially higher order than typica l observations o f external system dynamics (Beck , 1987). The inab i l i ty to accurately estimate parameters dur ing cal ibrat ion is a typ ica l indicator that the structural complex i ty o f the m o d e l cannot be resolved against the cal ibra t ion data set (Hornberger et a l . , 1985). Grayson et a l . (1992a) compare the case o f an overparameterized mode l to a high-degree p o l y n o m i a l w i t h too many degrees o f freedom. Arguab ly , any m o d e l cou ld simulate any f l o w series - regardless o f scientif ic meri t - b y s i m p l y adding a sufficient number o f parameters. T h e authors demonstrate this b y us ing several combinat ions o f parameter values to produce s imi l a r output series ( ibid.) . Hornberger et a l . (1985) present s imi la r results, us ing a broad range o f parameters to generate " o p t i m a l " results. F o r the reasons p rov ided above, parameter uncertainty becomes very significant i n the presence o f substantial parameter complex i ty and n o m i n a l cal ibrat ion data (Beven , 1989; Loague and Freeze, 1985; Jakeman and Hornberger, 1993). Jakeman and Hornberger (1993) go so far as to r ecommend that streamflow-calibrated models be l imi t ed to the half -dozen parameters that are mathemat ica l ly required to represent the out f low series. G a n et a l . (1997) caution that, despite g r o w i n g awareness, over-parameterization c o u l d become more c o m m o n as a result o f advances i n comput ing technology and op t imiza t ion theory. O f course, parameter uncertainty can be reduced through the use o f mul t ip le data sets for ca l ibra t ion (Bergs t rom et a l . , 2002; Burges , 2002; Jakeman and Hornberger, 1993). T h i s approach is discussed further i n Sect ion 3.3.3. Interdependence between a number o f parameters impl ies that non-opt imal or deviant parameter values can compensate for each other. In m a n y cases, this makes it dif f icul t to associate parameter uncertainty w i t h i nd iv idua l parameters. M o d e l performance is almost always associated w i t h parameter combinat ions rather than i nd iv idua l parameter values; a g iven value for a part icular parameter m a y result i n either good or poor mode l performance depending on the values o f other parameters i n the mode l (Beven , 2000; H a r l i n and K u n g , 1992). A set o f cal ibrated parameters represents mere ly one combina t ion that a l lows a part icular m o d e l and ca l ibra t ion approach to produce results s imi la r to the observed response. Parameters m a y on ly have mean ing w i t h i n the context o f their part icular parameter set and mode l , and are u n l i k e l y to w o r k equal ly w e l l w i t h a different mode l or i n a different parameter set (Beven , 2000, 2001 ; 96 B i n l e y et a l . , 1991). A l t h o u g h the literature indicates that transfer o f parameter values between models or hydro log ic , c l ima t ic , or geographical regions m a y be possible i n some situations, it must be undertaken carefully. In general, parameter values mean very little i f r emoved from their o r ig ina l context. Some research has explored the v i ab i l i t y o f app ly ing "average" values for less sensitive parameters and focussing cal ibrat ion and f ie ld measurement on those parameters to w h i c h the m o d e l is h i g h l y sensitive. M e l c h i n g et a l . (1990) conclude that the bu lk o f parameter uncertainty lies i n ca lcula t ing the quantity o f runoff; i f this is proper ly estimated, "average" rout ing parameters should suffice to calculate the peak and t i m i n g o f the outf low. M i c o v i c (1998) investigates the use o f average values for a subset o f U B C W M parameters determined to have sma l l var iab i l i ty across a range o f catchments i n B r i t i s h C o l u m b i a . H e concludes that reasonably rel iable s imulat ions are poss ible us ing constant values. The use o f "average" parameters m a y s imp l i fy s imulat ions where o n l y a rough estimate is required. H o w e v e r , the apparent l imi ta t ion on parameter uncertainty must be interpreted w i t h respect to the f o l l o w i n g : • M i c o v i c (1998) notes a 5% decrease i n m o d e l eff iciency when m o v i n g from case-specif ic to average parameters. T h i s m a y not be insignif icant; i n some cases, even a 1% improvement can be quite diff icul t to achieve (e.g., L a n , 2001); • Performance m a y not be as consistent o n an event-by-event basis us ing average parameter values as it can be over the long term. Therefore, apply ing this approach for f lood forecasting applications might not be advisable; • " A v e r a g i n g " o f parameters can introduce systematic errors i n m o d e l output for i n d i v i d u a l applicat ions. F o r example, o f the twelve basins rev iewed b y M i c o v i c (1998), f ive consistently overestimate peak f lows , w h i l e five others consistently underestimate peak f lows; and 97 • The appl icab i l i ty o f "average" parameter values must be considered carefully i n the context o f their potential dependence on other parameter values, catchment var iabi l i ty , and the p reva i l ing hydro log ic , c l ima t ic , or geographic region. M e r z and B l o s c h l (2004) present a more in-depth examina t ion o f parameter regional iza t ion for the H B V mode l . T h e y report that us ing a g loba l parameter set ( in this case, the mean calibrated parameter values for 308 catchments i n Aus t r ia ) results i n generally poor performance. In r ev i ewing the literature, they f ind that most case studies report l o w correlations between mode l parameters and catchment attributes. A l t h o u g h their study finds that regional iza t ion methods based o n spatial p r o x i m i t y perform signif icant ly better than regional iza t ion based o n catchment attributes, their results i m p l y that there is an upper l i m i t o f appropriateness for the spatial regional iza t ion o f m o d e l parameters. M e r z and B l o s c h l (2004) also propose exp lo r ing parameter uncertainty b y cross-comparing the results f rom independent cal ibrat ions against two halves o f a spli t-sample data series. In compar ing their results to other studies i n the literature, they suggest that uncertainty for a g iven set o f parameters can have a significant degree o f dependence o n the catchment be ing studied and the characteristics o f the avai lable data. Hornberger et a l . (1985) investigate two methods to reduce parameter uncertainty. In the first case, as above, insensi t ive parameters are f ixed at their med ian values. A l t h o u g h this leads to greater parameter stabili ty, they f ind that objective funct ion results are s ignif icant ly infer ior to those obtained w i t h the fu l l ca l ibrat ion. T h i s procedure, w h i l e c o m m o n , is obv ious ly not ideal for this part icular appl ica t ion ( ibid.) . In the second case, parameters are e l iminated rather than arbitrari ly f ixed. T h i s results i n a m u c h better defined m i n i m u m on the response surface, al though the objective function is s t i l l inferior to the value obtained w i t h the fu l l ca l ibrat ion. T a k y i (1991) relates parameter interdependence to parameter uncertainty i n a more general context, not ing that disregard o f parameter correlat ion can lead to gross underest imation o f m o d e l predict ive uncertainty dur ing later uncertainty analyses. F o r example, Song and B r o w n (1990) f ind that the standard devia t ion for predicted d i sso lved oxygen deficit is 2 0 - 4 0 % larger for correlated inputs than for uncorrelated inputs. H a r l i n and K u n g (p. 211 , 1992) point out that 98 the di f f icul ty o f s tudying the interaction between three or more parameters is exacerbated b y the inab i l i ty to graph a response surface. T h e y note that the shape o f any two-parameter response surface w i l l depend on the values o f the other parameters. Parameter uncertainty is perhaps most obvious when ostensibly static, s ingle-value m o d e l parameters are observed to have a non-negl ig ible dependence o n the f l ow sequence and c l imat ic data used for cal ibrat ion. G a n and Burges (1990b) present one such example. Other examples are noted b y W o o l h i s e r (1996), w h o observes parameter sensit ivit ies that are dependent o n both bas in scale and ra infa l l magnitude, and A r n a u d et a l . (2002), w h o f ind it diff icul t to resolve the "phys ica l interpretation" o f any parameter i f it is found to be sensitive to the rainfal l pattern. Gup ta and Sorooshian (1983) point out that w h i l e the noted uncertainty cou ld potent ia l ly be parameter uncertainty, the cause c o u l d equal ly be related to ident i f iabi l i ty problems i n the m o d e l structure, or to cal ibrat ion data that do not adequately "act ivate" the relevant process. In general, dependence o f parameter values o n cal ibra t ion data is often an indicator that changes to the m o d e l structure are necessary (Wei le r , 2005). Ano the r c o m m o n indicator o f parameter uncertainty arises w h e n the parameters o f a "best f i t " m o d e l assume values at or very near the extremities o f their feasible range. F ranch in i et a l . (1998) suggest that this phenomenon is due to undesirable compensat ion among parameters induced b y restrictive specifications o f the feasible parameter space. P re l imina ry investigations into the effect o f w i d e n i n g the specif ied parameter ranges b y H o g u e et a l . (2000) y i e l d inconsistent and inconc lus ive results. O f the three components o f knowledge uncertainty, parameter uncertainty is arguably the most amenable to statistical analysis. F o r subjective assessments o f parameters, a mean value can be used to represent the "best guess" o f the parameter value and a variance can be assigned corresponding to the leve l o f confidence i n that guess (V icens et a l . , 1975). Examples o f statistical representation o f parameter values abound. B i n l e y et a l . (1991) calibrate the I H D M against f ive storm events, us ing the result ing f ive sets o f parameter values to estimate statistical properties for each parameter. Garen and Burges (1981) study m o d e l predict ive uncertainty us ing the idea that Coeff icients o f V a r i a t i o n for the uncertain parameters can be estimated and used to describe the parameter var iabi l i ty . S u c h est imation is subjective, especial ly for 99 parameters l ack ing a phys ica l interpretation. Me thods for est imating parameter uncertainty are m u c h more defined than for m o d e l or even data uncertainty, and are discussed i n Sect ion 3.3. 3.2 Uncertainty and Calibration It is u n l i k e l y that any hydro log ic model le r c o u l d obtain a good s imula t ion o f observed data wi thout recourse to cal ibrat ion. The challenge o f s treamflow predic t ion i n ungauged basins remains a major focus for research (e.g., I A H S , 2005). E v e n physica l ly-based (i.e., f ie ld-measured) parameter values are subject to many problems and typ ica l ly require some "adjustment" to improve performance ( B i n l e y et a l . , 1991). Therefore, the f o l l o w i n g d iscuss ion should not be v i e w e d as be ing entirely l imi t ed to calibration-dependent conceptual and empi r i ca l models . The most important aspect o f uncertainty i n cal ibrat ion deals w i t h the b lu r r ing o f the different types o f uncertainty discussed i n Sect ion 3.1 (e.g., data, parameter, and m o d e l uncertainty). L e i and S c h i l l i n g (1996) h ighl ight the widespread presumpt ion that ca l ibra t ion can resolve data and m o d e l structure uncertainty as w e l l as parameter uncertainty. W h i l e this is obv ious ly not the case, the premise is not entirely groundless. M a d s e n (2000) points out that al though cal ibrat ion should idea l ly o n l y affect parameter uncertainty, it m a y also compensate for errors i n other areas o f the s imula t ion . The result ing compl imentary errors can generate unbiased output that compares favourably w i t h the observed data ( M e l c h i n g , 1995; M e l c h i n g et a l . , 1990). It w o u l d be more correct to state that mode l cal ibrat ion par t ia l ly compensate for uncertainties and errors i n data and m o d e l structure b y changing parameter values to force the m o d e l output to better fit the observed data. In this way, ca l ibra t ion translates some o f the extant mode l and data uncertainty into parameter uncertainty. F o r example , a realist ic m o d e l cannot resolve biased input data (e.g., ra infa l l undercatch) against unbiased ca l ibra t ion data (e.g., streamflow) wi thout ar t i f ic ia l ly distort ing parameter values (Beck , 1987; Burges , 2003) . Converse ly , i f a m o d e l is able to reproduce an erroneous, non-representative or otherwise uncertain series o f observed data, there is l i k e l y significant uncertainty i n the m o d e l structure or parameter values. F o r example , F ranch in i and Pacc ian i (1991) f ind that a hydro logic mode l 100 k n o w n to omi t relevant processes can nonetheless be calibrated to obtain an acceptable s imula t ion o f observed streamflow. M a d s e n et a l . (2002) advise that, i f data and m o d e l uncertainty are not otherwise addressed, the goal o f cal ibrat ion i m p l i c i t l y shifts f rom the m i n i m i z a t i o n o f parameter uncertainty to the ba lanc ing o f compensat ing errors. L e i and S c h i l l i n g (1996) suggest the use o f "p re l imina ry" uncertainty analysis to separate events and models w i t h substantial uncertainties or errors i n data and m o d e l structure from those that can proper ly benefit f rom cal ibrat ion. In essence, cal ibrat ion adjusts parameters i n an attempt to reproduce the observed transformation o f a specif ic input series into a specific output series. Therefore, g iven the non-repeatabili ty o f hydro log ic events, each subsequent appl icat ion o f a calibrated hydro log ic m o d e l cannot help but i n v o l v e extrapolat ion or interpolat ion from or between response modes observed dur ing cal ibra t ion. The potential pi tfal ls o f extrapolat ion are re la t ively w e l l - k n o w n , and are addressed i n an appl ied context i n Sec t ion 3.4. H o w e v e r , even interpolat ion is not always acceptable. K l e m e s (1986a) presents a classic example o f the dangers o f interpolat ion; he describes h o w i f one m a n takes four hours to load a truck, two m e n take two hours, and four m e n take one hour, an observer might correct ly assume that the load ing process requires a total o f four man-hours. Howeve r , it w o u l d be fundamentally inaccurate to interpolate that the same truck c o u l d be loaded over three hours b y one and one third men. The assumption that calibrations are transferable beyond the cal ibra t ion per iod , w h i l e often v a l i d , is sometimes tenuous. G a n et a l . (1997) observe that a mode l cal ibrated against wet years tends to be biased toward over-est imation w h e n val idated against d ry years, and toward under-est imation i n the reverse case. In many cases, the level o f uncertainty introduced dur ing m o d e l appl ica t ion is related to the extent b y w h i c h the nomina l condit ions differ f rom those for w h i c h the m o d e l was calibrated. F o r example, G a n (1987) finds that extreme f l o w forecasts generated b y the S A C - S M A m o d e l are s ignif icant ly better for surface-flow dominated watersheds than for those dominated b y sub-surface processes. H e contends that this is due to a better s imula t ion o f the processes i n v o l v e d i n an extreme runof f event (i.e., over land f low) dur ing cal ibra t ion ( ibid.) . 101 The U . S . N a t i o n a l Research C o u n c i l (p.44, N R C , 2000b) notes that uncertainty can arise f rom "an inab i l i ty to understand the objectives that society holds important or to understand h o w alternative projects or designs should be evaluated". W h i l e this descr ipt ion is h igh ly generalized, it nonetheless captures the idea that there is uncertainty associated w i t h the choice o f objective for evaluat ing a situation. Measurement o f hydrologic m o d e l performance dur ing cal ibrat ion is no except ion. B o y l e et a l . (2000) caut ion that preemptive select ion o f a single cr i ter ion for ca l ibra t ion can potent ial ly predispose the cal ibrat ion process toward an inappropriate result. Insight about the nature o f the var ious alternative objective-dependent response surfaces i n the region o f the f ina l so lu t ion is necessary for quantifying uncertainty i n m o d e l predict ions (Kucze ra , 1997). A true set o f g loba l ly -op t imum parameter values should be independent o f ca l ibra t ion data. H o w e v e r , i n many cases the act o f ca l ibra t ion i m p l i c i t l y relates the two ( G a n and B i f t u , 1996). The inab i l i ty o f ca l ibra t ion procedures to locate g loba l ly opt imal parameter estimates w i t h confidence translates into uncertainty regarding the accuracy o f the m o d e l (Duan et a l . , 1992). Gup ta and Sorooshian (1983) illustrate w h y ach iev ing a good objective function value alone is not sufficient evidence to conclude that the ca l ibra t ion is successful. T h e y prove that the use o f addi t ional data can alter the response surface such that o n l y the g loba l o p t i m u m point is shared b y a l l potential distr ibutions o f loca l op t ima. Therefore, a "successful" m o d e l ca l ibra t ion that identifies o n l y a l oca l o p t i m u m cou ld produce poor results when appl ied to other data sets ( ibid.) . Th i s hypothesis is supported b y studies o f mul t i -object ive cal ibrat ion, w h i c h demonstrate that a "successful" ca l ibra t ion against one parameter is not necessarily associated w i t h "successful" s imula t ion o f other aspects o f the watershed (e.g., Bergs t rom et a l . (2002); Seibert (2000); Seibert and M c D o n n e l l (2002)). L o c a l op t ima m a y result f rom mode l or data uncertainty, the nature o f the objective function itself, or any combina t ion thereof. The i r d is t r ibut ion m a y be scattered, clustered, or curvi l inear . Th rough a process o f exhaustive gr idding , D u a n et a l . (1992) identify as many as 55 loca l op t ima w h e n consider ing two parameters o f the S L X P A R m o d e l , a s impl i f i ed vers ion o f S A C - S M A . The number o f loca l op t ima increases substantial ly as noise is introduced into the cal ibra t ion t ime series, and exceeds 500 for several cases w h e n consider ing a three-dimensional parameter subspace. The authors conclude that the response surface is quite flat i n the v i c i n i t y o f the g loba l 102 op t imum. One part icular set o f parameter values is far f rom the "true" parameter values but has a function value v i r tua l ly indis t inguishable f rom the point nearest the true solut ion ( ibid.) . E v e n a "g loba l o p t i m u m " set o f parameter values m a y not be correct; it is possible that a near-op t imal point (as measured b y the selected cri terion) m a y be more representative o f in situ processes than the mathemat ica l ly-opt imal answer. The choice between near-optimal alternative parameter values can be h i g h l y subjective and is the subject o f some discuss ion i n Sect ion 3.3, most notably w i t h regard to the concept o f equi fmal i ty (Sect ion 3.3.6). The problems w i t h cal ibra t ion are wel l -documented , as evidenced b y the dominant focus o n mode l cal ibrat ion w i t h i n the literature o f the past few decades. Bu t , researchers are n o w beginning to question what more can be gained b y setting aside studies o f cal ibrat ion i n favour o f exp lo r ing the root causes o f uncertainty i n m o d e l l i n g . T h i s perspective aligns i t se l f w i t h the argument that progress i n m o d e l l i n g - at least i n terms o f uncertainty reduct ion - must come f rom a better understanding o f hydrology, and not f rom more intensive manipula t ion o f the li t t le knowledge w e have (K lemes , 1982; M e l c h i n g et a l . , 1990). 3.3 Techniques for Exploring Uncertainty Resul ts f rom most hydro log ic models are determinist ic i n nature, consis t ing o f a single value or series o f values presented without alternatives or probabi l i t ies . B e c k (1987) contends that precise results f rom uncertain models offer a mis l ead ing sense o f precis ion. T h i s "arbitrary p rec i s ion" m a y be acceptable i n specific circumstances, but there are many cases i n w h i c h an analysis o f the accompanying uncertainty is c r i t i ca l . Scarce resources m a y restrict further invest igat ion to factors or processes ident i f ied as important at the expense o f other potent ial ly important factors ( N R C , 1995). A h i g h degree o f uncertainty m a y i t se l f be sufficient to influence or bias sensitive decisions. B e v e n (2001) notes that an assessment o f predict ive uncertainty improves the l i k e l i h o o d o f success w h i l e p rov id ing mi t iga t ion i n the event o f error or misca lcu la t ion . Uncer ta in ty analysis can also lead to insights concerning the value o f addi t ional informat ion and thereby a l l ow researchers to focus their efforts more effect ively ( N R C , 1995). 103 S i m p l e solutions to the p rob lem o f h o w to reduce or quantify uncertainty have been proposed, i nc lud ing us ing a longer data series for cal ibrat ion, reducing the d imensional i ty o f a m o d e l b y e l imina t ing insensi t ive parameters, and u t i l i z i n g different evaluat ion criteria. Howeve r , these approaches attempt to gain more informat ion from the same knowledge and do not address fundamental issues. A more comprehensive approach is o b v i o u s l y required. The U . S . N a t i o n a l Research C o u n c i l ( N R C , 1995) outlines two approaches for quantifying uncertainty. The first, be ing confidence intervals, expresses uncertainty i n terms o f the probabi l i ty w i t h w h i c h repeated sampl ing is expected to y i e l d outcomes w i t h i n a g iven interval around the "true" solut ion ( ibid.) . Conf idence intervals are the most c o m m o n l y used method o f descr ib ing sampl ing uncertainty, and are frequently employed even w h e n they are not substantiated b y the larger context o f the p rob lem. F o r example , confidence bands based on standard error are inc luded i n many f lood frequency analyses, despite the fact that differences between predict ions from alternative distributions can be o f a m u c h greater magnitude. The N R C ' s second approach for consider ing uncertainty employs Bayes ian statistics to generate probabi l i s t ic outcomes ( N R C , 1995). The Bayes ian approach treats m o d e l parameters as probabi l i s t ic variables whose pdfs represent the l i k e l i h o o d o f each parameter assuming different values. Output probabi l i ty distr ibutions ("posterior" pdfs) are the product o f two quantities: a l i k e l i h o o d function, w h i c h incorporates available cal ibra t ion data, and a "p r io r" p d f w h i c h represents the model le r ' s a priori knowledge . In the end, the promise o f this approach i n quantifying uncertainty is not fu l f i l led because the l i k e l i h o o d functions and pr ior estimates are s t i l l subject to the issues raised i n the first two sections o f this chapter. In general, statistically-based methods o f uncertainty analysis suffer from the same fundamental lack o f phys ica l basis noted for empi r i ca l and b lack box models i n Sec t ion 2.3.1. One must also be aware that statistical representation o f uncertainty can be mis lead ing i n cases where some aspects o f uncertainty are not inc luded either b y design or omiss ion . F o r example, a 5% probabi l i ty o f exceedance determined from frequency analysis does not generally make a l lowance for the effects o f c l imate change; the actual p robabi l i ty o f exceedance cou ld vary substantial ly from the "accepted" value. 104 Subsequent sections o f this chapter summarize a number o f the more widely-accepted techniques for managing uncertainty, i nc lud ing sensi t ivi ty analysis, r e l i ab i l i ty analysis, threshold techniques, equifmali ty, and uncertainty isolat ion. These approaches are by no means mutua l ly exclus ive . M a d s e n (2000) demonstrates the potential for different perspectives on uncertainty to lead to different insights. H e evaluates the performance o f the N A M mode l for s ingle- and mult i -object ive calibrations us ing overa l l v o l u m e error, overa l l R M S E , average R M S E for peak f low events, and average R M S E o f l o w f low events. F igure 3-4 shows the var iabi l i ty i n parameter values obtained b y consider ing a set o f solutions for single-objective cal ibrat ion (based o n a balanced objective function) whose objective function values are a l l w i t h i n 1% o f the o p t i m u m value. In contrast, F igure 3-5 shows the var iab i l i ty i n parameter values obtained b y consider ing the Pareto set o f op t imal parameter values obtained through a mult i -object ive analysis o f peak f low R M S E and l o w f low R M S E . Figure 3-4: Normalized Parameter Sets for a Single-Objective Optimization where Objective Function Values Differ by < 1% The bold line indicates the optimum parameter set. (from p. 286, Madsen, 2000) 105 UMAX LMAX CQOF OOF TIF TOF TG CK12 CKBF NAM parameter Figure 3-5: Normalized Pareto Optimal Parameter Sets for a Multi-Objective Optimization Full and dashed bold lines indicate parameter sets associated with the five smallest RMSE values for peak flow and low flow, respectively, (from p. 284, Madsen, 2000) Figures 3-4 and 3-5 are equal ly v a l i d representations o f different aspects o f uncertainty for M a d s e n ' s appl icat ion o f the N A M mode l (Madsen , 2000). The importance o f both perspectives to an analysis o f uncertainty is obvious : the first approach provides insight into the parameter uncertainty (i.e., uniqueness) o f the " o p t i m a l " solut ion for a single objective, w h i l e the second provides insight into parameter uncertainty across mul t ip le objectives. A ful l uncertainty analysis w o u l d need to consider and account for both approaches. 3.3.1 Sensitivity Analysis The most w e l l - k n o w n and wide ly -used tool for exp lor ing uncertainty i n hydrology is sensi t ivi ty analysis. The U S A C E (1992) defines sensi t ivi ty analysis as "the systematic evaluation o f the impacts o n project formulat ion and jus t i f ica t ion result ing from changes i n key assumptions under ly ing the analysis". F o r the purpose o f character izing m o d e l predict ive uncertainty, sensi t ivi ty analysis typ ica l ly takes on a more specific meaning. T a k y i (1991) defines sensi t ivi ty analysis as an invest igat ion o f h o w different aspects o f the mode l contribute to mode l predict ive uncertainty, expressed as the rate at w h i c h the mode l output changes w i t h variations i n the uncertain component. 106 Natura l ly , the criteria used to evaluate changes i n the mode l output must be appropriate for the subject o f the analysis. Sens i t iv i ty analysis is most c o m m o n l y used to identify the parameters that have the greatest impact o n m o d e l performance. T h i s is done b y evaluating h o w m o d e l output changes w i t h variat ions i n the "best" set o f parameter values ( M c C u e n , 1973; T a k y i , 1991). T y p i c a l l y , parameter values i n the "best" parameter are var ied one-by-one i n sma l l increments, w i t h a l l other parameters f ixed. In the same fashion, sensi t ivi ty analysis can also be used to identify " insens i t ive" parameters. G a n and Burges (1990b) present an example , not ing that a w i d e variety o f values cou ld be used for selected parameters i n the Sacramento ( S A C - S M A ) m o d e l wi thout affecting their s imula t ion results. Las t ly , sensi t ivi ty analysis can fo rm the basis for a more intensive analysis; for example , sensi t ivi ty analysis can be used to ident ify parameters that affect m o d e l results i n predetermined ways as a p re l iminary step to a more thorough invest igat ion o f those parameters. The benefits o f the qualitative and quantitative informat ion gained f rom sensit ivi ty analysis are fa i r ly intui t ive. Information related to the effects o f parameter var ia t ion can be o f considerable value i n de r iv ing relationships between parameter values and bas in or storm properties ( M c C u e n , 1973). A r g u a b l y more important is the role o f sensi t ivi ty analysis i n determining the confidence w i t h w h i c h the user or decis ion-maker regards the m o d e l results. Sens i t iv i ty analysis effectively provides an estimate o f the consequences o f error i n the analysis; confidence is increased i f results are sensitive on ly to those parameter values or data series that are k n o w n w i t h a h igh degree o f certainty. One major advantage o f sensi t ivi ty analysis is its s impl i c i ty ; it can be performed w i t h re la t ively li t t le effort as part o f a larger study. I f appl ied exhaust ively across a l l d imensions o f uncertainty, sensi t ivi ty analysis w o u l d no doubt prove to be a strong tool for character izing uncertainty. H o w e v e r , i n its most commonly -app l i ed format, it examines the sensi t ivi ty o f the m o d e l to on ly one factor at a t ime. T h i s l imi ta t ion impl i e s that sensi t ivi ty analysis alone, w h i l e useful, cannot p rov ide a complete descript ion o f m o d e l predict ive uncertainty. 107 Users o f sensi t ivi ty analysis techniques shou ld be aware o f several caveats. F i r s t ly , sensi t ivi ty analysis assesses parameters under the of ten- inval id assumption that they are independent. The outcome o f such an analysis m a y not reflect any correlations or interactions between parameters, and m a y therefore be mis lead ing or i n v a l i d (Madsen , 2003). M c C u e n (p. 43 , 1973) emphasizes that "the sensi t ivi ty o f one factor depends, i n the general case, o n the magnitude o f a l l factors o f the system". T h e increase i n m o d e l predict ive uncertainty ar is ing from independent perturbation o f a s ingle factor can far exceed that ar is ing f rom the same perturbation accompanied b y variat ions i n other parameters. In some cases, the incautious var ia t ion o f a single parameter value can result i n a parameter set that is no longer appropriate for s imula t ing the in situ condit ions. Per fo rming sensi t ivi ty analysis for a calibrated m o d e l can add further complex i ty . In such cases, the analysis is l imi t ed to reflecting the uncertainty o f the parameter near its " o p t i m a l " value and does not p rov ide informat ion about the overa l l importance o f the parameter w i t h i n the fu l l parameter space ( T a k y i , 1991). A n y attempt to reduce the d imens ional i ty o f the cal ibrat ion p rob lem based o n the results o f sensi t ivi ty analysis should proceed w i t h caut ion and an eye to parameter correlations essential to m o d e l behaviour (Madsen et a l . , 2002). In seeking to develop a process-oriented cal ibra t ion scheme for the H B V m o d e l , H a r l i n and K u n g (1992) f ind that different ca l ibra t ion methods result i n different " o p t i m a l " parameter sets. T h e y conclude that the appl ica t ion o f sensi t ivi ty analysis w o u l d not be meaningful since it is restricted to analysis o f var ia t ion near a single " o p t i m a l " parameter set. The i m p l i c i t argument against sensi t ivi ty analysis for calibrated models is that the results o f the analysis can differ f rom one cal ibrat ion to the next. F r o m another perspective, B e v e n (1989) questions the appl icat ion o f sensi t ivi ty analysis i n determinist ic s imula t ion because probabi l i ty is not paired w i t h each outcome. The results o f a sensi t ivi ty analysis essentially comprise non-commensurate results w h i c h must be assessed qual i tat ively. W i t h o u t an accompanying assessment o f the probabi l i ty for var ia t ion i n each parameter value, one must be aware o f M e l c h i n g ' s observat ion that insensit ive but uncertain parameters m a y have a greater influence o n the re l i ab i l i ty o f mode l output than h igh ly sensitive parameters that are k n o w n w i t h certainty ( M e l c h i n g , 1995). Where probabi l i ty is paired w i t h 108 each aspect o f a sensi t ivi ty analysis, the end result begins to take the form o f a re l i ab i l i ty analysis, discussed i n the next section. 3.3.2 Reliability Analysis It is often desirable to associate a probabi l i ty or l i k e l i h o o d w i t h the output o f a hydro log ic m o d e l as a means o f assessing its re l iabi l i ty . M e l c h i n g et a l . (1990) bel ieve that any approach for estimating re l iab i l i ty should be systematic, unbiased, f lexib le , consistent, s imple , and comprehensive (i.e., account ing for a l l potential sources o f uncertainty). W h i l e the rea l iza t ion o f a l l o f these goals s imul taneously is obv ious ly unattainable at present, several exis t ing methods can provide a probabi l i s t ic d is t r ibut ion o f uncertain results rather than a single determinist ic estimate. F o r the purposes o f this work , these methods are co l l ec t ive ly labeled methods o f re l iab i l i ty analysis. R e l i a b i l i t y analysis requires either a stochastic m o d e l or stochastic treatment o f a determinist ic mode l , as w e l l as descriptions o f var iab i l i ty for a l l basic quantities o f interest (e.g., data and parameters). R e l i a b i l i t y analyses for hydro logic models typ ica l ly estimate i nd iv idua l quantiles such as peak f low i n the fo rm o f a cumulat ive dis t r ibut ion function or c d f (i.e., as the l i k e l i h o o d F(x) that the peak f low for a f l ood w i l l be less than or equal to x). Represent ing m o d e l results as a c d f a l lows the mode l le r a means o f v i sua l l y assessing the uncertainty i n the s imula t ion ( M e l c h i n g et a l . , 1990). In the l i m i t i n g case o f absolute certainty, the c d f and p d f converge to step and spike functions, respectively, such that a s ingle unique answer is associated w i t h 100% probabi l i ty . Therefore, p rov ided the m o d e l is representative, a steep c d f w i t h m i n i m a l curvature at the extremes is an indicator o f good re l iab i l i ty (i.e., fa i r ly certain results) ( ibid.) . Converse ly , a broad c d f o f m i l d slope is an indicator that the results are h i g h l y uncertain or unrel iable. R e l i a b i l i t y analysis usual ly addresses uncertainty at a more fundamental l eve l than sensi t ivi ty analysis, and is therefore less concerned w i t h i sola t ing the impact o f any single factor o n the mode l l ed results. Therefore, re l i ab i l i ty analysis escapes most o f the potential pi tfal ls discussed i n the preceding section. H o w e v e r , other challenges take their place. F o r example , one must be aware o f the potential for parameter values randomly selected o n an i n d i v i d u a l basis to combine into unrealist ic parameter sets. S i m p l y r emov ing any unrealist ic scenarios from considerat ion 109 after they have been ident i f ied is a poor solut ion, since this compromises the relat ionship between the set o f "acceptable" alternatives and the pr ior p robabi l i ty distributions o f their components. If, for example , parameters are found to be correlated, these relationships must be explored and, i f necessary, incorporated into the analysis (e.g., Garen and Burges , 1981). W h i l e resultant probabil i t ies f rom a re l i ab i l i ty analysis must, b y def ini t ion, sum to unity, they almost never reflect the fu l l extent o f m o d e l predict ive uncertainty. R e l i a b i l i t y analyses rarely address any uncertainties associated w i t h basic assumptions made dur ing the formulat ion and execut ion o f the mode l . R e l i a b i l i t y analyses also rarely account for any indeterminate uncertainties that do not lend themselves w e l l to mathematical representation. M o d e l l e r s should respect the context o f any analysis b y document ing w h i c h aspects o f uncertainty are inc luded i n their analysis, and w h i c h are not addressed. In the absence o f a clear explanat ion o f context, probabi l i s t ic representation o f results c o u l d conce ivab ly generate a disproportionate l eve l o f confidence i n results (e.g., S o n g and James, 1992). M o n t e C a r l o S imula t ion ( M C S ) is generally the most accurate approach for re l iab i l i ty analysis; i n particular, T a k y i (1991) cites the advantages o f M C S i n deve lop ing the statistical properties o f the aggregate m o d e l output, w h i c h i n turn characterize the uncertainty i n the m o d e l response. The M C S approach typ ica l ly invo lves mult i - thousand evaluations o f a performance function such as Z = Q m a x - Cumulated, where Qsimuiated is the peak f low obtained f rom a determinist ic m o d e l w i t h a unique, randomly-selected combina t ion o f mode l , data, and parameters. The probabi l i ty that the peak f low w i l l be less than Q m a x is approximated b y the frequency w i t h w h i c h the free variable Z is less than zero. B y consider ing a range o f values for Q m a x , one can construct the c d f F ( Q s i m < Q ) for a l l values o f Q . The m a i n c r i t i c i sm leve l led at M C S - b a s e d re l iab i l i ty analysis has h is tor ica l ly i n v o l v e d its computat ional intensity. Subsequent advances i n comput ing technology have led to more advanced investigations o f uncertainty based o n extensive random sampl ing f rom input distr ibutions. H o w e v e r , it is w i d e l y recognized that M C S relies o n the assumption o f pure ly r andom sample selection. In most cases, random numbers are generated v i a one o f many poss ible computer algori thms, and there is potential for the assumption o f randomness to be viola ted. The user should be w e l l aware o f the famous quote attributed to John v o n N e u m a n n , 110 one o f the fathers o f M C S , w h i c h states that "anyone w h o considers ari thmetical methods o f p roduc ing random digits is , o f course, i n state o f s in" . M C S has been w i d e l y accepted as the best method for re l i ab i l i ty analysis over the past decade. H o w e v e r , alternatives exist i n the fo rm o f less accurate but computa t ional ly s imple methods o f approximate solut ion. Natura l ly , their results are subject to addi t ional uncertainty, w h i c h can be explored through compar i son against results obtained v i a M C S (e.g., Tung , 1990). M e l c h i n g (1995) outlines several approximat ion methodologies i nc lud ing M e a n - V a l u e Fi rs t -Order Second-M o m e n t ( M F O S M ) and A d v a n c e d M F O S M , the Poin t Es t ima t ion Me thods o f Rosenblueth and Har r , and L a t i n Hypercube Sampl ing . B r i e f d iscuss ion herein is afforded to the most c o m m o n o f these, be ing first-order approximat ion. A l s o , some d iscuss ion o f more advanced methods i n v o l v i n g M a r k o v Cha ins and the M e t r o p o l i s A l g o r i t h m is meri ted, as they represent great potential for exp lor ing m o d e l predict ive uncertainty and are l i k e l y to g row i n populari ty. Firs t-order methods for re l i ab i l i ty analysis are computa t ional ly straightforward, requir ing o n l y the first and second statistical moments o f the input variables. These methods use a l inear approx imat ion o f the performance function, usual ly through truncation o f a Tay lo r series expans ion at the mean point o f a l l uncertain variables. T h e l inear approximat ion is used to estimate the expected value and variance o f the performance function. The system re l iab i l i ty P can then be calculated as the inverse o f the coefficient o f va r iab i l i ty for the performance function, and related to probabi l i ty b y P(Q S im ^ Q ) = (^P), where 0 ( ) is the standard no rma l integral . Firs t-order methods i nvo lve several assumptions concerning the nature o f the p rob lem that can potent ia l ly result i n mis lead ing results. M o s t important ly, the l inear approximat ion becomes less accurate w i t h both increasing non-l inear i ty i n the system and increasing distance from the mean values, and can y i e l d incorrect results (e.g., g(E[x]) ?3E[g(x)]) ( H a r l i n and K u n g , 1992). Garen and Burges (1981) compare the results o f first-order approximat ions w i t h results obtained through M C S . T h e y conf i rm that first-order analyses underestimate means and standard deviat ions, as expected g iven the truncation o f the T a y l o r Series expansion. The authors conclude that where first-order analysis is k n o w n to be a poor approximat ion o f M C S , its results are l i k e l y subject to an unacceptable degree o f uncertainty ( ibid.) . 111 M o n t e Ca r lo M a r k o v C h a i n ( M C M C ) algori thms are g rowing i n popular i ty as a means for assessing uncertainty i n parameter values. M C M C algori thms, w h i c h began as models o f phys i ca l systems seeking a state o f m i n i m a l free energy, are a more advanced form o f sampl ing that use both expert knowledge and s imula t ion his tory to probabi l i s t i ca l ly guide M o n t e Ca r lo selection. V r u g t et a l . (2003) exp la in that M C M C methods use stochastic techniques to success ively explore solutions throughout the parameter space, updat ing their component probabi l i ty distr ibutions as they progress. Rather than seeking a probabi l i s t ic descript ion o f re l i ab i l i ty based o n a priori or u n k n o w n pr ior distr ibutions, M C M C algori thms attempt to define the appropriate probabi l i ty dis t r ibut ion for each parameter. Conven t iona l expressions o f re l i ab i l i ty (e.g., 9 0 % confidence interval for streamflow) are calculated f rom the posterior parameter distributions. B y far the most c o m m o n instance o f M C M C is the Met ropo l i s -Has t ings a lgor i thm (Hastings, 1970). V r u g t et a l . (2003) present a concise outl ine o f the Met ropo l i s -Has t ings process. K u c z e r a and Parent (1998) use the M e t r o p o l i s a lgor i thm to investigate parameter uncertainty for a conceptual hydro logic m o d e l . The k e y to the adaptive abi l i ty o f the a lgor i thm is that it w i l l a lways accept a solut ion o f higher probabi l i ty , but w i l l also accept less probable solutions w i t h a frequency obtained b y r andomly sampl ing from the interval [0,1] ( ibid.) . It is important to preserve the ergodici ty o f the M a r k o v C h a i n (i.e., its representation o f l o w -probabi l i ty areas) throughout M C M C processes. In this way , the intent o f M C M C techniques contrasts w i t h most other automatic cal ibrat ion techniques, w h i c h typ ica l ly abandon "sub-o p t i m a l " regions o f the parameter space i n favour o f regions o f higher probabi l i ty . V r u g t et a l . (2003) combine the M e t r o p o l i s a lgor i thm w i t h the S C E - U A a lgor i thm to form the hybr id Shuff led C o m p l e x E v o l u t i o n M e t r o p o l i s a lgor i thm, abbreviated S C E M - U A . The S C E M - U A a lgor i thm ut i l izes the demonstrated power o f the S C E - U A a lgor i thm w h i l e avo id ing the tendency to col lapse into a single "best" region o f attraction. The authors f ind that parameter values ident i f ied as most probable i n the S C E M - U A s imula t ion are v i r tua l ly ident ical to the unique so lu t ion generated b y the S C E - U A method. Therefore, it is reasonable to assume that the S C E M - U A method can replace the two-step process o f ident i fying a " g l o b a l " o p t i m u m and then app ly ing other techniques to investigate parameter uncertainty. The authors illustrate the u t i l i ty 112 o f the S C E M - T J A method b y us ing its results to generate confidence l imi t s for f lood hydrographs ( ibid.) . The results o f an M C M C algor i thm can be plot ted as a series o f hydrographs showing the range o f solutions associated w i t h a selected confidence leve l . F o r example , V r u g t et a l . (2003) show that the band o f hydrographs associated w i t h the 9 5 % confidence reg ion does not inc lude the observed records. T h e y conclude that this bias is a strong indicator o f uncertainty or error i n the m o d e l structure. There is great potential for such techniques to contribute to our understanding o f m o d e l predict ive uncertainty. H o w e v e r , most relevant M C M C techniques are re la t ively recent, and the independence o f their descr ip t ion o f uncertainty has not been fu l ly ver i f ied. 3.3.3 Multi-Objective Analysis A large number o f hydro logic m o d e l studies have focussed o n the pursuit o f a single "best" so lu t ion for the cal ibrat ion p rob lem. H o w e v e r , a single "best" so lu t ion is often unattainable. G u p t a et a l . (1998) expla in that the mul t i -object ive nature o f m o d e l ca l ibra t ion makes any single-objective cal ibrat ion necessari ly subjective. Y a p o et a l . (1998) caut ion that even a carefully-chosen objective function can s t i l l fa i l to adequately measure important characteristics of, and differences between, observed and s imulated data series. M u c h o f the subjectivi ty i n cal ibrat ion results from the lack o f an unambiguous ly correct w a y to m i n i m i z e the length o f an error vector conta ining non-commensurable components (Bastidas et a l . , 2001). A n d ul t imately, even a perfect s imula t ion o f the observed data series o f interest (e.g., an observed hydrograph) is not necessari ly a robust indicator o f an appropriate mode l for a system (Seibert and M c D o n n e l l , 2002). I f a mode l le r concludes that acceptable m o d e l performance for a l l required objectives cannot be obtained from a single unique solut ion, the recourse is mult i -object ive analysis. A m u l t i -objective solu t ion comprises the set o f a l l non-infer ior alternatives i n the feasible space (Gupta et a l . , 1998; Y a p o et a l . , 1998). R e v e l l e et a l . (p. 104, 1997) define a so lu t ion as non-infer ior " i f there exists no other feasible so lu t ion w i t h better performance w i t h respect to any one objective, wi thout hav ing worse performance i n at least one other objective". C o m m o n l y - u s e d synonyms for non-infer ior i ty include dominance, efficiency, and Pareto opt imal i ty , after Italian economist V i l f r e d o Pareto (Reve l le et a l . , 1997; Y a p o et a l . , 1998). 113 The key assumption o f mult i -object ive ca l ibra t ion is that the result ing non-infer ior solutions co l lec t ive ly present a more rel iable descr ipt ion o f catchment behaviour than can be attained us ing any one single objective (Seibert, 2000). The concept o f a set o f non-inferior results can be compared w i t h the set o f calibrations that w o u l d result f rom a team o f experts per forming independent manual calibrat ions, where each expert uses a unique combina t ion o f knowledge , experience, insights, and priori t ies to generate equal ly good but different solutions. Because o f the addi t ional va l ida t ion inherent i n the process, mode ls subjected to mult i -object ive cal ibra t ion are typ ica l ly considered more robust internal ly than their single-objective counterparts. F o r a complete mul t i -object ive analysis, res idual uncertainty w o u l d arise on ly from concerns o f subjectivity i n selecting objectives and f rom statistical sampl ing issues. H o w e v e r , i nc lud ing alternative m o d e l structures and data sets i n a mul t i -object ive analysis w o u l d be diff icul t at best, and w o u l d l i k e l y result i n the ident if icat ion o f discrete solutions rather than subjective trade-offs. F o r this reason, the bu lk o f the w o r k i n mul t i -object ive analysis has focused on parameter uncertainty. Other dimensions o f m o d e l predic t ive uncertainty are better addressed us ing the more ph i losoph ica l techniques described i n subsequent sections o f this chapter. Y a p o et a l . (1998) suggest that further research into mul t ip le-object ive cal ibrat ion is best directed toward understanding h o w to select the set o f objective functions, the sensi t ivi ty o f results to the number o f objective functions, and the amount o f data used i n opt imiza t ion . The different "object ives" o f mult i -object ive analysis need not conform to their fami l ia r context as numer ica l measures o f s imi la r i ty between observed and simulated streamflow. A l i s t ing o f various alternative "object ives" might inc lude: • different response characteristics (e.g., peak f lows over threshold or annual water balance) ( B o y l e et a l . , 2000; M a d s e n , 2003); • different m o d e l state variables or processes (e.g., groundwater l eve l , chemica l data, snowpack) (Bastidas et a l . , 2001 ; Bergs t rom et a l . , 2002; M a d s e n , 2003); • distributed or mult i -s i te measurements (e.g., runof f measured at points other than the catchment outf low) (Madsen , 2003); 114 • mul t ip le sets o f cal ibrat ion data (e.g., different events for an event-based model) ( B C H y d r o , 2004; B i n l e y et a l . , 1991); and • the use o f bo th "hard" and "soft" (e.g., spotty, discont inuous, numer ica l ly approximate, or qualitative) data (Seibert and M c D o n n e l l , 2002; M e r z and B l o s c h l , 2004). M a n y objectives currently i n use are themselves mathemat ical composites o f mul t ip le objectives; for example, the E O P T ! statistic c o m m o n l y used for evaluat ing the U B C W M sums the N a s h -Sutcl iffe eff ic iency E ! and total v o l u m e error. M e r z and B l o s c h l (2004) demonstrate the potential for i m p r o v e d performance us ing a c o m p o u n d objective. The i r objective function combines runof f efficiency, v o l u m e error, and penalty functions for three k inds o f "soft" data (expected parameter distr ibutions, snow accumulat ion , and moisture accumulat ion) . U s i n g this c o m p o u n d objective function results i n a decrease i n E ! for ca l ibra t ion compared to the convent ional ca l ibra t ion against runoff. H o w e v e r , performance dur ing va l ida t ion improves , ind ica t ing a better representation o f the catchment hyd ro logy ( ibid.) . Nevertheless, one c o u l d argue that a single quantitative combina t ion o f mul t ip le objectives is s t i l l a single measure; it is not def in i t ive ly "mul t i -ob jec t ive" i n the sense that it does not provide mul t ip le alternative solutions. A s for mul t i -object ive research i n other fields, hydro log ic m o d e l studies have shown that trade-offs i n performance between members o f the non-infer ior set can be significant (e.g., B o y l e et a l . , 2000; Loague and Freeze, 1985). Seibert (2000) finds that a mult i -object ive cal ibrat ion w i l l tend to simulate a s ingle variable less accurately than a dedicated cal ibrat ion. Howeve r , he contends that, for the right mode l , the decrease i n performance should be smal l and w i l l most l i k e l y be offset b y improvements i n the s imula t ion o f other variables. H e goes o n to demonstrate substantial improvement i n groundwater l eve l s imulat ions (from R 2 = 0.313 to R 2 = 0.834) accompanied b y a re la t ively m i n i m a l decrease i n the qual i ty o f runof f s imula t ion (from E ! = 0.879 to E ! = 0.834) w h e n convent ional ca l ibra t ion is expanded to include groundwater l eve l data. Bergs t rom et a l . (2002) reach s imi la r conclus ions us ing snow depth, depth to the water table, runoff, and O groundwater content, as do Seibert and M c D o n n e l l (2002) us ing both 115 "ha rd" and "soft" runo f f and groundwater data. Seibert and M c D o n n e l l ( ibid.) also note that their mul t i -object ive ca l ibra t ion reduced parameter uncertainty by an average o f 6 0 % compared to the corresponding convent ional single-objective cal ibrat ion. W h i l e near-unilateral improvement i n objectives is possible , objectives and Pareto solutions can also be mutua l ly exc lus ive . T h i s situation means subjective preference must be used to choose between the var ious objectives and solutions (Lohan i et a l . , 1997); the model le r must decide whether the required reduct ion i n some aspect o f performance is jus t i f ied b y the better representation o f others (Seibert and M c D o n n e l l , 2002). In such cases, the non-inferior set is often plotted i n two- or three-dimensional objective space to graphica l ly represent the trade-offs between objectives (Reve l le et a l . , 1997). Where trade-offs are significant, unacceptable trade-offs and imprac t ica l solutions are often el iminated as a p re l iminary step i n par ing d o w n the non-infer ior set (e.g., B o y l e et a l . , 2000). A n a l y s i s o f the non-infer ior set can reveal characteristic m o d e l or parameter properties and behaviours . Th i s informat ion can sometimes be used to guide structural improvements to the m o d e l i t se l f ( B o y l e et a l . , 2000) . F o r example, a parameter exh ib i t ing smal l var iab i l i ty across the Pareto set is usual ly important to overa l l mode l behaviour , since parameters to w h i c h the m o d e l is h igh ly sensitive w i l l i m p l i c i t l y be estimated w i t h greater accuracy (Madsen , 2003). Converse ly , the Pareto set o f parameter values m a y cover a large percentage o f the feasible parameter space; i n one case, M a d s e n (ibid.) observes Pareto parameter values va ry ing b y over 5 0 % o f their feasible range. A h igh degree o f var iab i l i ty across the non-infer ior set usua l ly indicates that related aspects o f the mode l require further study. G u p t a et a l . (1998) p lo t a l l the hydrographs associated w i t h the non-infer ior set o n a s ingle graph to show "the space o f possible hydrograph solut ions" to the mult i -object ive p rob lem. T h i s type o f plot is referred to herein as a Pareto hydrograph. U s i n g a Pareto hydrograph, one can subject ively assess the fit between the set o f non-infer ior solutions and the observed data, as w e l l as evaluat ing whether the fit can be improved through further cal ibrat ion. T h i s makes the Pareto hydrograph a good tool for ident i fying data error or mode l non-representativeness. Based o n the p lo t shown i n F igure 3-6, Y a p o et a l . (1998) conclude that the Pareto hydrograph for their m u l t i -objective analysis does not bracket the observed data for a l l l o w - f l o w events. 116 (DRMS, HMLE) - WY: 1973 1Q° I I I I I J l I I I I | J Oct Nov Dec Jan Feb March April May June July Aug Sept Oct Months Figure 3-6: Hydrograph Ranges Associated with a Pareto Solution Set (from p.94, (Yapo et ai, 1998)) As for the MCMC calibrations of Vrugt et al. (2003) discussed above, the systematic nature of the deviations observed in Figure 3-6 is an indicator of possible model error. Respecting the non-commensurate nature of the results, the authors do not attempt to quantitatively combine the Pareto solutions. The final product is in fact the set of non-commensurate hydrographs, which collectively provide a qualitative description of uncertainty. The most common approach to generating a non-inferior set is to separate the multi-objective calibration into a set of single-objective optimizations. This approach has the advantage of including in the Pareto set the "best" solution for each individual objective (Bastidas et al., 2001). The process can be computationally prohibitive, given that applied engineering problems may have hundreds of thousands of feasible extreme points and tens of possible objective functions (Revelle et al., 1997; Yapo et al., 1998). For this reason, Madsen (2003) advocates pursuing a good estimate of the Pareto frontier rather than an exhaustive identification of the full Pareto set. Breaking the multi-objective problem down into a series of single-objective optimizations requires that each optimization be assigned a unique objective function. These objectives can be defined using one of two possible "generating techniques". The two methods are generally referred to as the constraint method and the weighting method. 117 T h e constraint method opt imizes the objectives one b y one, w i t h a l l other objectives he ld at pre-selected values. The Pareto set is populated b y successively o p t i m i z i n g each objective w h i l e i terat ively vary ing the values o f the constrained objectives. T h i s method is most efficient w h e n informat ion o n desired outcomes can be used to guide the select ion o f constraint values for each objective (Reve l le et a l . , 1997). The constraint method yields o n l y an approximat ion o f the non-infer ior set, since the a priori select ion o f constraint l imi t s m a y preclude the ident if icat ion o f the extreme points o f the feasible reg ion i n objective space. The we igh t ing method addi t ive ly combines mul t ip le objective functions us ing variable we igh t ing factors to form a "grand objective funct ion" (Reve l le et a l . , 1997). T o balance the grand objective function, transformation constants are appl ied to normal ize the magnitudes o f the component objective functions before the weigh t ing factors are considered (e.g., M a d s e n , 2000). T h e non-infer ior set is then obtained b y repeatedly o p t i m i z i n g the grand objective function w h i l e systematical ly va ry ing the we igh t ing factor assigned to each objective. The weigh t ing method is often more amenable to hydro log ic m o d e l l i n g than the constraint method because o f its sui tabi l i ty for automatic methods o f op t imiza t ion (Madsen , 2003). , R e v e l l e et a l . (1997) present two cautions pertaining to the we igh t ing method. F i r s t ly , i f the var ia t ion o f the weights is too coarse, the we igh t ing method c o u l d fa i l to identify the fu l l set o f non-infer ior solutions. Some solutions (e.g., "gap point solut ions") m a y not be found w i t h any combina t ion o f weights. Secondly , caut ion must be used i n interpreting alternate opt ima. N o t a l l alternate op t ima found w h i l e s o l v i n g for an i nd iv idua l objective w i l l be non-inferior; the non-infer ior solutions must be ident i f ied b y adding inf in i tes imal weights to the other objectives ( ibid . ) . G u p t a et a l . (1998) propose an alternate approach for o p t i m i z i n g mult i -object ive problems i n the form o f a dedicated a lgor i thm ca l led M u l t i - O b j e c t i v e C O M p l e x evolu t ion ( M O C O M - U A ) . The authors contend that M O C O M - U A provides an effective and efficient estimate o f the non-infer ior so lu t ion space i n a single run, wi thout resorting to subjective weight ings ( ibid.) . T h e M O C O M - U A method is based o n an extension o f the S C E - U A algor i thm, c o m b i n i n g control led random search, compet i t ive evolu t ion , Pareto ranking, and a new strategy for mult i -object ive d o w n h i l l s implex searches. The var ious strategies o f "popula t ion evo lu t ion" produce rapid 118 convergence wi thout sacr i f ic ing g loba l search abi l i ty , m a k i n g the approach ideal for so lv ing mul t i -object ive problems such as hydro log ic m o d e l cal ibrat ion ( Y a p o et a l . , 1998). Compar i sons to more standard methods show that M O C O M - U A provides a fa i r ly consistent approximat ion to the Pareto set. 3.3.4 Generalized Sensitivity Analysis Genera l i zed Sens i t iv i ty A n a l y s i s ( G S A ) differs f rom the above techniques b y attempting to identify factors governing mode l behaviour rather than quanti tat ively exp lo r ing the feasible region. G S A was o r ig ina l ly developed b y Spear and Hornberger (1980) as an attempt to identify key parameters i n water qual i ty models , but has since divers i f ied into other fields, i nc lud ing runof f predic t ion . The G S A method uses M C S to r andomly sample parameter values f rom a specif ied dis t r ibut ion, repeating the process for a number o f s imulat ions . The user then classifies each s imula t ion i n terms o f whether or not it m i m i c s k n o w n system "behaviour" . F i n a l l y , tests o f statistical s imi la r i ty are used to determine whether parameter values associated w i t h the "behaviour" and "non-behaviour" classes are statist ically different. Na tura l ly , addi t ional steps are required i f input parameters or data are correlated ( T a k y i , 1991). Advantages o f the G S A approach include s impl i c i ty , f l ex ib i l i ty , and enforced declaration o f arbitrary assumptions. However , it m a y be diff icul t to examine behavioura l aspects o f mult ivariate jo in t distributions. A l s o , the apparent ease o f apply ing G S A m a y mask the need for under ly ing v igorous analysis o f m o d e l hypotheses (Beck , 1987). Hornberger et a l . (1985) demonstrate that G S A can introduce significant subjectivi ty through arbitrary selection o f the "behav ioura l " and "non-behavioura l" thresholds. H a r l i n and K u n g (1992) also note the lack o f a clear l i m i t separating acceptable and unacceptable s imulat ions. Seibert and M c D o n n e l l (2002) suggest that quali tat ive informat ion and other forms o f "soft" data c o u l d be o f use i n def ining appropriate thresholds. Widespread use o f G S A i n mode l -bu i ld ing is due to its ab i l i ty to q u i c k l y and easi ly classify the relevance o f certain parameters or processes for ach iev ing the desired result (Beck , 1987). A s such, it is most useful i n cases where m o d e l structure uncertainty is dominant . F o r this reason, G S A has seen more appl icat ion i n water qual i ty m o d e l l i n g than hydro log ic mode l l i ng , since even the processes and parameters o f environmental models are often u n k n o w n ( T a k y i , 1991). 119 A variety of techniques for sample generation, classification of behaviour, and statistical comparison have, been used within the GSA framework. Because of its simplicity, it is easily implemented into studies of uncertainty in conjunction with other techniques. Hornberger et al. (1985) present a simple analysis which effectively demonstrates the GSA approach with multiple objectives. After several hundred simulations, they visually compare parameter value distributions for the lowest and highest 30% of objective function values. Significant differences between the distributions indicate that the model is sensitive to that parameter, as measured by the corresponding objective function. Certain parameters exhibit consistent trends across objective functions, clustering into the same sub-range of the parameter space for all "good" simulations. The "good" cluster regions for other parameters differ markedly from one objective function to the next (ibid.). Beven and Binley (1992) take GSA a step further with their Generalized Likelihood Uncertainty Estimation (GLUE) procedure. The GLUE procedure uses GSA results to produce quantitative estimates under uncertain conditions. It proposes that all "non-behavioural" results be identified and discarded. For the remaining "behavioural" points, the objective used to evaluate behaviour is normalized such that its sum over all "behavioural" points is unity. Where predictions are required, simulations are run with each of the "behavioural" parameter sets. A weighted mean estimate and qualitative probability bounds for the variable of interest can then be calculated. 3.3.5 Equifinality Limiting an investigation of model predictive uncertainty to a single predetermined model or group of parameters artificially constrains the scope of the work and, as such, can preclude a full exploration of the problem (Braben, 1985; Welles, 1984). Researchers such as Caissie and El-Jabi (2003) advocate exploring all available approaches (e.g., models) rather than exhaustively investigating any single method. The concept of "equifinality" proposed by Beven (2000) champions this view. The basic premise of equifinality is that no model that works acceptably well can be rejected; rather, the each acceptable model should be viewed as "equifmal" (i.e., non-inferior) to all other acceptable models. Equifinality is less a quantitative approach than a conceptual philosophy. For example, traditional statistical theory prefers to minimize the risk of Type I error (rejection of a potentially 120 true hypothesis) w h i l e accepting Type JJ error (acceptance o f a potent ial ly false hypothesis). In the same sense, the concept o f equif inal i ty m i n i m i z e s the r i sk o f rejecting a good m o d e l w h i l e accepting the r i sk o f retaining a poor one. Equ i f ina l i t y rejects the pursuit o f a single " o p t i m a l " m o d e l as both imposs ib le and undesirable (Beven , 2001). B e v e n (2000) explains that, for a g iven catchment, many models are often compat ible w i t h avai lable data and observed processes; there is no basis for r emov ing any such m o d e l f rom considerat ion. F o r example, i n a study o f eight different runoff-response functions for the H B V conceptual mode l , H a r l i n (1992) concludes that most o f the functions c o u l d be calibrated to simulate floods as w e l l as the o r ig ina l . It is reasonable to expect that equif inal models w i l l be scattered across the possible range o f models and parameter sets. W h i l e a l l equif inal models w i l l have some commona l i t y o f function, be ing i n some sense s imi la r to the real catchment, some predict ive uncertainty w i l l be i r reducible (Beven , 2002). The greatest pract ical d i f f icul ty i n apply ing equif inal i ty lies i n the conceptual and computat ional intensity required to identify the suite o f behavioural models (Beven , 2000). Spec i f ica l ly , establishing an exhaustive m o d e l space is conceptual ly diff icul t , and ident i fying behavioural models i n the m o d e l space requires significant computat ional effort. W h i l e computat ional ly intense, ident i fying an equif inal set o f models and parameter combinat ions is conceptual ly straightforward, and is s imi la r to the G S A procedure. L i k e the G L U E approach, equif inal i ty essentially disregards any models found to be inappropriate or "non-behavioural" . B e v e n (p. 9, 2001) proposes the f o l l o w i n g blueprint for determining equif inal i ty amongst physica l ly-based models : (i) Def ine the range o f m o d e l structures to be considered. ( i i ) Reject any m o d e l structures that cannot be jus t i f ied as phys i ca l ly feasible for the catchment o f interest. ( i i i ) Def ine an appropriate range for each parameter i n each m o d e l . ( iv) Reject any parameter combinat ions that cannot be jus t i f ied as phys i ca l ly feasible. 121 (v) Compare the predict ions o f each potential m o d e l w i t h the available observed data and reject any models w h i c h produce unacceptable predict ions, taking account o f estimated error i n the observations. (vi) M a k e the desired predict ions w i t h the remain ing successful models to estimate the range o f poss ible outcomes. T h e "bluepr int" l a id out above necessari ly recognizes the importance o f high-qual i ty, representative data i n evaluat ing m o d e l feasibil i ty, since equif inal i ty cannot address or compensate for poor spatial representation or data error (Beven , 2001 , 2002). It is important to note that the blueprint contains neither assumptions nor equations. Instead, the equifmal set embraces a l l independently-developed models , regardless o f their character or paradigm. Equ i f ina l i t y analysis w i l l not necessari ly provide a quantitative outcome. B e v e n (2002) reminds us that a l l models cou ld conce ivab ly be rejected at Stage ( i i ) o f the blueprint . Further testing or condi t ion ing o f the m o d e l set us ing addi t ional in format ion m a y also lead to the rejection o f a l l models as non-behavioural , since good s imula t ion o f catchment out f low does not necessari ly i m p l y a good s imula t ion o f internal hydro log ica l processes (Beven , 2000). S i m i l a r to m u l t i -objective cal ibrat ion, a trade-off w i l l be required i n many cases between successful reproduct ion o f some observations and poor reproductions o f others ( ibid.) . H o w e v e r , B e v e n (2001) cites the potential for insights and progress even i n cases where a l l models are rejected as unsatisfactory. Comple te rejection o f a l l models requires that the study re-examine the sui tabil i ty o f each m o d e l structure and reconsider the relationships between measures o f performance and desired outcomes. B e v e n (2000) proposes c o m b i n i n g the predict ions o f equifmal models into a cumulat ive dis t r ibut ion function. W h i l e substantial insight c o u l d be gained f rom this approach, it also has several weaknesses. M o s t s ignif icant ly , it requires that arbitrary probabil i t ies be assigned to each solut ion. A s discussed for G S A i n Sec t ion 3.3.4, there is also substantial subjectivity i n def in ing the acceptabil i ty criteria. F o r example , accepting and rejecting models based so le ly on their s imula t ion o f catchment discharge can y i e l d a broader range o f "behav ioura l" models than i f mul t ip le diverse or distributed measurements were considered (Beven , 2002). The results o f 122 W e i l e r et a l . (2003) support this l ine o f reasoning; the authors improve parameter ident i f iabi l i ty for event water transfer functions b y c o m b i n i n g mul t ip le methods for hydrograph separation i n the T R A N S E P mode l . There is also potential for reviewers to assume that the f inal set o f non-rejected equif inal models represents a comprehensive and exhaustive (or at least unbiased) range o f outcomes; i n reality, this is true on ly i f the inventory o f inc luded uncertainties is exhaustive ( ibid.) . A better alternative for interpreting results is to v i e w them as a non-commensurate set o f mult i -object ive solutions. 3.3.6 Uncertainty Isolation M o d e l predict ive uncertainty is usual ly far too complex to be tackled en masse. Uncer ta in ty i so la t ion is a term used herein to refer to approaches that attempt to isolate specific aspects or components o f uncertainty and examine their influence o n results under various condit ions. S o m e studies accompl i sh this b y ignor ing or t r i v i a l i z i ng any uncertainties beyond their l imi t ed scope. In general, however , uncertainty i so la t ion is achieved through s impl i f i ca t ion o f some aspect o f the hydro logic m o d e l l i n g process. I f a mode l cannot adequately emulate a s imple system, it is questionable whether better performance can be expected for more complex systems (Gan , 1987). Therefore, one cou ld argue that acceptable s imula t ion o f a s imple system is a necessary but not sufficient milestone for m o d e l va l ida t ion . The data, the m o d e l structure, or both can be s impl i f i ed . S i m p l i f y i n g a m o d e l structure also usual ly s impl i f ies its parameter representation. D u a n et a l . (1992) use uncertainty isola t ion i n their design testing o f the S C E - U A method, app ly ing it to calibrations o f the S I X P A R m o d e l , a s imp l i f i ed six-parameter vers ion o f the S A C -S M A conceptual mode l . It is c r i t ica l to recognize that, a l though the authors reduce parameter uncertainty, they neither attempt nor c l a i m to achieve a net reduct ion o f m o d e l predict ive uncertainty. H a r l i n (1992) applies the p r inc ip le o f uncertainty analysis differently, us ing data selection rather than m o d e l structure s impl i f i ca t ion to isolate a subset o f m o d e l processes. H i s study focusses o n 123 ra in f loods to gain a clearer picture o f the behaviour o f his mode l ' s runoff-response function wi thout the compl ica t ing influence o f the mode l ' s snow routine. H y d r o l o g i c models c o m m o n l y exhibi t pronounced sensi t ivi ty to data error, uncertainty, and spatial var iabi l i ty . I f data uncertainty can be el iminated, any uncertainty i n the output must result f rom parameter uncertainty and m o d e l structural l imita t ions ( G a n and Burges , 1990a). H o w e v e r , c r i t ica l reviews have ident i f ied persistent data problems even for heavily-instrumented experimental catchments l ike G o o d w i n Creek (e.g., Steiner et a l . , 1999) and H y d r o h i l l (e.g., J akeman and Hornberger, 1993). It is not surprising that a large number o f studies attempt to sidestep data uncertainty entirely. One approach for avo id ing data uncertainty is to focus o n relative changes i n m o d e l output resul t ing f rom control led changes i n the mode l input. T h i s approach assumes that error and uncertainty are equal ly present i n both the or ig ina l and mod i f i ed s imulat ions, and therefore w i l l not affect conclus ions based o n relat ive results. In one example o f such a study, Bashford et a l . (2002) investigate the effects o f sca l ing and aggregating data b y compar ing output based o n 30-metre gr idded data to results obtained us ing the same input data discret ized to a 1-km scale. The i r focus is on examin ing the differences between the two sets rather than establishing absolute accuracy. Other examples inc lude L u m b and L i n s l e y ' s compar i son o f streamflow series based o n measured and mathematical ly-augmented rainfal l data ( L u m b and L i n s l e y , 1971), and the explora t ion o f the effects o f land-use changes on f low regime b y K u c h m e n t and Ge l fan (2002). Other studies elect to use v i r tua l or synthetic data, w h i c h can be assumed error-free for experimental purposes. In such cases, m o d e l results are evaluated against a s imulated "true" data series rather than against actual f ie ld data. In d rawing conclus ions f rom studies u t i l i z i n g v i r tua l data, Bashfo rd et a l . (p. 310, 2002) caut ion that a v i r tual hydro log ic real i ty m a y not be complete, accurate or realistic i n its process representation. K l e m e s (2000a) proposes that a l l models be subjected to testing w i t h synthetic data produced b y an exact m o d e l o f a hypothetical system. In its simplest interpretation, this procedure has become k n o w n as synthetic cal ibrat ion, and is the most easi ly identif iable example o f uncertainty isola t ion . Synthetic cal ibrat ion uses a specified set o f parameter values to generate an arbitrary 124 output series. These synthetic data then take the place o f observed data i n an automatic cal ibrat ion. T o be successful, the synthetic ca l ibra t ion must identify a def in i t ive ly g loba l ly -opt imal so lu t ion consis t ing o f the parameter values used to generate the "observed" data. T o ensure a realist ic ca l ibra t ion process, synthetic ca l ibra t ion should be based on realist ic input data and parameter values (Bashford et a l . , 2002; Seibert, 2000). Synthetic ca l ibra t ion is most c o m m o n l y used for evaluating the performance o f automatic cal ibrat ion algori thms. Howeve r , it can also prov ide insight into m o d e l structure uncertainty b y h igh l igh t ing any systematic failures w i t h regard to various performance characteristics. I f synthetic ca l ibra t ion fails to successfully reproduce the synthetic "truth" data, the pa i r ing o f cal ibra t ion too l and m o d e l cannot be expected to perform w e l l i n the presence o f rea l -wor ld uncertainties. H o w e v e r , a synthetic ca l ibra t ion that reproduces the "true" streamflow t ime series wi thout ident i fying the target parameter values is a half-success at best, and is l i k e l y an indicator o f overparameterizat ion i n the m o d e l structure. F o r example, Seibert (2000) finds that some op t imized parameters differ f rom their "true" values b y 10% or more, despite near-perfect convergence o f the objective function. B e v e n (2002) reminds readers that processes i n hydro logic models are not always an accurate reflect ion o f their in situ counterparts. Uncer ta in ty i so la t ion approaches have been used i n var ious attempts to explore the result ing m o d e l structure uncertainty. F o r example, the hypothet ical case o f a smal l , hydro log ica l ly s imple catchment w i t h un i fo rm properties and processes c o u l d be mode l l ed quite accurately us ing exis t ing physical ly-based models ( G a n and Burges , 1990a). Therefore, it should be poss ible to design a realist ic v i r tua l catchment for w h i c h a g iven physica l ly-based m o d e l is absolutely accurate and complete ly certain (e.g., G a n , 1987). G a n and Burges (1990a) exp la in h o w a mathemat ical m o d e l o f a v i r tua l catchment w i t h k n o w n properties, inputs, and outputs can take the place o f an ideal experimental catchment. B y creating a v i r tua l hydro log ic reali ty w h i c h can be exhaust ively explored through a mathematical m o d e l , the authors i n effect create an ersatz duplicate o f the control led condi t ions o f a f ie ld-scale, fu l ly-contr ived, repeatable laboratory experiment. A s s u m i n g the physica l ly-based m o d e l is fu l ly representative o f the v i r tua l catchment, f luxes calculated b y the phys ica l m o d e l detail 125 what w o u l d actual ly occur i n this catchment under the same input condit ions. T h i s effectively eliminates data uncertainty ( G a n and Burges , 1990a). W e i l e r and M c D o n n e l l (2004) pursue a s imi la r goa l us ing v i r tua l experiments o f h i l l s lbpe-scale hydro log ic processes. T h e y refer to their v i r tua l experiments as "numer ica l experiments w i t h a m o d e l d r iven b y co l lec t ive f ie ld in te l l igence" (p. 6, ib id . ) . Rather than us ing a detailed phys ica l m o d e l as a baseline t ra in ing too l , the authors p rov ide a quali tat ive d iscuss ion o f their results i n relat ion to other f indings documented from f ie ld experiments. In general, results o f the v i r tua l experiment are found to correspond w e l l w i t h f ie ld observations. Output from v i r tua l experiments such as those o f G a n and Burges (1990a, b) and W e i l e r and M c D o n n e l l (2004) can be used i n the same manner as results from a laboratory experiment: to conf i rm the structure and performance o f other models , and to evaluate the uncertainties introduced b y var ious s imp l i fy ing assumptions (e.g., T r o c h et a l . , 1993). W e i l e r and M c D o n n e l l (2004) suggest that the results o f a v i r tua l experiment be v i e w e d i n the context o f equif inal i ty (Sect ion 3.3.5) as an "equif inal i ty reducing instrument". In particular, a strong case can be made that extending the w o r k o f G a n and Burges (1990a) to inc lude a r igourous mult i -object ive analysis (e.g., S C E M - U A ) w o u l d account for both parameter and data uncertainty. I f this were achieved, any residual deviat ions between the Pareto hydrograph and the "observed" data w o u l d theoret ical ly be due to mode l uncertainty alone. 3.4 Uncertainty and Extreme Event Simulation The crux o f uncertainty i n extreme event s imula t ion is the relat ive dearth o f detailed data and observations for h is tor ica l events. The lack o f records can be attributed to the infrequency, unpredictabi l i ty , and power o f extreme events, whose sheer magnitude often overwhelms observation equipment. In the absence o f extensive quantitative study o f extreme events, it is not surpr is ing that there is no sc ient i f ica l ly p roven means o f est imating them; hydrologic m o d e l s imulat ions o f extreme events are h igh ly uncertain. Nonetheless , hydro logic models are w i d e l y used because they represent the best too l avai lable , are based o n scient i f ical ly plausible i f not p roven assumptions, and can make use o f the best informat ion available ( M i c o v i c , 2003a). 126 M o d e l l e r s and decis ion-makers both need to be aware o f the uncertainties i n v o l v e d i n extreme event s imulat ions. Dec i s ions should make expl ic i t a l lowance for uncertainty rather than accepting at face value the arbitrary prec is ion o f determinist ic m o d e l results. D e c i s i o n makers must fundamentally real ize that precise estimates o f extreme or design events are both unattainable and inappropriate. A t the same t ime, model lers must act to prevent non-experts f rom adopting the v i e w that extreme events l i ke the P M F have a near-infinite return per iod and an inf in i tes imal p robabi l i ty o f occurrence ( N R C , 1985). T o this end, K l e m e s (1992) suggests replac ing discrete estimates o f extreme event magnitude and probabi l i ty w i t h the more defensible concept o f a w i n d o w o f c red ib i l i ty i n probabi l i ty-magni tude space. Determinis t ic estimates for extreme precipi ta t ion (e.g., the P M P ) are par t icular ly subject to problems o f uncertainty. S m i t h et a l . (1996) demonstrate that the storm transposit ion procedure h is tor ica l ly recommended b y the W o r l d Me teo ro log i ca l Organiza t ion for P M P analyses is phys i ca l ly implaus ib le i n certain situations. S i m i l a r l y , the w o r k o f B i n g e m a n (2001) supports the existence o f an upper l i m i t for precipi ta t ion and suggests that " t radi t ional" P M P methods can result i n overest imation o f the P M F . W i n d - i n d u c e d precipi ta t ion undercatch also causes problems w h e n parameter values calibrated against undercatch-laden data are used to generate predict ions us ing determinist ic precipi ta t ion inputs. A l l o f the above considerations seem to support the argument o f A r n a u d et a l . (2002) that extreme f lows are generally overestimated. S ince the best parameter set ar is ing from n o m i n a l ca l ibra t ion and va l ida t ion is u n l i k e l y to be op t imal for extreme or design f lood condi t ions, the largest floods o f record are often used i n a f inal ca l ibra t ion stage to "f ine tune" parameter values (Har l i n , 1992; H a r l i n and K u n g , 1992). Current practice at B C H y d r o is to perform an event-based ca l ibra t ion over the largest discrete events o f the h is tor ica l record pr ior to s imula t ing the P M F for any watershed ( M i c o v i c , 2003a). H o w e v e r , even the largest events o f record are often far r emoved f rom the hydrologic condi t ions o f the extreme event. One must therefore question to what extent an extreme event m o d e l can be said to be "val idated". A p p l y i n g a hydro log ic m o d e l to predict extreme floods i m p l i c i t l y presupposes that the mechanisms d r i v i n g observed behaviours are s imulated correct ly (Seibert, 2000). H o w e v e r , there can be fundamental differences between processes con t ro l l ing no rma l and extreme 127 hydro logic responses. In fact, some researchers have argued outright that hydrologists do not have a clear idea o f what to route i n extreme situations ( C o r d o v a and Rodriguez-I turbe, 1983). R u n o f f dur ing an extreme precipi ta t ion event (i.e., one o f very l o w probabi l i ty) is almost a lways dominated b y over land f l o w ; regardless o f the s ignif icance o f other processes under n o r m a l condi t ions, they are almost a lways relegated to m i n o r roles ( K o u w e n , 2003). It is not surpris ing, therefore, that G a n and Burges (1990b) show that catchments dominated b y surface runof f respond more "accurately" to extreme precipi tat ion than their subsurface-dominated counterparts. Differences between un i fo rm and non-uni form precipi ta t ion can become less important as precipi ta t ion events become spat ial ly extensive and storage effects become sma l l ( ib id . ; A r n a u d et a l . , 2002). W o o l h i s e r et a l . (1996) urge caut ion i n extrapolating models based o n situations i n w h i c h the active processes differ f rom those o f the target appl icat ion. A l p i n e basins prone to debris f lows and mass was t ing are a good example o f a situation for w h i c h the va l id i t y o f m o d e l extrapolat ion is obv ious ly questionable. Jakob and Jordan (2001) show that a pure ly hydro log ic approach w i l l underestimate the magnitude o f design floods i n mountainous regions due to the dominance o f geomorphic processes for the highest observed peak discharges. H a r l i n and K u n g (1992) present a more specific example , demonstrating h o w the influence o f one part icular parameter increases considerably w h e n their mode l is extrapolated to extreme events. Garen and Burges (1981) also identify some parameters as insensit ive except under extreme condit ions. M o r e generally, B C H y d r o (Kroeker et a l . , 2003) has observed that different models , equal ly "wel l -ca l ib ra ted" b y their respective designers, can diverge s ignif icant ly i n extrapolat ion range to extreme events. K l e m e s (p. 17, 1986b) argues that the incautious extrapolat ion o f models to problems beyond far beyond their established capabil i t ies has "created a false impress ion that hydro logy has answers to problems w h i c h m a y remain beyond its reach for decades to come or whose solut ion l ies outside its f ramework". T h i s situation m a y result i n grossly incorrect estimates o f hydro log ica l condi t ions leading to poor decisions, w h i c h m a y i n turn lead to a cyn ica l wan ing o f support for genuine hydro logic research. Ul t ima te ly , this cyc le hinders the ab i l i ty o f hydro log ica l science to provide better answers ( ibid.) . 128 4. Research Tools and Methods of Analysis "Prediction is very difficult, especially if it's about the future. " - Nils Bohr A n y cr i t i ca l analysis o f hydro log ic m o d e l l i n g w i l l undoubtedly ident ify mode l ca l ibra t ion as h i g h l y significant i n its potential effects on uncertainty. M o s t hydro log ic model lers share the cur ios i ty o f O ' C o n n e l l and T o d i n i (p. 7, 1996) as to whether "alternative but phys ica l ly acceptable parameterizations, consistent w i t h the avai lable informat ion, [can] s t i l l g ive rise to the same set o f responses". In the context o f extreme f lood est imation, the quest ion m a y be extended to include the degree o f var iab i l i ty that these "alternative parameterizations" introduce into estimates o f extreme floods such as the P M F . Th i s thesis develops and demonstrates one poss ible approach for exp lo r ing these issues, focussing on the questions: • W h a t k i n d o f var iab i l i ty i n parameter values is associated w i t h alternative "acceptable" cal ibrat ions? • H o w m u c h var iab i l i ty arises w h e n the parameter values identif ied b y these alternative calibrations are used to estimate extreme f lood events? A s a pre-requisite step towards the larger goal o f reducing overa l l m o d e l predict ive uncertainty, this w o r k investigates the impact o f subjective decisions made dur ing cal ibra t ion on m o d e l predict ive uncertainty for extreme events. M u l t i p l e automatic cal ibrat ions o f a conceptual hydro log ic mode l are conducted us ing different measures o f performance, result ing i n a co l l ec t ion o f non-inferior parameter sets. T h i s approach is demonstrated us ing hydro log ic data for the C o q u i t l a m L a k e and I l lec i l lewaet R i v e r watersheds i n B r i t i s h C o l u m b i a . E a c h non-inferior parameter set for the C o q u i t l a m L a k e watershed is then used to simulate an extreme event based o n data p rov ided b y B C H y d r o . The combined output o f these extreme event s imulat ions characterizes the relative var iab i l i ty i n the hydrographs. 129 Simulations are conducted using the University of British Columbia Watershed Model (UBCWM), which is widely used to describe and forecast watershed behaviour in mountainous areas of British Columbia. Calibrations of the U B C W M utilize the Shuffled Complex Evolution Algorithm (SCE-UA), an effective and efficient optimization-based automatic calibration routine. Because automatic calibrations do not capture the level of knowledge inherent in a manual calibration, the extreme event hydrographs obtained using the alternative parameter sets are compared on a relative rather than absolute basis. This work focusses on exploring uncertainty, and as such does not attempt to produce a practicable calibration of the U B C W M for nominal or extreme-flood prediction. Section 4.1 of this chapter presents a brief review of the U B C W M , SCE-UA, and the watersheds selected as case studies. A more detailed description of the experimental approach is given in Section 4.2. 4.1 Tools and Case Studies 4.1.1 The University of British Columbia Watershed Model The U B C W M (Quick et al., 2003) has become BC Hydro's primary tool for forecasting watershed behaviour in mountainous areas. The U B C W M was chosen as the conceptual model for this study to ensure that results are useful both in an applied context as well as in the more general study of model predictive uncertainty. Also, technical support for the U B C W M is available locally through U B C and B C Hydro. The following brief overview of the model is intended to provide the reader with a working level of familiarity. For a more detailed description, the reader should refer to the U B C W M User's Manual (Quick, 1994). The U B C W M is a conceptual model designed to balance simplicity against physical realism (Micovic, 1998). Topographic and land use data are required to define watershed properties. Once the watershed is set up, the model requires only precipitation and temperature data as inputs. Streamflow data are necessary for calibration; Micovic (2003a) recommends a minimum ten years of daily data to ensure reliable calibration and validation. The U B C W M is primarily intended to model mountain runoff resulting from snowmelt, glacial melt, and rainfall (Zhu, 1997). In example applications, the model exhibits better performance for a snowmelt-driven 130 watershed ( 9 1 % efficiency) than for a ra infa l l -dr iven watershed (80% efficiency) (Qu ick , 1995). A s a lways, watershed complex i ty and data accuracy tend to govern m o d e l performance. A l t h o u g h the U B C W M can be classif ied as a l umped mode l , it nonetheless has a distributed component. T h e U B C W M subdivides a watershed into elevat ion bands to capture orographic gradients o f precipi ta t ion and temperature. Prec ip i ta t ion , snowmelt and E T are calculated independently for each elevat ion band. A n analysis o f the dependence o f rout ing parameters on watershed area leads M i c o v i c (1998) to conclude that the channel phase can be neglected i n the rout ing calculat ions for smal l and med ium-s i zed catchments. Therefore, a s imple summat ion o f runof f f rom a l l bands is used to calculate out f low. S ince the U B C W M contains both distributed and lumped elements, it is useful to d is t inguish the m o d e l f rom either group b y referring to it as "quasi-distr ibuted". M o d e l t i m i n g is based o n precipi tat ion and temperature inputs, and can assume either a da i ly or hour ly basis (Qu ick , 1995). U B C W M can operate as either an event-based or continuous mode l . F o r continuous s imulat ions , a one-year spin-up per iod is c o m m o n l y used w i t h in i t i a l condit ions set to zero and the first year o f results not considered i n the analysis. In i t ia l condit ions for event-based (hourly) cal ibrat ions are typ ica l ly determined b y excerpting data f rom longer-term continuous (daily) m o d e l s imulat ions ( M i c o v i c , 2003b) . W h i l e p r i m a r i l y focused o n streamflow estimation, the U B C W M can also provide estimates o f snowpack, s o i l moisture, groundwater storage, geographic runof f contributions, and surface-subsurface contributions (Qu ick , 1995). The accuracy o f the m o d e l i n a forecasting role depends largely o n accurate reproduct ion o f snowpack, so i l moisture and groundwater storage (Qu ick , 1995). E v e r y s imula t ion us ing the U B C W M involves three major subcomponents o f the m o d e l structure: • T h e meteorologica l subroutine distributes input data throughout the catchment. T h i s is the most important component due to its control o f moisture input and snowmelt . 131 • The so i l moisture subroutine calculates evaporat ion and runoff, apport ioning the runof f between four response modes: fast (e.g., surface); m e d i u m (e.g., in terf low); s l ow (e.g., upper groundwater); and very s l o w (e.g., deep groundwater). T h e response modes are conceptual ly analogous to their phys ica l examples, but lack a phys ica l basis i n computat ion. • The watershed rout ing subroutine calculates the t ime dis tr ibut ion o f runof f f rom each o f the four response modes apportioned above. E a c h response mode is subjected to storage us ing a cascade o f l inear reservoirs. T y p i c a l t ime constraints range f rom less than one day for fast runof f to 100-200 days for very s low runoff. A n effective cal ibrat ion must i nvo lve a l l four o f the runof f response modes described above to ensure the respective t i m i n g parameters have a l l been op t imized . O u t f l o w from the four response modes is summed for each timestep. These three components are appl ied consecut ively and the output from one becomes the input to the next. V a r i o u s other engines and uti l i t ies can be ca l led o n through the user interface to assist i n data preparation and analysis. Temperature data are used to determine snowmelt , evaporation, and precipi ta t ion form. Parameters required to calculate snowmelt , evaporation, and temperature lapse rates are a l l pre-calibrated. Snowmel t can be dr iven b y either energy budget or degree-day approximat ions (Quick , 1995). The U B C W M calculates E T i n three stages. Firs t , potential E T is calc