Probabilistic Dynamic Rating Curves using Auxiliary InformationbyLuis Camilo Galindo RuizB.S. Civil Engineering, University of Texas at Austin, 2015A THESIS SUBMITTED IN PARTIAL FULFILLMENTOF THE REQUIREMENTS FOR THE DEGREE OFMaster of Applied ScienceinTHE FACULTY OF GRADUATE AND POSTDOCTORAL STUDIES(Civil Engineering)The University of British Columbia(Vancouver)December 2017c© Luis Camilo Galindo Ruiz, 2017AbstractRating curves play a vital part in hydrology for producing streamflow time-series. The derivedstreamflow is an integral component to any hydrological study and therefore requires proper quan-tification of not only a discharge point value, but also an uncertainty measure. Using multivariateGaussian distributions as kernels, a probabilistic rating curve was developed from the conditionaldistribution as an alternative model for the standard deterministic rating curve. Auxiliary informa-tion from a run-of-river hydroelectric project, as well as the temporal variability from the gaugingmeasurements, were used to study the possible reduction in the uncertainty of the developed ratingcurve. The temporal information was modeled using an exponential function that updated uponreceiving new gaugings and the sluicing model was a continuously updated kernel distribution thatassigned more weight to gaugings taken after a sluicing event. Four models of varying complex-ity were created and their performance was evaluated using information theory measures such assurprise and the Kullback-Leibler divergence measure. The results indicate that probabilistic ratingcurves are useful tools for modeling and evaluating the dynamic uncertainty of the curves. Theuncertainty was shown to be reduced by up to 19% by including the temporal information of thegaugings and sluicing information. Auxiliary information can be beneficial to rating curve devel-opment and an argument is made for why probabilistic rating curves should become a norm in thehydrology field.iiLay SummaryThe volume of water (streamflow) flowing through rivers has large applications in engineering de-sign and modeling. All these applications require that the proper amount of streamflow be quantifiedto ensure the best possible design and usage of the water. For this reason, stations that continuouslyrecord streamflow have been placed at various locations along rivers all over the world. Directlymeasuring streamflow however is highly expensive and an indirect method must be used. Thismethod is called a rating curve, but the largest flaw in the method is that it does not properly cap-ture the uncertainty in the streamflow or give any insight into the probability of the discharge. Thisresearch addresses the flaw by developing a model that highlights the truly probabilistic nature ofthe streamflow. Available information impacting the streamflow uncertainty was also introduced tohelp reduce the uncertainty of the probabilistic dynamic rating curve.iiiPrefaceThis thesis is original, unpublished, and independent work by the author, Luis Galindo.ivTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiLay Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiGlossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Stage-discharge rating curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.1 Hydraulic Controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Development of a rating curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.1 Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.2 Site investigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2.3 Interpretation of the rating curve . . . . . . . . . . . . . . . . . . . . . . . 62.3 Uncertainties in rating curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3.1 Bayesian and MCMC methods . . . . . . . . . . . . . . . . . . . . . . . . 102.3.2 Time variance in rating curves . . . . . . . . . . . . . . . . . . . . . . . . 112.3.3 Machine learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.3.4 Information is informative . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4 Research objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15v3 Data and methods for fitting rating curves . . . . . . . . . . . . . . . . . . . . . . . . 173.1 Site description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Data availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2.1 Sluicing data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.3 Rating Curve Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.3.1 Forecast model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3.2 Hindcast model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.3.3 Development of rating curve . . . . . . . . . . . . . . . . . . . . . . . . . 283.4 Metrics for evaluating rating curve . . . . . . . . . . . . . . . . . . . . . . . . . . 323.5 Information theory metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323.5.1 The model that is less surprised is better . . . . . . . . . . . . . . . . . . . 333.5.2 Information gain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 Results of probabilistic rating curves . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1 Forecast mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.1.1 Probabilistic dynamic rating curve . . . . . . . . . . . . . . . . . . . . . . 354.1.2 Dynamic conditional uncertainty bands for hydrograph . . . . . . . . . . . 374.1.3 Evaluation of rating curves . . . . . . . . . . . . . . . . . . . . . . . . . . 394.2 Hindcast mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.3 Sensitivity analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3.1 Discharge uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3.2 Time model parameter (µ) . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.1 Cease-to-flow water level (h0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.2 The effect of fixing the slope of all kernels to be equal . . . . . . . . . . . . . . . . 505.3 Extrapolation of the rating curve . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.4 Maximizing what is known to help solve the unknown . . . . . . . . . . . . . . . 515.5 Dynamic uncertainty bands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.6 The need for probabilistic rating curves . . . . . . . . . . . . . . . . . . . . . . . 536 Conclusion and future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59A Snapshots of the rating curve in time . . . . . . . . . . . . . . . . . . . . . . . . . . . 63B Hydrographs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66C 20% assumed discharge uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . 68viList of TablesTable 4.1 Total surprise per model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41Table 4.2 RMSE and surprise during hindcast evaluation of rating curve model . . . . . . 45Table 4.3 Sensitivity analysis of discharge uncertainty used in the conditional distributionof the kernels. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Table 4.4 Sensitivity analysis of µ parameter in the exponential function used for the timeweights while in forecast mode. Note that the Equal Weights and the Only Sluic-ing models are not shown since they are not effected by the time weights. AW =All Weights and OT = Only Time. . . . . . . . . . . . . . . . . . . . . . . . . . 48viiList of FiguresFigure 2.1 Compound channel cross-section . . . . . . . . . . . . . . . . . . . . . . . . . 5Figure 2.2 Illustrates a shift correction. The red line represents the baseline rating curvewith triangular markers used to construct linear lines. The purple line is a tem-porary shift that occurred during the active period of the rating curve. . . . . . 7Figure 2.3 Variable-shift diagram displaying a negative shift. . . . . . . . . . . . . . . . . 7Figure 2.4 Different physical processes that are common in rating curve and an example ofextracting more information than stage and discharge relationship (from Her-schy (1995)). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14Figure 3.1 Location of Forrest Kerr Hydroelectric project. The Iskut river flows from eastto west. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18Figure 3.2 A schematic of the intake structure just upstream of the IFR hydrometric station.Courtesy of Tim Argast of Northwest Hydraulic Consultants (NHC). . . . . . . 19Figure 3.3 Time series of stage and discharge at two two different locations. A clear pat-tern for when sluicing was identified as having occurred. The forebay stage islocated right before the power tunnel intake in the desander basin. . . . . . . . 21Figure 3.4 Time series of stage and discharge at two different locations. Changes in thetime series with identified sluicing events detailed as red vertical lines. Thecolored patches represent time periods in which new NHC rating curves weredeveloped. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Figure 3.5 A bar graph showing which months experienced the most sluicing events be-tween January 2015 and July 2017. . . . . . . . . . . . . . . . . . . . . . . . 23Figure 3.6 Hydrograph at IFR hydrometric station between January 2015 and July 2017. . 23Figure 3.7 A bar graph showing which months experienced the most precipitation betweenJanuary 2015 and July 2017. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Figure 3.8 More weight from the exponential function (dotted line) is assigned to the gaug-ings (colored boxes) that are closer to the current time step, Ti. . . . . . . . . . 25viiiFigure 3.9 The kernel probability distribution function (PDF) and CDF are updated as moreinformation on possible sluicing events enter the model. The first panel showsthe PDF and CDF of the first sluicing event, the second panel shows and updatedversion, and the final panel shows the model with all the sluicing information. . 26Figure 3.10 An evolution of how the sluicing weights change as new gaugings are observed(x-axis) in time. Each row is a vector of the sluicing weights . . . . . . . . . . 26Figure 3.11 An equal weight assignment for the gaugings prior to sluicing information andthen the weights increase in magnitude for values later in time. The figure is asnapshot if Figure 3.10 at the time of the last added gauging. . . . . . . . . . . 27Figure 3.12 Evolution of the time model at different time period. . . . . . . . . . . . . . . 27Figure 3.13 Evolution of the sluicing model when applied in hindcast mode. An evolutionof how the sluicing model changes in the hindcast mode of the rating curve. Thetop left panel shows the function typical for the first 51 gaugings. The top rightpanel shows the function for the 53rd gauging. The bottom left is the typicalfunction for the gaugings between 53rd and 104th gauging, with the peak ofthe function centered around the gauging being evaluated during the hindcastmodel. The bottom right function is the function for the last gauging. . . . . . 29Figure 3.14 A multivariate Gaussian kernel with its marginal distributions and a conditionaldistribution. Note that only the scaling parameters are shown for illustrativepurposes. The location parameters per kernel are fixed to the value of the gaugings. 31Figure 4.1 The conditional probabilistic rating curve for all four models at a snapshot intime. May 15, 2016. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36Figure 4.2 A comparison of the conditional distributions for a give logarithmic water levelmeasurement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37Figure 4.3 A zoomed in version of Figure 4.1 used to understand the bimodal conditionaldistribution shown in Figure 4.2. . . . . . . . . . . . . . . . . . . . . . . . . . 38Figure 4.4 A hydrograph with dynamic conditional uncertainty bands for the All Weightsmodel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38Figure 4.5 A zoomed in version of Figure 4.4 . . . . . . . . . . . . . . . . . . . . . . . . 39Figure 4.6 Scatter plot describing the differences between the predicted discharge from thefour models and the observed values. . . . . . . . . . . . . . . . . . . . . . . 40Figure 4.7 Total surprise per month. The translucent bars represent the surprise before thegauging is added (during prediction) and the opaque bars are the surprise after. 41Figure 4.8 Cumulative gaugings per month are shown in the figure to the left and the rangeof values recorded per month and their frequency are shown in the heatmap tothe right. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42Figure 4.9 Discharge versus surprise and color coded with date in which gauging wasrecorded. The three largest values were removed from the main plot to bettershow the trend. The inset shows what the original date looked like. . . . . . . . 43ixFigure 4.10 Trend in the gaugings. The reason for the nonlinear shape is due to the initiallyassumed h0 of 234 m. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43Figure 4.11 The lines represent the information gain from the Kullback-Leibler divergencemeasure between the probability distribution prior to a prediction being madefor the gauging and the probability distribution after adding the gauging (”true”distribution). The black bars represent the discharge values of the gaugings usedin the model. The light gray bars are the initial 19 gaugings used to develop therating curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44Figure 4.12 A comparison of how the joint and conditional distributions change when usinga 10% discharge uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . 47Figure 5.1 Last dynamic probabilistic rating curve produced in simulations. . . . . . . . . 50Figure 5.2 A display of how parallel rating curves can emerge using the described linearregression for fixing the slope of all the kernels. . . . . . . . . . . . . . . . . . 51Figure 5.3 A closeup as to how the conditional uncertainty bands change when a gaugingis added. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53Figure A.1 November 28, 2013 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63Figure A.2 January 1, 2015 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64Figure A.3 June 20, 2017 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65Figure B.1 Hydrograph for Only Time model . . . . . . . . . . . . . . . . . . . . . . . . 66Figure B.2 Hydrograph for Only Sluicing model . . . . . . . . . . . . . . . . . . . . . . 66Figure B.3 Hydrograph for Equal Weights model . . . . . . . . . . . . . . . . . . . . . . 67Figure C.1 A comparison of how the joint and conditional distributions change when usinga 20% discharge uncertainty. . . . . . . . . . . . . . . . . . . . . . . . . . . . 68xGlossaryUSGS United States Geological SurveyPDF probability distribution functionWSC Water Survey of CanadaV-SHIFT variable shift diagramMCMC Markov Chain Monte CarloOU Ornstein-UhlenbeckGLUE Generalized Likelihood Uncertainty EstimationANN artificial neural networkENNF explicit neural network formulationENSO El Nin˜o Southern OscillationPDO Pacific Decadal OscillationEC electricial conductivityIFR instream flow requirementFKC Forrest Kerr CreekLOOCV leave-one-out cross validationCDF cumulative distribution functionNHC Northwest Hydraulic ConsultantsROR run-of-riverBC British ColumbiaRMSE root mean square errorxiAcknowledgmentsI would like to thank Dr. Steven Weijs for all his help throughout the past few years, and for thenotorious 1+ hour long meetings in your office. It has been great fun working under your supervi-sion. Thank you Dan Moore for being my second reader. Thank you to Akhil Kumar and HosseinFoorozand for your fruitful discussions and friendship during the past year. Valentine Arrieta andPauline Millet, those few months that you were each in Vancouver were unforgettable moments.Thank you to Stu Hamilton and Touraj Farahmand from Aquatic Informatics for the support onmy thesis and for helping with the NSERC Engage grant. Also thank you to Andre Zimmermanfrom NHC for generously providing me the data to complete my thesis.To everyone in the upstairs CEME graduate offices, thank you for all the great conversations. ToKevin, Elnaz, and Glenda, thank you for all the distractions, coffee, candy, and help.My experience in the hydrotechnical group would not have been the same without the followingpeople: Abhishek Agrawal, Mehretab Tadasse, Jonathan Van Groll, Andy Lee, Erica Kennedy, andJulio Portocarrero. Thank you all so much for all the great times that we have had.To the many faces of 2116, it has been a pleasure living with all those who have come and gone;you guys have been great roommates! James Roberts, thank you for all the fun that we have had thepast few months in the house—that haunted escape room was brilliant! Simone Tengattini, livingwith you was a beautiful experience, and the friendship that flourished will never be forgotten.Sylvia Woolley, thank you for everything. Your companionship the past year has been an extraordi-nary adventure.To all my friends from Texas that I have maintained in contact with, I am incredibly happy tohave you all in my life. Thank you to all those who have traveled to Vancouver to see me, especiallyJuan De Loera, Drake Hernandez, Brandon Comisarenco, Anais Picarelli, and Joanne Neira.Lastly, thank you to my family. None of this would have been possible without their support.Gracias a mi papa´ y mi abuela por todo tu amor y apoyo—los amo con todo mi corazo´n.xiiChapter 1IntroductionThis thesis is focused on the one compound that everyone in the world needs for survival—water.It’s both vast, powerful, and dynamic as well as finite, fragile, and lethargic, all at the same time.Human life depends on this compound, but its usability is threatened every single day by pollutionand anthropogenic climate change. This makes one wonder how water will play a factor in the futureand/or how much usable water will there be? x m3? y m3? This information is useful, but it leaves alingering question of, ”How sure are you?” This question instinctively forces the person providingthe x and y estimates to give a response like—”I’m fairly sure” or, ”About 67% confident.” So whengiven a prediction, it is natural to know how certain the answer is. Think about it. What if youturned on your television and the weather spokesperson said, ”Today it will rain.” This statementleaves one wondering if the individual meant that it will rain all day, or if the rain will be scatteredwith higher probability soon after lunch. Without any further information on the probability of raintoday, one is left to make the judgment of when he/she believes it will rain. So now, one can makea subjective statement on what the probability of rain will be, most likely determined by lookingoutside and examining the gloom in the clouds. However, one is not an expert in weather forecastingand the probability of rain determined by individual users may be different, and more likely to beincorrect, from the probability that the expert would assign. The reason for this is that the experthas all the information available to him/her and is more qualified to be the one making the judgmenton the chances of rain today. The aforementioned is true when trying to predict how much water isflowing in a river channel.Today, all around the world the task of quantifying how much water is flowing in any river is dele-gated to primarily government agencies. Here in North America, the two largest agencies are WaterSurvey of Canada (WSC) and the United States Geological Survey (USGS). Hydrographers for WSCand USGS perform field visits to rivers, and carefully measure water level and discharge valuesto develop an empirical relationship between the two variables. The true relationship is groundedon fundamental hydraulic principles and is known as the rating curve. However, the rating curveproduces a purely deterministic result. Therefore, similar to the scenario described in the priorparagraph, users of the assembled discharge time series are tasked with the exercise of applying a1probability to the values, or in many cases, applying no probability and using the time series as is,causing an ”illusion of certainty” (Krzysztofowicz, 2001). Many hydrologists have come to termswith the importance of probabilistic models and their importance (Beven, 2010; Liu & Gupta, 2007;McMillan, Freer, Pappenberger, Krueger, & Clark, 2010; Montanari & Brath, 2004; Weijs, 2011).Unfortunately, the rating curve has yet to be transformed into a truly probabilistic model and it ishere, where the author argues that a probabilistic rating curve is essential for not only the hydrol-ogy field, but those that are dependent on discharge time series. The results from a rating curveare fundamental to accurate depictions of floodplain mapping, emergency and evacuation decisions,engineering design structures, and more.The topic of rating curve uncertainty is continuously being studied to develop models that canproduce more accurate values. Again, the author has not found in the literature an explicit proba-bilistic dynamic rating curve model. The literature does contain probabilistic rating curves in thesense that the parameters of the rating curve are described by a probability distribution, but no rat-ing curve has been developed as a pure conditional distribution, like the one in this thesis. Theauthor’s first objective in this thesis is to develop a probabilistic rating curve model that producessimilar discharge values but with the added benefit of providing a probability distribution with thedischarge point value. The second objective of this thesis is to illustrate to the reader the impor-tance of adding available information to improve the accuracy of a rating curve model. Availableinformation in the context of rating curves means information that may help explain the behaviorof rating curves throughout the temporal and spatial domains, as well as any relevant data that mayhave also been collected while recording water levels and discharge, such as electrical conductiv-ity, as done in (Weijs, Mutzner, & Parlange, 2013). Information has the main benefit of possiblyreducing uncertainty. In the context of rating curves, auxiliary information that can be brought intothe model is dependent on what kind of other signals were collected at the hydrometric station, ornearby. Rivers with high energy, perched in the mountains, tend to have their discharge measuredby electrical conductivity. Weijs et al. (2013) showed that electrical conductivity could be useful inreducing rating curve uncertainty. Other signals that have a strong correlation with the dischargemay also serve as good side information to include, but they must be independent of the water level.Thinking about what local information relates to the discharge is a useful exercise that can havehigh impacting results on modeling rating curve uncertainty. Caution must be adhered to ensurethat a complex model is not developed as a substitute for the current practical rating curve. Instead,the ideal model should balance complexity and practicality. Thus, by grounding the rating curveon hydraulic principles and identifying useful information, one can create a model that meets bothcomplexity and practicality needs.More specifically, this thesis will demonstrate a probabilistic rating curve that uses informationavailable from a run-of-river hydroelectric project in northwestern BC. The thesis is organized asfollows. Chapter 2 will provide the background on the current state of developing rating curves, as2well as present a literature review of the current work that has been done on modeling rating curveuncertainty. The literature review over rating curve uncertainty extrapolates some of the work pre-sented in the literature review done by Le Coz in 2012, and expands on it by work done since thatmanuscript’s publication. Chapter 3 describes the methodology used in producing the probabilisticrating curve, as well as a description of how the added information was incorporated into the ratingcurve. This chapter finishes by introducing a few information theory metrics and measures used tohelp evaluate the added information, and the performance of the models. Chapter 4 is a presentationof the probabilistic rating curve produced, as well as an assessment of the auxiliary informationused in the model. Finally, Chapter 5 provides a discussion on the results, the importance of prob-abilistic models and using available information, and future work that can be done to expand thework presented herein this thesis.3Chapter 2Background2.1 Stage-discharge rating curveA stage-discharge rating curve is an empirical relationship between water level and discharge alonga cross-section of a river. A rating curve ideally consists of 10 or more points that are time invariantand supports the hypothesis of a stable relationship between the variables (Ministry of EnvironmentScience and Information Branch for the Resources Information Standards Committee, 2009). Apower law (Equation 2.1) is often used and forms the one-to-one relationship between the twovariables.Q = a · (ht −h0)b (2.1)Equation 2.1, when plotted on a logarithmic scale, creates a linear relationship (when the true cease-to-flow water level is known) that can be interpreted in terms of hydraulic principles. The scalingparameter (a) and exponent (b) have been attributed to the geometry and physical constraints ofthe river. In his seminal work, Chow (1959) demonstrated that the scaling parameter is related tothe characteristics of the channel or section control, while the exponent parameter can approximatethe shape of the cross-section. This can provide an indication as to whether the stage-dischargerelationship is controlled by a section or channel control. An exponent less than two would indicatea channel control while a larger exponent is typical of a section control. This led Chow (1959) togeneralize the following:• b = 3/2 is typical of a rectangular channel• b = 2 is typical of a rough parabolic shape• b = 5/2 is typical of a triangular or semi-circular section2.1.1 Hydraulic ControlsThe stage-discharge relationship at a gauging station is governed by the downstream hydraulic con-trols of the river (Rantz, 1982). The three types of hydraulic controls are section, channel, and4Figure 2.1: Compound channel cross-sectioncompound controls. A section control is either a natural or man-made structure, such as a rockoutcrop or bridge/weir, while a channel control is a function of the channel geometry, slope, androughness. Typically, a channel control is more evident in large and wide rivers, while a sectioncontrol is more commonly found in small narrow rivers. The third control is the compound con-trol, a combination of section and channel controls as shown in Figure 2.1. Here, the section controlmanages the low end of the rating curve and as the flow increases, the section control is drowned outand the relationship is dominated by the channel control. In such a hydraulic situation, it is commonto identify a transition region between the two controls. When plotting the rating curve, this canusually be identified by changes in the slope between low and high flows (Herschy, 1995). It is alsoimportant to note that the hydraulic equation may also be segmented to account for a compoundsection. In this case, a segmented hydraulic equation will have unique parameters for the differentsections that correspond to the dominant control. Also when considering hydraulic controls, oneneeds to account for the offset value, or h0 in Equation 2.1. This offset value is the height adjust-ment for the control, and differs if the stage-discharge rating curve is compounded to represent thedifferent physical characteristics of the channel.2.2 Development of a rating curve2.2.1 CriteriaWhen developing a rating curve, it is pertinent that the location of a rating curve satisfy the followingcriteria (when possible) as is mentioned in the Manual of British Columbia Hydrometric Standards.These criteria include:• Accurate water level measurements must be available at low and high water levels• A stable hydraulic control must be available, either natural or artificial• The station must be accessible year long• Straight, stable, and aligned banks• No tributaries between gauge and metering site• Metering section should have uniform depth and velocity flow lines5• Accountability for backwater effects• The station should be able to withstand large peak events and intense rainfall and snow events• The proper station should be constructed for the type of body of water one wishes to gaugeUnfortunately, not all criteria will always be met, and this could cause a less stable stage-dischargerelationship. If this is the case, one may need to perform more site visits.2.2.2 Site investigationSite visits allow the hydrographer to develop an intuition for how the river is behaving at a givenpoint in time; as well as give an image of changes that may have occurred since the last visit. Carefulobservations regarding high flowing rivers should be exercised and evidence of past floods shouldbe sought out. These include, but are not limited to, water marks, debris deposition, and fragmentedvegetation in the river’s floodplain. It is also important to ensure that if a housing structure, such asa stilling well, is on site, that it remains intact and fully operational.A careful observation of the channel cross-section shape (including banks of the river) will helpthe hydrographer develop a hypothesis for what a possible exponent value (b) may be in Equation2.1. Signs of large boulders along the river may indicate high energy and temporal variations of thecontrols (Baker, 2009; Kieffer, 1985; Le Coz, Renard, Bonnifait, Branger, & Le Boursicaud, 2014).It is recommended that during site visits, the controls should be identified to ensure no changesto the regime that controls the stage-discharge relationship have happened. Gaugings taken duringsite investigations should be aggregated to the most current rating curve. This practice allows thehydrographer to identify if the data collected fall within 5% of the current rating curve, or if a shiftor new curve may be required (Herschy, 1999).2.2.3 Interpretation of the rating curveA shift is described as a temporary deviation from the main rating curve. Shifts can be attributedto a number of reasons, primarily temporary changes in the control, such as a log/debris jam, ice,and vegetation. Backwater effects may also cause a temporary shift to be implemented. If the shiftpersists for a period of months, then this is may be indication that a permanent effect may havetaken place in the stream and a new rating curve is required. It is sometimes possible to predictwhen a shift is going to occur based on the temporal patterns of the stream and basin. For exam-ple, it is possible that through a thorough ecological investigation, a hydrographer may notice thatbeavers always build their dams during a specific time period and/or that during spring and sum-mer months, vegetation tends to grow and a shift in the curve is needed until that time period is over.Determining shifts can be an art, as stage and discharge have natural scatter due to the volatilityof the cross-section as impacted by erosion/deposition, vegetation, and/or ice. Gaugings that devi-ate from the adopted threshold (typically ± 5%) may indicate a possible shift in the stage-discharge6Figure 2.2: Illustrates a shift correction. The red line represents the baseline rating curve withtriangular markers used to construct linear lines. The purple line is a temporary shift thatoccurred during the active period of the rating curve.Figure 2.3: Variable-shift diagram displaying a negative shift.relationship, but a well-grounded physical reason must justify the decision on when to apply a shift,and where the shift should occur on the rating curve. Figure 2.2 presents a shift for the activerating curve of a gauging station between September 2008 and September 2009. A variable shiftdiagram (V-SHIFT) diagram is often used to help identify shifts and their location. A sample shiftdiagram is shown in Figure 2.3 The V-shift diagram plots the water level of the gaugings on theordinate, and the distance between the measured water level and that from the baseline rating curveon the abscissa. When the data are plotted, the gaugings should resemble a Gaussian distributionaround the zero shift reference line. If not, this could indicate a bias and supports the notion that7a shift may be necessary. A negative shift will cause the temporary aberration in the rating curveto be to the left of the baseline curve, and the opposite for a positive shift. Shifts are meant to betemporary and should always yield back to the baseline curve. If over time, gaugings systematicallyplot above or below the baseline curve, a new rating curve should be considered.Typically, rating curves can exhibit a pattern of moving to the left or right of the previous ratingcurve. A simple sensitivity analysis was performed by Rantz (1982) in which he stated that by plot-ting the rating curve on log paper, changes in the control are identifiable. For example, an increasein the width of the channel control causes the new rating curve to plot parallel and to the right ofthe original, while the opposite indicates a decrease in control width. Scouring and deposition arealso possible to detect and Rantz proposes that scouring will cause a new curve to plot to the rightbut will be concave downward, rather than parallel, while deposition will plot to the left and will beconcave upward.2.3 Uncertainties in rating curveAs one can see, the stage-discharge relationship thus far is affected by the hydraulic controls andtheir variability over time. However, there are many more factors that can affect the stability of arating curve. The following is a list outlining some of the most common factors affecting a ratingcurve as shown in Di Baldassarre and Montanari (2009); Hamilton and Moore (2012); Herschy(1999); Turnipseed and Sauer (2010):• Scour and fill sand-bed channel• Aquatic vegetation• Ice cover• Variable backwater• Changing dischargeUnderstanding how these factors play a role in modeling the rating curve is crucial for ensuring anaccurate discharge time series. For example, during the summer when vegetation is higher than therest of the year, a shift may be probable until the end of the summer when the vegetation no longerhas a strong influence on the river’s discharge. Ice plays a unique role in measuring streamflow, andin many cases, forces hydrographers to have to remove water level sensors to prevent cold weatherdamage to the instruments. This poses a challenge in trying to recreate the discharge time series.In many cases, WSC will interpolate their winter discharge time series with as few as two gaugings(Moore, Hamilton, & Scibek, 2002). Variable backwater effects in streams are usually caused byan obstruction in the river downstream that causes the water to propagate upstream, such as whena river tributary enters the main stem. Changing discharge is typically seen during passing floodwaves where the rising slope of the wave is steeper than the falling. This phenomenon creates ahysteresis effect where the discharge has two different water levels for the same discharge. In a8stage-discharge plot, this is usually identifiable by a loop around the main rating curve.One other source of uncertainty in a rating curve, is the one associated with extrapolation. Highflows can make it dangerous for hydrographers to visit the site, and safely record water level anddischarge. At sites with permanent measuring devices, high flows can cause the instruments to bewashed away. Extrapolation is required for an array of purposes, but special care should be takento ensure that the extrapolation is confined to the physical constraints of the channel cross-section.It is best practice to plot stage vs area on arithmetic or logarithmic axes to identify the relationshipof the two variables. Ideally, both variables will exhibit a linear relationship until a break in thecross-section is observed, similar to the break in a rating curve when a compound cross-section isused. Typically, one can extrapolate until the point where the stage-area curve is outside the boundsof the channel area. This however is still an assumption, and is never guaranteed to give the truerelationship, but can provide a reasonable approximation in some cases.The first published work on modeling rating curve uncertainty was done by Venetis (1970), in whichresidual variance was used to determine the bounds of the uncertainty in a power law rating curve.The data was log transformed and a least square regression was performed on the log-transformeddischarge. Venetis however did not account for measurement uncertainty. The work done by Rantz(1982) and Herschy (1999) really grounded the hydrometry field and provided a foundation for thework that follows.Di Baldassarre and Montanari (2009) presented a general uncertainty framework that assumes aglobal uncertainty accounting for the discharge measurement technique, interpolation and extrap-olation errors, the presence of unsteady flow conditions, and seasonal changes. They performeda one-dimensional hydraulic model on the Po River in Italy to carry out their estimation of ratingcurve uncertainty. Their results found that discharge uncertainty could be well over 40% and havehigh implications on hydrological models such as rainfall-runoff models that are dependent on theresults of a rating curve to calibrate the models. However, Di Baldassarre and Montanari also madean exhaustive list of assumptions, primarily that the cross-section of the river is time invariant andthat the uncertainties affecting the discharge values are independent, implying that the measuringtechnique used, velocity-area method, is independent of the flow conditions and vegetation.Heteroscedasticity is a common problem observed in rating curve modeling in which dischargeuncertainty increases as the magnitude increases (Sorooshian & Dracup, 1980). Using least squareregression on a power law function, however, violates this assumption, unless log-transformed.Petersen-Øverleir (2004) pointed out this common error of unaccounted heteroscedasticity uncer-tainty affecting rating curve parameters and estimated discharge. Petersen-Øverleir suggested amaximum likelihood method that better accounts for the natural uncertainty exhibited in the stage-discharge relationship. However, in that manuscript, only the uncertainty associated between stage9and discharge was examined, and not the individual gauging errors. The same author has also in-vestigated the impact of hysteresis (Petersen-Øverleir, 2006) and overbank flow in rivers (Petersen-Øverleir, 2008). In the first paper, the Jones Formula was used to identify a change of stage rate.In the latter paper, the author discussed issues with modeling multi-segmented rating curves, inwhich the range of the parameter estimates might be incorrect and the location of the breakpointmay also not have strong physical meaning. The author also made simplifying assumptions aboutchannel characteristics and hydraulic controls. Some of the numerical challenges were also not ad-dressed by the author in his 2005 manuscript (Petersen-Øverleir & Reitan, 2005), but were in laterpublications.2.3.1 Bayesian and MCMC methodsAs computation power has increased in recent years, so has the number of works using Bayesianand Markov Chain Monte Carlo (MCMC) methods. In Bayesian analyses, hydraulic knowledge canbe incorporated into the modeling process via prior distributions. As the model receives new data, itupdates its beliefs, via a likelihood function, and creates a posterior distribution (Reitan & Petersen-Øverleir, 2008). It is most common to use Gaussian distributions as priors and posteriors becauseof the closed form solution. However, if the prior and posterior are not conjugate to each other, aclosed form solution does not exist and MCMC methods must be used to sample from the posterior(Godsill, 2001). The usage of these methods naturally allows the modeler to sample from a proba-bility distribution, and therefore assess the uncertainty associated with the parameters of Equation2.1 and the predicted discharge. Incorporating hydraulic knowledge into the model via priors allowsthe modeler to use more information, rather than the gaugings alone.The segmented rating curve problem by Petersen-Øverleir (2008) was improved using Bayesianand MCMC methods, and further helped with modeling the uncertainty of the rating curve (Reitan& Petersen-Øverleir, 2009). A compounded channel was evaluated using Bayesian methods to es-timate the parameters in each power law segment, as well as the breakpoints in the stage-dischargerelationship. The posterior distribution that resulted from the Markov Chain Monte Carlo samplingwas then used to determine the uncertainty of the rating curve. The authors acknowledged that theirmodel was time invariant and assumed that their rating curves were not affected by changes in theriver geomorphology, hysteresis, or variable backwater. For backwater effects, Petersen-Øverleirand Reitan used two gauging stations to determine the slope of the reach and the correct dischargein their 2009 publication. While this solution does help with estimating a more correct discharge,it is considered impractical in many situations to include more than one gauging station. Multi-segmented rating curves were also looked at in McMillan and Westerberg (2015). Here, the authorscreated a new likelihood function called the ”Voting Point likelihood.” Their model assumed a lo-gistic distribution for the discharge gaugings and prior distributions for the parameters of the powerlaw function. A rating curve was then created from sampling the prior distributions of the param-eters, and the curve is assigned a likelihood weight, dependent on the logistic distributions it has10intersected with, as well as the range of stage and discharge that the rating curve spans. This proce-dure was then repeated using a MCMC method to create a posterior distribution of possible ratingcurves. The results of this method allowed to explicitly model aleatory and epistemic uncertainty.The authors treated all gaugings as equal, and do not account for permanent changes that may occurin the river after larger altering rainfall events.Le Coz et al. (2014) also showed the value of adding hydraulic knowledge to the developmentof the rating curve, as well as the importance of properly classifying the gauging error. The uncer-tainty in this study was divided into the uncertainty associated with the gaugings, the location ofthe gaugings, and the remnant uncertainty. Since the remnant uncertainty was unknown, a MCMCsimulation was used to estimate the value. What the authors show is that when the uncertainty inthe gaugings was assumed to be 5%, as suggested by (Herschy, 1995; Rantz, 1982; Turnipseed &Sauer, 2010), the remnant uncertainty increases. When the actual (true) uncertainty in the gaugingswas used, the remnant uncertainty was lower. These results indicate the importance of capturingthe correct discharge uncertainty, rather than assuming a standard value for all. The work done byLe Coz et al. in 2014 helped the community acknowledge the importance of Bayesian and MCMCusage in rating curve development, through their incorporation of the techniques into their userfriendly developed method called, BaRatin. However, a major limitation of the work is its station-ary assumption.2.3.2 Time variance in rating curvesIt is accepted that river characteristics naturally change over time and therefore the controls whichaffect the stage-discharge relationship are also evolving temporally (Di Baldassarre & Montanari,2009; Hamilton & Moore, 2012; Leonard, Mietton, Najib, & Gourbesville, 2000; Schmidt, 2002).Common practice is to apply time-varying shifts and/or increase the frequency of gaugings. How-ever, the latter option is often highly expensive. A few studies have tackled the task of time-variantrating curves. In Reitan and Petersen-Øverleir (2011), the authors transformed the assumed timeinvariant parameters to time dependent through the combination of a Bayesian hierarchy approachand the Ornstein-Uhlenbeck (OU) process. For those interested in the OU process, please refer toReitan and Petersen-Øverleir (2011) for further information. The OU process serves as the con-tinuous time-varying function for the power law and the parameters are estimated by updating themodels with information as it is received. The results produced 95% credibility bands around theestimated parameters in the power law function, and the rating curve. The uncertainty bands aroundthe parameters, however, have sharp changes in the uncertainty due to the OU process, implying nogradual change in the uncertainty over time. It can be argued this sharp increase in uncertainty is notalways observable in nature. Measurement errors were also not considered by the authors. The cred-ible bands around the rating curve are purely Bayesian and uniform in shape, rather than dynamicuncertainty bands that illustrate regions of high and low confidence with respect to the gaugingsand their date of measurement. Guerrero, Westerberg, Halldin, Xu, and Lundin (2012) studied the11dynamic relationship of stage and discharge on six hydrometric stations in Honduras through theusage of topographic surveys and Manning’s equation to develop their rating curve model. Theuncertainty in their model was evaluated using Monte Carlo analysis on a moving window of 30data points using the Generalized Likelihood Uncertainty Estimation (GLUE) method. Their resultsindicated a temporal variability in the scaling, offset, and exponent parameters of Equation 2.1. Thescaling exponent appeared to be the most sensitive to time and was the focus of their conclusions.The uncertainty, however, did not account for the uncertainty in each individual gauging and theother parameters were not discussed. In Westerberg, Guerrero, Seibert, Beven, and Halldin (2011),time variability of the rating curve and its uncertainty was derived through a weighted fuzzy re-gression, but some of the assumptions are too subjective. For example, their inclusion of the topthree largest discharge values in the time series into each 30 data point window, and the usage ofwhat appears to be an arbitrary window does not account for anything related to changes in thegeomorphology of the river or vegetation growth. McMillan et al. (2010) divided the window ofgaugings used in the uncertainty analysis of rating curves as being dependent on a 0.5 year returnperiod. This return period is used as a threshold to classify major flood events and is subjective.In this study, the authors looked into discharge errors stemming from rating curve uncertainty andthe impact that they had on rainfall-runoff models. They provided a methodology to quantify theuncertainty in discharge as a probability density function conditioned on errors in stage and veloc-ity measurements, assumptions about the stage-discharge relationship, extrapolation of the ratingcurve, and changes to the cross-section due to vegetation growth and bed movement. Coxon et al.(2015) showed that there are regional differences in the stage-discharge relationship of 500 gaugingstations in the UK, but that local conditions such as weed growth and channel instability typicallydominated the magnitude of discharge uncertainty.2.3.3 Machine learningIn this day and age, it would not give justice to this thesis to not briefly mention machine learningand its impact on hydrology and rating curves. There are also similarities between the Gaussiankernels used in this thesis, and those in Gaussian Processes. At its core, machine learning is aboutlearning and adapting, as most notability seen in nature (Goldberg & Holland, 1988). Machinelearning reduces the burden of having to code every rule, and instead lets the machine learn therules via the data. This however puts a large emphasis on data driven models and can cause er-roneous estimates when the physical constraints of the model and system being investigated areignored. The reader is referred to Solomatine and Ostfeld (2008) to learn more about the commonpitfalls in data driven models using machine learning, as well as interesting new avenues that thisexciting field can take hydrology and water resources. For a more practical overview, the readermay find Hsieh (2009) to be useful.Bhattacharya and Solomatine (2005) used an artificial neural network (ANN) in combination with anM5 model tree to develop a stage-discharge relationship. ANN, as of 2013, is the most common used12machine learning algorithm in water resources research (Govindaraju & Rao, 2013). ANN works asa collection of nodes, or neurons, that transmit information from neuron to neuron as received fromthe input-ouput data. Weights are assigned, by learning, to each neuron and are used to develop afunction that explains the relationship between input and outputs (Hsieh, 2009). The M5 model treeused is a type of tree like regression where the algorithm splits the parameter space (into subspaces)and estimates a local regression for each subspace. The machine learning model proved to be moreeffective and had a lower root mean square error in training and verification modes. However, eachtraditional rating curve is fundamentally connected to the physical constraints of the river, whichhelps in modeling and predicting values past the highest measured discharge. The extrapolationof rating curves was looked at by Sivapragasam and Muttil (2005) but again, no connection to thephysical constraints of the river were investigated, although having mentioned a few of the mostcommon methods in hydrometry. In Guven, Aytek, and Azamathulla (2013), the authors used thestage and discharge time series to teach an explicit neural network formulation (ENNF) model. Theauthors state the performance of their model is better than the traditional rating curve. Using thetime series over the gaugings forces the uncertainty associated with the original rating curve modelto propagate into their machine learning model. This therefore suggests overfitting by the ENNF.While machine learning does provide a good alternative for models with sparse data, caution shouldbe exercised to ensure modeling within the confinements of the natural system, and using localconditions to drive the model. This notion that local conditions impact rating curve uncertainty themost (Coxon et al., 2015) triggers an interesting question of, ”What local information is most usefulin quantifying and reducing rating curve uncertainty?”2.3.4 Information is informativeInformation can be defined as the particular arrangement of signals that can provide a change inbelief about the current state (Schement & Ruben, 1993). This belief, or better yet, the current stateof our knowledge, changes over time as one receives information (Nearing et al., 2016). If we canprovide information to our current state of knowledge, we implicitly can increase how much morewe know. This gives rise to the idea that information has value and can lead to better decisions(Weijs, 2011). But information does not arise out of nothing, it is simply a matter of extractingall the available information. A classic example in hydrometry is shown in Figure 2.4. The gaug-ings provide more information than the relationship between stage and discharge alone, but alsodescribe what is causing the change. Therefore, providing more information to the hydrographerand an example of increasing our knowledge, by updating our belief about the state. But just likeyou can extract more information from the data, one can also include new sources of information tothe model.Hamlet, Huppert, and Lettenmaier (2002) showed that the inclusion of El Nin˜o Southern Oscil-lation (ENSO) and Pacific Decadal Oscillation (PDO) signals into their streamflow prediction model13Figure 2.4: Different physical processes that are common in rating curve and an exampleof extracting more information than stage and discharge relationship (from Herschy(1995)).increased the lead time of their model by six months, and improved the operating systems of ahydroelectric power plant in the Columbia River basin. It can be argued that the hydroelectricstakeholders may have found high value in including this exogenous information into the model. Interms of rating curves, Weijs et al. (2013) found that by using electricial conductivity (EC) signals,the uncertainty in the rating curve of an alpine stream in Switzerland was reduced by over 40%. Thepredictive power of the EC in the stream was found to be of similar magnitude to that of the waterlevel. This manuscript is the only example known to the author, in which auxillary informationhas been used to improve the rating curve and its associated uncertainty, and raises an interestingquestion of, ”what other auxiliary information can be used to improve rating curve uncertainty?”In British Columbia (BC) hydroelectric power plants are a common mean of producing electric-ity because of the advantageous geographical attributes in the province. As of 2014, there were 56independent run-of-river (ROR) projects in BC, with 25 others expected to be operational by 2018(Clean Energy BC, 2015). Each run-of-river hydroelectric plant must ensure compliance with the14regulatory authorities and meet an instream flow requirement (IFR) downstream of their stream in-take. An IFR is placed to ensure the livelihood of aquatic and terrestrial life. To comply with IFRrequirements, a rating curve may be used to predict the discharge. Depending on the location ofthe ROR project, a rating curve may be volatile due to the sediment load and energy of the river.These dynamics can create high uncertainty in the predicted discharge and can have high econom-ical impact on the profit of a ROR project. This scenario sets up an interesting research landscapeto identify the information available from a ROR project to be utilized in the development and im-provement of a rating curve and its uncertainty.This research will address the current gaps in the literature and develop a probabilistic dynamicrating curve that highlights regions of high and low conditional probability. The new rating curvewill be described using conditional distributions that are updated as new gaugings are added to themodel thus forcing the uncertainty bands to by dynamic. The new model will also allows usersto produce continuous discharge time series, as well as identify the conditional probability of dis-charge occurring, given a water level measurement. The research will further explore the notion ofidentifying and utilizing available information towards uncertainty reduction as seen in the Weijs etal. (2013). In addressing these two gaps, the author hopes to convince the audience the necessity fortransitioning into a more probabilistic mindset, as well as the importance of assessing, quantifying,and modeling rating curve uncertainty.2.4 Research objectivesThis thesis will focus on exploring the data/information available from a run-of-river hydroelectricproject in northwestern BC to model a probabilistic dynamic rating curve. The central researchquestions to be addressed at the end of this journey are the following:1. Can a rating curve be developed using probability distributions to produce similar results suchas from a deterministic model?2. Can the inclusion of auxiliary information from a run-of-river hydroelectric plant reduce theuncertainty in the IFR rating curve?The thesis will also examine big picture ideas such as:• The importance of utilizing available information to reduce model uncertainty• Probabilistic versus deterministic models• Dynamic uncertainty bands for hydrographs• Economic considerations of modeling rating curve uncertaintyA handful of information theory measures and metrics will also be considered when evaluating thedeveloped probabilistic rating curve and the auxiliary information used. Employing an information15theory framework becomes a natural choice when analyzing problems concerned with informationand uncertainty.16Chapter 3Data and methods for fitting ratingcurves3.1 Site descriptionThe data used in this project is from a hydrometric stations located on the Iskut River between theconfluence with the Forrest Kerr Creek (FKC) and the braided reach of the Iskut River, a distance ofabout 20 km (Figure 3.1). The net drainage area of the watershed is approximately 9,500 km2, butthe watershed affecting the project is a fraction of the net drainage area, an estimated area of 6,978km2. The watershed includes a high plateau with broad valleys to the east and glaciated mountainsto the west. The maximum elevation in the watershed is 2,558 m with a minimum of 244 m down-stream at the FKC intake. The region around the Iskut River includes high glacial landforms thatcovers approximately 10.2% of the Iskut watershed (Northwest Hydraulic Consultants, 2016).The area has a volcanic history that has helped form the geology of the region. Heavy volcanicrock is found directly upstream of the confluence on the Iskut River, that is transported throughdeep narrow volcanic bedrock valleys. The sediment enters the reach of the Iskut River where theproject is located and is fed by small tributaries that originate from meltwater and precipitation. Dueto the location of the project, the climate of the watershed is influenced by its mountainous terrainand its proximity to the Pacific Ocean. The annual precipitation of the watershed exceeds 2,500mm/year (Northwest Hydraulic Consultants, 2016).All the hydrometric stations have stage-discharge rating curves that are commonly altered by theunique characteristics of the Iskut Watershed. The high sediment load of the Iskut is a driving forcefor the production of new rating curves. The Iskut River is a highly active river that transports largeamounts of sediment. It is estimated that the river caries over six million tonnes of fine sediment peryear. Such high quantity of particles and boulders traversing the reach in which the six hydrometricstations are located is a cause for the volatile relationship between stage and discharge at each sta-17Figure 3.1: Location of Forrest Kerr Hydroelectric project. The Iskut river flows from east towest.tion. To mitigate the impact of the sediment on the intake of the hydroelectric project, a box culvertwas designed upstream of the intake, with a sluiceway and radial gate to facilitate the passage ofsediment (refer to Figure 3.2). The sediment, however, does aggregate over time and as a result, theradial gate is opened to flush the sediment out (sluicing). This has an impact on the downstreamhydrometric station that is used to ensure IFR compliance of 10 m3/s since the large sediment loadstypically alter the stage-discharge relationship.Due to the altering effects that sluicing may have on the geomorphology of the river, new gaug-ings must be recorded immediately after a sluicing event to ensure an accurate discharge estimate.Altering events like this also imply that prior gaugings are probably not representative of the newstage-discharge relationship and therefore, older gaugings should have less weight in the estimationof the most recent rating curve.Current practice involves utilizing only the most recent gaugings, and sometimes a subset of oldergaugings, to create a rating curve. Unfortunately, it may not always be possible to measure a newfull range of stage and discharge values to form a new rating curve, and so previously recordedgaugings are selected based on how the hydrographer believes the rating curve has changed. Al-though the modeler may have a strong intuition for how to select the gaugings, information on howthe rating curve is changing may be lost due to selecting only a subsample of gaugings. Instead,this author argues that all the gaugings should be used with a weighted coefficient that gives more18Figure 3.2: A schematic of the intake structure just upstream of the IFR hydrometric station.Courtesy of Tim Argast of NHC.weight to the most recent values, as well as those after a sluicing event. This allows the rating curveto include all information from the gaugings, as well as the new sluicing and time information.3.2 Data availabilityThe data used to establish the rating curve originate from the IFR hydrometric station, while theadded information is from the intake structure upstream of the IFR station shown in Figure 3.1. Thehydrometric station has a record between April 2013 to July 2017 for a total of 105 gaugings. Thefirst 19 gaugings were used to develop the initial rating curve on which the remaining gaugings wereadded. The reasoning for this is that a subset of gaugings is required to develop any rating curve,and it would not make sense, nor is it practical, to develop a rating curve with a small set of gaugingsas it is unsure what the true relationship will be. The first 19 gaugings were selected as the initialpoints to develop the rating curve as this was the first curve created by NHC during their analysis.19The offset value (cease-to-flow water level) was determined to be 234 m using only the initial 19gaugings. This offset value was carried into all the rating curve analyses and was not updated toreflect changes in the controls. A discussion on the decision for selecting 234 m is presented in thediscussion section of this thesis. Lastly, all the data have undergone a natural log transformation tobetter fit the assumptions of normality. A bias does occur when the variables are log-transformed,used in linear regression by applying least-squares regression, and then back-transformed withoutaccounting for the error term in the regression. When working with log-transformed variables, theerror term in the regression has a mean of zero in logarithmic units, but the mean parameter of theerror changes when the variables are back-transformed, and the error must be properly compensatedfor to ensure accurate predictions of the discharge. Readers should take note of this and can furtherread on the issue found in Newman (1993). However, in this thesis, the author is concerned morewith producing probability distributions for the discharges, rather than mean values, but mean valuescan be calculated, and if back-transformed, the necessary corrections should be applied.3.2.1 Sluicing dataAlthough sluicing events were not explicitly recorded, they were inferred by evaluating certain pat-terns from four other signals at the intake structure of the FKC project (Figure 3.3). High sedimentbuild-up necessitates sluicing, and to prevent damage to the turbines and structures downstream ofthe intake pipeline, the power tunnel must be turned off (or reduced significantly). Less water isnow diverted through the power tunnel and the radial gate must be opened to allow the water to flowdownstream. The radial gate causes the water levels at the forebay and the sluiceway to decrease—forcing a higher discharge downstream at the IFR hydrometric station. This pattern is not perfect,and the water level at the sluiceway may not always drop when sluicing occurs. The reason for thisis a combination of how fast the intake to the power tunnel is closed, how fast the radial gate opens,and the discharge of the Iskut River coming in. The four time series used in the identification of thesluicing events are on a one minute interval with records for each beginning at different stages ofthe project. The overlap between the time series begins in 2015 with each time series having over1,500,000 values. A manual inspection of how the four time series change in relation to each otherwould be a laborious task. To aid in identifying sluicing events, a MATLAB code was written tofind the location of the sharp changes in three of the four time series. The time series of the sluice-way water level was not used since it was identified by NHC employees to not always follow thepattern. Each time series is constrained with a threshold that was characterized by a sluicing eventin October 2016. This sluicing event and its pattern is the only sluicing event that was used as anillustration to the author for what the general pattern in the other time series are during sluicing. Thethreshold was relaxed slightly to ensure the algorithm did not miss events. The dates extracted fromeach time series when the power tunnel discharge decreased, IFR increased, and forebay level de-creased were found by the findpeaks MATLAB algorithm and then analyzed. The sluiceway signalwas not used in the analysis because NHC employees found that the signal did not always behaveas expected. Since the changes of all time series were not expected to align perfectly, a window of20Oct/26/16 Oct/27/16 Oct/27/16 Oct/28/16 Oct/28/16 Oct/29/16-20020406080Powertunneldischarge (cms)Oct/26/16 Oct/27/16 Oct/27/16 Oct/28/16 Oct/28/16 Oct/29/160100200IFR discharge(cms)Oct/26/16 Oct/27/16 Oct/27/16 Oct/28/16 Oct/28/16 Oct/29/16244246248Sluiceway stage(m)Oct/26/16 Oct/27/16 Oct/27/16 Oct/28/16 Oct/28/16 Oct/29/16220240260Forebay stage(m)Figure 3.3: Time series of stage and discharge at two two different locations. A clear patternfor when sluicing was identified as having occurred. The forebay stage is located rightbefore the power tunnel intake in the desander basin.2 hours was selected as a possible period of overlap. The window was selected to try and ensure agreater overlap in dates between the time series, but the selection of the window was an arbitrarydecision.The algorithm identified over 70 possible events that shared the pattern described above. The eventswere then used to perform a second round of inspection by visually scanning through the time seriesmore efficiently, and verifying if the identified events were sluicing episodes, or not. The secondround of inspection done manually found a few sluicing events that the algorithm missed, as well asa few misclassified patterns. After filtering, 25 possible sluicing events were identified and verified.The reason that many of the original sluicing events identified by the algorithm were not selectedas possible sluicing events is due to 1) characteristics and unpredictable behavior of the sluicewayand 2) storm events. The latter made it difficult to assess with certainty if sluicing occurred or not,since the pattern would display high variability in the signals. Although one could argue that sluic-ing is possible during heavy rain events, it may not always be the case, and for this reason, thesedates from the original set identified by the algorithm, were not used. The estimated time for eachsluicing event was also identified. This was done by measuring the average time that it took theforebay water level and discharge from the power tunnel to go back to the level prior to dropping.The forebay and powertunnel were used since they proved to be the most consistent and stable time21Figure 3.4: Time series of stage and discharge at two different locations. Changes in the timeseries with identified sluicing events detailed as red vertical lines. The colored patchesrepresent time periods in which new NHC rating curves were developed.series. Figure 3.4 presents the four time series with red lines indicating periods where sluicing waspresumed to occur. The colored patches represent different rating curves created by NHC and theduration the curves were active. Figure 3.4 shows the relationship that the duration of some of therating curves had with the number of sluicing events. Although not every new curve was a con-sequence of a sluice event, there were a few that appeared to be. To better illustrate the sluicingevents, a histogram of the events was created (Figure 3.5). The months with the largest events wereMay, June, and October, which coincided with the behavior of the Iskut watershed. May and Juneare typical months when snow and ice melt and force an increase in discharge on the Iskut River(Figure 3.6). With an increased discharge and steep valleys near the confluence, large sedimentsand boulders can be carried and deposited at the intake structure. The reason for a high number ofsluicing events in October may be due to a partial combination of the amount of precipitation thatthe region receives from the Pacific Ocean, the volcanic geology, and the high glacial concentrationin the watershed. High rainfall can cause sediment from across the watershed, including glacialsediment, to be transported downstream through the Iskut and into the sluiceway.3.3 Rating Curve ModelThe rating curve model developed in this thesis is purely probabilistic and can operate in forecastmode, which uses real-time data to develop the most recent rating curve. This mode is desirablefor operators needing to monitor IFR to ensure compliance. The model predicts the next discharge22Jan Feb March April May June July Aug Sept Oct Nov DecMonths0123456Cumulative sluicing eventsFigure 3.5: A bar graph showing which months experienced the most sluicing events betweenJanuary 2015 and July 2017.Oct/28/14 Feb/05/15 May/16/15 Aug/24/15 Dec/02/15 Mar/11/16 Jun/19/16 Sep/27/16 Jan/05/17Time050010001500Discharge (cms)Figure 3.6: Hydrograph at IFR hydrometric station between January 2015 and July 2017.23Jan Feb March April May June July Aug Sept Oct Nov DecMonths050100150200250Cumulative precipitationFigure 3.7: A bar graph showing which months experienced the most precipitation betweenJanuary 2015 and July 2017.gauging by propagating the information used (i.e. time/sluicing weights) to update the Gaussiankernels to the most up-to-date model, prior to the next gauging being added.3.3.1 Forecast modelThe forecast model operates by updating the rating curve with gaugings as they are observed andmakes a prediction for the next gauging. The model uses a special case of the leave-one-out crossvalidation (LOOCV) method, in which the time and sluicing models are propagated to the period inwhich the next gauging is observed, and then makes a prediction for that value without using thatgauging in the subset of data. Once the prediction is made, the observed gauging is added, and aprediction for the next gauging is made using the same process. The LOOCV method was preferredfor the validation of the rating curve due to the limited dataset, and is a realistic evaluation of un-certainty in the way rating curves would be used without using data from the future. Unfortunately,dividing the dataset into training and test sets would not provide enough data in either set.Time modelThe time model used in the forecast option of the rating curve resembles an exponential distributiony = f (x|µ) = 1µe−xµ (3.1)24where y represents the weight, x is the time difference between the current time step and previouslyobserved gaugings, and µ is the expected value of the exponentially distributed variable—its in-verse, λ , is the rate at which the weights of the gaugings change. Equation 3.1 is typically usedin financial models for monitoring future market prices (Longerstaey & Spencer, 1996; Pafka &Kondor, 2001). Recent data are assigned larger weights than the older data that may not necessarilyrepresent the present. In hydrometry, rivers of high energy carrying large sediment can generatenew cross-sections and invalidate prior gaugings.The variable µ was selected to be 365 so as to make the time weights change at a rate of 1365 .A sensitivity analysis was performed and results showed minimal variability between the final re-sults and the value of µ chosen. An illustration of the process is shown in Figure 3.8. As timeFigure 3.8: More weight from the exponential function (dotted line) is assigned to the gaug-ings (colored boxes) that are closer to the current time step, Ti.progresses towards the next observed gauging, the exponential distribution is updated to reassignweights to the already observed gaugings.Sluicing modelTo create a probabilistic model of the sluicing information, a Gaussian kernel was drawn aroundeach event and then summed to create a kernel distribution. The weights of the individual kernelswere determined by the time duration of the sluicing events. As sluicing information becomesavailable in time, the kernel distribution is updated to eventually span between May 2015 and July2017 (Figure 3.9). In the forecast mode, only the observed sluicing events are used to develop thekernel distribution. The cumulative distribution function (CDF) is then obtained and used to assignweights to the observed gaugings. Gaugings after the last recorded sluicing event are assignedhigher weight than those that occurred prior to sluicing. Figure 3.10 shows the evolution of thenormalized weights taken from the updated CDFs, while Figure 3.11 shows the final sluicing weightassignment when all the gaugings were observed. A comparison of Figures 3.10 and 3.11 show asharp change in the value of the weights when the weight for the 52nd gauging is evaluated. Thiswas done since sluicing information was unfortunately only available post May 2015 and there were2516-May-201521-May-201526-May-201500.0050.010.0150.020.025PDF16-May-201521-May-201526-May-201500.20.40.60.81CDF05-Feb-201524-Aug-201511-Mar-201600.010.020.030.0405-Feb-201524-Aug-201511-Mar-201600.20.40.60.8111-May-201205-Feb-201501-Nov-201728-Jul-202000.0050.010.0150.020.02511-May-201205-Feb-201501-Nov-201728-Jul-202000.20.40.60.81Figure 3.9: The kernel PDF and CDF are updated as more information on possible sluicingevents enter the model. The first panel shows the PDF and CDF of the first sluicingevent, the second panel shows and updated version, and the final panel shows the modelwith all the sluicing information.Figure 3.10: An evolution of how the sluicing weights change as new gaugings are observed(x-axis) in time. Each row is a vector of the sluicing weights51 gaugings that were recorded prior to the first sluicing event available to the model. Using thecurrent distribution model would assign a significantly smaller weight to those values and would notbe a fair weight assignment. Instead, gaugings prior to the first sluicing event were assigned equalweight of one, which after normalization the weights would be lowered as more sluicing eventsenter the model.3.3.2 Hindcast modelIn hindcast mode, the probabilistic rating curve can be tested to evaluate how accurate the predicteddischarge output is with the known value, given all the information. The predicted gauging is re-260 20 40 60 80 100 120Gaugings00.0050.010.0150.020.0250.03Weight assignedFigure 3.11: An equal weight assignment for the gaugings prior to sluicing information andthen the weights increase in magnitude for values later in time. The figure is a snapshotif Figure 3.10 at the time of the last added gauging.Figure 3.12: Evolution of the time model at different time period.moved from the dataset so as to eliminate any bias, prevent overfitting, and keep testing independentof predictors. This procedure is repeated for every gauging, and therein utilizes a LOOCV method.Time modelTo perform the time model in hindcast mode, the exponential function was modified into a Laplacefunction so as to assign the greatest weight to the gaugings closest in time to the one being predicted,and less to those further out (Figure 3.12. The weights were computed as follows:y = f (x|ζ ,b) = 12be−ζ−xb , if x < ζex−ζb , if x > ζ(3.2)where the parameter ζ is the location parameter which is set as the current timestamp of the gaugingunder evaluation and the parameter b was set to 365, equivalent to the µ parameter in Equation 3.1.An example of what the distribution looks like is shown in Figure 3.12.27Sluicing modelThe sluicing model was modified slightly to more accurately echo the effect of sluicing events. Themodel follows a similar pattern as the time model, where the effect of sluicing events decreases thefurther the gaugings are from the gauging that the hindcast model is evaluating. However, sincethe first 51 gaugings did not have any sluicing information available, those gaugings remained witha uniform distribution. Then, once sluicing information became available for the remaining gaug-ings, the weighted CDF described earlier was used to produce the function necessary to assign theremaining weights. The negative of the function was taken so as to have the function vary fromzero to negative one and then a value of one was added to the function to align it with the equalweights of the first 51 gaugings. This was necessary to again ensure that the later gaugings had lessinfluence on the first 51 gaugings. The function was then normalized and the sluicing model for thefirst 51 gaugings in hindcast mode is shown in the top left panel of Figure 3.13. As the hindcastmodel propagated towards the later gaugings, the model had to be altered to ensure highest weightis assigned to the gaugings closest to the gauging being predicted. For this, three discrete functionshad to be appended together. The first function described the uniform weight for the first 51 gaug-ings, and the second and third functions were CDFs. The first CDF used all the gaugings startingat the 52nd gauging, up to the gauging being evaluated, while the second CDF used the remaininggaugings. The second CDF was negatively transformed, and a value of one was added to translate itup and align it with the first CDF. This then produced a function as shown in the bottom left panelof Figure 3.13. The CDF was chosen because of its natural curve, and because of its feasibility toadd the time duration of the sluicing events.A few changes had to be made to the sluicing model when the hindcast model was evaluatingthe 52nd , 53rd , and the last two gaugings since producing a CDF using the ksdensity function inMATLAB produced undesirable results when appending all three functions. Instead, the 52nd and53rd gaugings were assigned the highest weight of the CDF using the remaining gaugings (aka avalue of one), and then normalized. This produced a function that looks like the top right panel inFigure 3.13. For the last two gaugings, the same was done in which the 104th and 105th gaugingwere assigned values of one (bottom right panel in Figure 3.13).When the functions were all appended, it produced a discrete function of length L, that was 151(100 from the ksdensity function and 51 from the first gaugings) when the hindcast model was eval-uating the first 51 gaugings, and then had a max of 251 when evaluating the other gaugings. Usingthe function and the timestamps of the gaugings, the proper weights were then selected.3.3.3 Development of rating curveUsing the information provided above, four different models per rating curve mode (forecast andhindsight) can be developed and evaluated. The four models are:1. Rating curve that utilizes both sluicing and time information28Figure 3.13: Evolution of the sluicing model when applied in hindcast mode. An evolution ofhow the sluicing model changes in the hindcast mode of the rating curve. The top leftpanel shows the function typical for the first 51 gaugings. The top right panel shows thefunction for the 53rd gauging. The bottom left is the typical function for the gaugingsbetween 53rd and 104th gauging, with the peak of the function centered around thegauging being evaluated during the hindcast model. The bottom right function is thefunction for the last gauging.2. Rating curve that utilizes only time information3. Rating curve that utilizes only sluicing information4. Rating curve with no extra information addedNote that the model where no extra information is used assumes that all the gaugings have equalweight. The foundation of how the model operates is based on a weighted joint distribution betweenall the gaugings. The joint distribution for the rating curve was calculated using two dimensionalkernel density estimations for the gaugings, with weights that give more, or less, importance tosome gaugings than others. The kernel for each gauging was a multivariate normal probability29distribution:y = f (x,µ,Σ) =1√|Σ|(2pi)d e− 12 (x−µ)Σ−1(x−µ)′ (3.3)where y is the probability distribution, µ in the multivariate distribution equation is a 1-by-d vectorthat describes the location, or center, of the observed gauging and Σ is a d-by-d symmetric posi-tive definite covariance matrix that describes the spread and scale between the two variables. Thevalue d in Equation 3.3 is two since only stage and discharge are evaluated. The summation ofeach Gaussian distribution, and the weights, create the joint distribution for the stage and dischargerelationship.Covariance functionThe covariance matrix in Equation 3.3 helps describe the geometry of the Gaussian kernel. Thecovariance function is written as such:Σ=[σ2H σHQσQH σ2Q](3.4)Here, the parameters σH and σQ explain the standard deviation, or spread, in the x and y directionof the kernel, which are related to the marginal distributions, while σHQ represent how the two pa-rameters change with respect to each other. Note that H represents h-h0.In Equation 3.4, the parameters are unknown but can be solved for using the measurement un-certainty. In Figure 3.14, the measurement uncertainties are used to represent the uncertainty in thegaugings. These uncertainties are related to the conditional probability of the kernel at the gaugings,and can be used to solve for the parameters in Equation 3.4. The relationship connecting the un-certainties in the conditional and marginal distributions are shown in Equation 3.5 and Equation 3.6.The parameters σH∗ and σQ∗ in Figure 3.14 represent the standard deviations of the conditionaldistributions at the gaugings, and are related to the marginal distributions via Equations 3.5 and 3.6.σH =√σ2H∗(1−ρ2) (3.5)σQ =√σ2Q∗(1−ρ2) (3.6)σH∗ =ρ ∗σQ∗S(3.7)The parameter ρ , a weighted correlation coefficient, was found by using only the gaugings observedthus far by the rating curve model and their assigned weights, and every time a new gauging wasobserved, a new correlation coefficient was calculated. However, as shown in Equation 3.7, there30Figure 3.14: A multivariate Gaussian kernel with its marginal distributions and a conditionaldistribution. Note that only the scaling parameters are shown for illustrative purposes.The location parameters per kernel are fixed to the value of the gaugings.exists a relationship between the uncertainty parameters of a Gaussian multivariate distribution, itsslope, and the correlation coefficient, so therefore, all four parameters cannot be fixed without en-suring Equation 3.7 is satisfied. In this thesis, the slope (S) of each kernel was fixed to a value θthat was found by using the same subset of data as described in the correlation analysis, and per-forming a weighted linear regression. Ensuring that all the slopes of the kernels are aligned aids ininterpolation and extrapolation of discharge values, as well as produces a model that resembles theshape of a deterministic rating curve. To ensure that Equation 3.7 was met, the slope and correlationfactors were fixed by the described methods, and σQ∗ was assumed to be 5% of the magnitude of thedischarge gaugings. This left σH∗ , the conditional uncertainty of Q given Hgauging as the unknownparameter. This was done because it was assumed that the original assumption of 2 mm for σH∗would limit the ability to correctly identify σQ∗ , and might underestimate the uncertainty in the dis-charge. Given that if real measurement uncertainty data were available, the correct numbers couldbe used to determine the shortcomings in the parameters. The assumption of 5% of the dischargemagnitude as ΣQ∗ was shown to be a reasonable value in the literature (Di Baldassarre & Montanari,312009; Herschy, 1995; Rantz, 1982). The author does recognize that this assumption could have aninfluence on evaluating the uncertainty of a rating curve, but due to the lack of data on how dis-charge was measured, this assumption had to be made, but can be improved on, if the true dischargemeasurement uncertainties are available.Solving for all the parameters in Equation 3.4, the individual covariance matrices per gauging werecalculated and Equation 3.3 was used to develop the Gaussian kernels. The summation of all theindividual kernels produced a joint distribution for the gaugings numerically evaluated on a discretegrid of size 500x500 with a resolution of 0.0123 logarithm units that is based on the minimumand maximum values for the gaugings. From the joint distribution, the conditional distribution wascalculated by normalizing every column in the joint distribution to one—therefore producing theprobabilistic dynamic rating curve. The conditional distribution is then useful for extracting theconditional distributions of Q given a water level measurement from the continuous logger. Toobtain a point value estimate, the mean or mode of the distribution can be taken. For this thesis,the weighted mean of the distributions were used. From the rating curve, the conditional distribu-tion of Q given the continuous water level measurement can be extracted and appended to create ahydrograph with dynamic uncertainty bands, as shown in the results.3.4 Metrics for evaluating rating curveTo understand the accuracy of the probabilistic rating curve model, an evaluation of its performancemust be undertaken. In hydraulic and hydrologic models, one of the most commonly used methodsis the root mean square error (RMSE) where O represents observations, P predicted values, and n isthe sample size:RMSE =√∑ni=1(Oi−Pi)2n(3.8)RMSE spans between 0 and ∞, and a value of 0 indicates a perfect fit. Many local and federal au-thorities designated with standardizing and ensuring quality projects grade the nature of the ratingcurve based on the RMSE. In this thesis, the RMSE was performed on the logarithmic discharge.When in forecast mode, the RMSE is calculated prior to adding the gauging and after. The resultsonly show the RMSE before the gauging is added to the model, only otherwise stated. The RMSEafter the gauging was added always decreased.Since the model developed is centralized around probability, it makes sense to use metrics thatdo justice to the probabilistic nature.3.5 Information theory metricsIn 1948, Claude Shannon published his seminal paper on the quantification of information frommessages and signals. In it, he introduces the concept of entropy as a measure of uncertainty and32developed a mathematical form to quantify the uncertainty in bits as shown in Equation 3.9:He(x) =−n∑i=1P(xi)log2(P(xi)) (3.9)where He is the entropy of the discrete probability distribution and P is the probability of the eventxi. Uncertainty is defined as the information that is unknown and is a function of the current stateof knowledge as defined by a probability distribution. So as one receives more information, theprobability distribution can change to reflect a better outcome of the model. When evaluating thesurprise of the models, the conditional probability for a given water level and expected dischargewas used.3.5.1 The model that is less surprised is betterTake the example of a coin toss. It is known that a fair coin has a 50-50 chance of landing on eitherheads or tails. However, after many tosses, the coin flies through the air and instead of landing onone of the faces, it lands on the edge of the coin. An event like this has such a small probabilityof occurring that it causes a huge surprise for you. Using information theory, surprise can also bequantified:Sx =−log2(P(xi)) (3.10)where S represents surprise. What Equation 3.10 says is that an event with a small probability ofoccurrence will have a much larger surprise. This principle of surprise can be extremely usefulwhen evaluating how good a predictive model is. When thinking about surprise in terms of therating curve, a predicted discharge significantly higher than the largest gauging in the model, willtend to surprise the modeler. But in current applications, rating curves only use a subset of the data,rather than the historical set. So although a higher gauging may exist in the set of gaugings, therating curve may not be using that gauging and the higher discharge value then becomes more of asurprise than if having used the gauging from the historical dataset.There is of course a trade-off between robustness and accuracy when considering using all thegaugings, and the decision for this is dependent on the purpose for which the model serves. In thisthesis, the robustness of the rating curve is improved by the inclusion of all the gaugings, ratherthan selecting a subset of the data, and as will be shown in the results section, the accuracy can alsoimprove by including more information.The surprise of all four models will be evaluated and compared for every predicted gauging inforecast and hindsight mode. The results will provide an indication as to which model was sur-prised the least, and help identify which out of the four, might be a better predictive model. Thesurprise will also be measured before and after an observation (when a gauging is added) to comparehow the gauging impacts the surprise of the model.333.5.2 Information gainInformation gain of individual gaugings can also be quantified through the usage of informationtheory by evaluating the divergence of the prior PDF and the posterior PDF when a gauging isadded. The gaugings with the highest information gain, could prove beneficial to the modeler, andmight provide insights to any shortcomings of the rating curve. The measure of information isdetermined by using the Kullback-Leibler divergence measure:DKL(P||Q) =∑iP(i)log2(P(i)Q(i)) (3.11)Equation 3.11 measures how far the discrete probability distribution Q is from the true distribu-tion P. A DKL of 0 would indicate that the two distributions are equal and that no information wasgained, while a larger value means a difference is observed. There are two considerations to usingthe Kullback-Leibler divergence that should always be accounted for. The first is that the Kullback-Leibler divergence is not symmetric and so DKL(P||Q) 6= DKL(Q||P) and the second is that althoughinformation gain was observed, the DKL(P||Q) will not provide details on the quality of the informa-tion (i.e. good or bad). Inferences must be made using other available data to determine, if possible,that the information was beneficial.If the true probability distributions for the gaugings were available, rather than the 5% assump-tion, the information gain from the forecast to the gaugings, could be evaluated. The results wouldhelp indicate which model most closely resembles the true probability distribution of the gauging(DKL(Obs||Forecastmodel). Given that the true data were not available, this analysis was not per-formed since the results may not necessarily portray what is actually happening.In this thesis, the information gain for each gauging will be evaluated while in forecast mode wherethe ”true” distribution will be the probability distribution function given to the gauging, after thegauging was added to the model. The prior distribution will be the probability function assigned bythe models prior to seeing the gauging. In other words DKL(Gauginga f ter)||(Gaugingbe f ore). Theinformation between models will also be examined, where it is assumed that the true distribution isequal to the models using weights, and the prior distribution comes from the Equal Weights model.This is done by taking the probability distributions of each gauging during forecast mode for the AllWeights, Only Time, and Only Sluicing models, and comparing it with the probability distributionfrom the Equal Weights.34Chapter 4Results of probabilistic rating curves4.1 Forecast mode4.1.1 Probabilistic dynamic rating curveTo compare all four models, the probabilistic dynamic rating curves are plotted together in Figure4.1. Note that a rating curve was produced every day during the simulation and Figure 4.1 is onlyshowing the rating curve for May 15, 2016. Immediately, a few observations become apparent, oneof them being what appears to be a linear relationship between stage and discharge, and the otherbeing a fragment in the model around a logarithmic discharge of -1.5. The linear relationship isas expected due to the properties of log transforming a power law. Also, since the slope of all thekernels were fixed to be identical, it was expected that all the kernels would be oriented in the samedirection. The fragment in the linear relationship is a possible consequence of how the slopes forthe kernels were calculated, and assumed to be equal to the slope of the all the gaugings seen upto the time period at which the rating curve was produced. This implies that if a new observedgauging does not reside near the existing linear relationship, its slope will be parallel to all the othergaugings, and therefore create a parallel kernel as seen in Figure 4.1.A closer inspection of Figure 4.1 demonstrates the differences in the models. For example, lookingat the high end of the probabilistic dynamic rating curve for all four models shows that the EqualWeights and Only Sluicing model have more similar conditional distributions than the Only Timeand All Weights models. The latter two models appear to have more regions of higher probabilitywhich are caused by the influence of the assigned weights decreasing the surface area of the kernelson the two dimensional space, and increasing it in the third dimension (probability) centered at thegauging. From Figure 4.1 it can be seen that the Only Sluicing model resembles the Equal Weightsrating curve more than the Only Time rating curve. Prior to May 15, 2016 (date of the rating curvein Figure 4.1), there were 14 sluicing events of varying magnitudes that influenced the shape of therating curve kernels and are the reason for the distinctions between the Equal Weights and the Only35Figure 4.1: The conditional probabilistic rating curve for all four models at a snapshot in time.May 15, 2016.Sluicing rating curves. Finally, the All Weights rating curve, which is a product of the time andsluicing weights, seems to have more concentrations of higher conditional distribution peaks thanthe other models. This is possible in the event where gaugings were taken after a sluicing event andwere also the latest observed gaugings.To better understand how the models differ, the conditional distributions of Q given an arbitrarylogarithmic water level measurement of 1.3686 are compared in Figure 4.2. The results display thatall four conditional distributions appear to have two large peaks which indicates two unique pointsof high conditional probability. The reason for this is due to the influence of two discharge gaugingswith values centered around the peaks of the conditional distributions and their assigned weight.The results of Figure 4.2 are taken from the rating curve shown in Figure 4.1. Zooming into therating curves, a better inspection of the kernels can be performed. The kernels from Figure 4.3 alignwith what is seen in Figure 4.2, where two regions of high conditional distribution are influencingthe conditional distributions. The fact that the conditional distributions can be multi-modal, is thereason why a weighted mean is used to calculate the expected point value discharge rather than365 5.2 5.4 5.6 5.8 6 6.2Logarithmic discharge00.020.040.060.080.10.120.140.16Conditional probabilityAllWeightsEqualWeightsOnlyTimeWeightsOnlySluicingWeightsFigure 4.2: A comparison of the conditional distributions for a give logarithmic water levelmeasurement.using the mode.4.1.2 Dynamic conditional uncertainty bands for hydrographUsing the conditional distributions for a given water level measurement, a hydrograph with condi-tional uncertainty bands can be produced (Figure 4.4). The figure shows how the uncertainty in thehydrograph varies in time and also illustrates areas where the conditional density is highest. Alsoshown in Figure 4.4, are the predicted continuous discharges from the NHC deterministic ratingcurves, as well as the predicted discharges from the probabilistic dynamic rating curves. Takinga closer examination at the hydrograph, one can get a better understanding at how the conditionaluncertainty is varying in time. Figure 4.5 also compares the predicted continuous discharge valuesfrom the deterministic and the probabilistic rating curves. Figure 4.5 shows the area of highestprobability in the dynamic uncertainty band follows the same pattern as the predicted dischargevalues from the deterministic and probabilistic rating curves for the All Weights rating curve. Thehydrographs for the other three models are attached in the Appendix.An important take-away from the probabilistic dynamic rating curves is that the answer to an impor-tant question that could not have been asked in a deterministic rating curve is now possible. Prior toa probabilistic rating curve, one could not receive a response to, ”What is the probability distributionof a discharge, given a specific water level measurement?” This central question helped motivatethe research undergone thus far and is to be considered by the author, an important question thatpresumably does not get asked often when dealing with deterministic rating curves. However, it is aquestion that should be considered highly, by not only those who produce rating curves, but anyone37Figure 4.3: A zoomed in version of Figure 4.1 used to understand the bimodal conditionaldistribution shown in Figure 4.2.Figure 4.4: A hydrograph with dynamic conditional uncertainty bands for the All Weightsmodel.38Figure 4.5: A zoomed in version of Figure 4.4using a discharge time series.4.1.3 Evaluation of rating curvesRMSEWhen comparing the continuous discharge measurements in Figure 4.4 for the All Weights modeland the deterministic rating curve predictions, it was seen that both values were closely aligned. Tobetter understand how well the probabilistic model is performing, a scatter plot was created to com-pare the predicted discharge from the probabilistic model with the measured discharge gaugings(Figure 4.6). Immediately, one can identify the lower discharge values were both over and underestimated, and that higher discharge values tended to be well predicted. It was expected that gaug-ings taken later in time (darker colored markers) would reside closer to the one-to-one line in Figure4.6 as the model would learn from previously observed gaugings. A possible reason for not seeingthis result could be due to the small width of the kernels, but this makes sense since the influenceof gaugings should only be felt along the slope of the kernel and not necessarily along the width.Figure 4.6 also has the RMSE values for the four models. Both the Only Sluicing and Only Timemodels had lower RMSE values than the Equal Weights model. These results serve as an indicationthat the added information was useful in reducing the uncertainty. When using both signals, theRMSE decreased to 0.42, and helped reduce the uncertainty by a percent difference of 19% whencompared to the Equal Weights model.SurpriseAn analysis of the surprise about the observed value before, and after, a gauging is added to therating curve is helpful to identify gaugings that can improve the performance of the model. Tobetter investigate and compare the results of surprise per model, the data were categorized into thetotal surprise per month per model (Figure 4.7). The light colored bars represent the surprise before390 1 2 3 4 5 6 7Logarithmic QObserved01234567Logarithmic QPredictedAllWeightsRMSE = 0.4220-Jul-201405-Feb-201524-Aug-201511-Mar-201627-Sep-201615-Apr-20170 1 2 3 4 5 6 7Logarithmic QObserved01234567Logarithmic QPredictedEqualWeightsRMSE = 0.520-Jul-201405-Feb-201524-Aug-201511-Mar-201627-Sep-201615-Apr-20170 1 2 3 4 5 6 7Logarithmic QObserved01234567Logarithmic QPredictedOnlyTimeWeightsRMSE = 0.4520-Jul-201405-Feb-201524-Aug-201511-Mar-201627-Sep-201615-Apr-20170 1 2 3 4 5 6 7Logarithmic QObserved01234567Logarithmic QPredictedOnlySluicingWeightsRMSE = 0.4620-Jul-201405-Feb-201524-Aug-201511-Mar-201627-Sep-201615-Apr-2017Figure 4.6: Scatter plot describing the differences between the predicted discharge from thefour models and the observed values.the gauging was added, while the darker colored bars are the surprise after adding the gauging. Asexpected, the surprise decreased for all when the gauging was added to the rating curve. No clearpattern is shown in the figure as to which model is best. What can be seen is that surprise results forApril before the gaugings are added are not visible in Figure 4.7—this is due to two surprise valuesequal to infinity. The magnitude of the discharge gaugings were identified as logarithmic discharge2.17 and 2.35, but because these gaugings fell significantly outside the linear relationship observedthus far, the rating curve models had assigned a small conditional probability. This small probabilityis also a consequence of the numerical limitations by the computer used in which the probability issmall enough to round to zero, and therefore produces a default value of infinity. If the computationprecision was not limiting, a real number would be achievable, but nonetheless, the surprise wouldstill be of a large magnitude.Since the infinity values prevented the remainder of the surprise for April to be seen, they wereremoved, and the total surprise per model were recalculated (Table 4.1). The total surprise prior toadding the gauging was highest for Equal Weights, followed by the Only Sluicing, Only Time, and40Figure 4.7: Total surprise per month. The translucent bars represent the surprise before thegauging is added (during prediction) and the opaque bars are the surprise after.Table 4.1: Total surprise per modelModel name Total surprise before Total surprise afterAll Weights 1194 274Equal Weights 1414 344Only Sluicing 1306 307Only Time 1260 292the All Weights model. Another observation from Figure 4.7 is that the total surprise per monthappears highest in March and October. March had a total of four gaugings, with the first three offour taken in 2015, all within a week of each other. October had a total of 14 gaugings, 10 of whichwere taken in 2016. When the infinity values were removed for April, the results showed that themonth had a total of 534 bits of surprise. The heatmap in Figure 4.8 shows that all months recordedpredominantly low valued discharges, and as Figure 4.9 displays, low values tended to have a highersurprise since low probability was assigned by the rating curve. The first three discharges recordedin March 2015 were values that had not been observed by the rating curve and as the results show,41Figure 4.8: Cumulative gaugings per month are shown in the figure to the left and the rangeof values recorded per month and their frequency are shown in the heatmap to the right.high surprise was achieved. In October 2016, the relationship between discharge and water levelhad changed, and therefore a handful of gaugings were designated a low probability. From Figure4.9, a decrease in surprise and magnitude can be observed. A possible hypothesis for this couldbe that lower discharge values are more sensitive to minor changes in the channel characteristics,which is why more scatter is observed at the lower discharge gaugings. This scatter in the gaugingscan have an influence on the probability that the rating curve assigns to the localized region.A reason for why the surprise of large discharge values was low may be due to the influence that thefirst 19 gaugings had on the rating curve. A look at the first 19 gaugings show that NHC had donea good job at collecting a wide range of discharge values for their first rating curve, especially incollecting large discharges. Aside from the probabilistic rating curve having already observed theselarge discharges, Figure 4.10 shows that the large discharges have minimal scatter and all tend tofollow a linear path. Therefore, the rating curve can do a decent job at extrapolating, and the condi-tional distribution of gaugings along this line tends to be high—meaning low surprise. However, inthe case where a new gauging is recorded, is much larger than the rest, and falls outside the expectedlinear path, the model will assign a large surprise value to the gauging. If the gauging falls on thelinear path of the previous gaugings, the conditional distribution at that point is influenced by theother gaugings and so the probabilistic rating curve should not be as surprised.Kullback-Leibler DivergenceProbability distributions can be extracted and used to analyze the information gain from incorpo-rating the gaugings, for the different models. Figure 4.11 shows the Kullback-Leibler divergence421 2 3 4 5 6 7Logarithmic discharge05101520253035404550Surprise (bits)20-Jul-201405-Feb-201524-Aug-201511-Mar-201627-Sep-201615-Apr-20170 5 100200400Figure 4.9: Discharge versus surprise and color coded with date in which gauging wasrecorded. The three largest values were removed from the main plot to better showthe trend. The inset shows what the original date looked like.-5 -4 -3 -2 -1 0 1 2 3log(h-h0)1234567log(Q)All gaugingsFirst 19 gaugingsFigure 4.10: Trend in the gaugings. The reason for the nonlinear shape is due to the initiallyassumed h0 of 234 m.43Figure 4.11: The lines represent the information gain from the Kullback-Leibler divergencemeasure between the probability distribution prior to a prediction being made for thegauging and the probability distribution after adding the gauging (”true” distribution).The black bars represent the discharge values of the gaugings used in the model. Thelight gray bars are the initial 19 gaugings used to develop the rating curve.(colored time series) between the probability distributions of the predicted gauging before it wasadded to the model, and after. The black bars represent all the added gaugings for all four models.Immediately one notices numerous large spikes in the information gain for all the models. Theselarge spikes represent an infinite amount of information gain and implies that having observed thesegaugings was valuable to the probabilistic rating curve. However, what it also means is that priorto observing the gaugings, none of the models had assigned a large enough nonzero probability tothat gauging (combination of Q and H) occurring. This can be attributed to several reasons. Oneof the reasons is that since the widths of the kernels are small, the influence of one kernel does notincrease the probability space in the direction of its second principal component and so therefore,even if a gauging of similar value does occur, if it is parallel to a previous gauging, it is possible forthe model to assign it a low conditional probability. The effect can be seen in Figure 4.5 where thekernels are stretched primarily along its first principal component and the width of the kernel (sec-ond principal component) is marginally small and until a new kernel is added, that probability spacewill remain with low values close to zero. A second possible reason for the large spikes could be dueto the grid size chosen for the analysis. A 500x500 mesh was used to define the probability space.Such a fine grid would cause more areas of low probability than if a coarser mesh (i.e. 50x50) wasused. Although the size of the mesh does affect the probabilities, as long as the grid size remainsconstant throughout the analysis, the results are meaningful. While Figure 4.11 showed the infor-mation gain of adding a gauging to a model, the Kullback-Leibler divergence can also be used toexamine the information added going from one model to another. For example, it may be desirableto compare the probability distributions before a gauging is added to the All Weights model and theEqual Weights model to examine how much information was gained by using one model over theother. For this analysis, the models that are assumed to be the correct, or true, are the All Weights,the Only Time, and Only Sluicing models. These models were selected as such since the auxiliaryinformation being brought into the rating curves better describes what is actually happening. The44results of the information gain between models showed similar infinity values as those previouslydescribed in Figure 4.11. The All Weights and Only Time models had many more infinity values (47and 50, respectively) than the Only Sluicing model, which had 24, when calculating the informationgain assuming the Equal Weights model was not the true model. Due to the large difference in thenumber of infinity values, it did not seem appropriate to neglect the values and do a summation ofthe information gain. Instead, the number of infinity values are being interpreted as an indication ofhow far the models were to the Equal Weights model. Since the Only Sluicing model had the leastamount of infinity models, it was concluded that in general, this model was closer than the othertwo. This conclusion is similar to what has been seen so far in the results.4.2 Hindcast modeThe RMSE and surprise are calculated for each gauging when the model operates in hindcast mode.Rating curves at each gauging in hindcast can also be produced, but are not shown since they alllook and behave as the ones in Figures 4.1. The average of all 105 RMSE values and the totalsurprise are presented in Table 4.2. During hindcast mode, the surprise of the Only Sluicing modelTable 4.2: RMSE and surprise during hindcast evaluation of rating curve modelModel name RMSE Total surprise (bits)All Weights 0.27 453Equal Weights 0.34 456Only Sluicing 0.29 433Only Time 0.30 442was the lowest, followed by the Only Time, All Weights, and Equal Weights. When comparingthe results with the RMSE, the lowest value is from the All Weights model, preceded by the OnlySluicing, Only Time, and the Equal Weights model. The pattern of the surprise results are differentfrom those reported in Table 4.1, while the RMSE values continue to follow the pattern found inthe forecast mode of the probabilistic dynamic rating curve. What the RMSE values show is thatthe auxiliary information does help with improving the predictive capabilities of the rating curve.However, the model with the best RMSE, had the second largest total surprise. What this couldimply is that the All Weights model is over confident about the precision of the discharge estimates,and so the surface area of the kernel decreases (since over confident implies a more ”peakier” PDF)and therefore, the probability decreases in the surrounding area. Individually, the Only Sluicing andOnly Time models are not as over confident as the All Weights model, and the surface of their PDFsis larger (more spread out), hence a lower surprise. The Equal Weights model behaved as expectedwhen examining the results of both metrics.454.3 Sensitivity analysisThere were a few assumptions used in the development of the probabilistic dynamic rating curve.The most notable one is σQ∗ in Equation 3.5 where 5% was assumed for the uncertainty in dischargegaugings. However, it is possible for gauging measurements to have more than the assumed error,and in fact, it is expected that the extreme ends of the discharge gaugings will have more uncertainty.The data used in the analysis unfortunately did not have any uncertainty values attached to it andtherefore the 5% assumption had to be used. The author notes that this is a bit hypocritical to whatwas stated in earlier sections of the manuscript where the author argues that uncertainty assignmentshould not be done by the end-users, but rather the experts. Yet this contradiction serves to ignitefuel to the idea that the experts should be transparent with the uncertainty in the data, and thatthis important detail should be available to anyone who is interested in using the data for furthermodeling. If the uncertainty had been provided for each of the gaugings, this study would haveused those uncertainty readings as direct inputs into the geometry of each individual multivariateGaussian distribution. This information would have created kernels that better represent the actualuncertainty, thus forcing the global uncertainty of the rating curve and its dynamic probability bandsto be more accurate. A sensitivity analysis using the forecast mode of the rating curve on thisassumed uncertainty value is provided to better examine the results of the rating curve and identifythe robustness in the model.4.3.1 Discharge uncertaintyIn this analysis, the water level uncertainty continued to be solved for using Equation 3.5 but thedischarge uncertainty was varied between 10% and 20%, with the results from the original 5%assumption also shown for comparison purposes. The higher uncertainty values are more repre-sentative in stream gauging methods than anything lower than the originally assumed 5%. It wasexpected that as the discharge uncertainty increased to 10%, the kernels would also increase inwidth, in the ordinate direction. The results from Table 4.3 show that the RMSE stayed roughly thesame, and that the surprise explodes. An example of the probabilistic dynamic rating curve, andits joint distribution, with an assumed 10% discharge measurement uncertainty is shown in Figure4.12. By causing larger kernels, the rating curve becomes less confident and the multivariate dis-tribution increases its surface area in the x and y direction to accommodate for the fixed slope andcorrelation, and decreases in the z direction. As the discharge uncertainty continues to increase, thekernels expand, and the rating curve continues to lose it confidence. Appendix C compares the jointand conditional distributions for kernels assuming a 20% discharge measurement uncertainty. Theresults from Table 4.3 show that in general, the RMSE stays relatively the same as the uncertainty inthe discharge changes. A constant RMSE means that although the geometry of the kernels change,the mean of the kernels remains roughly the same.46Figure 4.12: A comparison of how the joint and conditional distributions change when usinga 10% discharge uncertainty.Table 4.3: Sensitivity analysis of discharge uncertainty used in the conditional distribution ofthe kernels.StdQ (%Q) Metrics AllWeights EqualWeights OnlySluicing OnlyTime5RMSE 0.42 0.50 0.46 0.45Surprise (bits) 1194 1414 1306 126010RMSE 0.40 0.47 0.44 0.43Surprise (bits) 858 1120 1005 95920RMSE 0.40 0.50 0.45 0.43Surprise (bits) 575 653 618 6054.3.2 Time model parameter (µ)The other assumed parameter in the probabilistic rating curve was the rate parameter of the expo-nential function (Equation 3.1) used for assigning the time weights. In the equation, a rate parameterof 1365 was used. Various other values were examined to understand the impact of the parameter onthe model and the results are shown in Table 4.4. The results of varying the parameter show thatthe model’s surprise is more sensitive than the RMSE. However, there is not a general pattern thatemerges when varying µ . What is occurring is that as the rate of the exponential function increases,the most recent gauging is receiving a larger weight than when assuming a value of 365 days, whileprevious gaugings are losing their weights at a faster rate. This can create a push-pull effect wheredifferent combinations of kernel influence can occur depending on the weights. A further analysisshould be performed to better understand the impact of µ on the rating curve.47Table 4.4: Sensitivity analysis of µ parameter in the exponential function used for the timeweights while in forecast mode. Note that the Equal Weights and the Only Sluicing modelsare not shown since they are not effected by the time weights. AW = All Weights and OT= Only Time.µ = 365 days µ = 180 days µ = 90 days µ = 30 daysMetrics AW OT AW OT AW OT AW OTRMSE 0.42 0.45 0.39 0.41 0.35 0.36 0.44 0.42Surprise (bits) 1194 1260 1830 1193 1465 1559 1201 1191The exponential function used in the development of the time weights has a memoryless prop-erty. This property states that the probability between events stays constant, and only when a newevent is observed, will the weights update. This property is desirable, because it states that theweights of the gaugings should only change when a new gauging is observed. If sufficient timefrom the last measured gauging has occurred, an idea could be to let the exponential distributionconverge to a theoretical equilibrium that could potentially relate to the river’s natural equilibrium.Leopold, Wolman, and Miller (2012) presented the idea that rivers are always seeking to be in astate of lowest energy, and so after an event has occurred that disrupts the current state, the riverwill slowly transition back to its natural equilibrium. Applying this similar idea to the exponentialfunction will imply that the weights of the gaugings that previously had little weight will now beset to an equilibrium. In a river like the Iskut that is constantly changing because of the upstreamintake, this idea may not work, but could be interesting to explore in other river systems.The parameter µ in this thesis was assumed, but it is possible for the performance of the probabilis-tic dynamic rating curve to be improved if the parameter learns from the data. Using a Bayesianapproach is a possible alternative to optimizing the parameter. A prior probability on the param-eter could be selected based on the characteristics of the river and/or watershed. As gaugings areobserved, and added to the rating curve model, the posterior distribution that best describes theparameter can be determined.48Chapter 5Discussion5.1 Cease-to-flow water level (h0)The h0 parameter is important because it helps establish the relationship between discharge and theriver’s water level, but also because it helps identify the height of the control. In practice, sometimesthe height of the control is not known, and so this parameter is typically calibrated in hindsight, incombination with field knowledge, such as possible physical limitations to the exponential param-eter in Equation 2.1. For calibration, the gaugings are often log transformed, as done in this thesis,to ensure a straight line. The value of 234 m for the h0 water level was selected by plotting all thefirst 19 gaugings used in the development of the first rating curve, independent of the simulationsperformed on the remaining gaugings, and identifying how well the data fell on a linear relationship.The value was then fixed for the duration of all the simulations in producing the new rating curves.The h0 parameter could be fixed by producing a moving parameter, similar to µ , that is optimizedevery time a new gauging is observed. More accurate cease-to-flow water levels were available tothe author from NHC, but were not used, because again, the value of their h0s was calculated inhindsight, and the author felt that using their values would be more unfair than fixing the data to theassumed value.A consequence of the assumed 234 m for h0 can be seen in the final probabilistic rating curve(Figure 5.1). It is known that the log transformation of data following a power law function pro-duces a linear relationship. However, Figure 5.1 shows otherwise. The reason for this is attributedto two gaugings that occurred on April 29th and 30th of 2017 with low water level measurementsthat did not conform to a linear relationship with the other data. This error was discovered whendeciding which value for h0 to assume, and since the gaugings occurred near the end of the dataset(4th and 5th from the last gauging), it was decided that 234 m would work for the other gaugings.A much better calibration could have been done, but again, this would produce an even more unfairforecast.49Figure 5.1: Last dynamic probabilistic rating curve produced in simulations.5.2 The effect of fixing the slope of all kernels to be equalAs described in the data and methods section, the slopes of the kernels were all fixed to be the sameby using the observed gaugings and performing a weighted linear regression to obtain the slope.A consequence of such method is that there is a chance that gaugings that do not fall exactly onthe linear function, risk forming parallel lines as shown in Figure 5.2, and producing a fragmentedrating curve. The example shown is from May 2016 in which only five gaugings with low waterlevel measurements had been recorded up to that point. These five gaugings were on the lower endof the original linear regression and slightly away from all the other data points and so therefore, aparallel line was formed when their slopes were fixed. This causes a similar bimodal distribution,as seen in Figure 4.2 and is a reason for why the weighted mean of the conditional PDF was used,rather than the mode. A possible fix to this problem is to use localized regression on the ’k’ nearestgaugings. In doing so, the original gauging of the five gaugings, highlighted within the red box inFigure 5.2, would have had its slope fixed with gaugings that fell closer to the linear regression, andas more gaugings entered the model, the slopes would update locally. For a proper linear function tobe produced, it would be important to ensure that the ’k’ nearest gaugings selected would producea monotonic function.50Figure 5.2: A display of how parallel rating curves can emerge using the described linearregression for fixing the slope of all the kernels.5.3 Extrapolation of the rating curveExtrapolation of the probabilistic rating curve works by the influence of the kernels and the ori-entation at which they are facing. Since the kernels in theory ”stretch-out” to infinity along theirprimary principal component, there will always be a probability assigned to the extreme ends of therating curve where extrapolation might be desired. The conditional would then be taken, and thePDF would be renormalized to a sum of one. An exception to this could be that if the extrapolationregion is far enough, computational limitations might take over and assign values of zeros for theprobability due to limited computer precision. In this case, a correct conditional PDF might not beguaranteed.Similar to extrapolating in a deterministic rating curve plotted on log-log space, caution shouldalways be adhered to ensure that extrapolation is only occurring within the bounds of the linearrelationship between the physical constraints of the river (i.e. stage and cross-sectional area). If ashift in the stage-area relationship occurs, then a shift in the orientation of the kernels should alsooccur to ensure proper extrapolation. This is one way to ensure that the physical characteristics ofthe river are also observed in the probabilistic model. These shifts in the stage-area relationship arealso easily adoptable in the probabilistic rating curve if a multi-segmented rating curve in the de-terministic domain is known to exist. The weighted linear regression would have to be altered intopossibly a piecewise function with the nodes being influenced by the height of the correspondingcontrols.5.4 Maximizing what is known to help solve the unknownIn the development of the probabilistic rating curve, time and sluicing information were broughtin to help reduce model uncertainty. It was argued that gaugings after sluicing, and those mostrecent in time, were more representative of the current stage-discharge relationship than those prior51to sluicing, and taken earlier. The results showed that both the time and sluicing models did helpreduce the uncertainty of a probabilistic rating curve when compared to a model using no auxiliaryinformation, by up to a maximum of 19%, when in forecast mode. This shows that the informationused to building the rating curves can become of practical importance for hydroelectric companiesthat rely on a rating curve downstream of their intake to ensure an accurate estimate of the IFRcompliance. Operators of the intake plant can use the probabilistic results to make better informeddecisions on whether more water should be released downstream to help ensure IFR compliance,within a degree of certainty, or, to stay business-as-usual because enough confidence is assigned tothe predicted discharges. What can also be done is the inclusion of operator knowledge on expectedsluicing events. By giving an approximate date and duration of the event, the model can evaluatehow a near future sluicing event could affect the probability of ensuring the IFR after the expectedsluicing event.The model can also be used to help plan gaugings. For example, it could be possible to opti-mize the dynamic probabilistic rating curve to ensure that a minimum conditional uncertainty isalways met. In doing so, it could help identify when to best take gaugings. Similar to how in thefinancial sector different portfolios are built, evaluated, and then the best one is selected, the samecan be done with gaugings. Different scenarios can be used to train the dynamic probabilistic ratingcurve, and once calibrated and validated, an optimal plan for when to take gaugings may be possible.Studies evaluating the usefulness of adding/collecting different information signals in conjunctionwith a cost-benefit analysis, may be performed. This analysis can also be used to define the value ofadded information to the rating curve to making better informed decisions. This idea was exploredby Raso, Weijs, and Werner (2017) in their recent manuscript, in which they used the OptimalDesign (OD) problem to evaluate the value of adding extra gaugings to the rating curve.5.5 Dynamic uncertainty bandsThe dynamic uncertainty bands are a beneficial product of the produced rating curve and could po-tentially serve as a good incentive for producing probabilistic dynamic rating curves. Figure 5.3shows a closer examination at how the conditional probabilities change when a gauging is addedwhile in forecast mode. The figure illustrates that the column where the gauging is found (greendot) has a flatter conditional PDF than after the gauging is added, column where the vertex in thepredicted continuous discharge value begins. The conditional PDF becomes ”peakier” and is shownby the change in color in the column only after the gauging. This result is important because itprovides an answer to a question that maybe does not get frequently asked when looking at dis-charge time series because of how the deterministic rating curve shows its results. The fact thatthe conditional PDF is changing when the data are added, helps answer the question of, ”What isthe probability distribution of the discharge, given a water level measurement of hi when a gaugingis recorded?” Or another question that could be interesting, ”What is the probaiblity distribution52Figure 5.3: A closeup as to how the conditional uncertainty bands change when a gauging isadded.of the discharge, given a water level, prior to adding a gauging to the rating curve?” Both thesequestions help set up a canvas where the data are now more useful than they were when analyzed ina deterministic domain. When working outside a stochastic framework, 95% confidence intervalsare typically appended to rating curves. However, the answers from these intervals are to differentquestions because the interval does not imply that the result has a 95% confidence of occurringwithin the interval, but rather, it states that if the samples were to be repeated, and the 95% con-fidence intervals recalculated, that 95% of the intervals would capture the population mean. Soreally, the 95% confidence interval is useful, but it answers a question that is not necessarily theimmediate question that someone using predicted discharges is truly concerned with. This methodfor displaying the uncertainty around a rating curve gives no valuation to knowing prior informationthat may be useful in the design of rating curve uncertainty. For this reason, Bayesian methods havegained traction as a valuable tool in modeling rating curves (Le Coz et al., 2014; Moyeed & Clarke,2005; Petersen-Øverleir & Reitan, 2009). Within a Bayesian framework, prior knowledge may beused to bound the limits of the possible parameters in the power law function (Equation 2.1), andas new data enter the framework, a likelihood function is used to determine the posterior that bestdescribes the parameters. The method used in the development of the probabilistic rating curve inthis thesis incorporates a similar approach where the probabilities are continuously updated as newdata is observed. The method used for updating the priors is not formally defined as Bayesian, butimplicitly, it is of a similar nature.5.6 The need for probabilistic rating curvesIn many cases when one downloads discharge time series data from WSC or USGS, there is no in-formation on the probability distribution of the values. WSC does provide the percent deviation that53the last gauging measured is from the current rating curve. WSC and USGS will also sometimesprovide categorical uncertainty measures that inform the users about any possible anomalies withthe data. A categorical value of ”B” for WSC indicates that the data were collected under the influ-ence of ice coverage (Hamilton & Moore, 2012). However, this is not a true measure of uncertaintyand is not a pragmatic approach to data uncertainty.With the probabilistic rating curve developed in this thesis, a conditional probability value, or func-tion, can easily be included with the predicted discharge and give a much more transparent, andinformative, result. This is shown in Figure 4.4 where the conditional PDFs are extracted and a dy-namic uncertainty band showcasing the conditional uncertainty of the discharge given the recordedwater level is produced. For decision makers who depend on the discharge time series, having thisextra information can help in making more informed decisions. For example, operators who use thepredicted discharge from a hydrometric station to ensure IFR compliance can use the conditionaldistributions to better assess the decision to release more water downstream to ensure that the IFRis met, or to take in more water for energy production/profit. The scenario described earlier aboutoptimizing the rating curve to better plan out when to take gaugings can also be applied for IFRrating curves. The results of such a scenario could imply a larger up front cost by having to sendpeople out to the field, but the decision could potentially save money in the long run by preventingthe streamflow to fall under the IFR, and force a hydroelectric company to pay a large monetarysum due to infraction fees.Probabilistic rating curves allow for better decision-making processes in which decision makers canhave more, or less, confidence in their actions as a result of the probability from the rating curve.In this thesis, an example of decision makers from a hydroelectric ROR project has been used, butwhat has not been discussed are the other factors influencing the decision on river withdrawals (i.e.electricity market demands) as well as the other stakeholders involved. In British Columbia, salmonhave economic, ecological, and cultural value. To ensure that salmon keep coming back to theirspawning grounds, sufficient water and the proper conditions must be available for the salmon toswim upstream. The amount of water flowing downstream can fluctuate due to the local climateand this can have an impact on water licenses. One now enters a multiobjective decision makingproblem in which enough water must be supplied downstream so that salmon can swim upstream,while trying to maximize economic benefit for the hydroelectric company. Decisions as these canbe improved by having a sense of the uncertainty in the predicted discharge and can aid in takingeither more conservative or liberal actions, as well as reduce the risk associated with these decisions.What is also beneficial from probabilistic rating curves is that one can condition the probabilities onall available information. This is to say that the probability of a discharge value is conditioned on notonly the recorded water level, but also time and sluicing, and possibly electrical conductivity. Theseconditional probabilities are derived from a probability function—in our case a mixture model of54multivariate Gaussians. Therefore, one can present much more information to the end users of adischarge series by not only including the point value conditional probability, but also the distri-bution function from which it stems. Krzysztofowicz (2001) identified four potential benefits forprobabilistic predictions/forecasts. These equally apply to measurements of discharge, which are infact also predictions. They have all have been discussed in this section, but are worth summarizingbelow:1. Probabilistic models tend to be more scientifically honest and eliminate the illusion of cer-tainty2. Risk-based criteria can be established for example in flood forecasting3. Rational decisions can be enhanced by knowing the true uncertainty of the data4. There is a potential for economic benefit by using probabilistic modelsThese four points help promote accountability for the research, and decisions, that are taken byexperts and leaders and have large benefits to society. Engineers are entrusted with the lives ofthousands, and even millions, of people every single day and better ethical decisions and designscan be developed by knowing the uncertainty in the data that is used. However, for this notionof uncertainty transparency to kickstart, a shift in how data is collected and viewed must occur.Those collecting the data (i.e. streamflow) must be conscientious of how the data is recorded and aneffort to calculate/estimate the uncertainty should be made so that the collecting authorities (wateragencies like WSC and USGS) can publicly publish these results. Water agencies may be hesitantto do such a move, because it may show flaws in their data, but again, only through this can honestscience progress and rational decisions be made.55Chapter 6Conclusion and future workThe primary objectives laid out at the beginning of this thesis were to develop a probabilistic dy-namic rating curve, and to explore the usability of auxiliary information from a run-of-river hydro-electric project to reduce model uncertainty. The data and methods section of this thesis highlightthe development of the probabilistic dynamic rating curve. For a probabilistic model to be devel-oped, the gaugings were converted into multivariate Gaussian distributions (kernels). The geometryof the kernel was shown to be important to control to ensure that all the kernels were aligned in theproper orientation so that interpolation and extrapolation of discharge values could be accuratelypredicted. Had the slope of the kernels not been fixed, they would be free to align themselves ac-cording to the variables used in Equation 3.6. While although the gaugings would align along astraight line as expected because of the log transformation, the kernels would not have followedthat line. This would cause less accurate conditional distribution functions, and using a weightedmean of the function would have produced erroneous predictions. Although the method used forthe kernel orientation was, per say, not the best (Figure 5.2), it is possible to improve it by using alocal regression approach, which would be expected to produce better results.The conditional distribution of the kernel was used to assign the measurement uncertainty of thedischarge, and hence control the spread of the (Q|h) distribution. The combination of using themarginal and conditional distributions served as the primary method to fully controlling the kernels.A value of 5% of the discharge magnitude was assumed for every kernel’s discharge conditionaldistribution uncertainty. Under this assumption, reasonable results were produced and it is expectedthat in events where the discharge uncertainty is known, those values could be substituted in. Theeffect would be that the produced kernels would vary more than what are seen in this thesis, butwould be a better representation of the actual uncertainty.With the kernels constructed, their summation was taken to develop a mixture model of multivariateGaussian distributions that represented the joint distribution. Using this finely discretized densitygrid, the conditional distributions (P(Q|h = hi)) were calculated and normalized to one. These con-ditional distributionsm when appended together as shown in Figure 4.1, represent the probabilistic56dynamic rating curve. Using this model, the continuous discharge predictions could be extractedby using the continuous water level time series. What is more is that the conditional distributionscan also be displayed in conjunction with the discharge time series to show a continuous dischargetime series with dynamic uncertainty bands (Figure 4.5). These uncertainty bands were discussedand an argument was provided as to why these bands are more relevant to users than 95% confi-dence intervals. Using these conditional uncertainty bands also helps in improving models that usethe discharge time series as inputs since the uncertainty in the time series is now transparent to theuser. This helps reduce the ambiguity that modelers use to assign the uncertainty to the dischargeand helps improve the overall performance of hydrological models and promotes better decision-making.The probabilistic dynamic rating curve model developed is also simple enough to be adopted byengineering practitioners who develop rating curves for clients. In combination with current meth-ods for developing deterministic rating curves, the slopes could be extracted after manual calibrationof a linear regression through the gaugings. A simple graphic user interface could be created wherethe user inputs the measurement uncertainty and grid size, and the probabilistic rating curve couldbe produced instantly. Calibration of the curve could be done by changing a few of the assumedparameters measured to minimize the RMSE. Predictions from the deterministic and probabilisticresults could be easily compared and dynamic uncertainty bands can be extracted for the probabilis-tic model.An exploration into reducing the uncertainty was undergone by assessing the usage of auxiliaryinformation. The information used in this thesis were sluicing signals from a ROR hydroelec-tric project and the timestamps of gaugings. The sluicing signal was inferred by referencing fourunique signals in the intake structure of the hydroelectric project. Weighted normal CDFs wereused to derive a function that was believed to help best describe the change in sluicing weights. Theweights used were taken from the duration of the individual sluicing events, where large events weregiven more weight than short sluicing events. For the time weights, an exponential model was usedto assign more weight to the recent gaugings and less to those that were taken in the past. Threeunique models were created that utilized different combinations of the weights and one model wasproduced that used no weights, which implies that all the gaugings had an equal weight assignmentof 1n where n is the number of gaugings observed. The results show that using the auxiliary infor-mation helped reduce the RMSE of the model that used an equal weight distribution by up to 19%.The surprise of the model that used both the time and sluicing weights (All Weights model) wasalso shown to be that smallest of the four. This implies that its ability to predict future gaugings wasthe best. For hindcast mode, the results of the surprise varied and it showed that the All Weightsmodel was over confident, which is possibly why that model had the largest total surprise. In all, itwas shown that when used in forecast mode, the probabilistic dynamic rating curve proved to be auseful model with auxiliary information being capable of reducing the model’s uncertainty.57There are many opportunities in which the work presented in this thesis can be taken further. Thefirst would be an improvement on the assumed value for the exponential distribution parameter. Asdescribed, it may be possible to train this parameter using a Bayesian approach. In doing so, themodel will adapt to the data and theoretically provide even better results then what were shown. Itwould also be interesting to explore how the model behaves when working with segmented ratingcurves. It is anticipated that as long as the individual slopes are extracted from the deterministicweighted linear regression, that a reasonable rating curve should be produced. The probabilisticdynamic rating curve can also be optimized to minimize RMSE, or any other performance metric,like maximizing entropy. In doing so, different scenarios can be studied to better understand whento best take gaugings. Lastly, in this thesis, multivariate Gaussian distributions were used. Thesesame distributions are also used in Gaussian Processes and a further exploration of how to tie the re-sults of this thesis with machine learning could be of high value, especially at a time when machinelearning is currently undergoing a lot of development.58ReferencesBaker, V. (2009). High-energy megafloods: planetary settings and sedimentary dynamics(Vol. 32). Special Publication.6Beven, K. (2010). Environmental modelling: An uncertain future? CRC Press.2Bhattacharya, B., & Solomatine, D. P. (2005). Neural networks and M5 model trees in modellingwater level–discharge relationship. Neurocomputing, 63, 381–396.12Chow, V. T. (1959). Open channel hydraulics. McGraw-Hill Book Company, Inc; New York.4Clean Energy BC, B. C. (2015). Run-of-river hydro power. Retrieved 2017-10-13, fromhttps://www.cleanenergybc.org/about/clean-energy-sectors/run-of-river14Coxon, G., Freer, J., Westerberg, I., Wagener, T., Woods, R., & Smith, P. (2015). A novelframework for discharge uncertainty quantification applied to 500 UK gauging stations.Water Resources Research, 51(7), 5531–5546.12, 13Di Baldassarre, G., & Montanari, A. (2009). Uncertainty in river discharge observations: aquantitative analysis. Hydrology and Earth System Sciences, 13(6), 913.8, 9, 11, 31Godsill, S. J. (2001). On the relationship between Markov Chain Monte Carlo methods for modeluncertainty. Journal of Computational and Graphical Statistics, 10(2), 230–248.10Goldberg, D. E., & Holland, J. H. (1988). Genetic algorithms and machine learning. MachineLearning, 3(2), 95–99.12Govindaraju, R. S., & Rao, A. R. (2013). Artificial neural networks in hydrology (Vol. 36).Springer Science & Business Media.13Guerrero, J.-L., Westerberg, I. K., Halldin, S., Xu, C.-Y., & Lundin, L.-C. (2012). Temporalvariability in stage–discharge relationships. Journal of Hydrology, 446, 90–102.11Guven, A., Aytek, A., & Azamathulla, H. M. (2013). A practical approach to formulatestage–discharge relationship in natural rivers. Neural Computing and Applications, 23(3-4),873–880.13Hamilton, A. S., & Moore, R. D. (2012). Quantifying uncertainty in streamflow records.59Canadian Water Resources Journal, 37(1), 3–21.8, 11, 54Hamlet, A. F., Huppert, D., & Lettenmaier, D. P. (2002). Economic value of long-lead streamflowforecasts for Columbia River hydropower. Journal of Water Resources Planning andManagement, 128(2), 91–101.13Herschy, R. W. (1995). Streamflow measurement. CRC Press.viii, 5, 11, 14, 32Herschy, R. W. (1999). Hydrometry: principles and practices. CRC Press.6, 8, 9Hsieh, W. W. (2009). Machine learning methods in the environmental sciences: Neural networksand kernels. Cambridge university press.12, 13Kieffer, S. W. (1985). The 1983 hydraulic jump in Crystal Rapid: Implications for river-runningand geomorphic evolution in the Grand Canyon. The Journal of Geology, 93(4), 385–406.6Krzysztofowicz, R. (2001). The case for probabilistic forecasting in hydrology. Journal ofHydrology, 249(1), 2–9.2, 55Le Coz, J. (2012). A literature review of methods for estimating the uncertainty associated withstage-discharge relations (Tech. Rep. No. 6).3Le Coz, J., Renard, B., Bonnifait, L., Branger, F., & Le Boursicaud, R. (2014). Combininghydraulic knowledge and uncertain gaugings in the estimation of hydrometric rating curves:A Bayesian approach. Journal of Hydrology, 509, 573–587.6, 11, 53Leonard, J., Mietton, M., Najib, H., & Gourbesville, P. (2000). Rating curve modelling withManning’s equation to manage instability and improve extrapolation. Hydrological SciencesJournal, 45(5), 739–750.11Leopold, L. B., Wolman, M. G., & Miller, J. P. (2012). Fluvial processes in geomorphology.Courier Corporation.48Liu, Y., & Gupta, H. V. (2007). Uncertainty in hydrologic modeling: Toward an integrated dataassimilation framework. Water Resources Research, 43(7).2Longerstaey, J., & Spencer, M. (1996). Riskmetricstmtechnical document. Morgan GuarantyTrust Company of New York: New York.25McMillan, H., Freer, J., Pappenberger, F., Krueger, T., & Clark, M. (2010). Impacts of uncertainriver flow data on rainfall-runoff model calibration and discharge predictions. HydrologicalProcesses, 24(10), 1270–1284.2, 12McMillan, H., & Westerberg, I. (2015). Rating curve estimation under epistemic uncertainty.Hydrological Processes, 29(7), 1873–1882.10Ministry of Environment Science and Information Branch for the Resources Information StandardsCommittee. (2009). Manual of British Columbia hydrometric standards. Resources60Information Standards Committee.4Montanari, A., & Brath, A. (2004). A stochastic approach for assessing the uncertainty ofrainfall-runoff simulations. Water Resources Research, 40(1).2Moore, R. D., Hamilton, A. S., & Scibek, J. (2002). Winter streamflow variability, YukonTerritory, Canada. Hydrological Processes, 16(4), 763–778.8Moyeed, R. A., & Clarke, R. T. (2005). The use of Bayesian methods for fitting rating curves, withcase studies. Advances in Water Resources, 28(8), 807–818.53Nearing, G. S., Tian, Y., Gupta, H. V., Clark, M. P., Harrison, K. W., & Weijs, S. V. (2016). Aphilosophical basis for hydrological uncertainty. Hydrological Sciences Journal, 61(9),1666–1678.13Newman, M. C. (1993). Regression analysis of log-transformed data: Statistical bias and itscorrection. Environmental Toxicology and Chemistry, 12(6), 1129–1133.20Northwest Hydraulic Consultants, N. (2016). Forrest Kerr Hydroelectric Project HydrometricReport 2016 (Tech. Rep.). Northwest Hydraulic Consultants. (unpublished report)17Pafka, S., & Kondor, I. (2001). Evaluating the riskmetrics methodology in measuring volatilityand value-at-risk in financial markets. Physica A: Statistical Mechanics and its Applications,299(1), 305–310.25Petersen-Øverleir, A. (2004). Accounting for heteroscedasticity in rating curve estimates. Journalof Hydrology, 292(1), 173–181.9Petersen-Øverleir, A. (2006). Modelling stage-discharge relationships affected by hysteresis usingthe Jones formula and nonlinear regression. Hydrological Sciences Journal, 51(3), 365–388.10Petersen-Øverleir, A. (2008). Fitting depth–discharge relationships in rivers with floodplains.Hydrology Research, 39(5-6), 369–384.10Petersen-Øverleir, A., & Reitan, T. (2005). Objective segmentation in compound rating curves.Journal of Hydrology, 311(1), 188–201.10Petersen-Øverleir, A., & Reitan, T. (2009). Bayesian analysis of stage–fall–discharge models forgauging stations affected by variable backwater. Hydrological Processes, 23(21),3057–3074.10, 53Rantz, S. E. (1982). Measurement and computation of streamflow: volume 2, computation ofdischarge (Tech. Rep.). USGPO,.4, 8, 9, 11, 32Raso, L., Weijs, S., & Werner, M. (2017). Balancing costs and benefits in selecting newinformation: Efficient monitoring using deterministic hydro-economic models. WaterResources Management, 1–19.5261Reitan, T., & Petersen-Øverleir, A. (2008). Bayesian power-law regression with a locationparameter, with applications for construction of discharge rating curves. StochasticEnvironmental Research and Risk Assessment, 22(3), 351–365.10Reitan, T., & Petersen-Øverleir, A. (2009). Bayesian methods for estimating multi-segmentdischarge rating curves. Stochastic Environmental Research and Risk Assessment, 23(5),627–642.10Reitan, T., & Petersen-Øverleir, A. (2011). Dynamic rating curve assessment in unstable riversusing Ornstein-Uhlenbeck processes. Water Resources Research, 47(2).11Schement, J. R., & Ruben, B. D. (1993). Between communication and information (Vol. 4).Transaction Publishers.13Schmidt, A. R. (2002). Analysis of stage-discharge relations for open-channel flows and theirassociated uncertainties (Unpublished doctoral dissertation). University of Illinois atUrbana-Champaign.11Shannon, C. (1948). A mathematical theory of communication. Bell System Technical J., 27(3),379–423.32Sivapragasam, C., & Muttil, N. (2005). Discharge rating curve extension–a new approach. WaterResources Management, 19(5), 505–520.13Solomatine, D. P., & Ostfeld, A. (2008). Data-driven modelling: some past experiences and newapproaches. Journal of Hydroinformatics, 10(1), 3–22.12Sorooshian, S., & Dracup, J. A. (1980). Stochastic parameter estimation procedures for hydrologierainfall-runoff models: Correlated and heteroscedastic error cases. Water ResourcesResearch, 16(2), 430–442.9Turnipseed, D. P., & Sauer, V. B. (2010). Discharge measurements at gaging stations. In U.s.geological survey techniques and methods book 3 (chap. A8). United States GeologicalSurvey.8, 11Venetis, C. (1970). A note on the estimation of the parameters in logarithmic stage-dischargerelationships with estimates of their error. Hydrological Sciences Journal, 15(2), 105–111.9Weijs, S. V. (2011). Information theory for risk-based water system operation. 2, 13Weijs, S. V., Mutzner, R., & Parlange, M. B. (2013). Could electrical conductivity replace waterlevel in rating curves for alpine streams? Water Resources Research, 49(1), 343–351.2, 14, 15Westerberg, I., Guerrero, J.-L., Seibert, J., Beven, K., & Halldin, S. (2011). Stage-dischargeuncertainty derived with a non-stationary rating curve in the Choluteca River, Honduras.Hydrological Processes, 25(4), 603–613.1262Appendix ASnapshots of the rating curve in timeFigure A.1: November 28, 201363Figure A.2: January 1, 201564Figure A.3: June 20, 201765Appendix BHydrographsFigure B.1: Hydrograph for Only Time modelFigure B.2: Hydrograph for Only Sluicing model66Figure B.3: Hydrograph for Equal Weights model67Appendix C20% assumed discharge uncertaintyFigure C.1: A comparison of how the joint and conditional distributions change when using a20% discharge uncertainty.68
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Probabilistic dynamic rating curves using auxiliary...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Probabilistic dynamic rating curves using auxiliary information Galindo Ruiz, Luis Camilo 2017
pdf
Page Metadata
Item Metadata
Title | Probabilistic dynamic rating curves using auxiliary information |
Creator |
Galindo Ruiz, Luis Camilo |
Publisher | University of British Columbia |
Date Issued | 2017 |
Description | Rating curves play a vital part in hydrology for producing streamflow time-series. The derived streamflow is an integral component to any hydrological study and therefore requires proper quantification of not only a discharge point value, but also an uncertainty measure. Using multivariate Gaussian distributions as kernels, a probabilistic rating curve was developed from the conditional distribution as an alternative model for the standard deterministic rating curve. Auxiliary information from a run-of-river hydroelectric project, as well as the temporal variability from the gauging measurements, were used to study the possible reduction in the uncertainty of the developed rating curve. The temporal information was modeled using an exponential function that updated upon receiving new gaugings and the sluicing model was a continuously updated kernel distribution that assigned more weight to gaugings taken after a sluicing event. Four models of varying complexity were created and their performance was evaluated using information theory measures such as surprise and the Kullback-Leibler divergence measure. The results indicate that probabilistic rating curves are useful tools for modeling and evaluating the dynamic uncertainty of the curves. The uncertainty was shown to be reduced by up to 19% by including the temporal information of the gaugings and sluicing information. Auxiliary information can be beneficial to rating curve development and an argument is made for why probabilistic rating curves should become a norm in the hydrology field. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2018-06-30 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0362966 |
URI | http://hdl.handle.net/2429/64323 |
Degree |
Master of Applied Science - MASc |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2018-02 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2018_february_galindo_luis.pdf [ 7.95MB ]
- Metadata
- JSON: 24-1.0362966.json
- JSON-LD: 24-1.0362966-ld.json
- RDF/XML (Pretty): 24-1.0362966-rdf.xml
- RDF/JSON: 24-1.0362966-rdf.json
- Turtle: 24-1.0362966-turtle.txt
- N-Triples: 24-1.0362966-rdf-ntriples.txt
- Original Record: 24-1.0362966-source.json
- Full Text
- 24-1.0362966-fulltext.txt
- Citation
- 24-1.0362966.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0362966/manifest