PLANNING REPAIR AND REPLACEMENT PROGRAM FOR WATER MAINS: A BAYESIAN FRAMEWORK by Golam Kabir M.Sc., Bangladesh University of Engineering and Technology (BUET), 2011 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE COLLEGE OF GRADUATE STUDIES (Civil Engineering) THE UNIVERSITY OF BRITISH COLUMBIA (Okanagan) April 2016 © Golam Kabir, 2016 ii iii Abstract Aging water infrastructure is a major concern for water utilities throughout the world. It is challenging to develop an extensive water mains renewal program and predict the performance of the water mains. Uncertainties become an integral part of the repair and replacement (R&R) action program due to incomplete and partial information, integration of data/information from different sources, and the involvement of expert judgment for the data interpretation and so on. Moreover, the uncertainties differ because of the amount and quality of data available for developing or implementing R&R action program varies among utilities. In this research, a Bayesian framework is developed for the R&R action program of water mains considering these uncertainties. At the beginning of the research, state-of-the-art critical review of existing regression-based, survival analysis and heuristic based failure models and life cycle cost (LCC) studies in the field of water main are performed. To identify the influential covariates and to predict the failure rates of water mains considering model uncertainties with limited failure information, Bayesian model averaging and Bayesian regression based model are developed. In these models, decision maker’s degree of optimism and credibility are integrated using ordered weighted averaging operator. A robust Bayesian updating based framework is proposed to update the performance of water main failure model for medium to large-sized utilities with adequate failure information. A LCC framework is prepared for water main of small to medium-sized utilities. Finally, a Bayesian belief network (BBN) based water main failure risk framework is developed for small to medium sized utilities with no or limited failure information. The integration of the proposed robust Bayesian models with the geographic information system (GIS) of the water utilities will provide information both at operation level and network level. The proposed tool will help the utility engineers and managers to predict the suitable new installation and rehabilitation programs as well as their corresponding costs for effective and proactive decision-making and thereby avoiding any unexpected and unpleasant surprises. iv Preface I, Golam Kabir, prepared all the contents of this thesis including literature review, accessing data, model development and analysis, interpretation of results and writing of the manuscripts under the supervision of Dr. Solomon Tesfamariam. The other co-authors of the articles include Dr. Rehan Sadiq, Dr. Jason Loeppky, Dr. Alex Francisque and Gizachew Demissie. Drs. Solomon Tesfamariam and Rehan Sadiq reviewed all the manuscripts and provided critical feedback in the improvement of the manuscripts and thesis while other co-authors provided technical advice, GIS mapping and editing the manuscripts. Most of the contents of this thesis are published, accepted or submitted for publication in the following journals and conference proceedings. A version of Chapter 3 has been published in ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering with a title “Integrating Bayesian Linear Regression with Ordered Weighted Averaging: Uncertainty Analysis for Predicting Water Mains’ Failures” (Kabir et al. 2015a). A version of Chapter 4 has been published in Canadian Journal of Civil Engineering with a title “Bayesian Model Averaging for the Prediction of Water Main Failure for Small to Large Canadian Municipalities” (Kabir et al. 2016). A portion of Chapter 5 has been published in The proceedings of the 12th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP), with the title “Prediction of water mains failure: a Bayesian approach” (Kabir et al. 2015b). A version of Chapter 5 has been published in Reliability Engineering & System Safety with a title “Predicting Water Main Failures using Bayesian Model Averaging and Survival Modelling Approach” (Kabir et al. 2015c). A version of Chapter 6 is under review in Knowledge-Based Systems with a title “Predicting Water Main Failures: A Bayesian Model Updating Approach” (Kabir et al. 2015d). v A version of Chapter 7 is under review in Urban Water Journal with a title “A Life Cycle Cost Approach for Water Mains’ Renewal in Small to Medium-sized Water Utilities” (Kabir et al. 2015e). A version of Chapter 8 has been published in European Journal of Operational Research with a title “Evaluating Risk of Water Mains Failure using a Bayesian Belief Network Model” (Kabir et al. 2015f). vi Table of Contents Abstract ........................................................................................................................... iii Preface ............................................................................................................................. iv Table of Contents ........................................................................................................... vi List of Tables .................................................................................................................... x List of Figures ............................................................................................................... xiii List of Abbreviations .................................................................................................... xvi List of Symbols .............................................................................................................. xx Acknowledgements ...................................................................................................... xxii Dedication ................................................................................................................... xxiv Chapter 1 Introduction ................................................................................................... 1 1.1 Background and motivation ................................................................................................ 1 1.2 Research objectives ............................................................................................................ 5 1.3 Thesis organization ............................................................................................................. 5 1.4 Research methodology ........................................................................................................ 8 1.4.1 Methodology: Objective 1 ........................................................................................... 8 1.4.2 Methodology: Objective 2 ........................................................................................... 9 1.4.3 Methodology: Objective 3 ......................................................................................... 10 1.4.4 Methodology: Objective 4 ......................................................................................... 10 1.4.5 Methodology: Objective 5 ......................................................................................... 11 1.4.6 Methodology: Objective 6 ......................................................................................... 11 Chapter 2 Literature Review ....................................................................................... 13 2.1 Literature review ............................................................................................................... 13 2.1.1 Pipe failure models ..................................................................................................... 13 2.1.1.1 Regression models ............................................................................................... 14 2.1.1.2 Survival analysis .................................................................................................. 17 2.1.1.3 Heuristic models .................................................................................................. 25 2.1.2 Life cycle cost (LCC) ................................................................................................. 32 vii Chapter 3 Bayesian Regression Based Failure Model ............................................... 37 3.1 Background ....................................................................................................................... 37 3.2 Methodology ..................................................................................................................... 37 3.2.1 Multiple linear regression .......................................................................................... 37 3.2.2 Bayesian linear regression ......................................................................................... 38 3.2.2.1 Confidence intervals vs credible intervals ........................................................... 39 3.2.3 Model selection .......................................................................................................... 40 3.2.4 OWA operators .......................................................................................................... 41 3.2.4.1 Basic concepts ..................................................................................................... 41 3.2.4.2 Determination of weights .................................................................................... 42 3.3 Case study: The City of Calgary ....................................................................................... 43 3.3.1 Covariate selection ..................................................................................................... 46 3.3.2 Data collection and preparation ................................................................................. 48 3.3.3 Model selection .......................................................................................................... 52 3.3.4 Bayesian regression .................................................................................................... 59 3.3.5 Application of OWA .................................................................................................. 69 3.4 Summary ........................................................................................................................... 71 Chapter 4 Bayesian Model Averaging (BMA) Based Failure Model ....................... 73 4.1 Background ....................................................................................................................... 73 4.2 Bayesian model averaging (BMA) ................................................................................... 73 4.3 Case studies ...................................................................................................................... 75 4.3.1 City of Kelowna, BC .................................................................................................. 77 4.3.2 Greater Vernon Water, BC ......................................................................................... 82 4.3.3 City of Calgary, Alberta ............................................................................................. 86 4.4 Summary ........................................................................................................................... 90 Chapter 5 Bayesian Weibull Proportional Hazard Based Failure Model ............... 92 5.1 Background ....................................................................................................................... 92 5.2 Proposed Methodology ..................................................................................................... 92 5.3 Case study: City of Calgary .............................................................................................. 96 viii 5.3.1 Data collection ........................................................................................................... 97 5.3.2 Results and discussions .............................................................................................. 98 5.4 Summary ......................................................................................................................... 127 Chapter 6 Bayesian Updating Based Failure Model ................................................ 129 6.1 Background ..................................................................................................................... 129 6.2 Methodology ................................................................................................................... 129 6.3 A Case study for City of Calgary ................................................................................... 131 6.3.1 Data collection ......................................................................................................... 131 6.3.2 Covariate selection ................................................................................................... 131 6.3.3 Model development .................................................................................................. 135 6.3.4 Model performance evaluation ................................................................................ 143 6.4 Summary ......................................................................................................................... 147 Chapter 7 Life Cycle Cost (LCC) for Repair and Replacement ............................. 148 7.1 Background ..................................................................................................................... 148 7.2 Proposed LCC framework .............................................................................................. 148 7.2.1 Pipe deterioration/condition states ........................................................................... 149 7.2.2 Life cycle cost (LCC) ............................................................................................... 150 7.3 Case studies .................................................................................................................... 152 7.3.1 Pipe deterioration curves development .................................................................... 152 7.3.2 Life cycle cost determination ................................................................................... 156 7.3.3 Overview of LCC based DST in RiLCAMP tool ..................................................... 159 7.4 Summary ......................................................................................................................... 167 Chapter 8 Bayesian Belief Network (BBN) for Prioritization of Water Mains .... 168 8.1 Background ..................................................................................................................... 168 8.2 Bayesian Belief Network (BBN) .................................................................................... 168 8.3 BBN for risk of failure for water mains .......................................................................... 171 8.3.1 Water quality index .................................................................................................. 173 8.3.2 Hydraulic capacity index ......................................................................................... 174 8.3.3 Structural integrity index ......................................................................................... 175 ix 8.3.4 Consequence index .................................................................................................. 179 8.3.5 Implementation of water mains risk in BBN ........................................................... 180 8.3.6 Sensitivity Analysis of Aggregated Failure Risk Index ........................................... 181 8.3.7 Scenario Analysis ..................................................................................................... 183 8.4 City of Kelowna: Case Study ......................................................................................... 185 8.4.1 Database management .............................................................................................. 187 8.4.1.1 Water quality data .............................................................................................. 187 8.4.1.2 Water velocity and pressure .............................................................................. 188 8.4.1.3 Soil corrosivity .................................................................................................. 188 8.4.1.4 Land use............................................................................................................. 189 8.4.1.5 Population density ............................................................................................. 191 8.4.2 Results and discussions ............................................................................................ 193 8.5 Summary ......................................................................................................................... 197 Chapter 9 Conclusions and Recommendations ........................................................ 198 9.1 Summary and conclusions .............................................................................................. 198 9.2 Originality and contributions .......................................................................................... 201 9.3 Limitations and recommendations for future work ........................................................ 202 References .................................................................................................................... 204 Appendix A: Summary of the survival analysis models for pipe failure ............... 225 x List of Tables Table 1.1: Drinking water/ wastewater infrastructure condition of different countries ............... 2 Table 2.1: Summary of linear and exponential regression models for water main failures ....... 15 Table 2.2: Summary of survival analysis methods for water main failures ............................... 19 Table 2.3: Risk factors of water main failure recommended by different literature .................. 27 Table 2.4: Comparison of various network based techniques .................................................... 29 Table 2.5: Summary of LCC studies of water mains ................................................................. 33 Table 2.6: Break cost factors considered by Dandy and Engelhardt (2001, 2006) .................... 36 Table 3.1: Summary of control variables used the regression model......................................... 50 Table 3.2: Parameter significance results of different models for CI pipes ............................... 55 Table 3.3: Bayes factor or odd ratio of different models for CI pipes ....................................... 56 Table 3.4: Parameter significance results of different models for DI pipes ............................... 57 Table 3.5: Bayes factor or odd ratio of different models for DI pipes ....................................... 58 Table 3.6: Linguistic quantifier, optimistic condition, and weights for different orness ........... 70 Table 4.1: Summary of control variables for CoK ..................................................................... 78 Table 4.2: Classical and Bayesian analysis of the CoK ............................................................. 80 Table 4.3: Performance of BMA and regression models for the CoK ....................................... 82 Table 4.4: Summary of control variables for GVW ................................................................... 83 Table 4.5: Classical and Bayesian analysis of the GVW ........................................................... 85 Table 4.6: Performance of BMA and regression models for the Greater Vernon Water ........... 86 Table 4.7: Summary of control variables for City of Calgary .................................................... 87 Table 4.8: Classical and Bayesian analysis of the City of Calgary ............................................ 89 Table 4.9: Performance of BMA and regression models for the City of Calgary ...................... 90 xi Table 5.1: Summary of control variables ................................................................................... 98 Table 5.2: Mean, SD and posterior probabilities of the coefficients for CI pipes (NOPF >0) based on BMA ....................................................................................................... 104 Table 5.3: Mean and SD of the coefficients of BWPHMs for CI and DI pipe strata ............... 106 Table 5.4: Performance of BWPHM and Cox-PHM for break prediction ............................... 125 Table 5.5: Description and failure history of M21738 and A16386-01 pipes.......................... 127 Table 6.1: Mean, SD and posterior probabilities of the coefficients for DI pipes using BMA 134 Table 6.2: Mean, SD and posterior probabilities of the coefficients for CI pipes using BMA 134 Table 6.3: Number of datasets for four periods for different models ....................................... 136 Table 6.4: Mean, standard deviation, and MC error of the coefficients of BWPHMs for CI pipes (NOPF > 0) ................................................................................................... 139 Table 6.5: Mean, standard deviation, and MC error of the coefficients of BWPHMs for CI pipes (NOPF = 0) ................................................................................................... 139 Table 6.6: Mean, standard deviation, and MC error of the coefficients of BWPHMs for DI pipes (NOPF > 0) ................................................................................................... 140 Table 6.7: Mean, standard deviation, and MC error of the coefficients of BWPHMs for DI pipes (NOPF = 0) ................................................................................................... 140 Table 6.8: Performance of BWPHM and Cox-PHM for CI and DI pipes break prediction .... 146 Table 7.1: Number of water mains failure of GVW and City of Kelowna .............................. 153 Table 7.2: Mean and standard deviation of the Bayesian regression coefficient for different models .................................................................................................................... 156 Table 7.3: Average repair costs for different situations for GEID and GVW .......................... 157 Table 7.4: Pipe replacement or new pipe installation cost for GEID and GVW ..................... 158 Table 7.5: Basic pipe characteristics data and net present value .............................................. 160 Table 7.6: List of different M/R/R techniques and corresponding costs .................................. 161 Table 7.7: GEID_1323 pipe condition after each M/R/R actions over the planning period .... 163 xii Table 7.8: Weight of deterioration rate after intervention or actions ....................................... 163 Table 7.9: Cost for each M/R/R actions of GEID_1323 pipe over the planning period .......... 164 Table 8.1: Details of indices for water quality index ............................................................... 174 Table 8.2: Details of indices for hydraulic capacity index ....................................................... 175 Table 8.3: Details of indices structural integrity index (physical factors) ............................... 176 Table 8.4: Details of indices structural integrity index (external factors) ................................ 178 Table 8.5: Details of indices structural integrity index (internal factors) ................................ 179 Table 8.6: Details of indices consequence index ..................................................................... 180 Table 8.7: Example of conditional probability table for Aggregated Failure Risk Index ........ 181 Table 8.8: Sensitivity analysis using proposed BBN model .................................................... 183 Table 8.9: Results of 3 scenarios using proposed BBN model ................................................ 184 Table 8.10: Water main pipe inventory for City of Kelowna................................................... 186 Table 8.11: Water quality sampling station name and geographic coordinates ....................... 187 Table 8.12: Available metallic pipe features of City of Kelowna ............................................ 188 xiii List of Figures Figure 1.1: Age of Canada’s infrastructure (modified after Mirza 2007) .................................... 1 Figure 1.2: Classification of uncertainties (modified after Klir and Yuan 1995) ........................ 4 Figure 1.3: Thesis organization .................................................................................................... 7 Figure 1.4: Proposed framework for objective 1 .......................................................................... 9 Figure 1.5: Proposed framework for objective 3 ....................................................................... 10 Figure 2.1: Flow chart of literature review ................................................................................. 13 Figure 2.2: Classification of survival analysis method for pipe failure analysis ....................... 18 Figure 3.1: Water distribution system of the City of Calgary .................................................... 44 Figure 3.2: Number of pipe installation of City of Calgary ....................................................... 45 Figure 3.3: Number of water main breaks of City of Calgary.................................................... 46 Figure 3.4: Histogram of different covariates of CI pipes.......................................................... 51 Figure 3.5: Histogram of different covariates of DI pipes ......................................................... 52 Figure 3.6: Histogram of Bayesian regression parameters for CI pipes..................................... 60 Figure 3.7: Histogram of Bayesian regression parameters for DI pipes .................................... 61 Figure 3.8: Observed Vs predicted curve of log(TBRKS) for CI pipes ..................................... 62 Figure 3.9: CPDF of TBRKS for CI pipes using normal & Bayesian regression ...................... 63 Figure 3.10: Comparison of RCI and BCI for CI pipes ............................................................. 65 Figure 3.11: Comparison of mean response prediction of normal and Bayesian regression for testing dataset for CI and DI .................................................................................... 67 Figure 3.12: Comparison of predicted response of normal and Bayesian regression for testing dataset for CI and DI .................................................................................... 68 Figure 3.13: Response prediction curves of CI pipes for 5%, 50% & 95% Bayesian credible interval ..................................................................................................................... 70 xiv Figure 3.14: Response prediction curve of CI pipes for different orness values ....................... 71 Figure 4.1: Posterior distributions and probabilities of the regression coefficients for plastic pipes ......................................................................................................................... 81 Figure 5.1: Number of failures of CI and DI pipes with respect to age and diameter ............... 99 Figure 5.2: Number of failures of CI and DI pipes with respect to soil resistivity and soil corrosivity index .................................................................................................... 101 Figure 5.3: Posterior distribution of the coefficients for CI pipes (NOPF >0) based on BMA 103 Figure 5.4: Selection of scale parameters for BWPHMs ......................................................... 104 Figure 5.5: Histogram of the coefficients of BWPHM for CI pipes (NOPF > 0) .................... 105 Figure 5.6: Survival curves for CI pipe strata with NOPF=0 and NOPF>0 ............................ 109 Figure 5.7: Survival curves for DI pipe strata with NOPF=0 and NOPF>0 ............................ 110 Figure 5.8: Observed breaks and predicted uncertainty bounds of BWPHM for CI pipes ...... 114 Figure 5.9: Observed breaks and predicted uncertainty bounds of BWPHM for DI pipes ...... 117 Figure 5.10: Observed and predicted breaks for CI pipe strata ................................................ 121 Figure 5.11: Observed and predicted breaks for DI pipe strata ................................................ 124 Figure 5.12: Survival function estimation of M21738 and A16386-01 ................................... 126 Figure 6.1: Bayesian updating framework for the water main failure prediction .................... 130 Figure 6.2: Posterior distribution of the coefficients for DI pipes (NOPF > 0) ....................... 133 Figure 6.3: Posterior distribution of the coefficients for CI pipes (NOPF > 0) ....................... 137 Figure 6.4: Survival curves for CI pipe with NOPF>0 and NOPF=0 ...................................... 141 Figure 6.5: Survival curves for DI pipe with NOPF>0 and NOPF=0 ...................................... 142 Figure 6.6: Observed and predicted breaks of BWPHMs and Cox-PHMs for CI pipes .......... 144 Figure 6.7: Observed and predicted breaks of BWPHMs and Cox-PHMs for DI pipes .......... 145 Figure 7.1: Proposed methodology for LCC of small to medium-sized water utilities ........... 149 Figure 7.2: Example timing of future activities........................................................................ 151 xv Figure 7.3: Posterior distribution of the regression coefficients of metallic pipes (Diameter > 200 mm) ................................................................................................................. 155 Figure 7.4: The welcome page of the of the RiLCAMP tool ................................................... 159 Figure 7.5: Deterioration profile of GEID_1323 pipe over the planning period ..................... 162 Figure 7.6: LCC profile of GEID_1323 over the planning period ........................................... 165 Figure 7.7: Net present value ($) of GEID_1323 for different discount rate (%) .................... 166 Figure 8.1: A schematic of BBN (modified after Cockburn and Tesfamariam 2012; Tesfamariam and Liu 2013) .................................................................................. 169 Figure 8.2: Proposed BBN model for risk of metallic water main failure ............................... 172 Figure 8.3: Number of water mains with respect to pipe material for City of Kelowna .......... 186 Figure 8.4: Land uses of water main ........................................................................................ 191 Figure 8.5: Population distribution (person/km2) of City of Kelowna by DA ......................... 192 Figure 8.6: Aggregated risk index for City of Kelowna water mains (a) Summer 2012, (b) Winter 2012 ........................................................................................................... 195 Figure 8.7: Aggregated risk index map for City of Kelowna water mains (Summer 2012) .... 196 xvi List of Abbreviations ABC: Activity based costing AC: Asbestos-cement ACU: Color AGE: Pipe age AHP: Analytical hierarchy process AI: Aggressiveness index AIC: Akaike information criterion ANN: Artificial Neural Networks ANP: Analytic Network Process ASCE: American Society of Civil Engineers AWWA: American water work association BBN: Bayesian Belief Networks BC: British Columbia BCI: Bayesian credible interval BIC: Bayesian information criterion BMA: Bayesian model averaging BMR: Bayesian multiple regression BWPHM: Bayesian Weibull Proportional Hazard Model CI: Cast iron CM: Cognitive Maps CN: Credal Network COI: Consequence index Cox-PHM: Cox proportional hazard model CPDF: Cumulative probability distribution functions CPT: Conditional probability tables DA: Dissemination area DF: Degree of freedom DI: Ductile iron DIA: Pipe diameter xvii Disp: Dispersion DM: Decision maker DP: Dynamic programming DST: Decision support tool EM: Expectation-maximization ERM: Exponential Regression Model FCM: Fuzzy Cognitive Maps FI: Freezing index FLQ: Fuzzy linguistic quantifier FRBM: Fuzzy Rule-Based Models GEID: Glenmore-Ellison Improvement District GIS: Geographic information system GLM: Generalised Linear Model GVW: Greater Vernon Water HCI: Hydraulic capacity index HDPE: High-Density Poly Ethylene LANUSE: Land use LCA: Life cycle analysis LCC: Life cycle cost LENGTH: Pipe length log-like: log-likelihood LoS: Level of service LRM: Linear Regression Model LU: Land use MC: Moisture content MCDA: Multicriteria Decision Analysis MCMC: Markov chain Monte Carlo MCS: Monte Carlo simulation MDP: Markov Deterioration Process ME: Maximum entropy MEM: Multivariate exponential model xviii MERM: Multivariate Exponential Regression Model MEWM: Multivariate Exponential/Weibull Models MLRM: Multiple Linear Regression Model MOGA: Multi-objective genetic algorithm MRM: Multiple Regression Model M/R/R: Maintenance/ rehabilitation/ replacement MSE: Mean square error NBRKS: Number of previous breaks NHPP: Non-homogeneous Poisson process NMR: Normal multiple regression NOPF: Number of observed previous failures NPV: Net present value NPVC: Net present value of cost NSGA: Non-dominated sorting genetic algorithm NTU: Turbidity OWA: Ordered weighted averaging PD: Pipe diameter PDF: Probability density functions PE: Polyethylene PFP: Pipe failure prediction PP: Population density PRM: Poisson Regression Model PVC: polyvinyl chloride PV: Present value R2: Coefficient of determination RC: Free residual chlorine RCI: Regression confidence interval RD: Rain deficit RESIS: Soil resistivity RiLCAMP: Risk based Life Cycle Asset Management tool for water Pipes RMSE: Root mean square error xix ROCOF: Rates of occurrence of failures RP: Redox potential R&R: Repair and replacement R/R/R: Repair, rehabilitation and replacement SBC: Schwarz’s Bayesian criterion SC: Sulfide content SCI: Soil corrosivity index SCR: Soil corrosivity SD: Standard deviation SII: Structural integrity index SOREG: Soil resistivity SpH: Soil pH SR: Soil resistivity SSE: Sum square error TBRKS: Break/failure rate TEMP: Average temperature TR: Type of road TRM: Technology road map TS: Transition-state TT: Type of traffic UP: Unconditional probability VINT: Vintage WA: Water age WDN: Water distribution network WP: Water pressure WpH: Water pH WPHM: Weibull proportional hazard model WQI: Water quality index WV: Water velocity xx List of Symbols A variance-covariance matrix BKjObserved observed breakage rate of the pipe i BKjpredicted predicted breakage rate of the pipe i Cc construction cost, Ct sum of maintenance, repair, replacement/rehabilitation costs and salvage value Cijt total incurred cost at condition state i by choosing decision j at period t d discount rate DA dissemination area DDp all the days below the threshold temperature e evapotranspiration E(Q|f) expected real value of query node after the new finding f E (β|D) mean of the coefficients of different models f state of the varying variable node FIp reezing index for day i, FLUCW final land use consequence weight FPCW final population consequence weight h (t|X) hazard function h0(t) baseline hazard function ln natural logarithm L(θ) partial likelihood estimate L of θ Li length of the pipe ith portion that intersects the ith DA m total number of data points P precipitation Pavg pipe average pressure Pi nodal pressures PDi population density of the ith intersected DA p(X) prior occurrence probability of X p(Y) marginal (total) occurrence probability of Y xxi p(X|Y) posterior occurrence probability of X given the condition that Y occurs p(Y|X) conditional occurrence probability of Y given that X occurs p(q|f) conditional probability of q given f Pr(Mk) prior probability of model Mk Pr(θk|Mk) prior density of θk under model Mk Pr(D|Mk) marginal likelihood of model Mk Pr (βi≠0|D) posterior probabilities of the coefficients of different models q state of the query node S (t|X) survival function S0 (t) baseline survivor function SD (β|D) standard deviation of the coefficients of different models sl planning period TLCC present value of the total life cycle cost wi preference weights α significance level (regression) α scale parameter (survival analysis) β column vector of unknown regression parameters or coefficients μ mean vector σ2 variance ∞ orness Δ quantity of interest ϵ random component δi censoring indicator Φ threshold temperature χ threshold window θ vector of covariate coefficients xxii Acknowledgements At first, I want to convey my deepest gratitude to the almighty ALLAH, the beneficial, the merciful for granting me to materialize this research in the present form. My deepest gratitude goes to my supervisor Professor Dr. Solomon Tesfamariam, who had always been there to help me as a mentor. I really appreciate all his contributions of dedicated time, and great ideas to make my PhD experience dynamic and exciting. I would like to thank for his encouragement that allowed me to grow as a researcher. I would like to thank Dr. Rehan Sadiq for his valuable guidance, thoughtful suggestions and critical comments and encouragement throughout the progress of this research work. I would like to also express my heartfelt thanks to my committee member Dr. Cigdem Eskicioglu for her valuable comments and constructive suggestions. I am also grateful to Professor Dr. Jason Loeppky for sharing his knowledge and guidance for conducting this research. I am indebted to my amazing colleagues of my research groups, Dr. Alex Francisque, Dr. Husnain Haider, Gizachew Demissie and Hassan Iqbal. It was a great pleasure to work with such a wonderful research group. This research would not have been possible without their help and support. I would also like to thank all the staff of various water utilities, who helped me in data collection and shared their experiences with me. The appreciation is certainly due to faculty and staff at the UBC who supported me on several occasions during the study period. The author is also grateful to all the writers and publishers of the books and journals that have taken as references while conducting this research. This research could be difficult without the financial support from the Natural Sciences and Engineering Research Council Collaborative Research and Development (NSERC CRD), Mitacs-Accelerate and partial financial support from the University of British Columbia as the University Graduate Fellowship and International Partial Tuition Scholarship. xxiii I would like to thank all my friends in Kelowna for their efforts and selfless care which alleviated my loneliness and helped me to concentrate on my research. I am also grateful to my friends in Bangladesh for their support and encouragement throughout my graduate study. Words cannot express how grateful I am to my mother, father, mother-in-law, father-in-law, my brother (G M Russel), my sister (Razia Sultana Sumi), and my nephews, niece as well as all the other members of my families, who provided their continuous inspiration, sacrifice and support to complete the research work successfully. With a very special recognition, I would like to thanks my beloved wife and best friend Afruna Lizu Dona. I have no other words to express her contribution in shaping my career all throughout my life. Acknowledgment would be incomplete if I don’t mention my little angel Ruqayaah who made my PhD life challenging as well as memorable. You are the best gift during this research work. xxiv Dedication To My Family 1 Chapter 1 Introduction 1.1 Background and motivation For last few decades, concerns have repeatedly been raised about deteriorating public infrastructure in Canada which can pose a serious threat to the wellbeing of Canadians (Mackenzie 2013). According to the Technology Road Map (TRM) (2003), about 41%, 31% and 28% of Canadian infrastructure is 40 years old or less, between 40 and 80 years and more than 80 years, respectively (Figure 1.1). TRM further reported that 79% of the total service life of Canada’s public infrastructure has passed and the infrastructure deterioration accelerates with age (Mirza 2007). Figure 1.1: Age of Canada’s infrastructure (modified after Mirza 2007) Water distribution networks (WDNs) are among the most important and expensive municipal infrastructure assets (Berardi et al. 2006) which are vital to public health. Table 1.1 shows the infrastructure condition rating of Australia, Canada, South Africa, UK and USA. American Society of Civil Engineers (ASCE) 2013 Report Card for America’s Infrastructure gave a grade of “D” to water/wastewater infrastructure. ASCE (2013) estimated that the investment gap for 2 water/wastewater Infrastructure is $84 billion (in 2010 dollars) to maintain a state of good repair or to achieve a grade of B. Table 1.1: Drinking water/ wastewater infrastructure condition of different countries Australia Category 1999 2001 2005 2010 Portable Water C- C B- B- Wastewater D- C- C+ B- Stormwater D C- C Water overall C C+ Canada 2012 Category Very poor Poor Fair Good Very good Drinking Water 0.7% 0.3% 14.4% 80.5% 4.2% Stormwater 0.8% 4.9% 17.7% 36.2% 40.5% Wastewater 1.2% 6.5% 22.4% 36.1% 33.7% 2016 Category Very poor Poor Fair Good Very good Drinking Water 3.0% 9.0% 17.0% 35.0% 36.0% Stormwater 2.0% 6.0% 16.0% 33.0% 59.0% Wastewater 3.0% 8.0% 24.0% 26.0% 39.0% South Africa Category 2006 2011 Water 1D+ 2C+ 3D- 1D- 2C+ 3D- UK Category 2002 2003 2004 2005 2006 2010 Water and wastewater B B+ B+ B+ B B USA Category 1988 1998 2001 2005 2009 2013 Drinking Water B- D D D- D- D Wastewater C D+ D D- D- D Grade: Australia: A = Very good, B = Good, C = Adequate, D = Poor, F = Inadequate Canada: A = Very good, B = Good, C = Fair, D = Poor, F = Very poor South Africa: A = World class, B = Fit for future, C = Satisfactory for now, D = At risk, E = Unfit for purpose UK: A = Fit for the future, B = Adequate for now, C = Requires attention, D = At risk, E = Unfit for purpose USA: A = Exceptional, B = Good, C = Mediocre, D = Poor, F = Failing 1Department of Water Affairs infrastructure; 2For Major Urban areas; 3For all other areas 3 Canada’s first national infrastructure Report Card (2012) indicated that a substantial amount of drinking-water, stormwater and wastewater infrastructure are in “fair” to “very poor” condition. The replacement costs of these assets are $25.9, $15.8 and $39 billion (in 2010 dollars), respectively. However, according to the recent infrastructure Report Card (2016), the percentage of drinking water in “fair” to “very poor” condition increased significantly from 15.4% to 29% which is an indication of future alarming condition. According to the United States and Canada Infrastructure Report (2007), there have been more than 2 million breaks in Canada and the United States since January 2000, with an average of 700 water main breaks every day, costing more than $10 billion/year. Moreover, water main leaks affect other existing nearby infrastructures such as sewer, stormwater, pavement, and gas pipes that may lead to catastrophic failures (US EPA 2011). Table 1.1 indicates only a few substantial improvements in the condition of drinking water/ wastewater infrastructure in spite of significance research contribution. The water and wastewater infrastructure repair and replacement (R&R) action program has transformed from corrective or reactive (actions after failure) to preventive (actions before failure) maintenance. A successful preventive R&R action program provides predictive tools to evaluate water main failure, determine total life cycle cost (LCC) associated with the assets, and recommend optimum prioritization or decision-making strategies (Almeida et al. 2011; Christodoulou et al. 2009; Clair and Sinha 2014; Fares and Zayed 2010). Having a better understanding and modelling of the failure of water main will aid the utility manager to better address the structural and hydraulic failure of water mains, proactively while meeting financial constraints, level of service, and regulatory requirements. Because of the incomplete and partial information, integration of data/information from different sources, the involvement of human (expert) judgment for the interpretation of data and observations, however, uncertainties become an integral part of the water main R&R action program. Klir and Yuan (1995) identify uncertainties as vagueness or fuzziness and ambiguity (discord or conflict and non-specificity) (Figure 1.2). 4 Figure 1.2: Classification of uncertainties (modified after Klir and Yuan 1995) Moreover, the decision-making problem becomes more complex and uncertain when multiple experts are involved who have different levels of credibility about their knowledge of the problem (Tesfamariam et al. 2010). Data quality also becomes a serious issue as many data sets contain uncertainty, e.g. due to the unreliable recording of failure times or inaccurate measurements of the confounding factors or even the lack of the actual failure times (Economou et al. 2008). Although the significant body of research has focused on the development of R&R action programs, the difficulty remains in interpreting and incorporating and synthesizing uncertainties into the research. The amount and quality of water main break data available for developing or implementing different water main failure models vary among utilities. Most of the previous water main failure and LCC studies mainly focused on larger utilities where adequate number of pipe failure information, sufficient number of experts and resources are available (e.g., Engelhardt et al. 2003; Herstein et al. 2011; Kleiner et al. 2001; Lim et al. 2006; Shahata 2006; Shahata and Zayed 2013). For the small and medium-sized utilities, the uncertainties become a vital and inevitable element of the decision-making process due to scarcity of data/information, lack of technical and financial resources, and involvement of human (expert) judgment with limited experience for the interpretation of data and observations (Haider et al. 2013; Wood and Lence 2009). For this, it will be not effective to develop a general water main failure prediction model, which can handle Uncertainty Vagueness The lack of definite or sharp distinctions Ambiguity One-to-many relationships Discord (Conflict) Disagreement in choosing among several alternatives Non-Specificity Two or more alternatives are left unspecified 5 different levels of uncertainties and could be applicable to water networks of all sizes, i.e., small, medium and large. This research aims to eliminate these limitations through developing a robust R&R action program using Bayesian framework considering the different level of uncertainties in pipe failure model and LCC for small, medium and large water utilities. 1.2 Research objectives The overall objective of this research is to develop a robust water main R&R action program for water utilities considering uncertainty at various levels. The specific objectives of this research are to: 1. Develop a methodology that addresses the uncertainties of water main failure modeling with limited failure information and considers decision maker’s degree of optimism and credibility. 2. Develop a methodology to identify the influential covariates and to predict the failure rates of water mains considering model uncertainties with limited failure information. 3. Develop survival curves to predict the failure rates of water mains considering model uncertainties for medium to large-sized utilities with adequate failure information. 4. Develop a robust Bayesian updating based water main failure framework to update the performance of water main failure model with adequate failure information. 5. Conduct an in-depth critical review of existing LCC studies in the field of water main and to develop a LCC framework for water main of small to medium sized utilities. 6. Develop a Bayesian belief network (BBN) based water main failure framework for small to medium sized utilities with no or limited failure information. 1.3 Thesis organization The organization of the thesis is presented in Figure 1.3. The thesis is organized based on six journal articles which are either published or submitted. Each publication achieves a specific objective of this research as shown in Figure 1.3. Chapter 2 provides the literature review that 6 presents background information and develops the framework for this research. Chapters 3 to 8 are based on research gaps and present articles which are published or under review. Finally, Chapter 9 contains the summary of research outcomes and recommendations for the future research. Chapter 3 (Publication #1) presents a comparative evaluation of the prediction accuracy of normal multiple linear regression and Bayesian regression models using water mains failure data/information from the City of Calgary, Alberta. The ordered weighted averaging (OWA) operator is integrated to measure the optimism/pessimism natures of the decision maker (DM). Chapter 4 (Publication #2) evaluates Bayesian model averaging (BMA) method to identify the influential covariates and to predict the failure rates of water mains considering model uncertainties. To accredit the proposed model, it is implemented to predict the failure of pipes of the water distribution network of the City of Kelowna, BC, Greater Vernon Water, BC, and City of Calgary, Alberta, Canada. Chapter 5 (Publications #3 & 4) presents a Bayesian framework for predicting the failure of water mains where BMA is presented to identify the influential covariates whereas Bayesian Weibull Proportional Hazard Model (BWPHM) is applied to develop the survival curves. The utility of the proposed framework is illustrated with the City of Calgary, Alberta pipe failure data. Chapter 6 (Publication #5) presents a Bayesian updating based water main failure prediction framework to improve the performance of the water main failure prediction models presented in the previous chapter. Chapter 7 (Publication #6) presents a LCC model for small to medium-sized water utilities to prioritize repair, rehabilitation and replacement (R/R/R) strategies of water mains. The proposed model is implemented in two small water utilities in interior British Columbia, namely Glenmore-Ellison Improvement District (GEID) and Greater Vernon Water (GVW). 7 Figure 1.3: Thesis organization 8 Chapter 8 (Publication #7) provides a BBN model for evaluating the risk of failure of metallic water mains especially for small to medium-sized water utilities with no or limited failure information. To demonstrate the application of proposed model, water distribution network of City of Kelowna, British Columbia (BC) has been studied. 1.4 Research methodology Following methodologies were adopted to achieve the objectives mentioned in section 1.2. This research used the data of the City of Kelowna, BC, Glenmore-Ellison Improvement District (GEID), BC, Greater Vernon Water (GVW) and City of Calgary, Alberta as case studies. The research work starts with the data collection. The research methodology for different objectives are as outlined below: 1.4.1 Methodology: Objective 1 The proposed framework for predicting water mains’ failure using Bayesian regression is presented in Figure 1.4. A detailed literature review was performed on the linear and exponential regression based pipe failure prediction models. The influential covariates were selected from the literature review. The proposed methodology was applied to the water distribution network of the City of Calgary. The normal linear regression and Bayesian linear regression models were developed. The performance of different models was evaluated based on significant parameters, different error measures and Bayesian factor. Bayesian regression analysis was performed for the final selected model. Different response prediction curves are integrated based on the decision maker’s degree of optimism and credibility using OWA operators. 9 Figure 1.4: Proposed framework for objective 1 1.4.2 Methodology: Objective 2 The proposed BMA methodology was applied to the water distribution network of the City of Kelowna, BC, Greater Vernon Water, BC, and City of Calgary, AB using the following steps: Pipe age, diameter, length, soil resistivity and soil corrosivity index data were collected from GIS database of the utilities. The pipes were grouped based on the pipe material and diameter. The classical regression analysis and BMA approach were performed for different groups. The significant parameters for different models were identified using p-values and posterior probabilities of the regression coefficients for the classical regression analysis and BMA approach respectively. The posterior distributions and posterior inclusion probabilities of the regression coefficients were determined using BMA approach. The performance of different models was assessed using multiple error measures. Literature Review Covariate Selection Model Development Model Selection & Analysis OWA Operators Multiple Linear Regression Bayesian Linear Regression 10 1.4.3 Methodology: Objective 3 The framework of the proposed research is shown in Figure 1.5. Pipe characteristics data, soil information and pipe breakage data were collected from the water utility’s Geographic Information System (GIS). Influential and significant covariates were selected using the BMA approach. Water mains failure prediction model was developed using BWPHM. Models were evaluated using 5-fold cross-validation method. Figure 1.5: Proposed framework for objective 3 1.4.4 Methodology: Objective 4 From the water utility’s GIS, pipe characteristics data, soil information and pipe breakage data were collected. Using the BMA approach, influential and significant covariates were selected. The entire water main failure dataset was divided into multiple periods for the model development. 11 Using the failure data of the first period, the BWPHM based water main failure prediction models was developed. Using the Bayesian updating approach, the model parameters was updated by the water main failure data of the second period. Bayesian updating approach was followed for the water main failure data of the remaining periods. The performance of the models after each period were determined using different Model Errors. 1.4.5 Methodology: Objective 5 The LCC framework for water mains renewal of small to medium-sized utilities is developed using the following steps. From water utilities’ database, different types of data (i.e., pipe characteristics, failure, soil and hydraulic data) were collected. Pipe deterioration and condition curves were generated based on the gathered information. Different LCC related data (i.e., repair, rehabilitation and replacements (R/R/R) methods, R/R/R costs, planning period, discount rates) were collected. R/R/R profile over the design period was determined to calculate the LCC of each pipe. 1.4.6 Methodology: Objective 6 The most current literature was thoroughly reviewed to determine the contributing risk factors of water main failure. A knowledge and data based BBN model was developed where casual relationships between structural integrity, hydraulic capacity, water quality, and consequence factors were established based on expert knowledge, published literature and training from available data or information. Sensitivity analysis was performed to identify critical input parameters that have a significant impact on the water main failure. The performance of the proposed BBN model has been checked with multiple hypothetical scenarios. 12 The City of Kelowna, BC, Canada water distribution network has been selected to demonstrate the application of BBN model. The aggregated risk index of City of Kelowna water mains was determined for summer and winter 2012. The proposed BBN model was integrated with the GIS of the City of Kelowna to create a risk index map. 13 Chapter 2 Literature Review This chapter presents the state-of the art literature related to this research and identifies weaknesses and limitations of existing literature. 2.1 Literature review To fully appreciate the multi-dimensional nature of this research, the literature review in the following sections covers the pipe failure models, prioritization, and LCC analysis of water main (Figure 2.1). Figure 2.1: Flow chart of literature review 2.1.1 Pipe failure models Water utilities often rely on water main break prediction models for effective R&R action plans. Pipe failures occur not only for environmental and operational stresses acting upon pipes but also due to the fact that the structural integrity of the pipes has been altered by degradation, corrosion, poor installation and manufacturing defects (Kleiner and Rajani 2001a, 2001b). A number of physically-based and statistically-based water main prediction models have been developed in the last 30 years. Kleiner and Rajani (2001a, 2001b) provide a thorough review of statistical (Kleiner and Rajani 2001a) and physical (Kleiner and Rajani 2001b) pipe break models prior to Pipe Failure Models Statistical Models Heuristic Models Life Cycle Cost Analysis 14 2001. Furthermore, Nishiyama and Filion (2013) presented a critical review of statistical water main break forecasting prediction models from 2002-2012. Physical models predict breaks by simulating the mechanics of pipe failure and a pipe’s capacity to resist failure (Rajani and Tesfamariam 2004; Tesfamariam et al. 2006). Physical models can be categorized as either deterministic or probabilistic (Kleiner and Rajani 2001b; Nishiyama and Filion 2013). Statistical models are developed with historical data on pipe breaks to identify failure patterns, and they extrapolate these patterns to predict future pipe breaks (Kleiner and Rajani 2001a; Nishiyama and Filion 2013). Statistical models can be categorized into deterministic, probabilistic and soft computing methods (Kleiner and Rajani 2001b; Nishiyama and Filion 2013). Physical modeling of individual pipes can provide strong inferences about pipe condition if sufficient data about the pipes are available, but it is prohibitively expensive to gather the information needed for physical models of every pipe in a given water distribution system (Le Gat and Eisenbeis 2000; Rajani et al. 2012). Thus, statistical models provide an efficient and ‘cheap’ surrogate. However, statistical models attempting to predict the behavior of water pipes are not only affected by the quantity and quality of available data, but also by the applied statistical techniques (Boxall et al. 2007; Economou et al. 2007, 2008). This research is to develop statistical based pipe failure models, and subsequent discussions are limited to it. 2.1.1.1 Regression models In order to successfully implement long- and short-term preventive management plans, utility managers and other authorities need access to location-specific information on pipe failure. Substantial efforts have been made to develop pipe failure prediction using linear and exponential regression based models. Different researchers applied different regression methods like exponential regression (Goulter and Kazemi 1988; Kleiner and Rajani 1999; Shamir and Howard 1979; Walski and Pelliccia, 1982), multiple linear regression (Asnaashari et al. 2009; Clark et al. 1982; Wang et al. 2009; Yamijala et al. 2009), multivariate exponential regression (Clark et al. 1982; Kleiner and Rajani 2002, 2000; Rajani and Kleiner 2001; Yamijala et al. 2009), linear regression (Jacobs and Karney 1994; Kettler and Goulter 1985), Poisson regression (Asnaashari et al. 2009; Boxall et al. 2007; Christodoulou 2011), generalised linear model (Bubtiena et al. 2011; Yamijala et al. 2009), and logistic generalised linear model (Yamijala et al. 2009). To develop water main failure models, researchers considered different pipe specific, site specific, 15 and environmental factors, and divide or grouped the data according to the material (Asnaashari et al. 2009; Boxall et al. 2007; Wang et al. 2009), number of previous breaks (Pelletier et al. 2003), diameter (Wood and Lence 2009), and installation year (Mailhot et al. 2000). Table 2.1 summarizes different linear and exponential regression based pipe failure prediction models. Table 2.1: Summary of linear and exponential regression models for water main failures References Variables Model Shamir and Howard 1979 age ERM1 Walski and Pelliccia 1982 age, diameter, and number of previous breaks ERM Clark et al. 1982 age, number of previous breaks MLRM2, MERM3 Kettler and Goulter 1985 age, diameter LRM4 Goulter and Kazemi 1988 time, space from previous break ERM Jacobs and Karney 1994 length, age LRM Kleiner and Rajani 1999 age ERM Rajani and Kleiner 2001 age, freezing index, cumulative rain deficit, snapshot rain deficit, cumulative length of replaced water mains, cumulative length of cathodic protection retrofit MERM Kleiner and Rajani 2000, 2002 freezing index, rainfall deficit, cumulative length of replaced mains, cumulative length of cathodic protection MERM Boxall et al. 2007 diameter, age, length, material, soil corrosivity PRM5 Asnaashari et al. 2009 age, length, diameter, wall thickness, maximum pressure, pipe location, cover depth, failure history MLRM, PRM Wang et al. 2009 age, size, length MLRM Wood and Lence 2009 age, material, diameter, soil type MERM Yamijala et al. 2009 diameter, material, length, year of installation, time since last break, pressure, land use, soil type, temperature, rainfall, maximum soil moisture, max-min soil moisture, soil corrosivity MLRM, MERM, GLM6, logistic GLM Bubtiena et al. 2011 age, material, length, diameter, material, depth, type of soil, water quality GLM Christodoulou 2011 material, diameter, pipe type, traffic load in the vicinity, number of previous breaks, incident type PRM 1ERM: Exponential Regression Model 2MLRM: Multiple Linear Regression Model 3MERM: Multivariate Exponential Regression Model 4LRM: Linear Regression Model 5PRM: Poisson Regression Model 6GLM: Generalised Linear Model 16 However, most of the studies did not consider the uncertainties in water main failure prediction which can arise due to incomplete and partial information, integration of data/information from different sources, involvement of human (expert) judgment for the interpretation of data and observations, unreliable recording of failure times or inaccurate measurements, involvement of multiple expert with different levels of credulity about their knowledge (Kabir et al. 2015f; Rajani and Tesfamariam 2007, 2004; Tesfamariam et al. 2010, 2006; Wood and Lence 2009). For this, to improve our understanding of failure processes, accurate quantification of model uncertainties in prediction is necessary. In the linear and exponential regression models, the regression parameters are considered fixed, and least-squares estimation or maximum likelihood methods are utilized to determine the regression parameters or coefficients (Asnaashari et al. 2009; Boxall et al. 2007; Kleiner and Rajani 2002). Most of the researchers determine model performance based on the coefficient of determination (R2) (e.g. Asnaashari et al. 2009; Boxall et al. 2007; Christodoulou 2011; Kleiner and Rajani 2000, 2002; Wang et al. 2009). Some researchers divide their dataset for calibration (train) and validation (test) purposes and to find out model performance (e.g. Bubtiena et al. 2011; Wood and Lence 2009; Yamijala et al. 2009). The inference was carried out conditionally on the selected “best” model, but this ignores the model uncertainty implicit in the variable selection process (Leamer 1978; Raftery et al. 1997; Viallefont et al. 2001). As a consequence, uncertainty about quantities of interest can be underestimated (Hoeting et al. 1999; Raftery et al. 1997; Viallefont et al. 2001). Bayesian regression based pipe break models can handle these issues by considering the model parameters as random variables and incorporating external information, relevant historical information, elicited expert opinions into the model by the parameter posterior distributions. Bayesian regression also provides uncertainty assessment. These uncertainties could be integrated into the analysis in order to see their impact on the decision-making process. Thus, to reduce the uncertainty of the regression-based pipe failure models, there is a need of effective Bayesian regression based models to consider uncertainties formally for effective decision making. 17 2.1.1.2 Survival analysis Survival analysis is a statistical approach dealing with deterioration and failure over time and involves the modeling of the elapsed time between an initiating event and a terminal event (Christodoulou 2011; Cox 1972; Klein and Moeschberger 1997; Røstum 2000). Survival analysis incorporates the fact that while some pipes break, others do not and this information has a strong impact on pipe failure analysis (Mailhot et al. 2000; Park et al. 2008a; Røstum 2000). The models use covariates to differentiate the pipe failure distributions without splitting the failure data, thereby giving a better understanding of how covariates influence the failure of the pipe (Martins 2011; Osman and Bainbridge 2011; Park et al. 2008b). The classification of survival analysis methods is provided in Figure 2.2. Kleiner and Rajani (2001b) presented brief review of the survival analysis methods for modeling pipe failure. Table 2.2 indicates that different researchers applied different survival analysis methods like Kaplan–Meier estimator (Christodoulou 2011), homogeneous Poisson process or Poisson regression (Asnaashari et al. 2009; Boxall et al. 2007), nonhomogeneous Poisson process (NHPP) (Kleiner and Rajani 2010; Rogers 2011), zero-inflated nonhomogeneous Poisson process (NHPP) (Economou et al. 2008; Rajani et al. 2012), exponential/Weibull model (Dridi et al. 2009, 2005; Park et al. 2008a), multivariate exponential model (Kleiner and Rajani 2002; Mailhot et al. 2000), Cox proportional hazard model (Cox-PHM) (Park et al. 2011, 2008b; Vanrenterghem-Raven 2004), and Weibull proportional hazard model (WPHM) (Le Gat and Eisenbeis 2000; Vanrenterghem-Raven 2004) and Bayesian inference/analysis (Dridi et al. 2005, 2009; Economou et al. 2007, 2008; Watson et al. 2004). 18 Figure 2.2: Classification of survival analysis method for pipe failure analysis Non-parametric Cox proportional hazard model Weibull proportional hazard model Semi-parametric Parametric Survival Analysis Homogeneous Poisson process Nonhomogeneous Poisson process Univariate Exponential/ Weibull model Multivariate Exponential/ Weibull model Exponential/ Weibull model Poisson model Bayesian Analysis 19 Table 2.2: Summary of survival analysis methods for water main failures References Covariates Cohorts/Groups Methods Kimutai et al. 2015 length, diameter, material, soil resistivity, freezing index, rain deficit material, number of previous breaks Cox-PHM1, W-PHM2, Poisson model Economou et al. 2012 age, number of previous failures material Zero inflated NHPP Fuchs-Hanusch et al. 2012 diameter, length, vintage material Cox-PHM Rajani et al. 2012 water and air temperature material Zero inflated NHPP3 Toumbou et al. 2012 mumber of pipe breaks, diameter, length, and installation dates material, number of previous breaks MEWM4 Christodoulou 2011 material, diameter, pipe type, traffic load in the vicinity, number of previous breaks, incident type material, number of previous breaks ANN5, Kaplan–Meier estimator, PRM6 Clark and Thurnau 2011 material, diameter, frailty factor Material, diameter Cox-PHM / shared frailty model Osman and Bainbridge 2011 age, length, freezing index material MEM7 Park et al. 2011 length, pipe internal pressure, land development, number of customer in a grid material, number of previous breaks Cox-PHM Renaud et al. 2011 material, length, installation year, break history material, number of previous breaks Linear extension of the Yule process Rogers 2011 age, material, diameter, leak type, number of previous breaks material, number of previous breaks NHPP Alvisia and Franchinia 2010 diameter, material, year of installation, number of breakages installation year , material, number of previous breaks MEWM, W-PHM Clark et al. 2010 material, diameter, frailty factor Material, diameter Cox-PHM Debón et al. 2010 age. length, diameter, pressure traffic, under a sidewalk, material not mentioned Cox-PHM, W-PHM, MRM8 Fadaee and Tabatabaei 2010 type of failure, cumulative number of failures type of failure Power law model Kleiner and Rajani 2010 age, diameter, material, length, service connection, vintage, soil type, freezing index, cumulative rain deficit, snapshot rain deficit, cathodic protection, retrofit cathodic protection diameter NHPP Asnaashari et al. 2009 age, length, diameter, wall thickness, maximum pressure, pipe location, cover depth material MRM, PRM Dridi et al. 2009 age number of previous breaks Bayesian MEWM 20 Table 2.2: Summary of survival analysis methods for water main failures (continued) References Covariates Cohorts/Groups Methods Rogers and Grigg 2009 age, material, diameter, number of previous breaks material, number of previous breaks NHPP, MCDA9 Gorji-Bandpy and Shateri 2008 material, diameter, length, year of installation, type of soil, land use number of previous breaks MEWM Park et al. 2008b material, length, diameter, installation time, pipe internal pressure, land development material, number of previous breaks Cox-PHM Park et al. 2008a age, previous failure times material, diameter Rates occurrence failure model Boxall et al. 2007 diameter, age, length, material, Soil corrosivity material PRM Park et al. 2007 material, diameter survival time Cox-PHM Economou et al. 2007 age, pipe length, pipe diameter, pressure, absolute pressure not mentioned Bayesian NHPP Dridi et al. 2005 age number of previous breaks Bayesian MEWM Vanrenterghem-Raven 2004 age, length, diameter, material, traffic, soil, subway, location in the street, presence of ancient water zones material, number of previous breaks Cox-PHM, W-PHM Watson et al. 2004 age not mentioned Bayesian NHPP Pelletier et al. 2003 diameter, length, material, year of installation, type of soil, land use number of previous breaks MEWM Kleiner and Rajani 2002 cumulative length of replaced mains, cumulative length of cathodic protection, freezing index, rainfall deficit material MEM Le Gat and Eisenbeis 2000 age, length, diameter, types of pipe assembling, pressure, soil type, level of traffic, kind of supply material, number of previous breaks W-PHM Mailhot et al. 2000 material, length, diameter, installation year, soil type, land use above the pipe installation year MEWM 1Cox-PHM: Cox Proportional Hazard Models 2W-PHM: Weibull Proportional Hazard Models 3NHPP: Nonhomogeneous Poisson Process 4MEWM: Multivariate Exponential/Weibull Models 5ANN: Artificial Neural Network 6PRM: Poisson Regression Model 7MEM: Multivariate exponential model 8MRM: Multiple Regression Model 9MCDA: Multicriteria Decision Analysis 21 Le Gat and Eisenbeis (2000) used Weibull Proportional Hazard Model (WPHM) to calculate and forecast the water main breaks of two water utilities (Charente-Maritime and Lausanne) with short and long-term records. To estimate the vector of the coefficient of explanatory variables and scalar regression parameter, the authors used the maximization of the log-likelihood function which is the log-transform of the joint probability of the observations. Mailhot et al. (2000) applied a two parameters Weibull distribution and single parameter exponential distribution for modeling the structural state of water pipe networks with 21 years break history of Municipality of Chicoutimi. The authors analyzed Weibull-Exponential, Weibull-Exponential-Exponential, Weibull-Weibull-Exponential, and Weibull-Weibull-Exponential-Exponential models with increasing number of parameters. The authors reported that after the first break, the occurrence of subsequent breaks did not vary with time, therefore they concluded that Weibull distribution can be used for describing time-to-first break while exponential distribution to model the subsequent breaks. Vanrenterghem-Raven (2004) applied Cox-PHM and WPHM models to determine pipe ranking based prioritization and optimization of rehabilitation techniques using 20 years data from the New York City. Pipe characteristics (length, diameter, material, age) and pipe environment characteristics (traffic, soil, subway, location in the street, the presence of ancient water zones) were considered for covariates. Data were stratified based on material with a number of previous breaks. Boxall et al. (2007) used Poisson generalized linear model (GLM) with logarithmic link function to study the burst rates of cast iron and asbestos cement pipes, using two data sets from water utilities in the UK. Park et al. (2008a) proposed a method focused on modeling the failure rate and estimating economically optimal replacement time of an individual water main by using two widely used rates of occurrence of failures (ROCOFs) models: the log-linear ROCOF and the power law process. The authors found that the log-linear ROCOF performed better than the power law process for the failure time-based model. The failure-time model was also proven to be an improvement over the failure number-based model implying that recording each failure time results in better modeling of the failure rate than observing failure numbers in some time intervals. 22 Asnaashari et al. (2009) compared the performance of a multiple regression model and a Poisson model to model pipe break for the City of Sanandaj, Iran with 10 years of recorded data. Rogers and Grigg (2009) proposed a schematic pipe failure assessment model for prioritizing water pipe renewal using the data from Colorado Springs Utilities and Laramie Water utilities. The program’s pipe failure prediction (PFP) used nonhomogeneous Poisson process (NHPP) for the pipes with a minimum of three break records whereas multicriteria decision analysis (MCDA) module was used for the pipes less than 3 breaks. Park et al. (2008b) applied Cox-PHM in modeling hazard rates for consecutive pipe (cast iron 6 inchs) failures for a water distribution system with the pipe failure date from 1903-1997. The individual pipes were categorized into seven ordered survival time groups according to the minimum total number of breaks recorded. The covariates analyzed included pipe material, diameter, installation time, length, pipe internal pressure and land development. The results indicated that the failure times of all of the survival time groups follow Weibull distribution. Moreover, the hazard rate for the first break increased with time while for the second and more breaks, the hazards decreased with time. In another study, Park et al. (2011) presented a comprehensive process for constructing proportional hazard models (PHMs) for the time intervals between consecutive pipe breaks using the break data of a water distribution system in the United States. Christodoulou (2011) presented a framework to determine the time-to-failure of a pipe using a combination of artificial neural network (ANN), parametric and non-parametric survival analysis with 5 years incident data (2002-2007) from City of Limassol, Cyprus. Osman and Bainbridge (2011) compared two common statistical deterioration models: rate-of-failure models (ROF) and transition-state (TS) models for the City of Hamilton, Ontario, Canada with more than 25 years of recorded data. The ROF models extrapolate the break rate for a given data set, on the basis of pipe characteristics and environmental factors, and include nonhomogeneous Poisson models and multivariate exponential models. To study the effects of temperature based covariates on pipe break of different materials (ductile iron (DI), cast iron (CI) and galvanized steel), Rajani et al. (2012) applied Non-Homogeneous Poisson model to three sets of data from the US and Canadian municipalities. Data used and collected from Connellsville, Pennsylvania included both pipe water temperature and air temperature, whereas 23 the data from Ottawa and Scarborough were limited to air temperature. The authors found that water-based covariates had a significant impact on pipe break than the air-based covariates and air temperature data alone is usually sufficient for predicting water main breaks, but can be enhanced with the use of additional water temperature data. Moreover, the authors also indicated that the impacts of these covariates on pipe differed according to the type of pipe material. A state-of-art review of the applications of survival analysis methods from 2000-2012 is presented in Appendix. To develop water main failure models, researchers considered different pipe specific, site specific, and environmental factors. Pipe failures or deterioration can be affected by a number of physical characteristics, environmental characteristics or operational factors, some of which are static while others vary over time (Kleiner and Rajani 2002, 2010; Rostum 2000; Rajani and Kleiner 2001). The combination of these factors determines the pipe failure processes and modes though the roles played by each factor vary from the water main to water main owing to site-specific conditions (Hu and Hubble 2007). Table 2.2 indicates that important covariates influencing pipe failure are pipe physical attributes (i.e., pipe material, age, diameter, length) and environmental factors (i.e., soil type, land use/development). For most of the studies, the selection of covariates depends on the data availability (Boxall et al. 2007; Rogers 2011) and pipe physical attributes (i.e., length, age) were found to contribute more to pipe failure than environmental covariates as environmental covariates highly site/location specific (Asnaashari et al. 2009; Mailhot et al. 2000; Rajani et al. 2012). The researcher divide or group the data according to the material (Asnaashari et al. 2009; Boxall et al. 2007; Osman and Bainbridge 2011; Rajani et al. 2012), number of previous breaks (Dridi et al. 2009, 2005; Le Gat and Eisenbeis 2000; Pelletier et al. 2003), diameter (Kleiner and Rajani 2010; Park et al. 2008a), installation year (Mailhot et al. 2000), and type of failure (Fadaee and Tabatabaei 2010). Appendix A presents brief summary of the survival analysis methods for water main failure. In most of the parametric and semi-parametric models, the model/distribution parameters have been considered fixed and the estimation of the model/distribution parameters is carried out using least-squares regression (Osman and Bainbridge 2011; Kleiner and Rajani 2002) and maximum likelihood method (Kleiner and Rajani 2010; Mailhot et al. 2000; Park et al. 2008a; Rajani et al. 24 2012; Rogers 2011; Rogers and Grigg 2009). For this reason, those models do not capture the model uncertainties, such as parameter uncertainty, measurement errors, etc. Some researchers divide their dataset for calibration (train) and validation (test) purposes, and find out the performance of the model based on the R2 and ignored uncertainties (Asnaashari et al. 2009; Boxall et al. 2007; Christodoulou 2011; Economou et al. 2008; Kleiner and Rajani 2002, 2010; Park et al. 2008a, 2008b; Rajani et al. 2012; Rogers and Grigg 2009; Rogers 2011). Uncertainties become an integral part of the water main failure prediction models. Uncertainties arise because of the incomplete and partial information, integration of data/information from different sources, involvement of human (expert) judgment for the interpretation of data and observations, unreliable recording of failure times, inaccurate measurements of the confounding factors or even the lack of the actual failure times, involvement of multiple experts having different levels of credibility and knowledge related to the problem, (Kabir et al. 2015b; Francis et al. 2014; Tesfamariam et al. 2010; Economou et al. 2007, 2012). To deal with these uncertainties in the water main failure prediction model, some researchers presented Bayesian inference or analysis. In Bayesian analysis, model parameters are considered as random variables and incorporated external information (e.g. elicited expert opinions, relevant historical information) into the model by constructing a probability distribution that describes the uncertainty in the model parameters (prior to the observing data from the experiment) (Dridi et al. 2009, 2005; Economou et al. 2012, 2007; Watson et al. 2004). In most of the Bayesian analysis, nonhomogeneous Poisson process (Economou et al. 2007; Watson et al. 2004) and exponential/Weibull models (Dridi et al. 2009, 2005) were considered. Watson et al. (2004) proposed Nonhomogeneous Poisson Process (NHPP) based Bayesian hierarchical model for pipe failure. The authors proposed a hierarchical Bayesian model to combine the failure rates of each pipe assuming similarity between parameters, named the hyperprior. Dridi et al. (2005) developed a Bayesian exponential/Weibull model and integrated structural and hydraulic indicators for developing optimal replacement strategies of water pipes. Dridi et al. (2009) proposed a Bayesian exponential/Weibull-based pipe failure model and developed a pipe replacement strategy considering the replacement cost, expected the cost of 25 pipe break repairs, and hydraulic performance. In their analysis, they considered the Weibull probability distribution function for the first breaks and the exponential probability distribution function for successive breaks. Economou et al. (2012) compared Bayesian NHPP model with Bayesian zero-inflated NHPP model to handle the excess amount of zeros in the number of failures (known as zero-inflation). The authors considered power law process for the intensity function. Gamma and normal distributions were assumed for uninformative priors. The authors found that zero-inflated NHPP model fitted the data better than the NHPP model for the calibration. However, pipe age was the only governing factor considered in most of these studies ignoring other influential physical (i.e., diameter, length, and manufacturing period) and environmental (i.e., soil condition, temperature) factors. On the other hand, very few water main failure prediction studies mention any preliminary covariate or model selection method that considers these uncertainties. Moreover, none of the studies have discussed the procedure that how to improve or update the performance of the model whenever new failure information/data are available. For most of the models, we have to perform the analysis again to handle the new failure or pipe information/data. Moreover, they did not discuss the effect of time dependent parameters (i.e., temperature difference, freezing index, etc.) on the pipe failure rigorously. Hence, to accurately quantify the model and prediction uncertainties of the pipe failure model, there is a need for a robust and effective model that not only represents the uncertainty of the pipe dependent, time dependent, and pipe-time dependent model/distribution parameters but also update the model performance whenever new data/information is available. 2.1.1.3 Heuristic models Most of the statistical pipe failure models discussed in earlier section mainly focused on medium to large size utilities where large number of pipe failure data/information and sufficient number of experts are available (Economou et al. 2008; Kleiner and Rajani 2002, 2010; Le Gat and Eisenbeis 2000; Park et al. 2008b, 2011). The amount of water main break data needed for those extensive model developments is not commonly available in many utilities (Wood and Lence 2006). Especially, most small to medium size utilities do have some information on their water mains (such as pipe diameter, pipe material, and date of installation), but few have been 26 maintaining thorough records of pipe breaks for longer than a decade (Mailhot et al. 2000; Pelletier et al. 2003). Moreover, due to various uncertainties, many utilities are not confident about their data. Thus, it will not be effective to derive a general pipe failure prediction model, which could be applied to any water network (like small, medium and large utilities) (Mailhot et al. 2000; Rostum 2000; Wood and Lence 2009). To handle these issues, different heuristic water main failure models have been reported. Table 2.3 indicates a summary of contributing risk factors and different heuristic water main failure models have been reported to quantify risk factors for water main failure and infrastructure deterioration. Yan and Vairavamoorthy (2003) consider six factors and classified them into pipe physical and environmental factors. Christodoulou et al. (2003) included eight significant factors like a number of observed previous breaks, diameter, length, material, traffic load, proximity to highway, proximity to subway, and proximity to roadway/block intersection in the study. Al-Barqawi and Zayed (2008) classified factors contributing to water main deterioration into physical, environmental and operational factors. Fares and Zayed (2009) categorized the model structure into four main factors and sixteen factors which represent the deterioration and post-failure factors. Francisque et al. (2009) grouped the factors into hydraulics, structural integrity and water quality parameters. Despite some differences, the performance objectives of these water mains failure models can broadly be categorized as: water quality index, hydraulic capacity index, structural integrity index, and consequences index (Fares and Zayed 2010; Francisque et al. 2009; Infrastructure Report 2007). 27 Table 2.3: Risk factors of water main failure recommended by different literature References Failure Index Model/ Framework Structural Integrity Hydraulic Capacity Water Quality Consequence Shi et al. 2013 diameter, age, material, temperature Spatial analysis Singh 2011 diameter, age, material, soil resistivity Bayesian Analysis Jafar et al. 2010 diameter, age, material, length, thickness, soil resistivity pressure ANN1 Fares and Zayed 2010 diameter, age, material, type of traffic, soil resistivity velocity turbidity, water age population, land use Fuzzy rule-based Christodoulou et al. 2009 age, material, diameter, length, type of traffic ANN & fuzzy logic Francisque et al. 2009 diameter, age, material, water pH turbidity, Free residual Cl, age land use AHP2 & fuzzy logic Al-Barqawi and Zayed 2008 age, material, diameter, length, type of traffic pressure AHP & ANN Amaitik and Amaitik 2008 age, diameter, soil resistivity pressure ANN Al-Barqawi and Zayed 2006a age, material, diameter, soil resistivity, type of traffic ANN Al-Barqawi and Zayed 2006b age, material, type of traffic diameter, soil resistivity, pressure AHP Kleiner et al. 2006 age Fuzzy rule-based Rogers 2006 age, material, diameter, soil resistivity NHPP3 & MCDA4 Rogers and Grigg 2006 age, material, diameter, soil resistivity pressure NHPP & MCDA Najafi and Kulandaivel 2005 age, material, diameter, length ANN Kleiner et al. 2004 age Fuzzy MDP5 Christodoulou et al. 2003 age, material, diameter, length, type of traffic ANN Jafar et al. 2003 age, material, diameter, thickness length, soil resistivity, type of traffic pressure ANN Yan and Vairavamoorthy 2003 age, diameter, material, soil condition, road loading, surroundings Fuzzy composite programming Kettler and Goulter 1985 age, material, diameter Regression analysis O’Day 1982 diameter, material, soil resistivity Function analysis 1ANN: Artificial Neural Network; 2AHP: Analytic Hierarchy Process; 3NHPP: Non-homogeneous Homogenous Poisson Process; 4MCDA: Multicriteria Decision Analysis; 5MDP: Markov Deterioration Process 28 Different water main failure models have been reported to quantify risk factorsforf water main failure and infrastructure deterioration. When significant historical data exist, model-free methods, such as Artificial Neural Networks (ANN), can provide insights into cause-effect relationships and uncertainties through learning from data (e.g. Christodoulou et al. 2003; Ismail et al. 2011; Najafi and Kulandaivel 2005). But, if historical data are scarce and/or available information is ambiguous and imprecise, other soft computing techniques can provide appropriate framework to handle such relationships and uncertainties (e.g., Bolar et al. 2013; Cockburn and Tesfamariam 2011; Deng et al. 2011; Flintsch and Chen 2004; Ismail et al. 2011; Janssens et al. 2006; Najjaran et al. 2005; Sadiq and Rodriguez 2004; Sun and Shenoy 2007; Tesfamariam and Najjaran 2007). It is important to determine the cause and effect of probability of failure, where failure can manifest as structural integrity, hydraulic capacity and water quality (Al-Barqawi and Zayed, 2008; Christodoulou et al. 2009; Fares and Zayed 2010; Sadiq et al. 2010) and consequences (e.g. Fares and Zayed 2010; Francisque et al. 2009). It is widely acknowledged the relationship among these factors are not linear and requires a network-based model to represent the relationships and determine pipe failure risk index. Moreover, strategies should be built on scientific approaches that combine human knowledge and experience as well as expert judgment to consider the risk of water main failure. Table 2.4 provides a qualitative comparison between six networks based computing techniques including ANN, Analytic network process (ANP), BBN, Cognitive Maps/Fuzzy Cognitive Maps (CM/FCM), Credal Network (CN) and Fuzzy Rule-Based Models (FRBM). Central to this comparison is an assessment of how each technique treats inherent uncertainties and its ability to handle interacting factors that encompass issues specific to the failure of water mains. 29 Table 2.4: Comparison of various network based techniques Attributes Network based techniques ANN ANP BBN CM/ FCM CN FRBM Network capability L VH H1 VH2 H1 L3 Ability to express causality N H VH VH H M Formulation transparency N4 VH H VH H H Ease in model development M H M VH M M Ability to model complex systems VH M H VH H H Ability to handle qualitative inputs N VH H VH H H Scalability and modularity VL5 VH6 H VH6 H L Data requirements VH L7 M8 L9 L10 L Difficulty in modification M N L N L H Interpretability of results VH VH VH H VH VH Learning/training capability VH11 H12 H13 H14 H13 M15 Time required for simulation H L L L M L Maturity of science H H VH M M H Ability to handle dynamic data H M H M H H Examples of hybrid models (ability to combine with other approaches VH16 H16 H H17 H VH16 Ratings: N = No or Negligible; VL = very low; L = low; M = medium; H = high; VH = very high Network based techniques: ANN = Artificial Neural Networks; ANP = Analytic Network Process; BBN = Bayesian Belief Networks; CM/FCM = Cognitive Maps/Fuzzy Cognitive Maps; CN = Credal Network and FRBM = Fuzzy Rule-Based Models 1 Can manage networks but cannot handle feedback loops, therefore referred to as directed acyclic graphs 2 Can handle feedback loops 3 Dimensionality is a major problem and formulation becomes complicated for network systems 4 Generally referred to as black box models 5 ANN needs to be retrained for new set of conditions 6 Very easy to expand, because algorithm is in the form of matrix algebra 7 Minimal data requirement, because causal relationships are given by decision makers 8 Medium data requirement for using precise probability 9 Minimal data requirement, because causal relationships are generally soft in nature 10 Minimal data requirement for using imprecise probability 11 Algorithms, e.g., Hebbian learning 12 Algorithms, e.g., minimizing the error function 13 Algorithms, e.g., evolutionary algorithms and Markov chain Monte Carlo 14 Training algorithms are available which have been successful in training ANNs 15 Clustering techniques, e.g., Fuzzy C-means 16 Examples are available in the literature to develop models using hybrid techniques, e.g., neuro-fuzzy models, fuzzy analytic network process 17 Has a potential to be used with other soft techniques 30 The ANN has been used by Christodoulou et al. (2003) to analyze the failure risk of water main in an urban area with historical breakage data, respectively. Christodoulou et al. (2003) indicated that the number of previous breakages, diameter, material, and length of pipe segments was the most important factors for water main failure. Modeling of breaks in the water networks was carried out using multiple linear regression and ANN by Jafar et al. (2003). The results indicate that the performance of the ANN model for predicting the number of failures in the water network is better to compare to the multiple linear regression models. Najafi and Kulandaivel (2005) developed a neural network model to predict the condition of sewer pipes based on historic data and considering seven input variables- length, size, type of material, age, depth, slope, and type of sewer. However, the authors didn’t perform any model validation. Al-Barqawi and Zayed (2006a, 2008) developed rehabilitation priority for water mains using ANN approach. Their results showed that the breakage rate has the highest relative contribution factor followed by age. Amaitik and Amaitik (2008) developed pre-stressed concrete cylinder pipe wire breaks prediction model using ANN. Failure rate and the optimal replacement time for the individual pipes of urban water distribution system were estimated using ANN by Jafar et al. (2010). Christodoulou et al. (2009) proposed neuro-fuzzy systems where artificial neural networks are combined with fuzzy logic for the detection of patterns in the underlying data and then convert these patterns to knowledge and generic rules to assist risk-of-failure analysis and preventive maintenance of water distribution networks. Kleiner et al. (2004) used a fuzzy rule-based nonhomogeneous Markov process to model the deterioration procedure of buried pipelines. The deterioration rate at a specific time is estimated based on the asset’s age and condition state using a fuzzy rule-based algorithm. In another study, Fuzzy sets and fuzzy-based techniques were proposed to evaluate pipeline failure risk by Kleiner et al. (2006). For prioritizing monitoring locations (zones) in a WDN, Francisque et al. (2009) coupled the concept of risk with fuzzy synthetic evaluation and fuzzy rule-base. To evaluate the risk of water main failure considering both consequence and deterioration factors and to develop a risk scale of failure, Fares and Zayed (2010) used hierarchical fuzzy expert system framework. 31 Yan and Vairavamoorthy (2003) proposed a Multi-Criteria Decision Making (MCDM) technique to assess pipeline condition which combines the available pipe condition indicators into one single indicator. Al-Barqawi and Zayed (2006b) used analytic hierarchy process (AHP) in order to set up rehabilitation priority for water mains and to consider physical, environmental, and operational factors and their effect on different types of water mains. Rogers (2006) and Rogers and Grigg (2006) complied pipe inventory and break data from the utility’s existing operation and maintenance records to develop Multicriteria Decision Analysis (MCDA) module based on the weighted average method. Tesfamariam et al. (2006) have proposed a possibilistic based pipe failure risk using Rajani and Tesfamariam (2004) mechanistic models. Rajani and Tesfamariam (2007) have extended these models with fuzzy deterioration model to estimate remaining service lives. Joseph et al. (2010) proposed BBN to support the water quality compliance of small or rural water distribution systems. Expert judgment was used to quantify the required probability relationships. However, it is usually difficult to establish mutual relationships among nodes in the network solely based on the knowledge of experts, particularly for complex problems (Nadkarni and Shenoy 2001). If a node in a BBN has several parent nodes or each parent node and child node has several states, the number of conditional probabilities will be increased exponentially (Tang and McCabe 2007). For example, if a child node has three parent nodes and the number of their states is five, the total number of conditional probability table (CPT) values can be great as 54 (625) (Lauría and Duchessi 2007). The elicited conditional probabilities defined by experts can be inconsistent, especially under the complex and large CPT condition. Joseph et al. (2010) limited the maximum number of parent nodes for any variable into three. However, it is not possible to always represent the causal effect properly with only three variables. In that situation, it is more reliable and consistent to construct CPTs training from data (Cooper and Herskovits 1992; Hager and Andersen 2010; Tang and McCabe 2007). To find out the posterior probabilities of water main failure starting from the prior probabilities of failure based on the age at failure, pipe diameter, the cause of break and type of soil for the specific type of pipe using Bayes’ theorem proposed by Singh (2011). But the author assumed these contributing risk factors independent and the causal dependencies among the variables have 32 not been taken into consideration. Furthermore, the author did not mention what will be the failure probabilities if all these risk factors affect simultaneously and how to update posterior probabilities when new failure information available. Therefore, there is a need for robust model to prioritize water mains for R&R which can provide information at pipe level and network level, able to visualize the most ‘vulnerable’, the most ‘sensitive’ and the ‘highest risk’ pipes within the distribution network and able consider the relationship between the factors and handle missing data/information. 2.1.2 Life cycle cost (LCC) LCC is performed to evaluate investment options more effectively, to consider the impact of all costs rather than only initial capital costs (Ammar et al. 2012; Farran and Zayed 2012; Moselhi et al. 2009). The LCC of an asset is the total cost over its life span including planning, design, acquisition and support costs and any other costs directly attributable to owning or using the asset (Ammar et al. 2012; Rajani and Kleiner 2001, 2007). LCC takes into account all aspects of the lifetime cost of the system such as agency costs and the impacts of the system on the users in particular and on the society in general (Engelhardt et al. 2003; Moselhi et al. 2009; Rajani and Kleiner 2007). The main motivation to use LCC is to increase the possibility of cost reductions during operation and maintenance (Lim et al. 2006; Shahata 2006; Singh and Tiong 2005). From the analysis of various studies, studies of LCC can be divided into deterministic and probabilistic methods. In the deterministic method, a discounted rate is used to compare all cost in the present value whereas each cost follows a probability distribution function for probabilistic method (Herstein et al. 2011; Nafi and Kleiner 2010; Rajani and Kleiner 2007; Rajani et al. 2004; Shahata 2006; Shahata and Zayed 2013). An overwhelming literature proposed LCC for comprehensive decision-making issues for water mains is summarized in Table 2.5. 33 Table 2.5: Summary of LCC studies of water mains Reference Different costs Method Techniques Applications D1 P2 Du et al. 2013 Capital, operation and maintenance √ LCA3 LCA for water and wastewater pipe materials Glick and Guggemos 2013 Construction, maintenance, and operational √ NPV4 For wastewater treatment facility Shahata and Zayed 2013 Operation and maintenance, repair, renovation, replacement √ MCS5 To predict suitable new installation and/or rehabilitation programs Ammar et al. 2012 Repair, renovation, replacement, operation and maintenance √ Fuzzy-based LCC Selection of rehabilitation methods of water mains Shahata and Zayed 2012 Operation and maintenance, repair, renovation, replacement √ Data acquisition analysis For water main rehabilitation techniques Herstein et al. 2011 Capital cost of new pipes, pipe cleaning and lining, and new tanks √ NSGA6 Evaluating the environmental impacts of water distribution systems Herstein and Filion 2010 New and duplicate pipe cost, pipe cleaning cost, tank cost √ NSGA To evaluate air emissions in the manufacturing of PVC and ductile iron pipes, steel tanks Nafi and Kleiner 2010 Failure repair, direct damage, indirect damage, water loss, pipe replacement √ NSGA II Renewal of water pipes considering adjacency of infrastructure works and economies of scale Moselhi et al. 2009 Total cost of rehabilitation, social cost (cleaning costs, loss of sales tax, number of impacted vehicles and pedestrians) √ AHP7 Rehabilitation method selection of water main networks Lim et al. 2008 Design and supervision, construction (costs for piping, pump, motor), pump pits, construction expenses, contractor’s overhead and profits, operations and maintenance (consumption of water, electricity, maintenance, repairs), disposal (costs for recycling, landfill, construction expenses, contractor’s overhead) √ NPV Environmental and economic performance of a water network system 34 Table 2.5: Summary of LCC studies of water mains (continued) Reference Different costs Method Techniques Applications D1 P2 Rajani and Kleiner 2007 break repair, water loss, damage, social, break repair, construction cost, hot spot anode, retrofit existing pipe, retrofit at pipe installation √ √ Pareto efficiency front For planning of cathodic protection in water Dandy and Engelhardt 2006 replacement costs, direct and indirect break costs √ MOGA8 Cost and reliability trade-off for the replacement of water mains Lim et al. 2006 engineering and supervision, construction, operation & maintenance, disposal √ NPV, cost & benefit analysis To evaluate the economic feasibility of a water network system Shahata 2006 operation and maintenance, repair, renovation, replacement √ MCS Stochastic llc for water mains Rajani and Kleiner 2004 renewal mitigation cost √ NPV Failure management of small & large diameter transmission pipeline Engelhardt et al. 2003 total cost, private cost, social cost, operating costs, capital expenditure, regulatory penalties √ ABC9 Whole life costing of water distribution network Wirahadikusumah and Abraham 2003 maintenance, rehabilitation cost √ DP10 with MCS To determine sewer maintenance/ rehabilitation plans Dandy and Engelhardt 2001 direct and indirect costs of repair, replacement cost √ MOGA To find out near optimal scheduling of water pipe replacement Kleiner et al. 2001 rehabilitation, maintenance cost √ DP For selection and scheduling of pipe rehabilitation alternatives Rajani and Kleiner 2001 pipe replacement, construction cost, breakage repair, water loss, social cost of failure √ √ NPV For water main renewal planning 1D: Deterministic 2P: Probabilistic 3LCA: Life cycle analysis 4NPV: Net present value 5MCS: Monte Carlo simulation 6NSGA: Non-dominated sorting genetic algorithm 7AHP: Analytical hierarchy process 8MOGA: Multi-objective genetic algorithm 9ABC: Activity based costing 10DP: Dynamic programming 35 Most of the previous water main LCC studies mainly focused on larger utilities where large number of pipe failure data, sufficient number of experts and resources are available (Engelhardt et al. 2003; Herstein et al. 2011; Kleiner et al. 2001; Lim et al. 2006; Shahata 2006; Shahata and Zayed 2013). The amount of water main break data needed for those extensive model developments is not commonly available in smaller utilities (Haider et al. 2013; Wood and Lence 2009). Most of the small to medium-sized water utilities do have some basic information on their water mains (e.g., pipe diameter, pipe material, and date of installation, etc.), but very few of them have been maintaining thorough records of pipe breaks for longer than a decade (Mailhot et al. 2000; Pelletier et al. 2003; Wood and Lence 2009). These utilities often suffer from data/information scarcity, inventory management, and lack of skilled technical personnel (Haider et al. 2013). Moreover, uncertainties become a vital and inevitable element of the decision-making a process for the small to medium-sized utilities (Dridi et al. 2009; Haider et al. 2013; Kabir et al. 2015b, 2015f; Tesfamariam et al. 2010; Rajani and Tesfamariam 2007; Wood and Lence 2009). Furthermore, most of the studies considered an average agency cost for repair, rehabilitation, renovation (Ammar et al. 2012; Du et al. 2013; Lim et al. 2006; Shahata 2006; Shahata and Zayed 2012, 2013; Rajani and Kleiner 2004) which are highly uncertain. Maintenance, repair, renovation costs depend highly on the land use and topographical condition of the location where the pipe are buried. For example, if one pipe buried under a highway road and another pipe buried under an open or agricultural field, the repair/ rehabilitation / renovation cost of the former will be much higher than the later one. But if we consider only an average cost, then repair/ rehabilitation / renovation for both these pipes will be same which is not practical and highly uncertain for effective decision making. Dandy and Engelhardt (2001, 2006) incorporated break cost factors in their analysis (Table 2.6) but they determined that factors based on land use only. The authors did not consider site condition and consider same cost factor for residential and industrial, and commercial and major roads. 36 Table 2.6: Break cost factors considered by Dandy and Engelhardt (2001, 2006) Land Use Cost factor Residential 1.5 Industrial 1.5 Residential/industrial 1.5 Commercial 3 Major roads 3 Rural 1 Most of the previous LCC studies assumed that the water mains will work as a new or follow same deterioration pattern or profile after the repair or rehabilitation which is not practical for real life situations (Ammar et al. 2012; Moselhi et al. 2009; Shahata 2006; Shahata and Zayed 2013; Rajani and Kleiner 2001). Different authors observed that pipe failure behaviors were different for the first break and subsequent breaks (Le Gat and Eisenbeis 2000; Mailhot et al. 2000; Pelletier et al. 2003). The authors also mentioned that the likelihood of a pipe experiencing more breaks is much higher after first break (Dridi et al. 2009; Kimutai et al. 2015; Pelletier et al. 2003). For this, it will be not effective to derive a general LCC model, which could be applicable to water networks of all sizes, i.e., small, medium and large. So, to reduce the uncertainty of the LCC framework for water main, there is a need of an effective LCC based decision support tool (DST) for the renewals of water mains in small to medium-sized water utilities taking uncertainties into consideration. 37 Chapter 3 Bayesian Regression Based Failure Model A version of this chapter has been published in ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems, Part A: Civil Engineering with a title “Integrating Bayesian Linear Regression with Ordered Weighted Averaging: Uncertainty Analysis for Predicting Water Mains’ Failures.” The lead author is Golam Kabir and the coauthors are Dr. Solomon Tesfamariam, Dr. Jason Loeppky and Dr. Rehan Sadiq. 3.1 Background Different researchers applied different regression techniques to develop pipe failure prediction models which are presented in Table 2.1. To enhance the predictive capability of pipe failure models, accurate quantification of model and uncertainties in prediction is necessary. To handle these issues, Bayesian regression based pipe break models are developed in this study. Moreover, it is essential to consider the optimism/pessimism natures of the DM. The OWA operator is able to model this optimism degree, which is one of the important motivation of using it in this study. The aim of this study is to compare the performance and predictive capability of normal linear regression and Bayesian linear regression models and to integrate the Bayesian prediction curves with OWA operator for effective decision making. 3.2 Methodology The framework of the proposed study is already shown in Figure 1.4. The following sections briefly discuss multiple linear regression, Bayesian linear regression, model selection, and OWA method. 3.2.1 Multiple linear regression In the linear multiple regression problem, the variation in a response variable y is described in terms of k predictor variables x1, ...,xk. The mean value of yi, the response for the ith individual, can be expressed as: 𝐸(𝑦𝑖|𝛽, 𝑋) = 𝛽1𝑥𝑖1 + 𝛽2𝑥𝑖2 + ⋯ + 𝛽𝑘𝑥𝑖𝑘 = 𝑥𝑖𝛽, 𝑖 = 1,2, … . , 𝑛 (3.1) 38 where xi = (xi1, ..., xik) = row vector of predictor values for the ith individual and β = (β1, ..., βk) = column vector of unknown regression parameters or coefficients (Albert 2009; Carlin and Louis 2009). The {yi} are assumed to be conditionally independent given the predictor variables and values of the parameters (Albert 2009; Bolstad 2007). Equal variances are assumed in the normal linear regression, where var(yi|θ,X) = σ2. Let θ = (β1, ..., βk, σ2) indicates the vector of unknown parameters. The errors ɛi = yi - E(yi|β,X) are assumed to be independent and normally distributed with mean 0 and variance σ2 (Albert 2009; Bolstad 2007; Carlin and Louis 2009). This model can be written in matrix notation for all observations as: y|β, σ2, X ∼Nn(Xβ, σ2, I) (3.2) where y = vector of observations, X = design matrix with rows x1, ...,xn, I = identity matrix, and Nk(μ,A) = multivariate normal distribution of dimension k with μ mean vector and A variance-covariance matrix (Albert 2009; Bolstad 2007; Carlin and Louis 2009). 3.2.2 Bayesian linear regression Bayesian regression differs from the normal regression in that model parameters are treated as random variables rather than fixed (unknown) constants (Box and Tiao 1992; Gelman et al. 1995). As the model parameters are random variables, external information can be incorporated into the model by constructing a probability distribution (prior) that describes the uncertainty in the model parameters (Casella and Berger 2002; Gelman et al. 1995). In many real life situations, a prior distribution that contains virtually no information (noninformative prior) about the value of the model parameters is used for a Bayesian regression (Hill and Tiedeman 2007). However, this prior can be updated whenever new information/data is available which is the main advantage of Bayesian analysis (Gelman et al. 1995; Lu et al. 2012). To complete the Bayesian formulation of the normal regression model, it can be assumed that (β, σ2) has the typical noninformative prior: 𝑔(𝛽, 𝜎2) ∝1𝜎2 (3.3) 39 For the normal regression model, the posterior analysis has a form similar to the posterior analysis of a mean and variance for a normal sampling model. The joint density of (β, σ2) can be represented as the product (Albert 2009): 𝑔(𝛽, 𝜎2|𝑦) = 𝑔(𝛽|𝑦, 𝜎2)𝑔(𝜎2|𝑦) (3.4) The posterior distribution of the regression vector β conditional on the error variance σ2, g(β|y, σ2) = multivariate normal with mean ?̂? and variance-covariance matrix Vβσ2, where ?̂? = (𝑋′𝑋)−1𝑋′𝑦 (3.5) 𝑉𝛽 = (𝑋′𝑋)−1 (3.6) If the inverse gamma (a,b) density is defined as proportional to y−a−1exp{−b/y}, then the marginal posterior distribution of σ2 is inverse gamma((n−k)/2, S/2), where 𝑆 = (𝑦 − 𝑋?̂?)′ (𝑦 − 𝑋?̂?) (3.7) For the prediction of a future observation ỹ corresponding to a covariate vector x∗ from the regression sampling model, ỹ is conditional on β and σ2, is N(x∗β, σ). The posterior predictive density of ỹ, p(ỹ|y), can be represented as p(ỹ|β, σ2), where they are averaged over the posterior distribution of the parameters β and σ2 (Albert 2009): 𝑝(?̃?|𝑦) = ∫ 𝑝(?̃?|𝛽, 𝜎2) 𝑔(𝛽, 𝜎2|𝑦)𝑑𝛽𝑑𝜎2 (3.8) 3.2.2.1 Confidence intervals vs credible intervals Both confidence and credible intervals can be defined symbolically as Prob (l ≤ X ≤ u) = 1- α for a random variable X, where u and l = upper and lower interval limits and α = significance level. However, for the confidence and credible intervals, the definition is interpreted in different ways as the two kinds of intervals are conceptually different. A confidence interval with a confidence level of 95% (α = 0.05), is an interval that is expected to include the true value of the prediction 95% of the time in repeated sampling of observations and prior information used in regression (McClave and Sincich 2000; Lu et al. 2012). Whereas a 95% credible interval represents the 40 posterior probability that the 95% prediction lies in the interval, and the interval is determined as (Box and Tiao 1992; Casella and Berger 2002; Lu et al. 2012): ∫ 𝑝(𝑔(𝜷)|𝒚)𝑑𝑔(𝛽) = 1 − 𝛼𝑢𝑙 (3.9) where β and g(β) = model parameters and predictions, respectively, and p(g(β)|y) = posterior distribution of g(β) conditioned on data y. The credible interval limits can be determined using the equal-tailed method (Casella and Berger 2002), highest posterior density interval (Box and Tiao 1992). However, the linear confidence and credible intervals of linear models are mathematically equivalent (Lu et al. 2012). Detailed discussions are referred to Box and Tiao (1992), Hill and Tiedeman (2007), and Lu et al. (2012). 3.2.3 Model selection Bayes factors can be used to compare different models of the same data. Suppose that the observed data y can be generated using two models M1 and M2. Using Bayes theorem, the posterior probability that M1 is true assuming either M1 or M2 is true can be expressed as (Martin et al. 2011): Pr(𝑀1|𝑦) = 𝑝(𝑦|𝑀1) Pr(𝑀1)𝑝(𝑦|𝑀1) Pr(𝑀1) + 𝑝(𝑦|𝑀2) Pr(𝑀2) (3.10) It is informative to look at the posterior odds in favor of one model (i.e., M1): Pr(𝑀1|𝑦)Pr(𝑀2|𝑦)= 𝑝(𝑦|𝑀1)𝑝(𝑦|𝑀2) × Pr(𝑀1)Pr(𝑀2) (3.11) It means that someone moves away from the prior odds in favor of M1 to the posterior odds in favor of M1, simply multiply the prior odds by: 𝑀12 = 𝑝(𝑦|𝑀1)𝑝(𝑦|𝑀2) (3.12) which is called the Bayes factor for M1 relative to M2. The strength of evidence for M1 over M2 is provided by M12 (Martin et al. 2011). 41 There are a number of other ways to assess the relative goodness of fit of different regression models. One of these is by comparing models based on the Bayesian information criterion (BIC), also known as Schwarz’s Bayesian criterion (SBC), of each model according to the formula: BIC = -2(log-likelihood) + number of parameters* log(number of observations) (3.13) Another way is by comparing models based on the Akaike Information Criterion (AIC) score of each model. The AIC for any model is given by: AIC = -2(log-likelihood) + 2(number of parameters) (3.14) Minimizing the BIC or AIC yields a model that maximizes the log-likelihood with the minimum number of parameters. 3.2.4 OWA operators 3.2.4.1 Basic concepts A new family of aggregation technique called the ordered weighted average (OWA) operators was introduced by Yager (1988) and since then it has been applied in many fields including decision theory. The purpose of using OWA operator for aggregation of response prediction curves is their capability to encompass a range of operators from minimum to maximum including various averaging (compromising) operators like arithmetic mean (Tesfamariam and Sadiq 2008). The OWA operator provides flexibility by utilizing the whole range of ‘‘or’’ to ‘‘and’’ associated with the attitude of a decision maker in the aggregation process (Yager and Filev 1994). An OWA operator of dimension n is a mapping OWA: Rn → R, which has an associated n weighting vector W = (w1, w2, ...,wn)T. The requirements to be satisfied are Rɛ[0, 1], wiɛ[0, 1], and ∑ 𝑤𝑖 = 1𝑛𝑖=1 . Hence, for given n input parameters, the input vector (X1, X2,...,Xn) is aggregated using OWA operators as follows: 𝐿 = 𝑂𝑊𝐴(𝑋1, 𝑋2, … , 𝑋𝑛) = ∑ 𝑤𝑖ℎ𝑖 𝑛𝑖=1 (3.15) where hi = ith largest element in the vector (X1, X2,..., Xn) (i.e., h1 ≥ h2 ≥...≥ hn) (Tesfamariam et al. 2010). 42 Therefore, the OWA weight wi is not associated with any particular value Xi, rather they are associated to the ordinal position of hi. The linear form of OWA equation aggregates a vector of input parameters (X1, X2,...,Xn) and provides a nonlinear solution (Tesfamariam et al. 2010; Yager and Filev 1999). Two characterizing measures were further introduced by Yager (1998) called orness (also known as optimism degree) and “dispersion” (Disp) or entropy, which are associated with the weighting vector W and are computed as: 𝑜𝑟𝑛𝑒𝑠𝑠 (∞) =1𝑛 − 1∑ 𝑤𝑖(𝑛 − 𝑖)𝑛𝑖=1 𝑤ℎ𝑒𝑟𝑒 𝑜𝑟𝑛𝑒𝑠𝑠 𝜖 [0,1] (3.16) 𝐷𝑖𝑠𝑝 = − ∑ 𝑤𝑖𝑙𝑛(𝑤𝑖)𝑛𝑖=1 (3.17) The orness characterizes the degree to which the aggregation is like an and (or or). Therefore, when orness (optimism degree) = 1, the OWA becomes a “maximum” operator and, conversely, when orness (optimism degree) = 0, the operator becomes a “minimum” operator (Tesfamariam et al. 2010). The measure Disp provides a degree to which the information is distributed and is bounded by 0 ≤ Disp ≤ ln(n). For orness= 1 or 0, the dispersion becomes zero, and when wi= 1/n (like a uniform distribution), the dispersion is maximum, i.e., ln(n) (Tesfamariam et al. 2010). 3.2.4.2 Determination of weights The generation of weights is one of the challenging tasks in OWA technique. Various OWA weight generation methods have been proposed (Filev and Yager 1998; Sadiq and Tesfamariam 2007; Xu 2005) after the introduction of OWA operators by Yager (1988). The procedure proposed by O’Hagan (1988) is adopted in this article which entails the use of orness and optimizes the Disp function (entropy). These calculated weights are referred to as maximum entropy (ME) OWA weights (Tesfamariam et al. 2010). The objective function and associated constraints can be written as: Objective function: 𝑀𝑎𝑥 (𝐷𝑖𝑠𝑝) = 𝑀𝑎𝑥 (− ∑ 𝑤𝑖𝑙𝑛(𝑤𝑖)𝑛𝑖=1) (3.18) Subjected to: 43 𝑜𝑟𝑛𝑒𝑠𝑠 (∞) =1𝑛 − 1∑ 𝑤𝑖(𝑛 − 𝑖)𝑛𝑖=1 (3.19) And ∑ 𝑤𝑖 = 1; 0 ≤ 𝑤𝑖 ≤ 1𝑛𝑖=1 (3.20) 3.3 Case study: The City of Calgary The proposed methodology is applied on the water distribution network of the City of Calgary. The City of Calgary is located in Alberta, Canada and has a population of 1.1 million people. It is situated at approximately 1048m above the sea level and has a humid continental climate (Peel et al. 2007). According to Environment Canada (2013), average daytime high temperature ranges from 26oC in July to -3oC in mid-January. The city receives annual precipitation of 412.6mm with 320.6mm occurring as rain. A snapshot of the city and the distribution network is shown in Figure 3.1. 44 Figure 3.1: Water distribution system of the City of Calgary 45 The City of Calgary water distribution network consists of 4,281km length of pipe with a total of 49,531 individual pipes where 21.92% DI, 16.10% CI, 54.64% polyvinyl chloride (PVC), and 7.34% other pipes (asbestos, concrete cylinder, steel, and copper). The number of CI, DI, PVC, and others pipes installed in different years is shown in Figure 3.2. Most of the CI pipes are installed during 1910-1968 whereas DI pipes are installed 1962-2010. After 1970, the city water network started using PVC pipes instead of metallic pipes and the proportion of this non-metallic pipe has increased significantly. Figure 3.2: Number of pipe installation of City of Calgary The City of Calgary started systematic recording of pipe break from 1956 and the database contains 13,692 individual pipe breaks. The number of CI, DI, PVC, and others pipes breaks in different years is shown in Figure 3.3. According to the Figure 3.3, the majority breaks occurred in CI (63.73%) and DI pipes (32.08%) whereas PVC and others (i.e., asbestos, concrete, copper and steel) pipes experienced very few breaks (1.18% and 3.01% respectively). It has been found that 2,882 CI pipes and 2,067 DI pipes experienced breaks from 1956-2013. Figure 3.3 indicates that the city network has experienced a steady increase in the number of breaks for both CI and 0200040006000800010000120001910-1955 1956-1965 1966-1975 1976-1985 1986-1995 1996-2005 2006-2013FrequencyYearCI-Install DI-InstallPVC-Install Others-Install46 DI pipes in the first 30 years of data recording. However, due to retrofitting program, the utility has been experiencing a drop in the number of breaks especially for the DI pipes for the last two decades. Moreover, the number of breaks also reduced due to replacement program that has targeted pipes showing signs of high break patterns and the installation of the PVC pipes in the system (Brander 2001). As almost 96% breaks occurred only in CI and DI pipes, the authorities are more concern about these pipes. For this, only CI and DI pipes are considered for the further analysis. Figure 3.3: Number of water main breaks of City of Calgary 3.3.1 Covariate selection For an effective water mains failure prediction model development, determination of appropriate factors causing pipe failures is important. A variety of factors causing failures have been reported in literature. However, there is no disagreement in the literature about the influence of age in pipe breaks (for example: Asnaashari et al. 2009; Clark et al. 1982; Shamir and Howard 1979). Some authors have observed a linear (Hu and Hubble 2007; Kettler and Goulter 1985) and non-linear relationship (Boxall et al. 2007; Gowlter and Kazemi 1988; Kleiner and Rajani 1999) between 0500100015002000250030001910-1955 1956-1965 1966-1975 1976-1985 1986-1995 1996-2005 2006-2013FrequencyYearCI-Breaks DI-BreaksPVC-Breaks Others-Breaks47 age and pipe breaks. Pipe diameter has been found to contribute significantly to the number of breaks observed (for example: Christodoulou 2011; Kettler and Goulter 1985; Yamijala et al. 2009). An inverse relationship between the pipe failure rate and the diameter has been reported by many authors (Boxall et al. 2007; Hu and Hubble 2007; Kettler and Goulter 1985). Pipe length is also considered as one of the most influential parameters for pipe breaks (Asnaashari et al. 2009; Bubtiena et al. 2011; Jacobs and Karney 1994; Wang et al. 2009). It has been found that the number of breaks increases considerably with the length of the pipes. Mailhot et al. (2000) and Pelletier et al. (2003) demonstrated that the failure of pipes are not only affected by age but also peculiarities associated with manufacturing and installation techniques as well as types of material used. The high number (80%) of breaks observed in the City of Calgary network is correlated to pipes installed during the period 1960-1976. Therefore, two groups are considered for vintage, pipes installed during 1960-1976 and others. Kabir et al. (2015b), Pelletier et al. (2003), and Yamijala et al. (2009) also pointed out relationship between land use and pipe breaks. It has been found that the number of breaks increases considerably from the end of summer and immediately peaks at the beginning of the winter. The number of breaks begins to decrease as spring sets in and the lowest number of breaks experienced at the end of summer period. Therefore, time dependent covariates or the effect of soil and temperature has been considered in this analysis. Although soil is considered as a static medium, but many soil properties are time dependent, including and moisture content (Kleiner and Rajani 2000, 2002). Several researchers observed the effects of electrical resistivity, temperature, moisture conditions and resistivity on the breakage rate of water mains (Boxall et al. 2007; Walski and Pelliccia 1982; Wood and Lence 2009; Yamijala et al. 2009). Soil resistivity is a measure of how strongly a soil opposes to pass the flow of electric current and it significantly affect the deterioration or corrosion of metallic pipes (Gould et al. 2009; Kabir et al. 2015f). Low resistivity will result in a higher corrosion rate probability, while high resistivity results in a lower corrosion rate probability (Sadiq et al. 2005). Researchers reported high number of water main breaks due to volumetric swelling and shrinkage of clays (Clark 1971) after a hot summer or just prior to spring thaw (Baracos et al. 1955; Hudak et al. 1998; Kleiner and Rajani 2002). As soil-moisture conditions are difficult to quantify 48 directly, Kleiner and Rajani (2000, 2002) used rain deficit (RD) as a surrogate for soil moisture. On the other hand, in continental climates, significant number of breaks has been observed in winter or extreme cold condition (Hu and Hubble 2007; Kleiner and Rajani 2000, 2002). To measure of the severity of winter during a specified period, Kleiner and Rajani (2000, 2002) used freezing index (FI) as a surrogate measure for temperature effects. For this, soil resistivity, temperature, RD, and FI has been considered in this study. Literature has pointed out strong relationship between break-type and pipe materials. Gould et al. (2009), Rajani et al. (1996) mentioned that circumferential or circular breaks in water main caused due to soil movement, temperature change and decrease in soil moisture content, and occurs mostly in small diameter pipes with CI pipe being the most affected (Makar et al. 2001). Corrosion break-type is occurred from continuous damaging reactions that reduces the pipe factor of safety (Tesfamariam and Rajani 2007). Literature has shown that apart from majorly affecting the CI pipe (Makar 2000), DI pipe is very vulnerable to corrosion and can corrode at a rate that equals that of CI pipe (Rajani and Kleiner 2001) and some researchers noted that DI pipe is fails majorly as a result of corrosion (Makar 2000). Thus, separate model has been developed for predicting failure of CI and DI pipes. 3.3.2 Data collection and preparation Pipe characteristics data like age, diameter, length, vintage or manufacturing period, number of connection of each pipes (for land use determination), and soil resistivity are collected from GIS database of the City of Calgary. Other weather data such as temperature, precipitation and rainfall used in this study was acquired from Environment Canada (2014). Data from Calgary International Airport CS weather station (Latitude: 51°06'31.080" N, Longitude: 114°00'52.000" W) was used as this station was within the study area and had the most frequent (hourly, daily and weekly data) and longest record panning. Daily mean temperature is considered for the analysis as Calgary experience wild temperature swings over the course of a year and even a few days. The RD is the difference between precipitation (P) and evapotranspiration (e) (Markovic et al. 2012): 49 RD = P - e (3.21) Negative values indicate a precipitation deficit while positive values indicate surplus of precipitation. Evapotranspiration is calculated using Thornthwaite equation (Thornthwaite 1948): e = 1.6 (10T / I)a (3.22) where e is the monthly evapotranspiration in cm, T is mean monthly temperature (oC), I is a constant representing heat index and ranges from 0-160, a is a constant ranging from 0-4.25 and is a function of I. The FI is the cumulative number of days below 0 degrees during a given period given as (Kleiner and Rajani 2002): 𝐹𝐼𝑝 = ∑ 𝜙 − 𝑋𝑖𝑖=𝐷𝐷𝑝 (3.23) where FIp is the freezing index for day i, DDp is all the days below the threshold temperature Φ and X is the daily temperature for day i. FI is calculated immediately before the break based on the daily mean temperature and pipe failure history. The variables used to represent input data together with summary statistics of these data are given in Table 3.1. 50 Table 3.1: Summary of control variables used the regression model Variable Description Unit Measured scale Cast Iron (CI) Ductile Iron (DI) NBRKS Number of previous breaks NA min: 1, max: 22, mean: 2.284, SD: 1.86 min: 1, max: 16, mean: 2.271, SD: 1.85 TBRKS Break/failure rate /year/km min: 0.025, max: 7.937, mean: 0.336, SD: 0.422 min: 0.018, max: 5.435, mean: 0.40, SD: 0.433 AGE Pipe age years min: 18, max: 58, mean: 55.46, SD: 3.95 min: 27, max: 51, mean: 40.4, SD: 3.95 DIA Pipe diameter mm min: 20, max: 600, mean: 184.3, SD:63.7 min: 100, max: 400, mean: 205.3, SD: 63.74 LENGTH Pipe length m min: 3, max: 937, mean: 164.3, SD: 89.13 min: 4, max: 1691, mean: 185.6, SD: 102.36 TEMP Average temperature degrees min: -24.6, max: 19.6, mean: -0.8397, SD: 9.25 min: -17.1, max: 19.4, mean: 0.49, SD: 8.56 FI Freezing index degrees-days min: -232.96, max: 672.86, mean: 9.18, SD: 67.75 min: -232.96, max: 622.43, mean: -18.614, SD: 58.12 RD Rain deficit cm min: -22.226, max: 218.8, mean: 2.88, SD: 21.51 min: -21.29, max: 218.8, mean: 21.65, SD: 23.75 VINT Vintage NA min: 0, max: 1, mean: 0.82, SD: 0.384 min: 0, max: 1, mean: 0.35, SD: 0.475 LANUSE Land use NA min: 0 (residential), max: 1 (commercial), mean: 0.218, SD: 0.413 min: 0 (residential), max: 1 (commercial), mean: 0.166, SD: 0.372 SOREG Soil resistivity Ωm min: 0, max: 17044, mean: 2586, SD: 1802.37 min: 0, max: 12411, mean: 1435, SD: 1174.25 NA symbolizes that there are no units Figures 3.4 and 3.5 present the histogram of different covariates of CI and DI pipes respectively. Figures 3.4 and 3.5 indicate that the smaller diameter pipes experience more breaks than the larger diameter pipes and the probability of failure after 40 years is much higher. Vintage or pipe manufacturing period has more effect on CI pipes than DI pipes, as most of the CI pipes are installed before 1976. Lower soil resistivity plays a vital role for the failure of both CI and DI pipes. Pipe failures also decrease with increase of soil resistivity. For both CI and DI pipes, residential connection experience more breaks than the commercial pipes due to the smaller diameter. 51 Figure 3.4: Histogram of different covariates of CI pipes Number of breaksNBRKSFrequency0 5 10 15 2001000# of breaks/year/kmTBRKSFrequency0 2 4 6 801500AgeAGEFrequency20 40 60 80 1000400LengthLENGTHFrequency0 200 400 600 8000600DiameterDIAFrequency0 100 300 50001000VintageVINTFrequency0.0 0.2 0.4 0.6 0.8 1.001500Land UseLANUSEFrequency0.0 0.2 0.4 0.6 0.8 1.001500Rain DeficitRDFrequency-50 0 50 100 2000600Freezing IndexFIFrequency-200 0 200 400 6000400TemperatureTEMPFrequency-30 -20 -10 0 10 200400Soil ResistivitySOREGFrequency0 5000 10000040052 Figure 3.5: Histogram of different covariates of DI pipes 3.3.3 Model selection The normal linear regression model and Bayesian linear regression model were developed using R, open-source statistical software with the LearnBayes package. The normal linear regression was fit in R using the ‘‘lm’’ command. The Bayesian linear regression model was fit using the “bayesglm” command and posterior distribution and prediction were performed using “blinreg”, “blinregexpected” and “blinregpred” commands, respectively. These models were all applied to Number of breaksNBRKSFrequency5 10 150800# of breaks/year/kmTBRKSFrequency0 1 2 3 4 501000AgeAGEFrequency25 30 35 40 45 500200LengthLENGTHFrequency0 500 1000 15000400DiameterDIAFrequency100 200 300 4000400VintageVINTFrequency0.0 0.2 0.4 0.6 0.8 1.00600Land UseLANUSEFrequency0.0 0.2 0.4 0.6 0.8 1.001000Rain DeficitRDFrequency-50 0 50 100 2000400Freezing IndexFIFrequency-200 0 200 400 6000300TemperatureTEMPFrequency-30 -20 -10 0 10 200200Soil ResistivitySOREGFrequency0 4000 8000 12000040053 only a reduced data set consisting of those pipes that had experienced at least one break during the data-recording from 1956-2013 period. The goal was to estimate the number of breaks on each pipe segment per year per km. This was done to avoid a zero-inflation problem that would have resulted if all pipes were included in the analysis (Asnaashari et al. 2009; Yamijala et al. 2009). The entire data set consisting of 5044 observations (2882 observations for CI and 2067 observations for DI) was divided into two sets. The first 75% data set (2162 observations for CI and 1551 observations for DI) was used for training, and the remaining 25% data set (720 observations for CI and 516 observations for DI) was for testing or validation purpose. However, if the year used for the validation test was biased for some reason (e.g., it was a particularly freezing or dry year), this would not be a good test of the model. To avoid this problem, the complete data set was divided randomly into training and validation data sets. The distribution of break/failure rate (TBRKS) in Figures 3.4 and 3.5 also indicate the deviation from a bell-shaped normal distribution. Therefore, log(TBRKS) is considered as dependent variable for model development. Based on the 75% training dataset, thirteen and twelve models have been considered for CI and DI pipes, respectively. Table 3.2 indicate the significant parameters, errors, R2, AIC and BIC of different models for CI pipes. Table 3.2 indicate that sum square error (SSE, 1094.88), mean square error (MSE, 0.38), and root mean square error (RMSE, 0.6182) of Model 1 are smaller compared to the other models. On the other hand, the R2 of Model 1 is very small (0.364) compared to Model 12 and Model 13 (0.8551). The BIC and AIC of different models are determined using the Equations 3.13 and 3.14. The AIC (5423.45) and BIC (5434.13) of Model 12 are lower than other models. The log-likelihood indicates the overall fit of the model with the smaller likelihood values indicate the worst fit. The log-likelihood of Model 1 (-2695.7) is higher than Model 12 (-2703.7). However, the 2(number of parameters) term of Equation 3.14 and number of parameters* log(number of observations) term of Equation 3.13 penalize variables and the overly complex models. Due to higher number of not significant parameters (i.e., AGE2, VINT, FI and TEMP), 54 the AIC and BIC of Model 1 (5425.45 and 5448.14 respectively) is higher than Model 12. The Bayes factors of the different models are determined using the Equation 3.12 and are presented in Table 3.3. The Bayes factors of Model 12 in Table 3.3 indicate the strength of evidence for Model 12 over other models. Based on the results presented in Tables 3.2 and 3.3, Model 12 has been selected for CI pipes with all significant parameters for further analysis. Similarly, Table 3.4 indicates the significant parameters, errors, R2, AIC and BIC of different models and Table 3.5 shows the Bayes factors of the different models for DI pipes. Model 12 has been selected for DI pipes for further analysis based on the results presented in Tables 3.4 and 3.5. 55 Table 3.2: Parameter significance results of different models for CI pipes Models 1 2 3 4 5 6 7 8 9 10 11 12 13 Intercept * *** * # # # N N N AGE N *** *** ** *** *** ** ** ** ** *** ** ** AGE^2 N LENGTH *** *** *** *** *** *** *** *** *** *** *** *** *** LENGTH^2 *** *** *** *** *** *** *** *** *** *** *** *** *** DIA *** *** ** ** ** ** ** ** ** ** ** ** ** DIA^2 * N VINT N # # * * * # * * *** *** *** *** LANUSE * * # # # N N N RD # * * # N # N ** ** * ** ** ** RD^2 # N N N N FI N N *** *** ** ** * * * * N FI^2 ** *** N N *** *** TEMP N *** *** # # N N TEMP^2 ** N N SOREG *** *** *** *** *** *** *** *** *** *** *** *** *** SOREG^2 * * # # # N SSE 1094.88 1094.88 1097.39 1102.37 1107.72 1107.72 1109.56 1110.37 1110.85 1111.7 1114 1106.8 1106.6 MSE 0.38 0.38 0.38 0.38 0.39 0.39 0.39 0.39 0.39 0.4 0.4 0.4 0.4 RMSE 0.6182 0.6181 0.6187 0.6200 0.6214 0.6213 0.6217 0.6218 0.6218 0.6219 0.6225 0.6206 0.6206 log-like -2694.7 -2694.7 -2698.0 -2704.5 -2711.5 -2711.5 -2713.9 -2715.0 -2715.6 -2716.6 -2719.6 -2710.2 -2710.0 DF 2865 2866 2867 2868 2869 2870 2871 2872 2873 2874 2875 2874 2873 R2 0.364 0.3641 0.362 0.359 0.3566 0.3566 0.3566 0.3551 0.3548 0.8545 0.8542 0.8551 0.8551 AIC 5425.45 5423.45 5428.05 5439.09 5451.05 5449.06 5451.83 5451.94 5451.17 5451.29 5455.25 5438.52 5440.01 BIC 5532.8 5524.9 5523.5 5528.6 5534.6 5526.6 5523.4 5517.6 5510.8 5505.0 5503.0 5492.2 5499.6 ‘***’ Signifies that the parameter is significant at p-values between 0 and 0.001. ‘**’ Signifies that the parameter is significant at p-values between 0.001and 0.01. ‘*’ Signifies that the parameter is significant at p-values between 0.01and 0.05. ‘#’ Signifies that the parameter is significant at p-values between 0.05 and 0.1. ‘N’ Signifies that the variable was included in the model but was not significant, and a blank cell signifies that the parameter was not used in the model. SSE: Sum square error, MSE: Mean square error, RMSE: Root mean square error, log-like: log-likelihood; DF: Degree of freedom: AIC: Akaike Information Criterion, BIC: Bayesian Information Criterion 56 Table 3.3: Bayes factor or odd ratio of different models for CI pipes Model 1 2 3 4 5 6 7 8 9 10 11 12 13 1 1.0E+00 6.6E-05 1.0E-11 1.5E-20 1.3E-25 6.9E-29 2.1E-13 1.9E-04 1.0E-11 3.7E-28 1.3E-25 6.9E-29 1.5E-31 2 1.5E+04 1.0E+00 1.5E-07 2.2E-16 2.0E-21 1.0E-24 1.5E-07 4.0E-08 2.0E-04 8.7E-23 1.1E-28 1.0E-11 9.8E-27 3 9.5E+10 6.3E+06 1.0E+00 1.4E-09 1.2E-14 6.6E-18 2.2E-16 2.1E-13 1.1E-09 2.7E-21 2.1E-25 2.9E+04 2.2E-21 4 6.6E+19 4.4E+15 6.9E+08 1.0E+00 8.9E-06 4.6E-09 2.0E-21 2.1E-13 8.2E-10 6.4E-30 1.1E-25 2.1E-25 7.3E-20 5 7.4E+24 4.9E+20 7.7E+13 1.1E+05 1.0E+00 5.1E-04 2.1E-25 1.1E-25 1.5E+04 1.4E-24 2.2E-21 1.1E-25 5.6E-16 6 1.4E+28 9.6E+23 1.5E+17 2.1E+08 1.9E+03 1.0E+00 4.1E+04 5.8E+01 9.5E+10 4.7E-23 1.4E-01 7.4E-02 1.8E-10 7 3.6E+15 6.6E-05 2.0E-21 2.1E-25 1.1E-25 1.0E-11 1.0E+00 1.9E-04 4.0E-08 2.1E-13 1.6E-13 6.3E-09 3.6E-02 8 9.0E+08 1.0E-11 1.0E-24 2.7E+23 3.6E+15 6.6E-05 5.1E+03 1.0E+00 2.0E-04 1.1E-09 8.2E-10 3.2E-05 2.6E+01 9 3.6E+19 1.1E+14 2.7E+23 3.6E+19 6.6E-05 4.0E-08 2.4E+07 4.8E+03 1.0E+00 5.3E-06 3.9E-06 1.5E-01 1.5E+03 10 9.0E+08 1.5E-07 5.5E+05 1.1E+14 1.0E-11 2.0E-04 4.6E+12 9.0E+08 1.8E+05 1.0E+00 7.4E-01 2.9E+04 2.0E+04 11 3.0E+04 2.1E+08 9.6E+23 1.5E+17 1.2E+09 5.5E+05 1.5E+08 3.0E+04 6.3E+00 3.3E-05 1.0E+00 2.5E-05 1.0E+04 12 4.9E+20 7.4E+24 3.0E+04 7.7E+13 2.5E+05 1.4E+28 6.2E+12 1.2E+09 2.5E+05 1.3E+00 3.9E+04 1.0E+00 1.9E+07 13 6.5E+30 1.0E+26 4.3E+20 1.3E+19 1.7E+15 5.4E+09 2.7E+01 3.8E-02 6.5E-04 5.0E-08 9.3E-05 4.9E-05 1.0E+00 57 Table 3.4: Parameter significance results of different models for DI pipes Models 1 2 3 4 5 6 7 8 9 10 11 12 Intercept # # # # # N N N N * AGE # # # # # ** ** ** ** ** ** *** AGE^2 N N N N N LENGTH *** *** *** *** *** *** *** *** *** *** *** *** LENGTH^2 *** *** *** *** *** *** *** *** *** *** *** *** DIA ** ** ** ** ** ** ** ** ** *** *** *** DIA^2 * * * * * * # # # ** VINT ** ** ** ** ** *** *** *** *** *** *** *** LANUSE *** *** *** *** ** *** *** *** *** *** *** *** RD N N N N N N N RD^2 N N N N FI N N N N N N # # FI^2 N N N TEMP N N N N N N TEMP^2 N SOREG * * * *** *** *** *** *** *** *** *** *** SOREG^2 N N N SSE 768.52 768.60 768.64 768.80 769.80 770.62 797.81 798.08 799.49 800.91 802.6 799.9 MSE 0.38 0.38 0.38 0.38 0.38 0.38 0.39 0.39 0.39 0.40 0.4 0.4 RMSE 0.6197 0.6196 0.6195 0.6194 0.6196 0.6198 0.6229 0.6229 0.6233 0.6237 0.6242 0.6233 log-like -1889.33 -1889.43 -1889.49 -1889.70 -1891.02 -1892.09 -1949.07 -1949.43 -1951.25 -1953.08 -1955.30 -1951.78 DF 2003 2002 2003 2004 2005 2006 2056 2057 2058 2059 2060 2059 R2 0.4047 0.4046 0.4046 0.4045 0.4037 0.4031 0.3942 0.394 0.393 0.3919 0.8252 0.8258 AIC 3914.67 3912.87 3910.98 3909.40 3910.04 3910.18 3922.14 3920.85 3822.50 3824.16 3826.59 3821.56 BIC 3915.647 3908.237 3900.739 3993.551 3988.576 3983.105 3989.75 3982.822 3978.8 3974.9 3972.3 3971.7 ‘***’ Signifies that the parameter is significant at p-values between 0 and 0.001. ‘**’ Signifies that the parameter is significant at p-values between 0.001and 0.01. ‘*’ Signifies that the parameter is significant at p-values between 0.01and 0.05. ‘#’ Signifies that the parameter is significant at p-values between 0.05 and 0.1. ‘N’ Signifies that the variable was included in the model but was not significant, and a blank cell signifies that the parameter was not used in the model. SSE: Sum square error, MSE: Mean square error, RMSE: Root mean square error, log-like: log-likelihood; DF: Degree of freedom: AIC: Akaike Information Criterion, BIC: Bayesian Information Criterion58 Table 3.5: Bayes factor or odd ratio of different models for DI pipes Model 1 2 3 4 5 6 7 8 9 10 11 12 1 1.00E+00 1.55E-05 6.73E-11 2.10E-12 2.72E-16 8.39E-22 4.18E-30 5.85E-33 1.01E-34 7.76E-39 1.43E-35 7.52E-36 2 6.45E+04 1.00E+00 4.34E-06 1.35E-07 1.75E-11 5.41E-17 2.70E-25 3.77E-28 6.48E-30 5.00E-34 9.25E-31 4.85E-31 3 1.49E+10 2.31E+05 1.00E+00 3.12E-02 4.04E-06 1.25E-11 6.22E-20 8.70E-23 1.49E-24 1.15E-28 2.13E-25 1.12E-25 4 4.76E+11 7.38E+06 3.20E+01 1.00E+00 1.29E-04 3.99E-10 1.99E-18 2.78E-21 4.78E-23 3.69E-27 6.83E-24 3.58E-24 5 3.68E+15 5.71E+10 2.48E+05 7.74E+03 1.00E+00 3.09E-06 1.54E-14 2.15E-17 3.70E-19 2.86E-23 5.28E-20 2.77E-20 6 1.19E+21 1.85E+16 8.02E+10 2.50E+09 3.24E+05 1.00E+00 4.98E-09 6.97E-12 1.20E-13 9.25E-18 1.71E-14 8.96E-15 7 2.39E+29 3.71E+24 1.61E+19 5.02E+17 6.49E+13 2.01E+08 1.00E+00 1.40E-03 2.40E-05 1.85E-09 3.43E-06 1.80E-06 8 1.71E+32 2.65E+27 1.15E+22 3.59E+20 4.64E+16 1.43E+11 7.15E+02 1.00E+00 1.72E-02 1.33E-06 2.45E-03 1.29E-03 9 9.95E+33 1.54E+29 6.69E+23 2.09E+22 2.70E+18 8.35E+12 4.16E+04 5.82E+01 1.00E+00 7.72E-05 1.43E-01 7.48E-02 10 1.33E+35 2.06E+30 8.95E+24 2.79E+23 3.61E+19 1.12E+14 5.56E+05 7.78E+02 1.55E-05 1.00E+00 1.03E-03 1.00E+00 11 6.97E+34 1.08E+30 4.69E+24 1.46E+23 1.89E+19 5.85E+13 2.92E+05 4.08E+02 7.01E+00 5.41E-04 1.00E+00 5.24E-01 12 1.29E+38 2.00E+33 8.67E+27 2.71E+26 3.50E+22 1.08E+17 5.39E+08 7.54E+05 1.30E+04 9.69E+02 1.85E+03 1.00E+00 59 3.3.4 Bayesian regression After selecting the model, Bayesian regression analysis has been performed for Model 12. Concerning pipe material, multiple regression equations were estimated for CI and DI pipes in separate equations. For CI pipes, the resulting regression equation was: log(TBRKS) = β1 AGE + β2 LENGTH + β3 LENGTH2 + β4 DIA + β5 RD + β6 FI2 + β7 VINT + β8 SOREG (3.24) Similarly for DI pipes, the resulting regression equation was: log(TBRKS) = β1 AGE + β2 LENGTH + β3 LENGTH2 + β4 DIA + β5 DIA2 + β6 VINT + β7 LANUSE + β8 SOREG (3.25) where β1, β2,…., β8 are regression parameters. The distribution of the regression parameters of CI and DI pipes models using the simulated draws from the joint posterior distribution are shown in Figures 3.6 and 3.7.60 Figure 3.6: Histogram of Bayesian regression parameters for CI pipes Age1Frequency-0.006 -0.002 0.002010002000Length2Frequency-0.0100 -0.0085010002000Length^23Frequency6.0e-06 9.0e-06 1.2e-05010002500Diameter4Frequency-0.6 -0.4 -0.2 0.0010002500Rain Deficit5Frequency-0.003 -0.00105001500Freezing Index^26Frequency-2.5e-06 -1.0e-06010002500Vintage7Frequency0.1 0.2 0.3 0.405001500Soil Resistivity8Frequency-7e-05 -4e-05 -1e-05010002500Error SDFrequency0.60 0.6401000200061 Figure 3.7: Histogram of Bayesian regression parameters for DI pipes Age1Frequency-0.005 -0.002 0.0010400800Length2Frequency-0.0100 -0.00850400800Length^23Frequency6.0e-06 9.0e-06 1.2e-0504001000Diameter4Frequency-0.6 -0.4 -0.2 0.004001000Diameter^25Frequency-0.003 -0.001 0.00105001500Vintage6Frequency-2.5e-06 -1.0e-0604001000Land Use7Frequency0.05 0.15 0.25 0.350400800Soil Resistivity8Frequency-7e-05 -4e-05 -1e-0504001000Error SDFrequency0.60 0.64040080062 Figure 3.8 indicates the observed vs predicted log(TBRKS) plot for CI pipes where gray line indicates 95% credible interval of the corresponding observation. The figure also shows that the points are scattered randomly around the line and very few outliers present which are the indication of good fit. Figure 3.8: Observed Vs predicted curve of log(TBRKS) for CI pipes Figure 3.9 indicate the 5%, 50% and 95% cumulative probability distribution functions (CPDF) for CI pipes using Bayesian regression and normal regression. Figure 3.10 indicates the comparison of 5%, 50 % and 95% regression confidence interval (RCI) and Bayesian credible interval (BCI) for CI pipes. Figure 3.10(b) indicates that the 50% RCI and 50% BCI for CI pipes are same. However, Figures 3.10(a) and 3.9(c), indicate the Bayesian credible intervals are bigger than the normal regression confidence intervals as they include the true value of the prediction 95% of the time in repeated sampling of observations. Therefore, Bayesian regression models capture the uncertainty more effectively than normal regression models. Similar results are obtained for DI pipes as well. 63 (a) Normal regression (b) Bayesian regression Figure 3.9: CPDF of TBRKS for CI pipes using normal & Bayesian regression 64 (a) 5% RCI Vs 5% BCI (b) 50% RCI Vs 5% BCI 65 (c) 95% RCI Vs 95% BCI Figure 3.10: Comparison of RCI and BCI for CI pipes In order to validate the proposed models, the remaining 25% testing data are used to determine the mean and predicted responses of both normal regression and Bayesian regression models. Figures 3.11(a) and 3.11(b) indicate that the mean response of the Bayesian and normal regression are almost same for both CI and DI pipes using Equation 3.1. Figures 3.12(a) and 3.12(b) present the comparison of Bayesian and normal regression predicted response for 25% testing dataset of CI and DI pipes. The predicted response of Bayesian regression models is determined using Equation 3.8. Figures 3.12(a) and 3.12(b) indicate that Bayesian regression models provide better performance for predicting future observations compare to the normal regression models. Bayesian regression models not only consider the measurement errors but also variability of the models for prediction. Therefore, Bayesian regression models give better performance for pipe failure prediction than normal regression models. However, the Figures 3.12(a) and 3.12(b) show that Bayesian regression models cannot predict the failure rates properly for some observations or uncertainty 66 remains for those observations. These can be due to no pipe failure information before 1956, weather data (e.g., temperature, FI, and RD) based on only one sample station, uncertainty in weather data, preventive or retrofitting program for DI pipes, land use based on only number of residential and commercial connection and no information about soil corrosiveness. More detail and reliable information will reduce these uncertainties. To develop the final pipe failure prediction model, 75% training dataset and 25% testing datasets are combined. The response prediction curves for 5%, 50% and 95% Bayesian credible interval for CI and DI pipes are developed. 67 (a) Mean response for CI pipes (b) Mean response for DI pipes Figure 3.11: Comparison of mean response prediction of normal and Bayesian regression for testing dataset for CI and DI 68 (a) Predicted response for CI pipes (b) Predicted response for DI pipes Figure 3.12: Comparison of predicted response of normal and Bayesian regression for testing dataset for CI and DI 69 3.3.5 Application of OWA To support water utility managers for selecting suitable response prediction curves for the planning and operation purpose, this study presented OWA to integrate 5%, 50%, and 95% response prediction curves based on their degree of optimism and credibility. Compared to other weighting methods, the OWA operator is able to model this optimism degree more effectively. The first step is to determine the orness or optimism degree and then the order weights. In this study, the orness or optimism degree of the DM is a measure defined to vary from one (for very risk prone DM) to zero (for very risk aversion DM). Depending on the optimism degree of the DM, the order weights are determined (Yager 1998). The greater the weights at the beginning of the vector, the higher the optimism degree are. In practice, some of DMs prefer to express his/her evaluations based on his/her risk attitude as natural language rather than mathematical terms (Mianabadi et al. 2014). One of the popular methods is applying fuzzy linguistic quantifier (FLQ), which used to characterize the aggregation imperatives. Some examples of the FLQ are: at least one of them, few on them, half of them or many of them. (Mianabadi et al. 2014; Zarghami and Szidarovszky 2009; Zarghami et al. 2008). For instance, a risk prone DM may prefer an option satisfying ‘‘a few’’ or ‘‘at least half’’ of the criteria. Whereas, a risk averse DM would like to select an alternative that satisfies ‘‘most’’ of the attributes. Table 3.6 presents the linguistic quantifier and optimistic condition for different orness values used in this study (Mianabadi et al. 2014; Tesfamariam et al. 2010; Zarghami and Szidarovszky 2009; Zarghami et al. 2008). If orness is less than 0.5, then the DM is assumed to be conservative or pessimistic about the problem and if orness is greater than 0.5, he/she is in optimistic condition. The central value (0.5) represents the normative or neutral situation. For example, if the DM wants to consider the evaluations of the failure rates with respect to some of the criteria then he/she is considered to be moderately optimistic and the orness or optimism degree becomes 0.7 from Table 3.6. By applying this optimism degree and using the objective function and constraints, the order weights are calculated as 0.554, 0.292, and 0.154. Similarly, the weights for seven different orness values are presented in Table 3.6. 70 Table 3.6: Linguistic quantifier, optimistic condition, and weights for different orness Linguistic quantifier Orness (∞) w1 w2 w3 Optimistic condition At most one of them 1.0 1 0 0 Very optimistic Few of them 0.9 0.826 0.147 0.026 Optimistic Some of them 0.7 0.554 0.292 0.154 Moderately optimistic Half of them 0.5 0.333 0.333 0.333 Normative Many of them 0.3 0.154 0.292 0.554 Moderately conservative Most of them 0.1 0.026 0.147 0.826 Conservative All of them 0 0 0 1 Very conservative The next step is to combine the 5%, 50% & 95% response prediction curves suing the weights generated in previous step. For this, the failure rates of CI and DI pipes in different probabilities are calculated from 5% BCI, 50% BCI, and 95% BCI cumulative probability density functions (PDF) or response prediction curves (Figure 3.13). Figure 3.13: Response prediction curves of CI pipes for 5%, 50% & 95% Bayesian credible interval 71 These failure rates are multiplied with the weights presented in Table 3.6 using Equation 3.15. Figure 3.14 indicates the predicted response curves of CI pipes for different linguistic quantifier or orness or optimistic condition. Therefore, decision makers can choose an appropriate response prediction curve based on their credibility and degree of optimism for further analysis and decision making. Figure 3.14: Response prediction curve of CI pipes for different orness values 3.4 Summary In the prediction modeling water mains’ failure, uncertainty is inherent regardless of quality and quantity of data used in model-data fusion. This study is focused on predictive uncertainty, since accurate quantification of predictive uncertainty is necessary for improving our understanding of water mains’ failure processes. The prediction accuracy for water mains failure using multiple linear and Bayesian regression models using 57 years of historical data collected for the City of Calgary demonstrated that the Bayesian regression model had superior prediction capabilities. 72 The case study conducted in this study may bring insight on selecting appropriate covariates and methods (e.g., regression and Bayesian) for uncertainty quantification with consideration of predictive performance. In order to assist the water utility manager for decision making, this study integrated OWA to select suitable response prediction curves based on their degree of optimism and credibility. As expected, important covariates influencing pipe failure were found to be pipe physical attributes and environmental factors. The impacts of these covariates differ according to material type. However, pipe physical attributes were found to contribute more to pipe failure than environmental covariates. The impact of FI and RD are found more on CI pipe than DI pipes. The results presented in this work also suggest that confidence and credible intervals are almost mathematically and numerically identical for linear models due to no and noninformative prior respectively. It is found that the mean response prediction of normal regression and Bayesian regression models are the same. However, the Bayesian regression model provides better predicted response than normal regression model. Therefore, decision makers can choose an appropriate response prediction curve based on their credibility and degree of optimism for further analysis and decision making. The results presented in OWA section indicates that conservative/pessimistic DM will consider higher failure rates in order to avoid risk compared to the optimistic DM. The proposed study can be strengthened a great deal by integrating it with the GIS system to develop an effective program of municipalities. 73 Chapter 4 Bayesian Model Averaging (BMA) Based Failure Model A version of this chapter has been published in Canadian Journal of Civil Engineering with a title “Bayesian Model Averaging for the Prediction of Water Main Failure for Small to Large Canadian Municipalities.” The lead author is Golam Kabir and the coauthors are Dr. Solomon Tesfamariam and Dr. Rehan Sadiq. 4.1 Background Very few regression based water main failure prediction studies mention any preliminary covariate or model selection method that consider model uncertainties. In the linear and exponential regression models, inference was carried out conditionally on the selected “best” model, but this ignores the model uncertainty implicit in the variable selection process (Leamer 1978; Raftery et al. 1997; Viallefont et al. 2001). As a consequence, uncertainty about quantities of interest can be underestimated (Hoeting et al. 1999; Raftery et al. 1997; Viallefont et al. 2001). A Bayesian model averaging based model can deal this problem by averaging over all possible models (i.e., combinations of predictors) when making inferences about quantities of interest (Raftery et al. 1997; Viallefont et al. 2001). The objective of this chapter is thus to develop a BMA based water main failure prediction model formally taking uncertainties into consideration. BMA based water main failure prediction model also provides uncertainty assessment. Effectiveness of the proposed model is illustrated with the City of Kelowna, Greater Vernon Water, and City of Calgary pipe failure data. 4.2 Bayesian model averaging (BMA) In the linear multiple regression problem, the variation in a response variable y is described in terms of a predictor variables x1, ...,xa. The mean value of yi, the response for the ith individual, can be expressed as: 𝐸(𝑦𝑖|𝛽, 𝑋) = 𝛽1𝑥𝑖1 + 𝛽2𝑥𝑖2 + ⋯ + 𝛽𝑎𝑥𝑖𝑎 = 𝑥𝑖𝛽, 𝑖 = 1,2, … . , 𝑛 (4.1) where xi = (xi1, ..., xia) = row vector of predictor values for the ith individual and β = (β1, ..., βa) = column vector of unknown regression parameters or coefficients (Albert 2009; Carlin and Louis 74 2009). The {yi} are assumed to be conditionally independent given the predictor variables and values of the parameters (Albert 2009; Bolstad 2007). Equal variances are assumed in the normal linear regression, where var(yi|θ,X) = σ2. Let θ = (β1, ..., βa, σ2) indicates the vector of unknown parameters. The errors ɛi = yi - E(yi|β,X) are assumed to be independent and normally distributed with mean 0 and variance σ2 (Albert 2009; Bolstad 2007; Carlin and Louis 2009). In classical regression analysis, inference will be performed based on the single “best” model which can underestimate the quantity if interest. This problem can be solved by averaging over all possible combinations of predictors when making inferences about quantities of interest (Raftery et al. 1997; Wasserman 2000). BMA is an average of the posterior distributions under each model weighted by the corresponding posterior model probabilities (Leamer 1978; Wasserman 2000). If Μ = {M1, … , MK} denotes the set of all models being considered and if Δ is the quantity of interest, then the posterior distribution of Δ given the data D is (Raftery et al. 1997; Wasserman 2000) 𝑃𝑟(∆|𝐷) = ∑ 𝑃𝑟(∆|𝑀𝑘, 𝐷)𝑃𝑟(𝑀𝑘|𝐷)𝐾𝑘=1 (4.2) The posterior probability of model MK is given by (Leamer 1978; Raftery et al. 1997) 𝑃𝑟(𝑀𝑘|𝐷) = 𝑃𝑟(𝐷|𝑀𝑘)𝑃𝑟(𝑀𝑘)∑ 𝑃𝑟(𝐷|𝑀𝑙)𝑃𝑟(𝑀𝑙)𝐾𝑙=1 (4.3) where 𝑃𝑟(𝐷|𝑀𝑘) is the marginal likelihood of model Mk, and is computed as: 𝑃𝑟(𝐷|𝑀𝑘) = ∫ 𝑃𝑟(𝐷|𝜃𝑘, 𝑀𝑘) 𝑃𝑟(𝜃𝑘|𝑀𝑘) 𝑑𝜃𝑘 (4.4) θk is the vector of parameters of model Mk, Pr(D|θk, Mk) is the likelihood, Pr(θk|Mk) is the prior density of θk under model Mk, and Pr(Mk) is the prior probability that Mk is the true model (Hoeting et al. 1999; Raftery et al. 1997). Averaging over all of the models in this fashion provides better predictive ability than using any single model Mj which can measured by a logarithmic scoring rule: 75 −𝐸 [𝑙𝑜𝑔 {∑ 𝑃𝑟(∆|𝑀𝑘, 𝐷)𝑃𝑟(𝑀𝑘|𝐷)𝐾𝑘=1}] ≤ −𝐸 [𝑙𝑜𝑔{𝑃𝑟(∆|𝑀𝑗 , 𝐷)}] (𝑗 = 1, … , 𝐾), (4.5) where Δ is the observable to be predicted and the expectation is with respect to ∑ 𝑃𝑟(∆|𝑀𝑘, 𝐷)𝑃𝑟(𝑀𝑘|𝐷) 𝐾𝑘=1 (Hoeting et al. 1999; Raftery et al. 1997). For a number of explanatory variables, the initial number of possible models with no interactions will be equal to 2a. If a is large, then this set will be quite large and needs to be reduced. For this, Occam’s window approximation (Madigan and Raftery 1994) can be used, which follow three steps: 1. calculate the posterior probabilities of all possible models using a workable fast approximation, 2. identify the ‘best’ model Mm or the model with the highest posterior probability, and 3. eliminate the models that are unlikely a posteriori or the models that are more than χ times less probable than the best one (Mm). Specifically, Occam’s window approximation retain the models Mn that satisfy 𝑃(𝑀𝑚|𝐷)𝑃(𝑀𝑛|𝐷)< 𝜒 (4.6) In practice, the leaps and bounds algorithm can be used to identify the most likely models a posteriori instead of avoiding the calculation of the posterior probabilities of all the models (Hoeting et al. 1999; Volinsky et al. 1997). To eliminate models whose posterior probabilities are much smaller than that of the best model, Bayesian information criterion (BIC) approximation to the Bayes factor and a threshold window for χ can be used. 4.3 Case studies The proposed BMA methodology is applied on the water distribution network of the City of Kelowna, BC, Greater Vernon Water, BC and City of Calgary, Alberta. The determination of appropriate factors causing pipe failures is vital for an effective water mains failure prediction model development. A variety of factors causing failures have been reported in literature. 76 However, all the literature agreed about the influence of age in pipe breaks (for example: Asnaashari et al. 2009; Clark et al. 1982; Shamir and Howard 1979). Some researchers observed a linear (Hu and Hubble 2007; Kettler and Goulter 1985) and non-linear relationship (Boxall et al. 2007; Goulter and Kazemi 1988; Kleiner and Rajani 2001) between pipe breaks and age. Pipe diameter is also considered as one of the most influential parameters for pipe breaks (Christodoulou 2011; Kettler and Goulter 1985; Yamijala et al. 2009). Different researchers like Boxall et al. (2007), Hu and Hubble (2007), Kettler and Goulter (1985) reported an inverse relationship between the pipe failure rate and the diameter. Pipe length has been found to contribute significantly to the number of breaks observed (Asnaashari et al. 2009; Bubtiena et al. 2011; Jacobs and Karney 1994; Wang et al. 2009). The number of pipe breaks increases considerably with the length of the pipes is observed in different studies. Several researchers observed the effects of soil resistivity and corrosivity on the breakage rate of water mains (Boxall et al. 2007; Wood and Lence 2009; Yamijala et al. 2009). Soil resistivity measures how strongly a soil opposes to pass the flow of electric current (AWWA 1999). Soil resistivity significantly affects the deterioration or corrosion of metallic pipes. High soil resistivity will result in a lower corrosion rate probability, while low resistivity results in a higher corrosion rate and failure rate probability (Sadiq et al. 2005). The metallic pipe corrosion process is predominantly facilitated for the corrosive nature of soil environment (Hubell 2003). For metallic pipes, soil corrosivity mainly affected by soil resistivity, redox potential, soil pH, soil moisture content, and sulphide content (AWWA 1999; Sadiq et al. 2005). The probability of external corrosion and hence the failure rates of metallic pipes increases with the increase of soil corrosivity index (Hubell 2003; Sadiq et al. 2004). As the data for only these five parameters are common in the City of Kelowna, Greater Vernon Water and City of Calgary, they are considered for further analysis. The overall match between observed and predicted values was assessed using mean square error (MSE), mean absolute error (MAE), root mean square error (RMSE), and percent bias (PBIAS) as shown in Equations 4.7-4.10. 77 𝑀𝑆𝐸 = ∑ (𝑂𝑖 − 𝑃𝑖)2𝑛𝑖=1𝑛 (4.7) 𝑀𝐴𝐸 = ∑ |𝑂𝑖 − 𝑃𝑖|𝑛𝑖=1𝑛 (4.8) 𝑅𝑀𝑆𝐸 = √∑ (𝑂𝑖 − 𝑃𝑖)2𝑛𝑖=1𝑛 (4.9) 𝑃𝐵𝐼𝐴𝑆 = ∑ (𝑂𝑖 − 𝑃𝑖)𝑛𝑖=1∑ (𝑂𝑖)𝑛𝑖=1 × 100 (4.10) where O, P, and n are the observed value, predicted value, and number of observation respectively. 4.3.1 City of Kelowna, BC The City of Kelowna (CoK) is located in British Columbia, Canada and serves more than 50,000 residential consumers and 1,700 industrial, commercial, and institutional properties. The CoK water distribution network consists of 403.4 km length of pipe with a total of 2,598 individual pipes. The CoK water network comprises of 11.00% metallic (CI, DI, copper, steel, and galvanized), 33.95% cementitious (asbestos (AC) and concrete), and 55.05% plastic pipes, which include PVC and high-density polyethylene (HDPE). The metallic, cementitious, and plastic pipes are installed during 1939-2009, 1939-2010, and 1963-2010 respectively. The database contains 199 pipe breaks where the first break record obtained for the cementitious, metallic, and plastic pipes in the year 1976, 1977, and 1993 respectively. The majority breaks occurred in AC (65.83%) and CI (27.14%) pipes whereas very few breaks found for plastic (7.03%) pipes. It has been found that 99 cementitious pipes experienced breaks whereas 35 metallic, and 11 plastic pipes breaks from 1976-2013. Four groups are considered for the analysis (i.e., metallic, cementitious (diameter ≤ 150 mm and diameter > 150 mm), and plastic). Pipe characteristics data like age (year), diameter (mm), length (m), soil resistivity (Ω-cm), and soil corrosivity index are collected from GIS database of the CoK. The variables used to represent input data together with summary statistics of these data are given in Table 4.1. 78 Table 4.1: Summary of control variables for CoK Variable Unit Scale Metallic Cementitious Plastic Dia ≤ 150 Dia > 150 NBRKS NA Min 1.00 1.00 1.00 1.00 Max 5.00 4.00 4.00 3.00 Mean 1.54 1.44 1.22 1.27 SD 1.04 0.82 0.58 0.65 AGE Year Min 35.00 22.00 15.00 9.00 Max 66.00 57.00 59.00 28.00 Mean 56.66 40.96 41.06 22.45 SD 8.53 6.06 6.72 5.47 DIA mm Min 100.00 100.00 200.00 150.00 Max 400.00 150.00 350.00 350.00 Mean 182.86 136.46 241.18 195.45 SD 70.65 22.45 46.59 61.05 LENGTH m Min 47.25 27.39 77.78 86.10 Max 714.52 591.63 545.61 735.12 Mean 211.23 200.08 232.39 294.22 SD 142.77 114.64 104.18 188.99 RESIS Ω-cm Min 1950.00 1950.00 1950.00 1950.00 Max 2750.00 10750.00 3000.00 10750.00 Mean 2292.86 3071.88 2090.20 3718.18 SD 401.68 2343.53 331.82 3003.85 SCI NA Min 4.00 1.00 1.00 0.00 Max 12.50 5.00 5.00 5.00 Mean 12.04 2.96 3.98 2.55 SD 1.42 1.88 1.58 1.81 NBRKS: Number of previous breaks, AGE: Pipe age, DIA: Diameter, LENGTH: Length, RESIS: Soil resistivity, SCI: Soil corrosivity index, NA symbolizes that there are no units Table 4.2 shows the posterior means, posterior standard deviations, and posterior probabilities of the regression coefficients using the BMA approach for the CoK. The estimated values, standard errors and p-values of the regression coefficients for the classical analysis for the CoK are also presented in Table 4.2. Table 4.2 also indicates the significant parameters for different models. The variables that were ‘not significant’ in the classical regression analysis generally had 79 posterior probabilities below 50 per cent (Viallefont et al. 2001). Table 4.2 shows that pipe length is the only significant parameter for metallic pipes whereas pipe age and length are the most influential parameters for the cementitious pipes (for both diameter ≤ 150 mm and diameter > 150 mm). For the plastic pipes, according to the classical approach, pipe age is the only significant parameter and pipe length is non-significant parameter as p > 0.05 or 0.0564. However, according to the BMA approach, pipe length is a significant parameter as the posterior probability is greater than 50% or 93.40%. Thus the classical regression analysis missed the evidence in the data for an effect of this variable. This high posterior probability indicates that pipe length is an influential candidate for the failure of plastic pipes. Soil resistivity and soil corrosivity index is not significant enough for any group of material or model due to less number of failure data and posterior probabilities are below than 25%. The only other non-significant variable for which the posterior probability approached 50% was pipe age for metallic pipes (40.0%). Again, BMA approach gives a more nuanced result than the classical regression approach, not indicating evidence for pipe age but not ruling it out either. If pipe age were an important variable for metallic pipes of CoK, this result would point towards the need for more failure data or information on its possible effect. 80 Table 4.2: Classical and Bayesian analysis of the CoK Model Variable Classical variable selection Bayesian model averaging ?̂? SE p-value E (β|D) SD (β|D) Pr (βi ≠ 0|D) Metallic Intercept 0.82124 4.66109 0.8614 2.3043 NA 100.00%* AGE -0.02015 0.01160 0.0931 -0.0065 0.0104 40.00% DIA -0.00121 0.00135 0.3768 -0.0002 0.0008 20.30% log(LENGTH) -0.86505 0.15638 < 0.0001* -0.8481 0.1420 100.00%* SCI 0.03141 0.07827 0.6911 0.0017 0.0265 11.90% log(RESIS) 0.34164 0.53387 0.5272 0.0617 0.2684 21.05% Cementitious (Dia ≤ 150) Intercept 3.68387 1.36080 0.0098* 3.71779 NA 100.00%* AGE -0.04691 0.01067 0.0001* -0.04720 0.01005 100.00%* DIA -0.00195 0.00377 0.6074 -0.00038 0.00139 16.40% log(LENGTH) -0.68677 0.11747 < 0.0001* -0.67596 0.11071 100.00%* SCI -0.00834 0.04926 0.8664 -0.00410 0.01661 15.45% log(RESIS) 0.04710 0.12630 0.71107 0.00830 0.05023 15.80% Cementitious (Dia >150) Intercept 0.20290 2.91300 0.94479 2.19677 NA 100.00%* AGE -0.02197 0.00674 0.00213* -0.02112 0.00799 95.35%* DIA 0.00010 0.00114 0.93117 0.00002 0.00040 15.50% log(LENGTH) -0.76750 0.09900 < 0.0001* -0.75772 0.09824 100.00%* SCI -0.02027 0.03321 0.54479 -0.00471 0.01582 17.80% log(RESIS) 0.37340 0.34760 0.28836 0.09540 0.23139 23.55% Plastic Intercept 6.14582 2.71310 0.07290 4.68895 NA 100.00%* AGE -0.09348 0.03197 0.03290* -0.08338 0.03718 92.77%* DIA -0.00100 0.00278 0.73380 -0.00009 0.00109 17.12% log(LENGTH) -0.71817 0.29055 0.05640 -0.72552 0.31025 93.48%* SCI 0.00937 0.10475 0.93220 0.00827 0.04288 18.88% log(RESIS) -0.17837 0.32116 0.60250 -0.04115 0.14323 21.46% * Significant variable 81 Figure 4.1 shows the posterior distributions and posterior inclusion probabilities of the regression coefficients for plastic pipes. The Figure 4.1 also indicates the means and standard deviation of the regression coefficients. As the posterior inclusion probabilities of pipe age and length are greater than 50%, they are considered as significant parameters for the failure of plastic pipes of CoK. Figure 4.1: Posterior distributions and probabilities of the regression coefficients for plastic pipes The comparison results of BMA and regression models for different pipe strata of the CoK are presented in Table 4.3. According to the Table 4.3, for all type of pipe strata, the MSE, MAE, and RMSE of the BMA models are lower compared to the regression models. Moreover, due to considering the pipe age as a significant parameter in BMA approach, the MSE, MAE, and RMSE of the plastic pipes are much lower compared to regression model and the normal 82 regression model highly underestimates (BIAS = -97.80) the failure rate of plastic pipes compare to BMA (BIAS = -25.00) approach. The underestimation of the failure rates by normal regression model increases with the decrease of number of failure data or information. For this, it can be concluded that the performance of BMA model is much better compared to normal regression model for all type of pipe strata for the CoK. Table 4.3: Performance of BMA and regression models for the CoK Error Model Metallic Cementitious (Dia ≤ 150) Cementitious (Dia >150) Plastic MSE BMA 0.0145 0.0091 0.0023 0.1459 REG 0.2151 0.2272 0.7781 1.9111 MAE BMA 0.0751 0.0648 0.0297 0.1506 REG 0.3417 0.6531 0.6457 1.4029 RMSE BMA 0.1204 0.0956 0.0479 0.3819 REG 0.8466 0.4127 0.8645 1.7817 BIAS BMA -14.4000 -6.7000 -4.9000 -25.0000 REG -25.5000 -15.1000 -11.1000 -97.8000 4.3.2 Greater Vernon Water, BC The Greater Vernon Water (GVW) is located in British Columbia, Canada. GVW has approximately 695 connections that are farm or agriculture status, 1,450 connections that are commercial, institutional or industrial, and 17,484 connections that are residential. The GVW water distribution network consists a total of 144,119 individual pipes and 666.14 km length of pipe. The GVW network comprises of 23.46% metallic (CI, DI, copper, steel, and galvanized), 22.69% cementitious (AC, concrete cylinder, and concrete), 36.76% plastic pipes, which include polyethylene (PE), PVC and HDPE, and 17.10% of unknown material. The database contains 306 pipe breaks from 1926 to 2009. The majority breaks occurred in CI (35.95%), DI (19.61%), and AC (21.57%) pipes whereas very few breaks found for plastic (8.17%) pipes. It has been found that 178 metallic pipes experienced breaks whereas 69 cementitious, and 25 plastic pipes breaks. Five groups are considered for the analysis (i.e., metallic (diameter ≤ 150 mm and diameter > 150 mm), cementitious (diameter ≤ 150 mm and diameter > 150 mm), and plastic). Pipe characteristics data like age, diameter, length, soil 83 resistivity, and soil corrosivity index are collected from GIS database of the GVW. Table 4.4 represents the summary statistics of the variables for the models of GVW. Table 4.4: Summary of control variables for GVW Variables Unit Scales Metallic Cementitious Plastic Dia ≤ 150 Dia > 150 Dia ≤ 150 Dia > 150 NBRKS NA Min 1.00 1.00 1.00 1.00 1.00 Max 4.00 2.00 5.00 2.00 3.00 Mean 1.23 1.11 1.20 1.17 1.40 SD 0.55 0.32 0.69 0.38 0.71 AGE Year Min 17.00 6.00 36.10 35.10 4.00 Max 88.00 88.00 50.10 48.00 47.00 Mean 50.59 42.26 43.92 45.43 22.41 SD 17.95 17.94 3.45 2.97 13.49 DIA mm Min 50.00 200.00 50.00 200.00 25.00 Max 150.00 600.00 150.00 900.00 300.00 Mean 130.63 277.78 128.89 325.00 133.20 SD 29.68 106.53 27.15 197.81 68.04 LENGTH m Min 4.27 5.12 2.26 17.09 6.10 Max 605.67 321.05 332.69 383.17 314.44 Mean 84.93 101.80 101.83 136.07 110.29 SD 68.08 76.10 66.17 87.47 81.30 RESIS Ω-cm Min 1000.00 1000.00 1000.00 1250.00 1000.00 Max 2800.00 2800.00 2800.00 2800.00 2800.00 Mean 2577.46 2287.50 2490.00 2637.50 2442.00 SD 540.54 712.98 513.41 457.37 536.90 SCI NA Min 0.00 0.00 0.00 0.80 0.00 Max 1.00 0.79 1.00 1.00 0.50 Mean 0.08 0.17 0.80 0.83 0.02 SD 0.22 0.30 0.13 0.07 0.10 NBRKS: Number of previous breaks, AGE: Pipe age, DIA: Diameter, LENGTH: Length, RESIS: Soil resistivity, SCI: Soil corrosivity index, NA symbolizes that there are no units 84 The posterior means, posterior standard deviations, and posterior probabilities of the regression coefficients using the BMA approach for the GVW are resented in Table 4.5. Table 4.5 also indicates the estimated values, standard errors and p-values of the regression coefficients for the classical analysis for the GVW. Table 4.5 shows that pipe age and length are the most influential parameters for the metallic pipes (for both diameter ≤ 150 mm and diameter > 150 mm). Pipe length is the only significant parameter for cementitious (diameter > 150 mm) pipes. For the cementitious (diameter ≤ 150 mm), pipe diameter is non-significant parameter as p > 0.05 or 0.0632. However, according to the BMA approach, pipe diameter is a significant parameter as the posterior probability is greater than 50% or 60.35%. Pipe age, diameter, and length are the most significant parameters for plastic pipes. Due to less number of failure data, soil resistivity and soil corrosivity index is not significant enough for any group of material or model. 85 Table 4.5: Classical and Bayesian analysis of the GVW Model Variable Classical variable selection Bayesian model averaging ?̂? SE p-value E (β|D) SD (β|D) Pr (βi ≠ 0|D) Metallic (Dia ≤ 150) Intercept 6.23455 1.75817 0.00054* 3.90527 NA 100.00%* AGE -0.01812 0.00159 < 0.0001* -0.01783 0.00159 100.00%* DIA -0.00067 0.00095 0.48319 -0.00007 0.00037 9.35% log(LENGTH) -0.95761 0.03353 < 0.0001* -0.94872 0.03411 100.00%* SCI -0.42841 0.29236 0.14514 -0.01560 0.08131 9.40% log(RESIS) -0.28009 0.22059 0.20634 -0.00324 0.05398 9.60% Metallic (Dia >150) Intercept 7.13294 4.01595 0.0859 4.33849 NA 100.00%* AGE -0.03115 0.00324 < 0.0001* -0.02969 0.00364 100.00%* DIA -0.00093 0.00061 0.1387 -0.00025 0.00053 29.00% log(LENGTH) -0.92372 0.06948 < 0.0001* -0.94153 0.07280 100.00%* SCI -0.49654 0.65881 0.4569 -0.03147 0.16154 17.04% log(RESIS) -0.32048 0.51265 0.5366 0.01100 0.11855 16.61% Cementitious (Dia ≤ 150) Intercept 4.96442 1.75586 0.0074* 3.45109 NA 100.00%* AGE -0.00736 0.01430 0.6096 -0.00215 0.00802 18.05% DIA 0.00354 0.00185 0.0632 0.00244 0.00245 60.35%* log(LENGTH) -0.91664 0.05609 < 0.0001* -0.89495 0.06023 100.00%* SCI 0.07931 0.35606 0.8249 0.01199 0.14248 13.35% log(RESIS) -0.29397 0.18179 0.1139 -0.11618 0.19250 36.50% Cementitious (Dia >150) Intercept 3.77192 2.24511 0.1100 3.75444 NA 100.00%* AGE -0.03094 0.02074 0.1530 -0.01348 0.02113 40.80% DIA -0.00033 0.00032 0.3110 -0.00011 0.00024 29.15% log(LENGTH) -0.97626 0.10895 < 0.0001* -0.99670 0.10706 100.00%* SCI -0.88139 0.99988 0.3900 -0.15076 0.57838 21.60% log(RESIS) 0.19910 0.25734 0.4490 0.02680 0.13112 17.60% Plastic Intercept 4.81012 3.41649 0.1753 5.78553 NA 100.00%* AGE -0.06605 0.00856 < 0.0001* -0.06299 0.00898 100.00%* DIA -0.00486 0.00170 0.0102* -0.00442 0.00217 90.70%* log(LENGTH) -0.99597 0.11365 < 0.0001* -0.96448 0.11805 100.00%* SCI 0.16936 1.08113 0.8772 0.02064 0.45325 15.50% log(RESIS) 0.20035 0.39965 0.6219 0.04108 0.20391 17.95% * Significant variable 86 Table 4.6 indicates the comparison results of BMA and regression models for different pipe strata of the GVW. The MSE, MAE, and RMSE of the BMA models are lower compared to the regression models for all type of pipe strata. Moreover, the MSE, MAE, and RMSE of the cementitious (diameter ≤ 150 mm) are much lower for BMA approach compared to regression model due to considering the pipe diameter as a significant parameter. The underestimation of normal regression model (BIAS = -3.70) is slightly lower compared to the BMA (BIAS = -4.30) approach for metallic pipes (diameter ≤ 150 mm). Except that, the normal regression model highly underestimates the failure rate of the remaining four models compared to BMA approach. According to the Table 4.6, the performance of BMA model is much better compared to normal regression model for all type of pipe strata for the GVW. Table 4.6: Performance of BMA and regression models for the Greater Vernon Water Error Model Metallic (Dia ≤ 150) Metallic (Dia >150) Cementitious (Dia ≤ 150) Cementitious (Dia >150) Plastic MSE BMA 0.0579 0.3325 0.0681 0.0150 4.0741 REG 0.1586 1.8230 0.6260 0.0590 14.7546 MAE BMA 0.1207 0.2220 0.0996 0.0699 0.9972 REG 0.2236 0.6378 0.2793 0.1154 1.7815 RMSE BMA 0.2407 0.5766 0.2609 0.1226 2.0184 REG 0.4422 1.3502 0.7912 0.2429 3.8412 BIAS BMA -4.3000 -11.8000 -9.0000 -4.8000 -19.8000 REG -3.7000 -61.2000 -48.1000 -27.1000 -60.1000 4.3.3 City of Calgary, Alberta The details of the water distribution network of the City of Calgary already presented in previous chapter. Systematic recording of pipe break data began in 1956 and the database contains 17,682 pipe breaks. Only CI, DI, cementitious and plastic pipes are considered for further analysis because of high percentage of breaks. It has been found that 2,882 CI and 2,067 DI pipes experienced breaks whereas 199 cementitious, and 159 plastic pipes breaks from 1956-2013. Six groups are considered for the analysis (i.e., CI (diameter ≤ 150 mm and diameter > 150 mm), DI (diameter ≤ 150 mm and diameter > 150 mm), cementitious, and plastic). From the GIS database of the City of Calgary, pipe characteristics data like age, diameter, length, soil resistivity, and 87 soil corrosivity index were collected. The summary statistics of the control variables for City of Calgary are presented in Table 4.7. Table 4.7: Summary of control variables for City of Calgary Variables Unit Scales CI CI DI DI Cementitious Plastic Dia ≤ 150 Dia > 150 Dia ≤ 150 Dia > 150 NBRKS NA Min 1.00 1.00 1.00 1.00 1.00 1.00 Max 21.00 22.00 13.00 16.00 17.00 2.00 Mean 2.24 2.37 2.35 2.21 2.03 1.07 SD 1.61 2.26 1.90 1.83 1.84 0.25 AGE Year Min 18.00 22.00 30.00 27.00 2.00 10.00 Max 104.00 104.00 50.00 51.00 58.00 41.00 Mean 64.40 61.69 40.25 40.52 29.02 27.15 SD 17.05 16.59 3.75 3.93 12.69 6.48 DIA mm Min 20.00 200.00 100.00 200.00 150.00 150.00 Max 150.00 600.00 150.00 400.00 1200.00 400.00 Mean 147.36 254.90 148.14 248.05 327.14 199.37 SD 12.71 62.54 9.48 52.54 195.32 61.36 LENGTH m Min 3.00 3.00 4.00 5.00 14.00 12.00 Max 562.00 937.00 515.00 1691.00 1870.00 637.00 Mean 156.34 179.64 171.00 196.54 363.80 175.72 SD 74.03 110.94 89.66 109.72 314.57 83.24 RESIS Ω-cm Min 579.00 573.00 507.00 505.00 677.00 746.00 Max 20000.00 10526.00 11111.00 12500.00 8333.00 6250.00 Mean 2737.95 2224.59 1825.35 1816.60 1859.08 2201.58 SD 1860.85 1138.35 997.61 948.56 973.06 1073.92 SCI NA Min 0.00 1.93 1.93 1.93 0.00 0.00 Max 20.43 20.43 20.43 20.43 18.18 20.43 Mean 6.22 7.92 10.85 11.08 0.60 0.60 SD 4.67 5.43 6.30 6.22 2.40 2.97 NBRKS: Number of previous breaks, AGE: Pipe age, DIA: Diameter, LENGTH: Length, RESIS: Soil resistivity, SCI: Soil corrosivity index, NA symbolizes that there are no units 88 Table 4.8 represents the posterior means, posterior standard deviations, and posterior probabilities of the regression coefficients using the BMA approach. The estimated values, standard errors and p-values of the regression coefficients for the classical analysis are also shown in Table 4.8. Table 4.8 indicates that pipe age and length are the most influential parameters for all type of pipes whereas soil resistivity and soil corrosivity index effects only the metallic pipes (CI and DI pipes). The posterior probabilities of the soil resistivity and soil corrosivity index were less than 25% for both cementitious and plastic pipes. 89 Table 4.8: Classical and Bayesian analysis of the City of Calgary Model Variable Classical variable selection Bayesian model averaging ?̂? SE p-value E (β|D) SD (β|D) Pr (βi ≠ 0|D) CI (Dia ≤ 150) Intercept 7.01039 0.32686 < 0.0001* 6.90069 NA 100.00%* AGE -0.01824 0.00077 < 0.0001* -0.01823 0.00077 100.00%* DIA -0.00078 0.00102 0.445 -0.00003 0.00025 3.90% log(LENGTH) -0.77871 0.02089 < 0.0001* -0.77938 0.02084 100.00%* SCI -0.02345 0.00361 < 0.0001* -0.02339 0.00361 100.00%* log(RESIS) -0.42319 0.03256 < 0.0001* -0.42301 0.03254 100.00%* CI (Dia >150) Intercept 7.26099 0.51284 < 0.0001* 7.35011 NA 100.00%* AGE -0.01444 0.00117 < 0.0001* -0.01401 0.00117 100.00%* DIA 0.00070 0.00031 0.2420 0.00019 0.00035 27.01% log(LENGTH) -0.72228 0.02617 < 0.0001* -0.71363 0.02629 100.00%* SCI -0.02242 0.00473 < 0.0001* -0.02275 0.00474 99.98%* log(RESIS) -0.57113 0.05782 < 0.0001* -0.57445 0.05791 100.00%* DI (Dia ≤ 150) Intercept 6.86396 0.69030 < 0.0001* 6.79038 NA 100.00%* AGE 0.02439 0.00542 < 0.0001* 0.02437 0.00541 100.00%* DIA -0.00041 0.00213 0.847 -0.00001 0.00030 1.85% log(LENGTH) -0.76226 0.03050 < 0.0001* -0.76286 0.02968 100.00%* SCI -0.01806 0.00448 < 0.0001* -0.01791 0.00475 99.15%* log(RESIS) -0.66666 0.06707 < 0.0001* -0.66455 0.06910 100.00%* DI (Dia >150) Intercept 6.24678 0.56524 < 0.0001* 6.15718 NA 100.00%* AGE 0.02045 0.00430 < 0.0001* 0.02049 0.00433 99.94%* DIA -0.00061 0.00032 0.0541 -0.00009 0.00025 15.14% log(LENGTH) -0.74965 0.02637 < 0.0001* -0.75203 0.02636 100.00%* SCI -0.01715 0.00405 < 0.0001* -0.01724 0.00412 99.77%* log(RESIS) -0.57345 0.06041 < 0.0001* -0.57725 0.06105 100.00%* Cementitious Intercept 4.69100 0.29560 < 0.0001* 4.48470 NA 100.00%* AGE -0.03644 0.00227 < 0.0001* -0.03569 0.00229 100.00%* DIA -0.00010 0.00024 0.679 -0.00001 0.00008 8.55% log(LENGTH) -0.97080 0.02604 < 0.0001* -0.96226 0.02661 100.00%* SCI 0.00504 0.00502 0.318 0.00052 0.00225 9.25% log(RESIS) -0.01913 0.03228 0.554 -0.00224 0.01195 8.55% Plastic Intercept 7.16854 0.84074 < 0.0001* 6.26074 NA 100.00%* AGE -0.03607 0.00353 < 0.0001* -0.03578 0.00357 100.00%* DIA -0.00039 0.00025 0.1154 -0.00010 0.00022 23.50% log(LENGTH) -1.06230 0.06193 < 0.0001* -1.08510 0.06154 100.00%* SCI -0.02580 0.01864 0.1679 -0.00500 0.01315 19.15% log(RESIS) -0.19555 0.10225 0.0573 -0.07172 0.11420 24.65% * Significant variable 90 Table 4.9 gives the comparison results of BMA and regression models for different pipe strata of the City of Calgary. According to the Table 4.9, the performance of BMA and regression models are almost similar for metallic pipes due to high number of CI (2,882) and DI (2,067) pipe failure data. However, the performance of the BMA model is much better compared to the regression model for both cementitious and plastic pipes due to very few failure data of cementitious (199) and plastic (159) pipes. The normal regression model highly underestimates (BIAS = -24.00) the failure rate of plastic pipes compared to BMA (BIAS = -1.80) approach. For this, it can be concluded that the performance of BMA model is noticeably better compared to normal regression model whenever very few pipe failure data or information is available. Table 4.9: Performance of BMA and regression models for the City of Calgary Error Model CI (Dia ≤ 150) CI (Dia >150) DI (Dia ≤ 150) DI (Dia >150) Cemen-titious Plastic MSE BMA 0.0572 0.0580 0.0963 0.0701 3.5273 0.0032 REG 0.0575 0.0477 0.0964 0.0700 4.0268 0.1143 MAE BMA 0.1334 0.1330 0.2013 0.1575 0.4045 0.0299 REG 0.1337 0.1353 0.2016 0.1577 0.4179 0.1348 RMSE BMA 0.2394 0.2407 0.3103 0.2643 1.8781 0.0567 REG 0.2398 0.2184 0.3108 0.2646 2.0067 0.3381 BIAS BMA -14.50 -7.60 -14.00 -14.90 -32.60 -1.80 REG -14.50 -7.61 -14.10 -14.90 -42.10 -24.00 4.4 Summary Uncertainty is inherent regardless of quality and quantity of data for the prediction of water main failure. In this chapter, Bayesian model averaging based water main failure prediction models is proposed to account model uncertainty as a formal way. The advantages of the proposed BMA based water main failure models are demonstrated by predicting the failure of pipes of the water distribution network of the City of Kelowna, BC, Greater Vernon Water, BC, and City of Calgary, Alberta, Canada. Results indicate that BMA provides a transparent statement of the probability that a variable is associated with the water main failure, through the posterior probability Pr (βi ≠ 0|D) in contrast to that of the p-value given by the classical analysis. 91 Moreover, the performance of BMA approach is noticeably better compared to classical normal regression model whenever limited pipe failure data or information is available. Such an approach could be helpful to identify the influential variables, and show better performance in real life study. Results also identified that the impacts of the covariates differ according to material type. However, pipe physical attributes like pipe age, length were found to contribute more to pipe failure when limited data is available. The results from these BMA models could be further integrated with economic assessment model (e.g., life cycle costing) to estimate the costs of inspection, repair, and rehabilitation, and to develop optimal maintenance or replacement plans. The proposed approach can also incorporate the judgement or knowledge of the utility managers or authorities as prior distribution and assess the overall impact. 92 Chapter 5 Bayesian Weibull Proportional Hazard Based Failure Model A version of this chapter has been published in Reliability Engineering & System Safety with a title “Predicting Water Main Failures using Bayesian Model Averaging and Survival Modelling Approach.” The lead author is Golam Kabir and the coauthors are Dr. Solomon Tesfamariam and Dr. Rehan Sadiq. 5.1 Background Statistical models attempting to predict the behaviour of water pipes are not only affected both by the quantity and quality of available data, but also by the adopted statistical techniques (Kleiner and Rajani 2002, 2001). Survival analysis is one of the most widely used statistical models for water main failures. Substantial efforts have been made to develop pipe failure prediction using survival analysis based models. Table 2.2 provides a brief summary of pipe failure prediction models based on survival analysis. The objective of this chapter is thus to develop an effective Bayesian analysis framework for failure rate prediction of water mains formally taking uncertainties into consideration. For this, Bayesian model averaging (BMA) is used for influential covariate selection taking account of model uncertainty whereas Bayesian Weibull Proportional Hazard Model (BWPHM) is used to develop survival curves for the failure prediction of water mains. Utility of the proposed framework is illustrated with the City of Calgary pipe failure data. 5.2 Proposed Methodology Cox-Proportional Hazard Model (Cox-PHM) is one of the most widely used semi-parametric survival analysis models for water main failure. The Cox-PHM was developed by (Cox 1972) in order to examine the effects of different covariates on the time-to-failure of a system and is of the form: h (t|X) = h0 (t) exp(Xθ) (5.1) where t is the elapsed time from the last failure, h (t|X) is the hazard function, X = [x1, x2,…., xp] is the covariates vector, θ = [θ1, θ2,…., θp] is the covariates coefficients, and h0(t) is the baseline 93 hazard function which is equivalent to the total hazard rate when covariates have no influence on pipe failure (Clark et al. 2010; Park et al. 2011;). In order to describe the time to failure of a water main, the survival function is given by S (t|X) = S0 (t) exp(Xθ) (5.2) where, S0 (t)is the baseline survivor function. Let time ti, i = 1,2 ,…., n represents time-to-failure for the ith individual for n pipes and subject ith has a vector of covariates X = [x1, x2,…., xp], then the partial likelihood estimate L of θ = [θ1, θ2,…., θp] is (Ibrahim et al. 2005): 𝐿 (𝜃) = ∑ [exp (𝜃𝑋𝑖)∑ exp (𝜃𝑋𝑖)𝑘𝜖𝑅(𝑡𝑖)]𝛿𝑖𝑛𝑖=1 (5.3) where R(ti)is the risk set of all pipes, k at time t, and δi is a censoring indicator. If δi = 1, the observation is not censored and ti is the actual survival time. Otherwise when δi = 0, the observation ti is the censored time. To better capture the failure hazards of a water main, it is important to determine the influential covariates to be included in a model. However, most of the previous Cox-PHM studies did not consider any preliminary covariate selection analysis (Debón et al. 2010; Fuchs-Hanusch et al. 2012; Park et al. 2008b; Vanrenterghem-Raven 2004) expect Park et al. (2011) examined the log-likelihood ratio statistic and Akaike Information Criterion (AIC). After that, Park et al. (2011) carried out inference conditionally on the selected model but this ignores the model uncertainty implicit in the covariate selection process, and thus may underestimate uncertainty about relative risks. Viallefont et al. (2001) showed that p-values computed after covariate selection can greatly overstate the strength of conclusions. For this, in this study, BMA is proposed as a formal way of covariate selection and taking account of model uncertainty in water main failure prediction models. BMA is an average of the posterior distributions under each model weighted by the corresponding posterior model probabilities (Leamer 1978; Wasserman 2000). If Μ = {M1, … , 94 MK} denotes the set of all models being considered and if Δ is the quantity of interest, then the posterior distribution of Δ given the data D is (Park and Grandhi 2014; Raftery et al. 1997; Wasserman 2000) 𝑃𝑟(∆|𝐷) = ∑ 𝑃𝑟(∆|𝑀𝑘, 𝐷)𝑃𝑟(𝑀𝑘|𝐷)𝐾𝑘=1 (5.4) The posterior probability of model MK is given by (Leamer 1978; Raftery et al. 1997) 𝑃𝑟(𝑀𝑘|𝐷) = 𝑃𝑟(𝐷|𝑀𝑘)𝑃𝑟(𝑀𝑘)∑ 𝑃𝑟(𝐷|𝑀𝑙)𝑃𝑟(𝑀𝑙)𝐾𝑙=1 (5.5) where 𝑃𝑟(𝐷|𝑀𝑘) is the marginal likelihood of model Mk, and is computed as: 𝑃𝑟(𝐷|𝑀𝑘) = ∫ 𝑃𝑟(𝐷|𝜃𝑘, 𝑀𝑘) 𝑃𝑟(𝜃𝑘|𝑀𝑘) 𝑑𝜃𝑘 (5.6) θk is the vector of parameters of model Mk, Pr(D|θk, Mk) is the likelihood, Pr(θk|Mk) is the prior density of θk under model Mk, and Pr(Mk) is the prior probability that Mk is the true model (Hoeting et al. 1999; Raftery et al. 1997). Since for Cox-PHMs, the integrals required for BMA do not have a closed–form solution, Hoeting et al. (1999) Raftery et al. (1997), and Volinsky et al. (1997) adopted Laplace approximation or Bayesian information criterion (BIC) approximation: 𝑙𝑜𝑔 𝑃𝑟(𝐷|𝑀𝑘) ≈ 𝑙𝑜𝑔 𝑃𝑟(𝐷|𝜃𝑘, 𝑀𝑘) − 𝑑𝑘 𝑙𝑜𝑔𝑚 (5.7) where 𝜃𝑘 is the posterior mean of the θk, dk is the dimension of θk, and m is the total number of uncensored cases. To implement BMA for Cox-PHMs, the Occam’s window method is considered and the “leaps and bounds” algorithm are adopted by Volinsky et al. (1997) considering a threshold window χ = 200 (see section 5.2 for details). To do this, an adapted leaps and bounds algorithm is implemented using the bic.surv function of BMA package. One of the limitations of the Cox-PHM is that it assumes a proportional fixed effect on the baseline hazard function (Clark et al. 2010) and baseline hazard function depends on time but not the covariates (Le Gat and Eisenbeis 2000). The baseline hazard function represents the aging 95 process such as the effect of internal and external corrosion (Clark et al. 2010), which occurs not only as a function of time but also other stressing variables like soil resistivity, soil corrosivity, moisture content and others (Alvisia and Franchinia 2010). For this, WPHM is gaining popularity for the water main failure models for considering interaction between covariates and time (Alvisia and Franchinia 2010; Debón et al. 2010; Le Gat and Eisenbeis 2000; Vanrenterghem-Raven 2004). Therefore, water mains failure prediction models will be developed using Bayesian WPHM in this study. WPHM is a parametric version of Cox-PHM but the baseline hazard function is assumed to follow a specific distribution when the model is fitted with data (Le Gat and Eisenbeis 2000). In WPHM, a set of vector of covariates X = [x1, x2,..., xm] is assumed to be linearly related to logarithm of the time interval Ƭ (interarrival time), and can be expressed as the log linear model for a generic pipe (Kimutai et al. 2015; Le Gat and Eisenbeis 2000): ln T = βX + αϵ (5.8) where β is the row vector = β0, β1, ..., βm, and α is the scale parameter, and ϵ is a random component. The model assumes that the random component ϵ of the break interarrival time has a distribution of extreme values with survival function F (Alvisi and Franchini 2010; Kimutai et al. 2015): F(ϵ) = exp[- exp(ϵ)] (5.9) The survival function of the WPHM which represents the probability that a generic pipe will survive beyond time T will be (Alvisi and Franchini 2010; Kimutai et al. 2015; Le Gat and Eisenbeis 2000): 𝐹(𝑇, 𝜷, 𝑿) = exp [− exp (𝑙𝑛𝑇 − 𝜷𝑿 𝜎)] = exp [−𝑇1𝜎 exp (−𝜷𝑿𝜎)] (5.10) The corresponding probability density function is (Alvisi and Franchini 2010; Le Gat and Eisenbeis 2000): 96 𝑓(𝑇, 𝜷, 𝑿) = − exp [−𝑇1𝜎 exp (−𝜷𝑿𝜎)] [− 1𝜎𝑇(1𝜎)−1 exp ( −𝜷𝑿𝜎) ] (5.11) Model parameters can be achieved by maximising the following log-likelihood function: ln[𝐿(𝑇1, 𝑇2, … . . , 𝑇𝑖, 𝜷, 𝑿)] = ∑ ln(𝑓(𝑇, 𝜷, 𝑿)) + ∑ 𝐹(𝑇, 𝜷, 𝑿) ∈𝑖|𝑟𝑐𝑗|𝑛𝑟𝑐 (5.12) in which the generic interarrival time Ti relative to the ith breakage contributes to the probability density function if it is associated with a breakage occurring within the observation period, or if the interarrival time Ti is not ‘right censored’ (nrc); whereas if it is ‘right censored’ (rc), it contributes to the survival function (Alvisi and Franchini 2010; Kimutai et al. 2015). For the Bayesian inference of the WPHM, independent priors are assumed for the parameters β and α. The joint posterior density function will be proportional to the product of the likelihood function and the priors (Albert 2009) 𝑓(𝜷, 𝛼|𝑫) ∝ 𝐿(𝜷, 𝛼|𝑫) 𝑓(𝜷)𝑓(𝛼) (5.13) The posterior inference can be generated through the joint posterior samples (β, α), which can be drawn through a Markov chain Monte Carlo (MCMC) technique (Gelman et al. 2003; Gilks et al. 1996). Whenever new data or information is available, the model parameters (β, α) can be updated. If there is no information about prior, uniform priors can be assigned for β and the usual noninformative prior proportional to 1/σ is assigned for shape parameter σ, the posterior density up to a proportionality constant can be expressed as (Albert 2009) 𝑓(𝛽, 𝛼|𝑑𝑎𝑡𝑎) ∝ 1𝜎 𝐿(𝛽, 𝛼) (5.14) To find a proposal density, scale parameter should be chosen so that the Metropolis random walk chain has an acceptance range in the 20-40% range (Albert 2009). 5.3 Case study: City of Calgary The proposed methodology is applied on the water distribution network of the City of Calgary. 97 5.3.1 Data collection The distribution network of the City of Calgary is already presented in Figure 3.1. The distribution of pipe installation and breaks are presented in Figures 3.2 and 3.3. Only CI and DI pipes are considered for further analysis because of high percentage of breaks. Pipe characteristics data like age (year), diameter (mm), length (m), vintage or manufacturing period, number of connection of each pipes (for land use determination), soil resistivity (Ω-cm), and soil corrosivity index are collected from GIS database of the City of Calgary. In addition to the basic information related to the pipe breaks, the water main break database contained information on the land development status, and valve circuit. Individual pipes or pipe-segment levels were defined using the unique ID defined by the City of Calgary. Two groups were considered for vintage (VINT), pipes installed during 1960-1976 and others. The weather data was acquired from Environment Canada for the Calgary International Airport weather station. Soil resistivity (Ω-cm) measures how strongly a soil opposes the flow of electric current to pass through. Soi1 resistivity can be affected by soil temperature, degree of soil compaction, concentration of different ion (slats) contents, and moisture content (AWWA 1999; Sadiq et al. 2004). Soil resistivity plays a vital role for determining the corrosive nature of soil. High soil resistivity (> 3,000 Ω-cm) results in a lower corrosion rate probability while low soil resistivity (< 1,500 Ω-cm) will result in a higher corrosion rate probability, while (Sadiq et al. 2004). Soil corrosivity have significant role on corrosion of buried metallic pipes. Due to the corrosive nature of soil environment, metallic pipe corrosion process is predominantly facilitated (Hubell 2003). For metallic pipes, soil corrosivity mainly affected by soil resistivity, soil pH, redox potential, soil sulphide content, and moisture content (AWWA 1999; Sadiq et al. 2004). High soil corrosivity index increase the probability of external corrosion in metallic pipes and hence the failure rates of metallic pipes (Hubell 2003; Sadiq et al. 2004). The RD is calculated based on the Equations 3.21 and 3.22. The FI is calculated immediately before the break based on the daily mean temperature and using Equations 3.23. The variables used to represent input data together with summary statistics of these data are given in Table 5.1. 98 Table 5.1: Summary of control variables Variable Description Unit Measured scale Cast Iron (CI) Ductile Iron (DI) NBRKS Number of previous breaks NA min: 1, max: 22, mean: 2.284, SD: 1.86 min: 1, max: 16, mean: 2.271, SD: 1.85 DIA Pipe diameter mm min: 20, max: 600, mean: 184.3, SD:63.7 min: 100, max: 400, mean: 205.3, SD: 63.74 LENGTH Pipe length m min: 3, max: 937, mean: 164.3, SD: 89.13 min: 4, max: 1691, mean: 185.6, SD: 102.36 FI Freezing index degrees-days min: -232.96, max: 672.86, mean: 9.18, SD: 67.75 min: -232.96, max: 622.43, mean: -18.614, SD: 58.12 RD Rain deficit cm min: -22.226, max: 218.8, mean: 2.88, SD: 21.51 min: -21.29, max: 218.8, mean: 21.65, SD: 23.75 VINT Vintage NA min: 0, max: 1, mean: 0.82, SD: 0.384 min: 0, max: 1, mean: 0.35, SD: 0.475 RESIS Soil resistivity Ω-cm min: 0, max: 15382, mean: 2304, SD: 1366.14 min: 0, max: 16666, mean: 1894, SD: 1170.459 SCI Soil corrosivity index NA min: 0, max: 20.430, mean: 7.293, SD: 5.138 min: 0, max: 20.428, mean: 10.373, SD: 6.269 LANUSE Land use NA min: 0 (residential), max: 1 (commercial), mean: 0.21, SD: 0.41 min: 0 (residential), max: 1 (commercial), mean: 0.16, SD: 0.37 NA symbolizes that there are no units 5.3.2 Results and discussions Figure 5.1 indicates the number of failures of CI and DI pipes with respect to age and diameter. According to the Figure 5.1(a), the number of breaks are increasing with the age until 30 years for both CI and DI pipes and most of the CI and DI pipes experienced breaks between 21-30 years. The initial failure rate (between 0-10 years) in DI pipes are higher than CI pipes due to installation problem, manufacturing defects etc. The number of breaks are quite steadier for CI pipes between 31-40 years and 41-50 years whereas sudden drops of breakage are shown for DI pipes. In fact no breaks are found during 51-60 years for DI pipes which can be an indication that the survival time of CI pipes are high than DI pipes. Figure 5.1(b) shows the relationship between number of breaks with pipe diameter and pipe materials. The number of breaks in smaller diameter pipes (especially 150 mm) are much higher compared to the large diameter pipes for both CI and DI pipes. 99 (a) Age (b) Diameter Figure 5.1: Number of failures of CI and DI pipes with respect to age and diameter 05001000150020000-10 11-20 21-30 31-40 41-50 51-60Number of failuresDiameter (mm)CI DI050010001500200025003000350040004500500020 25 40 50 100 125 150 200 250 300 350 400 450 500 600Number of failuresDiameter (mm)CI DI100 Similarly, Figure 5.2 indicates the number of failures of CI and DI pipes with respect to soil resistivity and soil corrosivity index. Figure 5.2(a) indicates that significant number of CI and DI pipes breaks due to low soil resistivity (< 1,500 Ω-cm) and the effect of low soil resistivity on DI pipes is much higher compared to the DI pipes. The number of breaks in DI pipes decreases with the increase of soil resistivity. Figure 5.2(b) presents the relationship between number of breaks with pipe material and soil corrosivity index. Figure 5.2(b) indicates that the effect of high soil corrosivity index on DI pipes are higher compared to the CI pipes. The interaction between the pipe material and other explanatory factors will not be the same. For this, initially the data are stratified according to material type cast iron (CI) and ductile iron (DI) in order to establish the influence of covariates. Different authors observed that pipe failure behaviors were different for the first break and subsequent breaks (Le Gat and Eisenbeis 2000; Mailhot et al. 2000; Pelletier et al. 2003). The authors also mentioned that the likelihood of a pipe experiencing more breaks is much higher after breaking first time. For this, two strata are defined according to the number of observed previous failures (NOPF); with no previous failure (NOPF=0) and with one or more previous failures (NOPF >0) for both CI and DI pipes (Le Gat and Eisenbeis 2000; Pelletier et al. 2003). Covariates to be included into the models are selected using BMA. Where time and status indicate the vector of values for the dependent variable and of indicators of censoring (0 = censored, 1 = failure) respectively. LENGTH, DIA, VINT, FI, RD, log.RESIS., SCI, and LANUSE are considered as independent variables. The BMA approach starts by considering all possible combinations considering the 8 variables, yielding an initial set of 256 models, reduced by the use of Occam’s window (see section 5.2). There were 52 models in Occam’s window, and these were used to calculate the BMA estimates of the regression coefficients. 101 (a) Soil Resistivity (b) Soil Corrosivity Index Figure 5.2: Number of failures of CI and DI pipes with respect to soil resistivity and soil corrosivity index 0500100015002000250030000-1500 1501-1800 1801-2100 2101-2500 2501-3000 3000-15384Number of failuresSoil resistivity (Ohm-cm)CI DI0500100015002000250030000-03 03-06 06-09 09-12 12-15 15-18Number of failuresSoil corrosivity indexCI DI102 The posterior distributions for the coefficients of CI pipes (NOPF >0) model based on the model averaging results are also shown in Figure 5.3. The posterior distribution for LENGTH, DIA, and log.RESIS are indeed centered away from 0, and log.RESIS with a moderate spike at 0 whereas FI, RD, and SCI are centred close to zero with a large spike at zero. On the other hand, the posterior distribution of VINT and LANUSE are centred at zero. Table 5.2 indicates the mean, standard deviation (SD) and posterior probability of the coefficients of CI pipes (NOPF >0) model. According to the Table 5.2, the posterior probabilities of LENGTH and DIA are 100% and 99.8% respectively which indicate that these two variables are highly significant for CI pipes (NOPF >0) model. The log.RESIS. is also significant factor for CI pipes (NOPF >0) model due to 83.2% posterior probability. On the other hand, the posterior probabilities of FI and RD are 24.9% and 33.2% respectively whereas the posterior probabilities of VINT, LANUSE, and SCI are very small within the range of 1.0%-8.2%. Therefore, none of these variable are significant enough to consider for CI pipes (NOPF >0) model. The variables that were ‘not significant’ had posterior probabilities below 50 per cent (Viallefont et al. 2001). For this, LENGTH, DIA, and log.RESIS are finally selected for CI pipes (NOPF >0) model. Similarly, the other influential covariates are determined for the remaining three strata models. 103 Figure 5.3: Posterior distribution of the coefficients for CI pipes (NOPF >0) based on BMA 104 Table 5.2: Mean, SD and posterior probabilities of the coefficients for CI pipes (NOPF >0) based on BMA Variables E (β|D) SD (β|D) Pr (βi ≠ 0|D) LENGTH 1.4e-03 0.00015 100.0% DIA 9.1e-04 0.00021 99.8% VINT 4.7e-05 0.00375 1.0% FI 7.0e-05 0.00013 24.9% RD 6.2e-04 0.00097 33.2% log.RESIS. -1.1e-01 0.06240 83.2% SCI -6.8e-04 0.00257 8.2% LANUSE 1.5e-03 0.01015 3.2% The influential covariates from the results of BMA are considered for BWPHM development. The final scale parameters for the different BWPHMs are chosen for 30% acceptance rate (Figure 5.4). Figure 5.4 indicates the relationship between scale parameters and acceptance range for both CI and DI pipe strata. Figure 5.4: Selection of scale parameters for BWPHMs 0.0010.0020.0030.0040.0050.0060.0070.0080.0090.00100.000 0.5 1 1.5 2 2.5 3Acceptance Rate (%)Scale ParameterCI (NOPF > 0)CI (NOPF = 0)DI (NOPF > 0)DI (NOPF = 0)105 The posterior distribution of the parameters or coefficients of BWPHM for CI pipes (NOPF > 0) using the simulated draws from the joint posterior distribution is shown in Figure 5.5. Similarly, the posterior distribution of the coefficients of other BWPHMs are determined. Table 5.3 present the mean and SD of the coefficients of BWPHMs for CI and DI pipe strata with NOPF=0 and NOPF>0. Figure 5.5: Histogram of the coefficients of BWPHM for CI pipes (NOPF > 0) Table 5.3 shows the significant covariates (from BMA) and the corresponding coefficients for each model used in the study for CI and DI pipe strata. In the past studies, it has been found that the number of breaks increases considerably with the length of the pipes (Le Gat and Eisenbeis 2000; Røstum 2000). Table 5.3 also indicates that pipe length marginally increase the rate of breakage (except for DI NOPF=0) or longer pipe breaks more than shorter pipes for both CI and 106 DI pipes, either with one or more breaks. For DI pipes, the risk of breakage is high for the small diameter pipes (<200 mm) compared to the large diameter pipes. The result is in agreement with the Figure 5.1(b) and previous studies like Fuchs-Hanusch et al. (2012), Kleiner and Rajani (2010), Park et al. (2008a), and Boxall et al. (2007). Results shows that DI pipes are more affected by vintage than CI pipes as most of the DI pipes are installed during 1960 to 1985. The DI pipes which are installed during 1960 to 1976 are more prone to failure than the pipes installed outside this period. Table 5.3: Mean and SD of the coefficients of BWPHMs for CI and DI pipe strata CI pipe DI pipe NOPF = 0 NOPF > 0 NOPF = 0 NOPF > 0 Mean SD Mean SD Mean SD Mean SD Intercept 7.17E+00 3.11E-02 4.29E+00 2.76E-01 6.64E+00 3.22E-01 4.76E+00 9.98E-02 LENGTH -3.46E-03 1.35E-04 -1.36E-03 1.39E-04 -2.85E-01 2.52E-02 -8.81E-04 2.45E-04 DIA 0 0 -9.52E-04 2.01E-04 1.74E-03 2.60E-04 0 0 VINT 0 0 0 0 -5.62E-01 3.72E-02 -3.02E-01 7.62E-02 FI -1.46E-03 1.26E-04 0 0 -2.73E-03 1.42E-04 0 0 RD 1.02E-01 2.92E-02 0 0 -1.53E-02 3.72E-04 0 0 log.RESIS. 0 0 1.20E-01 3.59E-02 1.72E-01 3.77E-02 0 0 SCI 0 0 0 0 0 0 -1.77E-02 3.80E-03 LANUSE 1.82E-01 4.84E-02 0 0 4.72E-01 3.90E-02 2.30E-01 5.92E-02 Results also represent that the FI and RD effects more for the occurring of first failure compared to successive failures. Low RD was found to marginally decrease the likelihood of pipe breaking through decreased shrinking of expansive soil compared with high RD. Fuchs-Hanusch et al. (2013) and Kleiner and Rajani (2002) also got similar results. Brander (2001) also mentioned that the impact of RD is reduced to areas with expansive soils (clay) due to the heterogeneity of soils in the City of Calgary water systems. On the other hand, the negative coefficients of FI indicate that more breaks are expected with the decrease of temperature for both CI and DI pipes. The result is in agreement with finding by Kleiner and Rajani (2010) and Kleiner and Rajani (2002). However, the increase of hazard for FI is marginal due to the 3m installation depths in 107 the city that masked the influence of the temperature by limiting the penetration of frost loads (Brander 2001). Zhao et al. (2001) also mentioned that the extent of frost penetration is influenced by the type of backfill material and the pipe buried depths. Table 5.3 shows, metallic pipes installed in high soil resistivity (3,000 ohm-cm >) are less likely to fail than those installed in soils with low resistivity (< 1,500 ohm-cm). After CI pipe has experienced more than one break, soil resistivity dominates the other covariates for successive failures. An increase in hazard rates is observed in DI pipes in soils with low soil resistivity. Brander (2001) also confirmed this observation that most breaks in the City of Calgary water systems occur in areas highly corrosive soils than in noncorrosive soils. Similar finding were obtained by Liu et al. (2010), Kleiner and Rajani (2002), and Makar et al. (2001). Table 5.3 also reveals that DI pipes are more sensitive on soil corrosivity index compared to the CI pipes which also confirmed the Figure 5.2(b). Table 5.3 indicates that high soil corrosivity index increases the hazard of the pipe while low resistivity decrease it; after DI pipe has experienced more than one break. Table 5.3 also shows that DI pipes will be more effected by the land use compared to the CI pipes. More number of breaks occurred in the pipes serving in the commercial unit compared to the residential units due to the small diameter pipes. The number of breaks increases as the number of residential units served increases. Based on the mean and standard deviation of the coefficients presented in Table 5.3, the survival curves are determined for CI and DI pipe strata with NOPF=0 and NOPF>0. Figures 5.6 and 5.7 show the 2.5th, 50th, and 97.5th percentiles survival curves or posterior median and 95% Bayesian interval estimates for the survival of CI and DI pipe strata with NOPF=0 and NOPF>0. Figures 5.6 and 5.7 show that, the survival curves for pipes with previous breaks (NOPF>0) are steeper than the curves for pipes with no previous breaks (NOPF=0). For this, the survival time of CI and DI pipes with NOPF=0 is higher than NOPF>0 which is the normal phenomena for the repairable components. The results indicate that after having first break, the likelihood of a pipe experiencing more breaks is higher. Similar results found by Fuchs-Hanusch et al. (2012), Park et al. (2008), and Pelletier et al. (2003). 108 Figures 5.6 and 5.7 also indicate that the survival time of DI pipes is less compared to CI pipes or CI pipes has a better survival rate than DI pipes for both NOPF=0 and NOPF>0. Figure 5.1(a) also justifies the finding as no breaks are found after 50 years of age for DI pipes. The 95% Bayesian credible interval (BCI) also provides the range of uncertainty for the effective decision making. As most of the previous Bayesian studies considered only pipe age for their analysis (Dridi et al. 2009, 2005; Economou et al. 2007; Watson et al. 2004), Figures 5.6 and 5.7 also indicate the 2.5th, 50th, and 97.5th percentiles survival curves of BWPHMs considering only pipe age of CI and DI pipe strata with NOPF=0 and NOPF>0. Figures 5.6 and 5.7 show that the survival times of BWPHM-AGE models are less than proposed BWPHMs for both CI and DI pipe strata with NOPF=0 and NOPF>0. Moreover, the 95% Bayesian interval or the range of uncertainty is also narrow compared to proposed BWPHMs for considering only one variable (pipe age). Figures 5.6 and 5.7 also represent that the survival times of both CI and DI pipes with NOPF=0 is higher than NOPF>0 and CI pipes has a better survival rate than DI pipes for both NOPF=0 and NOPF>0. 109 Figure 5.6: Survival curves for CI pipe strata with NOPF=0 and NOPF>0 110 Figure 5.7: Survival curves for DI pipe strata with NOPF=0 and NOPF>0 111 In order to check the predictive capability of the proposed BWPHMs, the entire data set are randomly divided into five testing data sets. However, if the year used for the testing was biased for some reason (e.g., it was a particularly freezing or dry year), this would not be a good test of the model. To avoid this problem, the complete data set was divided randomly into five testing data sets where each data set contain 20% of the full data set. Figures 5.8 and 5.9 present the observed breaks and predicted uncertainty bounds of BWPHM for CI and DI pipe strata respectively. Figure 5.8 shows the observed breaks, predicted breaks for 50% BCI, 2.5% BCI and 97.5% BCI or 95% uncertainty bounds of the proposed BWPHMs of five testing data sets of CI pipe strata. For the five testing data sets, the 95% uncertainty bounds of the proposed BWPHMs capture most of the observed breaks specially the early and late breaks. Due to the high number of breaks in CI pipes during the age between 21-30 years presented in Figure 5.1(a), some data points are outside of the 95% uncertainty bounds for all the five testing data sets. Except these, the proposed BWPHMs can predict relatively well for early failure and for old CI pipes. On the other hand, Figure 5.9 shows that the performance of BWPHMs are significantly better for DI pipe strata. For the five testing data sets, very few data points are outside of the 95% uncertainty bounds. Therefore, for the DI pipes, the proposed BWPHMs can predict significantly well for the entire time period. 112 113 114 Figure 5.8: Observed breaks and predicted uncertainty bounds of BWPHM for CI pipes 115 116 117 Figure 5.9: Observed breaks and predicted uncertainty bounds of BWPHM for DI pipes To evaluate the performance of the BWPHM over BWPHM-AGE (considering only age) and Cox-PHM models, 5-fold cross validation method are used. The traditional hold out method is not suitable for limited data or information which is common for small to medium sized utilities. Moreover, for a single train and test experiment, the holdout estimate of error rate will be misleading if an “unfortunate” split is happened (Efron and Tibshirani 1997). The 5-fold cross validation method eventually use all the data for both training and testing and the true error estimate is obtained as the average of the five test estimates (Efron and Tibshirani 1997). The observed and predicted breaks for CI and DI pipe strata using BWPHM, BWPHM-AGE, and Cox-PHM models are shown in Figure 5.10 and 5.11, respectively. Figure 5.10 indicates that Cox-PHM cannot capture the initial failure (due to installation problem, manufacturing defect, etc.) properly and also over estimates until 43 years while BWPHM over estimates the failure after 45 years. For CI pipe strata, the performance of BWPHM are better compared to Cox-PHM from installation to 43 years. Whereas Figure 5.11 shows that the performance of BWPHM are 118 significantly better than Cox-PHM for DI pipe strata. The Cox-PHM underestimates the number of failures and provides almost constant number of failures after 13 years whereas BWPHM estimates the number of failures properly almost 40 years. After 40 years, both BWPHM and Cox-PHM over estimates the number of failures due to very few number of breaks. On the other hand, BWPHM-AGE model overestimates number of breaks for both CI and DI pipes due to less survival time or higher failure rate presented in Figures 5.6 and 5.7. For CI pipes, BWPHM-AGE model overestimates for almost 30 years and the performance of BWPHM and BWPHM-AGE models are almost similar after that. For this, BWPHM-AGE models fails to predict early failure and works relatively well for old CI pipes. Whereas, BWPHM-AGE model overestimates for DI pipes for entire life especially after 30 years the predicted number of breaks are much higher compared to BWPHM and Cox-PHM models. Therefore, the performance of BWPHM-AGE models is unsatisfactory compare to BWPHM. 119 120 121 Figure 5.10: Observed and predicted breaks for CI pipe strata 122 123 124 Figure 5.11: Observed and predicted breaks for DI pipe strata The overall match between observed and predicted values for each of the folds was assessed using MSE, MAE, RMSE, and mean relative absolute error (MRAE). The equation of MSE, MAE, RMSE are presented in equation 4.7, 4.8 and 4.9, and the MRAE is calculated by the following equation: 𝑀𝑅𝐴𝐸 = 1𝑛∑ |𝑂 − 𝑃|𝑛𝑖=1∑ |𝑂 − ?̅?|𝑛𝑖=1 (5.15) where O and P are observed and predicted values respectively, P is mean of P, and n is number of observation. Table 5.4 gives the comparison results of BWPHM, BWPHM-AGE, and Cox-PHM models for CI and DI strata. Table 5.4 indicates that for CI pipes, the MRAE of Cox-PHM (1.409) for fold 5 is less than BWPHM (2.696) and BWPM-AGE (3.113) whereas BWPM-AGE (1.884) is slightly lower than BWPHM (1.995) for fold 3. On the other hand, the MRAE of BWPHM (1.374) and Cox-PHM (3.304) is higher than BWPM-AGE (1.051) for fold 3 of DI pipe strata. 125 Except these, the other model errors of BWPHM is lower compared to BWPM-AGE and Cox-PHM for both CI and DI pipe strata. For this, it can be conclude that the performance of BWPHM is noticeably better compared to BWPHM-AGE and Cox-PHMs. Table 5.4: Performance of BWPHM and Cox-PHM for break prediction Model Errors Models Fold 1 2 3 4 5 MSE CI BWPHM 52.529 55.678 53.250 67.342 98.409 BWPHM-AGE 186.631 185.004 193.320 116.505 227.204 Cox-PHM 93.375 98.088 108.752 71.200 101.960 DI BWPHM 78.930 91.983 75.650 87.795 67.153 BWPHM-AGE 139.888 133.830 134.772 155.893 149.308 Cox-PHM 191.766 180.023 171.545 191.586 172.412 MAE CI BWPHM 5.820 6.125 5.470 6.196 7.115 BWPHM-AGE 8.701 8.615 8.501 7.893 9.304 Cox-PHM 7.529 7.646 8.206 6.520 7.700 DI BWPHM 5.970 5.511 6.207 5.755 5.335 BWPHM-AGE 8.486 8.205 8.422 8.846 8.277 Cox-PHM 10.369 9.806 9.506 10.092 9.340 RMSE CI BWPHM 7.247 7.461 7.297 8.206 9.920 BWPHM-AGE 13.661 13.601 13.903 10.793 15.073 Cox-PHM 9.663 9.903 10.428 8.438 10.097 DI BWPHM 8.884 9.590 8.697 9.369 8.194 BWPHM-AGE 11.827 11.568 11.609 12.485 12.219 Cox-PHM 13.847 13.417 13.097 13.841 13.130 MRAE CI BWPHM 2.037 1.103 1.995 1.180 2.696 BWPHM-AGE 2.135 1.581 1.884 1.264 3.113 Cox-PHM 3.013 3.403 2.880 1.737 1.409 DI BWPHM 1.152 0.557 1.374 0.550 0.613 BWPHM-AGE 1.182 0.712 1.051 0.637 1.029 Cox-PHM 3.336 1.546 3.304 1.807 3.900 126 The proposed BWPHMs can also estimate the survival functions of individual pipes. For example, Figure 5.12 presents the estimated survival functions of individual pipe, pipe IDs M21738 and A16386-01, for 2.5%, 50%, and 97.5% BCI. Table 5.5 represents the description and failure history of M21738 and A16386-01 pipes. Table 5.5 indicates that the diameter and length of M21738 is higher than A16386-01 while soil resistivity is lower. Pipe M21738 already experienced 16 breaks (15 for leakage and 1 for corrosion) whereas A16386-01 failed twice due to leakage and circumferential break. For this, the survival time of M21738 is much lower than A16386-01 (Figure 5.12). Table 5.5 also informs that M21738 and A16386-01 failed last on 2005 and 1970 respectively and for this reason the utility can expect failure of these two water mains within couple of years according to the Figure 5.12. Figure 5.12: Survival function estimation of M21738 and A16386-01 127 Table 5.5: Description and failure history of M21738 and A16386-01 pipes Pipe ID Material Diameter Length Installation year Soil resistivity Failure history and type M21738 CI 600 m 817 m 1951 1428 Ωm Corrosion: 1985; Leak: 1963, 1967, 1969, 1971, 1972, 1973, 1976, 1976, 1977, 1978, 1979, 1983, 1986, 1990, 2005 A16386-01 CI 150 m 139 m 1954 2380 Ωm Circumferential: 1969; Leak: 1970 5.4 Summary Accurate quantification of uncertainty is necessary for improving our understanding of water mains’ failure processes. For this, the study has sought to develop Bayesian-based water main failure prediction models for water distribution systems considering uncertainty. In this study, BMA is conducted to bring insight on selecting influential and appropriate covariates and BWPHM is applied to develop survival curves for CI and DI pipes using 57 years of historical data collected for the City of Calgary. The results indicated that the CI and DI water mains respond differently to the effect of covariates. The results also represented that the survival times of CI and DI pipes with NOPF=0 are higher than NOPF>0. After experiencing first break, soil resistivity is the most significant or influential parameters for the increases the hazard of the CI pipes whereas DI pipes are more sensitive on soil corrosivity index. The DI pipes installed during 1960 to 1985 are more prone to failure. Due to the 3m installation depths in the city, the effect of temperature on pipe failures are marginal compared to the physical and soil parameters. It has been found that the impact of FI and RD are more on the occurring of first failure compared to successive failures. The utility managers or authorities can use the proposed model to get information both at network and operation levels. For network level, the utility managers or authorities can identify the number of water mains that need immediate maintenance, repair or replacement (M/R/R) actions to better address the structural and hydraulic failure of water mains proactively. The developed survival curves for CI and DI water mains can be integrated with economic assessment model (e.g., life cycle costing) to estimate costs of M/R/R and to develop optimal M/R/R plans while 128 meeting level of service, financial constraints, and regulatory requirements. Whereas for operation level, it is able to estimate the survival functions with range of uncertainties for individual pipes within the distribution network and help to take appropriate preventive or corrective action. The proposed study can also incorporate the judgement of the utility managers or authorities as prior distribution and assess the overall impact of the different management strategies for the long term plan as well as on the level of funding or investment needed to maintain specific structural and hydraulic performance levels. 129 Chapter 6 Bayesian Updating Based Failure Model A version of this chapter has been submitted in Knowledge-Based Systems with a title “Predicting Water Main Failures: A Bayesian Model Updating Approach.” The lead author is Golam Kabir and the coauthors are Dr. Solomon Tesfamariam, Dr. Jason Loeppky and Dr. Rehan Sadiq. 6.1 Background In the previous chapter, a BWPHM is developed survival curves for the failure prediction of water mains. Yet, none of water main failure prediction model or study mentioned any formal guideline or framework how to deal with the new data or information or tried to improve or update the performance of the model whenever new data or information is available. The objective of this study is to develop an effective Bayesian updating based water main failure prediction framework that not only incorporates the uncertainties but also provide a rational framework on how to update the performance of the model. The applicability of the proposed framework is illustrated with the City of Calgary pipe failure data. The proposed framework provides an efficient framework for probabilistic updating and the assessment of model performance in light of uncertain and evolving information, particularly for post-event failures or pipe replacement. 6.2 Methodology The proposed Bayesian updating framework for the water main failure prediction is shown in Figure 6.1. The first step entails gathering pipe characteristics data, soil information and pipe breakage data from the water utility’s GIS. In the second step, the influential and significant covariates will be selected using the BMA approach. For the model development, the entire water main failure dataset will be divided into multiple periods. Then the BWPHM based water main failure prediction models will be developed using the failure data of first period. The detailed methodology of BWPHM is provided in section 5.2. After that, the model parameters will be updated using the Bayesian updating approach by the water main failure data of second period. Similarly, Bayesian updating approach will be followed for the water main failure data of the 130 remaining periods. Finally, the performance of the models after each period will be determined. Each steps will be discussed elaborately in case study section with appropriate example. Figure 6.1: Bayesian updating framework for the water main failure prediction P11 P12 1st Time Period Non-informative prior Posterior 1 P21 2nd Time Period Posterior 2 Informative prior 1 P31 3rd Time Period Posterior 3 Informative prior 2 P22 P32 PN1 Nth Time Period Posterior N Informative prior (N-1) PN2 Prediction Data Collection Covariate Selection Model Development & Updating Bayesian Weibull Proportional Hazard Model Model Performance Evaluation Bayesian Model Averaging (BMA) 131 6.3 A Case study for City of Calgary In order to demonstrate the applicability of the proposed approach, it was applied on the water distribution network of the City of Calgary. 6.3.1 Data collection The detail of the water distribution network of City of Calgary is already presented in previous chapter. Almost 96% breaks occurred only in CI and DI pipes, the authorities are more concern about these pipes. For this, in this study, only CI and DI pipes are considered for the further analysis. Pipe length (LENGTH), diameter (DIA), vintage (VINT), land use (LANUSE), freezing index (FI), rain deficit (RD), soil resistivity (RESIS) and soil corrosivity index (SCI) were considered as the explanatory variables in this study. The effect of the variables on the water main breaks are discussed in Kabir et al. (2015a) and Kabir et al. (2015d). The summary statistics of the variables used in this study are already presented in Table 5.1. 6.3.2 Covariate selection The effect of the explanatory variables on CI and DI pipes differs significantly. In order to establish the influence of covariates, the data were stratified according to material type CI and DI initially. According to Kabir et al. (2015e), Kimutai et al. (2015), Alvisi and Franchini (2010), Pelletier et al. (2003), Le Gat and Eisenbeis (2000) and Mailhot et al. (2000), the pipe failure behaviors were different for the first break and subsequent breaks and the likelihood of a pipe experiencing more breaks is much higher after breaking the first time. Therefore, the CI and DI pipes were further stratified according to the number of observed previous failures (NOPF); with no previous failure (NOPF = 0) and with one or more previous failures (NOPF > 0). After that, BMA were used to identify the influential covariates for the different models. More detailed discussions about the concept of BMA and the development of the BMA models are presented in previous chapter. The BMA approach started with an initial set of 256 models, reduced to 52 models by the use of Occam’s window. Figure 6.2 shows the posterior distributions for the coefficients of DI pipes (NOPF > 0) model based on the model averaging results. The spike at 0 is an artifact of the approach developed by 132 Raftery et al. (1997), Viallefont et al. (2001), in which it is possible to consider models with a predictor or variable fully removed from the model. The posterior distribution for VINT and LANUSE were indeed centered away from 0, and LENGTH and SCI with a very small spike at 0 whereas log.RESIS., and DIA were centred close to zero with a large spike at zero. On the other hand, the posterior distribution of FI and RD are centred at zero. Similarly, the posterior distributions for the coefficients of other models were generated. 133 Figure 6.2: Posterior distribution of the coefficients for DI pipes (NOPF > 0) 134 Tables 6.1 and 6.2 indicate the mean (E (β|D)), standard deviation (SD (β|D)) and posterior probabilities (Pr (βi≠0|D)) of the coefficients of different models for DI and CI pipes respectively. According to the Table 6.1, the posterior probabilities of the coefficients of VINT and LANUSE are 97.7% and 97.8% respectively which indicate that these two variables are highly significant for DI pipes (NOPF > 0) model. Table 6.1: Mean, SD and posterior probabilities of the coefficients for DI pipes using BMA Variables DI (NOPF > 0) DI (NOPF = 0) E (β|D) SD (β|D) Pr (βi ≠ 0|D) E (β|D) SD (β|D) Pr (βi ≠ 0|D) LENGTH 6.3E-04 3.7E-04 81.5% 1.5E-03 1.40E-04 100.0% DIA -3.5E-05 1.6E-04 6.1% -2.2E-03 3.60E-04 100.0% VINT 2.5E-01 7.9E-02 97.7% 8.0E-01 5.17E-02 100.0% FI 1.2E-06 2.2E-05 1.7% 3.7E-03 1.90E-04 100.0% RD -1.2E-05 1.3E-04 2.1% 2.1E-02 4.30E-04 100.0% log.RESIS. -2.9E-02 7.6E-02 14.7% -2.5E-01 5.25E-02 100.0% SCI 1.3E-02 5.9E-03 88.0% -2.9E-05 7.80E-04 2.2% LANUSE -2.0E-01 6.2E-02 97.8% -7.8E-01 5.02E-02 100.0% Table 6.2: Mean, SD and posterior probabilities of the coefficients for CI pipes using BMA Variables CI (NOPF > 0) CI (NOPF = 0) E (β|D) SD (β|D) Pr (βi ≠ 0|D) E (β|D) SD (β|D) Pr (βi ≠ 0|D) LENGTH 1.4E-03 1.50E-04 100.0% 4.3E-03 1.70E-04 100.0% DIA 9.1E-04 2.10E-04 99.8% -9.3E-06 7.00E-05 3.1% VINT 4.7E-05 3.75E-03 1.0% 2.9E-04 5.25E-03 1.5% FI 7.0E-05 1.30E-04 24.9% 1.8E-03 1.60E-04 100.0% RD 6.2E-04 9.70E-04 33.2% 1.2E-02 6.60E-04 100.0% log.RESIS. -1.1E-01 6.24E-02 83.2% -5.5E-04 7.96E-03 1.8% SCI -6.8E-04 2.57E-03 8.2% -2.9E-04 1.52E-03 5.1% LANUSE 1.5E-03 1.02E-02 3.2% -1.1E-01 5.52E-02 85.6% 135 The LENGTH and SCI are also significant factor for DI pipes (NOPF >0) model due to 81.5% and 88.0% posterior probabilities respectively. On the other hand the posterior probability of log.RESIS., is 14.7% whereas the posterior probabilities of DIA, FI, and RD are very small within the range of 1.7%-6.1%. Therefore, none of these variable are significant enough to consider for DI pipes (NOPF >0) model. According to Viallefont et al. (2001), the variables that had posterior probabilities below 50 per cent are ‘not significant’. For this, LENGTH, VINT, SCI, and LANUSE were finally selected for DI pipes (NOPF >0) model. Similarly, the other significant or influential covariates were determined for the remaining three strata models. 6.3.3 Model development For Bayesian updating, 58 years pipe breaks data of the City of Calgary were arbitrarily divided into four periods: 1st period: 1956-1980 (25 years), 2nd period: 1981-1998 (18 years), 3rd period: 1999-2008 (10 years), and 4th period: 2009-2013 (5 years). It should be mentioned that different data division strategy can be adopted. At the first stage of the analysis, the failure data from 1956-1980 were used for Bayesian model development where all the model parameters were considered as random variables and non-informative priors for distributions. After developing the model with 25 years failure data, the posterior distributions of the models parameters were determined. Then the posterior distribution of the parameters of 1st period were considered as an informative prior of 2nd period and the models were updated using 18 years failure data from 1981-1998. Similarly, the posterior of 2nd and 3rd period were considered as prior of 3rd and 4th period to update the model with 10 years and 5 years failure data from 1999-2008 and 2009-2013 respectively. After updating the models, the performance of four periods were compared. Table 6.3 indicates the number of datasets for four periods for different models of DI and CI pipes. According to Figure 3.2, very few DI pipes (156) were installed before 1966 whereas very few CI pipes (505) were installed after 1966. The installation of DI pipes dominates CI pipes after 1966 and no CI pipes were installed after 1970. For this, the number of CI pipes experienced single or multiple breaks are relatively more than DI pipes during 1956-1980. Table 6.3 also indicates that the number of pipes experienced successive breaks increased significantly for both DI and CI pipes. The number of breaks for 4th period (2008-2013) decreased for both DI and CI 136 pipes due to the replacement of metallic pipes with PVC pipes, and retrofitting program of DI pipes. Table 6.3: Number of datasets for four periods for different models DI pipe CI pipe NOPF > 0 NOPF = 0 NOPF > 0 NOPF = 0 After 1st period 72 692 983 3,090 After 2nd period 1,468 5,968 1,658 2,606 After 3rd period 660 5,477 1,022 780 After 4th period 147 790 521 296 Initially the datasets of 1st period (1956-1980) were used for the BWPHM development. The BWPHM development and analysis were performed into Bayesian inference using Gibbs sampling known as WinBUGS. Figure 6.3(a) shows the posterior distribution of the coefficients for CI pipes (NOPF > 0) After 1st period where alpha, beta0, beta1, beta2, and beta3 indicate the scale parameter, intercept, and regression coefficients for LENGTH, DIA, and log.RESIS., respectively. 137 (a) After 1st period (b) After 4th period Figure 6.3: Posterior distribution of the coefficients for CI pipes (NOPF > 0) 138 As there were no information about the prior, normal distribution with mean 0.0, and precision 0.001 or variance 1,000 were used for beta0, beta1, beta2, and beta3 whereas Gamma distribution with mean 1.0 and precision 0.001 was used for alpha or scale parameter. The chain was run with a burn-in of 1,000 iterations with 100,000 retained draws and a thinning to every 100th draw. The posterior distribution of the coefficients is presented in Figure 6.3(a) were used as an informative prior (using mean and standard deviation) for the Bayesian updating using the datasets of 2nd period. Similarly, the posterior distribution of the coefficients After 2nd period were used as an informative prior for next period. Figure 6.3(b) indicates the posterior distribution of the coefficients for CI pipes (NOPF > 0) after updating the model using the datasets of 2nd, 3rd, and 4th periods, respectively. The mean, standard deviation, and Monte Carlo (MC) error of the coefficients of BWPHMs after different periods for CI pipes (NOPF > 0) are presented in Table 6.4. Likewise the mean, standard deviation, and MC error of the coefficients for CI pipes (NOPF = 0), DI pipes (NOPF > 0 and NOPF = 0) are presented in Tables 6.5-6.7, respectively. Based on the values presented in Tables 6.4-6.7, the survival curves after different periods were determined for both CI and DI pipes and shown in Figures 6.4 and 6.5 respectively. Figures 6.4 and 6.5 indicate that, the survival curves for pipes with no previous breaks (NOPF = 0) are less steep than the curves for pipes with previous breaks (NOPF > 0). The survival time of CI and DI pipes with NOPF > 0 is lower than NOPF = 0 which is the normal phenomena for the repairable components like water mains. For this, the likelihood of a pipe for experiencing more breaks is higher after having first break and the result is in agreement with the previous studies like Kabir et al. (2015b), Fuchs-Hanusch et al. (2012), Park et al. (2008), and Pelletier et al. (2003). According to the Figures 6.4 and 6.5, for both NOPF = 0 and NOPF > 0, the survival time of CI pipes is high compared to DI pipes or DI pipes has a better survival rate than CI pipes. Figures 6.4 and 6.5 also indicate that for both NOPF = 0 and NOPF > 0, the survival time is much lower After 1st period for both CI and DI pipes. 139 Table 6.4: Mean, standard deviation, and MC error of the coefficients of BWPHMs for CI pipes (NOPF > 0) CI (NOPF > 0) Alpha Intercept LENGTH DIA VINT FI RD log.RESIS. SCI LANUSE After 1st period Mean 1.032 -3.114 7.85E-04 1.28E-03 …. …. …. -2.08E-01 …. …. SD 2.59E-02 5.87E-01 2.62E-04 3.82E-04 …. …. …. 7.57E-02 …. …. MC Error 1.40E-03 4.98E-02 7.37E-06 1.41E-04 …. …. …. 6.36E-03 …. …. After 2nd period Mean 1.144 -4.829 1.35E-03 3.13E-04 …. …. …. -1.16E-01 …. …. SD 2.35E-02 3.51E-01 2.46E-04 3.25E-04 …. …. …. 4.35E-02 …. …. MC Error 1.38E-03 3.25E-02 7.01E-06 1.17E-05 …. …. …. 4.01E-03 …. …. After 3rd period Mean 1.142 -5.311 1.63E-03 5.47E-04 …. …. …. 1.57E-02 …. …. SD 4.08E-02 1.05 5.80E-04 6.66E-04 …. …. …. 1.29E-01 …. …. MC Error 2.65E-03 1.03E-01 1.70E-05 2.61E-05 …. …. …. 1.57E-02 …. …. After 4th period Mean 1.142 -5.027 1.60E-03 -5.43E-04 …. …. …. -1.42E-01 …. …. SD 4.04E-02 9.18E-01 5.12E-04 6.87E-04 …. …. …. 1.14E-01 …. …. MC Error 2.07E-03 4.92E-02 1.30E-05 2.08E-05 …. …. …. 6.15E-03 …. …. Table 6.5: Mean, standard deviation, and MC error of the coefficients of BWPHMs for CI pipes (NOPF = 0) CI (NOPF = 0) Alpha Intercept LENGTH DIA VINT FI RD log.RESIS. SCI LANUSE After 1st period Mean 1.63 -8.13 1.61E-04 …. …. -4.13E-04 -2.11E-03 …. …. -1.66E-02 SD 5.76E-02 2.99E-01 1.92E-04 …. …. 1.26E-04 1.26E-03 …. …. 3.75E-02 MC Error 4.50E-03 2.34E-02 5.36E-06 …. …. 1.38E-06 1.45E-05 …. …. 6.11E-04 After 2nd period Mean 1.80 -1.03E+01 4.73E-04 …. …. -3.37E-04 -7.26E-03 …. …. 3.70E-02 SD 4.01E-02 2.40E-01 2.44E-04 …. …. 1.51E-04 9.91E-04 …. …. 4.20E-02 MC Error 3.56E-03 2.13E-02 7.24E-06 …. …. 1.65E-06 1.12E-05 …. …. 8.53E-04 After 3rd period Mean 1.10 -5.58 1.60E-04 …. …. 5.38E-04 -1.06E-02 …. …. 3.18E-02 SD 4.48E-02 2.94E-01 4.84E-04 …. …. 3.95E-04 2.56E-03 …. …. 7.96E-02 MC Error 3.26E-03 2.16E-02 1.72E-05 …. …. 4.10E-06 3.05E-05 …. …. 1.66E-03 After 4th period Mean 1.24 -7.10 1.95E-04 …. …. 3.76E-04 -1.10E-02 …. …. -2.46E-03 SD 3.68E-02 2.69E-01 8.01E-04 …. …. 6.53E-04 4.05E-03 …. …. 1.21E-01 MC Error 2.05E-03 1.55E-02 2.44E-05 …. …. 8.63E-06 3.90E-05 …. …. 2.27E-03 140 Table 6.6: Mean, standard deviation, and MC error of the coefficients of BWPHMs for DI pipes (NOPF > 0) DI (NOPF > 0) Alpha Intercept LENGTH DIA VINT FI RD log.RESIS. SCI LANUSE After 1st period Mean 1.024 -3.28 -2.48E-03 …. 7.33E-01 …. …. …. 6.83E-03 -2.10E-01 SD 9.37E-02 6.78E-01 1.38E-03 …. 5.48E-01 …. …. …. 1.86E-02 3.53E-01 MC Error 3.65E-03 3.99E-02 4.04E-05 …. 3.00E-02 …. …. …. 5.17E-04 5.72E-03 After 2nd period Mean 0.98 -4.13 7.74E-04 …. -4.13E-02 …. …. …. 2.10E-02 -1.36E-01 SD 2.04E-02 1.66E-01 3.10E-04 …. 1.08E-01 …. …. …. 4.34E-03 7.30E-02 MC Error 9.57E-04 1.09E-02 1.13E-05 …. 5.97E-03 …. …. …. 1.54E-04 1.11E-03 After 3rd period Mean 1.110 -5.08 1.09E-03 …. 1.78E-01 …. …. …. 1.41E-02 -1.15E-01 SD 3.41E-02 2.40E-01 4.21E-04 …. 1.19E-01 …. …. …. 6.31E-03 9.22E-02 MC Error 1.71E-03 1.52E-02 1.41E-05 …. 4.64E-03 …. …. …. 2.08E-04 1.17E-03 After 4th period Mean 1.103 -4.76 8.79E-04 …. 3.02E-01 …. …. …. 1.78E-02 -2.30E-01 SD 6.75E-02 4.52E-01 9.70E-04 …. 1.82E-01 …. …. …. 1.44E-02 1.91E-01 MC Error 3.85E-03 2.83E-02 3.37E-05 …. 3.86E-03 …. …. …. 4.11E-04 3.32E-03 Table 6.7: Mean, standard deviation, and MC error of the coefficients of BWPHMs for DI pipes (NOPF = 0) DI (NOPF = 0) Alpha Intercept LENGTH DIA VINT FI RD log.RESIS. SCI LANUSE After 1st period Mean 1.102 -2.98 -4.98E-05 -1.15E-03 -4.27E-01 -6.87E-04 -1.01E-02 -2.11E-01 …. -1.83E-01 SD 3.74E-02 8.60E-01 4.35E-04 6.30E-04 1.20E-01 2.46E-04 2.18E-03 1.11E-01 …. 9.20E-02 MC Error 1.73E-03 8.32E-02 1.47E-05 2.71E-05 4.85E-03 3.10E-06 2.87E-05 1.07E-02 …. 1.76E-03 After 2nd period Mean 1.18 -4.69 7.10E-04 -1.14E-04 -6.05E-03 1.51E-04 -2.58E-05 -2.09E-01 …. -1.67E-01 SD 2.13E-02 2.49E-01 1.43E-04 2.23E-04 3.91E-02 1.29E-04 5.90E-04 2.74E-02 …. 3.24E-02 MC Error 1.58E-03 2.31E-02 4.40E-06 9.33E-06 1.55E-03 1.34E-06 7.01E-06 2.60E-03 …. 5.28E-04 After 3rd period Mean 1.16 -5.35 4.08E-04 7.34E-04 1.25E-01 -5.12E-04 -5.36E-03 -2.40E-01 …. -1.24E-01 SD 1.28E-02 2.54E-01 1.50E-04 2.12E-04 3.71E-02 2.84E-04 8.51E-04 3.39E-02 …. 3.25E-02 MC Error 7.55E-04 2.42E-02 4.98E-06 8.20E-06 1.29E-03 2.85E-06 9.18E-06 3.22E-03 …. 3.25E-02 After 4th period Mean 1.13 -5.35 4.08E-04 7.34E-04 1.25E-01 -5.18E-04 -5.44E-03 -2.40E-01 …. -1.24E-01 SD 2.47E-02 8.56E-01 2.74E-04 5.99E-04 8.14E-02 4.37E-04 1.90E-03 1.14E-01 …. 8.35E-02 MC Error 1.16E-03 8.30E-02 8.18E-06 2.60E-05 2.55E-03 6.02E-06 2.59E-05 1.11E-02 …. 1.84E-03 141 Figure 6.4: Survival curves for CI pipe with NOPF>0 and NOPF=0 142 Figure 6.5: Survival curves for DI pipe with NOPF>0 and NOPF=0 143 6.3.4 Model performance evaluation A testing dataset was randomly chosen from the entire dataset to check the performance of the proposed BWPHM and to compare it with the traditional Cox-PHM. However, if the year used for the testing was biased for some reason (e.g., it was a particularly freezing or dry year), this would not be a good test of the model. To avoid this problem, the testing dataset contain 20% of the full dataset was selected randomly. The overall match between observed and predicted water main breaks for CI and DI pipes were assessed using MSE, MAE, RMSE, MRAE and PBIAS using equation 4.7, 4.8, 4.9, 5.15 and 4.10, respectively. The observed and predicted breaks for CI and DI pipe strata using BWPHM and Cox-PHM after different periods are shown in Figures 6.6 and 6.7, respectively. Figure 6.6 indicates that both BWPHM and Cox-PHM After 1st period underestimated the number of failure especially after 30 years due to more datasets in NOPF = 0 (3,090) compared to NOPF > 0 (983) for CI pipes. The estimate improves after updating the BWPHM or After 2nd period whereas the model overestimates the number of failure After 3rd period. However, after updating the model with the datasets of 4th period, the estimation of number of failures improved for CI pipes. However, the estimation of number of failures is almost same After 3rd period and After 4th period for Cox-PHMs. On the other hand, for DI pipes, both BWPHM and Cox-PHM After 1st period overestimated the number of failures due to few datasets in both NOPF = 0 (72) and NOPF > 0 (692) models. For BWPHM, the estimation of the failure improved for the initial failures in DI pipes After 2nd period but still the model overestimated after 30 years. After updating the BWPHMs with the datasets of 3rd and 4th periods, the failure estimation for the DI pipes improved. However, for Cox-PHMs, the failure estimation improved After 2nd period but almost same After 3rd period and After 4th period. 144 Figure 6.6: Observed and predicted breaks of BWPHMs and Cox-PHMs for CI pipes 145 Figure 6.7: Observed and predicted breaks of BWPHMs and Cox-PHMs for DI pipes 146 Table 6.8 gives the comparison results of the BWPHMs and Cox-PHMs after 1st, 2nd, 3rd, and 4th periods for CI and DI strata. Table 6.8 indicates that the model errors After 4th period are lower compared to other three models for both CI and DI pipe strata. For CI pipes, the performance improvement After 4th period isn’t significant enough using the Cox-PHM compared to BWPHM. Both BWPHMs and Cox-PHMs are underestimating the number of breaks except After 3rd period for BWPHM. On the other hand, due to the less number of datasets, the performance of BWPHM and Cox-PHM were not significant enough. The model performance improved significantly for both BWPHM and Cox-PHM due to more datasets at the second period. Both BWPHMs and Cox-PHMs are overestimating the number of breaks. However, the overestimations of Cox-PHMs were significantly higher compared to the BWPHMs. The overall performances of the BWPHMs after updating were significantly better compared to the traditional Cox-PHMs. Table 6.8: Performance of BWPHM and Cox-PHM for CI and DI pipes break prediction Model Errors Models After 1st period After 2nd period After 3rd period After 4th period CI MSE BWPHM 375.92 249.34 122.11 74.83 Cox-PHM 405.63 293.42 139.56 137.61 MAE BWPHM 15.41 12.26 8.22 6.97 Cox-PHM 17.96 14.92 7.70 7.59 RMSE BWPHM 19.39 15.79 11.22 8.65 Cox-PHM 20.14 17.13 11.81 11.73 MRAE BWPHM 2.86 2.18 1.78 0.92 Cox-PHM 3.67 2.52 1.03 1.22 PBIAS BWPHM -46.90 -47.60 16.50 -14.70 Cox-PHM -58.40 -47.10 -12.00 -11.20 DI MSE BWPHM 735.14 135.40 86.29 85.97 Cox-PHM 1357.89 272.45 251.00 243.94 MAE BWPHM 13.55 10.01 7.44 6.67 Cox-PHM 16.12 14.13 14.64 14.24 RMSE BWPHM 27.11 11.64 9.29 9.27 Cox-PHM 45.36 16.51 15.84 15.62 MRAE BWPHM 3.88 0.86 0.57 0.62 Cox-PHM 2.39 1.80 1.99 1.78 PBIAS BWPHM 72.00 47.00 16.00 18.00 Cox-PHM 73.10 76.80 70.70 69.80 147 6.4 Summary In this study, a Bayesian updating based water main failure prediction framework is developed to update the performance of the pipe failure models and to quantify the uncertainties. For this, BMA is conducted to select influential or important covariates whereas BWPHM is updated to develop the survival curves for CI and DI pipes using 57 years of historical data collected for the City of Calgary. The test results confirmed that the Bayesian updating model effectively improve the performance of the water main failure models. The proposed Bayesian updating model can be a unique tool for small to medium sized utilities due to their data/information scarcity, lack of technical and financial resources, and limited experience. The utility managers or authorities of the small to medium sized utilities can consider the expert knowledge or judgement as a prior for the analysis and can update the model whenever new data/information is available. On the other hand, due to availability of sufficient data or information for large utilities, Bayesian updating methodology can be applied to improve the performance of the existing pipe failure models though quantifying the uncertainties arising from modeling errors. To develop optimal M/R/R action plans though estimating the costs of M/R/R while meeting acceptable level of service and financial constraints, the utility managers or authorities can integrate the developed survival curves with the economic assessment model like life cycle costing. The proposed model can also incorporate judgement/opinion of the utility managers or authorities as prior distribution and assess the overall impact of the different management strategies for the short and long term plans as well as on the level of funding or investment needed to maintain acceptable structural and hydraulic performance levels. 148 Chapter 7 Life Cycle Cost (LCC) for Repair and Replacement A version of this chapter has been submitted in Urban Water Journal with a title “A Life Cycle Cost Approach for Water Mains’ Renewal in Small to Medium-sized Water Utilities.” The lead author is Golam Kabir and the coauthors are Gizachew Demissie, Dr. Rehan Sadiq and Dr. Solomon Tesfamariam. 7.1 Background It is unrealistic and impractical to replace all the damaged and aging water mains simultaneously (Kleiner et al. 2001; Rajani and Kleiner 2004). Localized repair, general rehabilitation and replacement are decisions that can be taken to improve the condition of water mains to acceptable level of service (LoS) (Engelhardt et al. 2003; Pelletier et al. 2003). The selection of cost effective method(s) is critical for determining when to repair, renovate or replace a section of water main. Moreover, it is required to develop an effective tool to help a decision maker in reaching to optimal repair, rehabilitation or replacement decision (Herstein et al. 2011; Lim et al. 2006). LCC can be used effectively as a DST to aid utility managers/engineers to compare and to select the most cost effective rehabilitation alternative (Boussabaine and Kirkham 2004). The main objective of this research is to develop a LCC based DST for the renewals of water mains in small to medium-sized water utilities taking uncertainties into consideration. The applicability of the proposed framework is illustrated with the Glenmore-Ellison Improvement District (GEID) and Greater Vernon Water (GVW) data. 7.2 Proposed LCC framework The LCC framework developed in this study for water mains renewal of small to medium-sized utilities is shown in Figure 7.1. Initially, different types of data (i.e., pipe characteristics, failure, soil and hydraulic data) were collected from water utilities’ database. Based on those data, pipe deterioration and condition curves were generated. After that, different LCC related data (i.e., repair, rehabilitation and replacements (R/R/R) methods, R/R/R costs, planning period, discount rates) were collected. Finally, the R/R/R profile over the design period was determined to 149 calculate the LCC of each pipe. The main components of the framework are described in the following sections. Figure 7.1: Proposed methodology for LCC of small to medium-sized water utilities 7.2.1 Pipe deterioration/condition states For developing pipe deterioration curves, the most widely used statistical approach is generating break frequency curve using different regression methods (Boussabaine and Kirkham 2010). Water utilities keep records of pipe breaks, so that the basic data to populate a curve are available and it can be applied to all types of pipes. The application of these data can range from simple direct analysis of break rates (e.g., breaks/km/year) to sophisticated mathematical models incorporating a variety of factors (i.e., soil condition, operating conditions, pipe types, location, land use etc.) to predict remaining life (Kimutai et al. 2015; Rajani and Tesfamariam 2007). This method is well suited when an adequate amount of break data over a sufficient time period is 150 available (Banciulescu and Sekuler 2010). However, due to scarcity of data/information for the small and medium-sized water utilities, the results of the break frequency curve using normal regression methods are highly uncertain (Kabir et al. 2015a). For this, in this study, Bayesian regression based model is developed to determine the break frequency curves. The details of Bayesian regression are provided in section 4.2.2. After developing the break frequency curves using Bayesian regression, the results of the analysis can be used to develop pipe deterioration curve to facilitate proactive renewal by replacing pipelines based on projected breakage rates that exceed an acceptable threshold. For this, the methodology proposed by Banciulescu and Sekuler (2010) was followed in this study. Banciulescu and Sekuler (2010) illustrated six different break frequency curves for 4 to 6 inch cast iron pipes of different ages, materials, and soil types. 7.2.2 Life cycle cost (LCC) The LCC of an asset is defined as “the total cost throughout its life including planning, design, acquisition and support costs and any other costs directly attributable to owning or using the asset” (NSW Treasury 2004; Woodward 1997). The LCC for water mains takes into account construction costs, operating costs, maintenance costs, capital replacement costs and any resale, salvage, or disposal cost, over the life-time of the water mains (Shahata, 2006; Boussabaine and Kirkham 2004; Engelhardt et al. 2003). Modeling of LCC enables decision makers to compare between different alternatives and select the most feasible and economic solution (Gransberg and Diekmann 2004; NSW Treasury 2004). In order to estimate timing of future activities, breakage rate analysis can be used to express the deterioration rate of the existing water mains. The timing of future breaks and pipe condition can be analyzed from the breakage rate analysis. Next, the costs of each new installation/rehabilitation alternatives have to collect. After defining new installation/ rehabilitation alternatives and timing of future activities, set of scenarios combined of repair, renovation, and replacement alternatives can be suggested (Figure 7.2). 151 Figure 7.2: Example timing of future activities The present value (PV) for each suggested scenario will be calculated. The cost data, service life, and discounted rate can be expressed using the following formula (Ammar et al. 2012; Engelhardt et al. 2003). 𝑇𝐿𝐶𝐶 = 𝐶𝑐 + ∑𝐶𝑡(1 + 𝑑)𝑡𝑠𝑙𝑡=𝑜 (7.1) Where TLCC is the present value of the total life cycle cost, Cc is the construction cost, Ct is the sum of maintenance, repair, replacement/rehabilitation costs and salvage value, d is the discounted rate and sl is the service life of the water mains or the planning period. For most of the case, it is very challenging to get the construction costs and the salvage value of the water mains. For this, only R/R/R costs were considered in most of the studies (Rahman and Vanier 2004). Future costs are expressed in constant dollars and then discounted to the present at a discount rate that reflects only the opportunity value of time (real discount rate). This is because public sector project benefits should be dependent only upon real gains (cost savings or expanded output), rather than purely price effects (Rahman and Vanier 2004). After computing the PV for each scenario the results of each scenario will be analyzed and compared to predict the best new installation/rehabilitation scenario for water main rehabilitation. Initial ConstructionFirst RehabilitationSecond RehabilitationService life:2nd RehabilitationService life:1st RehabilitationService life:Initial ConstructionMinimum Acceptable Condition of Agency’sTimeWatermain Condition 152 7.3 Case studies The proposed LCC methodology is applied on the water distribution network of the Glenmore-Ellison Improvement District (GEID), BC and Greater Vernon Water (GVW), BC. The GEID is located in British Columbia, Canada and is one of the five water purveyors in the City of Kelowna and in a portion of the Regional District Central Okanagan. The water distribution network of GEID consist a total of 1,328 individual pipes and 152.17 km length of pipe. C900 (1071 pipes having total length of 901.81km) and asbestos-cement (AC) (199 pipes having 45.73km of total length) pipes are the most frequent water mains. Other than that their network consist of concrete cylinder (18 pipes), ductile iron (DI) (29 pipes), galvanized (2 pipes), steel (4 pipes), Polyvinyl Chloride (PVC) (3 pipes) and High-Density Poly Ethylene (HDPE) (2 pipes). From the GIS database of the GEID, the pipe characteristics data like age, diameter, length, and soil corrosivity index were collected and the water pressure, velocity, and aggressiveness index were collected from the EPANET database of the GEID. No pipe failure information was found in the GIS database. Only few (2-3) failure information were collected from the paper based work order. However, the GEID personnel was not confident about those information. For this, no pipe failure information of GEID was considered in this study. The details of the water distribution network of the GVW is already presented in section 4.3.2. Pipe characteristics data like age, diameter, length, soil corrosivity index and number of breaks were collected from GIS database where water pressure, velocity, and aggressiveness index were collected from the WaterCAD database of the GVW. 7.3.1 Pipe deterioration curves development The historical break data of GVW was not sufficient to develop individual statistically significant pipe failure models or break frequency curves. Moreover, no failure data was found for GEID. For this, the pipe break inventory of GVW and another medium-sized water utility of Okanagan valley, BC (City of Kelowna) have been pooled due to the demographic and environmental similarities and to develop more statistically significant pipe failure models. The most common method for generating break frequency curves is to allocate pipes into relatively homogenous groups or pipe cohorts based upon specific pipe characteristics (e.g., pipe type, diameter). 153 The pipe failure data of City of Kelowna and GVW were classified using pipe material and pipe diameter. The pipes have been grouped into five: Metallic (Diameter ≤ 200 mm and Diameter > 200 mm), Cementitious (Diameter ≤ 200 mm and Diameter > 200 mm) and Plastic. The metallic pipes include the CI, DI, steel, galvanized (Galv), and copper (Cop) pipes, the cementitious pipes cover the asbestos-cement (AC) and concrete (Conc) pipes, the plastic pipes consist of PVC. Table 7.1 presents the number of water mains failure of GVW and City of Kelowna based on pipe material and diameter. Table 7.1 indicates that the majority of GVW pipes experienced breaks are CI, DI and AC whereas for City of Kelowna mostly are AC and CI. The numbers of breaks in plastic pipes are few due to recent installment. Table 7.1: Number of water mains failure of GVW and City of Kelowna Utility Diameter Metallic Cementitious Plastic CI Cop DI GALV Steel AC Conc PVC GVW ≤ 200mm 96 1 58 3 3 56 13 > 200mm 6 15 11 3 2 City of Kelowna ≤ 200mm 38 99 12 > 200mm 16 1 32 2 Bayesian regression analyses proposed by Kabir et al. (2015b) were performed to develop a model for each pipe group (cementitious, metallic and plastic pipes). The break frequency curve for water mains of different materials and sizes were developed based on pipe characteristics (i.e., age, length, and diameter), soil parameter (i.e., soil corrosivity index), hydraulic factors (i.e., velocity and pressure), and aggressiveness index. The installation year of the pipes has been deducted from the base year “2014” to get the age of the pipes. Soil corrosivity index (SCI), aggressiveness index (AI), water velocity and water pressure were calculated based on the methodology proposed in Francisque et al. (2014). For each material-diameter class, the total number of recorded breaks, the age of the pipes, and the pipe lengths were computed that were eventually used to calculate breakage rate (# of break/km/year). 154 The developed models using Bayesian regression techniques predicted the pipe breakage rate and the acceptability/rejection of the models were evaluated using the root mean square error (RMSE) given by Equation 7.2 and square of the correlation coefficient (R2). 𝑅𝑀𝑆𝐸 = √1𝑚 ∑(𝐵𝐾𝑗𝑂𝑏𝑠𝑒𝑟𝑣𝑒𝑑 − 𝐵𝐾𝑗𝑃𝑟𝑒𝑑𝑖𝑐𝑡𝑒𝑑)2𝑚𝑗=1 (7.2) where BKjObserved and BKjpredicted are the jth observed breakage rate of the pipe and the corresponding data predicted by the model, respectively, and m is the total number of data points. As there were no information about the prior, normal distribution with mean 0.0, and precision 0.001 or variance 1,000 were used for the regression parameters. The distribution of the regression parameters of the models with were generated from the joint posterior distribution. Figure 7.3 shows the posterior distribution of the of Bayesian regression coefficients of metallic pipes (Diameter > 200 mm). In order to test the normality of the posterior distribution of the of Bayesian regression coefficients, Anderson-Darling and Kolmogorov-Smirnov normality test were performed. For all the coefficients of the different Bayesian regression models, p-value was found greater than 0.05. Thus, we failed to reject the hypothesis that the data is normally distributed (Thode 2002). 155 Figure 7.3: Posterior distribution of the regression coefficients of metallic pipes (Diameter > 200 mm) Table 7.2 presents the significant parameters and the mean and standard deviation (SD) of the posterior distributions of Bayesian regression coefficients of the models to predict the breakage rate for cementitious, metallic and plastic pipes. The RMSE of these models were low and the R2 were high except for plastic due to very few data or failure information. Based on these regression coefficients, the break frequency curve were generated for different group of pipes. ln(Age)1Frequency-0.2 0.0 0.2 0.4 0.6 0.805001000ln(Length)2Frequency-1.2 -1.0 -0.8 -0.6 -0.405001500SCI3Frequency-1.0 -0.5 0.0 0.5 1.004008001400Pressure4Frequency0.000 0.005 0.010 0.01505001500Velocity4Frequency-0.5 0.0 0.5 1.005001500Error SDFrequency0.3 0.4 0.5 0.6 0.7 0.8 0.90100020003000 156 Table 7.2: Mean and standard deviation of the Bayesian regression coefficient for different models Metallic Cementitious Plastic Dia ≤ 200mm Dia > 200mm Dia ≤ 200mm Dia > 200mm Mean SD Mean SD Mean SD Mean SD Mean SD ln(AGE) 0.2005 0.0899 0.3880 0.1337 0.5765 0.0927 0.1881 0.1598 0.2510 0.1580 ln(LENGTH) -0.8526 0.0618 -0.8197 0.1023 -0.7654 0.0554 -0.4989 0.1170 -0.5292 0.1899 SCI 0.0000 0.0000 0.0398 0.2837 -0.1867 0.1236 0.0000 0.0000 0.0000 0.0000 AI 2.7956 0.4398 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 PRESSURE 0.0000 0.0000 0.0087 0.0019 0.0016 0.0015 0.0000 0.0000 0.0000 0.0000 VELOCITY 0.0000 0.0000 0.1887 0.0019 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 7.3.2 Life cycle cost determination In order to optimize the usage of available funds for the cost effective R/R/R action program, an improved and generic LCC framework was developed by incorporating pipe deterioration model with R/R/R costs. Life cycle cost profile development involves three steps: 1) the estimation of the time of future activities, 2) the identification of rehabilitation/replacement (new installation) scenarios, and 3) the estimation of the water mains R/R/R costs. As stated earlier the future deterioration of water mains can be assessed by a pipe deterioration curves. If a minimum of LoS corresponding to a certain value of pipe condition is set, pipe deterioration curves will allow the managers to estimate the time of future activities. The second step refers to the identification of rehabilitation/replacement (new installation) scenarios to improve the pipe condition up to certain percentage. Finally, for the third step, the costs of water main repair/rehabilitation/renovation and replacement alternatives will be collected. Currently, both GEID and GVW utilities used only open trench repair action for the renewal. They did not consider any type of trenchless renewal methods. Moreover, they don’t have any reliable information of the costs of water mains breaks. Most of the LCC studies of water main considered an average agency costs for R/R/R (Ammar et al., 2012; Engelhardt et al. 2003; Lim et al., 2006; Rajani and Kleiner, 2004; Shahata, 2006; Shahata and Zayed, 2012, 2013) which are highly uncertain. Moreover, repair costs depend highly on the land use and topographical 157 condition of location where the pipe are buried. For example, if one pipe buried under a concrete pavement and another pipe buried under an open or agricultural field or native soil, the repair cost of the former will be much higher than the later one. But if we consider only an average cost, then repair cost for both these pipes will be same which is not practical and highly uncertain for effective decision making. The average repair costs for different site conditions (i.e., gravel, concrete pavement and agricultural/no pavement/native soil) for different asset types (i.e., mainline valve, service line valve and water main) shown in Table 7.3 were considered in this study based on the discussion with the authorities of GEID and RDNO. According to the Table 7.3, if the asset is in road crossing, the average repair costs will be double compared to the asset in same side. Moreover, due to the conflict with the other utilities like sewer, internet, gas, the average repair costs will be higher than the normal condition. For example, if a mainline valve lay under a concrete pavement road crossing and it also conflict with other utilities, then the total average R/R costs will be $ 11,000.00 ($ 5,000.00 × 2 + $ 1,000.00). This will help to reduce the uncertainty in decision making. However, the repair cost can be updated whenever new information is available. Table 7.3: Average repair costs for different situations for GEID and GVW Site condition Asset Type Mainline Valve Serviceline Valve Water Main Gravel $ 2,000.00 $ 2,000.00 $ 2,500.00 Concrete pavement $ 5,000.00 $ 5,000.00 $ 5,500.00 Agricultural/No pavement/Native soil $ 1,500.00 $ 1,500.00 $ 2,000.00 Same side 1 1 1 Road crossing 2 2 2 Utility conflict $ 1,000.00 $ 1,000.00 $ 1,500.00 158 For the pipe replacement or new pipe installation cost (including fitting & valves), the cost presented in Table 7.4 were used in this study for GEID and RDNO. However, the replacement or new pipe installation cost can be updated whenever new information is available. Table 7.4: Pipe replacement or new pipe installation cost for GEID and GVW Water Main Replacement Cost (including fitting & valves) GEID RDNO Diameter (mm) Unit Cost (per m) Diameter (mm) Unit Cost (per m) 50 $ 65.00 100 $ 230.00 100 $ 70.00 150 $ 250.00 150 $ 80.00 200 $ 300.00 200 $ 110.00 250 $ 350.00 250 $ 135.00 300 $ 400.00 300 $ 150.00 350 $ 465.00 350 $ 200.00 400 $ 515.00 400 $ 250.00 450 $ 565.00 450 $ 300.00 500 $ 615.00 500 $ 350.00 600 $ 740.00 600 $400.00 750 $ 890.00 900 $ 1,040.00 1200 $ 1,350.00 After the determination of pipe deterioration curves and cost profiles, the LCC of different scenarios were calculated. For the LCC model, the objective function was the minimization of the discounted net present value of cost (NPVC) over the life-cycle (or desired analysis period) of the water mains as shown in the following equation. 𝑀𝑖𝑛 𝑁𝑃𝑉𝐶 = ∑𝐶𝑖𝑗𝑡(1 + 𝑑)𝑡𝑠𝑙𝑡=1 (7.3) where sl = Planning period 159 d = discount rate Cijt = Total incurred cost at condition state i by choosing decision j at period t 7.3.3 Overview of LCC based DST in RiLCAMP tool The developed LCC-based DST was a module of the Risk based Life Cycle Asset Management tool for water Pipes (RiLCAMP) model for the prioritization of the water main repair or replacement program at the Glenmore-Ellison Improvement District (GEID) and Greater Vernon Water (GVW) based on existing data/information (Figure 7.4). The other two modules were RISK ANALYSIS and EXECUTE GIS MAP which are not discussed in this paper. Interested readers are referred to Francisque et al. (2014) for the more elaborate discussion on RISK ANALYSIS module. The proposed LCC methodology for GEID and GVW is described with an illustrative example. For example, a pipe (ID: GEID_1323) is selected for the illustration. Figure 7.4: The welcome page of the of the RiLCAMP tool 160 The GEID_1323 pipe is a C900 plastic pipe with 200 mm diameter, 7m length, and installed in 2009 (Table 7.5). The current age of the pipe is 3 years for using the soil data and hydraulic data of 2012 for the risk analysis and pie failure model development. The discount rate is considered 3% by default. Table 7.5: Basic pipe characteristics data and net present value Pipe ID GEID_1323 Pipe Material C900 Age (year) 3 Year of Installation 2009 Pipe Length 7 Pipe Diameter (mm) 200 Discount Rate (%) 3% Net Present Value $ 879.56 Multiple maintenance, repair and rehabilitation (M/R/R) techniques like cathodic protection, wrapping, lining, repair, wrapping & lining are considered in this tool. At present, GEID and GVW is using only open trench repair technique for the pipe repair but other M/R/R techniques like cathodic protection, wrapping & lining, other 1 are also incorporated in this tool so that they can use those in future. The fictional cost of the other M/R/R techniques like cathodic protection, wrapping, lining, other 1, other 2 are considered in this tool but the decision makers can update or change the costs wherever new information is available. The list of the different M/R/R techniques and costs are shown in Table 7.6. The average repair costs and pipe replacement or new pipe installation cost for GEID and GVW are already presented in Tables 7.3 and 7.4 respectively. 161 Table 7.6: List of different M/R/R techniques and corresponding costs List of M/R/R Interventions Today_Cost Cathodic Protection $ 100 /m Wrapping $ 50 /m Lining $ 150 /m Relining $ 200 /m Wrapping & Lining $ 125 /m Wrapping & Relining $ 230 /m Lining & Cathodic Protection $ 125 /m Relining & Cathodic Protection $ 170 /m Other 1 $ 125 /m Other 2 $ 225 /m Other 3 $ 325 /m No intervention $ 0 /m The break frequency curves were converted into pipe deterioration curves based on the methodology proposed by Banciulescu and Sekuler (2010). The deterioration of the GEID_1323 over the planning period (100 years after the pipe installation) can be shown in Figure 7.5 (a). If no intervention is selected over the planning period, then the pipe will follow the dotted deterioration profile and deteriorated the pipe condition and LoS gradually. However, if the decision maker’s wants to keep very good LoS, then the pipe condition will be 75% and certain action or interventions is required after 54 years to maintain or improve the LoS. The decision maker can choose repair as an action which will improve the pipe condition by 15%. The new pipe condition will be 82.5% and the pipe will start deteriorating form there. However, it is quite challenging to determine the percentage of improvement of pipe condition after any intervention or action due to the involvement of multiple factors. For this, the decision makers can easily change the percentage of improvement of pipe condition after any intervention based on their judgment and experience. Similarly, the decision makers can choose other actions to maintain a satisfactory LoS or pipe conditions by selecting appropriate interventions shown in Table 7.7. For the selection of different actions to improve the pipe condition, the pipe deterioration profile will also change. Figure 7.5 also shows the deterioration profile of GEID_1323 pipe after selection different actions. 162 (a) Considering New Pipe (b) Not Considering New Pipe Figure 7.5: Deterioration profile of GEID_1323 pipe over the planning period 40.060.080.0100.01 21 41 61 81 101Pipe condition (%)Age of Pipe (year)No intervention After_Action 1 After_Action 2 After_Action 3After_Action 4 After_Action 5 Pipe condition (%) Current age of pipe40.060.080.0100.01 21 41 61 81 101Pipe condition (%)Age of Pipe (year)No intervention After_Action 1 After_Action 2 After_Action 3After_Action 4 After_Action 5 Pipe condition (%) Current age of pipe 163 Table 7.7: GEID_1323 pipe condition after each M/R/R actions over the planning period Actions Level of Service (LoS) Threshold Pipe Condition (%) Possible Interventions Percent of Increase of Pipe Condition New Condition of Pipes Action 1 Very Good 75 Repair 15 86.25 Action 2 Very Good 75 Repair 10 82.5 Action 3 Good 55 Repair 10 60.5 Action 4 Good 55 Repair 5 57.75 Action 5 Good 55 Repair 5 57.75 In most of the previous studies, it is assumed that the pipe will follow same deterioration pattern or profile after the repair or rehabilitation. However, in real life situation it is not appropriate. However, it is quite challenging to determine the deterioration pattern or profile after any intervention or action. Therefore, the weight of deterioration rate after intervention or actions are considered in this study. The decision makers can choose weights equal to 1 then the pipe will follow the same deterioration pattern before interventions. If the weights are less or more than 1, then deterioration of the pipe will be higher or lower than the deterioration before taking any interventions. Based on the weights the deterioration pattern or profile will also change. Table 7.8 indicates the weight of deterioration rate after intervention or actions considering new pipe and not considering new pipe. If the decision makers consider that the GEID_1323 pipe will follow the same deterioration or consider a new pipe after any intervention, then Deterioration profile of GEID_1323 pipe over the planning period will be Figure 7.5(a). Table 7.8: Weight of deterioration rate after intervention or actions Weight of Deterioration Rate after Intervention (Considering New Pipe) Weight of Deterioration Rate after Intervention (Not Considering New Pipe) Actions Age of Pipe Actions Age of Pipe 0-20 20-40 40-60 60-80 >80 0-20 20-40 40-60 60-80 >80 Action-1 1.00 1.00 1.00 1.00 1.00 Action-1 0.95 0.95 0.95 0.95 0.95 Action-2 1.00 1.00 1.00 1.00 1.00 Action-2 0.90 0.90 0.90 0.90 0.90 Action-3 1.00 1.00 1.00 1.00 1.00 Action-3 1.00 1.00 1.00 1.00 1.00 Action-4 1.00 1.00 1.00 1.00 1.00 Action-4 1.00 1.00 1.00 1.00 1.00 Action-5 1.00 1.00 1.00 1.00 1.00 Action-5 1.00 1.00 1.00 1.00 1.00 164 If the decision makers consider that the pipe condition will not be new after taking R/R action, and it will be 95% and 90% after repairing first and second time respectively, then the weight after Action 1 and Action 2 will be 0.95 and 0.90 respectively. The weight of deterioration rate not considering new pipe after intervention or actions are presented in Table 7.8. For this, the deterioration profile of GEID_1323 pipe over the planning period will also change and it is shown in Figure 7.5(b). Figure 7.5(b) shows that the deterioration profile of the GEID_1323 pipe will not be same after taking R/R action. The pipe will deteriorate much faster compared to Figure 7.5(a). Similarly, the decision makers can try different options to see the impact. Based on the actions taken in Table 7.7 and considering the weights presented in Table 7.8, the corresponding R/R/R costs will be also calculated. Table 7.9 indicates the costs for different actions, total cost over the planning period and the net present value considering 3% discount rate. The cost of the actions will also change based on the site and road conditions. Figures 7.6(a) and 7.6(b) indicate the LCC profile of GEID_1323 pipe over the planning period based on the deterioration profile presented in Figures 7.5(a) and 7.5(b). The net present value of the total LCC of GEID_1323 pipe will be $ 879.56 for Figure 7.6(a) whereas for Figure 7.6(b), it will be $ 1,027.83 which is $ 148.27 or 16.85% higher than the previous scenario. The proposed LCC module will help the decision makers to take practical decision regarding their assets. Table 7.9: Cost for each M/R/R actions of GEID_1323 pipe over the planning period Actions Site condition Utility conflict Road Condition Cost Net Present Value1 Net Present Value2 Action-1 Agricultural/No pavement/Native soil No Same side $ 3,018.50 $ 668.49 $ 668.49 Action-2 Agricultural/No pavement/Native soil No Same side $ 3,018.50 $ 211.08 $ 359.34 Action-3 Not applicable No Same side $ 0.00 $ 0.00 $ 0.00 Action-4 Not applicable No Same side $ 0.00 $ 0.00 $ 0.00 Action-5 Not applicable No Same side $ 0.00 $ 0.00 $ 0.00 Total $ 6,037.00 $ 879.56 $ 1,027.83 1: Considering New Pipe, 2: Not Considering New Pipe 165 (a) Considering New Pipe (b) Not Considering New Pipe Figure 7.6: LCC profile of GEID_1323 over the planning period $0.00$500.00$1,000.00$1,500.00$2,000.00$2,500.00$3,000.00$3,500.001 11 21 31 41 51 61 71 81 91 101Age of Pipe$0.00$500.00$1,000.00$1,500.00$2,000.00$2,500.00$3,000.00$3,500.001 11 21 31 41 51 61 71 81 91 101Age of Pipe 166 The discount rate is another important element for the R&R decision making. The discount rate represents the investor's minimum acceptable rate of return. Different province of Canada follows different discount rate for the decision making. Recently gazetted 2016 Ontario discount rates of 0% for the initial 15 years from the trial commencement date and 2.5% per annum thereafter, represent a decrease from 2015 rates of 0.3% and 2.5% per annum respectively (https://www.ontario.ca/laws/regulation/900194). For British Columbia province, prescribed rates changed from 2.5% to 1.5% for income loss and 3.5% to 2.0% for future care effective April 30, 2014. On the other hand, Saskatchewan and Manitoba consider 3.0% discount rate where New Brunswick, Prince Edward Island, Northwest Territories and Nunavut follow 2.5% (http://www.collinsbarrow.com/uploads/offices/toronto/Litigation_Accounting_and_Valuation_Services/Discount_Rates_2016.pdf). In this RiLCAMP tool, the discount rate is considered 3% by default although the decision makers can change or update the discount rate based on the location and other factors. The developed module is also capable to handle multiple discount rates. Figure 7.7 shows the net present value ($) of the LCC of GEID_1323 pipe for different discount rate (%). This will help the decision makers to take appropriate decision. Figure 7.7: Net present value ($) of GEID_1323 for different discount rate (%) 010002000300040005000600070000 1 2 3 4 5 6 7 8Net Present Value ($) of GEID_1323 Discount Rate (%) 167 7.4 Summary In this study, a LCC based DST is developed to prioritize R/R/R recommendations for water mains for small to medium-sized water utilities. The proposed tool will aid utility engineers and managers to predict the suitable new installation and/or rehabilitation programs as well as their corresponding costs for effective and proactive decision making and thereby avoiding any unexpected and unpleasant surprises. A combination of M/R/R techniques is integrated in the model to develop different scenarios for rehabilitation of water mains. The proposed LCC module of the RiLCAMP model provides the decision-maker with an effective and easy to use tool to make an informed decision. To accredit the proposed framework, it was implemented to develop water main renewal program for GEID and GVW. Although, the proposed LCC module is developed for the WDN of GEID and GVW, the methodology developed in this research can be used by any type of WDN, in particular the small to medium-sized utilities. The proposed LCC module is a unique tool for small to medium utilities given the scarcity of data/information, lack of technical and financial resources, and limited experience. In the water main failure prediction or break frequency models, uncertainty is inherent regardless of quantity and quality of data used in model-data fusion. Therefore, Bayesian regression model is proposed for its superior prediction capabilities compared to normal linear regression. This method is flexible to include more contributing variables or new information as well as decision of the utility engineers and managers. The proposed model also demonstrated how to fuse the failure data/information from multiple utilities having demographic and environmental similarities and to develop more statistically significant pipe failure models. The R/R/R costs of the water mains provided by the decision makers based site and road conditions also reduce the uncertainty and unexpected expenses. Moreover, the representation of pipe condition change after any R/R actions by the weight of deterioration rate after intervention or actions make the model more realistic and practical for effective decision making. 168 Chapter 8 Bayesian Belief Network (BBN) for Prioritization of Water Mains A version of this chapter has been published in European Journal of Operational Research with a title “Evaluating Risk of Water Mains Failure using a Bayesian Belief Network Model.” The lead author is Golam Kabir and the coauthors are Dr. Solomon Tesfamariam, Dr. Alex Francisque and Dr. Rehan Sadiq. 8.1 Background Failure risk is combination of probability and impact severity of a particular situation that negatively affects the ability of infrastructures to obtain municipal objectives (InfraGuide 2006), which is in congruence with Lawrence’s (1976) risk definition. A successful risk assessment program provides predictive tools to evaluate water mains failure, assess the consequences associated with such failures, and recommend prioritization strategies for capital and operating spending (Moustafa 2010). The contributing risk factors of water main failure recommended by different literature are presented in Table 2.3. The Bayesian Belief Network (BBN) can provide appropriate framework to handle the cause and effect relationships between the risk factors and the uncertainties when historical data are scarce and/or available information is ambiguous and imprecise. The objective of this chapter is to develop an effective model to evaluate the risk of water main failure considering structural integrity, hydraulic capacity, water quality, and consequence factors. This chapter explores both knowledge and data based BBN model to evaluate water main failure risk index that can be used to rank or prioritize the water main in a network system for maintenance, repair or replacement (M/R/R). In this chapter, deterioration factors that lead to the failure event and the consequence factors that result from the failure event (failure impact) are studied. 8.2 Bayesian Belief Network (BBN) Bayesian Belief Network is a graphical model that permits a probabilistic relationship among a set of variables (Pearl 1988). A BBN is a Directed Acyclic Graph, where the nodes represent variables of interest and the links between them indicate informational or causal dependencies 169 among the variables (e.g., Cockburn and Tesfamariam 2012; Hager and Andersen 2010; Laskey 1995). As depicted in Figure 8.1, a BBN is composed of: a. a set of variables (e.g. A1, A2 and B) and a set of directed links between the variables; b. a set of mutually exclusive and exhaustive states for each variable (e.g. for A1, A2 and B the states are High (H), Medium (M) and Low (L) or {H, M, L}); and c. an assigned conditional probability for each variable with ‘parents’, which will be defined shortly (e.g. for B). Figure 8.1: A schematic of BBN (modified after Cockburn and Tesfamariam 2012; Tesfamariam and Liu 2013) 170 The relations between the variables in a BBN are expressed in terms of family relationships, where a variable A1 is said to be the parent of B (B the child of A1) if the link goes from A1 to B (Figure 8.1). The dependencies are quantified by conditional probabilities for each node given its parents in the network. These dependencies are quantified through a set of conditional probability tables (CPTs); each variable is assigned a CPT of the variable given its parents. In the case of a variable with no parents, the conditional probability structure reduces to the unconditional probability (UP) of that variable (e.g., A1 and A2, Figure 8.1). The UPs of the basic input parameters, often, are not known a priori, consequently, equal weights (1/n, where n is number of category considered for each basic input) can be assigned using the principle of insufficient reasoning (e.g., Ismail et al. 2011; Tesfamariam and Liu 2013). For example, if the states for A1 are categorized as H, M and L, the UPs will be P(A1=H) = 1/3, P(A1=M) = 1/3 and P(A1=L) = 1/3. However, from the evaluation, with certainty, if the state of A1 is determined to be M, the UPs will be P(A1=H) = 0, P(A1=M) = 1 and P(A1=L) = 0. Indeed, if there is still uncertainty with the state of A1, the appropriate values should be used. BBN is based on the Bayes’ theorem that has been proven to be a coherent method to manage uncertainty by explicitly representing the conditional probability dependencies between variables (e.g., Ismail et al. 2011; Sun and Shenoy 2007; Tesfamariam and Liu 2013). In a BBN analysis, for n number of mutually exclusive parameters Xi (i = 1,2,…,n), and a given observed data Y, the updated probability is computed by: 𝑝 (𝑋𝑖|𝑌) = 𝑝 (𝑌|𝑋𝑖) × 𝑝 (𝑋𝑖)∑ 𝑝(𝑌|𝑋𝑗)𝑗 𝑝 (𝑋𝑗) (8.1) where p(X|Y) represents the posterior occurrence probability of X given the condition that Y occurs, p(X) denotes the prior occurrence probability of X, p(Y) denotes the marginal (total) occurrence probability of Y and is effectively constant since the obtained data is in hand, and p(Y|X) refers to the conditional occurrence probability of Y given that X occurs too (it is often viewed in this sense as the likelihood distribution) (Pearl 1988). 171 The conditional probabilities shown in Figure 8.1 which will be used in Equation (8.1), can be obtained through expert knowledge elicitation (e.g., Joseph et al. 2010; Nadkarni and Shenoy 2001), or training from data (e.g., Cooper and Herskovits 1992; Lauría and Duchessi 2007; Tang and McCabe 2007). Where multiple experts are considered, credibility of each decision maker on the decision can be elicited by considering experience and confidence on the assessment (Tesfamariam et al. 2010). The credibility factor can be used to reduce the overall decision (Cockburn and Tesfamariam 2012). The efficacy of a BBN is realized in its flexibility to capture bottom-up inference, observing the effect (child) and inferring the possible cause (parent) (diagnostic analysis) and top-down inference, observing the cause (or parent) and inferring the possible effect (or child) (predictive analysis) (e.g., Cockburn and Tesfamariam 2012; Ismail et al. 2011). In predictive analysis, the probability distribution of a particular variable can be found by marginalizing the joint probability distribution with respect to the variable (e.g., Janssens et al. 2006; Tesfamariam and Liu 2013). This calculation is called marginalization, which can be used to compute the reliability of systems based on statistical data (e.g., Cai et al. 2013; Nadkarni and Shenoy 2001; Poropudas and Virtanen 2011). Fundamentally, a BBN is used to update probabilities when new information is available. The network supports the computation of the probabilities of any subset of variables given evidence about any other subset (e.g., Hager and Andersen 2010; Janssens et al. 2006; Tesfamariam and Liu 2013). 8.3 BBN for risk of failure for water mains Water mains of different materials, such as metallic, cementitious and plastic, exhibit significantly different causes of deterioration and modes of failure. Therefore different considerations have to be made in determining the system failure index for different materials. Based upon comprehensive literature review (Table 2.3) and expert opinions, the risk factors of failure for a metallic pipe have been identified and the proposed BBN is illustrated in Figure 8.2. 172 Figure 8.2: Proposed BBN model for risk of metallic water main failure Twenty one different factors are incorporated in this research, which represents the deterioration and post failure factors (Al-Barqawi and Zayed 2008; Fares and Zayed 2010; Francisque et al. 2009; Jafar et al. 2003). The factors selected to be incorporated in the pipeline failure risk model are clustered into four main categories: (i) Water Quality Index (WQI), (ii) Hydraulic Capacity Index (HCI), (iii) Structural Integrity Index (SII) and (iv) Consequence Index (COI). These attributes can be gathered from different types of documents such as design information, visual inspection and maintenance reports, etc. The factors of consequence are hard to quantify and thus a qualitative and quantitative approach is followed. Details of the model shown in Figure 8.2 will be discussed in the subsequent subsections. Aggregated Failure Risk IndexStructural Integrity Risk IndexFree Residual ChlorineSeasonBio Film GrowthWater AgeTurbidityStructural Integrity IndexImpact of TrafficType of TrafficPhysical FactorsImpact of CorrosionPipe DiameterPipe AgePipe LengthPipe ThicknessImpact of Frezzing IndexFreezing Index Soil CorrosovitySoil pHMoisture ContentRedox PotentialSulphide ContentResistivityConsequence Risk IndexWater Quality IndexHydraulic Capacity Risk Index Water Quality Risk IndexPopulationLand UseWater VelocityWater PressureHydraulic Capacity IndexColourWater pHExternal FactorsInternal FactorsType of Road 173 8.3.1 Water quality index Water quality in a water supply network can be described by specific microbiological, physicochemical and aesthetic attributes of the water. Water quality can be further classified into turbidity, biofilm growth and water colour. Turbidity (NTU) is a measure of the cloudiness of water and used to assess microbial water quality and filtration effectiveness in a water supply network (e.g., Sadiq and Tesfamariam 2007; Sadiq et al. 2009). Between 0 to 1 NTU turbidity indicates good and NTU turbidity greater than 1 indicates bad. Biofilm growth includes the following sub-factors: water age, water velocity and free residual chlorine. Water age (WA) or water residence inside the water supply network is an important water quality parameter directly linked to the network hydraulic characteristic (Fares and Zayed 2010; Francisque et al. 2009). WA is measured by hour. Free residual chlorine (RC) represents the suitability of the water in terms of human consumption as it ensures microbial inactivation (e.g., Francisque et al. 2009; Rodriguez et al. 2003; Sadiq and Rodriguez 2004). RC is measured by mg/L. Colour in water is aesthetically undesirable and also an indicator to track contaminant inside water (e.g., Francisque et al. 2009; Sadiq et al. 2010). Colour can be measured in apparent color unit (ACU) that informs about levels of dissolved, suspended materials, iron and manganese in a water sample. The granularity of WA, RC, and ACU are given in Table 8.1. 174 Table 8.1: Details of indices for water quality index Sub criteria Surrogates and Unit Performance Measure Turbidity Turbidity (NTU) Good Bad 0 ≤ NTU < 1 NTU ≥ 1 Biofilm growth Water Age (WA) (hour) Good Moderate Bad 0 ≤ WA < 30 30 ≤ WA < 70 WA ≥ 70 Free Residual Chlorine (RC) (mg/L) Low Medium High Very High 0 ≤ RC < 0.30 0.3 ≤ RC < 0.60 0.60 ≤ RC < 1 RC ≥ 1 Colour Colour (ACU) Low Medium High 0 ≤ ACU < 8 8 ≤ ACU < 15 ACU ≥ 15 8.3.2 Hydraulic capacity index Hydraulic capacity of a pipe ensures that there is sufficient pressure at the point of supply to provide an adequate flow to the consumers. Hydraulic capacity of the pipe or network depends on water pressure (WP) and water velocity (WV). WP represents hydraulic capacity failure through inadequate water supply to the customers, inadequate pressure for firefighting, etc. and possible water quantity loss by leakage (e.g., Hu and Hubble 2007; Pickard and Levine 2006; Rogers 2011). WV maintains the network hydraulic capacity as well as removes microbial, physicochemical and aesthetic deterioration of water quality (e.g., Fares and Zayed 2010; Sadiq et al. 2009). The granularity of WP and WV is given in Table 8.2. 175 Table 8.2: Details of indices for hydraulic capacity index Sub criteria Surrogates and Unit Performance Measure Water Pressure Water Pressure (WP) (m) Very Low Low Medium High Very High -25 ≤ WP < -10 -10 ≤ WP < 35 35 ≤ WP < 70 70 ≤ WP < 120 WP > 120 Water Velocity Water Velocity (WV) (m/s) Low Medium High Very High 0 ≤ WV < 0.2 0.2 ≤ WV < 1 1 ≤ WV < 1.5 WV ≥ 1.5 8.3.3 Structural integrity index Structural deterioration of water mains and their subsequent failure are affected by physical factors, external factors and internal factors (Rajani and Tesfamariam 2004; Tesfamariam et al. 2006). Physical factors include the following sub-factors: pipe diameter (PD), pipe age (PA), pipe length (PL) and pipe thickness (PT). PD is considered one of the vital parameter for water main failure (Kettler and Goulter 1985; Tesfamariam et al. 2006). PD is reported in mm and can be grouped as very small, small, medium, large and very large, respectively, corresponding to [0-50], [50-200], [200-400], [400-600] and greater than 600. PA or installation date is reported as the most important parameter in most of the literature (e.g., Berardi et al. 2008; Hu and Hubble 2007; Kim et al. 2006; Kleiner et al. 2004; Rajani and Tesfamariam 2007) causing pipe deterioration. PA is measured by years. PL has been considered as one of the basic static parameter for structural integrity and is measured by m (e.g., Christodoulou et al. 2009; Hu and Hubble 2007; Jafar et al. 2003; Najafi and Kulandaivel 2005). PT or wall thickness is another important factor for the metallic pipe (Tesfamariam et al. 2006) and measured by mm. The granularity of PA, PL and PT are given in Table 8.3. 176 Table 8.3: Details of indices structural integrity index (physical factors) Sub criteria Surrogates and Unit Performance Measure Physical Factors Pipe Diameter (PD) (mm) Very Small Small Medium Large Very Large 0 ≤ PD < 50 50 ≤ PD < 200 200 ≤ PD < 400 400 ≤ P < 600 PD ≥ 600 Pipe Age (PA) (years) Very Low Low Medium High Very High 0 ≤ PA < 20 20 ≤ PA < 40 40 ≤ PA < 60 60 ≤ PA < 80 PA ≥ 80 Pipe Length (PL) (m) Very Small Small Medium Large Very Large 0 ≤ PL < 25 25 ≤ PL < 100 100 ≤ PL < 200 200 ≤ PL < 500 PL ≥ 500 Pipe Thickness (PT) (mm) Small Medium Large 0 ≤ PT < 10 10 ≤ PT < 20 PL ≥ 20 External factors deteriorate the exterior portion of pipe through electrochemical corrosion with the damage occurring in the form of corrosion pits. External factors can further be classified into soil corrosivity, FI, type of traffic and type of road. Soil Corrosivity (SCR) plays a very significant role in the corrosion process of different types of pipe material (e.g., Al-Barqawi and Zayed 2006b; Gorji-Bandpy and Shateri 2008; Rogers and Grigg 2006; Sadiq et al. 2005). Different soil properties like soil resistivity, soil pH, redox potential, soil sulphide content and moisture content are mainly responsible of the external corrosion of water mains (e.g., DIPRA 2000; Jafar et al. 2010; Najjaran et al. 2005; Sadiq et al. 2005; Singh 2011). Soil resistivity (SR) is a measure of how strongly a soil opposes to pass the flow of electric current (e.g., Najjaran et al. 2005; Rogers and Grigg 2006; Sadiq et al. 2005). Soil pH (SpH) is a measure of the acidity or alkalinity in the soil. Redox potential (RP) is a measure of the degree of aeration in a soil. Low redox values may provide an indication that conditions are conducive to anaerobic microbial activity (Najjaran et al. 2005; Sadiq et al. 2005). RP are measured by mV. Sulfide content (SC) is responsible for microbiological corrosion which 177 results metallic pipe deterioration (e.g., Hu and Hubble 2007; Najjaran et al. 2005; Sadiq et al. 2005). Moisture content (MC) is a measure of drainage conditions for different type of soil (Najjaran et al. 2005; Sadiq et al. 2005). The granularity of SC, SpH, RP, SC and MC are shown in Table 8.4. Freezing index is a surrogate measure for cold temperature effects on water main breaks. FI can be considered as a surrogate measure for live load induced by frost water and expressed in degree-days (e.g., Hu and Hubble 2007; Kleiner and Rajani 2002). Type of Traffic (TT) indicates the average daily traffic in to the road (e.g., Al-Barqawi and Zayed 2008; Christodoulou et al. 2009; Christodoulou et al. 2003). Type of Road (TR) indicates the condition of road under which the pipe passes (e.g., Al-Barqawi and Zayed 2006b; Al-Barqawi and Zayed 2008; Fares and Zayed 2010). The granularity of FI, TT and TR are given in Table 8.4. 178 Table 8.4: Details of indices structural integrity index (external factors) Sub criteria Surrogates and Unit Performance Measure States Range AWWA1 points External Factors Soil resistivity (SR) (Ω cm) Soil pH (SpH) Redox potential (RP) (mV) Sulphide Content (SC) (mg/Kg) Moisture Content (MC) Very Low Low Medium Medium High High Very High Very Low Low Medium High Very High Low Medium High Positive Trace Negative Poor Drainage Fair Drainage Good Drainage 0 ≤ SR < 1500 1500 ≤ SR < 1800 1800 < SR < 2100 2100 < SR < 2500 2500 < SR < 3000 SR ≥ 3000 0 ≤ SpH < 2 2 ≤ SpH < 4 4 ≤ SpH < 7.5 7.5 ≤ SpH < 8.5 SpH ≥ 8.5 0 ≤ RP < 50 50 < RP > 100 RP ≥ 100 SC ≥ 3 2 ≤ SC < 3 SC < 2 Continually wet Generally moist Generally dry 10 8 5 2 1 0 5 3 0 0 3 4 3.5 0 3.5 2 0 2 1 0 Freezing Index (FI) (Degree-day) Low Medium High FI < 0 0 ≤ FI < 0.5 FI ≥ 0.5 Type of Traffic (TT) Heavy Moderate Low Type of Road (TR) Local Primary Secondary Free way Arterial 1AWWA: American Water Works Association 179 Internal factors deteriorate the interior of a pipe by tuberculation, erosion and crevice corrosion, which resulting in a reduced effective inside diameter, as well as a breeding ground for bacteria (Sadiq et al. 2009). Water pH (WpH) and free residual chlorine (RC) are two most important internal factors (Table 8.5). WpH indicates aggressively of the internal water which may cause leaching and deterioration of the water main (e.g., Francisque et al. 2009; Sadiq and Tesfamariam 2007). WpH between 0 and 5.5 indicates low, between 5.5 and 8.75 indicates medium and greater than 8.75 indicates high. Classification of RC has been defined in Table 8.5. Table 8.5: Details of indices structural integrity index (internal factors) Sub criteria Surrogates and Unit Performance Measure Internal Factors Water pH (WpH) Low Medium High 0 ≤ WpH < 5.5 5.5 ≤ WpH < 8.75 WpH ≥ 8.75 Free Residual Chlorine (RC) (mg/L) Low Medium High Very High 0 ≤ RC < 0.30 0.3 ≤ RC < 0.60 0.60 ≤ RC < 1 RC ≥ 1 8.3.4 Consequence index In case of failure of a water main, the consequences depend on various factors such the land use around the water main, the population that might be affected, the diameter and the length of the water main. Consequence index (COI) are further classified into pipe diameter (PD), population density (PP) and land use (LU) (Table 8.6). The PD has an important impact on the consequence level in the event of a water main failure. Classification of PD has been defined in Table 8.6. PP is used as a surrogate measure to quantify the number of people that will be affected due to the water main failure (Fares and Zayed 2010). The PP is measured by person/km2 and their extents are quantified very low, low, medium, high and very high, respectively, corresponding to [0-415], [415-595], [595-830], [830-1195] and greater than 1195. LU indicates the usage of the territory that will be affected due to the water main failure (Fares and Zayed 2010). The impact on a land used for hospital or school or a day to day vital activity for the city will be more important than the impact of the same failure on an 180 agricultural land (Francisque et al., 2009). LU or zoning system is classified into low, medium and high, respectively. Table 8.6: Details of indices consequence index Sub criteria Surrogates and Unit Performance Measure Pipe Diameter Pipe Diameter (PD) (mm) Very Small Small Medium Large Very Large 0 ≤ PD < 50 50 ≤ PD < 200 200 ≤ PD < 400 400 ≤ PD < 600 PD ≥ 600 Land Use Low Medium High 0 ≤ FLUCW ≤ 0.03 0.03 < FLUCW ≤ 0.05 0.051 < FLUCW ≤ 1 Population Population density (PP) (person/km2) Very Low Low Medium High Very High 0 ≤ PP < 415 415 ≤ PP < 595 595 ≤ PP < 830 830 ≤ PP < 1195 PP ≥ 1195 8.3.5 Implementation of water mains risk in BBN The propagation of the Bayesian belief network model for risk of metallic water main failure is performed using the commercially available software package Netica (Norsys Software Corp, Vancouver, Canada, 2006). Using this software, the overall graphical representation of proposed BBN model for the metallic water main failure is generated (Figure 8.2). The Aggregated Failure Risk Index is defined using three states, AggreFailureRiskInL, AggreFailureRiskInM and AggreFailureRiskInH, which are related to low (L), medium (M), and high (H) risk. Similarly, Structural Integrity Risk Index, Hydraulic Capacity Risk Index, Water Quality Risk Index, Consequence Risk Index, Structural Integrity Index, Hydraulic Capacity Index, Water Quality Index, Physical Factors, Internal Factors, External Factors, Impact of Freezing Index, Impact of Corrosion, Impact of Traffic, Biofilm Growth and Soil Corrosivity are also defined as low (L), medium (M), and high (H) states. Based on an algorithm provided in Netica, 38 Nodes (16 parent or independent nodes and 22 child or dependent nodes), 47 links, and 4108 conditional probabilities are generated. It is possible to compile and update nets whose 181 CPTs are absent or incomplete. Netica takes missing entries as uniform probabilities. A snapshot of these conditional probabilities is shown in Table 8.7. It should be noted that these CPT values can also be obtained through expert knowledge elicitation. Interpretation of the CPT values summarized in Table 8.7 can be explained as follows. Table 8.7: Example of conditional probability table for Aggregated Failure Risk Index Criteria (HydraulicCapRiskIn, WaterQuaRiskIn, StructuralInteRiskIn) Aggregated Failure Risk Index (AggreFailureRiskInL, AggreFailureRiskInM, AggreFailureRiskInH) (Low, Low, Low) (95, 5, 0) (Low, Low, Medium) (80, 20, 0) ………. ………. (Medium, Low, Medium) (50, 50, 0) (Medium, Low, High) (30, 40, 30) ………. ………. (High, Medium, High) (0, 25, 75) (High, High, Low) (5, 30, 65) ………. ………. HydraulicCapRiskIn = Hydraulic Capacity Risk Index; StructuralInteRiskIn = Structural Integrity Risk Index; WaterQuaRiskIn = Water Quality Risk Index Assume the values for input set (HydraulicCapRiskIn, WaterQuaRiskIn, StructuralInteRiskIn) are defined as (Medium, Low, High). Consequently, the corresponding CPT values generated for Aggregated Failure Risk Index are (Table 8.7): (AggreFailureRiskInL, AggreFailureRiskInM, AggreFailureRiskInH) = (30, 40, 30). Based on this information, the conditional probability for Aggregated Failure Risk Index being in the state of low, medium and high are 30%, 40% and 30% respectively. 8.3.6 Sensitivity Analysis of Aggregated Failure Risk Index Sensitivity analysis assumes that input parameters to the model are uncertain; it shows the designer the variation of the system reliability, given some variation of the input parameters values (Yang et al. 2009). It refers to how sensitive the performance of a model is to minor changes in the input parameters (Nadkarni and Shenoy 2001). Since the final output of BBN is dependent on a priori assigned probabilities, there is a need to carry out sensitivity analysis to 182 identify critical input parameters that have a significant impact on the output results (Ismail et al. 2011). The sensitivity analysis based on Bayesian Network also serves as an aid to identify uncertainties to prioritize data collection (Laskey 1995). Various methods have been proposed for carrying out sensitivity analysis in a Bayesian Network (e.g., Castillo et al. 1997; Jensen 1996; Laskey 1995; Pearl 1988; Spiegelhalter and Lauritzen 1990). Since the input parameters required evaluating failure risk, which has discrete and continuous values, the variance reduction method (e.g., Cheng 1986; Janssens et al. 2006; Norsys Software Corp. 2006; Pearl 1988) is used here to determine the sensitivity of the BBN model’s output to variation in a particular input parameter. The variance reduction method works by computing the variance reduction of the expected real value of a query node Q (e.g., aggregated failure risk index) due to a finding at varying variable node F (e.g., water quality index, WQI). Thus, the variance of the real value of Q given evidence F, V(Q|f) is computed using the following equation (Norsys Software Corp. 2006; Peal 1988): 𝑉(𝑄|𝑓) = ∑ 𝑝(𝑞|𝑓)[𝑋𝑞 − 𝐸(𝑄|𝑓)]2 𝑞 (8.2) where q is the state of the query node Q, f is the state of the varying variable node F, p(q|f) is the conditional probability of q given f, Xq is the numeric value corresponding to state q, and E(Q|f) is the expected real value of Q after the new finding f for node F. Result of the sensitivity analysis for the Aggregated Failure Risk Index node is summarized in Table 8.8. For the Aggregated Failure Risk Index, Table 8.8 shows the variance reduction and percentage of variance reduction of the parent node or independent input factors. For the Aggregated Failure Risk Index (AggreFailureRiskIn node), the Pipe Age shows the highest contribution 54.41% towards the variance reduction. To a lesser degree, Pipe Diameter and Population, and Turbidity show sensitivities of 15.39%, and 11.89% respectively. Finally, Water Pressure, Resistivity, Pipe Length and Water Velocity showed sensitivity in the range of 1.54% to 6.03%. Sensitivity of the other nodes is comparatively small. Total contribution of the parent node or independent node for the total percentage of variance reduction of Aggregated Failure Risk Index is only 12.10%. Results of this analysis indicate that the sensitivity of the parent node significantly depends on the variability of the dependent children. Similarly, the sensitivity of 183 other nodes like HydraulicCapRiskIn, WaterQuaRiskIn, and StructuralInteRiskIn can be performed to identify the significant factors for the corresponding nodes. Table 8.8: Sensitivity analysis using proposed BBN model Node Variance Reduction % Variance Reduction Pipe age 279 54.41% Pipe diameter 78.93 15.39% Population 60.97 11.89% Water pressure 30.94 6.03% Resistivity 26.45 5.16% Pipe length 24.12 4.70% Water velocity 7.874 1.54% Water age 2.411 0.47% Colour 0.71 0.14% Water pH 0.68 0.13% Land use 0.3529 0.07% Turbidity 0.09714 0.02% Free residual Cl 0.08296 0.02% Season 0.0495 0.01% Freezing index 0.0343 0.01% Redox Potential 0.02153 0.004% Sulphide Content 0.01941 0.004% Soil pH 0.0044 0.001% Moisture content 0.00424 0.001% Type of road 0.00421 0.001% Type of traffic 0.00414 0.001% 8.3.7 Scenario Analysis Proposed BBN model presented in this study has been checked with multiple hypothetical scenarios. The states of the criteria for the three hypothetical scenarios are summarized in Table 8.9. Scenario 1 depicts a water distribution situation when all the criteria in worst condition states, Scenario 2 represents all the criteria in medium or moderate condition states and Scenario 3 shows all the criteria in favorable condition states. The proposed BBN is applied to these different 184 scenarios and the results are summarized in Table 8.9. For example, for scenarios 1 to 3, the three states for AggreFailureRiskIn are estimated as (AggreFailureRiskInL, AggreFailureRiskInM and AggreFailureRiskInH) = (0.34, 10.6, 89.1), (48.7, 32.3, 19.0) and (89.9, 9.84, 0.24), respectively. Table 8.9: Results of 3 scenarios using proposed BBN model Node Scenario 1 Scenario 2 Scenario 3 Water Pressure Very Low Medium Medium Water Velocity Low Medium Medium Water Age Bad Good Good Turbidity Bad Good Good Free Residual Chlorine Low High Medium Colour High Medium Low Season Summer Summer Winter Water pH High Medium Medium Pipe Diameter Very Small Medium Very Large Pipe Age Very High Medium Very Low Pipe Length Very Large Medium Very Small Freezing Index High Medium Low Resistivity Very Low Medium Very High Soil pH Very High Medium Medium Redox Potential Low Medium High Sulphide Content Positive Trace Negative Moisture Content Poor Drainage Fair Drainage Good Drainage Population Very High Medium Very Low Land Use High Moderate Low Type of Traffic Heavy Moderate Low Type of Road Arterial Secondary Local Aggregated Failure Risk Index Aggregated Failure Risk IndexLowMediumHigh0.3410.689.176.4 ± 15Aggregated Failure Risk IndexLowMediumHigh48.732.319.039.1 ± 26Aggregated Failure Risk IndexLowMediumHigh89.99.840.2420.6 ± 14 185 Table 8.9 shows that scenario 1 has the highest probability mass of 59.0% was assigned to High condition state, where as for scenario 2 and 3, the highest probability mass of 72.2% and 52.3% were assigned to Low condition state. That indicates failure risk of a water main will be very high if it is having poor structural (due to very high age, and very large diameter and length) and hydraulic condition (due to low pressure and velocity), and lying in a highly soil corrosive (due to high pH, low resistivity and redox potential, and poor drainage condition) and populated area (due to high population and land use). 8.4 City of Kelowna: Case Study City of Kelowna (BC, Canada) water distribution network has been selected to demonstrate the application of BBN model. For prioritizing water mains for M/R/R, a BBN based water main prioritization tool has been developed combining the likelihood of failure of each water main and the associated consequences in case of failure. The risk index supports the water utility managers to take informed decision related to M/R/R about each water main, and, consequently, about all the metallic pipes in the network (graphics and maps for the whole network illustrate the situation for Summer 2012 but only the aggregated risk indices if the metallic pipes are shown here. City of Kelowna’s pipe inventory includes numerous pipe materials and diameters of highly variable age and condition (Table 8.10, Figure 8.3). In this study, 259 metallic water mains (cast iron (CI), ductile iron (DI), Galvanized (GALV) and steel (ST)) of City of Kelowna are considered for further analysis. 186 Table 8.10: Water main pipe inventory for City of Kelowna Material Number of pipes Total length (m) Diameter range (mm) AC 849 151430 100 to 500 CI 126 19541 100 to 400 CON 33 14044 500 to 900 COP 22 1532 25 to 50 DI 130 26354 100 to 1350 GALV 2 77 50 HDPE 4 617 50 to 812 PVC 1448 187387 50 to 750 ST 1 212 500 AC: Asbestos; CI: cost iron; CON: concrete; COP: copper; DI: ductile iron; GALV; galvanized; HDPE: high density polyethylene; PVC: polyvinyl chloride; ST: steel Figure 8.3: Number of water mains with respect to pipe material for City of Kelowna 0200400600800100012001400AC CI CON COP DI GALV HDPE PVC STNumber of water mainMaterial 187 8.4.1 Database management Pipe attributes (e.g., age, diameter, length, material, breakage history) and water quality parameters are collected from City of Kelowna’s pipe inventory and water quality department- sampling station data. Geographic Information System (GIS)-based data related to water main pipe characteristics have been collected from the database. 8.4.1.1 Water quality data Monthly water quality data have been collected (for 2005 to 2012) from City of Kelowna water quality department. To consider seasonal variations, for simplicity the year is divided into two seasons: summer and winter. Summer goes from May to October whereas winter starts in November and ends in April. Free residual chlorine, water turbidity, and colour have been collected from 14 water quality monitoring stations/sampling points (Table 8.11). The average parameter value has been calculated for summer and winter. Table 8.11: Water quality sampling station name and geographic coordinates No. Sampling station GIS zone name Latitude Longitude 1 Fisher and Leader SS FisherRoad 49.8629695 -119.4464278 2 Gordon and Young Sample Station GordonYoung 49.82840135 -119.4830400 3 View crest & View crest Sample Station ViewCrest 49.79877409 -119.5184612 4 Water Lab @ 951 Raymer (sink tap) Raymer 49.86498202 -119.4822085 5 Bulk Water Filling Station BulkFilling 49.88581456 -119.4384456 6 Clement and Graham CementGraham 49.89344647 -119.4794338 7 Cooper Road Sampling Station CooperRoad 49.8762464 -119.4430724 8 Knox Mtn. Sample Station Knox 49.90452369 -119.4928515 9 Lawrence & Ellis Sampling Station LawrenceEllise 49.88516217 -119.4936468 10 Parkview Cr. Sampling Station Parkview 49.88167366 -119.4224757 11 Poplar Point Sampling Station PoplandPoint 49.91520544 -119.4889891 12 Sutherland Sampling Station Sutherland 49.88143515 -119.4633982 13 Clifton Road Sampling Station Clifton 49.91145397 -119.4671881 14 City Yard Office CityYard 49.88650583 -119.4474363 188 8.4.1.2 Water velocity and pressure Hydraulic EPANET-based model is used to predict WDN nodal pressure and water velocity after defining the pipe elevation, flow demand pattern, and basic characteristics of the pipes. A particular pipe average pressure (Pavg) can be calculated by taking the average of its two corresponding nodal pressures (Pi and Pj) as shown by Equation 8.3. 2i javgP PP (8.3) Table 8.12 shows the available metallic pipes and their features like range of pipe diameter, pipe length, installation year, pipe thickness, water velocity, water pressure, water age, water pH, free residual Cl, turbidity and colour. Table 8.12: Available metallic pipe features of City of Kelowna Parameter Material & Range CI DI GALV Steel Pipe Diameter (mm) 100-400 100-1350 50 500 Pipe Length (m) 13.69-412.92 12.03-1203.95 16.65-61.18 212.29 Installation year 1939-2004 1942-2009 1990 1980 Pipe Thickness (mm) 9.91-15.24 8.1-20.6 3.6 12.7 Water velocity (m/s) 0.02-1.35 0.03-2.65 0.13-0.58 0.6 Water pressure (m) 14.11-74.17 2.57-210.76 3.75-10.67 17.99 Water age (hour) 3.75-12.96 0.87-120 0.67-112 11.32 Water pH 7.63-7.92 7.65-7.95 7.63-7.71 7.63-7.70 Free Residual Cl (mg/L) 0.57-1.19 0.37-1.19 0.93-1.09 0.68-0.98 Turbidity (NTU) 0.29-0.44 0.28-0.44 0.38-0.39 0.36-0.38 Colour (ACU) 4.17-5.08 3.5-6.07 4.58-4.80 4.39-5.37 8.4.1.3 Soil corrosivity Water distribution network has been divided into 15 zones based on the position of 15 bore holes used to conduct investigations about soil and groundwater. Soil property data, e.g., resistivity, pH, redox potential, sulphide and moisture content have been collected for each of the 15 zones. Soil properties are classified according to the 10- point scoring method introduced by DIPRA 189 (2000) and American Water Work Association (AWWA 1999). If the total point (Table 8.4) is greater than 13, 13-4.5 and less than 4.5 indicates that there are high, medium and low soil corrosivity respectively (e.g., Najjaran et al. 2005; Sadiq et al. 2005). As a pipe may intersect many soil zones the higher soil corrosiveness value is assigned to this pipe for security purpose. After generating soil corrosiveness of each water main, parameter learning method used to determine the CPTs at each node, given the link structures and the data using Expectation-Maximization (EM) algorithm. In case of water main failure the consequences will depend on land use above the water main, population that might be affected, and water main diameter are considered. The procedure to assign those factors, particularly land use and population, to each water main is described in the following sections. 8.4.1.4 Land use The impacts of a water main failure will vary according to the usage of the territory that will be affected. Land use or zoning system for City of Kelowna has been collected from GIS coordination department. The GIS database named Zoning has 96 land use codes, which are described in “Bylaws” (2012) and regrouped under agricultural, rural residential, urban residential, commercial, industrial, public & institutional, health district, and comprehensive development. Each land use received a weight expressing its relative importance. Using the analytic hierarchy process (AHP) developed by Saaty (1988), this relative weight has been attributed to each land use based on different information on each of the 96 land uses gathered from the documents provided by the City of Kelowna and different other sources. AHP uses pairwise comparisons to estimate the preference weights (wi) of each factor of a group (Saaty 1988). These preference weights (wi) are normalized to 1, for n contributory factors: 11niiW w ; 0 ≤ iw ≤ 1 (8.4) Using the geometric mean of each row and normalizing this value according to the sum of the geometric means of all the rows, a relative weight (wi) for each land use has been estimated. However, a water main can be buried under various land uses. To determine all the land uses that 190 a pipe is related to and the consequence on these land uses in case of its failure following equation is used: 1 1 2 21 1...................n nnL LUW L LUW L LUWFLUCWL L L (8.5) where, FLUCW is the final land use consequence weight; Li is the length of the water main portion i that intersects the land use i; LUWi is the relative weight of the land use i; and n is the number of land uses intersected by the water main. For example, the FLUCW for water main 125566 (Figure 8.4) can be calculated from Equation 8.5 using LUW1 = weight assigned to I4 = 0.006; LUW2 = weight assigned to I2 = 0.006; and LUW3 = weight assigned to RU2 = 0.025: 31.96 0.006 60.66 0.006 4.87 0.0250.0131.96 60.66 4.87FLUCW For a qualitative approach, three classes or granularities such as Low (L), Medium (M), and High (H) have been defined by using lower and upper limits as shown in Table 8.6. 191 Figure 8.4: Land uses of water main 8.4.1.5 Population density The impacts of a water main failure will also vary according to the population size of the given area that will be affected. Population density (Person/km2) for each dissemination area (DA) within the territory served by the city water supply network has been calculated (Figure 8.5). A DA is a small, relatively stable geographic unit composed of one or more adjacent dissemination blocks (Statistic Canada 2001). It is the smallest standard geographic area for which all census data of Statistic Canada are disseminated. DAs are uniform in terms of population size, which is targeted from 400 to 700 persons to avoid data suppression. GIS file from Statistic Canada census 2006 containing each DA size and its population has been used. 192 Figure 8.5: Population distribution (person/km2) of City of Kelowna by DA However, a water main can be related to many DAs. To determine all the DAs that a pipe is related to, and, in case of its failure, the consequences on the DAs’ population, the following equation is used: 11ni iiniiL PDFPCWL (8.6) 193 where, FPCW is final population consequence weight; iL is the length of the pipe ith portion that intersects the ith DA; PDi is the population density of the ith intersected DA; and n the number of DAs intersected by the pipe. After determining the data of different criteria of each water main, the proposed BBN is applied to determine the aggregated failure risk index of individual water mains of City of Kelowna. Aggregated failure risk index are converted into three states namely low (0 ≤ AggreFailureRiskIn < 30), medium (30 ≤ AggreFailureRiskIn < 60) and high (60 ≤ AggreFailureRiskIn). The user can fine-tune these thresholds if needed. 8.4.2 Results and discussions In the case of the BBN, the conditional dependencies are retained in the network structure and the interactive links quickly enable incorporation of “evidence” into the model. Simply by selecting a particular State (i.e. by specifying that the probability of high pipe diameter is 100%) for the “Pipe Diameter”, which is a root Node, provides an immediate estimate of the flow-on effects of that changes to all Nodes in the network. For example, for a very small diameter pipe in water distribution systems, the BBN suggests (Figure 8.2) that the structural integrity risk index will be 50.1 with maximum 43.1% high probability and aggregated pipe failure risk index will be 48.4 with maximum 36.2% high and 36.1% low probabilities. Whereas for very large diameter pipe, the structural integrity risk index and aggregated pipe failure risk index will be 56.9 with maximum 54.4% high probability and 54.7 with maximum 46.5% high probabilities respectively. The above cases would suggest that very large diameter pipes have a higher consequence or overall failure risk than small diameter pipes considering all other variable constant, as water main with larger diameter is expected to have higher thickness compared to smaller diameter water main. Figure 8.6 shows the aggregated risk index of City of Kelowna water mains for summer and winter 2012. Only 8% of the total 259 metallic pipes are in low risk. That means almost 8% of the water mains have a level of risk that can be qualified acceptable. Almost 84% of the water main incurs a medium risk. Nevertheless, around 8% of the water mains are at high risk in both 194 summer and winter. The utility managers must give immediate and very special attention to those pipes and take appropriate preventive action (e.g., maintenance, or replacement after inspection), to avoid their failure. A risk index map is also created for summer 2012 above the city GIS WDN frame as shown by Figure 8.7 for improving the spatial comprehension of the risk distribution and therefore facilitating the appropriate intervention of the utility managers. 195 (a) Summer 2012 (b) Winter 2012 Figure 8.6: Aggregated risk index for City of Kelowna water mains (a) Summer 2012, (b) Winter 2012 8.49%83.40%8.11%Low Medium High8.49%83.78%7.72%Low Medium High 196 Figure 8.7: Aggregated risk index map for City of Kelowna water mains (Summer 2012) 197 8.5 Summary A detailed literature review is performed to identify the major deterioration and consequence factors that represented as the nodes in the proposed model. As it is widely acknowledged that the relationship among these factors are not linear and requires a sophisticated process to represent the relationships and determine pipe failure risk index. In this study, a proof-of-concept BBN model is proposed to prioritize metallic water mains for maintenance, repair, and replacement using structural integrity, hydraulic capacity, water quality, and consequence factors. The casual relationships among these factors were established based on expert knowledge, published literature and training from available data or information. The proposed method is flexible to include more contributing factors, new information (consequences) as well as incorporate priorities set by concerned agencies. Water distribution network of City of Kelowna is investigated to demonstrate the applicability of the proposed concept. The model is capable to provide qualitative and quantitative information of different factors on the pipe failure risk index. In the context of medium and small WDNs, it is likely that data or information scarcity and the number of experts for a given system would be small. The critical element in BBN model involves the use of experts in the generation of probabilities for various system conditions for which data are not available. Using BBN, it is possible to investigate changes to the distribution of States within Root Nodes by modifying the underlying probability distribution. By propagation of new observations through the network, BBN updates the prior probabilities, yielding posterior probabilities. These posteriors, unlike priors that are based mainly on generic data and expert knowledge, are more specific to the failure risk studied and better reflect its characteristics. This flexibility is particularly useful for modelling natural processes where the complexity of the interactions may preclude collection of large data sets upon which testing of a model can be based. BBN can be used for both forward (i.e., prognostic) and backwards (diagnostic) reasoning. For BBN, model validation is relatively simple, though not necessarily rigorous, especially in data poor environments. The proposed BBN model can be a unique tool for small and medium utilities given the scarcity of data/information, lack of technical and financial resources, and limited experience. The sensitivity analysis will help the decision makers to identify the significant parameters and guide them to modify the network. 198 Chapter 9 Conclusions and Recommendations 9.1 Summary and conclusions The main objective of this research was to develop a robust water main R&R action program for water utilities considering uncertainty at various levels. The amount and quality of water main break data available for developing or implementing different water main failure models vary among utilities. Medium to large sized water utilities has more data/information compared to the small to medium sized utilities. The research starts with the development of water main failure models based on the data/information availability. In order to handle the predictive uncertainty, Bayesian regression based water main failure model is developed integrating OWA to select suitable response prediction curves based on their degree of optimism and credibility. The results of the proposed Bayesian regression based failure model suggest that confidence and credible intervals are almost mathematically and numerically identical for linear models due to no and noninformative prior respectively. Although the mean response prediction of normal regression and Bayesian regression models are same, Bayesian regression model provides better predicted response than normal regression model. The model also allowed the decision makers to choose an appropriate response prediction curve based on their credibility and degree of optimism for further analysis and decision making. In order to present preliminary covariate or model selection method considering uncertainties, a Bayesian model averaging based water main failure prediction models is proposed to account model uncertainty as a formal way. Results indicate that BMA provides a transparent statement of the probability that a variable is associated with the water main failure compare to classical analysis. Such an approach could be helpful to identify the influential variables, and show better performance in real life study. The performance of BMA approach is noticeably better compared to classical normal regression model whenever limited pipe failure data or information is available. The proposed approach can also incorporate the judgment or knowledge of the utility managers or authorities as prior distribution and assess the overall impact. 199 For the medium to large sized water utilities where sufficient data or failure information is available, a Bayesian updating based water main failure model in proposed to handle the uncertainty systematically. In the first part, BMA is conducted to bring insight on selecting influential and appropriate covariates and BWPHM is applied to develop survival curves for CI and DI pipes. The results indicated that the survival times of CI and DI pipes with NOPF=0 are higher than NOPF>0. The proposed model is able to estimate the survival functions with a range of uncertainties for individual pipes within the distribution network and help to take appropriate preventive or corrective action. In the second part, a Bayesian updating based water main failure prediction framework is developed to update the performance of the pipe failure models and to quantify the uncertainties. The test results confirmed that the Bayesian updating model effectively improves the performance of the water main failure models. Bayesian updating methodology can be applied to improve the performance of the existing pipe failure models though quantifying the uncertainties arising from modeling errors due to the availability of sufficient data or information for large utilities. Whereas utility managers or authorities of the small to medium sized utilities can consider the expert knowledge or judgment as a prior for the analysis and can update the model whenever new data/information is available. The proposed model is able to incorporate the judgment of the utility managers or authorities as prior distribution and assess the overall impact of the different management strategies for the long term plan as well as on the level of funding or investment needed to maintain specific structural and hydraulic performance levels. The developed survival curves for CI and DI water mains can be integrated with economic assessment model (e.g., life cycle costing) to estimate costs of M/R/R and to develop optimal M/R/R plans. To prioritize R/R/R recommendations for water mains for small to medium-sized water utilities, a LCC based DST is developed in this study. To develop water main failure prediction or break frequency models, Bayesian regression model is proposed for its superior prediction capabilities and ability to handle the uncertainties. The proposed model also demonstrated how to fuse the failure data/information from multiple utilities having demographic and environmental similarities and to develop more statistically significant pipe failure models. The R/R/R costs of 200 the water mains provided by the decision makers based site and road conditions also reduce the uncertainty and unexpected expenses. To make the model more realistic and practical for effective decision making, the weight of deterioration rate after intervention or actions are incorporated to represent the pipe condition change. The proposed LCC module of the RiLCAMP model provides the decision-maker with an effective and easy to use tool to make an informed decision. The proposed tool will aid utility engineers and managers to predict the suitable new installation and/or rehabilitation programs as well as their corresponding costs for effective and proactive decision-making and thereby avoiding any unexpected and unpleasant surprises. The LCC methodology developed in this research can be used by any type of WDN, in particular, the small to medium-sized utilities. The proposed LCC module is a unique tool for small to medium utilities given the scarcity of data/information, lack of technical and financial resources, and limited experience. The developed BBN-based pipe failure model can be a unique tool for small to medium size utilities given the scarcity of data/information, lack of technical and financial resources, and limited experience. The casual relationships between structural integrity, hydraulic capacity, water quality, and consequence factors were established based on expert knowledge, published literature and training from available data or information. The model is capable of providing qualitative and quantitative information of different factors on the pipe failure risk index and flexible to include more contributing factors, new information (consequences) as well as incorporate priorities set by concerned agencies. The critical element in BBN model involves the use of experts in the generation of probabilities for various system conditions for which data are not available. The integration of the proposed robust Bayesian models with the GIS of the water utilities, enables to provide information both at operation level and network level. For operation level, it is able to visualize the most ‘vulnerable’ and ‘sensitive’ pipes within the distribution network and help to take appropriate preventive or corrective action. Whereas for network level, it can identify the total number of pipes need immediate M/R/R actions which will help the utility manager to better address the structural and hydraulic failure of water mains, proactively, while meeting financial constraints, level of service, and regulatory requirements. 201 9.2 Originality and contributions Different level of uncertainties poses a major challenge in the development of repair and replacement (R/R) action program by the water utilities. This research investigated and developed effective R&R action plans of small, medium and large sized water distribution systems considering uncertainties in a formal way. This work evaluated various water main failure prediction and life cycle cost analysis models. The literature shows that there is no comprehensive and general water main failure prediction model available which could be applied to small, medium and large sized utilities due to the data/information and resource variability and limitations. Little has been done on how to assess water main failure risk of the small to medium-sized water utilities due to data/information inadequacy. This research developed a BBN-based pipe failure model for the small to medium-sized water utilities using expert knowledge and judgments to establish causal relationships between structural integrity, hydraulic capacity, water quality, and consequence factors. In this research, a Bayesian updating framework is developed for large utilities in the presence of sufficient data/information to improve the performance of the pipe failure models and to quantify the uncertainties. Moreover, the research developed a methodology to consider uncertainties in model selection and to reduce predictive uncertainties of water main failure using Bayesian model averaging and Bayesian regression models respectively. To develop optimum and cost effective R&R profile considering uncertainties in different levels, Bayesian regression model is proposed for the water main break frequency models and special attention is given to measure the repair costs considering site location factors. Therefore, the utilities would be able to estimate their future financial requirements for R&R plans with fewer risk and uncertainty. The proposed integrated framework will aid the utility manager to better address the failure and consequences of water mains and to take effective decisions about their water mains R&R plans proactively while meeting financial constraints, level of service, and regulatory requirements. 202 9.3 Limitations and recommendations for future work The deterioration of water mains leads to structural failure that has grave economic impacts due to treated water loss (loss of revenue, taxes), flooding of streets, pavement (loss of nearby infrastructure) and sometimes homes, and loss of business and costs associated with emergency response (loss of private property) (Clair and Sinha 2014; Makrouplos and Butler 2006; Sinha et al. 2008). In addition, water main failure pollutes air, water, noise and ecology of the environment, and also affects the social lifestyle of the people like high traffic flow, pressure problem (Clair and Sinha 2014; Fares and Zayed 2010; Gorji-Bandpy and Shateri 2008; Hellstrom et al. 2000; Makrouplos and Butler 2004; Sinha et al. 2008; Studziński and Pietrucha-Urbanik 2012). In the proposed BBN-based water main failure model, more emphasis was provided on the likelihood than a consequence of water main failure. Thus, a consequence based decision support tool can be considered to the better knowledge of water mains’ condition and the magnitude of their potential consequences for consumers, business, and socio-economic activities. The proposed water main failure BBN model will be extended into a multi-hazard framework, where external factors, such as earthquake (Jeon and O’Rourke 2005; Pineda-Porras and Najafi 2010; Tesfamariam and Liu 2013), seismic ground movement (Liu and Tesfamariam 2012; O’Callaghan 2012; Pineda-Porras and Ordaz 2010), ground rupture (Da 2007), landslides (Kinash and Najafi 2012), scouring (Rajani and Tesfamariam 2004), and climate change (Wols and van Thienen 2013) can be incorporated into structural integrity index to find out their effect on water main failure. The proposed Bayesian regression and BMA models can strengthen a great deal by integrating it with the GIS system of the water utilities to identify the high breakage rate zone. To improve the performance of the BMA models, other environmental parameters (e.g., rain deficit, freezing index), hydraulic information (e.g., water velocity, pressure) can be integrated. The weather data such as temperature, precipitation, and rainfall used in this study was acquired from Environment Canada (2014) considering Calgary International Airport CS weather station (Latitude: 51°06'31.080" N, Longitude: 114°00'52.000" W). In order to 203 reduce the uncertainty in environmental factors and land use, weather data from multiple sample stations and zoning system of the City of Calgary, and traffic load in the vicinity can be considered. In the proposed Bayesian updating based water main failure model, the normal distribution with mean 0.0, and precision 0.001 were used for betas whereas Gamma distribution with mean 1.0 and precision 0.001 was used for alpha or scale parameter due to no information about the prior. The performance of the proposed model can be further compared with other types of distribution like log-normal, distribution, beta distribution, exponential distribution etc. For future research, the performance of the proposed BWPHM can be compared with other Bayesian survival analysis models like NHPP, Poisson regression, Weibull/exponential parametric models or using other types of distributions as prior. How is the accuracy of the model affected by the data division per period can also analyze. Due to the lack of economic data availability, average repair costs for different site locations and pipe replacement or new pipe installation cost were considered in the proposed LCC based DST for small to medium-sized water utilities. Despite the fact that the social cost component could be significant in certain projects, most studies either do not account for social cost or estimate it in an approximate manner (Rajani and Kleiner 2001, 2007). None of the studies provide any specific guideline how to calculate that cost (Engelhardt et al. 2003; Moselhi et al. 2009; Rajani and Kleiner 2001, 2007). However, the mathematical formulation of this problem is very difficult and the question remains, are all equations valid for all times and all places. Moreover, collecting data related to social cost such as the number of pedestrians, vehicles, loss taxes, etc. is very difficult and highly uncertain (Moselhi et al. 2009). In future, a network based model can be developd to capture the nonlinear and complex interrelationships between these factors based on the expert’s judgment and available data/information to reduce the uncertainty of decision making. 204 References Agarwal, M. (2010). Developing a framework for selecting condition assessment technologies for water and wastewater pipes (Doctoral dissertation, Virginia Polytechnic Institute and State University). Albert, J. (2009). Bayesian Computation with R, 2nd Ed., Springer, New York, 304. Al-Barqawi, H., & Zayed, T. (2006a). Condition rating model for underground infrastructure sustainable water mains. Journal of Performance of Constructed Facilities, 20(2), 126-135. Al-Barqawi, H., & Zayed, T. (2006). Assessment model of water main conditions. In The Pipeline Division Specialty Conference, Chicago, USA.[doi: 10.1061/40854 (211) 27]. Al-Barqawi, H., & Zayed, T. (2008). Sustainable infrastructure management: Performance of water main. Journal of Infrastructure Systems, 14(4), 305–318. Almeida, M.C., Leitão, J.P., & Coelho, S.T. (2011). Risk management in urban water infrastructures: application to water and wastewater systems. In Almeida, B., Gestão da Água, Incertezas e Riscos: Conceptualizaçãooperacional (Water management, uncertainty and risks: operational conceptualisation). Esfera do Caos, Lisbon, Portugal(in Portuguese). Alvisi, S., & Franchini, M. (2010). Comparative analysis of two probabilistic pipe breakage models applied to a real water distribution system. Civil Engineering and Environmental Systems, 27(1), 1-22. Amaitik, N. M., & Amaitik, S. M. (2008). Development of PCCP wire breaks prediction model using artificial neural networks. In Proceedings of the International Pipelines Conference, July 22-27, Atlanta, Georgia, USA. American Society of Civil Engineers (ASCE) Report Card. (2013). Retrieved from: http://www.infrastructurereportcard.org/a/#p/drinking-water/overview (Accessed on 18 March 2013). Ammar, M. A., Moselhi, O., & Zayed, T. M. (2012). Decision support model for selection of rehabilitation methods of water mains. Structure and Infrastructure Engineering, 8(9), 847-855. 205 Anwar, P., Koester, P., & Harlow, K. (2005). Should I keep or replace it? A risk based approach to making asset rehabilitation and replacement (R&R) decisions. Proceedings of the Water Environment Federation, 2005(15), 1034-1046. Asnaashari, A., McBean, E. A., Shahrour, I., & Gharabaghi, B. (2009). Prediction of watermain failure frequencies using multiple and Poisson regression. Water Science and Technology: Water Supply, 9(1), 9-19. American Water Works Association. (1993). American national standard for polyethylene encasement for ductile-iron pipe systems. In AWWA standards. AWWA. Banciulescu, C., & Sekuler, L. (2010). Developing a Cip Using a Deterioration Modeling and Field Sampling Approach. Proceedings of the Water Environment Federation, 2010(1), 645-661. Baracos, A., Hurst, W. D., & Legget, R. F. (1955). Effects of physical environment on cast-iron pipe. Journal (American Water Works Association), 47(12), 1195-1206. Barker, K., & Baroud, H. (2014). Proportional hazards models of infrastructure system recovery. Reliability Engineering & System Safety, 124, 201-206. Bennett, J. C., Bohoris, G. A., Aspinwall, E. M., & Hall. R. C. (1996). Risk analysis techniques and their application to software development. European Journal of Operational Research, 95(3), 467-475. Berardi, L., Giustolisi, O., Kapelan, Z., & Savic, D. A. (2008). Development of pipe deterioration models for water distribution systems using EPR. Journal of Hydroinformatics, 10(2), 113-126. Bolar, A., Sadiq, R., & Tesfamariam, S. (2013). Condition assessment for bridges: a hierarchical evidential reasoning (HER) framework. Structure and Infrastructure Engineering, 9(7), 648-666. Bolstad, W. (2007). Introduction to Bayesian Statistics, 2nd Ed., Hoboken, NJ: John Wiley and Sons, 464. Boussabaine, A., & Kirkharm, R. (2004). Whole lifecycle costing: risk and risk responses. Journal of Construction Management and economics, 2(10), 1103-1108. Box, E.P., & Tiao, G.C. (1992). Bayesian inference in statistical analysis, 1st Ed., Wiley, New York, 608. 206 Boxall, J., O’Hagan, A., Pooladsaz, S., & Saul, A.J. (2007). Estimation of burst rates in water distribution mains. Water management, 160(2): 73–82. Brander, R. (2001). Water pipe materials in Calgary, 1970–2000. InAWWA Infrastructure Conference Proceedings. Brander, R., Bill, N., 2000. Developing a condition assessment technique for water mains. Bubtiena, A.M., ElShafie, A.H., & Jaafar, O. (2011). Performance improvement for pipe breakage prediction modeling using regression method. International Journal of the Physical Sciences, 6(25), 6025-6035. Cai, B., Liu, Y., Liu, Z., Tian, X., Zhang, Y., & Ji, R. (2013). Application of Bayesian Networks in Quantitative Risk Assessment of Subsea Blowout Preventer Operations. Risk Analysis, 33(7), 1293-311. Canadian Infrastructure Report Card. (2012). Retrieved from: http://www.canadainfrastructure.ca (Accessed on 27 October 2012). Carlin, B., & Louis, T. (2009). Bayesian Methods for Data Analysis, 3rd Ed., Boca Roton, FL: Chapman and Hall, 552. Casella, G., & Berger, R.L. (2002). Statistical Inference. 2nd Ed. Duxbury Press, New York, 660. Castillo, E., Gutiérrez, J. M., and Hadi, A. S. (1997). Sensitivity analysis in discrete Bayesian networks. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 27(4), 412-423. Cox, D. (1972). Regression models and life-tables, Journal of the Royal Statistical Society, 34(2): 187–220. Cheng, R. C. H. (1986). Variance reduction methods. In Proceeding of the 1986 Winter Simulation Conference, 60-68. Christodoulou, S.E. (2011). Water network assessment and reliability analysis by use of survival analysis. Water Resources Management, 25(4): 1229–1238. Christodoulou, S., Aslani, P., & Vanrenterghem, A., (2003). Risk analysis framework for evaluating structural degradation of water mains in urban settings using neurofuzzy systems and statistical modeling techniques. In Proceedings of World water and environmental resources congress, ASCE, Reston, Va., USA, June 23-26, 1-9. 207 Christodoulou, S., Deligianni, A., Aslani, P., & Agathokleous, A. (2009). Risk-based asset management of water piping networks using neurofuzzy systems. Computers, Environment and Urban Systems, 33(2): 138-149. City of Kelowna. (2012). Bylaws. Retrieved from: http://www.kelowna.ca/cm/page1329.aspx Clair, A.M.S., & Sinha, S. (2014). Development of a standard data structure for predicting the remaining physical life and consequence of failure of water pipes. Journal of Performance of Constructed Facilities, 28(1): 191–203. Clark, C.M. (1971). Expansive-soil effect on buried pipe. Journal (American Water Works Association), 63, 424–427. Clark, R., Stafford, C., & Goodrich, J. (1982). Water distribution systems: a spatial and cost evaluation. Journal of the Water Resources Planning and Management Division, 108(WR3), 243-256. Clark, R. M., Carson, J., Thurnau, R. C., Krishnan, R., & Panguluri, S. (2010). Condition assessment modeling for distribution systems using shared frailty analysis. American Water Works Association. Journal, 102(7), 81. Clark, R. M., & Thurnau, R. C. (2011). Evaluating the risk of water distribution system failure: A shared frailty model. Frontiers of Earth Science, 5(4), 400-405. Cockburn, G., & Tesfamariam, S. (2012). Earthquake disaster risk index for Canadian cities using Bayesian belief networks. Georisk: Assessment and Management of Risk for Engineered Systems and Geohazards, 6(2), 128-140. Cooper, G. F., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9, 309-347. Cox, D.R. (1992). Regression models and life-tables. In Breakthroughs in Statistics (pp. 527-541). Springer New York. Da, H. (2007). Evaluation of ground rupture effect on buried HDPE pipelines. PhD dissertation, Rensselaer Polytechnic Institute, New York, USA. Dandy, G.C., & Engelhardt, M. (2006). Multi-objective trade-offs between cost and reliability in the replacement of water mains. Journal of water resources planning and management, 132(20): 79–88. 208 Dandy, G. C., & Engelhardt, M. (2001). Optimal scheduling of water pipe replacement using genetic algorithms. Journal of Water Resources Planning and Management, 127(4), 214-223. Debón, A., Carrión, A., Cabrera, E., & Solano, H. (2010). Comparing risk of failure models in water supply networks using ROC curves. Reliability Engineering & System Safety, 95(1), 43-48. Demissie, G., Tesfamariam, S., Brander, R., & Sadiq, R. (2014). Modelling Soil Corrosivity Using Bayesian Belief Network. Journal of Infrastructure Systems, Manuscript submitted for publication. Deng, Y., Sadiq, R., Jiang, W., and Tesfamariam, S. (2011). Risk analysis in a linguistic environment: a fuzzy evidential reasoning-based approach. Expert Systems with Applications, 38(12), 15438-15446. DIPRA (Ductile Iron Pipe Research Association) (2000). Polyethylene encasement – effective, economical protection for ductile iron pipe in corrosive environments. Retrieved from: http://www.dipra.org/wp-content/uploads/Corrosion-Control-PolyEncasement. pdf Dridi, L., Mailhot, A., Parizeau, M., & Vlleneuve, J.P. (2009). multiobjective approach for pipe replacement based on bayesian inference of break model parameters. Journal of Water Resources Planning and Management, 135(5): 344-354. Dridi, L., Mailhot, A., Parizeau, M., & Villeneuve, J. P. (2005, September). A strategy for optimal replacement of water pipes integrating structural and hydraulic indicators based on a statistical water pipe break model. InProceedings of the 8th International Conference on Computing and Control for the Water Industry (pp. 65-70). Du, F., Woods, G. J., Kang, D., Lansey, K. E., & Arnold, R. G. (2012). Life cycle analysis for water and wastewater pipe materials. Journal of Environmental Engineering, 139(5), 703-711. Economou, T., Kapelan, Z., & Bailey, T. C. (2012). On the prediction of underground water pipe failures: zero inflation and pipe-specific effects. Journal of Hydroinformatics, 14(4), 872–883. Economou, T., Kapelan, Z., & Bailey, T.C. (2008). A zero-inflated Bayesian model for the prediction of water pipe bursts. In Proceedings of the 10th Annual Water Distribution 209 Systems Analysis Conference WDSA2008, Van Zyl, J.E., Ilemobade, A.A., Jacobs, H.E. (eds.), August 17-20, 2008, Kruger National Park, South Africa. Economou, T., Kapelan, Z., & Bailey, T.C. (2007). An aggregated hierarchical Bayesian model for the prediction of pipe failures. In Proceedings of 9th International Conference on Computing and Control for the Water Industry (CCWI), Leicester, UK. Efron, B., & Tibshirani, R. (1997). Improvements on cross-validation: the 632+ bootstrap method. Journal of the American Statistical Association, 92(438), 548-560. Engelhardt, M., Savic, D., Skipworth, P., Cashman, A., Saul, A., & Walters, G. (2003). Whole life costing: application to water distribution network. Water Science and Technology: Water Supply, 3(1-2), 87-93. Environment Canada. (2013). Climate data online. <http://weather.gc.ca/city/pages/ab-52_metric_e.html>. EPA (2009). State of Technology Review Report on Rehabilitation of Wastewater Collection and Water Distribution Systems. United States Environmental Protection Agency, Cincinnati, Ohio, USA. Fadaee, M., & Tabatabaei, R. (2010). Estimation of failure probability in water pipes network using statistical model. World Applied Sciences Journal, 11(9), 1157-1163. Farran, M., & Zayed, T. (2012). New life-cycle costing approach for infrastructure rehabilitation. Engineering, Construction and Architectural Management, 19(1), 40-60. Fares, H., & Zayed, T. (2010). Hierarchical fuzzy expert system for risk of failure of water mains. Journal of Pipeline Systems Engineering and Practice,1(1), 53-62. Filev, D.P., & Yager, R.R. (1998). On the issue of obtaining OWA operator weights. Fuzzy Sets and Systems, 94, 157–169. Flintsch, G. W., & Chen, C. (2004). Soft computing applications in infrastructure management. Journal of Infrastructure Systems, 10(4), 157-166. Francis, R. A., Guikema, S. D., & Henneman, L. (2014). Bayesian Belief Networks for predicting drinking water distribution system pipe breaks. Reliability Engineering & System Safety, 130, 1-11. Francisque, A., Shahriar, A., Islam, N., Betrie, G., Siddiqui, R.B., Tesfamariam, S., & Sadiq, R. (2014). A decision support tool for water mains renewal for small to medium sized 210 utilities: a risk index approach. Journal of Water Supply: Research and Technology-AQUA, 63(4), 281–302. Francisque, A., Rodriguez, M. J., Sadiq, R. Miranda, L. F., & Proulx, F. (2009). Prioritizing monitoring locations in a water distribution network: a fuzzy risk approach. Journal of Water Supply: Research and Technology-AQUA, 58(7), 488-509. Fuchs-Hanusch, D., Kornberger, B., Friedl, F., & Scheucher, R. (2012). Whole of life cost calculations for water supply pipes. Water asset management international, 8(2), 19-24. Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995). Bayesian Data Analysis. Chapman & Hall: London. Gilks, W. R., & Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 337-348. Glick, S., & Guggemos, A.A. (2013). Rethinking wastewater-treatment infrastructure: case study using life-cycle cost and life-cycle assessment to highlight sustainability considerations, Journal of Construction Engineering and Management, doi: 10.1061/(ASCE)CO.1943-7862.0000762. Gorji-Bandpy, M., & Shateri, M. (2008). Analysis of pipe breaks in urban water distribution network. Greater Mekong Subregion Academic And Research Network International Journal, 2(3), 117-124. Gould, S., Boulaire, F., Marlow, D., & Kodikara, J. (2009). Understanding how the Australian climate can affect pipe failure. In Proceedings of OzWater, 9. Goulter, I.C., & Kazemi, A. (1988). Spatial and temporal groupings of watermain pipe breakage in Winnipeg. Canadian Journal of Civil Engineering, 15(1), 91-97. Gransberg, D. D., & Diekmann, J. (2004). Quantifying pavement life cycle cost inflation uncertainty. AACE International Transactions, RI81. Hager, D., & Andersen, L. B. (2010). A knowledge based approach to loss severity assessment in financial institutions using Bayesian networks and loss determinants. European Journal of Operational Research, 207(3), 1635-1644. Haider, H., Sadiq, R., & Tesfamariam, S. (2013). Performance indicators for small-and medium-sized water supply systems: a review. Environmental reviews, 22(1), 1-40. http://www.collinsbarrow.com/uploads/offices/toronto/Litigation_Accounting_and_Valuation_Services/Discount_Rates_2016.pdf 211 https://www.ontario.ca/laws/regulation/900194 http://www.watermainbreakclock.com/ He, Y., & Huang, R. H. (2008). Risk attributes theory: decision making under risk. European Journal of Operational Research, 186(1), 243–260. Hellstrom, D., Jeppsson, U., & Karrman, E. (2000). A framework for systems analysis of sustainable urban water management. Environmental Impact Assessment Review, 20, 311–321. Herstein, L., & Filion, Y. (2010). Life-Cycle analysis of water main materials in the optimal design of the'anytown'water network. In Water Distribution Systems Analysis 2010 (pp. 822-832). ASCE. Herstein, L. M., Filion, Y. R., & Hall, K. R. (2010). Evaluating the environmental impacts of water distribution systems by using EIO-LCA-based multiobjective optimization. Journal of Water Resources Planning and Management, 137(2), 162-172. Hill, M.C., & Tiedeman, C.R. (2007). Effective calibration of ground water models, with analysis of data, sensitivities, predictions, and uncertainty. John Wiley and Sons, New York, 480. Hoeting, J.A., Madigan, D., Raftery, A.E., & Volinsky, C.T. (1999). Bayesian model averaging: a tutorial. Statistical science, 382-401. Hubell. (2003). Corrosion Guide. Helical Screw Foundation System Design Manual for New Construction. http://www.vickars.com/screwpile_manual/PDF_Files/Step7-Corrosion Guide. pdf. Hu, Y., & Hubble, D. W. (2007). Factors contributing to the failure of asbestos cement water mains. Canadian Journal of Civil Engineering, 34(5), 608-621. Hudak, P.F., Sadler, B., & Hunter, B. A. (1998). Analyzing underground water-pipe breaks in residual soils. Water Engineering and Management, 145(12), 15-20. Ibrahim, J. G., Chen, M. H., & Sinha, D. (2005). Bayesian survival analysis. John Wiley & Sons, Ltd. InfraGuide., (2006). Managing risk: Decision making and investment planning. Retrieved from: http://www.infraguide.ca/bestPractices/PublishedBP_e.asp#dmip_ _Jan. 17, 2007. Infrastructure Report. (2007). The water main break clock. Retrieved from: Canada Free Press, http://www.canadafreepress.com/infrastructure.htm (Accessed on January 23, 2012). Ismail, M. A., Sadiq, R., Soleymani, H. R., & Tesfamariam, S. (2011). Developing a road 212 performance index using a Bayesian belief network model. Journal of the Franklin Institute, 348(9), 2539-2555. Jacobs P., & Karney, B. (1994). GIS development with application to cast iron water main breakage rates. In Proceedings of 2nd International Conference on Water Pipeline Systems, Edinburgh. Mechanical Engineering Publication Ltd, London. Jafar R., Eisenbeis P., & Shahrour, I. (2003). Modeling of the structural degradation of an urban water distribution system. In Proceeding of the 17th (EJSW) on Rehabilitation management of urban infrastructure networks, Dresden, Germany, 57-65. Jafar, R., Shahrour, I., & Juran, I. (2010). Application of artificial neural networks (ANN) to model the failure of urban water mains, Mathematical and Computer Modelling, 51(9-10), 1170-1180. Janssens, D., Wets, G., Brijs, T., Vanhoof, K., Arentze, T., & Timmermans, H. (2006). Integrating Bayesian networks and decision trees in a sequential rule-based transportation model. European Journal of Operational Research, 175(1), 16–34. Jensen, F.V. (1996). An Introduction to Bayesian Networks. UCL Press, London. Jeon, S.S., & O’Rourke, T.D. (2005). Northridge earthquake effects on pipelines and residential buildings. Bulletin of the Seismological Society of America, 95(1), 294-318. Joseph, S. A., Adams, B. J., & McCabe, B. (2010). Methodology for Bayesian belief network development to facilitate compliance with water quality regulations. Journal of Infrastructure Systems, 16(1), 58-65. Kabir, G., Tesfamariam, S., Loeppky, J. & Sadiq, R. (2015a). Integrating Bayesian Linear Regression with Ordered Weighted Averaging (OWA): An Uncertainty Analysis for Predicting Water mains' Failure. ASCE-ASME Journal of Risk and Uncertainty in Engineering Systems Part A: Civil Engineering, 1(3), 04015007, 1-17. Kabir, G., Tesfamariam, S., & Sadiq, R. (2016). Bayesian Model Averaging for the Prediction of Water Main Failure for Small to Large Canadian Municipalities. Canadian Journal of Civil Engineering, 43(3): 233-240. Kabir, G., Tesfamariam, S., & Sadiq, R. (2015b). Prediction of water mains failure: a Bayesian approach. In the proceedings of the 12th International Conference on Applications of Statistics and Probability in Civil Engineering (ICASP), Vancouver, Canada, July 12-15, 2015. 213 Kabir, G., Tesfamariam, S., & Sadiq, R. (2015c). Predicting water main failures using Bayesian model averaging and survival modelling approach. Reliability Engineering & System Safety, 142, 498-514. Kabir, G., Tesfamariam, S., Loeppky, J., & Sadiq, R. (2015d). Predicting Water Main Failures: A Bayesian Model Updating Approach. Knowledge-based Systems, Under Review, Submitted on December 13, 2015. Kabir, G., Demissie, G., Sadiq, R., & Tesfamariam, S. (2015e). A Life Cycle Cost Approach for Water Mains' Renewal in Small to Medium-sized Water Utilities. Urban Water Journal, Under Review, Submitted on December 14, 2015. Kabir, G., Tesfamariam, S., Francisque, A., & Sadiq, R. (2015f). Evaluating Risk of Water Mains Failure using a Bayesian Belief Network Model. European Journal of Operational Research, 240(1): 220–234. Kabir, G., Demissie, G., Sadiq, R., & Tesfamariam, S. (2015g). Integrating Failure Prediction Models for Water Mains: Bayesian Belief Network Based Data Fusion. Knowledge-Based Systems, 85, 159-169. Kettler, A. J., & Goulter, I. C. (1985). An analysis of pipe breakage in urban water distribution networks. Canadian Journal of Civil Engineering, 12(2), 286-93. Kim, J., Bae, C., & Kim, J. (2006). Evaluation models of deteriorated water mains for replacement/rehabilitation by field surveying. In Proceedings of 8th annual water distribution systems analysis symposium, Ohio, USA. Kimutai, E., Betrie, G., Brander, R., Sadiq, R., & Tesfamariam, S. (2015). Comparison of Statistical Models for Predicting Pipe Failures: Illustrative Example with the City of Calgary Water Main Failure. Journal of Pipeline Systems Engineering and Practice, 6(4), 04015005. Kinash, O., & Najafi, M. (2012). Large-Diameter Pipe Subjected to Landslide Loads. Journal of Pipeline Systems Engineering and Practice, 3(1), 1–7. Klein, J. P., & Moeschberger, M. L. (2005). Survival analysis: techniques for censored and truncated data. Springer Science & Business Media. Kleiner, Y., & Rajani, B. B. (2010). I-WARP: Individual water main renewal planner. Drinking Water Engineering and Science, 3, 71-77. 214 Kleiner, Y., & Rajani, B. (2002). Forecasting variations and trends in water-main breaks. Journal of infrastructure systems, 8(4), 122-131. Kleiner, Y., & Rajani, B. (2001a). Comprehensive review of structural deterioration of water mains: statistical models. Urban water, 3(3), 131-150. Rajani, B., & Kleiner, Y. (2001b). Comprehensive review of structural deterioration of water mains: physically based models. Urban water, 3(3), 151-164. Kleiner, Y., & Rajani, B. (1999). Using limited data to assess future needs.American Water Works Association. Journal, 91(7), 47-62. Kleiner, Y., & Rajani, B. (2000). Considering time-dependent factors in the statistical prediction of water main breaks. In Proceedings of American Water Works Association Infrastructure Conf. AWWA, Baltimore, Denver. Kleiner, Y., Adams, B.J., & Rogers J.S. (2001). Water distribution network renewal planning. Journal of Computing in Civil Engineering, 15(1): 15-26. Kleiner, Y., Rajani, B., & Sadiq, R. (2006). Failure risk management of buried infrastructure using fuzzy-based techniques. Journal of Water Supply Research and Technology: AQUA, 55(2), 81-94, 2006. Kleiner, Y., Rajani, B., & Sadiq, R. (2005). Risk management of large-diameter water transmission mains. Denver, Colo.: AWWA, AwwaRF and National Research Council of Canada. Kleiner, Y., Sadiq, R., & Rajani, B. (2004, August). Modeling failure risk in buried pipes using fuzzy Markov deterioration process. In ASCE international conference on pipeline engineering and construction (pp. 1-12). Klir, G., & Yuan, B. (1995). Fuzzy sets and fuzzy logic (Vol. 4). New Jersey: Prentice hall. Kumar, D., & Klefsjö, B. (1994). Proportional hazards model: a review. Reliability Engineering & System Safety, 44(2), 177–188. Laskey, K.B. (1995). Sensitivity analysis for probability assessments in Bayesian networks. IEEE Transactions on Systems Man and Cybernetics, 25(6), 901-909. Lauría, E.J.M., & Duchessi, P.J. (2007). A methodology for developing Bayesian networks: An application to information technology (IT) implementation. European Journal of Operational Research, 179(1), 234-252. Lawrence, W. W. (1976). Of acceptable risk. William Kaufmann Inc., Los Altos, CA. 215 Le Gat, Y., & Eisenbeis, P. (2000). Using maintenance records to forecast failures in water networks. Urban Water, 2(3), 173-181. Leamer, E.E. (1978). Specification Searches, New York: Wiley. Lim, S. R., Park, D., & Park, J. M. (2008). Analysis of effects of an objective function on environmental and economic performance of a water network system using life cycle assessment and life cycle costing methods. Chemical Engineering Journal, 144(3), 368-378. Lim, S. R., Park, D., Lee, D. S., & Park, J. M. (2006). Economic evaluation of a water network system through the net present value method based on cost and benefit estimations. Industrial & engineering chemistry research, 45(22), 7710-7718. Liu, Z., Sadiq, R., Rajani, B., & Najjaran, H. (2009). Exploring the relationship between soil properties and deterioration of metallic pipes using predictive data mining methods. Journal of Computing in Civil Engineering, 24(3), 289-301. Liu, Z., & Tesfamariam, S. (2012). Prediction of lateral spread displacement: data-driven approaches. Bulletin of Earthquake Engineering, 10(5), 1431-1454. Lu, D., Ye, M., & Hill, M.C. (2012). Analysis of regression confidence intervals and Bayesian credible intervals for uncertainty quantification. Water Resources Research, 48(9), 1753. Mackenzie, H. (2013). Canada's Infrastructure Gap: Where it Came from and why it Will Cost So Much to Close. Canadian Centre for Polcy Alternatives. Madigan, D., & Raftery, A.E. (1994). Model selection and accounting for model uncertainty in graphical models using Occam’s window. Journal of the American Statistical Association, 89, 1535 –1546. Mailhot, A., Pelletier, G., Noel, & J.P.V. (2000). Modeling the evolution of the structural state of water pipe networks with brief recorded pipe break histories: Methodology and application. Water Resources Research, 36(10): 3053–3062. Makar, J. (2000). A preliminary analysis of failures in grey cast iron water pipes. Engineering Failure Analysis, 7(1), 43–53. Makar, J., Desnoyers, R., & McDonald, S. (2001). Failure modes and mechanisms in gray cast iron pipe. Underground infrastructure research, 1–10. 216 Makropoulos, C. K., & Butler, D. (2006). Spatial ordered weighted averaging: incorporating spatially variable attitude towards risk in spatial multi-criteria decision-making. Environmental Modelling & Software, 21(1), 69-84. Makropoulos, C. K., & Butler, D. (2004). Spatial decisions under uncertainty: fuzzy inference in urban water management. Journal of Hydroinformatics,6(1), 3-18. Malandain, J., Le Gauffre, P., & Miramond, M. (1999). Modeling the aging of water infrastructure. Proceedings of the 13th EJSW, 8. Markovic, S., Cerekovic, N., Kljajic, N., & Rudan, N. (2012). Rainfall analyses and water deficit during growing season in Banja Luka Region. In Proceedings of The 1st International Congress of Ecologists, Ecological Spectrum 2012, 20-21 April, Bosnia and Herzegovina. Marlow, D. R., Beale, D. J., & Mashford, J. S. (2012). Risk-based prioritization and its application to inspection of valves in the water sector. Reliability Engineering & System Safety, 100, 67-74. Martins, A. (2011). Stochastic models for prediction of pipe failures in water supply systems. Instituto Superior Tecnico, Universidade tecnica de Lisboa,85. Martin, A.D., Quinn, K.M., & Park, J.H. (2011). MCMCpack: Markov Chain Monte Carlo in R. Journal of Statistical Software, 42(9), 1-21. Matos, M. A. (2007). Decision under risk as a multicriteria problem. European Journal of Operational Research, 181(3), 1516–1529. McClave, J.T., & Sincich. T., (2000). Statistis, 8th Ed., Prentice Hall, Englewood, Cliff s, NJ. Mianabadi, H., Sheikhmohammady, M., Mostert, E., & Van de Giesen, N. (2014). Application of the Ordered Weighted Averaging (OWA) method to the Caspian Sea conflict. Stochastic Environmental Research and Risk Assessment, 28, 1359–1372 Mirza, S. (2007). Danger ahead: the coming collapse of Canada’s municipal infrastructure. Federation of Canadian Municipalities. Moselhi¹, O., Zayed, T., & Salman, A. (2009). Selection method for rehabilitation of water distribution networks. In Proceedings of ICPTT 2009: Advances and Experiences with Pipelines and Trenchless Technology for Water, Sewer, Gas, and Oil Applications, 1390-1402. Moustafa, A. M. (2010). Risk Based Decision Making Tools for Sewer Infrastructure Management. PhD dissertation, University of Cincinnati, Cincinnati, USA. 217 Nadkarni, S., & Shenoy, P. P. (2001). A Bayesian network approach to making inferences in causal maps. European Journal of Operational Research, 128(3), 479–498. Nafi, A., & Kleiner, Y. (2009). Scheduling renewal of water pipes while considering adjacency of infrastructure works and economies of scale. Journal of water resources planning and management, 136(5), 519-530. Najafi, M., & Kulandaivel, G. (2005). Pipeline condition predicting using neural network models. In Proceedings of Pipelines 2005: Optimizing Pipeline Design, Operations, and Maintenance in Today’s Economy, ASCE, New York, 767-781. Najjaran, H., Sadiq, R., & Rajani, B. (2005). Condition assessment of water mains using fuzzy evidential reasoning. In Proceedings of IEEE International Conference on Systems, Man and Cybernetics, Ottawa, Canada, 10-12 October, pp. 3466-3471. Nishiyama, M., & Filion, Y. (2013). Review of statistical water main break prediction models. Canadian Journal of Civil Engineering, 40(10), 972-979. Norsys Software Corp. (2006). Netica TM application, Retrieved from http://www.norsys.com (Accessed on August 20, 2012). NSW Treasury (2004). Total Asset Management Life Cycle Costing Guideline. New South Wales Treasury, TAM04-10, September 2004. O’Callaghan, F. W. (2012). Pipe performance and experiences during seismic events in New Zealand over the last 25 years. In Pipelines 2012@ sInnovations in Design, Construction, Operations, and Maintenance, Doing More with Less (pp. 1136-1146). ASCE. O'Day, D. K. (1982). Organization and Analyzing Leak and Break Data for Making Main Replacement Decisions (PDF). Journal-American Water Works Association, 74(11), 588-594. O’Hagan, M. (1988). Aggregating template or rule antecedents in realtime expert systems with fuzzy set logic. In Proceedings of the 22nd Annual IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, CA: IEEE and Maple Press, 681–689. Osman, H., & Bainbridge, K. (2011). Comparison of statistical deterioration models for water distribution networks. Journal of Performance of Constructed Facilities, 25(3): 259-266. Park, I., & Grandhi, R. V. (2014). A Bayesian statistical method for quantifying model form uncertainty and two model combination methods. Reliability Engineering & System Safety, 129, 46-56. 218 Park, S., Jun, H., Kim, B., & Im, G. (2008a). Modeling of water main failure rates using the log-linear ROCOF and the power law process. Water Resources Management, 22(9): 1311–1324. Park, S., Kim, J., Newland, A., Kim, B.J., & Jun, H. (2008b). SurvivalAnalysis of Water Distribution Pipe Failure Data Using the Proportional HazardsModel. In Proceddings of World Environmental and Water Resources Congress, 2008 Ahupua’a. Park, S., Jun, H., Agbenowosi, N., Kim, B.J., & Lim, K. (2011). The proportional hazards modeling of water main failure data incorporating the time-dependent effects of covariates. Water Resources Management, 25(1): 1–19. Park, S., Kim, J., Newland, A., & Jun, H. (2007). A methodology to estimate economically optimal replacement time interval of water distribution pipes. Water Science & Technology: Water Supply, 7(5-6), 149-155. Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers, Inc., San Francisco. Peel, M., Finlayson, B.L., & McMahon, T.A. (2007). Updated world map of the Köppen-Geiger climate classification. Hydrology and Earth Systems, 4, 439–473. Pelletier, G., Mailhot, A., & Villeneuve, J. P. (2003). Modeling water pipe breaks-three case studies. Journal of Water Resources Planning and Management, 129(2), 115-123. Pickard, B. D., & Levine, A. D. (2006). Development of a GIS based infrastructure replacement prioritization system; a case study. In Proceedings of Water Distribution Systems Analysis Symposium, Cincinnati, Ohio, USA. Pineda-Porras, O. & Najafi, M. (2010). Seismic Damage Estimation for Buried Pipelines: Challenges after Three Decades of Progress. Journal of Pipeline Systems Engineering and Practice, 1(1), 19–24. Pineda-Porras, O., & Ordaz, M. (2010). Seismic fragility formulations for segmented buried pipeline systems including the impact of differential ground subsidence. Journal of Pipeline Systems Engineering and Practice, 1(4), 141–146. Poropudas, J., & Virtanen, K. (2011). Simulation metamodeling with dynamic Bayesian networks. European Journal of Operational Research, 214(3), 644-655. Raftery, A.E., Madigan, D., & Hoeting, J.A. (1997). Bayesian model averaging for linear regression models. Journal of the American Statistical Association, 92(437), 179-191. 219 Rahman, S., & Vanier, D.J. (2004). Life cycle cost analysis as a decision support tool for managing municipal infrastructure. In Proceedings of CIB 2004 Triennial Congress, Toronto, Ontario, May 2-9, 2004, pp. 1-12. Rajani, B., & Kleiner, Y. (2007). Quantifying effectiveness of cathodic protection in water mains: case studies. Journal of infrastructure systems,13(1), 1-11. Rajani, B., & Kleiner, Y., (2004). Alternative strategies for pipeline maintenance/renewal. In Proceedings of AWWA 2004 Annual Conference, Orlando, Florida, June 13-17, 1-16. Rajani, B., & Kleiner, Y., (2001). WARP–water mains renewal planner. In Proceedings of Int. Conf. on Underground Infrastructure Research, 1-6. Rajani, B., Kleiner, Y., & Sink, J. E. (2012). Exploration of the relationship between water main breaks and temperature covariates. Urban Water Journal, 9(2), 67-84. Rajani, B., & Tesfamariam, S. (2007). Estimating time to failure of ageing cast iron water mains under uncertainties. ICE Water Management Journal, 160(2), 83-88. Rajani, B., & Tesfamariam, S. (2004). Uncoupled axial, flexural, and circumferential pipe soil interaction analyses of partially supported jointed water mains. Canadian Geotechnical Journal, 41(6), 997-1010. Rajani, B., Zhan, C., & Kuraoka, S. (1996). Pipe soil interaction analysis of jointed water mains. Canadian Geotechnical Journal, 33(3), 393-404. Renaud, E., Le Gat, Y., & Poulton, M. (2011). Using a break prediction model for drinking water networks asset management: From research to practice. Water Science and Technology: Water Supply, 12(5), 674-682. Rodriguez, M., Milot, J., & Serodes, J. (2003). Predicting trihalomethane formation in chlorinated waters using multivariate regression and neural networks. Journal of Water Supply Research and Technology-AQUA, 52(3), 199-215. Rogers, P. D. (2011). Prioritizing water main renewals: case study of the Denver water system. Journal of Pipeline Systems Engineering and Practice,2(3), 73-81. Rogers, P. D., & Grigg, N. S. (2009). Failure assessment modeling to prioritize water pipe renewal: two case studies. Journal of Infrastructure Systems, 15(3), 162-171. Rogers, P. D. & Grigg, N. S. (2006). Failure assessment model to prioritize pipe replacement in water utility asset management. In Proceedings of 8th Annual Water Distribution Systems Analysis Symposium, Cincinnati, Ohio, USA, August 27-30. 220 Rogers, P. (2006). Failure assessment model to prioritize pipe replacement in water utility asset management. Ph.D. dissertation, Colorado State University, Colorado, USA. Røstum, J. (2000). Statistical modelling of pipe failures in water networks, PhD Dissertation, Norwegian University of Science and Technology, Trondheim, Norway. Saaty, T. L. (1988). Multicriteria decision-making: The Analytic Hierarchy Process. University of Pittsburgh, Pittsburgh, PA. Sadiq, R., Kleiner, Y., & Rajani, B. (2010). Modelling potential for water quality failures in distribution networks – framework (I). Journal of Water Supply: Research & Technology – AQUA, 59(4), 255-276. Sadiq, R., Kleiner, Y., & Rajani, B. (2009). Proof-of-concept model to predict water quality changes in distribution pipe networks (Q-WARP). National Research Council of Canada, Ottawa, Canada. Sadiq, R., Rajani, B., & Kleiner, Y. (2004). Fuzzy-based method to evaluate soil corrosivity for prediction of water main deterioration. Journal of infrastructure systems, 10(4), 149-156. Sadiq, R., & Rodriguez, M. J. (2004). Fuzzy synthetic evaluation of disinfectant by-product-a risk-based indexing system. Journal of Environmental Management. 73(1), 1–13. Sadiq, R., & Tesfamariam, S. (2007). Probability density functions based weights for ordered weighted averaging (OWA) operators: An example of water quality indices. European Journal of Operational Research, 182(3), 1350–1368. Sadiq, R., Veitch, B., Husain, T. & Bose, N. (2005). Prioritizing environmental effects monitoring (EEM) programs: a risk-based strategy. In S. L. Armsworthy, P. J. Cranford, & K. Lee (Eds), Offshore Oil and Gas Environmental Effects Monitoring: Approaches and Technologies. (pp. 95-110), Battelle Press, OH. Salloum, D. (2012). Canada's First Infrastructure Report Card. The Canadian Society for Civil Engineering, 1. Sha, N., & Pan, R. (2014). Bayesian analysis for step-stress accelerated life testing using weibull proportional hazard model. Statistical Papers, 55(3), 715-726. Shahata, K., (2006). Stochastic life cycle cost modeling approach for water mains. Masters Dissertation, Concordia University, Montreal, Quebec, Canada. Shahata, K., & Zayed, T. (2013). Simulation-based life cycle cost modeling and maintenance plan for water mains. Structure and Infrastructure Engineering, 9(5): 403-415. 221 Shahata, K., & Zayed, T. (2012). Data acquisition and analysis for water main rehabilitation techniques. Structure and Infrastructure Engineering, 8(11): 1054-1066. Shamir, U., & Howard, C. (1979). Analytic approach to scheduling pipe replacement. Journal of AWWA, 71(5), 248-258. Shi, W. Z., Zhang, A. S., & Ho, O. K. (2013). Spatial analysis of water mains failure clusters and factors: a Hong Kong case study. Annals of GIS, 19(2), 89-97. Singh, A. (2011). Bayesian analysis for causes of failure at a water utility. Built Environment Project and Asset Management, 1(2), 195-210. Singh, D., & Tiong, R. L. (2005). Development of life cycle costing framework for highway bridges in Myanmar. International Journal of Project Management, 23(1), 37-44. Sinha, S., Angkasuwansiri, T., & Thomasson, R. (2008). Phase-1: Development of standard data structure to support wastewater pipe condition and performance prediction. Development of protocols and methods for predicting the remaining economic life of wastewater pipe infrastructure assets, Water Environment Research Foundation, Alexandria, VA. Sorge, C., Christen, T., & Malzer, H. J. (2013). Maintenance strategy for trunk mains: development and implementation of a high spatial resolution risk-based approach. Water Science & Technology: Water Supply, 13(1), 104-113. Spiegelhalter, D. J., & Lauritzen, S. L. (1990). Sequential updating of conditional probabilities on directed graphical structures. Networks, 20, 579-605. Statistics Canada (2001). 2001 Census. Retrieved from http://www12.statcan.ca/english/census01/Products/Reference/dict/geo021.htm Studziński, A., & Pietrucha-Urbanik, K. (2012). Water main failure risk assessment. Journal of KONBiN, 4(24): 115-124. Sun, L., & Shenoy, P.P. (2007). Using Bayesian networks for bankruptcy prediction: some methodological issues. European Journal of Operational Research, 180(2), 738-753. Tang, Z., & McCabe, B. (2007). Developing complete conditional probability tables from fractional data for Bayesian belief networks. Journal of Computing in Civil Engineering, 21(4), 265–276. Technology Roadmap: 2003-2013. 2003. CSCE, CCPE, CPWA and National Research Council Canada. 222 Tesfamariam, S., & Liu, Z. (2013). Seismic risk analysis using Bayesian belief networks. In Tesfamariam, S., and Goda, K. (Eds.), Handbook of seismic risk analysis and management of civil infrastructure systems, Woodhead Publishing Limited, Cambridge, UK. Tesfamariam, S., & Najjaran, H. (2007). Adaptive network-fuzzy inferencing to estimate concrete strength using mix design. ASCE Journal of Materials in Civil Engineering, 19(7), 550-560. Tesfamariam, S., & Rajani, B. (2007). Estimating time to failure of cast-iron water mains. Proceedings of the ICE - Water Management, 160(2), 83–88. Tesfamariam, S., Rajani, B., & Sadiq, R. (2006). Consideration of uncertainties to estimate structural capacity of ageing cast iron water mains - a possibilistic approach. Canadian Journal of Civil Engineering, 33(8): 1050-1064. Tesfamariam, S., & Sadiq, R. (2008). Probabilistic risk analysis using ordered weighted averaging (OWA) operators. Stochastic Environmental Research and Risk Assessment, 22, 1–15. Tesfamariam, S., Sadiq, R., & Najjaran, H. (2010). Decision making under uncertainty-an example for seismic risk management. Risk Analysis, 30(1): 78-94. Thode, H. C. (2002). Testing for normality (Vol. 164). CRC Press. Thornthwaite, C. (1948). An approach toward a rational classification of climate. Geographical Review, 38(1), 55–94. Toumbou, B., Villeneuve, J. P., Beardsell, G., & Duchesne, S. (2012). General model for water-distribution pipe breaks: Development, methodology, and application to a small city in Quebec, Canada. Journal of Pipeline Systems Engineering and Practice, 5(1), 04013006. US EPA. (2011). Aging water infrastructure research: science and engineering for a sustainable future. Publication No. EPA/600/F-11/010. US EPA. (1995). Use of risk-based decision-making in UST corrective action programs. Retrieved from: http://www.epa.gov/oust/directiv/od961017.htm#Implementation (Accessed on 10 January, 2013). Vanrenterghem-Raven, A., Eisenbeis, P., Juran, I., & Christodoulou, S. (2004). Statistical modeling of the structural degradation of an urban water distribution system: case study of New York City. In World water & environmental resources congress 2003 and related symposia, Pennsylvania, USA. 223 Viallefont, V., Raftery, A.E., & Richardson, S. (2001). Variable selection and Bayesian model averaging in case‐control studies. Statistics in medicine, 20(21), 3215-3230. Volinsky, C. T., Madigan, D., Raftery, A. E., & Kronmal, R. A. (1997). Bayesian model averaging in proportional hazard models: assessing the risk of a stroke. Journal of the Royal Statistical Society: Series C (Applied Statistics),46(4), 433-448. Walski, T.M., & Pelliccia, A. (1982). Economic analyses of watermain breaks. Journal of AWWA, 74(3), 140-147. Wang, Y., Zayed, T., & Moselhi, O. (2009). Prediction models for annual break rates of water mains. Journal of Performance of Constructed Facilities, 23(1), 47–54. Wasserman, L. (2000). Bayesian model selection and model averaging. Journal of mathematical psychology, 44(1), 92-107. Watson, T. G., Christian, C. D., Mason, A. J., Smith, M. H., & Meyer, R. (2004). Bayesian-based pipe failure model. Journal of Hydroinformatics, 6(4), 259-264. Wirahadikusumah, R., & Abraham, D. M. (2003). Application of dynamic programming and simulation for sewer management. Engineering, Construction and Architectural Management, 10(3), 193-208. Wols, B.A., & van Thienen, P. (2013). Impact of weather conditions on pipe failure: a statistical analysis. Journal of Water Supply: Research and Technology. doi:10.2166/aqua.2013.088 Wood, A., & Lence, B. J. (2009). Using water main break data to improve asset management for small and medium utilities: district of maple ridge, BC. Journal of Infrastructure Systems, 15(2), 111-119. Wood, A., & Lence, B. J. (2006). Assessment of water main break data for asset management. Journal (American Water Works Association), 98(7), 76-86. Woodward, D. G. (1997). Life cycle costing—theory, information acquisition and application. International journal of project management, 15(6), 335-344. Xu, Z. (2005). An overview of methods for determining OWA weights. International Journal of Intelligent Systems, 20, 843–865. Yager, R.R. (1988). On ordered weighted averaging aggregation in multicriteria decision making. Proceeding of IEEE Transactions on Systems, Man, and Cybernetics, 18(1), 183–190. 224 Yager, R.R., & Filev, D.P. (1999). Induced ordered weighted averaging operators. Proceeding of IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 29(2), 141–150. Yager, R.R., & Filev, D.P. (1994). Parameterized ‘‘andlike’’ and ‘‘orlike’’ OWA operators. International Journal of General Systems, 22(3), 297–316. Yamijala. S., Guikema, S.D., & Brumbelow. K. (2009). Statistical models for the analysis of water distribution system pipe break data. Reliability Engineering & System Safety, 94(2), 282-293. Yan, J., & Vairavamoorthy, K. (2003). Fuzzy Approach for Pipeline Condition Assessment. In Pipeline Technologies, Security, and Safety, ASCE, Baltimore, MD, USA, 467-476. Yang, Z. L., Wang, J., Bonsall, S., & Fang, Q. G. (2009). Use of fuzzy evidential reasoning in maritime security assessment. Risk Analysis, 29(1), 95-120. Zarghami, M., Ardakanian, R., Memariani, A., & Szidarovszky, F. (2008). Extended OWA operator for group decision making on water resources projects. Journal of Water Resources Planning and Management, 134(3), 266-275. Zarghami, M., & Szidarovszky, F. (2009). Stochastic-fuzzy multi criteria decision making for robust water resources management. Stochastic Environmental Research and Risk Assessment, 23(3), 329-339. Zhao, J. Q., Rajani, B. B., & Daigle, L. (2001). Thermal performance of trench backfills used for frost protection of water service lines. Canadian geotechnical journal, 38(1), 161-174. 225 Appendix A: Summary of the survival analysis models for pipe failure Reference Model Notation Parameter estimation Rajani et al. 2012 Nonhomogeneous Poisson analysis 𝑃(𝑘𝑖) = 𝜆𝑖𝑘𝑖 . 𝑒−𝜆𝑖𝑘𝑖! 𝜆𝑖 = 𝑒𝑥𝑝[𝛽0 + 𝜓𝜏(𝑔𝑖) + 𝜷𝒒𝒊] 𝜆𝑖 =expected number of breaks 𝑘𝑖= number of breaks ψ = aging coefficient and 𝛽0 = constant qi= row vector of time-dependent covariates β = column vector of the corresponding gi = pipe age at time i Maximum likelihood method Osman and Bainbridge 2011 Weibull distribution for transition state model 𝑓(𝑥, 𝛼, 𝛽) = 𝛼𝛽(𝑥𝛼)𝛽−1𝑒[−(𝑥𝛽)𝛼] Multivariate exponential model 𝑁(𝒙𝒕) = 𝑁(𝑥𝑡0)𝑒𝒂.𝒙𝒕 α = shape parameter of Weibull distribution β = scale parameter of Weibull distribution xt = vector of time-dependent covariates N(xt) = number of breaks resulting from x a = vector of parameters of covariates x 𝑥𝑡0 = vector of baseline x values at t0 Least squares regression and maximum likelihood method Park et al. 2011 Proportional hazard model 𝑆𝑖(𝒕) = [𝑆0(𝒕)]𝑒𝜷′.𝒙𝒊 xi = vector of influencing covariates S0(t) = baseline survival function β' = vector of parameters of covariates x Log-log transformed values of baseline survival probabilities Asnaashari et al. 2009 FF = a0 + a1DR + a2LL + a3DP + a4TK + a5MP + a6AG + a7PL + a8FH FF = exp (b0 + b1DR + b2LL + b3DP + b4TK + b5MP + b6AG + b7PL + b8FH) DR, TK, DP, AG, LL, MP, PL, FH = covariates a0… a8 = multiple linear regression coefficients b0… b8 = Poisson regression coefficients Degree of contribution to FF Rogers 2011 and Rogers and Grigg 2009 𝑃(𝑡) = 1 − 𝑒−[𝜆(𝑡+𝑑𝑡)𝛽−𝜆𝑡𝛽] 𝑅(𝑡, 𝑡 + ∆𝑡) = 𝑒−[𝜆(𝑡+𝑑𝑡)𝛽−𝜆𝑡𝛽] 𝑆𝑖 = ∑ 𝑊𝑗 ∗ 𝑅𝐼𝑖,𝑗𝑛𝑗=1 λ = scale parameter of power law model β = shape parameter of power law model Si= total MCDA score for pipe i Wj = points assigned to variable j and RIij = relative importance weight of variable j on pipe i Maximum likelihood estimates Christodoulou 2011 log (Y) = β0 + β1Xi1 + β2 Xi2 +…..+ βPXip 𝜆𝑖(𝑡) = 𝜆0(𝑡)𝑒𝑥𝑝[𝛽1𝑋𝑖1 + 𝛽2𝑋𝑖2+ ⋯ … . . +𝛽𝑃𝑋𝑖𝑃] β0, β1,..,βP= regression coefficients λ0(t) = baseline hazard function XiP = covariates Degree of contribution to response/ Maximum likelihood method 226 Reference Model Notation Parameter estimation Kleiner and Rajani 2010 Nonhomogeneous Poisson analysis 𝑝(𝑘𝑖,𝑡) = 𝜆𝑖,𝑡𝑘𝑖,𝑡 . 𝑒−𝜆𝑖,𝑡𝑘𝑖,𝑡! 𝜆𝑖,𝑡 = 𝑒𝑥𝑝[𝛼0 + 𝜃𝜏(𝑔𝑖,𝑡) + 𝛼𝑧𝑖+ 𝛽𝑝𝑡 + 𝛾𝑞𝑖,𝑡] 𝜆𝑖,𝑡 = mean intensity 𝑘𝑖,𝑡= number of breaks 𝛼0 = constant 𝜏(𝑔𝑖,𝑡) = age covariate θ = coefficient of age zi= row vector of pipe-dependent covariates α = column vector of the corresponding coefficients pt= row vector of time-dependent covariates β = column vector of the corresponding coefficients qi,t=row vector of pipe & time dependent covariates γ = column vector of the corresponding coefficients Maximum likelihood method Dridi et al. 2009 Weibull-Exponential for installation and first break 𝑓𝑤(𝑡) = 𝜆1𝛽𝑡𝛽−1𝑒−𝜆1𝑡 Exponential for successive breaks𝑓𝑒(𝑡) = 𝜆𝑘𝑒−𝜆𝑘𝑡 Posterior distribution for exponential distribution 𝑔(𝜆𝑘)= (𝛽0𝑡?̃? + 1𝛽0)𝑠𝑘+𝛼0 𝜆𝑘𝑠𝑘+𝛼0−1Г(𝑠𝑘 + 𝛼0)𝑒[−𝜆𝑘 (𝑡?̃?+1/𝛽0)] λ = rate parameter of exponential distribution β= shape parameter of exponential distribution k = recorded pipe breaks α0 = shape parameter of gamma distribution β0 = scale parameter of gamma distribution Г = gamma function sk = pipe indentified by index j tk̃ = sum of the times between successive breaks Gamma distribution for prior Park et al. 2008b Proportional hazard model 𝑆𝑖(𝒕) = [𝑆0(𝒕)]𝑒𝜷.𝒙𝒊 xi = vector of influencing covariates S0(t) = baseline survival function β = vector of parameters of covariates x Maximizing the partial likelihood Boxall et al. 2007 λ(D,L,A) = exp(α + βDD + βLL + βAA) α, βD, βL, βA = Poisson regression coefficients D, L, A = diameter, length and age covariates Degree of contribution to response Watson et al. 2004 Power law model for intensity function 𝜆(𝑡) = 𝑎𝑡𝑏−1 Likelihood function of NHPP 𝑝(𝛼|𝑁) ∝ 𝛼𝑎+𝑁−1 𝑒−𝛼(𝑏+𝑇) λ = α = intensity function N = number of breaks T = time interval a = shape parameter of gamma distribution b = scale parameter of gamma distribution Gamma distribution for prior and posterior 227 Reference Model Notation Parameter estimation Dridi et al. 2005 Weibull-Exponential for installation and first break Exponential for successive breaks 𝜆𝑖 = 𝑎 + 𝑏𝑖𝑐 i = pipe break order a,b,c = parameter for non-linear relationship Gamma distribution for prior Economou et al. 2008 Power law model for baseline hazard function 𝜆(𝑡) = 𝜃𝑡𝜃−1 Likelihood function of NHPP 𝐿(. )= ∏ [[∏ 𝜆𝑖(𝑡𝑖𝑗 , 𝑥𝑖)𝑛𝑖𝑗=1]𝛿𝑖𝑒{−𝛬𝑖((𝑡𝑜𝑖,𝑇𝑖],𝑥𝑖)}]𝑁𝑖=1 Likelihood function of Zero-inflated NHPP 𝐿(. )= ∏ [𝑢𝑖 ([∏ 𝜆𝑖(𝑡𝑖𝑗 , 𝑥𝑖)𝑛𝑖𝑗=1]𝛿𝑖𝑒{−𝛬𝑖((𝑡𝑜𝑖,𝑇𝑖],𝑥𝑖)})𝑁𝑖=1+ (1 − 𝑢𝑖)(1 − 𝛿𝑖)] λ = intensity function θ = shape parameter n = number of breaks N = number of pipes t0= start of observation period T = end of observation period tij= time of jth break of ith pipe xi=vector of related explanatory variables δi= likelihood function parameter ui =Bernoulli distribution parameter Gamma and Normal Gamma distribution for prior Park et al. 2008a Failure-time-based model 𝑣(𝑡) = 𝑒(𝛽0+𝛽1𝑡)and𝑣(𝑡) = 𝛾𝛿𝑡𝛿−1 Failure-number-based model 𝑙= ∑ 𝑛𝑖 {𝛽0 + 𝑙𝑛 (𝑒𝛽1𝑏𝑖 − 𝑒𝛽1𝑎𝑖𝛽1)}𝑛𝑖=1− 𝑒𝛽0𝑒𝛽1𝑎𝑚+1 − 𝑒𝛽1𝑎1𝛽1− ∑ ln 𝑛𝑖!𝑚𝑖=1 𝑙 = ∑ 𝑛𝑖 . 𝑙𝑛{𝛾(𝑏𝑖𝛿 − 𝑎𝑖𝛿)}𝑛𝑖=1− 𝛾(𝑎𝑚+1𝛿 − 𝑎1𝛿)− ∑ ln 𝑛𝑖!𝑚𝑖=1 β0, β1= parameters of log-linear ROCOF model γ, δ= parameters of power law model ni = number of observed failures ai , bi = overlapping time intervals Maximum likelihood estimate Le Gat and Eisenbeis 2000 𝑆(𝑡, 𝒙, 𝜷) = 𝑒[−𝑡1𝜎 𝑒(−𝒙𝒕𝜷𝜎 )] S = survival function t = time to failure x=vector of related explanatory variables β = vector of coefficient of explanatory variables σ = scalar regress parameter Maximisation of the log-likelihood function 228 Reference Model Notation Parameter estimation Economou et al. 2007 Power law model for baseline hazard function 𝜆(𝑡) = 𝑎𝑡𝑏−1 Likelihood function of NHPP 𝐿(𝜽, 𝜷𝒊)= [∏ 𝜗𝑖𝑡𝑖𝑗𝜗𝑖−1𝑒𝛽𝑖𝑥𝑖𝑛𝑖𝑗=1]𝛿𝑖𝑒{−[𝑇𝑖𝜗𝑖−𝑡0𝑖𝜃𝑖]𝑒𝛽𝑖𝑖𝑥𝑖} n = number of failures t0= start of observation period T = end of observation period tij= time of jth break of ith pipe xi=vector of related explanatory variables βi = vector od coefficient of explanatory variables δi= likelihood function parameter δi =shape parameter Gamma distribution for prior Vanrenterghem-Raven 2004 ℎ(𝑡) = ℎ0(𝑡)𝑒𝑥𝑝 (∑ 𝛽𝑖𝑥𝑖𝑛𝑖=1) ℎ(𝑡) = 𝜆𝑝(𝜆𝑝)𝑝−1𝑒𝑥𝑝 (∑ 𝛽𝑖𝑥𝑖𝑛𝑖=1) xi = risk factors n = number of risk factors βi = regression parameters of xi h0(t) = baseline function λ = scale parameter of Weibull distribution p = shape parameter of Weibull distribution Maximization of log- likelihood function Kleiner and Rajani 2002 𝑁(𝒙𝒕) = 𝑁(𝒙𝒕𝟎)𝑒?̅?.?̅?𝒕𝑻 𝒙𝒕 = vector of time-dependent covariates 𝑁(𝒙𝒕) = number of breaks resulting from 𝒙𝒕 ?̅? = vector of covariates parameters 𝑁(𝒙𝒕𝟎) = vector of baseline least-squares regression and maximum likelihood method Mailhot et al. 2000 ln[𝐿 (𝑝1, 𝑘1, 𝑘2)]= ∑ 𝑙𝑛{𝑒𝑥𝑝[−(𝑘1𝑇𝑎𝑖)𝑝1]𝑡|𝛽=0+ 𝑒𝑥𝑝[𝑘2(𝑇𝑏𝑖 − 𝑇𝑎𝑖)][1− 𝑒𝑥𝑝[−(𝑘1𝑇𝑏𝑖)𝑝1]]} + 𝑛′𝑙𝑛𝑘2− 𝑘2 ∑ 𝑇𝑎𝑖𝑡|𝛽≥1+ ∑ ln{𝑝1𝑘1𝑝1𝑡1𝑡𝑝1−1 exp[−(𝑘1𝑡1𝑖)𝑝1] exp(𝑘2𝑡1𝑖)𝑡|𝛽≥1+ 𝑘2 exp(𝑘2𝑇𝑏𝑖)[1− 𝑒𝑥𝑝[−(𝑘1𝑇𝑏𝑖)𝑝1]]} n' = (nb + n0 – nt) p1, k1, k2 = model parameters of the Weibull/exponential distributions Tbi = times when the observation period began Tai = times of analysis t1i = times of the first recorded break in the observation period βi = on the number of breaks observed nb = total number of recorded breaks for the entire water pipe network n0 = number of pipe segment that have not failed during the observation period nt= total number of pipe segment Maximum likelihood function
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Planning repair and replacement program for water mains...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Planning repair and replacement program for water mains : a Bayesian framework Kabir, Golam 2016
pdf
Page Metadata
Item Metadata
Title | Planning repair and replacement program for water mains : a Bayesian framework |
Creator |
Kabir, Golam |
Publisher | University of British Columbia |
Date Issued | 2016 |
Description | Aging water infrastructure is a major concern for water utilities throughout the world. It is challenging to develop an extensive water mains renewal program and predict the performance of the water mains. Uncertainties become an integral part of the repair and replacement (R&R) action program due to incomplete and partial information, integration of data/information from different sources, and the involvement of expert judgment for the data interpretation and so on. Moreover, the uncertainties differ because of the amount and quality of data available for developing or implementing R&R action program varies among utilities. In this research, a Bayesian framework is developed for the R&R action program of water mains considering these uncertainties. At the beginning of the research, state-of-the-art critical review of existing regression-based, survival analysis and heuristic based failure models and life cycle cost (LCC) studies in the field of water main are performed. To identify the influential covariates and to predict the failure rates of water mains considering model uncertainties with limited failure information, Bayesian model averaging and Bayesian regression based model are developed. In these models, decision maker’s degree of optimism and credibility are integrated using ordered weighted averaging operator. A robust Bayesian updating based framework is proposed to update the performance of water main failure model for medium to large-sized utilities with adequate failure information. A LCC framework is prepared for water main of small to medium-sized utilities. Finally, a Bayesian belief network (BBN) based water main failure risk framework is developed for small to medium sized utilities with no or limited failure information. The integration of the proposed robust Bayesian models with the geographic information system (GIS) of the water utilities will provide information both at operation level and network level. The proposed tool will help the utility engineers and managers to predict the suitable new installation and rehabilitation programs as well as their corresponding costs for effective and proactive decision-making and thereby avoiding any unexpected and unpleasant surprises. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2016-04-13 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0228867 |
URI | http://hdl.handle.net/2429/57568 |
Degree |
Doctor of Philosophy - PhD |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Engineering, School of (Okanagan) |
Degree Grantor | University of British Columbia |
GraduationDate | 2016-05 |
Campus |
UBCO |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2016_May_Kabir_Golam.pdf [ 4.69MB ]
- Metadata
- JSON: 24-1.0228867.json
- JSON-LD: 24-1.0228867-ld.json
- RDF/XML (Pretty): 24-1.0228867-rdf.xml
- RDF/JSON: 24-1.0228867-rdf.json
- Turtle: 24-1.0228867-turtle.txt
- N-Triples: 24-1.0228867-rdf-ntriples.txt
- Original Record: 24-1.0228867-source.json
- Full Text
- 24-1.0228867-fulltext.txt
- Citation
- 24-1.0228867.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0228867/manifest