UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Risk management of acid rock drainage under uncertainty Betrie, Getnet Dubale 2014

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
24-UBC_2015_febraury_Betrie_Getnet.pdf [ 3.56MB ]
Metadata
JSON: 24-1.0074404.json
JSON-LD: 24-1.0074404-ld.json
RDF/XML (Pretty): 24-1.0074404-rdf.xml
RDF/JSON: 24-1.0074404-rdf.json
Turtle: 24-1.0074404-turtle.txt
N-Triples: 24-1.0074404-rdf-ntriples.txt
Original Record: 24-1.0074404-source.json
Full Text
24-1.0074404-fulltext.txt
Citation
24-1.0074404.ris

Full Text

RISK MANAGEMENT OF ACID ROCK DRAINAGE UNDER UNCERTAINTY     by  Getnet Dubale Betrie   A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF  THE REQUIREMENTS FOR THE DEGREE OF  DOCTOR OF PHILOSOPHY  in   The College of Graduate Studies (Civil Engineering)   THE UNIVERSITY OF BRITISH COLUMBIA (Okanagan)  December 2014   © Getnet Dubale Betrie, 2014ii  ABSTRACT Acid rock drainage (ARD) is a major environmental problem that causes local and global pollution. ARD occurs when sulfide bearing materials are exposed to oxygen and water during mining activities. This reaction between sulfide and oxygen with the presence of water generates elevated metals and metalloids that may cause potential environmental and human health risks. The remediation costs of potentially acid-generating wastes at abandoned minesites are estimated to be over $20 billion in USA. The major objective of this research is to propose a risk management framework for ARD that can improve the prediction of ARD chemistry, assess and manage environmental and human health risks to guide decision-making under uncertainty.  The proposed framework consists of methodologies for filling in missing data, predict ARD chemistry, assess environmental risks, and manage risks of ARD. In the first methodology, missing values of ARD data are filled in using imputation algorithms that reduce loss of information and introduction of biases. After having the complete data, future ARD chemistry is predicted using machine learning techniques. The predictive uncertainty due to data, parameters and model is quantified using a probability bounds approach. Models are integrated using aggregation methods to reduce the uncertainty of the individual model. Case studies in minesites show that the developed methodology improves the prediction of future ARD chemistry under uncertainty. For ecological and human health risks assessment of ARD, two methodologies are developed based on the fugacity and PHREEQC approaches. The fugacity and PHREEQC approaches are applied in minesites with limited and adequate hydrogeological information, respectively. Case studies in minesites show that these methodologies are useful to quantify ecological and human health risks in the mining industry. In addition, they quantify the associated uncertainties in the risk assessments using the probability bounds and fuzzy-probabilistic approaches. For risk management of ARD, a methodology that uses multicriteria decision analysis (MCDA) and probabilistic approaches is developed. The methodology enables decision-makers to evaluate mitigation measures with various criteria, such as risk, cost, and technical feasibility, identifies the optimal mitigation measure, and quantifies the associated uncertainties in decision-making. In general, it is believed that the proposed framework enhances the decision-making ability of the mining industry under uncertainty and reduces environmental risks and remediation costs of managing ARD.  iii  PREFACE I, Getnet Dubale Betrie, prepared all the contents of this thesis including accessing data, literature review, development and application of analytical and numerical models, interpretation of results and writing of the manuscripts under the supervision of Drs. Rehan Sadiq and Solomon Tesfamariam. The co-authors of the manuscripts are Dr. Kevin A. Morin and Dr. Craig Nichol who provided technical advice and editing the manuscripts. This thesis consists of manuscripts that are either published or submitted for publication in international journals and conferences as discussed below: • A version of Chapter 3 has been accepted for publication in Mine Water and the Environment entitled “On the issue of incomplete and missing water-quality data in minesites databases: Comparing three imputation methods” (Betrie et al. 2014e). • Versions of Chapter 4 have been published in Environmental Monitoring and Assessment Journal and The Proceedings of 9th International Conference on Acid Rock Drainage (ICARD) entitled “Predicting copper concentrations in acid mine drainage: A comparative analysis of five machine learning techniques”  (Betrie et al. 2012a and Betrie et al. 2012b). • A version of Chapter 5 has been published in Science of the Total Environment entitled “Uncertainty quantification and integration of machine learning techniques for predicting acid rock drainage quality: A probability bounds approach” (Betrie et al. 2014d). • A version of Chapter 6 has been submitted to Environmental Technology and Innovation entitled “Ecological risk assessment of acid rock drainage under uncertainty” (Betrie et al. 2014c). • A version of Chapter 7 has been submitted to Journal of Hazardous Materials entitled “Environmental risk assessment under uncertainty in the mining industry: A PHREEQC approach” (Betrie et al. 2014a). • Versions of Chapter 8  have been published in Journal of Environmental Management  entitled “Selection of remedial alternatives for minesites: A multicriteria decision analysis”  (Betrie et al. 2013a) and The Proceedings of the 2013 Annual International Mine Water Association Conference-Reliable Mine Water Technology entitled with “A multicriteria decision analysis approach  for selection of remedial alternatives in mine sites” (Betrie et al. 2013b). • Versions of Appendix I have been published in the Proceedings of 12th International Environmental Specialty conference, CSCE Annual Conference entitled “Uncertainty iv  analysis of an aquivalence-based fate and transport model using a hybrid fuzzy-probabilistic approach” (Betrie et al. 2012c) and submitted to Stochastic Environment and Risk Assessment entitled “Human health risk assessment in the mining industry: A fuzzy-probabilistic approach” (Betrie et al. 2014b).    v  TABLE OF CONTENTS Abstract ........................................................................................................................................... ii Preface ............................................................................................................................................ iii Table of Contents ............................................................................................................................ v List of Tables ................................................................................................................................ viii List of Figures................................................................................................................................. ix List of Abbreviations ...................................................................................................................... xi List of Symbols............................................................................................................................. xiii Acknowledgements ..................................................................................................................... xvii Dedication................................................................................................................................... xviii Chapter 1 INTRODUCTION ........................................................................................................ 1 1.1 Motivation ........................................................................................................................ 1 1.2 Research Objectives ......................................................................................................... 3 1.3 Thesis Organization ......................................................................................................... 4 Chapter 2 LITERATURE REVIEW AND PROPOSED RAMEWORK ..................................... 6 2.1 Literature Review ............................................................................................................. 6 2.1.1 ARD Prediction Techniques ..................................................................................... 6 2.1.2 Ecological Risk Assessment ..................................................................................... 9 2.1.3 ARD Risk Management.......................................................................................... 11 2.1.4 Uncertainty Analysis .............................................................................................. 12 2.2 Proposed Framework ..................................................................................................... 17 2.2.1 Filling in Missing Data ........................................................................................... 19 2.2.2 Prediction of ARD Chemistry using Machine Learning Techniques: A Deterministic Approach ......................................................................................................... 19 2.2.3 Prediction ARD Chemistry using Machine Learning Techniques: A Probability Bounds Approach .................................................................................................................. 19 2.2.4 Risk Assessment: A Fugacity Approach ................................................................ 19 2.2.5 Risk Assessment: A PHREEQC Approach ............................................................ 20 2.2.6 Risk Management: A MCDA Approach ................................................................ 20 Chapter 3 FILLING IN MISSING DATA .................................................................................. 21 3.1 Background .................................................................................................................... 21 3.2 Methodology .................................................................................................................. 22 3.2.1 Description of Algorithms ...................................................................................... 22 3.3 Data preparation and methods evaluation ...................................................................... 25 vi  3.3.1 Data preparation ..................................................................................................... 25 3.3.2 Performance Evaluation ......................................................................................... 26 3.4 Results and Discussion ................................................................................................... 28 3.5 Summary ........................................................................................................................ 33 Chapter 4 PREDICTION OF ACID ROCK DRAINAGE CHEMISTRY USING MACHINE LEARNING TECHNIQUES: A DETERMINSTIC APPROACH ............................................... 35 4.1 Background .................................................................................................................... 35 4.2 Methodology .................................................................................................................. 36 4.2.1 Machine learning techniques .................................................................................. 36 4.2.2 Parameter selection for drainage quality ................................................................ 43 4.2.3 Model development and validation ........................................................................ 46 4.2.4 Model evaluation .................................................................................................... 46 4.3 Results and Discussion ................................................................................................... 47 4.4 Summary ........................................................................................................................ 50 Chapter 5 PREDICTION OF ACID ROCK DRAINAGE CHEMISTRY USING MACHINE LEARING TECHNIQUES: A PROBABILITY BOUNDS APPROACH ................................... 52 5.1 Background .................................................................................................................... 52 5.2 Methodology .................................................................................................................. 53 5.2.1 Probability bounds .................................................................................................. 55 5.2.2 Aggregation methods .............................................................................................. 55 5.2.3 Selection of variables for drainage chemistry ........................................................ 56 5.2.4 Model development and evaluation ........................................................................ 58 5.3 Results and Discussion ................................................................................................... 58 5.4 Summary ........................................................................................................................ 65 Chapter 6 ECOLOGICAL RISK ASSESSMENT: A FUGACITY APPROACH ..................... 67 6.1 Background .................................................................................................................... 67 6.2 Ecological Risk Analysis ............................................................................................... 68 6.2.1 Problem Formulation .............................................................................................. 68 6.2.2 Exposure Characterization ...................................................................................... 68 6.2.3 Effect Characterization ........................................................................................... 73 6.2.4 Risk Characterization ............................................................................................. 74 6.3 Results and Discussion ................................................................................................... 74 6.4 Summary ........................................................................................................................ 79 Chapter 7 ECOLOGICAL RISK ASSESSMENT: A PHREEQC APPROACH ....................... 80 7.1 Background .................................................................................................................... 80 vii  7.2 Methodology .................................................................................................................. 81 7.3 Case study ...................................................................................................................... 84 7.3.1 Problem Formulation .............................................................................................. 84 7.3.2 Exposure Characterization ...................................................................................... 84 7.3.3 Effect Characterization ........................................................................................... 89 7.3.4 Risk Characterization ............................................................................................. 89 7.4 Results and Discussion ................................................................................................... 90 7.5 Summary ........................................................................................................................ 97 Chapter 8 RISK MANAGEMENT: A MCDA APPROACH ..................................................... 99 8.1 Background .................................................................................................................... 99 8.2 Proposed Methodology ................................................................................................ 100 8.2.1 Deterministic Multicriteria Decision Analysis ..................................................... 103 8.2.2 Probabilistic Multicriteria Decision Analysis ....................................................... 107 8.3 Case study .................................................................................................................... 108 8.3.1 Risk Assessment ................................................................................................... 108 8.3.2 Development of Potential Alternatives ................................................................. 109 8.3.3 Deterministic MCDA ........................................................................................... 109 8.3.4 Probabilistic MCDA ............................................................................................. 112 8.4 Results and Discussion ................................................................................................. 113 8.5 Summary ...................................................................................................................... 118 Chapter 9 CONCLUSIONS AND RECOMMENDATION ..................................................... 120 9.1 Conclusions .................................................................................................................. 120 9.2 Limitations and Recommendations for Future Work ................................................... 122 9.3 Originality and Contributions ...................................................................................... 122 BIBLIOGRAPHY ....................................................................................................................... 124 APPENDIX A  HUMAN HEALTH RISK ASSESSMENT: A FUZZY-PROBABLISTIC    APPROACH  .............................................................................................................................. 136        viii  LIST OF TABLES Table 2.1: Summary of static and kinetic tests ................................................................................ 7 Table 2.2: Open source code models for ARD prediction .............................................................. 8 Table 2.3: Typology of uncertainty ............................................................................................... 13 Table 2.4: Application of uncertainty analysis methods in environmental risk assessment ......... 16 Table 3.1: Statistical summary of the test data used to evaluate the imputation methods ............ 26 Table 3.2: The performance of the imputation methods ............................................................... 29 Table 4.1: Correlation between copper concentration and other variables with time lags ............ 45 Table 4.2:  Variables and summary of the data used in the study ................................................. 46 Table 4.3: Performance of models over testing sets (n = 128) ...................................................... 48 Table 4.4: The adjusted p-values of the paired t-test on error residuals ........................................ 50 Table 5.1: Correlation between input and output variables using the MIC (n = 4757) ................. 57 Table 5.2: Performance of ANN and SVM models in terms of four evaluation measures ........... 59 Table 6.1: Algorithm of the fugacity model .................................................................................. 72 Table 6.2. Input parameters of the fugacity-based fate model ...................................................... 73 Table 7.1: Chemistry of native-water and infiltration-water in /mmol L  .................................... 88 Table 7.2: Model parameters and their values ............................................................................... 88 Table 7.3: LC50 data for the fish species in the study area ............................................................ 89 Table 7.4: Predicted and observed values at different percentiles ................................................ 92 Table 8.1: Types of preference function ...................................................................................... 105 Table 8.2: Alternatives considered and their effectiveness to reduce contaminants ................... 109 Table 8.3: Criteria and scoring scale ........................................................................................... 110 Table 8.4: Alternatives and their respective score at each criterion ............................................ 111 Table 8.5: Comparison matrix of criteria .................................................................................... 112 Table 8.6: Evaluation table .......................................................................................................... 113 Table 8.7: Preference indices, incoming, outgoing, and net flow ............................................... 114 Table 8.8: Probabilistic complete ranking of alternatives in percentage ..................................... 116    ix  LIST OF FIGURES Figure 1.1: Thesis organization ....................................................................................................... 5 Figure 2.1. Proposed ARD risk management framework ............................................................. 18 Figure 3.1: Box plots showing variables in the water-quality database ........................................ 25 Figure 3.2: Raw (left) and test (right) datasets with their observed (grey) and missing (red) values. ........................................................................................................................ 27 Figure 3.3: Comparing the logarithm of observed and estimated values of variables pH, Zn, Ca, and Cd. ....................................................................................................................... 31 Figure 3.4: Comparing the logarithm of observed and estimated values of variables Mg, Cu, Conductivity, and Al. ................................................................................................. 32 Figure 3.5: Autocorrelation of analysis of the test dataset. ........................................................... 33 Figure 4.1: A machine learning algorithm using real system data to predict output ..................... 37 Figure 4.2: Multilayer perceptrons neural networks ..................................................................... 38 Figure 4.3: Schematic representation of model tree, where iX  and iM  represent the input parameters and local models ...................................................................................... 41 Figure 4.4: Scatter plots of observed and predicted copper concentrations .................................. 49 Figure 4.5: The probability distribution of the error residuals of five techniques ......................... 50 Figure 5.1: A schematic representation of the methodology used in this study. ........................... 54 Figure 5.2: Predicted and observed P-boxes for Cu, Cd, and Zn .................................................. 60 Figure 5.3: Integrating ANN and SVM predictions for Cu using four methods and observed P-boxes .......................................................................................................................... 61 Figure 5.4: Integrating ANN and SVM predictions for of Cd using four methods and observed P-boxes ............................................................................................................................ 62 Figure 5.5: Integrating ANN and SVM predictions for Zn using four methods and observed P-boxes .......................................................................................................................... 63 Figure 5.6: Observed concentrations of Cd; index on the x-axis represents time. ........................ 65 Figure 6.1: A schematic representation of the conceptual model .................................................. 70 Figure 6.2: Observed and predicted concentrations of copper and zinc in the lake ...................... 75 Figure 6.3: Derived and regulatory PNEC concentrations of copper and zinc ............................. 77 Figure 6.4: Predicted exposure and effect concentrations of copper and zinc .............................. 78 Figure 7.1: Proposed methodology for ecological risk assessment ............................................... 82 Figure 7.2: Example of constructed P-box (dashed lines) and its cuts (solid lines) ...................... 83 Figure 7.3: Schematic representation of the mine system ............................................................. 87 x  Figure 7.4: Observed and predicted concentrations of sodium, calcium, magnesium and iron in the groundwater ......................................................................................................... 91 Figure 7.5: Observed and predicted concentrations of copper, zinc, sulfate and alkalinity in the groundwater ............................................................................................................... 93 Figure 7.6: Derived and guideline PNECs of copper and zinc ...................................................... 94 Figure 7.7: Predicted exposure and effect concentrations of copper and zinc .............................. 95 Figure 7.8: Comparisons between the predictions of fugacity and PHREEQC models, and   observed p-boxes on a logarithmic scale. .................................................................. 97 Figure 8.1: The proposed methodology for selection of mitigation alternatives at minesites ..... 102 Figure 8.2: Evaluation matrix ...................................................................................................... 103 Figure 8.3: Conceptual model for risk assessment ...................................................................... 108 Figure 8.4: Cumulative distribution function of net flow ............................................................ 117 Figure 8.5: Contribution (%) of criteria on ranking of alternatives ............................................. 118   xi  LIST OF ABBREVIATIONS AHP  Analytical Hierarchy Process AMD Acid Mine Drainage ANN Artificial Neural Network  ARD Acid Rock Drainage AMELIA  Multiple Imputations of Incomplete Multivariate Data  BPNN Back Propagation Neural Network  CD-MUSIC  Charge Distribution MUltiSIte Complexation CW Constructed Wetland   DN Do-nothing EDCM Empirical Drainage-Chemistry Model ELECTRE  Elimination Et Choix Traduisant La Realite EMB Expectation-Maximization with Bootstrapping  EOF Excavation and Off-site disposal   EQ3/6 Calculation of chemical Equilibrium between Aqueous and Minerals EON Excavation and On-site disposal  ERS Environmental Risk Assessment FU Future use and aesthetic  GRNN General Regression Neural Network  IA Ion-Association  IM Implementability  IMPSEQ  Sequential Imputation for Missing Values  IRMI Iterative Robust Model-Based Imputation  K-NN K-nearest neighbors  LP Long-term performance  MAE  Mean Absolute Error  MAR Missing At Random  MAU/VT  Multi-Attribute Utility/Value Theory MC Membrane System  MCAR Missing Completely At Random  MCDA Multiple Criteria Decision Analysis  MCS Monte Carlo Simulation  MIN3P Multi-component Reactive Flow and Transport Model MINMAX  Minimum and Maximum MINMAXMODE  Minimum Maximum and Mode MINTEQA2 Metal Speciation Equilibrium Model for Surface and Ground Water ML Machine Learning mmms Minimum Maximum Mean and Standard Deviation MRA  Multiple Regression Analysis  MVRA Multivariate Regression Analysis  NETPATH  Net Geochemical Reactions Along a Flow Path NMAR Not Missing At Random NOEC No Observed Effect Concentration  xii   OTEQ  One-dimensional Transport with Equilibria Calculations PBIAS Percent Bias  P-boxes Probability boxes  PHREEQC           PH (pH), RE (Redox), EQ (Equilibrium), and C (programming language) PNEC  Predicted-No-Effect-Concentration PROMETHEE                 Preference Ranking Organization Method for Enrichment of Evaluations RAE Relative Absolute Error RT Reduction of toxicity  SB Sulfate Reducing Bacteria  SC Soil Cover  SMT Sustainable Mine Technology SOLMINEQ.88  A computer program for geochemical modeling of water-rock interactions SP Short-term performance  Stdev  Standard deviation SVM Support Vector Machine SVM-Poly Support Vector Machine with Polynomial Kernel SVM-RBF Radial Base Function Kernel WATEQF4 A computer program for calculating chemical equilibrium of natural waters WC Water cover WT Water Treatment     xiii    LIST OF SYMBOLS A chemical transport by advection flow 𝑎𝑖 Alternative i 𝑎𝑖𝑖 Weight vectors of input layers 𝑎0𝑖 Bias term of hidden layers  𝑏𝑖𝑗 weight vectors of hidden layers 𝑏0𝑗 Bias term of output layers  𝐵𝐵 Bulk density 𝛽 Regression coefficients  𝐶𝐶𝐶 Concentration of copper 𝐶𝑍𝑍 Concentration of zinc 𝑐 Total number observed values  𝐵 Chemical transport by diffusion 𝐵𝑜𝑜𝑜 Observed elements  𝐵𝑚𝑖𝑜 Missing elements 𝑑𝑔𝑔 Thickness of aquifer 𝑑𝑜; Thickness of soil 𝑑𝑑 Dilution factor  𝐸𝑆  Emission into soil ε Error term G Mass phase flow rate 𝐺𝑜𝑠  Flow rate in soil 𝐺𝑔𝑔 Flow rate in groundwater g Transfer function 𝑔𝑖 Criterion K Nonlinear mapping kernel xiv  L Lagrange function l Length 𝐾𝑑 Partitioning coefficient M Matrix Max Maximum Mi Local models Min Minimum  MW Molecular weight  n(vi)  Number of missing values in variable P Preference function 𝑝𝑝 Pruning factor Q Aquivalence q Quartile 𝑣𝑠 Variable 𝑉𝑜𝑖 Observed values  𝑉𝑍𝑖 Missing values into two matrices  𝑋𝑐 Complete dataset matrix 𝑋𝑚 Missing dataset matrix 𝑋 Input variables 𝑋�𝑐 Mean of observed values 𝑥𝑖 Input pattern Xi Input parameters  θ Distribution parameter T Training set Y System output 𝑌� Predicted output 𝑍𝑖 Outputs of hidden layers 𝑦𝑖 Output pattern W Width xv  w Degree of flatness ζ Slack variables p’ Prediction passed up to a next higher node p  Prediction passed to current node from below P(x) Imprecise probability function 𝑃�(𝑥) Upper probability 𝑃(𝑥) Lower probability q  Predicted value by a model at current node n The number of training instances S Area 𝑥𝑞 Unknown test instance 𝑤𝑖 Weight 𝑌𝑜 Observed output 𝑌𝑝 Predicted output 𝑌�𝑝 Mean of predicted output 𝑍′ Aquivalence capacity 𝑍𝑊′  Aquivalence capacity of water T Chemical transformation by reaction V Compartment volume  k Constant of first order rate 𝐿𝐶50 Median lethal concentration 𝐵𝑊𝑆 Diffusion of metals from water to soil 𝐵𝑆𝑊 Diffusion of metals soil to water 𝐵𝑆𝑆𝑊 Diffusion of solid-particles to water 𝐴𝑜𝑠  Advection through soil  𝐴𝑔𝑔 Advection through groundwater z Charge number of ion  𝐼 Ionic strength 𝑎𝑖 Ion size parameter xvi  𝛽𝑜& 𝛽′  Specific-ion parameters α A constant for a similarly charged class of electrolytes v+ The sum of the stoichiometric coefficients for the cation   v- The sum of the stoichiometric coefficients for the anion 𝛽𝐼 The equivalent fraction of the activity of an exchangeable ion 𝛽𝐼𝑀 The molar fraction of the activity of an exchangeable ion 𝐼 − 𝑋𝑖  The activity of exchangeable ion with respect to the number of exchangeable cations 𝐼1/𝑖 − 𝑋 The activity of exchangeable ion with respect to the number of exchange sites t Time  v Pore water flow velocity DL The hydrodynamic dispersion coefficients q Concentration in the solid phase π Multicriteria preference index Φ Incoming and outgoing flow PI Preference II Indifference RI Incomparability CR Consistency ratio ρ Spearman rank correlation coefficient   xvii  ACKNOWLEDGEMENTS This thesis would not have been possible without the support of various individuals and organization. First and most, I thank God who is the beginner and finisher of everything, for providing me the strength and inspiration to finish this work. Also, I thank my wife for her patience, understanding, support and unceasing love that helped me to finish this work.  I am very grateful to my academic supervisor, Dr. Rehan Sadiq for his constant support, encouragement, and invaluable guidance in the last three years. Without his liberal thought, I could not have discovered the limit of my inner-self. I am also grateful to him for inspiring me to be a true scholar and for living by example. I would like to thank my co-supervisor, Dr. Solomon Tesfamariam for his advice, support and encouragement during this research. My deepest gratitude goes to Dr. Kevin A. Morin for introducing me to the mining industry, providing data and unwavering moral support to materialize this work. Special thank goes to Dr. Craig Nichol for his insightful guidance on geochemistry and providing me various supports. Also, I thank my committee members Dr. Bahaman Naser and Dr. Cigdem Eskicioglu for their constructive comments and suggestions.  I thank the University of British Columbia, Natural Sciences and Engineering Research Council (NSERC), MITACS-Accelerate and Pacific Booker Minerals (PBM) Inc. for their financial support.   xviii    DEDICATION To my mother and wife   Chapter 1  1  CHAPTER 1 INTRODUCTION 1.1 Motivation Acid rock drainage (ARD), also called acid mine drainage (AMD), is a major environmental problem causing pollution locally and globally (Gray, 1996; Gray, 1998; Azapagic, 2004).  ARD is generated when a sulfide-bearing material reacts with oxygen and water during mining activities (Morin and Hutt 2001a; Price 2009). This reaction changes relatively insoluble chemical species in sulfide minerals into more easily dissolved free ionic species (e.g., 2Cu + and 2Zn + ) or secondary minerals (e.g., sulfate, carbonates and hydroxides). Moreover, the oxidation of some sulfide minerals produces acid that may lower the drainage pH. A lower drainage pH increases the rate of sulfide oxidation, solubility of many products of sulfide oxidation and rate of weathering for other minerals.  2 22 2 2 42 2 7 4 4 2FeS H O O H SO Fe+ − ++ + − − + +   1.1   2 32 24 4 4 2Fe O H Fe H O+ + ++ + − − +   1.2  3 2 22 2 414 8 15 2 16FeS Fe H O Fe SO H+ + − ++ + − − + +   1.3  32 33 ( ) 3Fe H O Fe OH H+ −+ −− +   1.4  The sequence of oxidation reactions for pyrite ( 2FeS ) to generate ARD is given above in equations 1.1 through 1.4 (Ferguson and Erickson, 1988) although there are many sulfide-bearing materials (e.g., chalcopyrite ( 2Cu S ), Sphalerite ( ZnS ), and Millerite ( NiS ). In equation 1.1, the pyrite oxidizes and releases hydrogen ions, sulphate, and ferrous ions. Next, the soluble ferrous ions react with oxygen to form the ferric ions as shown in equation 1.2, and it occurs slowly under low pH (Ferguson and Erickson, 1988). If the formed ferric ions contact with pyrite, the third reaction (equation 1.3) occurs catalyzed by Thiobacillus ferrooxidans bacteria under a low pH. On the other hand, if the ferric ions contacted with water, the last reaction (equation 1.4) occurs, and the ferric ions precipitates as hydroxysulfate (Ferguson and Erickson, 1988). However, the sequence of oxidation reactions for pyrite presented by Ferguson and Erickson (1988) has many simplifications. For instance, equations 1.1-1.3 are only valid for 2.3pH < , whereas equation 1.4 is only valid for4pH > . In addition, the effect of Thiobacillus as catalysis has been greatly exaggerated and not well demonstrated (Morin and Hutt, 2010).  Chapter 1  2  The global area covered with mine waste is in the order of 100 million hectares that contain several hundred thousand million tonnes of mine wastes, and 20,000-25,000 million tonnes of solid waste are added every year (Lottermoser 2010). In the eastern and western U.S., approximately 7000-16,000 kilometers of streams are affected by acid generating rock and 8,000-16,000 kilometers by ARD (USEPA, 1994). In Canada, approximately 351 million tonnes of waste rocks, 510 million tones sulfide tailings, and more than 55 million tonnes of other mining sources have the potential to release ARD (Environmental Mining Council of BC, 1995). The associated liability costs of potentially acid-generating wastes at minesites are estimated to be US$ 1.2-20.6 billion in USA, US$ 1.3-3.3 billion in Canada, and US$ 530 million in Australia (Miller et al. 2006; Pelley 2007). However, the author suspects that the estimate for Canada should be at least an order of magnitude higher. Once ARD is produced, water transports these toxic substances (i.e., metals) into the environment that may contaminate receiving water resources and soils. Exposure to these toxic substances may cause serious human and ecological risks (Azapagic, 2004). Potential human health risks include increased chronic diseases and various types of cancers (USEPA, 2007). Ecological risks range from elimination of species to a significant reduction in ecological stability and bioaccumulation of metals in the flora and fauna (Gray, 1996).  An appropriate ARD management plan can reduce the environmental risks, reduce remediation costs by the mining industry as well as regulators/ governments, and help gain public support for future mining operations (INAP (The International Network for Acid Prevention), 2009). Consequently, several guidelines have been developed at the provincial and national levels in Canada and internationally (Price and Errington, 1998; Price, 2005; World Bank Group, 2007; INAP, 2009; Price, 2009). For instance, the Canadian and International Network for Acid Prevention (INAP) guidelines are briefly discussed below.  The guidelines developed by the province of British Columbia (Price and Errington, 1998) and MEND (Price, 2009) comprise of identification and characterization of geological materials, prediction of ARD potential for different units of sulfide materials using various techniques (e.g., static tests, kinetic tests and modeling approaches), and development of mitigation measures based on various criteria (e.g., environmental risk, liability, and long-term land alienation). The INAP guideline comprises of characterization, prediction, and prevention/mitigation phases (INAP, 2009; Chapter 1  3  Verburg et al., 2009). In the characterization phase, a conceptual site model is developed once the sources, pathways, and receptors have been characterized. In the prediction phase, the drainage quality is predicted using various techniques (e.g., static tests, kinetic tests, and modeling approaches). In the prevention/mitigation phase, mitigation measures are evaluated based on predefined environmental objectives (e.g., protectiveness, regulatory acceptance, community approval, and costs) and a measure that performs best overall is selected. The existing guidelines for ARD management have limitations with respect to prediction future drainage chemistry, assessing environmental risk, evaluating mitigation measures, and handling uncertainty. Often models are used to predict future drainage chemistry from mine wastes; however, research showed that 76% of U.S. mines formed ARD although they are predicted low or no ARD formation (Kuipers et al. 2006; Pelley 2007). The guidelines emphasize the importance of assessing uncertainties (e.g., Price, 2005; Price, 2009) at each phase of mining; however, little has been done to address this issue (Maest et al. 2005; Nordstrom 2007). Although humans are not capable of solving multiple objectives unaided (McDabueks et al. 1999), evaluation of mitigation measures are conducted without incorporating multicriteria decision analysis (MCDA) aids. Methodologies to assess and manage environmental risks due to ARD are not presented in the existing guidelines.  1.2 Research Objectives The main objective of this research is to propose a risk management framework for ARD that can improve the prediction of ARD chemistry, assess, and manage environmental and human health risks to guide decision-making under uncertainty. The specific objectives of this research are to: 1. Investigate different imputation methods to fill in missing values of ARD chemistry data; 2. Evaluate different machine learning techniques to develop models that predict future drainage chemistry using operational data; 3. Develop a methodology that addresses the uncertainties of machine learning techniques for predicting ARD chemistry using probability bounds approach; 4. Develop a methodology that quantifies ARD risks under uncertainty using a fugacity-based model for minesites with limited hydrogeological information; Chapter 1  4  5. Develop a methodology that quantifies ARD risks under uncertainty for minesites with adequate hydrogeological information; and 6. Develop a methodology that uses multicriteria decision analysis for risk management of ARD under uncertainty.   1.3 Thesis Organization Figure 1.1 presents the organization of the thesis. The thesis is organized based on seven publications which are either published or submitted. Each publication achieves a specific objective of this research as shown in Figure 1.1. Chapter 2 provides literature review that presents background information relevant to this research. The proposed framework for this research is also presented. Chapters 3 through 8 present articles which are published or under review. Chapter 9 provides conclusion and recommendation of this research. The bibliographies are provided for entire thesis to avoid duplication.   Chapter 3 (Publication #1) extracts complete dataset from raw data of ARD chemistry and produces missing values artificially in the complete dataset. It then uses three imputation methods to estimate the missing values. The promising methods to fill in missing values are identified by comparing the estimates of each method to the observed values using statistical techniques. Chapter 4 (Publication #2) evaluates four machine learning techniques to predict future ARD chemistry using operational data. The operation data are used to identify variables that control drainage chemistry and develop models. The predictions from each technique are compared with observed data to identify the promising techniques. Chapter 5 (Publication #3) is an extension and improvement of the earlier chapter. It uses the promising machine learning techniques to predict future ARD chemistry, quantifies the predictive uncertainties of each method, and integrates the output of each technique to reduce the predictive uncertainties due to data and model. Chapter 6 (Publication #4) provides a risk assessment methodology for minesites with poor hydrogeological information. It develops and validates fugacity-based model to simulate the fate-and-transport of metals released waste rock into groundwater under uncertainty. It then characterizes ecological and human risks by comparing the degree of overlaps between the exposure and response concentrations. Chapter 7 (Publication #5) extends the methodology presented in the previous chapter for minesites with adequate hydrogeological information. It sets up and validates the PHREEQC model to estimate exposure concentration instead of a fugacity model. Chapter 8 (Publication #6) presents a methodology that Chapter 1  5  uses MCDA to manage ARD risk under uncertainty. It uses the risk results from Chapters 6 and 7 and other criteria to evaluate mitigation measures and identify an optimal measure that reduce the impact risk of ARD. Appendix I (Publication #7) presents a methodology for human health risk assessment in minesites using a fuzzy-probabilistic approach.  Figure 1.1: Thesis organization Chapter 2 6   CHAPTER 2 LITERATURE REVIEW AND PROPOSED RAMEWORK This chapter consists of the literature review and the proposed framework. First a brief review of literature is presented that gives background knowledge, identifies drawbacks and limitations of existing practices. A proposed framework for ARD management is then presented.  2.1 Literature Review In this section, techniques for ARD prediction, environmental risk assessment, risk management, and uncertainty analysis are discussed. 2.1.1 ARD Prediction Techniques The goal of predicting ARD drainage chemistry is to determine potential environmental impacts due to future ARD generations and to develop measures required to prevent adverse environmental impacts (Price, 2009). Techniques used for ARD prediction include laboratory and field tests (e.g., static and kinetic tests) and predictive models, which are discussed in more detail below. Laboratory and field tests that include static and kinetic tests are used to determine the potential to generate acid and drainage chemistry of mine wastes (Price 2009; U.S. EPA 1994; Verburg et al. 2009; Morin and Hutt 2001a). Static tests are used to determine whether a sample (e.g., rock or soil) will be acid producing when exposed to weathering (Blowes et al. 2003). Kinetic tests are employed to estimate the rates of sulphide oxidation and metal leaching rates of mine wastes under a designed experiment that mimics the field conditions (Blowes et al. 2003; Price 2009). Kinetic tests provide information on the rate of sulfide oxidation, the rate of acid generation, time to net acidic conditions, and drainage chemistry (USEPA 1994; Price 2009). Static tests differ from kinetic tests based on time required to undertake the test. Static tests are conducted at one-time, whereas kinetic tests are conducted by applying aqueous solution or humidity on the sample repeatedly overtime (Jambor et al. 2002). A summary of static and kinetic tests is presented in Table 2.1. The detail descriptions and procedures of these tests are given in Morin and Hutt (2001) and Price (2009). Static and kinetic tests have uncertainties because they are conducted at small-scale (e.g., laboratory and field tests) that might inadequately represent the process of acid generations at full-scale of a mine due to temporal and spatial variations (USEPA 1994; Morin and Hutt, 2007).  Chapter 2 7  Table 2.1: Summary of static and kinetic tests Static tests Kinetic tests Whole-rock and near-total elemental analysis Humidity tests Soluble constituents analysis Modified humidity tests Sulphur and carbonate analysis Column tests Neutralization potential Shake flask test Acid Base accounting Soxhlet test Net Acid Generation (NAG) tests Wet-dry cycle test Physical Analyses Elevated temperature test Mineralogical analysis Field tests (tubs and test piles) Predictive models are used to describe water chemistry, assess fate and transport of metals, and plan mitigation measures in order to reduce the environmental damage. These models are broadly classified into empirical and geochemical models (USEPA 1994; Perkins et al. 1995; Maest et al. 2005; Price 2009). Empirical models describe the time-dependent behavior of a mine waste system using previously observed data (USEPA, 1994). These empirical models are highly site-specific and depend on years of monitoring data of a minesite. Therefore, the prediction accuracy of the empirical models depends heavily on the quality of available data. An example of empirical model is provided by Morin and Hutt (2001), named empirical drainage-chemistry model (EDCM), and is used for the prediction of drainage quality using historical data from minesites.       Chapter 2 8   Table 2.2: Open source code models for ARD prediction Model Description MINTEQ (Allison et al., 1991) Computes the equilibrium chemistry of dilute aqueous solutions in the laboratory and include sorption WATEQ4F (Ball and Nordstrom 1991) Computes major and trace element speciation and mineral saturation for natural waters SOLMINEQ.88 (Kharaka et al. 1988) Computes distribution of mass among aqueous species and complexes and calculates the saturation indices of minerals at temperatures up to 350°C and pressures up to one kilo bar PHREEQC (Parkhurst and Appelo 1999) Compute aqueous speciation, mass transfer, and reaction paths and one-dimensional transport OTEQ (Runkel et al., 1999) Computes advection, dispersion, lateral inflow and transient storage, and speciation and complexation of aqueous species, precipitation/dissolution and sorption NETPATH (Plummer et al. 1991) Computes the mass transfers in every possible combination of the selected phases that accounts for the observed changes in the selected chemical and (or) isotopic compositions observed along the flow path EQ3/6 (Wolery, 1979) EQ3 computes distribution of species for natural water compositions; EQ6 uses the results of EQ3 to predict the consequences of heating and cooling aqueous solutions and of irreversible reaction in rock-water systems MINTRAN (Walter et al. 1994) It simulates transport of multiple thermodynamically reacting chemical substances in groundwater systems MIN3P (Mayer et al. 2002) It simulates hydrology, reaction-path, sorption, oxidation kinetics, and gas diffusion Chapter 2 9  MINTEQA2=Metal Speciation Equilibrium Model for Surface and Ground Water; EQ3/6= Calculation of chemical Equilibrium between Aqueous and Minerals; OTEQ = One-dimensional Transport with Equilibria Calculations; PHREEQC = PH (pH), RE (Redox), EQ (Equilibrium), and C (programming language); MIN3P= Multi-component Reactive Flow and Transport Model; NETPATH = Net Geochemical Reactions Along a Flow Path; SOLMINEQ.88 = A computer program for geochemical modeling of water-rock interactions; WATEQF4= A computer program for calculating chemical equilibrium of natural waters Geochemical models describe mine waste system in terms of chemical and/or physical processes that are believed to control ARD. These models are based on either the ion-association (IA) or Pitzer methods (Nordstrom and Campbell 2014). In the ion-association method, the extended Debye–Hückel equation and the mass-action equation are used to solve a chemical equilibrium problem. In the Pitzer method, the Pitzer equation and the mass-action equation is used to solve a chemical equilibrium problem. The ion-association method works better for dilute waters, whereas the Pitzer method works for brines (Nordstrom and Campbell 2014). These models predict the drainage chemistry of mine wastes by calculating chemical speciation and mineral saturation indices that indicate whether minerals will dissolve or precipitate in the environment (Price 2009). Moreover, these models simulate acid-base equilibrium, redox equilibrium, and adsorption/desorption (Blowes et al. 2003). The limitation of geochemical models is that they may require intensive site-specific studies and data, but collecting those data with sufficient accuracy is often difficult and expensive. A summary of open source geochemical models commonly applied to evaluate mine drainage chemistry is presented in Table 2.2. A critical review of geochemical models can be seen in Perkins et al. (1995), Alpers and Nordstrom (1999), and Nordstrom (2014). Their applications for metal risk assessment are well documented in the literature (Blowes et al. 2003; Nordstrom, 2014; Caruso et al. 2008; Runkel et al. 2012). 2.1.2 Ecological Risk Assessment Environmental or ecological risk assessment (ERA) is a process that evaluates the likelihood that adverse ecological effects may occur or are occurring as a result of exposure to one or more stressors (U.S. EPA 1992). Several guidelines and frameworks have been developed (U.S. EPA 2007; U.S. EPA 1998; U.S. EPA 1992; USEPA 1998; CCME 1996). These frameworks have common components, which include receptor characterization, exposure assessment, hazard assessment and risk characterization. These guidelines differ from each other based on the level of uncertainty analysis conducted and number of tiers involved in the frameworks. The definitions of each component are given below following the Environment Canada framework (1996): Chapter 2 10  • Receptors are parts of the environment including individuals, populations, communities, or ecosystems that can be adversely affected; • Exposure is the co-occurrence of a contaminant with receptors and is determined by measuring/estimating the concentrations of the contaminant in the environment; • Hazard refers to  the impact of a contaminant on receptors and is determined by conducting toxicity tests in laboratory or field tests; and • Risk is the evaluation of whether an adverse effect will occur as a result of exposure to a contaminant. The ERA framework (CCME, 1996a) has three tiers: (i) screening assessment, (ii) preliminary quantitative ERA, and (iii) detailed quantitative ERA. Screening assessment is characterized by simple, qualitative, and/or comparative methods, and relies heavily on literature information and previously collected data. The purposes of screening assessment include compilation and evaluation of available data and information, identification of relevant exposure pathways, chemical of concern, information gaps; refinement and update of conceptual models and assessment and measurement of endpoints; and determination of whether further next tiers are needed to design and implement remedial actions. The purposes of preliminary quantitative ERA include producing a preliminary quantitative risk estimate for receptors exposed to chemicals, refine and re-evaluate the conceptual model, define and re-evaluate the assessment and measurement endpoints, and set terms of reference for higher tier if necessary. The detailed quantitative ERA relies on site-specific data and predictive modeling to supply quantitative information, particularly on complex ecosystem responses. It has the purposes to produce quantitative predictions regarding current and future risks to assessment endpoints, and to develop an adaptive process for selecting unique, site-specific, quantitative remediation objectives, which may be revised through time. Each tier has receptor characterization, exposure assessment, hazard assessment, and risk characterization components. As the risk assessment progress from tier-I to tier-III,  the levels of assessment increase from simple to complex, qualitative to quantitative, descriptive to predictive, and literature to field. In addition, the framework has planning and reporting elements. A detailed review of each section is provided following the Environment Canada guideline (1996). Chapter 2 11  2.1.3 ARD Risk Management ARD risk management involves selecting a best mitigation measure (i.e., alternative) from a set of mitigation measures based on predefined environmental objectives (e.g., protectiveness, regulatory acceptance, community approval, and costs). Therefore, ARD risk management is a problem of MCDA that should be dealt through MCDA methods. MCDA methods deal with a problem whose alternatives are predefined and decision-makers rank available alternatives based on the evaluation of multicriteria (Tesfamariam et al. 2010; Sadiq and Tesfamariam, 2009; Tesfamariam and Sadiq, 2006).  There are many MCDA methods, most of which consist of four steps: (i) structuring the decision problem, (ii) articulating and modeling preferences, (iii) aggregating the alternative evaluations (preferences); and (iv) making recommendations (Guitouni and Martel, 1998). The main differences among these methods lie in the type of algorithm used to aggregate alternative evaluations, the types of input data required, and their final results (i.e., a single alternative vs. rank of alternatives).  The literature classifies the MCDA methods into elementary, multi-attribute utility and value theories (MAU/VT), outranking, and mixed methods although the literature shows there are many classifications (Guitouni and Martel, 1998; Belton and Stewart, 2002; and Figueria et al., 2005). The elementary methods identify a non-dominated alternative based on the performance of alternatives in at least one criterion. These methods are suitable for reducing the number of alternatives into potential alternatives; however, they are not suitable for environmental problems that involve many alternatives and criteria (Kiker et al., 2005). Examples of elementary methods include weighted sum, maxmin, conjunctive and other methods.  Utility theory methods use a utility/value function for each criterion to evaluate alternatives and to aggregate the utility/value of each criterion in order to identify the best alternative (Keeney and Raiffa, 1993). Examples of these methods include MAU/VT (multi-attribute utility/value theory), AHP (analytical hierarchy process), TOPSIS, and other methods.  The outranking methods build an outranking relation (i.e., a binary relation that is used to evaluate the performance of the outranking character of one alternative over another alternative), and then exploit this relation to identify the best alternative, sort alternatives into groups, or rank them (Belton Chapter 2 12  and Stewart, 2002). Examples of outranking methods include ELECTRE (elimination et choix traduisant la realite), PROMETHEE (preference ranking organization method for enrichment of evaluations), and other methods.  The mixed methods integrate the above methods with uncertainty analysis techniques. Examples of mixed methods include QUALIFLEX, fuzzy conjunctive/disjunctive, Martel and Zaras, and other methods. Guitouni and Martel (1998) presented a comprehensive review of MCDA methods and their description. Moreover, a comprehensive literature review of MCDA applications in the environmental decision making was conducted by Kiker et al. (2005) and Huang et al. (2011). 2.1.4 Uncertainty Analysis The nature of uncertainty refers to whether the uncertainty is due to incomplete knowledge or inherent variability (Walker et al., 2003). The uncertainty due to incomplete knowledge, also referred as epistemic, arises due to lack of complete knowledge (Ayyub and Klir 2006). This type of uncertainty can be reduced by more study and research. While the uncertainty due to inherent variability, also called aleatory, arises due to random nature of the process (Walker et al. 2003; Ayyub and Klir 2006). Aleatory uncertainty cannot be reduced by more studies. Table 2.3 summarizes various typologies of uncertainty used in environmental decision-making. In general, uncertainty is defined as incomplete knowledge about a particular subject (Ayyub and Klir 2010; Walker et al. 2003). An appropriate method used to quantify and propagate uncertainty requires identifying the source and nature of uncertainty (Morgan and Henrion 1990; Ferson and Ginzburg 1996). The source and nature of uncertainty are discussed in the section below. The source and nature of uncertainty are discussed based on typology presented by Morgan and Henrion (1990), Walker et al. (2003), and Loucks and van Beek (2005). The source refers to where the uncertainty manifested from (Walker et al., 2003), which can include model structure uncertainty, model parameter uncertainty, input data uncertainty, and uncertainty related to decision-making. The typologies of uncertainty that are used throughout this research are described below. • Model structure uncertainty is the conceptual uncertainty that arises due to incomplete understanding and simplified descriptions of modeled processes as compared to reality  (Refsgaard et al. 2007; Walker et al. 2003). For example, the flow-path of water and air in Chapter 2 13  mine rock may follow curved-path. But during the model conceptualization process, the flow-path could be assumed as straight-line that affects the amount and chemistry of leachate obtained.  • Input uncertainty can arise due to system data that are used to drive the model. An example of such data includes contaminant concentrations from waste rocks, particles size of waste rocks, gas concentrations in the waste rocks and other data.  • Parameter uncertainty arises from uncertain estimates of various model parameters (Loucks and van Beek 2005). This type of uncertainty attributed to data sets and the methods used to calibrate the model parameters. For example, suppose a reaction rate of pyrite is a parameter used in a geochemical model. Such parameter is often estimated from measured leachate materials during model calibration.  However, the estimated parameter value has uncertainty because of the measurement errors.  • Linguistic uncertainty arises from imprecise (vagueness) nature of human language (Morgan and Henrion 1990).  It is common practice that expert knowledge is used where there is lack of data during the pre-mining phase. For example, an expert could tell that the particle size of rock is medium for a given site, but medium could have different values for other experts.  • Decision uncertainty arises from the preference of the decision-makers, which includes human goals, interests, activities, demands, and impacts (Morgan and Henrion, 1990; Loucks and van Beek 2005). An example of such uncertainty can be the degree of protectiveness of a mitigation measure perceived by various decision-makers.  The nature of uncertainty refers to whether the uncertainty is due to incomplete knowledge or inherent variability (Walker et al., 2003). The uncertainty due to incomplete knowledge, also referred as epistemic, arises due to lack of complete knowledge (Ayyub and Klir 2006). This type of uncertainty can be reduced by more study and research. While the uncertainty due to inherent variability, also called aleatory, arises due to random nature of the process (Walker et al. 2003; Ayyub and Klir 2006). Aleatory uncertainty cannot be reduced by more studies.     Chapter 2 14  Table 2.3: Typology of uncertainty Literature Types of uncertainty Morgan and Henrion (1990) Statistical variation, subjective judgment, Linguistic imprecision, variability, inherent randomness, disagreement and approximation US-EPA (1997) Scenario, parameter and model Walker et al. (2003) Location or source: context, inputs, model (structure and technical), parameter and model outcome; level: statistical, scenario, recognized ignorance and total ignorance; nature: epistemic and variability Loucks and van Beek (2005) Knowledge (model and parameter), natural variability (temporal and spatial), decision (goals-objectives and values-preferences) Ascough et al. (2008) Knowledge (process understanding, parametric/data, model structure, technical, and model output), variability (natural, human, institutional, technological), linguistic (vagueness, ambiguity, non-specificity); decision (goals/objectives or assessment criteria) Matott et al. (2009) Input (input data,  response data and model parameters) and model (structure, resolution, and code)  Several methods for uncertainty analysis are presented by Ayyub and Klir (2006). Summary of probability theory, fuzzy set theory, Dempster-Shafer theory (DST), and imprecise probability are presented below and their applications are summarized in Table 2.4. • Probability theory: for random variable x  which is an element of the universal set X , a probability distribution function  maps x  in the ranges zero and one (Klir 2005). Monte Carlo Analysis (MCA) is the most widely used probabilistic technique for determining the true alternative of a variable in the environmental decision-making framework. Types of MCA commonly applied for uncertainty analysis include one-dimensional (1D) and two-dimensional (2D) techniques. The techniques could vary depending on the way the random variates are generated (e.g., straight versus Latin Hypercube sampling). • Fuzzy set theory: elements belong to a given set by degree of membership that takes value in the interval zero to one in fuzzy sets (Ayyub and Klir 2006). For x  which is an element of X , the membership function defines the degree of membership by mapping x  in the interval of real values [0, 1].  Chapter 2 15  • Possibility theory: it is characterized by possibility and necessity measures (Ayyub and Klir 2006). A possibility measure is a function that maps the power set of the universal set between the values zero or one. The necessity (Nec) measure is defined as a dual measure of possibility function. • The Dempster-Shafer theory (DST) maps the lower probability and upper probability on the real values of [ ]0,1  (Dempster 1967). The three most important concepts of this theory are belief measure, plausibility measure, and a basic probability assignment. The belief, plausibility, and a basic assignment show the likelihood that an element belongs to each subset as strong, weak and collected evidence, respectively (Ayyub and Klir 2006). There are many combination rules for combining evidence. Combination rules are special types of aggregation methods for data obtained from multiple sources (Sentz and Ferson 2002). The purpose of combining (aggregating) evidence is to synthesize and extract useful information from data coming from multiple sources. • Imprecise probability is a generalization of probability theory when one is not able to define a precise probability function P for an event x , which is element of the universal set X  (Walley 1991). An imprecise probability function ( )P x  is characterized by its lower probability ( )P x  and upper probability ( )P x . Lower probability and upper probability functions map an event x X∈  in interval values between zero and one. It is worth noting that there many methods to model imprecise probability such as interval probability (Cui and Blockley 1990; Hall et al. 1998), probabilistic arithmetic (Wiiliamson and Downs 1990), and probability bounds (Ferson 2001).            Chapter 2 16  Table 2.4: Application of uncertainty analysis methods in environmental risk assessment Literature Approach Application area Criscenti et al. (1996) 1D MC Inputs uncertainties of geochemical code Guyonnet  et al. (1999) 1D MC and Possibilistic Risk assessment modeling Cabaniss (1999) 1D MC and derivative approximation Inputs uncertainties of geochemical calculation Guyonnet et al. (2003) Fuzzy-probabilistic Environmental risk assessment Ferson et al. (2004) DST and probability bound analysis Epistemic and aleatory uncertainty in risk assessment Denison and Garnier-Laplace 2005) 1D MC Database parameter uncertainty in mineral equilibrium calculation Baudrit et al. (2005) Fuzzy-probabilistic and Dempster-Shafer Imprecision and randomness uncertainty analysis Kentel and Aral (2005) 2D MC and 2D fuzzy-Monte Carlo Environmental risk assessment Baudrit et al. (2007) DST and fuzzy-probabilistic Risk of groundwater contamination Li et al. (2007) Fuzzy-stochastic Risk of groundwater contamination  Mansour-Rezaei et al. (2012) Fuzzy set theory, 1D and 2D MC, and Probability bound analysis Contaminant transport in groundwater Leavitt et al. (2011) 1D MC Inputs uncertainties of equilibrium modeling   Chapter 2 17  2.2 Proposed Framework To address the highlighted limitations in the background (Chapter 1) and previous section (Chapter 2), a framework for ARD risk management is proposed and presented in Figure 2.1. The proposed framework starts by collecting drainage chemistry data from a minesite. The drainage chemistry data could be either from operational or from kinetic tests depending on whether a mine is at a pre-mining phase or an operational phase. If the data have missing values, they will be filled in the second step. Once the missing data are filled in using the imputation methods presented, the complete data can be used in the next steps. In the third step, machine learning techniques are used to develop models that predict future ARD drainage chemistry deterministically and under uncertainty. The developed methodologies allow predicting future drainage chemistry and quantifying uncertainties due to data, parameter, and model. This step completes addressing limitations associated with ARD prediction. In fourth step, two methodologies are developed to conduct risk assessment in minesites based on the availability of hydrogeological information. If there is limited hydrogeological information, a fugacity approach is suitable to conduct risk assessment since it does not require detailed hydrogeological information. On the other hand, a PHREEQC approach is suitable to conduct risk assessment in minesites where adequate hydrogeological information is available. If the outcome of a risk assessment indicates unacceptable risk level, the risk management step will be initiated.  In fifth step, a methodology that uses MCDA is developed for risk management of ARD deterministically and under uncertainty. It evaluates various mitigation measures that have multiple criteria and conflicting objectives and identify the optimal measure. It also quantifies the uncertainty associated with ranking mitigation measures. The proposed framework is implemented through six original methodologies that have been developed during this research. While a summary of each methodology is presented in the next section, a thorough explanation of each methodology is presented in chapters 3-8.   Chapter 2 18   Figure 2.1. Proposed ARD risk management framework  Chapter 2 19  2.2.1 Filling in Missing Data The methodology (Chapter 3) is developed to fill in missing values of ARD drainage chemistry data. It identified two imputation methods that are suitable to fill in missing values in drainage chemistry data. These methods are iterative robust model-based imputation (IRMI) and sequential imputation for missing values (IMPSEQ). The methodology compares the performance of these methods to estimate missing values and suggests which method to use to address the issue of missing values in ARD chemistry database.  2.2.2 Prediction of ARD Chemistry using Machine Learning Techniques: A Deterministic Approach The methodology (Chapter 4) identifies machine learning techniques that are suitable to predict ARD chemistry without considering uncertainty. The techniques that are suitable for prediction are support vector machine and artificial neural network. The methodology also identifies physical and chemical parameters that control drainage chemistry to develop empirical models using a linear approach.  2.2.3 Prediction ARD Chemistry using Machine Learning Techniques: A Probability Bounds Approach  The methodology (Chapter 5) presents an approach to quantify and reduce predictive uncertainties of support vector machine and artificial neural network. The sources of uncertainties include data, parameter and model. It uses a probability bounds approach to address the issue of predictive uncertainty and aggregation method to integrate the predictions of the two techniques. In addition, it provides a nonlinear approach to identify physical and chemical parameters that control drainage chemistry. 2.2.4 Risk Assessment: A Fugacity Approach The methodology (Chapter 6) simulates fate-and-transport of metals in environment using a fugacity approach, estimates the response of receptors to exposure concentration, and estimate the risk poses to receptors. It also quantifies and propagates the uncertainties associated with data and parameters in Chapter 2 20  an integrated approach using a probability bounds approach. This methodology is suitable to minesites where there is scarcity of hydrogeological information. 2.2.5 Risk Assessment: A PHREEQC Approach The methodology (Chapter 7) extends the previous methodology for minesites with adequate hydrogeological information. It simulates fate-and-transport of metals in environment using a PHREEQC model, estimates the response of receptors to exposure concentration, and estimate the risk poses to receptors. It also quantifies and propagates the uncertainties associated with data and parameters in an integrated approach using a probability bounds approach.  2.2.6 Risk Management: A MCDA Approach The methodology (Chapter 8) presents a deterministic and probabilistic MCDA approach to evaluate mitigation measures and identify optimal mitigation measures. The type of MCDA approach used is PROMETHEE. It ranks mitigation measures and quantifies the uncertainty associates with ranking. It also provides the impact of decision criteria on ranking through sensitivity analysis.  Chapter 2 21  CHAPTER 3 FILLING IN MISSING DATA A version of this chapter has been published in Mine Water and the Environment with a title” On the Issue of Incomplete and Missing Water-quality Data in Minesites Databases: Comparing Three Imputation Methods.” The lead author is Getnet D. Betrie and the coauthors are Dr. Rehan Sadiq, Dr. Kevin A. Morin and Dr. Solomon Tesfamariam.  3.1 Background Large ARD chemistry databases are valuable for predicting drainage chemistry, identifying optimal measures for mitigation and remediation, and refuting/refining models and theories (Morin et al. 2012). These databases often contain analyses of dozens of chemical elements at many locations over extended periods of time. Some data are often missing within such databases, due to periodic lack of sampling and analysis or input errors. Missing data present challenges in data driven modeling that includes machine learning, soft-computing and data mining. The challenges associated with missing data include the use and interpretation of partially collected data (Betrie et al. 2013), accuracy of learning algorithms (Bello 1995; Acuna and Rodriguez 2004), and the use of software as statistical software are developed under the assumption that input is provided as a complete data matrix (Schafer and Olsen 1998). In the latter case, the software would delete listwise/pairwise or substitute missing values with mean values if the data matrix has one or more missing values. In listwise deletion, any records containing a missing value in one of the variables would be deleted. On the other hand, a variable containing missing values would be deleted and the other variables used for analysis in pairwise deletion. The downside of such methods to filling in missing values is that it leads to a severe loss of information or introduction of bias (Little and Rubin 2002). Therefore, better methods are required to treat missing values before modeling. Methods for treating missing values can be grouped into list/pairwise deletion, parameter estimation, and imputation (Allison 2001; Little and Rubin 2002). In list/pairwise deletion, records or variables that contain missing value are deleted in order to create a complete data matrix. In the parameter estimation method, a joint probability function is defined for the data matrix and parameters are estimated using expectation-maximization algorithm or direct maximum likelihood. In the imputation method, missing values are estimated using single or multivariate imputation methods. Single imputation substitutes missing values with mean  Chapter 2 22  estimation or a value estimated from a model. On the other hand, the multiple imputation method substitutes missing values with estimated values from two or more single imputation results. In this chapter, imputation methods are examined because the literature (e.g. Allison 2001; Little and Rubin 2002) shows that imputation methods are superior to other methods. Prior to applying an imputation method, however, the mechanism of handling incompleteness should be determined as different assumptions were considered with the developed methods (Little and Rubin 2002). The literature classifies the mechanism of accounting for missing data into (e.g., Allison 2001; Little and Rubin 2002; Graham 2012): missing completely at random (MCAR), missing at random (MAR), and not missing at random (NMAR). In MCAR, the probability of a missing value for a variable does not depend on the probability of missing values of this variable and observed values of other variables. In MAR, the probability of a missing value does not depend on the missing values of this variable, but it does depend on observed values of other variables. In NMAR, the probability of a missing value depends on the other missing values. It is worth noting that the literature (e.g., Güler et al. 2002) indicates that water-quality data from minesites has the nature to follow MAR mechanism.  The objective of this chapter is to investigate three imputation methods to fill in missing values of ARD chemistry data. These methods include iterative robust model-based imputation (IRMI), multiple imputations of incomplete multivariate data (AMELIA), and sequential imputation for missing values (IMPSEQ). Note that these methods are selected because their assumptions are based on MAR.  3.2 Methodology 3.2.1  Description of Algorithms  In this study, two single imputations (IRMI and IMPSEQ) and one multiple imputation (AMELIA) methods were compared to impute missing values of multivariate ARD chemistry data from the example minesite. The algorithms of the three methods used to estimate missing values are implemented in RStudio (RStudio 2012) and their description are presented below. 3.2.1.1 Iterative Robust Model-based Imputation (IRMI) The IRMI algorithm is a model-based imputation method where missing values are estimated using sequence of regression models (Templ et al. 2011). A brief of summary of this algorithm is Chapter 2 23  presented below, but a detailed description of this algorithm, including its development and verification, can be found in Templ et al. (2011). The six steps procedure for IRMI algorithm is: Step 1: Sort the variables based on the number of missing values as show in equation 3.1,   1 2( ) ( ) ( )pn v n v n v≤ ≤ ⋅⋅⋅  3.1 where ( )in v  denotes the number of missing values in variable lv , for 1, ,i p= ⋅⋅⋅ . Step 2: For a given variable v, separate the observed and the missing values into two matrices ioV and inV , where io  and in  are the indices of observed and missing values. Step 3: A regression equation as shown below will be developed using the matrices with unknown regression coefficients β  and error termε .    i io ov V β ε= +  3.2 Step 4:  Estimate the regression coefficients ?̂? and the error term ε  from the observed values and use them to estimate the missing values as shown below. 𝑣𝑍𝑖 = 𝑉𝑍𝑖?̂? + 𝜀  3.3  Step 5: Repeat Steps 2 to 4 for the other variables. Step 6: Repeat Steps 2 to 5 until the estimated values converge. 3.2.1.2 Sequential imputation for missing values (IMPSEQ) The IMPSEQ algorithm is a covariance-based imputation method where missing values are estimated sequentially by minimizing the determinant of the covariance matrix of the data (Verboven et al. 2007). The brief of summary of the IMPSEQ algorithm is presented below, but a detailed description of this algorithm can be found in Verboven et al. (2007). The five steps procedure for the IMPSEQ algorithm is: Step 1: Partition the dataset X  into complete and missing matrices, which are cX  and mX , respectively.  Step 2: Sort the variables based on the increasing number of missing values, mX .   Chapter 2 24  Step 3: For a variable with least number of missing values, estimate the missing values by minimizing the determinant of the covariance matrix that is shown in equation 3.4. 'c c1 1cov( ) cov(X ) ( )( )m m m ccX X X X Xc c−= + − − 3.4 where c is the total number observed values and cX  is the mean of observed values.  Step 4: Update the missing values in cX .  Step 5: Repeat Steps 2 to 4 for the other variables. 3.2.1.3 Multiple Imputation of incomplete multivariate data (AMELIA) The AMELIA algorithm estimates missing values using the expectation-maximization with bootstrapping algorithm (EMB) (Honaker et al. 2006). This algorithm assumes that the complete dataset has multivariate normal distribution and the data are missing at random. The brief summary of the AMELIA algorithm is presented below, but a detailed description of this algorithm can be found in the literature (e.g., Honaker et al. 2006; Honaker and King 2010). The six steps procedure for AMELIA algorithm is: Step 1: Partition the dataset D  into observed obsD  and missing misD  elements, which is{ },obs misD D D= . The missing values are estimated based on the Bayesian concept as shown in the equation below,  ( | ) ( | )obs misp D p D dDθ θ= ∫  3.5 where θ  is the distribution parameter. Step 2: Define a matrix M  that has zero and one elements. The zero and one represent the missing and observed values, respectively. Step 3: Create m  samples from M  using the bootstrapping algorithm. Step 4: Estimate the posterior θ  for the M samples using the EMB algorithm. Step 5: Estimate the missing values in each m  sample using the posterior parameter and the conditional distribution of obsD . Chapter 2 25  Step 6: Take the averages of m  estimates to obtain each missing value. 3.3 Data preparation and methods evaluation 3.3.1 Data preparation The water-quality database was obtained from a copper-molybdenum-gold-silver-rhenium minesite located in British Columbia, Canada. The database contains time series dataset that is collected at daily bases. Figure 3.1 shows the variables in the database and their median, first and third quartiles, range, and outlier values using box plots. The dataset consists of 5720 indices and 12 variables as shown in Figure 3.2 (left). The red and grey colors represent, respectively, the missing and observed values in the water-quality dataset. Censored (below a detection limit) values are replaced by 0.55 of the detection limit, as recommended by Sanford et al. (1993).   Figure 3.1: Box plots showing variables in the water-quality database In order to compare the methods (i.e., IRMI, IMPSEQ, and AMELIA) previously described, a test dataset with complete observation is extracted from the raw dataset. The test dataset consists Chapter 2 26  of 8 variables (i.e., pH, electrical conductivity (EC), Cu, Zn, Cd, Ca. Mg, and Al) and 825 indices that are extracted from indices 4700 to 5525 of the raw dataset (Figure 3.2, left). It is worth noting that these indices contain no missing values. Following the missing values pattern in the raw dataset, some observed values in the test dataset were deleted to reproduce the missing values pattern in the raw dataset (Figure 3.2, right). Before the analysis begins, the dataset was transformed into normal distribution using the logarithmic transformation because the original dataset was log-normally distributed (Betrie et al. 2013). This test dataset with its missing values is used to compare the three methods. The statistical summary of this data set is presented in Table 3.1. This table shows that the missing values in Al is highest, followed by EC, Cu, Mg, Cd, Ca, pH, and Zn.  Table 3.1: Statistical summary of the test data used to evaluate the imputation methods  Zn (mg/L) pH  Ca (mg/L) Cd (mg/L) Mg (mg/L) Cu (mg/L) EC (mS/cm) Al (mg/L) Minimum 0.1 4.43 140 0 22 0.01 840 0.1 1st Quartiles 0.7 6.72 240 0.01 35 0.02 1330 0.1 Median  1.9 7.13 270 0.02 39 0.03 1440 0.2 Mean   1.91 7.07 274.1 0.02 40.76 0.05 1435 0.31 3rd Quartiles 2.7 7.53 310 0.02 45 0.05 1509 0.2 Maximum 7.3 10.65 430 0.05 72 0.73 3450 5.2 Sample size 826 826 826 826 826 826 826 826 Missing (%) 4 8 11 15 21 26 62 78 3.3.2 Performance Evaluation The accuracy between observed and estimated values was evaluated for each imputation method. The evaluation techniques include mean absolute error (MAE), relative absolute error (RAE), and percent bias (PBIAS), as shown in equations 3.6 to 3.8. MAE and RAE compare the match between estimated and observed values. MAE and RAE values equal to zero indicate optimal values; in general, a method that has the smallest values is preferred over the other methods (Betrie et al. 2013). PBIAS measures whether the average tendency of estimated values is larger or smaller than their observed values (Gupta et al. 1999; Betrie et al. 2011). Values of PBIAS equal to zero, positive, or negative indicate, respectively,  ideal estimation, overestimation and Chapter 2 27  underestimation (Gupta et al. 1999). In addition to these three evaluation techniques, visual inspection was used.  1no piY YMAEn=−=∑ 3.6 11no pino oiY YRAEY Y==−=−∑∑ 3.7 ( )( )11100np oinoiY YPBIASY==−= ×∑∑ 3.8   where oY , pY , oY and n   are the observed, estimated, mean of the observed sample, and size of the dataset, respectively.  Figure 3.2: Raw (left) and test (right) datasets with their observed (grey) and missing (red) values.  Chapter 2 28  3.4 Results and Discussion The performance of the imputation methods is presented in Table 3.2 and Figure 3.3 and Figure 3.4. For Zn, which had 4% missing values, the MAE and RAE results show that the IRMI method performed best followed by IMPSEQ and AMELIA. The PBIAS results showed the estimation of the IRMI and IMPSEQ methods underestimated the observed values, whereas the estimation of AMELIA highly overestimated the observed values. It is interesting to note that the magnitude of underestimation for IRMI is larger than IMPSEQ. Figure 3.3 shows that most of the estimated values of Zn by IRMI and IMPSEQ are above the ideal line (the line of equality along which the estimated and observed values are equal), which indicates that the two methods underestimated the observed values. On the other hand, the entire estimation of Zn by AMELIA is below the ideal line, which indicates this method highly overestimated the observed values of Zn. For pH, which had 8% missing values, the MAE, RAE, and PBIAS results show that IMPSEQ performed best followed by IRMI and AMELIA. The PBIAS results indicate that IRMI AMELIA, IMPSEQ slightly overestimated, highly overestimated, and slightly underestimated the observed values, respectively. Figure 3.3 shows most of the imputed values of IMPSEQ and IRMI methods are equally distributed below and above the ideal line, which indicates that the imputed values are well estimated. However, the imputed values of AMELIA are far below the ideal line, indicating that the imputed values are highly overestimated.  For Ca, which had 11% missing values, the results of all evaluation techniques show that the performance of IRMI is the best followed by IMPSEQ and AMELIA. The PBIAS results show that the imputed values of IRMI is slightly underestimated, whereas the imputed values of IMPSEQ and AMELIA are slightly and highly overestimated the observed values, respectively. Visual inspection of Figure 3.3 shows that most of the imputed values of IMPSEQ and IRMI are distributed around the ideal line and this indicates that the imputations of these two methods are quite good. On the other hand, the imputed values of AMELIA are way below the ideal line, indicating that the method tended to highly overestimate the observed values.   Chapter 2 29  Table 3.2: The performance of the imputation methods    Variable Missing (%) MAE RAE PBIAS IRMI Zn 4 0.84 1.71 -32 IMPSEQ Zn  0.90 1.84 -17 AMELIA Zn  34 71 1483 IRMI pH 8 0.28 0.82 0.25 IMPSEQ pH  0.22 0.65 -0.13 AMELIA pH  64 184 893 IRMI Ca 11 18 0.47 -0.63 IMPSEQ Ca  20 0.29 3 AMELIA Ca  2359 56 944 IRMI Cd 15 0.004 1.02 2.60 IMPSEQ Cd  0.003 0.86 2.82 AMELIA Cd  0.14 35 1115 IRMI Mg 21 5.97 1.43 18 IMPSEQ Mg  3.2 0.77 8.057 AMELIA Mg  297 71 1006 IRMI Cu 26 0.020 0.92 -3 IMPSEQ Cu  0.01 0.76 -14 AMELIA Cu  0.41 19 1048 IRMI EC 62 230 1 5 IMPSEQ EC  158 0.68 7 AMELIA EC  12771 55 966 IRMI Al 78 0.26 1.09 -1 IMPSEQ Al  0.18 0.77 -16 AMELIA Al   2.40 10 681 For Cd, which had 15% missing values, the results of all evaluation techniques show that the performance of IMPSEQ is the best followed by IRMI and AMELIA. In addition, the PBIAS results show that imputed values of all three methods overestimated the observed values, but the Chapter 2 30  overestimation of IRMI and IMPSEQ is slight, whereas the overestimation of AMELIA is very high. Visual inspection of Figure 3.3 shows that the estimations of IMPSEQ and IRMI are well distributed above and below the ideal line. This indicates that the central tendency of imputed values and the observed values would tend to be similar, but the estimates could have larger error on-sample-by-sample basis. On the other hand, the imputed values of AMELIA are way below the ideal line, indicating that the method is highly overestimated the observed values. The imputed values are clustered into three values since most of the observed values are close to the analytical-method detection limit. For Mg, which had 21% of missing values, the results of all evaluation techniques show that IMPSEQ was the best, followed by IMRI and AMELIA. PBIAS shows that the imputed values are overestimated by all methods, but the overestimation of IMPSEQ and IRMI is slight, whereas the overestimation of AMELIA is very high. Visual inspection of Figure 3.4 shows that most of the imputed values of IMPSEQ and IRMI are distributed below the ideal line, indicating that these methods slightly overestimated the observed values. On the other hand, the imputed values of AMELIA are clustered way below the ideal line, indicating that this method is highly overestimated the observed values.   For Cu, which had 26% of missing values, the results of all evaluation techniques show that IMPSEQ is the best, followed by IRMI and AMELIA. The PBIAS values show that the imputed values of IMPSEQ and IRMI are slightly underestimated, whereas the imputed values of AMELIA are highly overestimated. Figure 3.4 shows that the imputed values of IMPSEQ and IRMI are distributed along the ideal line, indicating that imputations are reasonable. All of the imputed values of AMELIA are below the ideal line, indicating that this method is highly overestimated the observed values.  Chapter 2 31   Figure 3.3: Comparing the logarithm of observed and estimated values of variables pH, Zn, Ca, and Cd. For conductivity, which had 62% missing values, MAE and RAE show that IMPSEQ performed the best, followed by IRMI and AMELIA. PBIAS shows that the imputed values of all methods are overestimated, but the overestimation of IMPSEQ and IRMI is slight, whereas the overestimation of AMELIA is very high compared to other methods. Visual inspection of Figure 3.4 shows that most of the imputed values of conductivity by IMPSEQ and IRMI are below the ideal line, indicating that the imputed values are slightly overestimated the observed values. The imputed values of AMELIA are way below the ideal line, indicating that the imputed values are highly overestimated the observed values. For Al, which had 78% missing values, MAE and RAE show the imputation of IMPSEQ performed the best, followed by IRMI and AMELIA. PBIAS shows that imputed values of IRMI and IMPSEQ are slightly underestimated, whereas the imputed values of AMELIA are highly Chapter 2 32  overestimated. Figure 3.4 shows that the imputed values of IRMI and IMPSEQ are distributed along the ideal line, indicating that imputation of these methods are reasonable. Most of the imputed values of AMELIA are below the idea line, indicating that imputation of AMELIA is overestimated the observed values.    Figure 3.4: Comparing the logarithm of observed and estimated values of variables Mg, Cu, Conductivity, and Al. In general, the results show that IMPSEQ is the best imputation method followed by IRMI, whereas the imputation of AMELIA is the worst. The AMELIA algorithm did not perform best probably because the multivariate normality assumption is violated as the transformed pH variable may result in bimodal distribution. According to (Templ et al. 2011), the assumption of multivariate normality becomes inappropriate for multimodal distributions. It is interesting to note that the imputations of IRMI and IMPSEQ introduce certain bias. This could be attributed to the dataset is highly autocorrelated. Figure 3.5 shows the result of autocorrelation analysis. Any Chapter 2 33  values above the dashed-line are significant. This reveals that there is a significant autocorrelation within each variable.  Figure 3.5: Autocorrelation of analysis of the test dataset. 3.5 Summary Minesites drainage chemistry databases often have missing values due to periodic lack of sampling and analysis or input errors. These missing values often limit the use and interpretation of data in making appropriate decisions. This chapter presents advanced methods to fill missing values in drainage chemistry databases to improve decision-making in minesites. Three imputation methods were investigated to impute missing values of multivariate drainage chemistry data from minesites: iterative robust model-based imputation (IRMI), sequential imputation for missing values (IMPSEQ), and multiple imputations of incomplete multivariate data (AMELIA). The performance of these imputation methods were evaluated using mean Chapter 2 34  absolute error (MAE), relative absolute error (RAE), and percent bias (PBIAS) techniques. In addition, visual inspection was used to evaluate the imputation methods. The results show that IMPSEQ and IRMI are suitable to impute missing values of drainage chemistry databases at minesites, whereas AMELIA is not.  Chapter 4 35  CHAPTER 4 PREDICTION OF ACID ROCK DRAINAGE CHEMISTRY USING MACHINE LEARNING TECHNIQUES: A DETERMINSTIC APPROACH  A version of this chapter has been published in Environmental Monitoring and Assessment with a title” Predicting copper concentrations in acid mine drainage: a comparative analysis of five machine learning techniques.” The lead author is Getnet D. Betrie and the coauthors are Dr. Solomon Tesfamariam, Dr. Kevin A. Morin, and Dr. Rehan Sadiq.  4.1 Background Predicting the future drainage chemistry is important to assess potential environmental risks of ARD and implement appropriate mitigation measures. However, predicting the potential for ARD can be exceedingly challenging because its formation is highly variable that varies from site-to-site depending upon mineralogy and other operational and environmental factors (U.S. EPA 1994). For this reason, laboratory test, field test and a variety of predictive modeling approaches have been used for predicting the potential of mined materials to generate acid and contaminant (U.S. EPA 1994; Maest et al. 2005; Price 2009). Laboratory and field tests are often undertaken for short periods of time with respect to the potential persistence period of ARD; hence, they may inadequately mimic the evolutionary nature of the process of acid generation (U.S. EPA 1994). Predictive modeling approaches have been used to overcome the uncertainties inherent in short-term testing and avoid the prohibitive costs of very long-term testing. Empirical models describe the time-dependent behavior of one or more variables of a mine waste geochemical system in terms of observed behavior trends (U.S. EPA 1994). The most commonly used empirical model in Canada was developed by Morin and Hutt (Morin and Hutt 2001b; Morin and Hutt 1993; Morin and Hutt 1994; Morin et al. 1995). These researchers developed an empirical model, named empirical drainage-chemistry model (EDCM), and applied it for the prediction of drainage chemistry using historical data from minesites. The EDCM approach involves defining correlation equations using least-linear fitting between concentrations and other geochemical parameters typically pH and sulphate.  In this chapter, advanced empirical models, namely machine learning techniques have been explored to develop models that predict future drainage chemistry using operational data. This approach is widely applied to solve environmental and civil engineering problems (Hsieh 2009; Reich 1997). An example of using one of machine learning techniques to predict ARD was  Chapter 4 36  presented by Khandelwal and Singh (2005) although it did not involve the use of existing minesites database. These researchers compared artificial neural network (ANN) to multivariate regression analysis (MVRA) for prediction of mine water quality. They reported that ANN provided acceptable results compared to MVRA. However, this comparison lacks adequate performance evaluation measures. Machine learning approaches are useful to develop predictive models but their use requires insight into the learning problem formulation, selection of appropriate learning methods, and evaluation of modeling results to achieve the stated goal of the modeling activity (Reich and Barai 1999; Cherkassky et al. 2006). This chapter evaluates machine learning techniques to develop models that predict future drainage chemistry using operational data. The selected machine learning techniques are artificial neural networks (ANN), support vector machine with polynomial (SVM-Poly) and radial basis function (SVM-RBF) kernels, model tree (M5P), and K-nearest neighbors (K-NN). The prediction accuracy refers to the difference between observed and predicted values, whereas predictive uncertainty refers to the variability of the overall error around the mean error. The detailed description of machine learning methods and the approach taken are presented in the following sections. 4.2 Methodology 4.2.1 Machine learning techniques  Machine learning is an algorithm that estimates an unknown dependency between mine waste geochemical system inputs and its outputs from the available data (Mitchell 1997). The available mine waste geochemical data are usually represented as a pair ( , )i ix y  which is called an example or an instance. The machine learning consists of input variables X , mine waste geochemical system that return output Y for each input variable, and a machine learning algorithm that selects mapping functions (i.e., ( )Y f X= ), which describes how the mine waste geochemical system behaves (Figure 4.1). The goal of learning (training) is to select the best function that minimizes the error between the system output (Y ) and predicted output 𝑌� based on examples data. These examples data used for training purpose are called a training data set. The process of building a machine learning model follows general principles adopted in modeling: study the problem, collect data, select model structure, build the model, test the model and iterate (Solomatine and Ostfeld 2008). There are various types of machine learning techniques but artificial neural networks, support vector machine, model tree and K-neighbors are explored in this study. These Chapter 4 37  techniques are implemented using WEKA 3.6.4 Software (Bouckaert et al. 2010) and they are described in detail in the following sections.  Figure 4.1: A machine learning algorithm using real system data to predict output 4.2.1.1 Artificial Neural Network (ANN) Artificial Neural Network (ANN) is one of machine learning techniques that consists of neurons with massively weighted interconnections (Bishop 1995). These neurons are arranged as input layer, hidden layer and output layer as displayed in Figure 4.2. The task of the input layer is only to send the input signals to the hidden layer without performing any operations. The hidden and output layers multiply the input signals by set of weights and either linearly or nonlinearly transforms results into output values. These weights are optimized during ANN training (calibration) process to obtain reasonable predictions accuracy.  MACHINE   LEARNINGAMD SYSTEMINPUT DATAMINIMIZEYXYChapter 4 38   Figure 4.2: Multilayer perceptrons neural networks In this study, multilayer perceptron is used although there are various types of ANN algorithms (Bishop 1995). Multilayer perceptron is a feedforward neural network, where signals always travel in the direction of the output layer. A typical multilayer perceptron with one hidden layer can be mathematically expressed in equations 4.1 to 4.4.  The outputs of hidden layer ( jZ ) is obtained as (i) summing products of the inputs ( iX ) and weight vectors ( ija ) and a hidden layer’s bias term ( 0 ja ) as seen in equation 4.1, and (ii) transforming this sum using transfer function gusing equation 4.2. There are many transfer functions, but the logistic function was used in this research. Similarly, the outputs of the output layer kY  is obtained by (i) summing products of hidden layer’s outputs ( jZ ) and weight vectors ( jkb ) and output layer’s bias term ( 0kb ) as shown in equation 4.3, and (ii) transforming this sum using transfer function g  as seen in equation 4.4.  01inpNj i ij jiu X a a== +∑  4.1  ( )j jZ g u=  4.2  01hidNk j jk kjv Z b b== +∑  4.3  ( )k kY g v=  4.4   XNinpX1 aij Z1ZNhidY1YNoutbijChapter 4 39  4.2.1.2 Support Vector Machine (SVM) The Support Vector Machine was mainly developed by Vapnik and co-workers (Vapnik 1998; Cherkassky and Mulier 2007). Its principle is based on the Structural Risk Minimization that overcomes the limitation of the traditional Empirical Risk Minimization technique under limited training data. The Structural Risk Minimization aims at minimizing a bound on the generalization error of a model instead of minimizing the error on the training dataset. The SVM algorithm was first developed for classification problems and then adapted to address regression problems. In this study, the basic idea of SVM regression is illustrated since a regression problem is solved.  The complete description of SVM regression is well presented by Smola and Schölkopf (2004) and a summary of it is presented in this study. Given a training dataset ( ),i ix y , where ix  is thei th−   input pattern and iy  is corresponding target value iy ∈ℜ . The goal of SVM regression is to find a function ( )f x  that has at most ε  deviation from actually obtained targets iy  for all training data, and at the same time, is as flat as possible (Vapnik 2000). The function f  is represented using a linear function in the feature space ( ) , ,f x w x b with w X b= + ∈ ∈ℜ  4.5 where .,.  denotes the dot product in X . In this case, the flatness means seeking a small w . This can be ensured by minimizing the norm (i.e., 2.w w w= ) if the assumption that a function f  is known a priori  to approximate all pairs ( ),i ix y with precision. If such function is not known a priori, it is possible to introduce slack variables ,i iζ ζ∗  and allow for some errors. This minimization problem can be mathematically expressed as  2i1( )2,,, 0iii i ii i ii iMinimize w Cy w x bSubject to w x b yζ ζε ζε ζζ ζ∗∗∗+ + − − ≤ + + − ≤ + ≥∑ 4.6  The constant 0C >  determines the tradeoff between the flatness of f  and the amount up to which deviations larger than ε  are tolerated. The constrained optimization problem is converted into unconstrained optimization by introducing Lagrange function. The Lagrange function is Chapter 4 40  constructed from the objective function and the corresponding constraints by introducing a dual set of variables as follows: 2i i i1 1i i i1 11: ( ) ( y b)2( y , b) ( )l li ii il li i i i ii iL w Cw xζ ζ α ε ζα ε ζ η ζ η ζ∗= =∗ ∗ ∗ ∗= == + + − + − +− + + − − − +∑ ∑∑ ∑ 4.7  It follows from the saddle point condition that the partial derivatives of L  with respect to the primal variables (w,b, , )i iζ ζ∗  have to vanish for optimality. Substituting the results of this derivation into equation 4.7 yields the dual optimization problem. [ ]i1 1 111( )( ) , ( ) y ( )2( ) 0 , 0,l l li i j j i j i i i ii i ili i i iiMaximize x xSubject to and Cα α α α ε α α α αα α α α∗ ∗ ∗ ∗= = =∗ ∗=− − − − + + −− = ∈∑ ∑ ∑∑ 4.8  Once the coefficient iα   and iα∗  are determined from equation 4.8, the desired vectors can be written as follows: 11( )li i iiw xα α∗=== −∑  , and therefore 1( ) ( ) ,li i iif x x x bα α∗== − +∑  4.9  Nonlinear regression problems are very common in most engineering applications. In such case, a nonlinear mapping kernel K  is used to map the data into a higher-dimensional feature space or hyperplane by the function φ . The kernel function, i( , ) ( ), ( )iK x x x xφ φ=  can assume any form. In this study, the polynomial (SVM-Poly) and radial basis function (SVM-RBF) kernels are used. These kernels are displayed in equations 4.10 and 4.11.  Polynomial Kernel: i( , ) ( , ) , 0diK x x x xγ τ γ= + >  4.10  Radial Basis Function Kernel: 2i( , ) exp( ), 0)iK x x x xγ γ= − − >  4.11  where , , ,and dγ τ are kernel parameters. These parameters were optimized using an optimization tool in the software. 4.2.1.3 Model Trees (M5P) Model trees are tree-based models for dealing with continuous-class learning problems with piecewise linear functions, originally developed by Quinlan (1992). The schematic representation of model tree is depicted in Figure 4.3. Given a training set T , this set is either associated with a leaf or some test is chosen that splits T into subsets corresponding to the test outcomes and the Chapter 4 41  same process is applied recursively to the subsets. For a new input vector, (i) it is classified to one of the subsets and (ii) the corresponding model is run to produce the prediction. The steps to build the M5P model trees are building the initial tree, pruning and smoothing.    Figure 4.3: Schematic representation of model tree, where iX  and iM  represent the input parameters and local models In the building tree procedure, the splitting criterion in each node is determined. The splitting criterion is based on treating the standard deviation of the class values that reach a node as a measure of the error at that node, and calculating the expected reduction in error as a result of testing each attribute at that node (Wang and Witten 1997).  The attribute which maximizes the expected error reduction is chosen. The standard deviation reduction (SDR) is calculated using equation 4.12.  ( ) ( )i iiTSDR sd T sd TT= − ×∑  4.12  where T  is the set of examples that reach the node and 1 2, ,T T ⋅ ⋅ ⋅  are the subsets that result from splitting the node according to chosen attribute. The splitting process will terminate if the output values of all the instances that reach the node vary only slightly, or only a few instances remain.  M1x1<=1.3M2M3x2<=6.0M4 M5M6YesYesYesNoNoNo YesYes Nox1<=0.9 x1<=1.5x2<=4.5Chapter 4 42  The pruning procedure makes use of an estimate of the expected error that will be experienced at each node for the test data. First, the absolute difference between the predicted value and the actual class value is averaged for each of the training examples that reach that node. This average will underestimate the expected error for unseen cases, to compensate this it is multiplied by the following equation:  n v pfn v+ ×− 4.13  where n  is the number of training instances that reach that node,  v  is the number of parameters in the model that represents the class value at that node, pf  is pruning factor. The resulting linear model is simplified by dropping terms to minimize the estimated error calculated using the above multiplication factor, which may be enough to offset the inevitable increase in average error over the training instances. Terms are dropped one by one until the error estimate stops decreasing. Once a linear model is in place for each interior node, the tree is pruned back from the leaves, so long as the expected error decreases.  The smoothing process is used to compensate for the sharp discontinuities that will inevitably occur between adjacent linear models at the leaves of the pruned trees. This is a particular problem for models constructed from a small number of training instances. The smoothing procedure in M5P first uses the leaf model to compute the predicted value, and then filters that value along the path back to the root, smoothing it at each node by combining it with the value predicted by linear model for that node. The formula used for smoothing is: 'np kqpn k+=+ 4.14  where 'p  is the prediction passed up to the next higher node, p is the prediction passed to this node from below, q  is the value predicted by the model at this node, n is the number of training instances that reach below and k  is a constant.  4.2.1.4 K-Nearest Neighbors (K-NN) K-nearest neighbor (K-NN) technique is an instance-based learning, where training examples are stored and the generalization is postponed until a prediction made (Mitchell 1997).  The K-NN classifies an unknown input vector qx  by choosing the class of the nearest example x  in the training set as measured by a Euclidean distance. For real valued target functions, the estimate is the mean value of the K-nearest neighboring examples. However, this method is slow for a bigger Chapter 4 43  test set because it involves finding which member of the training set is closest to an unknown test instance ( qx ), which calculates the distance from every member of the training set and select the smallest (Witten and Frank 2005). One way of improving this limitation is to consider weighted distance. Thus, the distance weighted K-NN algorithm was used in this study. In this algorithm, each K-neighbor ix  is weighted according to their distance from the query point qx  as follows: 1q1( )(x )ki iikiiw f xfw=== ∑∑ 4.15  where weight iw  is a function of distance i( , )qd x x between qx  and ix . The most commonly used weight functions are provided in equations 4.16 to 4.18, in this study, however, equation 4.15 is used. Linear: 1 ( , )i q iw d x x= −   4.16  Inverse: 1( ( , ))i q iw d x x−=  4.17  Inverse square: 2( ( , ))i q iw d x x−=  4.18  4.2.2 Parameter selection for drainage quality Modeling of ARD chemistry using machine learning methods requires defining control variables that dictate the process. According to the literatures (Morin and Hutt 1994; Morin et al. 2010; Morin and Hutt 2001a), the most important factors that control drainage chemistry are: (i) geochemical production rates, (ii) infiltration of water,  (iii) elapsed time between infiltration events,  (iv) residence time of water within rocks,   (v) internal temperatures and pore-gas concentrations of oxygen and carbon dioxide, (vi) particle size, and (vii) iron and sulfur-oxidizing bacteria. Geochemical production rates refer to the production rates of elements, acidity and alkalinity under acid and pH-neutral conditions rock. Once produced, these reaction products are either Chapter 4 44  flushed by flowing water or accumulate in the rocks. Whenever, the products flushed out of waste rocks, it highly affects the drainage quality. The infiltration of water controls the amount of reaction products to be flushed. The importance of the elapsed time between infiltration events is that it provides opportunity for reaction products to accumulate in the flow channel. It is worth noting that both the volume of infiltrating water and the elapsed time between infiltration events affect the concentrations and loadings observed in the basal seepage. The residence time of water within waste rocks refers to time required for infiltrated water to pass through rocks. Thus residence time determines the time of reaction products to occur in the basal seepage. The internal temperatures and pore-gas concentrations of oxygen and carbon dioxide can affect the rates of pyrite oxidation and acid generation, and the amount of reactions products. For instance, higher temperatures, lower oxygen and higher carbon dioxide could be associated with higher rates of pyrite oxidation and acid generation. The particle size is an important factor since it primarily affects the surface area exposed to weathering and oxidation. In addition, this factor affects the amount of water and air percolating into waste rock. Iron and sulfur oxidizing bacteria affect oxidation rates since they catalyze the oxidation reaction.  In this study, the physico-chemical parameters monitored for over 25 years from waste rocks were obtained from a copper-molybdenum-gold-silver-rhenium minesite, located in British Columbia, Canada. The monitoring was routinely performed by collecting the drainage samples at well established stations and these samples were analyzed by qualified personnel at the minesite laboratory. The chemical parameters include pH, conductivity, alkalinity, acidity, sulphate and metals. The physical parameters include flow rate, dissolved oxygen, temperature. The amount of precipitation infiltrated into waste rocks was estimated from climatic data. The climatic data such as precipitation, minimum and maximum temperature were obtained from Environment Canada. The evapotranspiration of the site was calculated using the Hargreaves method (Hargreaves and Riley 1985). The Hargreaves method uses minimum and maximum temperature, and solar radiation to estimate evapotranspiration. Then, the site effective precipitation was estimated as difference between precipitation and evapotranspiration.  The input variables to machine learning techniques should consist of all relevant variables that influence the ARD’s generation process. However, overlapping information of input variables should be avoided to simplify the task of the training algorithms. In order to make parsimonious selection of inputs, the linear correlations between input and output variables were examined. It is worth noting that a nonlinear machine learning technique could be able to make use of more information than is revealed by this linear technique. The correlation between the copper concentrations and the others variables with their time lags are shown in Table 4.1. It shows that Chapter 4 45  the current copper concentration highly correlated to previous time copper concentrations (i.e., 1t −  to 5t − ) and other variables except the effective precipitation. While the pH is negatively correlated to current time copper concentration, conductivity and acidity are positively correlated to it. Moreover, the table shows that the current time concentration has strong correlations with pH, conductivity and acidity at previous time state. Therefore, pH, conductivity, acidity and previous time copper concentrations were used as control variables and the current time copper concentrations were used as output. Table 4.1: Correlation between copper concentration and other variables with time lags Time, t ( day ) pH Conductivity ( /S cmm ) Acidity (3 /mgCaCO L ) Effective Precipitation ( mm ) Cu ( /mg L ) t -0.74 0.52 0.81 -0.02 1.00 t-1 -0.69 0.51 0.78 -0.05 0.94 t-2 -0.68 0.50 0.76 0.01 0.90 t-3 -0.65 0.50 0.74 -0.01 0.89 t-4 -0.62 0.49 0.72 -0.01 0.86 t-5 -0.59 0.49 0.71 -0.01 0.84  The statistical summary of the input and output variables considered in this study are summarized in Table 4.2. These variables are pH, conductivity, acidity, and dissolved copper. This table shows that pH dataset distribution has the lowest variability, followed by the conductivity, acidity and copper. In addition, the variability of the independent variables (i.e., pH, acidity, conductivity and effective precipitation) and the dependent variable (copper) are within a reasonable range.        Chapter 4 46  Table 4.2:  Variables and summary of the data used in the study  pH* Conductivity* ( /S cmm ) Acidity* (3 /mgCaCO L ) Cu** ( /mg L ) Minimum 3.56 500 1.5 0.01 1st quartile 4.24 1200 87.0 0.54 Median 4.40 1700 200.0 0.95 Mean 4.531 1718 202.5 0.93 3rd quartile 4.69 2210 290.0 1.30 Maximum 6.46 3140 570.0 2.50 CoV$ 9.93 34.8 63.5 51.74 Sample size (n) 1278 1278 1278 1278 * Used as inputs in model development; ** Used as a model output; and $ Coefficient of variation 4.2.3 Model development and validation The dataset was divided into training and testing sets following the k-fold cross-validation method (Mitchell 1997). In the k fold−  cross-validation method, the dataset is subdivided into k  subsets preferably of equal size. Next, the 1k −  subsets are used to train the machine learning models and the remaining one subset are used for testing the models. In this study, each subset has the size of 128 values and ten-fold cross-validation with stratification was repeated 10 times. This exercise provided a total 100 independent models error for each machine learning techniques. This method is computationally very intensive; however, the author strongly believes that it provided reliable results.  4.2.4 Model evaluation The prediction accuracy helps to evaluate the overall match between observed and predicted values for each machine learning technique. The predictive accuracy of each machine learning technique was evaluated using the Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Root Relative Squared Error (RRSE), Relative Absolute Error (RAE), where the smaller value indicates a better technique. Moreover, a paired t-test was used to determine whether the mean of error estimates of one machine learning technique is significantly different from another technique. The equations of the error estimates are given in equations 4.19 to 4.22: Chapter 4 47  21( )no piY YRMSEn=−= ∑  4.19  1no piY YMAEn=−=∑  4.20  2121( )( )no pino oiY YRRSEY Y==−=−∑∑  4.21  11no pino oiY YRAEY Y==−=−∑∑  4.22  where oY  and pY  represent the observed and predicted values, oY  represents the mean of the observed values and n  represents the number of examples that are presented to the learning algorithms. Predictive uncertainty refers to the variability of the overall error around the mean error. The predictive uncertainty of each machine learning technique was evaluated using averaged error residuals of the models. Next the averaged residuals of the five techniques are assumed as random variable and probability distributions were fitted using @Risk software (Palisade Corporation Inc. 2005) to compare the predictive uncertainty of the techniques. 4.3 Results and Discussion Performances of the five machine learning techniques in terms of predicting copper concentrations, four evaluation methods are presented in Table 4.3. This summary shows the best (minimum), mean and the worst (maximum) performance of the selected five techniques. The minimum and maximum values show the performance range of the techniques, whereas the mean value shows the average performance of the techniques over testing datasets. These indicators are important for making decision in environmental risk analysis. Comparison of the mean performance indicates that the SVM-Poly is the best technique, followed by SVM-RBF, ANN, M5P, and K-NN techniques on all evaluation methods. The K-NN technique was found to be the poorest performing predictive model.  The observed and predicted copper concentrations using different models are presented in Figure 4.4. This figure shows that the overall predictions of the SVM-Poly fits best to the ideal line (i.e., the diagonal line), followed by SVM-RBF, ANN, M5P and K-NN. This confirms that the conclusion made above on the performance of the techniques. Most of the predicted values of the Chapter 4 48  M5P technique are below the ideal prediction line, whereas the K-NN predictions are above the ideal prediction line. This implies that the K-NN technique underestimates and the M5P overestimates the overall predictions. As can be seen in Figure 4.4, the SVM-Poly predicted the high values better, which is desirable for conservative decision-making in environmental risk analysis. On the other hand, the high values are under predicted by SVM-RBF, ANN and M5P techniques and the K-NN could not predict the high values at all. This suggests K-NN should not be used for decision-making in which the associated risk is high. It is interesting to note that there are a few outliers in Figure 4.4. These outliers data are seen only once in the testing dataset. Subsequently, these values were either underestimated if the observed data have high values (e.g., 2.5 mg/L) or overestimated if the observed data have low values (e.g., 0.96 mg/L) by the machine learning techniques. This indicates that the prediction of machine learning techniques for outliers’ data should be carefully analyzed. Table 4.3: Performance of models over testing sets (n = 128) Models MAE RMSE RAE (%) RRSE (%)  Min Mean Max Min Mean Max Min Mean Max Min Mean Max ANN 0.07 0.17 0.56 0.09 0.22 0.62 12 41 269 14 43 233 SVM-Poly 0.06 0.14 0.42 0.08 0.18 0.46 9 32 152 11 36 134 SVM-RBF 0.06 0.15 0.38 0.09 0.20 0.41 11 36 218 15 39 189 K-NN 0.09 0.28 0.68 0.13 0.34 0.73 18 61 261 18. 61 261 M5P 0.05 0.21 0.68 0.06 0.26 1.44 11 47 205 14 51 193   Chapter 4 49   Figure 4.4: Scatter plots of observed and predicted copper concentrations A paired t-test was used to determine whether the mean of error estimates of one machine learning technique is significantly different from another technique. This t-test is important to ensure that the obtained results are not because of a particular dataset used. The adjusted p-values using the Benjamin- Yekutieli method (Benjamin and Yekutieli 2001), at a significance level of 0.05α = , of the paired t-test on prediction error residuals of the five techniques are shown in Table 4.4. The test results show that the obtained results are statistically significant except the SVM-Poly and SVM-RBF predictions. Although SVM-Poly performed better than SVM-RBF on all model evaluation methods, this test indicates that difference is not statistically significant.      Chapter 4 50  Table 4.4: The adjusted p-values of the paired t-test on error residuals  ANN K-NN M5P SVM-Poly SVM-RBF ANN 1 0.000 0.029 0.000 0.018 K-NN  1 0.004 0.000 0.000 M5P   1 0.000 0.004 SVM-Poly    1 0.1 SVM-RBF     1 The predictive uncertainty of each machine learning technique was evaluated using error residuals. These error residuals were computed as a difference between measured and predicted copper concentrations. For each machine learning technique, the residuals of 100 independent models were calculated and averaged. Next the averaged residuals of the five techniques are assumed as random variable and 18 probability distributions were fitted. The lognormal probability distribution was the best fit to the residuals of the five techniques (Figure 4.5). The best technique is the one that has residuals represented by narrowest, symmetrical and highest probability distribution. The SVM-Poly is the best in terms of the predictive uncertainty, followed by SVM-RBF and ANN techniques. The M5P shows a wider range of predictive uncertainty, whereas the K-NN showed the worst predictive uncertainty.   Figure 4.5: The probability distribution of the error residuals of five techniques 4.4 Summary This chapter explored five machine learning techniques to predict ARD chemistry from minesites without considering uncertainty. The machine learning techniques investigated include artificial 02468100 0.2 0.4 0.6 0.8 1 1.2 1.4ProbabilityError ResidualsK-NNM5PSVM-PolySVM-RBFANNChapter 4 51  neural networks (ANN) with multilayer perceptrons, support vector machine with polynomial and radial base function kernels, model tree with M5P algorithm, and K-nearest neighbors. Physico-chemical parameters and time lag that influence the drainage chemistry were identified as important parameters and were used as inputs in the five techniques. Although the precipitation has been mentioned in the literature as the most controlling parameter of ARD generation, it has not affected the ARD process for this case study area. The prediction results show that the identified inputs parameters represented the system dynamic. However, the predicted results likely improve if more parameters (e.g., flow rate, internal gases concentrations and temperature) are considered as more data become available in the future. The experimental results showed that the support vector machine with polynomial kernel performed best both in terms of predictive accuracy and uncertainty evaluation methods. The support vector machine with radial base function kernel and artificial neural network provided better prediction results, followed by model tree and the K-nearest neighbors showed the worst performance in terms both evaluation measures. These results indicate that the process of ARD generation is highly nonlinear and could not be captured with techniques that build local linear models such as model tree and K-nearest neighbor techniques.  Chapter 5 52   CHAPTER 5 PREDICTION OF ACID ROCK DRAINAGE CHEMISTRY USING MACHINE LEARING TECHNIQUES: A PROBABILITY BOUNDS APPROACH  A version of this chapter has been published in Science of the Total Environment with a title” Uncertainty quantification and integration of machine learning techniques for prediction of acid rock drainage chemistry.” The lead author is Getnet D. Betrie and the coauthors are Dr. Rehan Sadiq, Dr. Kevin A. Morin, and Dr. Solomon Tesfamariam.  5.1 Background Predicting the future drainage chemistry is important to assess potential environmental risks of ARD and implement appropriate mitigation measures that reduce adverse environmental risks (Betrie et al. 2012). Predictive models are one of the approaches used to predict the future drainage chemistry of minesites. These models are classified as process-based and empirical (data-driven) models (Maest et al. 2005; Price 2009). Process-based models describe the ARD system in terms of chemical and/or physical processes that are believed to control ARD generation (Betrie et al. 2013). Nevertheless, the physical/chemical processes that govern generation of ARD are not fully understood (Price 2009). Subsequently, uncertainty is introduced in the prediction of drainage chemistry due to poor representation of the ARD system. In addition, process-based models introduce uncertainty due to data because they use database information (e.g., solubility product) that might not match a given site (Price 2009). On the other hand, data-driven models (e.g., machine learning, soft-computing, computational intelligence) describe the time-dependent behavior of one or more variables of the ARD system in terms of observed data trends obtained from years of monitoring at a minesite (Betrie et al. 2012). Therefore, these models are prone to uncertainties in the data that arise due to epistemic (e.g., measurement errors and limited sample size) and aleatory (e.g., temporal and spatial variation) uncertainties, where these uncertainties arise due to incomplete knowledge and natural stochasticity, respectively (Sentz and Ferson 2002).  The literature review shows that machine learning techniques (e.g., ANN and SVM) have been used to predict ARD drainage chemistry. Khandelwal and Singh (2005) compared ANN and multiple regression analysis (MRA) to predict chemical parameters (sulphate, chloride, total Chapter 5 53  dissolved solid (TDS) and others) as a function of physical parameters (pH, temperature, and hardness). They reported that ANN provided acceptable results compared to MRA. Rooki et al. (2011) evaluated two types of ANN (back propagation neural network (BPNN), and general regression neural network (GRNN)) and MRA to predict heavy metals (Cu, Fe, Mn, Zn) as function of physical/chemical parameters (pH, sulfate, and Mg) in the Shur River near Sarcheshmeh Copper mine, Iran. They reported that the predictive accuracy of BPNN is the best followed by GRNN and MRA. For the Shur River and the same input-output variables, Aryafar et al. (2012) applied SVM and compared to their GRNN model results. The results showed that the predictive accuracy of SVM was slightly better than ANN. Previous chapter evaluated the predictive accuracy and uncertainty of four machine learning techniques (ANN, SVM, mode trees, and K-nearest neighbors) to predict copper concentration as function of physical/chemical parameters and their time lags. It was reported that SVM performed best followed by ANN, model trees and K-nearest neighbors both in terms of predictive accuracy and uncertainty. The prediction accuracy refers to the difference between observed and predicted values, whereas the predictive uncertainty refers to the variability of the overall error around the mean error (Betrie et al. 2012). Although identification and quantification of uncertainties are integral parts of ARD assessment and risk mitigation (Price, 2005; Price, 2009), previous studies have not addressed uncertainty issues except a minor attempt by Betrie et al. (2012). The objective of this chapter is to develop a methodology that quantifies the uncertainties of machine learning techniques for predicting ARD chemistry using a probability bounds approach. The probability bounds approach is an uncertainty analysis method that combines probability theory and interval arithmetic to produce probability boxes, which allow the comprehensive propagation of both variability and uncertainty rigorously (Tucker and Ferson 2003). Furthermore, predictions of ANN and SVM are integrated using four aggregation methods in order to reduce the predictive uncertainty of the individual technique. Aggregation methods are used to combine information obtained from various sources in order to improve the reliability of information (Sentz and Ferson 2002). 5.2 Methodology Figure 5.1 depicts the methodology applied in this research that consists of five blocks. In the first block, data pre-processing is done that includes filling in missing values and outlier analysis. In the second block, variables that control drainage chemistry of ARD are identified and used to Chapter 5 54  develop model using the machine learning techniques. In the third block, the dataset is divided into training and testing sets using ten-fold cross-validation technique. The training dataset is used to optimize parameters of the models, whereas the test dataset is used for predicting drainage chemistry. In the fourth block, first the predictive accuracy of training models is evaluated using four statistical techniques. If the results of training are not acceptable based on the obtained statistics, the modeling process would be reinitiated from the second block. However, the predictive accuracy and uncertainty for test models would be initiated if the training models provide acceptable results. In the last block, the uncertainties due to data and model are quantified using probability bounds approach. Also, predictions from ANN and SVM are integrated using four aggregation methods to reduce the predictive uncertainty of individual models.  Figure 5.1: A schematic representation of the methodology used in this study. DATA PREPROCESSING•FILLING MISSING VALUES•OUTLIER ANALYSISSELECTION OF MODEL VARIABLESACCEPTABLEPERFORMANCE ?UNCERTAINITY ANALYSIS•QUANTIFICATION OF UNCERTAINTY•INTEGRATION OF MODELSNOYESMODEL DEVELOPMENT•MODEL TRAINING•MODEL TESTINGMODEL EVALUATION•PREDICTIVE ACCURACY•PREDICTIVE UNCERTAINTYENDChapter 5 55  5.2.1 Probability bounds  Probability bounds (P-box) is a method used to represent imprecise probability. Imprecise probability is a generalization of probability theory when one is not able to define a precise probability function P  for an event x (Walley 1991). An imprecise probability function ( )P x  is characterized by its lower probability ( )P x  and upper probability ( )P x . Lower and upper probability functions map an event x X∈  in interval values between zero and one (Ferson et al. 2003). The lower and upper bounds on  ( )P x  are the probability that a random variable X  give the possible range of probabilities that X  exceeds any particular value. These bounds are close together when the imprecision is small but may be far apart when it is large. In this study, the predicted and observed P-boxes are constructed to compare the uncertainties and this is implemented using Risk Cal 4.0 software (Ferson 2000).  5.2.2 Aggregation methods Aggregation methods are used to combine information obtained from various sources in order to improve the reliability of information for better decision-making (Sentz and Ferson 2002). In this study, four aggregation methods are used to integrate the prediction results of ANN and SVM. The methods are intersection, envelope, mixture, and averaging.  The intersection method gives the smallest region that all predictions agree with high degree of confidence (Ferson et al. 2003). This method is appropriate to use when a modeler strongly believes that the prediction of each model encloses the distribution of the observed data. Suppose 1 1 1 2 2 2, , , ,..., P ,n n nP P P P P P P P     = = =       are prediction P-boxes of many models, the intersection method mathematically expressed using the equation below 1 2 1 2 1 2... max( , ,..., ),min( , ,..., )n n nP P P P P P P P P ∗ ∗ ∗ =    5.1 The envelope method gives the biggest region that each prediction agrees with certain degree of confidence. It is appropriate when a modeler believes that at least one of the predictions enclose the observed distribution (Ferson et al. 2003). For n prediction P-boxes mentioned above, the envelope method mathematically expressed as 1 2 1 2 1 2... min( , ,..., ),max( , ,..., )n n nP P P P P P P P P ∗ ∗ ∗ =    5.2 Chapter 5 56  The mixture method treats disagreements between the prediction P-boxes from each model and gives a condensed P-box without erasing disagreements (Ferson et al. 2003). For n prediction P-boxes mentioned above, the mixture method mathematically expressed by equation below 1 1 2 2 31 1 2 2(w ... ) /(w w ... ) /n in n iP P w P w P wP P P w P w= + + += + + +∑∑ 5.3 where 1 2, ,..., nw w w  are weights of the P-boxes.  The average method simply horizontally averages the edges of the P-box by finding the inversion of the lower and upper bounds (Ferson et al. 2003). For n-boxes the average method mathematically expressed by the following equation 1 1 111 21 1 111 2( *) (1 )( ... )( *) (1 )( )nnP n P P PP n P P P− − −−− − −−= + + += + + 5.4  where -1 represents the inverse function.  5.2.3 Selection of variables for drainage chemistry The input variables to machine learning techniques should consist of all relevant variables that influence the ARD generation (Betrie et al. 2013). However, overlapping information of input variables is avoided to simplify the task of the training algorithms. For this reason, a nonlinear correlation analysis was conducted using the maximum criterion information (MIC) (Reshef et al. 2011) algorithm implemented in RStudio to identify correlation between input and output variables. Unlike other correlation analysis methods (e.g., Spearman, Pearson, and Kendall), MIC not only detect a linear relationship but also other nonlinear relationships such as cubic, exponential, categorical, periodic, hyperbolic and various sinusoidal types. Its value ranges between zero and one, where zero and one indicate no and perfect relationships, respectively. The correlation between input and output variables are shown in Table 5.1. A value greater than or equal to 0.3 was used from the values presented in Table 5.1 as criterion to select input variable. It shows that the copper concentration correlated to pH, alkalinity, sulfate, acidity, and Al. The Cd concentration is correlated to pH and Al. The Zn concentration is correlated to pH and Al. Therefore, heavy metals (Cu, Cd, and Zn) are predicted as function of pH, alkalinity, sulfate, acidity, and flow. Although flow has a weak correlation with other variables, it is included as input variable because the author believes that the machine learning techniques can extract more complex nonlinear relationships than MIC.  Chapter 5 57   Table 5.1: Correlation between input and output variables using the MIC (n = 4757)   pH Conductivity Sulfate Alkalinity Acidity Flow Ca Mg Al Cu Cd Zn pH 1.0            Conductivity (𝜇𝜇/𝑐𝑐) 0.3 1.0           Sulfate (𝑐𝑔/𝐿) 0.3 0.5 1.0          Alkalinity (𝑐𝑔/𝐿𝑎𝐿 𝐶𝑎𝐶𝐶3) 0.4 0.2 0.1 1.0         Acidity (𝑐𝑔/𝐿𝑎𝐿 𝐶𝑎𝐶𝐶3) 0.4 0.4 0.5 0.2 1.0        Flow (𝐿/𝐿) 0.2 0.2 0.2 0.1 0.2 1.0       Ca (𝑐𝑔/𝐿) 0.1 0.4 0.5 0.1 0.2 0.2 1.0      Mg (𝑐𝑔/𝐿) 0.3 0.5 0.5 0.1 0.3 0.2 0.6 1.0     Al (𝑐𝑔/𝐿) 0.5 0.3 0.4 0.2 0.5 0.2 0.3 0.4 1.0    Cu (𝑐𝑔/𝐿) 0.8 0.3 0.4 0.3 0.5 0.1 0.2 0.3 0.7 1.0   Cd (𝑐𝑔/𝐿) 0.5 0.1 0.2 0.2 0.3 0.1 0.3 0.2 0.5 0.5 1.0  Zn (𝑐𝑔/𝐿) 0.7 0.2 0.3 0.3 0.3 0.1 0.2 0.2 0.5 0.7 0.6 1.0 Chapter 5 58  5.2.4 Model development and evaluation The dataset was divided into training and testing sets following the k-fold cross-validation method (Mitchell 1997). In this method, the dataset is subdivided into k subsets preferably of equal size. Next, the k-1 subsets are used to train the machine learning algorithms and the remaining one subset are used for testing the models. In this study, each subset has the size of 475 values and ten-fold cross-validation is used.  The predictive accuracy of each machine learning technique was evaluated using Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Root Relative Squared Error (RRSE), Relative Absolute Error (RAE), where the smaller value indicates a better technique (Betrie et al. 2013). The predictive accuracy helps to evaluate the overall match between observed and predicted values for each machine learning technique. The equations of the error estimates are given in equations 4.19 through 4.22. On the other hand, the predictive uncertainty of each machine learning technique is evaluated by comparing the observed and predicted P-boxes. 5.3 Results and Discussion Performances of the ANN and SVM models in predicting the heavy metals in terms of the four evaluation methods are presented in Table 5.2. For prediction of Cu using ANN, it shows that Cu-5 and Cu-4 models are the best and the worst predictions, respectively. For prediction of Cu using SVM, Cu-2 and Cu-1 models are the best and worst prediction, respectively. The range of the performance indicates that the ANN model has a higher prediction uncertainty than SVM model. For prediction of Cd, both models have the same uncertainty band and the performances of the models are not good as shown by the RRSE and RAE measures. For prediction of Zn, Zn-6 and Zn-2 models of ANN are the best and worst, whereas Zn-8 and Zn-2 of SVM are the best and the worst models, respectively. The range of the performance measures indicates that the prediction of ANN is higher than SVM.         Chapter 5 59  Table 5.2: Performance of ANN and SVM models in terms of four evaluation measures       ANN       SVM   Model MAE RMSE RRSE RAE MAE RMSE RRSE RAE Cu-1 0.11 0.20 34.20 23.23 0.16 0.25 41.73 32.81 Cu-2 0.12 0.20 37.65 27.89 0.13 0.20 39.17 30.98 Cu-3 0.13 0.18 32.21 26.67 0.15 0.21 38.66 31.99 Cu-4 0.29 0.32 57.32 60.94 0.15 0.22 39.08 31.75 Cu-5 0.09 0.16 28.20 20.00 0.14 0.21 37.06 30.21 Cu-6 0.13 0.18 31.32 26.86 0.14 0.20 34.97 27.76 Cu-7 0.13 0.21 34.83 26.07 0.15 0.22 36.77 29.76 Cu-8 0.13 0.20 36.54 26.91 0.15 0.22 39.58 32.06 Cu-9 0.11 0.17 30.86 22.19 0.14 0.20 36.31 29.21 Cu-10 0.11 0.17 33.28 24.56 0.14 0.21 40.24 32.03 Cd-1 0.01 0.02 73.89 60.39 0.01 0.02 73.89 60.39 Cd-2 0.01 0.02 77.82 64.53 0.01 0.02 77.82 64.53 Cd-3 0.01 0.02 75.19 61.14 0.01 0.02 75.19 61.14 Cd-4 0.01 0.01 71.89 63.86 0.01 0.01 71.89 63.86 Cd-5 0.01 0.01 71.62 61.78 0.01 0.01 71.62 61.78 Cd-6 0.01 0.01 71.15 59.63 0.01 0.01 71.15 59.63 Cd-7 0.01 0.02 74.43 60.73 0.01 0.02 74.43 60.73 Cd-8 0.01 0.01 71.54 58.65 0.01 0.01 71.54 58.65 Cd-9 0.01 0.02 72.73 59.23 0.01 0.02 72.73 59.23 Cd-10 0.01 0.02 79.26 62.12 0.01 0.02 79.26 62.12 Zn-1 1.34 2.41 57.68 42.30 1.48 2.55 60.95 46.91 Zn-2 1.51 2.52 59.56 46.19 1.58 2.69 63.66 48.30 Zn-3 1.48 2.53 59.28 43.93 1.58 2.69 62.81 46.95 Zn-4 1.25 1.83 49.11 40.94 1.42 2.04 54.65 46.35 Zn-5 1.19 1.94 50.09 39.84 1.36 2.15 55.58 45.62 Zn-6 1.19 1.86 48.67 38.06 1.38 2.12 55.27 44.21 Zn-7 1.34 2.33 56.95 41.87 1.50 2.50 61.05 46.90 Zn-8 1.16 2.06 52.37 36.52 1.34 2.19 55.82 42.12 Zn-9 1.32 2.25 54.88 41.75 1.45 2.47 60.29 45.70 Zn-10 1.33 2.68 62.81 42.93 1.49 2.79 65.38 48.05  Chapter 5 60  The predicted and observed P-boxes are shown in Figure 5.2. This figure shows that the predicted Cu concentration using ANN very well enveloped the observed Cu distribution, whereas the prediction of SVM has not enveloped the observed Cu distribution completely. As indicated by the performance measure above, the upper bound of ANN prediction is higher than the upper bound of SVM prediction. The observed of Cd concentration is not enveloped by both the ANN and SVM models prediction. This is attributed to the majority of Cd data (over 90%) are below 0.05 mg/L, which is a value of analytical detection limit. The data above 0.05 mg/L do not have information to identify an empirical model. Nevertheless, it is interesting to note that 90% of the Cd data are well enveloped by ANN and SVM. The majority of observed Zn distribution is well enveloped by prediction of ANN except that a small portion of the upper and lower bounds. On the other hand, the prediction of SVM has not enveloped some part of the observed Zn.   Figure 5.2: Predicted and observed P-boxes for Cu, Cd, and Zn The comparison between integrated (ANN-SVM) prediction of Cu and observed P-boxes are shown in Figure 5.3. This figure shows that the integrated prediction using the envelope and intersection methods gives the individual prediction of ANN and SVM, respectively. The Chapter 5 61  integrated prediction using the mixture method is higher than the observed upper bound distribution. However, the mixture result is closer to the observed upper bound than the individual prediction of ANN. The integrated prediction of Cu using the average method well enveloped the P-box for the observed concentrations.  However, the major part of the mixture upper bound is higher than the observed upper bound.    Figure 5.3: Integrating ANN and SVM predictions for Cu using four methods and observed P-boxes  The comparison between integrated (ANN-SVM) prediction of Cd and observed P-boxes are shown in Figure 5.4. This figure shows that any of the integration methods could not improve the prediction uncertainty of individual models. Also, it shows that the integrated predictions are exactly the predictions of individual ANN and SVM. As discussed previously, these poor predictions are attributed to lack of heterogeneity in the Cd data.   Chapter 5 62   Figure 5.4: Integrating ANN and SVM predictions for of Cd using four methods and observed P-boxes The comparison between integrated (ANN-SVM) prediction of Zn and observed P-boxes are shown in Figure 5.5. This figure shows that the integrated prediction using the envelop method well bounded the observed P-box except the slight portion of the upper bound.  It is interesting to note that this method improved the poor predictions of ANN at the lower part of the upper bound distribution. The integrated prediction using the intersection method has not enveloped the upper bound of the observed distribution. The integrated prediction of Zn using the mixture and average methods has not enveloped the upper portion of the observed upper bound. The correlation coefficients between variables reported in this study are quite different from values reported in previous studies. For instance, the correlations in this study between Cu (i.e., dependent) and pH, conductivity, sulfate, and alkalinity (i.e., independent) are equal to 0.8, 0.3, 0.4, and 0.3, respectively as seen in Table 5.1. Aryafar et al. (2012) reported the relationship between Cu (i.e., dependent) and pH, conductivity, sulfate, and alkalinity (i.e., independent) are equal to -0.697, 0.757, 0.663, and -0.199, respectively. Betrie et al. (2012) reported that the relationship between Cu (i.e., dependent) and pH and conductivity are equal to -0.74, and 0.52, Chapter 5 63  respectively. The absolute values of these and previous studies for Cu and pH show that there is an agreement, whereas the absolute values of this study and previous work disagree for correlation between Cu and the other variables. This disagreement is likely attributed to the effect of outliers. All previous studies used linear correlation methods (Spearman and Pearson) that are sensitive to outliers (Gideon et al. 1987). Note that Pearson is more sensitive to outliers than Spearman because it is a parametric method. Therefore, the correlation results of previous studies are relatively inflated compared to this study that does not have outliers.   Figure 5.5: Integrating ANN and SVM predictions for Zn using four methods and observed P-boxes The evaluation among machine learning techniques based on predictive accuracy by Aryafar et al. (2012) and Betrie et al. (2012) reported that SVM performed best compared to other techniques. Furthermore, Betrie et al. (2012) reported that SVM performed best compared to other methods based on predictive uncertainty. However, this study shows that ANN is better than SVM since it envelops very well the observed data as depicted in Figure 5.2. This finding indicates that the higher predictive undertainty of ANN, which is often considered as a limitation, has enabled ANN to envelop the observed P-box. It is interesting to note that this finding Chapter 5 64  indicates that selection of an optimal model based on model accuracy (i.e., evaluation techniques) could be misleading. This agrees with Cherkassky et al. (2006) who stated a selection of model based on evaluation techniques cannot guarantee an optimal model in critical situations. Therefore, it could be reliable to select an optimal model based on predictive uncertainty rather than predictive accuracy of models. It is worth noting that while the gap between the lower and upper bounds determines the degree of epistemic uncertainty, the shapes of lower and upper bounds determine the degree of aleatory uncertainty (i.e., variability).  The predicted and observed lower bounds have no variability as they are almost straight, whereas the upper bounds have some variability as shown in the figures. The lack of variability in the lower bound could be attributed to analytical detection limits. Both the observed and predicted data have epistemic uncertainty as shown in the figures. Thus, this indicates that there is a need for further study and data collection to reduce the epistemic uncertainty.  Although the machine learning techniques predicted well the observed distribution of Cu and Zn, they have not performed well for Cd prediction as seen in Figure 5.2. Most of the observed data (over 90%) are below 0.05 mg/L (Figure 5.6)  and those data above this value do not have enough information to identify an empirical model. Subsequently, the maximum concentration value predicted by ANN and SVM is slightly over 0.05 mg/L. This result indicates that machine learning techniques work best if the data have heterogeneity. Of course, one of the limitations of data-driven paradigms including machine learning techniques is the requirement of a large database (Cherkassky et al. 2006; Solomatine and Ostfeld 2008). In case the data lack heterogeneity, a classical modeling approach (e.g., multiple linear regression) should be used instead of machine learning techniques. Integrating the prediction of ANN and SVM using the aggregation methods improved the prediction results except for prediction of Cd as shown in Figures Figure 5.3 to Figure 5.5. For integration of Cu prediction, the average method performed best followed by mixture, envelope and intersection. On the other hand, the envelope method performed best followed by mixture, average and intersection for integrated of Zn prediction. It is interesting to note that the intersection method has the worst performance and no single method performed best for prediction of Cu and Zn. This result indicates that all aggregation methods should be investigated by modelers if the intention is to improve prediction results. Chapter 5 65   Figure 5.6: Observed concentrations of Cd; index on the x-axis represents time. 5.4 Summary This study presented a probability-bounds-based methodology to predict ARD chemistry using machine learning techniques. The two machine learning techniques used are artificial neural networks (ANN) with multilayer perceptrons and support vector machine with polynomial kernel. Missing values in the data were estimated using the iterative robust model-based imputation algorithm. Multivariate outlier analysis was conducted using the adaptive outlier detection algorithm to remove outlier values from the data. Note that a selection of percentile value for outlier analysis should be done in consultation with an expert opinion since it has a serious implication for risk analysis. A nonlinear correlation analysis was conducted using the maximum information criterion algorithm to identify the relationship between independent and dependent variables. Based on the correlation analysis, heavy metals (Cu, Cd, and Zn) were predicted as functions of pH, alkalinity, sulfate, acidity, and flow. The predictive accuracy of ANN and SVM algorithms were evaluated using the Root Mean Squared Error, Mean Absolute Error, Root Relative Squared Error, and Relative Absolute Error. The epistemic and aleatory uncertainties in the prediction results were quantified using P-boxes and compared with the observed P-boxes graphically. A visual comparison of observed and predicted P-boxes was used as a measure of prediction uncertainty. The predictions of ANN and SVM were integrated using envelope, intersection, mixture, and average methods.  Chapter 5 66  The results of this study show that the predictions of ANN enveloped well the observed Cu and Zn concentrations than SVM prediction. On the other hand, both algorithms did not envelope the observed Cd concentrations. The prediction of Cd was not good because there was little heterogeneity in the Cd dataset since 90% of the dataset is below analytical detection limit. These results indicate that the success of machine learning techniques not only depends on the amount of data but also on heterogeneity within the data.  Integrating the prediction of ANN and SVM using the aggregation methods improved the prediction results except for Cd. While envelope, mixture and average showed good performances, the intersection method showed poor performances. These results indicated that there is no best aggregation method, but rather analysts should investigate to determine which one improves predictions.   This study not only quantified prediction uncertainty, but also identified the sources of uncertainties, and whether there is possibility to reduce them. In addition, the study showed the danger of selecting an optimal technique model using predictive accuracy and thus highlights the use of predictive uncertainty using P-boxes for selecting an optimal model.  Chapter 6 67   CHAPTER 6 ECOLOGICAL RISK ASSESSMENT: A FUGACITY APPROACH A version of this chapter has been submitted to Environmental Technology and Innovation with a title” Ecological risk assessment of acid rock drainage under uncertainty” The lead author is Getnet D. Betrie and the coauthors are Dr. Rehan Sadiq, Dr. Kevin A. Morin, and Dr. Solomon Tesfamariam.  6.1 Background ARD poses environmental risks such as elimination of biological species, significant reduction in ecological stability, and bioaccumulation of metals in the flora and fauna (Gray 1997; Gray 1998). The remediation costs of these damages could be billions of dollars in Canada. For instance, the remediation costs at orphaned mines have been estimated at ten billions of dollars in North America alone (INAP 2012). To minimize environmental risks and to lower remediation costs, ecological risk assessments are highly important in various phases of mining. Ecological risk assessments evaluate the likelihood that adverse ecological effects may occur or are occurring as a result of exposure to one or more contaminants (U.S. EPA 1992). The assessment involves problem formulation, exposure characterization, effect characterization, and risk characterization.  Exposure characterization is often conducted using measured environmental data or using fate and transport models (Solomon et al. 1996). Fate and transport models are used to estimate environmental concentrations of contaminants/stressors. Such models for the mining industry are well-documented in the literature (Perkins et al. 1995; Caruso et al. 2008). Although these models are expected to provide accurate predictions, they have inherent uncertainties such as model, parameter, and data uncertainties (Walker et al. 2003; Matott et al. 2009). In the mining industry, however, the uncertainties associated with fate and transport modeling are rarely stated or recognized (Maest et al. 2005).  Effect characterizations are obtained either from laboratory tests or micro/mesocosm studies (Solomon et al. 1996). Laboratory tests include the “median lethal concentration” (LC50) and the “no observed effect concentration” (NOEC). LC50 is the concentration of a chemical that is required to kill 50% of the test population, whereas NOEC is the concentration at which no long-Chapter 6 68  term or chronic effects occur in the test population. Extrapolating these toxicity data from test samples to the population (e.g., species) introduces data uncertainty (Hall et al. 1998).  Data and parameter uncertainties may originate from randomness due to natural variability and from imprecision due to systematic measurement errors or expert opinions (Walker et al. 2003; Baudrit et al. 2006). In ecological risk assessments, addressing these uncertainties is integral (CCME 1996; USEPA 1998). Uncertainties due to imprecision and variability in the risk assessment should be addressed separately using the available methods (Hoffman and Hammonds 1994; Ferson and Ginzburg 1996). The objective of this chapter is to develop a methodology that quantifies ARD risks under uncertainty using a fugacity-based model for minesites with limited hydrogeological information. 6.2 Ecological Risk Analysis 6.2.1 Problem Formulation Problem formulation includes defining assessment endpoints, a conceptual model, and an analysis plan. The assessment endpoints express the environmental value that is to be protected. The selection of endpoint characteristics is based on the ecological relevance, susceptibility or relevance to management goals. The assessment endpoints are defined based on the mine’s management goal, which are to protect at least 90% of the fish species 90% of the time from stressor exposure in a nearby lake. The conceptual model is the description and visual representation of possible relationships between the endpoints and stressors. The conceptual model for the risk assessment consists of the stressors, media, exposure routes, and receptors. The stressors are elevated copper and zinc released from waste rock. The media that could be contaminated are groundwater and surface water. The two potential exposure routes that are considered at this site are ingestion and contact. In the analysis plan, risk hypotheses are evaluated to determine how they will be assessed using the available data. The hypothesis evaluated in this research was as follows: copper and zinc may cause a permanent reduction in the fish species in the lake.  6.2.2 Exposure Characterization In the exposure characterization, exposure is analyzed by examining stressors, their distribution in the environment and the extent of co-occurrences between stressors and receptors. The analysis also describes the impact of variability and uncertainty on exposure estimates and Chapter 6 69  determines the likelihood that exposure will occur. Exposure characterization is conducted using the fugacity-based fate and transport model. In the following paragraphs, the concept of the fugacity-based model, a literature review of this model, the model development, input parameters and their uncertainty, and a model validation are presented. A fugacity-based model is an approach for estimating of chemical distributions in multimedia based on the complexity of transport and transformation processes (Diamond et al. 1992; Ling et al. 1993; Mackay et al. 1983). This model uses aquivalence ( 3, /Q mol m ) as the controlling variable instead of using concentration ( 3, / mC mol ). A linear relationship exists between these quantities (i.e., 'C QZ= ), where 'Z  is the aquivalence capacity, which depends on the characteristics of a chemical, a medium, and temperature. The value of aquivalence capacity for water ( 'WZ ) is defined as 1.0 and the values for other media are obtained by multiplying 'wZ  and the partition coefficient of a medium. Rates of various processes, such as advection, diffusion, and chemical transformation are expressed, as the product of aquivalence and a transport or transformation parameter value ( 3 /m h ). There are three types of transport or transformation parameter values: (i) For a chemical transport by advective flow, 'A G Z= ⋅ , where G  is a mass-phase flow rate in 3 /m h ,  (ii) For a chemical transport by diffusion, 'D USZ= , where U  and S  are the mass transfer coefficient in /m h , and area in 2m , respectively, and  (iii) For a chemical transformation by reaction, 'T VkZ= , where V  is the compartment volume ( 3m ), and k  is a first-order rate constant ( 1h− ). The fugacity/aquivalence model is widely applied to predict the fate and transport of toxic substances in the environment. Mackay and colleagues (Diamond et al. 1992; Ling et al. 1993; Mackay et al. 1983) applied a steady-state non-equilibrium model to rivers and lakes to study the distribution of heavy metals. This model was used in the deterministic mode, and uncertainties associated with input parameters were not considered. Sadiq et al. (2003) applied a steady-state aquivalence-based model in a probabilistic mode to study the fate of heavy metals that were released from an offshore drilling well into a marine environment. Additionally,  Luo and Yang (2007) used a fugacity-based model to predict chemical concentrations in the multiple media (i.e., air, surface water, and sediment) and propagated the uncertainty of chemical properties and Chapter 6 70  hydrogeological parameters using Monte Carlo simulations in the Great Lakes region. Sommerfreund et al. (2010) used a fugacity/aquivalence model with Monte Carlo analysis to quantify the environmental parametric uncertainty in the Venice Lagoon. The Center for the Study of Metals in the Environment (CSME) has developed the Unit World Model (UWM) based on the fugacity approach to assess the fate of metals in rivers and streams (U.S. EPA 2006). In this model, a probabilistic method is implemented to capture the variability associated with fate simulation of metals in rivers.  In this study, a steady-state model is developed to simulate the fate of heavy metals in soil, water, and solid particle media from a minesite. The mine system consists of waste rock, soil layers, an aquifer and solid particles. A conceptual model showing different mechanisms of transport in three layers is shown in Figure 6.1. Waste rock assumed to be the source of copper and zinc metal emissions ( SE ) into the soil. Intermedia transports through diffusion and advection were considered. The diffusion of copper and zinc from water to soil ( WSD ), soil to water ( SWD ), water to solid particles ( WSPD ), and solid particles to water ( SPWD ) is calculated based on mass transport coefficients. The values of advection through the soil ( slA ) and the aquifer ( gwA ), which correspond to the transport of copper and zinc through water pores of the soil and the aquifer, are calculated based on infiltration and groundwater flow rates, respectively. An algorithm for calculating the copper and zinc concentrations in soil, solid particles, and water is presented in Table 6.1.   Figure 6.1: A schematic representation of the conceptual model Waste rockGroundwaterASl DSWDSWAgwESClay tillInfiltrationDWSP DSPWChapter 6 71  The input parameters of the fate and transport model are provided in Table 6.2. A clay till thickness ( sld ) of 10 m and aquifer thickness ( gwd ) of 700 m were used. The clay till overlies an aquifer that is considered homogenous and well mixed. The aquifer has a length and width of 1000 m. It is assumed that flow rates through the clay till ( slG ) and aquifer ( gwG  ) are gravity driven and that copper and zinc are transported by advection. The soil-water mass transport coefficient ( 11U ) is obtained from Mackay and Guardo (1996). Table 6.2 also shows the parameter types (constant, random or imprecise) and values. The imprecise variables are defined by their statistics such as mmms (minimum, maximum, mean, and standard deviation), minmaxmode (minimum, maximum, and mode) or minmax (minimum and maximum). The random variable of the partitioning coefficient is defined using a lognormal distribution, and the distribution’s parameter and shape are obtained from Allison and Allison (2005). The concentrations of copper and zinc are multiplied by a dilution factor ( dl ) to obtain the concentrations of copper and zinc in the lake. The dilution value is obtained by measuring the effective surface area of the lake using the Google earth pro program (Google 2013), and its depth was assumed. Note that concentrations and physical properties were obtained from a copper-molybdenum-gold-silver-rhenium minesite located in British Columbia, Canada. Table 6.1: Algorithm of the fugacity model Steps Parameters and Equations Define physico-chemical properties of contaminants  Partitioning with soil ( dK ) and molecular weight ( MW ) Define media physical properties Length ( l ), width ( w ), thickness of aquifer ( gwd ) and soil ( sld ), and bulk density ( BD ) Define flow rates in the soil and aquifer Flow rates in the soil ( slG ) and aquifer ( gwG )  Calculate the aquivalence capacity ( 'Z ) of all layers Aquivalence capacities of water ( ' 1wZ = ) and soil ('s dZ BD K= ⋅ )  Calculate the A -values of advection Advection  in the aquifer ( 'gw w gwA Z G= ⋅ ) and soil ('Sl s sA Z G= )  Calculate the D -values of diffusion   Diffusion from the soil to water ( '11SW sD SU Z= ), water to the soil ('11WS wD SU Z= ), solid particles to water ( '11SPW sD PU Z= ), and water to solid particles ('11WSP WD PU Z= ). 11U is the mass transfer coefficient (MTC), S is the area of soil-aquifer interface, and P  is the area of solid particles-aquifer interface Define contaminant loads by direct emissions in the soil ( sE ) slSC GEMW×= C and MW are the concentration and molecular weight, respectively, of copper and zinc Calculate aquivalence of water ( wQ ), soil ( sQ ), and solid particles   ( spQ ) Swgw WSPEQA D=+, WSP( )( D )( )S gw WSP WSsgw sl SWE A D DQA A D+ +=+ +, and D ( )S WSPspSPW gw WSPE DQA D=+ Calculate concentrations (mg/L) in media dl  is the dilution factor in the lake (0.001)  Concentrations in the aquifer ( 'w W wC MW Z Q= ⋅ ⋅ ), soil ('s s sC MW Z Q= ⋅ ⋅ ),  solid particles ( 'sp s spC MW Z Q= ⋅ ⋅  ), and lake ( LC Cw dl= ⋅ )   The developed model was validated by visually comparing the predicted copper and zinc with observed data at the lake. The observed dataset is monitored at the lake, which is downstream of the waste rock. The observed dataset at the lake was monitored occasionally and consists of a few data points. On the other hand, the copper and zinc data used as inputs for the model were measured at the toe of the waste rock, where monitoring was done more frequently; thus many data points are available. Table 6.2. Input parameters of the fugacity-based fate model Parameter Symbol Units Variable Type Value/Distribution Molecular weight of Cu and Zn, respectively MW  g/mol constant 63.6 and 65.39 Aquivalence capacity 'wZ    constant 1 Width w  m constant 1000 Length l  m constant 1000 Thickness of aquifer gwd  m constant 712 Thickness of soil sld  m constant 10 Soil-water mass transport 11U  m/h constant 10-05 Density of solids BD  kg/m3 imprecise minmaxmode~ (1840,1981,1965) Flow rates in soil slG  L/sec imprecise minmax~(1,6) Flow rates in aquifer gwG  L/sec imprecise minmax~ (0.5,4) Partition coefficient of Cu log CuKd  L/kg random lognormal~(3.5,1.7) Partition coefficient of Zn log ZnKd  L/kg random lognormal~(4.1,1.6) Copper concentration CuC   mg/L imprecise mmms~(0.0015,3.2,0.31,0.82) Zinc concentration ZnC  mg/L imprecise mmms~(0.023,18,4.97,0.78) 6.2.3 Effect Characterization The characterization of ecological effects presents the relationship between stressor levels and ecological effects. Ecological effects are characterized using the response of organisms to Chapter 6 74  stressors. This response is reported as NOEC and LC50. The five fish species that inhabit the lake are Oncorhynchus kisutch, Cottus asper, Gasterosteus aculeatus, Oncorhynchus clarki and Salvelinus malma. In this study, the LC50  response of these species to  copper and zinc were obtained from the AQUIRE database (US EPA 2014).  To consider the effect of pH on toxicity, toxicity tests that are conducted with pH ranges 4 to 6 were used in this study. The predicted-no-effect concentration (PNEC) was derived by dividing the LC50 value by 10, as suggested in the literature (Roman et al. 1999). The PNEC values are assumed to represent any possible exposure pathways, such as ingestion of contaminated water and consumption of lower trophic-level organisms.  6.2.4 Risk Characterization The risk characterization phase involves estimating of the probability of adverse effects on selected endpoints. In this phase, risk is estimated and described, and uncertainties and assumptions are summarized. In this research, the degree of overlap between p-boxes of exposure and effect is used for risk characterization. First, the effect concentration at the lower or upper bounds that is defined to protect the assessment endpoints is calculated. Second, this concentration is recorded on the lower or upper bound of the exposure, and its corresponding non-exceedance level is computed. Third, the risk value is calculated by subtracting the non-exceedance values from one. 6.3 Results and Discussion The validation of the exposure model is presented in Figure 6.2. This figure depicts the lower and upper bounds of the cumulative distribution functions of the observed and predicted concentrations of copper and zinc in the lake. The observed p-boxes are computed with Kolmogrov-Simrnov 95% confidence bounds based on eleven values. The lower and upper bounds provide concentration values that are not exceeded at a given percentile level. For example, the upper bounds of copper and zinc show that their predicted concentrations of copper and zinc do not exceeded 0.027 and 0.186 mg/L, respectively, at the 80% probability level.  The predicted copper concentration underestimated the lower bound of the observed concentration as shown in this figure. The upper bound of the observed concentration of copper is underestimated between the 0 and 0.8 probability levels and overestimated at probability levels higher than 0.8, as shown in the figure. The predicted concentration of zinc underestimated the lower bound of the observed concentration of zinc. For the upper bound, the observed zinc Chapter 6 75  concentration is well estimated, underestimated, and overestimated at the probability levels of 0-0.4, 0.4-0.5, above 0.5, respectively. The distance between the lower and upper bounds indicates the degree of uncertainty due to imprecision. The uncertainty band of the predicted copper ranges from 0 to 0.035 mg/L, whereas the uncertainty band of the observed copper ranges from 0.003 to 0.033 mg/L. Similarly, the uncertainty band of the predicted zinc ranges from 0 to 0.198 mg/L, whereas the uncertainty band of the observed concentration of zinc ranges from 0.028 to 0.16 mg/L.  The reason this model overestimated the observed concentrations is most likely attributed to metals removal by chemical processes are not represented in the model. The literatures shows that chemical processes,  such as precipitation, ion exchange and reduction-oxidation, remove heavy metals in aquatic systems (Luo and Yang 2007; Sommerfreund et al. 2010). However, this model excludes those important processes; subsequently, the maximum values of the predicted concentrations of copper and zinc are higher than their observed counterparts.  Figure 6.2: Observed and predicted concentrations of copper and zinc in the lake Chapter 6 76  The reason for the difference between the shape of the observed and predicted p-boxes is discussed as follows. The smooth edges of the predicted p-boxes are attributed to the types of distributions (e.g., minmax, mmms, minmaxmode, and lognormal) used to represent the input parameters of the exposure and effect assessments. However, the rough edges of the observed p-boxes are attributed to the small sample sizes used to compute the p-boxes.    The results of the effect characterization (i.e., the PNEC-derived probability box) for copper and zinc are shown in Figure 6.3. The probability boxes of copper and zinc have ranges of values at different percentile levels. For instance, the PNEC concentration of copper varies within 0.002-0.01, 0.002-0.014, 0.003-0.029, and 0.027-0.055 at the 10th, 30th, 60th, 99th percentiles, respectively. Similarly, the PNEC concentration of zinc varies within 0.001-0.002, 0.001-0.003, 0.001-0.007, and 0.003-0.01 at the 10th, 30th, 60th, 99th percentiles, respectively. These probability boxes of copper and zinc are validated by comparing them with the Canadian regulatory value (PNEC-Regulatory) for freshwater, which is obtained from Canadian Environmental Quality guidelines (CCME 2014). The guideline value of copper overlaps with the lower bound of copper PNEC values up to the 58th percentile. The guideline value of zinc falls between the lower and upper bound distributions of zinc, from the 30th to 99th percentiles. It is worth noting that the guideline values are constant at various percentiles because they are deterministic estimates. The results of these comparisons indicate that derived PNEC probability boxes of copper and zinc are acceptable.  Chapter 6 77   Figure 6.3: Derived and regulatory PNEC concentrations of copper and zinc The risk characterization was conducted by comparing the degree of overlap between exposure and effect (PNEC) concentrations, as shown in Figure 6.4. This comparison is specifically performed by comparing the lower and upper bounds of the effect concentrations at a given percentile with their respective exposure concentrations. For risk characterization with respect to the lower bound of copper, the value of the effect concentration that will be exceeded at the 10th percentile is equal to 0.002 mg/L. This concentration of copper is not exceeded 100% of the time by the lower bound of the exposure distribution. This indicates that the risk is equal to 0%, and there is no risk from copper with respect to the lower bound. For risk characterization with respect to the upper bound of copper, the value of the effect concentration that will be exceeded at the 10th percentile is equal to 0.01 mg/L. This value is not exceeded 73% of the time in the exposure distribution. This indicates that the associated risk of copper is equal to 27% with respect to the upper bound of the distribution. The estimated value of risk (27%) is not acceptable because it exceeds the acceptable risk value of 10%.    Chapter 6 78   Figure 6.4: Predicted exposure and effect concentrations of copper and zinc For the lower bound of zinc, the value of the effect concentration that will be exceeded at the 10th percentile is equal to 0.006 mg/L. This value is not exceeded 100% of the time at the lower bound of the exposure distribution. This indicates that the associated risk of zinc is equal to 0% and there is no risk with respect to the lower bound of zinc. For risk characterization with respect to the upper bound of zinc, the value of the effect concentration that will be exceeded at the 10th percentile is equal to 0.015 mg/L. All concentrations of the upper bound of the exposure distribution exceed 0.015 mg/L at all the times. This indicates that the associated risk of zinc with respect to the upper bound is equal to 100%, and this value is beyond the acceptable level of risk of 10%. The results of the risk characterization for both methods indicate that the fish are at risk from copper and zinc because the assessment endpoints that are defined to protect at least 90% of the fish species 90% of the time from stressors are not met. However, the estimated risk might be overestimated because the exposure model used in this research does not represent important Chapter 6 79  processes (e.g., complexation) in aquatic systems that determine the availability and fate of metals. 6.4 Summary This chapter presents an ecological risk assessment methodology based on probability bounds. The method quantifies and propagates uncertainties for minesites with scarce hydrogeological information. This risk assessment involved problem formulation, exposure characterization, effect characterization, and risk characterization. In the problem formulation, assessment endpoints, a conceptual model, and a hypothesis of the risk assessment were defined. In the exposure characterization, a fugacity-based model was developed to determine the fate and transport of metals from waste rock in the environment.  In the effect characterization, the response of organisms to heavy metals was estimated. Data and parameter uncertainties are addressed using the probability-boxes method. In the risk characterization, the likelihood of risk was estimated by comparing the degree overlaps between probability boxes of exposure and effect concentrations at a given percentile level.  The exposure modeling results show that the predicted concentrations of copper and zinc slightly overestimated the observed concentrations. This model most likely overestimates the observed concentrations because chemical processes that remove metals are not represented. The results of the effect characterization show that the derived effect concentrations for copper and zinc are acceptable because the probability bounds of were within the deterministic guideline values. The risk characterization results indicate that there is a high probability of ecological risk due to metals transported into the nearby lake.     Chapter 7  80  CHAPTER 7 ECOLOGICAL RISK ASSESSMENT: A PHREEQC APPROACH A version of this chapter has been submitted to Journal of Hazardous Materials with a title” Environmental risk assessment under uncertainty in the mining industry: A PHREEQC approach.” The lead author is Getnet D. Betrie and the coauthors are Dr. Rehan Sadiq, Dr. Nichol Craig, Dr. Kevin A. Morin, and Dr. Solomon Tesfamariam.  7.1 Background The transport of ARD products into the environment and exposure to the aquatic life pose serious ecological risks. The adverse risks of ARD on the aquatic life is well document in the literatures (Gray 1997; Soucek and Cherry 2000; Parsons 1977; Payette and Delwaide 2003; Niyogi et al. 2001; Bray et al. 2009; Luís et al. 2008; Hogsden and Harding 2012; Harding 2005). Gray (1997) summarized the overall ecological effects of ARD that include habitat modification, niche loss, bioaccumulation within food chain, loss of food source, elimination of sensitive species, reduction in primary productivity, and food chain modification.  Ecological risk assessment evaluates the likelihood that adverse ecological effects may occur or are occurring as a result of exposure to one or more contaminants (U.S. EPA 1992). It is imperative to conduct ecological risk assessment during various phases of mining to minimize ecological risks in minesites. Consequently, Betrie et al. (2014) proposed a probability-bounds-based methodology of ecological risk assessment for the mining industry. This methodology consists of problem formulation, exposure characterization, effect characterization, and risk characterization. However, a risk estimate of this methodology has a limitation. This limitation is attributed to the fate-and-transport model used for exposure characterization. Fate-and-transport models are often used to estimate exposure concentration in the environment for risk assessment (Solomon et al. 1996). The fate-and-transport model used in Betrie et al. (2014) is a fugacity-based model that uses transport parameters (advection and diffusion) to transport metals and partitioning coefficient (Kd) to estimate the distribution of metals into different media such as water, lipid, soil and sediment. This model does not have the capability to represent complex processes that are going in minesites.  The review of commonly used fate-and-transport models in minesites is well documented in the literature (Alpers and Nordstrom 1999; Caruso et al. 2008; Nordstrom and Campbell 2014; Perkins et al. 1995). Examples of these models include MINTEQ (Allison et al. 1991), Chapter 7  81  PHREEQC (Parkhurst and Appelo 1999), MIN3P (Mayer et al. 2002), and others. These models have capabilities to simulate processes such as geochemical speciation, acid-base equilibrium, redox equilibrium, precipitation/dissolution, adsorption/desorption, and transport (Blowes et al. 2003). The objective of this chapter is to develop a methodology that quantifies ARD risks under uncertainty using a PHREEQC model for minesites with adequate hydrogeological information.  7.2 Methodology The improved methodology for ecological risk assessment is presented in Figure 7.1. In the first block, assessment endpoints, a conceptual model and analysis plan are defined. The assessment endpoints express the environmental value that is to be protected. The selection of endpoints characteristics are based on the ecological relevance, susceptibility or relevance to management goals. The conceptual model is the description and visual representation of possible relationships between the endpoints and stressors. The analysis plan evaluates risk hypotheses to determine how they will be assessed using available data. In the second block, probability boxes (P-boxes) are constructed for input and validation parameters of the PHREEQC model. A P-box is used to propagate and quantify uncertainty. The input parameters include concentration of cations and anions from waste rocks, input of chemical reactions (e.g., cation exchange capacity) of the aquifer, and transport parameters such as advection, diffusion and dispersion. The parameters for model validation are measured concentrations of cations and anions in groundwater that are used to validate the simulation of PHREEQC.  In the third and fourth blocks, a probability level is defined and this level is used to cut the P-boxes into j  intervals. An example that demonstrates these blocks are shown in Figure 7.2. This figure shows a P-box for copper, a defined level of probability and the cuts. In this example, a probability level of 0.2 is defined to cut the P-box into 6 cuts ( 1 6j = ⋅⋅⋅ ). Each cut has minimum and maximum values from the lower and upper cumulative distribution functions, respectively. For instance, the cut 6j =   consists of lower and upper bound values of 0.05 and 0.263 /mg L , respectively.  Chapter 7  82    Figure 7.1: Proposed methodology for ecological risk assessment Chapter 7  83  In the fifth block, the PHREEQC model is run using values of input parameters at each cut. The total number of runs will be 2 j× . For the above examples, the total number of simulations is 12, which are two simulations at each cut. In the sixth and seventh blocks, P-boxes are constructed for simulated parameters and validated against P-boxes of the observed parameters. If an acceptable match is not obtained between simulated and observed P-boxes, the process is started from the second block.   Figure 7.2: Example of constructed P-box (dashed lines) and its cuts (solid lines) In the eighth block, ecological effects are characterized by examining the response of organisms to stressor levels. In this block, the LC50 data would be collected and P-boxes are constructed for each stressor. In the ninth block, ecological risks are estimated on selected endpoints. Three steps are followed to estimate the ecological risks. First, the effect concentration that is defined to protect the assessment endpoints is estimated from lower and upper distributions. Second, the probabilities of non-exceedance for these concentrations are determined from their corresponding exposure distributions. The results of the second step determine the percentage of time the assessment endpoints is exposed to a stressor and whether the defined protection level is satisfied. There is a risk if the defined protection level is violated. Third, the risk is estimated, which is equal to one minus the probability of non-exceedance.  j=1j=2j=3j=4j=5j=6Chapter 7  84  7.3 Case study In this section, the proposed framework was implemented for Minesite B, located in British Columbia, Canada.  7.3.1 Problem Formulation As mentioned in the previous chapter, the assessment endpoints are defined to protect at least 90% of the fish species 90% of the time from stressor exposure in the nearby Lake. The conceptual model for risk assessment consists of the source of stressors, media, exposure routes, and receptors. The stressors are elevated copper and zinc released from waste rock. The media that could be contaminated are groundwater and surface water. The two potential exposure routes that are considered at this site are ingestion and contact. The hypothesis evaluated in this study was: copper and zinc may cause permanent reduction of fish species in the lake. The fish species that inhabit the lake are Oncorhynchus mykiss, Oncorhynchus kisutch, Gasterosteus aculeatus, and Oncorhynchus clarki. 7.3.2 Exposure Characterization 7.3.2.1 Description of the PHREEQC Model PHREEQC (pH-REdox-Equilibrium C (programming language)) is a computer model that simulates aqueous equilibrium speciation, batch chemical reactions and one-dimensional reactive transport in aqueous environments (Parkhurst and Appelo 1999). This program has capabilities of simulating: speciation, sorption and desorption, and one dimensional transport through sequential reactions. The speciation component calculates concentrations and activities of ionic species and complexes using the law of mass action, mineral saturation indices as a logarithmic ration of ion activity product and solubility product, density and specific conductance of solution compositions. Solute activities are calculated as product of the activity coefficient and concentration of ion i . The activity coefficient is determined using the Debye-Hückel equation (Hückel 1923), or the Pitzer equation (Pitzer 1973), which are shown in equations 7.1 and 7.2, respectively. 2 1/21/2log1iiiAZ IBa Iγ = −+ 7.1  where z  is charge number of ion i ; I  is ionic strength; A  and B  are temperature dependent constants; and ia  is ion size parameter. Chapter 7  85  3/222 2( )ln ( ) ( )3v v v vAz z f I B I m Cmv vγ + − + −± + −= − + +  7.2  where  1/21/21/21/2'1/2 2( ) 1.67 ln(1 1.2 I )1 1.2and2 1B(I) 2 1 (1 I I) e2o IIf IIIαββ α αα−= + ++ = + − + −   and where oβ  and 'β  are specific-ion parameters, α  is a constant for a similarly charged class of electrolytes, v  is the sum of the stoichiometric coefficients for the cation ( v+ ) and the anion (v− ). Sorption and desorption are simulated as surface complexation reactions or ion exchange reactions. Surface complexation reactions are simulated either by the Dozombak and Morel database for complexation of heavy metals ions on hydrous ferric oxide or CD-MUSIC (Charge Distribution MUltiSIte Complexation) for complexation of heavy metals on goethite (Parkhurst and Appelo 1999).  Ion exchange reactions are simulated with the Gaines-Thomas convention, the Gapon conventions or the Vanselow convention (Parkhurst and Appelo 1999). The Gaines-Thomas convention calculates the activity of exchange species using the equivalent fraction of the exchangeable cation as shown in equation 7.3 (Gaines and Thomas 1953). The Gapon convention calculates the activity of exchange species using equivalent fraction of exchange sites as shown in equation 7.4 (Gapon 1933). The Vanselow convention calculates the activity of exchange species using the mole-fraction of exchangeable cations (Vanselow 1932).  , , ,...iiI XII XI J Kmeqmeqβ −−=∑ 7.3 1/1/, , ,...ii iI XMII XI J Kmeqmeqβ −−=∑ 7.4  where Iβ  is the activity of an exchangeable ion I , iI Xmeq − is concentration of species I  on the solid phase expressed in milliequivalents ( /meq kg ), and , , ,...I J K  are the exchangeable ions. Chapter 7  86  The transport of solutions through a column or 1D flow path is modeled using the advection-reaction-dispersion equation, which is shown in equation 7.5. This equation consists of the transport (i.e., the first two terms) and the chemical reactions (the last term) parts. A finite difference scheme is used to solve the transport part of the equation. The scheme is forward in time, central in space for dispersion, and upwind for advective transport (Parkhurst and Appelo 1999). In this equation, firstly the advective transport is calculated, secondly the chemical reactions, thirdly the dispersive transport and finally the chemical reactions again. 22LC C C qv Dt x x t∂ ∂ ∂ ∂= − + −∂ ∂ ∂ ∂ 7.5  where C  is concentration in water, t  is time, v is pore water flow velocity, LD  is the hydrodynamic dispersion coefficients, and q is concentration in the solid phase. LD  is computed as shown below:  x* longitudinal dispersivityv velocity in the x directionD* effective diffusion coefficientL L xLD v Dαα= +=== 7.3.2.2 Fate and Transport modeling In this study, the PHREEQC model is used to simulate the fate and transport of heavy metals in an aquifer that is located in a minesite. The mine system consists of waste rock and the aquifer, which has geological units of clay and bedrock. The conceptual model that shows the modeled system is depicted in Figure 7.3. The waste rock is in the unsaturated zone, and clay till and bedrock are in the saturated zone. The column model consists of 30 cells with equal cell size. The flow direction of the groundwater is lateral in each cell and simulated as a column. Then groundwater reports to an adjust lake once the flow reaches the last cell.   The modeling assumptions are described as follows. The aquifer initially filled with native-water and this water represents pre-mining water chemistry of the aquifer. The aquifer contains clay with cation exchange capacity and calcite. Initially, the cation exchanger is in equilibrium with the native water. The aquifer is recharged with rain water that infiltrates through waste rock. The infiltrating water is concentrated with cations (e.g., copper, zinc, and iron), anions (sulfate, Chapter 7  87  alkalinity) and carbon dioxide in the unsaturated zone from oxidation processes. The drainage water enters the saturated zone and reacts with calcite in the presence of the cation exchanger.   Figure 7.3: Schematic representation of the mine system  The chemistry of the native-water and infiltration-water are presented in Table 7.1. The chemistry data is used as an input to PHREEQC. The chemistry of native-water represents the pre-mine water chemistry obtained from monitoring wells installed during pre-mining phase in different locations around the minesite. The values presented are the mean of concentration and it was assumed that the uncertainty associated with the values is negligible. The concentrations of sulfate, iron, copper and zinc are below detection-limit, indicating ARD was not there initially in the minesite.  The chemistry of infiltration-water represents the infiltration water that carries the pyrite oxidation products from the base of the waste rock into the unsaturated zone. This water carries oxidation products (i.e., cations and anions). The datasets were obtained from monitoring stations at the base of the unsaturated zone. These drain water from the waste rock/soil interface, and are used therefore to represent the chemistry of water that continues downwards through the till layer to the underlying aquifer. The uncertainty in these datasets is represented by their statistics, which are minimum, maximum, mean and standard deviation. These statistics are used to construct P-boxes using the RAMAS Risk Calc software (Ferson 2000). It was assumed for the purposes of this initial model demonstration that the variables are independent. It is worth noting that the values of the chemistry variables have significantly increased including for sulfate, iron, copper and zinc.  INFLITRATIONWASTE ROCKCLAY TILLBEDROCKUNSATURATEDZONESATURATEDZONEChapter 7  88  Table 7.1: Chemistry of native-water and infiltration-water in /mmol L   pH Alk. SO4 Cl Na Ca Mg Fe Cu Zn Native-water 8 2.68 - 0.05 5.48 0.65 0.37 - - - Infiltration-water           Minimum 0.3 0.5 70 - 0.005 0.007 0.01 0.001 0.001 0.001 Maximum 11.4 3118 21000 - 239 798 1430 2520 650 40 Mean 6.8 290.22 2533 - 35 375 297 13 10 0.421 Standard deviation 1.4 590.74 1567 - 23 153 230 111 44 1.522  Table 7.2 shows the model parameter, their values, and whether each parameter is uncertain or constant. The infiltration velocity was obtained from climate data at the minesite and its value is treated as constant (certain). The total number of cells used in the model is 30. Each cell in the model has the same length and has equal residence time. A simulation period of 168 months is used, which is the period between 1978 and 1992. The model was run at a monthly time step and new infiltrating water is picked at each time step. A specified flux was used as boundary condition at the start and end cells. There are no measured values for diffusion coefficient, dispersivity and cation exchange capacity (CEC) at the minesite. Thus, these variables are represented with their minimum and maximum values that are obtained from the literatures. The values of diffusion coefficient and longitudinal dispersivity were obtained from Batu (2006), and values for CEC are obtained from Appelo and Postma (2005).     Table 7.2: Model parameters and their values Parameter Constant Minimum Maximum Infiltration velocity (mm/year) 100 - - Cell length (m) 1.25 - - Number of cells 30 - - Simulation period (month) 168 months (1978-1992) - - Time step (s) 2.63 x 106 - - Diffusion coefficient ( D∗ , m2s-1)  - 0.1 x 10-9 2 x 10-9 Longitudinal dispersivity ( Lα , m) - 0 1 CEC (meq/L) - 0.1 0.5  Chapter 7  89  7.3.3 Effect Characterization In this study, the LC50  response of these species to  copper and zinc were obtained from the AQUIRE database (US EPA 2014). The summary of LC50 of copper and zinc for the fish species is presented in Table 7.3. This table shows the statistics in term of minimum (Min), maximum (Max.), Mean, and standard deviation (Stdev). The values of LC50 were divided by 10 to calculate the predicted-no-effect-concentration as recommended by Roman et al. (1999). The PNEC values were assumed to represent any possible exposure pathways such as ingestion of contaminated water and consumption of the lower-trophic level organisms. For each stressor, the calculated PNEC values of all species are combined to construct a P-box. This P-box assumes to represent the response of the entire fish species.  Table 7.3: LC50 data for the fish species in the study area   Oncorhynchus mykiss Oncorhynchus clarkii Oncorhynchus kisutch Gasterosteus aculeatus Copper (mg/L)     Min 0.004 0.021 0.014 0.185 Max 5.1 0.55 0.601 2.78 Mean 0.388 0.132 0.132 0.842 Stdev 0.875 0.128 0.137 0.881 Zinc (mg/L)     Min 0.00034 0.061 0.15 4.82 Max 15.61 1 6.04 9.33 Mean 2.312 0.329 2.489 6.383 Stdev 3.175 0.289 1.662 2.05  7.3.4 Risk Characterization Risk characterization involves estimation of the probability of adverse effects on selected endpoints. In this phase, risk is estimated and described, and uncertainties and assumptions are summarized. Risks are estimated by comparing exposure and effects data using either the risk-quotient method or distributions (USEPA 1998; Solomon et al. 1996; Hall et al. 1998; USEPA 2001). The risk-quotient method compares the maximum exposure concentration and effect Chapter 7  90  concentration of the most sensitive species. This method may not be suitable to evaluate the impact of mitigation measures in reducing the estimated risk (USEPA 1998). In this case study, the degree of overlap between the probability bounds of exposure and effect were compared.  7.4 Results and Discussion The p-boxes of the observed and simulated concentrations of cations and anions in water compared to validate the PHREQC outputs (Figure 7.4 and Figure 7.5). For sodium, the lower bound of the observed concentration is well simulated by PHREEQC, whereas the upper bound of the observed concentration is underestimated (Figure 7.4). A detailed analysis of the lower bounds of sodium reveals that the predictions are underestimated the observed values at the 10th and 99th and overestimated at the 50th percentiles (Table 7.4), respectively. For the upper bound of sodium, the predictions are overestimated at the 10th percentile and underestimated at the, 50th and 99th percentiles. The PHREEQC simulations for calcium overestimate both the lower and upper bounds of the observed p-box. The predicted values of calcium at the lower bound overestimate at the 10th and 50th percentiles, and underestimate the observed values at the 99th percentiles. For the upper bound of calcium, the prediction values overestimate the observed calcium at the 10th, 50th, and 99th percentiles.  For magnesium, the simulations of PHREEQC slightly overestimate the lower and upper bounds of the observed p-box. Table 7.4 shows that the predicted values of magnesium at the 10th, 50th, and 99th percentiles slightly overestimate their observed counterparts for both the lower and upper bounds of the p-box. The PHREEQC simulations for iron well capture the lower and upper bounds of the observed p-box. For the lower bound of iron, the prediction values are equal to the observed values at the 10th, 50th, and 99th percentiles (Table 7.4). For the upper bound of iron, the prediction values are equal to the observed values at the 50th and 99th percentiles, and underestimate the observed values at the 10th percentile. For copper, the PHREEQC simulations reproduce the lower and upper bounds of the observed p-box (Figure 7.5). The predicted values of copper are equal to the observed values at the 10th, 50th, and 99th percentiles of lower bound. For the upper bound, the prediction values are slightly lower at the 10th percentiles and higher than the observed values, at the 50th and 90th percentiles. The Chapter 7  91  comparison of the simulated and observed copper at the given percentile level reveals that copper is simulated very well by PHREEQC.   Figure 7.4: Observed and predicted concentrations of sodium, calcium, magnesium and iron in the groundwater The PHREEQC simulation for zinc at the lower bound slightly underestimates its observed counterpart (Figure 7.5). The simulation at the upper bound slightly overestimates the observed upper bound for probability level greater than 0.6 and underestimates the observed upper bound for the probability level less than 0.6. For instance, comparisons between the predicted and observed values of the lower bounds at the 10th, 50th, and 99th percentiles reveal that the predicted values underestimate the observed values. For the upper bound, the predicted values slightly underestimate the observed value at the 10th and 50th percentiles and overestimate the observed value at the 99th percentile, respectively.  Chapter 7  92  Table 7.4: Predicted and observed values at different percentiles Parameter 10th percentile  (mg/L) 50th percentile (mg/L) 99th percentile (mg/L)  Min Max Min Max Min. Max Na (Simulated) 0.01 61 13 64 36 255 Na (Observed) 0.97 54 0.97 120 125 367 Ca (Simulated) 98 578 269 894 316 1091 Ca (Observed) 4 238 97 466 394 671 Mg (Simulated) 33 196 72 320 426 1007 Mg (Observed) 7 106 7 225 276 765 Fe (Simulated) 0.001 1.5 0.001 16.59 1.86 95 Fe (Observed) 0.001 9 0.001 16.63 1.86 95 Cu (Simulated) 0.001 0.004 0.001 0.019 0.05 0.26 Cu (Observed) 0.001 0.008 0.001 0.014 0.08 0.25 Zn (Simulated) 0 0 0 0.01 0.001 0.20 Zn(Observed) 0.003 0.009 0.003 0.016 0.062 0.16 Sulfate (Simulated 70 2732 966 4132 2376 5770 Sulfate (Observed) 78 879 94 2066 1939 5770 Alkalinity (Simulated 168 264 169 334 226 817 Alkalinity (Observed) 48 288 169 401 334 587  Chapter 7  93   Figure 7.5: Observed and predicted concentrations of copper, zinc, sulfate and alkalinity in the groundwater For sulfate, the PHREEQC predictions overestimate the observed p-box except in the lower portion of the lower bound (Figure 7.5). For instance, the predicted values of the lower bound slightly underestimate at the 10th percentile and overestimate at the 50th and 99th percentiles the observed values, respectively. For the upper bound, the prediction values overestimate or are equal to the observed values at the 10th and 50th percentiles, and the 99th percentile, respectively. The comparisons at the given percentiles reveal that sulfate is poorly simulated. The prediction of alkalinity overestimates and underestimates the lower bound of the observed p-box at probability level less or greater than 0.5, respectively. The prediction of alkalinity underestimates and overestimates the upper bound of the observed p-box at probability level less or greater than 0.7. For instance, the predicted values of the lower bound overestimate, equal to and underestimate the observed values at the 10th, 50th and 99th percentiles, respectively. For the upper bound, the predicted values underestimate the observed values at the 10th and 50th Chapter 7  94  percentiles, and overestimate the observed values at the 99th percentile. The comparison between the simulated and observed concentrations reveals that the simulated alkalinity is acceptable. The effects of copper and zinc that were derived in this study and their guideline values are presented in Figure 7.6. The P-boxes of copper and zinc provide range of concentration values at different percentile levels. For instance, the PNEC of copper (mg/L) varies from 0.001-0.021, 0.001-0.04, and 0.07-0.51 at the 10th, 50th, 99th percentiles, respectively. Similarly, the PNEC of zinc (mg/L) varies from 0.001-0.19, 0.001-0.41, and 0.49-1.56 at the 10th, 50th, 99th percentiles, respectively. These P-boxes of copper and zinc are validated by comparing them with Canadian water quality guideline (Guideline-PNEC) for aquatic life (CCME 2014). For the copper effect, the results show that the guideline value and the lower bound of the derived p-box are almost equal, whereas the upper bound of the derived p-box is greater than the guideline value. For zinc, the result of the validation shows the guideline value is slightly greater than the lower bound of the derived p-box for a probability level less than or equal to 67%, whereas the guideline value is slightly less than the lower bound of the derived p-box for the probability level greater than 67%. The upper bound of the derived zinc is greater than the guideline value at all probability levels. These comparisons reveal that Derived-PNECs are acceptable. It is worth noting that the guideline values are constant at various probability levels since they are deterministic estimates.  Figure 7.6: Derived and guideline PNECs of copper and zinc Chapter 7  95  The results of risk characterization show the exposure and PNEC P-boxes for copper are overlapped both on their lower and upper bounds (Figure 7.7). This overlap suggests that there would be risk from the exposure concentration at both bounds. The computed risk of copper based on the lower bound shows that the effect concentration (0.001 mg/L) is not exceeded 92% of the time and the associated risk is equal to 8%. The computed risk of copper based on the upper bound shows that the effect concentration (0.021 mg/L) is not exceeded 52% of the time and the associated risk is 48%. These results reveal that the computed risk of copper based on the lower bound is acceptable, whereas the risk of copper based on the upper bound is not acceptable since it violates the defined assessment endpoint (10%).  Figure 7.7: Predicted exposure and effect concentrations of copper and zinc In case of zinc, the exposure and PNEC concentrations are overlapped only at their lower bounds (Figure 7.7). This overlap suggests that there would be risk from exposure concentration at the lower bound. The computed risk based on the lower bound of zinc shows the effect concentration (0.001 mg/L) is not exceeded 50% of the time and the resulting risk is 50%. In addition, the computed risk based on the upper bound shows that the effect concentration (0.2 mg/L) is not exceeded 80% of the time and the associated risk is 20%. These results reveal that the risk of zinc Chapter 7  96  is not acceptable based the lower and upper bound measures since it violates the acceptable risk, which is equal to 10%. In general, the simulation of PHREEQC is quite acceptable except for calcium and sulfate provided that the limited information available to describe the minesite.  The overestimation of calcium and sulfate is attributed to precipitation of gypsum ( 4CaSO ) which was not included in the simulation since there was no information from the minesite that indicates which minerals precipitate.  The comparison between the predictions of copper and zinc using a fugacity model (previous chapter), PHREEQC and observed p-boxes is shown in Figure 7.8. The result shows that the upper bounds of the predicted copper and zinc using the fugacity model overestimate their observed counterparts by order of magnitudes. A reason for overestimation could be the model’s inability to represent precipitation and dissolution processes that control the transport of metals. This result well agrees with previous studies. Zhu (2003) compared PHREEQC and partitioning coefficient (Kd) based models for simulating the transport of sulfate at an abandoned minesite. The results showed the Kd based model predicted lower and higher sulfate concentration in groundwater for larger and smaller Kd values used, respectively. Bethke and Brady (2000) compared Kd based and surface complexation-based models. Their results showed that their Kd based model overestimates transport of the metals plume. Therefore, the use of partitioning-coefficient-based models (e.g., a fugacity-based model) for simulating fate and transport of metals in minesites might overestimate the exposure concentration and risks posed to the environment.   The methodology presented for risk characterization provides two ways to estimate a risk. These are risk characterization based on the lower and upper bounds. The risk characterization based on the upper and lower bounds provides conservative and non-conservative risk estimates, respectively. Although it is the task of decision-makers to choose lower or upper bounds to estimate a risk, it is often recommended to use the conservative one (Guyonnet and Come 1999). In this study, the risks of copper and zinc to fish species are not acceptable based on the conservative measure. On the other hand, the risk of copper is acceptable based on the non-conservative measure, whereas the risk of zinc is not acceptable.   Chapter 7  97   Figure 7.8: Comparisons between the predictions of fugacity and PHREEQC models, and observed p-boxes on a logarithmic scale.  7.5 Summary This chapter presents a methodology of ecological risk assessment in the mining industry using probability bounds and the PHREEQC model. This risk assessment involved problem formulation, exposure characterization, effect characterization, and risk characterization. In problem formulation, assessment endpoint, conceptual model, and hypothesis of risk were defined. In exposure characterization, the PHREEQC model was used determine the fate-and-transport of metals in the environment.  In effect characterization, the response of organisms to metals is estimated. Data and parameter uncertainties are addressed using the probability bounds approach. In risk characterization, the likelihood of risk is estimated by comparing the degree overlaps between probability bounds of exposure and effect concentrations at a given percentile level.  The simulations of PHREEQC determined the concentrations of cations and anions in groundwater. These simulations were acceptable except for calcium and sulfate. These variables are overestimated because the precipitation of gypsum was not included in the simulation. The prediction of fugacity model greatly overestimates the amounts of minor ions in groundwater compared to prediction of PHREEQC.  The results of the effect characterization showed that the Chapter 7  98  derived effect concentrations for copper and zinc are comparable with guideline values. The risk characterization result indicated that there is an adverse of ecological risk due to metals transported into groundwater based on conservative measure.  Chapter 8 99  CHAPTER 8 RISK MANAGEMENT: A MCDA APPROACH A version of this chapter has been published in Journal of Environmental Management with a title “Selection of remedial alternatives for minesites: A multicriteria decision analysis.” The lead author is Getnet D. Betrie and the coauthors are Dr. Rehan Sadiq, Dr. Kevin A. Morin, and Dr. Solomon Tesfamariam.  8.1 Background ARD from mine wastes that contains elevated metals and metalloids poses human and environmental risks (Gray, 1996; Azapagic, 2004). Often mitigation measures or alternatives are used to reduce human and environmental risks that are caused by ARD in minesites. These alternatives are categorized into source control and migration control (Johnson and Hallberg 2005; Kuyucak 1999; Miller et al. 2006; Blowes et al. 2003). Alternatives of source control prevent the release of ARD generation by preventing the ingress of oxygen and water in waste rocks or tailings. Examples of source control include water-covers, soil-covers, chemical treatments and bactericides and others. Alternative of migration control prevent the transport of ARD products into the environment once it is generated. Examples of migration control include water treatment through chemicals, constructed wetlands, and permeable reactive barriers and others.  The selection of an optimal alternative for a minesite is quite complex (Miller et al. 2006) because most environmental decision-making (i.e., the selection process) involves multiple and conflicting objectives (e.g., minimizing risk and cost, maximizing benefit, and maximizing stakeholder preferences) (Sadiq and Tesfamariam 2009; Kiker et al. 2005). Moreover, input information for each objective is often obtained in different forms (i.e., quantitative and qualitative), which are non-commensurable and thus exacerbate the decision making process (Tesfamariam and Sadiq 2008). It is worth noting that there is a move away from decision-making based on single objective (e.g., cost or technical feasibility) toward selecting a sustainable mitigation technology (SMT) by considering multiple objectives such as environmental, economic and social objectives. SMT refers to technologies that protect the environment, economically feasible, and socially acceptable. The existing guidelines (Price 2009; Verburg et al. 2009) emphasize the selection of a mitigation alternative that satisfies various criteria such as protectiveness of the environment, regulatory acceptance, cost and others. In these guidelines, experts conduct remedial selection analysis without multiple criteria decision analysis (MCDA) aids. However, the literature shows that Chapter 8 100  humans are not capable of solving multiple objectives unaided. When they attempt to do so, opposing views are often discarded (McDaniels et al. 1999).  Uncertainty is an unavoidable and inevitable component of any environmental decision making process (Sadiq and Tesfamariam 2009). MCDA methods require input data such as weights of criteria and preference of alternatives with respect to each criterion provided by decision makers. However, these data have uncertainty because of decision makers’ judgment and subjectivity. Before informed decisions can be made, this uncertainty must be quantified, through the application of available techniques. The objective of this chapter is to develop a methodology that uses MCDA for risk management of ARD under uncertainty. Although there are many MCDA methods, as previously discussed, the PROMETHEE methods (Brans and Vincke 1985) are implemented in this study. The PROMETHEE methods are implemented for two reasons (i) they are easy to apply compared to other outranking methods such as ELECTRE since PROMETHEE methods require fewer parameters from decision makers; and (ii) they rank alternatives as well as identify the best alternative, whereas the utility theory methods only identify the best alternatives.  8.2 Proposed Methodology The proposed methodology for mitigating ARD risk is presented in Figure 8.1. It consists of baseline risk assessment, identify potential alternatives, future risk assessment, deterministic and probabilistic multictriteria decision analysis, and result analysis blocks. The methodology starts by conducting baseline risk assessment. Baseline risk assessment consists of problem formulation, exposure, effect, and risk characterizations. This block could be conducted using the methodology presented in Chapters 6 or 7 depending upon the level of site characterizing information available.   If there is risk due to contaminants in a site, potential alternatives that prevent the generation of ARD or control the transport of ARD products are developed in the develop potential alternatives block. The development and implementation of alternatives are governed by geochemical and hydrological processes that control the generation of ARD in a minesite (Blowes et al. 2003). A single or multiple technologies could be combined to develop alternatives. Once the potential alternatives is developed, their effectiveness to prevent generation or transport of oxidation products are determined. This is obtained either from experience of other minesites or conducting experiment for a given minesite. After the effectiveness of potential alternatives is determined, future risk assessment is conducted. Chapter 8 101   In future risk assessment block, future risk under each potential alternative is estimated. This is specifically done by running scenarios using a fate-and-transport model under each alternative. The exposure concentrations obtained from the fate-and-transport modeling is compared with the baseline effect to characterize the risks under each alternative.  Next, the risk values obtained from risk characterization for each alternative is used as inputs to MCDA blocks.  MCDA is done either in a deterministic or a probabilistic approach. The deterministic block consists of identifying criteria to evaluate alternatives, identifying the weight of criteria using AHP, and applying the PROMETHEE I and II methods. The probabilistic block consists of identifying criteria to evaluate alternatives, defining the weight of criteria by using AHP, defining probability distribution for the weight of criteria, conducting Monte Carlo Simulation, applying PROMETHEE II, and conducting a sensitivity analysis.  In the result analysis block, the deterministic and probabilistic alternative rankings are analyzed. The deterministic alternative rankings are obtained from partial and complete rankings of PROMETHEE methods. The probabilistic alternative rankings are obtained from a complete ranking of the PROMETHEE II method. These results provide probability distributions for the ranking of alternatives. In addition, the results provide the probability of given alternatives belonging to various ranks.   If an optimal alternative obtained in the result analysis block, the process is terminated; otherwise, the process repeated from development of potential alternatives block until an optimal alternative is obtained. In the subsection below, the elements of the deterministic and probabilistic MCDAs block are presented. Since the probabilistic MCDA block contains the first two elements of the deterministic block, the elements of the probabilistic MCDA block beginning from “defining the probability distribution criteria weight” are discussed. Chapter 8 102   Figure 8.1: The proposed methodology for selection of mitigation alternatives at minesites BASELINE RISK ASSESSMENTIS  THERE RISK?DEVELOPMENT OF POTENTIAL ALTERNATIVESNOYESSTARTFUTURE RISK ASSESSMENTDETERMINSTIC MULTICRITERIA DECISION ANALYSISPROBABLISTIC MULTICRITERIA DECISION ANALYSISRESULT ANAYSISIS  THERE OPTIMAL ALTERNATIVE?YESENDChapter 8 103   8.2.1 Deterministic Multicriteria Decision Analysis 8.2.1.1 Preference Ranking Organization Method for Enrichment Evaluation (PROMETHEE) The PROMETHEE I and II methods are multicriteria decision making techniques developed by Brans and Vincke (1985). These methods build a valued outranking relation and exploit this relation to obtain a partial ranking (PROMETHEE I) or complete ranking (PROMETHEE II). When A  of n  alternatives (i.e., 1 2 3, , a na a a⋅ ⋅ ⋅ ) have to be ranked, andG  of k  criteria (i.e., 1 2 3 k, , , ,gg g g ⋅ ⋅ ⋅ ) have to be maximized or minimized as shown in Figure 8.2, the resulting multicriteria problem has the following form as shown in equation 8.1 (Brans and Vincke 1985):  1 2{ ( ), ( ),..., ( ) | }kmax g a g a g a a A∈  8.1   Figure 8.2: Evaluation matrix The basic steps of PROMETHEE methods to solve such multicriteria problem are (Brans and Vincke 1985): i. Defining a preference function for each criterion, w1 w2 .  .  . wka1 g1(a1) g2(a1) .  .  . gk(a1)a2 g1(a2) g2(a2) .  .  . gk(a2)...............an g1(an) g2(an) .  .  . gk(an)Chapter 8 104  ii. Calculating the multicriteria preference index as a weighted average of the preference functions, iii. Calculating a leaving flow, an entering flow, and a net flow for each alternative, and iv. Assigning a partial or complete ranking.  A preference function ( ( , )( , )jP a b a b A∈ ) associated with criterion ig  gives the degree of preference, expressed by decision-makers, for alternative a over b . In other words, the preference function takes the deviation of the preference a over b as an input and provides a normalized output as shown by equations 8.2 and 8.3, respectively.  ( , ) [ ( ) ( )]j j i iP a b P g a g b= −   8.2 0 ( , ) 1jP a b≤ ≤   8.3 If ( , ) 0jP a b = , there is no preference a over b and if ( , ) 1jP a b = , there is a strict preference of aoverb . Brans and Mareschal (2005) provided six types of preference functions as displayed in Table 8.1. If a weight jw  is given, which shows the relative importance of each criterion, it is used to calculate the multicriteria preference index as defined by equation 8.4: 1( , ) ( , )kj iia b w P a bπ==∑  8.4 The leaving (outgoing) flow that shows how an alternative ia  is outranking other alternatives is defined by the equation below: ( ) ( , )x Ka a xφ π+∈= ∑  8.5 The incoming (leaving) flow that shows how an alternative ia  is outranked by other alternatives is defined by the following equation: ( ) ( , )x Ka x aφ π−∈= ∑ 8.6   Chapter 8 105  Table 8.1: Types of preference function Preference function Definition  0, 0( )1, 0dp dd≤=  >    0,( )1,d qp dd q≤=  >   0 0( ) 01ddp d d ppd p ≤= ≤ ≤ >    0,1( ) ,21,d qp d q d pd p≤= ≤ ≤>    0,( ) ,1,d qd qp d q d pp qd p ≤ −= ≤ ≤− >   2220, 0( )1 , 0dsdp de d−≤=  − >   PROMETHEE I is used to derive a partial ranking which is obtained by comparing the leaving and incoming flows, as defined by equation 8.7: Type I:UsualCriterion1Pd01PdType II:U-shapeCriterionq0Pp1dType III:V-shapeCriterion0q p0.51Pd0Type IV:LevelCriterionq p1Pd0Type V:V-Shape with indif-ferenceCriterion1Pd0Type VI:GaussianCriterionChapter 8 106  ( ) ( ) ( ) ( ),( ) ( ) ( ) ( ),( ) ( ) ( ) ( );( ) ( ) ( ) ( )( ) ( ) ( ) ( ),( ) ( ) ( ) ( );IIIa b and a b orap b iff a b and a b ora b and a baI b iff a b and a ba b and a b oraR b iffa b and a bφ φ φ φφ φ φ φφ φ φ φφ φ φ φφ φ φ φφ φ φ φ+ + − −+ + − −+ + − −+ + − −+ + − −+ + + +  > <  = < > = = =  > > < < 8.7 where ,I IP I and IR  PI, II, andRIrepresent preference, indifference, and incomparability, respectively. PROMETHEE II is used to derive a complete ranking which is calculated as a net flow between outgoing and incoming flows, as shown in equation 8.8. If the net flow of a  is greater thanb , aoutranksb . Otherwise, a is indifferent tob . ( ) ( ) ( )a a aφ φ φ+ −= −  8.8 8.2.1.2 Analytical Hierarchy Process (AHP) AHP is one of the MCDA methods that assists decision makers in solving complex problems by organizing thought, experiences, knowledge, and judgments into a hierarchical framework, and by guiding them through a sequence of pairwise comparison judgments (Saaty 1982). In general, it derives the ratio scale from a pairwise comparison. In this study, AHP is used only to generate weight. The process of generating weight using AHP consists of the following steps (Saaty 1982): i. Identify the criteria to generate weights,  ii. Make a pairwise comparison for each criterion on a scale of 1 to 9. Note that  the relative preference of one criterion over another is obtained from decision makers, iii. Compute eigen values and eigen vectors to obtain a normalized weight for each criterion, and iv. Compute the consistency ratio (CR). If the CR is more than 10%, then the results are inconsistent. Thus, the assigned weight values are modified until the CR value is within an acceptable range.  Chapter 8 107  8.2.2 Probabilistic Multicriteria Decision Analysis 8.2.2.1 Define Probability Distribution for Criteria Weight The criteria weights are generated using the AHP technique from preference information provided by actors involved in the decision process. The definition of probability distribution for criteria weight solely depends on the number of actors who are participating in the decision process. If an adequate number of actors participated in the decision process, the obtained information on preference is fitted to a probability distribution function using available techniques or software. In a situation where there are no adequate actors participating in the decision process, the type of probability distribution that is recommended is uniform distribution (Rietveld and Ouwersloot 1992). 8.2.2.2 Conduct Monte Carlo Simulation Monte Carlo Simulation (MCS) is used to propagate uncertainties in random variables. In this technique, the weight of the criterion is represented by a probability density function that defines both the range of values which the weight can take and the likelihood that the weight has a value in any subinterval of that range (Aral 2010). A random sampling is used to select the weight of each criterion set at different probability levels, and repeated several times. For each combination of sampled weights, the PROMETHEE II method is run. The MCS runs until sufficient sample values have been collected to approximate the input parameter distribution.  8.2.2.3 Sensitivity Analysis Sensitivity analysis is used to identify the contribution of each criterion in the overall ranking of alternatives. A sensitivity analysis is often carried out using MCS and a rank correlation method is applied to the results of MCS in order to identify the contribution of parameters (i.e., criteria) to the output (Tesfamariam 2006). The latter method involves the determination of correlation coefficients that measure the strength of the linear relationship of two variables. In this study, the Spearman rank correlation coefficient (equation 8.9), which measures the linear relationship of two ranked variables, was used.  21261( 1)niidn nρ == −−∑ 8.9  Chapter 8 108  where id  is the difference between the rank of values of alternatives and weights of criteria, and n  is the total number of MCS realizations.  8.3 Case study 8.3.1 Risk Assessment Site characterization information was obtained from the closure plan report of the minesite. The remediation objective at this minesite is to reduce environmental risk. The conceptual model for the risk assessment is shown in Figure 8.3. It shows the sources of contaminants, release mechanisms, media, exposure routes, and receptors. According to the mine’s closure plan report, the chemical of concern (COC) is elevated copper and zinc from reactive waste rocks and tailings. The possible release mechanisms are leaching and erosion. The media that could be contaminated are surface water and soil. The two potential exposure routes that are considered at this site are ingestion and contact. The major ecological entity that should be protected is fish. The baseline risk assessment for this site was conducted as discussed in Chapter 7 and the results of the risk analysis are used in this chapter.   Figure 8.3: Conceptual model for risk assessment SOURCES• TAILINGS•WASTE ROCKSRELEASE MECHANISMS• LEACHING• EROSIONMEDIA• SURFACE WATER• SEDIMENT/ SOILEXPOSURE ROUTES•INGESTION•CONTACTRECEPTORS• FISHChapter 8 109  8.3.2 Development of Potential Alternatives Table 8.2 shows the alternatives that were considered in this study and their effectiveness to reduce contaminants. The literature review  (Johnson and Hallberg 2005; Miller et al. 2006; Skousen et al. 1998; Skousen et al. 2000; U.S. EPA 2000) shows various remedial alternatives. These alternatives reduce exposure of reactive wastes to oxygen and water, or reduce the amount of contaminants transported so that any adverse effects on humans and the environment are eliminated. The potential effectiveness (in percent) of the alternatives to reduce acid generation is shown in Table 8.2. These values are obtained from the literature as presented in this table. Note that in this step, potential technologies should have been identified, screened, and then assembled into alternatives. In this case study, however, this was not done; instead, a single technology was considered as an alternative in order to make the illustration of proposed methodology simple.  Table 8.2: Alternatives considered and their effectiveness to reduce contaminants Alternatives  Effectiveness (%) Sources Do-nothing (DN) 0 - Soil Cover (SC) 33 MEND (2000) Membrane System (MC) 70 Meek and Allen (1994) Water cover (WC) 95 Yanful and Simms (1997; Vigneault et al. (2007) Excavation and On-site disposal (EON) 99 U.S. EPA (1995) Excavation and Off-site disposal (EOF) 100 U.S. EPA (1995) Water Treatment (WT) 90 U.S. EPA (2000) Constructed Wetland (CW) 30 MEND (1999; Skousen et al. (1998) Sulphate Reducing Bacteria (SB) 80 Miller et al. (2006) 8.3.3 Deterministic MCDA Criteria to evaluate the alternatives were identified by an expert from the minesite and the authors in order to ensure that the proposed remedial objectives are achieved. These criteria are presented below: i. Minimize risk posed to fish, Chapter 8 110  ii. Minimize the cost of an alternative, iii. Maximize the short-term performance of an alternative, iv. Maximize the long-term performance of an alternative, v. Maximize the implementability of an alternative, vi. Maximize the reduction of toxicity,  mobility, or volume of wastes, and vii. Maximize the future use and aesthetic of the minesite. Table 8.3 shows these criteria and their respective scoring scale. Each criterion except risk was rated by experts on a scale of 1 to 9, as shown in this table. An alternative that has low cost rated as 9 and high cost rated as 1. For short-term performance, an alternative that has immediate effect after implementation is rated as 9 and an alternative that would take a longer time to be effective is rated as 1. For long-term performance, an alternative that is effective for longer durations (e.g., over 20 years) is rated as 9; otherwise, it is rated close to 1. If an alternative is easy to implement, it is rated as 9; otherwise, it is rated at a lesser value. An alternative that has an excellent effect on the reduction, mobility, and transport of contaminants is rated as 9; otherwise, it is rated at a lesser value. An alternative that increases future use and aesthetic of the site is rated as 9.  Table 8.3: Criteria and scoring scale Criteria Score Risk Obtained from Chapter 7 Cost 1(Low) to 9 (Extremely high) Short-term performance (SP) 1(Poor) to 9 (Excellent) Long-term performance (LP) 1(Poor) to 9 (Excellent) Implementability (IM) 1(impossible) to 9 (absolutely possible) Reduction of toxicity (RT) 1(Poor) to 9 (Excellent) Future use and aesthetic (FU) 1(Poor) to 9 (Excellent)  Table 8.4 presents the score of the alternatives with respect to each criterion. This table shows that the cost associated with DN is the lowest (1), SC is high (6), and EOF is the highest (9). The short-term performance of DN is poor (1) but the WC alternative has an excellent (9) short-term performance. The long-term performance of DN is poor (1) but the EON and EOF alternatives have an excellent (9) long-term performance. The MC alternative is difficult (2) to implement, Chapter 8 111  while the DN and SB alternatives are the easiest (9) to implement. The DN alternative is the least effective (1) but EOF is most effective (9) in reducing toxicity. The DN and SB alternatives have the least impact (1) on improving the future use of this site, while EOF and EON have the most impact on improving the future of this site.   The Risk values of the DN alternative are obtained from Figure 7.7. The risk characterization results of the upper bound are used to estimate the risks of copper and zinc. The upper bound is used because it provides very conservative estimate of risk that protects the assessment endpoint. The risk estimates of copper and zinc is summed to find the risk value of DN, which is equal to 68%. It is worth noting that the scoring shown in Table 8.4 is highly site specific and may not necessarily be applicable to other minesites. Table 8.4: Alternatives and their respective score at each criterion Criteria* DN SC MC WC EON EOF WT CW SB Risk 68 46 20.4 3.4 1 0 9 48 14 Cost 1 6 9 3 8 9 8 3 3 SP 1 2 2 9 6 8 9 4 7 LP 1 5 7 8 9 9 6 7 4 IM 9 5 2 8 7 7 5 8 9 RT 1 4 4 6 8 9 8 3 5 FU 1 4 6 5 8 9 2 7 1 * Do-nothing (DN), Soil Cover (SC), Membrane System (MC), Water cover (WC), Excavation and On-site disposal (EON), Excavation and Off-site disposal (EOF), Water Treatment (WT), Constructed Wetland (CW), Sulphate Reducing Bacteria (SB), Short-term performance (SP), Long-term performance (LP), Implementability (IM), Reduction of toxicity (RT), Future use and aesthetic (FU) Table 8.5 presents a comparison matrix of the criteria. The comparison matrix shows the relative importance of one criterion against another and reflects the preference of actors who participated in the decision making process. A pairwise comparison was done on a scale of 1 to 9. For instance, a decision-maker strongly prefers (9) the Risk criterion compared to the Future use and aesthetic (FU) criterion. The preference of FU over Risk (i.e., 0.11) was obtained by taking the reciprocal preference value of Risk over FU. The normalized weights of the criteria were obtained by computing eigenvectors of the matrix. The remainder values of the matrix can be explained in a similar fashion.   Chapter 8 112  Table 8.5: Comparison matrix of criteria * Risk Cost LP SP IM RT FU Risk 1 7 4 6 8 3 9 Cost 0.14 1 5 7 8 6 3 LP 0.25 0.2 1 9 9 3 5 SP 0.17 0.14 0.11 1 0.33 0.17 0.2 IM 0.13 0.12 0.11 3 1 3 3 RT 0.33 0.17 0.33 6 0.33 1 9 FU 0.11 0.33 0.2 5 0.33 0.11 1 * Long-term performance (LP), Short-term performance (SP), Implementability (IM), Reduction of toxicity (RT), Future use and aesthetic (FU) The input information for multicriteria methods (e.g., PROMETHEE) are provided in Table 8.6. This evaluation table consists of criteria; whether the criteria have to be minimized (Min) or maximized (Max); criteria weight that shows relative importance one criterion over other criteria; the alternatives’ evaluation outcome with respect to each criterion;  preference function that gives the degree of preference of an alternative against another alternative; and preference parameter value that has to be fixed. The criteria weights were obtained from the AHP computation. The results of AHP show that Risk is the most prioritized (i.e., 0.41) criterion followed by Cost, Long-term performance (LP), Reduction of toxicity (RT), Implementability (IM), Future use and aesthetic (FU), and Short-term performance (SP). Type III (i.e., V-shape) preference function was selected for all criteria. This preference type was selected because it best represents the data compared to other preference functions. The parameter value of type III function that was assumed for the Risk criterion was equal to 230 and for the other criteria was equal to 8.8. 8.3.4 Probabilistic MCDA A uniform probability distribution function was defined for the weights of criteria. Inputs to this distribution were defined by multiplying the original weight of each criterion by 10± . Each probability distribution was randomly sampled 5000 times using the MCS technique. For each combination of randomly sampled weight, the MCDA method (PROMETHEE II) was run. The results of criteria weights and PROMETHEE II outputs obtained from 5000 MCS run were ranked and then correlation coefficients were estimated using the Spearman rank correlation method. The estimated correlation coefficients of each alternative and criterion were normalized Chapter 8 113  and multiplied by 100 to obtain the contribution (in percent) of the criteria on overall ranking of the alternatives. Table 8.6: Evaluation table  Criteria* Min or Max Criteria weight DN SC MC WC EON EOF WT CW SB PF PP Risk min 0.41 68 46 20.4 3.4 1 0 9 48 52 III 14 Cost min 0.23 1 6 9 3 8 9 8 3 3 III 8.8 SP max 0.02 1 2 2 9 6 8 9 4 7 III 8.8 LP max 0.16 1 5 7 8 9 9 6 7 4 III 8.8 IM max 0.06 9 5 2 8 7 7 5 8 9 III 8.8 RT max 0.08 1 4 4 6 8 9 8 3 5 III 8.8 FU max 0.04 1 4 6 5 8 9 2 7 1 III 8.8 * Do-nothing (DN), Soil Cover (SC), Membrane System (MC), Water cover (WC), Excavation and On-site disposal (EON), Excavation and Off-site disposal (EOF), Water Treatment (WT), Constructed Wetland (CW), Sulphate Reducing Bacteria (SB), Short-term performance (SP), Long-term performance (LP), Implementability (IM), Reduction of toxicity (RT), Future use and aesthetic (FU), Preference function (PF), Preference parameter (PP) 8.4 Results and Discussion Table 8.7 presents the computed preference indices of the deterministic PROMETHEE analysis. These preference indices were computed by evaluating the preference of one alternative over another, using the preference function of each criterion. Next, the obtained value was multiplied by the weight of each criterion and summed. A preference index indicates the preference of an alternative over another. For instance, Do-nothing (DN) is preferred over Soil cover (SC) by 0.13, while SC is preferred over DN by 0.248. A higher preference value indicates a stronger preference for that alternative.  The computed outgoing (ϕ+), incoming (ϕ-), and net (ϕNET) flows for the deterministic PROMETHEE analysis are shown in the bottom of Table 8.7. The outgoing flow of an alternative was obtained by summing the preference indices in that row. For example, the incoming flow of DN (0.967) was obtained by summing the preference indices in the first row. On the other hand, the incoming flow of an alternative was obtained by summing the preference indices in that column. For example, the incoming flow of DN (4.095) was obtained by summing the preference indices of in the first column. The incoming and outgoing flows provide an indication of whether Chapter 8 114  an alternative outranks, or is outranked, by other alternatives, respectively. If an alternative is characterized by a greater incoming flow and a smaller outgoing flow, the better this alternative dominates other alternatives. These values were used for partially ranking the alternatives. The net flow was obtained as a net difference between the incoming and the outgoing flows and it indicates whether an alternative globally outranks other alternatives. The higher the net flow of an alternative, the more this alternative dominates other alternatives. The net flow is used to completely rank the alternatives.  Table 8.7: Preference indices, incoming, outgoing, and net flow  * DN SC MC WC EON EOF WT CW SB DN - 0.13 0.209 0.052 0.183 0.209 0.182 0 0 SC 0.248 - 0.078 0.02 0.066 0.092 0.061 0.043 0.043 MC 0.454 0.206 - 0.045 0.034 0.034 0.041 0.235 0.077 WC 0.642 0.523 0.42 - 0.185 0.175 0.172 0.422 0.149 EON 0.628 0.442 0.283 0.059 - 0.026 0.076 0.418 0.172 EOF 0.678 0.523 0.311 0.078 0.055 - 0.094 0.472 0.208 WT 0.662 0.43 0.282 0.039 0.068 0.058 - 0.435 0.146 CW 0.259 0.185 0.25 0.061 0.183 0.209 0.208 - 0.041 SB 0.523 0.449 0.355 0.052 0.201 0.209 0.183 0.304 - ϕ+ 0.967 0.653 1.127 2.688 2.105 2.42 2.121 1.397 2.277 ϕ- 4.095 2.888 2.191 0.408 0.975 1.013 1.018 2.330 0.838 ϕNET -3.128 -2.235 -1.064 2.28 1.130 1.408 1.103 -0.933 1.439 *Do-nothing (DN), Soil Cover (SC), Membrane System (MC), Water cover (WC), Excavation and On-site disposal (EON), Excavation and Off-site disposal (EOF), Water Treatment (WT), Constructed Wetland (CW), Sulphate Reducing Bacteria (SB) The deterministic partial ranking (PROMETHEE I) of the alternatives is shown below: , , , , , , , ;, , , , , ;, , , , ;, , , ;, , ,I I I I I I I II I I I I II I I I II I I II I I IWCP DN WCP SC WCP MC WCP EON WCP EOF WCP WT WCP CW WCP SBSBP DN SBP SC SBP MC SBP EON SBP WT SBP CWEOFP DN EOFP SC EOFP MC EOFP WT EOFP CWEONP DN EONP SC EONP MC EONP CWWTP DN WTP SC WTP MC WTP CW ;, , ;, ;I I II ICWP DN CWP SC CWP MC andMCP DN MCP SC Chapter 8 115  The Water cover (WC) alternative dominates all the alternatives. The Sulphate reducing bacteria (SB) alternative dominates all the alternatives except WC and Excavation and off-site disposal (EOF); the SB alternative is dominated by WC and is incomparable with EOF. The EOF alternative dominates the Do-nothing (DN), Soil cover (SC), Membrane cover (MC), Water treatment (WT) and Constructed wetland (CW) alternatives, is dominated by WC alternative, and is incomparable with the SB and Excavation and on-site disposal (EON) alternatives. The EON alternative dominates the DN, SC, MC and CW alternatives, is dominated by the WC and SB alternatives, and is incomparable with EOF and WT. The WT alternative dominates the DN, SC, and CW alternatives, is dominated by the WC, SB, and EOF alternatives, and is incomparable with the EON and MC alternatives. The CW alternative dominates the DN and SC alternatives, is incomparable with MC, and is dominated by the WC, SB, EOF, EON and WT alternatives. The DN and SC alternatives are incomparable. The alternatives are incomparable whenever they have conflicting relative rankings. Incomparability between the alternatives occurs if an alternative is superior according to some criteria and inferior according to other criteria compared to other alternatives. For instance, the incomparability of SB and EOF occurs because the EOF alternative performs better on the Risk, Long-term performance (LP), Reduction of toxicity (RT) and Future use and aesthetic (FU) criteria than does the SB alternative, whereas the SB alternative performs better on Cost and Implementability (IM) criteria.   The deterministic complete ranking (PROMETHEE II) shows that WC ranked first followed by the SB, EOF, EON, WT, CW, MC, SC, and DN alternatives. It is interesting to note that WC and DN have the same rank in both partial and complete ranking methods. However, the incomparability information is lost in the complete ranking method which may affect decision-makers ability to make informed decision.  The results of the probabilistic complete ranking of the alternatives are displayed in Table 8. This table shows the probability of an alternative obtaining a certain rank. The WC alternative obtains the first rank with a probability of 100%. The chances of the SB alternative ranking second and third are 70% and 30%, respectively, whereas the chances of the EOF alternative ranking second and third are 30% and 70%, respectively. The probability of the EON alternative obtaining fourth and fifth ranks is 67% and 33%, respectively, whereas the WT alternative has a probability of 33% and 67% of obtaining the fourth and fifth ranks, respectively. The chance of the CW alternative ranking sixth and seventh is 97% and 3%, respectively, whereas the MC alternative ranks similar to the CW alternative with the values of probability reversed. The SC and DN alternatives rank eighth and ninth with a probability of 100%. It is interesting to note that this Chapter 8 116  method clearly showed the degree of conflict in relative ranking that led to incomparability in the deterministic partial ranking which is presented above. Table 8.8: Probabilistic complete ranking of alternatives in percentage  *Do-nothing (DN), Soil Cover (SC), Membrane System (MC), Water cover (WC), Excavation and On-site disposal (EON), Excavation and Off-site disposal (EOF), Water Treatment (WT), Constructed Wetland (CW), Sulphate Reducing Bacteria (SB) Figure 8.4 shows the cumulative distribution functions of the net flow for the probabilistic PROMETHEE analysis. It shows a possible range of net flow for the alternatives. For example, the net flow of SB has a possible range between 1.305 and 1.573, and the EOF alternative has a possible range between 1.24 and 1.569. It is interesting to compare the probabilistic net flow values with the deterministic net flow results. The information obtained from the probabilistic analysis explains the reason for the conflicting relative ranking among alternatives (e.g., SB vs. EOF and EON vs. WT). This probabilistic output enables decision-makers to obtain net flow values at the required confidence levels and to rank the alternatives with a certain level of confidence. Rank DN SC MC WC EON EOF WT CW SB 1 0 0 0 100 0 0 0 0 0 2 0 0 0 0 0 30 0 0 70 3 0 0 0 0 0 70 0 0 30 4 0 0 0 0 67 0 33 0 0 5 0 0 0 0 33 0 67 0 0 6 0 0 3 0 0 0 0 97 0 7 0 0 97 0 0 0 0 3 0 8 0 100 0 0 0 0 0 0 0 9 100 0 0 0 0 0 0 0 0 Chapter 8 117   Figure 8.4: Cumulative distribution function of net flow  The contribution (in percent) of the criteria on ranking the alternatives is shown in Figure 8.5. For instance, the three most sensitive criteria that impact the WC alternative are Risk, SP and Cost.  Although SP has the least weight among the criteria, its effect is greater than the Cost criterion which is the second most highly weighted criterion. It is worth noting that the ranking of each alternative could be sensitive to different criteria as is seen in this figure. For instance, the SB alternative is sensitive to Cost followed by Risk, SP, FU, IM, and RT criteria, and the CW alternative is sensitive to Risk followed by Cost, RT, LP, SP, FU and IM criteria.  Chapter 8 118   Figure 8.5: Contribution (%) of criteria on ranking of alternatives 8.5 Summary In this chapter, a methodology of ARD risk management under uncertainty is developed. This methodology is based on probabilistic and deterministic MCDA methods. The MCDA consists of identifying criteria to evaluate alternatives, calculating criteria weight using AHP, defining the probability distribution of criteria weights, conducting MCS, applying PROMETHEE I and II methods, conducting a sensitivity analysis, determining the deterministic and probabilistic rankings of alternatives, and identifying the sensitivity of the criteria. Finally, a case study was used to demonstrate the developed methodology. The results of the deterministic PROMETHEE I analysis showed and ranked the comparable alternatives, and showed the incomparable alternatives. The deterministic PROMETHEE II results only showed rankings of the alternatives without incomparability information. For instance, the deterministic PROMETHEE II method at the case study minesite showed that the WC (Water cover) alternative ranks first followed by Sulphate reducing bacteria (SB), Excavation and off-site disposal (EOF), Excavation and on-site disposal (EON), Water treatment (WT), Constructed wetland (CW), Membrane system (MC), Soil cover (SC) and Do-nothing (DN) alternatives. In both methods, the uncertainty in the ranking of the alternatives due to input -100 -75 -50 -25 0 25 50 75 100RiskCostSPLPImRTFUContribution (%)SBCWWTEOFEONWCMCDNSCChapter 8 119  information variation was not shown to decision-makers. On the other hand, the probabilistic PROMETHEE II results showed the probability level at which an alternative could yield certain rankings. For instance, the application of the probabilistic complete ranking at the case study minesite showed that WC ranked 1st with a probability level of 100%, SB and EOF ranked 2nd and 3rd with a probability level of 70%, EON and WT ranked 4th and 5th with a probability level of 67%, CW and MC ranked 6th and 7th with a probability level of 97%, and SC and DN ranked 8th and 9th with a probability level of 100%. Also, this method was used to rank the alternatives in the form of a cumulative distribution function that will help decision-makers rank alternatives with the required level of confidence.  The sensitivity analysis showed the impact of the criteria on ranking the alternatives. For instance, the Risk and Future use and aesthetic (FU) criteria impacted the WC alternative by 44% and -2%, respectively. Based on the contribution of the criteria, decision makers can leave insensitive criteria for further analysis in order to reduce the complexity of decision making. Meanwhile, they can collect more information on the most sensitive criteria in order to reduce the uncertainty associated with the criteria. The study also showed that the assigned weight of criteria has little effect on ranking the alternatives.   Chapter 9  120   CHAPTER 9 CONCLUSIONS AND RECOMMENDATION 9.1 Conclusions The main objective of this research was to propose a risk management framework for ARD that can improve the prediction of ARD chemistry, assess, and manage environmental and human health risks to guide decision-making under uncertainty. The proposed framework has the capability to fill in missing data of ARD chemistry using imputation methods, predict future ARD chemistry using machine learning techniques under uncertainty, assess environmental risks under uncertainty for minesites using a fugacity-based and PHREEQC models, and manage risks of ARD under uncertainty using a multicriteria decision analysis approach. In general, the outcomes of this research improve the ability of the mining industry to predict ARD chemistry, assess, and manage risks of ARD under uncertainty.  To address the issue of missing data in the proposed framework, different imputation methods were investigated including iterative robust model-based imputation (IRMI), sequential imputation for missing values (IMPSEQ), and multiple imputations of incomplete multivariate data (AMELIA). The results showed that IMPSEQ and IRMI are suitable to fill missing data of ARD chemistry databases in minesites, whereas AMELIA is not suitable.  In this research different machine learning techniques that predict future ARD chemistry in minesites were evaluated to identify promising techniques to implement in the proposed framework. The machine learning techniques evaluated include artificial neural networks (ANN) with multilayer perceptrons, support vector machine with polynomial and radial base function kernels, model tree with M5P algorithm, and K-nearest neighbors. The research identified key physico-chemical parameters that control drainage chemistry using linear correlation analysis. Moreover, it identified that support vector machine with polynomial kernel and radial base function kernel and artificial neural can accurately predict future drainage chemistry. Model tree and K-nearest neighbor techniques that build local linear models cannot capture the highly nonlinear process of ARD generation.  Furthermore, the research identified the sources of uncertainties in machine learning techniques to predict ARD chemistry, quantified and showed the possibility of reducing those uncertainties. Also, it identified nonlinear parameters that control ARD chemistry; and it found that the predictive accuracy of machine learning techniques for accurately predicting ARD chemistry Chapter 9  121  depends on the size of data as well as heterogeneity within the data. In general, the research showed that machine learning techniques are useful to predict future ARD chemistry in minesites.  In the proposed framework, two methodologies are developed to conduct risk assessment based on the availability of hydrogeological information. For minesites where adequate hydrogeological information is available, a methodology that simulates fate and transport of metals in the environment using the PHREEQC model, estimates the response of receptors to contaminants, and estimates the risk posed to the environment under uncertainty is applied. The case study in a minesite showed that the methodology can estimate contaminant concentration in the environment and provides reasonable risk estimates under uncertainty. It also provided conservative and non-conservative measures to assess risk, propagated and quantified the risk throughout the risk assessment processes. The presented methodology could be applied to conduct risk assessment at various mining phases in the mining industry. For minesites where hydrogeological information is limited, another methodology that simulates the fate and transport of contaminants concentration in the environment using a fugacity-based fate-and-transport model, characterizes the response of receptors to contaminants, and estimates risk under uncertainty is applied. The uncertainty associated with fate-and-transport, effect assessment and risk characterization are quantified and propagated in an integrated manner. The case study in a minesite showed that the method tends to overestimate the degree of risk because the fugacity-based model does not represent important processes in minesites such as precipitation, dissolution, desorption and others. This methodology could be used to assess risk at a preliminary level when hydrogeological information that characterizes minesites is scarce. However, the PHREEQC-based risk assessment should be used when data become available through time. This research has developed a methodology that uses MCDA to evaluate mitigation measures and identify an optimal measure to manage ARD risk deterministically and under uncertainty. It can rank alternatives deterministically and probabilistically. Also, it ranks the alternatives in the form of a cumulative distribution function that will help decision-makers rank alternatives with the required level of confidence. In addition, it can identify the impact of criteria on ranking alternatives. In general, the methodology could be used to select mitigation measures to manage the risk of ARD in the mining industry under uncertainty. Also, it could be used in other systems (e.g., contaminates sites) either in its present form or by replacing the multicriteria decision analysis method.  Chapter 9  122  9.2 Limitations and Recommendations for Future Work • The methods (IMPSE and IRMI) recommended to fill in missing values in ARD chemistry data are superior to traditional methods that introduce serious bias and loss of information, but they introduce a little bias, which could be attributed to the non-stationarity nature of the data. This limitation should be addressed in future work by developing a method based on the non-stationarity assumption.  • This research identified machine learning techniques that can predict future ARD chemistry in the mining industry. However, it is believed that the predictions of these techniques could be improved further by considering more parameters (e.g., internal gases concentrations and temperature) when data become available in the future. •  The fate-and-transport of modeling of metals using PHREEQC simulates sorption process for exchange-sites and precipitation of a few minerals. In order to simulate sorption due to surface complexation and precipitation of more minerals, experiment should be done to determine the amount of reactive minerals surface available for surface complexation and types of minerals precipitates.   • The methodology for risk management of ARD addresses uncertainties associated with criteria weight. Further work is required to consider uncertainties associated with input data (e.g., decision-makers’ preference for alternatives) and model in the future. • The proposed framework in this research has presented each methodology separately at different minesites due to the lack of complete data. Future work could be done to integrate these methodologies and develop a comprehensive standalone risk-management program. 9.3 Originality and Contributions • Missing data in large databases of ARD chemistry pose major challenge in the development of models that are used to predict ARD chemistry and evaluate mitigation measures. Often the mining industry treats missing data using traditional methods that introduce loss of information and bias. This research investigated and identified advanced methods to treat missing data in the mining industry. • The literature shows that predictions of ARD made using mechanistic or empirical models provide inaccurate results. This work evaluated various machine learning techniques and identified two techniques that simulate the nonlinear nature of ARD Chapter 9  123  generation. Moreover, the research developed a methodology to account and reduce uncertainties of machine learning techniques in predicting ARD chemistry. • Little has been done on how to assess environmental risks of ARD in the existing guidelines of the mining industry. This research developed two methodologies that help the mining industry to assess environmental risks of ARD.  • The literature shows that mitigation measures to manage risks of ARD are selected without the use of MCDA aids although humans are not capable of solving multiple objectives unaided. This work has developed a methodology to manage risks of ARD using a MCDA approach. • The issue of uncertainty is rarely addressed in the mining industry during prediction, assessment, and management of ARD risks although the guidelines emphasize the importance of addressing uncertainties. This work demonstrated probability bounds (p-boxes), fuzzy-probabilistic, probabilistic algorithms to propagate and quantify uncertainties due to data, parameters, and model throughout the process of risk assessment and management. The research also successfully addressed variability and uncertainty in risk assessment separately using probability bounds (p-boxes) and fuzzy-probabilistic algorithms. • Proposed a framework to manage risk of ARD under uncertainty and successfully applied it at minesites located in British Columbia, Canada.     124   BIBLIOGRAPHY Acuna, E. & Rodriguez, C., 2004. The Treatment of Missing Values and its Effect on Classifier Accuracy. In D. Banks et al., eds. Classification, Clustering and Data Mining Applications. Chicago: Springer Berlin Heidelberg, pp. 639–647. Allison, J., Brown, D. & Kevin, J., 1991. MINTEQA2/PRODEFA2, a geochemical assessment model for environmental systems: Version 3.0 user’s manual. Available at: http://www.epa.gov/ceampubl/mmedia/minteq/USERMANU.PDF. Allison, J.D. & Allison, T.L., 2005. Partition coefficients for metals in surface water, soil, and waste, Washington, DC. Alpers, C. & Nordstrom, D., 1999. Geochemical modeling of water-rock interactions in mining environments. In The environmental geochemistry of mineral deposits. Part A. Process, Methods, and Health Issue. Littleton, Colorado: Society of Econmic Geologists, pp. 289–323.  Appelo, C.A.J & Postma, D., 2005. Geochemistry, groundwater and pollution, Boca Raton, FL: Taylor & Francis Group. Aral, M.M., 2010. Environmental modeling and health risk analysis (ACTS/RISK), Springer. Aryafar, A., Gholami, R., Rooki, R. & Doulati Ardejani, F., 2012. Heavy metal pollution assessment using support vector machine in the Shur River, Sarcheshmeh copper mine, Iran. Environmental Earth Sciences, 67(4), pp.1191–1199.  Ascough, J.C., Maier, H.R., Ravalico, J.K. & Strudley, M.W., 2008. Future research challenges for incorporation of uncertainty in environmental and ecological decision-making. Ecological Modelling, 219(3-4), pp.383–399.  Ayyub, B. & Klir, G., 2010. Uncertainty modeling and analysis in engineering and the sciences, Boca Raton, FL: Taylor & Francis Group. Azapagic, A., 2004. Developing a framework for sustainable development indicators for the mining and minerals industry. Journal of Cleaner Production, 12(6), pp.639–662.  Batu, V., 2006. Applied flow and solute transport modeling in aquifers: fundamental principles and analytical and numerical methods, Boca Raton, FL: Taylor & Francis Group.  Baudrit, C.,  Dubois, D., Member, S. & Guyonnet, D., 2006. Joint Propagation and Exploitation of Probabilistic and Possibilistic Information in Risk Assessment. IEEE Transactions on Fuzzy Systems, 14(5), pp.593–608. Baudrit, C., Guyonnet, D. & Dubois, D., 2007. Joint propagation of variability and imprecision in assessing the risk of groundwater contamination. Journal of contaminant hydrology, 93(1-4), pp.72–84.   125  Baudrit, C., Guyonnet, D. & Dubois, D., 2005. Postprocessing the hybrid method for addressing uncertainty in risk assessments. Journal of Environmental Engineering, 131(12), pp.1750–1754. Bello, A.L., 1995. Imputation techniques in regression analysis: Looking closely at their implementation. Computational Statistics & Data Analysis, 20(1), pp.45–57.  Benjamini, Y., & Yekutieli, D., 2001. The control of the false discovery rate in multiple testing under dependency. Annals of Statistics, 29, pp. 1165–1188. Bethk C.M. and Brady P.V., 2000. How the Kd approach undermines groundwater clean up. Ground Water, 38, pp. 435-443. Betrie, G.D., Sadiq, R., Craig, N., Morin, K.A. & Tesfamariam, S., 2014a. Environmental risk assessment under uncertainty in the mining industry: A PHREEQC approach. Journal of Hazardous Materials, submitted. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S., 2014b. Human health risk assessment: A fuzzy-probabilistic approach. Stochastic Environmental Research and Risk Assessment, submitted. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S., 2014c. Ecological risk assessment of acid rock drainage under uncertainty. Environmental Technology and Innovation, Submitted, pp.1–19. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S.,  2014d. Uncertainty quantification and integration of machine learning techniques for predicting acid rock drainage quality: A probability bounds approach. Science of the Total Environment, 490:182-190. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S., 2014e. On the issue of incomplete and missing water-quality data in minesites databases: comparing three imputation methods. Mine Water and the Environment, doi :10.1007/s10230-014-0322-4. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S., 2013a. Selection of remedial alternatives for mine sites: A multicriteria decision analysis. Journal of Environmental Management, 119, pp. 36-46. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S., 2013b. A multicriteria decision analysis approach for selection of remedial alternatives at mine sites. In The Proceedings of the 2013 Annual International Mine Water Association Conference-Reliable Mine Water Technology, Golden, Colorado USA. Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S., 2012a. Predicting copper concentrations in acid mine drainage: a comparative analysis of five machine learning techniques. Environmental monitoring and assessment, 185(5), pp.4171–82.  Betrie, G.D. , Tesfamariam, S., Morin, K.A. & Sadiq, R.  2012b. Predicting copper concentrations in acid mine drainage: a comparative analysis of five machine learning techniques. In The Proceedings of 9th International Conference on Acid Rock Drainage (ICARD), May 20-26, 2012, Ottawa, Canada.  126  Betrie, G.D., Sadiq, R., Morin, K.A. & Tesfamariam, S. 2012c. Uncertainty analysis of an aquivalence-based fate and transport model using a hybrid fuzzy-probabilistic approach. In The Proceedings of 12th International Environmental Specialty Conference, Canadian Society for Civil Engineers, June 6-9, Edmonton, Alberta, Canada.  Betrie, G.D. Mohamed, Y.A., van Griensven, A. & Srinivasan, R., 2011. Sediment management modelling in the Blue Nile Basin using SWAT model. Hydrology and Earth System Sciences, 15(3), pp.807–818.  Bishop, C.M., 1995. Neural Networks for Pattern Recognition. Journal of the American Statistical Association, 92(440), p.1642. Blowes, D.W., Ptacek, C.J., Jambor, J.L. & Weisener, C.G., 2003. The geochemistry of acid mine drainage. Treatise on Geochemistry, 9, pp.149–204.  Bouckaert, R.R., Frank, E., Hall, M., Kirkby, R., Reutemann, P., Seewald, A. & Scuse, D., 2010. WEKA manual for version 3-6-4. Hamilton, p.300. Brans, J. & Mareschal, B., 2005. PROMETHEE methods. In J. Figueria, S. Greco, & M. Ehrgott, eds. Multiple criteria decision analysis: State of the art survey. Boston, MA: Springer Science + Business Media, Inc., pp. 164–195.  Brans, J. & Vincke, P., 1985. A preference ranking organisation method: (The PROMETHEE method for multiple criteria decision-making). Management science, 31(6), pp.647–656. Bray, J., Broady, P.A., Niyogi, D.K. & Harding, J.S., 2009. Periphyton communities in New Zealand streams impacted by acid mine drainage. Marine and Freshwater Research, (2000), pp.1084–1091. Cabaniss, S.E., 1999. Uncertainty propagation in geochemical calculations: non-linearity in solubility equilibria. Applied Geochemistry, 14(2), pp.255–262.  Caruso, B.S. Cox, T.J., Runkel, R.L., Velleux, M.L., Bencala, K.E., Julien, P.Y., Butler, B.A., Alpers, C.N., Marion, A. & Smith, K.S., 2008. Metals fate and transport modelling in streams and watersheds: state of the science and USEPA workshop review. Hydrological Processes, 4021, pp.4011–4021. CCME, 1996. A framework for ecological risk assessment at contaminated sites in Canada: General Guidance, Winnipeg, Manitoba. CCME, 2014. Canadian Environmental Quality Guidelines. Available at: http://ceqg-rcqe.ccme.ca/. Cherkassky, V., Krasnopolsky, V., Solomatine, D.P. & Valdes, J., 2006. Computational intelligence in earth sciences and environmental applications: issues and challenges. Neural networks: the official journal of the International Neural Network Society, 19(2), pp.113–21.  Cherkassky, V. & Mulier, F., 2007. Learning from data: concepts, theory, and methods Second., Hoboken, New Jersey: A John Wiley and Sons, Ltd.   127  Criscenti, L., Laniak, G. & Erikson, R., 1996. Propagation of uncertainty through geochemical code calculations. Geochimica et cosmochimica acta, 60(19), pp.3551–3568.  Cui, W., Blockley, I., 1990. Interval probability theory for evidential support. International Journal of Intelligent Systems, 5(2), pp.183–192. Denison, F.H. & Garnier-Laplace, J., 2005. The effects of database parameter uncertainty on uranium(VI) equilibrium calculations. Geochimica et Cosmochimica Acta, 69(9), pp.2183–2191.  Diamond, M.L., Mackay, D. & Welbourn, P.M., 1992. Models of multi-media partitioning of multi-species chemicals: The fugacity/aquivalence approach. Chemosphere, 25(12), pp.1907–1921.  Environmental Mining Council of British Columbia, 1995. Acid Mine Drainage: Mining and Water Pollution issues in BC. Available at http://www.miningwatch.ca/sites/www.miningwatch.ca/files/amd_0.pdf Ferson, S., Kreinovich, V., Ginzburg, L., Myers, D.S. & Sentz, K., 2003. Constructing Probability Boxes and Dempster-Shafer Structures(SAND2002-4015), New Mexico. Ferson, S. et al., 2004. Dependence in probabilistic modeling , Dempster-Shafer theory , and probability bounds analysis (SAND2004-3072), New Mexico. Ferson, S., 2001. Probability bounds analysis solves the problem of incomplete specification in probabilistic risk and safety assessments. In Y. Haimes et al., eds. Risk-Based Decisionmaking in Water Resources IX. Reston, VA: American Society of Civil Engineers, pp. 173–188. Ferson, S., 2000. RAMAS Risk Calc 4.0 Software: Risk Assessment with Uncertain Numbers, Lewis Publishers.  Ferson, S. & Ginzburg, L.R., 1996. Different methods are needed to propagate ignorance and variability. Reliability Engineering & System Safety, 54(2-3), pp.133–144.  Gaines, G.L. & Thomas, H.C., 1953. Adsorption Studies on Clay Minerals. II. A Formulation of the Thermodynamics of Exchange Adsorption. The Journal of Chemical Physics, 21(4), p.714.  Gapon, E.N., 1933. On the theory of exchange adsorption in soils. J. Gem. Chem., 3, pp.144–163. Gideon, R.A., Hollister, R.A. & Hollister, A., 1987. A Rank Correlation Coefficient Resistant to Outliers. Journal of American Statistical Association, 82(398), pp.656–666. Google, 2013. Google earth Pro. Available at: http://www.google.com/intl/en/enterprise/mapsearth/products/earthpro.html. Graham, J.W., 2012. Missing data: Analysis and Design, Springer Berlin Heidelberg. Gray, N., 1997. Environmental impact and remediation of acid mine drainage: a management problem. Environmental Geology, 30, pp.62–71.   128  Gray, N.F., 1998. Acid mine drainage composition and the implications for its impact on lotic systems. Water Research, 32(7), pp.2122–2134.  Gray, N.F., 1996. Field assessment of acid mine drainage contamination in surface and ground water. Environmental Geology, 27(4), pp.358–361. Güler, C., Thyne, G., McCray, J. & Turner, K., 2002. Evaluation of graphical and multivariate statistical methods for classification of water chemistry data. Hydrogeology Journal, 10(4), pp.455–474.  Gupta, H.V., Sorooshian, S. & Yapo, P.O., 1999. Status of Automatic Calibration for Hydrologic Models: Comparison with Multilevel Expert Calibration. Journal of Hydrologic Engineering, 4(2), pp.135–143.  Guyonnet, D. Bourgine, B., Dubois, D., Fargier, H., Colme, B. & Chilès, J., 2003. Hybrid Approach for Addressing Uncertainty in Risk Assessments. Journal of Environmental Engineering, 129(1), pp.68–78.  Guyonnet, D. & Come, B., 1999. Comparing two methods for addressing uncertainty in risk assessments. Journal of Environmental Engineering, 125(7), pp.660–666.  Hall, J.W., Blockley, D.I. & Davis, J.P., 1998. Uncertain inference using interval probability theory. International Journal of Approximate Reasoning, 19(3-4), pp.247–264.  Hall, L., Scott, M. & Killen, W., 1998. Ecological risk assessment of copper and cadmium in surface waters of Chesapeake Bay watershed. Environmental Toxicology and Chemistry, 17(6), pp.1172–1189.  Harding, J., 2005. Impacts of metals and mining on stream communities. In T. A. Morre et al., eds. Metal Contaminants in New Zealand. Christchurch, NZ: resolutionz press, pp. 343–357. Hargreaves, G., & Riley, J., 1985. Agricultural benefits for Senegal River basin. Journal of Irrigation Drainage, E-ASCE, 111, pp.113–124. Hoffman, F.O. & Hammonds, J.S., 1994. Propagation of uncertainty in risk assessments: the need to distinguish between uncertainty due to lack of knowledge and uncertainty due to variability. Risk analysis, 14(5), pp.707–12.  Hogsden, K.L. & Harding, J.S., 2012. Consequences of acid mine drainage for the structure and function of benthic stream communities: a review. Freshwater Science, 31(1), pp.108–120.  Honaker, J. & King, G., 2010. What to Do about Missing Values in Time-Series Cross-Section Data. American Journal of Political Science, 54(2), pp.561–581.  Honaker, J., King, G. & Blackwell, M., 2006. Amelia II: A program for missing data. Journal of Statistical Software, 45(7).  Hsieh, W.W., 2009. Machine learning methods in the environmental sciences, New York: Cambridge University Press.  129  Debye, P. & Hückel, E., 1923. On the theory of electrolytes. I. Freezing point depression and related phenomena. Physikalische Zeitschrift, 24(9), pp.185–206. INAP, 2012. Global Acid Rock Drainage (GARD) Guide. Available at: http://www.gardguide.com/index.php/Summary. Jambor, J., Dutrizac, J., Groat, L. & Raudsepp, M., 2002. Static tests of neutralization potentials of silicate and aluminosilicate minerals. Environmental Geology, 43(1-2), pp.1–17.  Johnson, D.B. & Hallberg, K.B., 2005. Acid mine drainage remediation options: a review. The Science of the total environment, 338(1-2), pp.3–14. Khandelwal, M. & Singh, T., 2005. Prediction of mine water quality by physical parameters. Journal of Scientific and Industrial Research, 64, pp.564–570. Khandelwal, M. & Singh, T.N., 2005. Prediction of mine water quality by physical parameters. Journal of Scientific & Industrial Research, 64(August), pp.564–570. Kiker, G., Bridges, T.S., Varghese, A., Seager, P., Thomas, P. & Linkov, Igor , 2005. Application of multicriteria decision analysis in environmental decision making. Integrated environmental assessment and management, 1(2), pp.95–108.  Klir, G., 2005. Uncertainty and information: foundations of generalized information theory, Hoboken, NJ: John Wiley & Sons. Kuipers, J.R., Maest, A.S., MacHardy, K.A. & Lawson, G., 2006. Comparison of predicted and actual water quality at hardrock mines: The reliability of predictions in environmentalimpact statements. Kuyucak, N., 1999. Acid mine drainage prevention and control options. In Mine Water and the Environment. 1999 IMWA Congress. Sevilla, Spain. International Mine Water Association, pp. 599–606. Leavitt, J.J., Howe, K.J. & Cabaniss, S.E., 2011. Equilibrium modeling of U(VI) speciation in high carbonate groundwaters: Model error and propagation of uncertainty. Applied Geochemistry, 26(12), pp.2019–2026.  Li, J., Huang, Gordon, H., Zeng, G., Maqsood, I. & Huang, Y., 2007. An integrated fuzzy-stochastic modeling approach for risk assessment of groundwater contamination. Journal of environmental management, 82(2), pp.173–88.  Ling, H., Diamond, M. & Mackay, D., 1993. Application of the QWASI Fugacity/Aquivalence Model to Assessing Sources and Fate of Contaminants in Hamilton Harbour. Journal of Great Lakes Research, 19(3), pp.582–602.  Little, R.J.A. & Rubin, D.B., 2002. Statistical Analysis with Missing Data, New York: Wiley. Lottermoser, B.G., 2010. Mine Wastes Characterization, Treatment and Environmental Impacts Third., Heidelberg Dordrecht London New York: Springer.  130  Loucks, D. & van Beek, E., 2005. Water resources systems planning and management, Paris: UNESCO. Luís, A.T., Teixeira, P., Almeida, S.F.P., Ector, L., Matos, J.X. & Ferreira da Silva, E. A., 2008. Impact of Acid Mine Drainage (AMD) on Water Quality, Stream Sediments and Periphytic Diatom Communities in the Surrounding Streams of Aljustrel Mining Area (Portugal). Water, Air, and Soil Pollution, 200(1-4), pp.147–167.  Luo, Y. & Yang, X., 2007. A multimedia environmental model of chemical distribution: fate, transport, and uncertainty analysis. Chemosphere, 66(8), pp.1396–407.  Mackay, D. & Guardo, A. Di, 1996. Evaluating the environmental fate of a variety of types of chemicals using the EQC model. Environmental Toxicology and Chemistry, 15(9), pp.1627–1637.  Mackay, D., Joy, M. & Paterson, S., 1983. A quantitative water, air, sediment interaction (QWASI) fugacity model for describing the fate of chemicals in lakes. Chemosphere, 12(7), pp.981–997.  Maest, A.S., Kuipers, J.R., Travers, C.L. & Atkins, D.A., 2005. Predicting water quality at hardrock mines: methods and models, uncertainties, and state-of-the-art. Mansour-Rezaei, S., Naser, G. & Sadiq, R., 2012. A comparison of various uncertainty models: An example of subsurface contaminant transport. Journal of Hydro-environment Research, 6(4), pp.311–321.  Matott, L.S., Babendreier, J.E. & Purucker, S.T., 2009. Evaluating uncertainty in integrated environmental models: A review of concepts and tools. Water Resources Research, 45(6), p.n/a–n/a.  Mayer, K.U., Frind, E.O. & Blowes, D.W., 2002. Multicomponent reactive transport modeling in variably saturated porous media using a generalized formulation for kinetically controlled reactions. Water Resources Research, 38(9), pp.13–1–13–21.  McDaniels, T.L., Gregory, R.S. & Fields, D., 1999. Democratizing risk management: Successfulpublic involvement in local water management decisions. Risk Analysis, 19(3), pp.497–510.  Meek, F. & Allen, J., 1994. Evaluation of acid prevention techniques used in surface mining. In Proceedings of International Land Reclamation and Mine Drainage Conference. Pittsburgh, Pennsylvania. MEND, 1999. A review of passive systems for the treatment of acid mine drainage (MEND Report 3.14.1), Canadian Mine Environment Neutral Drainage. MEND, 2000. Construction and instrumentation of a multilayer cover (MEND Report 2.22.4a), Canadian Mine Environment Neutral Drainage. Miller, G.C., Kempton, H., Figueroa, L. & Pantano, J., 2006. Engineering issue: management and treatment of water from hard rock mines (EPA/625/R-06/014), U.S. Environmental Protection Agency.  131  Mitchell, T., 1997. Machine Learning, McGraw Hill. Morgan, M. & Henrion, M., 1990. Uncertainty: a guide to dealing with uncertainty in quantitative risk and policy analysis, Cambridge, UK: Cambridge University Press. Morin, K. & Hutt, N., 1994. An empirical technique for predicting the chemistry of water seeping from mine-rock piles. In The Interantional Land Reclamation and Mine Drainage Conference on the Abatement of Acidic Drainage, Pittsburgh, PA, USA, April 24-29, 1994. Morin, K. & Hutt, N., 1993. The use of routine monitoring data for assessment and prediction of water chemistry. In Proceedings of the 17th Annual Mine Reclamation Symposium, Port Hardy, British Columbia, May 4-7. pp. 191–201. Morin, K., Hutt, N. & Aziz, M., 2010. Twenty-three years of monitoring minesite-drainage chemistry, during operation and after closure: The Equity Silver Minesite, British Columbia, Canada, Surrey. Available at: http://www.mdag.com/case_studies/MDAG-com Case Study 35 - 23 Years of Minesite-Drainage Chemistry at Equity Silver Minesite.pdf. Morin, K., Hutt, N. & Horne, I., 1995. Prediction of future water chemistry from Island Copper Mine’s on-land dumps. In Proceedings of the 19th Annual British Columbia Mine Reclamation Symposium in Dawson Creek, BC, 1995. The Technical and Research Committee on Reclamation. Morin, K.A. & Hutt, N.M., 2001a. Environmental geochemistry of minesite drainage: practical theory and case studies, Vancouver: MDAG Publishing. Morin, K.A. & Hutt, N.M., 2001b. Prediction of minesite-drainage chemistry through closure using operational monitoring data. Journal of Geochemical Exploration, 73(2), pp.123–130.  Morin, K.A., Hutt, N.M. & Aziz, M., 2012. Case Studies of Thousands of Water Analyses through Decades of Monitoring: Selected Observations from Three Minesites in British Columbia , Canada. In Proceedings of the 2012 International Conference on Acid Rock Drainage, Ottawa, Canada, May 22-24. Ottawa. Niyogi, D., Jr, W.L. & McKnight, D., 2001. Litter breakdown in mountain streams affected by mine drainage: biotic mediation of abiotic controls. Ecological Applications, 11(2), pp.506–516.  Nordstrom, D. & Campbell, K., 2014. Modeling low-temperature geochemical processes. Treatise on geochemistry, 7, pp.27–68.  Palisade Corporation Inc., 2005. Guide to using @RISK: Advanced risk analysis for spreadsheets. Parkhurst, D. & Appelo, C., 1999. User’s guide to PHREEQC (Version 2): A computer program for speciation, batch-reaction, one-dimensional transport, and inverse geochemical calculations.  Parsons, J., 1977. Effects of acid mine wastes on aquatic ecosystems. Water, Air, and Soil Pollution, 7(3).   132  Payette, S. & Delwaide, A., 2003. Shift of Conifer Boreal Forest to Lichen-Heath Parkland Caused by Successive Stand Disturbances. Ecosystems, 6(6), pp.540–550.  Pelley, J., 2007. Unreliabe water-quality forecasts at mines. Enviromental Science and Technology, 41(5), pp.1511–1512. Perkins, E.H.,  Nesbitt, H.W., Gunter, W.D., St-Arnaud, L.C. & Mycroft, J.R., 1995. Critical review of geochemical processes and geochemical models adaptable for prediction of acidic drainage from waste rock. Pitzer, K., 1973. Thermodynamics of electrolytes. I. Theoretical basis and general equations. The Journal of Physical Chemistry, 7242(6).  Price, W., 2009. Prediction manual for drainage chemistry from sulphidic geologic materials (Mend Report 1.20.1). Quinlan, J., 1992. Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Singapore. pp. 343–348.  Refsgaard, J.C., van der Sluijs, J.P., Højberg, A.L. & Vanrolleghem, P.A., 2007. Uncertainty in the environmental modelling process – A framework and guidance. Environmental Modelling & Software, 22(11), pp.1543–1556.  Reich, Y., 1997. Machine Learning Techniques for Civil Engineering Problems. Computer-Aided Civil and Infrastructure Engineering, 12(4), pp.295–310.  Reich, Y. & Barai, S.V., 1999. Evaluating machine learning models for engineering problems. Artificial Intelligence in Engineering, 13(3), pp.257–272. Reshef, D.N. et al., 2011. Detecting novel associations in large data sets. Science (New York, N.Y.), 334(6062), pp.1518–24.  Rietveld, P. & Ouwersloot, H., 1992. Ordinal data in multicriteria decision making, a stochastic dominance approach to siting nuclear power plants. European Journal of Operational Research, 56(2), pp.249–262. Roman, G., Isnard, P. & Jouany, J., 1999. Critical analysis of methods for assessment of predicted no-effect concentration. Ecotoxicology and environmental safety, 43(2), pp.117–25.  Rooki, R., Doulati Ardejani, F., Aryafar, A. & Bani Asadi, A. , 2011. Prediction of heavy metals in acid mine drainage using artificial neural network from the Shur River of the Sarcheshmeh porphyry copper mine, Southeast Iran. Environmental Earth Sciences, 64(5), pp.1303–1316.  RStudio, 2012. RStudio Software. Available at: http://www.rstudio.org/. Runkel, R.L.,  Kimball, B.A., Walton-Day, K., Verplanck, P.L. & Broshears, R.E., 2012. Evaluating remedial alternatives for an acid mine drainage stream: a model post audit. Environmental science & Technology, 46(1), pp.340–7.   133  Saaty, T.L., 1982. Decision making for leaders: The analytical hierarchy process for decisions in a complex world, Belmont,Ca: Lifetime Learning. Sadiq, R., Husain, T., Veitch, B. & Bose, N., 2003. Distribution of arsenic and copper in sediment pore water: an ecological risk assessment case study for offshore drilling waste discharges. Risk analysis: an official publication of the Society for Risk Analysis, 23(6), pp.1309–21.  Sadiq, R. & Tesfamariam, S., 2009. Environmental decision-making under uncertainty using intuitionistic fuzzy analytic hierarchy process (IF-AHP). Stochastic Environmental Research and Risk Assessment, 23(1), pp.75–91.  Sanford R.F., Pierson C.T., & Crovelli R.A., 1993. An objective replacement method for censored geochemical data. Mathematical Geology, 25(1), pp. 59–80. Schafer, J.L. & Olsen, M.K., 1998. Multivariate Behavioral Multiple Imputation for Multivariate Missing-Data Problems: A Data Analyst ’ s Perspective. Multivariate Behavioral Research, 33(4), pp.545–571.  Sentz, K. & Ferson, S., 2002. Combination of Evidence in Dempster-Shafer Theory, Albuquerque, NM. Skousen, J.,  Rose, A., Geidel, G., Foreman, J., Evans, R. & Hellier, W., 1998. Handbook of technologies for avoidance and remediation of acid mine drainage, The National Mine Land Reclamation Center Located at East Virginia University in Morgantown, West Virginia. Skousen, J.G., Sexstone, A. & Ziemkiewicz, P.F., 2000. Acid mine drainage control and treatment. In R. I. Branhisel, R. G. Darmody, & W. L. Daniels, eds. Reclamation of Drastically Disturbed Lands. Madison: American Society of Agronomy, pp. 1–42. Smola, A. & Schölkopf, B., 2004. A tutorial on support vector regression. Statistics and computing, 14, pp.119–222.  Solomatine, D.P. & Ostfeld, A., 2008. Data-driven modelling: some past experiences and new approaches. Journal of Hydroinformatics, 10(1), p.3.  Solomon, K., Baker, D.B., Richards, P.R., Kenneth, R.D., Stepen, J.K., La Point, T.W., Kendall, R.J., Weisskopf, C.P., Giddings, J.M., Giesy, J.P., Hall, L.W. & Williams, W.M ., 1996. Ecological risk assessment of atrazine in North American surface waters. Environmental Toxicology and Chemistry, 15(1), pp.31–76.  Sommerfreund, J.,  Arhonditsis, G.B., Diamond, M.L., Frignani, M., Capodaglio, G., Gerino, M., Bellucci, L., Giuliani, S. & Mugnai, C., 2010. Examination of the uncertainty in contaminant fate and transport modeling: a case study in the Venice Lagoon. Ecotoxicology and environmental safety, 73(3), pp.231–9.  Soucek, D. & Cherry, D., 2000. Laboratory to field validation in an integrative assessment of an acid mine drainage–impacted watershed. Environmental Toxicology and Chemistry, 19(4), pp.1036–1043.  134  Templ, M., Kowarik, A. & Filzmoser, P., 2011. Iterative stepwise regression imputation using standard and robust methods. Computational Statistics & Data Analysis, 55(10), pp.2793–2806.  Tesfamariam, S., 2006. Possibilistic approach for consideration of uncertainties to estimate structural capacity of ageing cast iron water mains. Canadian Journal of Civil Engineering, 33, pp.1050–1064. Tesfamariam, S. & Sadiq, R., 2008. Probabilistic risk analysis using ordered weighted averaging (OWA) operators. Stochastic Environmental Research and Risk Assessment, 22(1), pp.1–15.  Tucker, W.T. & Ferson, S., 2003. Probability bounds analysis in environmental risk assessments, New York. U.S. EPA, 2000. Abandoned minesite characterization and cleanup handbook (EPA 910-B-00-001), San Francisco, CA: U.S. Environmental Protection Agency. U.S. EPA, 2006. Center for the Study of Metals in the Environment, Washington, DC: U.S. Environmental Protection Agency. U.S. EPA, 1995. Contaminants and remedial options at selected metal-contaminate sites (EPA 910-B-00-001), Washington, DC: U.S. Environmental Protection Agency. U.S. EPA, 1998. Ecological risk assessment guidance for superfund: process for designing and conducting risk assessments, Washington, DC: U.S. Environmental Protection Agency.. U.S. EPA, 1992. Framework for ecological risk assessment(EPA/630/R-92/00), Washington, DC: U.S. Environmental Protection Agency. U.S. EPA, 2007. Framework for Metals Risk Assessment (EPA 120/R-07/001), Washington, DC: U.S. Environmental Protection Agency. U.S. EPA, 1994. Technical document Acid mine draiange prediction (EPA530-R-94-036), Washington, DC: U.S. Environmental Protection Agency.  US EPA, 2014. ECOTOX | MED | US EPA. Available at: http://cfpub.epa.gov/ecotox. USEPA, 1998. Guidelines for ecological risk assessment (EPA/630/R-95/002F), Washington, DC: U.S. Environmental Protection Agency. USEPA, 2001. Risk assessment guidance for superfund: Volume III. Part A: Process for conducting probabilistic risk assessment (EPA 540-R-02-002), Washington, DC: U.S. Environmental Protection Agency.  Vanselow, A.P., 1932. Equilibria of the base exchange reactions of bentonites, permutites, soil colloid, and zeolites. Soil Sci., 33, pp.95–113. Vapnik, V., 1998. Statistical learning theory, Toronto: John Willey & Sons. Vapnik, V.N., 2000. The nature of statistical learning theory 2nd ed. M. Jordan et al., eds., New York: Springer-Verlag.  135  Verboven, S., Branden, K. Vanden & Goos, P., 2007. Sequential imputation for missing values. Computational biology and chemistry, 31(5-6), pp.320–7.  Verburg, R., Bezuidenhout, N., Chatwin, T. & Ferguson, K., 2009. The Global Acid Rock Drainage Guide (GARD Guide). Mine Water and the Environment, 28(4), pp.305–310.  Vigneault, B., Kwong, Y. & Warren, L., 2007. Assessing the long term performance of a shallow water cover to limit oxidation of reactive tailings at Louvicourt Mine (MEND report 2.12.2), Canadian Mine Environment Neutral Drainage. Walker, W., Harremoës, P., Rotmans, J., van der Sluijs, J.P., Van Asselt, M.B.A., Janssen, P., & Von Krauss, M.P.K., 2003. Defining uncertainty: a conceptual basis for uncertainty management in model-based decision support. Integrated assessment, 4(1), pp.5–17.  Walley, P., 1991. Statistical reasoning with imprecise probabilities 1st ed., London: Chapman and Hall/CRC. Wang, Y. & Witten, I., 1997. Inducing model trees for continuous classes. In Proceedings of the 9th European Conference on Machine Learning Poster Papers. Wiiliamson, R.C. & Downs, T., 1990. Probabilistic arithmetic I: numerical methods for calculating convolutions and dependency bounds. International Journal of Approximate Reasoning, 4, pp.89–158. Witten, I. & Frank, E., 2005. Data mining: practical machine learning tools and techniques, Oxford: Elsevier Inc. Yanful, E. & Simms, P., 1997. Review of water cover sites and research Projects (MEND Project 2.18.1). Zhu C., 2003. A case against Kd-based transport model: natural attenuation at a mill tailings site. Computers & Geosciences, 29(3), pp. 35-359.   136  APPENDIX A HUMAN HEALTH RISK ASSESSMENT: A FUZZY-PROBABLISTIC APPROACH  Background The mining industry plays a paramount role in economy and overall employment of Canadian (Natural Resources Canada 2013). However, this industry produces vast amount of solid waste, which may contain sulfide materials. The exposure of sulfide materials to water and oxygen can result in acid rock drainage (ARD) (Morin and Hutt 2001; Price 2009). ARD produces dissolved heavy metals (e.g., copper, zinc, and arsenic) or secondary minerals (e.g., sulfates and hydroxides). Moreover, the oxidation of some sulfide minerals produces acidity that may lower the drainage pH. This lower drainage pH could increase the rate of sulfide oxidation, solubility of many products of sulfide oxidation and rate of weathering of other minerals.  The transport of dissolved metals into the environment through water and soil poses adverse human health risks. Some risks to human health include increased chronic diseases and various types of cancers. Examples of chronic include cardiovascular diseases, reproductive impairments, allergies, liver and kidney diseases (Nriagu 1988; Hu 2002). Examples of cancers include skin, lung, kidney, breast, and bladder (Mishra et al. 2010; Hayes 1997). To prevent adverse effects of metal in human health, it is critical to conduct health risk assessment in the mining industry.  Human health risk assessment evaluates the probability of adverse health effects in human that have occurred or may occur as a result of exposure to one or more chemicals in contaminated media (US EPA 2004). It mainly consists of exposure assessment that includes fate-and-transport modeling and receptor-dose assessment, toxicity assessment, and risk characterization phases. Uncertainty analysis should be conducted in all steps of risk (Abt et al. 2010).  The sources of uncertainties in risk assessment are mainly related to parameters that are used in each step. Parameter uncertainty may originate from randomness due to natural variability resulting from heterogeneity or stochasticity and/or imprecision due to lack of information  137  resulting from systematic measurement error or expert opinions (Baudrit et al. 2006). The uncertainty related to randomness and imperfect knowledge is referred to as stochastic/aleatory uncertainty and epistemic uncertainty, respectively (Walker et al. 2003). The aleatory uncertainty is non-reducible, whereas epistemic uncertainty is reducible by more studies such as more research and data collection (Refsgaard et al. 2007).  In cases where parameters of risk assessment are represented by random and imprecise variables, two separate methods must be used to characterize uncertainties (Hoffman and Hammonds 1994; Ferson and Ginzburg 1996). However, parameter uncertainty is rarely recognized or addressed using a probabilistic approach in the mining industry (Maest et al. 2005). The use of probabilistic distribution functions (PDFs) to represent parameters involve collecting sufficient number of site specific measurements and fitting the theoretical distribution to a histogram of measured data (Guyonnet et al. 2003). In a pre-mining stage where data are scarce, it is practically impossible to collect all site-specific parameters information due to financial and time constraints. The use of PDFs, without justification by available information, may lead to a non-conservative estimation of risk (Guyonnet and Come 1999). In this study, a methodology of human health risk assessment using a fuzzy-probabilistic approach is presented for the mining industry.  Kwakernaak presented the concepts and algorithm of fuzzy random variables (Kwakernaak 1978; Kwakernaak 1979). Guyonnet et al. (2003) proposed a fuzzy-probabilistic approach that is based on Monte Carlo and fuzz calculus and applied it for exposure assessment via vegetation consumption in a cadmium-contaminated soil. Kentel and Aral compared fuzzy-probabilistic with 2D Monte Carlo for risk characterization (Kentel and Aral 2005). Zhang et al. (2011) applied fuzzy-probabilistic for human health risk assessment in a petroleum-contaminated site. In this study, fuzzy-probabilistic is used to estimate and propagate uncertainties in each step of human health risk assessment in the mining industry.  138  The remainder of this paper is organized as follows. In the next section, the methodology is presented. The Case Study section presents the demonstration of the proposed methodology at a minesite. The results and conclusion are presented in the last two sections. Methodology Human health risk assessment evaluates the probability of adverse health effects in human that have occurred or may occur as a result of exposure to one or more chemicals in contaminated media (US EPA 2004). The proposed methodology of human health risk assessment using a fuzzy-probabilistic approach is shown in Figure 1. It consists of fuzzy-probabilistic, fate-and-transport modeling receptor-dose assessment, toxicity assessment, and risk characterization steps.   Fuzzy-Probabilistic  In fuzzy-probabilistic, uncertainty analysis of parameters is estimated and propagated in the risk assessment process. The fuzzy-probabilistic approach is an integration of fuzzy set and Monte Carlo approaches. A fuzzy set A  in X  is characterized by the membership function Am , which takes value in the interval zero to one. The fuzzy set theory is a fundamental of possibilistic distribution (fuzzy number). A fuzzy number describes the relationship between an uncertain quantity X  and a membership functionm . This function, which ranges between zero and one, describes the possibility that the quantity X  may take on a certain value x  (Zadeh 1978). A fuzzy set is considered as a fuzzy number if it is normal, convex, there is exactly one x∗  element of real number with ( ) 1A xm∗ = , and the membership function ( )A xm , where x   is an element of real numbers, is at least piecewise continuous (Hanss 2005). An important concept of fuzzy sets is the concept of an cutα −  level (Figure 2), Aα . Aα is a subset of A that represents a cut on a fuzzy set A  and consists of an interval for which membership grades in A  are greater than or equal to the specified value of α . The membership value of a real number reflects the “likeliness” of occurrence of that number; the alpha-cut level reflects the different sets of numbers with a given minimum likeliness (Kaufmann and Gupta 1991).  139  In a Monte Carlo approach, parameters are represented in a PDF that defines the range of values and their likelihood (USEPA 2001). A random sampling is used to sample a set of parameters several times. When a sufficient number of samples is used, the variance of the output reflects the combined impact of the variances in the parameters as the uncertainty is propagated through a model.  140   Figure 1: The proposed methodology of human health risk assessment  DEFINE α-cut LEVEL TO CUT THE FUZZY  PARAMETERS M TIMES TO FIND A1 … AMRUN THE FATE-AND-TRANSPORT MODEL USING A1 AND B1…BNSAMPLE THE RANDOM PARAMETERS N TIMES TO FIND B1…BNCONSTRUCT THE PREDICTED CDFs FOR EACH α-cut GENERATE THE FUZZY CONCENTRATION FOR SELECTED PERCENTILEIS IT  THE AM VALUE?NOTOXCITY ASSESSMENT YESCHARACTERIZE RISK RECEPTOR DOSE ASSESSMENT STARTEND 141   Figure 2: Illustration for concept of cutα −  Fate-and-Transport Modeling In fate-and-transport modeling, exposure concentration of contaminant in environmental media is estimated. A fugacity/aquivalence-based model is a fate-and-transport model that is used to estimate exposure concentration of chemical in multimedia based on transport and transformation processes. According to  Mackay et al. (1983), this model uses aquivalence ( 3, /A mol m ) as the controlling variable instead of using concentration ( 3/mol m ). A linear relationship existing between these quantities (i.e., 'C A Z= ⋅ ). 'Z is the aquivalence capacity, which depends on the characteristics of the chemical, the medium, and temperature. The aquivalence value of water ('wZ ) is actually defined as 1.0 and the values for other media are obtained by multiplying 'wZ  by partition coefficients of particular medium. Rates of diverse processes ( /mol h ) such as wet and dry deposition from the atmosphere, a chemical transformation, and re-suspension of bottom sediments can be expressed as the product of an aquivalence and a transport or transformation parameters ( 3, /D m h ). For chemical transport by advection flow D  is expressed as 'GZ  where G is a mass phase flow rate (m3/h. For chemical transport by diffusion, D  is given as 'UAZ , where U ( /m h  ) is a mass transfer coefficient, and A  ( 2m ) is area. For chemical transformation, D  is equal to 'VKZ , where V ( 3m ) is the compartment volume, and K  ( 1h− ) is a first order rate constant.   142  Receptor-Dose Assessment In receptor-dose assessment, the actual dose of contaminants to the exposed population is estimated. The receptor-dose equation shown below (USEPA 2001) is used to estimate receptor-dose (chronic daily intake, mg kg day− ). This equation consists of fish ingestion rate ( FIR , / .mg kg day ), exposure concentration ( C , / )mg L that is obtained from fate-and-transport modeling, bioconcentration factor ( BCF , /L kg ), lipid content of fish ( ,%LC ), exposure frequency ( EF , /day year ), exposure duration ( ED , year ), and average life time exposure (,AT days  kg ).  ( )FIR C BCF LC EF EDCDIAT× × × × ×=  Toxicity Assessment Toxicity assessment estimates the adverse health effects using dose-response relationships. Dose response relationships could be expressed as reference dose (RFD) and slope factors (SF) for non-carcinogenic and carcinogenic chemicals, respectively. RFD is a daily dose that is believed not to cause any adverse health effect. SF ( 1( / )mg kg day −− ) is a 95% confidence limit of the increased cancer risk from a lifetime exposure to an agent by ingestion (USEPA 1997).  Risk Characterization Risk characterization estimates non-carcinogenic and carcinogenic risks. Risk characterization is conducted by comparing the fuzzy hazard quotient (HQ) with a threshold value of one using possibility theory. HI is the ratio of CDI and RFD ( /HQ CDI RFD=  ). A threshold value one indicates an acceptable level of intake that is equal to the RFD value. Possibility theory represents a state of knowledge about an issue, distinguishes between what is “plausible” and what is “surprising” and what is “expected” (Dubois and Henri 1990). In possibility theory, the fuzzy hazard index HI  is lower or equal to the compliance guideline T  is defined as HI T≤  (Guyonnet and Come 1999). The possibility theory describes uncertainty using a possibility  143  measure (Π ) and a necessity measure ( N ) (Dubois and Henri 1990). The possibility and necessity measures that the proposition A C≤  is defined by the following relationships:  uu( ) Sup min[ (u), (u)]( ) max[1 (u), (u)]HI THI THI TN HI T Infm mm mΠ ≤ =≤ = −                             where HIm  is membership function of HI  for any value u ; Tm  is the membership function of T  for any value u ; uSup  and uInf  are largest and smallest value, respectively; and min  and max  are minimum and maximum operators, respectively. Note that 1Cm =  if u T≤  and 0Cm =  if u T> . Guyonnet and Come (1999) defined five conditions based on the possibility and necessity measures  for fuzzy number HI T≤  as shown below: • The proposition is necessarily true if 1, 1NΠ = =  ; • The proposition is possibly true with necessity measure 1-α if 1, 1N αΠ = = − ; • The proposition is necessarily true with necessity measure zero if 1, 0NΠ = = ; • The proposition is necessarily true with a possibility measure of β if , 0NβΠ = = ; and • The proposition is necessarily false if 0, 0NΠ = = . Case Study In this study, a steady-state aquivalence based model is developed to simulate the fate of heavy metals in soil and water media from a mine site. The mine system consists of waste rocks, clay till and an aquifer. A conceptual model showing different mechanisms of transport in two compartments is shown in Figure 3. It was assumed that the waste rocks is source of copper metal emission ( sE ) into the clay till. Intermedia transport, water-to-soil and soil-to-water mass flow through diffusion and advection were considered. The diffusion of copper from water to the clay till ( WSD ) and soil to water ( SWD ) are calculated based on mass transport coefficient. The advection through the clay till ( ASD ) and the aquifer ( AWD ) values are calculated based on infiltration and groundwater flow rates, respectively. Eventually, copper is transported from the  144  groundwater to an adjust lake through advection and taken up by fish. Mine workers who live around this lake are exposed to copper through ingestion of the fish.  An algorithm for calculating the metal concentration in soil and water is presented in Table 2. This algorithm is programmed in terms of fuzzy arithmetic operation as described in Ross ( 2009) and Monte Carlo algorithm.   Figure 3: Schematic representation of the conceptual model  The input parameters (i.e., physico-chemical characteristics of chemicals and the multimedia environment) of the aquivalence-based model are provided in Table 3. Clay till thickness ( sd ) of 8 m and groundwater thickness ( gwd ) of 712 m at a mine site were used. The clay till overlies the aquifer that is considered as homogenous and well mixed. The aquifer has a length and width of 1245 m. It is assumed that infiltration through the clay till ( sG ) is gravity driven and it transports the copper concentrations to the aquifer. The soil-water mass transport coefficient ( 11U ) is obtained from Mackay and Guardo (1996). The concentration in the waste rocks is equal to aqueous solubility of copper and it migrates vertically through the clay till by advection and ldsTailingsGroundwaterDAS DSWDWSDAWESClay tilldgwInfiltration 145  diffusion. The mass flux at the clay-aquifer interface mixes homogenously over a given aquifer thickness. Copper also migrates from the aquifer to the surrounding lake by advection flow ( gwG).  Table 2. Algorithm for computing concentration of copper in the clay till and aquifer Steps Parameters and Equations Define physico-chemical properties of chemicals.  Partitioning with soil ( dK )  Define multimedia physical properties.  Length ( l ), width ( w ), the depths of aquifer (gwd ) and clay till ( sd ) ,and others.  Define advection velocities in clay till-aquifer system.  gwG  and sG    Calculate the aquivalence capacity ( 'Z ) of all compartments.  Z'w = 1,     Z's = ' 's w dZ Z K=     Calculate the advection D-values.  Advection  'AW w gwD Z G=  and  'AS s sD Z G=  Calculate the diffusion D-values A is the area of clay-aquifer interface.  '11SW sD AU Z=  and '11WS wD AU Z=  11U  is mass transfer coefficient  Define pollutant loads by direct emissions in soil CuC  and MW  are copper emissions and molecular weight MWGCE SCuS×=  Calculate aquivalence water ( wA ) and clay ( sA)   AWWSSW DDEA−=      and )(*)( SWASAWWSAWSS DDDDDEA+−=  Calculate concentrations in clay till ( SC , /mg L )  and aquifer ( , /WC mg L  ).  CW = MW × Z'W × AW  CS = MW × Z'S × AS     Table 3 also shows the probabilistic or possibilistic distributions of input parameters together with their values. The probabilistic distributions of the random parameter are determined by  146  fitting the concentrations of copper using @Risk Software (Palisade Corporation 2010) and from literature for partitioning coefficients of copper (Allison and Allison 2005). The triangular distribution type of the fuzzy parameters is subjectively selected and their values are obtained from studies of the mine site. The developed integrated model was used to simulate two scenarios (i) Scenario 1 represents current conditions and (ii) Scenario 2 represents an introduction of a water cover system. A water cover system reduces the transport of copper concentration by 99% (Vigneault et al. 2007).  Table 3. Input parameters of an aquivalence-based model Parameter Symbol Units Variable Type Distribution type Molecular weight MW g/mol constant 63.6 Half-lives in water H1/2-water h constant 1E+40 Half-lives in soil H1/2-soil h constant 1E+40 Aquivalence capacity Z'w  constant 1 Width w m constant 1245 Length l m constant 1245 Thickness of groundwater dgw m constant 712 Thickness of soil ds m constant 8 Soil-water mass transport U11 m/h constant 1.0E-05 Density of solids ρs kg/m3 fuzzy triangular~   (2600,2740,2800) Advection of soil Gs m3/hr fuzzy triangular~(2,10,18) Advection of groundwater Ggw m3/hr fuzzy triangular~(0.01,0.3,0.6) Partition coefficient of Cu log Kd l/kg random lognormal~(2.5,0.6) Copper concentration for scenario 1 CCu mg/l random lognormal~(8.9,1.56) Copper concentration for scenario 2 Ccu mg/l random lognormal~(0.089,0.0156)  In this study, the exposure concentration is estimated using the aquivalence model and this value is used in the receptor-dose equation to estimate the values of CDI. In this study, FIR is  147  equal to 3,888 / .mg kg day is used based on USEPA estimation (USEPA 1997). FIR refers to the amount of fish “as consumed” by the general public. BCF is the ratio of concentration in fish tissue (mg/kg) to the concentration in water (mg/L). BCF of copper is equal to 300 /L kg  is used as reported by Conklin et al. (1987). Lipid content of fish denotes the percentage of fat available to attach chemicals in the fish body. An LC value of 4.56 is used in this study based on USEPA recommendation (USEPA 1997). ED refers to household residence time that would be supplied with contaminated fish. A value of 30 years is used based on the history of mine operation. An exposure frequency of 365 days/yr and average time of 25,550 days are used based on the recommendation of USEPA (USEPA 1997). Information of dose response is obtained from the US DOE (Department of Energy) website (USDOE). For copper, only the RFD value is reported, which is equal to 0.04 mg/kg-day.  Results  The comparison of copper concentrations in groundwater under Scenario 1 and Scenario 2 is presented in Figure 4. Figure 4 depicts cumulative distribution functions of copper concentration in groundwater at cutα −  0, 0.5, and 1 for Scenarios 1 and 2. The shapes of lower distribution (LD) and upper distribution (UD) at cutα −  0, cutα −  0.5, and  cutα −  1 indicate these distributions have randomness. The horizontal distance between LD and UD at  cutα −  0 shows the uncertainty due to fuzziness. The LD and UD at cutα −  0 provide possible values of copper concentration in groundwater at a given percentile level. For example, the copper concentration could ranges between 0.44 mg/L and 4 mg/L in the groundwater under Scenario 1 at 60% percentile level as shown in Figure 4 (top). For Scenario 2, the possible values of copper concentration at 60 percentile range between 0.004 mg/L and 0.04 mg/L. The distribution at cutα −  1 provides the most likely value of copper concentration in groundwater at a given percentile level. For example, the most likely value of copper concentration in groundwater under Scenario 1 is equal to 2.22 mg/L at 60th percentile level. For Scenario 2, the most likely value of copper concentration is equal to 0.022 mg/L as shown in Figure 4 (bottom).   148  Fuzzy concentrations of copper in groundwater can be extracted at each percentile level from Figure 4. For example, the fuzzy concentrations of copper in groundwater at the 50th, 75th and 95th percentiles are presented in Figure 5. For any fuzzy number, the membership at zero shows that the possible values of copper concentration in groundwater at a given percentile. The membership at one shows the most likely value of copper concentration in the groundwater. For the 95th percentile under Scenario 1, the possible values of fuzzy concentration in groundwater range between 0.56 and 5.07 /mg L  and with a most likely value of 2.83 mg/L. The values outside this range are not a member of the fuzzy number at the 95th percentile. For Scenarios 1 and 2, the possible and most likely values of copper concentration increase as the percentile level increase from 50th to 95th. The comparison of the fuzzy concentration for the two scenarios reveals that the introduction of Scenario 2 reduced the transport of copper by two orders of magnitudes at each percentile. For instance, the possible concentration of copper has reduced from 0.56 - 5.07 /mg L  (i.e., Scenario 1) to 0.005-0.051 /mg L  (i.e., Scenario 2) for fuzzy numbers at the 95th percentile level.  The results of risk characterization are shown in Figure 6. This figure shows that the fuzzy HQs for the two scenarios and compare them with a compliance value. The values of HQ for the 50th, 75th, and 95th percentiles under Scenario 1 are equal to 6.39-58.08, 7.19-65.23, and 8.48-77.1, respectively. On the other hand, the values of HQ for the 50th, 75th, and 95th percentiles under Scenario 2 are equal 0.24-2.17, 0.27-2.46, and 0.32-2.91, respectively. In this study, the compliance value is defined as “the fuzzy HQ should not exceed 1 with possibility and necessity measures of 0.6.” However, it is worth noting that this value should be defined by decision-makers.    149   Figure 4. Cumulative distribution of copper in groundwater for Scenarios 1 and 2   Figure 5. Membership functions of copper in groundwater at the 50th, 75th and 95th percentiles for Scenario 1 and 2  150  For Scenario 1, the above proposition that the fuzzy HQ T≤  is false with possibility and necessity measures of 0 for all percentiles of fuzzy HQ. For example, HQ values of 50th percentile, which ranges between 6.39 and 58.08, exceed the compliance guideline that is equal to 1. For Scenario 2, different results are obtained for each percentile. For the 50th percentile, the above proposition is true with possibility measure of 60% since HQ (equal to0.79) is less than 1, whereas the above proposition is false with necessity measure of 60% since HQ (equal to1.8) exceeds 1. For the 75th percentile, the above proposition is true with possibility measure of 60% since HQ (equal to 0.8) does not exceed 1, whereas the above proposition is false with necessity measure of 60% since HQ (equal to 2.01) exceeds 1. For the 95th percentile, the above proposition is false with possibility and necessity measure of 60% since the values of HQ are equal to 1.1 and 2.4 exceed 1.    Figure 6. Risk characterization for Scenarios 1 and 2   151  Often decision-makers will choose which percentile and fuzzy measure to use to estimate the fuzzy hazard quotient and to decide whether the risk is acceptable or not, respectively. For instance, consider the decision-makers decided to use 75th percentile fuzzy hazard index. If the decision-makers are optimistic (non-conservative), then they may decide to use the possibility measure. For 75th percentile fuzzy hazard index and using the possibility measure, the decision makers would definitely accept the risk. However, the risk would not be acceptable if the decision-makers are pessimistic (conservative) and use the necessity measure. When an analyst is asked to report whether the risk is acceptable or not, it is recommended to use the necessity measure (i.e., to be conservative) (Guyonnet and Come 1999;, Kentel and Aral 2005). For this case study, the risk is not acceptable for the 50th, 75th, and 95th percentiles of fuzzy HQs.  Summary and Conclusions In this study, a methodology of human health risk assessment based on a fuzzy-probabilistic approach is presented for the mining industry.  This methodology consists of fuzzy-probabilistic, receptor-dose assessment, toxicity assessment and risk characterization. The fuzzy-probabilistic approach is used to estimate and propagate uncertainty in each steps of risk assessment. Exposure concentration in the environment is estimated using a fate-and-transport model. The exposure concentration is used to estimate dose of contaminants in mine workers using the receptor-dose equation. The toxicity assessment is conducted for non-carcinogenic risk using reference dose that is obtained from a toxicity database. The risk characterization was conducted by calculating a fuzzy hazard quotient (HQ) and comparing it with a compliance value using possibility theory.  The prediction results of the aquivalence model showed a in terms of cumulative distribution functions (CDFs) and fuzzy number for a given percentile level. The CDFs results show the magnitude of the predictive uncertainties and sources of input uncertainties (i.e., randomness or fuzziness). The information on sources of uncertainties will help modelers and decision makers to determine needs for further research and data collection activities to reduce parameter uncertainty. The results of risk characterization provide decision-makers with possibility and  152  necessity measures instead of one measure (i.e., most likely values) to characterize risks. For the minesite studied, the results of the case study showed that the risk may not be acceptable for both scenarios using the necessity measure, which is s conservative measure.  The fuzzy-probabilistic approach provided results that are impossible to obtain using individual fuzzy and probabilistic approaches. However, this approach does not provide reasonable results in case a model has complex equations (e.g., subtraction operation in a denominator) and repeated use of a fuzzy number in the equations. This limitation could be attributed to fuzzy interval arithmetic and it may be improved if advanced fuzzy arithmetic such as the transformation method is used instead of standard fuzzy arithmetic implemented in this study. The methodology presented in this study could be used for risk assessment and make informed decision for sustainability of the mining industry. Biblography Abt E, Rodricks J V, Levy JI, et al. (2010) Science and decisions: advancing risk assessment. Risk Anal 30:1028–36. doi: 10.1111/j.1539-6924.2010.01426.x Allison JD, Allison TL (2005) Partition coefficients for metals in surface water, soil, and waste. Washington, DC Baudrit C, Dubois D, Member S, Guyonnet D (2006) Joint Propagation and Exploitation of Probabilistic and Possibilistic Information in Risk Assessment. IEEE Trans FUZZY Syst 14:593–608. Conklin R, Galles S, Shane E, Tondreau R (1987) Bioconcentration of heavy metals (Cu,Cr,Pb, and Zn) in the Missouri river near Sioux city, Iowa. US Army Corps Eng.  Dubois D, Henri P (1990) Possibility theory: An approach to computerized processing of uncertainty, First. Plenum Press, New York and London Ferson S, Ginzburg LR (1996) Different methods are needed to propagate ignorance and variability. Reliab Eng Syst Saf 54:133–144. doi: 10.1016/S0951-8320(96)00071-3 Guyonnet D, Bourgine B, Dubois D, et al. (2003) Hybrid Approach for Addressing Uncertainty in Risk Assessments. J Environ Eng 129:68–78. doi: 10.1061/(ASCE)0733-9372(2003)129:1(68) Guyonnet D, Come B (1999) Comparing two methods for addressing uncertainty in risk assessments. J Environ Eng 660–666. Hanss M (2005) Applied fuzzy arithmetic: An introduction with engineering application. Springer  153  Hayes RB (1997) The carcinogenicity of metals in humans. Cancer Causes Control 8:371–85. Hoffman FO, Hammonds JS (1994) Propagation of uncertainty in risk assessments: the need to distinguish between uncertainty due to lack of knowledge and uncertainty due to variability. Risk Anal 14:707–12. Hu H (2002) Human health and heavy metals exposure. Life Support Environ. Hum. Heal.  Kaufmann A, Gupta MM (1991) Introduction to fuzzy arithmetic: theory and applications. 361. Kentel E, Aral MM (2005) 2D Monte Carlo versus 2D Fuzzy Monte Carlo health risk assessment. Stoch Environ Res Risk Assess 19:86–96. doi: 10.1007/s00477-004-0209-1 Kwakernaak H (1978) Fuzzy random variables—I. Definitions and theorems. Inf Sci (Ny) 15:1–29. Kwakernaak H (1979) Fuzzy random variables—II. Algorithms and examples for the discrete case. Inf Sci (Ny) 278:253–278. Mackay D, Guardo A Di (1996) Evaluating the environmental fate of a variety of types of chemicals using the EQC model. Environ Toxicol Chem 15:1627–1637. Mackay D, Joy M, Paterson S (1983) A quantitative water, air, sediment interaction (QWASI) fugacity model for describing the fate of chemicals in lakes. Chemosphere 12:981–997. Maest AS, Kuipers JR, Travers CL, Atkins DA (2005) Predicting water quality at hardrock mines: methods and models, uncertainties, and state-of-the-art. 77. Mishra S, Dwivedi SP, Singh RB (2010) A Review on Epigenetic Effect of Heavy Metal Carcinogens on Human Health. Open Nutraceuticals J 3:188–193. doi: 10.2174/18763960010030100188 Morin KA, Hutt NM (2001) Environmental geochemistry of minesite drainage: practical theory and case studies. MDAG Publishing, Vancouver Natural Resources Canada (2013) Publications and Reports @ NRCan - Additional Statistics - Minerals and Metals. http://www.nrcan.gc.ca/publications/statistics-facts/1245#sec1. Accessed 17 May 2014 Nriagu JO (1988) A silent epidemic of environmental metal poisoning? Environ Pollut 50:139–61. Palisade Corporation (2010) Guide to Using @RISK. 14850: Price W (2009) Prediction manual for drainage chemistry from sulphidic geologic materials. Mend Rep 1201 579. Refsgaard JC, van der Sluijs JP, Højberg AL, Vanrolleghem P a. (2007) Uncertainty in the environmental modelling process – A framework and guidance. Environ Model Softw 22:1543–1556. doi: 10.1016/j.envsoft.2007.02.004  154  Ross T (2009) Fuzzy logic with engineering applications, Third. A John Wiley and Sons, Ltd US EPA (2004) Risk Assessment Principles and Practices. Washington, DC USDOE The Risk Assessment Information System. http://rais.ornl.gov/cgi-bin/tools/TOX_search. Accessed 10 Jun 2014 USEPA (2001) Risk assessment guidance for superfund: Volume III. Part A: Process for conducting probabilistic risk assessment. III: USEPA (1997) Exposure factors handbook. U.S. Environmental Protection Agency, Washington, DC Vigneault B, Kwong Y, Warren L (2007) Assessing the long term performance of a shallow water cover to limit oxidation of reactive tailings at Louvicourt Mine. MEND Rep. 2.12.2  Walker W, Harremoës P, Rotmans J, et al. (2003) Defining uncertainty: a conceptual basis for uncertainty management in model-based decision support. Integr Assess 4:5–17. Zadeh L (1978) Fuzzy sets as a basis for a theory of possibility. Fuzzy sets Syst 9–34.    

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0074404/manifest

Comment

Related Items