THE WORTH OF DATA IN PREDICTING AQUITARD CONTINUITY IN HYDROGEOLOGICAL DESIGN By BRUCE RENNIE JAMES B.A.Sc, University of Waterloo, 1986 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Department of Geological Sciences) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA April 1992 Â® Bruce Rennie James, 1992 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department The University of British Columbia Vancouver, Canada DE-6 (2/88) ABSTRACT A Bayesian decision framework is developed for addressing data worth questions for hydrogeological design in heterogeneous geological environments. This framework is developed specifically for aiding hydrogeologists, dealing with groundwater contamination, in the design of exploration programs searching for aquitard discontinuities. It can be used to evaluate, and compare, the cost effectiveness of (a) patterns of precise point measurements (e.g. boreholes), which are often expensive, and (b) areal geophysical surveys which are imprecise, but usually less expensive. The framework consists of two basic modules: a geostatistical indicator algorithm for simulating aquitard heterogeneity and a numerical model for simulating contaminant transport. Bayesian decision analysis ties these two modules together. The Bayesian nature of the framework also provides a methodology for combining a conceptual understanding of the local geology with quantitative information. Indicator geostatistics allows the handling of hydrogeological parameters which behave in space as non-Gaussian random variables. The estimated worth of a measurement was found to be particularly sensitive to economic parameters. It was less sensitive to hydrogeological and geostatistical parameters. The framework was applied in a retrospective fashion to the design of a remediation program for soil contaminated by radioactive waste disposal at the Savannah River Site, in South Carolina. The cost effectiveness of different patterns of point measurements was studied. This study included determining the number and spacing of the most cost effective pattern. Contour maps were produced of the net worth of a single, point measurement. These contour maps can be used to design sequential sampling programs involving single or multiple measurements. Good potential was also shown for determining the cost effectiveness of an areal geophysical survey. The net worth of patterns of precise, point measurements was compared to that of an imprecise, areal seismic survey. These results indicate that the framework can be very valuable in determining if additional exploration is cost effective and in designing efficient exploration programs. Results also show that ignoring a conceptual understanding of geology can lead to erroneous data worth analysis. The framework could be modified to handle other data worth questions in hydrogeology or other disciplines, such as mining, or petroleum reservoir engineering. TABLE OF CONTENTS ABSTRACT i i LIST OF TABLES vii LIST OF FIGURES viii ACKNOWLEDGEMENTS xi CHAPTER 1: INTRODUCTION 1 1.1 HYDROGEOLOGICAL DESIGN AND GEOLOGICAL HETEROGENEITY 2 1.2 PREVIOUS WORK 3 1.3 APPROACH AND CONTRIBUTION OF THESIS 6 CHAPTER 2: A N OVERVIEW OF BAYESIAN DECISION ANALYSIS 10 2.1 INTRODUCTION 10 2.2 PRIOR ANALYSIS 10 2.3 POSTERIOR ANALYSIS 15 2.4 DATA WORTH THROUGH PREPOSTERIOR ANALYSIS 19 2.4.1 EXPECTED VALUE OF PERFECT INFORMATION (EVPI) 20 2.4.2 EXPECTED VALUE OF SAMPLE INFORMATION (EVSI) 22 2.4.2.1 Method 1: The EVSI from the Expected Increase in the Maximum Objective Function 22 2.4.2.2 Method 2: The EVSI from Expected Reduction in Minimum Expected Regret 24 2.4.2.3 Method 3: Expected Posterior Worth of a Measurement 26 2.5 APPLICATION AND EXTENSION OF THE METHODS 29 2.6 NOTATION 29 CHAPTER 3: GEOLOGICAL PREDICTION OF AQUITARD CONTINUITY 33 3.1 INTRODUCTION 33 3.2 THE EFFECT OF DEPOSITIONAL ENVIRONMENT ON THE GEOMETRIC CHARACTERISTICS OF C L A Y LAYERS 33 3.2.1 BRAIDED STREAM 33 3.2.2 MEANDERING STREAM 34 3.2.3 DELTAIC 35 3.2.4 ESTUARINE 36 3.2.5 SUBMARINE FAN 37 3.2.6 MARINE 37 3.2.7 GLACIAL 37 3.3 QUANTITATIVE DESCRIPTION OF SHALE HETEROGENEITY 39 3.4 THE IMPORTANCE OF GEOLOGY IN PREDICTING AQUITARD CONTINUITY 40 CHAPTER 4: SEQUENTIAL INDICATOR SIMULATIONS OF AQUITARDS 45 4.1 INTRODUCTION 45 4.2 BASIC GEOSTATISTICAL CONCEPTS 45 4.3 INDICATOR KRIGING (IK) 50 4.3.1 INDICATOR RANDOM VARIABLE 50 4.3.2 TYPES OF DATA 52 4.3.3 SIMPLE INDICATOR KRIGING (SIK) WITH HARD DATA 53 4.3.4 ORDINARY INDICATOR KRIGING (OIK) WITH HARD DATA 54 4.3.5 COMPARISON OF OIK AND S K 55 4.3.6 EXTENSION OF SIMPLE INDICATOR KRIGING TO HANDLE TYPE (a) SOFT DATA 57 4.4 SEQUENTIAL INDICATOR SIMULATION ALGORITHM (SIS) 58 4.5 INFERENCE OF GEOSTATISTICAL PARAMETERS 61 4.5.1 NON-BAYESIAN INFERENCE 61 4.5.1.1 Inference of the Mean and Variance 61 4.5.1.2 Inference of Cov(h) 65 4.5.2 BAYESIAN INFERENCE 68 4.5.3 SUMMARY OF INFERENCE OF GEOSTATISTICAL PARAMETERS 76 4.6 LIMITATIONS OF SIS ALGORITHM 77 4.6.1 GEOSTATISTICAL ASSUMPTIONS 78 4.6.2 FINITE SIZE OF AQUITARD REALIZATIONS 79 4.6.3 VOLUME VARIANCE RELATIONSHIP 79 4.6.4 TYPES OF UNCERTAINTY HANDLED 80 4.6.5 NUMBER OF CONDITIONING DATA 81 4.7 ALTERNATIVE SIMULATION ALGORITHMS 81 4.8 GEOSTATISTICAL NATURE OF C L A Y LAYERS 83 4.9 NOTATION 83 CHAPTER 5: CONTAMINANT TRANSPORT 86 5.1 INTRODUCTION 86 5.2 TRANSPORT PROCESSES 86 5.2.1 ADVECTION 87 5.2.1 HYDRODYNAMIC DISPERSION 88 5.2.3 IMPORTANCE OF ADVECTION 90 5.3 MODELING OF CONTAMINANT TRANSPORT 90 5.3.1 2-D MODELING OF SOLUTE TRANSPORT 91 5.3.2 3-D MODELING OF SOLUTE TRANSPORT 94 5.4 NOTATION 95 CHAPTER 6: OUTLINE OF FRAMEWORK 96 6.1 INTRODUCTION 96 6.2 MEASURING AQUITARD DISCONTINUITIES 96 6.3 EVALUATING THE WORTH OF HARD, POINT MEASUREMENTS 99 6.3.1 EXAMPLE DESIGN 99 6.3.2 TWO METHODS OF GENERATING REALIZATIONS FOR THE FRAMEWORK 101 6.3.2.1 Realizations Generated During Both Prior and Preposterior Analysis (Abandoned Approach) 102 6.3.2.1.1 The Prior Analysis 102 6.3.2.1.2 Preposterior Analysis 102 6.3.2.2 Realizations Generated During the Preposterior Analysis (Method Used in Thesis) 107 6.4 SENSITIVITY OF ESTIMATED WORTH OF SINGLE HARD MEASUREMENT TO NUMERICAL ARTIFACTS 110 6.4.1 THE NUMBER OF REALIZATIONS GENERATED PER SAMPLE OUTCOME I l l 6.4.2 STARTING SEED USED IN RANDOM NUMBER GENERATOR 112 6.4.3 NUMBER OF FIRST AQUITARD BLOCK GENERATED 112 6.4.4 NUMBER OF BLOCKS INTO WHICH AQUITARD IS DISCRETIZED 113 6.4.5 COMBINED AFFECT ON RELIABILITY OF DATA WORTH 113 6.5 PREPOSTERIOR ANALYSIS OF A SOFT, AREAL SURVEY 114 6.6 MODIHCATION TO FRAMEWORK TO HANDLE DATA WORTH PROBLEMS IN OTHER DISCIPLINES 116 6.7 SUMMARY OF CHAPTER 6 118 6.8 NOTATION 118 CHAPTER 7: GENERIC DESIGN CASE: A SENSITIVITY ANALYSIS OF DATA WORTH 123 7.1 INTRODUCTION 123 7.2 TWO CONTAMINATION SCENARIOS 124 7.2.1 SCENARIO NUMBER 1 124 7.2.2 CONTAMINATION SCENARIO 2: 127 7.3. SENSITIVITY TO ECONOMIC PARAMETERS 127 7.3.1 KNOWN COST OF CONTAINMENT (SCENARIO 1) 128 7.3.2 COST OF FAILURE (SCENARIO 1) 130 7.3.3 DISCOUNT RATE (SCENARIO 1) 131 7.4 SENSITIVITY TO HYDROGEOLOGICAL PARAMETERS 131 7.4.1 VERTICAL HYDRAULIC CONDUCTIVITY OF AQUITARD (SCENARIO 2) 132 7.4.2 HORIZONTAL HYDRAULIC CONDUCTIVITY OF LOWER AQUIFER (SCENARIO 2) 132 7.4.3 CONSTANT HEAD BOUNDARY CONDITION, h4, IN LOWER AQUIFER (SCENARIO 2) 133 7.5 SENSITIVITY TO GEOSTATISTICAL PARAMETERS 135 7.5.1 PRIOR ESTIMATE OF MEAN OF I(x) (SCENARIO 1) 135 7.5.2 CORRELATION LENGTH of I(x) (SCENARIO 2) 136 7.5.3 CONFIDENCE IN THE PRIOR ESTIMATE OF M E A N (SCENARIO 1) 137 7.6 SENSITIVITY TO EXISTING HARD DATUM 139 7.7 SUMMARY OF CHAPTER 7 140 7.8 NOTATION 141 CHAPTER 8: SAVANNAH RIVER SITE CASE HISTORY 150 8.1 INTRODUCTION 150 8.1.1 PHYSIOGRAPHY AND GEOLOGY 151 8.1.2 HYDROGEOLOGY 153 8.2 SET UP OF BASE CASE 155 8.2.1 ALTERNATIVE CLOSURE DESIGNS 155 8.2.2 MODELING OF CONTAMINANT TRANSPORT NEAR H-AREA SEEPAGE BASINS 157 8.2.3 GREEN C L A Y DATA BASE 161 8.2.4 INFERENCE OF GEOSTATISTICAL PARAMETERS 164 8.2.4.1 Inference of Correlation Length 164 8.2.4.2 Inference of Mean and Variance 165 8.3 PRIOR ANALYSIS 167 8.4 PREPOSTERIOR ANALYSIS OF HARD DATA 168 8.4.1 SINGLE HARD MEASUREMENT 168 8.4.2 SENSITIVITY OF NET WORTH OF HARD SINGLE HARD MEASUREMENT TAKEN AT POINT B TO PARAMETERS SET UP IN BASE CASE 170 8.4.2.1 Numerical Artifacts 171 8.4.2.2 Economic Parameters 173 8.4.2.3 Geostatistical Parameters 175 8.4.3 NET WORTH OF SINGLE, HARD MEASUREMENT TAKEN AT DIFFERENT LOCATIONS 177 8.4.4 PATTERNS OF MULTIPLE, HARD MEASUREMENTS 179 8.5 PREPOSTERIOR ANALYSIS OF SOFT, AREAL SURVEY 182 8.6 SUMMARY OF CHAPTER 8 186 8.7 NOTATION 188 CHAPTER 9: SUMMARY AND CONCLUSIONS 202 REFERENCES 206 APPENDIX 1: GREEN C L A Y DATA BASE 213 LIST OF TABLES Table 6-1: Economic parameters used in example design 101 Table 6-2: Numerical parameters used in example design 101 Table 6-3: The values of parameters used in the prior analysis and the expected value of parameters used in the preposterior analysis 106 Table 7-1: Costs and benefits for the containment and no containment alternatives 126 Table 7-2: Hydrogeological parameters used in Scenario 1 126 Table 7-3: Numerical parameters used in Scenario 1 127 Table 8-1: Summary of cost and benefits associated with the no action and waste removal alternatives 158 Table 8-2: Hydraulic parameters of hydrostratigraphic units used in the base case 161 Table 8-3: Summary of geostatistical parameters used in base case 167 Table 8-4: Aquitard discretizations used in evaluating net worth of hard measurement taken at point B 172 LIST OF FIGURES Figure 2-1: Cross section througli example landfill 12 Figure 2-2: Decision tree used in the prior analysis of the example landfill 15 Figure 2-3: Decision tree used in posterior analysis of example landfill 19 Figure 2-4: Expected regret calculations for the landfill example 21 Figure 2-5: Decision tree of preposterior analysis of example borehole using method 1 25 Figure 2-6: Decision tree used in preposterior analysis of example borehole using method 2 26 Figure 3-1: Cumulative probability distribution of shale length in different depositional environments 41 Figure 3-2: Prediction of shale continuity in the Ivishak Formation without geological information 42 Figure 3-3: Prediction of shale continuity in the Ivishak Formation with geological information 42 Figure 4-1: Plot of the exponential, spherical and Gaussian covariance functions, for <5-^ = 1, A, = 10 m and a = 10 m 49 Figure 4-2: Plot of i*(x) on 1-d line by both OIK and SIK 56 Figure 4-3: Variance of i*(x) fi-om SIK and OIK 57 Figure 4-4: One dimensional stratigraphie section divided into blocks 59 Figure 4-5: An example of nh hard data which are declustered 63 Figure 4-6: F*I(0) versus different cell size d 64 Figure 5-1: Boundary conditions on two-dimensional flow field 91 Figure 5-2: Sample finite element grid for two dimensional vertical cross secdon 93 Figure 6-1: Boundary conditions, hydrogeological and geostatisdcal parameters used in the example design 100 Figure 6-2: Finite element mesh used in example design with eight times vertical exaggeration 101 Figure 6-3: Decision tree used in prior analysis and its simplified version to right 102 Figure 6-4: Preposterior analysis of example design when realizations are generated in both the prior and the preposterior analysis 105 Figure 6-5: The preposterior analysis for example design for case where realizations are generated during preposterior analysis only 109 Figure 6-6: Sensitivity of the worth and median failure time to number of realizations per sample outcome I l l Figure 6-7: Sensitivity of the worth to the starting seed used in the random number generator 112 Figure 6-8: Sensitivity of the worth to first aquitard block in starting path 113 Figure 6-9: Sensitivity of W to the number of blocks that the aquitard is discretized into 113 Figure 7-1 : Plan view of contaminated aquifer system 125 Figure 7-2: Cross section through contaminated aquifer system 125 Figure 7-3: Sensitivity of W to the known cost of containment 129 Figure 7-4: The expected prior objective function of the two alternatives and the expected expected objective function of the posterior best design alternative versus known cost of containment 129 Figure 7-5: The effect of samphng a window on the objective functions of the different alternatives 129 Figure 7-6: The effect of sampling no window on the objective functions of the different alternatives 130 Figure 7-7: W versus cost of failure 130 Figure 7-8: The expected prior objective function of the two alternatives and the expected expected objective function of the posterior best design alternative versus cost of failure 131 Figure 7-9 : W versus discount rate 131 Figure 7-10: The expected prior objective function of the two alternatives and the expected expected objective function of the posterior best design alternative versus discount rate 131 Figure 7-11: The sensitivity of W and prior probability of failure to aquitard hydraulic conductivity for Scenario 2 132 Figure 7-12: The sensitivity of W and prior probabiUty of failure to the transmissivity of the lower aquifer 133 Figure 7-13: The sensitivity of W to the constant head boundary condition in lower aquifer 134 Figure 7-14: Sensitivity of W to the prior estimate of the mean 135 Figure 7-15: Effect of sampling no window on the prior and posterior objective function of the containment and no containment alternatives 135 Figure 7-16: Effect of sampling a window on the prior and posterior objective functions of the containment and no containment alternatives 135 Figure 7-17: Probability of sampling a window versus the mean chance of a window 136 Figure 7-18: Sensitivity of W and prior probability of failure to correlation length 137 Figure 7-19: Sensitivity of W to the prior confidence in the estimates of the geostatistical parameters 137 Figure 7-20: The updated mean versus ne 138 Figure 7-21: Effect of sampling a window on the objective functions 139 Figure 7-22: Effect of an existing hard datum on the worth of single borehole taken along an east-west line through the datum 140 Figure 8-1: Location map of the Savannah River Site 150 Figure 8-2: H-Area seepage basins in the General Separations Area 151 Figure 8-3: Geologic cross section through the General Separations Area 153 Figure 8-4: Area where flow and transport are modeled around the H-Area seepage basins 158 Figure 8-5: Contours of hydraulic head in Barnwell Formation and paths of particles of contamination near the H Area seepage basins 162 Figure 8-6: Locations of global hard data 163 Figure 8-7: Locations of global soft data 163 Figure 8-8: Local hard and soft data used to condition aquitard realizations 163 Figure 8-9: Decision tree used in prior analysis, where Cost of failure = $70 million 168 Figure 8-10: Location of hard measurement taken at point B 169 Figure 8-11: Decision tree used in preposterior analysis of hard measurement taken at point B, for cost of failure = $70 million 170 Figure 8-12: Sensitivity of net worth to aquitard block discretization 172 Figure 8-13: Sensitivity of net worth to the discount rate 174 Figure 8-14: Sensitivity of net worth to cost of failure 174 Figure 8-15: Sensitivity of net worth to known cost of lay cap alternative 175 Figure 8-16: Sensitivity of net worth to known cost of no clay cap alternative 175 Figure 8-17: Sensitivity of net worth to geological estimate of mean 176 Figure 8-18: Sensitivity of net worth to correlation length 176 Figure 8-19: Sensitivity of net worth to confidence in prior estimate of mean 177 Figure 8-20: Contour map of net worth of a single, hard measurement taken at different locations for cost of failure = $45 million 177 Figure 8-21: Contour map of net worth of a single, hard measurement taken at different locations for cost of failure = $70 million 179 Figure 8-22: Pattern of hard measurement taken on a 29 m spacing 179 Figure 8-23: Pattern of hard measurement taken on a 145 m spacing 180 Figure 8-24: Pattern of hard measurement taken on a 290 m spacing 180 Figure 8-25: Net worth of multiple hard measurements taken at spacings of 29,145, and 290 m versus number of measurements taken, for cost of failure = $70 million 181 Figure 8-26: Net worth of multiple hard measurements taken at spacings of 29 145 and 290 m versus number of measurements taken, for cost of failure =$45 million 181 Figure 8-27: Contour plot of total worth for single, hard, point measurements taken throughout the region of study, for cost of failure=$45 million 183 Figure 8-28: Decision tree used in preposterior analysis of soft geophysical survey 184 Figure 8-29: Net worth of the geophysical survey vs the probabiUty of sampling a window that will cause failure, given that a window that will cause failure exists 186 Figure 8-30: Net worth of the seismic survey compared to net worth of patterns of hard, point measurements taken on a 290 m spacing 186 ACKNOWLEDGEMENTS First and foremost, I would like to thank A l Freeze, my advisor, for giving me the opportunity to study under him. He has given me many things: a role model to follow and an understanding of what good research is really about, to just name a few. Secondly, I would like to thank Les Smith. He helped me on many occasions with technical problems relating to my thesis. I barged into his office dozens of times over the last five years with, "Have you got a sec?" Never once do I remember him saying no. It is a unique opportunity to be able to work under two scientists who are both world class researchers and first rate people. One of the nicest aspects of my graduate studies was the rest of the groundwater gang. I would like to thank Brian Berkowitz, Joann Bessler, Elizabeth Hill , Karen Jardine, Dina Lopez, Tony Sperling, Dan Walker, and Chistoph Wels for many technical discussions, for feedback on practice talks that I have given, and for providing a really enjoyable working environment. One outstanding individual who I am particularly in debt to is Tom Clemo. He not only helped me solve a couple of very critical problems that I was faced with in my research, but was continuous a source of feedback. He is one of the best people that I have ever met. I would also like to thank Bryon Cranston for getting me into hockey, Keith Everard for baking lots of muffins, and Peiwen Ke for teaching me about China. The hardest thing about leaving Vancouver is missing all of the people who I have met and worked with. Many thanks are also given to the Natural Sciences and Engineering Research Council of Canada who funded my research through scholarships and research operating grants. Finally, I would like to thank Dale Stephenson of Westinghouse Savannah River Co. and Glenn Duffield of the Geraghty & Miller ModeUng Group. They provided invaluable aid in helping me complete the Savannah River Site case history, an important component of my thesis. CHAPTER 1: INTRODUCTION High costs are the major challenge presently facing the remediation of groundwater/soil contamination. In 1990, die U.S. EPA estimated tiiat diere were 951 Superfund sites, 27 000 potential Superfund sites, as well as up to 100 000 other potential major sources of groundwater/soil contamination in the U.S. alone (National Research Council, 1990). These numbers do not include the many sites on US military installations. The cost of cleaning up individual major contamination sites is commonly in the $10's of millions, but can range up to the billions of dollars. The total cost of cleaning up all of these sites could exceed a tiillion dollars. Such high costs are unacceptable to society, particularly when tiiere are odier more pressing social and environmental problems, such as economic competitiveness and air pollution, facing us. Therefore, it is critical to remediate these contamination sites in a cost effective fashion. A crucial key to cost effective remediation is making good design decisions for remediation programs. However, making good decisions on designs is not a simple task since there can be much uncertainty in the quality of any particular design. Making good design decisions requires two important questions be answered: (1) Given uncertain conditions, how does one choose a good design? (2) Is it cost effective to collect more information so that one can improve a design? Massmann and Freeze (1987a and b), Massmann, et al. (1991), Sperling (1991), and SperUng et al. (in press) address the former question. The objective of diis thesis is to address the second question. This objective is reached by developing a Bayesian decision framework for addressing data worth questions in reahstic hydrogeological design problems that deal with groundwater contamination. The framework can be valuable in reducing costs because it provides the site engineer with a tool not only for spending site exploration dollars more efficientiy, but for deciding when enough information has been collected and a final remediation design decision can be chosen. This latter point is very important as it reduces the time necessary to make design decisions. The framework is specifically applicable to the hydrogeological design of new waste disposal sites or remediation programs for existing sites of groundwater contamination. The framework must able to handle geological heterogeneity for it to be useful in realistic design. 1.1 HYDROGEOLOGICAL DESIGN AND GEOLOGICAL HETEROGENEITY The design of new waste disposal sites, or remediation programs for sites of existing groundwater contamination, involves selecting the best design from a set of possible design alternatives. For example, in the design of a new landfill, one must consider whether an expensive synthetic liner is required to contain leachate, or whether no Uner is needed. Selecting the best design is often an expensive balancing act. On one side, the range of costs of constructing new landfill sites, or remediating existing contamination, is large. For example, the cost of proposed design alternatives for closing down four seepage basins at the Savannah River Site, South Carolina ranged from only $3.3 million to over $100 million (Killian et al., 1987). An overly conservative design can be a huge waste of resources. On the other side, a poorly designed remediation program that does not halt contamination, or a poorly designed waste disposal site that leaks leachate and contaminates local groundwater supplies, can result in large remediation costs and irreparable damage. The best design minimizes the combined cost of carrying out the design as well as any subsequent costs if the design performance fails to meet expectations. Choosing a poor design can be expensive. Uncertainty caused by geological heterogeneity can make the choice of the best design alternative difficult. The performance of any alternative may depend upon where and how fast leaking contaminants will be transported. Often, the major factor controlUng contaminant transport is geological heterogeneity, Uiat is the variability of hydrogeological parameters such as the hydraulic conductivity, dispersivity, or bed diickness. The true states of these parameters are uncertain since the porous medium is hidden below die surface, and is often highly variable. Therefore, there is a wide range of possible states for die three dimensional distiibution, and a wide range of possible performances of the various design alternatives. The choice of design can be facilitated by reducing the uncertainty in geological heterogeneity by gathering additional information through a sampling program. However, gathering data can be expensive. Will the expense of additional data collection be cost effective in allowing a better design decision to be made? And if so, what is the best sampling program to obtain this information? Decisions are needed as to the number and types of measurements to be taken, and their locations. Present exploration programs are generally based solely on the professional judgment of the site engineer or geologist. However, the complexity of die heterogeneity of hydrogeological parameters can be overwhelming. On one hand, many cases exist where insufficient exploration resulted in expensive, poor designs. For these cases, more information might have prevented the poor design. On the other hand, the adversarial and regulatory environment that exists at many Superfund sites in the U.S.A. has sometimes led to too many data being collected and delay in action. This has resulted in a waste of resources that could have been better spent cleaning up odier contamination sites. It is clear Uiat in many cases the worth of data is not analyzed properly. Consequentiy, diere is need for a tool which can help hydrogeologists/engineers evaluate the worth of data in hydrogeological design. 1.2 PREVIOUS WORK Evaluating the worth of data is not a new question. Over die last 50 years, much work has been done in designing sampling programs in geology. However, "with few exceptions, the research to date on sample network design has focused on the goal of improved estimation of spatial averages over a predefined area of interest." (Barnes, 1989. p.l2). Some selected references are Slichter (1955), Drew (1979), Delhomme (1983), and Bogardi et al. (1985). While this type of work makes valuable contributions to the design of sampling networks, there is no consideration of the monetary worth of a sampling program. Excluding monetary worth misses the fundamental point of a sampling program in hydrogeological design: the question of whether it is worth paying for a sampling in the first place. The work which has put a monetary worth on sampling programs has been based on Bayesian decision theory. (Bayesian decision theory will be discussed in Chapter 2.) Davis and Dvoranchick (1971), Davis et al. (1972), Dawdy (1979), Davis et al. (1979), and Attanasi (1979) utilized Bayesian decision theory to evaluate the worth of hydrological data. Ben-Zvi et al. (1988) evaluated the worth of a borehole in a simple groundwater contamination problem. Maddock (1973), in one of the pioneering works, extended Bayesian decision analysis to the management of a farm, utilizing an analytical groundwater flow model. Management decisions involved many factors including pumping rates for irrigation and types of crops to be planted. However, hydrogeological parameters were assumed to be homogeneous. Such an approach can be used in some hydrogeological design problems. Often, however, geological heterogeneity must be dealt with to effectively carry out design, particularly when dealing with groundwater contamination. Gates and Kisiel (1974) did a very interesting study that evaluated the worth of measurements of hydrauhc conductivity, storativity, hydraulic head and recharge/discharge rates using a computer model that predicted hydraulic head in the Tucson Basin in Arizona. The parameters were heterogeneous; however, the worth of data was quantified in terms of a cost associated with the precision of the predicted heads, rather than a design decision. None of the previous work that puts a monetary value on a sampling program has utilized geostatistics, or the concept of regional variables. Utilizing geostatistics is important because it is the most powerful tool available today for handling geological heterogeneity. Many hydrogeological parameters that govern groundwater flow and contaminant transport behave as regional variables (de Marsily, 1984, and Rouhani, 1985). A regional variable is correlated in space. Geostatistics utiUzes the spatial correlation to constrain the variability of possible states of the three dimensional distiibution of hydrogeological parameters. Thus, by using the spatial correlation, an improved estimate of geological heterogeneity can be achieved and the performance of a design alternative can be estimated more closely. Nevertheless, these estimates can still involve great uncertainty. Massman and Freeze (1987a and b), Massman et al. (1991), Sperling (1991), and Sperling et al. (in press) used geostatistical methods to address the problems of hydrogeological design under conditions of uncertainty, but they did not address the issue of data worth. Cahn (1987) developed sti-ategies for sampling programs utilizing geostatistical concepts. However, she did not evaluate the wordi of a measurement before it was taken. Rouhani (1985) linked the worth of a measurement of hydraulic head to the economic value related to die precision of die predicted hydraulic head, but he did not link the worth to a design decision. More recendy, Marin et al. (1989) and Medina (1989) oudined a Bayesian risk methodology for the permitting of waste disposal sites under conditions of uncertainty. Permitting was based upon whether the contaminant concend"ation would reach a critical level at some compliance point. Uncertainty in the contaminant concentration was estimated using Monte-Carlo simulations. In dieir work, diey discuss how the worth of a measurement is quantified in terms of the increased precision of the estimated concenti^tion. However die worth is not linked to an engineering design decision, nor do diey actually carry out an analysis of die worth of data. Freeze et ai. (1988) and Freeze et al. (in press) produced a general oudine of a Bayesian stochastic framework for addressing the data wordi question in hydrogeological design involving correlated random variables. However, diey did not put dieir ideas into practice. 1.3 APPROACH AND CONTRIBUTION OF THESIS Many hydrogeological parameters affect contaminant transport. As a first step, the framework will deal with only one hydrogeological parameter: aquitard continuity. This parameter has been chosen for three reasons. Firsdy, aquitard continuity is one of the major parameters controlling contaminant d"ansport. In aquifer systems of non-zero vertical gradients, aquitard discontinuities are often the most significant factor controlling flow behavior and the spread of contaminants (Haldorsen and Chang, 1986; Fogg, 1986; and Duffield et al. 1989). This factor is particularly important when there is a great difference between the hydraulic conductivity of the aquifers and the aquitard of interest. Variations of other hydraulic parameters, such as porosity, hydraulic conductivity, or storativity, within a hydrogeological unit are of lesser importance. This contention is supported by Fogg (1986), Desbarats (1987) and Haldorsen and Chang (1986) who report that the most significant factor controlling the flow pattern is often the transition between sand and shale units rather than variations within a particular unit. Newson and Wilson (1988) report diat even abandoned wells have provided important paths for contamination to spread vertically through aquifer/aquitard systems. Secondly, it is a common problem. Aquitard discontinuities have enhanced die spread of contamination, or have been a significant problem at numerous contamination sites (Roux and Altoff, 1980; Jackson et al., 1985; Duffield et al., 1989; and Kennedy, et al., 1990). Finally, aquitard continuity affects the ti-ansport that is most critical: die vertical spreading of contamination to clean aquifers. Previous experience has shown diat the clean up of a contaminated aquifer is often economically impossible. Therefore, the horizontal spreading of contamination in an aquifer diat is akeady contaminated is not as important as die vertical spreading of contamination to a clean aquifer. To the best of the author's knowledge, no other work has been done on evaluating the effect of uncertainty of aquitard continuity on the prediction of contaminant transport. The framework consists of two basic modules. The first is an indicator geostatistical simulation algorithm for handling aquitard heterogeneity. It accounts for the spatial correlation between measurements. Indicator geostatistics allows the handling of (a) hard, point measurements (which are precise, but are probably few and expensive), (b) soft, point measurements (which are imprecise, but are probably cheaper and more numerous), and (c) hydrogeological parameters which behave in space as non Gaussian random variables. Bed thickness determined from complete borehole logs represent an example of hard data. Bed thickness determined from incomplete borehole logs could represent soft data. Geophysical measurements could also represent soft data. The incorporation of soft data is important because there is often a lack of hard data. The second module is a numerical model (finite element or finite difference) for simulating contaminant transport. Bayesian decision analysis ties the two modules together. The Bayesian nature of the framework provides a methodology for combining a conceptual understanding of the local geology with quantitative information. Including a conceptual understanding of geology is vital in many cases. For example, whether an aquitard is formed in a meandering stream environment or in a deep marine environment has immense implications as to how far an aquitard can be extrapolated from a measurement point. A case study illustrates that ignoring a conceptual understanding of geology can lead to erroneous data worth predictions, resulting in a poor design decision. The framework can evaluate and compare the cost effectiveness of sampling programs consisting of (a) point, hard measurements, or (b) soft, geophysical surveys covering a large area. The framework will determine the best sampling strategy from a series of proposed ones. The best sampling strategy is not guaranteed to be the optimal pattern because the optimal pattern is the best of all possible sampling strategies, not just the best of a finite set of proposed ones. This thesis will concentrate on unconsolidated aquifer/aquitard systems formed in clastic depositional environments. It assumes that the discontinuities in an aquitard are a result of depositional processes and post depositional erosional processes only. Discontinuities caused by tectonic or other processes, such as faulting, are not considered. In summary, the major contribution of this thesis is the development of a Bayesian decision framework for evaluating the worth of data in hydrogeological design in heterogeneous geological environments. It can be used for tackling one of the most significant problems in environmental contamination: reducing the high costs associated with it. The framework provides a tool for allowing decision makers to not only expend site exploration dollars more efficiently, but to determine when enough information has been collected. This latter point is particularly important as it reduces the time necessary to make design decisions. The second major contribution is the holistic approach used to develop the framework. This approach is accompUshed by two factors. The first, and primary factor, is the Bayesian nature of the framework. This is the key to the integration of all the factors affecting a design and the expression of data worth in monetary terms, the major driving force of any exploration program. The second factor is the incorporation of many different types of information, including soft data, hard data, and a conceptual understanding of the local geology. Including as many sources of information as possible is critical because there are often few reliable hard data. In addition, to the best of the author's knowledge, the framework is unique in its tackling of aquitard continuity, often the critical uncertainty confronting decision makers involved in groundwater contamination. Also, the framework is not just conceptual in nature, but has been demonstrated with a real field case. Finally, the framework is applicable to other exploration programs in hydrogeology and other fields. The methodology is sufficienUy general diat it can be easily modified to handle odier hydrogeological parameters such as hydraulic conductivity, or to similar problems in other disciplines, particularly mining and petroleum engineering. Chapters 2 to 5 present the different components of the framework. Chapter 6 describes how the different components are tied into the framework. Chapter 7 presents a sensitivity study of data worth using generic design examples, while Chapter 8 demonsti-ates the framework using a case history. The case history is the closure of four seepage basins at the Savannah River Site in South Carolina. CHAPTER 2: A N OVERVIEW OF BAYESIAN DECISION ANALYSIS 2.1 INTRODUCTION Bayesian decision analysis is a powerful mediod of madiemadcally selecting die best design from a number of alternatives under uncertainty. It is powerful because it combines bodi engineering judgement and measured data in the decision process. It also provides a method for evaluating the worth of obtaining further information, which is one of die principal questions asked in design under uncertain conditions: Is it cost effective to get more information so that uncertainty will be reduced, allowing a better decision to be made? Or would it be more cost effective to go with the currentiy available data, and use a more-conservative design approach. In hydrological design, the number of uncertain events affecting different design alternatives can be overwhelming. This chapter will introduce the basic concepts of Bayesian decision theory using a simple design problem. The incorporation of the methodology into the framework described in this thesis will be discussed in later chapters. The three major components of Bayesian decision analysis are die prior, posterior, and preposterior analyses. The prior analysis is used to choose the best design alternative from a series of available ones, based on current information. A posterior analysis is used to choose the best design alternative once new information becomes available. A preposterior analysis is used to determine whether collecting new information is cost effective. 2.2 PRIOR ANALYSIS The prior analysis selects the best design alternative, A ^ , from a set of "I" possible alternatives, A;, based on present information. For diis discussion, diere will be J possible events, Ej, that affect the performance of die design alternatives. Only one event can be ti-ue, Ep, but it is unknown. The methodology will be illustrated with the design of a simple landfill (Fig. 2-1). The landfill is to be constructed on an unconfined surficial aquifer. Beneath the surficial aquifer is a thick clay aquitard, which in turn is underlain by a confined aquifer of economic importance for water supply. A downward vertical gradient exists across the aquitard. The lower aquifer is bounded from below by impermeable bedrock. In this thesis, a landfill failure is defined as any event in which contamination of the lower aquifer occurs. If failure occurs, then the owner of the landfill will have to pay for an expensive cleanup. It is assumed that failure will only occur if a permeable discontinuity, or window, exists in the aquitard. The aquitard is a marine clay formed during a marine transgression over the lower sandy aquifer. Its deposition was followed by a marine regression during which fluvial sands were deposited, forming the upper aquifer. Marine clay is generally continuous over distances on the order of kilometers. However, during this regression, there existed a large fluvial channel existed which could have eroded through the clay layer, creating a window. The approximate location of this fluvial channel is assumed known, but it is not known whether the channel has created a discontinuity in the form of a window through the aquitard. The landfill is to be designed to minimize the combined cost of landfill construction and cleanup if failure occurs. For simpUcity, only two alternative designs are considered here: 1) A L , install a synthetic liner, or 2) A [ ^ , install no liner. It is assumed that the synthetic liner will leak to some degree in the future. Therefore, its benefit will be in reducing the amount of leachate escaping, rather than completely containing it. The two events affecting the performance of the design alternatives are 1) E^f/, a window is present in the aquitard at the fluvial channel, and 2) E f ^ , no window is present in the aquitard at the fluvial channel. land fill water table confined aquifer potential window sand â€” â€˘ impermeable bedrock Figure 2-1: Cross section tiu-ougli example site. If no window is present, then failure cannot occur and there is no need for a synthetic hner. However, if a window does exist, then a liner will substantially reduce the amount of contamination and associated cleanup costs. An expected maximum utility criterion will be used here to select the best design alternative. It takes into account the probability of different events occurring. It is based on an economic objective function which is defined as the net present value of the expected sti-eam of future benefits, costs, and risks, taken over an engineering time horizon. The best alternative, or the design alternative, Aj), is die one with the maximum expected value of die objective function. The objective function, Z, (Crouch and Wdson, 1982) takes the form: Z=l7â€”7^[B(t)-C(t)-R(t)] (2.1) where: t=0 i = discount rate (decimal fraction) t = time (years) T = time horizon (years) B(t) = benefits in year t ($) C(t) = costs in year t ($) R(t) = risks in year t ($) B(t) represents known benefits. For the landfill, these would include the annual income from disposing of waste. C(t) represents known costs, such as the cost of building and operating the landfill, and the costs of future monitoring to check for leachate plumes leaking from the landfill. R(t) is defined as the expected cost of failure. It is calculated by: R(t) =P^(t)C/t)U(Cp (2.2) where: = probability of failure in year t (decimal fraction) C (^t) = cost of failure in year t ($) U(Cp = utility function (decimal fraction >=1) The cost of failure, Cf, represents all costs that could occur as a result of failure. These could include cleanup costs of any contamination as well as legal or other costs. The utility function, U(Cp, accounts for risk-averse tendencies of some decision makers. Conservative decision makers, leery of failure, will set U(Cp > 1 to increase the weighting of the risk. Risk neutral decision makers will use an expected value approach, that is, one for which U(Cp = 1. Refer to any introductory textbook on decision analysis such as Benjamin and Cornell (1970) for more details on utility theory and how to choose values for U(Cp. An expected value approach will be followed here. Therefore, equation (2.1) becomes: T 1 Z=Iâ€”[B(t)-C(t)-P(t)C^(t)] t=0^ (2.3) In the simple landfill example to be presented now, the probability of a window at the fluvial channel is arbitrarily assumed to be 0.2. Therefore: P(Ew) =0.2 (2.4) P(ENW) = 0.8 (2.5) For simplicity, all benefits, known costs, and costs of failure will be assumed to occur immediately; therefore, the time term in equation (2.3) drops out and (2.3) becomes: Z = B - C - P f C f (2.6) It is further assumed that B for both alternatives will be $1 000 000 and C will be $500 000 and $200 000 for A L and A ^ L , respectively. Cf will be $1 000 000 for A^^. Cj for A L will be only $200 000 because of the reduced volume of leachate. The prior analysis can be visualized using a decision tree (Fig. 2-2). The square at the left of the tree marks a decision between different alternatives. Each branch from the square represents a possible design alternative . At the end of each branch is a circle with another series of branches emanating from it. Each branch of this second set represents a possible event, which is beyond the control of the decision maker. Each event affects the objective function of the alternative. Here there are only two possible events: E j ^ and E ^ . The probability of each event is marked below the event's branch. The objective function, Z, for each alternative, given that a particular event occurs, is shown at the right of the decision tree. The objective functions for the different combinations of alternatives and events are 1) for a liner and a window Z ( A L , E W ) = $ 1 000 000 - $500 000 - $200 000 (2.7) = $300 000 2) for a liner and no window Z ( A L , E N W ) = $1 000 000 - $500 000 (2.8) = $500 000 3) for no liner and a window Z(Ai^,Ew) = $ 1 000 000 - $200 000 - $1 000 000 (2.9) = - $ 2 0 0 000 4) for no liner and no window Z ( A N L , E N W ) = $1 000 000 - $200 000 (2.10) = $800 000 Figure 2-2: Decision Q-ee used in die prior analysis of die example landfill. The expected value of the objective function for each alternative, E[Z(Aj)], is marked beside it. E[Z{A^] is $600 000 (= 0.2 x (-$200 000) + 0.8 x $800 000). E[Z(A^] is only $460 000 (= 0.2 x $300 000 + 0.8 x $500 000). Therefore, the no-liner design, A^ is the best design because it has the maximum expected objective function. The maximum expected objective function has been enclosed in a box. This convention will be followed throughout this thesis. 2.3 POSTERIOR ANALYSIS The posterior analysis incorporates new information into the decision process. The new information is used to reduce the uncertainty by updating the prior probabilities, P(E{), of the different events, Ej, occurring. Traditionally, Bayes' equation is used to update the probabilities. Once the prior probabilities have been updated, the steps of the prior analysis are repeated to choose the best posterior design alternative. Bayes' equation updates P(Ej), given that a sample Sj has been taken, by: P(SilEi)P(Ei) P(EilSj)= 'p^ g^ ^ (2.11) where, - PCEjISj) is the posterior probability of Ej, given that sample Sj has been taken, - P(SjlEi) is the probability of sampUng Sj, given that Ej exists, - P(Ei) is the prior probability of event Ej, and - P(Sj) is the probability of sampling Sj. P(SjlEj) is also referred to as the likelihood function of samphng Sj. P(Sj) is calculated from the total probability of all of the different ways of sampUng S:: P(Sj)=IP(SjlEi)P(Ei) (2.12) i=i One of the powers of Bayes' equation is that the quality of the sample is automatically included in the updating by the likelihood function. The likelihood function can, for example, take into account the precision of the sampling device. Bayes' equation also allows the inclusion of subjective information through the prior term. The prior probability can be based on previous measurements and/or intuition. Returning to the earlier landfill example, a single borehole is to be drilled at the probable location of the fluvial channel to determine if a window exists in die aquitard. It is arbiti-arily assumed diat if die window exists there is a 70% chance that it will be found with the measurement technology selected. It is also assumed diat die borehole wUl not give a false outcome; dierefore, if a window is sampled dien it exists widi certainty. Obviously, if no window exists, then there is no chance of finding one. Therefore, the likelihood functions for the lx)rehole are P(SwlEw) =0 .7 (2.13) P(SNWIEW) = 1. - P(SwIEw) (2.14) = 1.-0.7 = 0.3 P(SWIENW) = 0 . (2.15) P(SNWIENW) = 1-0 (2.16) For the case where a window is sampled, S^, die updated probability of a window existing is determined from Bayes' equation by P(SwlEw)P(Ew) P ( E W I S W ) - P ( S ^ ) (2.17) The probability of sampling a window, P{Sy^) is calculated using equation (2.12): P(Sw) =P(SWIEW)P(EW) + P(SWIENW)P(ENW) (2-18) = (0.7)(0.2) + (0)(0.8) = 0.14 Therefore, from equations (2.4), (2.13), and (2.18): P ( E â€ž . â€ž , (2.,9) = 1.0 PCENW'^W) ^so be calculated using Bayes' equation, but since there are only two events it can be more easily calculated by: P(ENWISW) = 1 . 0 - P ( E W I S W ) (2.20) = 1 .0 -1 .0 = 0. For the case of S^w where: P(SNW) =1.0-P(Sw) (2.22) = 1 .0 -0 .14 = 0.86 Therefore, from equations (2.22), (2.14) and (2.4): PCEw'^Nw) _(0-3)(0-2) ~ (0.86) (2.23) = 0.07 Note, that has reduced the probability of a window from its prior value of 0.2 to 0.07. The updated probability of no window, given S j ^ , P(Ej^lSj,^), is For the landfill example, let us assume that no window was found at die fluvial channel by die borehole. The updated probabilities from equations (2.23) and (2.24) are marked on the decision tree shown in Figure 2-3. The same steps used in the prior analysis are then used to select the posterior best alternative. The posterior expected objective function for A L is $490 000, but the expected objective function for A j ^ is now $752 000, or $152 000 higher dian calculated during die prior analysis . Ajs^ ^ is still the best design. In fact, now that additional information is available, it is better than originally thought during the prior analysis. 2.4 DATA WORTH THROUGH PREPOSTERIOR ANALYSIS The preposterior analysis can be separated into two components: the expected value of perfect information (EVPI) and the expected value of sample information (EVSI). The EVPI is an estimate of die value of having perfect information so that uncertainty is reduced to zero and a perfect design decision can be made. The EVSI is an estimate of die expected value of a normal, imperfect, sampling program which will not reduce uncertainty to zero and diere will still be uncertainty in any design decision. P(ENWISNW) - 1 â€˘ P(EW'SNW) (2.24) = 1 - 0.07 = 0.93 â€˘biect iye Function ^ $300 E C Z ( ALlS|^y):=*486 $ 5 0 0 E [Z (A n l IS |^^ )3=$730 | -$eoo Note: All dollars are in thousonols $ 8 0 0 Figure 2-3: Decision tree used in posterior analysis of example landfill 2.4.1 EXPECTED V A L U E OF PERFECT INFORMATION (EVPl) The EVPI can be explained using the concept of regret. The regret associated with a decision represents the monetary loss incurred by not making the best decision. It is defined as the difference between the value of the objective functions ZCA-j-) and Z(AQ), where A ^ represents the true best design alternative and Ap represents the chosen design alternative. Both are evaluated at the true event, E.^. The regret associated with A Q , Reg(AQ,Ej), is calculated by: Reg(AD,Er) = Z(AT,ET) - Z (AD,ET) (2.25) where. - Z(Aj,Ej) is the objective function for A.j. evaluated at E j , and - Z(Aj),Ej) is the objective function for A ^ evaluated at Ep. Note, that the regret can be calculated for any alternative, A , not just A Q . Unfortunately, is never known. Therefore, we must calculate an expected regret, E[Reg(A0)], over all the possible events, Ej: I E[Reg(AD)] = X [Z(AT,Ei) - Z{A^,E{)]m;) (2-26) i=l The expected regret gives the expected monetary loss when A ^ is not A j . In other words, it is the maximum improvement that could be expected in Aj), if Ej- was known. The E[Reg(Aj))] represents the EVPI. Since no exploration program will yield perfect information, die EVPI is an estimate of the maximum worth of any normal exploration program. An exploration program should not be carried out if its costs exceed die EVPI. An example calculation of the expected regret for the prior design of a liner is presented in Figure 2-4. The regret associated with each design/event pair is shown at the right hand side of Figure 2-4. If E^ is a window, dien from Figure 2-2 and equations (2.7) and (2.9), Z ( A L , E W ) = $300 000 and Z(A^,E^ff) =- $200 000. Hence, the prior A Q of the no hner alternative really would have been poor. The tine best alternative would really have been a liner. The regret in choosing AJ^L the prior analysis would be: Reg(ANL,Ew) = Z ( A T , E W ) - Z ( A N L , E W ) (2.27) = Z ( A L , E W ) - Z ( A N L , E W ) = $300 000 - (-$200 000) = $500 000 However, if Eq. is no window , dien from Figure 2-2 and equations (2.8) and (2.10) Z ( A L , E N W ) = $500 000 and Z(ANL,Ef^) = $800 000. Under diis outcome, die prior A Q of no-liner would still be the best. Hence the regret would be zero. No-te= AU dollars are in thousands Figure 2-4: Expected regret calculations for the landfill example. The expected regret of die prior best design alternative of no-liner is E[Reg(ANL)] =Reg(ANL,ENw)P(ENw) + Reg(ANL,Ew)P(Ew) (2.28) = ($0)(0.8) + ($500 000)(0.2) = $100 000 therefore, EVPI = $100 000 (2.29) It would not be worth carrying out any exploratory work diat cost more dian $ 100 000. Expected regret can also be used in the prior analysis to choose the best alternative. A ^ has die lowest expected regret. For example, E[Reg(AL)] = $240 000 while E[Reg(ANL)] = $100 000 (Fig. 2-4). The alternative with the minimum expected regret is the alternative with the highest expected objective function. 2.4.2 EXPECTED VALUE OF SAMPLE INFORMATION (EVSI) There are two standard approaches for calculating the EVSI. The first is based on the increase in the maximum expected objective function. The second is based on the reduction in the minimum expected regret. These are well established in the literature (refer to Benjamin and Cornell, 1970; Davis and Dvoranchik (1971); or Davis, Kisiel, and Duckstein (1972)). A third approach is developed by rearranging the equations used in the first method to isolate the individual contributions of the different sample outcomes to the EVSI. It is utilized in this thesis to visualize how different factors affect the data worth process. The three methods will first be outlined in more detail below. 2.4.2.1 Method 1: The EVSI from the Expected Increase in the Maximum Objective Function In this method, the worth of data, Wj, is calculated from the expected increase of the expected objective function of the posterior best design alternative, A^'^ over the expected objective function of the prior best design alternative, A ^ . Wj is calculated by: Wi = E[E(Z(AD' ) ) ] - E(Z(AD)) (2.30) where, - Z(AÂŁ)) is the objective function of A Q - E(Z(Ap)) is the expected value of Z(Aj5), and - E[E(Z(AD'))] is the expected value of E ( Z ( A D ' ) ) . For the discrete case of I events, Ej, and J sample outcomes, Sj, equation (2.30) becomes: W, = 1 j=i XZ(ADj'.Ei)P(EilSj)]P(Sj) Li=l -XZ(AD,Ei )P(Ei ) (2.31) J i=l where, - Apj' is the best design alternative given Sj, - Z(AD,Ei) is the objective function A Q evaluated for event Ej, - POBjISj) is the posterior probability of event, Ej, given diat sample Sj has been taken, - PCE) is die prior probability diat event E; will occur, and - P(Sj) is the probability of samphng Sj. An example preposterior analysis using method 1 is carried out below. The worth of the example borehole carried out in the posterior analysis (Section 2.3) will be evaluated. In a preposterior analysis, the outcome of the lx)rehole is not known. A decision tree used in die preposterior analysis is shown in Figure 2-5. It consists of two parts. The first part, associated with the decision not to drill the borehole, is identical to the prior analysis. The second part, associated with die decision to drill the borehole represents the preposterior analysis. The preposterior analysis can be diought to consist of two stages. Stage 1 is a series of branches for each possible sample outcome. In this example, there are two: sampling either a window, S^, or no window, S(^ . The probability that a particular outcome will be sampled is marked in brackets beside each branch. Recall diat diese were calculated in (2.18) and (2.22). Stage 2 consists of a series of pseudo-posterior analyses. A pseudo-posterior analysis is associated with each sample outcome. The probabilities used in each pseudo-posterior analysis are updated according to the sample outcome. For the case of S[,j^, P(EY f^lSj,jw) and P(Ef^lSf^) have already been calculated in equations (2.23) and (2.24). The expected objective functions are E [Z(AJ^ ISNW)] = $730 000, and E[Z(ALISNW)] = $486 000. Therefore, A Q ' would be A ^ L -For the case of Syff, PCE^IS^) and P(Ej^lS^) have aheady been calculated in equations (2.19) and (2.20). E[Z(ANLISW)] = -$200 000, and E[Z(ALISW)] = $300 000. Therefore, A ^ ' would now be A L . The worth of the borehole from equation (2.30) is = $670 000-$600 000 = $70 000 Therefore, the borehole is worth $70 000. It would not be worth drilling the borehole, if it is going to cost more than $70 000. 2.4.2.2 Method 2: The EVSI from Expected Reduction in Minimum Expected Regret Method 2 calculates the worth of data, Wj, from the expected reduction in the E[Reg(AL,')] from the E[Reg(AD)]. W2 is calculated by where, - Reg(AL,) is the regret of A Q , - E(Reg(AQ)) is the expected regret of A Q , and - E[E(Reg(AD'))] is the expected expected regret of A Q ' . For the discrete case, (2.33) becomes = E [ E ( Z ( A D ' ) ) ] - E ( Z ( A D ) ) (2.32) W2 = E(Reg(AD)) - E[E(Reg(ADO)] (2.33) W2= I [Z(ATEi) - Z(AD,Ei)]P(Ei) -i=i J I 1 1 ([Z(AT,/,Ei) -Z(ADj',Ei)]P(EilSj)}P(Sj) (2.34) j=l i=l â€˘b iectivc Function Stage 1 Stage 2 Figure 2-5: Decision tree of preposterior analysis of example borehole using method 1. where - Z(ATj',Ei) is die objective function of die posterior true best alternative evaluated for event, Ej, given sample, S:, and - Z(Aj,Ei) is the objective function of the prior true best alternative evaluated for the event, E;. A decision tree used in the preposterior analysis is shown in Figure 2-6. The regret for each combination of Ej and Aj is marked to the right of the branch representing Ej . The calculation of these values was discussed in Section 2.3. In the case of Sj ,^ , the expected regret for A Q ' (no liner) is $34 900. In the case of S^, the expected regret for A ^ (a liner) is $0. Therefore, the E[E(Reg(Aj3'))] = $30 000. Therefore, from equation (2.33) is W2 =E(Reg(AD))-E[E(Reg(AD'))] (2.35) = $100 000-$30 000 = $70 000 Therefore, the worth of the borehole is again $70 000. 2.4.2.3 Method 3: Expected Posterior Worth of a Measurement Method 3 is developed by rearranging Method 1 to show contributions of the different sample outcomes to the EVSI. The basic philosophy of the method is to put the EVSI, or worth, of the measurement in terms of the posterior worth of sample outcomes. Or in other words, what is the real worth of a measurement that samples a window (or no window) after it has been taken. Recall from Method 1 that the worth of taking a measurement for the discrete case, is calculated by J I I W, = II Z(ADj',Ei)P(EilSj)P(Sj) -I Z(AD,Ei)P(Ei) (2.36) j=l i=l i=i Figure 2-6: Decision tree used in preposterior analysis of example borehole using mediod 2. For a given particular event, Ej, die sum of all possible sample outcomes, S:, is one; hence. J I P ( S j l E i ) = l j=l (2.37) Therefore, equation (2.36) can be rewritten as J I I J W i = 11 Z(ADj',Ei)P(EilSj)P(Sj) -I I Z(AD,Ei)P(SjlEi)P(Ei) (2.38) j=li=i i=lj=i Substituting Bayes' equation. P(SilEi)P(Ei) P(EilSj)= 'p ;^^ ^ (2.39) into equation (2.38) yields: J I I J W i = X I Z(ADj',Ei)P(EilSj)P(Sj) -I I Z(AD,Ei)P(EilS3)P(S3) (2.40) j=li=l i=lj=l Reversing the summation terms on the second term of the right hand side yields: J I J I Wi = S I Z(AD/,Ei)P(EilSj)P(Sj) -I X Z(AD,Ei)P(EilSj)P(Sj) (2.41) j=l i=l j=l i=l The above equation can be rearranged to give: J I I W 3 = I [ I Z(ADj'3i)P(EilSj) -I Z(AD,Ei)P(EilSj)]P(Sj) (2.42) j=l i=l i=l where Wj has been replaced by W 3 to denote that the E V S I is now being calculated by Method 3. For the continuous case, W 3 can be calculated by: W3 = E[E(Z(AD'IS)) - E(Z(ADIS))] (2.43) where E(Z(AQIS)) is the expected value of the prior best design altemadve, but evaluated with addidonal sample information S. The contributions of the different sample outcomes have now been isolated. One can use this method to examine die posterior worth of a measurement. For example, if the sample outcome was S ^ , its worth would be W(Sw) = E(Z(AD'ISW)) - E(Z(ADISW)) (2.44) which represents the difference between the expected value of the new posterior best design alternative, Aj)', and the new estimate of die expected value of die old prior best design alternative, A ^ . In other words, it represents the difference between the new best alternative and what the Ap is really worth. If the outcome of a measurement does not change A ^ to a different one, then the outcome has no value. A preposterior analysis using mediod 3 can be illusti-ated with die decision ti"ee used in method I (Fig. 2-5). Recall diat for S^w, E[Z(ANLISNW)] = $730 000 and E[Z(ALISNW)] = $486 000. Therefore, S^w has only reconfirmed that the prior best alternative of no-liner is still the best and sampling no window is worth nothing. In die case of S ^ , die best alternative is now to have a liner (E[Z(ALIS^)] = $300 000). The expected objective function of the prior no-liner decision was found to be really only -$200 000, radier than its previous value of $600 000 calculated in die prior analysis. Therefore, has resulted in an improvement of $500 000 ($300 000 - - $200 000) in die design decision. The worth of taking a sample from equation (2.42) is W3 = ($500 000)(0.14) + ($0)(0.86) = $70 000 (2.45) which is the same as calculated by methods 1 and 2. In general, the worth of data will be expressed in terms of Method 1 because it is commonly used in the literature. Method 3 will only be used for some demonstration examples. It is felt that Method 3 is a more intuitive approach which simplifies the understanding of how different factors affect the estimated worth of a measurement. 2.5 APPLICATION AND EXTENSION OF THE METHODS The methods presented in Section 2.4 for discrete events will be applied to more complex cases involving spatial correlation and multiple realizations to determine probabilities. The presentation of these ideas must await the presentation of the geological arguments (Chapter 3), sequential indicator simulation (Chapter 4) and contaminant transport concepts (Chapter 5). 2.6 NOTATION A D prior best design alternative A L liner alternative A N L no liner alternative A.J. true best design alternative B(t) known benefits in year t C(t) known costs in year t j * possible event event of no window in aquitard E T true event Ew event of a window in aquitard i discount rate Pf(t) probability of failure in year t R(t) risk in year t Reg(Aj) regret of design alternative Aj Sj sample outcome Sj ,^ sample outcome is no window Syj, sample outcome is a window t time T engineering time horizon U(Cf) utility function W J worth of data by the expected increase in the maximum objective function W J worth of data by the expected reduction in minimum regret W 3 worth of data by the expected posterior worth of a measurement Z objective function CHAPTER 3: GEOLOGICAL PREDICTION OF AQUITARD CONTINUITY 3.1 INTRODUCTION The purpose of Chapter 3 is to explain why a conceptual understanding of local geology can be important in predicting aquitard continuity. There is a sti-ong relationship between patterns of heterogeneity of clastic rocks and die depositional environment that formed them (Le Blanc, 1977a and b; Weber, 1982; and Weber, 1986). A brief overview is presented below of a number of depositional environments and the geometric characteristics of clay layers deposited in them. Refer to Ravenne et al. (1989), Sneider et al. (1978), Jardine et al. (1977), and Le Blanc (1977a and b) for a more detailed overview of patterns of heterogeneity formed in different depositional environments. A review of previous work in quantifying shale/clay layer heterogeneity is then presented. Finally, the importance of a conceptual understanding of die local geology in predicting aquitard continuity is emphasized. In general, die relationship between depositional envkonment and patterns of geological heterogeneity is poorly understood for two reasons. Firsdy, the patterns of heterogeneity can be very complex. And secondly, the available data base on the three dimensional geometiy of clastic rock sd^atigraphy is small. 3.2 THE EFFECT OF DEPOSITIONAL ENVIRONMENT ON THE GEOMETRIC CHARACTERISTICS OF C L A Y LAYERS. 3.2.1 BRAIDED STREAM Braided streams occur in high energy environments where there is an abundant supply of coarse grained sediments (Walker, 1983). They also have easily erodible banks which allows the stream to laterally migrate. The Brahmaputra, for example laterally migrates up to 900 m per year (Selby, 1985). The result is that fine grained material generally washes through the system and any clay deposits formed are often eroded by the constantly changing river. Patches of shale are rare, elongate in shape and are often less than a meter is size (Potter et al. 1980). Leopold and Wolman (1957), cited by Wadman et al. (1979) found in a study of 17 braided and straight channel deposits that the average length to width ratio of shale layers was three. 3.2.2 MEANDERING STREAM Meandering rivers form under lower energy environments than braided streams and generally carry smaller volumes of sediments, which are fined grained. The meander belt is generally 15 to 20 times the width of the channel (Le Blanc, 1977a). The Mississippi River meander belt is up tol5 to 20 miles wide. Shale in meandering streams predominantly occurs as accretionary flood plain deposits, as plugs in abandoned stream channels, or as clay drapes. Clay plugs are formed from clay being deposited in abandoned channels. The amount of clay deposited depends upon the rate at which the channel was abandoned. Rapidly abandoned channels are predominantly filled with clay, while slowly abandoned channels are composed of sand and silt as well. The size of the plugs are limited to the size of the stream channel (Richardson et ai., 1978). The plugs are generally resistant to erosion by the meandering stream. Flood plain deposits are formed during flood periods when the river overflows its banks. The flood plain deposits are areally extensive, but are subjected to erosion by the meandering river. Consequently, flood plain shales are preserved in elongated isolated bodies that range in size from 400 to 16(X) m wide and 1600 to 3200 m long. Studies of ancient flood plain deposits indicate that shales make up from five to 15% of the flood plain on an areal basis (Richardson et al., 1978). Potter et al. (1980) report diat flood plain shales can extend up to tens of km parallel to shoreline and can be up to ten or more meters thick. Some flood plain deposits in back-swamps and flood basins are not on the meander belt, and will not be eroded by the migrating river (Le Blanc, 1977a). Drapes are formed when clay is deposited on point bars at the ebb of a flood when die sd-eam velocity is almost zero (Le Blanc, 1977a). Drapes are generally small in size compared to either flood plain or abandoned channel clays with their size ranging from that of local depressions to entire point bars. Clay drapes generally range from one mm to more than 30 to 60 mm in diickness and are more common in the upper half of the point bar sequence (Le Blanc, 1977a). 3.2.3 DELTAIC There are many types of deltas depending upon the energy of the environment, sediment source and supply. For simplicity, the discussion here will be limited to the two extreme types of deltas. These are (a) high energy sand deltas and (b) low energy mud deltas. High energy sand deltas are characterized by a few active meandering, disd-ibutary channels. Low energy mud deltas are characterized by many bifurcating distributary channels which tend to be straight, and not meander as much. There are a vast number of different types of deltas in between the two extremes. Al l deltas can be divided into a number of subenvironments consisting of distributary channels, delta flood plains, the delta front and the prodelta. Delta flood plains commonly contain dozens of abandoned distiibutary channels and have only a few active channels (Le Blanc, 1977a). Just as in the meandering stream environment, die abandoned channels are fdled with sediments. Quickly abandoned channels are predominandy filled widi clay whUe slowly abandoned channels are filled widi more silt and sand. Delta flood plain deposits are formed in the same manner as meandering river flood plain deposits. The size of a distributary channel, and hence its associated deposits is dependent upon several factors including: the delta size, type of bed material being eroded, and the position of the channel in the delta. For example distributary channels in the Mississippi near the sea are typically less than 200 m wide and 10 m deep, while the upstream channels are up to 1000 m wide and 60 m deep (Sneider et al., 1978). In high energy sand deltas, fines are winnowed away from the delta fi'ont deposits. Consequently, only very discontinuous shale layers are deposited. Shale layers are more common at the bottom of distributary mouth bars. In low energy deltas, shale layers have a much greater continuity. In distributary mouth bars, 2 cm thick clay beds extend over hundreds of meters, particularly in the lower half of the bar deposits (Sneider et al., 1978). The prodelta shales can be very continuous, having dimensions similar to that of marine shales (Richardson et al., 1978). Shallow deltaic clays can be gready affected by bioturbation, gas production, and growth faults. Bioturbation is important because it can greatly increase the hydraulic conductivity of a clay. 3.2.4 ESTUARINE Estuaries occur where a river mouth is affected by tidal action. Tidal channels meander between mudflats and channel bars of mud and sand. The difference between deltas and estuaries is that deltas have a large of amount of sediments being deposited in the river mouth. In estuaries, tidal currents and river discharge keep the river channels open (Selby, 1985). However, similar depositional features are associated with both estuaries and deltas. For example, barrier bars, lagoons and coastal swamps can be found in both types of environments. The nature of the clay deposits is dependent upon the forces affecting them. For example, in esmaries with low wave energy, mud banks can occur with areal dimensions which are up to lOO's to 10(X)'s of meters squared. In estuaries with high wave energy, clay mosdy occurs in small to medium sized patches which are a few meters thiclc. In tidally dominated estuaries, drapes can occur on point bars and on sheets parallel to die channel (Potter et ai, 1980). Estuarine clays can be gready affected by bioturbadon. 3.2.5 SUBMARIISfEFAN The classical depositional process of the submarine fan is the turbidity current. Submarine fans predominandy occur on delta fronts, continental shelves and deeper ocean basins. Turbidite layers tend to be laterally extensive covering hundreds of meters and with litde variation in thickness (Walker, 1983). Shales are predominandy deposited in the lower portion of submarine fans and can cover large areal extents. However, shales of smaller size can be deposited in the proximal portions of the fan. Hazue et al. (1988), in a study of a submarine fan found that the average lengdi of non-correlative shales was approximately 500 m in distal portions of the fan and 200 m in proximal portions. Shale layers often occur in thin layers interl)edded with sandstone and silt. Shales layers can be eroded in cases of increased sediments supply because fan channels may cut down into lower fan sediments (WaUcer, 1983), incising die clay layers. 3.2.6 MARINE Deep marine shales have large areal extents with thin (0.3 m) shale layers commonly covering two to five square kilometers. Thicker shales are commonly continuous for a hundred km (Richardson et al., 1978). Shallow marine clays are very susceptible to bioturbadon. 3.2.7 GLACIAL There are two main types of deposits associated with glaciation: tills which are deposited directly by the ice, and sediments deposited by meltwater. Deposits of glaciolacustrine clay form some of the most extensive shallow aquitards in North America (Freeze and Cherry, 1979). Clay deposited by meltwater streams will be similar in character to those deposited by fluvial processes and therefore will not be discussed here. Tills are generally composed of clay sized particles and are hydrologically important because they often form aquitards. The hydraulic conductivity of dense, fine grained unfractured glacial till, typically ranges from 10'^Â° to 10"^ ^ m/s. Glacial tills form a major portion of near-surface aquitards in Canada and the northern U.S.A. because the last period of glaciation covered most of northern North America only 8 000 to 14 000 years ago (Press and Siever, 1974). Not enough geologic time has occurred for other surface deposits to form. Till can be formed by both alpine and continental glaciation. However, till deposits from alpine glaciation will not be discussed here because they comprise a small percentage of the till relative to those formed in continental glaciation in North America. Only the two most significant types will be discussed: supraglacial and subglacial till. Supraglacial till is carried on top of the glacier. Subglacial, or basal, till is laid down at the bottom of the glacier. The subglacial till forms a ground moraine that can be several meters thick. Massive basal till is the most widespread Pleistocene facies on land (Reading, 1983). Supraglacial till forms an ablation moraine. It forms on the sloping snout of the glacier and is left behind as the glacier recedes. Sheets of ablation and basal till can be continuous over extensive areas. For example, in Southern Ontario till sheets are continuous over tens of kilometers. Till units are generally thicker near the source rock and thinner towards the outer margins of the glacier. However, the thickness of till sheets is also controlled by the relief. The till is diicker in hollows and thinner or absent in topographic highs. The erodibility of the substi-ata is a major factor in controlling till thickness. Even though extensive deposits of massive till occur, they often contain isolated bodies of coarse, stratified sediments deposited by englacial and subglacial streams. In general, die stratigraphy of glacial deposits tends to he very complex. The composition of till can be heavily dependent upon its source. For example, glaciers that cross over old lakes pick up fine grained lacustrine sediments and leave a clayey till. In general, when the source rocks are sedimentary, till is predominandy clay sized. When the source rocks are granite, or other crystalline rocks, till is more sandy or silty. Tills formed from crystalhne source rocks are less likely to form aquitards. 3.3 QUANTITATIVE DESCRIPTION OF SHALE HETEROGENEITY Most of die work in quantifying the relationship between shale/clay layer heterogeneity and depositional environments is recent and is restricted to die field of peti-oleum geology. Zeito (1965), noted the dimensions of shales in different types of environments. Weber (1982) extended diis work. He presented the cumulative probability distribution of the lengdis of shale layers formed in marine, delta barrier, delta fringe, delta plain, distiibutary channel, and coarse point bars environments (Fig. 3-1). Haldorsen (1989) and Hazeu et al. (1988) present similar cumulative probability distributions for shale layers formed in submarine fans. Geehan et al. (1986) also presents similar curves for shales formed in flood plains, abandoned channels, and as drapes on point bars. Haldorsen and Lake (1984, p. 449) note that "No universal validity is claimed for die findings of Zeito, but there appears to be significant confidence in some of the data." The important point to note is that there is a distinct difference between the different cumulative distributions. However, there is uncertainty in the exact probability distribution for a particular environment for two reasons. The first is the cumulative probability distributions are estimated from a limited number of measurements. The second reason, as pointed out by Haldorsen (1989), is that the dimensions measured in outcrop of shale layers, as reported by Weber (1982), are not the true length of the shale, but a cross section through the shale layer at some unknown angle. Little work has been done in quantifying the spatial distribution and correlation of shale/clay layers formed in different sedimentary environments. Zeito (1965) found in a study of outcrops that the centers of shale layers were randomly distributed. Inconclusive work has been done on relating the length of a shale layer to its thickness. Delhomme and Giannesini (1979), cited by Haldorsen and Chang (1986), concluded that in the deltaic system that they studied, the thickness of shale layers was only related to local topography that prevailed at the time of deposition and was independent of shale length. However, Wu et al. (1973) found an approximate relationship between the thickness of different layers and thek lengths in Mississippi alluvial deposits. Kossack (1989), in the most comprehensive study, reported that there were only six publications with enough layer length and thickness data to permit analysis. His results are inconclusive. Some work has been done on quantifying the geostatistical nature of clay/shale layers. However, this work will be presented in Chapter 4, after geostatistical concepts have been introduced. 3.4 THE IMPORTANCE OF GEOLOGY IN PREDICTING AQUITARD CONTINUITY A conceptual understanding of the local geology can be the most critical factor in predicting aquitard continuity in clastic depositional environments. This will be particularly true when diere is a shortage of measurements of the aquitard, which will be the norm in most hydrogeological design situations. This is 0 100 200 300 400 500 600 L E N G T H DF S H A L E L A Y E R Cn> Figure 3-1: Cumulative probability distribution of shale lengdi in different depositional environments, (modified from Weber (1982)) clearly demonsti-ated in an interesting study of shale continuity in the Ivishak peti-oleum reservoir at Prudhoe Bay by Geehan et al. (1986). The Ivishak Formation is a combination of sandstones, conglomerates, and shales deposited in a largely progradational, fluvial/deltaic complex. The continuity of shales in one chronostratigraphic horizon was studied. The thickness of intersected shales was first contoured without any geological input (Fig. 3-2). This results in a maximum continuity case. (Note, for example the three data points at X , Y, and Z.) However, a very different picmre is obtained once die data are contoured to geologically conform to the genetic type of shale (Fig. 3-3). The reason for this change can be seen by examining the shale at X , Y, and Z. The shale at these points was assumed to be continuous in Figure 3-2. However, the shale at points X , Y , and Z represent drape, abandoned channel, and flood plain facies respectively and dierefore cannot be one continuous unit. The flood plain shale will have a large lateral condnuity, but the abandoned channel and drape shales will have a lateral continuity on the order of lO's of meters. Hence, even diough the three shale units are present in three adjacent wells, are chronostradgraphically equivalent, and are of the same approximate thickness, diey do not correlate. Consequendy, the addition of geological information has fundamentally altered the picture of shale continuity. The importance of using geological information is shown by the sti^ ong relationship which exists between patterns of clay heterogeneity and depositional environment. Even if the relationship is not well quantified, it could be vital to include because the range of characteristics is immense. For example, a clay layer formed in a braided stream environment would be expected to have a continuity on the order of meters while a clay layer formed in a deep marine environment would be expected to have a continuity on the order of kilometers. Therefore, simply knowing the depositional environment that formed an aquitard can provide crucial information in predicting aquitard continuity. Geological information will be most useful when it can be combined with existing quantitative information. The Bayesian methodology for carrying out this combination is developed in Chapter 4 and is carried out for the Savannah River Site case history in Chapter 8. However, it must be noted that combining geological understanding with quantitative data can be limited at present. Not enough work has been done in estabUshing the necessary quantitative relationships between environment of deposition and patterns of shale/clay layer heterogeneity. But, these relationships will become better established in the future as more research is carried out. 0 1 ^Jl-E 1 SHALE THICKNESSES ARE IN FEET nodified fron Geehan et al. (1986) - 'CDNTINUDUS' SHALE: ND GEDLDGIC BIAS - ZERD THICKNESS CONTOUR - DATUM POINT Figure 3-2: Prediction of shale continuity in the Ivishak Formation without geological information, (modified from Geehan et al. (1986)) Figure 3-3: Prediction of sliale condnuity in die Ivisiiak Formation widi geological information, (modified from Geehan et al. (1986)) CHAPTER 4: SEQUENTIAL INDICATOR SIMULATIONS OF AQUITARDS 4.1 INTRODUCTION This chapter describes how aquitards are numerically generated using the sequential indicator simulation (SIS) algorithm. The SIS algorithm is powerful because it not only accounts for the spatial correlation of a random variable of interest, but provides a means of incorporating measurements that have a wide range of reliability. Incorporating measurements with a wide range of reliabihty is vital in hydrogeological design because precise (hard) data is usually scarce and expensive while cheaper, imprecise measurements are often more readily available. Some general geostatistical concepts will be presented in Section 4.2. This will be followed by a discussion of indicator kriging, the mathematical basis of SIS (Section 4.3), and how the SIS algorithm works (Section 4.4). Section 4.5 will present non-Bayesian and Bayesian methods of inference of the geostatistical parameters used by the SIS algorithm. The Bayesian method will be used as much as possible in this thesis; however, the non-Bayesian method will also be used in some cases for practical reasons. Bayesian inference is the key to incorporating geological understanding into the SIS algorithm. Limitations of the SIS algorithm are then discussed in Section 4.6. Finally, alternative simulation algorithms will be covered (Section 4.7) and the geostatistical nature of clay layers will be reviewed (Section 4.8). 4.2 BASIC GEOSTATISTICAL CONCEPTS Only the basic geostatistical concepts that are necessary for understanding this thesis will be covered here. For a more detailed presentation refer to de Marsily (1984), Clark (1979), or Joumel and Huijbregts (1978). Geostatistics is analogous to classical statistics. A realization, or data set, z (denoted by a lower case letter) is used to infer statistical parameters governing a random variable, Z, (denoted by a capital letter) from which z was drawn. However, in geostatistics the random variable is a regional variable, Z(x), with spatial correlation. From now on, unless otherwise stated, all random variables are assumed to be regionalized variables. Only statistical parameters based on the first two moments of Z(x) are of interest here: the mean, ^ i ^ , the variance, cs^, and the covariance function, Cov(h). In all practical situations, there wUl never be enough available data to estimate parameters based on higher order moments. The covariance function quantifies the spatial correlation of Z(x). It is similar in nature to the variogram, a term more familiar to many practitioners of geostatistics. The variogram will be discussed in more detail later in diis section. However, this diesis will utilize the covariance function rather dian the variogram. The covariance between Z at two points X j and X2 is defined by: where Xj and X2 represent locations in two dimensional space and E denotes the expected value. The covariance between two coincident points is die variance: Cov(Z(xi),Z(x2)) = E[(Z(xi) - Mz(,,))(Z(x2) - Mz(x,))] (4.1) Cov(Z(xi),Z(xi)) = E[(Z(xi)-Mz(,_))(Z(xi)-Hz(,^))] (4.2) = Var(Z(xi)) Geostatistics is based on two fundamental assumptions: ergodicity and stationarity. Ergodicity says that die actual realization, z, has the same statistical properties, or probability density function, as the ensemble of possible realizations generated by die random process Z(x). This is analogous to the assumption in classical statistics that the sample is representative of the population from which it was drawn. The assumption of ergodicity is necessary for the inference of the geostatistical parameters governing Z(x) to be made because it allows one to trade spatial averages for ensemble averages. The assumption of stationarity greatly simpUfies the statistical properties of Z(x) allowing tractable estimations of Z(x) to be made. Z(x) is stationary if, for a finite number of points n and any separation distance h, the joint distribution of Z(xj), Z(x2) , . . . , Z(Xn) is the same as the joint distribution of Z(xj + h), Z(x2 + h) Z(Xj, + h) (Meyers, 1989). This means that all of the moments of Z(x) are independent of spatial location, x. This assumption is too strong to be useful in practical situations; therefore, a weaker assumption of second order stationarity is often made. In second order stationarity, only the first two moments are stationary. Therefore, the mean and the covariance of Z(x) are independent of x. Second order stationarity will be assumed in this thesis. If the spatial correlation is isotropic, then the covariance of Z(x) at two points is only dependent upon the separation distance, h, and not on the direction of the separation. Equation (4.1) then becomes: Therefore, for simplicity Cov(Z(x),Z(x - h)) will be referred to as Cov(h). Cov(h) is usually assumed to follow a well defined analytical function, such as the exponential, spherical or Gaussian functions. The exponential covariance function has the form: Cov(Z(x),Z(x - h)) = E[(Z(x) - M^(Z(x - h) - M ^ ) ] (4.3) Cov(h) = az2exp(-hA) (4.4) while the Gaussian covariance function has the form: Cov(h) = az2exp(-(hA)2) (4.5) where o-^ is the variance of Z(x) and X is the correlation length. The correlation length is a measure of the strength of the spatial correlation. It represents the distance at which the covariance function drops to a value of Cov(0)*e"*. The spherical model has die form Cov(h) = C T z 2 ( l - | ^ + ^ ) h<a (4.6) = 0 h>a where a is the range. The range is another common term used to express the strength of the spatial correlation. It differs from the correlation length in that it represents the separation distance, h, at which the correlation becomes zero. Therefore, an exponential or Gaussian covariance function with X=10 m has a stronger correlation than a spherical covariance function with a = 10 m. Example spherical, Gaussian and exponential covariance functions with X and a =10 m are shown in Figure 4-1. h (m) Figure 4-1: Plot of die exponential, spherical and Gaussian covariance functions, for o-^ = 1, A, = 10 m and a = 10 m. The assumption of isotropy of Cov(h) may be poor in many real geological situations because correlation is often dependent upon direction. For example, the spatial correlation in an aquitard is generally higher parallel to bedding than perpendicular to bedding. Unless stated otherwise, it will be assumed that the spatial correlation of an aquitard is isotropic in the horizontal plane. A weaker form of stationarity than second order stationarity is the intrinsic hypothesis. It assumes that the variance of the first increment is finite and weakly stationary. This concept is utilized with the variogram, 7(h), which is defined by: Consequently, 7(h) is only dependent upon the squared differences of Z(x) and not on (J^. Under conditions of second order stationarity, the Cov(h) and 7(h) are related by: An important concept in geostatistics is the volume-variance relationship. The variance and covariance of Z depend upon the volume, or scale, of the measurement of Z(x). The covariance functions shown in Fig 4-1 are for the point scale. As the volume increases, and Cov(h) will decrease, until the extreme where the volume represents the entire domain of interest and both (s^ and Cov(h) are zero. Normally the volume variance relationship poses problems in estimating the geostatistical parameters of Z(x). The volume variance relationship will be ignored in this thesis for reasons that are explained in Section 4.5. 7(h) = 0.5E[(Z(x+h)-Z(x))2] (4.7) Cov(h) = Cov(O) - 7(h) (4.8) 4.3 INDICATOR KRIGING (}K.) This section will introduce indicator kriging (IK). It begins with a discussion of indicator random variables, the different types of data, and the estimation of geostatistical parameters. The problem of estimating the occurrence of the aquitard at a point x, wdl then be discussed. There are several geostatistical methods for tackling this problem, depending upon the stiictness of the statistical assumptions made. Only two methods will be covered here: simple and ordinary indicator kriging. For a more detailed description of indicator geostatistics, refer to Joumel (1989), Alabert (1987), or Joumel (1983). 4.3.1 INDICATOR RANDOM VARIABLE In IK, it is not the specific value of a random variable Z diat is of interest, but whether Z falls into a particular class or not. The indicator random variable I(x) is defined as: i(x,z) =lifz(x)<z^ (4.9) = 0 if z(x) > Zj. where z^ , is some critical cutoff. I(x) has a mean jij and variance a^. The mean represents the expected value of I(x) at a point x. Therefore, = E[I(x)] = P(z(x) < z,)(l) + P(z(x) > zJ(0) (4.10) = P(z(x)<z,) In odier words, iXj represents die expected probability diat z(x) at some point x is less dian die critical value z^. Since it has only two possible states (0 or I) it is a binary, random variable with variance, cs^, equal to: CT,2 = n i ( l - HI) (4.11) Hence, the exponential covariance function will have the form: Cov( h) = Hi(l - (4.12) The concept of the indicator random variable fits exactiy with our interest in aquitard continuity. Is the aquitard present or not? If Z(x) represents the thickness of the aquitard at point x then: where the inequality in equation (4.9) has been changed to an equality since negative thicknesses are impossible. In the above case, there are only two classes; however, there can be any number of classes in indicator kriging. Hence, it is possible to handle a variety of different types of heterogeneities, such as a sand, silt, clay sequence. The only drawback is that computational requirements dramatically increase with the number of class intervals handled. 4.3.2 TYPES OF DATA i(x,z) = 1 if z(x) =0 (i.e. aquitard is not present) (4.13) = 0 if z(x) > 0 (i.e. aquitard is present) There are two main classes of data in indicator kriging: hard and soft data. Hard data represent measurements which have a negligible uncertainty. For hard data, the presence, or absence, of the aquitard will be known with certainty and I(x) will have a value of either 1 or 0. An example of a hard measurement would be a borehole which was cored in its entirety, with a distinct aquitard that could be easily recognized. In this thesis, a hard random variable is referred to as I(x) and a hard datum as i(x). Soft data represent measurements which have uncertainty. Geophysical surveys, or a borehole that was incompletely cored could represent soft measurements. In this thesis a soft random variable is referred to as Ij,(x) and a soft datum as i^(x). Alabert (1987) defined three types of soft data: types (a), (b), and (c). For a more detailed description, refer to Alabert (1987). Type (a) soft data represents a single valued measurement z^(x) which is only an estimate of the tine value z(x). Therefore, a type (a) soft measurement will have a value of either is(x)=0 or is(x)=l. However, it is of questionable reliability. The reliabiUty of type (a) data is quantified by two probabilities Pi(x) and p2(x) defined as Pl(x) = P(Z,(x) = OIZ(x) = 0) or P(I,(x) = lll(x) = 1) P2(x) = P(Z/x) = OIZ(x) > 0) or P(I,(x) = lll(x) = 0) Pi(x) is the probability diat a window will be sampled given diat one exists. This represents die confidence that the soft measurement is correct. P2(x) is the probability diat a window will be sampled, given that one does not exist. This represents die chance of the measurement being incorrect. The error in die soft data is assumed to be independent of spatial location; dierefore, pj(x) and P2(x) will simply be referred to as pj and P2. Alabert (1987) discusses how these probabilities can be estimated using calibration samples. Calibration samples are samples which have been sampled by both "hard" and "soft" techniques. They allow die soft data to be direcdy compared to die hard data. Type (b) soft data represents an interval (Zj^jâ€ž(x), Z^^^(\)) over which the true value is known to be bound. In type (c) soft data, the true value of Z(x) is assumed to follow a probability distribution. Only type (a) soft data will be used in this thesis; therefore, types (b) and (c) will not be discussed further. 4.3.3 SIMPLE INDICATOR KRIGING (SIK) WITH HARD DATA In SIK, I(XQ) is estimated at point x^ by the following linear combination of the Uj, hard data points: where i(xâ€ž)* is the estimate of l(xj. It is an unbiased estimator. The weights, aj, j=l,..,n, in equation (4.14) are calculated by solving the following system of equations: where Cjj^ is a covariance coefficient, representing Cov(I(Xj),I(xjj)). C j^j. represents the covariance between data point x,;^ and the estimation point x^. The development of the system of equations (4.15) is given in Joumel (1989) and will not be repeated here. They are developed as an optimization problem by minimizing the variance, Var(i(xQ)* - i(xQ)Y), between iix^)* and the true value, Kx^W-"h i(Xo)* = III + X aj(i(Xj) - HI) j=i (4.14) "h X ^j^jk ~ ^ o k ' â€˘ â€˘ â€˘> n j=l (4.15) Var(i(xâ€ž)* - i(xâ€ž)T) = X I (-aj)(-ak)Cjk (4.16) j=0 k=0 where ao = -1 and Cj,j = Cov(I(Xj),I(x^)). After some simplification of equation (4.16) Var(i(xQ)* - i(xQ)j) can be calculated by: Var(i(xâ€ž)* - i(xâ€ž)T) = - i a^ Coj (4.17) j=i From equations (4.14) and (4.15) i ( X o ) * , is dependent upon knowledge of fij and Cov(h). SIK also requires second order stationarity. If no data points exist, then i(Xj,)* = Hj. 4.3.4 ORDINARY INDICATOR KRIGING (OIK) WITH HARD DATA In OIK, I(x ,^), is estimated by the following linear combination of the n^ data points: "h i ( X o ) * = S aji(Xj) (4.18) j=i The weights aj j = l , n ^ , are calculated by solving die following system of equations: "h S HjCjk + Ti = Câ€žk, k= l , . . . , n (4.19) j=l "h I a j = l (4.20) j=i where TI is a Lagrange multiplier. These equations are derived similarly to die SIK equations by calculating die weights aj such diat die variance of die estimate, Var(i(xQ)* - is a minimum, widi the constiaint that die estimate is unbiased. For a more detailed development of the OIK equations, refer to Joumel (1989) or Joumel and Huijbreghts (1978). The statistical assumptions necessary to apply OIK are more relaxed than those needed to apply SIK. As noted in the above three equations, only the covariance function is assumed known. In addition, OIK does not require second order stationarity. OIK can be carried out under the intrinsic hypothesis using the variogram. 4.3.5 COMPARISON OF OIK AND SIK The difference between estimates based on SIK and OIK are illustrated in Figure 4-2. The probability of a hole existing at fifteen points (xj, Xj, X3 ... X15) spaced ten meters apart along a one dimensional line has been estimated by both SIK and OIK. Two hard measurements exist: i(x3)=l and i(xg)=0. For SIK, it has been assumed that lij is equal to the sample average, mi=0.5. Therefore, An exponential covariance function with = 5 m has been assumed for both OIK and SIK. Therefore, Note that both OIK and SIK give the same estimates. This is because Hj was assumed to be equal to mj. Note that at the measurement points, the measured datum is reproduced. However, as estimation locations get further from the measured data points, i*(x) approaches HJ in the case of SIK and mj in the case of OIK. = (0.5)(1 - 0.5) (4.21) = 0.25 Cov(h) = 0.25e-5/h (4.22) X (m) Figure 4-2: Plot of i*(x) on 1-d line by bodi OIK and SIK. The variance of the esdmates, Vai(i(Xg)* - i(Xo)j), by both SIK and OIK are zero at the data points (Fig. 4-3). However, at esdmadon points away from data points, die variance of esdmates by SIK are lower than those by OIK. There is more certainty in i*(x) by SIK because of the more rigid statistical assumptions necessary to carry out SIK. Recall diat in SIK, |ij is assumed known while in OIK it is not. Knowing jXj provides an exti-a consti-aint on i*(x), reducing die variance or increasing die certainty. 4.3.6 EXTENSION OF SIMPLE INDICATOR KRIGING TO HANDLE TYPE (a) SOFT DATA There are now soft data available in addition to the n^^ hard data. The indicator value, i*(Xo) at some point X Q is estimated by co-kriging the soft and hard data through the following system of equations: where. " h "s i * ( X o ) = ILti + S aj(i(Xj) - t^I) + I bk(is(Xk) - Hj^ ) j=i k=i (4.23) Ordinary Kriging Simple Kriging â€” â€” Assunned Var iance Â° 6 10 2 0 3 0 4 0 50 6 0 70 8 0 9 0 100 110 120 130 140 150 X (m) Figure 4-3: Variance of i*(x) from SIK and OIK. - Hi is the average of the hard data, and - H-i is the average of the soft data. The co-kriging weights are calculated by solving the following system of equations: " h "s X ajCovi(x,-Xj)-H X bkCovii^(X|-Xk) = Covj(Xo-X|) l=l,...,nh j=l k=l ^ " h I ajCovii^(xâ€ž-Xj) + X bkCovj^(xâ€ž-Xk) = Cov,(xâ€ž-xâ€ž) m=l,...,n, j=l k=i (4.24) where - Covj(xj-Xj) represents the covariance between two hard datum, one at point Xj and the other at point Xj. - Covjj (x,-x^ )^ represents the covariance tÂ»etween a soft datum at point X | and a hard datum at point x^, or vice versa. - Covj (Xn,-x^) represents the covariance between two soft data, one at point x^ and the other pointĂ˘t x^. The covariances, Covjj (xj-Xj^ )^ and Covj (x^-x^) are calculated from Covj(x,-Xj) using the probabilities Pj, and pj. These calculations are discussed in Secdon 4.5. For a more detailed development of equation (4.24), refer to Alabert (1987). A number of different sources of soft data could be incorporated into the above system of equations. However, the probabilities pj and P2 would have to be known for each source of information. The indicator kriging technique used here for soft data is a Taylor-series type linear approximation of Bayes' equation (Alabert, 1987). For the case of one soft datum and no hard data, it is equivalent to Bayesian updating. However, for greater than one soft datum, soft data points are not reproduced exacdy. Consequendy, the conditioning of a soft datum may not be as sttong as it should be. Al l hard data points will be reproduced exacdy. 4.4 SEQUEISfTIAL INDICATOR SIMULATION ALĹ“RITHM (SIS) The SIS algoridim is a simple mediod of numerically generating patterns of the indicator random variable in one, two or diree dimensions. Only one and two dimensional patterns will be generated here. A one dimensional pattern will represent a two dimensional vertical cross section through an aquitard. A two dimensional pattern will represent a map view of an aquitard, for use in three dimensional hydrogeological representations. The methodology will be illustrated with the generation of a one dimensional pattern of an aquitard. The stratigraphie horizon representing the possible aquitard is first divided into a series of equal sized blocks (Fig 4-4). The selection of the block size will be discussed in Chapter 6. I I I I I I I I I I I I T T T l Figure 4-4: One dimensional stratigraphie section divided into blocks The SIS algorithm is then carried out in the following steps: Step 1: A random generation path is chosen through all of the blocks. This has been done in this thesis using the following congruential generator (Bratiey et al., 1987): Bj = (5 X Bj.i + l)mod2" (4.25) where Bj is the j ' * block along the path. Any block can be the starting point of the path. Each integer between 1 and 2" will be generated once. Therefore, n is chosen to be as small as possible, while maintaining 2" greater than the total number of blocks. Integers greater than the number of blocks are discarded. A random path ensures that generated realizations have a maximum degree of disorder. Step 2: An estimate, i(Bj)* is kriged at block j , the first block in the path along the aquitard. Step 3: A random number, p, is generated between 0 and 1 using a uniform random number generator. Step 4: The random number, p, is transformed into an indicator value by die following criteria: if p > i(Bj)* , then i(Bj) = 0 and the block represents aquitard, or if p < i(Bj)*, dien i(Bj) = 1 and die block represent a window. Step 5: The indicator value, i(Bj), is dien added to die data set where it is used to condition the generation of die next block in the path. Step 6: Return to step 2 until aquitard or a discontinuity has been generated at all aquitard blocks. In step two, either SIK or OIK can be used; however, the framework developed in this thesis utilizes SIK. While SIK is more restrictive to apply than OIK because it requires an estimate of Hj and Cov(h), using bodi parameters is a distinct advantage. Utilizing die mean in addition to die covariance function allows greater power in manipulating the characteristics of the numerically generated aquitards. Through Bayesian updating, the mean can incorporate one's conceptual understanding of the geology. Hence, the generated aquitards can be forced to conform more with one's understanding of the geology in SIK dian in OIK. Using SIK can be of fundamental importance, particularly widi sparse data sets. This will be illusti-ated with the following example. As discussed in section 4.3.2, kriged estimates range between the sample average, or mean, and the measured data values. If no holes are measured, which is likely in the advent of a sparse data set, the indicator value at all data points will be zero and the average indicator value will be zero. Hence, in OIK diere will be a zero chance of a discontinuity everywhere. This is clearly unacceptable because an aquitard formed in any depositional environment, wdl always have a chance of having a discontinuity. In SIK, |ij can be assumed to have a value greater than zero, forcing a finite chance of a discontinuity. As data sets increase in size, the difference between SIK and OIK decreases. 4.5 INFERENCE OF GEOSTATISTICAL PARAMETERS The geostatistical parameters can be estimated using either Bayesian updating or non-Bayesian approaches. In the non-Bayesian approach, only sampled data are used to estimate the parameters. In Bayesian updating, one's conceptual understanding of geology can be included in the inference. However, as will be explained below, the Bayesian approach can only be practically used for estimating the mean. The correlation length will have to be estimated using non-Bayesian methods. Both of the methods of inference will now be discussed below. 4.5.1 NON-BAYESIAN INFERENCE 4.5.1.1 Inference of the Mean and Variance The mean, fij, is the expected value of I(x). Therefore, lil =E[I(x)] (4.26) = (l)P(Z(x)=0) + (0)P(Z(x)>0) since for Z(x)=0,1(x)=l and for Z(x)>0, l(x)=0. The above relationship then simplifies to HI =P(Z(X)=0) (4.27) = F(0) where F(0) is the cumulative probability distribution of Z(x). Therefore, F(0) is used as an estimator of die mean. Hence, mj = F*(0) (4.28) where F*(0) is an estimate of F(0) and mj is an estimate of As a convention that will be used du-oughout diis thesis, unless odierwise stated an asterisk will be used to denote an estimate of a parameter or a random variable. An estimate of F(0) by hard and soft data will be referred to by Fi*(0) and Fj5*(0), respectively. Fj*(0) is calculated from n^ hard data by: F V 0 ) = I aji(Xj) (4.29) j=i where aj is a weight for i(Xj). If the data were independent random variables, then aj = However, the data are spatially correlated and is generally clustered about areas of special interest. These factors must be accounted for if mj is to be unbiased. One way of accounting for these two factors is cell declustering, which is a simple nonprobabilistic technique (Joumel, 1983). Cell declustering will calculate non equal weights aj in the above equation. Cell declustering is a very useful technique because knowledge of die correlation length is not needed. There are odier possible methods of estimating F(0) from spatially correlated data. For example, could be estimated over a block representing die entire domain of interest using ordinary kriging. Cell declustering has been chosen here because of its simplicity. Consider die case of n^ hard data located in some area D (Fig. 4-5) Declustering is carried out in die following steps: (1) Area D is overlaid with a regular grid of iij cells of size d. (2) The number of data Uj is counted in each cell dj. ( 3 ) Each datum in cell dj is weighted by H J and then summed. ( 4 ) The sum of data for each cell is then weighted equally. ( 5 ) F* j (0 ) is then calculated by: FVO) = ^ I I ^ i ( x , ) ^=1 kj=l i ( 4 . 3 0 ) where n^ is the number of cells that contain data points and Uj is the number of data points in cell j . M * Cell o f Size ol m This ce l l h a s 6 d a t a , e a c h rece iv ing a weight of 1/6 â€˘ u t l i n e o f Ar-ea D Figure 4 - 5 : An example of hard data which are declustered (modified from Joumel, 1 9 8 3 ) . A number of different cell sizes d should be tried. The d providing the lowest F* j (0 ) should be used (Fig. 4 - 6 ) . For very small d and very large d, F* j (0) simply represents an equal weighting of all data. F",<0> â€˘ p t i n o l X T 0 <^ d Figure 4-6: F*j(0) versus different cell size d (modified from Joumel, 1983) F*i (0) can be calculated in a similar manner from n^ soft data. However, die lack of reliability of die soft data will bias F*j (0) relative to F*(0). The reliability is accounted for by the following relation: Fi^ (O) =P(Z/x) = 0) = E[l,(x)] = Ei[E[I,(x)II(x)]] = E[I,(x)II(x)]P(I(x)=l) + E[I,(x)II(x)]P(I(x)=0) = PlP(I(x)=l) + p2(l-P(I(x)=l)) = PlF(0) + P2(l-F(0)) (4.31) Therefore, die unbiased estimate of F*(0), based on soft data, is F*(0) = -F*i /0)-p2 P1-P2 (4.32) F*(0) can be calculated from a combination of hard and soft data by: F*(0) = (Ă»F*i(0) + (1 - coy F*i /0)-p2 (4.33) Pi -P2 The relative weight, CĂą, represents the relative number of hard and soft data and the quality of the soft data. Alabert (1987) suggests that when the number of soft data is large, and pj and pj are correctly estimated, o) = nj(n^ + n )^. If the estimates of p, and pj are unreliable, or the number of soft data is small, then co should be increased. 4.5.1.2 Inference of Cov(h) Recall that the covariance function, defined by: is dependent upon [ij. Under conditions of second order stationarity and ergodicity, the covariance function can be estimated from the variogram, which is estimated without knowledge of |J.j, using equation (4.7). However, Isaaks and Srivastava (1987), quoted by Alabert (1987) report that it is better to estimate a covariance function directly from data, rather than deriving it indirectly from a variogram. Therefore, HJ will have to be estimated first by some method which accounts for the spatial correlation when it is not known, such as cell declustering. The estimate of Cov(h) from hard data alone will be referred to as Cov*j(h). A number of classes of separation distance,dj, between measurement points are defined. For example. Cov(I(x),I(x - h)) = E[(I(x) - Hi)(I(x - h) - HI)] (4.34) di = 0- l(X)m dj = 100 - 200 m dj = 200 - 300 m. For each class, die total number of pairs of points, H J , whose separation distance falls within that class, and their average separation distance, h^ .^ are calculated. Then for each class the average covariance between all pairs of points H J is calculated by: "J "j Cov(h,.) =11 [i(x^) - mi)(i(xâ€ž ) - m,)] (4.35) m=ln=m The Cov(h3.) is dien plotted versus h^.. An analytical function representing Cov*j(h), such as the J J exponential function, is then fitted to the plotted points. Soft data can be used to improve Cov*j(h). This involves the soft-hard indicator cross covariance and the soft indicator covariance. The soft-hard indicator cross covariance, Cov*,, (h) is estimated with the same procedure as Cov*i(h). Covu^(h) is related to Cov(h) by: Covji (h) Cov(h)= 1 (4.36) Pi -P2 This relation is only valid if the number of soft data is large. It is developed in more detail in Alabert (1987). It is based upon the following assumptions: (1) I(x), Ij(x), pj, and P2 are stationary (2) P(Z(xi) = 0IZ(X2) = 0, Z ,(X2) = 0) = P(Z(x) = 0IZ(X2) = 0) and P(Z(xi) = 0IZ(X2) > 0, Z ,(X2) = 0) = P(Z(x) = 0IZ(X2) > 0) Assumption (2) states that the information contained in a soft datum at point X2 Zj(x2), is negligible compared to the information in a hard datum taken at the same point, Z(x2). A combined estimate of Cov(h) could be: Cov*ii (h) Cov*(h) = coCov*i(h) + (1 - Ă» ) ) - ^ â€” ^ (4.37) The weight, (o, should be chosen to account for bodi the number of hard and soft data and die quality of the soft data. The soft indicator covariance, Cov*j (h), is again calculated from the soft data using die same procedure as the hard data. Cov(h) is related to Covj (h) by: Covj (h) Cov(h) = ; ^ (4.38) (Pl -P2) This relationship is developed in detail in Alabert (1987) and is based on the previous two assumptions plus the following additional assumptions: (3) P(Z,(x) = OlZ(y) = 0, Z,(y) = 0) = P(Z(x) = OlZ(y) = 0) and P(Z,(x) = OIZ(y) > 0, Z3(y) = 0) = P(Z(x) = OlZ(y) > 0) Assumption (3) states diat die information contained in Zfy) is negligible compared to Z(y). Assumption (3) is stionger than assumption (2) because it relates to the spatial correlation of the soft information. Assumption (3) would be invahd in die case of a stiong spatial correlation in the error Zj(x) - Z(x). For example, this could occur in die case of subjective guesses which are not made independendy of each other. A combined estimate of Cov(h) is Cov*â€ž (h) Cov*, (h) Cov*(h) = co,Cov*,(h) -H c o j - ^ ^ + C03 (4.39) The weights cOj are chosen to sum up to one. They represent the number of pairs used to estimate each of die covariances and the quality of the soft information. However, a fundamental problem arises in hydrogeology. While there may be enough data to quantitatively estimate tiiere will often not be enough data to quantitatively estimate Cov(h). Even, in cases where there is much data, the measurements are rarely spaced uniformly through the area of interest, but are usually clustered in particular locations. This problem could be tackled through a Bayesian approach where data is combined with geological intuition. For example, Cov(h) could be estimated by comparing the area of interest to similar areas where much data are available. Unfortunately, as will be explained in the next Section, the author is unaware of any straightforward Bayesian methodology for carrying this out. 4.5.2 BAYESIAN INFERENCE Bayesian statistics is philosophically different from classical statistics. The basic approach is that an estimate of a geostatistical, or other parameter, is updated by Bayes' equation as new information becomes available. In this way, an estimate is continually improved as new information becomes available. The Bayesian approach is very powerful because it allows one to combine different types of data. A geological model built from a conceptual understanding of geology can be combined with measured data. With very large data sets, estimates of geostatistical parameters based on the classical and Bayesian methods approach each other. Utilizing one's conceptual understanding of geology is of great importance in many cases where there is a lack of measured data. Recall from Chapter 3, that there should be a quandtadve relationship between HI (and Cov(h)) and depositional environment. This relationship could be used to estimate Hj, or Cov(h), based on information from a geologically similar area where many data are available. However, as will be discussed in this section, only the mean will be updated here. The Bayesian updating equation is developed below. Much work has been done on the Bayesian updating of probability density functions of independent random variables. The updating of probabilities for independent random variables is discussed in Section 2.3. However, here we will deal with correlated random variables. Unfortunately, litde work has been done on updating probability distributions of multivariate, correlated random variables. Kitanidis (1986) presents a methodology for updating the probability density function of a multivariate Gaussian spatially correlated random process for three different cases of parameter uncertainty. In the first case, die variance and covariance function are fully known. Therefore, only the mean is updated. In the second case, the covariance function is known except for a multiplicative constant. For example, this case corresponds to an exponential covariance function (Cov(h)=CT^exp(-h/A,)) with known correlation length Xj. In the final case, only the form of the covariance function is known: the mean and all parameters in die covariance function are unknown. This diesis will utdize die methodology developed for die second case to update the mean of I(x), even though I(x) is a binary random variable rather than a Gaussian random variable. The methodology should be very applicable to updating die mean of I(x) because it can be developed using only the first two moments and ignoring any distiibutional assumptions. A spatially correlated binary random variable is fully described by its first moment (or die mean) and covariance stincture. The author is unaware of any Bayesian updating equations developed specifically for multivariate spatially correlated binary random variables. The framework could be updated to incorporate new theoretical probabilistic methods direcdy applicable to binary random variables as they become available. The correlation length is not updated for two reasons. The first is to simplify the numerical calculations carried out in the framework. In the first of Kitanidis' two cases the likelihood functions of the parameters can be expressed with a fixed number of sufficient statistics, and conjugate priors can be used in the Bayesian updating. Therefore, the updating breaks down to simply solving a series of algebraic equations. However, Kitanidis' third case is much more difficult because the likelihood function of the parameters cannot be expressed in a fixed number of sufficient statistics. Therefore, no general analytical method is available for calculating the updated parameters. The second reason is that in any field investigation in hydrogeology, it is difficult to get enough data to numerically estimate the correlation structure, let alone update it. If enough data was available, and the effort was warranted, the framework could be adapted so that the covariance function could be updated by Bayes' equation. The details of the updating procedure for the second case are listed below. Kitanidis (1986) assumes that the random function, Z(x), has the following general linear form: Z(x) = ifj(x)Pj + v(x) (4.40) j=l where, - X is a spatial point where Z is sampled - fj(x) j = l , p are known functions of x - Pj j=l,..., p are parameters - \j/(x) is a zero mean spatial random function. The first term on the right hand side represents die deterministic component of Z(x). The second term on the right hand side represents a zero mean random field. For the special case where Z(x) is a second order stationary random field Z(x) = n , + V(x) (4.41) For a vector of n measurements, z, z = u[i^ + \j/ (4.42) where u is a vector of I's and \|/ has zero mean and covariance matrix Q^JiB). Qzz(9) is a function of parameters 0. It is assumed that the form of the covariance function is known except for a multiplicative constant, 0. Therefore, Qzz(0) = ^ z z (4.43) where S^ ^ is known. The likelihood function for and 0, given a measurement vector z, is P(Hz, 0lz) a 0^/2exp[-| (nj. b,)TH,(Hi - b,)]0^/2exp(.ivq,0) (4.44) where - H , b , = uTS,,-lz -r = rank(H,) - V = n - r - q , = (zX-'-bs'^uS,,-lz)/v - a represents "proportional to". The conjugate prior for [i^ and 0 is a normal gamma 2 function: P(H,. e)' a e^^exp[-| (m - m^Wdi, - m,')] 0^'/2-iexp(-^'q'0) (4.45) where - m ,^' = the prior estimated mean - v'=therankofH' - V = vector of ones, n^ long - Ug = number of prior measurements. The updated estimated mean, m / ' is H"m/ ' = H 'm/ + H,b, (4.46) where, H " = H ' + H^. H ' and represent weights for the prior and sample estimates of the mean, respectively. The above equation can be rearranged into m / =H"-i[H'm/ + H,bJ = [H' + H,]-i[H'm/ + uX - i ^ ] = [H' + uTS^^-iu]-l[H'm/ + u'^S^-h] (4.47) Refer to Kitanidis (1986) for a more detailed development of die updating equations. The Bayesian updating in this thesis will be carried out using a modified version of the above equations. This modified version is developed below. Since I(x) is assumed to be a second order stationary random variable, it will have the following form: I(x) = HI + (p(x) (4.48) where <p(x) is a zero mean binary random variable widi variance M.i(l - Hi). Therefore, the equation used to update mj, has the following form: mi" = [H' + uTSii-iu]-l[H'mi' + u^S^fH] (4.49) where i is a vector of data n long, and the term u^Sy'^u represents the weighting of existing measurements (or H^). The variance is direcdy updated from die updated mean by equation (4.11). To incorporate a geological estimate of the mean widi measured data, m{ would represent die estimate of Hi based on the geological model. The confidence in the prior estimate of the mean can be quantified through an equivalent number of n^ measurements which are spaced enough apart to be independent. Therefore, H ' can be calculated by H' = vTSii-lv (4.50) where v is a vector of ones n^ long and Su is the covariance matrix between the n^ measurements. Since the n^ measurements are independent. Su only contains a^^ terms along the diagonal, and all other terms are zero. Therefore, the above equation simplifies to (4.51) Equation (4.49) can also accommodate the case where no prior knowledge is assumed. No prior knowledge is also termed a diffuse prior. In this case H ' is set to zero and equation (4.49) collapses to One problem in using equation (4.49) in this framework is that is does not directly accommodate soft data because soft data is biased. The development of a Bayesian updating equation that will simultaneously handle both hard and soft spatially correlated data is a current research topic. If the number of soft data is small relative to other information, or its quality is poor, then its affect on the updated mean could be small. Under these circumstances, the soft data could be excluded from the updating, or estimation of the mean. However, if the soft data provides a substantial quantity of information, then it should be incorporated into the estimation of the mean. The incorporation could be carried out by assuming that the soft data forms an independent data set from the hard data. The estimation of the mean would then be carried out in two steps. In the first step, the prior geologically estimated mean, mj', would be updated by the soft data alone. The updating would be done by modifying equation (4.49). Since [u^ Sj j u][u'^ Sj^ j^ "^ u]-^ equals an identity matrix, equation (4.49) can be rearranged into (4.52) = [H'+uTSi,-lu]-l[H'm,' + [uTSii-1 u][uTSii-iu]-iuTSii-'i] (4.53) where Sj j is the covariance matrix between the soft data. Using equations (4.52) and (4.50), die above S S equation can t>e reduced to mj" =[H' + HJ-i[H'm,'+Hsmi] (4.54) where H is the precision matiix of the soft data and mj is the estimate of the mean from the soft data alone. The updated mean, mj", is simply the weighted average of the prior mean, mj', and the sample mean, mj, from the soft data weighted by the precision matrices. However, mj is still biased from the sample precision. Therefore, the above equation is modified using equation (4.32) to unbias mj, to form mi" = [H' + H J-1 [H'mj' +H, ]^ (4.55) Pl -P2 The above equation can then be used to update the mean using the soft data. In die second step, die final estimate of the mean, mj"', would be estimated by updating mj" with die hard data using equations (4.49) or (4.54). Note diat for hard data, equation (4.55) and equation (4.54) are equivalent because for hard data Pj = 1 and P2 = 0 allowing the precision terms in (4.55) to drop out. Note, that the updating of the mean is independent of the exact value of the variance of I(x), which is used to calculate the matrix Sjj and hence the weights, H ' and Hj. The magnitudes of both H ' and H^ are affected by the variance, but their relative sizes remain unchanged. Even diough die mean is updated using equations developed for Gaussian random variables, in reality its estimate wdl follow a Beta distribution, which is continuous between zero and one. 4.5.3 SUMMARY OF INFERENCE OF GEOSTATISTICAL PARAMETERS Both Bayesian and non-Bayesian approaches can be used to estimate the geostatistical parameters. Non-Bayesian approaches utilize measured data only, where Bayesian updating allows one to incorporate geological intuition into the estimate. Both approaches have advantages and disadvantages. The non-Bayesian methods are the most theoretically complete and will readily incorporate both hard and soft data. However, they cannot incorporate geological intuition which can be critical, particularly in the case of sparse data sets. Nor do they provide a methodology for updating estimates as new information becomes available; therefore, they do not direcdy fit into the Bayesian data worth framework used here. Bayesian updating can handle geological intuition and will readily fit into the Bayesian data worth framework. Unfortunately, work on the Bayesian estimation of geostatistical parameters of correlated random variables is not as complete as that of non-Bayesian methods. Only the mean can be readily updated and not the correlation length. In this diesis, the correlation length will be estimated by the non-Bayesian approach and is assumed to remain constant. The correlation length is not updated because the numerical complexity of carrying out the calculations is exti^ eme and because the required number of data to carry out effective updating will rarely be available. The updating equation does not simultaneously handle hard and soft data, but it has been modified to handle hard and soft data in two different steps. 4.6 LIMITATIONS OF SIS ALGORITHM There are several factors which provide limitations on the SIS algorithm. First, a number of geostatistical assumptions have been made to make the statistics tractable. Secondly, in reality die aquitard realizations are finite in size when theoretically they are assumed to be infinite. Thirdly, volume variance relationships do not work for the form of die SIS algoridim used here. Finally, some kinds of uncertainty are ignored by die SIS algorithm and diere is a practical hmit to die number of conditioning data points. 4.6.1 GEOSTATISTICAL ASSUMPTIONS Geostatistical assumptions which have been made include diose of ergodicity, stationarity, and isoti-opy. These assumptions were explained in detail in Section 4.2. If ergodicity is not true then geostatistics breaks down. The assumption of ergodicity is axiomatic and must be taken for granted. It is apparendy just accepted in the geostatistical community as being valid and there is no standard way of testing its validity (Sinclair, 1991). In many cases, I(x), may not be second order stationary, resulting in aquitard realizations which are not representative of the true state of nature. An example would be a facies change in the area of interest. The result would be a ttend in aquitard continuity. There are a number of ways of deaUng with conditions that are not second order stationary. For example low order ttends in the mean could be filtered out, leaving a residual which could be second order stationary. Completely non-second-order stationarity conditions could possibly be handled by replacing the SIS algorithm with a different scheme for numerically generating aquitards. One such approach might be to use an algoridim based on fractal-based geostatistics such as have been carried out in the petroleum industry (refer to Hewett and Behrens, 1988). The assumption of isoti^ opy is not critical. If anisotropic conditions exist, they can easily be built into the SIS algorithm. However, with limited data sets, anisotropy can be difficult to detect and even more difficult to estimate. 4.6.2 FINITE SIZE OF AQUITARD REALIZATIONS Simulation algorithms assume that the domain is infinite when in reality it is bounded and finite. The boundaries reduce the variability of the realizations, damping the sample covariance function of the realizations below its true covariance function. To reduce this effect, the total size of the discretized aquitard must be approximately four times greater than die correlation length used in generating the realizations. 4.6.3 VOLUME VARIANCE RELATIONSHIP Several kinds of data representing different volumes, or scales, may be available at a given site to condition the indicator simulations. These include: - boreholes, which are point size, - geophysical surveys, which cover a large area, and - previously simulated blocks, which can be up to hundreds of meters in size. Recall, that Uie SIS algoritiim sequentially simulates the aquitard blocks so tiiat a simulated block is conditioned on the previously simulated blocks. The weighting of a datum in estimating I(x) at point x is dependent upon the volume of the measurement because of the volume variance relationship. A datum with a large volume will have more weight in estimation than a datum with a small volume, given that the two data are equidistant from die point of estimation. If I(x) was a continuous random variable, then diis problem could be tackled using volume variance relationships. However, in diis thesis, I(Bj) is a discrete random variable with a value of either 0 or 1. The entire block represents eidier aquitard or a window. Under diese circumstances, I(Bj) is insensitive to the volume of Bj. Therefore, volume variance relationships do not apply (Soivastava, 1990). Therefore, the mediods oudined in this thesis are only able to handle measurements of one volume. This volume is set equal to the volume of the simulated blocks. I(x) would be a continuous random variable if it represented the average point value of I(x) averaged over every point within an aquitard block. Measurements with different volumes will be accounted for as follows. It will be assumed that the outcome of a borehole can be exd-apolated to the entire block. This will be realisuc if the size of the block is small and the minimum size of any clay layer is on the order of the size of the block. It is further assumed that there will only be one borehole taken per block. Geophysical measurements covering a large area will be handled by breaking up the area into a number of blocks covering the area, 4.6.4 TYPES OF UNCERTAINTY HANDLED There are three types of uncertainty: natural, stadsdcal, and model (Benjamin and Cornell, 1970). Natural uncertainty represents the uncertainty in the random variable itself. Stadsdcal uncertainty is the uncertainty in the parameters, such as mean and variance, governing the probabUity distribution. Model uncertainty is the uncertainty in the exact form of the probability density function, e.g. "does a random variable really follow a normal distribution, or does it follow a Gamma distribution?" Simple kriging only accounts for the natural uncertainty and ignores the statistical and model uncertainty because it assumes that HJ and Cov(h) are known. The statistical uncertainty in the assumed parameters can be accounted for by using their Bayesian disfributions (Benjamin and Cornell, 1970). Kitanidis (1986) presents a series of matiix equations for estimating spatially correlated random variables that accounts for the uncertainty in die different parameters. There is no straightforward procedure to account for model uncertainty (Benjamin and Cornell, 1970). As an initial first order approximation, model and statistical uncertainty will be ignored in this thesis. 4.6.5 NUMBER OF CONDITIONING DATA Recall that there is potentially a large amount of data to condition the estimate of 1(B) at some block, B. These data will not only include measured data, but previously simulated blocks. Ideally the estimate should be conditioned on all data. However, this is impractical since each conditioning data point adds one equation to the system diat must be solved for Uie weights aj, j=I,...,n. The solution quickly becomes numerically awkward as n increases because the effort needed to solve this system of equations is proportional to n^. Therefore, die number of conditioning data points is restiicted to die ones nearest to the estimation point. This limitation will not cause numerical problems because data points that are far away from the estimation point, or have odier data points in between themselves and the estimation point, will have little effect on the estimation result. For one-dimensional simulations, the limit is the five nearest previously simulated blocks and/or measured data on either side of die block being simulated. In two dimensional simulations, the 10 to 20 nearest measured data points or previously simulated blocks are used. 4.7 ALTERNATIVE SIMULATION ALGORITHMS There are several standard methods for generating realizations of correlated random variables. These include turning bands (see Mantoglou, 1987) and Cholesky decomposition (see Clifton and Neumann, 1982; Alabert, 1987; and Davis 1987). These methods deal widi continuous random variables rather than discrete random variables, and assume that the random variable comes from a multi-variate Gaussian distribution. It would be possible to use any of diese alternative methods for dealing with aquitard continuity by generating realizations of aquitard block thickness. Holes would exist if the aquitard diickness was below some critical value. However, the SIS algorithm has several distinct advantages over these methods. The first is diat the conditioning of realizations on bodi hard and soft data is built into die SIS algorithm. There is no sttaightforward method for conditioning the realization from the standard mediods on soft data. A second important advantage is that die SIS algorithm does not require that the simulated random variable come from a multivariate Gaussian distribution.; a wide range of different probability density functions can be handled (Gomez-Hemadez and Srivastava, 1990). Aquitard thickness may not follow a Gaussian distiibution. For example, if there is a finite chance of a hole, i.e. thickness equal to zero, then the thickness cannot follow a Gaussian distiibution because the probability of die random variable having a particular value is zero. Another important advantage of the SIS algorithm is that it deals widi extreme values much better than the Gaussian based methods. This is of direct interest here because one extreme thickness, or window, can have a major impact on the flow system and transport of contaminants. One of die specific purposes of IK, on which the SIS algorithm is based, is to handle extteme values (Joumel, 1983). The Gaussian assumption has a damping effect which greatly reduces the probability of extreme oudiers. In addition, the Gaussian assumption does not allow for the spatial correlation of exti^ eme values (Joumel, 1989), which again is one of the specific purposes of die SIS algorithm. Including die spatial correlation of extreme values is of utmost importance because it provides valuable information on predicting the locations of extreme values. The advantage of die Gaussian-based mediods is diat diey are very sti-aight forward to apply. If die framework were used to evaluate the worth of hydraulic conductivity measurements, which are generally assumed to follow a mulitvariate log-normal distribution, then die SIS algorithm could be readily replaced with one of die above standard methods. 4.8 GEOSTATISTICAL NATURE OF C L A Y LAYERS In general, litde work has been done in quantifying the geostatistical nature of clay/shale layers, or relating it to depositional environments. Desbarats (1987a) found that, while the shale layers in the vertical direction had no correlation, they were exponentially correlated in the horizontal direction with a range of 15 m and no apparent nugget effect. The shales were deposited in an alluvial environment (Desbarats, 1987b). He noted that die physical significance of the range was not clear because indicator variography expresses spatial continuity of shales and sands rather than shale continuity alone. The range does not correspond to the average shale length. In the only other study relating spatial correlation and geology that the author is aware of, Phillips and Wilson (1989) produce a method for estimating the correlation lengdi of hydraulic conductivity based on the dimensions of geologic units. However, they did not study shale continuity, nor did they study the spatial correlation as it relates to specific depositional environments. 4.9 NOTATION a range of covariance function j " kriging weight or declustering weight j * aquitard block generated by sequential indicator simulation algorithm covariance coefficient between I(Xj) and I(X|j^ ) Cov(h) covariance function of I(x) Cov*i(h) estimated covariance function of I(x) Cov*ii (h) estimated cross covariance function between I(x) and Ij(x) Covji (h) cross covariance function between I(x) and Ij(x) Covi^(h) covariance function of l^{x) Cov*i (h) estimated covariance function of Ij(x) dj j ' ' ' distance class fj (x) known function of x representing drift in Z(x) F(0) cumulative probability distiibution of I(x) F*(0) estimate of F(0) F*i(0) estimate of F(0) from hard data F*, (0) estimate of F(0) from soft data h separation distance h .^ average separation distance of data points within distance class dj Hj weight of measured sample data H ' weight of prior data H " updated weight of data i vector of data i(x) i(x) value of indicator random variable I(x) i(x)* estimated value of indicator random variable I(x) i{x)j true value of indicator random variable I(x) I(x) indicator random variable at point x lj(x) soft indicator random variable at point x mj estimate of HJ mj' prior estimate of |J.j mj" updated estimate of HJ m^' prior estimate of [i^ m^" updated estimate of Hg equivalent number of data Uj number of data point in dj n,, numtÂ»er of hard data Uj numt)er of soft data p random numl)er drawn form uniform random number generator Pl(x) probabUity that a window will be sampled at point x, given diat one exists, P(l5,(x)=lll(x)=l) Pl stationary form of Pi(x) P2(x) probabihty that a window wdl be sampled at point x, given diat one does not exist, P(l,(x)=lll(x)=0) P2 stationary form of P2(x) QjJQ) covariance matiix of Z(x) S i i ( 0 ) known covariance matiix of l(x), except for unknown multipUcative constant 8^2(9) known covariance matiix of Z(x), except for unknown multipUcative constant u vector of I's V vector of I's z vector of measurements of z z(x) value of Z(x) at point x Zj. critical cutoff Z(x) continuous random regional variable at location x 0 unknown parameter Xj correlation lengdi of I(x) HI mean of random variable I(x) Hi mean of random variable Ij(x) Hz mean of random variable Z(x) variance of I(x) variance of Z(x) zero mean binary random function zero mean continuous random function relative weight of hard data CHAPTERS: CONTAMINANT TRANSPORT 5.1 INTRODUCTION Recall that failure occurs when contaminants from the landfill penetrate an aquitard and spread to a lower aquifer. The failure time is needed in the objective function so that future costs can be converted into present day dollars. The purpose of diis chapter is to explain how contaminant ti-ansport is modeled and failure times calculated for an aquitard realization generated by the SIS algorithm. For simplicity it is assumed that the contaminants are miscible in water, are at low concentrations which do not significandy change the density of groundwater, and are non reactive and non-radioactive. It is further assumed that groundwater flow conditions are fully saturated and at steady state. More complicated contaminants or flow conditions could easily be incorporated into the framework by increasing the complexity of the numerical model needed to calculate failure times. A brief overview of contaminant transport processes pertinent to this thesis will be given first. For a more detailed discussion on contaminant d-ansport processes refer to Domenico and Schwartz (1990), Freeze and Cherry (1979), or any odier standard text on hydrogeology. A description of how contaminant transport and fadure times are calculated in bodi the two dimensional and three dimensional cases follows. 5.2 TRANSPORT PROCESSES The ttansport of a contaminant that is miscible, at low concentrations, and non reactive, is governed by an advection dispersion equation of die following form (de Marsily, 1986): 3 C VÂ«(D V C - C v ) = - ^ (5.1) where 2 - D is the dispersion tensor (L /T) 3 - C is the concentration (M/L ) - V is die average linear velocity vector of die groundwater flow (L/T) -1 is time (T) Bold upper case letters represent tensors and bold lower case letters represent vectors. For one dimensional tiansport in the x direction, in a homogeneous medium, equation (5.1) becomes: d^c dc a c The first term on the left hand side represents die hydrodynamic dispersive component of tiansport while the second term accounts for the advective component. These different components are discussed in more detail below. 5.2.1 ADVECTION The advective component represents die amount of dissolved contaminant physically tiansported by flowing groundwater at the average linear velocity of die groundwater. The average linear velocity of groundwater is calculated using Darcy's law: v = - ^ V h (5.3) where K is the hydraulic conducdvity tensor, n is the effective porosity, and h is the hydraulic head. From equation (5.3) it is seen that the rate of advective transport is dependent upon the hydraulic conductivity, die porosity, and die hydraulic gradient. If die co-ordinate axes coincide with die principal directions of anisotropy, then the off diagonal components of K become zero. Therefore, K = K , . 0 0 1 0 K ^ , 0 0 0 K , J (5.4) Darcy's law has the foUowing form for flow in the x direction: Kxxah " n ax (5.5) The hydraulic head field is determined by solving the flow equation, which has the following form for steady state flow: VÂ»KVh = 0 (5.6) 5.2.1 HYDRODYNAMIC DISPERSION Hydrodynamic dispersion accounts for the spreading of contaminants as they are transported. As a resuh, it is a dilution process; however, contaminants can arrive at a particular point faster than they would by simple advection at the average linear velocity. Hydrodynamic dispersion has two components: diffusion and mechanical dispersion. Diffusion is Q-ansport due to concentration gradients. The rate of diffusion is governed by Pick's law: f, = -D,WC (5.7) where - f(i is the mass flux of contaminant flowing through a unit area of aquifer per unit time, and - D(i is die apparent diffusion coefficient of the porous media. Mechanical dispersion is a spreading process caused by local variations in groundwater velocity. These local variations are the result of flow through heterogeneities diat occur at both the microscopic and macroscopic scale. On the pore scale, mixing occurs for a number of reasons. First, fluid in the center of a pore will flow faster than at the edges because of boundary effects. In addition groundwater will flow faster through some pores than others due to pore size and pore geometry. Finally, the path through any pore system will be tortuous and branching, causing flow dirough the pore system to keep dividing. On larger scales, groundwater will flow faster through more permeable parts of a flow system, such as a sand lens, than through less permeable parts of the flow system. As a contaminant tiavels farther through a flow system, dispersion is increasingly contioUed by larger scale heterogeneities. Mechanical dispersion typically predominates over molecular diffusion in most groundwater flow situations. Diffusion is only an important process at low groundwater velocities. 5.2.3 IMPORTANCE OF ADVECTION Massmann (1987) reported from a study of ten case histories, diat die advection component generally predominates over die dispersive component if die average linear groundwater velocity is greater than several meters per year. He further notes that hydraulic conductivities of between 10'^ and 10"^ m/s are needed for diis to occur. This range of values would apply to sands and gravels for unconsolidated deposits. For simplicity, it is assumed in this thesis that the average linear groundwater velocities are high enough that advection predominates, allowing dispersion to be ignored. Travel paths from the landfill dirough the upper aquifer and through an existing window in the aquitard to the lower aquifer are assumed to occur in sand and gravel. This thesis assumes that contaminant ti-ansport occurs only by advection. Nevertheless, if dispersion is an important component of transport, it would be a simple matter to incorporate it into the framework. 5.3 MODELING OF CONTAMINANT TRANSPORT Contaminant transport is modeled by first solving the steady state flow equation (5.6) for the hydraulic head field using a finite element or finite difference model. The advective velocity field is then calculated from the hydraulic head field. The ti-avel time for a plume to reach a point of failure in the lower aquifer is then calculated from the velocity field by a particle ti-acking routine. Recall that all factors governing contaminant transport are assumed known, except for aquitard continuity. The modeling of contaminant ttansport for the two different cases (two-dimensional and three-dimensional) covered in this thesis will now be discussed. Two dimensional modeling is used in Chapter 6 for explanatory purposes only. Three-dimensional modeling in used in both Chapters 7 and 8. 5.3.1 2-D MODELING OF SOLUTE TRANSPORT A Galerkin finite element model is used to calculate die two-dimensional hydraulic head field from equation (5.6). The boundary conditions on die flow field are shown in Figure 5-1. The flow field is split into diree regions. The centtal region is die zone of interest where aquitard continuity is unknown. The aquitard in diis region is generated by die SIS algorithm. The purpose of the outer regions on either side of the cend-al region is to reduce die effect of the vertical boundary conditions on flow in the cential zone of interest. The aquitard in these regions is assumed to be continuous. Figure 5-1: Boundary conditions on two-dimensional flow field. The bottom of the system is an impermeable boundary. On the vertical sides, constant head boundaries, Hj, Hj, H3, and H4 occur on the edges of the aquifers while impermeable boundaries occur on the edges of the aquitard. Hj, H2, H3, and H4 have been chosen so that flow is from left to right and is vertically downward across the aquitard at die right hand side. The top boundary is a fixed water table which is assumed to vary linearly between the two vertical constant head boundaries on either side of the upper aquifer. The contaminant source is represented as a vertical line over which particles are released at a uniform spacing. A vertical line source was chosen to spread the plume over a large thickness of the upper aquifer. tine s o u r c e o f c o n t a n i n a n t s Linear tiiangular elements spaced on a rectangular mesh are used in the finite element model (Fig. 5-2). A variable mesh size can be used so that a finer mesh can exist in the central region than in die two outer regions. The aquitard layer is represented by two or more layers of elements. Once the hydraulic head field has been calculated by the finite element model, the average element linear velocities are calculated by (Frind and Matanga, 1985): j where, - hj is the hydraulic head at node j , - n is the porosity, and - Wj is die basis function for node j . The element velocities are ti-ansferred to a rectangular grid of cells to simplify the particle tracking. This is done by assigning to die grid point the velocity of the element that contains the grid point. The velocity of the particle, Vp, at any point (x,y) in some cell is dien calculated by linear interpolation of die velocities at the four grid points at the cell comers. Particles are tt-acked through die velocity field by the tangent, or Euler, mediod (see Boyce and DiPrima, 1969): Xp(t) = Xp(t - At) + Vp Jxp(t - At),yp(t - At)]At (5.10) yp(t) = yp(t - At) + Vpy[Xp(t - At),yp(t - At)]At (5.11) where - Xp(t) is die X co-ordinate of die particle at time t, - yp(t) is the y co-ordinate of the particle at time t. - At is the time step, and - Vpj[Xp(t),yp(t)] is die x component of die particle velocity at point [Xp(t),yp(t)]. m m ///Ă» Ă»///JJJ7 / miiy 77777777777777777J77 \/wwvwvwwwwm Figure 5-2: Sample finite element grid for two dimensional vertical cross section. The time step. At, is kept small so that it takes at least ten steps for a particle to cross one cell. This is to keep particles from jumping sti-eamlines. A number of particles are released at different points in the source area and are ttacked through the flow region. Failure occurs if any of die particles reaches die lower aquifer. It is assumed that any of the particles reaching die lower aquifer represents a major breakthrough of contamination. There are several limitations associated with the approach carried out here for modeling contaminant tiansport and groundwater flow. For example, the position of the water table is fixed. In reality die position of die water table is dependent upon the geometry of the aquitard used in the flow model. This could cause errors in the calculated hydraulic head field. Another potential numerical problem is created by ttacking particles dirough die velocity field using die Euler or tangent method. For large time steps, particularly in non smooth velocity fields, particles could jump stieam lines resulting in an incorrect paths being followed. Both of these limitations could be removed by making the analysis more complicated and numerically intensive, thus increasing the reliability of the calculated failure dmes. For example, a free surface model would account for a changing water table. Problems of particles jumping stream lines could be solved by having a very fine finite element mesh or by tiacking die particle using a modified Euler or Runge Kutta approach. However, the purpose of the two dimensional cross section is to provide a simple test case on which to build die more compUcated and more realistic diree dimensional case. Hence, we are only concerned with first order accuracy for this case, and for simplicity, inaccuracies due to the above numerical problems will be ignored. The overall rehability of estimated worths will be more gready affected by uncertainty in other factors such as aquitard heterogeneity and boundary conditions on die flow model. 5.3.2 3-D MODELING OF SOLUTE TRANSPORT In the three-dimensional modeling, die hydraulic head field is solved using die USGS finite difference MODFLOW package. MODFLOW is a commercially available computer model that has been widely used throughout the groundwater world over the last 20 years and has been well validated. The ti-ansport of contaminant is solved using die USGS MODPATH particle tracking routine. MODPATH assumes that advective ti-ansport dominates odier forms of ttansport. MODPATH uses flow velocities calculated direcdy by MODPATH to move particles of contamination dirough die flow field. For a detailed description of MODFLOW or MODPATH, refer to McDonald and Harbaugh (1989) and Pollack (1989), respectively. 5.4 NOTATION C concentration of contaminant D dispersion tensor apparent diffusion coefficient of die porous media Dj^ dispersion in the x direcdon mass flux of contaminant flowing through a unit area of aquifer per unit time h hydrauhc head K hydraulic conductivity tensor Kj^j hydrauhc conductivity in x direction Kyy hydrauhc conductivity in y direction K^z hydraulic conductivity in z dkection n porosity t time V average linear velocity of groundwater flow Vj^ average linear velocity of groundwater flow in x direction Vpj^ X component of particle velocity Wj basis function for node j Xp(t) X co-ordinate of particle at time t yp(t) y co-ordinate of particle at time t CHAPTER 6: OUTLINE OF FRAMEWORK 6.1 INTRODUCTION Chapter 6 ties together the different components introduced in Chapters 2 through 5 into the data worth framework. The framework is able to evaluate die worth of hard, point measurements and soft, areal measurement. A soft, areal measurement could consist of some type of geophysical survey such as a shallow seismic, or a radar survey. Soft, point measurements cannot be handled. Different mediods of measuring aquitard discondnuities are discussed in Secdon 6.2. Secdon 6.3 describes how the framework evaluates the wordi of a single hard measurement. Secdon 6.4 discusses the sensitivity of diis estimate to numerical artifacts of die mediodology. Section 6.5 presents how die methodology developed in Section 6.3 can be used to estimate the worth of a soft geophysical survey which covers a large area. Section 6.6 discusses how the framework could be modified to handle data wordi problems in other disciplines, such as mining engineering and reservou- engineering. 6.2 MEASURING AQUITARD DISCONTINUITIES Aquitard continuity can be measured direcdy using hard data provided by boreholes, or soft data provided by geophysical surveys, or by odier methods. It can also be measured indirecdy by methods such as hydraulic head measurements, pump tests, barometiic efficiency, and age dating of water. Boreholes provide both hard and soft point measurements. If die borehole is continuously cored, the presence of a distinct aquitard should be determined with certainty. However, if one meter split spoon samples were taken every two meters, a thin aquitard could be missed. Geophysical measurements could range from areal surveys covering the entire region of interest, such as radar or seismic surveys, to point measurements such as geophysical logs in a borehole. It is assumed that geophysical measurements will be imprecise. No specific geophysical technique is targeted in this thesis. However, several methods would be applicable such as seismic, resistivity, and radar surveys, and gamma ray or self potential logs. Measurements of hydraulic head, H, are available in virtually all hydrogeological investigations. They can be used in two forms to predict aquitard continuity. In the first form, many H measurements are taken throughout the aquifer, but at the same time. They can be used to estimate, or constrain, patterns of hydraulic conductivity through inverse techniques (see Yeh ,1986; Kitanidis and Vomvoris ,1983; and Clifton and Neuman, 1982). However, these techniques are not widely used because dieir numerical complexity is high and stabihty problems can lead to a difficulty in obtaining solutions. Previous work has also shown that such measurements of H are not very effective in determining the continuity of aquitards or aquifers. Fogg et. al. (1979), quoted by Rouhani (1985), found that radical changes in transmissivity created only small changes in hydraulic head, while Fogg (1986) reported that hydraulic heads give litde indication of interconnected zones of aquifer/aquitard systems, even with carefully calibrated models. Duffield et. al. (1989) confirmed these results in their study of the Savannah River Site in South Carolina. They found that windows and faults in aquitards, with significant vertical gradients across them, produced only small changes in the H field which were difficult to detect widiout closely spaced monitoring wells in the vicinity of the window. In the second form, H measurements taken fi-om a single well at different points in time form a time series. H measurements in this form are potentially a very valuable way of investigating aquitard continuity. The fluctuations in hydraulic head, both natural and induced, will be correlated for two aquifers diat are hydraulically connected. Fluctuations in hydraulic head for aquitards that are not hydrauUcally connected should be independent or show a significant time lag. Pump tests can be valuable in testing aquitard continuity. A well in one aquifer can be pumped while drawdown is observed in adjoining aquifers. A large drawdown is a good indication of hydraulic connection. Such a connection may be caused by an aquitard discontinuity between the aquifer with the pumping well and the aquifer with the observed drawdown. However, no analyses are known to the author that determine exactiy how a window will affect a pump test. The age dating of water through isotope analysis could be very useful in predicting aquitard continuity. If die water in two adjacent aquifers have gready different residence times, dien the aquitard separating them is probably continuous. For a more detailed explanation of the age dating of groundwater, refer to Domenico and Schwartz (1990) or Fritz and Fontes (1980). Barometric efficiency, B^, could potentially be used to establish the continuity of aquitards that act as confining layers for near-surface aquifers. B^ relates the decline in peizometiic head in an aquifer to a given increase in air pressure. It is defined by: Be = ^ (6.1) where - 7 is die specific weight of water - dh is the change in the peizometiic level, and - dP^ is the change in the air pressure. The greater is die value of B^ , die greater is die confining ability of the aquitard and the greater die likeUhood of its continuity. Refer to Freeze and Cherry (1979) for a more detailed explanation of barometiic efficiency. Some of the above methods could be very promising in helping to predict aquitard continuity. However, as a first step, only the worth of boreholes, geophysical surveys, and/or local geological information will be directly evaluated in this diesis. Other types of data could be added to die framework later if desked. However, such additions could involve a great deal of numerical complexity. As noted in Chapter 3, a regional geological understanding of how an aquitard was formed can be very valuable in predicting continuity. In this thesis prior geological understanding can be incorporated into the framework, but its worth will not be evaluated. In diis thesis, only the presence or absence of aquitard is of interest; aquitard thickness is ignored. Therefore, data consists of a set of yes or no values; eidier the aquitard is present beneath a location (xj) or it is not. If desired, aquitard thickness could be put into the framework through co-kriging or probability kriging. However, this will not be done for two reasons. First, it is unclear whether a relationship exists between layer thickness and continuity (see Chapter 3). Secondly, co-kriging will substantially increase the complexity of the framework, but is expected to bring litde improvement over simple indicator kriging (Joumel, 1983). Practice in probability kriging has shown diat generally die increase in precision over normal indicator kriging is small and not worth the increased effort (Sinclair, 1990). For a detailed discussions on cokriging or probability kriging refer to Joumel and Huijbreghts (1978) or Sullivan (1985), respectively. 6.3 EVALUATING THE WORTH OF HARD, POINT MEASUREMENTS This section ties together all of the ideas presented earlier to show how the framework evaluates the worth of a sampling program for hard, point measurements. This presentation uses an example design where the worth of a single hard measurement is evaluated. The example design is presented below. 6.3.1 EXAMPLE DESIGN The example design is similar to the landfdl presented in Chapter 2. The boundary conditions, hydrogeological and geostatistical parameters are shown in Figure 6-1. The confidence in the estimates of die mean is arbitiarily set to be equal to diat of 15 prior measurements. (Recall, diat the updating of the mean and die confidence in die prior estimate of die mean are discussed in Section 4.5.2.) The finite element mesh is shown in Figure 6-2. The mesh consists of 2 520 elements and 1 397 nodes. The worth of a single, hard measurement is evaluated. The hard measurement in this case will represent a borehole which is cored and logged in its entirety. It is taken 75 m (block 5) from the left hand side of the zone of interest. 1 00 n 500 n -line s o u r c e o f con-taninants 100 n â€˘ 20 n _L_ h=100n/ 30 n h = 100n . Measured B l o c k n=0,3 -i I I I I I I II I I 1 I I I I I I I I I I I I I I I I K^=Ky=10"''n/s n = 0.3-/ / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / / h=95n >h=90n ;Li[=0.05 X,= 30 n Figure 6-1: Boundary conditions, hydrogeological and geostatistical parameters used in the example design. The two design alternatives are again to have either a liner or no liner. The economic parameters are shown in Table 6-1 while numerical parameters are shown in Table 6-2. A discount rate of 10% is used. A minimum of 100 realizations is used to estimate the objective function of an alternative for each prior analysis for a particular set of sample outcomes in die preposterior analysis. The number of realizations needed to obtain a reliable estimate of the objective function for an alternative is discussed in Section 6.4. Five particles were released from the source. Experimentation indicated that five was the minimum number of particles released from the source which yielded consistent data worth estimates. Results were not significandy improved by using more than five particles. iiiiiiiiiinninniiinmuiiinnmuniitiniimninnniiinmimniim Figure 6-2: Finite element mesh used in example design with eight times vertical exaggeradon. Altemadve Benefits Costs Cost of Failure (millions $) (millions $) (millions $) Liner 10. 5. 1. No Liner 10. 0. 10. Table 6-1: Economic parameters used in example design. Parameter Value minimum number of realizations to estimate Z number of particles released from source # of aquitard blocks starting seed first aquitard block generated minimum number of conditioning data points on either side of est. pt. 100 5 30 6874462 2 5 Table 6-2: Numerical parameters used in example design. 6.3.2 TWO METHODS OF GENERATING REALIZATIONS FOR THE FRAMEWORK Recall that there are two basic components in analyzing the worth of a sampling pattem: die prior analysis and the preposterior analysis. Aquitard realizations generated in the prior analysis are created by the SIS algorithm using the prior estimates of the geostatistical parameters. Aquitard realizations generated during the preposterior analysis are created by die SIS algorithm using estimates of die geostatistical parameters updated from sample outcomes. In a sti^ghtforward application of the principles discussed in Chapter 2, aquitard realizations would be first generated during die prior analysis. A second set of realizations would dien be generated during die preposterior analysis. An alternative way is to generate realizations during the preposterior analyses and then "reuse" them in die prior analyses. The advantages and disadvantages of each method are discussed below. It will be shown diat numerical problems lead to the abandonment of the first method. 6.3.2.1 ReaUzations Generated During Both Prior and Preposterior Analysis (Abandoned Approach) 6.3.2.1.1 The Prior Analysis The prior analysis can be visualized by a decision tree (Fig. 6-3) that is slighdy different from the one presented in Chapter 2 (Fig. 2-2). In Figure 2-2, a design alternative only had two possible outcomes: failure or no failure. The tine outcome was uncertain. Here, the design alternative has as many outcomes as realizations that are generated. Each outcome is assumed to be equally likely. To simplify the rest of die Figures presented in this Chapter, the effect of all of die possible outcomes is represented by die expected value of the objective. The prior analysis is shown as the upper limb in Figure 6-4. Using die objective function given as equation (2.3) and the data fi-om Table 6-1, die prior best design, Ap, is die no liner alternative widi E(Z(Ai^))=$7.743 x 10^ and E(Reg(Ai^)=$2.195 x 10^. The prior probability of failure is 0.37. objective aquitard realizations function Figure 6-3: Decision tree used in prior analysis and its simplified version to right. 6.3.2.1.2 Preposterior Analysis The preposterior analysis is carried out in the following basic steps: 1) A pattern of n measurements is proposed. 2) The 2" combinations of different sample outcomes Sj are calculated. 3) The mean is updated, based on sample outcome Sj. 4) A series of M aquitard realizations, R^^ij, are generated using die updated parameters and conditioned on Sj. (The subscript mlj on R refers to m* realization which is conditioned on sample outcome Sj.) 5) For each realization, R^|j, contaminant transport is modeled for each alternative, Aj, to determine if failure occurs for Aj. Failure times are calculated. 6) The failure time is dien used to calculate die objective function, Z(A,,R^|jlSj), for each A,, for all M realizations, R^ij. 7) The alternative, A ^ , ' , with the highest expected objective function, over all M realizations is then selected. 1 ^ ECZCADO) = M I Z(AD'.Rmij I Sj) (6.2) ni=l Z(AQ',R^IJISJ) is die objective function of die posterior best design alternative for realization R^|j, given Sj. Steps 3 through 7 are repeated for each of the 2" sample outcomes Sj. The expected expected objective function of the best preposterior design is calculated by j=l m=l 10) The worth of die sampling pattern is calculated from Wordi = E [ E ( Z ( A D ' ) ) ] - E(Z(AD)) (6.4) Recall that all hard measurements are assumed to be taken at die same scale as the blocks that form die discretized stratigraphie horizon containing die aquitard. If a measurement is taken at a block, the outcome of the measurement is assumed to apply to the entire block. In step 2, it is assumed that there are only two possible outcomes at any sampled block: either an aquitard is measured or a window is measured. Since there are n sampling locations, there are 2" possible combinations of sample outcomes. For the preposterior analysis of the borehole taken at B5, there are only two possible outcomes: or S ^ . Each sample outcome is used in turn to update the geostadsdcal parameters. The updating of geostatistical parameters was discussed in Section 4.5.2. One hundred new reahzations are generated for each set of updated geostatistical parameters. The new best design, A ^ ' , is chosen for each set of realizations. These results are summarized in Figure 6-4. The worth, Wj, of the borehole from the expected increase in the maximum expected objective function is Wi = E [ E ( Z ( A D O ) ] - E [ Z ( A D ) ] (6.5) = [($4,355 X 106)(0.05) + ($7,802 x 106)(0.95)] - $7,743 x 10^ = $-112 700. The worth, W2, of the borehole from the expected decrease in the minimum expected regret is W2 =E[Reg(AD)]-E[E((Reg(AD'))] (6.6) = $2,195 X 105 - [($0)(0.05) + ($2,158 x 105)(0.95) = $145 000. Note, diat even though die two mediods ought to be equivalent (see Chapter 2), here they give different worths for the same borehole. Wj even has a negative value, which is impossible. The minimum possible worth for a measurement is zero. This difference is caused by changes in the system between the prior and the preposterior analysis. The system refers here to the entire set of parameters used in either the preposterior or the prior analysis. These include the geostatistical parameters, the expected value of die objective function of die different alternatives, and die probability of failure. If anyone of diese parameters changes between the prior and die preposterior analysis, die system has changed. It is explained below how the system has changed. The properties of the ensemble of realizations used in the prior analysis must be idendcal to the properdes of the ensemble of realizations used in die preposterior analysis; no sample in reality has been taken to change the ensemble of realizations. Therefore, all parameters must remain consistent between the prior and preposterior analysis; their expected values in the prior analysis must equal their expected expected values in the preposterior analysis. However, in die example analysis, the probability of failure and objective functions for the different alternatives are inconsistent (Table 6-3). Only the geostatistical parameters remain consistent. PRIOR ANALYSIS Zm\iUS4 774 E[Reg(A,^ )]=S3 188 PREPDSTERIDR ANALYSIS Note: All dollars ore in thousonds ECZ<A^IS^)]-$4 355 E[Rg9(A|^ ISy>] = Â«0 =0.109 P(FaillS,<>= 1.0 ECZ<A^Sy)3=Â«3 550 E[ReQ<A^S^M=$805.0 E[Z<ALIS,^>]=*4 780 E[Re9<A^^IS^^Â»*3 238 f^m^'0047 P(roillSHv)=0.36 E[Z(A^IS^>]=*7 802 ECRegC A ^ IS^ ^ >]= S215.8 Figure 6-4: Preposterior analysis of example design when realizations are generated in both the prior and the preposterior analysis. Theoretically, the generated realizations are only dependent upon the geostatistical parameters; therefore, the properties of the prior and preposterior ensemble of realizations would approach each other as the number of generated realizations increased, since the geostatistical parameters remain consistent. Therefore, if a huge number of realizations were generated, the ensembles of realizations would be almost identical. Parameter EfPrior Value! EfE(Preposterior Value)! mj 0.05 0.05 =(.05)(.109)+(.95)(.047) 30 30. =(.05)(30)+(.95)(30) Z ( A L ) $4,774 X 10^ $4,759 X 106 =(.05)($4.355xl06)+(.95)($4.780xl06) Z ( A N L ) $7,743 x 106 $7,589 X l(P =(.05)($3.550xl0V(.95)($7.802xl06) P(failure) 0.37 0.392 =(.05)(l.)+(.95)(0.36) Table 6-3: The values of parameters used in the prior analysis and the expected value of parameters used in the preposterior analysis. Unfortunately, in reality this does not happen. The realizations are only pseudo-random and are dependent upon a number of "artificial" parameters, which are functions only of the algorithm used to generate die realizations. These parameters are: die starting seed, the number of the first aquitard block generated, the order in which the aquitard blocks are generated, and the number of aquitard blocks generated. Change any of these parameters and a different set of realizations are generated. This effect is defined as noise in diis diesis. Because of this pseudo-randomness, the properties of die ensemble of realizations generated during the prior and the preposterior analyses will not be exactiy identical, regardless of how many realizations are generated. Given the sensitivity of the system, even diese very small inconsistencies in Z ( A L ) , Z{A^), and Pf are enough to cause die large discrepancies in data worth noted in equations (6.5) and (6.6). The geostatistical parameters remained consistent because the updating of the mean is carried out direcdy by Bayes' equation and is unaffected by parameters used in die SIS algoridim. The correlation lengdi is not updated, and therefore, must be consistent. The updating of geostatistical parameters is discussed in Section 4.5.2. In view of the foregoing, it was therefore recognized diat this mediod of generating realizations would not work. 6.3.2.2 Realizations Generated During the Preposterior Analysis (Method Used in Thesis) In this method, all realizations are generated during the preposterior analysis. For each combination of sample outcome, the geostatistical parameters are updated and 100 realizations are produced. The ensemble of realizations used in the prior analysis is formed from die sum of die realizations generated during die preposterior analysis, weighted by die probability of the different sample combinations occurring. Therefore, die ensemble of realizations used in die prior and preposterior analysis are identical. The preposterior analysis is carried out in die following basic steps: 1) A pattem of n measurements is proposed. 2) The 2" combinations of different sample outcomes Sj are calculated. 3) The mean is updated, based on sample outcome Sj. 4) A series of M aquitard realizations, R^y, are generated using the updated parameters and conditioned on Sj. (The subscript mlj on R refers to m''' realization which is conditioned on sample outcome Sj.) 5) For each realization, R^|j, contaminant ti-ansport is modeled for each altemadve. A,, to determine if failure occurs for A,. Failure times are calculated. 6) The failure time is then used to calculate die objective function, Z(A|,R^|jlSj), for each Aj, for all M reahzations R^|j. 7) The alternative, A ^ ' , widi die highest expected objective function, over all M realizations is dien selected. E ( Z ( A D ' ) ) = ^ I Z ( A D ' , R â€ž | J I Sj) (6.7) m=l Z ( A Q ' , R ^ | J I S J ) is the objective function of the posterior best design alternative for realization R ^ | j , given Sj. 8) Steps 3 through 7 are repeated for each of the 2Â» sample outcomes Sj. 9) The expected expected objective function of the best preposterior design is calculated by B l B ( Z ( A â€ž 0 ) , = i X l 2 ĂŽ ^ P ( S p j=l m=l 10) The prior expected objective function of each alternative is calculated by j=l m=l The alternative with die highest objective function is die prior best design, A ^ . 11) The wordi of die sampling pattern is calculated from Wordi = E [ E ( Z ( A D ' ) ) ] - E ( Z ( A D ) ) (6.10) The steps are identical to those presented in Section 6.3.2.1., except a new step 10 has been inserted. This new step carries out die prior analysis using realizations generated during die preposterior analysis. Since the ensemble of realizations used in the prior analysis and the preposterior analysis are die same. the expected objecdve funcdon for an altemadve is idendcal for both the prior and the preposterior analysis. Therefore, using step 10 E(Z(AL)) = E(Z(ALISW))P(SW) + E(Z(ALISNW))P(SNW) (6.11) = ($4.3550 X 10^)(0.05) + ($4.7802 x 10^)(0.95) = $4.7590 x 10^ and E ( Z ( A N L ) ) = E(Z(ANLISW))P(SW) + E(Z(ANLISNW))P(SNW) (6.12) = ($3.5501 X 106)(0.05) + ($7.8024 x 106)(0.95) = $7.5900 X 10^ Therefore, the prior best altemadve is to have no liner. Using the results shown Figure 6-5, the worth of data from the expected increase in maximum expected objecdve funcdon is W i = E[E(Z(AD'))] - E(Z(AD)) (6.13) = [($4.3550 X 106)(0.05) + ($7.8024 x \Q^){0.95)] - $7.5900 x 10^ = $40 000 The worth of the borehole from the expected decrease in the minimum expected regret is W 2 = E(Reg(AD)) - E[E(Reg(AD'))] (6.14) = $2.4526 X 105 - [($0)(0.05) + ($2.1580 x 105)(0.95)] = $40 000 The worth of the borehole by both die above methods is $40 000. PRIOR ANALYSIS E[ZCA^>]=*4 7590 E[Res(A^>]=Â»3 0761 ^=0.05 P(f 010=0.392 E[Z<A^>]-*7 589.8 E[Reo<A^?>S245.26 Note: All dollors ore in thousands EtZ<A^IS^>]=Â«4 355 ECReB<AJS,^ >] = *0 P<FaillSi^ = 1.0 EtZCA^Sy>]=Â«3 550 ECReB<A^S )^3=$805.0 PREPDSTERIDR ANALYSIS E[ZCALiSÂ»n,>]=*4 780 EtRe8<A I^S^ ,^>]=$3 238 '^I.S^=0047 P<FaillSt(V>-0.36 i^.lS^>]=Â»7 802.4 EtZCA ECRegC A ^ IS^ y, >]= $215.8 Figure 6-5: The preposterior analysis for example design for case where realizations are generated during preposterior analysis only. One problem is that the total number of required realizations Umits the number of sample locations. For example, if a minimum of 100 realizations is needed per sample outcome, dien 3 200 realizations would be needed for a sampling pattem of five hard measurements. (Recall that there are 32 (ie. 2 )^ possible combinations of sample outcomes for a pattem of five measurements.) The practical upper limit of the sampling pattem is six to seven boreholes. It may well be diat phased site investigation programs often consider suites of this order of magnitude. 6.4 SENSITIVITY OF ESTIMATED WORTH OF SINGLE HARD MEASUREMENT TO NUMERICAL ARTIFACTS There are many numerical parameters which are only artifacts of the methodology used in the framework. These parameters affect die estimated worth, W, of a proposed sample. The purpose of Section 6.4 is to study the sensitivity of W to these parameters and to determine if they could affect the decision of whether a sample is cost effective or not. Numerical parameters are used in bodi die deterministic modeling of contaminant tiansport and die stochastic generation of the aquitards. Some numerical parameters used in the deterministic modeling of contaminant tiansport are 1) the density of the mesh used in the numerical model to solve for hydrauhc head 2) the number of particles released at the contaminant source for particle ttacking 3) the distance of vertical boundaries from the zone of interest 4) the absolute difference used to stop iterative head calculations (used in MODFLOW). In this thesis, it is assumed that the correct values of the above parameters have been set by sensitivity analyses and that the deterministic tiansport of contamination is correcUy modeled. The remainder of this section is concerned with numerical parameters that affect the generation of the aquitards. These parameters are 1) the number of realizations generated per sample outcome 2) the starting seed used in the random number generator 3) the number of the first aquitard block generated 4) the number of blocks into which the aquitard is discretized. The sensitivity of the worth of a single borehole to each of the above parameters will be discussed below. The total effect of these numerical parameters on the estimated worth will be discussed last. 6.4.1 THE NUMBER OF REALIZATIONS GENERATED PER SAMPLE OUTCOME In the example design, 100 aquitard realizations were used to estimate W. However, W is dependent upon the number of realizations used to estimate it. In the example presented here, as the number of reahzations increases W decreases and becomes asymptotic after 50 realizations (Fig 6-6); however, there remains an irreducible fluctuation. Therefore, it is assumed that if enough realizations are generated W wiU be unbiased, but diere wiU be a random error. The standard deviation of die random error, a^, is estimated to be $830. This corresponds to a 2% fluctuation from die average W of $44 200; dierefore, for the example design, die random error will have a small effect on W. S JZ s C M â€” I â€” 50 100 150 Number of Realizations 200 250 Figure 6-6: Sensitivity of the worth and median failure time to number of realizations per sample outcome. 6.4.2 STARTING SEED USED IN RANDOM NUMBER GENERATOR Ideally die realizations generated by die SIS algoridim should be completely random. Unfortunately, they are not because they are dependent upon the U(0,1) random number generator. The U(0,1) random number generator generates pseudo-random, rather than truly random, sequences of numbers which are dependent upon a starting seed. The pseudo-random sequence changes for every different starting seed. The result of this incomplete randomness is that the ensemble of realizadons generated with one starting seed will be different from an ensemble generated widi anodier starting seed. Every different ensemble of realizations will yield a slighdy different probability of failure and a different set of failure times, and hence, a different estimated worth for a sample. As die starting seed increases, W moves randomly about a constant mean value (Fig 6-7). Therefore, as in die above case, varying the starting seed is assumed to create a random error, but will not bias W. The standard deviation in W,a^, is estimated to be $1 440, which corresponds to a 3.5% fluctuation about die average W of $41 500. Therefore, for diis example die random error will again have a small effect on W. 10 ' 10 * 10 â€˘ 10 â€˘ 1 0 ' 10 â€˘ 10 â€˘ Storting Seed Figure 6-7: Sensitivity of die worth to the starting seed used in die random number generator. 6.4.3 NUMBER OF FIRST AQUITARD BLOCK GENERATED The realizations are also dependent upon a congruendal random generator used to determine the order in which the aquitard blocks in a realization are generated. The order is dependent upon the numl)er of the fh-st block generated, which is supplied by the user. A different order, and consequendy a different set of realizations, will result for each different starting block number. W again varies randomly about a constant mean as the first aquitard block generated changes (Fig 6-8). Therefore, the starting block number will not bias W, but will create a random error with standard deviation equal to $2 080. This corresponds to a 4.4% fluctuation about the average W of $44 500. Â§ o -o C O o o o -o o o -o C M -1 1 1 1 5 1 0 1 5 2 0 First Aqu i t a rd B lock Genero ted 2 5 3 0 Figure 6-8: Sensitivity of die worth to first aquitard block in starting padi. 6.4.4 NUMBER OF BLOCKS INTO WfflCH AQUITARD IS DISCRETIZED W is dependent upon the number of blocks into which the aquitard is discretized. Fortunately, W varies randomly around a mean of $42 760 (Fig. 6-9). Therefore, at least for diis example, W is unbiased by die number of aquitard blocks, but diere is a random error widi a standard deviation of $2 630, which is 6% of die average worth. o o O n o o_l , , , , , , 15 20 25 30 35 40 45 Number of Aquitard Blocks Figure 6-9: Sensitivity of W to the number of blocks that the aquitard is discretized into. 6.4.5 COMBINED AFFECT ON RELIABILITY OF DATA WORTH For die example design case studied here, if die numerical parameters are set correcdy, W will be unbiased, but each will contiibute to a random error in it. Assuming that die errors caused by any one parameter are independent of the others and normally disttibuted, then the standard deviation of the total random error, O^, in W can be estimated by the central limit theorem. Therefore, n a â€ž 2 = X a , 2 (6.15) where G ^ ^ jg variance in W caused by parameter j . For die example design presented here. = ($830)2 + ($ 1 440)2 + ($2 080)2 + ($2 060)2 = ($3 400)2 The is only $3 400 compared to an overall average W which is in excess of $40 000. Therefore, for the example presented here the total random error in W is small compared to the average W. The magnitude of the random error may be site dependent; therefore, it should be estimated for each site. In addition, possible bias in W due to numerical artifacts should be checked for each site. 6.5 PREPOSTERIOR ANALYSIS OF A SOFT, AREAL SURVEY Section 6.5 discusses how the framework is used to evaluate die worth of a soft, geophysical survey which covers a large area. The survey must cover the entire area where potential windows could cause not only fadure, but a failure which would alter the prior best design alternative. These latter type of windows are denoted as Wfj. Some windows may allow failure (i.e. contamination to reach die lower aquifer), but the cost of the failure may not be high enough to change die prior best design alternative. As wUl be shown in Chapter 8, diis area is not fixed, but is a function of odier factors such as die cost of failure. Since the survey covers the entire area where Wf^ may occur, it is assumed that the survey will sample any such window, given diat one exists, widi some probability, P(Swj^lWfj,). A false outcome may also be possible. Its probability is denoted by P(SwjjlNWfj). These two probabilities represent die likelihood functions (or the precision pj and P2) of die survey. They must be estimated a priori for the data worth analysis to be carried out; however, this will not be a simple task for two reasons. The first is that geophysicists generally do not quantify the precision, or quality, of geophysical surveys in these terms (Cross, 1992). Therefore, diere are no readily available estimates. Secondly, diey depend on may factors which may be poorly understood, such as stratigraphy and material properties of geological materials. The above assumption allows the multivariate Bayesian updating problem described previously for hard, point measurements to break down to a univariate case. The worth of the survey can then estimated using die mediodology for univariate random variables outUned in Chapter 2. The following information is needed to carry out die analysis: the prior probability of failure, the updated probability of failure for each sample outcome, the expected value of the objective functions for the alternatives for both failure and no failure, and the lUceUhood functions of the measurement. As previously mentioned, the likelihood functions of the survey must be estimated. Al l the remaining information is calculated or derived from the prior analysis, which is carried out by the framework. The prior analysis direcdy evaluates the prior probability of failure and the expected value of the objective functions for the alternatives for both failure and no failure. Note that the expected values of die objective functions of the design alternatives now have a different form from diat used for the hard, point measurements. Instead of simply calculating the total expected value of an altemadve, such as E(Z(A(^)), the expected value is calculated twice for each alternative, once for the realizations that fail and once for the realizations that don't fad, ie. E(Z(AclFailure)) and E(Z(AĂ§lNo Failure)). Recall, from Section 6.3.2 that the expected values of the objective functions for the different alternatives, given failure or no failure, remain unchanged by sampling. The prior probabdity of failure is updated by Bayes' equation for the univariate case, using the likelihood functions. The probability of sampling different outcomes is calculated using the likelihood functions and the prior probabUity of failure. An example analysis is carried out for the case study in Section 8.5. Note, because of the uncertainty in the estimated lUcelihood functions, there will be uncertainty in the estimated worth of a geophysical survey. Therefore, instead of concenti-ating on die actual worth of a geophysical survey, the break even point, or minimum precision for the survey to be cost effective, will be estimated. The minimum precision of die survey should be much easier to estimate dian die actual precision of the survey. 6.6 MODIHCATION TO FRAMEWORK TO HANDLE DATA WORTH PROBLEMS IN OTHER DISCIPLINES The framework forms a foundation for addressing data worth questions in both groundwater and other disciplines, that deal with heterogeneous geological envkonments. For example, if one were interested in determining the patterns of hydraulic conductivity within an aquifer, die framework could be easily modified to deal with this situation. Hydraulic conductivity is generally assumed to behave as a multivariate lognormal distribution. The framework could be modified to handle it by replacing the SIS algorithm with a turning bands algorithm. Flow or transport could still be simulated using the numerical model. It could also be modified for use with exploration programs used in petioleum reservoir engineering. The major factor in controlling flow in a petioleum reservoir is often patterns of sand/shale continuity. This factor is particularly ttue in enhanced oil recovery programs. The SIS algorithm could still be used to simulate die reservoir geology. The major modification would be replacing die numerical contaminant tiansport model with a numerical multiphase flow model. One could address a question such as: Will the increased recovery and information gained from a new well justify the cost of installing the well? The framework could be particularly applicable to petioleum engineering because the economic consequences of decisions are high. An oil well can cost $10's millions, but increased recovery could be worth $l(X)'s millions. In addition, die economic consequences are more stiaightforward to quantify dian diose in groundwater contamination because one does not have to deal with factors which are hard to put a monetary value on, such as die loss of a river for recreational use. It would be sttaight forward to modify die framework to address data worth questions in mine development. For this case, only an algoridim for simulating ore bodies would be requked. Again, if grade/tonnage of the ore body behaved as lognormal random variable, the SIS algorithm could be replaced by a turning bands algorithm. One could investigate whether one should spend (a) $100 000 on diamond drill holes, (b) more than a $1 million on building a shaft and collecting a large bulk sample, (c) nothing and build the mine, or (d) nothing and abandon the site. Mining could be a very appropriate area to adopt the framework for three reasons. The first is that potential mine sites often have much data available. Secondly, geostatistics has been employed for a number of years to many types of mineral deposits; therefore, the geostatistical nature of ore bodies is better understood than diat of geological sti-atigraphy. Finally, diere is a need because many poor decisions have been made in mine development. Often $10's of million have been spent developing a mine, only to find that the mine is uneconomic. 6.7 SUMMARY OF CHAPTER 6 Chapter 6 outiines how the framework evaluates die worth of a sampling program using a design example which involves two dimensional contaminant tiansport. The framework can only direcdy evaluate the wordi of hard, point measurements. Soft, point measurements cannot be handled. However, die framework can be used to estimate the worth of a soft, geophysical survey which covers the entire area where windows can cause a failure which alters the prior best design alternative. There are two mediods by which realizations might be generated by die SIS algoridim for die framework to estimate W. In die first mediod, realizations are generated in both die prior and preposterior analysis. This mediod was rejected because noise in die SIS algorithm causes die data worth calculations to break down. In die second mediod, realizations are generated during the preposterior analysis and are "reused" in the prior analysis. This method leads to correct results and was selected for use in this thesis. The estimated worth of a hard measurement is sensitive to numerical artifacts in the SIS algorithm. However, for the example presented here, this sensitivity is very small compared to average value of the estimated worth. The framework could be modified to handle odier types of data wordi problems in both hydrogeology and other disciplines. 6.8 NOTATION Ajy prior best design alternative Ap' posterior best design alternative A L liner alternative A N L " Â° li"Â®"" alternative Bg barometiic efficiency h hydrauhc head hydraulic conductivity in x dkection Ky hydrauhc conductivity in y direction mj estimate of mean of I(x) NWfj no window exists which causes failure that changes prior best design, A] Pj air pressure Pi probability of sampling a window, given that one exists P2 probability of sampling a window, given that one does not exist Reg(Aj) regret of alternative A; Rn,ij m* aquitard realizadon wliicli is condidon on sample outcome Sj Sj j * sample outcome W worth of data Wfj window exists which causes failure and changes prior best design, W J worth of data from the expected increase in the maximum objective function W2 worth of data from the expected reduction in minimum expected regret Z objective function Y specific weight of water Hi mean of indicator random variable I(x) Xi correlation length of indicator random variable I(x) aâ€ž2 variance of W 2 variance of W caused by parameter j CHAPTER?: GENERIC DESIGN CASE: A SENSITIVITY ANALYSIS OF DATA WORTH 7.1 INTRODUCTION Chapter 7 presents a sensitivity study of the different factors affecting the worth, W, of a single, point, hard measurement. The factors include economic, hydrogeological, and geostatisdcal parameters needed to carry out the data worth analysis, as well as die presence, or absence, of existing data. Economic parameters include die discount rate, known benefits, known costs and cost of failure for all alternatives. The hydrogeological parameters include the three-dimensional geometry of the hydrostiatigraphic units, the distribution of hydraulic conductivity and porosity throughout each unit, and the hydraulic boundary conditions. The geostatistical parameters include the prior estimate of the mean, mj, the correlation length, Xj, and the confidence in mj. The major purpose of diis study is to present an analysis of how and why different factors can affect W. The secondary purpose is to show how sensitive W is to errors in parameter estimates. The effect of diis sensitivity on the decision of taking a measurement will be discussed briefly. A measurement will only be taken if its estimated wordi is greater than its cost. However, this concept will be ignored until Chapter 8. The decision of whedier a measurement should be taken is covered in detail in Chapter 8 using the case history. W is quantified in this Chapter in terms of method 3 because it is easier to visualize how different factors affect W. Method 3 was presented in Section 2.4.2.3. Recall from equation (2.44) diat W was quantified by: W = E[E(Z(Aj)'IS)) - E(Z(ADIS))] (7.1) where, - is the prior best design - A Q ' is the posterior best design - Z(ADIS) is the objective function of Ap, but evaluated with additional sample information S - E(Z(Ao)) is the expected value of ZCAp). For the case of two sample outcomes, S^ v and S^w, the above equation can be rearranged to form: W = {E[Z(AD'ISW)] - E[Z(ADISW)]}P(SW) + (7.2) {E[Z(AD'ISNW)] - E[Z(ADISNW)]}P(SNW) The first set of terms on the right hand side of the above equation represents the contribution of to W . It is the product between the probability of sampling the window, PCS^v), and the difference between E[Z(AJ)'ISW)] and E[Z(Ai5lS^)]. Simdarly, the second term on die right hand side represents the contribution of to W . It is important to note diat conclusions drawn in this Chapter may not be exti-apolated to every design problem. Parameters which are numerical artifacts of the methodology have already been discussed in Chapter 6. The sensitivity study will be carried out using the two contamination scenarios presented below. 7.2 T W O C O N T A M I N A T I O N S C E N A R I O S 7.2.1 S C E N A R I O N U M B E R 1 Scenario 1 is a layered system consisting of an unconfmed upper aquifer, an impermeable aquitard and a lower aquifer (Fig. 7-1 and 7-2). Groundwater contamination has been discovered in die unconfined aquifer. A municipal water supply well is located in the lower aquifer 400 m down gradient from the contamination. The well is pumping at a rate of 5 x 10"^ m^s (79 US gal/min). The clay aquitard is known to exist at the water supply well, but its presence is uncertain elsewhere. 800 n y. r contanlnatlon REGIONAL GROUND WATER FLOW \ 400 n ' B punping well 1000 n N I boundary vy ^ zone o f 0 100 n scote Figure 7-1: Plan view of contaminated aquifer system. If the contamination reaches the lower aquifer, then the municipal water supply well will be lost and an expensive remediation program wdl be necessary to protect furdier water supply wells. Failure is defined here as contamination reaching any part of the lower aquifer. The cost of failure is assumed to be $40 million. A decision must be made whether to contain the contamination, A^ , widi pumping wells, or to provide no containment, Aj^^, and hope for die best. The containment will be long-term, but its net present cost is estimated to be $4 million. It is assumed to be 100% effective. No benefits are associated with either altemadve and there are no known costs associated with AJ^Q. The costs and benefits for die alternatives are summarized in Table 7-1. A discount rate of 10% is used in the analysis. A a q u i t o r d o f unknown con t inu i t y^ A ' uncon f i ned a q u i f e r contaminat ion h, =100n =100n ^ F L D V â€” â€˘ \ FLOW â€˘ well hg =95n =95n c o n f i n e d a q u i f e r N o t e : v e r t i c a l e x a g g e r a t i o n = 5 Figure 7-2: Cross section through contaminated aquifer system. s c a l e lOOn I Alternative Known Known Costs Cost of Failure Benefits (millions $) (millions $) (millions $) Ac 0 4. 0 A N C 0 0. 40. Table 7-1: Costs and benefits for the containment and no containment alternatives. The boundary conditions are shown in die cross section du-ough die aquifer/aquitard system (Fig. 7-2) and the hydrogeological parameters are summarized in Table 7-2. The correlation length, and estimated mean, mj, are 50 m and 0.03, respectively. The confidence in the prior estimate of mj is equivalent to 15 independent hard data. The boundary conditions are 100 m away from the zone of interest. Recall, that windows are only generated in the zone of interest. Note that the downward vertical gradient across the aquitard is local in nature because it is due entirely to die pumping well. Layer Parameter Value Unconfined Aquifer Kh 2 X 10-5 m/s Kv 2 x 10-5 m/s Tiiickness 20. m porosity 0.3 Aquitard Kv 10-'" m/s Thickness 10. m porosity 0.4 Confined Aquifer Kh 8 X 10-5 m/s Thickness 20. m porosity 0.3 Table 7-2: Hydrogeological parameters used in Scenario 1. Two layers of nodes were used in the upper aquifer while one layer of nodes was used in the lower confined aquifer. The clay aquitard was accounted for using a leakage layer. The node spacing in both horizontal dkections is 25 m. The contamination source is represented by a series of particles released along an east/west horizontal Une in the middle of the upper aquifer. Numerical parameters used in die stochastic simulations are summarized in Table 7-3. The value of W is evaluated for a single borehole taken at point B, 50 m north and 25 m west of the municipal water supply well (Fig. 7-1). From the prior analysis, best design alternative is no containment, Aj^^^. Its expected objective function is E[Z(Af^Q] = -$2.45 million. The expected objective function for die containment alternative, E[Z(Ac)] = -$4 million. W is estimated to be $25 700. The prior probability of failure is 0.38 and die median failure time is 21.0 years. Parameter Value number of realizations per sam. outcome first aquitard block generated # of aquitard blocks starting seed 100 39 768 687453 Table 7-3: Numerical parameters used in Scenario 1. 7.2.2 CONTAMINATION SCENARIO 2: The second Scenario is idendcal to die first, except for the foUowing: no conditioning data are present, the well is absent, nij is estimated to be 0.01, and the boundary condition h4 on the south side of die lower aquifer has been reduced from 95 m to 90 m. Therefore, the vertical gradient is forced entirely by the lower boundary condition, rather than die pumping well. The definition of failure is still contamination of any part of die lower aquifer. From the prior analysis, E[Z{AQ)] = -$4 million and E[Z(Ai^c)] = -$3.01 mUlion. Therefore, the prior best design is again to have AJ^Q. The value of W for the same borehole as in the first scenario is estimated to be $55 600. The prior probabihty of failure is 0.28 and the median failure time is 17.4 years. 7.3. SENSITIVITY TO ECONOMIC PARAMETERS The sensitivity of W to die known cost of containment, the cost of failure, and the discount rate are studied below for scenario 1. 7.3.1 KNOWN COST OF CONTAINMENT (SCENARIO 1) W can be very sensitive to the known cost of containment, C^ (Fig. 7-3). Attention is drawn to three points D, E, and F in Figure 7-3 where C^ equals $2.39, $2.45, and $5.27 mUlion, respectively. W is zero for Cj, below point D, but rises rapidly to a maximum of $60 000 at point E. W dien more gradually drops off to $0 at point F where C ,^ = $5.27 mUlion. The significance of the three containment costs at D, E, and F can be visualized in Figure 7-4. Here, the expected prior objective functions of the two alternatives are shown in solid lines. The expected expected objective function of the posterior best design, E[E(Z(A]) ' ) ) ] , is shown in a dashed Une. Recall, that W is positive only if E [ E ( Z ( A Q ' ) ) ] is greater than the expected objective function of the prior best design, E[Z(Aj))]. W increases with this difference. The containment cost at point E represents where E(Z(Ac)) = E(Z(AJ^Q)) (Fig. 7-4). Below E, A^. is die best alternative, while above E, Aj^^ the best alternative. Point E also marks an instability in the data worth analysis. Any data worth analysis carried out near it can be unreliable. There will always be uncertainty in most parameters, such as C .^. Therefore, if C ,^ is estimated to be close to point E, it is possible that the true value of W could be dramatically different from its calculated value. This instability in W , at the point where E(Z(Ac)) = E(Z(Aj^(.)), will be seen for different parameters throughout this sensitivity study. o Known Costs for Containment (f) Figure 7-3: Sensitivity of W to the known cost of containment. For a positive W , the ttue cost of containment must lie between $2.45 mUlion (E) and $5.27 million (F). It is informative to graphically examine the effect of each sample outcome on the prior and posterior objecdve functions of die alternatives, and hence on W. The effect of sampling a window, S^ yr, is shown in Figure 7-5. Recall that determines that failure will occur with certainty. The prior and posterior objective functions are again shown in solid and dashed lines, respectively. Note that Z(Ac) is constant, therefore, E(Z(AĂ§)) = E(Z(AclSw))- For between E and F, S^ v results in a change in decision. A Q changes from Aj^j^ to A^. when the certainty of failure is determined. Sampling a window stops a poor design from being carried out. For C ,^ greater than F and less than E , S^ ^ has no effect on Ap. Known Costs for Contoinnnent (S) Figure 7-4: The expected prior objective function of the two alternatives and the expected expected objecdve function of the posterior best design alternative versus known cost of containment. Known Costs for Containment ($) Figure 7-5: Tlie effect of sampling a window on the objecdve functions of the different alternatives. Similarly in Figure 7-6, S]sj^ , only results in a change in the prior design alternative for between D and E. For C ,^ greater than E and less than D, Sj^^ has no effect on the prior design decision. to o o Known Costs for Contoinnnent ($) Figure 7-6: The effect of sampling no window on the objective functions of the different alternatives. Consequently, for C .^ below D and above F, neither or affects A Q and W = $0. W > $0 for Cc between D and F. For C .^ between D and E, only S^w affects A ^ , while for C^ between E and F, only affects A Q . Recall from Chapter 2, that a sample only has worth if its outcome results in a change in A Q . 7.3.2 COST OF FAILURE (SCENARIO 1) The value of W exhibits a similar sensitivity to the cost of failure, Cf (Fig 7-7). W is very unstable at point E (Cf=$67 million) where A ^ changes (E(Z(Ac)) = E(Z(ANC)) (Fig. 7-8). For Cf between E and F (Cf=30.4 million) W drops from a maximum to zero with less than a 3% change in Cf. For Cf between D (Cf = $30.4 million) and E , W rises more slowly. For positive W , $30.4 million < Cf < $65.4 million. Figure 7-7: W versus cost of failure. CO O o oĂ‰fOOO 2E+007 4E+007 6E+007 ' Cost of Failure ($) Figure 7-8: Tlie expected prior objecdve funcdon of die two altemadves and die expected expected objecdve funcdon of the posterior best design alternative versus cost of failure. 7.3.3 DISCOUNT RATE (SCENARIO 1) The value of W is very sensitive to the discount rate (Fig 7-9). It rises from $0 at i=0.066 (D) to $110 000 at i=0.0675 (E). It then gradually drops to zero at i=0.114. The maximum W at i=0.0675 (E) again represents an instability point for W where Ap changes (Fig. 7-10). For a positive W, the discount rate must lie between 0.0675 (E) and 0.114 (F). O . o o_ E o SI -*-Â« o o -> o o -o D V F oâ€” O.C 1 1 1 0 1 1 1 1 0.05 1 1 1 1 1 1 1 0.10 Discount Rote (decimal %) Figure 7-9: W versus discount rate. Discount Rote (decinnal %) Figure 7-10: The expected prior objective function of the two alternatives and the expected expected objective function of the posterior best design alternative versus discount rate. The change in W with discount rate demonstiates an important point. Different parties may view discount rates differendy, and consequendy can get large differences in W. For example, an owner operator of a land fill will be interested in making a profit. Therefore, future cost could be discounted back into present day dollars by some acceptable market discount rate. At a discount rate of 0.1, W is $25 700. However, for an environmentalist, a future failure is just as important as present day failure. Therefore, an envh-onmentalist would be more inclined to use a discount rate of 0 and would evaluate W = $0. For a discount rate of 0, the best prior alternative is A^. under all circumstances. 7.4 SENSITIVITY TO HYDROGEOLOGICAL PARAMETERS The sensitivity of W to the vertical hydraulic conducdvity (K^) of the clay aquitard, the horizontal hydraulic conductivity (K,,) of the lower aquifer, and the fixed boundary condition (h4) are studied using Scenario 2. 7.4.1 VERTICAL HYDRAULIC CONDUCTIVITY OF AQUITARD (SCENARIO 2) The value of is quantified here in terms of a leakage coefficient, L^,, which is defined as K divided by the aquitard thickness. At low Lj,, W is relatively constant at over $55 000 (Fig. 7-11). However, W drops to $0 as L^, increases above 10"^ 1/s. As the aquitard becomes more permeable at high L^., contamination can peneĂ»^ate the aquitard, regardless of the presence or absence of windows. This is shown by the prior probability of failure climbing to 1 at high L ,^ (Fig. 7-11). When die probability of failure is 1, it is certain that die best design is Ac. and W is zero. 1 0 - " 1 0 " ' Â° 1 0 1 0 " " 1 0 " ' 1 0 ^ Aquitard Leakage Coefficient ( 1 / s ) Figure 7-11: Tlie sensitivity of W and prior probability of failure to aquitard hydraulic conducdvity for Scenario 2. Therefore, for this case it is important to know if L^,, or , is above or below a threshold value. Knowing its actual value is less important. 7.4.2 HORIZONTAL HYDRAULIC CONDUCTIVITY OF LOWER AQUIFER (SCENARIO 2) The horizontal hydraulic conducdvity of the lower aquifer has been quandfied in terms of die ttansmissivity, T, which is equal to the product of die hydraulic conducdvity times the aquifer diickness. At high T, W is relatively insensitive to changes in T (Fig 7-12). As T decreases, W starts to decrease more rapidly and then becomes zero at T = 10-5 (mys). The reason for this decrease lies in die interaction between the downward vertical gradient across the aquitard and die transmissivity of the lower aquifer. At very low T, die lower aquifer can transmit littie water and there is litde leakage, and hence litde vertical gradient across the aquitard. With litde vertical gradient across the aquitard, diere is littie tendency for contamination to be pulled through it to the lower aquifer and for failure to occur, even if a window is present. Since fadure cannot occur it is certain that the best design is A^Q and W is zero. Transmissivity of Lower Aquifer (m**2/s) Figure 7-12: Tiie sensitivity of W and prior probability of failure to the transmissivity of die lower aquifer. However, as T increases, a cridcal point is reached (T = 10"'* m /^s) when a significant vertical gradient across the aquitard is created and significant leakage can occur dirough a window. The prior probability of failure increases from 0 to 0.168 and die prior design alternative of A^c could now be in error. W then increases from $0 to $26 700. However, as T increases further, it has little affect on the prior probability of failure. At this point, the probability of failure is controUed predominandy by the presence or absence of windows in the aquitard and not by the vertical gradient. Therefore, it is more important to know whether T, or Kj,, is above or below a critical value rather than knowing its actoal value. However, for diis case, the ti-ansition to the critical value is not as sharp as was die case for die hydraulic conductivity of the aquitard. Alternatively, if a sttong vertical gradient is known to exist across the aquitard, knowledge of T, or K,,, of the lower aquifer may not be important, at least for the definition of failure used here. 7.4.3 CONSTANT HEAD BOUNDARY CONDITION, h4, IN LOWER AQUIFER (SCENARIO 2) Several different boundary conditions could affect the flow model and the tiansport of contamination. The two Scenarios only account for a few of die possible boundary conditions. For example, the upper boundary could be a recharge boundary rather than a water table boundary and the vertical boundaries could be flow boundaries radier dian constant head boundaries. However, to simpUfy the situation, only the constant head boundary condition h4, will be investigated here. Al l odier boundary conditions are kept constant. The value of h4 was gradually reduced below the boundary condition, hj, on the aquifer above, increasing the downward vertical gradient across the south end of the aquitard. The value of W is plotted versus the difference in h3 and h4 (Fig. 7-13). The results are very similar to those encountered in the last section in the study of the hydraulic conductivity of the lower aquifer. At very low differences in head across the aquitard, UtUe vertical gradient exists, and failure does not occur. Consequendy, W equals $0. As the difference increases to 1 m, a critical vertical gradient is achieved and failure can occur. At a head difference of two meters W suddenly increases to approximately $8 (XX). W then increases more or less linearly with the difference in hydraulic head. Therefore, it is critical to know if the vertical gradient across die aquitard is above a critical value. Since W increases with the difference in hg and h4, if hg is known it can be important to know the actual value ofh4. h3 - h4 (m) Figure 7 - 1 3 : Tiie sensitivity of W to tiie constant iiead boundary condidon in lower aquifer. 7 . 5 SENSITIVITY TO GEOSTATISTICAL PARAMETERS The sensitivity of W to the prior estimate of the mean, mj, the correlation length Xj, and the prior confidence in m[ are presented in this section using Scenario 1 and 2 . 7 . 5 . 1 PRIOR ESTIMATE OF M E A N OF I(x) (SCENARIO 1 ) The value of W increases almost linearly with mj from $ 0 at mj = 0 to over $ 1 0 0 0 0 0 for mj = 0 . 1 5 (Fig. 7 - 1 4 ) . Data worth is totally due to sampling a window, S^. S^ again switches A Q from Aj^^. to A,-. (Fig 7 - 1 6 ) . S f ^ makes no contribution to W for all tested because, A^Q is the best alternative regardless of die sample outcome (Fig. 7 - 1 5 ) . Figure 7-14: Sensitivity of W to tiie prior estimate of the mean. CO O O +â€˘ L l J g T CO c o o o 'â€˘^ +-â€˘ I ? . <U L l J o I CO o o +-o o o o o o o o o - EfzrAc)) E Z A H C ) ) 0 0 * 0 ^ EfZfAcISNw)) A ^ r ^ ^ E ( Z ( A N C I S N W ) ) T- T 1 1 1 1 1 1 0.10 0.15 prior est imate of mean ^.00 0.05 Figure 7-15: Effect of sampling no window on the prior and posterior objecdve funcdon of the containment and no containment alternadves. o o o + O-l C O CĂ» 'â€˘^ o <D I 1 ,> O .eu lu" -o o o + o o o o o o o o o e-o o Â° Â° o E (Z (Ac) ) E : ( Z ( A N C ) ) ^ ^ o o - o < Â« . E(Z(AclSw)) E ( Z ( A N C I S W ) ) â€”I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 r y.OO 0.05 0.10 0.15 I prior estimate of mean Figure 7-16: Effect of sampling a window on the prior and posterior objective functions of the containment and no containment alternatives. The reason for this increase can be visualized as follows. W from equation (7.2) is W = {E[Z(AD'ISW)] - E[Z(ADISW)]}P(SW) + {E[Z(AD'ISNW)] - E[Z(ADISNW)]1P(SNW) (7.3) The prior design decision is no containment, hence Aj , = AJ^Q. If a window is sampled, A^,' = A ^ and if no window is sampled then A Q ' = A^^Q. Therefore, the above equation can be rewritten as W = {E[Z(AclSw)] -E[Z(ANciSw)])P(Sw) + {E[Z(ANCISNW)] - E[Z(ANCISNW)]}P(SNW) (7.4) Note that die second difference term is zero, and the above equation simplifies to W = {E[Z(AclSw)] -E[Z(ANCISW)] )P(SW) (7.5) From Figure 7-16, one can see that the difference in E[Z(AclS^)] and E[Z(A]^clSxv)] is almost constant. Therefore, the increase in W with mj is caused by the increase in P(S^) with mj (Fig. 7 -17 ) . O cs-â€˘a c cn I s . CO o J3 O O CL O o-O. 00 0.05 0.10 0.15 prior estimate of mean Figure 7-17: Probability of sampling a window versus die mean chance of a window. 7.5.2 CORRELATION LENGTH of I(x) (SCENARIO 2) The value of W rises from $0 at = 40 m to over $50 000 at A., = 45 m (Fig. 7-18). W dien rises slighdy, but then slowly drops off as increases. As A,; increases, the realizations become increasingly constrained resulting in fewer windows. Consequendy, the prior probability of failure decreases (Fig. 7-18). At low XJ, the probabihty of failure is so high that A -^. is the only viable alternative, regardless of the outcome of the sampling program, and W = $0. At A,j = 45 m, die probability of failure is low enough that Aj) becomes A j^Ă§.. W jumps to over $50 000 because samphng a window wdl find that A^Q was a poor choice. For this case, it is more important to know whether A-j is above or below a cridcal threshold, rather than its actual value. The decrease in W with increasing Xj at higher values of A,j is conttadictory to what might have been expected. Intuitively, one would expect that W would increase with Xj. As A-j increases, information from the sample can be extrapolated over a larger area and the effect of the sample on the system increases. However, die ti-end in W has more factors affecting it dian simply A,i. It is worth noting, however, that in die case history described in Section 8.4.3, W increases with A,j. Correlation Length (m) Figure 7-18: Sensitivity of W and prior probability of failure to correlation length. 7.5.3 CONFIDENCE IN THE PRIOR ESTIMATE OF MEAN (SCENARIO 1) Recall, that the prior confidence in mj is quantified by relating it to an equivalent number of prior independent hard measurements, n^. The value of W initially increases with increasing confidence in mj. or Ug, but then t)ecomes asymptotic for n^ greater than 15 (Fig 7-19). Therefore, for scenario 1 the confidence in the prior estimate of the mean is relatively unimportant. O o O n o -x: Â° I â€˘ â€˘ ' ' I ' ' â€˘ ' I I ' I I I I I I I I I I I I I ' I I I I I I I I I ' I I I I I I I I I 0 10 20 30 40 50 60 70 80 90 Equivalent Number of Measurements Figure 7-19: Sensitivity of W to die prior confidence in the estimates of die geostatistical parameters. As with die correlation length, W reacts in an opposite manner to what might have been expected. As n^ increases, die measurement outcome has less affect on the updated estimate of die mean (Fig. 7-20). As the measurement has less effect on the updated mean, one would expect that W would decrease. This paradox is explained below. to., o in - 1 c o a> -t-> I . I . o . â€˘ â€”â€˘ M|ISw oo-oo< M|ISNW Prior M, V â€˘ I ' 10 ' I ' I I ' ' ' ' I ' 20 30 40 50 60 70 80 Equivalent Nunnber of Measurennents Figure 7-20: The updated mean versus n^. Since W is completely due to S^, = Aj^^' and Aj)' = A^, given S^, equation (7.2) simplifies to W = (E[Z(AclSw)] -E[Z(ANCISW)]}P(SW) P(SYV) is controlled by the prior estimate of mj and and is independent of n^. Therefore, the increase in W is due completely to die difference between E[Z(ACISYV)] and E[Z(Aj^(-.ISw)], which decreases for increasing n^ (Fig. 7-21). E[Z(AclS^)] stays constant, but E[Z(Aj^(-.IS^)] decreases. The objective funcdon for A^^Q will have the following form since its known costs and known benefits are equal to zero: PfCf(t) (1 +iy (7.6) The cost of failure, Cf(t) is fixed and the probability of failure, Pf, remains constant at one. The only variable is the failure time, t. Therefore, the decrease in E[Z(Af^c''5w)l increasing n^ . is due entirely to a corresponding increase in t with n^ (Fig. 7-21). o o o +" Ld O C o to â€˘â€˘i^ o _> o .a O o o +-I Â°â€”a â€” Â« - â€” â€˘ â€” B - â€” â€˘ â€” Hâ€” â€” a â€” a- â€”a â€” â€” â€”o â€” â€” â€”o â€” â€” - o E(ZfAc)X CM (0 _ t f l â€” E E(Z(ANC ISW)) â€˘ B-o M Median Fai lure T ime -I ( I I I I I I I I I I I I I I 1 I I I I I I I I I \ I I I I I I I I 1 I R 10 2 0 3 0 4 0 5 0 6 0 7 0 8 0 Equivalent N u m b e r of M e o s u r e m e n t s _ o '6 Li. c g T3 9 0 Figure 7-21: Effect of sampling a window on the objecdve functions The reason for this decrease in t lies in the interaction between the vertical gradient through the window and the velocity in the upper aquifer at which the flow approaches the window. At high n^, when the updated mean approaches its prior value of 0.03, only 3% of the aquitard blocks will represent windows. Leakage wdl be concentrated through a few windows. Consequendy, die vertical gradient through the windows will be strong. Therefore, the contaminant will be transported more quickly as it approaches towards and ti-avels through the window in die aquitard. At n^ =1, die updated mean is 0.515; dierefore, approximately 50% of die blocks will represent windows. Leakage through the aquitard is distributed through many windows; dierefore, the vertical gradient pulling contaminant towards and through the window is small. Hence contaminants wiU not travel as quickly towards and through the window. 7.6 SENSITIVITY TO EXISTING HARD DATUM The sensitivity of the worth of a single borehole to an existing datum was studied for Scenario 1. The worth was calculated of single hard measurements taken on a 25 m spacing along an east west line dirough die hard datum located at the well. These results are shown in Figure 7-22. The well is located at 512.5 m. As expected, W is zero at die well, where the existing hard datum is located. There is no point in taking a measurement at a location where the aquitard is already known to exist with certainty. As die measurement gets further from die existing datum, uncertainty about the presence, or absence, of aquitard increases and hence, W increases. However, on eidier side W reaches a maximum about 50 m from the datum and then decreases to zero. As windows get further from the well the vertical gradient induced by the well through diem will decrease and failure is less likely to occur, accounting for the decrease in W. Note that the plot of W versus distance is not symmetric about the existing datum. This lack of symmetry is because the well does not lie in the middle of the path of contaminants, but off to the right hand side. 200 400 Distance (m) 600 800 Figure 7-22: Effect of an existing hard datum on die worth of single borehole taken along an east-west line dirough die datum. 7.7 SUMMARY OF CHAPTER 7 Chapter 7 presents a sensitivity analysis of the worth of a single, hard measurement to economic, hydrogeological, and geostadsdcal parameters. The economic parameters include the cost of failure, the discount rate, and the known cost of containment. The hydrogeological parameters include a constant head boundary condition in lower aquifer, the vertical hydraulic conductivity of die aquitard, and the horizontal hydraulic conductivity of the lower aquifer. The geostatistical parameters include the correlation length, the mean, and the confidence in the estimate of the mean. The worth is much more sensitive to the economic parameters than to the geostatistical or to the hydrogeological parameters. For the hydrogeological parameter in general, it is much more important to know whether the value of the parameter is above or below some threshold value, rather than knowing its actual value. For the geostatistical parameters, the wordi is relatively insensitive to the correlation length and the confidence in the prior estimate of die mean, but is sensitive to die estimate of the mean. As die measurement point approaches an existing datum point, spatial correlation causes W to decrease until W=$0 where the measurement and existing datum coincide. The estimated worth of a measurement can be unstable when the objective function of the prior best design alternative is close in value to that of anodier prior design alternative. Different parties, which view discount rates differendy, may get very dtfferent estimates of W. 7.8 NOTATION Ac containment alternative prior best design alternative posterior best design alternative A, no containment alternative Cj. cost of containment Cf cost of failure h hydraulic head i discount rate hydrauhc conductivity in horizontal direction hydraulic conductivity in vertical direction Lg leakage coefficient mj estimate mean of l(x) Ug number of equivalent measurements Pf probability of failure outcome of sampling no window outcome of sampling a window t time T transmissivity W wordi of a sampling program Z objective function A, I correlation lengdi of indicator random variable I(x) HI mean of indicator random variable I(x) CHAPTERS: SAVANNAH RIVER SITE CASE HISTORY 8.1 INTRODUCTION Chapter 8 uses the Savannah River Site (SRS) to demonstrate as realisdcally as possible how the framework developed in this thesis can be used to evaluate die worth of data in a hydrogeological design problem. The design problem utdized here is die closure of the H-Area seepage basins at the Savannah River Site (SRS). The SRS is a nuclear weapons production faciUty approximately 3 400 square km in size which is located in South Carolina on the Savannah River (Fig 8-1). The H-Area seepage basins are located in the General Separations Area (GSA) of the SRS (Fig. 8-2). The H-Area seepage basins are unlined eardi basins diat were used for the disposal of effluent water for nearly 30 years. The effluent contained dissolved tritium and other radioactive and nonradioactive species (Buss et al. 1987). The effluent would seep through the soil to die water table where it would then travel to nearby streams. Contaminants in the disposed effluent water were filtered by die soil as it infiltt-ated through the sod. Figure 8-1: Location map of die Savannah River Site (modified from Duffield et al., 1990). The seepage basins are presently no longer being used for waste disposal. However, if simply abandoned, they pose a potential long term threat to the local envh-onment and water supplies. It is possible that the contaminants remaining in the soil could be remobilized in die future by infiltrating precipitation. This remobilization could provide a contaminant source over the next several hundred years. To deal with this potential long term threat, various options were considered for the closure of the seepage basins. These options ranged from basically leaving die basins "as is" to removing all of the contaminated soil. The remobUized contaminants pose two environmental problems. Fu-sdy, they could be tiansported to Four Mile Creek (Fig. 8-2), which would cause it to become contaminated. Secondly, die contaminants could be tiansported through the aquitard discontinuities to lower aquifers, causing the contamination of clean aquifers. This case study will focus only on lower aquifers. The continuity of only one aquitard will be dealt with. The remainder of diis inttoduction will discuss die general GSA geology, physiography, and hydrogeology. Section 8.2 defines the base case, and presents all parameters needed by die framework to evaluate the worth of data. The prior analysis is carried out in Section 8.3. The preposterior analyses of hard, point data and soft, areal geophysical surveys are carried out in Sections 8.4 and 8.5, respectively. The case stody is presented here for demonstiation purposes only. It is not for determining the true closure design, which has aheady been invoked. The case study is as reaUstic as possible, but idealizations have been carried out to simplify the numerical modeling. In addition, values of certain parameters which are difficult to estimate, such as the cost of failure and the mean of I(x) have been set to be realistic, but also to force die worth of data to have a positive value. 8.1.1 PHYSIOGRAPHY ANfD GEOLOGY The GSA is approximately 5.5 km by 3.7 km in dimension (Fig. 8-2). The surface topography is characterized by gently rolling hills. It is bounded on the north by Upper Three Runs Creek, on the south by Four Mile Creek and on die east by die McQueen Branch. Figure 8-2: H-Area seepage basins in the General Separations Area (modified fi-om Parizek and Root, 1986) It is underlain by a thick sequence of unconsolidated sediments which form a layered system of aquifers and aquitards, that dip to the south east at 1.6 to 2 m per km. Sediments range in depth from 150 to 400 m (Duffield et al. 1989). The sediments were deposited during periods of successive transgressions and regressions (Cooke, 1936, quoted by Parizek and Root, 1986). This dme period also included several periods of erosion. This study will focus on die sttatigraphic units which form die upper diree aquifers and upper two aquitards. Several types of nomenclature have been used to describe die stratigraphy of die SRS. The nomenclature used by Parizek and Root (1986) is used here. The Congaree Formation forms the lowermost aquifer studied (Fig. 8-3). It is about 30 m in thickness and consists of mosdy well sorted sand. The Congaree Formation is of marine origin and was deposited during a transgression (Price, 1989). Tlie Congaree is lx)unded from below by the EUenton Formation, which is a thick, areally extensive clay unit. The Green Clay was deposited on top of the Congaree as the ti^ansgression continued, forming an aquitard. The Green Clay is generally continuous throughout the GSA and it's diickness ranges from 0.3 to 6.1 m. It consists of a grey to green dense clay. Glauconite present within the Green Clay confirms diat it is of marine origin (Price, 1989). However, the Green Clay is not always green or a clay. It is typically recognized by a very fine grained material (Parizek and Root, 1986). The Green Clay is the aquitard of interest in this study. A regression then occurred during which the McBean Formation was deposited. It forms an aquifer; however, the lower part of the McBean is carbonaceous and fine grained and could act as a confining layer widi the Green Clay. Glauconite in die carbonaceous zone can make it appear Green. It consists of well sorted to clayey sand and ranges in diickness from 20 to 25 m. Siple (1967), quoted by Parizek and Root (1986), determined that the McBean is of marine origin. Clay and silt lenses are widely distiibuted throughout the McBean. The McBean Formation is bounded from above by the Tan Clay, which forms the uppermost aquitard. It is a discontinuous layer diat occurs diroughout the GSA. It's diickness ranges from between 0.3 and 3 m. The Congaree Formation, Green Clay, McBean Formation and Tan Clay were all deposited during the Eocene period from 49 to 35 million years ago. The Barnwell formation forms the uppermost aquifer. It consists of clayey to well sorted sand. Discontinuous clay and silt lenses occur throughout. It is about 30 m in thickness. The bottom of the Barnwell could have formed in a near shore marine environment. Nysti-om and Willoughby (1982), quoted by Parizek and Root (1986), reported broad scour features at die base of the Barnwell Formation. Parizek and Root (1986) suggest diat die deposition of die Tan Clay on diis irregular feature could account for its discontinuous nature. The Barnwell was formed during the OligocĂ¨ne from 35 to 22 million years ago. 100 n 50 -0 0 Iâ€” GREEN CLAY-CONGAREE FORMATION E L L E N T D N FORMATION 1 k n Figure 8-3: Geologic cross section through the General Separations Area (modified from Parizek and Root, 1986). (The location of die cross section is shown in Figure 8-2.) A major potential cause of discontinuities in the Green Clay would be erosion by channeling (Price, 1989). Channeling could occur during any post Eocene period of unconformity or alluvial action. Several such periods of alluvial action did occur. For example, the Barnwell is overlain by an unconformity and much alluvial activity occurred during the Late Tertiary periods. For a more detailed description of the site geology, refer to Parizek and Root (1986). 8.1.2 HYDROGEOLOGY The water table is predominandy located in the Barnwell, but also in some parts of the McBean and Congaree near Upper Three Runs Creek (Fig. 8-3). In general, water enters the system through infilti-ation into die Barnwell. The diree bounding creeks form discharge zones. The hydrauhc head distribution in the Barnwell and McBean is strongly influenced by all three creeks. In the Congaree, only Upper Three Runs Creek has an influence on the hydraulic head distiibution. The vertical hydraulic gradient is downward across bodi die Tan Clay and die Green Clay. Groundwater flows from the Barnwell either laterally to discharge in the smrounding creeks or vertically across the Tan Clay into the McBean. The groundwater in the McBean predominandy moves laterally towards the three creeks. However, some groundwater will penetrate the Green Clay and pass into the Congaree. In general, the Tan Clay does not appear to be a significant confining layer or to gready affect the hydrogeological system. Hydrogeological modeling studies by Parizek and Root (1986) and a comparative study of hydrographs from the McBean and Barnwell suggest that the two formations act as one hydrostratigraphic unit. The Tan Clay best acts as a confining layer in H-Area where the hydraulic head difference across die aquitard is up to 5 m. The Green Clay acts as a very sttong confining layer. The hydraulic head difference across die Green Clay is as high as 25 m. Parizek and Root (1986) report that diis head difference is persistent throughout the GSA and, dierefore, it is unlikely diat die Green Clay is extensively discontinuous. However, Duffield et al. (1989) found that small discontinuities in the Green Clay in the GSA could be difficult to detect with hydraulic head data because the perturbations that they created in the hydraulic head field were small. Groundwater in the Congaree comes from leakage dirough die Green Clay and from horizontal flow from off site towards the soudi east. Groundwater in die Congaree flows laterally north west where it discharges into Upper Three Runs Creek. The leakage across die Green Clay is probably not great (Parizek and Root, 1986). The Congaree is bounded from below by die Ellenton Formation, which is a thick, areally extensive aquitard. The Ellenton Formation is assumed to be completely impermeable in this diesis. The clay and silt lenses in the Barnwell and the McBean will cause the hydraulic conductivity to be anisotiopic. Groundwater flow velocities in both the Barnwell and the McBean are up to 100 m/yr. Therefore, the assumption of advective tiansport dominating dispersive tiansport will be valid. The dominance of advective transport over dispersive tiansport was discussed in Chapter 5. For a more detailed description of the site hydrogeology, refer to Parizek and Root (1986). 8.2 SET UP OF BASE CASE Section 8.2 sets up the base case, or the basic modeled system. Setting up the modeled system requires: (a) determining the closure design alternatives, (b) determining how to model contaminant tiansport near the H-Area seepage basins, (c) analyzing available Green Clay data, and (d) evaluating the geostatistical parameters using the available Green Clay data. 8.2.1 ALTERNATIVE CLOSURE DESIGNS Three design alternatives were considered for closure: (a) low permeability clay cap, (b) no clay cap (or no action), and (c) waste removal and low permeability clay cap (Killian et al. 1987). For simplicity, this diesis will only deal with the first two alternatives. In the low permeabihty clay cap option, A^c, the basins would be back filled with soil and compacted. A low permeabUity cap would then be placed on the top to reduce infiltration. The basins would be fenced off and maintenance would be carried out for 30 years. Maintenance would consist of mowing grass in the basins, checking for soil erosion, and monitoring groundwater. This alternative is assumed to be 100% efficient in stopping die contaminant source. The basins were closed with diis option. In the no clay cap alternative, Aj^^-., the basins would fenced off and maintenance similar to the A ^ ^ alternative would be carried out. With this alternative the seepage basins represent an active source of contamination. Therefore, failure, or contamination of the lower aquifer can only occur for Aj^ Ă§.-The total estimated costs in fourth quarter 1985 dollars for A^c. and AJ^Q are $22.8, and $3.3, respectively (Killian et al., 1987). The benefits of all altemadves are zero (that is they do not lead to any direct income to the decision maker). Unless stated otherwise, all monetary values used in Chapter 8 are in fourth quarter 1985 dollars. Preventing contamination of the Congaree is important for several reasons. Firsdy, under existing regulations, if contamination occurs, it will have to be cleaned up. Secondly, contamination will result in regulatory penalties and bad publicity for the SRS. The critical factor in preventing contamination of the Congaree is the continuity of the Green Clay. The author is unaware of any study that has direcdy estimated the cost of contamination of die Congaree by the H-Area seepage basins, and determining a cost of failure is difficult. The easiest of the above cost factors to estimate will be regulatory penalties. However, even these may not be that sti-aight forward to estimate because fadure may not happen for several years and it then may not be detected for several more years. The penalties wdl be dependent upon future laws which may be different than those now in effect. Clean up costs are difficult to estimate. Again, since fadure may happen a number of years into the future, the clean up costs will be highly dependent upon future, unknown clean up standards and techniques. Based on present clean up standards and techniques, the clean up costs wdl be very expensive. Clean up costs at some contamination sites have reached over $100 million. However, it is suggested here that this is an absolute maximum of any possible clean up cost for two reasons. Firsdy, present clean up standards are unrealistically stiict. For example, under some legislation the contaminated aquifer must be cleaned up to drinking water standards. Practice has shown diat in most cases, this is an impossible goal. In the future, clean up standards will likely be relaxed. Secondly, much of the past clean up costs have been spent on litigation rather than the actual clean up. It is also likely in the future as legislation improves, that these costs will be reduced. The costs associated with bad publicity are extremely difficult, if not impossible to quantify. Therefore, since the cost of failure is unknown and very difficult to estimate, and this case study is presented for demonstiation purposes only, two costs of failure will be assessed, and diese will be arbitiarily set at $70 million and $45 million. These values were chosen after some preliminary sensitivity analyses of the case history were carried out. The $70 million cost of failure was chosen to force data to have a large stable positive net worth. Recall from Chapter 7 that if the prior best design alternative is unstable, the estimated W is unstable. The $45 million cost of failure still gives a positive net worth, but it is less stable. From the sensitivity analysis carried out in Section 8.4.2.2, a cost of failure of approximately $35 milhon results is a net data worth of $0. The costs and benefits associated with the three alternatives are summarized in Table 8-1. A discount rate of 0.10 is used in the analysis. A discussion on the choice of different discount rates is presented by Massmann and Freeze (1987a and 1987b). 8.2.2 MODELING OF CONTAMINANT TRANSPORT NEAR H-AREA SEEPAGE BASINS Root (1987) and Parizek and Root (1986) carried out comprehensive hydrogeological studies of die GSA. Odier studies include: Duffield et al. (1990), Duffield et al. (1989), and Duffield et al. (1987). Alternative Benefits Known Costs Cost of Failure (million $) (million $) (million $) Acc 0 22.8 0 0 3.3 70 Table 8-1: Summary of cost and benefits associated with the no action and waste removal alternatives. Ideally, die flow widiin die endre GSA should be modeled because of die presence of die du-ee bounding creeks makes die boundary condidons on all sides easy to define . However, it is impractical to model such a large area because of the number of Monte-Carlo reahzations needed for the data wordi analysis. Therefore, flow in a much smaller region around the H-Area seepage basins will be modeled instead (Fig. 8-4). Steady state flow conditions are assumed. Figure 8-4: Area where flow and transport are modeled around the H-Area seepage basins. The Tan clay was not included in die numerical model of die site hydrogeology. The Tan Clay has a very weak influence on the hydrologie system (see Section 8.1.2) and it is very discontinuous and does not prevent contamination from entering the McBean Formation. Yet, it is almost as thick as die Green clay and its vertical hydraulic conductivity was estimated by Parizek and Root (1986) and Duffield et al. (1987) to be of the same order of magnitude as the Green Clay. Therefore, if it were included as a continuous layer, it would exert an artificially stiong influence on the vertical movement of groundwater and prevent contamination from entering the McBean Formation. Alternatively, it could be included as a distinct layer with discontinuities placed in it. However, it is not known where these discontinuities are; therefore, this approach was rejected. However, leaving out the Tan Clay completely will artificially increase the vertical ttansport of contaminants. Contamination is generally ti^ansported horizontally in the Barnwell and upper part of die McBean (Duffield at al. 1989). To maintain ttansport in the upper portions of the system, die vertical hydrauhc conductivity of the Green Clay was arbittarily decreased by a factor of two from that estimated by Parizek and Root (1986). Al l vertical boundaries on the edges of the flow region are defined as constant head boundaries. Since the Tan Clay is assumed to not be present, die vertical boundary conditions in the McBean and the Barnwell will be identical. However, those in the Congaree will be different because of the presence of the Green Clay. The southern and eastern boundaries in die McBean and Barnwell are placed along Four Mde Creek and the unnamed creek, respectively. The western boundary was placed so that approximately 50% of it overlapped with another unnamed creek. The northern boundary was placed approximately 600 m north of the main seepage basin. The hydraulic head along all boundaries was set equal to the elevation of the 1982 water table, presented by Parizek and Root (1986). In die Congaree, all vertical boundaries are placed along die edges of die modeled flow region. The values on the constant head boundaries are set by interpolation from a 1982 contour map of hydraulic head in the Congaree presented by Parizek and Root (1986). The water table represents the upper boundary of the hydrogeological system. The recharge to the water table was estimated by Parizek and Root (1986) to be 38.1 cm/yr. Ideally, widi die clay cap alternative, inflation will be reduced over die seepage basins, creating a depression in die water table. However, for simplicity this change in boundary condition for the clay cap alternative will be ignored. The bottom of die Congaree is assumed to be impermeable. The only two hydrogeological studies of die GSA presented in enough detail to sufficiendy follow die important modeling steps are Parizek and Root (1986) and Duffield et al. (1987). (Root (1987) and Parizek and Root (1986) present equivalent hydrogeological information.) Parizek and Root (1986) modeled the same system of aquifers and aquitards modeled here. Duffield et al. (1987) included the aquifer and aquitard below the Congaree in their modeling study. The modeling carried out in this thesis will closely follow diat of Parizek and Root (1986) radier than Duffield et al. (1989) because of die greater similarity of the system modeled and the more comprehensive nature of the presentation. The same layering of nodes used by Parizek and Root (1986) will be used in this thesis. The Barnwell will be represented by a single layer of nodes, while the McBean will be represented by two layers of nodes to account for vertical gradients. The Green Clay is represented as a leakage layer, and die Congaree is represented by a single layer of nodes. This density of nodes should be sufficient to model the groundwater velocities as accurately as needed here. More layers are used here than are used in the particle tiacking study carried out by Duffield et al. (1987) in the same area. A nodal spacing of 29 m has been used in die horizontal discretizations of die aquifer and aquitard units. Therefore, the aquitard has been discretized into 29 m square blocks. This discretization corresponds to 60 columns and 50 rows. The effect of the density of discretization on the worth of data is investigated in Section 8.4.2.1. The hydraulic parameters used in diis thesis for die hydrostradgraphic units are summarized in Table 8-2. Except the hydraulic conductivity of die Green Clay, all of die values were taken direcdy from die results of a steady state calibration by Root (1986). Recall that die hydraulic conductivity of the Green Clay was reduced by a factor of two. Root's calibration was carried out by trial and error. The calibration was carried out by matching the calculated hydraulic head widi maps of hydraulic head in the diree aquifers diat were produced from 150 measurements. The contaminant source is represented by 33 particles spaced along a line centered in each of the four seepage basins. The particles are all initiaUy released from an elevation corresponding to 99% of die saturated thickness of the Barnwell. The flow solution and the padis of contaminant particles tracked are shown in Figure 8-5, for the case where no windows are present in the Green Clay. hydrosttatigraphic unit parameter Estimate Barnwell Kh 8.8 X 10-6 (m/s) Kv 7.4 X 10-8 (m/s) porosity 0.2 Upper McBean Kh 1.8 X 10-5 (m/s) Kv 1.5 X lO-'' (m/s) porosity 0.2 Lower McBean Kh 8.8 X 10-6 (m/s) Kv 7.4 X 10-8 (m/s) porosity 0.2 Green Clay Kv 1.75 X 10-10 (m/s) porosity 0.4 Congaree Kh 2.2 X 10-'* (m/s) Kv 2.2 X 10-5 (m/s) porosity 0.2 Table 8-2: Hydraulic parameters of hydrosti-atigraphic units used in die base case. Windows will be generated in the Green Clay up to 50 m from the fixed boundary conditions in the McBean and Barnwell. 8.2.3 GREEN C L A Y DATA BASE This thesis will utilize the same Green Clay data base presented by Parizek and Root (1986). It consists of 147 boreholes which penetrated the Green Clay horizon. They are summarized in Appendbc 1. Al l of the boreholes had lithologie logs available while 47 boreholes also had geophysical logs. More Recent boreholes have been taken, but they will not be used because their results are not readily available. The boreholes have been split into both hard and soft data. The hard data comprises the 47 boreholes which have geophysical logs. The geophysical logs include gamma ray, self potential, and resistivity logs. These boreholes are defined as hard data because the gamma ray signature of the Green Clay is relatively consistent and widespread throughout the GSA (Parizek and Root 1986) and is easy to pick out (Aaland, 1989). Al l of these boreholes found Green Clay present. The data points are mosdy concenti^ted near H-Area, but a few are spread throughout the GSA; however, only one is located near the H-Area seepage basins (Fig. 8-6). The soft data comprises the remaining 100 boreholes which have only lithologie logs. These data are defined as soft because often only a percentage of the core was recovered in many boreholes (Parizek and Root, 1986). The quality of the logging and the amount of core recovered varied over time with earlier boreholes having poorer core recovery and a poorer quality of logging than the later boreholes. However, for simplicity, this variation is ignored. Thirty four of diese boreholes (34%) did not encounter the Green Clay. Their locations are again concentrated in particular areas of interest, but some locations are scattered throughout the GSA (Fig. 8-7). It is worth noting diat boreholes diat did not encounter the Green Clay are generally located widiin 1(X) m of a borehole that definitely did encounter die Green Clay. N 15 I 1 1 1 1 1 1 I I I 15 215 415 615 815 1015 1215 1415 1615 distance (m) Figure 8-5: Contours of iiydraulic liead in Barnwell Formation and paths of pardcles of contamination near the H Area seepage basins. (Contour interval is 1 m.) The Green Clay data base has been split into two categories. The 147 boreholes taken du-ough out the GSA wiU be referred to as global data. They are used to estimate and A,j. (The estimation of Hj and is covered in the next Section.) A subset of the global data (14 soft and 1 hard) are located within the region where flow and transport are modeled (Fig. 8-8). These are referred to as local data. The aquitard realizations are conditioned on local data only. Recall from Chapter 4 that die reliability, or quality, of the soft data is quantified by Pl = P(Is(x) = lll(x) = 1) = P(Sw I Window is present) P2 = P(Is(x) = lll(x) = 0) = P(SY^ I No window is present) + Figure 8-6: Locations of global hard data. where pj represents die probability that the lithologie log will indicate a window given that a window is really present. In other words, pj represents the probability that the sample outcome is correct. In this thesis it is assumed diat Pi =1. The value of pj represents die probability that die lidiologic log indicates a window is present, while in reality Green Clay is present. In odier word, P2 represents the probability diat die sample outcome is wrong. Here, P2 is estimated from die lidiologic logs of die 47 hard data. Of all the hard data sampled, 18, or 38%, of these lithologie logs report a window even though the geophysical logs indicate aquitard; dierefore, P2 = 0.38. Hence, it is expected diat 38% of the soft data will show a window when aquitard is actually present. THREE DATA POINT Figure 8-7: Locations of global soft data. Figure 8-8: Local hard and soft data used to condition aquitard realizations. It is likely that none of the soft data truly sampled a window. Since, P2 = 0.38, it is expected that 38% of the lithologie logs will indicate a window even when Green Clay is actually present If some lithologie logs are truly finding windows, then it would be expected that more than 38% of the logs would indicate windows. However, only 34% of the hthologic logs alone indicate windows, which is below the expected 38%. Furthermore, all of the lidiologic logs of the hard data which indicate a window are apparendy wrong. It is assumed diat leakage will not occur through existing boreholes. 8.2.4 INFERENCE OF GEOSTATISTICAL PARAMETERS Recall from Section 4.3.3 that the mean will be estimated by Bayesian updating, while the correlation length must be estimated by non Bayesian mediods. The Bayesian estimation of [Xj is dependent upon Xj; therefore, A-j must be estimated fh-st. However, the estimation of presents a problem because it cannot be direcdy estimated from the data. This problem and the method used to circumvent it are discussed in die next Section. 8.2.4.1 Inference of Correlation Length. The correlation length cannot be direcdy estimated for either the hard or soft data. Al l of the hard data encounter aquitard; therefore, there is no variability in the data and a covariance function cannot be calculated. The soft data does have variability in the data; however, as discussed in Section 8.4, all of the encountered windows are probably due to false sample outcomes. Therefore, the variability is likely due to false sample outcomes, rather dian representing any true variability in the aquitard. A covariance function based on these data would be completely misleading as to the true spatial variability of the Green Clay. In lieu of data to estimate the correlation length, it will be arbittarily assumed that is equal to 200 m. It assumed that the correlation will be large since the Green Clay was deposited in a marine environment. Fortunately, as will be shown in Section 8.4.2.3, the worth of a sample is relatively insensitive to the correlation lengdi. 8.2.4.2 Inference of Mean and Variance There are three sources of information available to estimate die mean: geological intuition, global soft data and global hard data. The mean will be first estimated based on geological intuition. A combined estimate using all of the information will be obtained by updating the geologically based estimate in two separate steps, one for die soft data and one for die hard data, using equation (4.55) which is presented below: where - H ' = [u^ S I J I J I u]"^ or the weight of the prior data - H = u^Sij"^], or the weight of the sample data - mj' is the prior mean - mj is the sample mean - mj" is die updated mean - Pl and P2 represent the precision of the measurement. It is believed that the Green Clay was deposited over 35 million years ago. Since then, unconformities and periods of alluvial action have occurred which could have eroded a discontinuity in the Green Clay. Upper Three Runs Creek is a present day example of channeling creating a discontinuity in the Green Clay. Therefore, based on the geology alone, it could be estimated that 5% of the Green Clay under the m (8.1) GSA has been eroded by channeling in die past 35 mdlion years. Therefore, die geological estimate of the mean, mj', is 0.05. The confidence in the prior estimate is assumed equivalent to five independent hard measurements. Updating die geological estimate widi the 100 global soft data yields: ^ â€ž _(105.26)(0.05) +(132.116)(0) â€˘ " l - 105.26+132.116 ^ -^^ ^ = 0.0222 Note that the unbiased estimate of die sample mean based on soft data alone is 0 and diat the weight of the soft data, H, is 132.1 compared to the weight, H ' = 105.3 for geological intuition. The second updating step using the 47 global hard data yields: _(0.0222)(237.38)-H (336.18X0) " " l - 237.38 + 336.18 = 0.0092 Note, that die sample mean based on die hard data alone is zero and that die weight of the hard data alone, H, is 336.2 which is three times the weight of the soft data alone and over three times the weight of the geological estimate. The overall updated mean, mj'", is 0.0092. The updated estimate of die variance is a2"' = 0.0092 (1 - 0.0092) = 0.00912 The geostatistical parameters used in the base case are summarized in Table 8-3. Parameter description mean variance correlation length confid. in geol. est. of mean 0.0092 0.00912 200 m 5 ind. samples Table 8-3: Summary of geostatistical parameters used in base case. 8.3 PRIOR ANALYSIS A prior analysis is carried out using the base case set up in Section 8.2. Recall from Section 6.4 that the prior analysis is calculated using the same realizations generated for the preposterior analysis. One hundred realizations are used for each sample outcome in die preposterior analysis; therefore, a total of 200 realizations are used in the prior analysis. A decision free illustiating the prior analysis is shown in Figure 8-9 for Cf = $70 million. AJ^Q is the best alternative with an expected objective function of -$11.296 million. The expected objective function for Ac is -$22.8 million. The prior probability of failure is 0.226. The expected regret of the clay cap alternative is $3.79 million; therefore, no more dian approximately $3.79 milhon should be spent on any exploration program. Note that the expected regret of the prior best design alternative is equal to the expected value of perfect information, EVPI. When Cf = $45 million, the clay is still the best alternative with an objective function of -$8.44 million. The expected objective function for the clay cap alternative is still -$22.8 million. The prior probability of failure is still 0.226, but the expected regret of die clay cap alternative is $1.19 million. No more dian approximately $1.19 million should be spent on any exploration program. ECZ(Ac)] = - $ 2 2 800 E[Re9(Ac)]=$12 600 Note : AU d o l l a r s a r e in t h o u s a n d s P(fai l )= 0.226 E[Z(A^C^^ = -^11 296 E[Re9(A,^c^^=^3 790 Figure 8-9: Decision ttee used in prior analysis, where Cost of failure = $70 million. Note that reducing Cf from $70 million to $45 million has reduced the estimate of maximum expenditure for an exploration program by over three dmes. Cf could be very difficult to estimate. Failure can happen decades into die future; therefore, the cost of failure is dependent upon many future factors which are unknown at present, such as legislation and clean up techniques. Therefore, the estimated maximum expenditure is highly uncertain. 8.4 PREPOSTERIOR ANALYSIS OF HARD DATA Section 8.4 presents die preposterior analysis of hard data. This analysis includes estimating data wordi for bodi Cf = $70 and $45 million. For simphcity, Cf = $70 million and Cf = $45 mdlion, will be referred to as Cf7 and Cf4, respectively. The framework is used in Section 8.4.1 to determine whether a single hard measurement is cost effective, or in other words should it be taken. The robusmess of this decision is studied in Section 8.4.2. In Section 8.4.3, die framework is used to prioritize sampling locations and to determine the maximum area around the seepage basins where hard measurements should be taken. Finally, in Section 8.4.4, the worth patterns of multiple hard measurements is studied. It is assumed in this Chapter that a hard measurement is idendcal to existing hard data and consists of a borehole and geophysical logs including: gamma ray, self potential, and resisdvity. The preposterior analysis of a soft geophysical survey that covers a large area is carried out in Secdon 8.5. 8.4.1 SINGLE HARD MEASUREMENT The framework is used in diis Section to determine whether a single hard measurement should be taken at point B, approximately 300 m soudi of die seepage basins (Fig. 8-10). A decision tiee for the preposterior analysis of this measurement is shown in Figure 8-11, for . The expected expected objective function of the posterior best design alternative is E [ E ( Z ( A D ' ) ) ] = E(Z(AD'1SW))P(SW) + E(Z(AD ' ISNW))P(SNW) (8.4) = (-S22 800 000)(0.00730) + (-$11 034 000)(0.99270) = -$11 119 892 Therefore, W from equation (2.30) is W = E [ E ( Z ( A D ' ) ) ] - E ( Z ( A D ) ) (8.5) = -$11 119 892-(-$11 296 000) = $176100 For Cf4 , W is reduced to $62 400. 1415 1215 1015 vĂŠ- 815 (U O g 615 tn T5 415 215 15 15 215 415 615 815 1015 1215 1415 1615 distance (m) Figure 8-10: Location of hard measurement taken at point B. We are really interested in the net worth, Wâ€žg( of the hard measurement, where Wâ€žgj is defined as the difference between W and the cost of taking the measurement. It represents whether the measurement is cost-effective or not. It is assumed that if the measurement, or borehole, is taken, a monitoring well will also be installed. The cost of installing a monitoring well at die SRS in 1989 dollars is approximately $420/m. This cost includes drilling, materials, and oversight by a state registered geologist. The depth to the Green Clay at point B is approximately 33m, but it is assumed that the borehole will be drilled a further 20 m into the Congaree to ensure that the Green Clay horizon is penetiated and to get a sampling point for the hydraulic head towards the lower half of the Congaree Formation. Therefore the total cost of die borehole and monitoring well is 22 260 ($420/m x 53 m) 1989 dollars. o - future sample point J i I I I I I PRIOR ANALYSIS "^7 E[Z<At^3=-*22 800 E[Re9(/^g>]-*15 293 tij" 0.0091 P<foil>= 0.226 E[Z<A^>]=-*11 296 ECRe9<Aâ€žc >3=$3 788 0' Note: All dollors are in thousands E[Z<AtJS^>]=-$22 80( E[Reo</^ JS,,>J = Â»0 P(FoillS>^= 1.0 ECZCA IS^)]=-S46 937 E[Re9(AjS^>]=S24 137 PREPDSTERIDR ANALYSIS '9 ECZ</^JS,^>]=-$22 800 E[Re9<A^JS^)3=Â«l5 405 P(FaillSlw>= 0.220 E[Z<A^ I S ^ Â» - * 1 1 034 EIRegCA^ IS^y)]=$3 639 Figure 8-11: Decision tree used in preposterior analysis of hard measurement taken at point B, for cost of failure = $70 million. The cost of taking die geophysical logs in one borehole 50 m deep at the beginning of 1992 will be on the order of $700 (Sperling, 1992). When all costs are discounted back at die annual rate of inflation, the total cost of die hard datum is approximately 20 000 diird quarter 1985 dollars. Hence, for Q.^ Wâ€žg( is approximately $156 100 (=$176 100 - $20 000) in diird quarter 1985 dollars. For Cf4, Wâ€žgt is reduced to $42 400 (=$62 400 - $20 000). Note diat die cost of die data wordi analysis is assumed to be zero. 8.4.2 SENSITIVITY OF NET WORTH OF HARD SINGLE HARD MEASUREMENT TAKEN AT POINT B TO PARAMETERS SET UP IN BASE CASE. In Section 8.4.1, it was determined that die hard measurement taken at point B was cost effective, or hence, that it should be taken. For this data worth analysis to be carried out, values had to be assumed for all parameters in the base case (Section 8.2). These included geostatistical parameters, hydrogeological parameters, economic parameters, and numerical artifacts of the SIS algorithm. Errors, or variations, in estimates of these parameters will cause variations in the estimated net worth of die measurement. The purpose of the this Section is two fold. First, and foremost, it is to test die robusmess of the decision made in the last Section that die measurement is cost effective. Recall that for the measurement to be cost effective, Wâ€žgj > $0. The second reason is to get an idea of how large these variations in Wâ€žg( can be. Both of these purposes are accomplished by carrying out a sensitivity of Wâ€žgt of the measurement taken at point B to parameters set up in the base case. These parameters include: economic and geostatistical parameters, and numerical artifacts of die SIS mediodology. The sensitivity of W to hydrogeological parameters will not be studied. Recall from Chapter 7 that the following two conditions are important with respect to the hydrogeology: (1) The aquitard is impermeable enough to prevent major vertical migration of contamination, except where windows are present. (2) The vertical gradient across die aquitard is sufficient to draw contamination through a window, if present, to the lower aquifer. The GSA hydrogeology is well understood; therefore, both of these conditions are known to be tine (Refer to Section 8.1). Therefore, for simplicity it was decided not to carry out a sensitivity analysis for the hydrogeological parameters. 8.4.2.1 Numerical Artifacts Parameters which are numerical artifacts of the SIS methodology include the number of blocks into which the aquitard has been discredzed, the starting seed, die number of die first aquitard block generated, and die number of realizadons generated. The sensitivity of the net wordi to aquitard discretization was studied by carrying out the preposterior analysis using five different densities of aquitard discretization, or finite difference mesh (Table 8-4). Recall, each finite difference cell represents one aquitard block. grid size aquitard block dimension (m) 40x33 43 50x42 35 60x50 29 70x59 25 80x67 21 Table 8-4: Aquitard discretizations used in evaluating net worth of hard measurement taken at point B. For C0 as the aquitard block size changes, Wâ€žg( fluctiiates randomly and very slighdy about its mean value (Fig. 8-12) with a standard deviation of $3 820. This standard deviation is only 2.6% of die mean of $151 040. For Cf^ the fluctuations are similar with a standard deviation of $2017. This standard deviation corresponds to only 3.4% of the mean of $58 900. Note, that a block size of 29 m is used in die base case. in o o +â€˘ LJJ es -C o o o o L J 0 o Net Worth, C,=$70 million - â€” Average Net Worth, Ci=$70 million (Â»eeeo Net Worth. Ci=$45 million Average Net Worth, C,=$45 million 1 0 ~ ~ 1 â€” 3 0 (m) Block 2 0 Dimension 4 0 Figure 8-12: Sensitivity of net worth to aquitard block discretization. For both C^; and Cf^, Wâ€žgj has very litde sensitivity to the starting seed, the number of die fkst aquitard block generated, or die number of reahzations. For all cases, it fluctuates randomly and slighdy about its mean value. The standard deviations of die fluctuations are $835, $3 147, and $3 121 for the number of realizations, the starting seed, and die number of the fh-st aquitard block generated, respectively. The standard deviation of the total random error from all of the numerical artifacts in Wâ€žg( from equation (6.15) is Â«'w.. = ( i c T w . . ' ) i ^ (8.6) i=i =($3 8202 + $8352 + $3 1472 + $3 1212 )i/2 = $5 910 Therefore, die standard deviation of die total random error is only 3.4% of W^^ ^ of $156 400 evaluated for die hard datum taken at point B. For 0^4, die standard deviations of die fluctuations are $474, $1917, and $1968 for die number of realizadons generated, the starting seed, and the number of the first aquitard block generated. The total standard deviation is $3441 which is only 12.3% of die Wâ€žet of $42 000. In summary, for both costs of failure, parameters which are numerical artifacts will only have a small effect on Wâ€žgj of the measurement taken at point B and hence will not affect whether it is cost-effective or not. 8.4.2.2 Economic Parameters The value of Wâ€žet is almost equally sensitive to the discount rate for both costs of failure (Fig. 8-13). As die discount rate increases, Wâ€žej decreases. The bounds over which the measurement is cost-effective is wide for bodi cost of failure. For C^^, die discount rate must increase to 0.15, while for C^j, it must increase to greater than 0.15, before the measurement is no longer cost-effective. Errors of more than a few percent in the discount rate are unlikely; therefore, Wâ€žgj will have a positive value for all lUcely discount rates. Therefore, for the example carried out here, the discount rate is not a critical parameter in determining whedier die measurement is cost-effective or not. However, as Cf decreases, it will become more important. Turning now to the cost of failure, Wâ€žgj is zero for Cf < $35 million, but dien increases linearly with Cf (Fig. 8-14). Cf could be a critical parameter in determining whedier die datum is cost-effective. A 50% decrease in die base case Cf of $70 miUion would result in Wâ€žg, = $0. Unfortunately, a 50% error in the estimated Cf is possible since as discussed in Section 8.3, it is very hard to estimate. O Â§^ O ÂŁ 8 , ^ o CN -4-Â» O O 0.00 Cf=$70 million oo-oo-o Cf=$45 million 1 1 1 1 1 1 1 1 1 1 f r 0.05 0.10 0.15 Discount Rote (decimal Figure 8-13: Sensitivity of net wortli to tlie discount rate. <N S s -s o -OP+OOO 2E-H007 4E-I-007 6E-I-007 8E-I-007 Cost of Foilure ($) 1E-I-008 Figure 8-14: Sensitivity of net worth to cost of failure. With respect to the known cost of failure, W^^ has a similar sensitivity to the cost of the clay cap, C,,,,. for both Cfj and Cf^ (Fig. 8-15). For C^j, die measurement is cost-effective for C^ .^ , between $15 million and $40 million. The estimated base case value of C^^ is $22.3 million. Therefore, an underestimate of 50% or an over estimate of 100% is necessary to reduce the Wng( to zero. For Cf^, C^^ has a greater impact on the measurement's cost-effectiveness. The measurement is cost-effective for C^^ between $10 and $29 million. The base case value of now only has to increase by 30% to make the measurement no longer cost-effective. The value of C^^ will be much easier to estimate than Cf because the time horizon of construction will be much shorter and will involve many fewer uncertainties. Therefore, for high Cf, C^^ will likely have litde impact on the cost-effectiveness of the measurement. However, this impact will grow as the cost of failure decreases. Figure 8-15: Sensitivity of net worth to known cost of lay cap alternative. The value of Wâ€žg( has an identical sensitivity to the known cost of the no clay cap alternative, C^^ for bodi Cf7 and Cf4. The measurement is cost-effective for < $10 million, for bodi costs of failure. Therefore, Câ€žj, must at least tiiple before the measurement is no longer cost-effective. The value of C, should be easy to estimate. Therefore, uncertainty in Câ€žg is unldcely to affect die measurement's cost-effectiveness. OE- I-OOO 5 E - I - 0 0 6 1 E- ( -007 K n o w n C o s t o f N o C l o y C o p ( $ ) Figure 8-16: Sensitivity of net worth to known cost of no clay cap alternative. 8.4.2.3 Geostatistical Parameters The sensitivity of Wâ€žg, to the geological estimate of the mean, mj, is shown in Figure 8-17, for both costs of failure. The measurement is cost-effective for 0 < mj < 0.16 for Cf; and for all values of mj tested for Cf4. Therefore, there is a large margin of error over which the net worth will maintain a positive value and measurements will be cost-effective. However, it is impossible to estimate possible errors in the geological estimate of the mean because of a lack of available data. The geological estimate of the mean could potentially be a critical parameter in determining whether the measurement is duly cost-effective. Figure 8-17: Sensitivity of net worth to geological estimate of mean. The value of Wâ€žgt is relatively insenstive to for bodi costs of failure. W^ j^ > $0, and hence the measurement is cost-effective for 60 m < < 250 m for C^j and 20 m < Xj < 300 m for Cf4 (Fig. 8-18). Correlation lengths greater than 250 m become difficult to study because becomes a significant percentage of the domain and the generated realizations do not reproduce die statistical parameters used to generate them. The correlation length for this case study and likely for many others, is difficult to measure because of a lack of information. Nevertheless, it seems to be potentially a less critical parameter than other parameters such as the cost of failure and mj. Correlation lengths greater then 60 m are likely for an aquitard formed in a deep marine environment such as the Green Clay. Figure 8-18: Sensitivity of net worth to correlation length. The sensitivity of Wâ€žg( to the confidence in the geological esdmate of the mean is shown in Figure 8-19, for both Cf7 and C^^. Note that die confidence is quandfied in terms of an equivalent number, n^, of independent hard data. The measurement is cost-effective for 1< n^ <29 for Cfy and 1< n^ <30 for Cf^. Those bounds correspond to weights of 21.05 and 589.5 for the prior geological mean compared to a wight of 336.2 for the 40 existing hard data. It is extremely unlikely that die weight of the geological mean is greater dian diat of 40 hard data points, or equal to zero. Therefore, die net worth will be positive, and the datum cost-effective, for all likely values of the prior confidence in the geological estimate of the mean. Note that at n^ = 0, Wâ€žet = $20 000. At n^ = 0, die estimate of die mean is based completely on die sample data which estimates a mean value of zero. A mean value of zero allows no windows to occur and hence does not allow for failure to occur. If failure cannot occur, W=$0. Figure 8-19: Sensitivity of net worth to confidence in prior estimate of mean. 8.4.3 NET WORTH OF SINGLE, HARD MEASUREMENT TAKEN AT DIFFERENT LOCATIONS In Section 8.4.1 it was determined that a hard measurement taken at point B was cost effective. Section 8.4.2 studied die robustness of this decision. In Section 8.4.3, die framework will be used to make two more realistic and useful data worth decisions. The first is sequentially locating measurement points and the second is determining the maximum size of a grid of measurement points. The former question is more important because sampling programs are much more Idcely to be carried out in sequential stages of a few measurements at a time, radier dian many measurements taken simultaneously on a large grid. Bodi of diese decisions can be made by estimating W^^j at a large number of points around die seepage basins and contouring the results. A contour map of Wâ€žg, is shown in Figure 8-20 for Cf^. It is assumed that the cost of taking a hard measurement is constant throughout the region of study. The net worth is a maximum of approximately $50 000 in the south east comer between the seepage basins and two creeks. The maximum occurs in this area because it is the most probable location for die contaminant plume. Refer to Figure 8-5 for the most probable contaminant plume locations. distance (nn) Figure 8-20: Contour map of net worth of a single, hard measurement taken at different locations for cost of failure = $45 million. (Numbers are in thousands of dollars.) The value of Wâ€žg, decreases towards the west and nordi of the seepage basins because of both the conditioning effect of existing soft data and the reduced likelihood of a window causing a failure. Windows towards the north and west are less likely to cause failure because they are further from the likely path of contaminant plumes. The stiong effect of an existing hard datum on Wâ€žgj is shown by the depression in the north east comer where Wâ€žgj is reduced to -$20 000. The value of W^^j decreases as the sampling location approaches the existing datum because of tlie correlation effect. The much weaker correlation effect of existing soft data on Wâ€žgt is shown by soft data at points A and B west of the seepage basins. Note that hard measurements taken on the opposite side of Four Mile Creek and the unnamed creek in the south east comer of the map have a net wordi of up to $40 000, even though the creeks prevent contamination from reaching any window located there. However, because of spatial correlation, a window there may extend across die creeks, allowing failure to occur. Therefore, sampling outside of die zone of where failure can occur can have significant value. A sequential sampling program could be easily designed as foUows. In die first step, a contour map of Wâ€žm would be created. In the second step, the measurement would be located where Wâ€žĂ§j was a maximum. After the measurement was taken, a new contour map would be produced to locate the next measurement location. Samphng would continue as long as the maximum Wâ€žgj > $0. The robustness of a decision to take a measurement could be tested by carrying out a sensitivity analysis of measurement's Wnet to base case parameters. The contour map can be used as well to determine the maximum limits of a zone over which a grid of measurement points can be taken. For C^^, this zone is bounded by the $0 Wâ€žgj contour to the west and north-west, Four Mile Creek to the south and die unnamed creek to the east A small portion of the zone's north end is off die map. However, the exact boundaries of diis zone are uncertain. A contour map of Wng, is shown in Figure 8-20, for Cfj. It is very similar to diat for C^^, except diat Wâ€žg, has increased in value and the area where measurements are cost-effective has increased. Note that a hard measurement now is cost-effective at every location in the region of study, except at the existing hard datum. Therefore, the zone in which hard measurements are cost-effective is variable and depends on the base case parameters, such as the cost of failure. Note, however, diat die location of die maximum Wâ€žg,, and hence the highest priority measurement location, has not changed. Figure 8-21: Contour map of net worth of a single, hard measurement taken at different locations for cost of failure =$70 million. (Numbers are in thousands of dollars.) 8.4.4 PATTERNS OF MULTIPLE, HARD MEASUREMENTS Section 8.4.4 evaluates the worth of patterns of multiple hard measurements. Three patterns of 1,2,3,4, and 5 measurements taken on 29,145, and 290 m spacing are used. These patterns are shown in Figures 8- 22 to 8-24. The data are collected in the order of the numbers marking the measurement points. Note that point 1 is the same for all three patterns. The patterns were picked where W^^j at the individual measittement points was reasonably constant. BIS t2l5 distance (m) Figure 8-22: Pattem of liard measurement taken on a 29 m spacing. 15 ĂŹ - hord dotum 9- soft dotum o - mMSuromant point 15 415 815 1215 distance (m) 1615 Figure 8-23: Pattem of hard measurement taken on a 145 m spacing. The pattems were resdicted to a maximum of five data l)ecause of die CPU dme involved in carrying out die data worth analysis. Recall diat 100 x 2" realizadons are needed to esdmate the worth of the pattem, where n is the number of measurements in the proposed pattem. Therefore, 3200 realizadons are required to evaluate the worth of a pattem of five measurements. distance (m) Figure 8-24: Pattern of hard measurement taken on a 290 m spacing. The value of Wâ€žgj for die patterns are shown in Figure 8-25, for C^j. Wâ€ž^^ initially increases with the number of measurements in the pattern. The greater the separation distance between the adjacent measurement, the greater the initial increase. This behavior is because the more independent die data, the greater their Wnet. However, for the 29 m spacing pattern, Wâ€žg( reaches a maximum at 3 measurements and then starts to decline. A similar behavior is shown for the 145 m spacing pattern where a maximum is being approached at five measurements. This maximum is at a greater number of measurement than the 29 m spacing pattern because of the greater separation distance and mdependence of the measurements. The value of Wâ€žet continually increases over the five measurements for the 290 m spacing. Note that all patterns are cost-effective. The net wordi for the patterns are shown in Figure 8-26, for C^^. Note diat W^^^ for all patterns has been reduced below diat for Cf;. The value of W^^f for the pattern on a 29 m spacing continually decreases as die number of measurements increases and becomes negative for greater dian five measurements. For die patterns on die 145 m spacing, Wâ€žgj initially increases and reaches a maximum at four measurement and then starts to decrease. The measurements on the 290 m spacing are almost independent; therefore, W, increases continuously between one and five measurements. Note, diat all pattems except five measurement on the 29 m spacing are cost-effective. o o 0_ , O CM o_l 1 1 1 1 , 0 1 2 3 4 5 Number of Data Figure 8-25: Net worth of multiple hard measurements taken at spacings of 29,145, and 290 m versus number of measurements taken, for cost of fadure = $70 million The most cost effective pattern was the five measurements with the 290 m spacing. With the 290 m spacing, no limit was found for die optimum number of measurements. The measurements are independent enough that their net worth increases with the number of measurements. However at smaller spacing, or with much existing data, spatial dependence will become important. Under these conditions, the net worth wdl increase with the number of measurements, reach a maximum and then start to decrease. Hence there will be an optimum number of measurements, which may be as small as one. O O O - , O Number of Data Figure 8-26: Net worth of multiple hard measurements taken at spacings of 29 145 and 290 m versus number of measurements taken, for cost of failure =$45 million. The framework could be easily used to design sequential sampling programs involving mulUple measurements. In the fkst step, a contour map of the net worth for a single measurement would be produced and the location of the maximum net worth found. In the second step, the measurements would be located in the region of the maximum net wordi. The optimum number and spacing of the measurements would then be determined. Once die measurements were taken, a new contour map would be created to pick out the next measurement locations. Measurements would be taken as long as the maximum net worth was > $0. The robusmess of a decision to take a pattern of measurements could be tested by carrying out a sensitivity analysis of the measurements' Wâ€žg, to base case parameters. 8.5 PREPOSTERIOR ANALYSIS OF SOFT, AREAL SURVEY Section 8.5 evaluates die worth of a soft, areal survey. The assumptions behind the mediodology used here are discussed in Section 6.5. It will be assumed that a reflection seismic survey will be carried out. Based on cost alone, a radar survey would Idcely be the measurement of choice for delineadng shallow stradgraphy. However, a radar survey would unldcely be feasible because it would probably not be able to map the so-adgraphy down to die 30 m depdi of die Green Clay. Groundwater, die Tan Clay, and clay lenses in the Barnwell and McBean Formations would likely attenuate the radar too much (Knoll, 1992). It is assumed that the smallest discontinuity wiU represent a 29 m square. An absolute minimum of two measurements, or geophones, on a line across a window would be required to sample it (Cross, 1992). To be conservative, it is assumed that geophones will be placed on a square grid every five meters to ensure that an adequate number of sampling points fall widiin any potential window. The analysis wdl be carried out here for a cost of failure of $45 million. Recall that die survey covers die entire area where windows can result in a fadure which causes a change in A Q . This area is marked by the $1 000 contour of the total wordi of a single hard measurement and Four Mile Creek and the unnamed discharge creek (Fig. 8-27). (Measurement points to the west of the $1000 contour are actually equal to $0.) Its exact size is not known because part of it is off the north end of the map; however, it will be estimated here that the entire area is approximately equal to a 1.3 x 1.3 km square. It is assumed that the total cost of carrying out a seismic survey on a square grid with a five meter spacing over this area is on the order of $200 000. Road access, type of geology, and odier factors are ignored in this estimate. A more detailed cost estimate would requked for an in depth analysis. When discounted back to 1985 at die rate of inflation, die cost of die survey is approximately $163 000. Note from Section 8.4, that the area of the survey, and hence total cost, is a function of the cost of failure. The foUowing likelihood functions for the survey are assumed: P(Sw,INWfd) = 0. (8.7) Figure 8-27: Contour plot of total worth for single, hard, point measurements taken throughout die region of study, for cost of failure =$45 million. (All numbers are in thousands of dollars.) where Wjj represents a window is present which will cause a failure which will alter the prior best design and NWf J represents no window is present which will alter the prior best design. In other words, P(SFIF ) = 0.2 (8.8) P(SplNF) = 0. where F denotes failure and NF, no failure. The decision tree used in die prior and the preposterior analysis is shown in Figure 8-28. Note that the decision d-ee now resembles die decision tree used for die univariate case presented in Chapter 2. Objective Function Note: oU dollars ore in nillions Figure 8-28: Decision tree used in preposterior analysis of soft geophysical survey. From the prior analysis, the prior probability of failure, PCF), is 0.266 and the expected objecdve funcdons for the two altemadves are E(Z(AcclFailure)) = -$22.8 million E(Z(AcclNo Failure)) = -$22.8 million E(Z(ANclFailure)) = -$25.938 miUion E ( Z ( A N C I N O Fadure)) = -$3.2759 million. The prior best design altemadve is A ^ ^ ^ with an expected objecdve funcdon of -$8.4401 million ((0.226)(-$25.938 miUion) + (0.774)(-$3.2759 million)). The updated probabdity of fadure, P(FISp), from Bayes' equation, given that a window/fadure is sampled, is P(FISp) = P(SplF)P(F) P(SF) (8.9) where P(Sp) =P(SplF)P(F) + (P(SNplNF)P(NF) = (0.2)(0.226) + (0)(1 - 0.226) = 0.45 (8.10) Therefore, _(0 .2 ) (0 .226)_ PCFISp) - 0.045 - ^ and P(FISNP)= P(SNF1F)P(F) P(SNF) _(0.8)(0.226) (1-0.045) " - ^ ^ ^ (8.11) The expected expected objective function for the best posterior design is E [ E ( Z ( A D ' ) ) ] = E(Z(AD'ISP))P(SP) + E(Z(AD'ISNP))P(SNF) (8.12) = (-$22 800 000)(0.045) + (-$7 581 700)(0.955) = -$8 266 524 Therefore, the worth of the soft, areal survey is W =E[E(Z(AD') ) ] - E ( Z ( A D ) ) (8.13) = -$8 266 524 - (-$8 440 100) = $0,173 million Therefore, W^^j is $13 000 ($173 000 - $163 000) and the geophysical survey is cost-effective. However, as noted in Section 6.5, Wâ€žgj for the survey is highly uncertain; therefore, our major concern is not estimating Wâ€žgt. but calculating die break even, or minimum precision, P(SplF ), needed to make the survey cost-effective. It is assumed that P(SplNF ) will remain equal to zero. To calculate die break even precision, die above calculations were repeated for P(SplF ) between zero and one (Fig. 8-29). The break even P(SplF ) is approximately, 0.14. The seismic survey will be cost-effective if P(SplF ) > 0.14. This estimate could be potentially very valuable, because while it may be difficult to estimate die exact precision of survey, it will be much simpler to estimate if die precision if above some minimum bound, particular the one in this case which is so low. Besides evaluating die cost-effectiveness of a seismic survey or a pattern of hard, point measurements, one is concerned with which is die more cost effect measurement program. For example, should a soft, areal geophysical survey be carried out, or should a pattern of five hard, point measurements be taken? This question is tackled below where the Wâ€žg( values of patterns of hard, point measurements taken on the 290 m spacing are compared to die Wâ€žgt value of die seismic survey (Fig. 8-30). Wâ€žet for die survey equals W^g, for one hard measurement at approximately P(SplF ) = 0.19. Therefore, if P(SplF) > 0.19, die survey is die best option and if P(SplF) < 0.19, die single, hard measurement is the best option. Similarly for five hard measurements, if P(SplF ) > 0.23, the survey is die best option and if P(SplF ) < 0.23, the five hard measurements are the best option. Figure 8-29: Net worth of the geophysical survey vs the probability of sampling a window that wdl cause failure, given that a window that will cause fadure exists. 8.6 SUMMARY OF CHAPTER 8 The closure design of die H-Area seepage basins located at the Savannah River Site was used as a case study for die framework. The two altemadve closure designs considered were to either (a) put a clay cap on the basins or (b) put no clay cap on the basins. The clay cap altemadve is assumed to be 100% effective in preventing any contamination from leaving the seepage basins, while the no clay cap altemadve is assumed to result in the seepage basins being an active source of contamination. Therefore, failure can only result for the no clay cap alternative. The cost of failure for the no clay cap alternative is not known; therefore, the data wordi analyses are carried out with two assumed costs of failure of $45 million and $70 million. These costs of failure were arbitiarily selected to force measurements to have a positive worth. The cost of failure of $45 mdlion results in a more unstable prior best design altemative. The framework was first used to evaluate the net worth of a single, hard, point measurement. It was found to be cost effective for both costs of failure. A sensitivity analysis was carried out to determine the robustness of diis decision. The sensitivity analyses included numerical parameters, economic parameters, geostatistical parameters, and numerical artifacts of the SIS algorithm. In general, none of the parameters tested had a serious effect on die cost effectiveness of die measurement. At lower costs of failure, when the prior best design altemative becomes more unstable, the effect of these parameters on Wne( increases. The net worth of pattems of up to five hard, point measurements at sample spacings of 29,145, and 290 m were studied. The most cost effective pattem was die five measurements widi die 290 m spacing. With die 290 m spacing, no limit was found for die optimum number of measurements. The measurements are independent enough that their net worth increases with the number of measurements. However at small spacings, or with much existing data, spatial dependence becomes important. Under these conditions, the net worth will increase with die number of measurements, reach a maximum and then start to decrease. Hence there will be an optimum number of measurements, which may be as small as one. The framework could be easUy used to design sequendal sampling programs involving single or multiple measurements. In the first step a contour map of die net worth for a single measurement would be produced and the location of the maximum net worth found. In the second step, the measurement(s) would be located in the region of the maximum net worth. In die case of multiple measurements, the optimum number and spacing of the measurements would then be determined. Once the measurement(s) was taken, a new contour map would be created to pick out the next measurement location(s). Measurements would only be taken as long as the maximum net worth was > $0. The robusmess of a decision to take a measurement(s) can be tested by carrying out a sensitivity analysis of Wâ€žm of the measurement(s) to base case parameters. Contour plots of W^^, could also be used to map out the maximum area over which a grid of point measurements could be cost effective. However, this area is uncertain because it is a function of die cost of failure and other base case parameters. The fi-amework was found to be effective in determining the cost effectiveness of a geophysical survey covering a large area because only die break even precisions (pj and pj) had to be estimated, radier dian the exact precision. For the example survey used here, it was estimated that the break even precision, P(SplF) (i.e. Pl), was only 0.14. Similarly, the framework also is potentially very useful for determining the most cost effective sampling technique. In the case study, the net worth of pattems of hard, point measurements were compared to the net worth of an areal, soft seismic survey. The seismic survey only needed a precision of P(SplF) > 0.23 for it to be more cost effective than five hard measurements taken 290 m apart. 8.7 NOTATION A^Q clay cap alternative Ap prior best design altemadve Ap' posterior best design altemadve A^Q no clay cap alternative Cgj. cost of clay cap altemative cost of no clay cap altemative Cf cost of failure Cf4 cost of failure of $45 million Cfj cost of failure of $70 million F failure H ' weight of prior data Hj weight of sample data Kj, hydrauhc conductivity in horizontal direction hydraulic conductivity in vertical direction mj estimate of die mean of I(x) mj' prior estimate of die mean of I(x) mj" updated estimate of the mean of I(x) Ug equivalent prior number of measurements NF no failure Pi probability of sampling a window, given diat one is present P2 probability of sampling a window, given that one is not present Sp outcome of sampling failure Sjyjp outcome of sampling no failure outcome of sampling no window ^NWfj outcome of sampling NWfj outcome of sampling a window outcome of sampling Wfj W worth of data Wfj window exists which causes failure and changes the prior best design altemadve NWf J no window exists which will cause a failure which will alter the prior best design altemadve ^aei "et worth of data correlation lengdi of indicator random variable I(x) | i i mean of indicator random variable I(x) aj-^ variance of indicator random variable l(x) Oj^'" updated variance of indicator random variable I(x) â€˘^ Wnet^ variance of net worth ^Wnet:^ variance of net worth caused by parameter i CHAPTER 9: SUMMARY AND CONCLUSIONS The objective of this thesis is to develop a Bayesian decision framework for answering data worth questions pertaining to hydrogeological design in heterogeneous geological environments. Previous Bayesian methods dealt only with homogeneous systems. The framework is developed specifically for aiding hydrogeologists, dealing with groundwater contamination, in the design of exploration programs searching for aquitard discontinuities. The framework can be valuable in carrying out cost effective remediation: It provides the site engineer with a tool not only for spending site exploration resources more efficientiy, but also for deciding when enough information has been collected. The framework consists of two basic modules: a geostatisdcal indicator (SIS) algorithm for simulating aquitard heterogeneity and a numerical model for simulating contaminant transport. Bayesian decision analysis ties the two modules together. The Bayesian nature of the framework also provides a mediodology for incorporating a conceptual understanding of the local geology widi quantitative information. Indicator geostatistics allows the handling of (a) hard, point measurements (which are precise, but are probably few and expensive), (b) soft, point measurements (which are imprecise, but are probably cheaper and more numerous), and (c) hydrogeological parameters which behave in space as non-Gaussian random variables. It is assumed that die SIS algoridim correcdy reproduces the essential characteristics of die local geology. If it does not, dien die data worth analyses could be affected. Nevertheless, it is felt diat the SIS algorithm is the best method available for handling geological heterogeneity. Finding mediods of realistically and numerically representing geological heterogeneity is a current research problem. As improved mediods are found, diey could be incorporated into die framework. Many parameters which must be estimated to carry out data worth analyses. For hard measurements, these include geostadsdcal parameters, hydrogeological parameters, economic parameters, and numerical artifacts of the SIS algorithm. The analyses of soft geophysical surveys are also dependent upon the survey's precision. The sensitivity of die worth of a single, hard measurement to these parameters was studied using the Savannah River Site case history and two generic design examples. The worth was found to be most sensitive to the economic parameters, in particular to the discount rate and the cost of failure. For the hydrogeological parameters in general, it is much more important to know whedier the value of the parameter is above or below some threshold value, rather than to know its actual value. For the geostatistical parameters, die worth is relatively insensitive to die correlation lengdi and the confidence in the prior estimate of the mean, but is sensitive to the estimate of the mean. The worth is relatively insensitive to the numerical artifacts of the SIS algorithm. Sensitivity results indicate that the framework can be robust in determining the cost effectiveness of a measurement program. However, data worth analyses can be unstable when die objective function of die prior best design alternative is close in value to that of another prior design altemative. In addition, the sensitivity analyses indicate that die perspective of the decision maker can have a major impact on the worth of a sampling program. For example, the owner/operator of a waste disposal site interested in making a rate of retum on an investment will evaluate a very different measurement worth than an environmentalist interested in die long-term preservation of die environment. The source of this difference will come primarily from the assumed discount rate, and the perceptions of die cost associated with failure. For the Savannah River case history, die net wordi of pattems of up to five hard, point measurements at spacings of 29,145, and 290 m were studied. The most cost effective pattem was the five measurements with the 290 m spacing. With the 290 m spacing, no limit was found for the optimum number of measurements. The measurements are almost independent; therefore, dieir net wordi increases widi the number of measurements. However at smaller spacing, or with many existing data, spadal dependence can become important. Under these condidons, the net worth will increase with the number of measurements, reach a maximum and then start to decrease. Hence, the analysis predicts an opdmum number of measurements, which may be as small as one. The framework could be easdy used to design sequential sampling programs involving single or multiple measurements. In the first step a contour map of the net worth for a single measurement would be produced and die location of die maximum net worth found. In the second step, the measurement(s) would be located in the region of the maximum net wordi. In die case of multiple measurements, the optimum number and spacing of the measurements would be determined. Once the measurement(s) was taken, a new contour map would be created to pick out the next measurement location(s). Measurements would be only taken as long as the maximum net worth is > $0. The robusmess of a decision to take a measurement(s) can be tested by carrying out a sensitivity analysis of the measurement's net worth to base case parameters. These include geostatistical, economic, and hydrogeological parameters and numerical artifacts of the SIS algoridim. The framework was found to be effective in determining the cost effectiveness of a geophysical survey covering a large area because only the break even precision needed to be estimated, rather than the exact precision. The break even precision is die precision at which die net worth = $0. In die case study, the geophysical survey was cost effective if the probability of its locating a window in the area sampled, given that one existed, was > 0.14. For similar reasons, the framework was also found to be effective in comparing the cost effectiveness of geophysical surveys and pattems of hard point measurement. In die case study, die seismic survey needed only a precision of P(SplF) > 0.23 to be more cost effective than five hard measurements taken 290 m apart. Consequendy, die developed framework was found to be effective in making data worth decisions involving exploration programs searching for aquitard continuity. This accompUshment represents the ; / major contribution of this thesis. The present applicability of the framework to answering real data worth question should be tested by back analyzing sampling programs at real groundwater contamination sites where many data have been collected. A second contribution of diis diesis lies in combining geological understanding widi quantitative data to gain a geostatistical description of sand/shale heterogeneity. This combination is important because the environment of deposition contains much information about the heterogeneity. The case study demonsttated how the inclusion of geological intuition can be critical in correcdy carrying out a data wordi analysis. Analysis based on existing quantitative data alone is handicapped because if no windows have been previously found, then the probability of a window existing is zero. However, it is known that alluvial action over die last several million years could have created a window; Upper Three Runs Creek is an existing example. Consequendy, ignoring the geological information would have resulted in the incorrect conclusion that data have zero worth and may have resulted in a poor design decision. Application is not stiaightforward at this time because of the lack of a quantitative relationship between heterogeneity and environment of deposition, but research at other centers is apparendy progressing in this dh-ection. A third contribution is the handling of die effect of uncertainty of aquitard continuity on the prediction of contaminant tiansport. A final contribution lies in the adaptability of the framework to handle other types of data worth questions. The framework provides a foundation for addressing new data worth questions not only in hydrogeology, but also in other disciplines such as mining and petioleum reservoir engineering. REFERENCES Aaland, R., Personal Communication, Wesdnghouse Savannah River Company, Savannah River Laboratory, Ailcen, South Carolina, 1989. Alabert, F.G., Stochastic Imaging of Spatial Distributions Using Hard and Soft Information, Masters diesis, Stanford University, 1987a. Alabert, F.G., The practice of fast conditional simulations through die L U decomposition of die covariance matiix. Mathematical Geology, 19 (5), 369-386,1987b. Attanasi, E.D., and M.R. Karlinger, Worth of data and natural disaster insurance. Water Resources Research, 15 (6), 1763-1766, 1979. Barnes, R.J., A partial history of spatial sampling design, Geostatistics, Spring, 10-13, 1989. Ben-Zvi, M . , B. Berkowitz, and S. Kesler, Preposterior analysis as a tool for data evaluation: Application to aquifer contamination. Water Resources Management, 2, 11-20,1988. Ben-Zvi, M . and Y. Bachmat, Management of a Water Resource Under Uncertainty, technical report, Hydrological Serv., Jerusalem, Israel, 1979. Benjamin, J.R. and C.A. Cornell, Probability, Statistics, and Decision for Civil Engineers, McGraw-HiU, Toronto, 1970. Benjamin, J.R. and C.A. Cornell, Probability, Statistics, and Decision for Civil Engineers, McGraw-Hill, 1970. Bogardi, I., A. Bardossy, and L. Duckstein, Multicriterion network design using geostatistics. Water Resources Research, 21 (2), 199-208, 1985. Boyce, W.E. and R.E. DiPrima, Elementary Differential Equations and Boundary Value Problems, John Wdey and Sons, 1969. Bradey, P. B.L. Fox, and L.E. Schrage, A Guide to Simulation, Springer-Verlag, New York, 1987. Brownlow, A.H. , Geochemistry, Prentice Hall, Englewood Chffs, N.J., 1982. Bury, K.V. , Statistical Models in Applied Science, John Wdey and Sons, 1975. Buss, D.R., G.M. Duffield, T.S. Wadsworth, and J.M. Mercer, Characterization of Groundwater Flow and Transport in the General Separations Areas Savannah River Plant: Evaluation of a Corrective Groundwater Action for the F and H-Area Seepage Basins, final report prepared for E.I. duPont de Nemours & Company, Environmental Sciences Division, Savannah River Laboratory, Aiken, South Carolina, 29808 by GeoTrans Inc., 1987. Cahn, L.S., Development of Guidelines for Design of Sampling Programs to Predict Groundwater Discharge, M.S. Thesis, University of British Columbia, Vancouver, B.C., 1987. Clark, I., Practical Geostatistics, Applied Science Publishers, 1979. Clifton P.M. and S.P. Neuman, Effects of kriging and inverse modeling on conditional simulation of die Avra Valley in soudiem Arizona, Water Resources Research, 18 (4), 1215-1234, 1982. Cooke, C.W., Geology of die Coastal Plain of Soudi Carolina, U.S. Geol. Surv. Bull. 867,1936. Cross, G. Personal Communication, Department of Geophysics, University of British Columbia, Vancouver, B.C., 1992. Crouch, E.A.C., and R. y^ilson, Risk/Benefit Analysis, Ballinger, 1982. Davis, D.R. and W.M. Dvoranchik, Evaluation of the worth of additional data. Water Resources Bulletin, 7(4), 700-707, 1971. Davis, D.R., L. Duckstein, and R. Krzystofowicz, The worth of hydrologie data for nonoptimal decision making. Water Resources Research, 15 (6), 1733-1742, 1979. Davis, D.R., C C . Kisiel, and L. Duckstein, Bayesian decision theory applied to design in hydrology. Water Resources Research, 8 (1), 33-41,1972. Davis, M.W., Production of conditional simulations via the L U triangular decomposition of the covariance matiix. Mathematical Geology, 19 (2), 91-98,1987. Dawdy, D.R., The wordi of hydrologie data. Water Resources Research, 15 (6), 1726-1732, 1979. de Marsily, G. Spatial variability of properties in porous media: A stochastic approach. Fundamentals of Transport Phenomena in Porous Media, J. Bear and M.Y. Corapcioglu (eds.), NATO ASl Series, 719-770, 1984. de Marsily, G., Quantitative Hydrogeology, Academic Press, New York, 1986. Delhomme, J.P., Kriging in die hydrosciences. Flow Through Porous Media-Recent Developments, G.F. Pinder (ed.), C M L Publications, 99-113, 1983. Delhomme, K., and Giannesini, F., Reservoir description techniques improve simulation results in Hassi-Messaoud Field Algeria; SPE Paper No. 8435,54th Annual Technical Conference and Exhibition of the SPE in Las Vegas, Nevada, Sept 23-25, 1979. Desbarats, A.J., Numerical estimation of effective permeability in sand-shale formations. Water Resources Research, Vol. 23 (2), 273-286, 1987a. Desbarats, A.J., Stochastic Modeling of Flow in Sand - Shale Sequences, Ph.D. Dissertation, Stanford University, 1987b. Domenico, D.A., and F.W. Schwartz, Physical and Chemical Hydrogeology, John Wiley, Toronto, 1990. Drever, J.I. The Geochemistry of Natural Water, Prentice Hall, Englewood CUffs, N.J., 1982. Drew, L.J., Pattem drilling exploration: optimum pattem types and hole spacings when searching for elliptical targets. Math. Geology, 11 (2), 223-254,1979. Duffield, G.M., D.R. Buss, and D.E. Stephenson, Velocity prediction errors related to flow model cahbration uncertainty, ModelCARE 90: Calibration and Reliability in Groundwater Modelling, lAHS Publ. no. 195, 1990. Duffield, G.M., D.R. Buss, D.E. Stephenson, and J.W. Mercer, A grid refinement approach to flow and transport modeling of a proposed ground-water corrective action at the Savannah River Plant, Aiken, South Carolina, Proceedings of Conference on Solving Ground Water Problems with Models, mm A, Denver, Co., 2,1097-1120,1987. Duffield, G.M., D.E. Stephenson, D.R. Buss, and T.D. Wadsworth, Effects of heterogeneous porous geology on ground-water flow and ti-ansport modeling in multiaquifer systems. Proceedings of the Solving of Ground-Water Problems with Models Conference and Exposition, NWWA, Indianapolis, IN, 1989. Fogg, G.E., Groundwater flow and sand body interconnectedness in a thick multiple-aquifer system. Water Resources Research, 22 (5), 679-694, 1986. Fogg, G.E., E. Simpson, and S.P. Neuman, Aquifer Modeling Applied to an Arizona Groundwater Basin, Tech. Rep. 32., Dept. of Hydrol. and Water Res., Univ. of Arizona, Tucson, 1979. Freeze, R.A., et. al. Advances in die assessment of data worth for engineering decision analysis in groundwater contamination problems. Groundwater Flow and Quality Modelling, E. Custodio et al. (eds.), Reidel Pub. Co., 665-697,1988. Freeze, R.A. and J.A. Cherry, Groundwater, Prentice Hall, Englewood Chffs, N.J., 1979. Frind, E.G., and G.B. Matanga, The dual formulation and flow for contaminant transport modeling 1. Review of theory and accuracy aspects. Water Resources Research, 21, (2), 159-169,1985. Fritz, P. and J.C. Fontes eds.. Handbook of Evironmental Isotope Geochemisty, V.l., The Terrestrial Environment A, Elsevier, New York, 1980. Gates, J.S. and C.C. Kisiel, Worth of additional data to a digital computer model of a groundwater basin. Water Resources Research, 10(5), 1031-1038, 1974. Geehan, G.W., Lawton, T.F., Sakurai, S., Klob, H., Clifton, T.R., Inman, K.F. and Nitzberg, K.E., Geologic prediction of shale continuity Prudhoe Bay Field, Reservoir Characterization, Lake, L.W and Carrol, H.B. Jr. (eds.) Academic Press, 1986. Ghauri, W.K., A.F. Osborne, and W.L. Magnuson, Changing concepts in carbonate waterflooding-West Texas Denver unit project~an illustrative example, / . of Petroleum Technology, June, 595-606, 1974. Gomez-Hernandez, J.J, Personal Communication, Stanford University, Dept. of Applied Eardi Science, 1990. Gomez-Hernandez, J.J, and R.M., Srivastava, IsimSd: A Three Dimensional Multiple Indicator Conditional Simulation Program, Computers and Geosciences, 16 (4), 395-440,1990. Hachich, W. and E.H. Vanmarcke, Probabilistic updating of pore pressure fields, / . ofGeot. Eng., 109 (3), 373-387, 1983. Haldorsen, H.H., On modeling of vertical permeabiUty barriers in single-well simulations models, SPE Formation Evaluation, Sept., 349-358, 1989. Haldorsen, H.H. and Chang, D.M., Notes on stochastic shales; from outcrop to simulation model. Reservoir Characterization, Lake, L.W and Carrol, H.B. Jr. (eds.) Academic Press, 1986. Haldorsen, H.H. and L.W., Lake, A new approach to shale management in field-scale models, SPEJ, Aug., 447-457, 1984. Hazeu, G.J.A., Krakstad, O.S., Rian, D.T., Skaug, M . , The application of new approaches for shale management in a three dimensional simulation study of die Frigg Field, SPE Form. Eva!., Sept., 493-502,1988. Hewett, T. A. and R. A. Behrens, Conditional simulation of reservoir heterogeneity with fractals, SPE 18326,63'''^ Annual Technical Conference and Exhibition of the Society of Petroleum Engineers, Houston, TX, Oct 2-5, 645-660,1988. Isaaks, E.H. and M . Srivastava, Spatial continuity measures for probabilistic and deterministic geostatistics, AfGf/S conference, 1987. Jackson, R.E., R.J. Patterson, B.W. Graham, J. Bakr, D BĂ©langer, H Lockwood, and M . Priddle, Contaminant Hydrogeology of Toxic Organic Chemicals at a Disposal Site, Gloucester, Ontario, 1. Chemical Concepts and Site Assessment, NHRI paper No. 23, FWD Scientific series No 141, National Hydrology Research Institute Inland Waters Directorate, Ottawa, Canada, 1985. Jardine, D., D.P. Andrews, J.W. Wishart, and J.W. Younge, Distribution and continuity of carbonate reservoirs, / . of Petroleum Technology, July, 873-885, 1977. Joumel, A.G., Fundamental of geostatistics in five lessons. Short Course in Geology: Volume Eight, American Geophysical Union, 1989. Joumel, A.G., Nonparametiic estimation of spatial distiibutions. Mathematical Geology, 15 (3), 445-468, 1983. Joumel, A.G. and Ch. J. Huijbreghts, Mining Geostatistics, Academic Press, 1978. Kennedy, Jenks, and Chilton, Soil and Groundwater Investigation, Pennwalt Inorganic Chemical Division, Tacoma, WA, Feb., 1990. Killian, T.H., N.L. Kolb, P Corbu, and I.W. Marine, Environmental Information Document, H-Area Seepage Basins, E.I. du Pont de Nemours and co.. Savannah River Laboratory, DPST-85-706, 1987. Kitanidis, P.K., Parameter uncertainty in estimation of spatial functions: Bayesian analysis. Water Resources Research, 22 (4), 499-507, 1986. Kitanidis, P.K. and E.G. Vomvoris, A geostatistical approach to the inverse problem in groundwater modeling (steady state) and one-dimensional simulations, Water Resources Research, 19 (3), 677-690,1983. Kossack, C.A., Prediction of layer lengdis from layer heights for reservoir simulation: a statistical analysis of outcrop data,/. Pet. Tech., 81 (8), 867-971,1989. Knoll, M . , Personal Communication, Department of Geological Sciences, U.B.C., Vancouver, B.C., 1992. Le Blanc, R. J. Sr. Distribution and continuity of sandstone reservoir--part 1, / . of Pet. Tech., July, 776-792,1977a. Le Blanc, R. J. Sr. Distribution and continuity of sandstone reservoir-part 2, / . of Pet. Tech., July, 793-805,1977b. Leopold, L.B., and Wolman, M.G., River channel patterns: braided, meandering, and sd-aight; Professional Paper 282-B, USGS, Washington, D.C., 39-85, 1957. Lewis, C , Personal Communication, Wesdnghouse Savannah River Company, Savannah River Laboratory, Aiken, South Carolina, 1989. Maddock 111, T. Management model as a tool for studying the worth of data. Water Resources Research, 9 (2), 270-280, 1973. Mantoglou, A., Digital simulation of multivariate two- and three- dimensional stochastic processes with spectial turning band method. Mathematical Geology, 19, 129-150,1987. Marin, C M . M.A. Medina Jr., and J.B. Butcher, Monte Carlo analysis and Bayesian decision theory for assessing the effects of waste sites on groundwater, I: Theory, / . of Contaminant Hydrology, 5, 1-13, 1989a. Massmann, R. A Freeze, L. Smith, T. Sperling, and B. James, Hydrogeological decision analysis: 2. Applications to ground-water contamination. Groundwater, 29 (4), 536-548, 1991. Massmann, J. and R.A. Freeze, Groundwater contamination from waste management sites: The interaction between risk-based engineering design and regulatory policy 1. methodology. Water Resources Research, 23 (2), 351-367, 1987a. Massmann, J. and R.A. Freeze, Groundwater contamination from waste management sites: The interaction between risk-based engineering design and regulatory policy 2. results. Water Resources Research, 23 (2), 368-380, 1987b. Massmann, J.W., Groundwater Contamination from Waste-Management Sites: The Interaction Between Risk-Based Engineering Design and Regulatory Policy, Ph.D. Thesis, University of British Columbia, Vancouver, B.C., 1987. McDonald, M.G. and A.W. Harbaugh, A Modular Three-Dimensional Finite-Difference Ground-Water Flow Model, U.S. Geological Survey Open File Report 83-875, 1988. Medina M.A. Jr., J.B. Butcher, and M . M . Carlos., Monte Carlo analysis and Bayesian decision theory for assessing the effects of waste sites on groundwater, II: Applications, J. of Contaminant Hydrology, 5,15-31,1989b. Meyers, D.E. To be or not to be ... stationary? That is the question. Mathematical Geology, 21 (3), 347-362,1989. National Research Council, Ground Water Models: Scientific and Regulatory Applications, Water Science and Technology Board, 1990. Newson, J.M. and J.L. Wilson, Vertical flow in wellbores: short-circuits for contaminants, Eos, Transactions, American Geophysical Union, 69 (44), 1215, 1988. NUS Corp, Remedial Investigation, CIBA-GEIGY Site. Toms River, NJ., Sept. 1986. Nystrom, P.G., Jr. and R.H. Wdloughby, eds. Geological investigations related to stratigraphy in the Kaolin mining disti-ict, Aiken Co., South Carolina, Car. Geol. Soc. Field Trip Guide Book, 1982. Parizek, R.R., and R.W. Root Jr., Development of a Ground-Water Velocity Model for the Radioactive Waste Management Facility Savannah River Plant, South Carolina, Final Report submitted to E.I. duPont and Co., Savannah River Laboratory, Aiken, Soudi Carolina, 1986. Phdlips, F.M., and J.L. Wilson, An approach to estimating hydrauhc conductivity spatial correlation scales using geological characteristics. Water Resources Research, 25 (1), 141-143, 1989. Pollack, D.W., Documentation of Computer Programs to Compute and Display Pathlines Using Results from the U.S. Geological Survey Modular Three-Dimensional Finite-Difference Ground-Water Flow Model, U.S.G.S, 1989. Potter, P.E., Maynard, J.B., and Pryor, W.A., Sedimentology of Shale, Springer-Verlag, New York, 1985. Press, F. and R. Siever, Earth, W.H. Freeman and Co., 1978. Price, v. . Personal Communication, Wesdnghouse Savannah River Company, Savannah River Laboratory, Aiken, South Carolina, 1989. Ravenne, C , R. Eschard, A. Gali, Y. Mathieu, L. Montadert, J.L. Rudkiewicz, Heterogeneities and geometiy of sedimentary bodies in a fluvio-deltaic reservou-, SPE Formation Evaluation, June, 239-247, 1989. Reading, H.G., Sedimentary Environments and Facies, Elsevier, New York, 1983. Richardson, J.G., D.G. Harris, R.H. Rossen, and G. Van Hee, The effect of small, discontinuous shales on od recovery, 7. of Petroleum Technology, Nov. 1531-1537, 1978. Riggs, J.L., W.F. Rentz, and A.L. Kahl, Essentials of Engineering Economics, McGraw-Hill Ryerson Ltd., 1983. Root, R.W. Jr., A Comparison of Deterministic and Geostatistical Modeling Methods as Applied to Numerical Simulation of Ground-Water Flow at the Savannah River Plant, South Carolina, Ph. D. Thesis, Pennsylvania University, 1987. Rouhani, S. Variance Reduction Analysis, Water Resources Research, 21 (6), 837-846, 1985. Roux, P.H. and W.F. Althoff, Investigation of organic contamination of groundwater in South Brunswick Township, New Jersey, Groundwater, 18 (5), 464-471, 1980. Selby, M.J., Earth's Changing Surface, Clarendon Press, Oxford, 1985. Sinclair, A.J., Personal Communication, Dept. of Geological Sc., U.B.C., Vancouver, B.C., Canada, 1991. Sinclair, A.J., Personal Communication, Dept. of Geological Sc., U.B.C., Vancouver, B.C., Canada, 1990. Slichter, L.B. , Geophysics applied to prospecting for ores, in Pt. 2 of Bateman, A . M . (ed.). Economic Geology, 50di anniversary volume, 1905-1955: Urbana lU., Econ. Geology Pub. Co., 885-969, 1955. Sneider, R.M., Tinker, C.N. and Meckel, L.D., Deltaic environment reservoir types and died-characteristics,/, of Pet. Tech., Nov., 1538-1546, 1978. Sperling, T. Personal Communication, Gartner Lee Ltd., Vancouver, B.C. 1992. Sperhng, T., R. Allan Freeze, J. Massmann, L . Smith, B. James, Hydrogeological decision analysis: 3. Application to design of a groundwater control system at an open pit mine. Groundwater, (in press). Sperling, T., Risk-Cost-Benefit Framework for the Design ofDewatering Systems in Open-Pit Mines, Ph.D Thesis, University of British Columbia, Vancouver, B.C., 1991. Srivastava, M . , Personal Communication, FSS Canada, Vancouver, B.C., 1990. SuUivan, J.A., Non-Parametric Estimation of Spatial Distributions, Ph.D. Thesis, Stanford University, Stanford, CA, 1985. Wadman, D.H., Lamprecht, D.E. and Mrosovsky, I., Joint geologic/engineering analysis of the Sadlerochit reservok, Pudhoe Bay Field / . of Pet. Tech., July, 933-940, 1979. Walker R.G., Facies Models, Geoscience Canada, Reprint Series 1, 1983. Weber, K.J., How heterogeneity affects oil recovery. Reservoir Characterization, Lake, L.W and Carrol, H.B. Jr. (eds.) Academic Press, 1986. Weber, K.J., Influence of common sedimentary structures on fluid flow in reservoir models, J. of Pet. Tech., March, 665-672., 1982. Weber, K.J., Klootwijk, P.H., Konieczek, J., van der Vlugt, W.R., Simulation of water injection in a barrier-bar-type. Oil-rim reservoir in Nigeria, / . Pet. Tech,, July, 1555-1565, 1978. Wu, T.H., Vyas, S.K., and Chang, N.Y., Probabihstic analysis of seepage J. Soil Mech. and Foundation Division, ASCE, April, 323-340, 1973. Yeh, W.W-G., Review of parameter identification procedures in groundwater hydrology: the inverse problem. Water Resources Research, 22 (2), 95-108, 1986. Zeito, G.A., Interbedding of shale breaks and reservoir heterogeneities, J. Pet. Tech., Oct, 1223-1228, 1965. APPENDIX 1: GREEN CLAY DATA BASE THICK - thickness of Green Clay (m) GCTOP - top elevation of Green Clay (m) GCBOT - bottom elevation of Green Clay (m) GLOG - G =geophysical log available. GC - Y = Green Clay indicated on hdiologic logs - N = Green Clay not indicated on lidiologic logs DH - difference in head across green clay (m) Borehole Co-ordinate THICK GC GCTOP GCBOT GLOG DH E N (m) (m) (m) (m) BGC-IA 57196 73551 0.9 Y 36.3 35.4 BGC-2A 55887 74350 2.4 Y 41.3 38.9 BH-1 64590 76000 3.9 N 47.7 43.7 G BH-10 65000 74000 3.3 Y 46.3 43.0 G BH-11 65199 74301 1.6 Y 45.9 44.3 G BH-12 65998 74001 3.3 Y 48.0 44.7 G BH-13 64002 72999 3.3 N 47.0 43.7 G BH-14 65018 72993 2.3 N 43.5 41.2 G BH-15 66008 72992 1.4 Y 46.5 45.0 G BH-159 64380 73866 3.9 N 43.3 39.4 G BH-16 64592 72504 3.3 Y 43.5 40.3 G BH-160 64438 74365 3.3 Y 45.8 42.6 BH-18 65165 75035 4.9 Y 48.2 43.4 BH-2 64000 74999 3.3 N 45.9 42.7 G BH-20 65175 75025 1.6 Y 48.2 46.6 G BH-22 64880 75115 0.0 N 43.8 43.8 BH-23 65050 75115 6.6 Y 43.8 37.2 21.3 BH-24 64600 73693 3.0 N 43.3 40.4 G BH-25 64675 75035 3.3 Y 46.0 42.7 BH-29 64880 74940 0.0 N 43.3 43.3 BH-3 64601 75008 3.3 N 44.8 41.5 G BH-31 65050 74940 1.6 Y 49.4 47.8 BH-32 64520 74840 0.0 N 44.7 44.7 BH-33 64675 74840 3.3 Y 42.6 39.3 BH-37 64520 74660 0.0 N 43.4 43.4 BH-38 64590 74745 0.7 Y 46.6 46.0 BH-39 64675 74660 0.0 N 46.0 46.0 BH-4 65305 75106 1.6 Y 47.2 45.6 G 21.3 BH-41 64880 74775 3.3 Y 42.7 39.4 BH-42 65050 74775 4.9 Y 43.4 38.5 BH-45 64880 74610 0.0 N 43.5 43.5 BH-46 62894 73662 2.1 N 44.9 42.8 G BH-47 65050 74610 3.6 Y 45.4 41.8 G BH-48 65505 74590 3.8 Y 48.6 44.8 17.4 BH-49 65640 74590 3.8 Y 48.5 44.8 BH-5 66001 75001 3.8 Y 44.3 40.5 G BH-50 65790 74590 1.6 Y 45.3 43.7 17.1 BH-54 64188 73869 3.3 N 43.0 39.7 G BH-55 64520 74420 1.5 Y 43.4 41.9 G BH-56 64675 74465 1.4 Y 46.1 44.8 BH-57 64880 74440 3.3 Y 46.0 42.7 BH-58 65050 74440 3.3 Y 42.0 38.7 BH-6 64586 74500 1.5 Y 46.4 44.9 G BH-62 64190 74024 3.0 N 42.6 39.7 G BH-63 65640 74459 3.3 Y 48.7 45.4 G BH-64 65790 74405 1.6 Y 44.8 43.2 17.4 Borehole Co-ordinate THICK GC GCTOP GCBOT GLOG DH E N (m) (m) (m) (m) BH-66 64378 74025 1.6 Y 45.7 44.1 G BH-69 64765 74245 0.0 N 46.2 46.2 23.3 BH-7 63015 73995 2.6 Y 42.9 40.3 G BH-72 65502 74332 1.6 Y 44.5 42.8 G BH-73 65705 74340 0.0 N 43.0 43.0 BH-77 64520 74165 0.0 N 44.2 44.2 BH-78 64771 74088 3.3 Y 45.6 42.3 BH-8 63995 73994 2.5 N 43.4 40.9 G BH-80 64465 73990 0.0 N 44.3 44.3 BH-81 64765 73955 2.0 N 44.1 42.1 G BH-83 64259 74365 2.0 N 43.6 41.6 G BH-84 64253 74256 1.6 N 40.8 39.1 G BH-85 64440 74256 0.0 N 44.2 44.2 BH-G6L1 64362 74334 0.0 N 42.8 42.8 BH-86L2 64367 74292 1.6 N 44.1 42.5 G BH-89 64465 73820 3.3 Y 43.9 40.6 BH-9 64601 74011 0.0 N 43.0 43.0 BH-90 64591 73851 3.3 Y 44.4 41.1 G BH-92 64765 73785 3.3 Y 42.3 39.0 BH-98 64465 73620 0.0 N 43.3 43.3 22.3 DH-2 62090 71760 0.0 N 39.5 39.5 DH-3 61520 71629 0.0 N 37.8 37.8 DH-4 53190 77000 0.0 N 40.0 40.0 DH-5 52650 77335 0.0 N 40.3 40.3 F-10 53472 78985 0.0 N 42.7 42.7 F-11 54200 78400 0.0 N 42.7 42.7 F-13 53480 78275 2.6 Y 43.9 41.2 F-14 53480 79270 1.6 Y 45.9 44.2 F-28 53660 78657 0.0 N 42.7 42.7 F-6 53549 79843 0.0 N 42.7 42.7 F-8 53400 78475 1.6 Y 44.1 42.5 F-9 53400 79480 0.0 N 42.7 42.7 FC-IA 53115 79665 3.6 Y 48.9 45.3 22.3 FC-2A 55424 79244 3.3 Y 48.6 45.3 21.0 FC-3A 57620 78727 0.7 Y 51.2 50.5 18.4 FC-4A 53897 82243 0.3 Y 42.3 42.0 G 4.3 FC-5A 54672 87988 1.6 N 53.0 51.3 G FF-4 52942 76922 0.0 N 42.7 42.7 FU-1 53475 79121 0.0 N 42.7 42.7 FU-2 53478 78648 5.7 Y 46.8 41.0 FU-3 54390 78320 0.0 N 42.7 42.7 H-35D 58548 71918 3.3 Y 38.1 34.8 G HC-IA 61867 71755 0.7 Y 40.4 39.7 G 24.6 HC-lOA 61531 75815 1.3 Y 41.3 40.0 G 14.8 H C - l l A 62147 74519 1.0 Y 44.6 43.6 G HC-13A 63610 73394 1.0 Y 41.1 40.1 G Borehole Co-ordinate THICK GC GCTOP GCBOT GLOG DH E N (m) (m) (m) (m) HC-14A 60658 67560 1.6 Y 38.2 36.6 G HC-16A 65462 72596 1.8 Y 43.0 41.2 G 19.4 HC-17A 61700 73200 0.7 Y 42.0 41.3 HC-18A 63409 71560 1.0 Y 37.4 36.4 HC-2A 61866 71794 2.6 Y 41.6 38.9 G 26.2 HC-3A 62266 74742 3.0 Y 39.2 36.3 G 24.9 HC-7A 66992 74352 0.7 Y 46.6 45.9 G HC-8A 59990 77492 1.0 Y 50.2 49.2 G 13.8 HC-9A 64084 75135 1.5 Y 44.0 42.4 G 22.3 PDM-5 54695 74819 0.7 Y 42.7 42.0 PWAA-17 54826 72978 0.3 Y 40.4 40.0 PWCC-13 54387 73057 0.7 Y 42.3 41.7 PWCC-25 55350 72350 2.3 Y 40.7 38.4 PWEE-9 53941 73132 1.0 Y 40.0 39.0 PWM--1 54210 75182 1.0 Y 41.7 40.7 PWM--4 53968 75360 1.0 Y 41.3 40.4 PWM-1 54370 75063 1.0 Y 41.7 40.7 PWM-13 55338 74344 2.0 Y 42.3 40.4 PWM-17 55660 74106 2.0 Y 41.7 39.7 PWM-5 54695 74819 0.3 Y 41.7 41.3 PWM-9 55016 74582 0.7 Y 41.0 40.4 PWQ-3 54300 74614 1.0 Y 40.0 39.0 PWQ-5 54457 74498 2.3 Y 39.7 37.4 PWQ-7 54619 74378 1.0 Y 41.3 40.4 PWQ-9 54779 74260 1.0 Y 41.7 40.7 PWU-17 55184 73463 0.7 Y 39.7 39.0 PWU-5P 54219 74176 2.0 Y 41.0 39.0 PWW-5 54106 74010 0.0 N 39.4 39.4 PWW-5P 54106 74010 0.0 N 39.4 39.4 PWY-21 55268 72903 0.3 Y 41.0 40.7 SDS-12A 66610 77742 3.3 N 50.7 47.4 G 16.4 SDS-7A 67681 76515 1.3 N 44.0 42.7 G 12.1 12H-11 61150 72100 1.3 Y 42.7 41.3 12H-18 62473 71310 0.0 N 41.0 41.0 12H-19 62473 71270 0.0 N 41.0 41.0 12H-20 62498 70975 2.6 Y 40.0 37.4 12H-21U 62636 70966 1.3 Y 39.8 38.5 12H-22U 62730 70956 3.3 Y 39.4 36.1 12H-23 62850 70956 0.0 N 39.4 39.4 12H-27U 62794 70945 3.3 Y 40.6 37.3 12H-3 62708 71300 1.0 Y 40.7 39.7 12H-30 62900 70956 1.2 Y 39.4 38.2 12H-39 62670 70826 0.0 N 39.4 39.4 12H-40U 62740 70832 0.0 Y 39.4 39.4 14F-1 52800 76720 1.6 Y 43.3 41.7 Borehole Co-ordinate THICK GC GCTOP GCBOT GLOG DH E N (m) (m) (m) (m) 14F-2 52786 76814 2.3 Y 44.3 42.0 14F-3U 52786 76940 1.3 Y 43.3 42.0 14F-4 52786 77072 1.6 Y 43.6 42.0 14F-6 52600 76694 1.6 Y 42.7 41.0 14F-7U 52600 76816 0.0 N 42.7 42.7 14F-8 52600 76940 1.6 Y 44.3 42.7 14F-9 52600 77072 1.6 Y 41.7 40.0 55F-1U 52332 77072 0.0 N 41.0 41.0 55F-2 52332 76696 1.6 Y 42.0 40.4 55F-3 52332 76940 1.6 Y 41.7 40.0 55F-4U 52332 76816 0.3 Y 38.4 38.1
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- The worth of data in predicting aquitard continuity...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
The worth of data in predicting aquitard continuity in hydrogeolgical design James, Bruce Rennie 1992
pdf
Page Metadata
Item Metadata
Title | The worth of data in predicting aquitard continuity in hydrogeolgical design |
Creator |
James, Bruce Rennie |
Date Issued | 1992 |
Description | A Bayesian decision framework is developed for addressing data worth questions for hydrogeological design in heterogeneous geological environments. This framework is developed specifically for aiding hydrogeologists, dealing with groundwater contamination, in the design of exploration programs searching for aquitard discontinuities. It can be used to evaluate, and compare, the cost effectiveness of (a) patterns of precise point measurements (e.g. boreholes), which are often expensive, and (b) areal geophysical surveys which are imprecise, but usually less expensive. The framework consists of two basic modules: a geostatistical indicator algorithm for simulating aquitard heterogeneity and a numerical model for simulating contaminant transport. Bayesian decision analysis ties these two modules together. The Bayesian nature of the framework also provides a methodology for combining a conceptual understanding of the local geology with quantitative information. Indicator geostatistics allows the handling of hydrogeological parameters which behave in space as non-Gaussian random variables. The estimated worth of a measurement was found to be particularly sensitive to economic parameters. It was less sensitive to hydrogeological and geostatistical parameters. The framework was applied in a retrospective fashion to the design of a remediation program for soil contaminated by radioactive waste disposal at the Savannah River Site, in South Carolina. The cost effectiveness of different patterns of point measurements was studied. This study included determining the number and spacing of the most cost effective pattern. Contour maps were produced of the net worth of a single, point measurement. These contour maps can be used to design sequential sampling programs involving single or multiple measurements. Good potential was also shown for determining the cost effectiveness of an a real geophysical survey. The net worth of patterns of precise, point measurements was compared to that of an imprecise, a real seismic survey. These results indicate that the framework can be very valuable in determining if additional exploration is cost effective and in designing efficient exploration programs. Results also show that ignoring a conceptual understanding of geology can lead to erroneous data worth analysis. The framework could be modified to handle other data worth questions in hydrogeology or other disciplines, such as mining, or petroleum reservoir engineering. |
Extent | 7781812 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2008-12-18 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0052848 |
URI | http://hdl.handle.net/2429/3188 |
Degree |
Doctor of Philosophy - PhD |
Program |
Geological Sciences |
Affiliation |
Science, Faculty of Earth, Ocean and Atmospheric Sciences, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1992-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1992_spring_james_bruce_rennie.pdf [ 7.42MB ]
- Metadata
- JSON: 831-1.0052848.json
- JSON-LD: 831-1.0052848-ld.json
- RDF/XML (Pretty): 831-1.0052848-rdf.xml
- RDF/JSON: 831-1.0052848-rdf.json
- Turtle: 831-1.0052848-turtle.txt
- N-Triples: 831-1.0052848-rdf-ntriples.txt
- Original Record: 831-1.0052848-source.json
- Full Text
- 831-1.0052848-fulltext.txt
- Citation
- 831-1.0052848.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0052848/manifest