IMPRECISE PROBABILITY AND DECISION IN CIVIL ENGINEERING— DEMPSTER-SHAFER THEORY AND APPLICATIONByWuben LuoB.Eng., Tsinghua University, Beijing, China, 1985M.A.Sc., The University of British Columbia, 1988A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF CIVIL ENGINEERINGWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIAJune 1993© Wuben Luo, 1993In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature) Department of 6,'v1(-- 4: 4 /;1The University of British ColumbiaVancouver, CanadaDate ^jui--fi /' DE-6 (2/88)AbstractOver the last three decades, Bayesian theory has been widely adopted in civil engineer-ing for dealing with uncertainty and for purposes of decision making under uncertainty.However the Bayesian approach is not without criticisms. One major concern has beenthat information or knowledge, no matter how weak or sparse, must necessarily berepresented by conventional, precisely specified, probabilities. This has lead to thedevelopment of statistical methods which allow for more flexible expressions of bothinformation inputs, and inferred results. More recently a general concept, called im-precise probability, which embraces a number of these methods, has been described[Walley, 1991].Weak information is often encountered in civil engineering. This is especially truein decision making as major decisions are often dominated by random, but infrequent,extreme natural events. For these rare events the sample record is usually short andthe relevant subjective knowledge based on human experience is also likely to be verylimited. The imprecise probability concept therefore has potential relevance to someimportant civil engineering decision problems.Among the existing imprecise probability schemes, Dempster-Shafer (D-S) theory isprominent. This theory has attracted considerable attention in the Artificial Inteligence(AI) field, but the applications are different from those considered here. This haslargely overshadowed the relevance of the theory to the more conventional inferenceand decision making types of problems encountered in civil engineering.In this thesis, some applications of the D-S theory primarily to water resourcesengineering decision problems are developed. The engineering examples presentediithroughout the thesis provide some indications of the impact of implementing impreciseprobabilities on engineering decision analysis. In most instances a closest equivalentBayesian analysis is performed and the results compared with those obtained from theD-S scheme.The concept of imprecise probability is philosophically important to the researchand is briefly reviewed. The theoretical ingredients of D-S theory which are necessary tosupport engineering applications are then introduced. This is followed by presentationof several different procedures for translating weak sampling information inputs into D-S inference results. The elicitation of subjective prior inputs for the D-S scheme is alsodiscussed and includes representing some typical engineering expressions of subjectiveknowledge. Two civil engineering models, one in hydrologic design and the other inreliability analysis, are also developed, and they demonstrate how the scheme can beapplied in more complex engineering situations.When presented with weak information input, the D-S decision analysis yields upperand lower expected utilities. This reduces the ability to choose between the best de-cision alternatives, especially when the expected utility intervals for different decisionsoverlap. But this reduction in resolution is believed to more realistically reflect thetrue decision making situation. The factors governing the size of the expected utilityinterval are also discussed.iiiTable of ContentsAbstract^ iiList of Tables^ viiiList of FiguresAcknowledgement^ xiiiPreface^ xiv1 Introduction 11.1 Uncertainty and the Bayesian View ^ 11.2 Alternatives to the Bayesian View 61.3 Structure of the Thesis ^ 82 Imprecise Probability 113 Introduction to Dempster-Shafer Theory 143.1 D-S Representation of Uncertainties ^ 143.2 Some important quantities based on a BPA ^ 203.2.1^Belief and Plausibility Functions 203.2.2^Commonality Function ^ 213.2.3^Compatible Probability Distributions ^ 223.3 Combining BPA's via Dempster's Rule of Combination ^ 233.4 Discretized Contiguous Frame ^ 26iv3.53.63.7The Continuous Contiguous Frame ^D-S Decision Analysis ^3.6.1^Review of the Utility Function ^3.6.2^The D-S Upper and Lower Expected Utilities ^3.6.3^Making a Decision ^Summary ^3033333436374 D-S Statistical Inference — Binomial Parameter 394.1 Introduction ^ 394.2 D-S Binomial Parameter Inference from Sample Data ^ 404.3 Incorporating Prior Information ^ 494.4 An Application ^ 534.4.1^Description of the Problem ^ 534.4.2^Case I: Complete Ignorance 554.4.3^Case II: Using the Record Information ^ 574.4.4^Case III: Using Both Prior and Record Information ^ 604.5 Summary ^ 615 D-S Statistical Inference — Likelihood Based BPA 635.1 Introduction ^ 635.2 Sample Likelihood Based BPA ^ 645.3 The Axiomatic Justifications of a Likelihood-Based BPA ^ 675.4 An Application ^ 695.5 Summary 776 D-S Statistical Inference – General Model Parameters '796.1^Introduction ^ 796.26.36.4The Multinomial Model and D-S Statistical Inference ^D-S Inference of Parameter with Monotone Density Ratio ^Application to Normal and Lognormal Models ^8084896.4.1^Inference on it, with o- Fixed ^ 916.4.2^Inference on o with it Fixed 966.5 Application to Gumbel Model ^ 1016.6 Summary ^ 1067 Representing Subjective Knowledge with a BPA 1077.1 The BPA Based on Contamination of a Prior Distribution ^ 1087.2 The BPA Based on Perturbation of a Prior Distribution 1097.2.1^Local Perturbation Interval with Constant Width ^ 1107.2.2^Local. Perturbation Interval with Varying Width 1147.3 BPA Based on the Range and most Likely Value of 8 ^ 1177.4 BPA Based on Constructive Probability of Canonical Example ^ 1247.5 Summary ^ 1268 The D-S Theory Applied in Hydrologic Design 1288.1 Hydrologic Engineering Design ^ 1288.2 The Basic Hydrologic Design Model 1318.3 The D-S Approach to Parameter Uncertainties ^ 1368.3.1^Uncertainties Associated with E(pi ) 1368.3.2^Uncertainty with A ^ 1388.3.3^Uncertainty with the PMF Estimate xp ^ 1408.4 Numerical Example ^ 1428.4.1^Description of the Problem ^ 1428.4.2^D-S Theory Applied to Hydrologic Design ^ 144vi8.5 Summary ^ 1489 Application of D-S Theory in Reliability Analysis^ 1509.1 Reliability Analysis in Civil Engineering ^ 1509.2 D-S Reliability Analysis ^ 1549.3 An Illustration ^ 1589.4 Summary 16710 The D-S Upper Lower Expected Utilities^ 16811 Summary and Conclusions^ 173Bibliography^ 179Appendices^ 184A Determination of K, g(v), and f(u)^ 184B Upper Lower Marginal Distributions, g,(11,) and fi,(p), of it,^189B.1 Expressing gm (p) and fii(p) in Terms of the Marginal Distributions ofj and Ay ^ 189B.2 The gp,(p) and fp (p) for Normal Marginal Distributions of fi x and kty ^ 190viiList of Tables3.1 Belief, plausibility and commonality functions of m i (A)^ 223.2 The combination of m i (A) and m 2 (A)^ 264.1 D-S and Bayesian decision results in Case I: complete ignorance (fromCaselton and Luo, [1992])^ 564.2 D-S and Bayesian decision results in Case II: record information only(from Caselton and Luo, [1992]). ^ 584.3 D-S decision results in Case III: prior and record information. ^ 615.1 Prior probabilities and costs in dollars (from McAniff et al., [1980]). . . 705.2 Sample likelihood values from x = 10.5% (from McAniff et al., [1980]). . 715.3 Expected costs based on Bayesian analysis (from McAniff et al., [1980]). 715.4 Sample likelihood based BPA m x (A) ^ 725.5 The combined BPA's m(A) for different a. 735.6 Upper lower expected utilities (costs) for different decisions d i and dif-ferent a values^ 745.7 The modified costs for d 2. ^ 765.8 The corresponding modified upper lower expected costs for d 2 and dif-ferent a values^ 768.1 Design options and cost (in dollars) (from Stedinger and Grygier [1985]). 1428.2 The annual flood intervals and average damages x 10 3 (in dollars) forintervals. ^ 143viii8.3 Upper lower expected costs x10' (in dollars) with respect to pi , A^. . 1478.4 Sensitivity analysis to p ^ 1478.5 Upper lower expected costs x10 3 (in dollars) with respect to pi , A and xp 148ixList of Figures3.1 A conventional probability assignment^ 153.2 The multivalued mapping from T to 0 which generates a BPA on 0(following Wasserman, [1990a]) ^ 173.3 Graphical representation of a contiguous frame — discrete case.^273.4 BPA elements of discrete contiguous frame contributing to commonalityH([u,v]) (the shaded elements) (from Caselton and Luo, [1992]). . . . 293.5 Area of continuous contiguous frame BPA where m([u,v]) is integratedto determine H([u,v]) (the shaded area) (from Caselton and Luo, [1992]). 314.1 The inferred relationship between w and 6 given an observation Z = S(from Caselton and Luo, [1992]^ 424.2 The BPA density for observation (a) Z = S; (b) Z=F (from Caselton andLuo, [1992]). ^ 424.3 The BPA density m([u, v]) for observations (a) n = 6, r=2; (b) n = 30,r = 10 (from Caselton and Luo, [1992]). ^ 474.4 The D-S marginal distributions f(0) and g(0), and the Bayesian posteriordistribution R-(8) with uniform prior for a) n = 6, r = 2; b) n = 30, r =10 (from Caselton and Luo, [1992]). ^ 485.1 The upper and lower expected costs as functions of a for d 1 . ^ 755.2 The upper expected costs as functions of a for di ,j = 1,..., 5. ^ 755.3 The upper expected cost as functions of a for di ,j = 1,2,3, and 5, after^cost function for d 2 are modified. 776.1 The BPA density functions for normal model parameter^inN(1/1 , 1,11, 2j),for a) n = 2; b) n = 10. ^ 946.2 The D-S upper and lower marginals, gN (p) and fN (y), and the Bayesianposterior 7rN (p) for normal mean^a) n = 2; b) n = 10. ^ 956.3 The BPA density functions for normal model parameter u, mN ([a 1 ,o- 2]),for a) n = 2; b) n = 10. ^ 996.4 The D-S upper and lower marginals, gN (o- ) and fN (u), and the Bayesianposterior 7rN (o-) for normal standard deviation o, a) n = 2; b) n = 10. . 1006.5 The BPA density functions for Extreme Type I model parameterm7([/1141,2]), for a) n = 2; b) n = 12. ^ 1046.6 The D-S upper and lower marginals, D(I) and fr (p), and the Bayesianposterior ri (p) for Extreme Type I parameter 11, a) n 2; 11) n = 12. . 1057.1 The BPA based on local perturbation with constant interval width.^1127.2 The BPA based on local perturbation with varying interval width. .^1157.3 The contiguous frame of parameter 0 C [01 ,0u] ^ 1187.4 a) The BPA density functions m i ([0 1 ,0 2 ]); b) the plausibility functions^P1 1(0) 1217.5 a) The BPA density functions m 2 ([01 ,0 2 ]); b) the plausibility functions^P1 2(0) 1238.1 The probability decay function ^ 1348.2 Possible tail distributions in conventional frequency analysis (from Fig-ure 6.7 of Haimes et al. [1988])^ 1458.3 The BPA function on parameter A 146xi9.1 The upper lower expected failure probabilities as functions of b for dif-ferent s/o- and 0 values. 166A.1 The contiguous frame ^ 185xiiAcknowledgementI would like to express my most sincere gratitude to Dr. Bill Caselton, my supervisorand teacher, for his invaluable guidance and advice in the process of this research. Thisthesis would not have existed without his efforts and cooperation. He has taught mea great deal, both academically and in other aspects of my life, which will continueto benefit me in the future. His great help and understanding have made my stay inCanada and my graduate studies much more enjoyable.I would also like to thank Dr. Karl Bury, Dr. Alan Russell, Dr. Denis Russelland Dr. Ricardo Foschi for reviewing this dissertation and for the valuable comments.Thanks are also extended to Dr. Michael Quick and Dr. Alan Freeze for serving in mysupervisory committee.Financial support provided through the University of British Columbia GraduateFellowship throughout this research is gratefully acknowledged.Special thanks to my parents and friends, their support and encouragement havecontributed to the completion of this thesis.PrefaceThe Dempster-Shafer (D-S) statistical theory, on which this thesis is largely based,dates back only about 30 years. The theory was originally developed by Dempster andlater extended by Shafer. The term "imprecise probability" has been used by Walley[1991] to describe a broader concept which embraces the D-S and other relatively newstatistical developments aimed at relaxing the restrictions of the conventional Bayesianexpression of uncertainty under conditions where the available information is distinctlyweak. In civil engineering, and water resources engineering in particular, importantprofessional decisions are often made under such conditions.D-S theory is of ongoing interest to statisticians and has also received considerableattention from the AI community. The application of D-S theory to decision makingin a professional field, other than possibly AI, does not appear to have been explored.This is certainly the case in civil engineering where concerns about weak informationhave been expressed in the literature. There have, however, been a few papers con-cerned with other kinds of applications of D-S theory in civil engineering. The materialpresented in the early chapters represents an original synthesis of the earlier work ofDempster, the subsequent extensions by Shafer, as well as contributions by others. Thismaterial is considered necessary for the D-S decision making theory to be applied tocivil engineering and to be judged from an engineering perspective. It includes newexplanations of both fundamental theories and computational aspects and utilizes agraphical representation of the basic distributional form, known as a basic probabilityassignment (BPA), which has not previously been exploited for expository purposes.New theoretical developments of the D-S theory were necessary when it was appliedxivto some common water resources problems and also to an application in reliabilityanalysis. Examples include the D-S inference for parameters of the lognormal andmaximum extreme value type I distributions in Chapter 6, the D-S theory applied tohydrologic design model in Chapter 8, and the resultant BPA for a function of twouncertain variables in Chapter 9.Numerical examples are presented for every type of application considered. Theyinclude adaptations of previously published Bayesian examples as well as some newlydeveloped ones. In most instances the nearest Bayesian analysis and its result arepresented for comparison. The examples not only demonstrate the implementationprocess but also provide a first indication of the quantitative consequences when thetheory is applied in an engineering context. One paper arising from this work hasalready been published in the journal Water resources research [Caselton and Luo,1992] and is cited in this thesis. This paper was based on an early draft of Chapters 2,3, and 4. Some diagrams and text from that paper appear in these chapters.Overall, the thesis describes research into just one of a number of possible schemesfor implementing the concept of imprecise probability. It therefore represents only afirst, though necessary, step towards establishing if imprecise probability can be of valuein civil engineering decision making.XVChapter 1Introduction1.1 Uncertainty and the Bayesian ViewIn water resources engineering, design and planning decisions often have to be madeunder uncertainty. In the situations where the degree of uncertainty and its effect on thefinal design are not significant, uncertainty can be ignored by using the best estimatesfor the uncertain factors and treating the problem as deterministic. However thereare many situations where both the magnitude of uncertainty and its consequencesin decision making are significant. In these cases, uncertainty has to be dealt withexplicitly.Uncertainties in civil engineering can be classified into three types [Benjamin andCornell, 1970]. The first type deals with the inherent uncertainty of some naturallyrandom processes, e.g. the annual maximum flood in a hydrological design. Thistype of uncertainty, called natural uncertainty, can not be reduced or eliminated byobtaining more information about the random variable. The future outcome of such arandom variable is unpredictable and can take various values with different frequenciesor probabilities. In conventional statistics, this type of uncertainty is described by astatistical model or probability distribution.The second type is called statistical uncertainty and is related to the estimationof the parameters of a statistical model. The parameters for a selected model areunknown and must be estimated from the information available about the randomChapter 1. Introduction^ 2process, usually in the form of sampling. Since this information is invariably incom-plete, there are consequent uncertainties associated with the parameter estimates. Theparameter uncertainty can be reduced if more information is obtained, but can onlybe entirely eliminated by sampling the entire population of outcomes, in effect withcomplete information.The third type of uncertainty is model uncertainty which reflects the lack of precisionin model selection. The model uncertainty is also induced by the lack of informationabout the random process. It can be reduced as well if more information is availablebut with natural processes can never be eliminated.The parameters of a selected model, and the model itself both represent some factorswhose truths are fixed for a given random process but are unknown due to the lack ofinformation. The second and third types of uncertainty can therefore be considered asuncertainty of unknown "states of nature".In classical statistics, a conventional statistical model (e.g. a log-normal distribu-tion) is first selected as a contending model to describe the random variable. Theparameters of this model are then estimated from past sampling data of the randomvariable. Usually a point estimate, based on the method of maximum likelihood, isobtained for each parameter and a confidence interval is then calculated to indicate thepotential errors of the estimation.With the point estimates of the parameters the model is completely determined andit is then verified by comparing its predictions with other sampled data. The verificationis performed through some goodness-of-fit tests such as the x 2 test or Kolmogoroff test.The verified model is then ready for its intended engineering purpose.Accurate estimates of the parameters, and the selection of the model, require alarge sample record. In engineering practice however, the sample record of a randomevent is usually short, typically less than 50 years in water resources engineering. As aChapter 1. Introduction^ 3result, the estimated parameters and selected model involve large uncertainties. Theseuncertainties are not considered in the final engineering design in the classical statisticalapproach; as a consequence, the resultant design can be inefficient [Bodo and Unny,1976] and could be substantially under- or over-designed [Afshar and Mariio, 19901.Indeed, for a given set of data, several probability models might be selected fit thedata equally well. The distribution models may be very close to each other in thecentral part where the sample records are concentrated, but significantly different inthe tails where the samples occur with low frequency or are unavailable. On the otherhand, economic losses due to damage from civil engineering projects is generally caused,not by the occurrence of the more frequent events, but by the occurrence of the rareevents with low frequencies. Stedinger and Grygier [1985] have shown, explicitly, thathydraulic dam spillway design is very sensitive to the shape of the tail portion of thedistribution model.In designing an engineering project engineers often have some personal or subjectiveknowledge, based on their past experience about the random process, which contributesto the design. In situations where the sample record is limited, as is almost invariablythe case in water resources engineering, this subjective knowledge can be an importantsupplement to the sample information in determining the unknown parameters andselecting the model. The classical statistical approach to the parameter and modelinferences uses only the sampling data and ignores this subjective information, resultingin less informative inferential results [Martz and Ray, 1982].Over the last three decades, Bayesian theory has been widely adopted in civil engi-neering for inference and decision making under uncertainty (see Benjamin and Cornell[1970] or Ang and Tang [1975, Vol. I] as examples of engineering texts which confirmthis). This theory explicitly recognizes subjective probability and is philosophicallydistinct from classical statistics and the frequentist interpretation of probability.Chapter I. Introduction^ 4According to the classical or frequentist interpretation, the probability of an eventis the long run relative frequency of that event in a sequence of repeated trials. Theparameters of a selected model, which usually do not have the characteristics of fre-quency, are considered as unknown constants and can not be assigned probabilitiesin the classical sense. As a result, the confidence interval of a point estimate of anunknown parameter does not have a probability interpretation. Subjective probabilitytheory however interprets the probability of an event as a person's subjective beliefthat the event will occur in the future. Based on subjective probability theory, anunknown parameter, though recognized as being a constant, can be considered as arandom variable and assigned a probability distribution. The latter distribution whenestablished solely on, say, an engineer's stated opinion would be assumed to reflecthis personal knowledge of the unknown parameter. Over the last 200 years, there hasbeen a debate over the subjectivist and frequentist interpretations of probability. Fora review of the history of this debate, which is still inconclusive, see Shafer and Pearl[1990].Recall that the parameters of a model, and the model itself, can all be defined asdifferent states of nature. Bayesian theory deals with the uncertainty of a state ofnature by first representing subjective knowledge through a so called prior probabilitydistribution and the sampling information through the sample likelihood function. Theprior distribution and the sample likelihood function are then combined through theuse of Bayes' theorem to yield a posterior distribution. This posterior distributionthen represents the resultant inference on the state of nature based on both subjectiveand sampling information. The posterior distributions of the parameters of a model,along with the sampling model itself, are then used in the final decision making. Justa few examples of the many uses of Bayesian theory in the field of water resourcesengineering are Shane and Gayer [1970], Davis et al. [1972], Vicens et al. [19751, WoodChapter 1. Introduction^ 5and Rodriguez-Iturbe [1975a, 1975b], Bodo and Unny [1976], Kuczera [1982, 1983] andMcAniff et al. [1980].Bayesian theory represents a step forward over the conventional statistical approachin that it is able to represent the decision maker's subjective information about an un-known state of nature through the prior probability distribution. Furthermore, theuncertainty of a state of nature which is represented by the Bayesian posterior dis-tribution can be incorporated systematically into the final decision making. But theBayesian approach is not without shortcomings and in recent years new criticisms haveemerged beyond those traditionally expressed by the classical or frequentist schools ofstatistics. Of particular interest here, and central to this thesis, are those concernswhich arise when the prior and sampling information might be characterized as beingvery "weak", implying that it is inadequate, ill defined, incomplete, or even nonexistent.In water resources engineering, decision problems are often dominated by natural,extreme, rarely occurring, uncertain events. Because these extreme events are rare itis not unusual for them to be absent from the record. Furthermore, because the timespan of human experience is short relative to the typical reoccurrence times of theselarge events, any relevant subjective prior knowledge is also likely to be very limited.It is not uncommon then, in a water resources context, that important decisions haveto be made when the information available on important but uncertain events can becharacterized as being weak.The principal difficulty with conventional Bayesian analysis when presented withweak information sources arises from what Walley [1991] has described as the "Bayesiandogma of precision" — that the information concerning uncertain statistical parame-ters or the states of nature, no matter how vague, must necessarily be represented bya conventional, exactly specified, probability distribution. The danger is that this ex-aggerated, though inadvertent, statistical precision may lead to inappropriately strongChapter 1. Introduction^ 6conclusions being drawn from the decision analysis. This has stimulated research fornew methodologies for dealing with the uncertainties associated with weak information.It should be stressed at the outset, however, that any such methodology is likely toproduce a more realistic but, at the same time, more equivocal view of the decisionalternatives. This weaker result reflects more accurately the scarcity of informationand the true decision making situation. It provides the decision maker with only thatinformation which can justifiably be extracted from the inputs provided.1.2 Alternatives to the Bayesian ViewAlternatives to the Bayesian scheme, which claim to accommodate weak informationmore appropriately, have been proposed. Some approaches, such as fuzzy logic, involveentirely new theories. Others are based on the familiar concepts of probability and re-tain some connection to the conventional Bayesian framework. Currently the principalrepresentatives of this group are:• Bayesian Robustness [Berger, 1984, 1985]• Upper Lower Probability [Walley, 1991; Walley and Fine, 1982]• Dempster-Shafer [Dempster, 1966, 1967a, 1967b, 1968a, 1968b, 1969; Shafer,1976, 1982a, 1982b]All three of the above listed methods address the consequences of weak informationthrough a concept which has recently been called "imprecise probability" [Wally, 1991].Only key references are given in the above list but the cited literature in each of thesereferences is extensive.The growing literature on these approaches reflects the considerable interest andsupport which each enjoys. But, just as there is debate over the concepts of frequentistChapter 1. Introduction^ 7and subjective probabilities, not surprisingly there is also a debate between the conceptsof conventional probability and this newly emerged imprecise probability. As well thereis debate over the quite different implementations of the imprecise probability concept.Representative views on all of these debates can be found in Shafer and Pearl [1990],Shafer [1982a, 1982b] and Walley [1987].This thesis aims at exploring the relevance, implementability, and consequencesof being able to express imprecise probabilistic inputs, primarily in a water resourcesengineering decision making context but also in civil engineering in general. Onlythe Dempster-Shafer (D-S) approach will be considered. The D-S approach to solvingdecision problems under uncertainty has been over shadowed by its prominence inartificial intelligence and expert systems. Investigation of Dempster's original work,and those aspects of Shafer's advancement of Dempster's ideas relevant to decisionanalysis, during the course of this research revealed aspects of the theory that appearedto be appealing to engineering application.In civil engineering, the D-S approach has been applied to the prediction of soilerosion [Toussi and Khanbilvardi, 1988], and to uncertainty inference in expert systemsapplications [Blockley and Baldwin, 1988; Caselton et al. , 1988]. However, these havenot addressed the problem of decision analysis. A more comprehensive application ofD-S theory, including decision analysis, is given by Caselton and Luo [1992], which wasdeveloped from an early draft of chapters 2, 3 and 4 of this thesis.It is still premature to recommend any one of these three new approaches overthe others. The question of superiority will likely be answered only after extensiveapplications of all three have been attempted in a particular field of application.Chapter I. Introduction^ 81.3 Structure of the ThesisThe concept of imprecise probability is fundamental to D-S and other new alternativeapproaches described in the previous section. This concept is both philosophically andtechnically important to the thesis and will be elaborated on in Chapter 2.In Chapter 3, the theoretical ingredients of the D-S scheme, including the D-Srepresentation of uncertainty, the combination of independent sources of information,and D-S decision analysis, will be introduced. The D-S expression of uncertainty issupported by a graphical representation which, together with other explanations, willaid non-statisticians to visualize this new way of representing uncertainty.In Chapter 4, Dempster's inference of the parameter of a binomial model fromsample observations is described. The incorporation of prior information in the bino-mial parameter inference is also discussed. Finally the inferential results are used ina water resources decision making example where increasing levels of information areconsidered.The D-S inference for the binomial parameter was presented here for several reasons.Firstly, the D-S inference for the binomial parameter simplifies the demonstration ofthe application of D-S theory and its comparison with the equivalent Bayesian results.Secondly, it is fundamental to D-S inference as it acts as a basis for the more generalprobability model parameter inference necessary for more advanced water resourcesproblems. And thirdly, the binomial model is well accepted in the water resources areafor describing natural event uncertainties, and the results are of practical value.In Chapter 5, the D-S inference of an unknown parameter of any statistical modelbased on a sample likelihood function is introduced. The inferential results are simpleand have some very attractive features from a practical standpoint. Again, a waterresources decision making example is used to demonstrate the application of theseChapter 1. Introduction^ 9results.In Chapter 6, a more formal theoretical approach to the general model parameterinference from sample observations, based on Dempster [1969], is presented. The ap-proach requires the assumption that the unknown parameter of the statistical modelsatisfies the "monotone density ratio condition". Some specific inferences for severalcommonly used statistical model parameters are obtained.In Chapter 7, the issue of representing subjective information of the unknown pa-rameter in the D-S scheme will be discussed. Closely related with this issue is the morefundamental and profound aspect of axiomatic justification of the imprecise proba-bility theory in general and the D-S approach in particular. The issue of axiomaticjustification of different probability approaches has been discussed extensively in theliterature. Savage's axioms, which support the Bayesian approach, are under criticismfrom both statisticians and psychologists [e.g. Ellsberg, 1961; Levi, 1982; Tversky andKahneman, 1974, 1986; Shafer, 1986]. Shafer and others have been trying to justify allof the statistical approaches mentioned above, from the common "constructive prob-ability" point of view [Shafer, 1981, 1982a; Shafer and Tversky, 1985]. The literatureon axiomatic and elicitation issues, both Bayesian and non-Bayesian, appears to beinconclusive though and a thorough and detailed discussion on these issues are outsidethe scope of this thesis.In Chapters 8 and 9, two applications of the D-S theory to solving practical engi-neering problems which involve uncertainties are presented. In Chapter 8 a hydrologicdesign model based on the D-S scheme is presented. In Chapter 9, the D-S scheme isused to characterize parameter uncertainties in civil engineering reliability analysis.The D-S decision analysis generally yields upper and lower expected utilities foreach decision. This is in contrast with the conventional Bayesian approach which con-ventionally yields a single expected utility for each decision option. The interpretationChapter 1. Introduction^ 10of this interval formed by the upper and lower expected utilities, and an understandingof its implications, are important for D-S application and will be discussed in Chapter10. The conclusion and summary of this research are presented in Chapter 11.Chapter 2Imprecise ProbabilityImprecise probability is a general concept, but is best explained in the context ofspecifying prior subjective knowledge. Its application to inference is invariably morecomplex technically.Consider a conventional probability specification concerning knowledge of some un-known state of nature. This will involve a probability assignment to each of a numberof possible states of nature, only one of which can be the truth. Such a specification willalways reveal, with absolute precision, whether any one of the possible states of natureis either more likely, or equally likely, or less likely, than any other. This precisionis unwarranted if the information or knowledge about the unknown state of nature istruly weak, yet it is inescapable.By contrast, an imprecise probability scheme avoids having to make such a precisestatement by introducing an extra dimension or degree of freedom into the formal ex-pression of uncertainty. This makes it possible to represent the knowledge according toits quality without introducing any extra unsubstantiated information. The resultingindeterminacy in the belief about the unknown state of nature reflects the weaknessof the knowledge. This inevitably introduces some indeterminism into any subsequentdecision analysis as we are increasing the dimensionality of the problem while deal-ing with less, not more, information. The weakness of the knowledge or informationtherefore is explicitly represented in the imprecise probability approach and is carriedthrough to the final decision analysis. The resultant indeterminism, rather than being11Chapter 2. Imprecise Probability^ 12a shortcoming, can be viewed as a strength, more faithfully reflecting the reality of thesituation faced by a decision maker.While the concept of imprecise probability has been proposed by many statisticians,[e.g. Dempster, 1966, 1968a; Smith, 1961; Good, 1962; Shafer, 1976; Kmietowicz andPearman, 1981; Gdrdenfors and Sahlin, 1982, 1983; Levi, 1982, 1985; Berger, 1984;Einhorn and Hogarth, 1985] as a way of representing weak information, and has beendiscussed at length in the statistical literature, no single universal technical definitionhas emerged. Its translation into a quantitative form depends upon the implementingmethodology. As mentioned in the introduction, three representative imprecise prob-ability approaches are: the Bayesian Robustness, the upper lower probability and theD-S theory.The Bayesian Robustness approach is closely linked to the conventional Bayesianrepresentation of uncertainty. Imprecision in the specification of prior probabilities isrepresented by adopting a set of prior distributions rather than just one. Unlike otherimprecise probability schemes, Bayesian Robustness remains faithful to the conven-tional Bayesian view by assuming that there still exists a true conventional Bayesianrepresentation, even though this cannot be identified.According to Shafer [1981, 1982a], the upper lower probability theory involves spec-ifying a class of conventional probability distributions and defining the upper and lowerprobabilities of an event A as the supremum and infimum probabilities of A from thesedistributions. This is a simplified version of the formal upper lower probability schemewhich is theoretically more complex (See Fine [1987], and Walley and Fine [1982] fora more rigorous introduction). Walley [1991] further extended the upper lower proba-bility theory and developed the so called theory of "upper lower previsions".Fundamental to the D-S approach is the representation of uncertain knowledgein the form of a Basic Probability Assignment (BPA) in which probabilities can beChapter 2. Imprecise Probability^ 13assigned directly to subsets of the states of nature as well as to individual states ofnature. The direct consequence of this kind of assignment is that, while the actualprobability of any individual state of nature may not be specified, its minimum andmaximum values will be specified. Imprecision is thus reflected in a possibilistic typeof specification of the probabilities. Note that the conventional precise probabilitydistribution is a special form of BPA in which the probabilities are all assigned to thesingleton subsets.Comparisons of different imprecise probability approaches can be found in Shafer[1981, 1982a] and Walley [1991]. A noteworthy technical discussion of the common anddissimilar aspects of imprecise probability schemes can be found in Wasserman [1990a].Chapter 3Introduction to Dempster-Shafer TheoryThe Dempster-Shafer theory is derived from Dempster's original work [Dempster, 1966,1967a, 1967b, 1968a, 1968b, 1969] which was aimed at relaxing certain Bayesian re-strictions when dealing with the inference of unknown parameters. Shafer [1976, 1982a,1982b] expanded Dempster's original concepts and produced what is now generally re-ferred to as the D-S theory. In this Chapter the basic ingredients of the D-S theory,including the D-S representation of uncertainties, the combination of different sourcesof information and the D-S decision analysis, will be introduced. To facilitate un-derstanding of the more unfamiliar aspects, the theory will first be described for thediscrete variable case and then extended to the continuous case in the latter part ofthis chapter.3.1 D-S Representation of UncertaintiesLet 0 represent a random variable whose true value is unknown. Let 0 = {0 1 , 0 2 ...0„}represent individual, mutually exclusive, discretized values of the possible outcomes of0. In conventional probability theory uncertainty about 0 is represented by assigningprobability values p i to the elements O i ,i 1,...n, which satisfy E pi = 1.0. As anexample, consider a random variable with only four outcom values a,b,c and d. Thena typical probability assignment might be as shown in Figure 3.1.The representation of uncertainties in the D-S theory is similar to that in conven-tional probability theory and involves assigning probabilities to the space 0. However14Chapter 3. Introduction to Dempster-Shafer Theory^ 150.20 0.35 0.40 0.05a b c dFigure 3.1: A conventional probability assignment.the D-S theory has one significant new feature: it allows the probability to be assignedto subsets of 0 as well as the individual elements O i . Here the collection of all thesubsets of 0. including the singleton elements, defines the D-S frame of 0. In thecontext of this theory the union of the elements in any subset is always implied. Thecomplete representation of uncertainty in D-S theory therefore involves assigning prob-ability values to the D-S frame. The complete probability specification on the frameis defined as a D-S Basic Probability Assignment (BPA) [Shafer, 1976], and is denotedas m(A) where A represents any subset of 0. As in conventional discrete probabilitytheory, the discrete BPA function m(A) must satisfyo<m(A)<1^for all A C 0m(0 0 (3.1)E m(A) = 1.0AC®where represents the null subset.The discrete BPA denoted by m(A) is analogous to the conventional discrete prob-ability assignment usually denoted by P(0) in that the subsets A act like the individualelements 9, and the summation of all the BPA values in m(A) is 1.0 as expressed in(3.1).When probability represents an individual's subjective knowledge about some un-known state of nature, it is also referred to as the "degree of belief" (e.g. Walley, [1991]Chapter 3. Introduction to Dempster-Shafer Theory^ 16pp19), or simply the "belief". Thus the expressions probability, probability assignment,and belief will be used interchangeably throughout this thesis when concerning subjec-tive knowledge.Being able to assign one's belief to a subset as well as a singleton element of 0 inthe D-S BPA provides extra freedom in expressing uncertainties and it is this freedomwhich makes the theory potentially attractive and useful. It represents that amountof probability or belief which can be committed to A but to nothing smaller. Thisreflects the imprecision of the evidence by indicating that through lack of informationthis probability value can not be further subdivided among the elements in that subset.Furthermore the BPA value on any subset is not the total probability value attachable tothat subset. Because of the freedom of expressing separate probabilities on overlappingsubsets, this total probability value may be unknown. This is in contrast with theconventional probability theory where the summation of the probability values on theelements of the subset always equates with the subset probability.For the more rigorously minded, the concept of BPA in D-S theory and its resem-blance with the conventional probability distribution can be further clarified by consid-ering a multivalued mapping from one space to another [Dempster, 1967a; Wasserman,1990a]. Here one space 0 which represents the possible values of an unknown randomvariable, has been defined. Let T represent another space on which a conventional prob-ability distribution IT is known. Let F(t) C 0 represent a multivalued mapping fromT to 0, which means that an observation t in T is equivalent to the observation thatthe true value of 9 is in I`(t) C 0. The concept of multivalued mapping is illustrated inFigure 3.2. Through the multivalued mapping from space T to space 0, a conventionalprobability distribution fLT in T thus corresponds to a probability distribution on 0,which is a D-S BPA assigning probabilities to subsets as well as singleton elements ofU. Such a probability distribution on 0 is also called by Walley [1991] an "impreciseChapter 3. Introduction to Dempster-Shafer Theory^ 17probability distribution".SPACE T^ SPACE 0Figure 3.2: The multivalued mapping from T to 0 which generates a BPA on 0(following Wasserman, [1990a]The following example illustrates how the BPA might be used to express real worldinformation.EXAMPLE 3.1. A site is being considered for the construction of a dam. The hy-draulic conductivity of the fissured bedrock is of concern and will be assessed, forfeasibility purposes, as either High, Medium, or Low, which are represented as 8 1 ,8 2 ,83respectively. A geologist's assessment, based on a review of geological maps of the areaand his past experience on similar sites, is that there is a 60% chance of it being 0 1 , a30% chance of 82 and a 10% chance of 63. Without a site visit the geologist evaluatesthe chance of his overall assessment being meaningful at 0.8 and the chance of it beingworthless at 0.2. An assessment is "worthless" when the geologist knows nothing aboutthe hydraulic conductivity of the bedrock at the site. This subjective information mightbe represented in the form of a BPA asChapter 3. Introduction to Dempster-Shafer Theory^ 18m 1 ({0 1 }) = 0.48m 1 ({0 2 }) = 0.24m i ({03 }) = 0.08m1({0}) = 0.20with all remaining m 1 (A) values equal 0.0. Here the subscript "1" in m i (A) indicatesthat this BPA is based on the first piece of information. The chance 0.2 is assigned tothe whole set 0 = (0 1 ,0 2 ,03 ) to represent the ignorance and the proportional chancesof 0 1 ,0 2 ,03 are maintained but adjusted to meet the requirement E m i (.4) = 1.0.The BPA is the fundamental expression of uncertainty in D-S theory. In the specialcase when the BPA has non-zero values only on the individual elements, it becomes aconventional probability distribution. The conventional probability distribution there-fore is a special type of BPA. It is referred to as a Bayesian BPA in the D-S schemeas it conveys the same information as does a conventional distribution in the Bayesianscheme. The Bayesian BPA satisfiesm(0i ) = pi for i = 1, ...nand m(A) = 0 A (3.2)E = 1.0There are situations where a person has no knowledge about 0 other than knowingthat the truth lies within the bounds of 0. In these cases, he will assign his total belief1.0 to the whole set 0, i.e.Chapter 3. Introduction to Dempster-Shafer Theory^ 19m(0) = 1m(A) = 0^A c O^ (3.3)This type of BPA represents complete ignorance in D-S theory and therefore is re-ferred to as an ignorance BPA. This provides an interesting contrast with the Bayesianrepresentation of ignorance using a noninformative prior distribution. There are situa-tions where there could be several different noninformative prior distributions and oftenthere is no clear reason to prefer anyone over the other [Berger, 1985]. Furthermore,Walley [1991] points out that the Bayesian noninformative prior is not noninformativebut that its precise specification of probability values on the singleton elements impliessubstantial knowledge. In comparison, provided the range of possible values impliedby 0 is sufficiently large, the ignorance BPA appears to be truly vacuous.When the subsets A i , i = 1, 2,...,n , which have the non-zero BPA values, are nested,then the corresponding BPA function is called a consonant BPA. The consonant BPAhas the important feature that the beliefs assigned to the subsets do not conflict witheach other. A consonant BPA can be expressed as^0 < m(A i ) < 1^for A i C A 2 ... C A„ C 0m(A i ) = 1.0 (3.4)Chapter 3. Introduction to Dempster-Shafer Theory^ 203.2 Some important quantities based on a BPAA BPA on the frame of 0 is the basic representation of uncertainty associated withthe unknown parameter 9 in the D-S theory. Once the BPA is obtained, a number ofquantities which can be used for a variety of purposes can be determined. Some of themore important quantities based on a BPA are described in the following subsections.3.2.1 Belief and Plausibility FunctionsConsider a subset A C 0 for which the probability value is unknown as a result ofimprecision in the evidence, which is reflected in the BPA. For any subset B of A theBPA value which is committed to B also naturally supports A. The total belief on A,which is the collection of all the beliefs committed to all the subsets of A, was defined asthe belief of A and denoted as Bel(A) by Shafer [1976]. If a subset B does not conflictwith A, i.e. the intersection of B and A is not empty, the BPA value which is assignedto B also provides possible belief supporting A. The collection of these possible beliefsto A defines the plausibility of A which is denoted as P1(A). The complete specificationof Bel(A) and Pl(A) for any A C 0 are called belief and plausibility functions whichcan be calculated from the BPA on 0 as followsBel(A) = > m(B)BCAPl(A) =^m(B)^(3.5)BnA*0The Bel(A) represents the least probability and P1(A) the most probability that thetrue f9 value is in A, based on the information expressed in the BPA. They can thereforebe considered as the lower and upper bounds of the probability on A respectively.Chapter 3. Introduction to Dempster-Shafer Theory^ 21Indeed, Dempster originally called the belief and plausibility on subset A the lower andupper probabilities of A.3.2.2 Commonality FunctionA further quantity of interest, known as the commonality, can also be determinedfrom the BPA. The commonality of subset A collects all of the BPA values that couldpotentially be committed to A from all of the supersets which include A. It is analogousto the cumulative probability in conventional statistics. The commonality of subset Ais denoted H(A) and is defined asI-1(A) = E m(B)^ (3.6)AcBcoIf values of H(A), A C 0, are specified for the entire frame of 0, it is referred to as acommonality function. When the commonality function is determined, the correspond-ing BPA will also be uniquely determined [Shafer, 1976, Eq. 2.4, 2.2]. The commonalityfunction is introduced here mainly as a computational device as it facilitates combiningdifferent sources of information.The following example demonstrates the calculation of belief, plausibility and com-monality functions from a known BPA function.EXAMPLE 3.1 (continued). The belief, plausibility and commonality functions of theBPA, m i (A), in the previous example can be calculated and are shown in Table 3.1.Sample calculations are given belowBe1 1 (10 1 ,0 2 D = mi({0 1 }) mi({92}) mi({01,02}) = 0.48 + 0.24 = 0.72Chapter 3. Introduction to Dempster-Shafer Theory^ 22P1 1 ({01,0 2 })^m i ( { 01 } )+mi( { 192})+ ini({01, 02}) -1- m l ({01 ,03 })+m i ({0 2 ,0 3 })+mi({ 0 1, 02,93}) = 0.48 + 0.24 + 0.20 = 0.92H1({01,92})^mi({01,02})+ m1(01, 02, 031) = 0.2Table 3.1: Belief, plausibility and commonality functions of m 1 (A).Subset Afed {0 2 } {63} {01,02} {01,03} {02,03} {191,02,93}Bel l (A) 0.48 0.24 0.08 0.72 0.56 0.32 1.0P/ 1 (A) 0.68 0.44 0.28 0.92 0.76 0.52 1.0H1 (A) 0.68 0.44 0.28 0.20 0.20 0.20 0.203.2.3 Compatible Probability DistributionsThe true conventional probability distribution on 0 is unknown but, based on the beliefand plausibility functions, a set of conventional probability distributions P(A) whichsatisfyBel(A) < P(A) < Pl(A) A C O (3.7)can be determined [Dempster, 1967a; Wasserman, 1990a, 1990b]. Individual distribu-tions in this set are referred to as the compatible probability distributions of the BPA.The BPA therefore can be interpreted as implying a set of conventional probabilitydistributions, or density functions when 0 is treated as a continuous variable. Thisinterpretation provides another perspective on the concept of BPA by relating it toconventional probability distributions. It also forms the basis of D-S decision analysis,as will be discussed in Section 3.6.Chapter 3. Introduction to Dempster-Shafer Theory^ 23Interpreting the BPA as representing a set of conventional probability distributionspermits a comparison with another of the imprecise probability schemes mentioned inChapters 1 and 2 — the Bayesian Robustness approach. There, also, a set of "possible"prior distributions on CI, rather than a single prior distribution, are adopted, to reflectthe lack of knowledge about O.The next example illustrates the determination of a compatible probability distri-bution from a known BPA function.EXAMPLE 3.1 (continued). Considering again the BPA m i (A). one compatible prob-ability distribution would beP({1}) = 0.547P({2}) = 0.306P({83 }) = 0.147which is obtained by distributing the ignorance 0.2 equally among the three singletonelements and can be seen to meet the Bel(A) < P(A) < Pl(A) condition by inspectionof Table 3.1.3.3 Combining BPA's via Dempster's Rule of CombinationIf two BPA's, m i (A) and rn 2 (B), on 0 are obtained as a result of two pieces of indepen-dent information, they can be combined via. Dempster's rule of combination [Dempster,1967a] to yield a new BPA. The combination is actually the orthogonal sum of the twoBPA's, which is based on the fact that if rn i (A) from one BPA supports subset Aand rn2(B) from another independent BPA supports the subset B, then the productChapter 3. Introduction to Dempster-Shafer Theory^ 24m i (A)m 2 (B) should naturally support the subset C which is the intersection of A andB, i.e. C = A n B. Since the BPA value should be zero on empty subsets, the resultantproducts must be normalized by a factor which is one minus the BPA values on theseempty subsets. The Dempster's rule of combination can be expressed asm(C)^mi(A) m 2 (B)(1 —^ E mi(A)m2(B);^for C CAnB=C(3.8)where k = EAnB.onli(A)m2(B) and m() = O.In addition to combining two BPA's, Dempster's rule of combination can also beperformed more directly and conveniently through the commonality functionsH(A) = KII i (A)112 (A);^for A C 0^ (3.9)where 11(A) is the commonality function of the resultant BPA, K = (1 — k) - ' and k isdefined as above.Equations (3.8) and (3.9) follow Shafer ([19761, theorems 3.1, 3.3). Once the resul-tant commonality function is obtained, as in equation (3.9), the corresponding BPA canthen be calculated and will be identical to the resultant BPA from (3.8). If more thantwo BPA's are obtained as a result of several independent sources of information, theycan be combined sequentially using either of the two procedures, and the combination isboth commutative and associative. The combined BPA represents the inference of theunknown parameter 9 based on the information from those independent sources. SinceChapter 3. Introduction to Dempster-Shafer Theory^ 25the combination procedure using the commonality functions is much simpler, (3.9) willbe used when combining BPA's representing independent sources of information.In addition to the properties discussed above, Dempster's rule of combination hassome other important features [Shafer, 1976; Wasserman, 1990a]. When a BPA iscombined with an ignorance BPA, the resultant BPA is always identical to the originalBPA. This feature further suggests the idea that the complete ignorance does representan informationless input. When any general BPA, which includes a Bayesian BPA,is combined with a Bayesian BPA, the result will always be another Bayesian BPA.This leads to the claim that the conventional Bayesian theory is actually a special caseof the more general D-S theory. While Dempster's rule of combination has a crediblebasis and has been widely accepted, it can not claim to be theoretically perfect andhas its critics. For more discussion on Dempster's rule of combination, see [Shafer,1982a, 1982b; Walley, 1987; Weichselberger and POhlmann, 1987]. The application ofDempster's rule of combination is illustrated in the following simple example.EXAMPLE 3.1 (continued). Suppose that an opportunity arises for a crew to under-take just one test drilling at the site. The test reports 0 1 but, because a single testdrilling cannot sense spatial variability over the entire site, its indication is known tobe incorrect 30% of the time. If the test drilling result is incorrect, then the truth couldbe either 0 2 or 03 . The test result might be characterized by the following BPA:rn2({01}) 0.70m2 ({02 ,03 }) = 0.30with all remaining m 2 (A) values equal 0.0.Chapter 3. Introduction to Dempster-Shafer Theory^ 26Since this test result and the geologist's subjective knowledge are two independentsources of information, they can be combined using Dempster's rule of combination.The combination can be performed using either (3.8) or (3.9), and the resultant BPA'sare the same. The combination of m i (A) and m 2 (A) using the commonality functions,i.e. equation (3.9), is given in Table 3.2. It can be seen that since both BPA's clearlysupport 0 1 , the resultant BPA, m(A), also gives strong support to 8 1 . Note that thevalue 0.1 assigned to the subset {0 2 ,0 3 } contains some ambiguity which is not resolvedby all of the current information.Table 3.2: The combination of m i (A) and m 2 (A).Subset A{01 } {9 2 } {63} {01,02} {0 1 ,03} {82,93} 01,82,031m i (A) 0.48 0.24 0.08 0.00 0.00 0.00 0.20m2 (A) 0.70 0.00 0.00 0.00 0.00 0.30 0.00H i (A) 0.68 0.44 0.28 0.20 0.20 0.20 0.20112(A) 0.70 0.30 0.30 0.00 0.00 0.30 0.00H (A) 0.48 0.13 0.08 0.00 0.00 0.06 0.00m(A) 0.76 0.11 0.03 0.00 0.00 0.10 0.003.4 Discretized Contiguous FrameIn the general case the BPA value can be assigned to any subset in the frame. Butwhere the singleton elements arise from the discretization of a continuous real variable,many of the subsets in the full frame become unreasonable. In practical applicationsthe unknown variables often represent physical quantities that, at least in theory, aremeasurable on a continuum. Once such a variable has been discretized, say, into asequence of ranked elements a, b, c and d, it is rare that real world knowledge wouldresult in a non-zero probability assignment to a union involving non-adjacent discreteelement such as the subset {a, c}, as this would imply that the knowledge somehowChapter 3. Introduction to Dempster-Shafer Theory^ 27explicitly excluded the intermediate value b. Therefore in many practical situationsit is only necessary to consider unions of elements which are contiguous. Eliminatingnon-contiguous unions substantially reduces the complexity of the frame. Henceforththis type of frame is referred to as a contiguous frame.Since in a contiguous frame the elements in any subset can be arranged in anascending or descending sequence, a subset can be identified by just its starting elementand ending element. For example, [b, d] represent a subset which contains elements b,cand d. Thus all of the subsets in a contiguous frame can be generated by consideringeach of the elements in 0 as a starting element, and for any starting element, eachof the elements which are greater than it as the ending element. A contiguous frametherefore can be organized so that it can be represented graphically [Strat, 1984] in atriangular diagram as illustrated for the discrete case in Figure 3.3.dbaa^b^c^d0.15 0.18 0.00 0.050.00 0.15 0.070.30 0.000.10Figure 3.3: Graphical representation of a contiguous frame — discrete case.In this triangular diagrani, any subset is now represented by its starting element,which is written above the horizontal axis, and its ending element, which is writtenbeside the vertical axis. Thus, in Figure 3.3, the entry of 0.18 in the box in row d,column b represents a probability assignment to the contiguous subset fb,c,c11. Entriesin the boxes on the diagonal from lower left to upper right correspond to probabilitiesassigned to the singleton elements. An allocation of the non-zero probability valuesChapter 3. Introduction to Dempster-Shafer Theory^ 28that is confined solely to these diagonal elements therefore represents a Bayesian BPA,i.e. a conventional discrete probability distribution. Note that, since the starting andending elements of subsets vary over the whole set 0, they can be considered as tworandom variables which, except for their new role in forming subsets in a contiguousframe, are otherwise identical to the random variable O.This triangular diagram, along with its continuous equivalent introduced in thenext section is, to the D-S scheme, the direct analogy of the familiar probability massor density plots in conventional statistics. It follows that it is equally valuable whenconceptualizing actual BPA's.The remainder of this thesis will be confined to the contiguous variable case anda subset A will be represented by an interval "[9 i , 9a ]" where O i and 0.i represent thestarting and ending elements of the interval and O i < Oi,i,j 1,...n.The expression for the commonality of a subset A = [9 i , 9a ], as defined by (3.6), issimplified in the contiguous frame case and becomesi nH([92,93]) = E E^oyi)^(3.10)x=1 y=jwhere 19 0 and 0„ represent the first and last elements of the discretized continuousvariable. This formula is also demonstrated graphically in Figure 3.4. The Dempster'srule of combination expressed in (3.9) can also be simplified for the contiguous caseand rewritten asH([0,0i]) =^ (3.11)Chapter 3. Introduction to Dempster-Shafer Theory^ 290^NN0Figure 3.4: BPA elements of discrete contiguous frame contributing to commonalityH([u,v]) (the shaded elements) (from Caselton and Luo, [1992]).The combination of any number of BPA's specified on the same contiguous framealways yields a resultant BPA function that also has its non-zero probability valuesconfined to the same contiguous frame.Knowing a BPA on a discrete contiguous frame, two compatible probability distri-butions (see Section 3.2.3) of particular significance can be determined. These are thetwo marginal distributions of O. The lower of these two distributions is generated byconcentrating the BPA value on each subset onto its starting element (i.e. to its lowerbound). The probability value for any one of these elements is obtained by summingall the components of the BPA values which can be concentrated on the element. Sincethe resulting distribution concentrates as much probability as possible to the smallestelements within subsets, it is defined here as the lower marginal distribution. It is de-noted by f(8). Similarly, by concentrating the BPA value on each subset to its endingelement, the resultant compatible distribution, which concentrates as much as possibleChapter 3. Introduction to Dempster-Shafer Theory^ 30probabilities to the largest elements within subsets, is defined as the upper marginaldistribution and denoted by g(0). Thus the upper and lower marginal distributionsg(0) and f(0) are expressed in the discrete case asg(0j) = Em[e„e i ]^for j = 1,2,...nnf ( 82) =^ej]^for i = 1,2,...n^ (3.12):7=23.5 The Continuous Contiguous FrameIn the previous section, the continuous random variable 0 was discretized into n el-ements, i.e. 0 1 ,0 2 , ..., O n . Now with 0's range fixed, as the number of elements isincreased, the interval represented by every individual element is decreased, and thecontiguous frame diagram would be discretized more finely. In the limiting case, as thenumber of discretized elements becomes infinite, the diagram evolves into a continuoussurface bounded by a triangle formed by the two axes and the diagonal line. It thenrepresents the continuous contiguous frame for a real continuous variable 0. Any in-terval is now represented by a point in this triangular region and its coordinates [u, v],measured on the horizontal and vertical axes respectively, are the lower and upperbounds of the interval.The calculation of commonality value for 0 in any interval [u, v] is essentially thesame as in the discrete case except that the summations are replaced by integrals.H([u,v]) = fu:^m([x,y])dydx^ (3.13)Chapter 3. Introduction to Dempster-Shafer Theory^ 31where u 0 vo represents the starting point and u 1 v 1 the ending point of thefull range in which the variable 0 can lie. In the discrete case m( ) was analogousto a probability mass assignment whereas in the continuous case it is analogous toprobability density. The integration of the BPA density in the above equation occursover areas which can be readily identified in the contiguous frame represented by atriangular BPA diagram, as shown in Figure 3.5. For the continuous contiguous frame,the Dempster's rule of combination in (3.9) can be expressed asFigure 3.5: Area of continuous contiguous frame BPA where m([u,v]) is integrated todetermine H([u,v]) (the shaded area) (from Caselion and Luo, [1992]).H([u,v]) = KI-liCu,vpH2([u,vi)^(3.14)conversely, if the continuous commonality function H({u,v]) is known, the BPA densitymCu,vi) can easily be determined from the derivatives of H(iu,vp as follows:Chapter 3. Introduction to Dempster-Shafer Theory^ 32d2 H(Lu,v])rri,([u, v]) = auav(3.15)As in the discrete case, the lower and upper bounds, u and v, of an interval [u,v] canthemselves be considered as two continuous random variables and are denoted as U andV respectively, where U < V. Again these two random variables are identical to therandom variable B except for their additional roles in forming intervals in a continuouscontiguous frame. Specifying a continuous probability measure BPA on the contiguousframe for the variable of interest is then analogous to specifying a bivariate densityfunction on U and V in conventional statistics, though the implications are completelydifferent. Given the BPA density m([u,v]), the continuous upper and lower marginaldistributions are defined for V and U and can be calculated as:G(v) = Pr(V v) =^lv m([u,v])dudvvo =A,F(u) = Pr(U < u) = 1:: fuvi m([u,v])dvdu^(3.16)and the upper and lower marginal density functions can be determined asg(v ) = —dv G(v ) = fo m([u, v]) duAu) = —ddu-F(u) = f tl m([u,v])dv^ (3.17)An equivalent but more direct way of determining the upper and lower marginal densityfunctions are from the commonality functions H([u,v]), i.e.Chapter 3. Introduction to Dempster-Shafer Theory^ 33g(v) =f(u) =^ (3.18)The upper and lower marginal distributions G(v) and F(u) are members of thecompatible probability distributions implied by the BPA m([u,v}) and represent twoextreme interpretations from the information, contained in the BPA, about the un-known parameter O. They play an important role in the D-S decision analysis as willbe discussed in the next section.3.6 D-S Decision AnalysisIn the previous sections, the representation of uncertainties of some unknown state ofnature 8 using D-S theory has been described. In engineering practice, the ultimatepurpose of quantifying uncertainties is to support decision making. In this section, D-Sdecision analysis, i.e. decision making based on the D-S representation of uncertainties,will be described.3.6.1 Review of the Utility FunctionConsider a typical decision problem where uncertainties stem from the value of a stateof nature B whose true value is unknown. Let D = [di , j = 1, ...m] represent the possibledecision options among which the decision maker can choose. For any decision optiondi , there is a consequence corresponding to a possible outcome of the state of nature O.According to the conventional decision criterion, which involves comparing expectedvalues, the consequence must be expressed in terms of the "utility", which is a functionChapter 3. Introduction to Dempster-Shafer Theory^ 34of decision option d3 and possible outcome 8, and is denoted by U(d, 8). The utilityfunction reflects the" decision maker's attitude towards risk and may or may not be thesame as the monetary consequence. In water resources engineering decision analysis,the decision maker often represents a large corporation or government and the amountof monetary loss may not be so severe as to cause the decision maker to deviate frombeing "risk neutral". If this is the case, the utility function can be equated withmonetary consequences. This will be assumed to be the case throughout the rest ofthis thesis but does not preclude the use of other utility forms in the D-S scheme.3.6.2 The D-S Upper and Lower Expected UtilitiesIn conventional Bayesian analysis, uncertainty of the state of nature 8 is representedby a posterior distribution of 8. Bayesian decision analysis involves calculating theexpected utility for each decision option based on the utility function U(d,8) and theposterior distribution of O. The decision criterion is to choose the option which willyield the maximum expected utility.In the D-S approach, the final inferential result is a BPA density on 0. Unlike aconventional probability distribution, the BPA density offers no direct way to computea single representative expected utility. Furthermore, it is important to avoid anyapproach which by computing a single value result effectively destroys informationconcerning the full extent of the uncertainties reflected in the BPA.In Section 3.2.3 it was explained how a BPA can be interpreted as implying aset of compatible probability distributions. Together with the utility function, it istherefore possible to calculate an expected utility value from any one of the compatibledistributions. The maximum and minimum among these are called the upper and lowerexpected values and form the basis of D-S decision making, as will be discussed in thenext subsection.Chapter 3. Introduction to Dempster-Shafer Theory^ 35In general the upper expected value is obtained by choosing a compatible probabilitydistribution which is derived from the BPA in a way that, for any interval [u,v), thedensity m([u,v]) is concentrated at the point in the interval [u,v] at which the utilityfunction reaches its maximum. The compatible probability distribution which results inthe lower expected value is obtained in a similar fashion by concentrating the m({u,v])value at the minimum utility point within [u, v]. With the utility function U(d, 9), theupper and lower expected utilities are thus defined asEIU(d)] =^v]cemflu,vp[supeck,,vit (d,O)id[u,v][u, E.[U(d)] "Wu, v HinfeE[2,,v]U(d, 0)]d[u,v].J [u,v]c®(3.19)In the case where the utility function is a monotone function of the state of nature0, equation (3.19) can be simplified in the following way. Consider the upper and lowermarginal density functions g(v) and f(u) defined by (3.17) or (3.18). Since the randomvariables u and v are identical to the state of nature 0, g(v) and f(u) are the upper andlower marginal density functions of 0, and are simply referred to as g(0) and 1(0). Notethat AO) is constructed in a way that concentrates as much of the probability measureas possible, within the constraints of the BPA function, towards the smaller 0 values,while g(8) is constructed in an analogous fashion so that the probability measure isconcentrated towards larger 8 values.In conjunction with a monotone utility function, g(0) and f(0) will produce theupper and lower expected values defined in (3.19). If the utility function is monotoneincreasing, then (3.19) simplifies to:Chapter 3. Introduction to Dempster-Shafer Theory^ 36E - [U(d)] =E„[U(d)] =JeEo U(d,0)dG(0) = eee U(d,0)g(0)(10fece U(d,0)dF(0) = E0 U(d,O)f(0)cle8(3.20)and in the monotone decreasing utility function case E - [U(d)] and E[U(d)] are simplyreversed in the above.Thus, in contrast with the Bayesian approach which produces a single precise ex-pected utility value for each decision option, the D-S approach yields both upper andlower expected utilities. The size of the interval formed by the difference between theupper and lower expected utilities, E[U(d)] and EJU(d)], is maximum when the ex-tent of knowledge about the unknown 0 is complete ignorance. The interval becomeszero either when the knowledge about 8 is sufficiently strong and precise to specify aconventional probability distribution, as in the Bayesian situation, or when the truevalue of B is known and the problem becomes deterministic. In the intermediate situa-tion where the information is weak or imprecise, the width of the interval is affected bythe form of the utility function, the amount of information available and the degree ofdisagreement among different sources of information brought to bear on the problem.The significance of this interval will be discussed further in Chapter 10.3.6.3 Making a DecisionSince for each decision choice d, the D-S analysis will yield both upper and lowerexpected utilities, there is no obvious way to rank the decisions and consequently selectthe "optimal" decision alternative. In engineering practice, one criterion of making adecision is choosing the alternative which minimizes the maximum, i.e. upper, expectedcost. This is called the mini-upper decision criterion in D-S analysis [Dempster andChapter 3. Introduction to Dempster-Shafer Theory^ 37Kong, 1987]. Since this criterion minimizes the worst possible consequence, underthe constraint of information available, it generally leads to a conservative decision.The mini-upper decision criterion (i.e. minimizing the upper expected cost) closelyresembles the conventional minimax decision principle [Berger, 1985] where the bestdecision is also to minimize the worst possible consequence.The decision based on the mini-upper criterion may not be the optimal choice, whichis unknown under the weak information condition. But rather, it can be considered asa reliable, convenient, and satisfactory decision with some desired properties. For morediscussions on the topic of decision making under imprecise probability, see Walley[1991].When making a. decision based on the mini-upper decision approach, the informationreflected in the intervals formed by the upper and lower expected utilities is effectivelyignored. A more interpretive approach which utilizes the information contained in allof the results would seem appropriate. Such an alternative will be discussed in Chapter10.3.7 SummaryThe theoretical elements of the D-S theory, including representation of uncertainty,Dempster's rule of combination of independent sources of information and D-S decisionanalysis, were introduced in this chapter. In engineering practice, the unknown statesof nature are often measurable quantities on a continuum and rankable. The relevantknowledge reflects this and therefore the D-S theory is confined to the appropriatecontiguous frame. The graphical representation of the contiguous frame is invaluablein both understanding and utilizing the D-S theory. The D-S decision analysis leads toindeterminism in decision making which is a direct result of the weak information input.Chapter 3. Introduction to Dempster-Shafer Theory^ 38One reasonable way of making a decision, when the utility function resembles cost, is themini-upper criterion which is analogous to the conventional minimax decision criterionbut more interpretive alternatives will be proposed in Chapter 10. In Chapters 4 to6, some engineering relevant examples of D-S statistical inference of unknown states ofnature will be discussed.Chapter 4D-S Statistical Inference ^ Binomial Parameter4.1 IntroductionIn Chapter 3 the basic ingredients of the D-S theory, including the representation ofknowledge and uncertainty for some unknown state of nature using a BPA and D-Sdecision analysis, have been introduced. As in conventional Bayesian analysis, theBPA inference in the D-S approach can be based on two sources of information. Onesource is the subjective knowledge which represents some expert's personal experienceabout the state of nature and the other source is the sampling record in conjunctionwith a statistical model. In this chapter, the D-S inference of a binomial parameterwill be discussed. The theory is based on Dempster's work [Dempster, 196813]. Theinferential results of the unknown binomial parameter based on sample informationare presented in Section 4.2. In Section 4.3, the incorporation of prior information indrawing the parameter inference is discussed. An example is then given in Section 4.4to demonstrate the application of D-S inferential results in decision making in a waterresources engineering context.The D-S inference of the simple binomial model parameter is of great significancefor a number of reasons. Firstly it clearly demonstrates one D-S inferential procedure,and is easily contrasted with the nearest Bayesian case. Secondly, it forms the basis forD-S inference of the parameters for a more general statistical model. And thirdly, thebinomial model has been widely adopted in water resources engineering practice as a39Chapter 4. D-S Statistical Inference — Binomial Parameter^ 40probability model to describe the occurrence of random events through return periods.4.2 D-S Binomial Parameter Inference from Sample DataLet Z be a Bernoulli random variable whose outcome is either S (success) or F (failure).The probability of observing Z=S is denoted as 0, 0.0 < 0 < 1.0, which is the unknownparameter of the Bernoulli variable Z. If a set of n observations are obtained, then thetotal number of S's, which is denoted as r, follows a binomial distribution with thesame parameter O. The binomial probability distribution with parameter 8 isp(r) =^Or(1 — Orr^(4.1)The inference procedure presented here involves introducing another random vari-able with known probability distribution. Let W represent this random variable whosesampling distribution is denoted as iz(w) which is uniformly distributed on [0.0,1.0].The introduction of W is simply recognition of the fact that the process of observing Zis itself a random process. Thus, obtaining an observation of Z is equivalent to drawinga sample from W, but the true value of this sample can not be identified. The pur-pose and justification of introducing random variable W in the inference of 9 has beendiscussed by Dempster [1968b]. A simple explanation of the role of W in the inferencewill be given in the familiar context of a Monte Carlo simulation of the outcome Z.This explanation will lead more naturally, if less rigorously, to the inferential result for9 from the observation of Z.A Monte Carlo simulation technique can be used to generate random values whichChapter 4. D-S Statistical Inference — Binomial Parameter^ 41follow a given probability distribution. The simulation can be performed by first gen-erating a random number from a uniform distribution on [0.0,1.0] and then transform-ing it to another random number which follows the required distribution. A detailedintroduction to Monte Carlo simulation and its application can be found in most in-troductory books on probability and statistical methods in civil engineering, e.g. Angand Tang, [1984, Vol. II]. In the situation considered here, if the value of parameter8 is known, Monte Carlo simulation can be used to generate the outcome of Z whichfollows a Bernoulli trial with parameter O. This involves drawing a sample value wfrom W according to the uniform distribution p,(w). If w < 0, then Z=S is obtainedand if w > 0, Z=F is obtained.The difference between the conventional Monte Carlo simulation procedure andthe inference situation is that, in the latter case, the observations of Z are obtainedwhile the value of 0 is unknown and must be inferred. First, consider just a singleobservation of Z which is assumed to be a success, i.e. Z = S. It can be concludedfrom this observation that the value of w must have been less than or equal to 0, or,equivalently, 0 must lie somewhere in the interval [w,1.0] as shown in Figure 4.1. Inthe representation provided by the contiguous BPA diagram for 0, which is boundedby 0.0 and 1.0, this interval must correspond to a point on the upper horizontal edgeof this diagram. Since the probability density of obtaining any specific value w is 1.0,as it is draw from the uniform distribution, then the density of 0 being in the interval[w,1.0] is also 1.0. The specific value of w can not be identified but this density of1.0 is applicable to all possible values for w in its range from 0.0 to 1.0. If this resultis entered into the contiguous BPA diagram it will be seen that a continuous uniformBPA density is obtained on the edge BC of the contiguous frame, as shown in Figure4.2a. This continuous uniform BPA density therefore represents the BPA inferencefrom a single observation Z = S.Chapter 4. D-S Statistical Inference — Binomial Parameter^ 42wses11-4^I.10^ w 1Figure 4.1: The inferred relationship between w and 8 given an observation Z = S(from Caselton and Luo, [1992]).m([u,1.0])=1.01111111111111 111111111111 111111111111=(a)A0.0 U 1.00.0^U^1.0Figure 4.2: The BPA density for observation (a) Z = S; (b) Z=F (from Caselton andLuo, [1992]).Chapter 4. D-S Statistical Inference — Binomial Parameter^ 43The adoption of the uniform distribution for W facilitates explanation of this keystep from sample result to BPA. It can be shown that any other assumed distributionfor W, together with an appropriately modified simulation rule, produces the sameuniform BPA density result on the edge BC. Thus the inferred uniform distribution onthe edge BC is "exact" and not simply duplicating the distribution adopted for W.Once the bivariate density distribution on the contiguous frame has been obtainedfor the single sample Z=S, the corresponding commonality function can be determined.From the definition in (3.6) and Figure 4.2a, the result isH([u,v]) = u (4.2)Similarly, if an observation Z=F is obtained, it implies that 8 lies in an interval [0, w].Again, since the probability density of obtaining w is 1.0, the probability density ofbeing in the interval [0,w] is also 1.0. As w varies uniformly from 0.0 to 1.0, thiswill lead to a continuous probability density function on the edge AB of the contiguousframe as shown in Figure 4.2b. The expression of H ( [u, v]) for this sample result canbe deduced from Figure 4.2b and isH([u,v]) =1 — v (4.3)So far, it has been shown that for any single observation of Z (i.e. either Z=F orZ=S), there is an inferential result about the unknown parameter B which is expressedby the commonality function (4.2) or (4.3). If a set of observations of Z are obtained,then the BPA's from individual observations can be combined, using Dempster's rule ofChapter 4. D-S Statistical Inference — Binomial Parameter^ 44combination (3.9), to produce a resultant BPA. This is demonstrated in the followingexample.EXAMPLE 4.1. If just two observations are made and Z=F and Z=S are obtained,then the resultant commonality function is given by (3.9) and will be:H([u,v]) = Ku(1 — v)The BPA density function at m([u,v]) can be obtained through (3.15) and is equalto K. The integration of this density over the triangular region is K/2 and must be setto unity. K must therefore be equal to 2. The BPA density for this two sample case isthus uniform over the entire contiguous frame.The upper and lower marginal density functions of U and V, which are two membersof the compatible probability distributions of m([u,v}), can be calculated from (3.18).They aref(u) = 2(1 — u)g(v) = 2vFor the general case when there are n observations of Z, r of them being S's andn — r of them being F's, the commonality function (from Dempster [1968b]) can bedetermined asF(u)G(v) (4.7)r-1=7= 1 — Vx=0x=r+iChapter 4. D-S Statistical Inference — Binomial Parameter^ 45(n \H([u,v]) =Tur( 1 — v)- (4.4)The corresponding BPA density function can be obtained from (3.15) which isni([u,v]) = r(n — r) nr—^(4.5)The lower and upper marginal density functions for this general case are asf(u) =^Tnr-1(1 — u)n -rg(v) =^n (n — r)vr(1 — v)n -r -1r(4.6)for 0 < r < n. For the extreme cases: when r = 0 then f(u) = 0 and g(v) =n(1 — v)n -1 ; and when r = n then f(u) = nun -1 and g(v) = 0. Note that two marginaldensity functions f(u) and g(v) are Beta distributions. The corresponding cumulativeprobability distributions, F(u) and G(v), were given by Dempster and areChapter 4. D-S Statistical Inference — Binomial Parameter^ 46The following example shows the plots of the BPA density and the two marginaldistributions obtained above for some general sample situations. A comparison of D-Sinferential results with the parallel Bayesian is also addressed.EXAMPLE. Consider a series of situations where the ratio r/n is kept at 1/3 for twodifferent n values reflecting relatively small, n = 6, and large, n = 30, sample sizes. Thecorresponding two BPA density functions can be determined from (4.5) and are plottedin Figure 4.3. When n = 6, indicating that the sampling information is very weak, theinferred BPA of 0, see Figure 4.3a, is widely spread over the contiguous frame. Whenthe sampling number n is increased to 30, the sampling information becomes strongerand the corresponding BPA converges towards 0 = 1/3, as shown in Figure 4.3b. Inthe extreme case as n goes to infinity, the BPA density function converges to a unitspike at the point B = 1/3.The two marginal density functions f(u) and g(v) for these two cases are obtainedfrom (4.6). Since they also represent two possible compatible probability distributionsof parameter 0, we will simply refer to them as f(0) and g(0). These are plotted inFigure 4.4a and 4.4b for n = 6 and n = 30 respectively.Also superimposed onto Figure 4.4a and 4.4b are the Bayesian posteriors 70) ob-tained when the same two cases are solved using the conventional Bayesian methodologyand a uniform prior. Recall that AO) and g(0) are just two of the many compatibledistributions which can be drawn from a resultant BPA. As compatible distributionsare considered to be conventional distributions then, qualitatively, f(0) and g(0) re-flect uncertainty concerning B in a similar fashion to the Bayesian posterior. However,they have no quantitative equivalence to the Bayesian posterior except for a tendency,shared with all the other compatible distributions of a resultant BPA, to converge onthe Bayesian posterior when the sample size n grows large. This convergence is notU0.0^0.2^0.4^0.8^0.8^1.0V c°(a)Chapter 4. D-S Statistical Inference — Binomial Parameter^ 47OPoS)Figure 4.3: The BPA density matt, v]) for observations (a) n = 6, r=2; (b) n = 30, r= 10 (from Caselton and Luo, [1992]).Chapter 4. D-S Statistical Inference — Binomial Parameter^ 48(a)p.d.f.543210 0(b)Figure 4.4: The D-S marginal distributions f(0) and g(0), and the Bayesian posteriordistribution 7r(0) with uniform prior for a) n = 6, r = 2; b) n = 30, r = 10 (fromCaseIton and Luo, [1992]).Chapter 4. D-S Statistical Inference — Binomial Parameter^ 49fully achieved until both methods produce unit spikes with an infinite sample.The "distance" between f(9) and g(9) might be viewed as an indication of theuncertainty due to imprecision which is detected by the D-S scheme. Exactly how thisdistance should be measured is an interesting but still open question. If, as in theBayesian approach, the posterior mean is used as a point estimate of the parameter 9,then f(0) and g(9) will yield the highest and lowest posterior mean values attainablefrom any of the compatible distributions. The distance between these point estimatesmight then be viewed as one possible quantification of imprecision. As described inSection 3.6 of Chapter 3, a related property of f(0) and g(9) that plays a central rolein the D-S decision analysis is that, with any monotone utility function on 9, they willyield the highest and lowest expected utilities of all compatible distributions of theresultant BPA.4.3 Incorporating Prior InformationIn Section 4.2 the inference of the unknown binomial parameter 9 based solely onsampling information was discussed. This can be considered as synonymous with thesituation where the prior knowledge is complete ignorance. In the more general situa-tion one usually has some prior knowledge about the parameter 9 and this may have aprofound effect on the inferential results of 9, especially when the sampling informationis weak. The incorporation of such prior information for the case of binomial parameterinference is therefore considered here. The D-S representation of subjective knowledgefor more general situations will be discussed in Chapter 7.First consider the situation where the prior evidence is strong and without anyimprecision, meaning that it can be correctly expressed as a Bayesian prior probabilitydistribution or equivalently, as a Bayesian BPA. The discrete case has been consideredH([u,v]) = p(0)0?(1— 6) 71-7 whenu=v=9(4.8)(rnChapter 4. D-S Statistical Inference — Binomial Parameter^ 50by Dempster [196813]. Here the continuous Bayesian BPA density, denoted as p(0) on9 where 0 < 0 < 1.0, will be considered. The commonality function in this case can beexpressed asH([u,v]) = p(0) for u = v == 0^otherwiseThe Dempster combination of this type of prior information with the Binomialsampling information expressed by (3.9) leads to the commonality function= 0 otherwiseSince the combined commonality function is non-zero only at the singleton points u= v = 9, it represents a conventional probability distribution, which is easily recognizedas a Bayesian posterior distribution. This demonstrates the basis of the claim thatthe Bayesian approach is a special case of the more general D-S approach when theprior information can be represented by a conventional probability distribution, i.e. aBayesian BPA.Consider now the more general situation where the prior subjective information isweak or imprecise, indicating that it is inappropriate to specify a precise probabilitydistribution. This type of prior information can be more properly represented by aBPA. Here one simple form of prior information which can be expressed entirely inthe statement "that the parameter p is within interval [a, b] with probability a", willbe considered. Note that nothing is said in this statement about the assignment ofChapter 4. D-S Statistical Inference — Binomial Parameter^ 51the residual probability 1 — a. This piece of information can be expressed in the D-Sscheme asPr([u,v]) = a^when [u, v]^[a,b]^= 1 — a when [u, v]^[0, 1]The corresponding commonality function isH^v]) = 1^when u > a and v < b= 1 — a otherwise(4.9)(4.10)Combining this with the commonality function from the Binomial sampling informationgiven by (4.4) yields the commonality functionH([u,v1) = K ur(1 — v)" -r^when u > a and v < b(1 — a)ur(1 v)' otherwise(4.11)The normalizing factor K can be determined analytically (see Appendix A for math-ematical details) and is found to beK={1 — a -1-- at( n )az(1 — a)' --n— byi- x (4.12)Chapter 4. D-S Statistical Inference — Binomial Parameter^ 52However, in numerical analysis, the best strategy found in this research for deter-mining the constant K is by summing the probability weights and normalizing. Sincethe commonality function is discontinuous on the triangular region, the correspondingBPA function is also discontinuous. The demonstration of the BPA function is givenin Appendix A and will not be presented here. The upper and lower marginal densityfunctions g(v) and 1(u), which can be determined from the commonality function in(4.11), also shown in Appendix A, will be discontinuous and involve probability spikesat some points. They areKnrn(1 — cx)(n — r)(1 — v)n - r - lv 7(density along 0 < v < aand b < v < 1.0)g(v) = KKrnr(n — r)(1 — v)n- r - 'vrab, (1 — On -7(density along a < v < b)(probability at v = b)(4.13)(density along 0 < u < aK ( n ) (1 — a)rur -1 (1 — u)n'rand b <u <1.0)(n)f(u) , K^aar(1 — arr^(probability at u = a)^(4.14)rK ()n rur -1 (1 — u)n -T^(density along a < u < b)r Note that each of the two marginal distributions expressed in (4.13) and (4.14), thoughthey seem to be somewhat complex, is composed of simple Beta distribution forms,except the probability spikes at v = b and u = a for g(v) and Au) respectively.Chapter 4. D-S Statistical Inference — Binomial Parameter^ 53The information described by the statement "that the parameter p is within interval[a, b] with probability a" represents a very simple but commonly occurring expressionof weak knowledge. It represents the situation where the individual expressing his sub-jective knowledge has some sense of an interval in which the unknown parameter valuemight lie, with some specifiable probability, but he is unable to specify the distribu-tion of probability within this interval. Furthermore he feels noncommittal about theremaining unassigned probability but uncomfortable with simply allocating it to thecompliment of the interval.In this section, the D-S inference of the binomial parameter based on both samplinginformation and subjective prior information has been discussed. The results can thenbe used in D-S the decision analysis. In the following section, a water resources decisionmaking example is presented to demonstrate such an application.4.4 An Application4.4.1 Description of the ProblemThe example adopted here is developed from one given by Benjamin and Cornell[pp232-235, 1970]. The example is simple but sufficient to demonstrate the D-S decisionanalysis based on the inferential results presented in the preceding section.Suppose that a highway drainage culvert needs to be designed and thus requiresthe adoption of a design flood value. The design is to be based on the criterion ofminimizing the combined construction and expected flood damage costs. Only twodecision choices for the design flood will be considered: a design flow value of 40 m3 /swhich will be identified as Qd, i ; and a flow value of 50 m3/s which will be identified asd ,2•A 6 year record of annual maximum floods at the design site only indicates that theChapter 4. D-S Statistical Inference — Binomial Parameter^ 54annual maximum flood has exceeded the 40 m 3 /s value on just two occasions while the50 m 3 /s value has been exceeded on only one occasion. No other sample informationis available.A simple Binomial model will be adopted to describe the random sequence of annualexceedances of any specified flood value. This implies the existence of fixed probabili-ties, 0 1 and 02, of exceedance by annual maximum floods for either of the two designflood choices. The true values of these probabilities are not known.Utilities are considered to be directly equivalent to monetary cost. The utility valuecorresponding to the construction cost is assumed to be proportional to the design flood,i.e. a constant s times the design flood. The value of s is 1.0. In any year, if the floodexceeds the design value, a constant damage cost of utility c will occur, if the flood isless than the design value, there will be no damage. The value of c is 3.0. It will alsobe assumed that, after each exceeding flood, the culvert will not be destroyed and thecapacity of the culvert is not changed. With the 20 year service life of the culvert thetotal expected cost can be expressed as:20U(Q0, 0i)^SCh,i E cxPi [x failures occur]..osCh,i 20c0, (4.15)where i identifies a choice of design flood value, i = 1 or i = 2. The construction cost forany given design Qd ,i is a constant and the expected cost becomes a function of just theuncertain annual exceedance probability 0 i . Note that the utility function expressed in(4.15) is a linear monotone increasing function of parameter Oi . Therefore, (3.20) canbe used in calculating the upper and lower expected utilities in the D-S approach.Three cases will be considered here. The first case assumes no prior information andChapter 4. D-S Statistical Inference — Binomial Parameter^ 55also ignores the entire record data, i.e. considers the decision analysis based on completeignorance. This extreme case is presented here mainly to demonstrate the maximumimprecision embodied in complete ignorance. The second case considers the record asthe sampling information but still assumes that there is no prior knowledge. And thethird case considers both prior and sampling information. The parallel conventionalBayesian approach will also be presented and the results will be compared with theD-S analysis for the first and second cases.4.4.2 Case I: Complete IgnoranceBayesian Analysis. Bayesian decision analysis utilizes a noninformative priordistribution to represent complete ignorance. For the binomial parameter 0, there is nounique noninformative prior for the Bayesian analysis and four noninformative priors,i.e. r i (0) = 1, 7r 2 (0) = 9 - '(1 r3(6) oc 0 -1 / 2 (1_9) -1 / 2 , r4 (0) cc e(1 — 9) 1 _8 , havebeen proposed which are considered to be plausible [Berger, 1985]. Note that 7. 2 (9) isan improper probability density function. Since there is no sampling information, theBayesian posterior for 9 will be the same as the noninformative priors.Since the utility U(Qd ,i3 Oi ) is a linear function of the unknown parameter O i , onlythe expected value of O i is needed to calculate the expected utility. The expectedvalue of 0i is 0.5 for 7r 1 j r3 , and 7.4 and is nonexistent for the improper prior 11- 2 . Theexpected utilities Eir,[U(Qd,i)] based on the four noninformative priors 7rj , j = 1, ...4,with respect to 0i for the two decisions are given in Table 4.1. Note that, for eachdecision, the expected utilities from the three proper noninformative priors are thesame because of the linearity of the utility function. If the utility function had beennonlinear with respect to the parameter 0 i , then the noninformative priors would notnecessarily produce identical expected utilities.According to the usual Bayesian decision criterion of minimizing expected utility,Chapter 4. D-S Statistical Inference — Binomial Parameter^ 56Table 4.1: D-S and Bayesian decision results in Case I: complete ignorance (fromCaselton and Luo, [1992]).DecisionChoicesD-S BayesianE.,[ ] El ] E,, [ ] E„.],[ ] E,., [ ] E.„,[ ]C 2 d,1 40 100 70 n/a 70 70Q d,2 50 110 80 n/a 80 80Here n/a denotes "not applicable "Q do. would be chosen as the design flood. Here, the Bayesian decision analysis gives aclear preference between the two design options even though there is no informationabout the unknown parameter.D-S Analysis. The D-S method represents total ignorance by the complete igno-rance prior which assigns probability 1.0 to the union representing the entire parameterrange, i.e. m(0) = 1.0. With the contiguous frame this is equivalent to a unit spike atthe upper left corner point of the triangular BPA diagram. The commonality functionin this case is H- ([u,v]) = 1.0 and the upper and lower marginal densities for bothdecision choices are also unite spikes at the extreme ends of O's range, i.e.1 9 = 0f( 9 )0 otherwise1 9 = 1g(0) =0 otherwiseThe utility function and the above two marginal density functions, in conjunctionwith (3.20), provide the following expressions for the upper and lower expected costsKIU(Q d ,i )] = sCh,i + 20cChapter 4. D-S Statistical Inference -- Binomial Parameter^ 57E.[U ( Q d,i)] = s Qd,iWith s=1 and c=3, the upper and lower expected values can be calculated and, togetherwith the Bayesian results, these are given in Table 4.1.If the mini-upper expected utility criterion is adopted then the D-S decision choicein this case should be Qd , i , which yields the value 100. At the same time it canbe noted that the difference between the lower and upper expected utility values forboth decision choices is very large compared with the difference in the upper expectedconsequences of the two decisions as well as the lower expected consequences. Thisis a reflection of the substantial uncertainty involved in the parameters 0 1 and 02. Ineffect, by providing a more explicit recognition of the full uncertainties involved, the D-S analysis is indicating near indifference between the two decision choices. In contrast,the Bayesian result produces just a single expected cost for each decision while theposterior parameter uncertainty is the same as the adopted informationless prior.4.4.3 Case II: Using the Record InformationBayesian Analysis.^It is again assumed that there is no prior information andthe four noninformative prior distributions will be considered.The sample likelihood can be expressed as:1(01 /x) oc 02i (1 — 0 1 ) s-Tiwhere r 1 is the number of annual exceedances of the design flood C20 observed in therecord. With the four noninformative priors, the four posterior distributions areChapter 4. D-S Statistical Inference — Binomial Parameter^ 58r1(t9i/x) OC tr(17r2 (0i /x) oc 82i -1 (17r3( /x) a O2i-0.5 (17r4(0i/x) a 0';' ±ei(1Note that r i (Oi /x), 71- 2 (0i /x) and 71-3 (0i /x) are Beta distributions. For each decisionoption Qd ,i the expected utilities from the four posterior distributions can then becalculated and are shown in Table 4.2. The expected costs are different for each decisionoption depending on the choice of the noninformative prior, and there is no obviousreason to prefer any individual result over the others. The noninformative priors ir 1 , 7r3 ,74 all support Qd, i as the design flood decision, but r2 indicates indifference betweenQd,1 and Qd,2.Table 4.2: D-S and Bayesian decision results in Case II: record information only (fromCaselton and Luo, [1992]).DecisionChoicesD-S BayesianE-^] E*^]_ Ei,L _Er2E4._]_61.75Qd,1 57.14 65.71 62.50 60.00 61.43Qd,2 58.57 67.14 65.00 60.00 62.86 63.78D-S Analysis.^The D-S analysis can be performed with just the informationdrawn from a data record, and a prior is not necessary. Furthermore, as was mentionedpreviously, the introduction of the noninformative prior BPA into the D-S analysis hasno effect on the result.From (4.6), if the total number of exceedances of the design flood in the record isr i in 6 observations, then the two marginal distributions for parameter 9 areChapter 4. D-S Statistical Inference — Binomial Parameter^ 59)ri 8 72"jr.-1 (1 —6 (6 — r i )v•i(1 — v)' - ri^(4.16)ri jfor 0 < r i < n.Knowing the utility function and the marginal density functions, the upper andlower expected values of U(Q0 ,8,) can then be calculated from (3.20) as1^ ri + 1EIU(Qd ,i )]^+ 20c I 9ig(0i)clei = sQd ,i + 20c^31E.[U(Q0)]Ti=^+ 20c^Oi f(Oi )d0i = sC2d,i + 20c-31^(4.17)Equation (4.17) reveals, for binomial sampling and this simple utility function,the influence of the various elements of the problem on the magnitude of, and differ-ence between, the upper and lower expected utilities. In the general case with recordlength n years, the denominator of the second term in (4.17) is n 1. For compar-ison, the equivalent Bayesian expected utility with the uniform prior, i.e. 7r 1 , yieldsE,,[U(Qd,i)] = sQd,i+ 20c(ri 1)/(n + 2).With s = 1 and c = 3, the upper and lower expected values for the two decisionchoices Qd , 1 and Q d , 2 are 65.71, 57.14 and 67.14, 58.57 respectively. These, togetherwith the Bayesian results, are summarized in Table 4.2.The D-S results indicate that the ranges between upper and lower values havenarrowed considerably for both decisions as a result of introducing the record. Also,Chapter 4. D-S Statistical Inference — Binomial Parameter^ 60all of these expected values for both D-S and Bayesian analyses have been reducedas a result of the diminished assessment for risk. There is some overlap between theranges for the two decisions which makes the choice less clear cut. According to themini-upper criterion the decision choice should be Qd,i.4.4.4 Case III: Using Both Prior and Record InformationIn the preceding two cases the prior information was non existent, and was treated asbeing equivalent to complete ignorance. In the case considered here, suppose that anexperienced engineer overseeing the culvert design project expressed some subjectiveknowledge which is independent of the record. He has a feeling that the return periodof the 40 m3 /s flood event is somewhere in the 2 to 4 year region. Furthermore, healso feels that the 50 m 3 /s event has a return period somewhere above 4 years but lessthan 10 years. In neither case has he any sense of the way that the unknown parametermight be distributed within or outside these intervals. His prior information might beexpressed along the following lines: there is a 0.9 probability that the parameter 0 1lies in the interval [0.25,0.50] and a residual probability 0.1 that says nothing about 0 1 ;also, there is 0.9 probability that the parameter °2 lies in the interval [0.10,0.25] and aresidual probability 0.1 that says nothing about °2.The above conforms to the type of weak prior information discussed in the previoussection on incorporating prior information and represents a simple and natural expres-sion of knowledge under uncertainty. While the D-S scheme is able to reflect such astatement faithfully, it presents some difficulties in the Bayesian approach. To use theBayesian approach, further elicitation is needed and this will involve more time, effort,and possibly risk of distorting the knowledge.It should be noted that the record suggests that the maximum likelihood returnperiod for the 40 m 3 /s event is 3 years and for the 50 m 3 /s event is 6 years. ThusChapter 4. D-S Statistical Inference — Binomial Parameter^ 61the prior information can be considered to be in general agreement with the record forboth Qd , i and Qd , 2 events.For each decision option Qd ,i, the resultant commonality function and upper lowermarginal density functions are determined from (4.11), (4.13) and (4.14). With theutility function expressed in equation (4.15), the upper lower expected utilities canthen be calculated from (3.20) which are given in Table 4.3.Table 4.3: D-S decision results in Case III: prior and record information.DecisionChoicesD-S BayesianE.[^] El 1 E.„.,[ ] E,[ ] E„,[ ] E.„,[ ]Qd,1 57.62 65.15 n/a n/a n/a n/aQd,2 58.05 63.41 n/a n/a n/a n/aHere n/a denotes "not applicable."Comparing with the D-S results for Case II, as shown in Table 4.2, it can be seenthat the interval formed by the upper lower expected utilities in Table 4.3 is narrowedfor each decision Qd,2. This is a result of incorporating the prior information which, aspointed earlier, is in general agreement with the sample information. According to themini-upper decision criterion, Qd,2 should be chosen as the decision. This is in contrastwith Case II where Qd,1 was the decision. Hence the consideration of prior informationin this case has changed the decision. The reason for the alteration in decision is thatthe prior information about Qd , 2 indicates rather strongly that the decision Qd , 2 haseven smaller risk.4.5 SummaryIn this chapter, the D-S inference of a binomial model parameter based on samplinginformation was discussed. The interpretation of the inferential process from MonteChapter 4. D-S Statistical Inference — Binomial Parameter^ 62Carlo simulation point of view is convincing and facilitates understanding of the in-ferential scheme. The incorporation of prior information into the inference was alsodiscussed and it was revealed that when the prior knowledge supports a conventionalBayesian prior, the final inferential results will be the Bayesian posterior.The inferential results were demonstrated in a simple water resources decision mak-ing example where three cases with increasing level of information were considered. Thecomparison of D-S results with the parallel Bayesian results in Case II revealed thatfor each decision option, the four Bayesian expected utilities corresponding to the fournoninformative priors all fall within the D-S upper and lower expected utility range.This indicates that there is a general agreement on the magnitude of the assessed riskbetween the conventional Bayesian and D-S methods. This kind of agreement is alsomentioned by Wasserman [1988], and adds support to the D-S results from a Bayesianperspective.On the other hand, the four Bayesian expected utilities obtained for each decisionoption span only about 30% of the D-S range in the case of C2d,i and 60% in the caseof Qd ,2 . Thus only some of the imprecision detected by the D-S analysis appears to bereflected in the Bayesian analysis even when four quite different ignorance priors areconsidered. While this one example is clearly insufficient to draw general conclusions, itdoes indicate that some substantial uncertainties have been overlooked in the Bayesianapproach.Chapter 5D-S Statistical Inference — Likelihood Based BPA5.1 IntroductionIn Chapter 4, the D-S inference of a binomial model parameter, based on both samplinginformation and subjective prior knowledge, was discussed. In the more general situa-tion, however, a random event may be assumed to follow a more complex probabilitymodel such as a normal or lognormal distribution. The parameter of the probabilitymodel may be unknown and thus needs to be inferred from the information available.This includes the sample observations of the random event, and some experts' subjec-tive knowledge about the unknown parameter.The D-S approach to the inference of the unknown parameter involves representingthe sampling information and the independent subjective prior knowledge by BPAfunctions, and then combining these via Dempster's rule of combination to yield aresultant BPA. In this chapter, the D-S inference of the unknown parameter fromsampling information will be discussed. A general discussion of D-S representation ofsubjective information using BPA functions will be presented in Chapter 7.Once sampling data are obtained and a statistical model is assumed, the samplelikelihood function, which summarizes the sample information, is established. It wouldseem reasonable to expect that the D-S statistical inference of the unknown parametercould be obtained from this sample likelihood function view of the information. Theinterpretation presented in Section 5.2 represents one such approach. The method63Chapter 5. D-S Statistical Inference — Likelihood Based BPA^64was originally developed by Shafer [1976] and subsequently extended and endorsed byWasserman [1988, 1990a, 1990b]. The axiomatic justification of the results, whichreveals some very attractive features, are discussed in Section 5.3. An example is thengiven in Section 5.4 to demonstrate an application of the D-S inferential results in waterresources engineering decision making.5.2 Sample Likelihood Based BPALet X represent a random variable whose outcome x is assumed to be best describedby a probability distribution f(x/0) where 8 is the unknown parameter and 8 E 0.Let x-,-(x 1 , x 2 ,...x,n ) be m independent and identically distributed observations of therandom variable X. The sample likelihood function from x is denoted as L(9) andcalculated fromL(0), II f(xj /0)^ (5.1)3 =1The ratio of the sample likelihood function to the supremum likelihood value isdefined as the relative likelihood function and denoted as R(9). That isL(0) R(0) =suPe ee L(0)(5.2)If a set of observations x is obtained, it tends to provide more support to the true 9value being the one which has the greater chance of producing x. The relative likelihoodfunction R(9) expresses the relative chance of each 8 value being able to produce x, i.e.Chapter 5. D-S Statistical Inference — Likelihood Based BPA^65being the truth. Thus for any 6, and Bi if R(0i) > R(03 ), then 6, is more likely to be thetrue parameter value than 03 . It is then assumed that the plausibility of any individual6 value being the truth is equal to its relative likelihood R(8), i.e. P1(0) = R(0).A further assumption is needed in order to determine the D-S inference of the un-known parameter from the sample likelihood function. Without knowing theoreticallyhow the sample information determines the D-S inference, a reasonable assumptionwould be that the sample evidence should be consonant (as defined in Section 3.1of Chapter 3) in providing support to the inference of the unknown parameter O.Consequently, the BPA inference based on the sample likelihood function should beconsonant, i.e. the subsets on which the BPA values are not zero are nested.Indeed, consider the discrete case where the parameter space O is discretized inton possible values, i.e. 0 = {0 1 ,8 2 , among which only one is the truth. Reorderthe parameter space into 0 = {01,82,...,00 according to their relative likelihood valuesfrom high to low. Then 01 which would have the highest relative likelihood value, andthus the highest tendency of producing x, deserves some explicit support and thereforeshould be assigned some non-zero BPA value m(0 1 ). The element 021 will not be givenany exclusive support as this would conflict with the support for 01. But the subset{01,0'2 } should have more support than 01 alone as it has a greater tendency to producex. This indicates that the subset {81,0 12 } should be assigned a non-zero BPA value. Thesame argument can be repeated for progressively larger subsets like {01, 82 , ..., a , i < n,justifying non-zero BPA values for each of them. Let A i = {01,02,...,0j for i < n. Sinceall A i are also nested and include the singleton element 01, the resultant BPA thereforewill be a consonant BPA centered on 01.So far, two assumptions have been introduced and some arguments supporting themhave been discussed. These two assumptions can be summarized as followsChapter 5. D-S Statistical Inference — Likelihood Based BPA^66Assumption I^The plausibility of any singleton element 6, is equal to its relativelikelihood value, i.e. Pl(9j ) = R(8:);Assumption II The likelihood-based BPA is consonant which assigns BPA valuesonly to the nested subsets A i = {01.en,...,0:} where i = 1, 2,...n.For more discussions on the justification of these two assumptions, see Shafer [1976]and Wasserman [1988]. Based on these two assumptions, a BPA function represent-ing the D-S inference from the sampling information is uniquely determined from thelikelihood function. Let n7„(.) denote this BPA, thenm(Ai) = R(9) — R( 9:+1) for i = 1,..n — 177/(0) = ROO^(5.3)m,(A) = 0 otherwiseThe example given later in this chapter demonstrates the generation of a BPA fromsample likelihoods.While the likelihood-based BPA derived above is based on two straightforwardassumptions, its derivation can be justified in a more formal way. This involves somebasic axioms developed by Wasserman [1988] which any BPA based on the likelihoodfunction would be required to satisfy. Furthermore, it has been shown by Wassermanthat the above derived likelihood-based BPA is the only one which satisfies these basicaxioms. This axiomatic justification of the likelihood-based BPA reveals some veryinteresting and attractive features and will be discussed in Section 5.3.Chapter 5. D-S Statistical Inference — Likelihood Based BPA^675.3 The Axiomatic Justifications of a Likelihood-Based BPAThe first axiomatic condition states that when the sample likelihood values of all 9, Care the same, meaning that the sample likelihood function does not provide any effec-tive information concerning the state of nature, the corresponding likelihood-basedBPA should be a complete ignorance BPA. The second axiomatic condition is that thestate of nature t9 i which has the higher likelihood value should also be given greatersupport in the derived BPA. And the third axiomatic condition requires that, when theprior subjective knowledge can be accurately expressed by any Bayesian prior distribu-tion, its combination with the likelihood-based BPA should yield a Bayesian posteriordistribution and the conventional Bayesian analysis result.Consider the likelihood-based BPA derived in Section 5.2 and expressed by (5.3).When the likelihood values are the same for all O i C 0, the corresponding relative like-lihoods ROO become 1.0 in which case (5.3) specifies a complete ignorance BPA. Thissatisfies the first axiomatic condition. The second condition is similar to AssumptionI in Section 5.2 and therefore is clearly satisfied in the derived BPA. Now assume thata Bayesian prior distribution is determined from the subjective knowledge and is ex-pressed with exact equivalence as a Bayesian BPA in the D-S theory, i.e. as m o (OD = piand mo (A) = 0 otherwise, where E i pi = 1.0. To combine this with the likelihood-basedBPA, the commonality functions from the two BPA's will first have to be determined.For the Bayesian BPA, the commonality function H o(A) has non zero values only onthe singleton elements. That is-110( 9:) = PiI-1 0(A) = 0^otherwiseChapter 5. D-S Statistical Inference — Likelihood Based BPA^68Since H0 (A) has nonzero values only on singleton elements, only the commonalityvalues of singleton elements of the likelihood-based BPA, i.e. Hx (C, need be deter-mined to combine it with the prior Bayesian BPA. According to the definition, thecommonality value for any singleton element is always equal to its plausibility value.ThereforeI- I (fi)^R(0:)L( 92)sup0® L(0)The combination of the two commonality functions using Dempster's rule of com-bination (3.9) results in the posterior commonality function which is non-zero only forsingleton elements, i.e.H(9) K p iL(OD (5.4)where K is a normalizing factor. This combined commonality function corresponds toa Bayesian BPA and is identical to the conventional Bayesian posterior distributionresult. Thus the likelihood-based BPA also satisfies the third axiomatic condition, i.e.when the prior knowledge can be expressed as a Bayesian BPA, the D-S approachshould be identical to the conventional Bayesian result.The three axioms are the basic conditions that, from a practical standpoint, anyBPA based on the sample likelihood function should satisfy. Furthermore, Wasserman[1988] has shown that the resulting BPA is the unique BPA which satisfies the threeconditions. Therefore it can also be considered as a proper BPA, where proper impliesChapter 5. D-S Statistical Inference — Likelihood Based BPA^ 69that it is the only type of BPA satisfying the basic requirements, derived from thesample likelihood function.The BPA derived in Section 5.2 has been based on the likelihood function from thetotal set of sample observations x. An alternative method would be to determine a BPAfrom each individual sample result and then combine the m BPA's from m individualobservations to produce a resultant BPA. This resultant BPA would be different fromthe BPA based on the likelihood function of all samples, one major difference beingthat the final BPA would no longer be consonant. The reason being that this approachreveals conflicting information in the samples which is suppressed in the likelihoodfunction summary. Arguments have been presented which favor obtaining the BPAdirectly from the likelihood function of the sample set [Wasserman, 1988] as it is onlythis approach that meets the axiomatic requirements described above. Also, as a minorpragmatic consideration, the BPA based on the likelihood function of total samples iseasier to use in say engineering practice.5.4 An ApplicationThe following water resources engineering decision problem was taken from McAniffet al. [1980]. The problem was originally solved using Bayesian decision analysis. Itwill be used here to demonstrate the application of the likelihood-based inference inthe D-S decision analysis and the results will be contrasted with that from the originalBayesian approach.An agricultural producer needs to choose one irrigation system, among a range ofoptions, which will minimize his total cost. The total cost of an irrigation systemincludes the initial capital expenditures and operation and maintenance expenses. Theoperational cost depends on the future energy price increases which are uncertain.Chapter 5. D-S Statistical Inference — Likelihood Based BPA^70Therefore the decision has to be made under uncertainty.Let D = (di ,j = 1, ...5) represent the decision options, i.e. the possible irrigationsystems, among which the decision maker must choose. The five options consideredhere are Center pivot units, Travelling trickle units, Gated pipe units with tailgaterreturn, Open ditch units and Dead level units. The future energy price level is theunknown parameter, or state of nature, denoted here as O. The possible values ofare represented by 0 = = 1, ...10) among which only one value can be the truth.Each of the ten possible values represents an annual average percentage energy priceincrease. For each decision option d3 and state of nature 0 i , there is a cost (i.e. negativeutility) which is denoted as C(di , 0 i ) and is given in Table 5.1.Table 5.1: Prior probabilities and costs in dollars (from McAniff et al., [1980]).StatesofnaturePriceincreaserangeAveragepriceincreasePriorprobabi-litiesCcosts in dollarsd1 d2 d3 d4 d591 <0 < 0 0.11 268190 269750 273780 287430 29978002 0-3 1.5 0.07 280020 280930 284440 306540 31330003 3-6 4.5 0.09 308620 307540 310050 331370 33072004 6-9 7.5 0.11 344760 341510 342810 371150 35880005 9-12 10.5 0.12 382080 385320 385060 421980 39481006 12-15 13.5 0.12 452790 441870 439530 488280 44161087 15-18 16.5 0.11 531180 515060 509990 573170 50154008 18-21 19.5 0.09 646880 609830 601250 683540 57941009 21-24 22.5 0.07 764400 726960 719420 826540 680290On > 24 > 24 0.11 903500 862160 844350 977600 787020A prior probability distribution on 0 has been specified and this, together withthe cost function C(c/3 ,00, is given in Table 5.1. It should be stressed that a priordistribution is mandatory in Bayesian analysis and its precision is governed entirely by,in this case, the discretization of 0 adopted. This is in contrast with the D-S analysiswhere it is entirely optional and, if adopted, can be expressed at any level of precisionChapter 5. D-S Statistical Inference — Likelihood Based BPA^71equal to or lower than the discretization of O.The sample information is represented by the sample estimate x of the true state ofnature obtained from a forecasting model. For a given true state of nature 0i , the modelwill predict different future energy price levels with different probabilities, reflecting theprecision of the model. In the situation considered here a prediction of x = 10.5% hasbeen provided. The sample likelihoods for this prediction are given in Table 5.2.Table 5.2: Sample likelihood values from x = 10.5% (from McAniff et al., [1980]).8i 01 02 03 04 05 06 07 08 09 010L(0i) 0.031 0.041 0.071 0.143 0.408 0.143 0.071 0.041 0.031 0.02Bayesian analysis.^This was documented by McAniff et al. [1980] and isbriefly reviewed here.The posterior probability distribution of 0 is determined, via Bayes' theorem, fromthe prior probabilities and sample likelihood function. The Bayesian expected costfor each decision option is calculated from this posterior distribution and the the costfunction. The results are given in Table 5.3. According to the Bayesian decisioncriterion of minimizing the expected cost, the decision d 3 , i.e. the Gated pipe withreturn irrigation system is the optimal choice.Table 5.3: Expected costs based on Bayesian analysis (from McAniff et al., [1980]).Decisions d1 d2 d3 d4 d5Expected costs 414799 410139 409050 451284 415465D-S decision analysis.^The D-S analysis can draw an inference on 0 solelyfrom the sampling information, which is then combined with a BPA representing thesubjective prior knowledge to produce the final inference for 9.Chapter 5. D-S Statistical Inference — Likelihood Based BPA^72First consider the sampling information alone. The D-S analysis involves deriving aBPA, mx (A), from the sample likelihood function following the procedure described inSection 5.2. Based on the likelihood values in Table 5.2, the BPA m x (A), determinedfrom (5.3), is shown in Table 5.4.Table 5.4: Sample likelihood based BPA m x (A).A {85} {84 -8 6} {03 ...61 7} {02...98} {91-99} {91.-810}m2,(A) 0.65 0.175 0.075 0.025 0.025 0.050Consider now the prior information. Here a precise prior probability distribution onO has already been provided but the situation will be amended to reflect some concernabout the precision of specifying such a distribution. In other words, allow for thefact that there might be some doubt about the specification of the precise probabilitydistribution given based on the prior information. One way of reflecting this doubt isby introducing a factor a, ranging from 0 to 1, which represents the confidence one hasabout the prior probability distribution. The value a can therefore be viewed as theprobability that the prior distribution is "correct" and 1 — a the probability that theprior is meaningless and there is only ignorance about the unknown parameter O. Thisway of weakening the prior information is known as "contamination of the prior". It willbe adopted somewhat arbitrarily here but will be discussed in more detail in Chapter 7.Note that, for the extreme cases, a 0 means that one is in complete ignorance aboutparameter 0, and a = 1.0 indicates that one is entirely confident about representingthe prior information by the given precise probability distribution.Given some value for the factor a, the precise prior probability distribution can be"discounted" by a and this leads to the prior BPAmo(9:)^aP(OD^for i = 1,2...10Chapter 5. D-S Statistical Inference - Likelihood Based BPA^73mo(0)^1 - a^ (5.5)A sensitivity approach to a will be adopted here so as to give some insight into itsinfluence.The prior BPA m 8 (A) and the BPA based on sample information are combinedusing Dempster's rule of combination to yield a posterior BPA m(A) which representsthe D-S inference of the unknown state of nature based on both subjective prior andsampling information. The combined BPA's for different a values are given in Table5.5. Note that when a = 0, the combined BPA is the same as that obtained usingthe sample likelihood function alone and when a = 1, the combined BPA becomes thefamiliar previously obtained Bayesian posterior distribution.Table 5.5: The combined BPA's m(A) for different a.SubsetADiscounting factor a0.0 0.2 0.4 0.6 0.8 1.00 1 0 0.0019 0.0047 0.0088 0.0159 0.030592 0 0.0016 0.0040 0.0075 0.0134 0.025993 0 0.0037 0.0089 0.0168 0.0303 0.058294 0 0.0090 0.0217 0.0411 0.0740 0.142395 0.6500 0.6369 0.6185 0.5904 0.5427 0.443606 0 0.0098 0.0237 0.0448 0.0807 0.155397 0 0.0045 0.0109 0.0205 0.0370 0.071208 0 0.0021 0.0051 0.0096 0.0173 0.033399 0 0.0012 0.0030 0.0056 0.0101 0.0194Oto 0 0.0013 0.0031 0.0059 0.0106 0.020394 ...96 0.1750 0.1639 0.1483 0.1245 0.0841 003-07 0.0750 0.0702 0.0635 0.0534 0.0360 002 ...0 8 0.0250 0.0234 0.0212 0.0178 0.0120 00 1 ...08 0.0250 0.0234 0.0212 0.0178 0.0120 001 ...0 10 0.0500 0.0468 0.0424 0.0356 0.0240 0With the posterior BPA m(A) and the cost function C(dj ,0 i ), the D-S upper andChapter 5. D-S Statistical Inference — Likelihood Based BPA^74lower expected costs for each decision dj over the range of values for a is calculatedusing the procedure described in Section 3.6. Note here that the cost is a monotoneincreasing function of the unknown parameter 9 so that upper and lower expected costscan be calculated using the upper and lower marginal distributions g(.) and P.). Theresults are shown in Table 5.6.Table 5.6: Upper lower expected utilities (costs) for different decisions d i and differenta values.a0 0.2 0.4 0.6 0.8 1d1 E [C(d 1 )] 358946 362484 367479 375067 387972 414798E*[C(d 1 )] 447885 445789 442830 438335 430690 414798d2 E.[C(d2 )] 360542 363683 368119 374857 386317 410138E" [C (cl 2 )] 442942 440864 437930 433474 425894 410138d3 E. C(d3 )_ 361179 364211 368492 374996 386057 409050E*[C(d3 )] 440690 438686 435856 431557 424247 409050d4 E.[C(d4 )] 393311 396983 402168 410044 423439 451283E* [C (4)] 489355 486944 483539 478366 469570 451283d5 E.[C(d5 )] 374536 377128 380789 386349 395806 415464E*[C(d4 )] 442367 440663 438257 434602 428386 415464From Table 5.6 it can be seen that the smaller the a value, indicating greaterweakness of the prior information, the bigger the difference between the upper andlower expected costs for any decision option. When a is 1.0, i.e. the Bayesian case,the upper and lower expected costs coincide. This can be seen clearly from Figure 5.1where the upper and lower expected costs vs a for d 1 only are plotted.The upper expected costs vs. a for different decisions di ,j = 1,...5, are plotted inFigure 5.2. It can be seen that the upper expected costs for decision d 3 are the smallestfor all a values. According to the D-S mini-upper decision criterion, the decision d 3 ,i.e. Gated pipe with return, should be chosen as the alternative.In this particular example, D-S decision analysis for any a value yields the same35074.4308 420 -u)• 410 -F-NO 400 -Utu 390 -1-• 380 -370 -3600^0.2^0.4^0.6^0.8aChapter 5. D-S Statistical Inference — Likelihood Based BPAFigure 5.1: The upper and lower expected costs as functions of a for d 1 .75d2^ d---------- 4-------------------^-----0cn0o.XUJ500490 -480 -470 -460 -450 -440 -430 -420 -410 -d3ci sd 14000^0.2 0.4^0.6^0.8^1aFigure 5.2: The upper expected costs as functions of a for d,,j =^5.Chapter 5. D-S Statistical Inference — Likelihood Based BPA^76final decision as the Bayesian approach, namely d 3 . However this is not necessarilyalways the case. Consider for example a situation where the cost values for d 2 areslightly lower than specified in Table 5.1 but are unchanged for other decision options.The modified costs for d 2 are given in Table 5.7, from which the corresponding upperand lower expected costs are calculated and presented in Table 5.8. The upper andlower expected costs for other decision options remain the same as in Table 5.6. Themodified upper expected cost vs. a for d 2 , together with the unchanged upper expectedcost curves for d 1 , d3 and d5 are plotted in Figure 5.3. From Tables 5.8 and 5.6 or Figure5.3, it can be seen that when a is greater than 0.6, the D-S decision based on the mini-upper criterion is d 2 . But when a is less than 0.6, the D-S decision becomes d 3 .Table 5.7: The modified costs for d 2 .States ofnatureCostsin dollars, d20 1 26505092 27623093 30284004 33681095 38282096 44157097 51476008 60953099 7266600 10 861860Table 5.8: The corresponding modified upper lower expected costs for d 2 and differenta values.a0 0.2 0.4 0.6 0.8 1d2 E. C(d2 )] 357272 360468 364981 371836 383496 407732E'[C(d 2 )] 441212 439091 436097 431548 423813 40773210.80 435 -00430 -!-Oa) 425 -H 420 -0415 -Chapter 5. D-S Statistical Inference — Likelihood Based BPA^77Figure 5.3: The upper expected cost as functions of a for did = 1, 2, 3, and 5, aftercost function for d 2 are modified.5.5 SummaryThe sample information about an unknown state of nature is often expressed in the formof the sample likelihood function. In this chapter, the D-S inference of an unknown stateof nature based on the sample likelihood function was presented. The approach, thoughbased on some intuitively reasonable assumptions, can be given a more formal axiomaticjustification. Again, when the prior knowledge can be expressed by a conventionalprobability distribution, the combination of it with the likelihood-based BPA yieldsthe conventional Bayesian posterior. The inferential results were demonstrated in awater resources decision making example where the weak prior information based oncontamination of a conventional prior probability distribution was also considered. Theresults indicate that when the input information is weak, the imprecision reflected inthe upper lower expected costs are also significant. Such kind of imprecision howeverChapter 5. D-S Statistical Inference — Likelihood Based BPA^78can not be obtained in the conventional Bayesian analysis.Chapter 6D-S Statistical Inference — General Model Parameters6.1 IntroductionIn Chapter 5, the D-S inference of an unknown parameter of a general statistical model,based on the sample likelihood function, was introduced. In this chapter a more formaland theoretical approach to the D-S inference of general statistical model parameters,based on sampling information, will be presented. This will be based on Dempster's[19691 original work which was confined to single parameter situations.The general D-S statistical inference is introduced through the multinomial modelwith a single parameter. Such a multinomial model can be considered as a discretizationof a continuous statistical model, and the inferential results can therefore be easilyextended to the general continuous model situation. The multinomial model, togetherwith a general description for the D-S inferential procedures of the unknown parameter,from sampling information, will be presented in Section 6.2.The general results described in Section 6.2 are of theoretical importance but havelimited practical use. The practically more useful results can be developed by introduc-ing an additional requirement, known as the monotone density ratio, for a statisticalmodel. This important condition, together with the corresponding D-S inferential re-sults, will be described in Section 6.3. This restricts the applicability of the D-S theoryto some degree but as will be seen, still permits an imprecise probability approach to beapplied to commonly used statistical models in engineering such as normal, lognormal79Chapter 6. D-S Statistical Inference - General Model Parameters^80and maximum extreme type I etc. In Sections 6.4 and 6.5, these general inferentialresults will be used to derive specific inferential results for the unknown parametersof some commonly used probability models which satisfy the monotone density ratiorequirement.It should be noted that Section 6.2 is included here mainly for the purpose of con-tinuity of the theoretical development in this chapter. The more practically relevantresults presented in Section 6.3 can be appreciated without developing a full under-standing of the basic theory in Section 6.2.Only the D-S inference of general statistical model parameters based exclusively onsampling information will be presented in this chapter. The sample based BPA canbe combined with a BPA representing subjective knowledge to yield a resultant BPA.Once the BPA inference on the unknown parameter is obtained, it can then be used inengineering decision analysis following the procedure described in Section 3.6 of Chapter3. Since this is a routine procedure which has been demonstrated through examplesin Chapters 4 and 5, it will not be illustrated again in this chapter. Nevertheless,decision making is often the ultimate purpose of any statistical inference dealing withuncertainties.6.2 The Multinomial Model and D-S Statistical InferenceThe multinomial model with a single parameter has been described in detail by Demp-ster [1969]. The following is a brief introduction to this model.Consider a random event x whose outcomes are from k categories, where the integerk > 2. This random event x can then be described by a multinomial model withprobability values specified on the k categories. The probability value specified on anycategory represents the relative frequency of the random outcome of x being from thatChapter 6. 1)-S Statistical Inference - General Model Parameters^81category.Let K represent the set of k categories, i.e.K^{1,2,...,k}^ (6.1)and let 7r(i) represent a probability value assigned to category i for i C K. Then atypical multinomial probability distribution can be expressed as7r^[7r(1),7r(2),...,lr(k)]^ (6.2)where E IL (i) = 1. Here the true multinomial frequency model is assumed unknownbut to lie among a set of possible multinomial models represented by a parametric formgoverned by a single parameter740) = [7r(1,0),7r(2,0),...,7r(k,0)]^ (6.3)where B C 0, i.e. B represents an unknown state of nature whose true value lies within0. Once the 8 value is determined, the true multinomial model becomes known. Thusthe problem of determining the true multinomial model is equivalent to the problem ofdealing with uncertainty of the single unknown parameter 8.Suppose a single sampling outcome of x is obtained and is from category j. Fora random event x, obtaining any outcome of x itself is random in nature. This factis recognized in the D-S approach through the introduction of another space U whoseChapter 6. D-S Statistical Inference — General Model Parameters^82points are in one to one correspondence with the observations of x. That is, when anobservation of x is obtained, this corresponds to a random sample in space U. Thisrandom sample however is not directly observable and thus can not be specificallyidentified. The space U takes the same mathematical form as the multinomial model7r, therefore a general point of U can be expressed asu= {u(1),u(2),...,u(k)} (6.4)where E itL i u(i) = 1. The space U is a collection of u, i.e. U = {u}. The distribution ofu over U is assumed known, and governs the random sampling operation. The choiceof this distribution is not critical as one distribution can always be transformed intoanother distribution. For convenience a uniform distribution of u over U is adoptedhere.Suppose a single observation of the random event x is obtained which is in categoryj. This random observation is equivalent to drawing a sample u from the space Uthough, as mentioned above, this sample itself can not be identified. However thesample u must match the observation j and the parametric multinomial model 7(8),and this can be achieved if and only ifr(i,fnu(j) < 7r(j,O)u(i) (6.5)for all i C K (see Dempster {1969] for a detailed discussion). In other words, the aboverelationship ensures that the sample u will be in correspondence with the observationj, under the multinomial model 7(0).Chapter 6. D-S Statistical Inference - General Model Parameters^83The understanding of the relationship in (6.5) is facilitated by considering the specialcase of binomial model which has been discussed in Chapter 4. Let r(1,0) = 6 representthe probability of S(uccess) and 7(2,0) = 1 - 6 the probability of F(ailure), wherer(1,0) r(2,0) = 1.0. Let the space u be represented by u(1) = w and u(2) = 1 - w,where 0 < w < 1.0 and u(1) u(2) = 1.0. If a sample of S is obtained, i.e. j = 1, thenaccording to (6.5) there is(1 - 0)w < 0(1 - w)which can be simplified asw < < 1 .0This is exactly the relationship developed in Chapter 4 and demonstrated in Figure 4.1for a sample of S.Since u is uniformly distributed over U, the relationship in (6.5) determines a proba-bilistic distribution, i.e. the D-S inference, on 6 based on the observation j. Let R(j, u)represent the random subset of 6 determined by the above relationship when j is fixed.The D-S inference of 0, which is expressed in the form of a commonality function onany subset A of CI, can then be determined as [Dempster, 1969]r(iH(RU,IL) D A) = C [EsuPecA [r(j,9)11,0)i=if infecAr(i 3 O) > 0^(6.6)otherwiseChapter 6. D-S Statistical Inference - General Model Parameters^84where C is a normalizing constant.The general D-S inferential result of 8 expressed in (6.6) is of little practical value.Rather, it serves as an intermediate step towards more practical D-S inference underthe condition that the parametric multinomial model has a property called "monotonedensity ratio". The monotone density ratio condition, together with the inferentialresults for 8 based on the sampling observations, are presented in the next section.6.3 D-S Inference of Parameter with Monotone Density RatioConsider the multinomial model r(8) defined by (6.3) in Section 6.2. The parameterhere is assumed to be a real continuous variable in the range a < 0 < [3, where a and/3 can go to -oo and ±oo respectively. For any two categories i,j C K, the ratio of theprobabilitiesr(i, 0) r(i, 19 )(6.7)is a function of 8 on a < < /3. If this ratio is a monotone nonincreasing or nonde-creasing function of 8 for all i,j C K, then the multinomial model r(8) is defined asmeeting the monotone density ratio condition.The parametric model 70) with monotone density ratio has some attractive features[Dempster, 1969]. First, the subset R(j, u) of 8 in (6.6) will be a closed interval for allj C K, i.e. R(j,u) C [8 1 ,02] for a < 8 1 < 02 < 0. This suggests that the BPA inferencefor 8 based on an observation j can be expressed on the continuous contiguous frameof 8 as defined in Chapter 3.Additionally, when r(9) has monotone density ratio, the k categories of randomvariable x can be reorganized into r partially ordered, mutually exclusive, subsetsChapter 6. D-S Statistical Inference - General Model Parameters^85< k, so that if i and j belong to a common subset K„, thenr(i 3 O 1 )^r(i, 0 2 )r(..7, 19 1)^7r(.1, 09 2)while if i C K, and j C Kt with s < 1, thenr(i 3 O 1 ) > 7r(i 3 O 2 ) r(i, 9 1)^r(i, 9 2)for all 01, 02 on a < 01 < 02 < Q. The partially ordered subsets K1, K2, K, are calledequivalence classes and K is the union of these equivalence classes.With the above two properties of the monotone density ratio assumption, the D-Sinference for parameter 0 from the single sample observation j C K,, as expressed in(6.6), can be significantly simplified. Let the k categories of random variable x bearranged so that they are in the same order as the equivalence classes K1 ,K 2 ,...,K,..Then for any closed interval [0 1 , 0 2] on a < 0 1 < 02 < #, the supremum of the ratior(i 3 O)/r(j,0) in (6.6) can be attained either at 0 1 or 02 , depending on which equivalenceclass the value i is in. If i is in the same equivalence class K, as j is, then the probabilityratio will be a constant for any 0 C [0 1 , 0 2]; if i is in an equivalence class Kt where t < s,then the ratio reaches its supremum at 0 1 ; and if i is in the equivalence class k1 where1 > s, the ratio achieves its supremum at 02. Thus under the monotone density ratioassumption, (6.6) simplifies toH([01,02]) = C 7r( 31^ + 1 r (i,11(02)i, 62)1 -1Chapter 6. D-S Statistical Inference - General Model Parameters^86if 71-(j,0) > 0 on 8 1 < 8< 92= 0^otherwise^(6.8)where II(j,0) = E i<j r(j,0) is the cumulative probability value of j. Note that thiscumulative probability value is the summation of all of the probability values in theequivalence classes Kt for any t < s. Equation (6.8) is the commonality function onthe subsets of O representing D-S inference on 8 from a single sample j.If n observations are obtained, n commonality functions can be deter-mined which can be combined via Dempster's rule of combination (3.9) to yield aresultant commonality function, i.e.H({91,92]) =c, Hn^61)^1 Mih,92)1.^h.1 r(.7h, 9 1)^r(3h,92)if r(j,0) > 0 on 0 1 < 0 < 92= 0^otherwise^(6.9)where C' is a normalizing constant.The above discussion has been based on the multinomial model r(8) with k cate-gories. This model can be extended to the continuous sampling model situation. Letik(x,O) represent the continuous sampling model, and let the x values be arranged sothat they are in the same order as the equivalence classes. Then for a set of samplesx i , x 2, x n , the D-S inference for 9 isH([91,92])^[91x i_, 0 1 ) + 1 - tli(x;,09.) -1^0(xj, 9 1)^0(xi,92)Chapter 6. D-S Statistical Inference — General Model Parameters^87if 7/)(x„ 0) > 0 on 0 1 < 0 < 9 2= 0 otherwise (6.10)where ‘11(x 2 ,0) is the cumulative probability of x i . If x, is in the equivalence class K„then this cumulative probability is equal to the integration of 0(x, 6) over the x valuesin the equivalence classes Kt for all t < s.From Equation (6.10), it can be seen that the commonality function for single pointsubsets, i.e. when 6 1 = 62 = 6, becomes the conventional sample likelihood function,i.e.H([6,6]) =^11 (xi, 6)i=1(6.11)Thus in the Bayesian case when the prior knowledge can be expressed as a conventionalprecise probability distribution or a Bayesian BPA on 0, say p(0), the correspondingcommonality function has non-zero values only on the single value subsets. The com-bination of this prior knowledge with the sampling information, using Dempster's ruleof combination, yields a posterior commonality function which has non-zero valuesC"p(0) (x i , 6), where C" is a normalizing factor, only on single value subsets.The corresponding posterior BPA is a Bayesian BPA which is identical to the conven-tional Bayesian posterior distribution. This further supports Dempster's [1969] claimthat the Bayesian analysis is a special case of the more general D-S approach.Once the commonality function for the unknown parameter 6 is obtained for n sam-ple observations from (6.10), the BPA density m(0 1 ,0 2 ) and two marginal distributionsf(0) and g(0) for 0 can then be determined. To facilitate the formulation, defineChapter 6.^D-S Statistical Inference - General Model Parameters 88( x i ,^)t i 0 1(^)( x^)1 - 1if(x 2 ,8 2 )si(e92) (6.12)0(x,, 02)then the commonality function in (6.10) becomesnII( [01, 02]) = c" II [tt( 91) + si(92)] (6.13)i=1According to equation (3.15), the BPA is then02 11([91,82j)m([01,02})^=^a81002=^021) {t[ii(01) + si(92)1 -2 t:(90s:(02) +1=1[t (^) 1 ( 02 )]^( 01 )1 [E.^[t 1 (0 1 ) + 8102 )1 -1 s:(02 )1} (6.14)i=1 i=1where tat 9 1 ) and s1(02 ) are defined asi:(91)^ii(01)= de l(6.15)s:(02)^=^d9 2 si(92)The two marginal distributions f(0) and g(0) can be determined from (3.18) andH([0 1 ,H({0 0 2]) as followsChapter 6. D-S Statistical Inference — General Model Parameters^890H([01, 0 2]) f(0) =^801^162=61=6=^ii'cb(xi,6)t;(6)1011([01,02]) g(0) =002^161,-.92=8= C i^Ik(xj, 0) [E 91)(X/,0)S;(6)1^(6.16)i=1 /=1In this section, the D-S statistical inference of an unknown parameter of a generalsampling model with the monotone density ratio condition has been described. Theinferential results are represented by the commonality function in (6.10), the BPAdensity function in (6.14) and the upper lower marginal distributions in (6.16). In thefollowing section, the specific D-S inferential results for unknown parameters of somecommonly used sampling models will be presented.6.4 Application to Normal and Lognormal ModelsIf a random variable x follows the lognormal distribution with parameters and a, thenlog(x) follows the normal distribution with IL as the mean and o- as the standard devi-ation. Dealing with the uncertainties of the lognormal model parameters is thereforeequivalent to dealing with the uncertainties of the corresponding normal model param-eters. In the following discussion, only the D-S inference of normal model parametersare presented. The D-S inference of lognormal model parameters can be obtained byfirst transforming the lognormal model into a normal model and then perform D-Sinference on the corresponding normal model parameters.The probability density function, ,t7bN (x ; it,o.) of a normal distribution with meanexp2o.2( 4 — x . — 2/1(x i — x i )(6.18)Chapter 6. D-S Statistical Inference — General Model Parameters^90and standard deviation o can be expressed as1^(x — p) 2 \ON(x; ^expV2ro- 2o•2 )(6.17)Verification of the monotone density ratio condition for each of the two parametersof the normal model follows the same procedure as for the multinomial model. For anytwo values x i and xi of x, the probability density ratio of x i and xi can be expressedaslkiv(xi; it, a)exp C (xi — it) 2 — (xi — 0 2 )ON(xj; y,o-)^2o2When the parameter a is fixed, this ratio is either a monotone non increasing (whenx i < xi) or nondecreasing (when x i > x i) function of a. Therefore according tothe definition, the normal model with parameter p, satisfies the monotone density ratioassumption. Similarly, when ,u is fixed, the normal model with parameter cr also satisfiesthe monotone density ratio condition.Since D-S inference deals only with single parameter situations, one parameter hasto be assumed fixed while the inference for another parameter is considered. In prac-tice there might be uncertainties associated with both parameters. In this situationthe parameter whose uncertainty is greater or has the more significant influence, on theconsequent results, such as from a decision analysis, should be treated as the uncertainparameter and the other parameter assumed constant. For example, Ang and TangChapter 6. D-S Statistical Inference - General Model Parameters^91[19841 suggested that, for a normal sampling model, the uncertainty associated withthe mean is more significant than with the standard deviation. Only the uncertaintyassociated with the mean value is then considered, and the standard deviation is con-sidered as a constant. In the following two subsections, the D-S inferential results foreach of the two parameters, while assuming the other is fixed, are presented.6.4.1 Inference on it with o FixedBefore discussing the D-S inference on p., the equivalence classes of the values of x underparameter y are determined. Consider again the probability density ratio expressed in(6.17). For any different x i and xi values of x, the density ratio is different for different/1 values. Thus any individual value of x forms an "equivalence class". Also for any< /1 2 and x i < xj , there isON(Xi; Pi)^ON(Xi; 112) ikN(Xj;^)^(PN( Xj; 112)(6.19)Therefore the order of the equivalence classes is from the class with small x value to theone with large x value. It is seen that the natural order of x values is consistent with theorder of the equivalence classes. Thus for any observation x i , the cumulative probability111 N (x i ,p), necessary for determining the D-S inference on it, can be expressed asTN(xi;,a)izij-. 7,1) N (x; p.,a-)dx(x i — ft)(6.20)((x i - /1 1) 2 ) (xi — fLi).^2ro-exp 2o-2Chapter 6. D-S Statistical Inference — General Model Parameters^92where 41(•) is the standardized cumulative normal distribution. The functions t i (p. i )and s i (y 2 ) in (6.12) can then be determined as4011)Si 1 — TN(xi; P, 2) ON(xi; A2)= \/-27rcrexp ( (xi 2—c/r2L2)2 ) [1^4) ( xi --cr 1t2 )]and the derivatives of t i (p 1 ) and s i (/u 2 ) to A i and /1 2 respectively areXi -2 /L1^_ 1Qxi —,a22 sz(ii2) + 1 (6.21)The following numerical example demonstrates the application of the results devel-oped in this subsection.EXAMPLE 6.1. The annual maximum flow x, in cfs, of a stream has been recorded forthe last 10 years as follows4025.8 3536.96506.5 9093.32012.0 12823.32409.0 12646.91347.7 4017.3Chapter 6. D-S Statistical Inference — General Model Parameters^93Studies indicate that the maximum flood values can be modeled by a lognormal dis-tribution with parameters p and u. The logarithm of the data therefore follows anormal distribution with mean p and standard deviation cr. The corresponding naturallogarithms of above data are^8.30^8.178.78^9.127.61^9.467.79^9.457.21^8.30from which the maximum likelihood estimate of the standard deviation can be deter-mined as (3- = 0.734. Assume the standard deviation cr is known and is equal to thesample standard deviation 0.734. The D-S inference on the uncertain parameter p canthen proceed. The BPA density functions, m N ({p i „a 2 ]), of p from 2 and 10 data areplotted in Figures 6.1a and 6.1b respectively. Here, and in the examples following, onlythe first two data are used in the 2 data case for the purpose of demonstration. It isfound that in all the examples presented in this chapter, choosing any other two datawill not have a significant effect on the results. From the plots it can be seen thatwhen the number of data is small, as in Figure 6.1a, the BPA density is widely spread,indicating large uncertainty associated with p. As the number of data increases, theBPA density function concentrates about the true IL value as in Figure 6.1b. In theextreme case when the number of data becomes infinite, the BPA density will becomea unit spike at the mean value, indicating the deterministic situation. Only in thisextreme case do the D-S inference and the sample likelihood estimate of p become thesame.For comparison, the Bayesian posterior distributions r iv (p) for p were determined40 6.0 8.0 10.0 12.012.010.0P28.06.04.012.010.0P2 8.06.04.040 6.0 8.0^10.0 12.0Chapter 6. D-S Statistical Inference - General Model Parameters^94(a)(b)Figure 6.1: The BPA density functions for normal model parameter^mN((f11,1/2)),for a) n= 2; b) n= 10.10.0^12.0T0.02.0fN al) ITN^gN (P)0.04.0^6.0fri 04 gN 04Chapter 6. D-S Statistical Inference - General Model Parameters2.0950.04.0^6.0-r^i^18.0 10.0il(b)112.0Figure 6.2: The D-S upper and lower marginals, gN (p) and fN (,u), and the Bayesianposterior r-N(y) for normal mean p., a) n = 2; b) n = 10.Chapter 6. D-S Statistical Inference - General Model Parameters^96for different sized samples. The commonly used noninformative prior for the locationparameter u, is the uniform distribution on p, which is an improper distribution asthe integration over u is infinite. The Bayesian posterior distributions irN (ii) with theuniform prior for p, together with the D-S upper and lower marginal distributions,gp,-(p.) and fN(p), of p from 2 and 10 samples are plotted in Figures 6.2a and 6.2brespectively. It can be seen that the Bayesian posterior distribution lies between the D-S upper and lower marginal distributions. Furthermore as the number of data increases,the upper and lower marginal distributions and the Bayesian posterior distribution alltend towards the true mean value. Again, when the number of data becomes infiniteboth D-S and Bayesian approaches lead to the same result.A qualitative evaluation of these results suggests that significant additional uncer-tainty due to imprecision has been recognized by the D-S scheme when the sample sizeis small, i.e. n = 2. But the difference between the D-S and Bayesian results at n = 10,and, in particular, the implications in a decision analysis are probably negligible.6.4.2 Inference on u with p FixedThe equivalence classes for values of x and the order of the equivalence classes, underparameter o-, are determined first. Consider the probability density ratio expressed in(6.18). This ratio will be the same for any two different values, x i and xj, which satisfyx i = 2p, - xj . This suggests that any two values of x which are symmetrical about themean p, are in the same equivalence class. A typical equivalence class can be expressedas {x, 2µ - x} where x > p and the order of the equivalence classes can be determinedas follows:Since for any p, < x i < x3 , the density ratio in (6.18) satisfiesxitacri) =^-^( 1^(Xi - (r1)2 ) 40700. 13(Xi - Cr2) 2 )1Xi -)Si( 0.2 3U2 U2 02ti (o2)Chapter 6. D-S Statistical Inference — General Model Parameters^97ON(xi; al)^ N(xi;u2) bN(xj; ui)^ON(xj;u2)(6.22)for all of < u 2 , then the order of the equivalence classes is the same as the order of xvalues for x > If all the x values are arranged so that they are in the same order asthe equivalence classes, the cumulative probability of x 2 , tlI N (x i ,cr), is111 N (x i ; u) ON(x; o)dx24, (x i —A)= (6.23)The two functions ti (c 1 ) and s i (o-2 ) are thenti(o1)si(u2)*N(xi;u1) PAr(xi;u1)1 — WN (x i ; cr2 )ON(xi; u2)(6.24)and the two derivatives tau ].) and s=(c2 ) are(6.25)The following example demonstrates the results developed in this subsection.Chapter 6. D-S Statistical Inference - General Model Parameters^98EXAMPLE 6.1 (continued). Assume that the population mean value it of the normaldistribution is known to be 7.414. The D-S inference on the standard deviation ofrom the sample observations can then proceed. The BPA density functions of u,mN([(71,0- 2}), from 2 and 10 data are plotted in Figures 6.3a and 6.3b. The plotsshow again the tendency of the BPA density converging towards the true u value asthe number of data increases. Compared with the BPA density for ,a in the previousexample, the convergence of the BPA density for c is much slower. This will be seenmore clearly below.For comparison, the Bayesian posterior distributions of cr, 7-N (Q), for 2 and 10data are determined using the noninformative prior, 1/c, for the scale parameter cr.The Bayesian posteriors, together with the D-S upper and lower marginal distributions,gN (c), and fN (cr) are plotted in Figures 6.4a and 6.4b. From the plots it can be seen thatthe two Bayesian posterior distributions all lie between the corresponding D-S upperand lower marginal distributions. Also when the sample size is small, as in Figure 6.4a,the D-S upper and lower marginal distributions are significantly different. Howeveras the sample size increases, as represented by Figure 6.4b, the two D-S marginaldistributions become closer, indicating a decrease in the amount of uncertainty. Inthe extreme case as the sample size becomes infinite, both the D-S marginal and theBayesian posterior distributions will converge to the true a- value. Note that in Figure6.4b, the D-S upper and lower marginal distributions are still quite different and fairlywidely spread, indicating still some considerable uncertainty associated with a evenwith the sample size of 10.al0.4^0.6 0.80.20.00.80.6020.40.20.00.0 (b)25.0a12.0^3.020.0 4.04.03.02.01.00.0(a)1.00.010.05.0a2Chapter 6. D-S Statistical Inference — General Model Parameters^99Figure 6.3: The BPA density functions for normal model parameter o, rnN([0-1,0- 2]),for a) n = 2; b) n = 10.n w (0)2.0 g N 100.00.0 0.51 N (a)^/I'm (a)ilk g m (o)0.5 1.5 2.010.08.0 -a 6.0 -.zx' 4.0 -6zz...-2.0 -0.00.0 1.0a( b )Chapter 6. D-S Statistical Inference - General Model Parameters^10010.0 ^8.0 -^f ed (a)1.0a(a)1.5^2.0Figure 6.4: The D-S upper and lower marginals, gN (o) and fN (a), and the Bayesianposterior T-N(a) for normal standard deviation a, a) n = 2; b) n = 10.Chapter 6. D-S Statistical Inference — General Model Parameters^1016.5 Application to Gumbel ModelThe Gumbel, or maximum extreme value type I, model has been widely used in engi-neering to describe the distribution of the random extreme value events, such as annualmaximum floods. The probability density function of the type I distribution isXI(X; (7) —exp [— ^ exp ^ (6.26)for all 0 < x < co.For any x i and xj of x, the probability density ratio is= exp{ x i — xiexp (LI ) [exp^cr^exp (--xo-)] }^(6.27)aFor any given u this ratio is a monotone non increasing (i.e. when x i < xj ) ornondecreasing (i.e. when x i > xi ) function of the parameter it. Thus the type I modelwith uncertain parameter and fixed o, satisfies the monotone density ratio condition.However, since the density ratio is not a monotone nondecreasing or nonincreasingfunction of a for any fixed it, the type I model with parameter c uncertain does notsatisfy the monotone density ratio condition. Therefore for the type I model, the D-Sinference can only be performed on parameter ft with a assumed as fixed.Considering again the density ratio in (6.27), since for any x i < xjOr (xi; /L i )^'01(xi; 11 2) OI(xi;^ ki(xj;p2)(6.28)Chapter 6. D-S Statistical Inference - General Model Parameters^102for p i < p, 2 , therefore any single x value forms an equivalence class, and the orderof the equivalence classes is the same as the natural order of the x values. Thus thecumulative probability distribution ‘11(x„kt) for any observation x i , which are necessaryfor determining the D-S inference for can be determined asTr(x2;11 ) = exp [—exP (6.29)The functions t i (A 1 ) and s i (1, 2 ) and their corresponding derivatives are= o-exp ( xi kil )i -7. /12 i(xi p+2)exp ( xi — P2)1,-_- cexp Q^ o-^IJ—o-exp ^cr, —ex ^p ( xi — p i )a ){xi — 11, 2^( xi — /12)1= —exp^+ expc o-^).1+exp [exp ( xi —c112 )]+exp (xi — fi2 )a (6.30)The results developed in this subsection are illustrated in the following example.EXAMPLE 6.2. The following observations are from a random event which can be modeledby a type I distribution (see Bury [1975} for the original example).Chapter 6. D-S Statistical Inference — General Model Parameters^1034.0 5.54.9 6.34.6 4.55.0 5.97.5 6.95.4 5.7The maximum likelihood estimates of parameters a and it are 0.818 and 5.05 respec-tively [Bury 1975]. Assume parameter o- is equal to the sample estimate, i.e. a = 0.818.The D-S inference on from sampling information can then be performed. The BPAdensity functions for p, m1([ii i „u 2 ]) from 2 and 12 data are illustrated in Figures 6.5aand 6.5b respectively. The plots again demonstrate that as the number of data in-creases, the uncertainty associated with parameter au decreases, and the BPA densityfunction tends towards the true value of it.Using a uniform distribution as the noninformative prior for the location parameterthe Bayesian posterior distributions, ri(p), of p, from 2 and 12 samples can bedetermined. These, together with the corresponding D-S upper and lower marginaldistributions, gr(p) and fr(p,), are all plotted in Figures 6.6a, and 6.6b. It is again seenthat the Bayesian posterior distribution lies between the D-S upper and lower marginaldistributions. Also as the number of sampled values increases, the Bayesian posteriordistribution and the D-S upper lower marginal distributions all converge to the truevalue. As the number of data becomes infinite, the D-S approach, the conventionalBayesian method, as well as the classical maximum likelihood estimate of parameter p,all converg e on the same result.(a)P14.0^6.0o.o 2.0 8.0(b)P10.0^2.0^4.0^6.0^8.08.02.00.06.0P24.0Chapter 6. D-S Statistical Inference - General Model Parameters^104Figure 6.5: The BPA density functions for Extreme Type I model parameter p,,rnAkt1,A21), for a) n = 2; b) n = 12...-:a470.5 -Chapter 6. D-S Statistical Inference - General Model Parameters^1052.0.-:3..-0.50.02.00.0 2.0^4.0^6.0^8.0p(a)0.0gi(u)0.0^2.0^4.0^6.0^8.0p(b)Figure 6.6: The D-S upper and lower marginals, .9/(µ) and f 1(A), and the Bayesianposterior r/(//) for Extreme Type I parameter u, a) n = 2; b) n = 12.Chapter 6. D-S Statistical Inference — General Model Parameters^1066.6 SummaryA more formal and theoretical approach to the D-S statistical inference of an unknownparameter of a general statistical model, based on sampling information, was presentedin this chapter. Though this approach requires that the unknown parameter in themodel satisfies the monotone density ratio condition, it still can be applied to commonlyused statistical models such as normal, lognormal and maximum extreme value type I.Again, the inferential results, when combined with a Bayesian prior probability distri-bution, yields the conventional Bayesian posterior. The application of the inferentialresults to several common statistical models showed that when sampling information isvery limited, the D-S result reveals significant imprecision which is overlooked by theconventional Bayesian results with noninformative priors. The two methods convergeto deterministic results when the sampling information becomes infinite.Chapter 7Representing Subjective Knowledge with a BPAIn Chapters 4, 5 and 6, the D-S inference of an unknown parameter of a statistical modelbased on objective sampling information was described. In addition to the samplinginformation there may be other sources of knowledge, often categorized as subjectivein nature, about the unknown parameter. These sources of information can play animportant role in situations where the sampling information is very limited. In thischapter several methods of determining the BPA's from subjective knowledge, each for adifferent situation, will be discussed. Unlike determining BPA functions from objectivesampling information, where a formal analytical approach is possible, the representationof subjective knowledge will necessarily involve an elicitation process. The issue ofelicitation is, however, closely related to the more fundamental and profound issueof axiomatic justification of any uncertainty representation scheme. The literature isextensive on this subject but inconclusive and one can find authoritative argumentsboth for and against most schemes. But just as the debate on the Bayesian approachhas not deterred it from being used in engineering practice, the dispute on the D-Sscheme should also not impede the exercise of exploring the potential implementationof this new scheme in the engineering world. The axiomatic issue itself is consideredto be outside the scope of this thesis though. Representative discussion on this subjectcan be found in Shafer and Pearl [1990].107Chapter 7. Representing Subjective Knowledge with a BPA^ 1087.1 The BPA Based on Contamination of a Prior DistributionIn conventional Bayesian theory, the subjective knowledge about the unknown stateof nature 8 must be represented by a precise probability distribution. Suppose, aftertaking into consideration the subjective knowledge, this prior probability distribution isdetermined and denoted as 7r(9). From the viewpoint of the D-S theory, 7r(0) representssubjective knowledge which is qualitatively precise or complete, i.e. provides a basison which a precise probability judgement about 9 can be made.In practice, however, when the subjective knowledge is weak or imprecise, it isvirtually impossible to characterize this knowledge as a single precise probability dis-tribution [Berger, 1984]. In effect there may always be some degree of doubt associatedwith the precise probability distribution 7r(0). In spite of this, 7r(0) might still be con-sidered as representing the best effort in characterizing the subjective knowledge in thecontext of conventional Bayesian theory.The lack of precision in 7r(0) can be reflected by introducing a factor e, 0.0 < e < 1.0,to represent the degree of doubt one has about the precise nature of 7r(0), as has beenbriefly discussed in Section 5. 4 of Chapter 5. This is generally referred to as thecontamination of a prior distribution in the literature. In Robust Bayesian analysis analmost identical scheme is used and 1 — e is interpreted as the probability that the trueprior probability distribution is 7r(0) and e the probability that the prior distributioncan take any form [Berger, 1984]. In the D-S scheme, however, e is simply interpreted asthe amount of ignorance one has about any specification of 9 and 1-- e as the confidencein the belief assignment 70) on 0. The 7r(0) ordinates can then be discounted by afactor 1 — e and the remaining portion of belief a can be assigned to 0. This naturallyleads to a BPA, known as a contamination BPA [ Wasserman, 1990b] as followsChapter 7. Representing Subjective Knowledge with a BPA^ 109m(9) = (1 — e)r(0)m(0) (7.1)The contamination BPA expressed by (7.1) is governed by the factor E. When eis zero, indicating that precise probability values can be elicited from the subjectiveknowledge with complete confidence, the BPA becomes a conventional precise proba-bility distribution or a Bayesian BPA. At the other extreme when e is 1.0, indicatingthat one knows nothing other than that the true 9 value is within 0, (7.1) specifiesa complete ignorance BPA. Any intermediate value of e, 0.0 < e < 1.0, correspondsto a general BPA which represents the weak or imprecise subjective knowledge. Thedetermination of the e value is clearly subjective as it would also necessarily involve anindividual's personal judgement.Though the above is a simple and easily understood method, one concern is that thederived BPA assigns the same degree of imprecision to all individual elements, includingthe extreme values in the tails where the knowledge is usually particularly weak. Thisis inappropriate, especially when the tails of the distributions are of crucial importancein decision making. In the next section, a more flexible approach to characterizingimprecision will be introduced.7.2 The BPA Based on Perturbation of a Prior DistributionThe basis of the method is again a Bayesian prior r-(0) on 9 which is to be alteredto reflect the weakness of the knowledge it purports to represent. The modificationcan be achieved by attaching, to any singleton value 9 (or element in the discreteChapter 7. Representing Subjective Knowledge with a BPA^ 110case), an interval (or subset) which contains this value, and assigning the probabilitydensity on the singleton value to this interval. Unless there is evidence to suggestotherwise, a reasonable assumption for the interval is that its bounds are symmetricallylocated on either side of the singleton value. This modification of a precise probabilitydistribution to reflect weakness in the subjective knowledge is called local perturbationof a prior probability distribution [Wasserman, 1990b], and the corresponding intervalsare denoted as local perturbation intervals.The width of the perturbation interval reflects an individual's personal assertionof the weakness of his subjective knowledge. In the simplest case, the width can beconsidered as a constant for all the intervals. But in the more general and realisticsituation, the width of the interval should vary reflecting different knowledge on differ-ent 9 values. For example, the width should be smaller for the 9 values in the centralpart of the distribution where there is greater support from experimented informationand therefore less uncertainty. In the tails of the distribution, where the uncertainty istypically much greater, the width should be larger.As with e in Section 7.1, the determination of width of the perturbation interval isnecessarily subjective. The constant width situation, which has been briefly discussedby Wasserman [1990b], is presented and further expanded in Section 7.2.1. The moregeneral case of the width being a function of 9 is considered in Section 7.2.2.7.2.1 Local Perturbation Interval with Constant WidthLet [0 — k, k] denote the interval attached to the point value 9 where k is an unknownconstant and needs to be determined. The local perturbation of the distribution 7r(0),which involves reassigning the density 740) to the interval [0 — k, 9 + le] as described atthe beginning of this section, leads to a BPA densityChapter 7. Representing Subjective Knowledge with a BPA^ 111m([0 — k,0 k]) = r(0) (7.2)This is demonstrated in Figure 7.1 where the points on the diagonal line /—/' repre-sent the singleton values of O. For any singleton value 0, the corresponding perturbationinterval [8 — k,0 k] is represented by a point with horizontal and vertical coordinates,0 1 and 02, being 8 — k and 0 k respectively. Note that the connection of the point[8 — k, 8 k] with the corresponding singleton value point 8 on 1-1' is perpendicular tothe diagonal line / — /' and the distance between the two points is -‘124. For constant lc,the points [0 — k, 8 k] for all 8 values form a straight line which is parallel to / — l', asrepresented by line r — r' in Figure 7.1. Thus the precise probability distribution r(0)is situated on the diagonal line 1— l' and the BPA density expressed in (7.2) is locatedon line r — r', which, as can be seen from Figure 7.1, is simply a parallel shift of 7(8)in the direction vertical to 1 — 1'.When the width of the perturbation interval is a constant for all 0 values, the de-termination is simplest. One reasonable way suggested by Wasserman [1990b] involvesfirst choosing an interval, called a reference interval, for 0. A reference interval canbe of any form depending on an individual's subjective choice, but the selection of itshould be such that a person feels confident in making a probability judgement withinsuch an interval. For this reason, it should be located in the central part of the pre-cise distribution r(0). For example, a reference interval might be [fi e — cro ,tt e ere ]where fie is the mean and cre the standard deviation of 0, i.e. one standard deviationaround the mean value. Note that a reference interval should not be confused with theperturbation interval [0 — k,0 + k], and the k value therefore can not be equivalent toChapter 7. Representing Subjective Knowledge with a BPA^ 112Figure 7.1: The BPA based on local perturbation with constant interval width.Henceforth a reference interval is assumed to be of the form [119 - C9, i.e -1-c-e ]. Oncea reference interval is selected the probability of 19 being within this interval based onthe precise probability distribution r(0) is then calculated as a single precise value.The plausibility or upper probability, .P/nµ61 — tee, a9 + creD, of the reference intervalbased on the perturbation BPA in (7.2) can also be obtained which is a function of k.One is then asked to set an upper bound p - , which he feels reasonable, for the referenceinterval. This upper probability bound p - is then set equal to P/(1/Le —cre ,p e —cep andthe constant k is thus determined from this relationship.The plausibility of the reference interval Pi([fie — cr0,129 — o-01) can be calculated bydefinition in conjunction with Figure 7.1. The reference interval [129 — tee, µe + co ] isrepresented by point A in Figure 7.1. By definition, P/Wie —a-9 , ye +cep is the integralof the BPA density over the area D 1 D2 D3D 4 . In the particular case considered here,Chapter 7. Representing Subjective Knowledge with a BPA^ 113this is equivalent to integrating the BPA density expressed in (7.2) from B to C on liner — r'. Since the two points B and C represent the pe-rturbation intervals of the twosingleton values BE and OF at points E and F, the integral is again equivalent to theintegration of the precise probability distribution 7(0) from E to F on line I —For BE at point E, its perturbation interval represented by B is expressed as [OE —k,OE k]. The upper bound of this interval, OE+ k, is the same as the singleton valuerepresented by D2 which is fie — tee , i.e. the lower bound of the interval represented bypoint A. That isOE^=^— cre^ (7.3)The value BE is then BE =^— cre k. Similar procedures can be used to determineBE which is OF =^0-0 k. Therefore, the plausibility of the reference interval[pe — cro ,p.9 ae] can be calculatedPI(Lue — 0-8,14 — ceD =eF71-(0)c10B Ere +00 +k71-(9)c10— cre — k(7.4)As an example, consider a situation where 70) is a normal density with mean peand standard deviation at,. For the reference interval [11, 13 — cre , + tee], the singleprecise probability value based on 740) is (DM — 4(-1) = 0.683 where 41(•) denotesthe cumulative standard normal distribution. If the upper bound of the probability isconsidered to be p', then the value for k can be determined= — cre * (I) -1 ( 12P*) coChapter 7. Representing Subjective Knowledge with a BPA^ 114= tee *^(1 2p*)^cre (7.5)As a numerical illustration, if an upper probability of 0.80, i.e. p* = 0.8, is consid-ered to be reasonable, then k is 0.290-e .7.2.2 Local Perturbation Interval with Varying WidthLet [9 — k(9), 9 k(9)} denote the perturbation interval for the singleton value 9 wherek(9) is varying as a function of O. The perturbation BPA from 7r(0) for this moregeneral situation is similar to the BPA expressed in (7.2) except that the constant Kis now replaced by the functional form k(9), i.e.m([9 — k(0),0 k(9)]) = 7r(9) (7.6)The determination of k(9) can be performed in a similar fashion as described inSection 7.2.1. At first, a simple reasonable form of k(9) as a function of 9 is assumedwhich contains some unknown coefficient. Then a reference interval for 9, say [tio —tee, µe gel, is selected as an upper bound of the probability of this interval, p*, isspecified by the engineer based on the subjective knowledge. The p' value is then setequal to the plausibility of the reference interval based on the BPA density in (7.6),and the unknown coefficient in k(9) is thus determined from this relationship.As an example, consider a simple situation where k(9) is assumed to be zero at themaximum likelihood point 9 (i.e. where 7r(9) is maximum) and is increasing linearly as9 moves toward the two tails, i.e.Chapter 7. Representing Subjective Knowledge with a BPA^ 115k(9) = c • IO — 9^ (7.7)where c is an unknown coefficient which needs to be determined. This is demonstratedin Figure 7.2 where the local perturbation intervals [0 — k(0),9 + k(9)] are representedby the points on the two lines r — o and o — r'. Note here that the point o on line 1 — I'represents the maximum likelihood value B. The perturbation BPA expressed in (7.6)is situated on lines r — o and o — r' as shown in Figure 7.2.Figure 7.2: The BPA based on local perturbation with varying interval width.The reference interval [126 —o-e ,p 9 H-o-8 ] is represented by point A in Figure 7.2. Again,by definition, the plausibility of this interval, p/([/20 — cro ,A 0 + cro ]), is the integral ofChapter 7. Representing Subjective Knowledge with a BPA^ 116the BPA density expressed in (7.4) over the area D 1 D 2 D3 D 4 which, for this particularsituation, is equivalent to the integration of the BPA density on lines BO and OG .Since the points B and C represent the local perturbation intervals of the singletonvalues OE and OF at points E and F, the integral is equivalent to integrating the preciseprobability distribution r(0) from E to F on line 1 — 1'.For the singleton value 9E at E, its perturbation interval represented by point B canbe expressed as [0 — k(9),8 k(9)]. From Figure 7.2, the upper bound of this interval,9 + k(9), is the same as the lower bound of the interval represented by A which is— cre , i.e.OE k(9) fie — (79 (7.8)Note that this expression is the same as (7.3) except that the constant k is replaced bythe functional form k(9). Substituting (7.7) for k(9) in (7.8), OE is determined as OE =(14 -0.0 — CO)1( 1— C). Similarly, OF at point F can be found as OE (11,8±Cre—CO)1(1—C).The plausibility of the reference interval Pi([ite — cro , — o-9)) can then be calculatedasP1 ([110 — COI^— COD^le—cre—cce1—r(0)c10^ (7.9)1-cSetting (7.9) equal to p - , the coefficient c can then be determined from this relationship.For example, if 7r(8) is normal with mean tie and standard deviation (re , then O =and (7.9) becomes(7.11)1= 1• (1+P*)2Chapter 7. Representing Subjective Knowledge with a BPA^ 117Pia - Gr 6 )119 - Cr OD (131 ^1 (^e) (7.10)Letting this expression be equal to the upper probability bound p- of the referenceinterval, the coefficient c can then be determined from= 1 + 11-• (k 2As a numerical illustration, if the engineer specifies p' as 0.8, then c is 0.225.7.3 BPA Based on the Range and most Likely Value of 9In civil engineering, it may be convenient and more realistic for one to express hisknowledge about 8 in terms of the range and the most likely true value of 8 [Ang andTang, 1984]. The application of conventional statistics requires a precise probabilitydistribution within this range to represent the uncertainty of O. See Ang and Tang [1984]for possible types of distributions. However from the imprecise probability theory pointof view, the adoption of any type of precise probability distribution within the rangewill necessarily imply significant extra information. In this section a possible way ofusing D-S theory to express weak subjective knowledge of this kind will be presented.Let 01 and 0,2 represent the lower and upper limits of the possible values for 9 andlet O represent the most likely value of O. It is reasonable to assume that the possibilityof any 8 value being the truth decreases as it moves away from O. Without presumingany specific form of BPA, it is also fair to assume that the subjective knowledge beChapter 7. Representing Subjective Knowledge with a BPA^ 118represented in a consonant way, hence that a consonant BPA be adopted. Finally forpractical purposes, the form of BPA should be simple.Figure 7.3: The contiguous frame of parameter 6 C [0 1 ,0„].Consider the triangular diagram representing the contiguous frame of the unknownstate of nature 6 as, shown in Figure 7.3. The point D on the diagonal line AC representsthe most likely value 6 of 6, and the upper left corner, i.e. point B, represents the wholeset 0 = [01, 0,,]. Now consider the straight line connecting points D and B; any pointon this line represents an interval which contains the singleton value 6. Note that thesize of the interval corresponding to each point increases linearly as one moves frompoint D to B. Thus the intervals represented by all the points on DB are nested, andany BPA function expressed entirely on this line represents a consonant BPA.When specifying a BPA density on line DB, the only singleton point which couldChapter 7. Representing Subjective Knowledge with a BPA^ 119receive positive support, i.e. positive BPA density value, is O. Furthermore the plausi-bility of 0 always has the maximum value of 1.0 and the plausibility of B value decreasesas it moves away from O.A BPA density expressed on line DB captures the main features of the subjectiveknowledge. However the question of specifying the BPA density along the line DB forany particular situation still remains unresolved. In the rest of this section, the impli-cation of different forms of BPA on line DB when representing subjective knowledgewill be discussed.Since the size of the interval represented by any point on DB increases linearlyfrom D to B, any interval represented on the line DB, [0 1 ,02], can be expressed by thegeneral form[0 1, 02] = [ + w(0 i - e), e + wou - (7.12)where w varies from 0.0 to 1.0. When w is 0.0, the interval becomes [B, B] which is asingleton value represented by point D. When w is 1.0, the interval becomes the wholeset [Oi , O.] which is represented by point B on the upper left corner of the diagram.Therefore, the interval represented by any point on DB is a function of w.Before discussing general subjective knowledge, it is helpful to consider first the twosimple but important extreme cases. One is the deterministic case where the true valueof 0 is known to be O with certainty. This is represented by a BPA which assigns thetotal belief 1.0 to Adj. The other case is complete ignorance, where the true 0 valueis only known to be within [0l , Bu ]. This is represented by the BPA which assigns thetotal belief 1.0 to [9 i , 0u ].The general subjective knowledge representation by a BPA density on DB can takeChapter 7. Representing Subjective Knowledge with a BPA^ 120various forms depending on how strong one feels about O being the truth compared withthe other 9 values. If 9 is considered much more likely to be the truth than the other 9values, one would like to reflect this fact by assigning greater belief to the points closerto D, i.e. to the intervals which are smaller. This suggests a monotone decreasing BPAdensity from D to B which might be expressed by the following functional form of asimple Beta distributionm1([9 1, 82]) = (A1 + 1)(1 - w)A1 (7.13)where the parameter A l > 0. When A l increases, (7.13) assigns more belief to thepoints closer to D, indicating stronger subjective knowledge. In the extreme case whenA i goes to infinity, (7.13) converges to a unit spike at point D which represents thedeterministic case. On the other hand when A goes to zero, (7.13) becomes a uniformBPA density on DB, which can be viewed as representing the situation where one justknows that O is the most likely value of 8 and knows little about how it is comparedwith other 8 values. The BPA densities represented by (7.13) for different A values aredemonstrated in Figure 7.4(a). Let P1 1 (0) be the corresponding plausibility distributionon the singleton values. In a different context, Wasserman [1990b] called P1 1 (9) theupper probability distribution of 8. The plots of P1 1 (8) for different A l values are shownin Figure 7.4(b). From Figure 7.4(b) it can be seen that as A l increases from zero toinfinity, the plausibility plot varies from a triangular shape to a unit value at 8 whichrepresents the deterministic case.Now consider the situation where one feels that while 9 is the most likely valueof 8, the other 9 values are also very likely to be the truth. This represents weakerknowledge than discussed above and it can be reflected by assigning greater belief toChapter 7. Representing Subjective Knowledge with a BPA^ 121(a)(b)Figure 7.4: a) The BPA density functions mi([01,02]); b) the plausibility functionsP11(9).Chapter 7. Representing Subjective Knowledge with a BPA^ 122the larger interval, i.e. to the point closer to B. In contrast to the previous case, thissuggests a monotone increasing BPA density from D to B which might be expressedby the general functional formm2([01,0 2 ] ) = (A2 + 1) w A2 (7.14)where A2 > 0. When A2 is zero, (7.14) becomes a uniform distribution representingthe situation where one only knows that O is the most likely value. As A2 increases,(7.14) assigns more belief to the larger interval and thus represents weaker subjectiveknowledge. In the extreme case when A2 is infinity, the BPA density converges toa unit spike at point B and represents complete ignorance. The BPA densities fordifferent A2 values are demonstrated in Figure 7.5(a) and the corresponding plausibilitydistributions P12 (0) are plotted in Figure 7.5(b). From Figure 7.5(b) it can be seenthat as A2 increases from zero to infinity, the plausibility plot varies from the triangularshape to a uniform distribution of plausibility which represents the complete ignorance.In engineering practice, subjective knowledge of 8 can often be expressed as knowingthe range and the most likely value of O. In this section, a flexible yet reasonable methodof representing this type of subjective knowledge in D-S theory has been proposed. Itcan accommodate widely differing feelings concerning how strongly a person feels aboutO being the most likely value compared to the other 8 values. To achieve this, the BPAfunction takes on various forms as represented by (7.13) and (7.14). It should be notedhere that the specification of values for either Al or A2 in (7.13) or (7.14) would haveto be based on subjective judgement.Similar to the approaches of Sections 1 and 2 of this chapter, the approach pro-posed in this section for representing subjective knowledge is not justified by rigorousChapter 7. Representing Subjective Knowledge with a BPA^ 123(a)(b)Figure 7.5: a) The BPA density functions m2(191, 92j); b) the plausibility functionsP/2(0).Chapter 7. Representing Subjective Knowledge with a BPA^ 124theoretical argument but rather from intuition and practical needs. But as subjec-tive knowledge is likely to be imprecise, the approach captures the main features ofthe subjective knowledge and has some intuitively appealing properties. Furthermore,the adoption of a simple BPA expression to formalize subjective knowledge with im-precision makes the approach easier to understand and more practical to implement.Nevertheless, the approach should not be considered in any sense as a rigorous theo-retical approach, which does not yet exist.7.4 BPA Based on Constructive Probability of Canonical ExampleIn Sections 1, 2 and 3 of this chapter, the D-S representation of subjective knowledge inthree different situations has been discussed. However in real practice, the subjectiveknowledge may be expressed in other ways which do not conform with these threesituations. In this section, yet another approach for representing subjective knowledgeusing a BPA will be discussed.The approach is based on the concept of "constructive probability" proposed byShafer and Tversky [Shafer, 1981, 1982a, 1987; Shafer and Tversky, 1985]). Accordingto constructive probability theory, the probability assessment of any subjective knowl-edge can be achieved by comparing the subjective knowledge with a set of canonicalexamples. The canonical examples are based on situations which involve familiar butuncertain events and about which one feels comfortable to make probabilities judge-ments. Through a comparative approach, one can choose a specific canonical examplewith known probability specification which best matches the subjective knowledge instrength. This probability specification is then considered as representative of the sub-jective knowledge. It should be noted that canonical examples do not need to matchthe real uncertainty problem. Rather, they are meant only to be standard examplesChapter 7. Representing Subjective Knowledge with a BPA^ 125with which the subjective knowledge can be compared and the probabilities therebyassessed.Different canonical examples might lead to different probabilities representations,such as Bayesian or D-S, of the subjective knowledge [Shafer, 1981, 1982a; Tversky andShafer, 1985]. For example, a set of canonical examples might be based on the game oftossing a dice in which the possible outcomes are governed by different known chances.Thus the probability judgement p of some event A based on subjective knowledge mightbe compared to a chance distribution which produces the event A with exact frequencyp. The comparison of the subjective knowledge with this set of canonical exampleswill lead to a precise Bayesian probability distribution representation of the subjectiveknowledge.As another illustration, a set of canonical examples were given by Shafer and Tver-sky [1985] which lead to a special type of BPA called a simple BPA Imagine a truthtelling machine which is only sometimes reliable. That is, the machine is operatingin two modes: reliable and unreliable. If the machine is reliable, it will send out amessage telling, for example, that the truth is in A. On the other hand, if the machineis unreliable, its message tells nothing at all. The chances of the machine being in thetwo modes can vary, thus forming a set of canonical examples. If the chance of themachine being reliable is believed to be p, the resulting BPA will be m(A) = p andm(0) = 1 —p.This elementarily structured BPA is referred to as a simple BPA. In practice, thetype of subjective knowledge which can be expressed through comparison with this setof canonical examples should be relatively explicit about the bounds of the uncertainparameter 9 even though it is not entirely confident about 8 lying in this interval. Forexample one may conclude that, based on the subjective knowledge, the parameter 9 isin the subset A but be only 100p% sure about this conclusion. This leads to the simpleChapter 7. Representing Subjective Knowledge with a BPA^ 126BPA function as expressed at the end of the preceding paragraph.More complex types of canonical examples, potentially representing more complexsubjective knowledge, are possible. See Shafer and Tversky [1985] and Shafer [1981,1982a] for examples. However it is very difficult, or virtually impossible, to obtainprobability assessments by comparing the subjective knowledge with complex canonicalexamples. A more practical way to deal with complicated subjective knowledge, alsosuggested by Shafer [1981] and Shafer and Tversky [1985], is to decompose it intoseveral simple and unrelated pieces of evidence, then represent each piece of evidenceby a simple BPA and, finally, combine all the simple BPA's to form a more complexBPA which represents the total subjective knowledge. This decomposition may not beentirely straightforward, and always involves considerable personal judgement in theprocess.The constructive probability theory has been briefly introduced in this section.Although the idea of probability elicitation by comparing the subjective knowledgewith canonical examples is very attractive, the process may be difficult to apply inpractice. Nevertheless, as emphasized by Shafer and others [e.g. Shafer, 1981, 1982a;Shafer and Tversky, 1985], the theory has important philosophical significance in thatit tends to provide an alternative way of justifying existing probability theories. Thusthe introduction of the constructive probability theory is perhaps more significant inits philosophical aspects than as a tool for probability assessment. For a rigorousintroduction on the concept of constructive probability theory, see Shafer [1981].7.5 SummaryIn this chapter, the representation of weak subjective knowledge in the form of a BPAhas been discussed. In Sections 7.1 and 7.2, intuitively approximated approaches basedChapter 7. Representing Subjective Knowledge with a. BPA^ 127on the original ideas extracted from the statistical literature were introduced for asituation where the subjective knowledge is represented by a precise probability distri-bution but there is not complete confidence about this precise probability distribution.A third and new development applicable to a situation where the range and most likelyvalue of the unknown parameter is known, together with some qualitative idea of howthis most likely value is compared with the other 0 values, was presented in Section 7.3.This represents a typical form of weak subjective knowledge often encountered in civilengineering circumstances. These three approaches might all be considered as reason-able practical methods of representing weak subjective knowledge. In Section 7.4, theconstructive probability theory was described and this provides a more theoretical basisfor generating a BPA from subjective knowledge. The importance of the constructiveprobability theory however is that it provides a possible alternative justification for allof the existing probability theories.The approaches discussed in this chapter are not in any sense exhaustive and otherways of generating BPA from the subjective knowledge are possible. But in any prac-tical situation a general rule in representing complex subjective knowledge should beto decompose it into several simple, unrelated pieces of evidence, represent each pieceof knowledge by the simplest forms of BPA and combine these BPA's to obtain a BPAreflecting the total subjective knowledge. While this and other strategies have beendiscussed at some length in the AI literature, it is still too early to judge the superiorityof any one strategy over the others.Chapter 8The D-S Theory Applied in Hydrologic Design8.1 Hydrologic Engineering DesignIn designing a new water resources project, or upgrading an existing one, a decision of-ten has to be made on the choice of a design flood. The economic criterion is to choosethe design flood that will minimize the total annual expected cost which includes theconstruction cost and the expected damage cost [Linsley and Franzini, 1979]. The con-struction cost depends on the design flood adopted and the expected damage dependson the probabilistic characteristics of the flood, the damage as a function of the floodmagnitude, and the design flood adopted.With long project life the flood process is usually characterized by the annual max-imum flood values, called the annual floods. In traditional flood frequency analysis, astatistical model based on a conventional probability distribution is used to describethe annual floods. One way of determining this statistical model is to fit the past ob-servations of the annual floods to a variety of probability distributions and then choosethe distribution which gives the best fit to the sampling data. The parameters of themodel can be estimated either by the conventional maximum likelihood method, whichuses only sampling information, or by the Bayesian method, which uses both samplingand subjective prior information. The annual expected damage is calculated from theresulting statistical model and a specified damage function.Hydrologic design for large important projects is invariably concerned with rare128Chapter 8. The D-S Theory Applied in Hydrologic Design^ 129flood events. Thus, typically, the design flood has a long return period or, equivalently,a small probability of being exceeded, and is much larger than the maximum observedannual flood. Also the major flood damage is caused not by the small more frequentevents, but by the rarer floods which occupy the tail portion of the distribution ofannual floods. Therefore the shape of the tail distribution is significant in conventionalhydrological design and thus has attracted much attention amongst hydrologists [i.e.Shen et al., 1980; Stedinger and Grygier, 1985; Haimes et al., 1988].In a typical hydrologic engineering design, the duration of recorded observations ofannual floods is usually short relative to the return period of the design flood. As aconsequence a variety of statistical distributions may fit the data equally well. Theseequally qualified distributions may have similar shapes in the central part, where theobserved data are concentrated, but may be very different in the tails. Yet for thepurpose of hydrologic design, these distributions must be extrapolated to the regionof rare floods with long return periods where there are usually no sample data. Theenormous uncertainties associated with these extrapolations have been recognized byengineers. For example, Benjamin and Cornell [1970, p499] warn that estimation ofprobabilities of rare events based on extrapolations of different distributions can be oforder-of-magnitude difference. Shen et al. [1980] fitted sampling data to two differenttypes of distributions, Gumbel type I and type II, and concluded that "a decision touse a type I model for data more appropriately modeled by type II model can leadto terrible underestimates of extreme events." Clearly, hydrologic design based on thedirect extrapolations of the distributions is potentially unreliable.The U.S. Bureau of Reclamation [1986] also reports that the uncertainty associatedwith the estimation of a rare flood based on the extrapolation of a fitted distributionis enormous. As a result the report concludes that "the frequency relationships shouldnot be extrapolated beyond twice the length of record or 100-yr return period whicheverChapter 8. The D-S Theory Applied in Hydrologic Design^ 130provides the rarer annual probability of being equalled or exceeded". Since it is notadequate to simply extrapolate a fitted distribution to the rare events, two separatecurves, one for the more frequent floods, say less than 100-yr return period and anotherfor the rare floods, say between 100-yr event and the PMF, are adopted to describingthe annual floods [e.g. U.S.B.R., 1986; Stedinger and Grygier, 1985]. Here the PMFdenotes the Probable Maximum Flood and its probability of exceedance, according tothe definition, should be zero. The PMF should therefore be the upper bound of theannual flood distribution.The flood frequency curve for the more frequent events can be determined throughconventional flood frequency analysis. For the rare floods in between the 100-yr eventand the PMF, observations are usually not available and other experiential informationabout the probabilistic aspects of these rare events is invariably limited. A frequencycurve for these rare events can be established by first assigning a return period to thePMF and then fitting a two parameter distribution using the 100-yr flood and the PMFas quantiles. The obvious problem with this approach is that it attaches a finite returnperiod to the PMF which contradicts the definition of the PMF. The value of the returnperiod assigned to the PMF is therefore entirely arbitrary. Furthermore, the selectionof the distribution in this critical interval is also arbitrary as there is little evidence tosupport any one choice.Indeed Haimes et al. [1988] stated that "the return period of the PMF is highlyuncertain: different reports recommended values ranging from 10 4 to 10 12 years. Mostof these recommendations are actually based on subjective judgement and experiencerather than on statistical evidence. Similar problems cripple the estimation of theflood-frequency distribution for floods ranging from the 100-yr event to the PMF."Stedinger and Grygier [1985] have extensively studied the sensitivity of the optimalhydrological design to various factors and conclude that this design is very sensitive toChapter 8. The D-S Theory Applied in Hydrologic Design^ 131the type of distribution tails and the return period assigned to the PMF.From the above discussion it can be seen that, in the realm of the extreme orlimiting design condition, a large amount of uncertainty can be involved in hydrologicengineering design. One main source of uncertainty is the lack of information concerningdesign event magnitude floods, and the conventional approach is limited in dealingwith such weak information. In this chapter, a design approach, based on the D-Stheory, which accommodates the weak information situation is introduced. It shouldbe noted here that only the uncertainty associated with the annual maximum floodsis considered. The damage function is assumed to be known as its variation does notgenerally have a significant effect on the optimal design [Stedinger and Grygier, 19851.8.2 The Basic Hydrologic Design ModelLet the random variable x C X represent the annual maximum flood. A set of observa-tions of x is known and is denoted as {x i , x 2 ,...x n } which are ordered and relabelled sothat x 1 < x 2 < < xn . Let A = faj ,j = 1,...ml represent a set of design options andfor each ai there is an annual construction cost which is denoted as cj . The damage,which is a function of both flood value x and the design option a, is assumed knownwith certainty and denoted as D(x, a). The problem is to choose the decision a* C Aso that it will minimize the total annual expected cost including construction cost andexpected damage.Since the PMF, which is denoted here as xp, is considered as the upper bound ofthe annual flood, the domain of the annual flood is from zero to xp, i.e. x C (0,x p).The annual flood domain is partitioned by the observations of X into n+1 intervalswhich are represented by 10 = (x0,x 1 ),/i = [x1,x2),..-,/i = [xn,xp).Here the square bracket means that the boundary point is contained in the intervalChapter 8. The D-S Theory Applied in Hydrologic Design^ 132while the parenthesis means it is just excluded. Thus each of the intervals I i , i = 1,...ncontains exactly one sample observation.Let qi denote the probability that x is within the interval I. The average damagein the interval I2 is denoted by d i and can be determined through discretization of thedamage function. If the number of samples n is very large, the sample data will tendto cover the whole range of the annual flood domain from zero to the PMF and theprobability qi for each interval Ii can be approximated by 1/n. The annual expecteddamage in this case can be calculated by summing the products of q i and di over allintervals, and the result should closely approximate that obtained from the conventionalapproach based on flood frequency analysis.In practice however, even though for large hydrologic engineering design there isusually a moderate length of sampling data, typically in the order of 50 to 100 years, thesample record is still short relative to the design event return period, and concentratedover a small portion of the whole annual flood domain. Thus there are no samplingdata in the rare flood region and, as a result, the last interval In , i.e. the intervalbetween the maximum observed flood x 7, and xp, is extremely large relative to the restof the intervals. While the components of the annual expected damage for the otherintervals can still be calculated the same way as above, it would be inappropriate tocalculate the damage component for this last interval using the average damage do andprobability qn .For a moderate length of sample record on the order of, say, 50-100 years, the lastinterval In then contains the floods from about the 50 year return period event to thePMF event. In the approach presented here, a function, which is called a probabilitydecay function and denoted by q5, (x), is introduced for the last interval In to describethe distribution of the probability value qn within this interval. The function 0,i (x) isconsidered to be monotone decreasing and bounded by x r, and xp. Furthermore On(x)Chapter 8. The D-S Theory Applied in Hydrologic Design^ 133is normalized so that its integral over the interval In is 1.0, i.e. 957,(x)dx = 1.0.The - probability decay function On(x) introduced here is somewhat similar to the"tail probability distribution" in conventional flood frequency analysis. However unlikethe conventional approach, it treats the PMF as the bound of the annual floods. Fur-thermore, the probability decay function does not follow one specific type of probabilitydistribution.Since there is no evidence about the occurrence of rare floods between xn and xp,the shape of the probability decay curve could take any of a variety of different forms.The mathematical function adopted for ckn (x) should therefore be flexible enough torepresent a wide range of different shapes. The following parametric function adoptedfor On (x) satisfies this needOn(z/A)^(zP — x )A -1(xp _ xn )),(8.1)where the parameter A should satisfy A > 1.0 for On (x/A) to be a monotone decreasingfunction. The function On (x/A) is recognizable as a special form of the general Betadistribution with one shape parameter being A, the other shape parameter being 1.0,and two location parameters xn and xp (See Bury [1975] for a more thorough discussionof the Beta distribution). The integral of O n(x/A) from xn to xp is 1.0.Once the two location parameters xn and xp are fixed, the probability decay functionOn (x/A) will totally depend on the parameter A. As an illustration, consider a situationwhere the maximum annual flood observation is x n = 30,000 cfs and the PMF isxp = 150,000 cfs. The resulting shapes for On (x/A) for different A values are plottedin Figure 8.1.From Figure 8.1 it can be seen that when A is 1.0 then O n (x/A) represents a uniformChapter 8. The D-S Theory Applied in Hydrologic Design^ 134xn^ X pFigure 8.1: The probability decay functiondistribution. When A is greater than 1.0 then On (x/A) becomes a monotone decreasingfunction from x„ to xp. As .\ becomes bigger the probability decay curve indicates afaster initial probability decay from xn . In the extreme case when .\ tends to infinity,66„(x/A) becomes unit spike at the point xn , meaning that the total probability qn isconcentrated at the point xn . Thus the probability decay function c¢„(x/A) defined in(8.1) is very flexible. In the following section, the uncertainty associated with parameterA will be assessed using the D-S theory.The damage function for the interval in is defined by D(x, a) but within this intervalis denoted here by Dn (x, a). Knowing the probability decay function 0„(x/)) andthe damage function Dn (x, a) for the interval In , the total annual expected damageL(qi ,A,xp) can then be calculated by summarizing the contributions from the smallerdiscretize intervals 1 to n — 1 and integrating over the larger final interval.Chapter 8. The D-S Theory Applied in Hydrologic Design^ 135L(qi , A,xp) = E qi di^On(x / A)Dn (x , a)dxn-1i=1(8 .2)Now let pi represent the probability that x is in the interval (0, x j+. 1 ) where i =1,...n — 1. The value pn then is the probability that x is within (0,xp) and is 1.0. Theprobabilities qi and pi are related as followsqi = pi —^for i = 1...n^(8.3)Substituting (8.3) in (8.2) yieldsn-1L(pi, A,xp) = E(Pi^(1 — p7,1)1 On(x / A)Dn (x, a)dx^(8.4)For any given i, the unknown probability pi can also be interpreted as a binomialparameter and its value might be estimated from the observations of i "successes" inthe interval (0,x i+1 ) and n — i "failures" outside this interval. Let E(pi ) represent theexpected value of pi. Since the loss function L(pi ,A,xp) expressed in (8.4) is a linearfunction of pi , the expectation of L(pi ,A,xp) with respect to pi can then be expressedasn-1E[L(pi ,A,xp)] = E(E(pi ) — E(pi_ i ))dii=i— E (pn-i))L On(x / A)Dn (x, a)dx^(8.5)Chapter 8. The D-S Theory Applied in Hydrologic Design^ 136The annual expected damage therefore is a function of the unknown parametersE(p2), A and xp. In the next section, the D-S theory perspective on the uncertain-ties in these parameters will be discussed and incorporated into the expected damagecalculation.8.3 The D-S Approach to Parameter UncertaintiesFor a moderate sample record of, say 50 to 100 years, it is reasonable to assume thatthe shape of the probability decay function ck(x/A) is independent of the samplesand thus the probability values p2 . The parameter A therefore can be considered tobe independent of the parameters E(p2). It is also reasonable to consider the PMFestimate xp independent of the parameters E(p2) and A since the PMF governs onlythe termination of the distribution tail but not its shape. Furthermore, the parametersE(pi ), and the other two parameters A and xp are considered to be independent amongthemselves and therefore they can be dealt with separately. In the following threesubsections the uncertainties associated with the three sets of parameters E(p2 ), A andxp will be discussed individually. It will be seen however that the order of dealing withthese parameters will not affect the results.8.3.1 Uncertainties Associated with E(pi )Recall that the probabilities p i are a set of binomial parameters. The D-S inference ofthe binomial parameter pi based on the sampling information was presented in Chapter4. Let [pi,„,pi ,v ] represent the continuous interval in the contiguous frame of p i . Fromequation (4.5) of Chapter 4, the D-S inference of the parameter pi with i successes andn — i failures is expressed by the following BPA,Chapter 8. The D-S Theory Applied in Hydrologic Design^ 137m([13i,u,Pi,vi) =^- i)Pi:u1 ( 1^Pi,v)n-z-1^(8.6)The two variates p i ,„ and pi ,v have the same meaning as the variable p, except thatthey are used to form the random interval [p i ,u ,p i ,v ]. The two marginal distributions ofpi, as expressed in (4.6), aref (pi) =g(pi) =ipii-1(1^pir-i(n — i)pij1 — (8.7)From these two marginal distributions, the upper and lower expected values of pican then be calculated asi + 1E*(Pi) n + 1E.(Pi) n^ +i 1 (8.8)where i = 1,...,n — 1. The upper and lower expected values E(pi ) and E(pi) can beconsidered as the upper and lower bounds of the unknown expected probability E(pi ).Since the expected damage (8.5) is a linear function of E(pi ), the upper and lowerexpected damages can then be calculated by determining all of the E(pi), subject to theconstraining effect of the upper and lower expected probability values. This involvessolving the following two simple linear programming problemsChapter 8. The D-S Theory Applied in Hydrologic Design^ 138I. maximize^E[L(pi, A,xp)]II. minimize^E[L(pi ,A,xp)]n-1where^E[L(pi,A,xp)] = E [E(pi ) — E(pi _ i )ldii=1-F[1 — E(pri--1)]^0(x / A)Dr,(x , a)dx (8.9)all subject to^E,,(pi) < E(p i ) < .E'(pi )^for i = 1,2...n — 1E(pi ) > 0Simple analytical solutions of the upper and lower expected damages with respectto E(pi ) can be obtained from the above linear programming problems as follows^1 n-1^2 rp.E;,[L(A,xp)] = n + 1 Edi+n-Flfx. d),(x / A)Dn (x , a)dx^1 n-1^1^Z pi [L(A,xp)] = ^ E^+ ^4,(x/A)Dn(x,a)dxn+1 i=i^n+1 j' xP(8.10)For any decision a, the upper and lower expected damages are now functions ofthe unknown parameters A and xp. In the next subsection, uncertainty concerning theparameter A will be studied.8.3.2 Uncertainty with AWhen the PMF, xp, is specified, the probability decay function On (x/A) defined byequation (8.1) is totally dependent on the parameter A. If enough information is avail-able for determining the shape of On (x/A), the parameter A will be a constant andthe uncertainty of the probability decay function will be eliminated. However in a realdesign problem, little evidence is available about the rare floods. Thus there is largeChapter 8. The D-S Theory Applied in Hydrologic Design^ 139uncertainty associated with the shape of the probability decay function, or equivalently,large uncertainty associated with the parameter A.Since the evidence about the unknown parameter A is very weak and likely to behighly subjective in nature, the uncertainty associated with A would appropriately beaddressed using the D-S theory and the techniques discussed in Chapter 7. Supposethen, after considering all the information available, the uncertainty about A is repre-sented by a BPA function in the contiguous frame of A. Let mA([A„, A,,]) denote thisBPA where {A,Av ] is a continuous random interval in the contiguous frame of A. Theupper and lower marginal distributions of A can then be calculated fr om rnA([A.,Avj)and equation (3.12), and are denoted as fA (A) and g),(A) respectively.Now consider the upper and lower expected damages expressed in (8.10). Thediscussion of the probability decay function 0 7,(x/A) in Section 8.2 reveals that when Abecomes large, the probability curve decays rapidly, and more probability is distributedto the smaller values of x. From (8.10) it is seen that since the damage function D,,(x , a)is a monotone increasing function of the annual flood magnitude x, the upper andlower expected damages are therefore monotone decreasing functions of the parameterA. Thus, knowing the upper and lower marginal distributions of A, f A (A) and gA (A), apair of upper and lower expected damages with respected to A for the upper expecteddamage function can be calculated. Similarly, the upper and lower expected damageswith respect to A for the lower expected damage function can also be calculated. Ofinterest here is the upper of the upper expected damage denoted as C,),[L(xp)] andthe lower of the lower expected damage .E. p,,),[L(xp)]. These are the upper and lowerexpected damages with respect to both E(pi ) and A. The upper and lower expecteddamages thus calculated therefore represent the upper and lower bounds of the damageafter considering the uncertainties associated with the parameters E(pi ) and with A.According to the definitions, E2,,A [L(xp)] and Z pi ,A [L(xp)] can be expressed asChapter 8. The D-S Theory Applied in Hydrologic Design^ 140[L( xp )] =n +11E,^[L( x p )1 = ^n +1-1+^2^[fXpq5,,(x / A)Dn (x, a)dx] fA (A)dA2=1^n +1^Xnn-1 xp^n+ ^n[I+ ^ 0,i(x A)Dn (x , a)dx] gA (A)dA^(8.11)Depending on the information available, various approaches may be used to con-struct the BPA function for A, as discussed in Chapters 4, 5, 6, 7. This result will bedemonstrated in a numerical example presented in Section 8.4.8.3.3 Uncertainty with the PMF Estimate xpThe estimation of PMF requires transforming the Probable Maximum Precipitationusing some hydrological model such as rainfall-runoff model, or some simulation tech-niques [Linsley et al., 1982]. Models for this transformation are well established andthere is relatively less uncertainty associated with the PMF than with the shape of theprobability decay function.Stedinger and Grygier [1985] conclude that the hydrological design is not very sen-sitive to any adjustment of xp within a reasonable range. This is equivalent to sayingthat the annual expected damage for a given design is not very sensitive to the varia-tions of the Xp. If this is the case, the uncertainty of xp can be ignored and the bestestimate of Xp can be used in the hydrological design. However, if the uncertainty ofxp or its influence on the design is considered to be potentially significant, it should bereflected in the design analysis. If information about xp is weak then the D-S approachshould be considered.Depending on the particular circumstances the various approaches discussed inChapters 4, 5, 6, or 7 may be used to obtain a BPA for the unknown xp. Suppose, afterChapter 8. The D-S Theory Applied in Hydrologic Design^ 141considering all of the information available, a BPA describing the uncertainty of xp isobtained. The corresponding upper and lower marginal distributions of xp can then becalculated and are denoted as fop (xp) and g„,(xp) respectively. Consider the upperand lower expected damages with respect to E(pi) and A expressed in (8.11). Since thefunction Z.' On (x / A)./),(x , a)dx is a monotone increasing function of Xp, and fa (A) andgA (A) are all positive functions, the upper and lower expected damages with respect toE(pi ) and A are also monotone increasing functions of the value Xp. Thus using theupper and lower marginal distributions of Xp, the upper and lower expected damageswith respect to all the unknown parameters E(pi ), A and xp can then be calculated as1 n-1^2EZ, ),, p [L] ^E +^n+1[1:: On(x/A)Dn(x,a)dxj fA(A)dA] g ip (x p)dx p1^1[L] =^dn±lt z + n+1[f^On(x I A)Dn (x, a)dx] g),(A)dAl hp (x p)dx p (8.12)ynIn this section, the incorporation of uncertainties of the unknown parameters E(pi),A and Xp, using the D-S theory, in calculating the annual expected damage have beencombined and will lead to the upper and lower annual expected damages for any decisionoption ai . The upper and lower annual expected costs for each decision option a; canbe calculated simply by adding the annual equivalent construction cost ci to the upperlower expected damages of^According to the D-S mini-upper decision criterion asdiscussed in Chapter 3, the optimal design flood would be the one which minimizes theupper annual expected cost. In the next section, a numerical example demonstratesthe application of the model developed this chapter.Chapter 8. The D-S Theory Applied in Hydrologic Design^ 1428.4 Numerical Example8.4.1 Description of the ProblemThe hydrologic design problem presented here is from Stedinger and Grygier [1985]. Anexisting dam has a spillway which is capable of passing 40,000 cfs (1100 m3 /s), whichis a fraction of the PMF value whose estimate is 150,000 cfs (4200 m 3/s). Four optionsconcerning the spillway are to be considered and these, together with the correspondingdesign floods and the annual construction costs, are listed in Table 8.1. Note that option1 is to do nothing and the other three options expand the spillway capacity.Table 8.1: Design options and cost (in dollars) (from Stedinger and Grygier [1985]).OptionDesign flowa (cfs)Annual constructioncost (in dollars)1^Do nothing 40,000 02^Modify spillway 60,000 50,0003^Rebuild spillwayand raise dam 120,000 1200004^Rebuild spillwayand lower crest 150,000 130,000The flood damage is a function of the flood magnitude. For small floods less than10,000 cfs (280 m3 ), equivalent to a 10-yr return period flood, the damage is negligible.For the floods between 10,000 cfs and the design flood a, the damage is a monotoneincreasing function of the flood value. When the flood is bigger than the design flood,the dam is considered to be destroyed and the damage is considered as a constant. Thisdamage function is expressed by the following formula (damage in dollars)Chapter 8. The D-S Theory Applied in Hydrologic Design^ 143x < 10, 000^cfs108^10,000 cfs < x < a1.3 x 10 8 a < x < xpD(x) =1 + 6 070 0 0 I 3(8.13)Now, instead of adopting two separate probability distributions for the floods from0 to the 100-yr return period and from the 100-yr event to the PMF xp respectively,as Stedinger and Grygier [1985] did, it is assumed here that only a set of sampling ob-servations of the annual floods are available. In order to demonstrate the methodologypresented in this chapter, a sampling record was created by simulating a set of n datavalues from a log-normal distribution with tc = 8.374 and cr = 0.651. Suppose n = 100,i.e. there are 100 annual observations, these data then divide the flood range from zeroto xp into n 1 = 101 intervals. Here only the intervals formed by the floods greaterthan 10,000 cfs, which cause damage, are of interest. These intervals, together with theaverage damages calculated from the damage function (8.13), are listed in Table 8.2.Table 8.2: The annual flood intervals and average damages x10 3 (in dollars) for inter-vals.z`" obs ery atz on 90 91 92 93 94 95flood value x i 9886.58 11044.7 11548.2 11664.8 11805.0 12646.9average damage di 529.5^663.3^718.7^742.6^839.9i th observation 95 96 97 98 99 100flood value x i 12646.9 12823.3 17155.3 21511.9 29880.1 30710.0average damage d i 947.2^1565.3^3273.2^7422.1^11405.7In the last interval [x ioo ,xp], the damage function which is denoted as Dn(x,a) isused and expressed asChapter 8. The D-S Theory Applied in Hydrologic Design^ 144/),(X) ==^1+4°.'"J 3 xn <x <a^(8.14)1.3 x 10 8 a < x < xpThe probability decay function for this interval is expressed, as in (8.1), byOn(x/A) = A(xp — x)A-1(xp — x,i )A8.4.2 D-S Theory Applied to Hydrologic DesignSince the sampling information is given, the upper and lower expected annual damageswith respect to the unknown parameters E(pi ),i = 0,1,-100, can be calculated from(8.10) asE;),[L(xp, A)]E., pi [L(xp,))]2 jr.= 287 +^0.(x/A)Dn(x,a)dx101^1001^"xi'= 287 + 101 j 073(x/ A)-Dn(x,a)dxx100(8.15)The above upper and lower expected damages are still functions of the unknownparameters A and xp. Consider first the uncertainty associated with A. There is noobjective information about A but some subjective knowledge may be implementedhere. In the conventional flood frequency analysis, it is generally believed that theshapes of the tail distributions vary within some range [e.g. Stedinger and Grygier,1985]; Haimes et al., 1988]. This corresponds to a range of the parameter A in termsof the probability decay function. Stedinger and Grygier [1985] state that the shapesof the tail distributions are generally thought to be bounded by the thick-tailed ParetoPMT q1-P Q(q)NormalLognormalParetoqueChapter 8. The D-S Theory Applied in Hydrologic Design^ 145distribution and the thin-tailed Gumbel distribution. Haimes et al. [1988] considerthe tail distribution to be in the range of Pareto and Normal which has thinner tailthan the Gumbel distribution as can be seen from Figure 8.2. Based on these sources ofevidence, the A value is considered to be within the range of [9,25] which approximatelyembraces the possible shapes of the tail distributions from Pareto to Normal.Figure 8.2: Possible tail distributions in conventional frequency analysis (from Figure6.7 of Haimes et al. [1988]).If no other information is available, a complete ignorance BPA on A, which assignsBPA value 1.0 to the whole interval [9,25], should be used. However if there is someknowledge about A, it should be used to construct the BPA. As an example considera situation where one feels that the tail distribution is most likely to be a Gumbeldistribution. This is equivalent to saying that A = 12 is the most likely value, as theprobability decay function of A = 12 is close to the shape of a Gumbel tail distribution.Chapter 8. The D-S Theory Applied in Hydrologic Design^ 146A consonant BPA on A in this case, as discussed in Section 7.3 of Chapter 7, is auniform BPA density on the diagonal line from [12,12] to [9,25] as shown in Figure 8.3.The upper and lower marginal distributions of A are also shown in Figure 8.3.9.0^ 25.0Figure 8.3: The BPA function on parameter A.The upper and lower expected damages with respect to A as well as E(p i ) can thenbe calculated as followsE;i , A [L(xp)] =279 1-4f re I f 12 (A)."Lid)1E. pi,A {L(X13)] = 279 + —^• 11-3-dAi101 ixioo L-112 41(X'A)Dn (x , a)dxDn (x , a)dx (8.16)Chapter 8. The D-S Theory Applied in Hydrologic Design^ 147The above upper and lower expected damages are still functions of xp. Using thebest estimate of xp = 150,000 cfs, the upper and lower expected damages for each ofthe four design options can then be calculated. Subsequently the upper and lower totalexpected cost for each of the four design options are calculated as shown in Table 8.3.Table 8.3: Upper lower expected costs x10 3 (in dollars) with respect to pi, ADecision a l = 40,000 a, = 60,000 a3 = 120,000 a4 = 150,000E;,,A[-] 1566 893 878 898E. F,,,AH 699 528 591 601Based on the mini-upper decision criterion, the design option a 3 , i.e. Rebuild thespillway and raise dam , should be selected in this particular example.As discussed in Section 8.3.3, there is uncertainty associated with the PMF estimatexp. Here the effect of the variation of xp on the upper and lower expected cost in(8.16) is studied first. The results of varying the PMF within one third of the currentestimated value, as suggested by Stedinger and Grygier [1985], are presented in Table8.4.Table 8.4: Sensitivity analysis to xpdecisionoptionX p = 100, 000 xp = 125, 000 X p = 175, 000 X p = 200, 000E*[•] E.[•] E*[.] E,,[.] E*[.] E.[•] E*[.] E.[.]a l 1096 519 1361 612 1725 777 1851 845a2 711 488 803 516 1001 553 1108 582a3 776 558 832 574 942 698 994 625a4 786 568 842 584 951 618 1004 635From Table 8.4 it can be seen that in this particular example, varying the PMFestimate within one third of its estimated value can have some effect on the upperand lower expected costs. In fact, the design option based on the mini-upper decisioncriterion is a3 for xp > 150,000 cfs and becomes a 2 if xp < 150,000 cfs. This suggestsChapter 8. The D-S Theory Applied in Hydrologic Design^ 148that the uncertainty associated with the PMF estimate, xp, should be incorporatedinto the analysis.If there is some information about xp, it can be represented by a BPA function and(8.12) can be used to calculate the upper and lower expected costs. Here it is assumedthat the decision maker knows only that xp lies within one third of its estimated valueand knows nothing else. This can be represented by a complete ignorance BPA forxp. For each decision, the resultant upper expected cost corresponds to the upperexpected cost for xp = 200,000 cfs in Table 8.4, and the resultant lower expected costcorresponds to the lower expected cost for xp = 100,000 cfs. The final upper andlower expected costs after considering uncertainties for all parameters, pt , A and xp,are given in Table 8.5. According to the mini-upper decision criterion, the D-S decisionin this case should be a 3 . Note however that for each decision, the difference betweenthe upper and lower expected costs is significant, indicating substantial imprecisionwhich is otherwise overlooked by the conventional statistical approach. An alternativestrategy of making a decision based on a more complete interpretation of the D-S upperlower expected utilities is presented in Chapter 10.Table 8.5: Upper lower expected costs x103 (in dollars) with respect to p i , A and xpDecision a l = 40,000 a 2 = 60,000 a3 = 120,000 a4 = 150,000C ,), ,z,[.] 1851 1108 994 1004E. p.,),[•] 519 488 558 5688.5 SummaryMajor hydrologic engineering design is invariably concerned with rare flood eventswhich are much larger than the maximum recorded annual flood, and therefore in-volves a large uncertainty. In this chapter a hydrologic design model based on the D-SChapter 8. The D-S Theory Applied in Hydrologic Design^ 149theory was developed which does not depend on any specific flood frequency distribu-tion assumption and is able to deal with the weak information input about unknownparameters associated with the rare flood events. The model generally produces, foreach design option, upper and lower expected costs as a result of the weak informationinput. A D-S decision can then be made based on the mini-upper criterion as demon-strated in the above example, or following a more complete interpretation of the D-Sresults as will be discussed in Chapter 10. The model was demonstrated in a numericalexample which revealed that weak information input about the unknown parameterscan cause significant indeterminism in the choice of hydrologic design.Chapter 9Application of D-S Theory in Reliability Analysis9.1 Reliability Analysis in Civil EngineeringThe problem of safety or reliability of an engineering system is essentially the problemof balancing the capacity of the system against the demand on the system [Ang andTang, 1984]. For example, the capacity of a flood control project is the design flood,which the project is intended to withstand, while the demand is the extreme floodwhich the project actually encounters during its life time. The flood control project issaid to be safe if the design flood is greater than the actual extreme flood which occursduring the service life of the project.The conventional approach to reliability analysis uses the concept of safety mar-gin or safety factor. The levels of the margins or factors of safety are determined fromsome conservative, deterministic estimations of the capacity and demand, e.g. choosingthe "minimum" capacity and the "maximum" demand. However, engineering design isusually based on incomplete knowledge and, as a result, uncertainties concerning thedesign events are unavoidable. The uncertainties are addressed only in a qualitativefashion and consequently the reliability of the system can not be described quantita-tively [Ang and Tang, 1984]). As a result, as pointed out by Duckstein and Bogardi[1981], the engineering project could be either over- or under-designed.The probabilistic approach to reliability analysis considers the design events as150Chapter 9. Application of D-S Theory in Reliability Analysis^151naturally occurring random variables and uses statistical models to describe the in-herent uncertainties associated with these random variables. Under the probabilisticapproach, the reliability of a system can be quantitatively defined as the probabilitythat the project will fulfil its original design purpose. Conversely, the failure probabil-ity is defined as the probability that the system will not perform adequately. In termsof capacity (X) and demand (Y) of a system, the failure probability in the continuouscase can be defined aspF = P(X 5 Y) = f °Q .1 °°0 f X ,Y (X y)1(x<oclxdy^(9.1)where fx ,y (x, y) are the joint distributions of X and Y. If X and Y are independent ran-dom variables with known distributions fx (x) and fy(y), then fx,y(x,y) fx (x) fy (y)and the above equation becomes [Ang and Tang, 1984]PF^F▪ x(Y).fr(y)dy= I• [1 — FY (X)] f X (X)dX^ (9.2)Define the safety margin Z asZ = X — Y^ (9.3)If X and Y are independent random variables with normal distributions, i.e. X --N(1.4,cr) and Y N(p,y ,ay ), where^and /iv are the mean values and a and ay theChapter 9. Application of D-S Theory in Reliability Analysis^ 152standard deviations of X and Y, then Z also follows a normal distribution with meanand standard deviation as= 1-Lx^fly0.^ „\/,. 2^2'^yThen the failure probability pF in this case becomespF = (-1^( ^N/0- + cr!2,In reliability analysis, another convenient and widely used quantity is the reliabilityindex Q which is defined as the number of standard deviations of the mean value of thesafety margin, i.e.— /iy = — = ^cr("Y(9.6)Equations (9.5) and (9.6) demonstrate that the failure probability pF and the reli-ability index [3 are related through pF = 4)(-0). This relationship is exact only if thetwo variables X and Y are independent and normally distributed. In the situationswhere those two conditions are not satisfied, this relationship and the calculated valueof pF will be, at best, only an approximation to the real failure probability.Under the probabilistic approach, reliability analysis therefore becomes a problemof determining the probability models of the system. If the statistical models for X andY are fixed with known parameters, the failure probability pF will be a deterministic(9.4)(9.5)Chapter 9. Application of D-S Theory in Reliability Analysis^153constant. However, in practice, the parameters of these models and even the modelsthemselves will have to be based on the available, but often inadequate, knowledge;therefore substantial uncertainties may be associated with the parameters of thesemodels, and the models themselves. As a consequence, the failure probability pF isitself an unknown random variable. Of interest here, and in line with the central themeof this thesis, is the situation of weak knowledge and its consequence on the results ofthe reliability analysis.In general situations, the variables X and Y are not two simple variables but arefunctions of many factors. The calculation of the failure probability pF in these sit-uations is more complex. Various approaches have been developed for calculating pFwhere the statistical models for the random variables and parameters of these modelsare all known, or where only the first and second moments of the random variablesare known. See Yen et al. [1986] for a brief review of these approaches, and Angand Tang [1984], Melching et al. [1990] for examples of applications. In the situationswhere the parameters are unknown, a Bayesian approach has been used to deal withthe parameter uncertainties [Martz and Waller, 1982].The Bayesian approach presumes that the subjective knowledge can be representedby a precise probability distribution and then combined with a precisely specified sam-ple likelihood function to produce a posterior distribution for the unknown parameters.But if the knowledge about the unknown parameters is weak we may be unable to spec-ify such precise probability distributions and likelihood functions. In these cases wherethe Bayesian representation is not supported by the knowledge, the D-S approach canbe used to deal with the parameter uncertainties.The application of D-S theory to uncertainties in reliability analysis will be exploredin the remainder of this chapter. The discussion will be confined to the simple situationwhere the capacity and demand of a system are two independent random variables withChapter 9. Application of D-S Theory in Reliability Analysis^ 154normal distributions, i.e. X ti N(itz ,uz ) and 3' — N(ity , a y ). There could be errorsor uncertainties associated with both the mean values µ x , µy and standard deviationscrz , ay of these normal distributions. But it is believed that uncertainties associatedwith the mean values are of primary importance while those uncertainties associatedwith the standard deviations are of secondary importance [Ang and Tang, 1984, p387].Only uncertainties with the two mean values are demonstrated and the two standarddeviations are treated as known constants.9.2 D-S Reliability AnalysisTraditionally, the uncertainties of the mean values p x , icy of X, Y are represented byconventional posterior probability distributions. Note here that the two mean values Itzand Ay are both greater than zero. The determination of these posterior distributionsby the Bayesian method is based on both the engineer's subjective judgement and realsampling information. Once the posterior distributions on p„ and py are obtained, thedistribution for p, = ity can then be determined. Based on this distribution, and thedeterministically specified values of crz and cry , various modified quantities describingthe reliability, such as the expected failure probability E(pF ), the different percentilesof pF and the reliability index #, can then be calculated from (9.5) and (9.6).In the D-S approach, precise probability distributions for itz and yy are not requi-site and the uncertainties about /4 and py can be represented less restrictively by BPAfunctions. The determination of these BPA functions, as discussed in Chapters 4, 5,6 and 7, can be based on an engineer's subjective knowledge, if any, or the samplinginformation alone, or a combination of these two sources of information. Since the pa-rameters itx and are real and continuous variables, the BPA's on the two parametersshould be specified on the corresponding contiguous frames as discussed in Chapter 3.Chapter 9. Application of D-S Theory in Reliability Analysis^155Let [pzu ,//,] and [tiy.,/i,,] represent any intervals in the contiguous frames of //„. and/iv respectively. Suppose that the BPA's on the two contiguous frames are obtained forP. and tiv and denoted as Trimz(Laz,,, ium,l) and m, ( [uy., //y.„]) respectively. As in theconventional approach, it is necessary to determine first the resultant BPA rn n ([tLA,P, ,,i)on the variable //, = /Ix — Ay where [tin gl y ] represents any interval in the contiguousframe of A.If p C [µy„, µ,] and /iv C [tiy„,/iy,], then the difference /2 =^—µy must be in theinterval [ti„,/i„] where^/1Xv^fly„flv -^ityt,^ (9.7)i.e. ti C [(IL, — ply),(Axi, — µy.)]• Note that pu and tit, can both either be positive ornegative. If /ix and tiy are independent random variables, then the two correspondingBPA's can be multiplied and the product be assigned to the resulting interval of TheBPA value on any interval [p.,,,p v ] is the collection of all the products of mM.([fixt,,[Lx,])and nip..„(tity„,//y,1) whose intervals satisfy — = pu and Az, — = ,u,,, i.e.rnii([//u,Avil =^ : flit, —^= Au,^—^ Ay}00 fM ({P it DM^DJo^0^Az^.7.,,^kty^Xv^Xu^IL U^Xu C Xv^(9.8)In the above integration, m i„([//,, 11,,,]) and m,,,,,([py„,p,y„]) can be of any formand are undefined, the integration therefore can not be carried out. As a result, theuniversal analytical expression for the BPA on ti can not be obtained. However, sinceChapter 9. Application of D-S Theory in Reliability Analysis^ 156the failure probability pF and the reliability index 0, as defined in equations (9.5) and(9.6), are all monotonic functions of (see Sec. 3.6 of Chapter 3) then only the upperand lower marginal distributions of gp (p) and fi,(11) are needed in calculating thequantities of interest in a reliability analysis.According to the definition in (3.17) of Chapter 3, the upper and lower marginaldensity function of it areg,, (Pv) = 1:11, MP([11111 AvDdlitiooLL(y.) = J^M 11([11121 Av])^(9.9)Now, let g,,z (f4) and fp.r (Ax ) represent the upper and lower marginal densities of yxbased on the BPA m,„([Ax„,p, x„]). Similarly, let g i,,(uy ) and fi,,(p,y ) denote the upperand lower marginal densities for py based on the BPA m„([/1y.,py,,]). The upper andlower marginal densities gm (p,) and 42 (1.1) expressed in (9.9) can now be specified interms of these upper and lower marginal densities of fix and y y as follows (see SectionB.1 of Appendix B for detailed mathematical derivations of (9.10))^=^gp..Gazif„,(kt.,r.^f„(iI) =^fAz(u.n)g„,(A, p)dp,^(9.10)The upper and lower expected failure probabilities E*(pF ) and E,,(pF ), obtained bycalculating the expectation of (9.5), areE - (pF) =^(--110)Chapter 9. Application of D-S Theory in Reliability Analysis^157f CO(PF^7 IL \^)4t^ (9.11)- 00The upper and lower reliability indices can also be obtained by calculating theexpectation of (9.6)COI- 00 --crgp(A)dtt1 pgm(p)dicf ooJ-0 fµ(u)dµ11142(04\ la!^foo(9.12)Equations (9.11) and (9.12) summarize the consequences of applying the D-S theoryto reliability analysis. In order to use these results, the BPA's for the two parameters 1.1„and Pry have to be provided. Depending on the particular problem and the informationavailable, various approaches, as described in Chapters 4 to 7, may be used to obtainthese BPA's. In Section 9.3, a situation will be considered where two precise probabilitydistributions, denoted by hp.. (.4) and hu.,(py ) for the two unknown parameters fix andpy , have been obtained, but the engineer has some concern about the precision of thisinformation. The method described in Section 7.2 of Chapter 7 will be used to derive aBPA from the precise probability distribution for each of the two parameters to accountfor the engineer's concern. The consequent D-S reliability analysis will be carried out,and the results will be compared with the conventional reliability analysis.The conventional reliability analysis based on precise probability distributions on1.6 and Py has been considered by Ang and Tang [1984, Example 6.17] and is first brieflyChapter 9. Application of D-S Theory in Reliability Analysis^158reviewed here. Suppose the two precise probability distributions h,„(11,) and h,y (p.y )on yz and py are normal, i.e.(9.13)hp, (PO = N (Y, sy)where x and y represent the mean values and s z and .s y standard errors of /L s and gyrespectively. Then the difference = — kty is also normally distributed with meanTz" y- --g and standard deviation s Vs! + s y2 . The expected failure probability E(pF )can be calculated asx — yE(PF)^[ Vcr!-F cr12,^s121 (9.14) And the reliability index becomes— y # v0.2 0.2,(9.15)Equations (9.14) and (9.15) are from Ang and Tang [1984].9.3 An IllustrationConsider first the precise probability distribution lip,(1..t) on pz whose precision issuspect in the mind of the engineer. Two possible ways of addressing this concern haveChapter 9. Application of D-S Theory in Reliability Analysis^159been discussed in Chapter 7 but, for illustrative purposes, the method which is calledlocal perturbation, as described in Section 7.2, is adopted here. Full details of thismethod can be found in Section 7.2 but, briefly stated, it involves attaching to anypoint Az an interval [fix — ca,, µ a, ca,] where ca, is taken as a constant, and reassigningthe probability density on p x to this interval. This process thus leads to a BPA whichcan be expressed asMkix ([µx CZ, + ca]) — hm2. (/u.) (9.16)If h i,.(px ) is assumed as normal with mean x and standard deviation s x , as was thecase in (9.13), then the upper and lower marginal distributions of it z are also normallydistributed with the same variance as la kz.(p) but different means, i.e.9µx (p=) = N(Y ca,, s x )= N(Y — cz,^) (9.17)As described in Section 7.2, the determination of the constant c x requires the se-lection of a reference interval and the engineer's specification of an upper probabilitybound for this interval. Choose [5 — s a,, s a,] as the reference interval, then the prob-ability of this interval based on hp..(pz ) is 0.683. Suppose the upper probability boundof this interval is specified as /4, then, from (7.5), cz can be expressed ascx^—sa, * 4,-1 (1 —2p;)^s a,^ (9.18)Chapter 9. Application of D-S Theory in Reliability Analysis^160As numerical examples, if p; = 0.9 then cx 0.645s z , and if p; = 0.99 thencz = 1.575s z . In the extreme case when p; = 1.0 then cy = oo and represents completeignorance about Ax . In reality if the engineer assumes to have at least some knowledge,then he should assign a value for p; which is less than 1.0. But it would be impracticalfor an engineer to specify probability values for p; above say 0.99 as it then becomesimpossible to distinguish any practical difference among these values. Therefore inpractice, a value like p; = 0.99 can be considered as representing the situation wherethe engineer has some very serious doubts about any specification of this probabilitydistribution, precise or otherwise. That is, the knowledge is not only weak but virtuallynonexistent. The corresponding c a, value to p; = 0.99 thus provides some insight aboutthe upper limit of ca, in practice.Similarly, if the engineer has some doubt about the precise probability distribution(Ay) of Ay, the same process described above can be used to determine a localperturbation BPA for Ay. Let [Ay - Cy, Ay cy ] be the interval associated with the pointity, where cy is considered to be a constant. The local perturbation BPA, m my ([Ay —Cy , Ay + Cy] ), can then be expressed asmg,([1-ty cy, cy}) = hp,(11y) (9.19)If hp., (Ay) is assumed as normal with mean y and standard deviation sy as was donein (9.13), then the upper and lower marginal distributions of p y , which are also normal,aregiL,(tty) = N(^sy)Chapter 9. Application of D-S Theory in Reliability Analysis^161.fiLy(uy) = N(Y cy'sy) (9.20)Again choose the reference interval as [--9- — .s y , -9- + s y]. If the engineer feels that theprobability value of Ay being in this interval can be as high as p';, then from (7.5) cycan be determined asAs-1 (1 Cy -Sy *2(9.21)As in the cx case, if p; = 0.90 then cy = 0.645sy , and if p; = 0.99 then cy = 1.575s y .Once the upper and lower marginal distributions for A x and Ay are obtained from thecorresponding local perturbation BPA's, the upper and lower marginal distributions,gii (p) and 4,(A), for can then be determined from (9.10). After some mathematicaloperations (see Section B.2 of Appendix B), gm (p) and fp (p) are found to be normal,i.e.gii(y) =^— y) + (cz + cy ),^.9! +N((Y — y) — (cz + cy), V .s! + s;)^(9.22)Note that in the conventional approach, the probability distribution of IL is normallydistributed with mean - 9- , and standard deviation Vs! + sy. From (9.22) it can beseen that, in the particular situation considered here, the upper and lower marginaldistributions of A in the D-S approach are of the same shape as the distribution of inthe conventional approach but the mean values are shifted by (ca, cy ) and —(c, cy)E . (pF) = c13 ^(c. cy))+ a-y2 + .s! +E-(PF) =41) ( ^V)+ (cx + cy ))+ + +Chapter 9. Application of D-S Theory in Reliability Analysis^162respectively. Based on the upper and lower marginal distributions of the upper andlower expected failure probabilities E'(pF ) and E.(pF ) can be determined as(9.23)The upper and lower reliability indices /3 - and 13„ can also be calculated. They are— g)^(ca,^cy )V0.! H._ 0.y2— — (c + Cy)Va2 4_ 0-2x ' y (9.24)It can be seen from (9.23) and (9.24) that the influence of the two coefficients cxand cy can be studied by considering only the sum cx +cy . Comparing (9.23) and (9.24)with (9.14) and (9.15) reveals that the difference between the D-S and the conventionalapproaches is also determined by the magnitude of the total cx cy. Indeed whencx and cy are both zero, indicating that the engineer has complete confidence in theprecise probability distributions on j and py , then (9.23) and (9.24) will be identicalto (9.14) and (9.15). When cz cy becomes larger, indicating weaker information, thediscrepancy between the D-S approach and the conventional approach becomes moresignificant.Consider the upper lower expected failure probabilities expressed in (9.23). Theycan be expressed in terms of the ratio sic = Vs ! sy2 viol + cry — a measure of themodeling error relative to the inherent variability o in the conventional approach, asE.(pF)E*(pF) = ^1 ^— y\/1 (s/o-) 2 (Vol ±1^x — y/1 (s/o-) 2 ( N/01+(9.25)cy cz— +^Vexs s z sy0- Vs ! sy21.575 (9.26)Chapter 9. Application of D-S Theory in Reliability Analysis^163Note that the term (Y — p-)/Vc•! o-y2 in (9.25) is the conventional reliability index0 as expressed in (9.15). The new term (c y cz )/Vo-! may, henceforth denoted by 6,is introduced by the D-S analysis and reflects the influence of imprecision. Also, theexpression in parentheses in the first equation of (9.25) is the lower reliability indexwhile that in the second equation is the upper reliability index as specified in (9.24).The magnitude of 8 = (cy c,)/Va.! cq depends, among other factors, on thetwo subjectively specified upper probability bounds p: and K. To gain some insightinto the extent of variation of 6, consider the rather extreme practical case wherep: py* = 0.99. As has been shown earlier in this section, this gives cz = 1.575sz andc11 •= 1 575s Y - The term S then becomesSince an algebraic property of the expression (s z s y )/Vs! s y2 is that it must be inthe interval 1.0 to 0 for any positive s z and sy , the upper limit of 6, namely 6„, isgiven by (9.26) as bz, = 1.5750s/cr. When the ratio s/o- = 1.0, bu = 2.227 and whens/o- = 0.5, 6„ =1.114.Chapter 9. Application of D-S Theory in Reliability Analysis^164The D-S upper and lower expected failure probabilities E - (PF ) and E.(PF ), ex-/0.1 + y2pressed in (9.25), are plotted in Figures 9.1a-d as functions of 6 = (cy cx )for different s/a and 8 values where s/o- = 0.5,1.0 and = 1.5,3.5. Here the two values1.5 and 3.5 for 0 are considered as representing high and low reliability situations incivil engineering practice.It can be seen from these graphs that when b = 0, E'(PF ) and E„(PF ) based onthe D-S analysis become identical and coincide with the expected failure probabilityE(PF ) based on conventional reliability analysis. Thus when the engineer is confidentabout the precise probability distributions specified on the two parameters p„ and it y ,the D-S reliability analysis will yield the same result as the conventional approach. Asb > 0, indicating that there is some imprecision in the information about Az and py ,E'(PF ) and E.(PF ) will differ and form the upper and lower bounds of the expectedfailure probability which contains the conventional reliability analysis result E(PF ).When b is relatively small, the difference between E*(PF ) and E*(PF ) is also rela-tively small, indicating that the conventional reliability analysis result E(PF ) is insen-sitive to some insignificant imprecision in the information. For example from Figure9.1b, when b = 0.2 then E - (PF) = 11.6 x 10' and E.(PF ) = 4.6 x 10' while theconventional reliability analysis result is E(PF ) = 9 x 10'.However when b. is large, indicating significant imprecision in the information,E'(PF ) and E.(PF ) can be of a difference of several orders. As can be seen fromthe plots, this difference becomes more significant for larger values. Consideringagain Figure 9.1b, when b. = 0.8 then E*(PF ) = 8 x 10' and E,,(PF ) = 6 x 10',which contrast with the conventional analysis result E(PF) = 9 x 10'. It is thus seenthat if there is some serious doubt about the precise probability distributions specifiedon px and py , i.e. significant imprecision, the conventional reliability analysis result issuspect. In fact, a large difference between the D-S upper and lower expected failureElpF)-■•(a)^= 1 .0, (3 = 1 .5.E(PF)E.(pF)1.01 0'110 .20cc0_ 104ccChapter 9. Application of D-S Theory in Reliability Analysis^1651.010'10.4 0861.2 1.6 2.0o'0.0 0.4 0.8 6 1.2 1.6 2.0(b) staz = 1.0, 0=3.5.60.0 0.2 04 0.6 0.8 1.00.0 0.2 0.4 60.6 0.8 1.0Chapter 9. Application of D-S Theory in Reliability Analysis10166.E-(PF)—10' 'jo'1 04^E. ( PF )E(d) sic; = 0.5, p=3.5.Figure 9.1: The upper lower expected failure probabilities as functions of b for differents/o and Q values.1.010'0cco_ 104wcc1 04(o)^= 0.5, p=1.5.Chapter 9. Application of D-S Theory in Reliability Analysis^167probabilities indicates that more information should be obtained about the unknownparameters in order to get more meaningful reliability analysis results.9.4 SummaryIn this chapter, the D-S theory has been used in reliability analysis to accommodateweak information inputs about the parameters of the random variables. Only a simplereliability analysis situation was considered where the demand and capacity are twoindependent, normally distributed random variables with the two means as the un-known parameters and the two standard deviations assumed to be known. Analyticalresults were developed which produce upper and lower expected failure probabilitiesas a direct result of the weak information input about the unknown parameters. Theresults were demonstrated in a numerical example which showed that, from the D-Sperspective, when the input information is weak, the conventional reliability analysisresult is not representative of the true situation. It suggests that, in this case, moreinformation would have to be obtained in order to get a more satisfactory reliabilityanalysis result.Chapter 10The D-S Upper Lower Expected UtilitiesSo far in this thesis the D-S theory has been used to draw inferences concerning someunknown state of nature and in making decisions based on these inferences. The impre-cision expressed in the inference results leads to the upper and lower expected utilitiesin any subsequent decision analysis, and will often result in indeterminacy in the de-cision choice. The actual interval formed by the upper and lower expected utilities isa significant feature of the D-S approach, but caution should be exercised when inter-preting such a interval. In this chapter, the way in which the D-S interval is affectedby various factors, and its implication in engineering practice will be explored.Before proceeding, it is interesting to pause for a conceptual comparison of the D-S upper and lower expected utility results with the outputs from some conventionalapproaches. Recall that in conventional Bayesian analysis, uncertainty about the un-known state of nature is reflected in the degree of dispersion of the posterior distri-bution, which is always a conventional precise probability distribution even when theprior information approaches ignorance and the sample data is sparse. This measureof uncertainty is suppressed when, for each decision option, a single expected utilityis calculated from the posterior distribution. The Bayesian Robustness on the otherhand adopts a set of possible prior distributions to reflect the weakness or imprecisionin the prior information. As a result, the final output is expressed in the form of thetwo posteriors which yield the highest and lowest expected utilities. When the util-ity resembles cost, one decision criterion, which is analogous to the D-S mini-upper168Chapter W. The D-S Upper Lower Expected Utilities^ 169criterion, would be to choose the decision which minimizes the highest expected cost[Berger, 1985]. However the choice of the priors is somewhat discretionary, and thesample data are still represented by the conventional sample likelihood.Unlike the conventional Bayesian and the Bayesian Robustness approaches, the D-Sscheme, as described in Chapter 3, represents both prior and sample information in theform of BPA's which are possibilistic in nature, and then combines them via Dempster'srule of combination to yield a resultant BPA. The prior and sample based inputs aretreated symmetrically and furthermore, if there is no prior information a prior BPA isnot required. The resultant BPA inference can be interpreted as representing a set ofconventional precise probability distributions known as the compatible distributions.Amongst these are the two marginal distributions which represent two extreme inter-pretations of the BPA function and which, when combined with the utility function,produce the upper and lower expected utilities. As pointed out in Chapter 4, the twomarginals are not equivalent in any quantitative sense to the Bayesian posterior exceptfor a tendency, shared with all other compatible distributions of the resultant BPA andthe Bayesian posterior, to converge on the deterministic result when the informationbecomes infinite.Suppose, after considering all the available sources of information, the D-S inferencehas been obtained for an unknown state of nature 9 which is expressed by a BPAfunction on the frame of 0. The corresponding upper and lower marginal distributionsare denoted as g(9) and AO). While no formal method has been suggested in theliterature to measure the degree of imprecision embodied in the BPA function, the"distance" between the two marginal distributions might be considered as having somebearing on this measure. One possible way of measuring this distance might be tocalculate the difference between the expectations, i.e. between the mean values of theupper and lower marginal distributions. The units of this measure will thus be theChapter 10. The D-S Upper Lower Expected Utilities^ 170same as the particular state of nature under consideration.Another measure of the distance can be obtained in terms of utility. If a monotoneincreasing utility function applies and is expressed as U (a, 8), then the upper and lowerexpected utilities can be calculated as described in Section 3.6, and areElU(d)] = U(d,6)9(0)d0E.[U(d)] = U(d,O)f(0)cl0The width of the interval formed by the upper and lower expected utilities is moremeaningful when the utility function is in monetary terms, which, as stated in Chapter3, has been assumed to be the case throughout this thesis. It is denoted here asAE[U(d)] and can be expressedAE[U(d)] = U(d,0)g(0)d0 — U(d,O)f(0))d0= f U(d,O)(11(0)— f(0))d0 (10.1)Equation (10.1) shows that AE[U(d)] is affected by both of the marginal distribu-tions, g(0) and f(0), and the utility function U(d,0). Note that, in the very simpleextreme situation where the utility function is a constant, AE[U(d)] becomes zero nomatter how much imprecision the BPA function implies, or how weak the knowledgeabout the unknown state of nature O. This result is reasonable as a constant utilityfunction means that the value of 8 has no effect on the outcome.Consider now the situation where the utility function is either monotone increasingor decreasing. Greater imprecision reflected in the BPA function means that g(0) andChapter 10. The D-S Upper Lower Expected Utilities^ 171f(9) will be further apart, and as a result the width of the interval, AE[U(d)], will belarger. Thus the weaker the knowledge about the unknown state of nature 0, the largerthe width of the interval.Now, for given marginal distributions g(0) and f (9), AE[U(d)] will be a function ofthe form of U(d, 9). Consider the simple case where U(d, 0) is a linear function of 0 i.e.U(d, 0) = cO, with coefficient c > 0. It is clear from (10.1) that the greater the c value,the larger the width AE[U(d)] will be. Since a greater c value indicates that the utilityfunction is more sensitive to 8, it can be said that the more sensitive the utility functionis with respect to 8, the larger the width AE[U(d)] will be. Although this conclusion isdrawn from the linear utility function case, it can be easily extended to the general nonlinear monotone utility function case. It can be concluded that the more sensitive theutility function is to 8 in the area where g(0) and f(0) are concentrated, the larger willbe the width AE[U(d)]. Note here that a more sensitive utility function also meansthat the weakness of knowledge about 8 will tend to make the decision analysis lessconclusive. See the example in Section 4.4.2 of Chapter 4 as a demonstration of thiseffect.The above discussion, though essentially qualitative, indicates that the width ofthe interval AE[U(d)] can provide some useful information to a decision maker in thesubsequent decision analysis. A small width AE[U(d)] indicates that the weakness ofthe information and the utility function have, in combination, less effect on the decisionconsequence. This could be a result of either weak information input, or an insensitiveutility function with respect to 0, or both. In all of these situations the weakness ofinformation has small effect on the decision analysis, indicating that the decision makercan make a decision with confidence commensurate with a true Bayesian situation.However, when AE[U(d)] is very large, then the weakness of information will almostinvariably lead to significant ambiguity or indeterminism in the decision choice. ThisChapter 10. The D-S Upper Lower Expected Utilities^ 172arises because the width of the utility interval for each individual decision then over-whelms the differences between the various alternative decision outcomes. Althoughdistinctions between decisions can still be drawn using the mini-upper criterion, therecan be little confidence in coming to any conclusion in the face of the extensive un-certainty due to imprecision reflected in the large interval. In this case, the decisionmaker would have to collect more information to reduce the size of the interval andstrengthen the ability of resolving the decisions.In addition to its usefulness in a decision analysis, the D-S interval also providesimportant information when justifying the use of conventional Bayesian approach, i.e.the Bayesian analysis and its conclusions are clearly most supportable in situationswhere the D-S interval is relatively small. Under these conditions it is, of course, highlylikely that the D-S and Bayesian analyses would support identical decision choices.Chapter 11Summary and ConclusionsWater resources engineering decisions often have to be based on knowledge which isweak and there is some question whether this is adequately expressed by a conventionalprecise probability distribution as required by the Bayesian and other conventional sta-tistical approaches. The objective of this research was to explore the implementationof D-S theory in characterizing weak information situations in a water resources en-gineering decision context and compare the approach with the conventional Bayesiananalysis.D-S theory is now seen, in retrospect, to be based on the concept of imprecise prob-ability which provides a much more possibilistic expression of probabilistic informationthan does a conventional probability distribution. It therefore offers an alternativestatistical approach for dealing with weak information situations in real engineeringpractice. The adoption of D-S theory to engineering decision analysis, when faced withweak information, will often lead to results which are somewhat indeterminate but arebelieved to be more realistic. As pointed out by Walley [1991], when weak informationis involved, the decision choice should naturally be indeterminate to some degree.The D-S theory has some additional features which are both appealing and impor-tant from a practical point of view. Unlike the conventional Bayesian scheme, bothprior and sample based inputs are treated in exactly the same fashion in the D-Sframework. Also, if there is no prior information, then a prior BPA is not required.Complete ignorance is represented by the ignorance BPA which has the desired feature173Chapter 11. Summary and Conclusions^ 174of not affecting the resultant BPA even if included in the analysis. Furthermore, in spiteof its radically different approach, the D-S framework does not represent a completebreak from the conventional Bayesian scheme. Indeed, when the prior is a conventionalBayesian distribution, the two approaches produce identical conventional distributionresults for the same sample information. The D-S framework thus includes the con-ventional Bayesian approach as a special case. One benefit of this research, therefore,is that it also provides a new and broader perspective from which the appropriatenessand limitations of the conventional Bayesian analysis can be judged.The idea of a graphical representation of BPA's using the triangular diagrams isfully exploited in the exposition of the D-S theory in this thesis. Other than appearingin Strat's original conference paper in 1984, this idea appears to have been totallyoverlooked in the literature, but was found to be invaluable both in understandingbasic D-S theory and in conceptualizing the expression of inputs and results in theform of BPA's. It also proved to be useful in the subsequent D-S analysis. Without it,much of the new theoretical development which occurs in this thesis would not haveoccurred.Engineering examples of applications of D-S theory, which include in some instancesthe adaptation of previously published Bayesian examples, are presented throughoutthe thesis. These examples are used not only to demonstrate implementation of thetheory in an engineering context but also to provide a preliminary investigation ofthe quantitative implications of weak information in some practical decision makingcircumstances.The D-S inference procedures presented in Chapters 4 to 6 provide a variety ofroutines for translating sampling information into BPA's. In most cases, the closestequivalent Bayesian approaches and results are also presented. The comparison showsChapter 11. Summary and Conclusions^ 175that the Bayesian results generally fall within the range established by the D-S ap-proaches. This indicates that there is some general agreement between the conventionalBayesian and D-S results. This kind of agreement is also mentioned by Wasserman[1988] and adds support to the notion that the D-S results are "reasonable" from aBayesian perspective.It might also be added that, on the basis of experience gained with this research,the Bayesian results are "reasonable" from the D-S perspective. But the D-S schemehas the distinct advantage of analytically detecting the imprecise probabilities dueto weak information inputs. Furthermore, it is able to detect imprecise probabilitiesarising solely from sample information without obscuring them by insisting on a prior.As the information becomes stronger, the degree of imprecision decreases, and bothD-S and conventional Bayesian approaches tend to converge. Thus the D-S schemeestablishes how quickly the imprecision decreases as the sample information growsstronger. This is of practical importance as it provides some quantitative indicationof the size of sample inputs, with or without the presence of subjective knowledge,necessary for reducing the imprecision to a level acceptable to the decision maker. Theincorporation of imprecise probability to inference leads, in the D-S decision analysis, toupper and lower expected utilities. This reduces the ability to choose the best decision,especially when the expected utility intervals for different decision alternatives overlap.The resultant indeterminism in the decision making is not necessarily bad but canbe considered as more faithfully reflecting the true decision situation than does the"precise" Bayesian decision result.The elicitation of prior BPA's from subjective knowledge discussed in Chapter 7includes several different approaches which can be used to describe some typical forms ofsubjective knowledge in engineering circumstances. As demonstrated, the D-S schemealong with the exploitation of the triangular BPA diagram, provides a far more flexibleChapter 11. Summary and Conclusions^ 176framework in which more varied expressions of subjective knowledge are possible.In the early introductory chapters of this thesis an original synthesis of ideas con-cerning D-S theory, extracted mostly from the existing literature, is presented. Theseideas have been developed by several statisticians over the last 30 years, and were dis-persed widely in the statistical and AI literatures. Some entirely new explanations arealso incorporated to assist the exposition. They are presented in a form which bestserves the needs of engineers and engineering applications. New theoretical develop-ments were necessary when the D-S theory was brought to bear on water resourcesengineering applications where the concept of imprecise probability appeared to haveits greatest potential relevance. These appear in the D-S inference of parameters oflognormal and extreme type I distributions in Chapter 6 and the adaptation of the twomore general models in Chapters 8 and 9.The hydrologic design model in Chapter 8 addresses an important problem posedby the US Bureau of Reclamation [1988], which was one of the principal motivationsfor this research. The model is able to incorporate the weak information inputs aboutvarious parameters associated with extreme annual flood events. In a situation whichis commonly associated with weak inputs, a significant advantage of this approach isthat, it does not depend on any presumed, necessarily arbitrary, statistical model forthe annual extreme floods.The reliability analysis model presented in Chapter 9, represents an example of theapplication of D-S theory to civil engineering reliability analysis. This entails findingthe BPA for an elementary function of two variables whose individual uncertainties areexpressed in the form of BPA's. With weak input concerning the unknown means of thetwo random variables, i.e. load and resistance, it produces upper and lower expectedfailure probabilities and the expected failure probability interval reflects the weaknessin the information provided. When this interval is large the reliability analysis resultChapter 11. Summary and Conclusions^ 177is ambiguous, and more information would have to be supplied in order to reduce thisinterval and obtain a more meaningful reliability analysis result.The expected utility interval is the most conspicuous new feature arising from aD-S decision analysis. A general interpretation of the factors governing the size of theinterval for monotone utility functions is provided in Chapter 10. It reveals that with amonetary utility function, the size of the interval not only reflects the weakness of theinput information but also tends to be enlarged if the utility function is sensitive to theunknown state of nature in the region of the interval. Of interest here is the situationwhen the size of the interval for each decision option is large compared with, say, thedifferences between the upper (or lower) expected utilities amongst the decision options.In this case, the engineer's ability to discriminate between decisions is substantiallydiminished and may even be non existent. When facing such a reality, more informationwould have to be collected to reduce the size of the intervals in order to make a moredefinitive decision.In the past the introduction of Bayesian decision analysis into resource managementproblems represented a major advance, permitting, amongst other things, more variedqualities of information to be brought to bear on decision problems. The impreciseprobability concept further expands the input information options so that, with weakinformation, the full extent of the uncertainties is more completely reflected than in theconventional Bayesian inputs. Equally important, it brings a broader perspective touncertainty and a new viewpoint from which to judge when the Bayesian approach isadequate. Implementation of this concept in the form of the D-S scheme, as presentedin this thesis, reveals both theoretically and quantitatively the nature and extent ofimprecision arising from weak inputs and the degree of indeterminism that this bringsto the decision analysis.Chapter 11. Summary and Conclusions^ 178While the theoretical basis for applying D-S decision analysis to some civil engineer-ing problems has been demonstrated in this thesis, full implementation in engineeringpractice can not occur until a number of conditions are met. As was stated in Chapter1, it would be premature to recommend this particular implementation of impreciseprobability until the other alternative approaches have also been investigated. Also,one might argue that because the D-S scheme can replicate any Bayesian decision resultif conventional Bayesian inputs are provided, then the more general D-S scheme shouldalways be used. But generalized equivalents of all Bayesian procedures have not yetbeen developed. Furthermore, there is no question that when the inputs can be satis-factorily specified in Bayesian terms, the less complicated and more familiar Bayesianscheme is preferable. However, an awareness of the alternatives for inputs offered bythe D-S and other imprecise probability schemes would seem to be essential beforethis satisfaction with Bayesian type of inputs can be appropriately judged. Finally,the unfamiliar nature of the D-S theory presents a significant educational challengewhich must be overcome before acceptance by engineering practitioners could occur.The greater flexibility and possible realism with which the D-S scheme represents weakinformation provides a significant incentive to overcome this major obstacle.Bibliography[1] Afshar, A and M.A. Marino, Optimizing spillway capacity with uncertainty inflood estimator, J. Water Resour. Ping. and Mgmt., ASCE, 116(1), 71-84, 1989[2] Ang, A. H.-S. and W.H. Tang, Probability Concepts in Engineering Planning andDesign, Vol. I: Basic principles, John Willey & Sons, Inc., New York, 1975.[3] Ang, A. IL-S. and W.H. Tang, Probability Concepts in Engineering Planning andDesign, Vol. II: Decision, Risk, and Reliability, John Willey & Sons, Inc., 1984.[4] Benjamin, J. R., and C. A. Cornell, Probability, Statistics, and Decision for CivilEngineers, McGraw-Hill, New York, 1970.[5] Berger, J. 0., The robust Bayesian viewpoint (with discussion), in Robustness ofBayesian Analysis, edited by J. Kadane, pp. 63-124, North-Holland, Amsterdam,1984.[6] Berger, J. 0., Statistical Decision Theory and Bayesian Analysis, Springer-Verlag, New York Inc. NY, 1985.[7] Blockey, D. I., and J. F. Baldwin, Uncertain inference in knowledge-based system,Journal of Engineering Mechanics, ASCE, 113(4), 467-481, 1987.[8] Bodo, B. and T.E. Unny, Model uncertainty in flood frequency analysis andfrequency-based design, Water Resour. Res., 12(6), 1109-1117, 1976.[9] Bury, K. V., Statistical Models in Applied Science, John Wiley & Sons, New York,1975.[10] Caselton, W.F., Froese, T.M., Russell, A.D. and Luo, W., Belief input proceduresfor Dempster-Shafer based expert system, Artificial Intelligence in Engineering:Robotics and Processes, Computational Mechanics Publications, Southampton,pp. 351-370, 1988.[11] Caselton, W.F. and Luo, W., Decision making with imprecise probabilities:Dempster-Shafer theory and application, Water Resour. Res. , 28(12), 3071-3083,1992.[12] Davis, D.R., C.C. Kisiel, and L. Duckstein, Bayesian decision theory applied todesign in hydrology, Water Resour. Res., 8(1), 33-41, 1972.179Bibliography^ 180[13] Dempster, A. P., New methods for reasoning towards posterior distributionsbased on sample data, Ann. Math. Statist., 38, 325-339, 1966.[14] Dempster, A. P., Upper and lower probabilities induced by a multivalued map-ping, Ann. Math. Statist., 37, 355-374, 1967a.[15] Dempster, A. P., Upper and lower probability inferences based on a sample froma finite univariate population, Biometrika, 54, 515-528, 1967b.[16] Dempster, A. P., A generalization of Bayesian inference, J. Roy. Statist. Soc.,Ser. B 30, 205-247, 1968a.[17] Dempster, A. P., Upper and lower probabilities generalized by a random closedinterval, Ann. Math. Stat., 39(3), 957-966, 1968b.[18] Dempster, A. P., Upper and lower probability inferences for families of hypotheseswith monotone density ratios, Ann. Math. Stat., 40, 953-969, 1969.[19] Dempster, A. P., and A. Kong, Comment, Statist. Sci., 2(1), 32-36, 1987.[20] Duckstein, L. and I. Bogardi, Application of reliability theory to hydraulic en-gineering design, Journal of the Hydraulics Division, ASCE, Vol. 107, No. HY7,799-815, 1981.[21] Einhorn, H. J. and R. M. Hogarth, Ambiguity and uncertainty in probabilisticinference, Psychol. Rev., 92, 433-461, 1985.[22] Ellsberg, D., Risk, ambiguity, and the Savage axioms, Quart. J. of Econom., 75,643-669, 1961.[23] Fine, T. L., Lower probability models for uncertainty and nondeterministic pro-cesses, Presented in Workshop on assessing uncertainty, Stanford, California,1987.[24] Gardenfors, P. and N. Sahlin, Unreliable probabilities, risk-taking, and decisionmaking, Synthese, 53, 361-386, 1982.[25] Gardenfors, P. and N Sahlin, Decision making with unreliable probabilities, Brit.J. Math. Statist. Psychol., 36, 240-251, 1983.[26] Good, I. J., Subjective probability as the measure of a non-measurable set, Logic,Methodology and Philosophy of Science (edited by Ernest Nagel, Patrick Suppes,and Alfred Tarski), Stanford Univ. Press, 319-329, 1962.Bibliography^ 181[271 Haimes, Y. Y., R. Petrakian, P.-O. Karlsson and J. Mitsiopoulos, Multiobjectiverisk-partitioning: an application to dam safety risk analysis, U.S. Army Inst. forWater Resour. IT4'R Report 88-r-4, by Environmental Systems Management, Inc.1988.[28] Kmietowicz, Z. W. and A. D. Pearman, Decision Theory and Incomplete Knowl-edge, Gower, Aldershot, UK, 1981.[29] Kuczera, G, Combining site-specific and regional information: An empirical Bayesapproach, Water Resour. Res., 18(2), 361-324, 1982.[30] Kuczera, G, A Bayesian surrogate for regional skew in flood frequency analysis,Water Resour. Res., 19(3), 821-832, 1983.[31] Levi, I., Ignorance, probability and rational choice, Synthese, 53, 387-417, 1982.[32] Levi, I., Imprecision and indeterminacy in probability judgement, Philos. Sci.,52, 390-409, 1985.[33] Linsley, R. K. and J. B. Franzini, Water Resources Engineering, McGraw-HillBook Company, N.Y., 1979.[34] Linsley, R. K., M. A. Kohler and J. L. H. Paulhus, Hydrology for Engineers,McGraw-Hill, Inc., New York, 1982.[35] Martz, H. F., and A. W. Ray, Bayesian Reliability Analysis, John Wiley & Sons,USA, 1982.[36] McAniff, R.J, Flug, M. And Wade, J., Bayesian analysis of energy prices onirrigation, J. Water Res. Div. ASCE, 106(WR2), 401-408, 1980.[37] Melching, C. S., B. C. Yen and H. G. Wenzel, JR, A reliability estimation inmodeling watershed runoff with uncertainties, Water Resour. Res., 26(10), 2275-2286, 1990.[38] Shafer, G., A Mathematical Theory of Evidence, Princeton Univ. Press, Prince-ton, N. J., 1976.[39] Shafer, G., Constructive probability, Synthese, 48, 1-60, 1981.[40] Shafer, G., Belief functions and parametric models (with discussion), J. Roy.Statist. Soc., Ser. B, 44, 322-352, 1982a.[41] Shafer, G., Lindley's paradox, J. Amer. Statist. Assoc., 77, 325-351, 1982b.Bibliography^ 182[42] Shafer, G., Savage revisited (with discussion), Statistical Science, 1(4), 155-179,1986.[43] Shafer, G., Probability judgement in artificial intelligence and expert systems,Statistical Science, 2(1), 3-16, 1987.[44] Shafer, G. and A. Tversky, Languages and designs for probability judgement,Cognitive Science, 9, 309-339, 1985.[45] Shafer, G. and J. Pearl (Eds), Readings in Uncertain Reasoning, Morgan Kauf-mann Publishers, Inc., San Mateo, California, 1990.[46] Shane, R.M. and D.P. Gayer, Statistical decision theory techniques for the revi-sion of mean flood flow regression estimates, Water Resour. Res., 6(6), 1649-1654,1970.[47] Shen, T. W., M. C. Bryson and I. D. Ochoa, Effect of tail behavior assumptionson flood predictions, Water Resour. lies., 16(2), 361-364, 1980.[48] Smith, C. A. B., Consistency in statistical inference and decision, (with discus-sion), J. Roy. Statist. Soc., Ser. B, 23, 1-25, 1961.[49] Stedinger, J. and J. Grygier, Risk-cost Analysis and spillway design, in Computerapplications in water resources, H. Torno, ed., ASCE, N.Y., 1985.[50] Strat, T. M., Continuous belief functions for evidential reasoning, ProceedingsFourth National Conference on Artificial Intelligence, Austin, TX, 308-313, 1984.[51] Toussi, S. and Khanbilvardi, R. M., Dempster's rule of combination in soil-lossprediction, J. Water Res. Div. ASCE, 114(4), 432-445, 1988.[52] Tversky, A. and D. Kahneman, Judgement under uncertainty: heuristics andbiases, Reprinted in Readings in Uncertain Reasoning (edited by G. Shafer andJ. Pearl), 32-39, 1990, originally appeared in Science, 1124-1131, 1974.[53] Tversky, A. and D. Kahneman, Rational choice and the framing of decisions, J.Business, 59, S251-S278, 1986.[54] Vicens, G. J., I. Rodriguez-Iturbe and J. C. Schaake, A Bayesian framework forthe use of regional information in hydrology, Water Resour. Res., 11(3), 405-414,1972.[55] U.S. Bureau of Reclamation, Guidelines to decision analysis, ACER Technicalmemorandum No. 7, Denver, Colorado, March 1986.Bibliography^ 183[56] Walley, P., Belief function representations of statistical evidence, Ann. Statist.,15, 1439-1465, 1987.[57] Walley, P., Statistical Reasoning with Imprecise Probabilities, Chapman and Hall,London, 1991.[58] Walley, P. and T. L. Fine, Towards a frequentist theory of upper and lowerprobability, Ann. Statist., 10, 741-761, 1982.[59] Wasserman, L. A., Belief functions and likelihood, Technical Report 420, Depart-ment of Statistics, Carnegie-Mellon University, Pittsburgh, PA, 1988.[60] Wasserman, L. A., Belief function and statistical inference, Canadian J. Statist.,18(3), 183-196, 1990a.[61] Wasserman, L. A., Prior envelopes based on belief functions, Ann. Statist., 18,454-464, 1990b.[62] Weichselberger, K. and S. POhlmann, A Methodology for Uncertainty inKnowledge-based Systems, Lecture notes in artificial intelligence (edited by J.Siekmann), Subseries of lecture notes in computer science, 419, Springer-Verlag,NY, 1987.[63] Wood, E. F., and I. Rodriguez-Iturbe, Bayesian inference and decision makingfor extreme hydrologic events, Water Resour. Res., 11(4), 533-542, 1975a.[64] Wood, E. F., and I. Rodriguez-Iturbe, A Bayesian approach to analyzing Un-certainty among flood frequency models, Water Resour. Res., 11(6), 839-843,1975b.[65] Yen, B. C., S.-T. Chang, and C. S. Melching, First-order reliability analysis, inStochastic and Risk Analysis in Hydraulic Engineering, edited by B. C. Yen, 1-34,Water Resources Publications, Littleton, Colo., 1986.Appendix ADetermination of K, g(v), and f(u)From equation (4.11), the commonality function isH([u,v]) = K n ur(1 — v)n -r^when u > a and v < b, i.e. AREA I= K n (1 — a)ur(1 — v)n-r AREA IIwhere the AREA I and AREA II are indicated in Figure A.1.Since the commonality function is not continuous, the corresponding BPA functionwill also not be continuous and, in particular, there will be probabilities concentratedon lines AB, AC, and at point A in Figure A.1. Based on the commonality function,the probability distributions and probability weights on different parts of the triangulararea can be determined as follows.Because of an abrupt change of commonality value at point A, there is a probabilityspike concentrated at A which isPrA = lim{H([a,b]) — 1^— E,b ED}K^ad' (1 — b)n -r (A.1)r184Appendix A. Determination of K, g(v), and f(u)^ 185UFigure A.1: The contiguous frameSimilarly line AB also carries a positive probability. The cumulative probabilityfrom point A to any point represented by [a, v], v < b, on AB can be calculated aslim{H([a + 6,v1) — Haa,01 — PrA K n^ aar(1 — v)' — PrA^(A.2)from which the probability density on AB can be determined as K cx(n—r)af (1—The probability PrAB carried by line AB is obtained by setting v = a inequation (A.2), that isAppendix A. Determination of K, g(v), and f(u)^ 186n \PrAB = K^aar(1 — a)n' — K ( n aar(1 — b)n -r^(A.3)rUsing the same procedure as above, the probability on line AC can be determinedasPr AC = K^oebr(1 — b)n-r — K^aar(1 — br?^(A.4)which is distributed according to the density K (n)r^arur -1 (1 — v)n -r.The probability densities (i.e. the BPA densities) on AREA I and AREA II can bedetermined directly from the commonality function which arem([11,v]) = Kr(n — r)^-^ AREA I= Kr(n — r)(n)(1 — a)ur(1 — v)n -r AREA II(A.5)The probability weight in AREA I, Pr AREA I can then be calculated by integrating thecorresponding BPA density function in (A.5), i.e.Pr AREA I =fabjubKr(n — r)I nur -1 (1 — v)n -r - dvTdu a IE n a'(1 — ar' — a r^nx^x=o x) 17(1 —^+1 — aKAppendix A. Determination of K, g(v), and f(u)^ 187= K ) ar(1 — b)' + K Ex=oTi—K E bX(1 — b)'Xn)ax(1 - a)'\ x(A.6)The probability weight in AREA II, Pr AREA II can be determined in the same fashion asabove but a study of Figure A.1 shows that Pr AREA E[ can be calculated from Pr AREA Iby the following relationshipPr AREA 11 = ( 1 - a)K — ( 1 — cx)Pr AREA I (A.7)Once the probability components for different parts of the triangular area are obtained,the total probability over this area can then be computed by adding all these compo-nents, i.e.Pr = Pr A PrAB + Pr AREAI + Pr AREAL( (A.8)Setting pr equal to 1.0, the normalizing factor K can be determined asK =^n^ n1 —a+aE^a-(1 -^—a E^— brx^x=o x x(A.9) KKg(v) =f(u) =KKKAppendix A. Determination of K, g(v), and f(u)^ 188Given the probability distribution in the triangular area in Figure A.1, the upperand lower marginal distributions g(v) and f (u) can be determined from equation (3.17)where the integrations must be performed separately for different regions as the densityfunction is not continuous. The g(v) and f(u) distributions are(1 — a)(n — r)(1 — v)" -r - lvr(n — r)(1 — v)nabr(1 — b)"- r(density along 0 < v < aand b < v < 1.0)(density along a < v < biA.10)(probability at v = b)(1 — c)rur-to. — ur,cear(1 — a)"rur -1 (1 — u)'(density along 0 < u < aand b < u < 1.0)(probability at u = a)^(A.11)(density along a < u < b)Appendix BUpper Lower Marginal Distributions, g,,(it) and 42 (p), of p,B.1 Expressing gm (p,) and fµ (µ) in Terms of the Marginal Distributions ofandConsider first the lower marginal distribution MA) of^Substituting equation (9.8)for m„([it„,A„]) in equation (9.9), fo(P) can be expressed asf.42(l1) = j^Trip([11.44})druv^ (B.1)0ri ,z ,^r^p.„]) 1 .1 miLy^— Ay,^ILO dead dp...4.„.10 Consider the integration in the square brackets in (B.1). Let s^— A t, thendict, = —ds and this integration becomes^E rn ^—^— AtipdAt,= —^rnpi, ( [s,^— Au])dsAX,^- CO^ — itsupds^ (B.2)From (9.7), pu is equal to ft, — ityr . Hence, the upper limit of the range of integrationin (B.2) becomes fi zv^—py„). Because ps„ > p„„, this expression is always greater189Appendix B. Upper Lower Marginal Distributions, g,,(kt) and Li(y), of^190than or equal to p y„ Since the BPA density m,,([8„Lt, — p up is zero whenever s isgreater than µ y,,, (B.2) is equivalent to integrating^— ,u,„}) from s = —cc tos =^which, according to the definition in Chapter 3, is the upper marginal density— p„) of py . Thus, equation (B.1) becomes00 Cofil(P)^ Mgr ( [110u 7 IIXT Dthiy (AZ. — PU )di1 Xlyjocc LI:m ,„(Dix,,,p„„Ddp,x,1gp,(1.1, — pu )d,a,^(B.3)The integration in the square brackets in (B.3) is recognized as the lower marginaldensity function fi,.(itz.) of fix . Equation (B.3) then becomes10 f,„(p„)g,,y (A, - 1.10d11,^ (B.4)Similarly the upper marginal distribution gtz (p) of it in (9.8) can be expressed interms of the upper and lower marginal distributions of fix and py , which isgA ( A ) =^ (B.5)B.2 The g„(it) and fp (p,) for Normal Marginal Distributions of and µyIf the upper and lower marginal distributions of tix and fly are normal, as representedby equations (9.17) and (9.20) respectively, the 42 (//,) and gm (p) in (B.4) and (B.5) arealso normally distributed as shown in the following.Appendix B. Upper Lower Marginal Distributions, g i,(y) and fi,(12), of^191First, consider the f„(p) in (B.4). Based on (9.17) and (9.20), f,i (y) can be expressedasLi(it) = 1:(27rs x s y ) 'exp_ 2 f — (92^) )2— —(y+ c. )) 2 1 dpa.„s 2(B.6)The term in the square brackets in (B.6) can be expressed in complete squares as follows^(4. —^cx))2^— 11 — + ez)) 22 s 22^1^1^— ex )^p^cy)^s2^S- — zyz„ ^ 2 2Y(Y — cx ) 2 + (IL + (y + cy )) 2+^s2^2^Sx SY^1^1^2 3,2^(y — ca) + y + (. + co)]y2S S---- (- +^ ) [ttxu^° - (s 2 + 3 2^2^2S z^S S Sx ^Y x Y^ z2 s y2+ CY ) 2 1 ^c°) 2^( fL + (9- + CY ) ) 2 2 ^s2^s2^.92^-r- s2 s21 + 1 )^s!sy2^c.) + A + + cv) 122^(s ^s2 { 4_ 3 2 s2^sy2 )x^y x^y1 ^— ((f — c. ) —^+ cy)1 2 (B.7)Therefore fi,(p) in (B.6) becomes1 s! 127r s 2 +^exp — 2— • 2^ 111 ((Y Ca) — (9- + CY))12}^(B.8)Appendix B. Upper Lower Marginal Distributions, g m (f.t) and fp(p), of p.^192which is recognized as a normal distribution with mean ( — ci,) — (y -E- cy ) and standarddeviation vs! Similarly the upper marginal distribution gp (p) expressed in (B.5)can be determined as= ^ 1^ eXP {^• 2^ [fl — ((Y cx ) — (g — cy ))1 2 }^(B.9)1/270 s 2^2 s! sywhich is also a normal distribution with mean (Y-i-c)—(y—cy ) and standard deviation\/.5.2^sy2.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Imprecise probability and decision in civil engineering...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Imprecise probability and decision in civil engineering : Dempster-Shafer theory and application Luo, Wuben 1993
pdf
Page Metadata
Item Metadata
Title | Imprecise probability and decision in civil engineering : Dempster-Shafer theory and application |
Creator |
Luo, Wuben |
Date Issued | 1993 |
Description | Over the last three decades, Bayesian theory has been widely adopted in civil engineering for dealing with uncertainty and for purposes of decision making under uncertainty. However the Bayesian approach is not without criticisms. One major concern has been that information or knowledge, no matter how weak or sparse, must necessarily be represented by conventional, precisely specified, probabilities. This has lead to thedevelopment of statistical methods which allow for more flexible expressions of both information inputs, and inferred results. More recently a general concept, called imprecise probability, which embraces a number of these methods, has been described [Walley, 1991]. Weak information is often encountered in civil engineering. This is especially true in decision making as major decisions are often dominated by random, but infrequent, extreme natural events. For these rare events the sample record is usually short and the relevant subjective knowledge based on human experience is also likely to be very limited. The imprecise probability concept therefore has potential relevance to some important civil engineering decision problems. Among the existing imprecise probability schemes, Dempster-Shafer (D-S) theory is prominent. This theory has attracted considerable attention in the Artificial Intelligence (AI) field, but the applications are different from those considered here. This has largely overshadowed the relevance of the theory to the more conventional inference and decision making types of problems encountered in civil engineering. In this thesis, some applications of the D-S theory primarily to water resources engineering decision problems are developed. The engineering examples presented throughout the thesis provide some indications of the impact of implementing imprecise probabilities on engineering decision analysis. In most instances a closest equivalent Bayesian analysis is performed and the results compared with those obtained from the D-S scheme. The concept of imprecise probability is philosophically important to the research and is briefly reviewed. The theoretical ingredients of D-S theory which are necessary to support engineering applications are then introduced. This is followed by presentation of several different procedures for translating weak sampling information inputs into D-S inference results. The elicitation of subjective prior inputs for the D-S scheme is also discussed and includes representing some typical engineering expressions of subjective knowledge. Two civil engineering models, one in hydrologic design and the other in reliability analysis, are also developed, and they demonstrate how the scheme can be applied in more complex engineering situations. When presented with weak information input, the D-S decision analysis yields upper and lower expected utilities. This reduces the ability to choose between the best decision alternatives, especially when the expected utility intervals for different decisions overlap. But this reduction in resolution is believed to more realistically reflect the true decision making situation. The factors governing the size of the expected utility interval are also discussed. |
Extent | 7907424 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2008-09-10 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0050446 |
URI | http://hdl.handle.net/2429/1813 |
Degree |
Doctor of Philosophy - PhD |
Program |
Civil Engineering |
Affiliation |
Applied Science, Faculty of Civil Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1993-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1993_fall_phd_luo_wuben.pdf [ 7.54MB ]
- Metadata
- JSON: 831-1.0050446.json
- JSON-LD: 831-1.0050446-ld.json
- RDF/XML (Pretty): 831-1.0050446-rdf.xml
- RDF/JSON: 831-1.0050446-rdf.json
- Turtle: 831-1.0050446-turtle.txt
- N-Triples: 831-1.0050446-rdf-ntriples.txt
- Original Record: 831-1.0050446-source.json
- Full Text
- 831-1.0050446-fulltext.txt
- Citation
- 831-1.0050446.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0050446/manifest