TWO REPRESENTATION THEOREMS AND THEIR APPLICATION TO DECISION THEORY by i^CHEW; SOO HONG M.A., Claremont Graduate School, 1977 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES (Interdisciplinary Programme) (Mathematics, Economics, Management Science) We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA November- 1980 0 ChewJ, Soo Hong, 1980 In presenting this thesis in partial fulfilment of the requirements fo an advanced degree at the University of British Columbia,rI agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the Head of my Department or by his representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Interdisciplinary The University of British Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date ABSTRACT 11 This dissertation consists of two parts. Part I contains the state ments and proofs of two representation theorems. The first theorem, proved in Chapter 1, generalizes the quasilinear mean of Hardy, Littlewood and Poly by weakening their axiom of quasilinearity. Given two distributions with the same means, quasilinearity requires that mixtures of these distributions with another distribution in the same proportions share the same mean, regardless of the distribution that they are mixed.with. We weaken the quas linearity axiom by allowing the proportions that give rise to the same means to be different, This leads to a more general mean, denoted by M , which has the form: %(F) = *~VR<^dF//Rc.dF), where <J> is continuous and strictly monotone, a is continuous and strictly positive (negative) and F is a probability distribution. The quasilinear mean, denoted by M^, results when the a function is constant. We showed, in addition, that the M mean has the intermediate value property, and can be consistent with the stochastic dominance (including higher degree ones) partial order. We also generalized a well known inequality among quasilinear means, via the observation that the mean of a distribution F can be written as the quasilinear mean of a distribution F , where F is derived from F via a as the Radon-Nikodym derivative of F with respect to F. We noted that the mean induces an ordering among probability distributions via the maximand, /Ra<}>dF//.RadF, that contains the (expected utility) maximand, /^<J)dF, of the quasilinear mean as a special case. Chapter 2 provides an alternative characterization of the above represen tation for simple probability measures on a more general outcome set where mean values may not be defined. In this case, axioms are stated directly in terms of properties of the underlying ordering. We retained several standard properties of expected utility, namely weak order, solvability and monotonicity but relaxed the substitutability axiom of Pratt, Raiffa and Schlaifer, which is essentially a restatement of quasi-linearity in the context of an ordering. Part II of the dissertation concerns one specific area of application decision theory. Interpreting the M^(F) mean of Chapter 1 as the certainty equivalent of a monetary lottery F, the corresponding induced binary relation has the natural interpretation as 'strict preference' between lotteries. For non-monetary (finite) lotteries, we apply the representation theorem of Chapter 2. The hypothesis, that a choice agent's preference among lotteries can be represented by a pair of a and cj) functions through the induced ordering, is referred to as alpha utility theory. This is logically equivalent to saying that the choice agent obeys either the mean value (certainty equivalent) axioms or the axioms on his strict preference binary relation. Alpha utility theory is a generalization of expected utility theory in the sense that the expected utility representation is a special case of the alpha utility representation. The motivation for generalizing expected utility comes from difficulties it faced in the description of certain choice phenomena, especially the Allais paradox. These are summarized in Chapter 3. Chapter 4 contains the formal statements of assumptions and the derivations of normative and descriptive implications of alpha utility theory. We stated conditions, taken from Chapter 1, for consistency with iv stochastic dominance and global risk aversion and derived a generalized Arrow-Pratt index of local risk aversion. We also demonstrated how 1 alpha utility theory can be consistent with those choice phenomena that contradict the implications of expected utility, without violating either stochastic dominance or local risk aversion. The chapter ended with a comparison of alpha utility with two other theories that have attracted attention; namely, Allais' theory and prospect theory. Several other applications of the representation theorems of Part I are considered in the Conclusion of this dissertation. These include the use of the mean as a model of the equally-distributed-equivalent level of income (Atkinson, 1970), and as a measure of asymmetry of a distribution (Canning, 1934). The alpha utility representation can also be used to rank social situations in the sense of Harsanyi (1977). We ended by pointing out an open question regarding conditions for comparative risk aversion and stated an extension of Samuelson's (1967) conjecture that Arrow's impossibility theorem would hold if individuals and society express their preferences by von Neumann-Morgenstern utility functions. CONTENTS v INTRODUCTION 1 PART I W REPRESENTATION THEOREMS 1 GENERALIZING THE QUASI LINEAR MEAN OF HARDY, LITTLEWOOD AND POLYA 7 1.1 INTRODUCTION 7 1.2 AXIOMS OF MEAN VALUE 8 1.3 REPRESENTATION THEOREM 14 1.4 PROPERTIES OF THE M , MEAN 29 a<j> 2 GENERALIZING THE EXPECTED UTILITY REPRESENTATION THEOREM 36 2.1 INTRODUCTION 36 2.2 PRELIMINARY DEFINITIONS 36 2.3 AXIOMS 38 2.4 REPRESENTATION THEOREMS 40 PART II APPLICATION TO DECISION THEORY BACKGROUND 59 3 CRITIQUE OF EXPECTED UTILITY THEORY 61 3.1 INTRODUCTION 61 3.2 SYSTEMATIC VIOLATION OF THE STRONG INDEPENDENCE PRINCIPLE 64 vi 3.3 CONCURRENCE OF RISK SEEKING AND RISK AVERTING BEHAVIOR 70 3.4 SOME PROBLEMS WITH PROBLEM REPRESENTATION 77 3.5 SUMMARY 80 4 A NEW THEORY 81 4.1 INTERPRETING MEAN VALUE AS CERTAINTY EQUIVALENT 81 4.2 REPRESENTATION OF A PREFERENCE BINARY RELATION 86 4.3 NORMATIVE IMPLICATIONS 90 4.3.1 Ratio Consistency 91 4.3.2 Assessment 97 4.3.3 Stochastic Dominance 100 4.3.4 Global Risk Aversion 103 4.3.5 Local Risk Aversion: The Arrow-Pratt Index 105 4.4 DESCRIPTIVE IMPLICATIONS 108 4.4.1 Systematic Violation of the Strong Independence Axiom 108 4.4.2 Stochastic Dominance 114 4.4.3 Local and Global Risk Properties: Concurrence of Risk Averting and Risk Seeking Behavior 116 4.4.4 Some Problems with Problem Representation 122 4.5 CRITIQUE OF ALLAIS* THEORY AND PROSPECT THEORY 124 4.5.1 Allais1 Theory 124 4.5.2 Prospect Theory 131 4.5.3 Comparison 137 vii CONCLUSION 5 CONCLUSION 141 5.1 SUMMARY 141 5.2 EXTENSIONS 143 viii TABLES 3.1 Summary of Empirical Results Relevant to the HILO Lottery Structure 68 4.1 Allowable Choice Patterns under Alpha Utility Theory 110 4.2 Comparison among Theories 139 ix FIGURES 3.1 A "typical" von Neumann-Morgenstern utility function 62 3.2 Four decision problems 64 3.3 The composition of the AT(B ) lottery from A (Bn) 65 3.4 The HILO structure of three consequence lotteries 66 3.5 Standard lottery comparison 70 3.6 Examples of uc and u^ (based on data from Allais (1977), Appendix C) 74 3.7 Examples of u^, and u^ (based on data from MacCrimmon et.al. (1972)) 75 3.8 Graphical representation of two lotteries 77 3.9 A sequential representation of lottery B 77 4.1 Ratio Consistency illustrated using barycentric coordinates 92 4.2 Geometric proof of the Ratio Consistency property 95 4.3 Probability Equivalent method 98 4.4 Test of Substitution axiom 98 4.5 Preference Pattern for ct(I) < 1 109 4.6 Consistency Conditions for Stochastic Dominance 115 4.7 Conditions for Local Risk Aversion 117 4.8 An admissible alpha function 118 4.9 A pair of u£ and u^ derived from an alpha utility decision maker 120 4.10 A pair of u and u derived from an alpha utility decision maker 120 g 3/i» r 5.1 An alpha function that discriminates against the rich 145 PREFACE The preface is perhaps a suitable place for an informal discussion of ideas leading to this dissertation. In their axiomatization of expected utility, von Neumann and Morgenstern likened the "utility", loosely speaking, of a lottery to the center of gravity of a mass distribution. In other words, the utility, u(x^,...>XN5P^»•••»Pn)» °f a lottery, (x^,...,x ;p^,..., Pn)» which pays x^ with probability p^, should be the mean or average of the utilities associated with each probable outcome. This led to the expected utility expression: n u(x,,...,x ;p,,...,p ) = .S,p.u(x.). ^ 1 n rl rn i=lri 1 Together with their elegant axiomatic characterization, there seemed to be a strong case for the adoption of expected utility. The Allais-Savage contro versy however clouded the picture somewhat. Some, including Allais, suggested replacing the probabilities, {p^^-i* with a more general set of weights, ^i^i-1' w*1^c'1 depend on the lottery, and sum to unity. In this case, n u(x.,...,x ;p,,...,p ) = .2,4. (x..,... ,x ;p.,...,p )u(x.). 1 n rl rn i=lriv 1 n rl rn I At this level of generality, the only testable implication is that the utility of a lottery is intermediate in value between the maximum and the minimum attainable utilities. This property, which may be called intermediate value property, is compatible with our intuition about the utility of a lottery as a mean value. To impose more structure, one may consider restricting the <JK weights to: w (Xj,...,xn;p1}...,pn) 0)i(x1,...,xn;p1,...,pn) = , j?lwj Cxi > • • • >xn'Pi' • • • 'Pn-1 where {w^}^^ is a set of positively valued weight functions that depend on xi the lottery, (x^,.. . >xn»P-_> • • • >Pn) • A further restriction is obtained by imposing a desirable property called combination property: the utility of a lottery remains unchanged if we combine different probabilities of getting the same outcome. This implies that the w^ functions are of the form: w.(x.,...,x ;p,,...,p ) = p.w.(x.;x .,p .), 1 1 n rl rn ri 1 l -i' -i/' where x ^ and p ^ denote the outcome vector and probability vector with the ith component deleted. With a bold sleight of hand, as yet unsubstantiated by any a priori reason, the values are assumed to be obtained from a single function a evaluated at the ith outcome, x^. This leads to a fairly tractable generalization of expected utility: n pia(xi) u(x1,...fxn;p1,...,pn) = J^I } u(x.). j?lPja(xj) Of course, a nice expression is just the first step. The next thing is to work back and forth in order to identify a minimal set of characteristic properties from which the a and u functions can be constructed. These properties, once found, would then necessarily be weaker than the corres ponding ones for expected utility. The first proof of such a representation theorem is in the context of generalizing the quasilinear mean of Hardy, Littlewood, and Polya given in Chapter 1. The corresponding result in terms of a preference ordering is given in Chapter 2. Although the basic result follows from a straightforward reinterpretation of those of Chapter 1, a self-contained treatment of Chapter 2 is presented so that readers who are used to the preference ordering approach can skip Chapter 1. Bob Weber provided an elegant geometrical interpretation of one of the axioms (Ratio Consistency) in an earlier version of the prefer ence ordering theorem and demonstrated that it was redundant. This led to a xii much weaker axiom in the current proof of the mean value representation theorem. As is usual, the organization of the dissertation assumes the normal trappings of academic writing. Formal results, which are developed rigor ously in Part I, precede interpretations in the context of decision theory in Part II, designed to build a prima facie case for the adoption of the more general expected utility hypothesis on both theoretical and empirical grounds. This division may cause the appearance of repetitiousness, but has the advantage of making Part II self-contained. Parts of the material in Part II are the result of joint work with Ken MacCrimmon, my research supervisor. I would also like to acknowledge my debt to the other members on my guidance committee. The numerous instances when I went to Shelby Brumelle and Cindy Greenwood for help had been instrumental in enabling me to carry through the analysis in the proofs of the representation theorems, and also in the understanding of some basic mathematics. I have benefited from Daniel Kahneman's research with Amos Tversky on the psychology of judgement and decision-making under uncertainty and also from the seminars in his home. I have also benefited from discussions with Dave Donaldson on his work with Charles Blackorby on the measurement of inequality and poverty. I was exposed to research in the economics of uncertainty during Yoshitsugo Kanemoto's seminars. Thanks are also due to a non-member, John Butterworth, for suggesting the problem during the final examination of his information choice course. xiii To my QH.a.ndmotkQA. 1 INTRODUCTION Two well known representation theorems provide the starting point for this dissertation. The first, due to Hardy, Littlewood and Polya (1934), is an axiomatic characterization of a rather general class of mean values called the quasilinear mean: ^(F) = CJTVR cj>dF). Mjj(F) denotes the quasilinear mean associated with a probability distribution F and characterized by a strictly monotone function <J>. Hardy, Littlewood and Polya proved their characterization for simple probability distributions defined on a compact interval,. Examples of quasilinear means include the widely used arithmetic mean (<b is linear) , the geometric mean (<j> is logarithmic), the harmonic mean (cj> is of the form ^) and the rth moment mean, also known as the general mean of order r (cj> is of the form xr). The second representation theorem has its genesis in the St. Petersburg's paradox -- an individual is not willing to stake all that he possesses to take part in a lottery that pays 21 dollars with ^ chance; thus, demonstrating the limitation of using mathematical expectation of payoffs as a general rule for the ordering of risky prospects. This led Bernoulli to propose, in 1738, the expectation of a 'moral worth' function, u, of wealth as an alternative. In particular., he used the logarithmic function derived by assuming that an infinite simal increase in the worth of wealth is proportional to an infinite simal increase in wealth but inversely proportional to wealth itself. The monetary worth or certainty equivalent M(F) correspond-2 ing to a lottery F is then given by, u(M(F)) = /RudF; or alternatively, M(F) = u_:L(rRudF). Note that this expression is the same as that defining the quasi-linear mean. The first axiomatic treatment leading to the expectation of a function of payoffs as a rule for the ordering of lotteries is given by Ramsey (1926) in his "Foundation of Mathematics", von Neumann and Morgenstern (1947) provided an alternative axiomatization independently in their "Theory of Games and Economic Behavior" and initiated the use of the term "utility". They proved the existence of an order-preserving map on a mixture set (e.g. the space of probability distributions) subject to a minimal set of postulates such that the order-preserving map is the expectation of a utility function. Their result is now commonly referred to as.the expected utility theorem. The usefulness of quasilinear means needs no elaboration.^ Expected utility theory, i.e., the application of the expected utility theorem to decision-making by interpreting the binary relation as a preference relation and the mixture set as a set of risky alternatives For a survey of the use of the rth moment mean in statistics, see Norris (1976), Blackorby § Donaldson (1978a) contains examples of the use of quasilinear mean in the measurement of income inequality. Weerahandi § Zidek (1979) provides an alternative characterization of the rth moment mean for probability distributions defined on the positive half-line. Ben-Tal (1977) showed that quasilinear means are ordinary arithmetic means defined on linear spaces with suit ably chosen operations of addition and multiplication. 3 or equivalently, the interpretation of the quasilinear mean as a certainty equivalent, has attracted considerable attention since its inception. Alternative axiomatizations were given by Marschak (1950), Samuelson (1952), Herstein and Milnor (1953), Savage (1954), Anscombe and Aumann (1963), Pratt, Raiffa and Schlaifer (1964), Jensen (1967), DeGroot (1970), Fishburn (1970), Arrow (1971) and others. Savage (1954), Blackwell and Girshick (1961) and DeGroot (1970) applied expected utility theory to statistical decisions. In addition, it served as the foundation for Arrow (1971) and Marschak and Radner (1972) in their investigation of the economics of uncertainty, and for Howard (1964) and Keeney and Raiffa (1976) in their work on decision analysis. Expected utility theory, though, has been less successful in describing and explaining actual choices (Edwards, 1961; Slovic, Fischhoff and Lichtenstein, 1977). Even von Neumann and Morgenstern realized at the outset that expected utility rules out complementarity among mutually exclusive consequences, a utility for gambling per se, and other behaviors that seem relatively common (von Neumann and Morgenstern, 1947, Appendix A, Sec. 3). Subsequently, various challenges, beginning with the Allais paradox (Allais, 1953), have called into question the empirical validity of a key property of expected utility theory, the strong independence principle (Marschak, 1950; Samuelson, 1952). The strong independence principle requires that ranking among lotteries remains unaltered when each lottery is composed with an identical lottery using the same probability. This is closely linked to the axiom of quasilinearity of Hardy, Littlewood and Polya (1934)} 4 the "substitution of lotteries" of Pratt, Raiffa and Schlaifer (1964) and the "sure-thing" principle of Savage (1954). Part I of this dissertation contains the statements and proofs of two representation theorems. The first theorem, proved in Chapter 1 generalizes the quasilinear mean of Hardy, Littlewood and Polya by weakening their axiom of quasilinearity. Given two distributions with the same means, quasilinearity requires that mixtures of these distributions with another distribution in the same proportions share the same mean, regardless of the distribution that they mixed with. We weaken the quasilinearity axiom by allowing the proportions that give rise to the same means to be different. This gives rise to a more general mean, M^, that is specified by a continuous and strictly positive (negative) function, a, and a continous and strictly monotone function, <J>. The quasilinear mean results when the a function is constant. In addition, we show that the mean has the Intermediate Value Property, and provide necessary and sufficient conditions for consistency with the stochastic dominance (including higher degree ones) partial order. We also generalize a well known inequality among quasilinear means, by the observation that the mean of a distribution F can be written as the quasilinear mean of a distribution F where F is derived from F via a as the Radon-Nikodym ct derivative of F with respect to F. As was noted earlier, the mean induces an ordering among distributions via the (expected utility) maximand, /R*dF. Correspondingly, the M ^ mean induces a more general ordering via the 5 maximand, /Ra<f>dF//RadF. We prove, in Chapter 2, an alternative characterization of the above representation for simple probability measures on a more general outcome set where mean values may not be defined. In this case, axioms are stated directly in terms of properties of the underlying ordering. We retain several standard properties of expected utility, namely weak order, solvability and monotonicity but relax the substitution principle of Pratt, Raiffa and Schlaifer, which is essentially a restatement of quasilinearity in the context of an ordering. The motivation for the research contained in Part I comes from paradoxes in the field of decision theory. Hence, the formulation of a new theory of choice (called alpha utility theory) that generalizes expected utility theory constitutes Part II (Chapters 3 and 4) of this dissertation. Chapter 3 gathers together the criticisms and empirical findings which contradict the implications of expected utility theory to pave the way for the development of alpha utility in the first two sections of Chapter 4. Sections 3 and 4 contain respectively, the derivation of the normative implications (e.g. stochastic dominance, global and local risk aversion) and descriptive implications (in particular, relating to the descriptive inadequacy of expected utility theory) of alpha utility theory. The chapter ends with a comparison in Section 5 of alpha utility with two alternative theories that have attracted significant interest. PART I REPRESENTATION THEOREMS 1 7 GENERALIZING THE QUASILINEAR MEAN OF HARDY/ LITTLEWOOD AND POLVA 1.1 INTRODUCTION What is mean value? Conventional wisdom tells us that it repre sents, typifies or in some way measures the central tendency of a dis tribution. We are rescued from the ambiguity, as in elementary statistics texts, by examples. Some familiar notions of mean value include median, mode, arithmetic mean, geometric mean, harmonic mean and root-mean-square or more generally the rth root of the rth moment of a positive random variable (known also as the general mean of order r). Of these, the arithmetic mean is the most widely used. There are however situations for which the arithmetic mean may not be the most appropriate 'typical' value: notably, the discrepancy between per capita income and the 'typical' income for a society, the bulk of whose wealth is in the hands of a few. In their pioneering work of 1934, Hardy, Littlewood and Polya showed how a general class of mean values, called the quasilinear mean, can be tailored to our needs subject to a necessary and sufficient set of axioms. This class includes as special cases all the examples of mean values mentioned above except for median and mode which do not satisfy their axioms. We generalize, in this chapter, the quasilinear mean via a weaker set of axioms stated in section 2. They are shown in section 3 to be necessary and sufficient for the representation of a class of mean values that generalizes the quasilinear mean. Finally, we derive some properties of our mean 8 value in section 4. Applications will be discussed in Part II and the conclusion of this dissertation. 1.2 AXIOMS OF MEAN VALUE2 Let Dj denote the space of probability distributions with all their mass concentrated in some interval J of the real line R (J need not be bounded). We consider a functional M whose domain is Dj. What properties should M possess in order to be a mean value? A natural candidate, motivated by the mean-value theorems of elementary calculus, is: Property 1: Intermediate Value Property JM(F) e conv Supp(F),VF e pj . The support of a distribution F, Supp(F), consists of each point x such that every open set containing x has positive mass, Conv Supp(F) is the smallest interval containing Supp(F). The intermediate value property requires that the mean of a distribution be neither greater than the maximum attainable value nor less than the minimum attainable value. Axiom 1. below is a consequence of the intermediate value property. A*10"1 1' Consistency with Certainty M(5 ) - X,VJC e J, The distribution 6^ refers to the step function at x which, in terms of probability, indicates obtaining x with probability 1. Another property that seems reasonable is given by Axiom 2. _ In this section, the terms axiom and property are used interchangeably. Properties carry the "axiom" label if they appear in the final represent ation theorem as a characteristic property. 9 Axiom 2: Betweenness V F, G eDj, if M(F) < M(G) then VB e (0,1), M(3F+(l-B)G)e (M(F),M(G)). It is straightforward to check that Betweenness is equivalent to Property 2 stated below: Property 2: Mixture-monotonicity V F, G eDj, if M(F) < M(G), then M(BF+(1-B)G) < M(YF+(1-Y)G) if 1>B>Y*0. Lemma 1.1: Axiom 2 <=» Property 2. Proof: Omitted. A distribution G is said to stochastically dominate another distribution F in the first degree, denoted by G > F, if G is always not greater than F pointwise. If in addition, G is strictly less than F at some point, then G stochastically dominates F strictly in the first l degree, denoted by G > F. Stochastic dominance of the first degree is an appealing partial order. Consistency of mean value with this partial order is stated as Property 3: Property 3: Monotonicity VF, G e Dj, G > F => M(G) > M(F) . The next axiom deals with the effect on mean value of certain changes in the composition of the underlying distribution. Axiom 3: Weak Substitution VF, G £ Dj, if M(F) = M(G) thenVB e (0,l)3y e (0,1) 9 V H E DJ, M(BF+(1-6)H) = M(YG+(1-Y)H) . 10 Hardy, Littlewood and Polya (1934) used a special case of Axiom 3 (They called it quasilinearity.) stated as Property 4 below for their quasi-linear mean. Property 4: Substitution (Quasilinearity) V F, G, He fjj, if M(F)=M(G), then V 6 e (0,1) M(BF+(1-B)H)=M(BG+(1-B)H) . Starting with two distributions with the same mean value, quasilinearity or the substitution property requires that mixtures of these distributions with another distribution in the same proportions share the same mean regardless of the distribution that they mixed with. The weak substitution axiom allows mixture proportions that give rise to the same mean value to be different. The following property called Ratio Consistency is a consequence of Axioms 2 and 3 (In an earlier paper (Chew, 1979), the Ratio Consistency property was implicit in a stronger statement of Axiom 3. The weaker version is due to suggestions of Weber, Myerson, and Milgrom. For a discussion of an interesting geometrical interpretation of Ratio Consistency due to Weber, see Chapter 4, section 3.1): Property 5: Ratio Consistency Suppose F, G, H e Dj and B^, B2> Y^> Y2 e (0»1) 3 M(F) = M(G) t M(H), and M(BiF+(l-Bi)H)=M(YiG+(l-Yi)H) for i=l,2. Bj/l-Bj B2/l-B2 ^ V1"*! = V1_Y2 • Lemma 1.2: Axioms 2 and 3 imply Property 5. Proof: Suppose that M(H) < M(F) = M(G) without loss of generality. 11 Axiom 3=>3f: (0,1) -»• (0,1) gV B e (0,1), M(BF+(1-B)H) = M(£(B)G+(1-£(B))H) ; Lemma 1.1 =* f is a strictly increasing function. This together with Axioms 3 and 4 implies that f"1 exists. Therefore, they are continuous functions, and hence differentiable a.e, Define x : (0,1) - R+ by T(3) = f ^g/ilg(3)» (1.1) Note that T is continuous and differentiable a.e.. We show below that T is a constant to complete the proof. Consider 0<B <B + <5 < 1. It follows after substituting G for F using Axiom 3: M((B.«)F*(I-(^))H) . M ([.ts^;i,:i;:;).(Bt,iG.(llt,)i;^_;;i.(BTT)H) But (B+6)F+(1-(B+5))H = BF + (1-g) 1-6 r l-B-6 .. F + , „ H 1-B (1.2) Therefore L.H.S. of (1.2) = M(BF+(1-B) & _ l-B-6 --TBF+"WH ) (1.3) M BMB) G + Bx(B)+l-B BT(B)+1-B 4F+1#H l-B (1.4) after substituting G for F using Axiom 3. Applying the same argument for the remaining F-component in expression (1.4), we obtain: 12 R.H.S. =M 6-r(6/e-r(e)+i-e)+eT(e) 6x(S/3x(B)+l-3)+3x(B)+l-(B+6) + 1-(3+6) \ 6x(6/Bx(B)+l-3)+Bx(B)+l-(3+6) HJ ' C1"5-1 Comparing expressions (1.2) and expression (1.5), it follows that (B+6)T(B+6) = Bx(B) + 6x(6/Bx(B) + 1-6) . (1.6) Suppose without loss of generality that x is differentiable at 6. Then Lim (3+6)T(B^-BT(B) =t(8) + Bt,(b) = Lim TC5/ ). o+U o o+U (1.7) Therefore, the right hand limit of x at 0 denoted by x(0+) exists, Applying the same argument for other B's for which x is dif-fer-entiable, we obtain fg[ex(B)] = x(0+) ' a.e. . (1.8) Therefore x(3) = x(0+) a.e., and hence x(B) = T(0+) by continuity. Q.E.D. Finally, we require our mean value to be a continuous functional in the sense of Axioms 4 and 5. Axiom 4: Continuity If {F } ,C DT converges in distribution to — n n=l J 6 F e Dj and F has compact support, then M(F) = Lim M(F ). n-H» n 13 Convergence in distribution has the following characterization which is used sometimes as its definition. F converges in distribution to F e DT if and only if n J /TfdF converges to /,fdF V f e C(J), where C(J) is the space J n u of all bounded continuous functions on J. Note that when J is unbounded, the arithmetic mean of F does not necessarily n converge to the arithmetic mean of F since the function x does not belong to C(J). We impose the condition of compact support in Axiom 4 in order not to exclude the arithmetic mean from our class of mean values. The requirement of Continuity is useful because it tells us that the mean of a distribution may be approximated by the mean of a different distribution that is close to it. When J is a compact interval, Axiom 4 is equivalent to continuity of the mean value, M, with respect to the I?'-norm. When J is unbounded, the following condition tells us how to estimate the mean value of a distribution F without compact support (if it exists) by its restriction to a compact interval, K, denoted by F„. K Axiom 5: Extension Let {^n^™! De an increasing family of compact intervals such that Lim .= J, then M(F) = LimM(FK ),VFeDj. The mean value for a distribution F without compact support is given by the limit of the mean values of the sequence of truncated distributions, "•FK ^ = 1' for any increasing family, {K }°° whose limit is J.' Since n n~ the sequence of mean values, {M(F )} does not always converge, the ii n=0 mean value for a distribution without compact support need not exist. A good example is the arithmetic mean. 14 1.3 PvEPRESENTATION THEOREM We begin with a statement of the quasilinear mean representation theorem. D°[A,B] denotes the restriction of D[A,B] to simple distribu tions with a finite number of discontinuities. Theorem 1.1: (Hardy, Littlewood S, Polya) Suppose 3M : D° [A,B] R. Then M satisfies Axiom 1, Property 3 and Property 4, if and only if 3 cj> : [A,B]->-R, continuous, strictly monotone such that M(F) = <J> (J $ dF) , V FE D [A,B]. (1.9) A Moreover, if 3 <f>* : [A,B] •+ R such that M(F) = <{>* 1 (f <j)* dF) , V F e D [A,B] A then V x e [A,B], <J>*(X) = a<j>(x) + b> f°r some a,b with a i 0. (1.10) Proof: (Omitted since it is a special case of Theorem 1.2). In other words, the most general certainty consistent, monotone and quasilinear functional of F is that defined by (1.9). We shall call it quasilinear mean. Since the quasilinear mean M(F) is completely specified by a continuous, strictly increasing function <(>, up to an affine transformation, (1.10), it is convenient to write it as M^(F). The following theorem generalizes the theorem of Hardy, Littlewood and Polya and extends their analysis from D°[A,B] to D[A,B]. A further extension to fJT for the arbitrary J follows later. 15 Theorem 1.2: Suppose 3M: D[A,B] -> R. Then M satisfies Axiom 1, Axiom 2, Axiom 3 and Axiom 4 if and only if 3cp : [A,B] ->• R, continuous, strictly monotone, and a : [A,B] -»• R+, continuous, strictly positive, B B such that M(F) = <j>_1(f ctcp dF/ J a dF) , V F e D[A,B]. (1.11) A A Moreover, if 3<j>* : [A,B] -> R, and a* : [A,B] -»- R*, i ,B B such that M(F) = <J>* (/ a*cf>*dF/J a* dF) , VF e D[A,B] A A then Vxe[A,B] = l^^^rl HU ^ d-") k(<|>(x)-<f>(A)) + (<J>(B)-<J)(x)) and a*(x) = ca (x) {k (<j> (x) -<|> (A)) + (<J> (B) -<|> (x)) } , (1.13) for some a,b,c,k with a,c i 0, k > 0 . Proof: (Necessity) Axiom 1 follows immediately. Axiom 2 follows from the observation that, 3(/,Ba dF)<J>fM(F)) + (1-6) (//a dG)4>(M(G)) <|>(M(BF+(1-B)-G))= n \ •e(/A« dF) + (1-B)(/Aa dG) increases strictly in 6 when M(F)>M(G). Consider F,G,H e D[A,B], Suppose M(F) = M(G). Then, *(M(BF + (1-B)H))= ^l* dF)KM(F)) + (1-g)(jfr dH)<j)(M(H)) B(/ABa dF) + (l-B)(JBa dH) <KM(YG + (l-Y)H)) With = ^ d¥)nUa dG) ' V 3 £ [A'B]' 16 Hence, Axiom 3. Let IF > ^ converge to F. Since a,<f>,c)> are continuous on a compact interval [A,B], B B B B Ua dFn^Aa dF and dFn dF " B B B B - lA^dFn/JAa dFn - /Aa<|> dF /J^ dF . -1 fB fB -1 rB ,B =» M(Fn) = cj> (/ a* dFn/JAa dFn) •+ <j, (J^acj) dF//Aa dF) = M(F) . Axiom 4 follows. (Sufficiency) Define ^ : [0,l] + [A,B] as follows. Kp) = M(Sp), VP e [0,1], where Sp = pSg + (l-p)6A. (1.14) Axiom 1 =* ^CO) = A and 4^(1) = B. Axiom 2 =* IJJ is strictly increasing. Let (Pn^-i converge to P- Then Sp^ converges in distribution to S . Axiom 4 =» Kp) = M(Sp) = Lim M(Spn) = Lim * (pn) . It follows that i> is continuous and strictly increasing and there fore has an inverse a) : [A.B] -»- [0,l] which is continuous and strictly increasing. If x = »Kp) > then p = <j>(x) and M(6x) = x = Kp) = M(S ) = M(S~(x)). (1.15) Lemma 1.2 implies the existence of a strictly positive constant that depends on x such that, VH e D[A,B], V 6 e ['0,l], M(36x + (l-B)H) = MCg^.gj S~ (x)+ H) . (1.16) 17 Construct a: (A,B) -> (0,°°) by assigning a(x) = x , V x e (A,B). The following argument establishes the continuity of a on (A,B) and then extends its domain to include the end-points. Consider g(x) = H(h& +h&J = M( "C>0 S2r , + ? ,iA) x AJ ^a(x)+l <|>(x) a(x)+l AJ M(S {a(x)Kx)/5(x)+l}) = Ka(x)Kx)/a(x)+l) Let txn} "^ converge to x e (A,B). Then ^^xn+^A converges i-n distri bution to %& + %6A. Axiom 4 implies that, g(x) = m&x + %6A) = Lim M(%6Xn + h&p) = Lim g (xn) . Therefore, g is continuous in (A,B). It follows that a is continuous in (A,B). Let (xn} "^ converge to B from below, then, Uh) = MCS^) = Lim H(h6Xn + h&A) = Lim g(xn) = Lim *($(xn)/(1 + 1/5(xn))) . => Lim a(x ) = 1, since i(B) = 1. Similarly, we can show that Lim a(*n) = 1 as x^ converges to A from above. We extend a to [A,B] continuously by assigning 1 to a at the end-points. Now, we are ready to show that, V F e D [A,B], the functions a and <f> satisfy condition (1.11). n °r Let {xi^_i be the support of a distribution F in D [A,B], and n represent F in the form, F = £ ei<5x. » wnere 6. = F(x^)-F(x. ). i=l 1 n M(F) = M( I 6.6x ) i=l i 6l°(xi) " 9j = M(oia(x1)+ze. s *(Xl) + i£2e15Cx1)+Eej 5xi) • after substituting S^x ^ for 6^ using expression (.1.16). Repeating (n-1) times on the remaining 6X^, i=2,...,n, yields, ~-l ,B . ,-B = ^(Ee.a(x )<j>(x )/E6.a(x.)) = <j, A(/ acJ>dF// adF) . : 3 A A Finally, we extend our construction to F e D[A,B]. Suppose F E D[A,B] - D°[A,B]. Construct the following sequence {F } °° in D°[A,B], n n=l V FCAJ«A . fiFCA* F(A. Ii^l„5A+i(B.A)/2n By construction, Fn(x) F(x), Vx e {A+^~I : i,n e I+, i < 2n} which is dense in [A,B]. Therefore {F^} °° converges to F. n=l Axiom 4 =» M(F) = Lim M(F ) ~ 1 B B = Lim <j> (/ 54 dF // 5 dF ) n-x=° Y VJA Y n JA ny — 1 B B = $ (/ S$ dF// a dF), A A since c]>,<f> and a are continuous on [A,B], 19 (Uniqueness) Suppose 3 a : [A.B-] -»• R+and cf> : [A,B] -»- R, that satisfy condition (1.11) Then x = M(6x) = M(S-(x)) = -1ra(B)c^(B)|(x) + a(A)^>(A)(l-$(x)) " * 1 a(B)$(x) + a(A)(l-$(x)) J ' U'1/J and V B e (0,1), -1 Ba(x)(Kx) + (l-B)a(A)<|>(A) , * L Ba(x) + (l-B)a(A) J = M(B6x + (1-B)6A) = Mr + Cl-B) , UlBa(x) + (l-B) <Kx) Ba(x) + (1-B) °A -» -1 B5(x)»(x)a(B)(l>(B) + (Ba(x)(l-»(X))-f(l-g))a(A)(|,(A) * 1 B5(x)$(x)a(B) + (B5(x)(l-<|)(x)) + (l-B))a(A) J' (1.18) after applying a and $ to the equalities (1.15) and (1.16). Let a = <j>(B) - <j>(A), b = <KA), c = a (A) , and k = ct(B)/ct(A). It is straightforward to check that, Vx e [A,B ], <j>(x) = a{k$(x)/(k*(x) + (l-$(x)))} + b, and a(x) = ca(x){k$(x)+(l-$(x))}. Suppose a* and cj>* are another pair of functions that satisfy condi tion (1.11). Then (J>*(x) = -a*{k**(.x)/(k*$ (x) + Q-* (x))) } + b* , 20 and a *(x) = c*o(xHk*$(x) + (l-$(x))}, with a* = t}.*(B) - <)>*(A), b* = <f>*(A), c* = a*(A), and k* = a*(B)/a*(A). Finally, we check that, Vx e [A,B], 4>*(x) = a'{k-(Hx)-KA))/(k-(<Kx)-KA)MKB)-Kx)))}+b' , a*(x) = c'a(x){k'(<Kx)-<KA)) + (<£ (B) -<J> (x)) }, Our generalization of the quasilinear mean, defined by (1.1-1), is completely specified by a pair of functions (ct,£>) . According to Theorem 1.2, this is the most general mean for distributions defined on a compact interval that satisfies Consistency with Certainty, Betweenness, Weak Substitution and Continuity. In keeping with precedent, we denote our generalized mean by M^. The pair of functions (a,<j)) denotes a particular member of the class, {a,<)>},, of functions that yield the same mean on DT. When J is a compact interval, such as [A,B] in Theorem 1.2, we can form the follow ing subclass of {a,<j)}T, called k-ratio subclass, for k > 0. for a' = (cf>*(B)-<}>*(A))/a = a*/a, b' = <j>*(A) = b*. c' = a*(A)/{a(A)(<HB)-<j>(A))} = c*/ca, and k' = {a*(B)a(A)/a*(A)a(B)} = k*/k. Q.E.D. [A,B] = { (a,<j>) e {a, tj)} [A,B] : a(B)/a(A) = k) . 21 k ,k k rk, We denote by (a ,<f> ) a generic element and (ci , cj> ) [A,B] the oanoniaal element of {a,<J>} [A,B] that satisfies: \ (A)=0, $(B)=1, a (A)=l and aK(B)=k. It can be shown, using expressions (1.12) and (1.13), that the elements of a k-ratio subclass are related to each other via an affine transformation k k for the (j> component and a scalar transformation for the a component. The class {ct.cbjj-^ can be obtained from its k-ratio subclasses by taking their union over all positive k's. The following corollaries of Theorem 1.2 are needed to extend our results to include noncompact intervals. The restriction of {a,<f>}r. D -i LAi, to the interval [An,Bn] is denoted by {a,4)}r. „ -. r D . U U LA1 >Bl-I LAQ' o Corollary 1.1: Let Aj < AQ < BQ < Bj. Then (a, aS} [AJ.BJ] U, {a, aS}. [A0,B0] = KEC!?01'H01> K.Bj ' o*"o-where 01 a|(Bn) ^(Bp) a^lAoT FfAoT (1.19) and •01 q^Bp) (1 - $*(Bn)) ^(AQ) (1 - $KA0)) (1.20) Proof: Denote by (o^1,^1) an element of {ct^}.-^1 n -, LAi, v>\J ki ki Observe that (ai1,^!1) [A0,B0] E {a^}[A°,B0] where k _"l|(Bo) _ &i1 (Bp) (M11 (Bp ) * (1-ji. 1 (Bn))) ( , 0 " ^P(A0) " a1l(A0)(k1$1l(A0) + (1-$11(A0)))' U-'iJ which is a continuous, strictly increasing and onto function of k 22 with domain (0,°°) and range (h ,h ) "01 01 Corollary 1.2: Let A2 < Ai < A0 < B0 < Bi < B2 1 I 1-Q.E.D. Denote by (cL1,^1) the canonical element of the unitary subclass-for the interval [A..B.1, i=l,2. ii Then where and bo2<h01 and h02 > h01 , 01 a.^Ao) ^(AQ) ' h0i = gJ^BoHl - ^iHBn)) d.UA0)(l - $.UA0)) ' (1.22) for i = 1,2, Proof: Construct the functions £. . : (0,°°) -> (h..,h..), for i < j, 3 >1 -ij ij via k. = K. • (k.) i ].i 3 a.HB.JCk.^.lCB.) + (1-ijL 1 (B.))) a^CA.JCk.^.UA.) + (l-^.^A.))) Note that, by construction, (1.23) k. {a'*}[A.,B.] 3 3 [A.,B.] [A.,B.] Note also that is continuous, strictly increasing, and onto from (0,°°) to (h. . ,h. .) . -1J ir Suppose h02 > h0i. Pick kj_ e [h12,°°). Then Cio^i) < h01 < h02. =* 3 k2 e (0,°°) such that t;20(k2) = £,l0{k1) . 23 But {a'*}[A2,B2] [A0,B0] [Ai.Bi] [Ai.Bi] [A0,B0] [A0, B0] A similar argument establishes h01 < h02. Q.E.D. According to Corollary 1.1, we have to restrict the kQ-ratio correspond ing to the [AO,BQ] interval to within a range of values if we want (co0'*!)0) to agree with (cxi1,^1) restricted to [AQ.BQ] for a larger interval [AX,BI]. Corollary 1.2 tells us that the range of permissible kg-ratio's gets squeezed as we go from [Ai,Bi] to a larger interval [A2,B2]. NOW we extend Theorem 1.2 to the case of arbitrary interval J. Theorem 1.3: Suppose 3 M : D R. Then M satisfies Axiom 1, Axiom 2 , Axiom 3 , Axiom 4 and Axiom 5 if and only if 3 <J> : J -»• R, continuous, strictly monotone, and a : J -*• R+, continuous, strictly positive, such that3M(F) = f1 (jja<|>dF/JpdF) , V F e Dj. (1.24) Moreover, if (a*,<j>*) is another pair of functions that satisfies 3 The ratio fj a^dF/fj adF for F without compact support is defined by expression (1.25) . 24 condition (1.24), then V interval [ A,B] C J, 3 a, b, c, k with a, c * 0 and k > 0 a V x e [A,B] , Avrfx! = a k(d)(x) - 4>(A)) * 1 j d k(Kx) - cj) (A)) + (cf,(B) - <j>(x)) b' a*(x) = ca(x){k(<})(x) - cf>(A)) + (<j> (B) - <|>(x))}. Proof: (Necessity) Verification of Axiom 1, Axiom 2 , Axiom 3 , and Axiom 4 is the same as in proof of Theorem 1.2. Axiom 5 follows trivi ally from the definition (expression (1.25)) of M(F) for F without compact support. (Sufficiency) If J is compact, then we are done. Otherwise, let OO j- _ CO {K } = {[A ,B J} be a sequence of intervals such that n n=0 n n n=0 00 CO {An}n_Q '-'lBn^n-o-' :""s a stri-ctly decreasing (increasing) sequence, and Lim{Kn}= J. Corollary 1.1 says that, {ai^i}tAi5B.]|[Ao,B0]= k/lhoi,ho.){aO^O>[;0)Bo] f a.^Bo) (l-^^CBo)) a^CBo) $^(60) where (h . ,h .) = ( , ) • '01 01 a.^Ao) (l-^.'fAo)) a.^Ao) $il(A0) Corollary 1.2 says that, {hQ^} 00 is strictly increasing and {h0.} °° is strictly i=l i=l decreasing. 25 Let C. = [(h .+h . J/2 , (h .+h . J/2 ] , 1 1 -01 -0 i+l Oi 0 l+l and D. = (h0 .+1 , h0 . Then Di ^ C. ^ (hoi>noi) for i = 1,2,3,... . Observe that (h^jh^J, C^,D^ are strictly decreasing sequences of sets by inclusion. Since C\ is compact for each i, therefore, Lim C. = C t <t> . Since Di C. C. 5 (h . ,h J V i it follows that, 1 i l ^ -oi 01 Lim D. c C C Lim (h..,h.J. But Lim D. = Lim (h„.,h„J. Hence Lim (h . ,h\.) = C . -Oi Oi 00 To construct (a,<j>) defined on J that satisfies condition (1.24), pick k0 e CM . Define (a(x),<Kx)) = (aj°(x) , <J»J°(x)) for x e [A0,B0], = (a^fx) , ^(x)) for x e [A^BjJ - [A0,B„], Ca^Cx) , ^(x)) for x e [A. ,B.] - [A.^.B. such that ^iO^i-1 = ko=50°(Bo)> (see expression (1.23)) , 4i(A0) = l = a5°(A0), ^i(A0) = 0 = $5°(A0), 26 <f>*i(B0) = 1 = $^0(Bo) Observe that (a1^1!1 , d^it1) v l+l ' vi + l -* agree at A0 and at B0. k • k • [A.,B.] E '•"i1 ' ^i1^ since they Given any distribution F with compact support, pick [A^,B^] such that Supp(F) C [A ,Bi]. Then M(F) = «j>ki 1 (J®1^1^1 dF/J^a^dF) , i i = f\jja<i> dF//jCt dF) . For any distribution F e Dj, if Supp(F) is not compact, then we obtain M(F) , if it exists, from Axiom 5 as follows. CO Let {Kn)n_Q be an increasing sequence of compact intervals whose limit is equal to J. Denote by Fj^ the restriction of F to K^, Then Axiom 5 =* M(F) = Lim M(FK ) = Lim <J._1(/ a* dFj^/J a dF^) . (1.25) When the limit (1.25) exists and does not depend on the choice of the sequence, IK i , it is denoted by tj> (/Ta<J) dF//Ta dF) . However, n n=0 <J J the above limit does not always exist. An example is given by the arithmetic mean for a cauchy distribution. (Uniqueness) This follows directly from applying Theorem 1.2 to arbitrary intervals [A,BJ in J. Q.E.D. 27 We have characterized the class of mean values for distributions on the real line having the properties of Consistency with Certainty3 Betweenness, Weak Substitution, Continuity and Extension with a pair of functions (a,<J>) . For distributions without compact supports, their corresponding means do not necessarily exist (see (1.25)). A necessary and sufficient condition that ensures existence is given below. Corollary 1.5: ^a(j,CF) exists V F e rjj if and only if either <J> is bounded or a»<|> is bounded. Proof: The sufficiency part of the proof is straightforward. To prove necessity, suppose for the pair (a,tj>), neither <j> nor a-cj) is bounded. We may assume, without losing generality, that <j> is not bounded from above. There are two cases. As x tends to +«, either i) a(x) is bounded from above, or ii) a(x) tends to +«>. Case i) : Consider a sequence {x. 9 a(x^)<j>(xp=2 • Lim — —- does not converge. m-**> 2 : 1a(x.)/2 i=l ^ \J oo 1 Case ii): Consider a sequence L>^}^_J 3 <KX^) = 2 . v m 1 i=la'-Xi'' Then M . (2 . , —r^5 0 = Lim r- does not converge. otcj)v 1=1 „i x±J m ,„i 2 y m-x» 2, . a(x.)/2 Then M (2 T —r-5 ) = acfr 1=1 2i Xj/ 28 A similar argument establishes the result for the case when 4 is unbounded from below. Q.E.D. The above corollary is useful in Chapter 4 when we interpret mean value as the certainty equivalent of a lottery and insist that a certainty equivalent should always be finite. 29 1.4 PROPERTIES OF THE M ± MEAN ad) Of possible properties for mean value, the Intermediate Value property (Property 1) enjoys a rather special status, somewhat like a defining property. After all, even measures such as median and mode, which are rejects of the quasilinear mean, exhibit this property. The conclusion that M has the intermediate value property follows from the observation that, M rn = c *• Mx)(<K*) - 4>(c))dF(x) _ a<rrj ° Ax(x)dF(x) " U Hence, Corollary 1.4: M satisfies Property 1 (Intermediate Value property) l Consistency with strict stochastic dominance ' > ' (Property 3: Monotonicity) is deemed desirable for many applications of mean value. The corollary below gives the condition under which M , is consistent . o<J» . 4 with stochastic dominance (nonstrict) 1 > 1. Corollary 1.5: Suppose a and <f> are both bounded on J. 1 Then V F,G e Dj, F > G Ma<j,(F) I Ma(j>(G^ if and only if V s e J, a(x) (<|>(x)-<f>(s)) (1-26) is a nondecreasing function (nonincreasing function). Proof: We shall assume without loss of generality that d) is strictly increasing. The partial order ' > 1 is defined by G > F if G(x) < F(x), V x e J. The stronger partial order ' > ' defined earlier (Property 3) is the above with strict inequality for some x. 30 1 1 (Sufficiency) Suppose G > F. Then FQ' > Fe whenever 9' > 6 , where FQ = (l-e)F+9G, V9 e (0,1). (1.27) Define £(x;F) = (a(x) //jCidF}{d>(x)-Q(F) } , (fi as in p. 22) (1.28) where fi(F) = fj acf>dF//ja dF. Then, ^(FQ) = /^(x^^GCx)-F(x)) , (1.29) = /J(G(x)-F(x)dC(x;FQ) > 0. Since the integrand is nonnegative and t, is nondecreasing V F e pjj, n(G)-n(F) = /J{/J(G(x)-F(x))d?(x;Fe)}de > o M .(G) > M .(F). (Necessity) Suppose a(x) (<j)(x),-<f>(s*)) is strictly decreasing at some x* for some s* e int J. Since a(x) (<}>(x)-c|>(s)) is a continuous function, it is strictly decreasing for some open neighbourhood (x*-£,x*+£). Assume without loss of generality that s* > x*. Pick any y* > s* and compute p* such that, s* = M (p*6y* + (l-p*)6x") = Ma<))(F*), where x" = x*-hE,. Consider G* = p*<5y* + (1-p*)6^, for some x' e [x*,x*+£). Compute, /j(G*(x)-F*(x))d<;(x;F*) = (1-p*) (5 (x» ;F*)-c(x";F*))' < 0. But Ma^((l-e)F*+9G*) = M (F*Q) is nondecreasing in 9. ^-^(F*E) = /J(F*(x)-G*(x))d?(x;F*e) > 0. Since the R.H.S. is continuous, its limit as 0 approaches 0 from above is nonnegative, which is a contradiction. The extension to possible end-points of J follows from the continuity and bounded-ness of a and <j>. Q.E.D. 31 . The function ?(x;F) can be used to generate the following linear functional, 5*F0) = /jC(-;F)d(.) . Observe that expression (1.29) is the Gateaux differential of Q, at FQ in the direction G-F, which may be written as: dVfi(Fe) = CG-p)-e The functional C*p(*) and the function r,(«;F) are both referred to as the Gateaux derivative of 5. at F. We now interpret condition (1.26) as follows. The Gateaux derivative of 0, at F, c,(_•;¥), is nondecreasing for every F in DT. This generalizes the corresponding condition for quasilinear mean if we observe that the Gateaux derivative of (<|>oM ) at F is simply 4> which is strictly increasing irrespective of F. Another useful partial order is second degree stochastic dominance ' 1 ', defined by, 2 G > F if / x(G(y)-F(y))dy < 0, V x e J and /T(G(y)-F(y))dy = 0 where JX = { y e J : y < x }. (1.30) The above says that . G dominates F in the second degree if they have the same arithmetic mean (if they exist) and the arithmetic mean x x of G truncated by J is not less than that of F truncated by J for every x in J. This is equivalent'to the notion of mean preserving spread (Rothschild § Stiglitz, 1970) in uncertainty economics, and the principle of transfer (Dalton, 1920) which states that a society's welfare is not diminished by a transfer of wealth from the rich to 32 the poor. Quasilinear mean M, is known to be consistent with second degree stochastic dominance when cj> is increasing and concave or decreasing and convex. Having noted the similarity between <j> and ?(*;F) in deriving consistency conditions for first degree stochastic dominance, we entertain the conjecture that the corresponding second degree condition for M is that c(-;F) is concave (convex) if $ is increasing (decreasing) for every F in Dj. The verification of this conjecture is contained as a special case of a more general result developed in the next paragraph. We begin with the following definition of kth degree stochastic dominance, G > F if /J{/JZn_1(-.-{/jZ3{/jZ2(G(z1)-F(z1))dz1}dz2}dz3}---}dzn.2} dzn-i)=0* for n = 2,. . . ,k, and /JZ]<{/JZ]c_1{« • •{ as above } v • }dzk_2)dzk_l } < 0, V zk c J. When the nth moment about the origin exists for distributions F and G for n = l,...,k, then G dominates F in the kth degree if their nth moments agree for n = l,...k, and the kth moment about the origin of G truncated by JZK is not less (greater) than that of F truncated at JZK if k is odd (even) for every z^ in J. The following corollary gives conditions on a and d> for consistency of M , with kth degree stochastic dominance. Corollary 1.6: Suppose a, a', a",-", a^"1^, and <J>, dp', <(>",•••, <f> ^ 1' are continuous and bounded on J. 33 Then V F,G e DT, G > F =* M .(G) > M .(F) if and only if VF e D _, r, ^k-1^ (x;F) is a nondecreasing (nonincreasing) function if <f> is increasing (decreasing) when k is odd, or is a nonincreasing (nondecreasing) function if <f> is increasing (decreasing) when k is even. Proof: Assume without loss of generality that <£ is increasing and k is even. k k (Sufficiency) Suppose G> F. Then FQ, > FQ whenever 6' > 6, where F E (1-9)F+9G, for 0 e (0,1). Then He"fiCFQ) = /J?(x;F6)d(G(x)-F(x)) = (-l)k/j{/JX{ as in p. 32}dzk_1}dc(x;FE) = (-l)k/JIk(x)dc(x;Fe) > 0, where is the k-time iterated integral of (G(x)-F(x)) on the interval J (see expression 1.30), (k-1) since is nonpositive and c, is nondecreasing (nonin creasing) for k odd (even) V F e DT. u It follows that, fi(G)-fl(F) = /J{/JT^(x)dc(x;Fe) }d0 > 0 =»• M .(G) > M .(F) . (Necessity) This follows from an argument that is essentially the same as the one used in the necessity proof of Corollary 1.5. Q.E. D. We end this section by offering a link between M . and M. that leads to a useful condition under which certain known inequalities 34 for M, can be extended to M ,. We derive from a distribution F, a through the function a, another distribution F , Fa(x) = / adF//, adF, for every x e J, (1.31) jX u if the denominator exists. In this case, M , and M, are related in a<j> cj) the following manner: M ,(F) = M fFa). acp cp This leads immediately to: Lemma 1.3: Suppose MX(F) > M, (F) V F e V C DT. ct Then if F e V whenever F does, then M (F) > M (F) V F e V• art) = ctip One use of the above is the extension of the result, If r > s then M (F) > M (F) for everv F e D, ,, * = s (o,00) where M (F) = M _(F) = {/°xrdF} ^. r J xr J 0 , c» -p 00 ^1/ to M (F) = / a(x)x dF(x)// a(x)dF(x)}/r . The function a has the a,r 0 0 standard measure-theoretic interpretation as a Radon-Nikodym deriva-tzve of F with respect to F. We may, on the other hand, consider ct F as an 'integral' of F through the function OL. Can we define Fa even when /TcxdF does not exist? Our definition of M (F) when F J ad)v does not have compact support (expression (1.25)) suggests the following. _ 00 Let (K } be an increasing family of compact intervals whose n n=l limit is J. Then -fjfdF = Lim ^K^^ dF/Jj^a dF, for every f e C°(J) where C°(J) denotes the space of continuous functions on J. 35 Cfc We have defined F so that the equality, holds even when F does not have compact support. It is straight forward to check that Lemma 1.3 holds for the extended definition 2 36 GENERALIZING THE EXPECTED UTILITY REPRESENTATION THEOREM 2.1 INTRODUCTION The preceding chapter generalized the quasilinear mean by weakening the axiom of quasilinearity. As we noted in the introduction, the quasilinear mean, M^, represents another way to axiomatize expected utility, via the maximand, /R<t>dF. Correspondingly, our generalized mean, M^, induces a more general ordering via the maximand, /Da<f>dF/.L adF. K K This chapter treats the problem of extending the above representation for the case of simple probability measures on a more general outcome space than the real line. Since the notion of mean value may not be defined for a more general outcome space (consider, e.g., the outcome space consisting of getting a promotion, status quo and being fired), we need to state axioms directly in terms of properties of the underlying ordering. The developments of results parallel those of Chapter 1. Consequently, the proofs here are straightforward adaptations of the corresponding ones in Chapter 1. They are nonetheless included so that Chapter 2 may be read independently of Chapter 1. Unlike Chapter 1, most definitions used here are given explicitly because they are relatively unfamiliar. 2.2 PRELIMINARY DEFINITIONS Definition 2.1: A simple probability measure P on a set X is a real-valued function defined on the set of all subsets of X such that: 1) . P(A) > 0, V A C X; 2) P(X) = l; 3) P(AuB) = P(A)+P(B) when A, B C X and A n B = <j>; 4) P(A) = 1 for some finite A c X. 37 A simple probability measure P on a set X has the property P({x}) = 0 for all but a finite number of x e X and for all A c X, P(A) = I P(x) where P(.{x}) is written as P(x). xSA Definition 2.2: A point mass, 6 , at x is the spm with P(x) = 1. X Definition 2.3: For $ G (0,1), the g-mixture of a spm P with another spm Q, gP + (l-g)Q, is the real-valued function that assigns BP(A) + (l-g)Q(A) for every A C X. It is clear that gP + (l-g)Q is a spm when P, Q are spm's. In n general, £ 6-p- is a sPm if P- is a spm for i = 1, 2,..., n and i=l 1 1 1 n I 6I= 1 with & > 0 for i = 1, 2, .. . , n. For a spm P on a set X, i=l let {x.}n c X be the set of points for which P(x.) > 0 for 1 i=l n i = 1, 2, n. It is easy to check that P = £ p.6 , where 1=1 1 Pi = P(xi) for i = 1, 2, ...,n. Definition 2.4: The expectation, E(f,P), of a real-valued function f defined on X relative to a spm on X is defined by E(f,P) = I f(x)P(x) xsX n n For P H I p fi , E(f,P) = I p.f(x ) . i=l I i=l Definition 2.5: A binary relation -< on a set Y is a weak order if-< is asymmetric (i.e. Vx, y e Y, x -< y =* not (y ^:x)) and negatively transitive (i.e. Vx, y, z e Y, not (x •< y) n not (y ^ z) =*• not (x-<z)). 38 We summarize some properties of a weak order,-<,- via the following. Lemma 2.1: Suppose -< is a weak order on Y. Define binary relations ~ , =< on Y by x ~ y ** not (x •< y) n not (y -< x) , Vx, y e Y and x =< y *» (x -c y) u (x ~ y) , Vx, y e Y. Then, i) ~ is an equivalence relation ii) =$ is transitive, iii) =S is connected (i.e. Vx, y € Y, (x y) u (y x)) . iv) (x -< y) n (y ~ z) =* x -< z, and (x ~ y) n (y -< z) =• x -< z, Vx, y, z e Y. Proof: (Omitted). In a preference context, -< is called 'strict preference' and x-< y is read as 'y is strictly preferred to x'; =5 is called 'weak preference' and x y is read as 'x is not preferred to y'; ~ is called 'indifference' and x ~ y is read as 'x is indifferent to y'. 2.3 AXIOMS The following are conditions on a binary relations on |_^, the set of spm's defined on a set X. Axiom 1: Ordering -c is a weak order. Axiom 2: Solvability VP, Q, R e |_ P-< Q and Q -< R =* 33 e (0,1) 33P + (l-g)R ~ Q. Axiom 3: Monotonicity VP, Q £ L^, P -< Q =• BP + (l-B)Q -< yP + (l-Y)Q for 0 < Y < g < 1. 39 Axiom 4: Weak Independence VP, Q 6 P ~ Q =» VB e (o,i) aY G (o,i) 3VR£ Lx» BP + (l-B)R ~ YQ + (1-Y)R. Axioms 1, 2, and 3 are standard properties of a binary relation that can be represented by the expectation of a utility function. Axiom 4 is our only departure. If we insist that B and Y be identical, then Axiom 4 reduces to the substitution principle, which is another property of expected utility. The following property is a restatement of Property 5 of Chapter 1 in the context of a weak order, -< , on Definition 2.6: (Ratio Consistency) If HP, Q, R € ^ and 8i, B2, Yi, Y2 e (0,1) 3 P~ Q and B.P + (l-B.)R ~ Y.Q + (l-Y.)R ii ii for i = 1, 2, then Yi / 1-Yl Y2 / 1-Y2 Bi / 1-Bi B2 / 1-B2 Lemma 2.2: Axioms 1, 3 and 4 => Ratio Consistency. Proof: The proof of Lemma 2.2 is essentially identical to that of Lemma 1.2 in Chapter 1. The interpretation of our axioms and the Ratio Consistency property in the context of choice will be deferred until Chapter 4 where we apply the representation theorems of this chapter and that of Chapter 1 to decision theory. 40 2.4 REPRESENTATION THEOREMS To facilitate the statement of our representation theorem, we have Definition 2.7: Let •< be a binary relation on L^, the set of simple probability measures on a set X. The induced binary relation -< on X is defined by, Vx. y e X, x< y<* ii < 5 . ' J ' J x y If -< is a weak order, then -< is also a weak order. We derive the binary relations :< and ^ from -< as in Lemma 2.1. Definition 2.8: Let -< be a weak order on a set Y, an element ye. Y is a maximal (minimal) element if Vx 6 Y, x =§ y (y ^ x). Theorem 2.1: Let be the set of simple probability measures defined on a set X. Suppose -< is a binary relation on with the induced binary relation on X denoted by -< . Then there exist functions a : X ->- R+ and v : X ->• R such that v is non constant and attains its supremum and infimum over X and VP, Q e |_^, r < 0~ E(av,P) <E(gv,Q) 4 E(ct.P) E(a,Q) • ^'1J if and only if -< satisfies Axioms 1-4 and X contains a maximal element x and a minimal element x such that x -< x. Moreover, if a, v and a*, v* satisfy the condition of this theorem, then 3 a, b, c, k with a, c, k > 0 such that Vx £ X, 41 a*(x) = ca(x){k[v(x) - v(x)] + [v(x) - v(x)]} and k[v(x) - v(x)] V W = a k[v(x) - v(x)] + [v(X) - v(x)] + b-Proof: Necessity: Let x, x e X 3 v(x) = inf v and v(x) = sup v X€X X€X Vx e X observe that v(x) < v(x) < v(x) 6 =$6 =S 6-XXX x =< x =5 x =*• x, x are minimal and maximal elements of X respectively. Furthermore, v is nonconstant =* Inf v < Sup v =* 6 -< 6-x x ** x -< X . Axiom 1 follows immediately. Axiom 2 follows from the observation that VP, Q S E(av,6P + (l-B)Q) . . D E(a> * (l-B)Q) 15 continuous in B-VP, Q e Lx, P -< Q =* E(av»p) ^E(av,QJ E(a,P) E(a,Q) _ E(av,gP + (l-B)Q) , . • . . • « ,0 E(a,3P + (l-B)Q) decreases strictly m B. (2.3) =* Axiom 3. 42 Suppose P, Q € L, and P ~ Q~ ff^i = ££™z?l E(a,P) E(a,Q) It follows that VR e 1^ and VB G (0,1) E(gy,BP + (l-B)R) = E(av,yQ + (l-y)R) E(a,BP + (l-B)R) E(a,yQ + (l-y)R) where v'u-y) = Ec«»p) 6/(1-6) E(a,Q) • Hence, Axiom 4. Sufficiency: Let x, x be minimal and maximal elements of X, respectively. Define VP e [0,1], S = p6- + (1 - p)6 . p *x ^ x By hypothesis 6^ -< 6-. It follows from Axiom 3 that S -< S *»0<p<q<l. (2.4) p q Vx S X 3 6 -< 6 and 6 -< 6-, it follows from Axiom 2 that XX XX 3q e (0,1) 3 6- S 4 It is clear from (2.4) that q is unique. We construct a real-valued function v : X [0,1] in the following manner. v(x) = 0 , v(x) = 1, Vx e X - {x,x} , v(x) = q 3 6 ~ S . ' n x q 43 From construction, Vx e X - {x,x}, 6x ~ Sv(x) (2.5) Lemma 2.2 => 3xx > 0 3 VR G |^ and B E (0,1) B6x + ^R ~ BxTiTa S(x)+ frrfif R ^ A X Construct a positive real-valued function a on X in the following manner. a(x) = a(x) = 1 Vx e X - {x,x} , a(x) Given a spm P E J p.6 . 1 ix. i=l l Applying (2.6) to x^, x2, xn sequentially, it follows that f n P,a(x ) n p I p6 ~ Sv(x)+ I ' S 6x ' 1=1 i p a(x,) + z p. vlxij i=2 p.a(x.) + z P. i 1 1 j=2 J 11 J=2 J P1a(x1) -n s. P^Cxj+p-aCxj + z p, ^(xi} j=3 ^ P2a(x2) P1a(x1)+p2a(x2) +_.i3 Pj LV n P. + I 3= _ 6x i=3 p a(x )+p7a(x9) + z p. xi 1 1 2 z i=3 J 44 n p.acx.) I — 1— S n I iL=i ? pa(x) *C*i> " [.ijPW^'l p.acx.)] Hence, VP € Lv, P ~ SE(av,P) E(a,P) • It follows from Lemma 2.1 (iv) that VP, Q 6 L^, P -< Q ~ sE(av,P) SE(av,Q) E(a,P) * E(S,Q) E(av,P) < E(5v,Q) E(a,P) E(a,Q) Uniqueness: Suppose H a :x ^ R+ and v : X R such that v is nonconstant and attains its infimum and supremum over X at v, y, respectively, and VP, Q e Lx, r i Q ^ E(av,P) < E(gy,Q) ' ^ E(a,P) E(a,Q) - U"/J Clearly, y, y are minimal and maximal elements of X-(y E x) U (6 ~ 6 ) and (y = x) U (6- ~ 6-). - - v y x v/ ^ v y x =*• v(x) = v(y) and v(x) = v(y) . By construction, Vx € X - {x,xl, 6 ~S~. (see relation (2.5)) (2.8) x v(x) ,.rYl _ v(x)a(x)v(x) + (1 - v(x))a(x)v(x) v(.xj - = =_ C2 a-) v(x)a(x) + (1 - v(x))a(x) ' l/'yj 45 after applying (2.7) to (2.8). Also Vf 6(0,1), BS(x) S + (1-3)6 36x + Cl-B)«x g~(g + (i-B) - C2.10) Bct(x)v(x) + (1-B)a(x)v(x) Ba(x) + (l-B)a(x) (2.11) _ Ba(x)v(x)a(x)v(x) + [Bct(x) (l-v(x)) + (1-B)]a(x)v(x) B5(x)v(x)a(x) + [Bct(x) (l-v(x)) + (1-B)]a(x) after applying (2.7) to (2.10) Let a = v(x) - v(x), b = v(x), c = a(x), a(x) . It is easy to check that (2.9) and (2.11) become r -i kv(x) v(x) = a i ~ r s — + b kv(x) + (1 - v(x)) and a(x) = cS(x)[kv(x) + (1 - v(x))]. (2.12) Suppose a*, v* are another pair of functions that satisfy the hypotheses of the theorem. Then V*W = a* i*~r r!CM ~( ^ + b* k*v(x) + (l-v(x)) and a*(x) = c*fi(x)[k*v(x) + (1 - v(x))], (2.13) 46 where a* = v*(x) - v*(x) , b* = v*(x), c* = «*(x) , and k* = ^[ . Finally, it is straightforward to check that, k'[v(x) - y(x)] v*fx1 = a' = : + D w k'[v(x) - v(x)] + v(x) - v(x) and a*(x) = c'a(x){k1[v(x) - v(x)] + v(x) - v(x)> (2.14) for a' = (v*(x)-v*(x) )/a = a*/a b' = v*(x) = b* c' = ^SE ( 1 £* . 1 aM v(x) - v(x)JC c a = a*(x) a(x) = k^ a(x) a*(x) k Q.E.D. We showed that any binary relation on (_^, that satisfies the ordering, monotonicity, solvability and weak independence axioms, is characterized by a pair of functions (a,v) defined on X. When a is constant, our represent ation, E(ocv,P)/E(a,P), reduces to the expected utility representation Ct E(v,P). If we define a simple probability measure P derived from P in the following manner, Pa(A) = E(cdA,P)/E(a,P) , VA C X, where 1^ denotes the indicator function of A, then we can state our representation in the alternative fashion below; ECv.P"). Our representation is then simply the expectation of the v-function 47 ct with respect to the measure P derived from P via the a-function which is the Radon-Nikodym derivative of Pa with respect to P. We render the role of a transparent by considering a uniform measure N P = £ N ' ^e corresPonding representation of P is given by, i=l i N ot(x.) E(ctv,P)/E(ct,P) = £ ~N v(x.) . 1=1 .Z.a(x.) J = l 3 N M This is a weighted average of (v(x^)}^_^ with weights, (ctCx^)}/^ . The statement of Theorem 2.1 requires the set X to be bounded by a maximal and a minimal element. The remainder of this section deals with the extension o'f Theorem 2.1 to the case where X has neither a maximal nor a minimal element. This parallels the development towards the proof of Theorem 1.3 in Chapter 1. Definition 2.9: Let -< be a weak order on a set X- P°r s, t £ X, an interval [s,t] C X is defined by [s,t] = {x £ X : s <x, x =st}. When X contains both a maximal element x and a minimal element x relative to a weak order -<, then X = [x,x]. When s^, s2, t^, t2 e X, such that s2 •< Sj -< tj -< t2, then [Sj.tj] ^ [s2,t2]. Definition 2.10: A pair of functions (a,v) is said to represent a weak order-< on LY if (<*,v) satisfies condition (2.1). 48 Definition 2.11: The uniqueness class representing a weak order -< on l_x> {ct>v}x> consists of all pairs (a,v) that represent -< on L^-Definition 2.12: Let s, t £ X 3 s -< t, we denote by {a,v)r , , the [s, t j k-ratio subclass of the uniqueness class {a,v)r , representing , t J -< on L|-s ^-j, consisting of those pairs (a,v) that satisfy aft") k k k r \ = k. A generic element of {a,v}r is denoted by (a ,v ) . Definition 2.13: Let s, t e X 9 s-< t, the pair (ct*,v*) is said to be an (a,b,c,k) transformation of (a,v) on [s,t] if 3 a, b, c, k such that a, c, k > 0 and Vx € [s,t] a*(x) = ca(x){k(v(x) - v(s)) + v(t) - v(x)} and v*(x)=a— kK*) ' ( , + b k[v(x) - v(s)] + v(t) - v(x) We denote such a transformation by (a*,v*) = TajbjC)k (<x,v) on [s,t] C X-II k It is clear that {ct,v}r , = ^ + {a,v}r ^, . Let (a,v) be an [s,t] kSR [s,t] element of {a,v}r ,, we can then generate all other elements [s, t j via the (a,b,c,k) transformation. Note that T ... is an a, D , l, i affine transformation on v and T, . . is a positive scalar 1,1,c,l multiple of a; and that we can use a unitary ratio pair (a*,v*) s {a,v)r to generate all elements of {a,v>r , Ls,tj [s,tj 49 since Tajb)C)k(al.vl) e {a'v}^s t]' We denote by (Sk.vk), the k k k canonical member of {a,v}f , which satisfies v (s) = 0, v (t)=l, [ s , t J ak(s) = 1, and &k(t) = k. It is clear that (ak,vk) is unique for each k. In general, a member a, v of {ct,v}r , is uniquely L S j t J specified by the values of a, v at.s and t. Let a = v(t) -v(s), b = v(s), c = a(s), k = a(t)/a(s), then Vx € [s,t]. a(x) = ca1Cx) [k01(x) + (1 - v^x))] and v(x) = a kv1(x) k\)1(x) + 1 - vJCx) + b Corollary' 2.1: Let sQ, tQ, ^ e X 3 Sj A; sQ -k t -k t , then {a, v} - U r ^k \s t 1 ~ kefjz, ? ^ tot,v} r , , where LSQ,t j kfct*01/01J 0 0 oi a, , a, , a (so} v (so} (2.15) a\tQ) 1 - v1^) -01 .1 ,i a (sQ) 1 - v (sQ) (2.16) Proof: [s0,t0] J [s1>tl] - L[SQ,tQ] / Hs^t^' Therefore, if a, v defined on [s ,t ] represents -<on L. then a,v L^^jt^J 50 represents ^onL - {a.vK , f , c {a,v}Fs , Observe that {a,v} [Sj.tj] ko fs ti = {a'v}rs t v where L O'V ^'V o —I i 1 (2-17) al(sQ) k^'CSg) + (1 - v^Sg)) Note that £ is a continuous, strictly increasing and onto function from (0,oo) to U01>£01)- Hence {a, v} kl rs t 1 = ^ (oi.vK , [s0,tQ] I! ( ^ U {a'v}rs t 1 = U _ {a»v}r° , • 0 l-01* 01J u Corollary 2.1 tells us it is possible to extend our representation of a weak order on Lr ^ to a weak order on Lr ^ if the [sQ,t0] [Sl,tl] ordering has not changed on Lr -i • It also gives conditions LV 0J . on (an,vn) e {a,v}r 1 so that (an,vn) can be extended to a U U [S^tpJ U U 51 member (ct^vj 6 {a»v}[s t ] SUCh that CVV = (a1,v1) i.e. the extended pair (a^,vj defined on [s^,t^] agrees with (a0Jv0) on [s0,tQ]. These conditions ((2.15) and (2,16)) are given in terms of a1(tQ), a1(s0), v1(t0), v1(s0); and can be obtained via relations (2.5) and (2.6) in the constructive proof of Theorem 2.1. Corollary 2.2: Let sQ, s^ s^ tQ, tj, t2 G X 3 s2 -k s1 4; sQ ^ t -< tj-< t2, and let (a*>^*) denote the canonical unitary a,v on [si,ti], for i = 1,2. Then, l02 < l0l and *02 > *01 (2.18) where £; 01 'i('o) and a?(t0) i - «|(t0) for i = 1,2. Proof: From Corollary 2.1, {a,v} [s2,t2] rs t i = U - {a'v}rs t l ' {a,v}r U (a.v}r° [S°'to] WrV^J[S°,to1 ' 52 and {a,v} [s2,t2] kl • U - {a'v}rs t l where '12 .1 a, , a2(s1J v2(s1) and a^tp l - v^ctp a2(s1) 1 - v2(s1) Construct the function £. . : (0,°°) -»- f£..,£..), for i < i , via J ,i -ij ij k. = C. .(k.) a^Ct.) k.v^t.) + I - v^(t.) J i 3 3i 3 i a*(s.) k.v*(s.) + 1 - v*(s.) 3i 3 3i 3i (2.19) Note that, by construction, k. {a,v} 3 [s. ,t.] 3 3 [s.,t.] l iJ {a,v} ^ 3 [s.,t.] Note also that £. . is continuous, strictly increasing, and onto from 3 J i (0,oo) to (£^,1^3. Suppose lQ2 >lQ1. Pick k1 e [l12,co) then ?10(kl) <lQ1< lQ2. 3 k2 e (0.-) 3 ?20(k2) = e10(kl) 53 i.e. {ct,v} [s2,t2] [s0,tQ] {a,v} ?io(V But {ct,v} [s2,t2] {a, v} [s2,t2] [Spt^ {a,v} 521(k2) 521Ck2) = kx But 521Ck-2) e (A ,A12) A similar argument establishes £Q1 < £Q2 Thus far, we have considered extending from an interval of X to a larger interval. Presumably x is not bounded, otherwise, we would have constructed (ct,v) on X with Theorem 2.1 once and for all. The next interesting case then is when X contains neither a maximal nor a minimal element, for example, the real line. With a structural condi tion on X, we show in Theorem 2.2 that even in this case, a a-v repre sentation exists on L,. Definition 2.14: Let •< be a weak order on a set Y- A sequence 00 ly.} C Y is cofinal (coinitial) if Vx € Y, x =$ y.(y. =< x) for i=l iw i some positive integer i. 54 Theorem 2.2: Let be the set of spm defined on a set X> an<3 -< is a binary relation on l_x with the induced binary relation on x denoted by -< . There exist functions a : X~*"R+ and v : X R such that (i) v(x) contains a strictly increasing cofinal sequence and a strictly decreasing coinitial sequence, and (ii) VP, Q e Lx W E(a,P) E(a,Q) ' if and only if, -< satisfies Axioms 1-5 and X ordered by -< contains a strictly increasing cofinal sequence and a strictly decreasing coinitial sequence. Moreover, if a, v and a*, v* both satisfy (i) and (ii) , then Vs, t e X 3 s t, 3a, b, c, k with a, c, k > 0 3 Vx e [s,t], a*(x) = ca(x){k[v(x) - v(s)] + [v(t) - v(x)]}, and v*fx) = a k[y(x) - y(s)] and v (x) a k[v(x) _ v(s)J + [v(t) _ v(x)] b. Proof: Necessity: Let {d.}°° and {e.}°° c v(X) De a strictly decreasing coinitial 1 i=0 1 i=0 sequence and a strictly increasing cofinal sequence, respectively. Pick s., t. eX3v(s.) =d. and v (t.) = e. for i = 0,1,2, ... ii ii ^ii 55 It follows that {s.} ({t.} ) is a strictly decreasing 1 i=l 1 i=l coinitial sequence (strictly increasing cofinal sequence) of X-Verification of Axioms 1-4 is straightforward (see Necessity Proof of Theorem 2.1). Sufficiency: oo 00 Let {s.} ({t.} ) be a strictly decreasing coinitial sequence 1 i=0 1 i=0 (strictly increasing cofinal sequence) of X- Suppose without loss of generality that s^ -< t^. It is easy to check that oo X= U [Sj.t ]. i=0 Let (ex., v.) represent -< on |r for i = 0,1,2, ... . It can 1 i' i always be done because of Theorem 2.1. Corollary 2.1 =* = U _ <vVko [s.,t.] a\{t ) 1 - v*(t ) a1 it ) v^t ) where (*oi,£oi) = ( . , -f-^- • -i_2-) a (s ) 1 - v (s0) a (s0) v (s ) Corollary 2.2 => "1 00 - 00 {A n-} is strictly increasing and {£n.} is U1 i=l 01 i=i strictly decreasing. Let A. - f0i + ;°>i+1 , £Qi + + , and B. =(£_..,£„ . J l -0,i+l 0,1+1 56 Then B. <~ A. c ttQi,iQi) for i = 1,2,3, .... Observe that tZQ^,lQi), A , Bi are strictly decreasing sequences by inclusion. Since A^ is compact for each i, nested interval theorem =* lim A. = A^ i <j). i->oo Since B C A C (o j.) V i 1 ^ l jt -Oi, Oi => lim B. C A C lim (£.. ,JL.) . l oo . -Oi Oi But lim B. = lim (£..,JL.) l . -Oi Oi Hence lim (£..,£„.) = A -Oi Oi ° To construct (ct,v) defined on x that represents -< on L^, pick kn e A . 0 00 ko ko Define (a(xj,v(xj) = (aQ (x) , vQ (x)) for x € [sQ,t ]. kl kl = (a1 (x), v1 (x)) for x e t^'1^] _ [s0»tQ], k. k. (a.1(x), v.1(x)) for xG [s.,t.] - [si_1,t. such that ?i0(V = ko and k. = 1 = k. = 0 = k. vi^V = 1 = ao <so3 ~ko vo <so> vo <V 57 Observe that (a. .. , v. , ) l+l ' l+l 1 agree at s^ and at tQ. k. . k. , k. k. i+l i + U i _ , I i. . „, = (a. , v. J since they [s.,t.] l I VP, Q e Lx, pick [s.,t.] B?> Q€ L[s.,t.]' i i k. k. k. k. E(a 1 v 1 , P) Efa.1 v.1 , Q) then P •< Q ** < L_J: Efa.1 , P) ECa.1 , Q) E(av,P) < EQv,Q) E(a,P) E(a,Q) by construction. To complete the sufficiency proof, observe that (v(s.)} 1 i = 0 ({v(t.)} ) is a strictly decreasing coinitial sequence 1 1=0 (strictly increasing cofinal sequence) of v(x)-Uniqueness: This follows directly from applying Theorem 2.1 to arbitrary intervals [s,t] in X-Q.E.D. 58 PART 11 APPLICATION TO DECISION THEORY BACKGROUND A choice situation exists when more than one course of action is available to a decision maker. A theory of choice specifies, for each set of available alternatives, the one that will be chosen. We have a valid descriptive theory if, for the relevant domain of choice situations, the theory can be compatible with the actual choices. The theory is normatively compelling if the underlying postulates are of sufficient appeal so that a decision maker is willing to change his choice to conform to the theory's specifications. Expected utility has been considered an example of such a theory because for many researchers (e.g. Savage (1954), MacCrimmon (1965) and Raiffa (1968)), it satisfies the latter requirements. Yet, there is enough empirical evidence (cf. Chapter 3) to suggest that it is not a very good descrip tive theory. People tend to systematically violate the implications of a key property of expected utility called the strong independence principle or the substitution axiom. Many of them would not change their choices after being told of their violations (MacCrimmon, 1968; Slovic $ Tversky, 1975). Due to the success of expected utility in the modeling of phenomena in the economics of uncertainty and its application to statistical decision theory, it has been fashionable to discount violations as mistakes needing correction. A departure from this trend is evident in the appearance of several recent papers (Meginniss, 1977; Handa, 1977; Karmarkar, 1978; Kahneman § Tversky, 1979; Machina, 1980) proposing alternative theories of choice to account for Allais' 'paradox' and 60 other empirical findings that contradict the implications of expected utility. We develop in Chapter 4 a new theory of choice called alpha utility theory which generalizes expected utility via. a necessary and sufficient set of axioms that weaken the corresponding ones for expected utility. Specifically, the substitution axiom is replaced by a weaker axiom called Weak Independence. Given two lotteries that are indifferent to each other, Weak Independence allows for different probabilities in com posing each of these lotteries with a third lottery to preserve indifference. However, these mixture-probabilities once determined must be independent of the third lottery. The axioms imply that the ratio of the mixture (probability) odds is constant. We call this the Ratio Consistency property. Expected utility results when this ratio is identically unity. Our theory has descriptive relevance in that it can represent the usual responses given to the Allais paradox and is compatible with other reported empirical findings contradicting the implications of expected utility. Yet, it can be consistent with such normatively appealing partial orders as stochastic dominance and global risk aversion. As with expected utility, the constructive proof of our representation theorem furnishes a procedure for the assessment of the alpha utility functions, from which empirically testable predictions can be derived. CRITIQUE OF EXPECTED UTILITY THEORY 3.1 INTRODUCTION Expected utility theory has attracted considerable attention since its revival by von Neumann and Morgenstern in their "Theory of Games and Economic Behavior". It serves as the foundation for the economics of uncertainty (Arrow, 1971; Marschak and Radner, 1972; Diamond and Rothschild, 1978), statistical decision theory (Savage, 1954; Blackwell and Girshick, 1961; Raiffa and Schlaifer, 1961; DeGroot, 1970) and decision analysis (Howard, 1964; Keeney and Raiffa, 1976) . Expected utility has been less successful though in explain ing and describing actual choices (Edwards, 1954, 1961; MacCrimmon, 1965; Kahneman and Tversky, 1979); thus, providing a strong impetus for further theoretical development. In this critique, we review briefly the empirical findings that pose difficulty for expected utility. First, we provide a summary, based on a recent paper (Chew and MacCrimmon, 1979b), of the systematic violations of the strong independence principle. The first example of such a violation is provided by the Allais paradox, which inspired extensive follow-up studies and modifications. The next section, discusses the concurrence of risk-averting and risk-seeking behavior evident in the prevalence of the purchase of insurance and gambling (Friedman and Savage, 1948). For instance, Markowitz (1952) noted prevalent risk-proneness for lotteries involv ing losses among his subjects. This observation is also noted by / 62 Kahneman and Tversky (1979), particularly their probabilistic insurance example. Figure 3.1 displays a typical von Neumann-Morgenstern utility function which is used to account for the joint risk averting/ seeking behavior discussed above. The "convex" ("concave") regions correspond to risk-proneness (risk-aversion). Fig. 3.1: A "typical" von Neumann-Morgenstern utility function u(x) 63 Since the von Neumann-Morgenstern utility function cannot be convex and concave at the same time, expected utility rules out concurrence of risk proneness and risk aversion within the same region of wealth levels. Whether this is actually the case is an empirical question that has yet to be fully investigated, There is however indirect evidence (MacCrimmon, et.al. 1972; Allais, 1977) to the contrary. Different theoretically equivalent procedures for the elicitation of von Neumann-Morgenstern utility, e.g., the certainty equivalent, the gain equivalent, and the chaining method, tend to yield different curves with opposing risk-propensities. A difficulty with expected utility, one that touches on the largely unexplored area of problem representation and its effect on the decision maker's preference, is,the controversy over the domain on which a utility function is defined. Should it be final wealth levels, the normatively compelling position as in Friedman and Savage (1948), Pratt (1964) and practically all the literature on the economics of uncertainty, or changes in asset position relative to some "customary" wealth level? Markowitz (1952) and others have observed that prefer ences are relatively independent of the current wealth levels. Another difficulty is related to the finding (Kahneman and Tversky, 1979) that, preferences among two-stage lotteries may depend on whether the decision maker represents these lotteries in their simple equiva lent forms. This chapter expands on the issues introduced above without duplicating unduly the contents of other critiques already cited. 64 We shall explore the descriptive implications of our generalization of expected utility theory in the next chapter in light of the examples considered here. 3.2 SYSTEMATIC VIOLATION OF THE STRONG INDEPENDENCE PRINCIPLE As a lead-in to a more general structure of lotteries, consider the four decision problems given in Figure 3.2. AQ: $1,000,000 for sure BQ: 10/11 chance of $5,000,000 1/11 chance of $0 , AL: 11/100 chance of $1,000,000 89/100 chance of $0 BL: 10/100 chance of $5,000,000 90/100 chance of $0 V 89/100 chance of $5,000,000 11/100 chance of $1,000,000 BH: 99/100 chance of $5,000,000 1/100 chance of $0 V $i,000,000 for sure V 10/100 chance of $5,000,000 89/100 chance of $1,000,000 1/100 chance of $0 Figure 3.2: Four decision problems Under the expected utility hypothesis, the only permissible patterns of choices are either A^, AT, A, , An or BU, BT, BT, BN. If 0 'HJ 'o-your choices are like most people's, you will have chosen Au, A , B. H I L and An, which is not consistent with the implications of expected utility. The choice of Aj and B^ constitutes the well known Allais 65 paradox. A lesser known paradox, the Allais ratio paradox, is given by the choice of B. and A,.. The violating choice of Au and BT L U n , L has not been studied. The insight one gains from the struc ture in Figure 3.2, rather than simply considering separate binary lotteries, is that the violating pairs (Aj, BL), (B^, A ) and (A^, B^) are all derivatives of the basic violation, AU, AT, B,, A„ versus' n 1 L U V Ar V V Several features of the structure in Figure 3.2 are worth noting. It is based on three consequences $0,' $1,000,000 and $5,000,000, denoted by L, I, and H respectively. The A (B ) alternative, where x X X stands for one of the consequences, L, I, H, is obtained from the AQ(BQ) alternative by composition with consequence x at probability 89/100. This is illustrated in Figure 3.3 for the case x = L. -• I Figure 3.3: The composition of the AT (B ) Li L lottery from AQ(BQ) 66 Since A (B ) in Figure 3.3 has the same final outcomes and probabilities as A^(BjJ in Figure 3.2, these lotteries are equivalent. Although for illustrative purposes, we have only considered the conse quences $0, $1,000,000 and $5,000,000 and the composition probability .89, it seems reasonable to expect violations of expected utility for other con sequence values and other probability levels. This leads to a more general structure of decision problems, illustrated in Fig. 3.4. A^ is a sure pro spect of the intermediate consequence I. B^ offers a q chance at the most preferred outcome H and a l-q chance at the least preferred outcome L. The A^(B^) alternative is obtained from the A^(BQ) alternatively by composing with the x-consequence at probability g. Figure 3.4: The HILO structure of three consequence lotteries 67 Note that the H (for "high"), I (for "intermediate") and L (for "low"), given in the boxes of Figure 3.4, exhaust the possible compositions from the basic problem (denoted as "0") . For ease of reference this will be called the "HILO" lottery structure. Expected utility theory imposes some severe restrictions on the choices in this lottery structure. The strong independence principle requires that preference between two alternatives be preserved when each alternative is composed with a common alternative at the same g g probability. Since this is how the and alternatives are gener ated, it implies that the choice of alternative A^ entails the choice g g of A while the choice of alternative B_ entails the choice of B . x Ox' for all values of x and B. Hence, the choice of an A alternative in one of the cases of the HILO structure and a choice of a B alternative in another case (such as the standard Allais choice of A , B ) vio-X LJ lates expected utility. In addition to these violations across Bl problems, violations may occur within each problem since A^ may be B2 chosen for some particular level 8^ while B^ may be chosen for some different level B2-Although empirical results are not available for all the across-case and within-case combinations (since this lottery structure has not appeared in the literature), there are results for some of the particular cases. Table 3.1 summarizes the choices from the main empirical studies in terms of the HILO lottery structure. From this table it should be apparent that most of the effort has been devoted to studying various versions of the standard Allais paradox, H-H Case: 06 .8' SH, AH (8 > 8') 0-H Case: B„. A„ L-L Case: A^, (6 > 8') 0-L Case: AQ, I-L Case: Aj, Standard Allais Paradox p+ o r> I O 3 3 i 3 3 1— at 2 1 7T o» 2 I o> o -S o */» -» (/» -*• § 1 o - 3 ui a> .2 * Q» —< "1 -*. SS1 -n O CO 03 o» a» m 3 3 0 T 1 TTTT o> ni 1 T Tl o. (T> ptcy ptcy urn —« o 3 -J — T a* ro —* Ul Ul CO Ul o o T 1 —• OOO OU1 tJl ro —• —' ro . . O Ul O Ul Ul o *»• -C* Ul UJ ui ui co ro <n U N M ro —« —• ^ ifl U3 3 Ul 3 11 CO II —' S< —' 89 69 the I-L caset Note that while the frequency of choosing the violating 6 6 choices, Aj, B^, varies across studies, the violation seems robust over quite different levels of consequences and probabilities. Receiv-ing increasing attention recently has been the L-L pattern, A^ , B^ (including the special case 0-L, of choices A , B 2) . U Li The only studies which have considered several cases simultan eously are those of MacCrimmon and Larsson (1975) and Kahneman and Tversky (1979) . The former study introduced the 0-H and the H-H cases in the context of negative outcomes and is the only attempt to map patterns of preferences between A^ and B^ for various levels of 3 (including the special "0" case of 6 = 1.0) and various levels of the intermediate outcome, I. It seems clear from Table 3.1 that our understanding of actual choices for the decision problems in Figure 3.4 is incomplete. Studies thus far conducted have covered the I-L, L-0, and L-L cases, for gains, and the H-0 and H-H cases, for losses. With the HILO lottery structure, there are potentially 6 distinct binary violation patterns, "I-L", "H-I", "1-0", "L-0", "H-0" and "H-L", across problems and three binary violation patterns, "H-H", "I-I" and "L-L" within problems. The cases "H-I", "1-0", "I-I" and "H-L" remain unexplored. To obtain a more complete picture, a systematic study of choices related to Figure 3.4 is needed. The terminology "I-L case" refers to the pair of lotteries comprising the I case being presented in conjunction with the lotteries of the L case. 70 3.3 CONCURRENCE OF RISK SEEKING AND RISK AVERTING BEHAVIOR We shall not discuss here the existence of non-overlapping risk-seeking and risk-averting regions of a utility function (see e.g. Fig.3.1) corresponding to the purchase of lottery tickets, purchase of insur ance and greater risk-seeking propensity for lotteries involving losses. This has been given adequate coverage elsewhere (Friedman and Savage, 1948; Markowitz, 1952; Kahneman and Tversky, 1979). Instead, we focus our attention on the possible concurrence of risk proneness and risk aversion within the same region; thus negating any explanation based on modifications of the von Neumann-Morgenstern utility function. Several measurement procedures to elicit a decision maker's von' Neumann-Morgenstern utility function are based on lottery compari son of the sort given in Figure 3.5. Figure 3.5: Standard lottery comparison Lottery A is a sure consequence of X , an amount greater than X^ but less than X . Lottery B is a p chance of getting X and a 1-p chance of getting X . If we fix the amounts in lottery B at X® and X^ 71 respectively but allow X£ and p to vary, then the pairs of numbers (Xc,p) such that lottery A is indifferent to lottery B define a function which we denote by uc(X), i.e. lottery A is indifferent to lottery B whenever p = u (X ). We denote an affine transformation of u (X) by uc(X), i.e. uc(X) = auc(X) + b, for some a , b with a > 0. Suppose the decision maker is an expected utility maximizer with von Neumann-Morgenstern utility function u(X). Then A indifferent to B implies that u(Xc) = Gc(Xc)u(X°) + (l-uc(Xc))u(xJ) , or alternatively u(Xc) = [u(x°)-uCxJ)]acCXcD + u(X°J , (3.1) which is an affine transformation of u (X ). c c Therefore, the function uc(X) is a von Neumann-Morgenstern utility function. The above measurement procedure is usually known as the Certainty Equivalent method. We can alternatively fix the sure amount in A at X^ and the loss c amount in B at X^ and obtain the pairs (X , p (X )) such that A remains indifferent to B. Applying expected utility again, we obtain the following relation. u(X°) = p (X )u(X ) + (1-p (XJ)u(X°J , or alternatively L 6 6 6 6 6 *> u(X?) - u(X°) Since u(X) is related to p *(x) through an affine transformation, p^ *(X) is also a von Neumann-Morgenstern utility function; and this 72 elicitation procedure is called the Gain Equivalent method. Similarly, we can determine the von Neumann-Morgenstern utility using the Loss Equivalent method by fixing and X^ to obtain the pairs (X ,p (X )) such that a p (X ) chance of X versus getting X otherwise is indifferent to getting X^ for sure. We apply expected utility again and obtain:. u(X°) - u(X°) Therefore, -=—^jvT is a von Neumann-Morgenstern utility function. 1_VXJ The certainty, gain and loss equivalent methods are obtained from Fig. 3.5 by holding constant the pairs (X , X ), (X , X ) and (X , X ) X. g £ C C g respectively. If instead we hold p^ constant, then there are three remaining candidates for additional methods known collectively as the Chaining Methods which are obtained by holding constant the pairs (P^,Xj), (P^,X^) and (p^,X-^) respectively. Of the three cases, we x. C g describe only the first i.e. fixing (p^,X^), which is more often used in practice. Beginning with an amount X^ larger than X^, determine an amount X2 such that a sure consequence of X2 i-n A would be indiff erent to p^ chance of getting X^ and obtaining X^ otherwise in B. Determine a third value X^ by replacing X^ by X2 and repeating the process. Thus, we obtain a decreasing sequence (X^,X2,X.j, . ..) with the following property, u(X.+1) = p°u(X.) + (l-p°)u(X°). (3.4) Assigning arbitrary values to u(^) and a greater value to u(X^), 73 we observe that expression (3.4) determines the von Neumann-Morgenstern utility on the decreasing set of points (X^,X2,X2> ...). We shall denote the utility function obtained using the chaining method with fixed probability p° and fixed loss amount X^ by u 0(X). a p As we have noted earlier, if the decision maker is a 'true' expected utility maximizer, then the utility functions u , u , u C g J6 and UpQ would be affine transformations of each other so that any one of them suffices. Allais (1979, Appendix C), in an experiment conducted in 1952 found that the u and u, curves obtained from the C 2 same subjects are generally very different. A fairly typical plot is given in Figure 3.6. Note that under the expected utility hypo thesis, the convex region near X=0 of u^ is not compatible with the glo bal concavity of Uj . Behaviorally, the risk seeking region of u ~"2 C corresponds to our intuition about the psychology of lottery purchase -- people tend to forgo a small certain amount in favour of a small chance of a large gain; while the concavity of ux reaffirms the reluc-2 tance of individuals to engage in symmetric bets. 76 In an ongoing study on the risk attitudes of top-level business managers carried out by MacCrimmon and others [1972), chaining and gain equivalents were among the methods used to assess von Neumann-Morgenstern utility functions. Fig. 3.7 displays a typical pair of curves u and u obtained from a subject using the chaining method and gain equivalent method respectively. Note again that the convexity of u near X=0 is inconsistent, under the expected utility hypothesis, with the concavity of u at the same region. The same inconsistency also applies, though in the opposite direction, to the convex region of Ug versus the concavity of u^ beyond its initial convex region. Even though the empirical evidence on the concurrence of risk proneness and risk aversion within the same range of wealth levels is scant and fragmentary, what we already know about the actual applica tion of different methods to elicit utility functions suggests that expected utility does not account for the results. This suggests that further investigation of the concurrence of risk proneness and risk aversion and especially mutual incompatibility of the different measurement procedures to obtain von Neumann-Morgenstern utility functions is warranted. 77 3.4 SOME PROBLEMS WITH PROBLEM REPRESENTATION Problem representation and its influence on preferences is a relatively untouched area of research on decision-making. Normatively, a decision maker's preference should not depend on the way alternatives are perceived or represented as long as it does not affect the desir ability of the underlying consequences of his alternatives. That this may not be the case is demonstrated by Kahneman and Tversky (1979) through a class of phenomena termed Isolation Effects. Consider the choice between A and B in Figure 3.8. $0 $4000 Fig. 3.8: Graphical representation of two lotteries $4000 $0 Fig. 3.9: A sequential representation of lottery B 78 If you are an expected utility decision maker, then your preference does not depend on how the probabilities of final outcomes are obtained, so that lottery C in Fig. 3.9 is equivalent to lottery B in Fig. 3.8, i.e. preference between A and B should be in the same direction as pre ference between A and C. Kahneman and Tversky found for one group of subjects ( n = 95 ), 65% prefer lottery B to lottery A. However, the modal preference pattern between lottery A and lottery C for another group of subjects is found to be the opposite (78% prefer C to A; with n=141). The problem description that elicited the modal preference for lottery C versus lottery A is reproduced below. (4000,.80) refers to a lottery that pays $4000 with .8 probability and $0 with .2 proba bility, and (3000) denotes the sure consequence of $3000. Consider the following two-stage game. In the first stage, there is a probability of .75 to end the game without winning anything, and a probability of .25 to move into the second stage. If you reach the second stage you have a choice between (4000,.80) and (3000) Your choice must be made before the game starts, i.e., before the outcome of the first stage is known. The problem description above focuses the subjects' attention on the choice between (4000,.80) and (3000) rather than the common outcome $0 with the' same probability .75. Kahneman and Tversky conjectured that most subjects then ignore the common outcome - probability component under the above problem representation so that their choice becomes identical to that between (4000,.80) and (3000). 79 Another kind of isolation effect considered by Kahneman and Tversky is related to Markowitz (1952)'s observation that preferences are relatively independent of current wealth levels. They presented the following problems to two different groups of subjects. Problem 1. In addition to whatever you own, you have been given 1000. You are now asked to choose between A: (1000,.50) and B: (500) Problem 2. In addition to whatever you own, you have been given 2000. You are now asked to choose between C: (-1000,.50) and D: (-500) The majority of subjects chose A in Problem 1 and B in Problem 2. Note> however, that in terms of final outcomes, the two choice problems are equivalent i.e. you are either $1500 richer if you choose B or D, or you have even chance of ending up with $1000 or $2000 more if you choose A or C. People, however, seem to perceive Problem 1 as a choice between (1000,.50) and (500), and Problem 2 as a choice between (-1000,.50) and (-500), with the lump sums of $1000 in Problem 1 and $2000 in Problem 2 safely integrated into their current wealth levels. As an alternative to the final outcome position normally associated with expected utility, Kahneman and Tversky proposed that people perceive outcomes as gains and losses relative to some neutral reference point. Allowing the reference point to be determined by the decision maker in the context of the choice situation he faces, expected utility, with a utility function defined -on changes in asset position relative to the reference point, is compatible with the modal preferences in 80 Problems 1 and 2. The reference point need not be the status quo especially when the choice situation involves' a sure gain of $1000. 3.5 SUMMARY We have reviewed briefly some of the literature on empirical evidence that contradicts the implications of expected utility. It has been classified under the headings:' systematic violations of the strong independence principle (section 3.2), concurrence of risk proneness and risk aversion (section 3.3), and some problems with problem representation (section 3.4). The phenomena considered include the Allais paradox and its various modifications, incompatibility among different methods of measuring a von Neumann-Morgenstern utility function, and Kahneman and Tversky's isolation effects. In the next chapter, we generalize expected utility theory by applying the representation theorems of Part I as a theory of choice. We then explore the descriptive implications of our generalization with respect to the phenomena considered in this critique. 4 A NEW THEORY 81 We develop in this chapter a new theory of choice called alpha utility theory which generalizes expected utility by interpreting, in sections 1 and 2, the representation theorems of Chapters 1 and 2 in terms of choice among lotteries. We explore in section 3 some normative implications of our theory including consistency conditions with stochastic dominance and local and global risk aversion. The question of descriptive validity is considered in section 4 where we show that alpha utility is compatible with the phenomena reviewed in the critique of expected utility (Chapter 3). We end the chapter with a comparison of our theory with two other alternative theories of choice. 4.1 INTERPRETING MEAN VALUE AS CERTAINTY EQUIVALENT In Chapter 1, we proved a representation theorem of a mean value functional, M, for probability distributions subject to a necessary and sufficient set of axioms. The present section explores the impli cations of our representation theorem for choice among lotteries by 2 interpreting mean values as certainty equivalents. In the ensuing discussion, we assume that lotteries, defined on some single-attribute consequence set, e.g., monetary gains, can be represented by probability distributions defined on the real line. (The case of more general con sequence space is considered in section 4.2) The decision maker is assumed to have complete and transitive preference over lotteries. The following axiom asserts that he is able to assign, corresponding to any lottery F, a certainty equivalent 2 This was discussed in Chew (1979). 82 M(F) which is an amount such that the decision maker is indifferent between getting it for sure and taking the lottery F. D denotes the space of probability distributions (lotteries) defined on the real line. Axiom MO: Existence V F e D, M(F) exists. Axiom MO rules out infinite certainty equivalents thus pre-empting any possibility of a St. Petersburg type paradox. The following five axioms M1-M5 are taken directly from Chapter 1. We did not state existence (MO) as an axiom in Chapter 1 because it is not an intrinsic property of mean values. For example, the arithmetic mean does not always exist. Axiom Ml: Certainty Consistency M(6 ) = x V x £ R. X It is difficult to take issue with Axiom Ml which requires that the certainty equivalent of ^ i.e. getting x for sure, is x. Another normatively appealing axiom is the following: Axiom M2: Betweenness V F, G e D, if M(F) < M(G), then V B € (0,1), M(BF+(1-B)G) €E (M(F),M(G)). Axiom M2 requires that the certainty equivalent of a mixture of two lotteries be intermediate in value between the certainty equivalents of the respective lotteries. The next axiom weakens the axiom of quasilinearity of Hardy, Littlewood and Polya or the "substitution of lotteries" principle of Pratt, Raiffa and Schaifer. 83 Axiom M3: Weak Substitution V F, G e D, if M(F)' = M(G), then V 3 e (0,1) 3 y £ (0,1) 3 V H e D, M(3F+(1-3)H) = M(yG+(l-Y)H). Suppose F and G are two lotteries with the same certainty equivalents i.e. indifferent to each other. Consider the mixtures or compound lotteries 3 F + (1-3)H and yG+(l-y)H of the respective lotteries, with a ;third lottery H at probabilities 6 and y. Weak substitution weakens the substitution principle in the sense that the mixture probabilities 3 and y that preserve equality of certainty equivalents or indifference need not be the same. However, it requires that these mixture probabili ties once determined be independent of the third lottery H. The next two axioms are technical assumptions which are introduced to ensure that our notion of certainty equivalent is well behaved rela tive to certain limiting operations on probability distributions. Axiom M4 permits the approximation of lotteries with numerous discrete outcomes by a continuous distribution e.g. uniform distribution. Axiom M4: Continuity If ^Fn^_^ converges in distribution to F and F has compact support, then M(F) = Lim M(F ). For a more technical discussion of Axiom M4 see Chapter 1. Our last axiom stated below asserts that the certainty equivalent of a lottery F , truncated by the interval K, is 'close' to that of the original lottery F if the interval of truncation K is sufficiently large. 84 oo Axiom M5: Extension Let {K } .be an increasing family of — n n = 0 J We are now ready to restate Theorem 1.3 of Chapter 1 in terms of certainty equivalents. Suppose a decision maker assigns a certainty equivalent M(F) corresponding to a lottery F, then the assignment of M(F) satisfies Axioms M1-M5 if and only if there exist a strictly positively valued function a and a strictly increas ing function v such that V F G D, compact intervals such that Lim K = R. n-*=° n Then M(F) = Lim Y[{Jc ) , V F e D. n M(F) = v -1 (/ ctvdF// adF) . (4.1) Moreover, if (ct*,v*) is another pair of functions that satisfies condition (4.1), then, for every interval [A,B] C R, there exist constants a,b,c,k with a,c,k > 0 such that V x S [A,B], v*(x) a k[v(x) - v(A)] + v(B) - v(x) k[v(xj - v(A)] + b , and a*(x) = ca(x)[k(v(x) - v(A))] + v(B) - v(x)] . (4.2) 85 Note that the expression for certainty equivalent (4.1) is more general than the corresponding expression for expected utility, M(F) = v_1(/RvdF) (4.3) which results when a is constant. The uniqueness relation (4.2) sub sumes as special cases, an affine transformation for v and a positive scalar transformation for a. To avoid the possibility of any St. Petersburg type paradox, we have asserted through Axiom MO that the certainty equivalent M(F) must be finite for any lottery F. We can then apply Corollary 1.3 in Chapter 1 to conclude that either v is bounded or a.v is bounded. Thus, we have obtained a generalization of expected utility, called alpha utility, in the sense that the decision maker's preference over lotteries is represented by a more general expression for his certainty equivalent characterized by an additional alpha-function. We shall defer discussion of the actual shapes of a and v functions till the sections after next. The next section contains a parallel development of the above theory using the representation theorems of Chapter 2. 86 4.2 REPRESENTATION OF A PREFERENCE BINARY RELATION In the preceding section, we developed alpha utility theory for lotteries defined on a single-attribute consequence set, e.g., monetary lotteries, via an axiomatization of the decision maker's assignment of-certainty equivalents. We showed, using the mean value representation theorems of Chapter 1, that the certainty equivalent M(F) corresponding to a lottery F is given by: M(F) = v_1(/RavdF / /RadF) (4.4) for some strictly positive function a and strictly increasing function v. When a lottery F is preferred to a lottery G, the certainty equi valent M(F) is greater than that of G, M(G). It follows (since v 1 is order-preserving) that LovdF / LadF > / avdG / / adG . (4.5) K K K K Observe that the preference represented by the above functional is more general than that of expected utility, /„vdF, which is a special case of (4.5) when a is constant. The present section applies the representation theorems of Chapter 2 to provide an alternative axiomatization of alpha utility theory Unlike the certainty equivalent approach, we do not limit ourselves to choice situations where the range of consequences are monetary values or some quantity of certain commodity. For example, the relevant consequence set for a child contemplating whether to steal a cake from a bakery may be status quo, having the cake, getting caught in the process. We denote by 3 This was discussed in Chew and MacCrimmon (1979a). 87 X (={x,y,z,•••}) the consequence set corresponding to a choice situation and l_x (= (P> Q> R,***}) the space of simple probability measures defined on X (cf. Chapter 1). A simple probability measure is completely specified by knowledge of the probabilities of occurrence of a finite number of consequences. The child in the 'above example may feel that he has an even chance of getting the cake without being caught. Simple probability measures are convenient representations of actual lotteries or risky decisions when the probabilities of occurrence of the underlying consequences can be subjectively estimated or determined based on symmetry considerations, e.g., a game of craps or roulette. We denote the alternative of obtaining some consequence x for sure by 6 . A finite lottery P is then represented as a probability weighted combi nation of sure consequences: Our representation functional for simple probability measures correspond ing to that obtained from the certainty equivalent approach (see expres sion (4.5)) is: n (4.6) where p^ is the probability of occurrence of the consequence x^. E(av,P) / E(a,P), (4.7) where a and v are functions on X, and E(*,P) denotes taking expectation of a function with respect to the simple probability measure P. The theorems of Chapter 2 show that the axioms on the strict prefer-88 ence binary relation '-< ' of a decision maker stated below are necessary and sufficient for representing his preference via expression (4.7). Axiom UI: Ordering -< is a weak order. The strict preference relation •< is a weak order if it is asymme tric (i.e.,if a lottery P is strictly preferred to a lottery Q, then the converse does not hold), and negatively transitive (i.e.,if a lottery P is not strictly preferred to a lottery Q which is in turn not strictly preferred to another lottery R, then P is not strictly preferred to the lottery R). Both asymmetry and negative transitivity seem like basic consistency requirements for a strict preference relation that few would want to violate. Axiom U2: Solvability V P, Q, R £ LX, P -< Q and Q-c R 3B £ (0,1) 3 BP+(1-B)R ~ Q. Axiom U2 says that whenever a lottery Q is between two lotteries P and R in preference, then there is a mixture between P and R which is indifferent to Q. The reasonableness of the above axiom stems from the intuition that a mixture between two lotteries is always intermediate in preference between them (see the Betweenness axiom in section 4.1 and also Axioms 3:B:a and 3:B:b in von Neumann and Morgenstern (1947)). It follows that a mixture with a greater probability weight on the better lottery should be preferred to another mixture with a smaller probability weight on the better lottery. This is the substance of the following axiom. 89 Axiom U3: Monotonicity V P, Q e l_x, P -< Q =* BP+(1-8)Q •< yP+(l-Y)Q for 0 < Y < B < 1. The axioms discussed thus far are standard normative properties common to expected utility theory. The next axiom which weakens the substi tution principle or its close counterpart, the strong independence principle, is the only departure. Axiom U4: Weak Independence v P, Q e I_x, P ~Q ^VB £ (0,1), 3 Y e (0,1) 3 V R e Lx, BP+(l-e)R ~ YQ+(1-Y)R-t Weak Independence is a restatement of the Weak Substitution axiom (M3) for certainty equivalents. Given two lotteries that are indifferent to each other, Weak Independence allows for different probabilities in composing each of these lotteries with a third lottery to preserve indifference. However, these mixture-probabilities once determined must be independent of the third lottery. We are now ready to interpret the representation theorems of Chapter 2. A decision maker, whose preference among finite lotteries defined on a consequence set X satisfies Axioms U1-U4, chooses as if maximizing the functional E(ctv,«) / E(ct,«) for some strictly positive-valued function a and real valued function v both defined on the consequence set X. 90 When the a function is constant, the representation E(av,•)/E (a,•) becomes the expected utility representation E(v,«) with v assuming the role of a von Neumann-Morgenstern utility function. Although the approach taken in this section does not require the consequence set to be single-dimensional, the certainty equivalent approach has the advantage of being able to treat general probability distributions, e.g.,continuous random variable or even unbounded ones like the normal distribution. In the next two sections where we discuss the normative and descriptive implications of alpha utility theory, we shall represent lotteries in terms of either probability distributions or simple probability measures. Representing lotteries with probability distributions is more appropriate when we deal with numerical outcomes, especially if the outcome can take on continuous values. Otherwise, simple probability measures would be used. 4.3 NORMATIVE IMPLICATIONS Despite growing evidence demonstrating its descriptive inadequacy, expected utility remains dominant mainly because of the normative appeal of its underlying postulates and the elegance with which expected utility characterizes stochastic dominance (increasing utility function), local (Arrow-Pratt index) and global (concave utility function) risk aversion. We show in this section that alpha utility is also an attractive 91 normative theory before considering its descriptive relevance in the following section. 4.3.1 Ratio Consistency As we noted earlier, except for the Weak Substitution axiom (M3) or equivalently the Weak Independence axiom (U4), the other axioms of alpha utility are standard properties common to expected utility. We showed in Chapter 1 (Lemma 1.2) and Chapter 2 (Lemma 2.2) that Weak Independence together with Monotonicity implies the Ratio Consistency Property (Chapter 1, Property 5; Chapter 2, Definition 2.6) which is crucial to our proofs and the assessment of the ct function. The Ratio Consistency property has been given a rather striking geometrical 4 interpretation by Weber (1980). Consider a simplex formed by the three lotteries P, Q and R in Figure 4.1. Each lottery X in the simplex is specified by its barycentric (areal) coordinates (^, B2> 3^) given by the areas of the triangles XQR, RPX and XPQ. Assume that the area of PQR is 1, X then represents a probability mixture B^P+f^Q+^R among the three vertices P, Q and R. The lotteries Y(Z) is a mixture between P(Q) and R with B(y) weight on P(Q) and l-B(l-y) weight on R. Ratio Consistency was originally Axiom U5 (see Chew and MacCrimmon, 1979a). It was also implicit in a stronger statement of Axiom M3 (see Chew, 1979) . Weber (1980) demonstrated via "an approach through the analysis of iso-preference sets" the redundancy of Ratio Consistency with respect to the other axioms by showing that Theorem 2.1 holds without assuming Ratio Consistency. This motivated the statements of Lemma 1.2 and Lemma 2.2. 4.1: Ratio Consistency illustrated using barycentric coordinates S3 93 Mixing Y and Z with weights varying from 0 to 1 generates the line seg ment YZ. Suppose the lotteries P and Q are indifferent to each other, then Monotonicity implies that the line segment PQ is an isopreference set. For each Y on PR, Weak Independence implies there is an Z on QR such that Y and Z are indifferent. Hence YZ is an isopreference. Project YZ to meet PQ extended at the point 0 with coordinates (s,l-s,0) (Note that either s > 1 (0 is on the left of P) or s < 0 (0 is on the right of Q)). The Ratio Consistency property simply requires all other isopreference sets such as Y'Z' joining points on PR to points on QR to originate at the same point 0. That this is the case follows from the observation that AOZR = AOYR + AYZR, (4.8) which is given by ys = B(s-l) + YB- (4.9) Solving for y gives Y = B(s-l)/(s-B) (4.10) Let x = s-l/s. Expression (4.10) becomes Y/l-Y B/l-B = T . (4.11) A similar observation for the line segment Y'Z' gives Y'/l-Y1 Y/l-Y (4.12) B'/l-B' B/l-B 94 Hence, line segments originating from the same point 0 satisfy the Ratio Consistency property (4.12). For expected utility, the mixture probabilities 8 and Y such that Y is indifferent to Z are always equal; so that the constant x equals unity. Geometrically, the line segments YZ and Y'Z' are parallel. The simplical representation can now be used as in Weber (1980) to provide an alternative proof of Lemma 1.2 or equivalently Lemma 2.2 without having to solve the functional equation (1.6). The basic idea is to show, once an isopreference say YZ is found, that all line segments such as Y'Z' originating from the point 0 of the intersection between the extensions YZ and PQ are also isopreferences. Monotonicity which implies that these isopreferences are non-intersecting then ensures that there are no other iso preferences. Lemma 2.2 is proved below using the simplical approach. Lemma 2.2: Ordering (UI), Monotonicity (U3) and Weak Independence (U4) implies Ratio Consistency (Definition '2.6). Proof: Suppose P ~ Q but not (P ~ R) in Figure 4.2. Then from Monotonicity, PQ is an isopreference. Corresponding to a mixture Y = 6P+(1-3)R, Weak Independence implies that there is a mixture Z = YQ+(1-Y)r such that Y ~ Z> so that YZ is also isopreference. Let 0 be the point of intersection of the extensions of YZ and PQ. (Suppose without loss of generality that 0 is on the left of P). R(1,0,1) O(s,l-s,0) P(1,0,0) A Q(0,1,0) Figure 4.2: Geometric proof of the Ratio Consistency property 96 Consider another line segment Y'Z' below YZ originating from the same point 0. We shall show that Y'Z1 is also an isopre ference and by the preceding discussion satisfies the Ratio Consistency property (4.12). Draw the median from R to bisect PQ at A. Draw a line from Y. parallel to RA and intersecting Y'Z' at S. Produce PS to meet RA at V. Complete the triangle PVQ by connecting V and Q. Denote by T the intersection between VQ and Y'Z'. Draw a line from Z parallel to RA and intersecting VQ at T'. To see that OST' is collinear and hence T* must be the same as T, view the figure in three dimensions with V the top vertex of a tetrahedron with base PQR. The plane through the parallel lines YS and ZT' contains 0, S and T'. The plane through PV and Q also contains 0, S and T'. The conclusion that Y'Z' is also an isopreference follows from applying Weak Independence to S = BP+(1-B)V and T = YQ+(1-Y)V. A similar argument establishes the result for the case Y'Z' above YZ. Q.E.D. 4.3.2 Assessment The derivation of a procedure for eliciting alpha utility functions directly from the proofs of the representation theorems is a feature that is shared with expected utility and sets alpha utility apart from other alternative theories in the literature. This is illustrated below via a simple consequence set X with dis tinct outcomes L,I, and H arranged in ascending order of preference. Chapter 3 section 1 contains a discussion of the systematic viola tions of the Strong Independence principle by stating the empirical studies on the Allais paradox in terms of lotteries defined on such a 3-outcome consequence set. Suppose an alpha utility decision maker chooses among lotteries defined on such a consequence set, then we can measure his a and v functions in the following way: Set the v-values for the worst and the best consequences to 0 and 1 respectively, i.e., v(L) =0 and v(H) = 1. The v-value of the intermediate outcome I is given by the pro bability q such that the decision maker is indifferent between the sure consequence I and a q chance of obtaining H and 1-q chance of obtaining L, as illustrated in Fig. 4.3. 98 Figure 4.3: Probability Equivalent Method v(L) - 0; v(H) = 1; v(I) = q such that P ~ Q. Having constructed the v-function, we form in Fig. 4.4 the follow ing lotteries to determine the a function. Figure 4.4: Test of Substitution Axiom If the decision maker subscribes to the Substitution principle, P' and Q' would be indifferent whenever 8 equals y; since Q is constructed to be indifferent to P. If,however,P' and Q' are indifferent with 8 and y 99 unequal, then Ratio Consistency tells us that Y/1-Y — - = T, a constant. (4.13) B/i-e The decision maker's a-function is then given by ct(I)=x and ct(L)=a(H) = l, i.e., assign 1 to the a-values of the best (H) and the worst (L) outcomes of the consequence set X.(See the sufficiency proof of Theorem 2.1 in Chapter 2 for more details). For consequence sets with more than three outcomes, the above proce dure can be repeated for the other intermediate outcomes. In the case of an' interval of a real line, the v and a functions can be obtained by interpola ting among a finite number of measurement points. Just as there are many different ways to measure von Neumann-Morgenstern utility functions, this would also be the case for alpha utility. Since we only skim the issue of assessment here, more work seems to be needed. In the next sub-sections, we obtain conditions for consistency of alpha utility with stochastic domi nance, local and global risk aversion. 100 4.3.3 Stochastic Dominance Definition 4.1: A distribution G is said to stochastically dominate another distribution F in the first degree,denoted by G \ F, if G(x) < F(x), V x e R. The above definition of what is usually called Stochastic Dominance has its origin in the works of Hadar and Russell (1969), Lehmann (1955) and Hanoch and Levy (1969). It is well known that, G > F =»• /RudG i /RudF for every increasing function u on R. Therefore, every expected utility decision maker would prefer G to F when G dominates F in the first degree. This is however not necessarily true of every alpha utility decision maker. To the extent that consistency with first degree stochastic dominance is normatively desirable (and, in the context of monetary lotteries, probably descriptively valid), it can be imposed as an additional requirement. The following corollary taken from Chapter 1 provides conditions under which an alpha utility decision maker would be consistent with first degree stochastic dominance. Corollary 1.4: Suppose a, v are bounded. Then V F, G e D, F > G Q. (F) > fi (G) iff a(x) (v (x) - v (s)) (4.14) is an increasing function V s e R. The functional ^(F) refers to the alpha utility representation avdF//RadF. it is easy to see that relation (4.14) can be restated, 101 a(x)-(v(x) - 0(F)) (4.15) is an increasing function V F e D-We obtain a "familiar" interpretation of (4.15) through the following expository discussion of the necessity proof of Corollary 1.5. Observe that G I F => Fg, i Fg if 6' > 9 where FQ = (1-0)F+6GV Suppose is consistent with >, then fi(F ) is increasing in 9. Differ-entiating with respect to 0 yields |e nCF0) = /Ru(x;Fe)d[G(x) - F(x)] > 0 V 6 e (0,1] (4.16) where u(x;FQ) = g[^F -,{v(x) - Q(FQ)} . (4.17) By continuity of u(x;F ) in 0, it follows that "Je fi^F^ fi=n+ = /pu(x;F)d[G(x)-F(x)] > 0. (4.18) 9=0+ JR Since G is any distribution that dominates F, u(x;F) has to be increasing for every x e R. The function u(x;F) has the functional-analytic interpretation as a Gateaux derivative (Luenberger, 1969) of the functional fi(F) at F and dTn(VU+ = /Ru(x;F)d(G-F) is interpreted as the Gateaux differential of fl at F in the direction of G-F. For expected utility, the corresponding Gateaux derivative and Gateaux differential are u(x) and /Ru(x)d[G(x)-F(x)] respectively. Note that u(x) does not depend on F since the expected utility repre-102 sentation Jl udF is linear in F. This suggests the term Lottery Specific Utility Function (abbv. LOSUF) for u(x;F)^ Corollary 1.5 has the familiar interpretation: Alpha utility is consistent with 1st degree stochastic dominance if its LOSUF is increasing for each F. This condition assumes the following more manageable form if we impose 1st order differentia bility on a and v. av' > max [a' [v-v(x)],-a' [v(x)-v] ] V x e R, (4.19) where v = Lim v(x) and v = Lim v(x). X-wo - X->--°° Alternatively, the above can be stated as: Vx e R, (log cx(x))' < p if a'W - ° v'(x) v(x) - V (4.20) if a*(x) < 0 The rate of change of log a is bounded from above and below by v'(x)/v-v(x) and -v' (x)/v(x)-v respectively. Expression (4.20) can be used directly to find all those a given a certain v that will be consistent with 1st degree stochastic dominance. Machina (1980) investigated the properties of a twice Frechet-differenti-able functional V(F) on D[0,M] with respect to the stochastic dominance and global risk aversion partial orders. Frechet differentiability is a stronger notion than Gateaux differentiability and does not admit a natural extension of the analysis to the real line. Machina called the first Frgchet derivatives of V(F) local utility functions. The alpha utility functional Q(F) when restricted to a compact interval such as [0,M] is Frechet differentiable so that we can identify a lottery speci fic utility function with a local utility function. 103 4.3.4 Global Risk Aversion Definition 4.2: A distribution G is said to stochastically dominate 6 2 another distribution F in the second degree; denoted by G > F, if /^(G(x)-F(x))dx < 0 Vy £ R (4.21) and /"(G(x)-F(x))dx = 0. (4.22) When the means associated with the distributions F and G exist, condition (4.22) implies that they are equal. Condition (4.21) requires that for each x, the mean of G truncated at (-°°,x] is not less than that 1 2 of F truncted at (-°°,x] . Note also that for all concave u in R, F > G if and only if / udF > / udG. K K Consequently, an expected utility decision maker with a concave utility function always prefers a distribution F that dominates another distribution G in the second degree. The normative content of 2nd-degree Stochastic Dominance (or Global Risk Aversion) is derived from the idea that a prudent person should be risk averse. Definition 4.2 extends the definition of increasing risk by Rothschild and Stiglitz (1970) for a compact interval. Similar or related results have been obtained in Blackwell (1951), Hanoch and Levy (1969) and Whitmore (1970) defined third-degree stochastic dominance and related it to the Arrow-Pratt index. The general kth-degree stochastic dominance was defined in Chapter 1. However, both definitions will not be treated here. \ 104 Strassen (1965). The following corollary adapted from Chapter 1, which is a restric tion of Corollary 1.6 to the case of second-degree stochastic dominance, provides conditions under which an alpha utility decision maker would be consistent with second-degree stochastic dominance. Corollary 1.6*: Suppose a, v, a', v' are bounded and continuous, z then VF, GED, F > G =* n(F) > R(G) (4.23) iff {a (x) [v (x)-v (s)] }' is a decreasing function V s e R. Note that condition (4.23) requires the LOSUF u(x;F) to be a concave function for each F. This further confirms our intuition that the LOSUF plays a von Neumann-Morgenstern utility-like role. Requiring a, v to be second differentiable, the condition (4.23) becomes V x e R, av"+2cx,v' < min [a" (v-v (x)), -a" (v (x)-v)] . (4.24) This condition is related to the Arrow-Pratt measure for alpha utility developed in the next subsection. 105 4.3.5 Local Risk Aversion: The Arrow-Pratt Index Consider an alpha utility decision maker with assets x. The Cash Equivalent C(x;Z) corresponding to a risk Z is given by M(F )-x which X^" i-t is the difference between the certain asset position M(F ) such that X^~ Li the decision maker is indifferent to taking the risk Z and his current asset position x. The risk premium TT(X;Z) is then defined by TT(X;Z) = E(Z) - C(x;Z) (4.25) which is the difference between the actuarial value E(Z) of the risk Z and its cash equiyalent C(x;Z). Since x+Z and (x+u)+(Z-u) have the same distribution on final assets, TT and C from expression (4.25) have the properties: C(x+y;Z-p) = C(x;Z)-y, and (4.26) TT(X+U;Z-U) = TT(X;Z) . (4.27Following Pratt (1964), we consider the behaviour of ir(x;Z) for an actuarially fair risk Z as -*• 0, assuming the third absolute central moment of Z is of order o(a|). Thus, vCx-ir(x;Z)) = fi(Fx+z) = - ^X . Expanding both sides after cross multiplication, we get 106 [v(x) -TTV (X)+0(TT (4.29) This reduces to TT(X;Z) hp\ r(x) + o(a|), (4.30) where r(x) = - |-[loga2(x)V(x)] ' = -(^ffi +2a' W ), (4i31) dx & ' ^ J VV (x) ct(x) As in expected utility, the decision maker's risk premium for a small, actuarially neutral risk Z is approximately half the variance times r(x), which, in keeping with precedent, we call the Arrow-Pratt index. When Z is not actuarially fair, we obtain from (4.27) It is straightforward to check that r(x) has the following alternative interpretation in terms of the probability premium (Pratt, 1964; Arrow, 1971), where %(l+p(x;h)) and %(l-p(x;h)) are the probabilities of obtaining x+h and x-h respectively, such that the decision maker is indifferent between the status quo x and taking the risk. It is also straightforward to check that r(x) is invariant under the uniqueness transformation (expression (1.12) and (1.13)) for the functions a and v. In particular, it is invariant under an affine transformation for v and a scalar TT(X;Z) = ho* r(x+E(Z)) + o(a|) . (4.32) p(x;h) = %hr(x) + 0(h2) , (4.33) 107 multiple for a. It is comforting to note that r(x) has similar local properties as expected utility. Unlike expected utility however, we cannot recover the functions a and v only from the knowledge of r(x) pointwise. We would not therefore expect to be able to characterize in general, global risk propensities of an alpha utility decision maker in terms of his local risk aversion function r(x). In particular, we note from subsection 4.3.4 that global risk aversion in the sense of consistency with second degree stochast dominance implies that r(x) > 0 pointwise, but the converse does not in general hold. This result seems intuitively appealing. Consistency with second degree stochastic dominance implies positive risk premium ir(x;Z) at any asset position x, for any actuarially fair risk Z, which in turn implies that r(x) is positive for actuarially fair infinitesimal risks about x. Even if r(x) is positive pointwise, indicating aversion to infinitesimal actuarially fair risks, it is still possible for an alpha utility decision maker to be risk seeking over some interval. This corres ponds to the observation that people purchase insurance and gamble at the same time (Friedman $ Savage, 1948; Markowitz, 1952). The implications of this and other empirically observed choice behavior for alpha utility will be considered in the next section; where we use the consistency conditions developed in the last subsections to identify regions of local and global risk aversion. 108 4.4 DESCRIPTIVE IMPLICATIONS We showed in the preceding section that alpha utility, like expected utility, is compatible with such normative notions as stochastic dominance, local and global risk aversion. We also illustrated how the constructive proof of our representation theorem furnishes a procedure for the assessment of the alpha utility functions. In this section, we will show that alpha utility is not so general that it has no testable implications, nor is it such a minute departure from expected utility that it is•susceptible to the same set of violations. 7 4.4.1 Systematic Violations of the Strong Independence Principle' The empirical findings violating the implications of the Strong Independence principle that stem from the Allais paradox were summarized in table 3.1 in terms of the HILO structure of lotteries developed in section 3.1. We consider here the implications of alpha utility theory for choice patterns for the HILO structure, and derive some testable predictions. Applying alpha utility theory to the decision choices in Figure 3.4 we obtain, q 3q gq + (l-3)a(I)v(I) 3 + (l-3)a(I) Bq + (1-3) fi(AQ) = V(I) fi(BQ) = = Ba(I)v(I) ft(B3).= 6a(I) + (1-3) fl(A^) = v(l) Ji(B^) = fi(A6 = 3a(I)v(D + (1-3) ft(BJb = 3a(I) + (1-3) H The material in this sub-section appeared in Chew and MacCrimmon (1979b). 109 Figure 4.5: Preference pattern for a(I) < 1 110 Assuming without loss of generality that ct(L)=ct(H) = l, (cf. Theorem 2.1), we obtain the following inequalities corresponding to preferences in each decision box. AQ >- BQ o v(I) > q o AI ^ BI (4-34) >• B[ ~ v(I) > q(6+ -^|1) (4.35) AJj >- B* o v(I) > l-(l.q)(p+ (4.36) The inequalities (4.34), (4.35) and (4.36) are plotted in Figure 4.5 for the case in which cx(I) < 1. Note that the four regions, denoted as regions I-IV, correspond to four distinct choice patterns. These, along with the choice patterns for the case of a(I) > 1, are summarized in Table 4.1. Table 4.1 Allowable Choice Patterns Under Alpha Utility Theory Re g io n ^.^^ < 1 Expected Utility = 1 > 1 I A^ AI, A^ AQ II V V V Ao Regions do not exist V V V Ao III V V V Bo V Br V Bo IV V Br V Bo Apart from the patterns corresponding to regions I and IV (the only ones consistent with expected utility) alpha utility theory allows for 4 out of 14 additional patterns. These are given by the entries Ill under the case a(I) < 1 and a(I) > 1 for regions II and III. The region II and region III patterns under a(I) > 1 have not been reported in the literature (cf. Table 3.1). On the other hand, all the empirical findings to date of violations of expected utility correspond to either of the region II and III patterns with a(I) < 1. In particular, both 6 R the standard Allais paradox (i.e. A B ) and the Allais ratio paradox X Li Q (i.e. B , A ) occur in region II. The existence of region III also has 3 3132 some empirical support; note (Table 3.1) that both the A^, BQ and A^ , B^ violations have been reported. Before we continue with further implications based on the allowable patterns of choice, we can gain some intuition about alpha utility theory in relation to the HILO structure by examining, in Figure 4.5, the effects of changes in the parameters 3 and v(I) on the resulting pattern of choices, Looking first at changes in 3, we see that if consequence I is sufficiently attractive (i.e., v(I) > q/a(I)) then the choice will be A alternative and will be unaffected by changes in 3. This indicates a basis for choice that may be called the security effect. Correspondingly, if I is unattractive (i.e., v(I) < 1 - (l-q)/a(I)), then no change in 3 will induce a shift away from the B alternative. This may be called a nothing-to-lose effect. Note that both above regions are the only regions consistent with the substitution principle. If I is somewhat attractive (i.e., v(I) between q and q/a(I)) then 3 3 decreasing 3 from 1 will cause a switch from A^ to B^. The smaller value of 3 acts as a dilution probability to narrow the perceived gap 3 3 between A^ and B^,due to the attractiveness of I}until finally the g attractiveness is sufficiently diluted to cause a switch to B . This 112 8 is the dilution effect. The special case for the dilution from A£l to 3-7 l, (where 3-^ = 1.0 and 32 << 1-0) has been called a certainty effect (Kahneman and Tversky, 1979). When I is somewhat unattractive (i.e., \)(I) is between 1 - (l-q)/a(I) and q) a decrease in 3 will 3 3 cause a switch from to A^. This is a reverse dilution effect. In addition to studying the effect of changes in 3 for given values of V(I), fresh insight can be gained by examining the effect of changes in v(I) for given levels of 3. For 3 = 1, we no longer have a compound structure and so revert to the simple A^, B^ choice which depends on whether v(I) > q. At low and intermediate levels of 3, the regions II and III, giving rise to the paradoxical choices can be quite large. Note that as V(I) decreases from being very attractive, 3 3 at some point the A^ choice will switch to B^. Then as V(I) drops g below q, the choices in the I and 0 cases change to B^. and BQ respectively. g Finally, as v(I) drops lower (i.e., below q(3 + (l-3)/a(I)), A changes 3 H to B*. Two main implications should be noted from these observations. First, alpha theory can describe a richer set of preferences than can expected utility theory. Specifically, it covers the Allais-type preferences (I-L, L-0, and L-L) and other observed violations (H-0 and H-H). It allows for a dependence on the values of the parameter and hence captures the observed dilution effect. On the other hand, it is not so general that it can describe any preferences. Only 6 of the possible 16 preference patterns of the HILO structure are consistent with alpha utility theory (and only 4 patterns for the empirically supported a(I) < 1 case). As Figure 4.5 113 suggests, our theory makes very specific predictions about the preference patterns and the way they change as the parameters 8 and V(I) changes. In particular, monotonicity requires that preferences for the 0 and I cases agree completely, i.e., >- BQ<=>A^>- B . More interestingly, note that the standard Allais paradox (A , B ) occurs if, and only if, the Allais ratio paradox (B^, AQ) occurs. Both new regions of permissible B 8 choice (i.e., regions II and III)provide for and B^; hence this previously unreported case would seem to be a prime candidate for a new "paradox". Further, note that as v(I) decreases, the switch from 3 3 A to B occurs first for the L case, then for the I and 0 cases, and finally for the H case. All these implications from alpha utility theory are empirically testable. 114 4.4.2 Stochastic Dominance Consistency with stochastic dominance is an intrinsic property of expected utility. This is not the case with alpha utility. To the extent that it is prescriptively desirable and descriptively valid (see Chapter 5 for an example of a potential violation of stochastic dominance in the context of income distributions), we can restrict the alpha utility functions considered to those that do not violate o stochastic dominance via the following consistency conditions taken from Corollary 1.4. VxeR, [loga(x)]' < - [log (v-v(x)]' if a'(x)>0 > - [log (v(x)-v)]' if a'(x)<0. where V = Lim v(x) and v = Lim v(x) (4.37) These conditions are depicted graphically in Figure 4.6 for a bounded V normalized to V=0 and V=l. The slope of log a is bounded by the slope of -log (1-V) from above and -log V from below. The a function that is compatible with (4.37) is then recovered from the graph of log a. The a function considered has a dent in the middle and increases in both the positive and the negative directions in order not to exclude the possibility of region II and region III preferences in Table 4.1. (Recall that the condition for the existence of region II and III preference is that a(I) < qa(H)+(l-q)a(L)). The limiting behavior of a is bounded above by , and below by -r. 3 1-v 3 V It is interesting to note that after satisfying stochastic dominance and also the "Allais" type preference for the HILO structure, we are still left with a fairly large class of a functions. We investigate in the next sub-section, the risk propensities of the a -ln[v(x)-v] j 115 3 J - ln [v -v ( Fig. 4.6: Consistency Conditions for Stochastic Dominance 116 and u functions of Figure 4.6. 4.4.2 Local and Global Risk Properties: Concurrence of Risk Averting and Risk Seeking Behavior Local risk aversion in the sense that the Arrow-Pratt index (see sub-section 4.3.5) is nonpositive corresponds to the observation that people tend to avoid taking a small gamble in favor of its expected return. For expected utility, local risk aversion pointwise is equivalent to having a concave von Neumann-Morgenstern utility function which is in turn equivalent to global risk aversion. Thus, the behaviorally plausible local risk aversion hypothesis is not compatible with any concurrent risk seeking behavior, e.g., the purchase of a lottery ticket. Alpha utility does not share the above difficulty since local risk aversion is necessary but not sufficient for global risk aversion. The condition for local risk aversion, namely that the Arrow-Pratt index (= -V^-2a(x)}> is nonnegative, is: (log a(x))' < -Js(log v'(x))\ (4.38) This is depicted graphically in Figure 4.7 for the V function of Figure 4.6. Note that a constant a (which corresponds to expected utility) is not admissible. The function log a has to decrease sufficiently rapidly near the ruin point -xr in order to correct for the convexity of v at the same region. At this point, we have a pair of functions (a, v) that satisfy stochastic dominance, exhibit local risk aversion pointwise , and are compatible with the Allais type preference for the HILO structure. It remains to check whether the functions have the correct global risk properties that correspond to actual choice behavior. First, we establish 117 Fig. 4.7: Conditions for Local Risk Aversion 118 Fig. 4.8: An admissible alpha function 119 that our a and v functions describe risk seeking behavior by showing that they do not satisfy the conditions for global risk aversion: VkeR, (a(x).v(x))" < 0 if a"(.x) > 0 < a"(x) if a"(x) < 0 (4.39) Since the a function considered is convex, it suffices to observe in Figure 4.8 that the product o:-v admits a convex region near the ruin point. This means that there is a concurrence of risk seeking and risk averting behavior for that region. This corresponds to the prevalence of risk seeking behavior for losses observed by Markowitz (1952) and recently by Kahneman and Tversky (1979). However, the local risk aversion hypothesis (see Figure 4.6) rules out the possibility of global risk proneness so that the alpha utility decision maker will at the same time have the opposite risk averse propensity for some other gambles. Finally, we consider the mutual incompatability of different procedures for the assessment of a von Neumann-Morgenstern utility function. Suppose we apply the certainty equivalent method to the alpha utility decision maker considered, with status quo x=0 and some substantial gain amount x as the endpoints of lottery B in Figure 3.5. The pairs of measurements (x , u (x )) that corresponds to indifference c c c between getting x for sure in lottery A, and getting x with u (x ) chance c g c c and getting 0 with l-u^x^) chance are related by the following expression: a(x )u (x )v(x ) + a(0)[l-u (x )]v(0). v(xc) = (4>40) a(x )u (x ) + a(0)[l-u (x )] £p c c C ^-After rearranging, we get: a( 0)[v(x)-v(0)] u (x) = (4.41) c a( 0)[v(x)-v(0)]+a(x )[v(x )-v(x)] 120 Fig. 4.9: A pair of u andu derived from an alpha utility decision maker c 1/2 _^r 0 2 Fig. 4.10: A pair of u and u.,. derived from g 3/4 an alpha utility decision maker 121 Alternatively, if we fix a loss amount at half the ruin position -xr and use x=0 as the intermediate amount, we can apply the gain equivalent method to obtain Ug below: a(xg)pg(xg)v(xg) + a(-^T)[i-pg(xg)]v(-^) v(0) = g"g• g' * ^- l *g* g Hence, - 1 - °(x) rv(x)-v(0) , . § T •"2~ Finally, we consider the chaining method. With x°=0 and xg=x^ as the endpoint, and p°=h as the probability parameter, we obtain the sequence (x^, -x.^,...) based on the following relation. v(x, 4.1) a(x.)v(x.) + a(0)v(0). (4.44) i+1 a(x.) + a(0) 1 We have thus determined Uj on the sequence of points (x , x ,...) given -2 -L 2 by expression (4.44). Changing the probability parameter to p°=3/4 o ^r and the lower endpoint to x^= ~~2~> we obtain: 3a(x.)v(x.) + a(- f^)v(- ^). (4.45) v(x1+1) = i 1 1 L_ 3a(x) + a(0) As before, we have determined u^/^ on the sequence of points (x^, x ,...) determined by expression (4.45). Figure 4.9 shows the graphs of uc and ui^ with the same axes for ease of comparison with Allais' results displayed in Figure 3.6. Similarly, the curves of U3/4 and ug are plotted in Figure 4.10 for comparison with the results of MacCrimmon et. al. (1972) in Figure 3.7. The rather striking fit between the theoretical prediction and the empirical measurement supports the validity of alpha utility theory; thus providing a "rational" explanation for an otherwise puzzling phenomena. 122 4.4.4 Some Problems with Problem Representation In this final subsection, we comment briefly on how alpha utility handles the two difficulties with problem representation discussed in section 3 of Chapter 3. Our position on the question of whether alpha utility functions should be defined on gains and losses relative to a customary wealth level or in terms of final asset positions is the same as that of expected utility: it depends on the particular application. The latter should be adopted if we are interested in modeling the choice behavior of the "economic" man or if a decision analyst is helping some one who professes his belief in integrating possible outcomes into his wealth position prior to evaluation (Kahneman and Tversky called this the asset integration position). On the other hand, the former position comes in handy if we want to describe the actual choice behavior of people who do not conform to the asset integration position. On the other question of whether a two-stage lottery is equivalent to its single-stage decomposition. Kahneman and Tversky showed that for a particular two-stage lottery which is closely related to their version of the Allais paradox (cf. the 0-L case in Table 3.1), a problem description that focuses the subject's attention on a conditional comparison be tween one of the branches can elicit a majority preference that is the reverse of what would be the case if the two-stage lottery is stated in terms of its single-stage equivalent. Expected utility cannot be consistent with this phenomena because the strong independence principle or the sure-thing principle dictates that the direction of the conditional preference must be the same as that between the overall preference. Alpha utility does not share this difficulty, since the conditional preference for An against Bn and the 123 preference for B against A is a special case of the HILO choice pattern J-l Li (Table 4.1) considered in sub-section 4.4.1. The phenomena considered have implications for studies in decision making in general. Apart from situations where a clear normative position dictates which is the 'correct' problem representation, the lesson seems to be that one should be sensitive to the context associated with a par ticular choice situation. People appear to use different schemes to represent the same choices, resulting in apparent inconsistency if the rankings among alternatives are different for different problem representa tions . 124 4.5 CRITIQUE OF ALLAIS' THEORY AND PROSPECT THEORY Having developed alpha utility theory in the preceding sections, we consider in this section two other alternatives to expected utility that have attracted significant interest. -4.5.1 Allais' Theory Allais (1953) assumed the existence of a functional V(F) that represents preference among probability distributions. Such an approach has the immediate implications that preference thus represented satisfies several standard properties of expected utility: completeness, transitivity, combination and composition. Allais further assumed that preference shares another property of expected utility, called consistency with stochastic dominance. This assumption restricts the preference functional V(F) to those that increase when the underlying distribution increases in the stochastic dominance n sense. For finite lotteries of the form F = £ p.S^., this condition has i=l 1 Xl the following simple form, n V(E p.6x.) increases in each x.. (4.46) i = l Unlike expected utility, the theory outlined above is not an axiomatic theory in the sense that the existence of the representation V(F) is asserted rather than being a consequence of the assumed properties of the underlying preference. Note also that the adoption of a more general representation is traded against the convenience of having a simple von Neumann-Morgenstern utility function which captures our intuition about diminishing marginal utility and offers a simple char acterization of risk proneness (aversion) via the convexity (concavity) 125 of the utility function. In order to obtain a utility-like function without resorting to expected utility, Allais revived the Frisch (1926) notion of quartenary preference among intervals of wealth and asserted the existence of a cardinal utility of wealth he termed psychological value (denoted by s). In other words, getting $100 at status quo is "better" than getting $100 after you have just received $1000 if the difference in psychological value from status quo to getting $100, s($100)-s($0), is greater than the corresponding difference going from $1000 to $1100, s($1100)-s($1000). According to Allais, a choice agent's preference depends on the mathe matical expectation (first moment), the dispersion (second moment) and in general the shape of the probability distribution of psychological values. This led him to assert that the preference functional V(F) evaluated at a distribution F can be stated in terms of some functional h(F-) of the distribution of psychological values, F_, as follows: s V(F)=hCFi). (4.47) n For a finite lottery, F (= £ p.6x.), the corresponding F- is given by: i=l Fs E." Pi6s(x.)- ' (4-48) 1=1 1J After rescaling, we obtained the functional h below. h(6-(x))=s(x). (4.49) An important property of h, which we will derive shortly, was however not noted by Allais. Since s is an interval scale, the functional h must represent the same preference under an affine transformation for s. Consider a lottery F and its certainty equivalent M(F). It follows that h(F_)=h(6_(M(F)))=s(M(F)). (4.50) 126 Under the affine transformation as + b, where a>0-, h(F - . ) = h(6 . ) v as+b' as(M(F))+b' = as(M(F))+b = ah(F_)+b. (4.51) It turns out that the above property of h enables us to draw very,spe cific conclusions about the admissible functional forms of h. Consider 1 N a uniform lottery F_ = _ S 6 Applying Aczel's theorem (Aczel, s s(-xiJ 1966, p. 236) to ii evaluated at F_ yields: - 1 N h(N.S165(x.)) E y + 0 gN(§(xi}-y' ^V"^ 1=1 1 —"a —a 1 N 2 1 n 2 where y = ^ I s(x ), a = - I (s(x j-y) , i=l 1=1 and g^ is an artibrary symmetric function, provided a > 0. (4 .52) Define the functional g on the space of uniform distributions as follows: 1 N g(N 1 6y > = %Cyi' V- (4-53) 1=1 i _ 1 N 1 N It follows that 1 N h(rrE 6_, J = y + a g(ij 6_, , ). (4.54) '•N. . s(x.) K 6^N. , s(x.)-y i=l l i=l v l a Note that the above result can be applied to finite lotteries with rational probabilities by taking N to be the least common denominator. Hence Lemma 4.1: h(F_) = y + a g(F_ ) ' (4.55) a n where F_ = £ p.8_, .,, with p. rational, s i=1ri s(xi)' Fi n - 2 n 2 y = Z p s(x ), o=l p.[s(x )-y] i=l 1 i=l 1 1 127 and g an arbitrary functional on the space of finite distributions with rational probability weights. Without getting into details, we remark that the extension of Lemma 4.1 to remove the restriction of rational weights is straight forward. We need h to be continuous with respect to weak convergence (cf. Chapter 2 Axiom M4), and then note that any finite distribution is the weak limit of a sequence of finite distributions with rational weights. This argument can be extended to include the case of dis tributions with compact supports since they are weak limits of sequences of finite distributions. Further extension to the case of non-compact support can be obtained through an assumption that is similar to Axiom M5 of Chapter 2. In a recent paper, Allais (L979) considered the following functional form for h. h(F_) = y > w(F§_y) (4.56) He then expressed w in terms of the normalized central moments of F_ such that s-y 9 i n* i m? where m = (s-y)ndF . n R The resultant expression for h, h(Fs-) =y + f(l4> 17-y y however does not satisfy property (4.51). Therefore it is not admissible, The idea of representing a distribution via its moments can be applied to the standardized distribution in expression (4.55) . We define 128 the function f in' the following way: f(m3, i4, in,...) = g(Fg_y) (4.58) a where m is the nth-moment of F. , . n s-y a Note that the value of f is not affected by an affine transformation of §. A key idea in Allais' criticism of expected utility is that it neglects the higher moments of the psychological value s. Presumably, the more moments we include, the closer does the resulting h approximate actual preference. As a first step, one is tempted to consider only the first two moments. The general form of h for this case is obtained by setting the function f in (4.58) equal to some constant X, i.e., h(F_) = u + X o. (4.59) It is however easy to see that h above with a nonzero X violates the assumption of consistency with stochastic dominance. Consider Fs E p6s(x) + (1_p) 6s(y)5 with §W > §<-y-1, Differentiating ^(p5s(x) + (1_p) 6s(y)-) With resPect t0 P yields: |^ {ps(x) + (l-p)s(y)+A[pl2(l-p)i'2(s(x)-s(y))]} = (s(x)-s(y))[l+|(l-2p/pls(l-p)12)] . (4.60) This derivative is negative for any negative (positive) value of X, when p is sufficiently close to zero (one). Hence, h restricted to the first two moments of s cannot be consistent with stochastic dominance even for lotteries with only two outcomes. We are compelled to include the higher moments if we want to get away from expected utility but stay within Allais' framework. The most general h that depends on the first three moments is given by 129 h(F.) = y + of(m3/a3). (4.61) 3 A linear approximation of f by X + y m /a produces: h(F_) = y + Xo + y m3/a2. (4.62) This is essentially the same expression adopted by Hagen (1979) . Hagen showed that the Allais type choice behavior is compatible with a positive y as long as the magnitude of X is not too large. The positive skewness dependence matches our intuition about people's preference evident in the prevalence of lottery ticket purchase.. The question of whether the above form of h is consistent with stochastic dominance remains. Consider again the two-outcome lottery F- = p6-, . + (l-p)6_. . with s s (x j s (_yj s(x) greater than s(y). The functional h evaluated at F_ gives: h(p8_. , + (l-p)6_, O ^ s(x) v yj s{y)J = s(y) + (p + X[p(l-p)]^ + y[p2+(l-p)2]} [s(x)-s(y)]. (4.63) In order that the derivatives of h with respect to p is always positive, we again require X to be zero. The dependence of h on a is then subsumed 2 m the denominator of the m^/a term. The resultant h without the Xo term is given by: H(P<5s(x) + C1-P)<5s(y)) = §iy) + {p + ^[P2^1"?2)]} [5(x)-s(y)]. (4.64) As p tends to one, h converges to s(x) + y [s(x) - s(y)]. Yet, when p equals one, h is equal to s(x) which is less than s(x) + y[s(x)-s(y)]. This implies that the lottery p8 +(l-p)6 is preferred to getting the x y higher amount x for sure when p is sufficiently close to unity. The conclusion that fi violates stochastic dominance still holds for negative y if we consider the behavior of ii as p tends to zero. Instead of extending our analysis to include higher moments, we shall pause to take stock of what we learned. In place of the expectation of a utility function, Allais asserted that preference is represented by 130 a functional that depends on the first moment, the dispersion and in general, the shape of the distribution of a psychological value function, which is obtained from comparison among hypothetical changes in wealth position. Based on the property of the psychological value function as an interval scale, we obtained a restriction on the class of admissible preference functionals. It turns out that the particular form of dependence on the moments of the psychological value considered by Allais is not admissible. Next, we considered those admissible functionals that depend only on the first two moments. The resultant functional form, the sum of the first moment and the standard deviation scaled by some constant, is shown to be inconsistent with stochastic dominance. Extending the analysis to include the third moment, we obtained the functional form that Hagen (1979) considered in a recent paper. This functional form is however again shown to violate stochastic dominance. It is not known whether the problem with stochastic dominance can be averted by incorporating even higher moments. The difficulty with the first three moments suggests however that the psychological value assumption would not lead to a "clean" way to characterize preferences that expected utility fails to capture. In the next sub-section, we introduce prospect theory, which represents a different approach to the, problem of descriptive validity of expected utility first identified by Allais via the famed Allais paradox. 131 4.5.2 Prospect Theory Prospect theory, developed by Kahneman and Tversky (1979), distinguishes two phases in the choice process: an editing phase in which problem representation rules called editing operations are7 applied to the offered prospects followed by an evaluation phase. We first introduce these editing operations before discussing their relation to the form of the evaluation functions. Coding. The perception of outcomes in terms of gains and losses relative to some reference wealth level. This is usually taken to be the current asset position, in which case gains and losses are the actu amounts to be received or paid. Cancellation. This refers to the possible discarding of common components that are shared by the offered prospects. An example is the cancellation of a common probability-outcome pair, which is a restatement of Savage's sure-thing principle. Segregation. An offered prospect with a riskless component such as a minimum gain (loss) is decomposed into a riskless component and the prospect with the riskless component taken from each outcome. Combination. The probabilities associated with equal outcomes are combined to yield a single outcome with probability given by the sum of the respective probabilities. Detection of Dominance. The dominated prospects are eliminated from the choice set prior to evaluation. Simplification. This refers to the possible simplification of prospects by rounding off probability or outcome values. 132 The coding and the cancellation operations are prompted by the empirical evidence described in Chapter 3 section 4. As a result of coding, the outcome values of offered prospects are always stated in terms of gains and losses. Although more remains to be known about the conditions under which cancellation applies, incorporating such an operation does yield an additional degree of freedom in the description of choice phenomena. Even less is known about the simplification operation, which seems to be a plausible problem representation heuristic. The other editing operations are related to the evaluation phase, which concerns the way a value function v(x) of the outcome x and a decision weight function Tr(p) of the probability of an outcome p combine to obtain the overall value of a prospect. The value function is con cave for gains and convex for losses. The IT function has the following properties: 1) TT increases from TT(0)=0 to TT(1) = 1. 2) Tr(p)>p, for small p. 3) Tr(p)+Tr(l-p)<l, for pe(0, 1). 4) TT(pq)/TT(p) < TT(pqr)/TT(pr) , for p,q,r e(0,l]. Prospect theory is developed for simple prospects of the form, P = P6x+q6y+(l-p-q)6o, (4.65) which have at most two non-zero outcomes. If the outcomes are strictly positive (negative) and p and q add to unity, the simple prospect is known as a strictly positive (negative) prospect. A simple prospect is regular if it is neither strictly positive nor strictly negative. For regular prospects, the overall value V is obtained from the scales v and TT in the following manner. 133 V(p6x+q6y+(l-p-q)60) = T:(p)v(x)+TT(q)v(y). (4.66) Unlike the von Neumann-Morgenstern utility, the value function v(x) is a ratio scale, i.e. v vanishes at the reference point. This property was noted in Edwards (1961). There are however some difficulties with the use of the above expression, stemming from the nonlinearity of IT. Since TT(p)+TT(s-p) is not equal to TT(p')+TT(s-p'), V(p6x+(s-p)6x+60) j V(p'6x+(s-p')6x+60). (4.67) In other words, two lotteries each yielding outcome x with probability s are not equivalent in preference. To get around this, the choice agent is assumed to apply the combination editing operation prior to evaluation. The other difficulty necessitates the detection-of-dominance operation. Consider the following comparison: Tr(p)v(x)+TT(s-p)v(x+e) ? TT(S)V(X) (4.68) Suppose the L.H.S. is less (more) than the R.H.S. for e equal to zero, then the inequality would still hold for a small but positive (negative) e. Thus, dominance is violated. Eliminating the dominated alternative before evaluation circumvents this problem, but not completely. Suppose Tr(p)v(x)+7T(s-p)v(x+e°)<Tr(s)v(x), ' (4.69) - . . o for some positive £ . We can find q°<s such that TT(p)v(x)+TT(s-p)v(x+e0)<Tr(q0)v(x+e°)<iT(s)v(x) . (4.70) This has the implication that PV(S-P)«X+£O+(1-S)«0 -<q06x+£o+ll-q°)V and ps6x+(l-s)60 >• q%+eo+d-q°)60, which is both normatively and empirically untenable. 134 v For the strictly positive and strictly negative prospects, we will run into the same problem with violations of dominance if we adopt expression (4.66). Instead, Kahneman and Tversky proposed that people decompose a strictly positive (negative) prospect P = p6 +(l-p)6 with x y y>x>0 (0>x>y) into a riskless component 6 and a risky component P' = P5Q+(1-P)<S^_x and evaluate the segregated prospect via the following expression: V(p6x+(1-P)6y) = v(x)+TT(l-p)[v(y)-v(x)]. (4.71) The use of a different expression to evaluate strictly positive or strictly negative prospects leads to a new difficulty which is not covered by the editing operations. Consider the regular prospect P = pSx+q5x+e+(l-p-q)60, with x,£>0. V(P) = Tr(p)v(x)+Tr(q)v(x+e) < TT(p)v(x)+Tr(l-p)v(x+e) . Since V is concave for positive x, V(P) < TT(p)v(x)+Tr(l-p) [v(x)+ev' (x) ] . (4.72) Choose £Q such that '0 - v'(x) This implies that, 1-TT(P)-TT(1-P) TT(l-p) (4.73) p6x+q6x+£o+(1-p"q)60 -< V V q £[0, 1-p). For the case q=l-p, we apply expression (4.69) and obtain: V(p) = v(x)+TT(l-p) [v(x+e-v(x)] >v(x). (4.74) This implies that p6 +(l-p)5 >- 5. r x ^ *J x+e The above implication is rather pathological. No matter how close q gets to 1-p, the L.H.S. is strictly worse than 6 . Yet, when q is 135 equal to 1-p, the L.H.S. is strictly preferred to 6 . We can get some feeling for the problem by estimating £q for some reasonable values of x, p and TT(P). Suppose x=$100, p=.50 and TT(0 .50) = .45 . (Note that property (3) of the IT function implies that TT (0.5)<0 .5) . We have that, [1-2TT(0.5)] < v(10) [1-2TT(0.5)] o-iUx TT(0.5) v'(10) TT(0.5) • l^-/i>J since the concavity of v implies that x < v(x) . A conservative estimate for SQ is then given by e = 10 x .1 i .45 = 22 o " • Consider the following choice problems. plE •56$ioo + •456$122 + •056$o vs. 6$100 P2E •56$ioo + •496$122 + •ol6$o vs. 6$100 P3 ~= •56$ioo + •4996$122 + .0018$0 vs. 6$100 Prospect theory would predict that P^, P^, P^ are all strictly worse than "SjjjTQQ- Moreover, the direction of preference remains unchanged as long as the probability of obtaining $122 is less than 0.5! Note that the inequality, x < v(x) , used to obtain e is highly conservative V(x) 0 since v'(x) is a decreasing function for a concave v(x). We can arrive at the above conclusion with a generally much higher £q given a specific v(x). If v is bounded, then v'(x) tends to zero so that £q can be made arbitrarily large if we consider a sufficiently large x. Prospect theory builds on the form of the evaluation function first suggested by Edwards (1955) . It treats systematically several classes of choice phenomena (cf. Chapter 3) that violate the implications of expected utility. However, the nonlinearity of the decision weight function TT, which constitutesits main deviation from expected utility, 136 generates some serious difficulties for prospect theory. The immediate ones, namely the violation of the combination principle and stochastic dominance, were circumvented via the combination and the detection-of-dominance editing operations. Two problems however remain. One of the problems is the implication that there is always a prospect Q which is strictly worse than some prospect P but strictly better than another prospect P' that dominates P. The other problem has to do with the use of a different evaluation function for the strictly positive (negative) prospects. This gives rise to discontinuity in the preference represented which leads to untenable predictions of actual choice behavior. o 137 4.5.3 Comparison We conclude the section with a comparison of alpha utility with Allais' theory and prospect theory. The alpha utility representation fi(F) can be stated in terms the expectation of a "value" function, v, ct with respect to an 'a-weighted' distribution F given by: pa(x) = /JW / /^adF, where a is a strictly positive function. The role of a becomes clearer n if we consider a simple distribution, F = E p.6 . In this case, i=l 1 xi fl(F) = /RvdFa n = E q (F)v(x ), i=l where q (F) = p aCx^/S p.a(x ) . j=l J Like prospect theory, fi(F) is obtained from the v(x^)'s via a set of "decision weights", ^4^^n^_2» with each q^ being a nonlinear function of the probability p^ of obtaining the ith outcome, x^. We should however note three distinctions. The q^ weights sum to unity but not the TT(P ) weights of prospect theory. In addition to p^, q^ depends on the rest of the Pj's and all the x^'s. Finally, the q^-weights has the combination property, since if the jth and kth outcomes are both equal to x, then fi fp.6 +p. 6 + E p. 6 .) • ri x rk x • / • , i xi [p'.a(x)+p a(x)]v(x) + E p a(x.)v(x ) n E p.a(x.) 138 .(Pi+Pk)aU)v(X) + 1 Pia(xi)v(xi) = iili n E p.a(x.) i=l 1 1 = n((p +Pk)6x + E p 6 ). Some of Allais' ideas can also be expressed in terms of the alpha utility representation. We can write fi(F) as the sum of v, — ct the first moment of v with respect to F, and the deviation term /D(v-v)dF shown below: £}(F) = v + /R(v-v)dFa, where v" = .Lvd'F. K Thus, the preference of an alpha utility decision maker depends on the first moment of v, and also the distribution of the deviation, (v-v), of v from v through the a-weighted distribution Fa. According to the proofs of the representation theorems in Chapters 1 and 2, the v-function, much like the von Neumann-Morgenstern utility, can be constructed from preference comparisons using a standard lottery. Allais may insist however that v be a psychological value function that is determined from introspective comparison among hypothetical changes in wealth position. In this case, we can apply the principle of Occam's razor to weed out the "psychological value" assumption since it is redundant. Finally, we summarize via Table 4.2 some of the salient features of the theories treated in this chapter. A '+' sign means the corresponding theory is compatible with the property referred to. Otherwise, we use a '-' sign. 139 Properties Transitivity Dominance Combination Continuity Allais » Paradox Expected Utility + + + + -Alpha Utility + + + + + Allais Theory + - + + + Prospect Theory - + + - + Table 4.2: Comparison among Theories A novelty of Prospect theory is the explicit use of editing operations prior to evaluation. Two of these operations are however needed to ensure consistency with dominance and combination. If Allais' theory were to adopt the detection-of-dominance editing operation, it would exhibit systematic intransitivity, as is the case for prospect theory (cf. expression (4.70)). Since lotteries are stated in terms of gains and losses, the coding operation seems a reasonable hypothesis about how people perceive monetary outcomes. Kahneman and Tversky (1979) have provided preliminary empirical evidence in support of two other editing operations, namely cancellation and segregation. These may be adopted by alpha utility theory if future empirical studies ascertain their validity as behavioral hypothesis. 140 CONCLUSION 5. CONCLUSION 141 5.1 SUMMARY Part I of this dissertation contains the statements and proofs of two representation theorems. The first generalizes the quasilinear mean, M^, of Hardy, Littlewood and Polya: VF) = l*~VR0dF), (5.1) where 0 is continous and strictly monotone, and F is a probability distribution. We have weakened a characteristic property of the quasi-linear mean, M^, called quasilinearity to obtain a more general mean value, M ,, which is characterized by an additional function a. The mean of a probability distribution F has the following form: Ma0(F) = 0~VRa0dF/.TRadF)f (5.2) where a is continuous and strictly positive (negative). Through the strict inequality, <, binary relation, the mean induces a binary relation, •-< , among probability distributions: F -< G <^> Ma0(F) <Ma0(G). (5.3) The R.H.S. is equivalent to /Ra0dF//RadF < 7Ra0dG//RadG, (5.4) for a strictly increasing 0. Note that the ordering represented by (5.4) is more general than that represented by the expected utility representation, f^tfdF. For convenience, we use fi(F) to label the representation functional, /RajzSdF//RadF. An alternative approach to obtain the representation, (5.4), is given in Chapter 2. Instead of probability distributions, we consider simple probability measures defined on some arbitrary set X- (A simple 142 probability measure is a convex linear combination of a finite number of point masses in X)• Axioms were stated directly in terms of a binary relation, -< , to obtain the corresponding fi representation for simple probability measures: fi(P) = E(av,P)/E(ct,P) , (5.5) where P is a simple probability measure on a set X, ana" a and v are real-valued functions on X- This is contrasted with the approach based on axioms on mean values in Chapter 1. The reason why the mean value approach cannot be extended to simple probability measures on an arbitrary set X is that mean value itself may not be defined in X « As an example, consider the outcome set X - {status quo, being promoted, being fired). Being able to deal with an arbitrary outcome set such as the above example is an advantage for the simple probability measure approach. There are however certain drawbacks. Without further structural assumptions on X.j we cannot discuss the notions of continuity and differentiability of the a and v functions nor can we generalize the fi representation to include more general probability measures. Part II of the dissertation concerns one specific area of application — decision theory. Interpreting the MA0(F) mean of Chapter 1 as the certainty equivalent of a monetary lottery F, the corresponding induced binary relation,K , has the natural interpretation as 'strict preference' between lotteries. , For non-monetary (finite) lotteries, we apply the representation theorem of Chapter 2. The hypothesis that the fi representation of either approach represents a choice agent's preference among lotteries is referred to as alpha utility theory. This is logically equivalent to saying that the choice agent obeys either the mean value (certainty equivalent) axioms or the axioms ;on the '-< ' (strict preference) binary relation. 143 Alpha utility theory is a generalization of expected utility theory in the sense that the expected utility representation is a special case of the alpha utility representation, and that alpha utility assumes a weaker form of a key property of expected utility, called substitutability (Pratt, Raiffa and Schlaifer, 1964), which is essentially the same as the quasilinearity property of Hardy, Littlewood and Polya. (A close counterpart of substitutability is the strong independence principle of Marschak (1950) and Samuelson (1952)), The motivation for generalizing expected utility comes from difficul ties it faced in the description of certain choice phenomena, especially the Allais paradox. These are summarized in Chapter 3. Chapter 4 contains the formal statements of assumptions and the derivations of normative and descriptive implications of alpha utility theory. We stated conditions, taken from Chapter 1, for consistency with stochastic dominance and global risk aversion and derived a generalized Arrow-Pratt index of local risk aversion. We also demonstrated how a pair of a and v functions that satisfy both stochastic dominance and local risk aversion can be consistent with those choice phenomena, summarized in Chapter 3, that contradicts the implications of expected utility. The chapter ended with a comparison of alpha utility with two other theories that have attracted attention; namely, Allais' theory and prospect theory. 5.2 EXTENSIONS We conclude by pointing out some potential areas of application. The quasilinear mean was given an interpretation as the 'equally-dis-tributed-equivalent' level of income corresponding to an income distribution by Atkinson (1970) in his paper on the measurement of inequality. We may use the M ^ mean as a more general model of 144 equally-distributed-equivalent income. Is there any need for a more general measure? Consider two societies with income distributions F and G given by: ) F E °-506$l,000 + °-506$2,000> MD G E °-506$i,ooo + °-496$2,ooo + °-ol6$itooo,ooo-To many (including probably Mao Tse-Tung), society F fares better than society G. But the G distribution stochastically dominates the F distribution, so that M^(F) is less than M^(G). Therefore, the measure fails to reflect the relative welfare of the two societies for those who believe that society F is better off. This does not pose any difficulty for the measure since consistency with stochastic dominance is not an intrinsic property. The departure of M , from M, can be made clearer if we consider a N society of N individuals with incomes, {x^}^_^. The corresponding equally-distributed-equivalent is given by: N N Ma«S(F) = * ( E a(xi)v(xi)/ Z aOr>)- C5-6) i=l i=l A remarkable feature of (5.6) is the presence of complementarity across incomes of different individuals. That this is a desirable property is reinforced by the fact that an individual perceives concurrently the incomes of other individuals in the distribution whereas only one of a set of mutually exclusive outcomes will obtain in a lottery. We can think of the role of a as assigning discriminatory weights on individuals based on their attained incomes. Mao Tse-Tung's (hypothetical) preference for society F may then be explained in terms of a decreasing discrimination function a that treats wealthy individuals less 'equally' than the poorer folks. 145 In a recent paper on the measurement of poverty, Blackorby and Donald son (1978b) applied Atkinson's equally-distributed-equivalent to the 'censored' distribution, i.e., the income distribution truncated at some exogeneously established poverty line. This is tantamount to having a measure with an a that is constant up to the poverty line and zero beyond. It is natural to suggest that a decreasing a with an inflexion point at the poverty line (see Figure 5.1) would integrate the contribution due to the whole distribution and at the same time be particularly repre sentative of the poorer folks with incomes below the poverty line. poverty line Income Fig. 5.1: An alpha function that discriminates against the rich The mean can also be used to generate measures of inequality as follows: Relative Inequality Index =1 - M^/y, and Absolute Inequality Index = y - M^, 146 where y denotes the arithmetic mean. The relative index is due to Atkinson (1970) and the absolute index is due to Kolm (1976a,b). When the distribution of income is completely equal, both indices equal zero. The coefficient of variation, an occasional measure of income inequality, is related to by: \. coefficient of variation = {M ,(F) - yc}a, 1>1 r (5.7) where Mg t(F) = {/QXS+tdF//~xSdF} / (i.e. a(x) = xS and 0(x) = xl). As an equally-distributed-equivalent, M is undesirable because it is always higher than the arithmetic mean except at equality, and so encourages inequality. Also, its weighting function a(x) =• x assigns progressively more weight to the more wealthy. The Mg t mean suggested above may be of use in statistics. We offer some examples. M = standard deviation • coefficient of skewness, and M = standard deviation • (coefficient of Kurtosis + 3)2. An equality that generalizes the well known result that the product of the arithmetic mean and harmonic mean is equal to the square of the geometric mean for two positive numbers, may be stated in terms of ^ t 2t aS f°ll°ws-M (F) = M, (F), (geometric mean) (5.8) -t,2t log x^ ^b where F = £ ^5Y , with > 0, when the frequency polygon of log x^ is symmetrical about the axis of ordinate at -j^E log x^. Canning (1934) proposed the use of 147 {M-t,2t^/Mlog x^> (5'9) as a descriptive measure of asymmetry. Harsanyi (1977) applied expected utility to an individual making a moral judgement about alternative social situations. Making a moral judgement, in this case, means making a hypothetical best choice under the assumption that the individual assumes the position of any one member of the society with equal chance. A social situation X, which is a listing of the N persons' states, would be perceived as a lottery that assigns the individual to any particular individual's position with 1/N chance. An expected utility decision maker would then maximize the expected utility corresponding to social situation X. This is simply the N arithmetic average, E u.(X)/N, of his von Neumann-Morgenstern utility, i=l 1 u^(X), for taking the position of the ith_person. An alpha utility decision maker will however maximize a slightly more general expression: N N E a (X)v (X)/ E a (X), (5.10) i=l i=l where ot^(X) denotes his a value for being in ith person's shoes within the social situation X. This is a weighted average of the v^(X)'s with weights given by the OL(X)'S. Expression (5.10) may be interpreted as the average of the v.(X)'s but the alpha utility decision maker uses 1 N a 'biased' estimate, a.(X)/ E a.(X), for his probability of being the 1 i=l 1 ith person. Again, the alpha utility expression has the advantage of allowing for complementarity, which is supported by our intuition about how moral judgements are made. Several questions remain untouched. There are other situations where the representation functions have the additive structure of expected utility. One example is the averaging of individual utility 148 functions using a set of fixed weights to obtain a group utility function (Keeney and Raiffa, 1976). Another example is the time-honored practice of discounting a time stream of utilities (Koopmans, 1972). The intuition that complementarity should not be ruled out is strong in both cases. For these and other similar examples, the alpha utility representation provides a useful first candidate for a departure from the additive structure in order to incorporate complementarity across attributes. The last decade saw a tremendous growth in the literature on the microeconomics of uncertainty. Practically all of these works are based on the assumption that choice agents maximize expected utility. We have not investigated what new results would follow or what old results are robust if we introduce alpha utility choice agents. One example is comparative risk aversion. The condition for one expected utility maximizer to always be more risk averse than another is given in Pratt (1964) . An equivalent result appeared in Hardy, Littlewood and Polya (1934) in the form of an inequality between two quasilinear means. The corresponding result for alpha utility choice agents or the mean, however, remains an open question. Another example is Samuelson's (1967) conjecture, which was confirmed by Kalai and Schmeidler (1977), that Arrow's impossibility theorem will hold in a cardinal setup where individuals and society express their preferences by von Neumann-Morgenstern utility functions. It is perhaps safe to conjecture that Arrow's impossibility theorem would still hold if individuals and society express their preferences via alpha utility functions. 149 REFERENCES Aczel, J. (1966), Functional Equations and Their Application, Academic Press, New York and London. Allais, M. (1953), "Le comportement de l'homme rationnel devant le risque", Econometrica 21, No. 4, 503-546. Allais, M. (1979), "The So-Called Allais Paradox and rational decisions Under Certainty," in M. Allais and 0. Hagen, eds., Expected Utility Hypothesis and the Allais Paradox, Reidel Publishing Company, 437-682. Allais, M. $ 0. Hagen (1979), Expected Utility Hypothesis £ the Allais Paradox, D. Reidel Publishing Co. Anscombe, F.J. $ R.J. Aumann (1963), "A Definition of Subjective Probability," Annals of Mathematical Statistics, Vol. 34, No. 1, 199-205. Arrow, K.J. (1971), Theory of Risk Bearing, Markham Publishing Company, Chicago. Atkinson, A.B. (1970) "On the Measurement of Inequality," Journal of Economic Theory, 2, 244-263. Becker, G.M. & CG. McClintock (1967), "Value: Behavioral Decision Theory," Annual Review of Psychology, Vol. 18, 239-286. Bernoulli, D. (1954) "Speciman theoriae novae de mensura sortis," Comentarii academiae scientiarum imperialas Petropolitanae, Vol. 5, 175-192; translated as, "Exposition of a New Theory on the Measurement of Risk," Econometrica, Vol. 22, 23-26. Ben-tal, A. (1977) "On Generalized Means and Generalized Convex Functions," Journal of Optimization Theory and Applications, Vol. 21, No. 1, January, 1-13. Blackorby, C. and D. Donaldson (1978a), "A Theoretical Treatment of Ethical Indices of Absolute Inequality" forthcoming in International Economic Review, Department of Economics Discussion Paper 78-03, University of British Columbia. Blackorby, C. and D. Donaldson (1978b), "Ethical Indices for the Measure ment of Poverty, Department of Economics Discussion Paper 78-04, University of British Columbia. Blackwell, D. (1951), "Comparison of Experiments," Proceedings of Symposium on Mathematical Statistics and Probability, 2nd, Berkeley, J. Neyman and L.M. Lecam, eds., University of California Press, 93-102. 150 Blackwell, D. and M.A. Girshick (1954), Theory of Games and Statistical Decisions, Wiley § Sons, Inc. Canning, J.B. "A Theorem Concerning A Certain Family of Averages of a Certain Type of Frequency Distribution," Econometrica 2, 442. The abstract is a reporter's summary of an unpublished paper. Chew, S.H. (1979), "A Generalization of the Quasilinear Mean of Hardy, Littlewood and Polya," Institute of Applied Mathematics and Statistics Technical Report 79-39, University of British Columbia. Chew, S.H. and K.R. MacCrimmon (1979b), "Alpha Utility, Lottery Composition and the Allais Paradox," Faculty of Commerce and Business Admini stration Working Paper #686, University of British Columbia. Chew, S.H. and K.R. MacCrimmon (1979a), "Alpha Utility Theory: A Generalization of Expected Utility," Faculty of Commerce and Business Administration Working Paper #669, University of British Columbia. Dalton, H. (1920), "The Measurement of Inequality of Incomes," Economic Journal 20, 348-361. DeGroot, M.H. (1970), Optimal Statistical Decisions, McGraw-Hill Book Co., New York. Diamond, P. § M. Rothschild (1978) Uncertainty in Economics, Academic Press, New York. Edwards, W. (1954), "The Theory of Decision Making," Psychological Bulletin, Vol. 51, 380-417. Edwards, W. (1955), "The Prediction of Decisions Among Bets," Journal of Experimental Psychology 60, 265-277. Edwards, W. (1961), "Behavioral Decision Theory," Annual Review of Psychology 12, 473-498. Fishburn, P.C. (1970), Utility Theory for Decision Making, John Wiley, New York. Friedman, M. § L.J. Savage (1948), "The Utility Analysis of Choice Involving Risk," Journal of Political Economy, Vol. 56, 279-304. Frisch, R. (1926), "Suv une Probleme d'Economie Pure," Norsk Mathematish Forenings Skrifter, 1. Hadar, J. and W.R. Russell (1969), "Rules for Ordering Uncertain Prospects," AER 49, 25-34. 151 Hagen, 0. (1979), "Towards a Positive Theory of Preferences Under Risk," in Expected Utility Hypothesis and the Allais Paradox, M. Allais and 0. Hagen, eds., Reidel Publishing Company, 271-302. Handa, J. (1977), "Risk, Probabilities and a New Theory of Cardinal Utility," Journal of Political Economy, Vol. 85, No. 1, February, 97-122. Hanoch, G. and C. Levy (1969), "Efficiency Analysis of Choices Involving Risk," Review of Economic Studies 36, 335-346. Hardy, G.H., J.E. Littlewood § G. Polya (1934), Inequalities, Cambridge University Press. Harsanyi, J.C. (1977), Rational Behavior and Bargaining Equilibrium in Games and Social Situations, Cambridge: Cambridge University Press. Herstein, I.N. 5 J. Milnor (1953), "An Axiomatic Approach to Measurable Utility," Econometrica, Vol. 21(2), April. Howard, R.A. (1964), "The Foundations of Decision Analysis," IEEE Trans. SSC-4:211-19. Jensen, N.E. (1967), "An Introduction to Bernoullian Utility. I. Utility Functions," Swedish Journal of Economics. Kahneman, D. § A. Tversky (1979), "Prospect Theory: An Analysis of Decision Under Risk," Econometrica, March, 263-291. Kalai, E. and D. Schmeidler (1977), "Aggregation Procedure for Cardinal Preferences: A Formulation and Proof of Samuelson's Impossibility Conjecture," Econometrica 45, No. 6, September, 1431-1437. Karmarkar, U.S. (1978), "Subjectively Weighted Utility: A Descriptive Extension of Expected Utility Model," OBHP, Vol. 21, 61-72. Keeney, R.L. & H. Raiffa (1976), Decisions with Multiple Objectives: References and Value Tradeoffs, Wiley § Sons, Inc. Kolm. S.Ch. (1976a), "Unequal Inequalities I," Journal of Economic Theory 12, June, 416-442. Kolm, S.Ch. (1976b), "Unequal Inequalities II," Journal of Economic Theory 13, August, 82-111. Koopmans, T.C. (1972), "Representation of Preference Orderings Over Time," in Decision and Organization, C.B. McGuire and R. Radner, eds., North Holland Publishing Company, Amsterdam. Krantz, D.H., D.R. Luce § A. Tversky (1971), Foundations of Measurement Vol. 1, Academic Press, New York § London. 152 Lehman'n, E.L. (1955), "Ordered Families of Distributions," Annals of Mathematical Statistics 26, 399-419. Lichtenstein, S. § P. Slovic (1971), "Reversal of Preference Between Bids and Choices in Gambling Decisions," Journal of Experimental Psychology 89, 46-55. Luenberger, D.G. (1969), Optimization by Vector Space Methods, New York: John Wiley § Sons. MacCrimmon, K.R. (1965), "An Experimental Study of the Decision-Making Behavior of Business Executives," unpublished dissertation, Univer sity of California, Los Angeles. MacCrimmon, K.R. (1968), "Descriptive and Normative Implications of the Decision Theory Postulates," in K. Borch and J. Mossin, eds., Risk and Uncertainty, St. Martin's Press, New York, 3-32. MacCrimmon, K.R., John F. Bassler £ William T. Stanbury (1972), Unpublished Results, Risk Study Project, University of British Columbia. MacCrimmon, K.R. § S. Larsson (1979), "Utility Theory: Axioms Versus 'Paradoxes'," in M. Allais and 0. Hagen, eds., Expected Utility and the Allais Paradox, Holland: D. Riedel. Machina, M.J. (1980), "Expected Utility Analysis Without the Independence Axiom," unpublished working paper, Department of Economics, University of California - San Diego. Markowitz, H. (1952), "The.Utility of Wealth," Journal of Political Economy, Vol. 60, 151-158. Marschak, J. (1950), "Rational Behavior, Uncertain Prospects, and Measureable Utility," Econometrica 18, 111-141. Marschak, J. § R. Radner (1972), Economic Theory of Teams, Yale University Press. Meginniss, J.R. (1977), "Alternatives to the Expected Utility Rule," unpublished Ph.D. dissertation, University of Chicago. von Neumann, J. £ 0. Morgenstern (1947), Theory of Games and Economic Behavior, Princeton University Press, 2nd Edition, Princeton. Norris, Nilan (1976), "General Means and Statistical Theory," The American Statistician, February, Vol. 30, No. 1. Pratt, J. (1964), "Risk Aversion in the Small and in the Large," Econometrica, Vol. 32, No. 1-2, January-April, 122-136. 153 Pratt,' J.W., H. Raiffa § R. Schlaifer (1964), "The Foundations of Decisions Under Uncertainty: An Elementary Exposition," JASA 59, 353-375. Preston, M.G. § P. Baratta, "An Experimental Study of the Auction Value of An Uncertain Outcome," Journal of Psychology 61, 183-193. Raiffa, H. (1968), Decision Analysis: Introductory Lectures on Choice Under Uncertainty, Reading, Massachusetts: Addison-Wesley. Raiffa, H. $ R. Schlaifer (1961), Applied Statistical Decision Theory, Division of Research, Harvard Business School, Boston, 356. Ramsey, F.P. (1931), "Truth and Probability," (1926) in The Foundations of Mathematics, R.B. Braithwaite, ed., Humanities Press. Rothschild, M. $ J.E. Stiglitz (1970), "Increasing Risk I. ADefintion," Journal of Economic Theory 2, 225-243. Samuelson, P.A. (1952), "Probability, Utility, and the Independence Axiom, Econmetrica 20, 670-78. Samuelson, P. (1967), "Arrow's Mathematical Politics," in Human Values and Economic Policy, S. Hook, ed., New York: New York University Press, 41-51. Savage, L.J. (1954), The Foundations of Statistics, Wiley, New York. Slovic, P., B. Fischhoff § S. Lichtenstein (1977), "Behavioral Decision Theory," Annual Review of Psychology, 39. Slovic, P. § A. Tversky (1974), "Who Accepts Savage's Axiom?" Behavioral Science 19, 368-373. Weber, R. (1980), Personal Communication, Weerahandi, S. and J. Zidik (1979), "A Characterization of the Genera*! Mean," forthcoming in the Canadian Journal of Statistics, Department of Mathematics Discussion Ppaer, University of British Columbia. Whitmore, G.A. (1970), "Third-Degree Stochastic Dominance," American Economic Review 60, 457-459.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Two representation theorems and their application to...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Two representation theorems and their application to decision theory Chew, Soo Hong 1980-03-26
pdf
Page Metadata
Item Metadata
Title | Two representation theorems and their application to decision theory |
Creator |
Chew, Soo Hong |
Date Issued | 1980 |
Description | This dissertation consists of two parts. Part I contains the statements and proofs of two representation theorems. The first theorem, proved in Chapter 1, generalizes the quasilinear mean of Hardy, Littlewood and Poly by weakening their axiom of quasilinearity. Given two distributions with the same means, quasilinearity requires that mixtures of these distributions with another distribution in the same proportions share the same mean, regardless of the distribution that they are mixed with. We weaken the quasilinearity axiom by allowing the proportions that give rise to the same means to be different, This leads to a more general mean, denoted by M[sub=αФ], which has the form: M[sub=αФ] = Ф⁻¹(ʃ[sub=R] αФF/ʃαdF), where α is continuous and strictly monotone, a is continuous and strictly positive (negative) and F is a probability distribution. The quasilinear mean, denoted by M[sub=Ф], results when the a function is constant. We showed, in addition, that the M[sub=αФ] mean has the intermediate value property, and can be consistent with the stochastic dominance (including higher degree ones) partial order. We also generalized a well known inequality among quasilinear means, via the observation that the M[sub=αФ] mean of a distribution F can be written as the quasilinear mean of a distribution F[sup=α], where F[sup=α] is derived from F via a as the Radon-Nikodym derivative of F[sup=α] with respect to F. We noted that the M[sub=αФ] mean induces an ordering among probability distributions via the maximand, ʃ[sub=R] αФF/ʃαdF, that contains the (expected utility) maximand, ʃ[sub=R] αФF, of the quasilinear mean as a special case. Chapter 2 provides an alternative characterization of the above representation for simple probability measures on a more general outcome set where mean values may not be defined. In this case, axioms are stated directly in terms of properties of the underlying ordering. We retained several standard properties of expected utility, namely weak order, solvability and monotonicity but relaxed the substitutability axiom of Pratt, Raiffa and Schlaifer, which is essentially a restatement of quasi-linearity in the context of an ordering. Part II of the dissertation concerns one specific area of application decision theory. Interpreting the M[sub=αФ](F) mean of Chapter 1 as the certainty equivalent of a monetary lottery F, the corresponding induced binary relation has the natural interpretation as 'strict preference' between lotteries. For non-monetary (finite) lotteries, we apply the representation theorem of Chapter 2. The hypothesis, that a choice agent's preference among lotteries can be represented by a pair of α and Ф functions through the induced ordering, is referred to as alpha utility theory. This is logically equivalent to saying that the choice agent obeys either the mean value (certainty equivalent) axioms or the axioms on his strict preference binary relation. Alpha utility theory is a generalization of expected utility theory in the sense that the expected utility representation is a special case of the alpha utility representation. The motivation for generalizing expected utility comes from difficulties it faced in the description of certain choice phenomena, especially the Allais paradox. These are summarized in Chapter 3. Chapter 4 contains the formal statements of assumptions and the derivations of normative and descriptive implications of alpha utility theory. We stated conditions, taken from Chapter 1, for consistency with stochastic dominance and global risk aversion and derived a generalized Arrow-Pratt index of local risk aversion. We also demonstrated how alpha utility theory can be consistent with those choice phenomena that contradict the implications of expected utility, without violating either stochastic dominance or local risk aversion. The chapter ended with a comparison of alpha utility with two other theories that have attracted attention; namely, Allais' theory and prospect theory. Several other applications of the representation theorems of Part I are considered in the Conclusion of this dissertation. These include the use of the M[sub=αФ] mean as a model of the equally-distributed-equivalent level of income (Atkinson, 1970), and as a measure of asymmetry of a distribution (Canning, 1934). The alpha utility representation can also be used to rank social situations in the sense of Harsanyi (1977). We ended by pointing out an open question regarding conditions for comparative risk aversion and stated an extension of Samuelson's (1967) conjecture that Arrow's impossibility theorem would hold if individuals and society express their preferences by von Neumann-Morgenstern utility functions. |
Subject |
Statistical decision |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-03-26 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0076820 |
URI | http://hdl.handle.net/2429/22726 |
Degree |
Doctor of Philosophy - PhD |
Program |
Interdisciplinary Studies |
Affiliation |
Graduate and Postdoctoral Studies |
Degree Grantor | University of British Columbia |
Graduation Date | 1980-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-UBC_1981_A1 C48.pdf [ 6.53MB ]
- Metadata
- JSON: 831-1.0076820.json
- JSON-LD: 831-1.0076820-ld.json
- RDF/XML (Pretty): 831-1.0076820-rdf.xml
- RDF/JSON: 831-1.0076820-rdf.json
- Turtle: 831-1.0076820-turtle.txt
- N-Triples: 831-1.0076820-rdf-ntriples.txt
- Original Record: 831-1.0076820-source.json
- Full Text
- 831-1.0076820-fulltext.txt
- Citation
- 831-1.0076820.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0076820/manifest