USES AND ABUSES OF QALY ANALYSISByAnn Michele HolmesB. A. (Economics) University of VictoriaM. A. (Economics) Queen's University at KingstonA THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDOCTOR OF PHILOSOPHYinTHE FACULTY OF GRADUATE STUDIESECONOMICSWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIADecember 1992© Ann Michele Holmes, 1992In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(SignatureDepartment of Economics The University of British ColumbiaVancouver, CanadaDate / 4/0e/I /993DE-6 (2/88)AbstractA major contribution of economics to health services research has been the develop-ment of QALYs (quality adjusted life years) as a measure of health status. This thesisinvestigates, in three essays, the use of QALYs in health care project evaluation andas an indicator of societal health.The first essay examines the validity (defined as consistency with preferences) andfeasibility of various QALY construction methods. Conditions for validity, derivedfrom welfare principles, are used to assess the different methods. A new QALYinstrument is devised that has interpersonal content (i.e. is valid for choices involvingdifferent individuals). Bias is shown to depend on various independence relationshipswithin preferences. A number of these conditions are tested using data from theGeneral Social Survey of 1985 (Canada. Statistics Canada [1987]).The second essay examines the welfare properties of the QALY-based index as it iscommonly employed to make health policy decisions. A comparison with alternativeeconomic-based health indexes (human capital and willingness-to-pay) is provided.The QALY-based measure does indicate which treatment is best for an individual.In choosing patients for treatment, however, QALY-based measures probably dis-criminate against certain types of individuals, including those who are risk aversewith respect to health and in poor health. In choosing between health programs,aggregate QALY-based measures do order community health profiles sensibly (exceptwhere people endure states worse than death), unlike the other measures considered.The QALY-based index may, however, favour unequal distributions of health.iiThe final essay assesses the appropriateness and feasibility of QALYs as a foun-dation for an index of societal health. Results suggest that, theoretically, the QALYserves as an imperfect measure of societal health, but that these problems are en-demic to any index based on individual preferences. Using the best available data, aQALY based index is calculated to measure the level and distribution of ill-health inCanada and indicate where health policy can be most effectively targeted. The essayconcludes with a discussion of what improvements in data collection are required toobtain more accurate figures.iiiTable of ContentsAbstract^ iiList of Tables^ ixList of Figures^ xiAcknowledgement^ xii1 Introduction 12 Separating Good Health Measures from Bad 82.1 Introduction ^ 82.2 Part I: Validity 92.2.1^Motivation ^ 92.2.2^Literature Review 102.2.3^Analysis ^ 142.2.4^Independence Results ^ 352.2.5^Equivalence Conditions 392.2.6^Instrument Selection ^ 402.2.7^Summary of Validity Results 462.3 Part II: Implementation ^ 472.3.1^Nature of the Problem 472.3.2^Literature Review ^ 492.3.3^Analysis ^ 541V2.3.4 Model ^552.3.5 Production Method ^562.3.6 Parameterization ^642.3.7 Bias from Instruments ^642.3.8 Bias from Aggregation ^682.3.9 Synthesis ^852.4 Conclusion ^903 The Welfare Properties of Three Health Status Statistics3.1 Introduction ^3.1.1 Review of Health Status Measures ^3.1.2 Evaluation Criteria ^3.2 Theoretical Framework for Assessment ^3.2.1 Individual's Optimization Problem 3.2.2 Derivation of the Health Status Indexes ^3.2.3 Human Capital Measures3.2.4 Willingness-to-Pay ^3.2.5 QALY/HYE Measures 3.3 Choosing Treatments for a Given Individual ^3.3.1 Exactness of the Health Status Index 3.3.2 Discussion ^3.4 Choices Between Individuals ^3.4.1 Variations Due to Endowments ^3.4.2 Preference Variation ^3.5 Choices Between Programs 3.5.1 Rationality of the Social Ordering ^(HK)9292939799991041041061061081081101111121221241263.63.5.2^Welfarism ^3.5.3^Ethics 3.5.4^Summary ^Conclusion 1281311331344 A QALY Based Societal Health Statistic for Canada, 1985 1364.1 Introduction ^ 1364.1.1 Purpose 1364.1.2 Criteria for a Societal Health Index ^ 1374.1.3 Literature Review ^ 1404.2 Model ^ 1434.2.1 Theoretical Assessment ^ 1434.2.2 Completeness ^ 1454.2.3 Consistency 1464.2.4 Ethical Content ^ 1504.2.5 Summary 1534.3 Empirical Assessment ^ 1544.3.1 Data ^ 1544.3.2 Procedure 1544.4 Results and Implications ^ 1604.4.1 Quality Adjusted Life Expectancy ^ 1604.4.2 Male-Female Differentials 1634.4.3 The Importance of Morbidity ^ 1654.5 Avenues for Future Research ^ 1675 Conclusion 169viBibliography^ 181Appendices^ 190A Chapter 2 Proofs^ 190Appendices^ 214B Simulation Results^ 214Appendices^ 218C Data Sources for Chapter 2^ 218Appendices^ 224D Empirical Results for Chapter 2^ 224Appendices^ 237E Proofs to Chapter 3^ 237Appendices^ 241F Proofs to Chapter 4^ 241Appendices^ 250G Empirical Analysis to Chapter 4^ 250G.0.1 Data Description 2500.0.2 Calculation of Satisfaction Estimates ^ 253G.0.3 Calculation of Expected Satisfaction 256viiG.0.4 Linkage to QALY Values ^ 258G.0.5 Comment on Institutional Data 263G.0.6 Calculation of QALY Averages ^ 264G.0.7 Life Expectancy ^ 265G.0.8 QALT Calculations 266viiiList of Tables3.1 Conditions for Exactness ^ 1114.1 Quality Adjusted Life Expectancy, Canada and the Provinces . . . . 1604.2 Quality Adjustment with Institutional Data, Canada and the Provinces 1624.3 Male-Female Differentials, Canada ^ 1644.4 Morbidity by Major Category, Men 1654.5 Morbidity by Major Category, Women ^ 166D.1 Estimation Results ^ 224D.2 Additive Tests 226D.3 Joint Additive Tests ^ 226D.4 Multiplicative Tests (all data) ^ 227D.5 Joint Tests ^ 227D.6 Multiplicative Tests ( < 30 yrs) ^ 228D.7 Joint Tests ^ 228D.8 Multiplicative Tests ( 30-65 yrs) ^ 229D.9 Joint Tests ^ 229D.10 Multiplicative Tests ( > 65 yrs) ^ 230D.11 Joint Tests ^ 230D.12 Multilinear Tests 231D.13 Joint Tests ^ 232D.14 Simulated Holistic QALY Values ^ 233D.15 Aggregated Values (cs) ^ 233ixD.16 Aggregated Values (sg) ^ 234D.17 Aggregated Values (tto) 234D.18 Aggregated Values (es) ^ 235D.19 Aggregated Values (pe) 235D.20 Distortions from cs Values ^ 236D.21 Distortions from sg Values 236D.22 Distortions from tto Values ^ 236D.23 Distortions from es Values 236D.24 Distortions from pe Values ^ 236G.1 Satisfaction Function Estimates (t-statistics in brackets) ^ 255G.2 Marginal Disutilities (taken at perfect health) ^ 257List of FiguresB.1 Time Trade-Off ^ 214B.2 Standard Gamble and Person Equivalents ^ 215B.3 Extended Sympathy ^ 216B.4 Most Likely Case 217xiAcknowledgementI would like to thank the members of my supervisory committee, David Donaldson,John Cragg, and Erwin Diewert, for their intellectual and emotional support duringthe preparation of this thesis. Despite the fact that none works in health economics,or perhaps because of this, they were able to guide my research on this subject ininnovative directions and the final product has been greatly improved by their efforts.I would also like to thank my peers in "640" for their constructive comments and fellowinmates of C-block for their support.This research was supported in part by the National Health Research and Devel-opment Program through National Health Fellowship number 6610-1557-47.)diChapter 1IntroductionOne of the most significant contributions economics can make to health services re-search is to provide appropriate means to evaluate the health care system so thatpolicy may be directed to best serve the interests of society. These interests includenot only the overall level of health and its distribution across members of society,but also those opportunities forgone because of the consumption of resources by thehealth care system.Unfortunately, valuation techniques which are based on demand curves as revealedby market transactions (like cost-benefit analysis) are inappropriate because the linkbetween the value of health outcomes and market data is made tenuous by peculiar-ities inherent in the health care system. Because health is not directly exchangeablebetween agents (with the possible exception of organ transplants), hedonic prices forhealth must be inferred from the demand functions for tradeable goods with healthconsequences. The legitimacy of such methods requires that markets for these goodsexist without externalities (e.g. asymmetric information across patient types does notdrive markets out of existence, insurance does not obscure the relationship betweenprice paid and value received), and that consumers are willing and able to choose theirpurchases to maximize their own well-being (i.e. they understand completely the re-lationship between consumption of a good and its effect on health - which impliesthe health care professional's role as provider of this information is redundant - andthere is no habit formation or addiction that can cause deviations from this optimal1Chapter 1. Introduction^ 2choice path over time). Since these characterizations are grossly atypical of actualhealth care markets, the valuations derived by such methods are usually invalid andcan misdirect policy.It is these same peculiarities of health care and related markets that ensure thefailure of the price mechanism to allocate resources efficiently. This provides thejustification for market intervention by a social planner. As demands on the healthcare system increase relative to the resources available to this sector (this due to thejoint effect of rapid technological innovation allowing suppliers to offer more services,some of dubious worth, coupled with rising expectations on the part of consumers asto what these services can provide), decisions about which health care interventionsto offer are apt to increase in frequency and importance.The effectiveness of the social planner's directives depends on his or her abilityto identify optimal health policies. Despite the difficulties noted above, a substantialamount of the evaluation literature has employed market data to evaluate healtheffects (see, for instance, Longmore and Rehahn [1975], Jones-Lee [1976], Mooney[1977]). While such methods are well established in the economics literature, thisis offset by the measurement errors that are engendered by imperfect markets. Anincreasingly popular alternative is to value the health effects by means independent ofthe market and its distortions. Such a procedure requires, in effect, the constructionof a health index. During the 1970's, a variety of such indexes were proposed (seeMcDowell and Newell [1987] for a compendium). Among these was the QALY (qualityadjusted life year) index (Torrance et al. [1972]).This thesis is an assessment of the QALY index as an instrument of policy for-mation. The QALY index is an appropriate topic of study for a variety of reasons.First, of all the health status indexes currently available, it has the strongest wel-fare foundation, being derived from preferences for health states. In fact, TorranceChapter 1. Introduction^ 3(1976c) has demonstrated that most other health status indexes are approximationsof the QALY index. Second, the QALY is a very flexible instrument and can be usedto evaluate a wide range of health states in a variety of applications. Other indexesare restricted to specific contexts which are special cases of the QALY's. Third, andperhaps most important, the QALY is being increasingly employed by policy makersin practice. These applications include not only choosing the optimal treatment pathfor specific individuals (e.g. McNeil et al. [1981]), but also choosing between peoplefor treatment (e.g. Boyle et al. [1983]), and choosing among projects for funding(e.g. Oregon Medicaid experiment as described in Klevit et al. [1991]). Furthermore,in some jurisdictions (e.g. Ontario, Australia), the index is on the verge of beinginstitutionalized in the decision making apparatus. In their eagerness to find somesystematic means to direct health policy that is free of market distortions, decisionmakers have employed QALYs in ever broader contexts where their validity is dubi-ous. Research has focused on deriving a more comprehensive list of QALY values,leaving the analysis of the appropriateness of decisions made with these data out-paced by the applications which use them. This thesis attempts to narrow this gapbetween theory and practice.As originally conceived, the QALY value acts as an adjustment factor on timealive. These scaling factors reflect the morbidity (or sickness) endured during a givenperiod of time. Summing these values over all years of life yields a health indexthat incorporates both the morbidity and mortality (quality and quantity) aspects ofhealth.What distinguishes the QALY index from other health status indexes is not thatlength of life is scaled to reflect morbidity, but that these scaling factors are chosen toreflect the preferences of individuals for these different morbid states, assigning highervalues to more preferred states of health than less preferred states of health. ThisChapter 1. Introduction^ 4is assured by the methods used to derive QALY values: Hypothetical health statesare described in a survey questionnaire. Respondents report their preferences forthese states. The strength of these preferences is measured with a QALY instrument,a device which compares preferences for morbid states against preferences for someother measurable variable (the "metric variable").Since these values are based on assessments of outcomes only (which respondentsshould be able to evaluate without assistance), not production processes (where theinformation gaps prevent agents from linking goods consumed to health outcomes),the index should be free of market distortions. Furthermore, because the index isbased explicitly on preferences, it has a stronger welfare foundation than most otherhealth status indexes available (it assigns the highest values to states that leave theindividual best off according to his or her own assessment).The outstanding feature of QALYs is this relationship with preferences. Thisallows the possibility that the QALY may be used in utility based welfare assessments,free of the ethical problems inherent in traditional cost-benefit analysis (Blackorbyand Donaldson [1990]). As a result, the concept of a QALY health index used in thisthesis is somewhat broader than the standard definition, and any index derived fromthe individual's preference ordering over health states as described above is includedin the assessment, not just those that fit the strict definition of a QALY as a factorused to scale time alive. Thus, indexes such as Mehrez and Gafni's (1989) healthyyear equivalent (where both the quality and quantity aspects of a health state areassessed together in a single function) are treated as special cases of QALY indexes.Through the 1970's, QALY research focused on the development of decision statis-tics (methods for combining QALY values with other relevant data and accompanyingrules to determine policy priorities) for a wide variety of policy applications. Theseincluded: the quality adjusted lifetime (the sum of the QALY values over all years ofChapter 1. Introduction^ 5life), used to evaluate optimal treatment paths; the cost-utility ratio (the differencein quality adjusted lifetimes caused by a health care project divided by its cost),used to determine how the health care budget should be allocated across differentprojects; the societal health index (the mean QALY or mean quality adjusted life-time), used to measure the health status of a community (Torrance et al. [1972],Torrance [1976a,b,c]). Although Torrance (1986) subsequently argued that QALYswere only intended as a measurement device for input-output analysis, it is clear fromthe literature that much of the appeal of QALY analysis is its (presumed) consistencywith welfare maximization (i.e. that decisions based on QALYs lead to society be-ing as well off as possible). This has become increasingly apparent as policy movestowards a "patient centred ethic" (i.e. consumer sovereignty).By the 1980's, other researchers began to raise concerns about the ethical under-pinnings of QALYs. These concerns included (i) whether the use of QALYs as scalingfactors produced a health status index that reflected preferences over the whole do-main of health (see Pliskin et al. [1980], and Mehrez and Gafni [1989]), (ii) theunderlying interpersonal assumptions of QALY-based decision statistics (see Harris[1987], Hilden [1985]), (iii) the social ethics of aggregate QALY statistics (Loomes andMcKenzie [1989]), and (iv) whether resource allocations based on QALY data wereconsistent with welfare maximization (see Anderson et al. [1986], Birch and Don-aldson [1987], Feeny and Torrance [1989]). These assessments have been incomplete,covering only select QALY applications. Some have raised the possibility that wel-fare inconsistencies may arise, but not under what conditions (Loomes and McKenzie,Harris, Hilden), while others have used assessment criteria that are not utility basedand therefore do not reveal if the statistics are consistent with welfare maximization(Anderson et al., Birch and Donaldson, Feeny and Torrance). Virtually none has pro-vided superior alternatives or strategies to compensate for biases. This might explainChapter 1. Introduction^ 6why such research has failed to curb the use of QALY data in decision-making.The three essays of this thesis constitute an attempt to close the gap betweenthe theory and practice of QALY based evaluations, and to evaluate the index as anindicator of health status for individuals and society and its suitability for directingresource allocation in the health sector. The assessment is utility based: the QALY orassociated decision statistic is evaluated as to whether or not it can identify the statewhich the individual (or society) most desires. When currently employed statistics failto do this, the probable distortions on decision making are identified and unbiasedalternatives are discussed. Also considered is the feasibility of implementing thevarious alternatives in practice.The first essay examines the validity (defined as consistency with preferences) andfeasibility of various QALY construction methods. Necessary and sufficient conditionsfor validity are derived from welfare principles and these are used to assess the differentmethods. A new QALY instrument is devised that has interpersonal content (i.e. isvalid for choices involving different individuals). The degree of bias is shown todepend on various independence relationships within preferences. A number of theseconditions is tested using data from the General Social Survey of 1985 (Canada.Statistics Canada [1987]). The more important conclusions drawn are: (1) while theredoes not exist any one QALY instrument that is universally valid, for every currentapplication using QALY data, there exists at least one instrument that is valid; (2)over brOadly defined categories of morbidity, multiplicative aggregation methods canbe used to reduce significantly the costs of QALY data collection without inducingbias.The second essay examines the welfare properties of a QALY-based index whenit is used as a health status measure in three types of allocation decisions commonlyChapter 1. Introduction^ 7faced by policy makers: choosing treatments for an individual, choosing an individ-ual for treatment, and choosing programs for implementation. A comparison is madewith the two other economic-based health status indexes, the human capital andwillingness-to-pay measures. The QALY-based measure does indicate appropriatetreatments for an individual, as does the willingness-to-pay measure. The human cap-ital measure generally fails to do this. All measures discriminate against some typesof individuals. The QALY-based measure likely discriminates against unhealthy peo-ple and people who are risk averse with respect to health, which distinguishes it fromthe other measures. As an aggregate health status statistic (used whenever broadlybased programs are compared), the QALY-based measure orders community healthprofiles sensibly (unless states worse than death exist), unlike the alternative mea-sures. Decisions made with the QALY-based index may favour unequal distributionsof health.The final essay assesses the appropriateness and feasibility of QALYs as a foun-dation for an index of societal health. Results suggest that, theoretically, the QALYserves as an imperfect measure of societal health, but that these problems are en-demic to any index based on individual preferences. Using the best available data, aQALY based index is calculated to measure the level and distribution of ill-health inCanada and indicate where health policy can be targeted most effectively. The essayconcludes with a discussion of what improvements in data collection are required toobtain more appropriate figures.Chapter 2Separating Good Health Measures from Bad2.1 IntroductionIn this chapter, the issue of how to obtain QALY (quality adjusted life year) valuesis addressed. Various methods are assessed for theoretical validity (i.e. whetherthey assign values to health states in a manner consistent with preferences) and easeof implementation (i.e. costs of obtaining a given set of values). Conditions forvalidity are shown to depend on the nature of the decision statistic used, where thechoice of decision statistic depends on what relevant factors change between statesof the world. It is shown that no single method of construction is valid in everyQALY application. Conditions for each construction method to be valid in specificapplications are derived. A new method is devised that, unlike those currently in use,is valid in situations involving interpersonal comparisons.Implementation considerations are viewed as an optimization problem where thecost of obtaining a set of QALY values is minimized by choice of construction methodsubject to some acceptable level of bias (where bias is measured by the deviationfrom true values that results in incorrect policy choices being made). Two strategiesare assessed: (1) substitution of QALY instruments (the method by which QALYvalues are derived from preferences), and (2) reconstruction of values for multipleillness states from the values for single illness states. Bias is shown to depend onindependence in preferences over morbidity from non-morbid and other morbid factors8Chapter 2. Separating Good Health Measures from Bad^ 9respectively. These independence conditions are evaluated using simulation analysisand empirical tests. Results suggest that the standard conceptual framework, wherehealth consists of sub-groupings of morbidity characteristics, is appropriate, and thatcost savings in QALY construction may be achieved without significant bias by theadoption of multiplicative aggregation structures. The optimal choice of additionalcost saving measures depends on how the QALY is used in decision making.This chapter is organized in two sections: issues of validity are addressed in thefirst, while implementation issues are addressed in the second.2.2 Part I: Validity2.2.1 MotivationOne of the main advantages of QALY indexes over other health status indexes is thatthey are derived from preferences. Because preferences for health states cannot beinferred from behaviour (since rational and informed choices between health statesare seldom made or observed), they must be obtained in some experimental setting.A subject is presented with a range of health states and asked which he or sheprefers. These preferences are then converted to a numerical scale by the use ofa survey instrument, a device used to measure the strength of preference between twostates. A variety of such instruments are available (those used in the past includecategory scaling (CS), magnitude estimation (ME), standard gamble (SG), time trade-off (TTO), and person equivalents (PE)). To these is added the extended sympathytime trade-off instrument (ES). These are discussed in greater detail below.Much energy has been expended to find the "gold standard" of QALY instru-ments, one which is universally appropriate and against which all other QALY in-struments are to be judged. Such a concept underlies all empirical studies comparingChapter 2. Separating Good Health Measures from Bad^ 10the mapping functions produced by the different instruments. After twenty years, noconsensus in this debate has been reached. This is partially because the literature hasbeen so fragmented that assessments have often been based on partial evidence, butalso because the whole issue of what is appropriate and when has not been adequatelyaddressed.In this paper, a rigorous definition of appropriateness (validity) is derived fromutility theory and the various instruments are evaluated against the conditions gen-erated by this definition. The analysis is both complete and uniform. Theoreticalresults are supported by examples of commonly used QALY based analyses.Nature of the ProblemIn essence, each QALY instrument generates a mapping function from the multidi-mensional health space to a single-dimensioned health index by comparing the valueof various health states to some measurable entity (called the metric variable). Sincethe metrics and the means of comparison vary across instruments, one might expectthe mapping functions to vary as well. This poses the problem of which instrument,if any, to use.Recall that the purpose of QALY analysis is to assist with health care decisionmaking. The thrust of this paper is that the QALY to be used is the one whoseinformation, used in the appropriate decision statistic, leads to the correct policychoice being made. All subsequent analysis proceeds with this in mind.2.2.2 Literature ReviewPapers in this area can be found under the general rubric of validity studies. Validityhas been approached from both theoretical and empirical perspectives.Chapter 2. Separating Good Health Measures from Bad^ 11Theoretical analyses have often been hampered by vague and ill-defined conceptsof validity. The definition usually employed is that of content validity: "A measure-ment instrument is valid to the extent that it actually measures the phenomenon itclaims to measure." (Phillips as quoted in Torrance {1976b, p. 132]) Such a definitionrequires that the purpose of the measurement be clearly understood. Then all condi-tions for validity should be derived from this starting point. QALY values are used toevaluate states of the world according to preferences. It is inappropriate to say thatthe QALY must represent these preferences since the QALY alone usually does notprovide enough information to evaluate all aspects of a state. Rather, it is typicallycombined with other relevant information in a decision statistic and it is this statisticwhich must represent preferences. The conditions for a QALY instrument to be validmust be derived in the appropriate context. Instead, many of the studies undertakento date have investigated some particular characteristic without first justifying thatsuch a property is either necessary or sufficient for valid measurement.Like the theoretical approach, the empirical approach suffers from a poorly definedconcept of validity. Until such time as a "gold standard" is identified, statisticaldifferences must be taken in relation to an arbitrary point and are meaningless. Thereis no reason why convergence would be expected to occur around the correct value (e.g.all the instruments could be biased). However, once such a standard is establishedtheoretically, such investigations will be very useful.Theoretical Investigations Attention has been focused on two characteristics:whether the QALY instrument generates an interval scale (i.e. is cardinally measur-able), and whether the mapping function depends on aspects of the survey instrumentitself apart from the morbidity description. The justification for the former conditionhas been that the QALY is often employed to measure health status differences andChapter 2. Separating Good Health Measures from Bad^ 12that the difference operator is only meaningful on scales with at least cardinal mea-surability. Such a position ignores the fact that differences usually occur over morethan just morbidity (e.g. time alive may also differ between states) or those situationswhere only a ranking of levels, not differences, is adequate (e.g. measures of societalhealth, where the QALYs must be comparable, though not necessarily cardinal).The independence condition is necessary if the non-morbid context in which theQALY is derived differs from the situation which it is supposed to evaluate. If pref-erences over morbidity are conditioned on these other factors, then a separate QALYvalue may have to be calculated for each context. Independence is not necessary forvalidity unless QALY values are used across contexts. In fact, if independence doesnot hold in preferences, it should not hold in QALYs either since these are supposedto be preference based.Neither of these conditions can be considered necessary nor sufficient for validityin general. The appropriate set of conditions must be derived from the decisionstatistic. Furthermore, evaluation of these conditions has been done in isolation, asthough either was sufficient for validity. Unless they are perfectly congruent, thiscannot be the case.Interval Properties: QALY results in this area have been based exclusively onthe expected utility literature, notably Luce and Raiffa (1957) and Fishburn (1964).Torrance (1986) asserts that only the SG instrument generates a valid QALY functionbecause, being based on choice under uncertainty, it recovers the cardinal utilityfunction that orders differences as well as levels of morbidity. But this position ignores(1) recent evidence indicating the underlying axioms necessary for cardinality do nothold (see Shoemaker [19821 for a general discussion, Loomes and McKenzie [1989) forapplications specific to QALY analysis) and (2) that other value functions, notablypseudo-money metrics (i.e. which are structurally equivalent to money metrics butChapter 2. Separating Good Health Measures from Bad^ 13are based on different value units), like TTO and PE, also exist. Thus, work in thisarea is incomplete and biased in favour of SG. Questions of which value function, ifany, is appropriate go unasked.Context: Because the QALY is only a hypothetical construct, some concern existsas to whether or not it reflects actual choices. First, if preferences are defined onlyas a binary 'choice relationship, the QALY must be based on choice to have economiccontent (eliminating the CS and ME instruments) 1 . A related concern is whether theQALY depends on aspects of the hypothetical state that do not concur with actualsituations. Attention has been focused on the TTO instrument and its dependence onthe survey time frame. Pliskin et al. (1980) establish the conditions for independence(the marginal utility of time for any given health level must be constant, conditionswhich may be overly strong for some QALY applications). Loomes and McKenzie(1989) review the ensuing 10 years work, most of which suggests this strong conditionis not satisfied in general. They also describe implications for risk analysis (resultsvery similar to Blackorby and Donaldson's [1988}). In response to this, Mehrez andGafni (1989, 1991) modify the TTO and develop the healthy year equivalent (HYE),which is not dependent on time. Similar rigour has not been applied to the study ofother instruments' dependence on survey variables. Nor have the implications of thisdependence on decision making been studied.Empirical Investigations The health services literature has focused not on thetheoretical issues, but on whether there exist statistically significant differences be-tween instruments'. Studies by Stevens (1959), Torrance (1976b), Read et al. (1984),'There exists an alternative position, commonly held by utilitarians like van Praag (1968), thatutility is a measure of satisfaction rather than a representation of the preference ordering (see Sen[1985] for a discussion). There would then be no basis for excluding CS or ME.2 Such analyses neglect those situations where the QALY need only rank states, not measurethem.Chapter 2. Separating Good Health Measures from Bad^ 14Wolfson et al. (1982), and Rosser and Kind (1978) compare the empirical performanceof CS, SG, and TTO. The general conclusion is that the instruments generate resultsthat are highly correlated, but not equivalent, and that convergence appears to besituation dependent (some results conflict). No study covering all instruments withadequate research design has been attempted.Avenues for Further ResearchThis review reveals there has been no complete and uniform analysis of validity acrossall instruments. Part of the problem appears to be a weak concept of validity. Anappropriate set of sufficient conditions has yet to be developed, and the existingnecessary conditions are often suspect because their derivation from the decisionstatistic is seldom apparent. Finally, the evaluation has been constrained to a setof well established instruments that were developed at least twenty years ago. Withone exception (Mehrez and Gafni [1989, 1991)), no attempt has been made to findsuperior instruments outside this set.2.2.3 AnalysisPurposeThe purpose of this section is to examine the validity of different QALY instruments.This is needed to establish which instrument generates the appropriate QALY valuesand to establish the cost-efficiency trade-offs across instruments.DefinitionsThe above statement begs the question: what is validity? In this context, validity isdefined as the ability of the QALY index to indicate the state most preferred by anChapter 2. Separating Good Health Measures from Bad^ 15individual (or by society when decisions affect many individuals). Such a criterion willultimately depend on how the QALY is applied in decision making. Some definitionsuseful in the construction of this concept are presented below.equivalence describes what ranking properties two functions share:a) f and g, two functions, are ordinally equivalent (rank levels of x the same way)if and only if, for all x in their common domain, there exists an increasingmonotonic function, 0, such that f(x) = qi(g(x)) g(x).b) f and g are cardinally equivalent (rank differences in x the same way) if and onlyif, for all x in their common domain, f(x) = ag(x) + b g(x),a > 0.c) f and g are ratio scale equivalent (rank proportions of x the same way) if andonly if, for all x in their common domain, f (x) = ag (x) =LI g(x), a > 0.uniqueness fy (x) ("the function f of x evaluated in situation y") is unique if andonly if it generates one and only one value for any x for every possible y. Otherwise,(x) is not a function.completeness f is complete if and only if the domain of f, D(f), is the entire setof x (X).measurability describes the classification of functions that are informationallyequivalent (i.e. provide the same information about preferences):a) a function, f, is ordinally measurable if any til(f(x) = g(f(x)) V x in thedomain)}, where g is some increasing monotonic function, is informationallyequivalent to J. This set of functions is described by S°.Chapter 2. Separating Good Health Measures from Bad^ 16The values generated by such functions cannot be manipulated by addition normultiplication operations.b) a function, f, is cardinally measurable if any { f1(f(x) = af(x)+ b, a > 0, V xin the domain)} is informationally equivalent to f. This set of functions isdescribed by Sc.Such functions exhibit interval properties and are suitable for addition opera-tions (i.e. the difference between two function values is meaningful).c) a function, f , is ratio scale measurable if any fii(f(x) = a f (x), a > 0 ; V x in thedomain)} is informationally equivalent to f. This set of functions is describedby Sr.Such functions are suitable for multiplicative and lower order operations.independence f (x; z) ("the function f of x conditioned on z") is independent ofz if and only if f (x; z) = f (x; z') V x, z .AssumptionsThe following assumptions establish the framework for the subsequent analysis.a) Let Sk denote the kth element of the K-dimensional vector q of all morbiditycharacteristics, each with an upper and lower bound, denoted by 4 k and 4. krespectively. Then the set q is bounded by q and q. It is assumed that qmay be described in a quantified fashion that is comprehensible to the surveyrespondent.b) Let x denote the m-dimensional set of all variables over which preferences are de-fined, including morbidity. This set includes q, and various "context" variables:Chapter 2. Separating Good Health Measures from Bad^ 17time alive in the state (t), other commodities (y), and personal characteristics(a). The last two (non-health) factors are grouped in the subset K (K, = (y, a)).c) Let the survey respondent have preferences over gambles involving x, denotedby2. It is assumed these preferences are complete, reflexive, transitive, andcontinuous. Then assume there exists a utility function, U : fr" -4 R l (sdenotes the number of possible states of the world), which represents thesepreferences in the sense C7(x/win ,p',x;0„,1 - p') > (1. (xwin , v x , 1 - p) 4-*(xw p' , x1033, 1 - p 1 )3?(x win , p, x to„ , 1 - p), where p denotes the probability ofreceiving the "win" value of x, and (1 - p) is the probability of receiving the"loss" value of x (this is easily extended to more than two states).d) Assume the survey respondent has preferences over certain outcomes, denoted byR, and that these preferences may be represented by a utility function definedover certain outcomes, U : Rm -4 R 1 , in the sense U(x') > U(x) 4-4 x'Rx. Inthe case where U is homothetic in time, U(x) _° µ(q/c)t (where^indicatesthe functions are ordinally equivalent). Assume further that preferences overgambles are separable from states of the world that never occur. Then the utilityfunction over certain outcomes must also represent preferences over gambleswhen there is no uncertainty (i.e. p = 1): U(x.'^>^4-4 xLi7,3?x,„i„(x„,i,,, 1, x 1085 , 0*(x.,,i„, 1, x 1038 , 0). This implies 0 . (xwin , 1, x1088 , 0) U(x tvin ).e) Assume preferences also exist over differences in outcomes, denoted by and thatthese preferences are represented by a cardinal utility function, U. If (x 1 , x ° )denotes the move from x ° to xl, then ü(x l ) - 0(x °) > U(x 3 ) - U(x 2 )(xi, x°)J(x3, x2).33 Notice that the cardinal utility function is defined directly over outcome differences and notimplicitly from expected utility over gambles. This concept of a value function holds regardless ofChapter 2. Separating Good Health Measures from Bad^ 18f) Assume preferences vary across individuals according to some finite number ofcharacteristics. Then it is possible to express the preferences, as representedin a utility function, of an individual as U i (q, t) = U (q,t, K i ), where Ui is theutility function of individual i, and icy are the characteristics of person i whichaffect preferences for outcomes over (q,t).g) Assume an individual has social preferences (Rw ) defined over the outcomesendured by all N members of society ((x 1 , zN ), the subscripts denotingindividuals), and that these social preferences may be represented by a so-cial welfare function, W, in the sense that 14 7 (4,..., x 'N ) > W (x i ,^x N ) 4-4h) Let cpj(qit,K) denote the QALY value for morbidity profile q, conditioned onthe context (t, ic), derived by instrument j (j CS, ME, SG, TTO, ES, orPE). Let F(t,o3 (q; t, ic), z) denote the decision statistic which employs the QALYvalue used in the evaluation and z, non-morbid factors which are relevant tothe decision. z may contain elements of x besides q that differ from those usedin the QALY valuation (see below). To distinguish between these, those factorswhich condition the QALY function are denoted (t, K), while those used in thedecision statistic are denoted (t, ic). The decision rule is to select the state withthe highest value of F.This last assumption requires some discussion. The QALY value, coi (q; t, K), mea-sures the value of q as conditioned on (t, ic). In a sense, it only provides informationabout the value of morbidity. In reality, projects seldom affect morbidity alone. Thepolicy maker must base his or her decision on all effects of the project. To do this,the validity of the von Neumann-Morgenstern axioms and is slightly different from the value functiontypically used in the decision sciences.Chapter 2. Separating Good Health Measures from Bad^ 19the QALY data must be incorporated with some measure of these other effects ina decision statistic, F. There have been three such decision statistics used in thepast. First is the QALT (quality adjusted lifetime), used when both morbidity andmortality change as a result of the decision: F(c,oj (q; t, z) (q; t, tz)i (this is usedto compare different health states for any one individual or between individuals). Inthis case, z includes the actual time spent alive in the morbid state (1), which maydiffer from the hypothetical time frame on which the QALY value is based (the ac-tual time alive may not be known at the time QALY values are calculated or maybe so variable that the replication of the QALY valuation exercise for each possiblesurvival curve is prohibitively expensive). Hence, z may contain elements of x besidesq that differ from those used in the QALY valuation. Second is the CUR (cost-utilityratio), used to evaluate projects which affect health status at some level of costs:f(cpi(q; t,K),z) ((pi (qA ; t, it)P1 — (qB ; t, tt)i.B )/ C (where A and B denote after andbefore the project respectively, and C denotes the cost of the project). In this case,the change in health is compared to the resources needed to achieve it. z includes notonly the time spent in the state, but the costs of the project as well. The costs of theproject are not included as context variables in the QALY valuation exercise becausethe patient does not usually incur these costs (because of health insurance), yet thedecision maker is aware of the resource drain on society that results from this decisionand wants to take this into account. Third is the ex post measure of societal health,the sum of the QALTs of the N members of the society, used to measure communityhealth: 11(c,0j (q; t, z) cpj (qi ; t i, Again, z contains elements other thanx (the number of people in society) and elements of x that differ from those on whichthe QALY value is based (the lengths of life).Chapter 2. Separating Good Health Measures from Bad^ 20ValidityAn index is valid if it measures what it is supposed to measure. Most assessmentsof QALYs to date have evaluated validity by comparing empirical QALY values toother established measures of health status (content validity). Validity is establishedif the QALY index assigns values to health states in a manner similar to these otherindexes. But this approach assumes these other measures are valid, which effectivelynegates the need to construct QALY values.In this chapter, a theoretical assessment of the validity of the QALY index isundertaken. The validity results obtained do not require validity of any other healthindex, but only that preferences for health states reflect the well-being associated withthose health states. One of the main contributions of this chapter is to operationalizethe above definition of validity so that it may be used in a theoretical assessment.QALYs are supposed to measure morbid states. More specifically, they must assignvalues to morbid states in a manner consistent with preferences so that more preferredstates are assigned higher values than less preferred states. Recall, however, that theQALY is seldom used alone to evaluate states, but is usually combined with otherrelevant information in a decision statistic. Thus, the conditions for validity requirethat the decision statistic, which is defined over the QALY, be an exact representationof preferences over the state:(cp- (q;t, tc), z) U (q, = U Vick) (U) (2.1)(where means the two functions are ordinally equivalent). Obviously, certainassumptions about the form of r and the content of z must be made (the form of rmust be consistent with preferences, z can only contain fixed levels of any variablesnot included in the utility function) for the above condition to hold. The focusof this chapter is not the appropriateness of r, but the validity of cp(q;t,K). TheChapter 2. Separating Good Health Measures from Bad^ 21above condition imposes two (necessary) conditions on co (q; t, K) for validity: (1)representation and (2) independence.Consider the case where only morbidity is allowed to vary between states of theworld. In this case, the QALY alone is a sufficient statistic to evaluate states of theworld. Then the validity condition holds if and only if the QALY function is an exactrepresentation of preferences over all morbidity levels for any given configuration ofother factors, i.e.(pi (q; t, U (q,t, k) V q E D(U) (2.2)for any (t, ii). In cases where changes in morbidity, rather than morbidity levels, areto be assessed, the QALY values must reflect value differences and the conditions onthe QALY function become more strict: the QALY must be related to the cardinalutility function, U, which represents preferences for such changes:co3 (q;t, ic) c U(q, t, ii)VqED(U), (2.3)(where ".-f--" indicates the functions are cardinally equivalent). The first necessary con-dition then may be summarized: the QALY function is (i) an (ordinally or cardinally)exact representation of preferences over its domain, and that this representation is(ii) complete and (iii) unique.Now consider a more likely scenario where both morbidity and other factors areallowed to change between states of the world. In this case, the "context" on whichthe QALY function is based is likely to differ from the contexts over which the decisionstatistic is defined (the costs of recalculating the QALY value for every given contextis prohibitively expensive; the actual context, which is used in the decision statistic,may not be known when the QALY values are obtained). In this case, the QALYfunction must represent preferences over morbidity not only for any given context,but for every possible context. Thus, states of the world are appropriately rankedChapter 2. Separating Good Health Measures from Bad^ 22(assuming the conditions on r and z hold) if and only ifP(coi (q; t, 1, U (q, (2.4)where the fixed z elements are suppressed in P. Given the right hand side of thisequation is independent of (t, K), so must be the left hand side. But this is the caseif and only if the QALY function is independent of the survey context, or(pi (q; t , ,c) = CiC3 3 (q) V (t, K). (2.5)The evaluation of instruments proceeds according to the following structure: (1)does the QALY represent ordinal and cardinal preferences over morbidity for a givencontext, (2) is the QALY independent of changes in context, and (3) is the decisionstatistic consistent with preferences when context is allowed to change? Precedingthis is a discussion of the instruments to be evaluated.Description of Instruments and Relationships to PreferencesThe following discussion includes the five instruments used in the past (the categoryscale, magnitude estimation, standard gamble, time trade-off, and person equivalents)and a new instrument presented here as a superior alternative for decisions involvinginterpersonal comparisons (the extended sympathy time trade-off).Category Scaling (CS) With CS, health states are valued directly relative to oneanother. The respondent is asked in the survey to place the described health stateon a line in relation to two distinct identified points (usually of perfect health and astate equivalent to death) according to the relative worth of each. The distance ofthis point from the death mark, divided by the distance between the perfect healthand death marks is the QALY value.''There is also a discrete version of CS. In this case, the respondent is presented with a fixednumber of points (or categories) where the health states may be ranked, perfect health and deathChapter 2. Separating Good Health Measures from Bad^ 23Assuming value assignments are made according to preferences (since "values" areinvolved, one might be tempted to assume that some "true" (i.e. cardinal) utility func-tion is used, although this assumption is not made here), then the respondent wouldreport values, conditional on his or her context (described by (t, ic)), as U(q,t, lc) =the value of the described health state, U (q i ,t, ic) = the value of the better refer-ence state (perfect health), U (q° , t, ic) = the value of the worse reference state (deathequivalent), such that the QALY:,pcs (q; U (q,t, k) — U (q° ,t, tc)=U(q 1 ,t,^— U (q° ,t,(the QALY function depends on the reference states as well, although this is sup-pressed in the notation since these states are assumed to be fixed).CS requires that preferences reflect the level of satisfaction as well as the preferenceordering, that this ordering exist, and that the best and worst states be closed.Magnitude Estimation (ME) ME is the ratio scale version of CS (which is aninterval scale). The researcher identifies a single reference state, q 1 , as a numeraireand the respondent reports the value of the described health state as a multiple (e.g.fraction) of the value of the reference state.Assume, as before, that value assignments are made according to some utilityfunction defined over outcomes. Then(pME (q; t, K) = U (q t , tc)U(q 1 ,t, ic)(2.7)ME requires the same conditions on preferences as in the CS case (see above), inaddition to the assumption that (q 1 ,t, ic)^0.(2.6)usually assumed to be in the extreme categories. This reduces respondent effort in evaluation sinceonly approximations are required.Chapter 2. Separating Good Health Measures from Bad^ 24Standard Gamble (SG) The standard gamble is based on principles of indif-ference between games of chance and known intermediate outcomes, as describedin Fishburn (1964). Under certain conditions, depending on preference axioms andhow the problem is presented to the respondent, the instrument shows similarities tofactionalization techniques in psychometrics.The respondent is presented with a health state and is asked to pick the odds(subjective probabilities) for a game with fixed win-loss (q 1 and q° respectively) states(the standard or reference gamble) such that the respondent is indifferent betweentaking the gamble or the described health state (i.e. the respondent is asked to pickp and 1 — p, given q, such that (I ((q,t, 10,1) = CI ((q l ,t, (q°, t, (1 — p))). Thenp becomes the numerical assignment of the ordering over health states:90sG( q; (2.8)For p to exist, an ordering over gambles must exist and be continuous over q suchthat agents are willing to accept gambles involving risk of death; for p to be unique,monotonicity must exist in preferences. If the von Neumann-Morgenstern axiomshold, p may be solved for explicitly as a function of the cardinal utility function:'SG ü(q,t,k)— ti(q° ,t,K) .(q; IC) = P u(qi,t,K)— ü(q°,Time Trade-Off (TTO) Like the SG, TTO is an economic instrument in that itinvolves decision making on the part of the respondent about how much of one goodshould be given up in order to receive more of another. The respondent is describedsome health state, q, and told it will last some (hypothetical and exogenous) numberof time periods, t, whereupon death will occur. The individual is asked to pick theportion or proportion of this length of life ((t — m)/t) spent in perfect health (q 1 )that is equivalent to the state initially described (i.e. U(q,t, = U (q 1 ,t — m, fri), the(2.9 )Chapter 2. Separating Good Health Measures from Bad^ 25hypothetical state is generally described as occurring with certainty). ThentpTTO (q; t, (t m)/t. (2.10)Person Equivalents (PE) Like SG and TTO, PE is based on choice and haseconomic content, although it is based on ethical or social preferences rather thanselfish preferences. In this case, the respondent is told that N people live in healthstate q and is asked how many persons can be given up (killed off or assigned healthstate q°) if the rest are given perfect health (q') and leave society equally well off.Let W denote social preferences in the same way that U denotes selfish preferences.Then the problem becomes to find m such that 1/1 7 (qi , •••qN, ti, •••, 1N, X1) • N) =W(qi, qk_m, The QALY is(pPE (q1,•-,q1,7;t1,•-,trI,K 1,• -,Kiv) (N - 771 )/ly, (2.11 )where the subscripts on the qs, ts, and ICS identify the individual affected by the healthstatus change.' The respondent must consider the values, needs, and prognoses of theN persons in the sample and, if they differ, the selection mechanism which decideswho dies and who is cured. A description of the N individuals' characteristics mustbe included in the health state description.Extended Sympathy Time Trade-off (ES) Sen (1985) has suggested that pref-erence information obtained in the fashion of the first four instruments above is ap-propriate for intrapersonal valuations, but not for interpersonal valuations. While PEmay be based on the ethical preferences necessary for such comparisons, its variousadministrative and cognitive difficulties, combined with the fact that individual char-acteristics necessary for such comparisons are seldom provided, make it unsuitable5 Health characteristic k endured by individual i is denotedChapter 2. Separating Good Health Measures from Bad^ 26for interpersonal comparisons. Yet, most applications of QALY analysis require someinterpersonal content in the values since they examine allocations across individuals.One possible method that does incorporate interpersonal comparisons and mayhave greater acceptance among respondents than PE (because immediate death isnot involved) is a numerical version of extended sympathy. As originally developed(see Arrow [1978]), extended sympathy involves choosing who is better off: personi in state a or person j in state b. Such exercises provide an ordering over healthstates and individuals so interpersonal comparisons are possible. In this form, theprocedure lacks the necessary measurability properties for many QALY applications.This problem is overcome here by introducing time trade-offs across individuals' states(i.e. the respondent is asked to choose what amount of time in state b enduredby person j leaves j as well off as i who endures state a for some specified time).The proportion of time j spends alive relative to i is then a measure of the relativeadvantage of i's situation over j's.For instance, if one felt occupation affected preferences over health, one wouldchange the standard time trade-off question to read: "John Doe currently has no useof his left hand and, as a result, works in a service sector job earning 10,000 dollarsa year. He will live another ten years in this state. Bill Smith lives in perfect healthand has a job in the manufacturing sector earning 20,000 dollars a year. How longwill Bill Smith have to live to be as well off as John Doe?" The ES QALY value isthe answer to this question divided by 10 years.Because there are a finite number of personal characteristics, it is possible to definepreferences over personal characteristics and outcomes (again, uncertainty does notenter the problem as posed) such that U (q, t, üi (q, t) V i. Then the problem isChapter 2. Separating Good Health Measures from Bad^ 27to choose m such that U (qi ,t i ,^= U(qi,ti — m, n i ) wherei0 E 'S' (q;^Ki, tj, icy)^(tj — 711)/ti.^ (2.12)Representation ResultsThis section examines whether each of the above instruments assigns values to morbidstates in a fashion consistent with preferences, either cardinal or ordinal, for any givencontext. The following lemmata are structured according to whether the QALY actsas an appropriate utility function for complete, reflexive and transitive preferences.Such a function must1) over its domain preserve the preference ordering over morbid states, whetherorderings are defined over levels (i.e. is the function ordinally equivalent to theutility function, U, which is defined over levels of x) or differences in morbidity(i.e. is the function cardinally equivalent to the cardinal utility function, U,which is defined over changes in states),2) have as its domain the entire set of morbid states over which preferences aredefined,3) have an image which varies only with relevant changes in preferences (e.g. ifmorbid states need only be ordered, then only changes in the preference orderingshould affect the QALY value; if morbid state differences must be ordered, thenchanges in cardinal utility or strength of preferences should affect QALY values).If (1) fails, then the QALY has no welfare content (a state with a higher QALYvalue is not necessarily preferred to one with a lower QALY value). If (2) fails,states beyond the QALY domain cannot be evaluated even though the respondenthas preferences for such states. If (3) fails, care must be taken that extraneous factorsChapter 2. Separating Good Health Measures from Bad^ 28are held constant across QALY values for different states or the same health statemay be assigned multiple values.Lemma 1 (CS): Given the assumptions above pertaining to CS1.a) over its domain, 4,0 c s preserves the ordering over outcomes for any givencontext (t,Cb) over its domain , cpS preserves the ordering over morbid state differences,for any given context, if and only if U E Sc (ü).2) the domain of^is the set of all q over which preferences are defined forgiven (t,K). 63.a) cocs s based on different representations of the same preference orderingover outcomes (U and U') have identical images if and only if U U';otherwise, the functions are ordinally equivalent.b) (pcss based on different reference states ((q 1 , q° ) and (q1, q-9) have identicalimages if and only if U (q l ,t, u (qi , t, lc) and U (q° ,t, = U (4° , t, tc);otherwise, the functions are cardinally equivalent.Proof: see Appendix A.Lemma 1 (ME): Given the assumptions above pertaining to ME1.a) over its domain, (pME preserves the ordering over outcomes for any givencontext (t, 10 if and only if U (q l ,t,^> 0.b) over its domain, 92ME preserves the ordering over morbid state differences,for any given context and non-zero reference values, if and only if U E,510)•'This condition imposes fixity of (t, K) and therefore ignores situations where the domains of qand (t, K) in preferences are not independent. Since this section assumes such fixity throughout, itdoes not seem appropriate to address issues of independence which assume variability.Chapter 2. Separating Good Health Measures from Bad^ 292) the domain of cp .?' is the set of all q over which preferences are defined(for given (t, n)) if and only if U(q 1 ,1, K) 7/- 0; otherwise, the function isundefined so the domain is the null set.3.a) (p's based on different representations of the same preference orderingover outcomes (U and U') have identical images if and only if U r U';otherwise, the functions are ordinally equivalent.^b) cp's based on different reference states (q 1 and^have identical images ifand only if U(q 1 ,1, K) = U(4 1 ,t,K); otherwise, the functions are ratio scaleequivalent.Proof: see Appendix A.Lemma 1 (SG): Given the assumptions above pertaining to SG1.a) over its domain , (psc preserves the ordering over outcomes for any givencontext (t, ).b) over its domain , iosc preserves the ordering over morbid state differences,for any given context, if and only if N(qwin,^), 1 — P)pU (qu,in ,1, ic) + (1 — p)U (qh,„,1, ic) (i.e. the von Neumann-Morgensternaxioms hold), where, by definition, U E Sc(U).2) the domain of co' is the set of all q whose values are bounded by the valuesof the two reference states (q1 and q° ) for any given level of (1, ic).3.a) coosG s based on different representations of the same preference orderingover gambles (U and U') have identical images. If the von Neumann-Morgenstern axioms hold , 50SG S based on different representations of thesame preference ordering over outcomes (U and U') have identical imagesif and only if U U'; otherwise, the functions are ordinally equivalent.Chapter 2. Separating Good Health Measures from Bad^ 30b) cp sG s based on different reference states ((q', go)) and (q1, 11,0 )) have identicalimages if and only if U(ql,t,K,) = U(4 1 ,t, 10 and U(e,t,^= u( -4°,t,K) ;otherwise, the functions are cardinally equivalent if U^E i pi U(qi ,t,K),and are ordinally equivalent if not.Proof: see Appendix A.Lemma 1 (TTO): Given the assumptions above pertaining to TTO1.a) over its domain , (,0TTO preserves the ordering over outcomes for any givencontext (t, rs) if and only if t is fixed; soTTO preserves the ordering overmorbidity levels for any t if and only if U(x) --LI IL(q, tc)t.b) over its domain , cpTTO preserves the ordering over morbid state differences,for any given context, if and only if U (q, t, p,(q, is )t (i.e. utility ishomothetic in time) and U E Sc (0).2) the domain of cpTTG is the set of all q such that, for any given K, U(q 1 ,t, ic) >U(q,t,^> U(q 1 , 0, KY .7 Note: The domain of the time trade-off instruments may be extended if some means of borrowingtime is introduced. Torrance (1982) suggests that respondents choose the proportion of time spent inperfect health as opposed to some state worse than death such that indifference is achieved betweenthis state and never being born (i.e. U(., 0, ic) = 0 = U (1—a)t, q, at, ,c )). While this value is relatedto the regular TTO index, the correction factor required to make the two consistent is a functionof the surveyed state: Assuming homotheticity in time, wTTO ( 17; t, n) = 1 — a = ia(q7n)/(A(q, n)p,(4, k)). But normal TTO values are scaled to the numeraire ic), so to make this indexcomparable with the normal index one must scale by the factor (11(q, #c) —^K)) / p(4, ►c) f (q;Since this correction factor is based on an unknown value, comparability cannot be achieved. If,instead, one told respondents that this particularly bad state was preceded by a set period of perfecthealth and that deductions were to be made over the entire period (i.e. U(4, (1 — a)t, q, at,^=U(4,1 —^IC)), then cpTT° (q; t, lc) = (ap.(q, K) I IL(q-, ,c)) + ((1 — a)p,(4, K)I 1.44, n)). To make thiscomparable with the normal index, one would subtract (1 — a) and divide by a. Since a is known,this is a feasible operation.Chapter 2. Separating Good Health Measures from Bad^ 313.a) ,TTO s based on different representations of the same preference orderingover outcomes (U and U') have identical images. When utility is homoth-etic in time, TTO s based on different representations of the same prefer-ence ordering over morbidity (µ and µ') have identical images if and onlyif µ T µ'; otherwise, the functions are ordinally equivalent.b) ccTTO s based on different reference states (q 1 ) and (q1 ) have identical imagesif and only if U(q l , t, ,c) = U(q l , t, K); otherwise, the functions are ratioscale equivalent if utility is homothetic in time and are ordinally equivalentif not.Proof: see Appendix A.Lemma 1 (ES): Given the assumptions above pertaining to ES1.a) over its domain, cpES preserves the ordering over outcomes for any givencontext (t 1 ,r,tj ,,j) if and only if the reference time frame (tj ) is fixed;cpES preserves the ordering over morbidity levels for any t = if and only ifU(xi) µ(gZ^ i^)tt.b) over its domain, cpES preserves the ordering over morbid state differences,for any given context, if and only if utility is homothetic in time andU E sc(U)•2) the domain of cpES is the set of all q1 such that, for any given( t1, ic2 ) , U (q , tj, t ) > U (qi, ti,,z) > U (q , 0, t ).3.a) cpES s based on different representations of the same preference orderingover outcomes (U and U') have identical images. When utility is homo-thetic in time, cpES s based on different representations of the same prefer-ence ordering over morbidity (µ and µ') have identical images if and onlyChapter 2. Separating Good Health Measures from Bad^ 32if µ r µ'; otherwise, the functions are ordinally equivalent.b) cpES s based on different reference morbid states (qD and ( -ID have identicalimages if and only if U(qj- ,ti ,lij ) , U(Ctj ,K j ); otherwise, the functionsare ratio scale equivalent if utility is homothetic in time, and are ordinallyequivalent if not.(pEs s based on different reference person types (K i and kj ) have identicalimages if and only if U , t j , icy ) = U (q. t j , hi ); otherwise, the functionsare ratio scale equivalent if utility is homothetic in time, and are ordinallyequivalent if not.Proof: see Appendix A.Lemma 1 (PE): Given the assumptions above pertaining to PE1.a) over its domain, 50 -PE preserves the social ordering over morbid states forany given context (t 1, t N , K 1, 'cN) if and only if N, the number ofpeople in the group under consideration, is fixed.b) over its domain, cpPE preserves the ordering over outcomes for any givencontext if and only if (i) the social welfare function is welfarist, (ii) Nis fixed or social welfare is homothetic, and (iii) t i = t, K i = K V i, or82u^0 for all (§, k ) and for all i, where § is an element of (t i ,K i )•CpPE is consistent with the average preference ordering if and only if (i) thesocial welfare function is welfarist and homothetic and (ii) the selectionrule is random and N is large.c) APE preserves the ordering over morbid state differences, for given context,if and only if (i) the social welfare function is welfarist and homothetic,(ii) either t i = t, hi = K V i, or 82u1a§a6, = 0, and (iii) U E S`(0).Chapter 2. Separating Good Health Measures from Bad^ 332) the domain of coPE is the set of all morbid status distributions where147 ((h ,^...,t N, lc l , ..., N)>for any given distribution of (t i ,3.a) (p l's based on different representations of the same social preference or-dering (W and W') have identical images. If the social welfare functionis welfarist and homothetic, then y;r's based on different representationsof the same preference ordering over outcomes (U and U') have identicalimages if and only if U' = aU b i d i = 1, N .b) Let S be the selection rule for deciding who is cured and who dies inthe alternative state. Then 90 's based on different selection rules haveidentical images if and only if (i) the social welfare function is welfarist^and homothetic and (ii) U(q,t i , n i )^v(q) bi (t i ,rc i ) V i = 1,^N .c) cioPE s based on different reference morbid states ((q 1 , q° ) and (41,40)) haveidentical images if and only if^•••, q1V-m , qk-m+1 )^dr, tl ,^t^•••, KN )TkrV1(-i^n-0 °=^n17 " • 7^-7n) `IN -m+1, •••; CT , t 1, •••,t N 1 11'1,^N);otherwise, the functions are cardinally equivalent if the social welfare func-tion is welfarist and homothetic, and ordinally equivalent if not.Proof: see Appendix A.Chapter 2. Separating Good Health Measures from Bad^ 34Summary of RepresentationThese six lemmata demonstrate that assumptions are required for each of these in-struments to represent legitimately preferences. To order morbidity levels, only SGrequires no additional assumptions; CS and ME require assumptions be made thatvaluations be based on preferences, and the other instruments are exact only for thegiven levels of their respective metric variables (in the neighbourhood of the level ofthe metric, these QALY values linearly approximate the true ordering).To order differences in morbidity levels, no instrument is exact without the im-position of assumptions: SG extracts the cardinal utility function only if the vonNeumann-Morgenstern axioms hold, CS and ME require individuals to make valua-tions based on a "cardinal" utility function, and the others require linearity in themetric variable in the "cardinal" utility function (regardless of the representation ofthese preferences actually used in the exercise).The completeness of the trade-off instruments depends on the choice of referencepoints, the values of which bound the states that may be evaluated. The feasibilityof changing these reference points is discussed above and an improved method (whichgenerates comparable values) is suggested. Changes in reference points change theinterval to which the image of the QALY function is projected, thereby changingvalues. In those cases where the metric of valuation enters utility homothetically, thetwo images are cardinally related and may be spliced together with any two commonobservations. Otherwise, the two functions are very difficult to relate.Finally, only SG is invariant to affine but not general transforms of the utilityfunction', so its image will change if and only if strength of preference changes. TTO,8 CS is as well, but, since this instrument is based on valuations that need not be cardinally relatedto the cardinal utility function, it may be that an affine change in the cardinal utility function maymanifest itself as a non-affine change in the utility function used in the CS evaluation.Chapter 2. Separating Good Health Measures from Bad^ 35ES, and PE are invariant to any such transforms such that their images are perturbedonly by changes in the ordering itself. If morbid states are to be ranked only, thelatter situation is more desirable, while if morbidity differences are to be ranked, theformer situation is more appropriate (such that only differences in preferences thatmatter affect the QALY values; otherwise, intransitivities can occur).2.2.4 Independence ResultsThis section enlarges the policy scope of the QALY by assessing whether the QALYcan still evaluate morbid states when the context changes. This is one of the condi-tions for validity when policy affects more than just morbidity. In addition, a majorresult from the previous section is that some instruments are dependent on the levelof their respective metrics. Seemingly, the other instruments are independent of theseconditioning factors (e.g. McKenzie and Loomes [1989] seem to imply the TTO isinferior to other instruments because of a dependency on time, but do not discusswhether other instruments are also dependent on time). This position ignores thatall instruments are conditioned on context (t, ,c) in the above analysis.In this section, the conditions for each instrument to be independent of any non-morbid context variable are established. These results indicate when QALYs are validmeasures of morbidity in different contexts. Context variables fall into two categories:those controlled by the researcher because they are specified in the health state de-scription, and those that are supplied by the respondent by default. Dependence onthe latter set of factors is more problematic because they are beyond the direct controlof the researcher and can only be manipulated by sample selection of the respondentgroup.Lemma 2 (CS) cpcs is independent of any element of (t, ic) if and only if U(x)Chapter 2. Separating Good Health Measures from Bad^ 36v(q)w(t, it) + z(t, K) for all (q, t, it) in the domain.Proof: see Appendix A.If not, differentiation of the expression for (pc s (q;t,K) with respect to any el-ement of (t,K), say §, reveals 8`° c ( 't '" ) < / > 0 H (Ug (q,t,K) — Ug (e,t,K)) -(p C S (Et, IC)(U§(q 1 t , IC) - U§(q° t , 10) < / > 0. A sufficient condition for the CSQALY value to be lower when the level of another good increases is that healthstatus and this good be complements in the utility function.Lemma 2 (ME) yam' is independent of any element of (t, K) if and only if U(x)v(q)w(t, K) for all (q, t, IC) in the domain.Proof: see Appendix A.If not, differentiation reveals 8 `P ME (r1;t'k)^>^<^0 4-4 U§ (q,t, it) —(pm E (q; t, K)(Ug (q 1 ,t, 10) > / < 0. The ME QALY value for a given morbidity levelfalls as the level of another good increases if health and this other good are substitutesin the utility function and the reference morbid state is preferred to the measuredmorbid state.Lemma 2 (SG) If the von Neumann-Morgenstern axioms hold, and (t, it) are fixedfor all states of the world, then cps' is independent of any element of (t, it) ifand only if (q, t, K) v(q)a(t, it) b(t, K) for all (p, q, t, K) in the domain.Proof: see Appendix A.If not, differentiation reveals that (to SG (q; t, ) relates to § in the same way as(pcs(EttK ) .Lemma 2 (TTO)a. (i..,TTO is independent of any element of K if and only if U(x)^U (v(q,t), K)Chapter 2. Separating Good Health Measures from Bad^ 37for all (q, t, 10 in the domain.b. („o TT° is independent of the level of t if and only if U (q, t, k)^ta(q, /0 forall (q, t, ic) in the domain.Proof: see Appendix A.,-„TTo^If not, a ^``I”^> / < 0 as 8MRS^> 0 and sign arpTTo (q;tot))8§ a§ at^Ut (44" )^cpTTO (q; t, K)). The TTO value increases with t if the marginalUt (q' ,t—m")utility of time is higher in the measured state than in the reference state. The effecton the TTO value of any other factor change depends on how that factor affects therelative value of morbidity versus time.Lemma 2 (ES) 9a) If Ki KJ, then (pES is independent of the level of any element of K i j if andonly if au4' ) = 0 for all (q, t i , K i ) in the domain.b) If t i^ti , then cpEs is independent of the level of ti , and the level of t i , ifand only if Ut (x i )t i = Ut (xj )(ti — m) for all (q, ti,^Kj) in the domain (i.e.U is homothetic in t).Proof: see Appendix A.If not, differentiation yields &P BS (Chati§'iti "i "i ) >av^8(pE (q,ti ,tj "iosi)> < 0 and ^i^> / < 0 as Ut(Xi)ti8tthe ES case is similar to the TTO case, althoughutility functions.< 0 and awEs(q:945;ti"."i ) < / > 0 as> / < Ut (x j )(tj — m). Intuitively,it is defined over a larger class ofLemma 2 (PE)a. CpP E is independent of the level of (t, JO (i.e. t i = t, K i = tc) if and only if'In this case, only situations with differing personal characteristics or time frames are considered;the converse situation reduces to the TTO case.Chapter 2. Separating Good Health Measures from Bad^ 38171.7 (v(qi,•••,qN), tl, •••7^N1) •••)ICN)for all (N, qi, t i , n i ) in the domain. In the case where the social welfare func-tion is welfarist, this requires (i) that social welfare be homothetic and (ii)U(qi ,t i) v(qi)a(ti,Ki)+ b(t i ,b. (pPE is independent of the distribution of (t i , KO (i.e. t i^ti , K i^Ki ) if andonly ifo1;17 (v(q17.••,qN),ti,.••,tN,Ki,.•., N)for all (N, qi ,t„ n i ) in the domain. In the case where the social welfare func-tion is welfarist, this requires (i) that social welfare be homothetic and (ii)U(qi ,ti, n i ) cv(qi )H- b(t i , KO.Proof: see Appendix A.As before, differentiation of the expression for PE reveals the direction of biaswhen these conditions do not hold. In the case where a non-metric variable changes(i.e. not the number of people alive), the effect on the PE QALY values depends onhow this factor affects the willingness to trade off one person's health for another's (i.e.(PPE (41,-14 N^N"1,-"N) >^<^4._.4 8111 RS qt.q3 ac ac > < 0 where E (t i ,^E [1, ..., N]l).Summary of IndependenceThese lemmata reveal an important distinction between the independence conditionsfor context variables set by respondents and those set by the researcher (the levelsof the metrics). The latter are much more stringent since metrics must enter utilityhomothetically 1° , while preferences for morbid states need only be separable from1"Note that a function of one variable is homothetic if it is increasing in that one variable.Chapter 2. Separating Good Health Measures from Bad^ 39the other context variables (the exact form of this separability is determined by theclass of transformations of utility that do not affect the QALY values for a particularinstrument).It is possible to determine how biased the QALY values are across contexts if thesecond derivative properties of the utility function are known. Independence requiresnull second derivatives: cross-context QALYs are essentially linear approximations tothe true values. The degree and sign of the bias depend on the amount and directionof curvature in preferences.2.2.5 Equivalence ConditionsEach QALY instrument is some transformation of some conditional utility function.The Representation section establishes the conditions necessary for the transforma-tions to be the same (i.e. for the mapping to occur to the same interval), while theIndependence section shows the conditions for the conditional orderings of morbid-ity implied by the QALY values to be the same regardless of the context in whichthe QALY is derived. Combining the results from the two sections, the followingProposition may be stated:Proposition 1: Equivalencea) all instruments yield identical results, given appropriate choice of reference points,identical individuals who share a common but arbitrary level of non-healthcharacteristics, and a homothetic, welfarist SWF, if and only if preferences maybe represented by(I(x.,ps)^)t,w(K8),b) all economic instruments (SG, TTO, ES, PE) are identical, given the conditionsChapter 2. Separating Good Health Measures from Bad^ 40in (a), if and only if preferences may be represented byU (x , 1) 8 ) C Ep sv( q8 )t,a(n s ) + bk.),c) both selfish economic instruments (SG, TTO) are identical, given appropriatechoice of reference points, if and only if preferences may be represented by(x„,p s )^f (E p.v(q.)t a , n),d) both interpersonal instruments (PE,ES) are identical, given the conditions in (a),if and only if preferences may be represented byU(x)^v(qi )ta(K i )^bz(K i ),but if individuals are not identical, preferences must be of the formU(x)^v(qi )tProof: see Appendix A.If, after appropriately rescaling values (by adjusting reference health states to mapto the same interval), results differ, it can be concluded that these conditions do nothold. A choice must then be made between instruments.2.2.6 Instrument SelectionThis section serves two purposes: first, the implications of inexactness in the decisionstatistics are brought forward and, second, how to overcome problems of this natureby appropriately fixing the reference context is demonstrated.In this section, four common policy spaces and their respective decision statisticsare examined: (1) to choose between health states for a single individual (QALT), (2)Chapter 2. Separating Good Health Measures from Bad^ 41to measure the health status improvement from a project (CUR), (3) as a measureof societal health (sum of QALTs), and (4) to choose between patients for a givenhealth status change (QALT). The conditions under which these decision statisticsare exact are stated and their performance under realistic parameterizations of theutility function are considered.QALY as an element in a health index The decision statistic in this case isH(q,t) = cpj(q)t (context variables are held fixed and suppressed in the notation).This health index orders (q,t) appropriately if and only if U v(q)t. Suppose thisis not so. Then from Lemma 1 (TTO), soTTo ) is inexact except when the surveyand actual time frames coincide. The non-time based indexes order morbid statesappropriately, but are inexact over sets of (q,t). In fact, over (q,t), only the TTO withtime appropriately set is exact (this result can be found in Mehrez and Gafni [1989],although they do not examine the exactness properties of the other instruments).Example: Let U (q, t) = f(t) e-TTv(q)dr. This is a commonly accepted utilityfunction defined over time. Usually, r is taken to be between .05 and .10. Thenso-iNt v(q)t U(q,t) unless r = 0 (j=CS, ME, SG, and PE), whereast m = soTTO (q,t)t = ln(1 — r(U(q,t))) -1 /7 if r > 0= v(q)t if r 0U(q,t) V r.The bias of using one of the other instruments may then be found: B = (t — m[r =0]) — (t — m[r > 0]) ( a(t&T ) )dr = ( 1 -P-K6 )dr. Since dr 0, and it is assumedthat U(q,t) > 0 in order for TTO to be well-defined, B > 0 if 1/U(q,t) > r andB < 0 if 1/U(q,t) < r. But since (t — m) is undefined for the latter condition, it canbe asserted that B is always positive, more so the larger is t or r.This implies that, under usual preferences, the TTO instrument assigns lowerChapter 2. Separating Good Health Measures from Bad^ 42values to morbid states than the other QALY instruments. If values derived by dif-ferent instruments are compared, morbid states evaluated by non-TTO instrumentsare overvalued relative to those valued by TTO (morbid states are consistently rankedwithin any given instrument, but not when different instruments are used). The prob-lem becomes significant when both morbidity and longevity change between states.Non-TTO instruments (inappropriately) undervalue states with greater length of life,whereas the TTO instrument assigns appropriate weight to length of life and morbid-ity during it. Thus, if non-TTO instruments are used, health care policy will improvethe quality of life more and the length of life less than what would achieve the greatestlevel of satisfaction.QALY as a measure of health status improvement The decision statistic inthis case is 40i (qA )tA_ cp .; ( qn )0 (non-health factors are fixed and suppressed in thenotation). This is exact if and only if cpj(q)t (I(q,t) (i.e. the QALY value iscardinally related to the cardinal utility function so that differences in the QALYvalue appropriately measure changes in morbidity). From Lemma 1 (SG), if thevon Neumann-Morgenstern axioms are imposed, tpSG for any given context(i.e. the hypothetical and actual time frames coincide). Additional and even moreunrealistic assumptions must be imposed for the other instruments to generate thisresult. However, as is shown below, this is not sufficient for the SG to rank healthstatus differences appropriately because neither the dependence of SG on t nor thefunctional form of F have been considered. In fact, the very disappointing result isgenerated that, in general, the SG ranks only changes in morbidity appropriately.Consider first the case where U = v(q)t. Then (pSG(q) = av(q)+ b and cpTTO (q)v(q). The decision statistic values are:A (p .5G t^(av(qA) otA (av(qB^vtl 3)^aLU + b(t A — tB ) (2.13)Chapter 2. Separating Good Health Measures from Bad^ 43(where A is the difference operator), which is exact if either b = 0 or tA = tB in allapplications. Conversely, since TTO employs some utility function U v(q)t,A ,pTTo t _ (av(gA))tA — (av(qn ))tB _ az1 0- ,^(2.14)which is exact regardless, suggesting TTO is superior to SG.Now consider the case U = v(q)w(t) (since this case allows for the possibility ofdiscounting, it may be considered fairly general). One might think that the resultabove does not hold because the TTO is time dependent while the SG is not. This isonly partially true. The decision statistic values are:AcpSG t =_ (av(e )^— (av(qB )+ b)tB = aA(T(tB/w(tB))+ b(tA tB ), (2.15)-which is exact only if tA = tB in all applications or w(t) = at. In contrast,A ,pTTO t (w -1 (av(qA )w(0 )) w -i (av(qn )w(tn )) _ Ao-4--) W -1 (U) = aU + b.^ (2.16)In both cases, exactness requires that w be linear. Concavity/convexity of wbiases both measures upwards/downwards. The bias in the SG case depends onlyon the level of t, whereas in the TTO case it depends also on the level of q. Whichgenerates the larger bias cannot be determined. The SG is only superior to the TTOif only morbid factors vary in a health state.This result is quite startling since it suggests the SG, even with linear expectedutility, cannot rank differences in health states correctly unless the conditions for theTTO to do so hold as well (note that it is not necessary to impose cardinal utilityfunctions in the TTO case). This is because the TTO function is not a value functionin this case (and cannot measure differences appropriately), and the SG does not valuetime appropriately relative to morbidity. The result is similar to that obtained in theChapter 2. Separating Good Health Measures from Bad^ 44cost-benefit literature which shows the conditions for willingness-to-pay measures andmoney metrics to be exact are essentially the same: homotheticity of income in theindirect utility function. Instead of adding up across individuals, this decision statisticadds up across time periods, hence the difference in restriction.QALY as an indicator of societal health Consider when QALYs are usedto compare health profiles across large groups of people. The indicator used isE iN_, (pi(qi ) = (,oj(q)N. This index has the same structure as the health status in-dex above, except t is replaced by N, selfish preferences are replaced by social, andPE replaces TTO as the principle instrument.Example: A commonly accepted and flexible expression for the social welfarefunction is W = (Ei v (qi )r )i/r. At r 1 this is the inequality neutral SWF.Larger values for r indicate inequality affection, while lower values indicate inequalityaversion. Then cpPE(q) (v19))r, while coi = vi,(49) for all other instruments (assuminghomotheticity in all other metrics). Then the bias may be expressed B = (c,aq.) —c,oPE (•))N = — OnN . Since E [0, B = 0 if r = 1, B < 0 as r fallsand B > 0 as r rises (an unlikely situation). Thus, one would expect the bias to benon-positive.Typically, under inequality averse preferences, non-PE instruments assign valuesto morbid states that are lower than those assigned by the PE instrument. In situ-ations where the number of people occupying any given state may vary, the non-PEinstruments overvalue states where there is inequality in outcomes and those withlarger numbers of people. If these values are used to direct policy instead of PE val-ues, too many resources will be directed to propagation and not enough to improvingthe health status of those who are already alive but unwell.Chapter 2. Separating Good Health Measures from Bad^ 45QALY's as a rationing device across patients Suppose there are only two typesof prospective patients characterized by some high or low level of some factor, y (say,income). Suppose also that y is an enabling factor in the enjoyment of health (i.eUe ,y^0). Let this relationship be expressed U (q, y) = v(q)w(y). Then cog = v(q),cpfs = v(q)(Z1 ), and (A =^= v(q) for j ES (assuming homotheticity in therelevant metrics). Finally, for ease of demonstration, assume that social welfare isadditive (this is a more restrictive function than the one above). The problem is tochoose which patient gets the health improvement. The true value of a health statuschange is given by the SWF:ASWF(H) = (v(qA ) — v(e)) 11 (y.11),^(2.17)ASWF(L) (v(qA ) — v(qB ))w(n),^(2.18)where H(L) denotes that the high(low) individual experienced the change from qB toqA . Then one would choose the 11(L) individual if(v(qA ) v(e))(w(YH) w(yL)) > (<)0.^(2.19)By non-ES QALY analysis, one would obtain2A E0.7(11-) v(qA) - v(qB ) (2.20)2A E cpi(L) = v(qA ) v(qB ),^ (2.21)indicating indifference as to who received the change. By ES QALY analysis, however,2A E4pEs(H) = v(e) _ v(qB)12E 90Es(L) (v(qA) v(e))( w ( YL ) ).1^ w(yH)(2.22)(2.23)Chapter 2. Separating Good Health Measures from Bad^ 46Thus,2^ 2A E ,Es (H) , A E(pEsm. ASW F(H) > ASW F(L).^(2.24)1 1Thus, only the ES instrument indicates the choice consistent with social welfare.Other instruments inappropriately discriminate against people who can derive greaterbenefit from a health status improvement.The above examples all carry two lessons. First, the choice of decision statisticcannot be made without reference to preferences defined over the whole policy space.Arbitrary decision statistics cannot order states of the world appropriately, regardlessof which QALY is used. Second, the solution to this problem of identification of thedecision statistic may be circumvented if the QALY is redefined with a broader context(all relevant factors are included explicitly in the reference state) such that the QALYbecomes a sufficient statistic over the whole policy space. This is basically the pathMehrez and Gafni (1989, 1991) have begun to follow for the first two cases above.The results are easily extended into other policy spaces.2.2.7 Summary of Validity ResultsThe conclusions of this section are fourfold. First, necessary and sufficient conditionsfor validity of a QALY function exist and stem from the necessary welfare properties ofthe decision statistics which incorporate QALY values for program evaluation. Onlynecessary conditions exist for the QALY in those situations where the QALY is not asufficient statistic (i.e. must be combined with other information to evaluate states).Second, any QALY index can be made exact if the state description in the surveyis set equal to the actual policy situation in every characteristic. The conditions forexactness to be robust to changes in characteristic values differ across instruments,but while the conditions for some instruments are more stringent than for others, noChapter 2. Separating Good Health Measures from Bad^ 47instrument is completely independent without some form of preference restriction.Third, because of this (variable) dependence on non-morbid factors, different QALYinstruments do not, in general, generate the same QALY values. Fourth, the appro-priate choice of QALY instrument depends on the choice of decision statistic in whichits values are to be used, which in turn depends on the types of characteristics whichchange between the states under consideration.Combining the last two points, one can conclude that the use of the wrong QALYfunction in a decision statistic is likely to lead to biased results and the wrong policybeing implemented. The obvious conclusion to be drawn is that the researcher mustchoose his or her QALY instrument according to what is being evaluated. But thiscreates the dilemma of how to compare states which affect different characteristicswhen the evaluation of each state requires the use of different QALY functions whichare noncomparable. Fortunately, the second result implies that exactness and com-parability can be achieved in these cases by selecting one metric (i.e. one instrument)defined over an enhanced information set (i.e. expand the morbid state descriptionto include not only the value of the metric, but the levels of all other factors whichvary between states as well). In this way, correct decisions in health policy can bemade.2.3 Part II: Implementation2.3.1 Nature of the ProblemThe recent proliferation of medical procedures and increased demand for these ser-vices from an aging population have increased the need for health care programmeevaluations. n Yet, despite the potential benefits to be had from such information,'Because of the peculiarities of the health care system, the price mechanism fails to allocateresources efficiently. Instead, efficient allocation must be undertaken by a central planner accordingChapter 2. Separating Good Health Measures from Bad^ 48QALYs have been constructed for relatively few health states. 12 The premise of thissection is that this discrepancy arises because, in most cases, the cost of a QALYanalysis is too high to be recouped by the anticipated gains (over randomly allocatedresources). This inference is supported by the fact that only high cost projects, wheregains are apt to be large, have been evaluated to date.There are several factors which contribute to these high costs. First and foremost,QALY values are not freely observable in the market, but must be obtained throughexpensive and time consuming surveys. Second, these surveys must be replicatedacross a large number of respondents because (a) responses to hypothetical surveyquestions tend to have high variances, (b) a representative sample of the populationmust be surveyed since the marginal consumer cannot be identified ex ante, and (c) acritical mass of QALY values is needed for meaningful comparisons of projects (thereis no absolute critical cost per QALY value, so selection criteria can only be relative).Third, because health has many aspects, there exist many possible health states thathave to be evaluated (see Williams [1983] or Boyle and Torrance [1984] for a discussionof how just a few characteristics and severity levels can generate a large number ofdistinct health states).to the information provided by program evaluations. Torrance developed the QALY in the early1970's as a health status index to be used for health care program evaluation. The premise of theQALY index is that time spent in poor health is worth less than time spent in good health. TheQALY value is used to scale time alive according to the quality of the health endured during it, sothat time spent in good health is assigned a higher value than time spent in ill-health. Since theindex is based explicitly on preferences, it has a stronger welfare foundation than the other healthstatus indexes available.12 QALY evaluations done to date have focused on cancer therapy (Sutherland et al. [1983] andMcNeil et al. [1982]), major organ dysfunction (Weinstein and Stason [1977] and Churchill et al.[1984]), and neonatal care (Boyle and Torrance [1984]). Torrance (1987) and Maynard (1991) providethe most comprehensive QALY value tables, but even these are limited to about a dozen distinctstates and tend to be drawn from the above three categories. Rosser and Kind (1978) have attemptedto produce generalized QALY values based on morbid characteristics and not specific disease states.While this method may be an efficient solution, the use of only two vaguely defined characteristicsis inadequate for many situations.Chapter 2. Separating Good Health Measures from Bad^ 49The usual response to this situation is that only high profile projects have beenevaluated Recently, some attention has been focused on how to produce QALYvalues more cheaply. The purpose of this paper is to examine the proposed costsaving techniques to determine what bias may ensue and its significance, and to findthe most cost-effective method of obtaining QALY values to encourage their use. Thisis important because, if biased values are used, incorrect policy decisions (those thatdo not maximize welfare) may be made, and, if QALY based allocations are forsaken,allocation may become arbitrary (e.g. according to the lobbying effort of interestgroups).2.3.2 Literature ReviewThe choice of production technique is a two-step procedure. First, the instrument ormethod used to evaluate each state must be chosen. Second, the aggregation rule,which determines how many of these states need to be evaluated directly, must beselected. Both decisions involve trade-offs between the the quality of the QALY valuesproduced and the costs of producing them. The literature review reveals these twoparts of the production process have only been considered independently, even thoughthe decision and outcomes are necessarily joint.Instrument SelectionPreferences over health states are converted to a numerical scale by the use of a surveyinstrument, a device used to measure strength of preference between two states. Whilethere are as many instruments as there are metrics against which to measure strengthof preference, the analysis is restricted to a set of five which, between them, providevalid indexes for all current QALY applications (see the previous section). Theseare: category scaling (CS), standard gamble (SG), time trade-off (TTO), personChapter 2. Separating Good Health Measures from Bad^ 50equivalents (PE) and extended sympathy time trade-off (ES). 13 The different surveytechniques/instruments require different amounts of time, effort and other inputs,and are associated with different cost structures. They also involve distinct decisionprocesses, so they may yield different results. While quality-input trade-offs areobviously present, no cost function for instrument selection has been specified becausethe cost and output concepts are vague and are usually considered independently ofone another.The best sources for cost information are user manuals and surveys (such as Fur-long et al. [1990], Drummond et al. [1987], and McDowell and Newell [1987]). Costsarise from the various inputs used in the evaluation process (see Drummond et al.for a description) and are difficult to calculate because the different inputs are of-ten measured in dissimilar units and, to the extent common monetary values areavailable, these are often specific to the researcher's particular environment. Thus,most cost analyses resort to the use of imperfect and partial signals, such as timefor respondent to complete task (Furlong et al.), response consistency (Froberg andKane [1989]), and non-response rates (Patrick et al. [1973]). Such measures are oftenordinally measurable, providing only a ranking of costliness (in descending order: SG,TTO, CS), but no quantification of these costs (although it is commonly believed thedifference between SG and TTO is less than the difference between TTO and CS).Apart from the lack of quantification, such analyses are inadequate because theydo not consider the effects on the quality of the index produced associated withcost reduction, focusing only on the number of states valued and not how accuratethese values are. Accuracy considerations can be found in an almost distinct body ofliterature dealing with the "validity" of various instruments. Theoretical concepts ofvalidity are often ill-defined, resulting in partial and misleading analysis (e.g. Torrance'The magnitude estimation and category scaling instruments are equivalent in this framework.Chapter 2. Separating Good Health Measures from Bad^ 51[1976a] and Loomes and McKenzie [1989] deal with only the interval properties of thefunction, focusing primarily on the SG; Pliskin et al. [1980] and Mehrez and Gafni[1989] examine context dependence, but only for the TTO).".Such theoretical work has been accompanied by various empirical investigationsinto equivalence (see Stevens [1959], Torrance [1976b], Read et al. [1984], Wolfson etal. [1982], Bombardier et al. [1982] and Rosser and Kind [1978]). This body of worksuggests that different instruments generate results that are highly correlated but notequivalent. While one must first identify the valid instrument before the significanceof these differences can be ascertained, it can be concluded that some inaccuracy willarise when instruments are interchanged, suggesting a cost-quality trade-off exists.Aggregation MethodsAn aggregation rule is a procedure whereby values for states which have not beenassessed directly (i.e. from a survey respondent) are obtained by some extrapolationfrom the values of states that have been assessed. The concept of valuation byselect aspects was first discussed by van Praag (1968), but it was not until Keeneyand Raiffa's book (1976) that due consideration was given to how these evaluationsshould be combined. Keeney and Raiffa establish the functional forms that mustcharacterize utility in order for multi-attribute theory to be applicable'. These arethe additive, the multiplicative, the multi-linear, and the holistic. Typically, eachmorbid characteristic is valued in isolation of any other morbid characteristic bydirect means, and the value for any configuration of morbid characteristics is thenreconstructed according to some function of the values associated with its component'Conditions for validity are derived in the previous section'While multi-attribute theory developed in economics, the theory of conjoint measurementevolved independently in the psychometric literature. Additive conjoint measurement is similarto additive forms of utility, while polynomial measurement resembles multiplicative forms.Chapter 2. Separating Good Health Measures from Bad^ 52parts. Different functions are associated with different aggregation rules with differentparameter requirements (which entail additional valuations to identify).Cost savings depend on the combination rule chosen. If K is the number of mutu-ally exclusive characteristics, then the holistic approach requires 2K QALY valuations,the multilinear approach requires 2 K — 1, the multiplicative approach requires K andthe additive approach requires K — 1.Verification of the various independence conditions necessary for the differentmethods to produce identical values has been irregular. Such reconstruction methodsare standard procedure for virtually all health status indexes (e.g. Activities of DailyLiving, Sickness Impact Profile) and have been advocated by Torrance (1986) as areasonable approximation to the holistic (direct) QALY method, yet very few authorshave examined the underlying assumptions of such an approach and even fewer haveattempted to verify these conditions empirically. Culyer (1976) recognizes explicitlythe assumptions which make the atomistic (additive) approach viable in a QALYspecific context. Rosser and Kind (1978) provide enough information for a crude testof additivity in their disability-distress matrix (the marginal disutility of distress in-creases with disability, refuting independence over these two characteristics), but donot perform such a test nor provide sufficient information to evaluate the statisticalsignificance of the result. Giaque and Peebles (1976) employ multidimensional util-ity theory in the valuation of various states related to streptococcal sore throat andrheumatic fever (ten attributes), but only so far as to impose independence assump-tions, not to test for them (with only 13 respondents, this is not feasible). Krischer(1976), in his study of cleft palate, finds that up to three-quarters of his sample of119 exhibit pairwise independence of preferences over the three attributes of speech,appearance, and hearing. Although these results appear to contradict the evidenceof Rosser and Kind, it must be remembered that the test procedures and the relevantChapter 2. Separating Good Health Measures from Bad^ 53vector of characteristics differ between the two studies. Indeed, the biggest problemwith studies of this nature is their specificity — the results are applicable to a verynarrow set of morbid characteristics and are not useful for most applications.Boyle et al. (1983, 1984) provide the first comprehensive and thorough analysisof atomistic methods in their work on neonatal intensive care. Faced with 23 charac-teristics that generated 960 possible states, they test whether the holistic values areadditive functions of atomistic values. They find specifications with interactive termsare significantly superior to those without such terms. Thus, they adopt the mul-tiplicative form of Keeney and Raiffa because it better accommodates these effects.Their analysis only tests for independence properties over extreme ranges of severityand, by the authors' own admission, only tests over a small subset of health states.They advise against the extrapolation of their results to other subspaces. A far moreserious flaw in the analysis, one which they apparently do not recognize, is that themultiplicative form is never tested against more flexible alternatives: the alterna-tive hypothesis of multiplicative separability is accepted even though the conditionsnecessary for it to be unbiased are never assessed.Avenues for Further ResearchThis review demonstrates four gaps in the literature to date. First, the productionprocesses are treated disjointly, even though this is infeasible in practice. Second,the objective function is not stated in terms consistent with the objectives of QALYanalysis (welfare maximization). Third, the relationship between costs and the accu-racy of QALY values is seldom made explicit. Fourth, meaningful estimates of thetrade-offs involved are either unavailable or apply only to very specific contexts.In this paper, trade-offs are modelled explicitly, with the bias generated by thechoice of instrument appropriately nested within the bias function for the aggregationChapter 2. Separating Good Health Measures from Bad^ 54method chosen. These bias functions are related to the welfare losses caused byinaccurate QALY values (this being a function of the expectation of choosing thewrong state and the consequences of this choice). Estimates of the trade-offs involvedare provided for general health states that are useful in most applications.2.3.3 AnalysisThe purpose of this paper is to provide enough information about the cost-outputtrade-offs in QALY calculation so that the policy maker can make rational choicesabout which production methods to use. Because cost data are location specific, theproduction function, the primal to the cost function, is analyzed. The informationprovided is equivalent, but is robust to environments with different input prices.AssumptionsThe assumptions made in Part I regarding preferences and their domain are sufficientfor the analysis here. However, a slightly modified notation for the decision statisticsneeds to be adopted to accommodate the additional QALY construction methodsanalyzed here. Hence, the assumptions of before are augmented by:i) Let Ei represent the "single-dimensioned" morbid state:4.2+17 •••, 4. 1() (i.e.^is the vector of morbid characteristicswith all but the ith element at perfect health).j) The holistic QALY value for any health state q is obtained by surveying q withinstrument j (j = CS, SG, TTO, PE, ES):y,a(q;t,K).(This corresponds to the QALY functions used in the previous section.)Chapter 2. Separating Good Health Measures from Bad^ 55k) The atomistic QALY value is the (necessarily holistic) QALY value for a single-dimensioned morbid state:(i,j(Ei; IS) -= Cd(6, •••,^6+1, •••,G; t 7 K ) •1) Let 4o3 'A (q; t, K) denote the QALY function reconstructed by method A (A= multi-linear, multiplicative, additive) over atomistic functions derived with instrumentj which is analogous to the holistic QALY value for the multi-dimensioned stateq:cpj ,A ( q; t )^fA ( (i,j (E i^(EK^)),where P(•) denotes the aggregation function for reconstruction method A.2.3.4 ModelThe policy maker seeks to obtain a set of QALY values which, when used in the ap-propriate decision statistic, correctly rank more preferred health care projects higherthan less preferred projects. Such QALY values are called valid. QALY values whichdo not satisfy this property are biased.QALY values are not freely available and the policy maker, faced with a limitedresearch budget, may have to adopt cheaper and more biased methods of constructingQALY values than would be the case if QALY values were costless. The problem isto minimize the amount of bias incurred for a given cost saving, i.e.min B(p)^(2.25)subject toC^c(p)^ (2.26)where p is the production process, B(p) is the aggregate bias of the QALY valuesgenerated by the production method p, c(p) is the cost of producing these values byChapter 2. Separating Good Health Measures from Bad^ 56production method p, and C is the evaluation budget.' This problem is not actuallysolved in this paper, since the optimal choice depends on cost structures and the valueof life, both of which are unknown.2.3.5 Production MethodThe production process consists of two parts: a survey instrument (j) and an aggre-gation rule (A):P {j, A}There are five instruments available: the category scale (CS), the standard gamble(SG), the time trade-off (TTO), the extended sympathy time trade-off (ES), and theperson equivalent (PE).The CS requires that the survey respondent value a described health state relativeto two reference states, say q 1 and q° . If such valuations are consistent with theutility function over certain outcomes, this instrument generates the following QALYfunctionU(q,t, K) — U (q° , t,co o' s (q;t,^_ ^aU(q,t,K)-Fb,U (q' , t ,^— U (q° ,t, K)(2.27)where a and b are constants.The SG requires that the survey respondent choose the probability vector fora reference or standard gamble involving given win and loss states (g,,, :7, q' andqh,„ = q ° respectively) such that he or she is indifferent between accepting the gambleor the described health state with certainty. Then this instrument generates thefollowing QALY function (pse. (q;t ic ) (2.28)16The problem may be reformulated such that efficiency gains must recoup the costs of evaluation.This approach requires the monetary value of incorrect project selection be stated explicitly. In thecase above, this value is implicit in the Lagrange multiplier.Chapter 2. Separating Good Health Measures from Bad^ 57whereU (U(q, t,^= (U(q i t , n),73; U (q ° ,t, K,), (1 — p)).^(2.29)If the von Neumann-Morgenstern axioms hold,i0sG ( 9, ;t,^U(0' k )^U( q° ' t ' K)^aU(q,t,K,) fib,U^— U(q°,t, K)(2.30)where a and b are constants.The TTO requires that the survey respondent select how much time in perfecthealth (q 1 for convenience) is equivalent to a given period of time spent in the de-scribed health state. Then this instrument generates the following QALY function(PTTO ( q; t, fro t — Tri (2.31)twhereU(q,t, ft) = U(q 1 ,t — m, n).^ (2.32)If utility is homothetic in time, thencpTT° (q;^= ^ = p(q, n),µ(q 1 , K)(2.33)where a is some constant.The ES is similar to the TTO, except the reference state is characterized by a givenhealth state (q 1 ) and personal characteristics (nj ). Then this instrument generatesthe following QALY functionwheret • - m,pES (q;ti,ni,ij,Ki) __, 3 ^,ti(2.34)= U(.7 1 , ti^(2.35)Chapter 2. Separating Good Health Measures from Bad^ 58where i indicates the respondent's characteristics. If utility is homothetic in time,thencpEs (q; t i , k i ,tj ,p(q,Ki)^ = aiL(q,ki), (2.36)where a is some constant.The PE is based on social, rather than selfish, preferences, trading off the healthstatus of groups of individuals to achieve states of social indifference. If social welfare,W, is welfarist, then this instrument generates the following QALY functionPE^N— rn.(g17.-7qN;t1,—,tN,K1,—,KN ) Nwhere7 (17 (q, t i ,^...,U(q 7 tN,KN))=14 7 (t1(qi ,t1,K1),...,11(q i ,tN,,KN-Th.),U(q° ,tN-m+1,K,N-m-Fi),•••,U(4 ° ,tN, EN)) .If TV is homothetic in population and Ki = K, t i = t V i, thenU(q, t,^— U(q° ,t,K) (PP E^=^ = aU (q,t,K) b,— U(q°,t, tc.)(2.37)(2.38)(2.39)where a and b are constants.Except when preferences over morbidity are weakly separable from non-morbidfactors and utility is homothetic with respect to all metrics, the different instrumentscan be expected to generate different QALY functions (see the previous section).There are three aggregation methods available in addition to the holistic approachwhich is based on the entire configuration of the health state. These are describedin Keeney and Raiffa (1976). Modifying their equations for the QALY case (wherevalues are based on differences from a maximal rather than a minimal point), theseare:Chapter 2. Separating Good Health Measures from Bad^ 59multi-linear If each characteristic is weak difference independent (WDI) of the re-maining characteristics (i.e. the ordering of preference differences over characteristicsdoes not depend on the level of characteristics which do not change), then civi (q; t, tc)can be represented by a multi-linear combination of the atomistic QALY functions:(q; t, tc) = 1 — (E Ak(1 — cbj (F.: kit, 10)-Fk=iK KE E Ak,h(i^sa3(Ek; t, K))(1^si,j(Eh;t,K))k=1 h>k— SP(E1;t,k))...( 1- — Sbi (F-EK;t,k))) (2.40)with Ek^+^Ahj •••• Al,...,K = 1, where the A's are preference weights(or trade-off values) and K is the number of mutually exclusive morbid characteristics(assuming morbidity can only be present or not present for each characteristic (thereare no severity levels within elements), this equals the number of elements in the setq).multiplicative If each characteristic exhibits mutual preference independence(MPI) (i.e. preference for each characteristic is independent of the level of all othercharacteristics, or weak separability), then soi (q; t, K) can be represented by a multi-plicative combination of the atomistic QALY functions:Ksoi'm (q; t,^= 1— (I ) k (1 — sbi(FEk ;t, 10)h=1K K+ E E AAkAh(1 — sbj(2.-,k;i,k))(1 — oi(Eh;t,K))k=1 h>kA K -1 A1.- -A K( 1 -^(El; t) k))-( 1^(,=K; t ) IC )))with 1 +a nr_ i( i+AAk ) .(2.41)Chapter 2. Separating Good Health Measures from Bad^ 60additive If each characteristic is strong difference independent (SDI) of the re-maining characteristics (the actual value of the preference differences, not just theordering, is invariant to the level of the characteristics that do not change, or strongadditive separability), then (pi (q; t, rz) can be represented by an additive combinationof the atomistic QALY functions:Kcoi'D (q) = 1 — (E A k (1 — cok=ik ; t, , ) ) ) (2.42)with Er_ i Ak 1.When the independence properties between the morbid characteristics do not hold,the different aggregation methods yield different values. The net difference betweenproduction methods depends on how the effects of non-independence between morbidcharacteristics interact with the effects of non-independence between morbid andnon-morbid characteristics.The budget constraintAssume the policy maker's resources for evaluation are set at some level, C. Thecosts of producing a given set of QALY values by production method p is c(p). Thisfunction is assumed to have the following structurec(p) = L(A)c3 (q) (2.43)where c' (q) denotes the unit cost of evaluating any given health state, q, by instrumentj 17 and L (A) is the number of values that need to be found when aggregation methodA is employed' s .17 Furlong et al. (1990) report that respondents are able to evaluate five to six health states at atime. Thus, while there are significant fixed learning costs, these are quickly dissipated by respondentlimitations. Efficiency dictates that the whole process be repeated frequently to achieve minimumcosts. Because there are many states to be evaluated, replication will have to be undertaken andthe average costs will appear to be relatively constant.l'All methods of reconstruction require the estimation of K atomistic functions (correspondingto the K mutually exclusive types and levels of health ailments), but different numbers of trade-offChapter 2. Separating Good Health Measures from Bad^ 61The bias functionThe bias function represents the losses the policy maker incurs when the estimatedQALY values differ from the true or valid QALY values. In the previous section,utility theory was used to generate the conditions for a valid QALY index. It wasfound that different applications had different requirements which were fulfilled bydifferent instruments. The difference between the values assigned to health states bythe "true" index (denoted by sot(q;t, tc)) and the values associated with any otherindex are relevant only to the extent that they cause incorrect project rankings. Thismust be included in the specification of the maximand.In the past, equivalence between the approximate index and the true index wasassumed to be necessary for unbiasedness, regardless of how differences between thetwo indexes affected the decision statistic and, therefore, choices. The problem maybe viewed from two contexts: ex post (true values are known), and ex ante (valuesare unknown).In the ex post case, the QALY may or may not be a sufficient statistic (if it is, theQALY value alone determines the project choice; otherwise, the QALY value mustbe combined with other information before a choice can be made). If the QALY isa sufficient statistic, then the equivalence condition is overly strong since only theordering of health states by the approximate and true indexes need be the same, i.e.(q; lc) o(cat(q; ,c)), (2.44)where 0 is an increasing monotonic function. In this case, the appropriate distortionvalues (the higher order 'A's). Thus, the multi-linear method requires (2K — 1) parameters to becalculated, the multiplicative requires (K + 1 — 1) such parameters, while the additive requires only(K —1). In comparison, the holistic approach requires 2K valuations. Parsimony in these parametersis purchased with more stringent preference assumptions.Chapter 2. Separating Good Health Measures from Bad^ 62measure is 1901((pj,A(q; t , k )) oopt(q; 10) (2.45)If the QALY is not a sufficient statistic, equivalence is necessary (because of themeasurability conditions imposed when QALY values are combined with other data),but is not sufficient since the welfare consequences of incorrect choices depend onthe additional non-QALY information. The welfare loss is equal to this differenceonly if the QALY is cardinally related to utility (which may or may not be the casedepending on the properties of the QALY and whether aspects of the state otherthan morbidity change). Since this information is specific to each assessment, thewelfare consequences of a wrong decision cannot be evaluated until the nature of theassessment is identified and the sum of incorrect decisions must serve as a first orderapproximation of the losses generated by a wrong decision.In the ex ante case, the policy maker does not know the true values of the healthstates under consideration, just the health outcomes themselves. This situation ismore likely to arise than the ex post case - after all, if the true values are known,there would be no point in collecting the QALY data. The values for these healthstates may be viewed as random draws from the QALY interval (the set of all possiblevalues). One may assume that the distribution of these values is uniform". Then,given that there are a large number of states to be evaluated, the probability thatany one state is incorrectly ranked is a linear function of the difference between the'0' may be found formally if the functional form of the two QALY generating functions areknown, or by Box-Cox estimation if the only information available are the values generated by eachfunction for some set of states.2( 'The assumption of uniformity is not supported by the distribution of QALY values found to date(which appear to follow a Beta distribution). However, this may be an artifact caused by selectionbias. Since there are not enough observations to confidently specify a distribution's parameters,and because the uniform distribution provides a good deal of intuition for the results obtained, andbecause the assumption probably holds over sub-intervals of the QALY range, the uniform densityis adopted.Chapter 2. Separating Good Health Measures from Bad^ 63calculated and the true QALY values. (The larger is this difference, the larger theinterval of neighbouring states against which this state is incorrectly ranked, the morelikely the alternate health state will draw a value in this interval. Linearity arises fromthe assumption of random draws from a uniform distribution.) Then the probabilityof making a wrong choice for a single health state is= a I cif- ' (q; t, ti)) — v) t(q;t, +b (2.46)(0-1 reflects the required degree of measurability), where b = 0 (since this probabilitymust necessarily equal zero when the two functions are identical) and a > 0 (a = 1only if the values of all states lie within the unit interval). The expected number ofincorrect responses is the probability that any one response is wrong taken over allpossible configurations of the 6,s, i.e. the sum of these probabilities.If it is unknown what, if any, information is to be combined with the QALYvalues, then this sum represents an undominated expression for the welfare losses in-curred when incorrect decisions are made (i.e. no other expression can be consideredbetter). Thus, equivalence is undominated in situations of complete ignorance. How-ever, if additional information regarding the true distribution of the QALY values inquestion, or how the values are to be subsequently used, becomes available, anotherrepresentation of the welfare loss could be superior. This discussion indicates underwhat conditions equivalence is the appropriate criterion.The policy maker's problem can be specifiedmin E 0 _1 @p i,A (q; t, to) — cpt(q; (2.47){k}(where {k} is the number of configurations of the morbid characteristics) subject toC^L(A)ci (q),^ (2.48)Chapter 2. Separating Good Health Measures from Bad^ 64(the a above can be suppressed in the minimand if the Lagrangian is correspondinglyadjusted).2.3.6 ParameterizationIn this section, parameter values for the above problem are derived for a broad rangeof health states. Such information is necessary to assess the cost-accuracy trade-off.While the policy maker is apt to know the nature of the constraint, having access tospecific information about C and cY , he or she is less likely to know the nature of theminimand. The main contribution of this paper is to provide a reasonable estimateof the minimand over a general range of health states so that rational choices can bemade by the policy maker about how to obtain QALY values.The bias generated by any particular production technique is a function of thebiases arising from the two components of the production decision. Since the biasassociated with the choice of instrument is nested within the bias associated with theaggregation rule, it is dealt with first in isolation and then in conjunction with theaggregation rule.2.3.7 Bias from InstrumentsFive QALY instruments are considered: category scaling, standard gamble, timetrade-off, person equivalents, and extended sympathy time trade-off. The assump-tions and functional representations of these instruments are discussed in the Analysissection.From Part I, equivalence requires (1) independence of preferences over morbidityfrom non-morbid factors, and (2) homotheticity of preferences with respect to themetric variable (i.e. the variable against which the health state is measured must,for some monotonic transform, enter utility linearly). While empirical analysis isChapter 2, Separating Good Health Measures from Bad^ 65necessary to verify these equivalence conditions, resources are not available to do so.Because these results are necessary to proceed with the rest of the analysis, theyare reconstructed axiomatically and a sensitivity analysis is performed to assess howreasonable an approximation this approach provides.Reference to the literature (see the review) suggests that independence is satisfied(comparisons show no disjointness in the relationships between instrument functions),but homotheticity is not (the functions are not perfectly congruent). Using this as astarting point, a utility function is posited with a functional structure and parametervalues defined over a range consistent with empirical observation. While a great manyassumptions must be made to support this function, it has been chosen so that noneof them is very unreasonable given empirical evidence.Consider the following utility functions which are able to accommodate discount-ing over time, non-linear expected utility (including non-independence in probabilitiesand regret theory), preference variation, and, in the case of social welfare, inequalityaversion:U (q, t, tc) = fo e -" 0(p)qw(K)dr. (2.49)SW F (EUic)(11c) (2.50)i=1(where the social welfare function is defined over all N members of society). In thesimulation, the discount rate, r , is set to 0, 5, and 10 percent, a range which spansmost reasonable estimates of time preference. Expected utility is expressed witha weighted probability function along the lines suggested by Chew (1980): q(p) =aP(1—p)6 . This function can represent a variety of behaviours under uncertainty,including non-independence (a 1) and regret theory (d > 1), as well as the standardvon Neumann-Morgenstern case ((a, d) = (1, 1)). The range selected is (a, d) =(1, 1), (.75, 1.5), (.5, 3), where d = f (a) such that the responses follow the patternChapter 2. Separating Good Health Measures from Bad^ 66found in experimental settings by Kahneman and Tversky (1979). w(k) = lc is ageneric function which scales the marginal utility of morbidity according to the level ofthe enabling factors, K . The sub-utility function over morbidity is, for convenience, setequal to the level of q, which is assumed to be measurable as a single aggregate statistic(results generalize to richer specifications). The range of this function is restricted to[Ildeath to avoid situations where the QALY functions could be undefined. Noticethat this functional form assumes preferences over q are separable from non-q factors,a position that appears to be supported by the empirical evidence of Torrance (1976b),Read et al. (1984) and others that show mean orderings suffer no discontinuities,which would arise if this were not the case. The social welfare function is given bythe mean of order c, with c = 1, .5, 0 (corresponding to inequality neutrality, and twocases with greater inequality aversion).The QALY functions associated with these utility functions (after appropriatenormalization) are:cocs(q; n)^ qw a5 G (q; t, lc ) dql /aioTTO (q; t n)^ (2.51)ln(1 — q(1 — exp{rt}))10ES (q; t i, ni,ti,ni)^= -4- ln(1^q(: (C: ) )(1 — exp{rt}))soPE^qN ; t 1 , ...,t N 1 ,^N ) = qc.The results of the sensitivity analysis are depicted graphically in Appendix B.The specification of q affects only the SG function, the levels of r and t affect bothTTO and ES, the level of K affects the ES function, while the value of c affects PE.Divergence depends on which instruments are being compared and over what range ofq. At times, differences can be quite large, even with conservative parameter values.While many conclusions can be drawn from this simulation exercise, probably the twomost significant are that divergence tends to be greatest in the upper-middle range ofChapter 2. Separating Good Health Measures from Bad^ 67morbidity (where the most prevalent morbid states lie) and that differences betweenthe instruments are evident even with conservative estimates of the parameter values.Inferences must be made with caution, however, since the simulation does not generateresults consistent with some actual evidence. (The work of Sackett and Torrance[1978] suggests that 0 has been misspecified and that a > 1. Since values at this levelcontradict the evidence of Kahneman and Tversky [1979] and the properties Chew[1980] writes the function 95 should possess, the sensitivity analysis is not extendedto this range.)These results suggest that the biases from using inappropriate QALY functionscan be quite significant, but are sensitive to the choice of parameter values. 21 Forthe purposes of this paper, a given set of parameter values must be employed. Theseare chosen, where possible, to match empirical evidence in the literature, and, wherea range of values is reported, small, conservative figures are chosen (this strengthensthe final conclusions of the paper). These are: r = .05 (a value consistent withmost reports of the social rate of discount (including Drummond et al. [1987])),g5(p) = p.„ .f15(1 _p).„ (which mimics the pattern found by, among others, Kahnemanand Tversky [1979]), w(K 2 ) = (14.25)w(K 3 ) (i.e. preferences are allowed to varyaround the "norm" by 25 per cent, an arbitrary but not unreasonable range), andc = .9 (to reflect a small amount of inequality aversion, again set arbitrarily but notunreasonably). These values seem to be a reasonable starting point for the rest ofthe analysis.21 It is straightforward to show equivalence is achieved if r = 0, 0(p) = p, w(x 2 ) = w(x.,) andc = 1 (i.e. when all metrics enter utility homothetically and independence with respect to the othervariables holds). These values are not consistent with casual observation.Chapter 2. Separating Good Health Measures from Bad^ 682.3.8 Bias from AggregationThere are four methods of reconstructing QALY values: holistic, multi-linear, multi-plicative, and additive. These are described, along with equivalence conditions, in theLiterature Review (the holistic imposes no structure on preferences, the multi-linearrequires weak difference independence (WDI), the multiplicative requires mutual pref-erence independence (MPI), and the additive requires strong difference independence(SDI)).A review of the literature shows what little work that has been done to verify theseconditions is applicable to very specific health states, is often based on flawed testdesign (not all hypotheses are refutable), and considers only subsets of the strategyspace (the aggregation choice is not nested within the instrument choice). Sinceprevious work is unreliable or unsuitable, the independence conditions are evaluatedin this paper.Literature ReviewThe multi-attribute literature presents three methods of testing for preference restric-tions directly (with utility rather than price and income data).regression methods This method is based on the estimated relationship betweenthe function values and the arguments of the function. The parameter estimates arechecked to see if they satisfy certain restrictions (usually the analysis is confined toverifying the insignificance of higher order terms to support SDI, although other testsare theoretically possible). An example of this approach can be found in Klein et al.(1985).Problems with this approach include: (1) the test is dependent on the modelspecification (and that the statistically best model is the true model), (2) that theChapter 2. Separating Good Health Measures from Bad^ 69test may be sensitive to the scales of measurement of the independent variables (seeVeit and Ware [1982] or Birnbaum [1973]), (3) the procedure tests for homogeneousseparability, so that the test may reject the hypothesis of separability in separablestructures because they are not also homogeneous (see Blackorby et al. [1978]), and(4) the parameter restrictions for the non-additive cases, particularly the multi-linear,are typically neither derived nor evaluated (especially in the health services researchliterature).experimental methods This method requires the solicitation of values for boththe atomistic and holistic value functions. The atomistic values are then aggregated,according to the weight restrictions, and compared to the holistic values. Boyle etal. (1983) provide an example of this approach. They set the A Z 's (the preferenceweights used in the aggregation function) equal to the atomistic values and evaluatethe expression (1 + A) = 11(1 + AA,). They find A is statistically different from zero,refuting the additive case, and, accepting this rejection of the one null hypothesis assupport for the other hypothesis, adopt a multiplicative approach instead. Problemswith their approach include: (i) the atomistic and holistic values are obtained bydifferent methods (and, hence, may be on different scales), (ii) the atomistic valuesare evaluated only at the extrema (where distortions between instruments are mostlikely) so that values for severity are never calculated and (iii) only the additive caseis posed as a refutable hypothesis. As with the regression approach, the tests aretransformation dependent (so the caveat above in the regression section still applies).axiomatic methods This method examines the underlying behaviour of prefer-ences directly. Birnbaum's (1973) factorial methods are one example of this approach(although generalizations beyond the additive case are required). Closer examinationChapter 2. Separating Good Health Measures from Bad^ 70of the independence requirements reveals each independence condition generates arestriction on how utility may respond to changes in morbidity. For SDI, the sizeof the utility change caused by a change in some morbidity characteristic must beinvariant to the level of any other characteristic, i.e.8 2 U(q,t,K) 0 V i j. (2.52).f9 iMPI requires weak separability across all morbid characteristics. This in turn requiresthat the marginal rate of substitution between any two characteristics be invariantto the level of any third characteristic, i.e.ou( q,t, K )laeiv(au( q ,t, K )/a4-j ) = 0 Vki,j.(94.1c(2.53)WDI requires that the ordering of utility differences be invariant to the level of anymorbid characteristic not generating these differences, i.e.Ugi s ej)^t)±-4^e3lelet i,j)t , IC )^&.77 ek i,j)t, 10 )Ugilii4k0i,j)t)101U.i(2:767ki,j,t7K)> 14-) Ugi)^ek0i,j) t ) K )lU, ( i ,^t, ) > 1,OM RS 647ask^> 1 — MRS4„,,^ (2.54)where the subscript on U denotes the first partial derivative with respect to thatvariable. Note that SDI MPI WDI. Note that SDI is transformation dependent.All independence conditions are refutable. Complexity of the tests increases lin-early with the number of characteristics, not exponentially (an additional character-istic requires additional tests be performed on that characteristic, but does not affectChapter 2. Separating Good Health Measures from Bad^ 71the tests over the other characteristics). Tests can be done on subsets of characteris-tics without having to specify the relationship over all characteristics. However, testresults cannot be interpolated between severity levels, so observations must exist forevery configuration to be tested. Further, weaker versions of independence requiremore data points to be refuted.DataThe data are comprehensively described in Appendix C. All variables are drawn fromthe General Social Survey (GSS) of 1985. This survey collected data on satisfactionwith health and health status from a stratified sample of 11,000 Canadians.Dependent VariableThe dependent variable chosen is satisfaction with health. This is the satisfactionconcept most consistent with utility over morbidity, the variable of interest. Evenso, additional assumptions are required. First, it must be assumed that this mea-sure is cardinally equivalent to utility over morbidity or to utility over health afterstandardization for age (cardinality may be relaxed in the non-additive cases). Thisrequires that responses be time independent (i.e. the satisfaction associated with aparticular morbid state does not depend on the duration of that state) and that allindividuals use the same time period (through which the morbid state endures) inassessing their well-being. It is also assumed that individuals assign the same mean-ing to the different categories (e.g. that "very satisfied" means the same thing todifferent people). Additional assumptions are required if structure in the preferencerelationships is used to infer structure in the QALY functions.The above independence properties all apply directly to the function under con-sideration. In this project, however, verification of these independence conditions inChapter 2 . Separating Good Health Measures from Bad^ 72the QALY function must be done by reference to the properties of the utility func-tion for which the data are available. 22 Since v)i (q;t, tc) fi (U (q, t, 10) (see equation2.51), what restrictions on U (q, t, is) are necessary for independence properties over qin cp-i(q;t,K) to hold (the necessity part of the following lemmata)?Lemma 3: yoi (q; t, K) exhibits WDI over q if and only if U(q,t, ti,) exhibits WDIover q.Lemma 4: (pj (q; t, tc) exhibits MPI over q if and only if U(q,t, tc) exhibits MPIover q.Lemma 5: coj (q; t, lc) exhibits SDI over q if and only if U (q, t, /0 exhibits SDIover q and (pi U (i.e. cio) is cardinally equivalent to the utility function which isstrongly additive).Proofs: see Appendix A.Because SDI is transformation dependent, there exists the potential for Type Ierror (i.e. there might exist some representation of preferences, O(U), non-linear,which satisfies SDI when U does not). In this case, there exists some transform ofthe QALY data that can be aggregated and the ensuant sum can be untransformedto yield the appropriate QALY value. Depending on the form of this representationand how the QALY and U are related, these transformations can be quite complex.In these cases, verification of reconstructability in U is sufficient to prove thepossibility of reconstructability in (p. The functional form of the aggregator functionis the same, although the parameter estimates can be expected to differ.Independent VariablesThe set of independent variables is determined by what is available in the data setand what is feasible for the statistical program used (SHAZAM). The GSS collects22 In fact, satisfaction, not utility, data are available. It is assumed here that S^U (see Sen [1985]for arguments why this relationship may or may not exist). The lemmata above are easily modifiedto be consistent with this relationship over satisfaction.Chapter 2. Separating Good Health Measures from Bad^ 73data on four categories of ill-health: chronic disability, including endurance (stairclimb, walk, carry, stand), agility (bend, grasp, reach), and perception (hearing,sight); short term incapacitation (number of bed and sick days in the two week periodprior to the interview), social health (number of contacts and visits with friends andfamily), and functional health (ability to carry out everyday activities). The lastcategory is available only for persons over age 55, so it is dropped from the analysis.With a second order approximation, it is impossible to use all the variables in theanalysis without exceeding the capacity of the statistical program. Instead, variablesare grouped into five categories: endurance, agility, perception, short-term, and social(these groupings are suggestive of the conceptual frameworks often used to analyzehealth status so that this paper evaluates the appropriateness of such concepts inaddition to its primary goal of assessing independence across morbid characteristics).If ill-health is indicated in any component of these aggregates, then the variableis assigned a value of one; otherwise, it is assigned a zero value. The significanceof these groupings is tested on a ten per cent sample of the data by isolating onecomponent and testing whether the coefficients on the isolated component and itsrespective interaction terms differ significantly from the coefficients associated withthe aggregate variable based on the remaining components. Except for the agilityvariable, where the results become unreliable (coefficient values exceed the boundsrequired in a probability model), no statistically significant differences are found.Such an approach also helps to minimize the multicollinearity that is present in thedata.Chapter 2. Separating Good Health Measures from Bad^ 74SpecificationTo verify the independence conditions, a hybrid of the regression and axiomatic ap-proaches is employed. Basically, the utility function is estimated by regression tech-niques and the predicted values are examined to see if they are consistent with theutility restrictions derived from the axiomatic approach. The coefficients on the in-teraction terms are critical for these tests, so a flexible functional form is desirable.To curtail the profusion of parameters, however, only a second order approximationis used (this assumes higher order terms are insignificant). Since the multilinear ex-pression is essentially a second order approximation, it can not be refuted in thisspecification. The tests are evaluated anyways, more as a demonstration of method-ology than for the results themselves.Since the independent variables are characterized by binary structures'', the bestchoice of flexible functional is the Generalized Leontief (GL). 24 25 26 Thus, the truerelationship to be estimated is5^5 5o.5 V" V" aS = 00 E pi%^pijq0.5 q0.5 + Ri=1^i=1 ;>i(2.55)(where the third and higher order terms, included in the remainder R, are assumedPNote that this ensures Birnbaum's (1973) criticisms are unjustified (because there is no mea-surement scale that can distort the model specification).24 0f the alternative specifications, the translog has undefined terms because of variables with zerovalues (and if these values are rescaled away from zero, the estimates will be biased since the translogis sensitive to measurement scale); the generalized CES requires evaluation of differences from baselevels, but levels are a meaningless concept with ordinal data.25 Actually, the binary structure of the independent variables poses problems for any flexiblefunctional form since derivatives are not well-defined. As such, a Taylor's series expansion, onwhich most flexible functional forms are based, is not valid for these data. However, if Diewert's(1973) original definition of flexibility is reinterpreted in a discrete framework, some support is givento the flexibility of the specification.26 Even without the structural problems of binary data, the GL is not truly flexible since it imposeshomogeneity (see Blackorby et al. [1978]). It is not possible to rectify this problem with the dataavailable since (1) higher order approximations exceed the capacity of the statistical package usedand (2) the data are not generated by any optimizing behaviour so the function cannot be partitionedinto variable and scale components.Chapter 2. Separating Good Health Measures from Bad^ 75to be insignificant and not to affect the estimates of the lower order terms).EstimationSince the dependent variable is an ordered categorical measure, taking on one of fourranked values, the method used to estimate the utility relationship is an orderedprobit analysis. Z 7 The ordered probit assumes normality in the distribution of errors.While the Central Limit Theorem does not hold in this case (because the individualschoose the "state" they are in, the independence conditions are not satisfied), theordered probit is still chosen over the alternative distributions (e.g. the ordered logit)because it is not clear that the data are better represented by any alternate extremevalue distribution and because the probit is easily modified to represent ordered ratherthan unordered data. It is also assumed that the errors are homoskedastic. These twovery strong assumptions about the distribution of errors are necessary if the orderedprobit estimates are to be used to represent preferences.Censored analysis is not necessary since response rates for both the dependentand independent variables are high and the collection of data is well stratified torepresent all segments of society. Curvature need not be imposed since the functionto be estimated is not the result of optimizing behaviour.Instead of estimating the true relationship directly, ordered Probit analysis es-timates the probability that a response will fall into a particular category if theresponses are generated by this relationship. Denote the true relationship as S(q)f(bT q). If the ith observation falls into the jth category, assign z23 a value of one;nOrdinary regression methods impose the condition that differences between levels are equal.Unordered probit methods fail to recognize that responses exist on the same continuum. Orderedprobit methods not only recognize both facts, but can be used to estimate the differences betweenthe levels so that the ordinal information provided by the dependent variable can be converted tonumerical values.Chapter 2. Separating Good Health Measures from Bad^ 76otherwise, assign it a value of zero. Then the log-density for a single observation isP(zi, j) > zij log (aj — f (bT q)) —^— f (17 q))),^(2.56)j=1where 43. is the normal cumulative distribution function on which Probit analysisis based, and cea is the estimated step function measuring the difference betweenthe jth and (j — 1)th categories (although there are four categories, there are onlytwo step functions because the density of the last category is one less the sum of theother three). The log-densities summed across all observations generate the likelihoodfunction which is estimated by maximum likelihood methods. The functional form off(g q) is the second order Generalized Leontief (equation (2.55) above).EstimatesEstimation is done with the SHAZAM statistical package as a non-linear regres-sion problem using numeric derivatives and the Davidson-Fletcher-Powell algorithm.While the coefficient estimates are very robust to the choice of starting values, thestandard errors are sensitive to the number of iterations needed for convergence (thisis typical of the algorithm employed to calculate the information matrix). Since thebasic starting values employed (based on linear approximations) do not lead to con-vergence until after twenty or more iterations, one may assume the approximationsof the covariances are reasonable.Of greater concern is the sensitivity of the results to the inclusion of certainobservations. Estimates are sensitive to the inclusion of respondent groups which arecharacterized by particular morbidity profiles (e.g. the elderly who are typically inpoorer health than the general population, farmers who are in good health or selectout of the occupation). This could indicate preference variation or, more likely, under-represented cells. For this reason, the entire data set is used for the final estimates.Chapter 2. Separating Good Health Measures from Bad^ 77The results for the entire sample are presented in Appendix D. Examination of thedata suggest satisfaction with health may contain a longevity component as wellas satisfaction with morbidity and that these variations fall under three groupings:the young, the middle-aged, and the elderly.' The demarcation for the young is atthirty years, while the cut-off for the elderly can be put anywhere between 55 and65 years (the pattern of preference variation is discernible at the lower value, butdoes not achieve significance until the higher value; the upper cut-off is chosen.) Theestimation is performed on these sub-groupings as well (see Appendix D for results).Probit estimation yields the results presented in Appendix D. Since the satisfactionvariable takes on higher values the more dissatisfied the respondent is with his or herhealth, and the health variables take on positive values in the case of ill-health, onewould expect the first order coefficients (E, A, P, S, and L) to be positive (i.e. thatill-health would be associated with higher levels of dissatisfaction). Over the entiredata set, all first order coefficients have the expected sign, and all but social ill-healthare significantly different from zero (in all tests, the critical significance level is set at5 per cent).Among the higher order terms, one would expect positive signs if the health com-ponents are substitutes (e.g. loss of one faculty makes a person more reliant on otherabilities) and negative if they are complements (e.g. if one needs a combination ofabilities to perform a certain function, then the marginal disutility of losing one abil-ity is less if the other ability is absent than if it is present). Six of the interaction28 Preference variation by socio-economic characteristics is found when a Hausman-Wise (1978)residual test on the original specification is performed. The most significant variables are age andoccupation, followed by ethnicity, marital status, and household size. The age effect suggests thesatisfaction with health variable is not perfectly congruent with satisfaction with morbidity andthat age standardization is called for. The other effects, with the possible exception of ethnicity,can be explained in a simple human capital—time use model. While conclusions can be drawnmore confidently if standardization is performed on these characteristics as well, this is not feasiblebecause further subgroupings result in very poor representation in some cells (e.g. there are fewyoung farmers in poor health) and this generates unreliable results.Chapter 2. Separating Good Health Measures from Bad^ 78terms are positive in sign (half being significant), while four are negative (none beingsignificant). The results suggest, among other things, that individuals who lack com-munication faculties (P) compensate by relying on social contacts (L) and vice versa.The set of interaction terms is significant overall (Wald test of 43.2 with ten degreesof freedom).Looking at the estimates in Table D.1 of Appendix D, it is clear that they varyacross the three age subgroupings. 29 In particular, the estimates suggest substantialvariation exists over the parameter estimates for short term and social ill-health, andthat the elderly are, ceteris paribus, more satisfied with their health (because theconstant term is smaller). Given that the estimates are robust to choice of startingvalues, these differences can only be explained as caused by some underlying differencein the preference ordering (e.g. conditioning on longevity) or by biased estimatesresulting from poorly represented cells (e.g. the young tend to be in very good health,the elderly are often in poor health). Coefficient patterns tend to replicate across thethree sub-groups, but with lower statistical significance (probably due to the smallersample sizes), suggesting the latter reasoning may be more correct.TestsThe procedure is to test the parameter estimates to see if they significantly differfrom the following restrictions which are generated by the respective independenceconditions associated with each aggregation rule:29 14R tests indicate the sub-equation estimates for the young and elderly differ significantly fromthose for the equation estimated over the whole data set. The likelihood ratio test is constructed bycomparing the log likelihood for any specific sub-group of data with the parameter estimates fromthe whole data set imposed, against the log likelihood for that sub-group of data evaluated at theparameter values found when only the sub-group data are used in the estimation. Test values for theyoung and the elderly are 84.99 and 138.33 respectively (distributed Chi-squared with 18 degrees offreedom), while the middle-aged equation does not differ significantly (test value of 22.57).Chapter 2. Separating Good Health Measures from Bad^ 79i) (92 U(q, t, )/aejai = 0 4-4^= 0 V j i 30 for SDI (10 conditions).ii) aMRSe,,ei /(9 k = 0 4-4 Oji — (M/3 k = 0 V k 7 i j for MPI31 (45 conditions) 32iii) if aU(q, t, oiae i au(g,t,K)/aei^.? p,), thena((aUN,t,is yaE i )1(aU(q,t,n)/Ni)) > 1 — (OU (Qt,^Ni)1 (OU (q,t, K)/ki)aeki32i —^Oic — OjekijforWDI(30 conditions).The parameter restrictions for SDI are tested using t-tests on the coefficients onthe individual interaction terms (the Ai 's), and Wald statistics are used to test thejoint restrictions. The other conditions (for MPI and WDI) require non-linear tests(since they involve combinations of the coefficients) and a Wald test is chosen for thesecases. (The advantage of the Wald test is that it requires only unrestricted estimates(saving the non-trivial resources needed to re-estimate the equations with restrictionsimposed). The disadvantage is that the test is dependent on the representation ofrestrictions (although the structure of the problem suggests the representation usedis the most reasonable). Because the number of observations is large, asymptoticequivalence between this test and the LR and LM tests may be assumed (when the LRtest is used to evaluate the SDI conditions, the resultant test statistic differs from the30The estimated coefficients are not the true marginal utilities, but rather the incremental prob-ability effects. The marginal utilities may be recovered by premultiplying these values by the ap-propriate probability distribution weights. However, these weights are the same within each testrestriction and cancel out. Therefore the tests may be performed on the estimated coefficientswithout adjustment.'The point of approximation is assumed to be perfect health.32 The MRS relationship described in the text generates 30 conditions, while the approach actuallyused (a pairwise test of the interaction terms to determine if the scale factors, A, necessary for/3,3 )A/32 to hold, are statistically equivalent across all pairs of interaction terms) generates 45.The conditions are redundant in one direction: if all 30 of the MRS conditions hold, then all 45 ofthese hold as well; if none of the 30 hold, some of the 45 can still hold, in which case some cost savingcombinations can be made. Theoretically, this case is described by U(,,6,ek) (V(Ct )6),G)and U(Ck,t,E1 )^U(v(G,6),e3 ), and U(2,e),Ek,W^U(v(Et.,6),11)(ek,6))•Chapter 2. Separating Good Health Measures from Bad^ 80Wald's in only the second decimal place and does not affect the conclusions drawn).Thus, the results of the Wald tests may be accepted with confidence.) In additionto these individual cases, it is also interesting to know if the restrictions hold jointly,both overall and with respect to some important subgroupings (such as whether thecomponents of chronic ill-health are independent of one another 33 , whether the threemain groupings of ill-health are separable from each other') which indicate whencertain types of ill-health can be evaluated in isolation of the other categories. Thisrequires that the independence conditions hold over all morbid characteristics withinthe relevant group (e.g. to test whether short term ill-health is separable from chronicill-health, one would have to test jointly the hypotheses that short term ill-health isseparable from endurance, that short term ill-health is separable from agility, andthat short term ill-health is separable from perception, since endurance, agility, andperception describe chronic ill-health in this model). Such joint tests are also morereliable than the individual tests because they are based on more information.Because parameter estimates vary according to age, independence tests are per-formed for the whole sample (this is the more reliable equation if the differences arisefrom representation problems), and for each of the three subgroups (if the differencesare due to preference variation, then these equations yield the more valid results).Additive TestsSDI is evaluated by examining the second order coefficients (SDI is refuted if theydiffer significantly from zero)." There are ten conditions to evaluate. Table D.2 of33This assumption is commonly made in the construction of other health status indexes whichmeasure the value of a health state by summing the component values.'There is a common demarcation of health according to physical, social, and emotional well-being(see Torrance [1986] for a discussion); it is also common to treat chronic ill-health in isolation ofperiodic ill-health episodes, although the justification for this has never been clear.35 Actually, reference must also be made to step values between categories since these indicate whattransform of the response variable obeys SDI if the higher order terms are insignificant. Assumingthe higher order terms are insignificant, then, if the steps are equally spaced, AI is supported, butChapter 2. Separating Good Health Measures from Bad^ 81Appendix D reveals 30 per cent of these restrictions are violated for the entire sample.The t-test associated with the variable ExA (the cross term on endurance and agility)is greater than the critical value (at 2.48), suggesting that the value of (E+A) willmisrepresent the true value of the state where both E and A are present (since thecoefficient is positive, the predicted level of satisfaction is too high). Conversely, thet-test for the variable ExL is .526, which indicates the sum of the individual valuesfor E and L does not differ significantly from the value of the state where both E andL are present.Across categories, it is clear that the additive structures are not supported, withthe chronic and short variables having the strongest result (a Wald test statistic of21.38, see Table D.3 of Appendix D). This result also drives the significance of themore inclusive tests (all characteristics and the tri-category test). The only jointtests which are insignificant are those which separate only social ill-health from agiven category. This is not surprising since the social ill-health variable itself isinsignificant, the high standard errors allowing a broad range of acceptable proxycoefficients. Within categories (i.e. within chronic ill-health), the additive structureis marginally rejected.Results suggest that simple additive structures across the categories studied hereimpose significant bias on the QALY values obtained, and are doubtful within thechronic disability category (although the results hold only for affine transformationsof the satisfaction measure). Since the cost savings associated with the additive casecan only be achieved by accepting significant bias, the other aggregation methodsshould be assessed to see if unbiasedness can be attained with more moderate costsavings.if not, AI holds only for some non-linear transform of the response scale.Chapter 2. Separating Good Health Measures from Bad^ 82Multiplicative TestsMPI is evaluated by examining whether the scaling factors on the first ordercoefficients which produce the higher order coefficients differ from each other (e.g. if13ExA = .A 1 13EPA and 0Exp = )t 2 /3Ef3p, one asks if A l and A 2 differ significantly fromone another). There are 45 such combinations to evaluate. Table D.4 of AppendixD reveals 24 per cent of these restrictions are violated. For instance, the Wald testassociated with the pairs (ExA) and (ExP) is 4.62, which is significant when comparedto the appropriate chi-squared value. Thus, the use of the scaling factor derived from(ExA) on E and P will produce a biased value of (ExP). Conversely, the test valuefor the pair (ExA) and (ExS) is 0.001, which is clearly insignificant, so that the useof the scaling factor derived from (ExA) may be used on E and S without biasingthe value of (ExS). Note that the PxL variable (perception and social ill-health) is afocal point for rejection.Individual test results seem to suggest that the multiplicative structure is notmuch better than the additive. However, in the joint tests, multiplicative structurescannot be refuted (the joint test over all categories could not be performed becausethe number of restrictions exceeded the number of parameters — however, all the sub-tests, which together encompass all restrictions contained in the global test, do notreject and, except in extreme cases (see Kennedy [1985]), this result implies the globalrestriction will also be satisfied). As a demonstration, consider the tri-category test,which cannot be performed. The components of this test are chronic-short, chronic-social, and short-social. The first two are given in Table D.4, the last in Table D.2(SxL; the multiplicative test is not defined over two variables, but is a weaker version ofthe additive test). None of the three sub-tests is significant, so it would be surprising,though not impossible, for the tri-test itself to be significant. Furthermore, PxL is afocal point of rejection in the individual tests, but since these are characterized byChapter 2. Separating Good Health Measures from Bad^ 83very high standard errors, the joint tests should not weigh these rejection points asheavily as the more definite non-refutable points. The joint tests should then be moreinclined to the non-refutable cases. Since joint tests are based on more informationthan the individual tests, assigning more weight to the less variable restrictions, theyare usually assumed to be more reliable. The multiplicative test within the chronic ill-health category could not be performed because of singularity. This singularity simplyreflects the estimates nearly satisfy additive independence. Because the conditionsfor additive structures are more stringent than for multiplicative structures, it can besaid that mutual preference independence cannot be rejected because the conditionsfor strong difference independence cannot be rejected.These results suggest that multiplicative structures may be used across the cat-egories studied here without significantly biasing the QALY values obtained. Sincethe multiplicative structures are nearly exact, the significant extra costs of the lessrestrictive aggregation methods cannot be easily justified. Also, the sub-aggregates ofphysical and social health, and short and long term health that are frequently used inthe literature are proper aggregate commodities (see Deaton and Muellbauer [1980])and may be treated as distinct aspects of health.Multi-linear TestsWDI is evaluated by examining a different combination of coefficients than MPI. Itis evaluated by Wald tests. Since the second order approximation used to estimate thecoefficients imposes the multilinear structure on the satisfaction relationship, rejectionof any of the restrictions is theoretically impossible. Table D.12 of Appendix D revealsany combination which is significantly different from zero has a sign that is consistentwith the restriction (unlike SDI or MPI, these are one-tailed tests).Chapter 2. Separating Good Health Measures from Bad^ 84Caveats and ImplicationsThe test results suggest that health does indeed consist of aggregate commodities(lending partial support to the World Health Organization's conceptual framework)and that these may be used to reduce substantially the costs of valuing multidi-mensional health states. Results also suggest that simple additive structures areinappropriate across the categories studied here and are doubtful within the chronicdisability category (although the result holds only for affine functions of this satisfac-tion measure). This result suggests indexes like the ADL (activities of daily living),which sum the values of various health characteristics to obtain a value for a healthstate, are probably biased. Certainly, a multiplicative format would yield more reli-able results, yet it appears not to have been considered despite the little extra effortinvolved. 36Some cautionary notes do need to be made, however. First, these results are basedon average preferences. There is some evidence that the representative consumer isa fallacious concept in this situation. More experimental work, like that of Krischer(1976), needs to be done to see if the results hold at the individual level. Second,while these results hold over general variables over which some interpolation is possible(i.e. individual characteristics within distinct groups should still obey the patternsexhibited by their respective aggregates), there are still relationships within theseaggregates that need to be specified. Furthermore, emotional health is not included inthe analysis, so this can be considered only a partial analysis of the usual demarcationof health into physical, emotional, and social well-being. Third, the test results forWDI are vacuous since only a second order approximation is used. Higher order'In the sub-groups, the rate of violation is lower, although the pattern of violations is preserved inthe two older groups. It is assumed that this reflects sample size problems, and that the underlyingpreference relationships are the same as for the whole sample.Chapter 2. Separating Good Health Measures from Bad^ 85approximations, even if possible, are not called for since, because MPI cannot berejected, one can assert that the weaker condition of WDI also cannot be rejected.Fourth, the significance of the results may have low power in both an empirical and atheoretical sense.' The Blackorby et al. (1978) criticism that the test might be overlystrong because it depends not only on independence, but homogeneity as well, is notof great concern for those hypotheses that are not refutable. Only SDI is rejectedand, as is shown below, the QALY values generated by an additive aggregation ruleare all biased regardless of the transformation (instrument) employed (the QALYinstruments restrict the set of allowable transforms as well, further weakening thecriticism).2.3.9 SynthesisThe purpose of this section is to synthesize the results obtained so far by integratingthe bias functions with each other and with costs. While the Modelling section ofthis chapter describes how bias is generated by the various production methods, itcan not indicate how large the differences generated are. This section also deals withthe fourth caveat above by demonstrating that the amount of bias caused by additive37The grouping of variables artificially linearizes the utility function which is then estimatedin linear segments. However, the number of independent variables is large enough to make suchsegment-wise approximations fairly close to the true function. The tests are carried out only at thepoint of approximation. In theory, this point is assumed to be perfect health, although empiricallythis assumption is false. In either case, acceptance of the independence condition has low powersince the test result only indicates that independence holds at this one point, but not necessarily anyother. Rejection of the condition is robust, however, since, if independence does not hold at somepoint in the function, it obviously cannot hold over the whole function. Statistically, the tests havelow power (none of the individual rejected hypotheses had power exceeding 40 per cent in the intervalaround the calculated value). This result may be due to mis-specification (the data do not fit thenormal distribution well) or to the categorical nature of the dependent variable (the imprecision ofthe response creates problems typical of small samples). For a majority of the individual restrictions(60 per cent), acceptance of the null is supported more than rejection (when the probabilities areset equal to each other, the calculated values lie outside the rejection region). One would expect thejoint tests to have higher power since they are based on more information.Chapter 2. Separating Good Health Measures from Bad^ 86structures does not vary significantly across the transformations associated with thedifferent QALY instruments.A parameterization of the bias function is generated using simulated QALY val-ues for the fifteen health states in the model specified in the Empirical section (seeAppendix D). The QALY values are constructed according to the specifications givenin the Modelling section. The expected satisfaction estimates from the Empirical sec-tion of this paper are transformed so that the values generated are positively relatedto satisfaction. The transformation used is^'§(4)^'§(q)v(q) ,^q)^,§ (9 , (2.57)where S is the predicted satisfaction response, v(q) is the transformed predictedresponse, and 47 and q are the perfect health (no ill health in any category) and worsthealth (ill health in every component) states respectively. Assuming T = 20 andv(q) A(q,K), the utility function over certain outcomes is:20U(q,t,,) = f=0 exp{ —rt}v(q)dt.^(2.58)This expression is substituted into the instrument functions to generate caj(q), orthe holistic QALY values, using the arbitrarily set parameter values described in the.75Empirical section under instrument generated bias (r=.05, ¢(p) = v^ 75 +1.75(1 —p)• 75.75gq,^= p(q, j ), and c = .9) and common reference points of perfect health (q')and worst health (q ° ). 38 These are given in Table D.14 of Appendix D.The QALY values associated with single morbid characteristics are then used togenerate values under the different aggregation regimes (the trade-off values may be38This level is implicitly assumed to be a death equivalent (i.e. v(q) = 0). If it is not, thenthe simulated QALY values will not be mapped to the usual interval and will not be comparableto QALY values reported in the literature. The results in this paper will be internally consistentregardless.Chapter 2. Separating Good Health Measures from Bad^ 87found from the values for single aspect states because of parameter restrictions).'The multilinear structure is identical to the holistic structure in this specification.The calculated values are given in Tables D.15 through D.19 of Appendix D.The results should be accepted cautiously since the significance of the differencesis not accounted for (significance cannot be incorporated since the instrument val-ues are not generated empirically and therefore are not associated with any mar-04gin of uncertainty) The results show that the ordering of states is preservedacross all instruments, but not across the different aggregation methods ((ExP,ExL)and (AxP,AxL,PxL,SxL) are reversed in the CS and PE estimates, (ExP,ExL,AxS)and (AxP,AxL,SxL) are reversed in the SG estimates, and (ExP,ExL,PxS) and(AxP,AxL,PxL,SxL) are reversed in the ES and TTO estimates). This result stemsfrom the fact that the utility function imposes separability of morbid characteristicsfrom non-morbid characteristics (so there exists an appropriate transform, 0 -1 , thatgenerates equivalence). When equivalence is required, the appropriate measure ofbias is the sum of the absolute deviations of the estimated values from the values ofthe "gold standard" or true production method over all health states defined for ev-ery instrument.' Results suggest that substitution across aggregation methods (readdown the columns in Tables D.20 through D.24 of Appendix D) typically involvesless bias than substitution across instruments (as read across the rows in the same'The trade-off value for the multiplicative structure is found by solving the polynomial restrictionassociated with such structures (see the Modelling section). Only one non-zero real root within therange allowed exists for each of the instruments. Except for the SG, the values are in line with thosefound by by Boyle et al. (1983).'The biases resulting from multiplicative aggregation are insignificant, while most of those fromthe additive reconstruction are probably significantly different from zero.'Recall that the choice of "gold standard" depends on how the QALY is used to evaluate projectsand that different QALY functions are appropriate in different contexts. Thus, differences are takenwith respect to each instrument's holistic values so that the bias of using any one production methodin any particular context can be assessed.Chapter 2. Separating Good Health Measures from Bad^ 88tables). Note that the bias effects from instrument substitution and aggregation tech-nique may offset each other. These results are due to a combination of the estimatedpreference structure between morbid characteristics and the imposed curvature onthe non-morbid characteristics.Additive structures do generate less bias for those instruments which are non-linear transforms of the satisfaction response, but in no case does this bias fall bymore than ten per cent. This suggests that the dependence of the test for SDI onhomogeneity does not decrease its validity for the QALY applications considered here.The choice of how much bias to accept should not be made without considerationof cost savings. Since very little bias arises from the use of multiplicative aggregationstructures and the cost savings are substantial (in this case, cj(q)10, where ci(q) is theunit cost of evaluating any health state with instrument j), it is hard to justify theholistic approach. If only orderings of morbid states are required, then the cheapest in-strument should also be adopted (resulting in savings of up to (max i -foil—mini { ci})5and no additional bias). If equivalence is required, as is more likely to be the case, thebias-cost trade-offs have to be considered. switching instruments saves (cm„, —cmin )5,but results in additional bias as read across rows in the simulation table; adoptingan additive aggregation structure over a multiplicative structure saves cdl, but re-sults in additional bias as read down the columns in the same table. In general, themore health characteristics that are valued, the greater the absolute cost savings ofswitching from multi-linear to multiplicative aggregation structures, and the greaterthe relative cost savings of switching instruments rather than aggregation structuresfrom multiplicative to additive. The decision about whether and how much bias to ac-cept ultimately depends on the expected costs from making the wrong project choice(this information is summarized in the lagrange multiplier) and how this measure ofbias increases the likelihood of making such a choice (the relationship is linear onlyChapter 2. Separating Good Health Measures from Bad^ 89in large samples).One of the problems with QALY analysis is obtaining the QALY values them-selves. This process is costly, so much so that QALY-based evaluations can seldombe justified. If the policy maker is willing to accept reasonable approximations ofthese values (which might be sensible given the high standard errors associated withempirical QALY values), cost saving strategies are available.This chapter assesses these strategies. It improves on earlier work by (1) fullyspecifying the strategy space, (2) identifying the loss functions in terms consistentwith the goals of QALY analysis, (3) testing for the least well known separabilityconditions over broadly defined morbid states, and (4) partially quantifying the trade-offs involved in the different strategies.Conclusions include that preference structures support the use of multiplicative,but not additive, aggregation techniques (a by-product of this result is the justifica-tion of commonly held but unsubstantiated conceptual frameworks of health basedon grouped characteristics). The result is of significant practical importance since itsuggests substantial cost savings in QALY construction can be achieved without gen-erating distortions in the QALY index, and that other health status indexes based onadditive structures can be substantially improved at slight additional cost by adoptinga multiplicative aggregation rule. This result suggests the most commonly invokedcost-saving strategy (instrument substitution) is sub-optimal. Unlike all previouswork in this area, which focuses on and applies to only specific health states, theseresults hold over broadly defined and nearly comprehensive health categories. Re-sults may therefore be generalized to specific characteristics which lie within thesecategories.Additional cost savings can only be achieved with some amount of bias. Theresearcher's decision at this stage will depend on how inaccuracies in the QALY indexChapter 2. Separating Good Health Measures from Bad^ 90distort project selections and how averse he or she is to incorrect project choices asopposed to financial losses. Simulations indicate substitution of instruments distortsproject orderings less than the adoption of additive structures if the QALY index needonly order morbid states, but that the situation is reversed if the QALY must measurevalues (which applies depends on how the QALY is used to evaluate projects). Theappropriate choice cannot be determined without reference to the researcher's costsituation and preferences.2.4 ConclusionOne of the major challenges in health policy is the measurement of health statusoutcomes. QALYs provide measures which are free of the distortions and ethicalbiases inherent in health care markets, but they are not freely observable and mustbe obtained in experimental settings. QALYs assign appropriate values to healthstates (values that lead to policy decisions that maximize well-being) if the surveyquestion used to derive the value is couched in the appropriate context for the typeof policy being considered. Unfortunately, this increases the dimensionality of theproblem facing the survey respondent and can dramatically increase the costs ofobtaining QALY values, particularly if different types of policies are to be compared(e.g. those that affect length of life versus those that affect large groups of people).The second part of this chapter addressed the feasibility aspects of obtaining QALYvalues and verified a reconstruction method that can significantly reduce the numberof health states that must be valued directly by survey respondents, but maintainsthe validity of the resultant QALY values.This chapter suggests two avenues for further investigation. One is to carefullyreassess the sorts of policy decisions which can be assisted with QALY-type data, andChapter 2. Separating Good Health Measures from Bad^ 91the information content of these QALY values necessary for them to be valid in theseapplications. It may be that some types of policy analysis require more informationcontent in the QALY measures than is feasible. The second avenue of research isto determine if reconstruction methods are possible over these extra dimensions ofpolicy context (i.e. beyond morbidity). If this is the case, it should be feasible to useQALY data in a wide variety of health policy analyses.Chapter 3The Welfare Properties of Three Health Status Statistics3.1 IntroductionA health status index is an index number which measures both the quantity (howlong one lives) and quality (how well one is during this time) of health. It acts asan aggregator function of the various dimensions of health. This has two importantimplications. First, the various aspects of health must be valued relative to oneanother. Second, the index allows disparate health states to be compared.Such statistics can help decision makers in three types of allocation decisions.First, one can determine which treatment path yields the best possible health out-come for a given patient. For instance, a patient with kidney disease may be treatedwith dialysis or organ transplantation. The health outcomes of these treatments aredifferent (dialysis is very time consuming and leaves the patient lethargic; transplanta-tion is followed by a variety of complications due to surgery and immunosuppressantsand has a higher risk of immediate death). A health index can be used to comparethese different outcomes and determine whether transplantation or dialysis yields"more" health.Second, one may use a health status statistic to determine which patient shouldreceive a given treatment (i.e. who benefits more from this treatment). This sort ofdecision confronts policy makers whenever treatments must be rationed. An examplewould be heart transplants, which are limited by the availability of donor hearts.92Chapter 3. The Welfare Properties of Three Health Status Statistics^93When one becomes available, a recipient must be selected from the waiting list. Ifpolicy makers wish to allocate resources to get the greatest possible level of health,then the patient with the greatest improvement in health, as measured by the healthstatus index, should be chosen.Third, one can use health status statistics to determine which programs generatethe most health benefits for society and should be implemented (i.e. compare differenttreatments affecting different people). This involves comparisons of the health ofgroups of people. An example would be the decision to implement either a pancreatictransplant or a lung transplant program in a given region. The former improves thehealth of persons suffering from severe diabetes and the latter improves the healthof persons suffering from severe respiratory disease. Assuming the costs of eitherprogram are equal, the decision will depend on how many people can be transplantedin each program with the given resources, the health improvements of the peopletreated, and how these improvements are measured.Obviously, how a health status index measures the various aspects of a healthstate critically affects the types of decisions made above. The goal of this chapteris to assess three prominent health status indexes, human capital, willingness-to-pay,and QALYs (quality adjusted life years), to determine if decisions made on the basisof these measurements are consistent with individual and social welfare maximization.If the valuations inherent in these indexes are inconsistent with such objectives, policymakers may make decisions that leave society with a sub-optimal level of health.3.1.1 Review of Health Status MeasuresThree health status measures have been used by economists in health policy assess-ments: human capital (HK), willingness-to-pay (WTP), and the quality adjusted lifeyear (QALY). These are described below.Chapter 3. The Welfare Properties of Three Health Status Statistics^94Human Capital MeasuresHuman capital measures equate the value of a health state to the income that can beearned in that health state (usually expressed in present value terms). 1 For instance,if a person after kidney transplantation is only willing or able to work twenty hours aweek at a non-strenuous job paying ten dollars per hour, the value of a year of life inthis health state would be about ten thousand dollars. If the person lives five years inthis state and the rate of discount is ten per cent a year, the value of this life wouldbe about forty thousand dollars. The human capital measure of societal health is thelevel of G.N.P. (gross national product) generated under a given community profileof health status (i.e. the sum of all the individual human capital values).The index is based on principles of national income accounting and measures thepotential contribution to G.N.P. of an improvement in health status. The principleadvantage of this statistic is that it is based on readily available income and work-force data. Its disadvantages are twofold. First, the statistic is inconsistent withPareto optimality because the measure of marginal benefit ignores all non-financialaspects of well-being 2 3 (if transplantation made you feel better, but you could notfind work because of discrimination by employers, the value of the health state withtransplantation would be zero). Second, entitlement is based on income generatingpotential so that poor persons, particularly those not in the "formal" labour marketand the retired, are discriminated against. The corporate executive is more likely to1 Mishan (1976) describes both the traditional measure above and variations with either con-sumption or savings deducted (reflecting the position that only benefits to others and only benefitsto the individual should matter respectively).2 This assumes the measure of marginal benefit is undistorted by imperfections in the labourmarket.3 Brent (1991) has devised a human capital model with time, valued at the wage rate, as themetric rather than income. This essentially equates the value of life with the value of full income, sothe valuation problem remains. The conditions under which marginal income and marginal utilityare synonymous were not identified.Chapter 3. The Welfare Properties of Three Health Status Statistics^95get a heart transplant than the housewife.Willingness-to-Pay MeasuresWillingness-to-pay (WTP) measures health status changes in a standard revealedpreference framework. The value of a health status change is the maximum amountof income the individual is willing to give up in order to secure the change. Assumingperfectly competitive markets for kidney transplants, fully informed patients and noinsurance that distorts marginal prices, one could (hypothetically) find the value ofa kidney transplant by observing the market price at which such transplants weretraded. The usual aggregate statistic is the sum of the individual willingnesses-to-p ay.This index is based on principles of standard cost-benefit analysis. Because of var-ious market imperfections, accurate willingness-to-pay values are difficult to achieve. 4It is obvious that willingness-to-pay is determined in part by ability to pay so that thewealthy are favoured over poor people in treatment, regardless of the source of theirincome (see Torrance [1986], Brent [1991]). There are serious consistency problemswith the aggregate statistic, as has been shown by Boadway (1974) and Blackorby andDonaldson (1990). For instance, it is possible for the aggregate statistic to indicatethat societal health is higher when a dialysis program replaces a kidney transplantprogram, and that societal health is higher when a transplant program replaces dial-ysis.4 1,VTP measures may be obtained through empirical observation and market data or by hypo-thetical survey questionnaires. Both approaches can be problematic since health care markets arebadly distorted and do not span all relevant health characteristics, and surveys typically suffer fromlow response rates. These difficulties contributed to the search for alternative measures.Chapter 3. The Welfare Properties of Three Health Status Statistics^96QALY Based MeasuresQALYs were developed in response to concerns that health care resources should notbe deployed by ability to pay 5 (a fact reflected in numerous health care institutions),but on a basis still consistent with individual preferences. The QALY value is de-rived from preferences using a QALY instrument (see Chapter 2), a device by whichthe strength of preference between two (hypothetical) health states (described in aquestionnaire) can be measured. The healthy year equivalent (HYE) is a special typeof QALY where the duration of a health state in addition to morbidity is explicitlypart of the health state description. In the kidney transplant example above, theindividual would be described the morbidity associated with kidney transplants andtold it would last, say, ten years. The individual would then be asked how manyyears in perfect health would leave him or her as well off as living for ten years with atransplant. This answer is the HYE value. The aggregate statistic is the arithmeticmean of the individual statistics (with a fixed population, this is equivalent to thesum of the individual statistics).QALY values are more reliable than market based values because they are notinfluenced by market imperfections. They are not freely observable and can be ex-pensive to obtain. Almost all attention has focused on the methods used to derivesuch a health status index (i.e. which instrument to use) and whether it is a utilitynumber or not (see Chapter 2 and Butler [1990]). Hilden (1985) has questioned theinterpersonal properties of QALYs because variations in the value of health relativeto other goods are suppressed (Ware and Young [1979] provide evidence that suchvariations in the value of health exist). Butler (1990) has claimed QALYs are pro-portional to willingness-to-pay measures, but, like Hilden, does not demonstrate the5 Torrance et al. [19721 suggest that "economic" well-being should be held constant in the QALYsurvey, claiming this purges the measure of any ability to pay considerations.Chapter 3. The Welfare Properties of Three Health Status Statistics^97conditions when this is so. While the arithmetic mean has been the only aggregatestatistic used to date, "the assumptions which underlie any method of aggregation,and the policy implications of alternative methods, need to be more fully explored"(Loomes and McKenzie: 1989, p. 305).Almost twenty years ago Torrance argued "the theory [of QALYs] should be ex-tended to include multiple utility maximizing individuals, interpersonal comparisonsof utility, and group decision-making" (1976a, p. 369). This review demonstrates thelimited progress made since then. Furthermore, the performance of the QALY indexhas never been compared rigorously to the alternative health status indexes used byeconomists in the past. It is this assessment which is now undertaken.3.1.2 Evaluation CriteriaThe purpose of this chapter is to evaluate the performance of these indexes as mea-sures of health status, not the decision statistics which sometimes incorporate themwith other non-health information (such as the cost-utility ratio). Three principlesform the basis of evaluation.i) the individual is the best judge of his or her well-beingFor the purpose of this paper, it is assumed well-being is synonymous with utility.Then this condition requires that the health status index assigns higher values tostates more preferred by the individual, i.e. the index is an exact representationof the preference ordering. This position is adopted to prevent the policy makerfrom imposing treatments that the patient would not want. For instance, in thekidney example above, a health index may assign higher values to transplantationthan dialysis. If the patient actually preferred dialysis, the policy maker who basedhis or her decision on the health index information would make the wrong choice.Chapter 3. The Welfare Properties of Three Health Status Statistics^98While this position is gaining acceptance in the medical field, several controver-sial implications need to be identified. First, more paternalistic positions must beabandoned (one cannot say, as did Avorn (1984) and Harris (1987), that preventionof death should be the paramount concern unless the patient agrees that all qualityof life should be sacrificed to extend life). Second, the extrinsic value of an indi-vidual's health (i.e. the value to others of his or her health) is deemed irrelevant.Finally, because the health status index measures health outcomes, not the processesby which they are achieved, it must be assumed that preferences are also definedover outcomes. People may think lotteries are a preferred selection mechanism fortransplants over some medically determined criteria, but this is irrelevant since onlythe well-being associated with the health outcomes of transplantation count, not howthey are achieved.all individuals are entitled to equal access to health benefitsEqual access to health care is a controversial principle, although it has oftenbeen promulgated by policy makers and occasionally manifests itself in health careinstitutions (e.g. social insurance). The reader should be mindful that other ethicalpositions exist (e.g. entitlement by "social worth" was once used to determine accessto dialysis).Equal entitlement requires that the value of a health status change (as measuredby the health status index) be equal for all persons with the same preference orderingover health. If two people would rather have treatment than not, the policy makershould not be able to discriminate against one or the other in treatment allocation.For instance, suppose two people both prefer kidney transplantation to dialysis. Oneperson is elderly and financially well-endowed, while the other is young and poor.Because both individuals prefer the same alternative, each should have an equal op-portunity to receive a transplant. Given that the policy maker will allocate the donorChapter 3. The Welfare Properties of Three Health Status Statistics^99kidney according to which patient achieves the greatest benefit from the transplantover dialysis, equal entitlement requires that the health status improvement, as mea-sured by the health status index, be the same for both individualsiii) policy decisions should be guided by distributionlly sensitive welfarist socialpreferences defined over the health states of the individuals in the communityAssume that social well-being may be represented by some set of preferences de-fined over community health profiles and this may in turn be represented by a socialwelfare function. Then this third principle implies social preferences must be definedover individual preferences (i.e. the social welfare function is a welfarist Bergson-Samuelson social welfare function), and that the aggregation of these individual pref-erences must reflect the degree of inequality aversion in the distribution of healthoutcomes across members of society held by the community (or a benevolent socialplanner). Welfarism is necessary so that the first and third principles do not contra-dict one another. Ethical flexibility in the distribution of health outcomes does notmean the second and third principles are inconsistent because the second is definedover identical health states and the third is defined over health states that differ.3.2 Theoretical Framework for AssessmentIn this section, each decision statistic is derived within a utility maximizing frame-work. This establishes the relationship of each decision statistic to individual well-being and allows the comparison of the welfare properties of each.3.2.1 Individual's Optimization ProblemThe individual seeks to maximize his or her well-being, which may be represented byan objective function defined over personal consumption and health status. HealthChapter 3. The Welfare Properties of Three Health Status Statistics^100status affects utility directly (via general wellness) and indirectly through productivity(e.g. wages) and by constraining the amount of time that may be spent in anyparticular activity. Utility is assumed to accumulate over the lifetime.Let U : RN^R 1 describe preferences over health (represented by morbidity, q,and a length of life, T), and other goods (described by the vector x). The individualmaximizes utility by choosing the optimal allocation of time to various activities andincome to various purchased goods subject to financial, institutional, and time useconstraints. The individual's choices are constrained in three ways. It is assumedthat the individual may allocate time to three activities: active leisure (t 1 ), passiveleisure (t 2 ), and paid labour (L). Active leisure and labour time may be constrainedby poor health (i.e. t 1 < ri(q),L < 7L (q)), but passive leisure is not, by definition,ever constrained. Time spent in all activities, including the labour market (L), mustnot exceed total time (T). Finally, the value of all purchases of x, valued at prices,p, must not exceed income earned (total time, T, less time spent at leisure activities,valued at the wage rate, w(q)) plus endowed income, I. Note, that the wage rate isassumed to depend on health status. Health can then affect utility in three ways:through earning capacity, through time constraints, and directly.Then the individual's problem is to find x, t 1 , and t 2 to solve:maxIti(x,ti,t2,q): (i) EPix.i w(q)L + I;x,ti,t22(ii) E t,^L^T; (iii) 1 1^7-1(4); (iv) L 5 71(01.^(3.1)j=1(This problem can be restated in a dynamic context, with all variables time-scripted so that their evolution can be tracked:max., ,,•..,T,t1,1,-,iT.1,t2,1 7-42,T {U(xi, .••, xT,ti,i, ..•, t 1,T7 t2,1, •••) t 2,T)^ELI PikXik <Ek _ 1 wk (qk )Lk + I ;^t + Lk < 1 Vk = I, T; Lk < 71,k(qk); tlk < Tlk(qk)Vk =Chapter 3. The Welfare Properties of Three Health Status Statistics^1011, ..., T} , where k denotes the time period, i the purchased commodity, and j theactivity. Such a representation provides greater detail, although most of it is super-fluous for the purposes here. The only critical difference comes in the calculation ofthe HYE function. This is discussed in the section where this statistic is derived.)After solving this problem, optimal choices may be substituted into the utilityfunction, U, to obtain an indirect utility function defined over given parameters. Itis sometimes easier to derive the health statistics from a dual representation, butproblems arise because of the presence of multiple constraints. The expenditurefunctions are not well-defined in these cases. There are two ways to remedy thissituation.The first is to view the indirect utility function as made up of a set of constrainedindirect utility functions, one for each possible combination of binding constraints.There are four possible cases: neither labour nor leisure is constrained, active leisure isconstrained, labour is constrained, and both labour and active leisure are constrained.If 'IL is the realized value of utility at (e, ti , q) (i.e. u = U(e, t7, t;)), thenif time allocations are unconstrained:u V(p,w(q),w(q)T + I, q)max{U(x,t i ,t 2 ) : Epixi < w(q)(T — t 1 — t2) + I};if L = ri,(q) (i.e. labour is constrained):u = f7(p,w(q),w(q)7L (q) I,T — .71(q),q)max{U(x,t i ,t 2 ) : Epixi w(q)7L(4) + I ;t + t 2 ---= T — 7.0;if t 1 =7-i (q) (i.e. active leisure is constrained):u = 1./4,w(q),w(q)(T — 7-1 (q)) + I,^q)Chapter 3. The Welfare Properties of Three Health Status Statistics^102max1U(s,t i ,t 2 ) : Ep i x i < w(q)(T —^t2);tiif L^T1(q) (i.e. both active leisure and labour are constrained):u = V(p,w(q),w(q) .71(q)-1- I,T 71,(4)— n(q), 71(4),q)max{U(x,t i ,t 2 ) : Epix i < w(q)71(q) + I; t 2^—^(3.2)For each constrained indirect utility function there exists one well-defined con-straint (the time use constraints which bind in each case may be substituted into theincome constraint to achieve a single binding constraint). Then for each constrainedindirect utility function there exists a well-defined expenditure or cost function whichis dual to it, and this may be recovered by inverting the constrained indirect utilityfunction. This yields a set of expenditure functions:if time use is not constrained:w(q)T + I = E(u,p,w(q),q)minfEpix i + w(q)(t i +1 2 ): U(x,t 1 ,1 2 )> u; t l +12 + L = T};if L = 71(q):w(q)71,(0+ I = E(u,p,w(q),T — 7-L (q), q)min{Ep ix i +w(q)(t i t 2 ) : U(x,t i ,t 2 )? u; t 1 +12 T — 711;if t 1 = 71(4):w(q)(T — 7-; (q)) + 1 = t(u,p,w(q),r i (q),q)min{E ^+ w(q)(ti + 1 2 ) : U(x,11,12)> u; 11= T i ; 11+12 + L T};if L^TL(q),11w(q)TL(q) + I = E(u,p,w(q),T TL(q) — ri(q),ri(q),q)Chapter 3. The Welfare Properties of Three Health Status Statistics^103minfEPixi w(q)(ti t 2 ) : U(x,t i ,t 2 )? u; t 1 = r1; t 1 t 2 +71 = T}. (3.3)While these functions are well-defined, they are restrictive to work with in cases wherethe set of binding constraints changes.The second method is to endogenize prices so that the time use constraints neverbind and the individual in effect faces one "composite" budget constraint. In thiscase the individual's problem becomes:max{U(x,ti,t2,q) EPixi w2t2 < w(q)I, + I + w l t l + w 2t 2}, (3.4)X,ti 7 t2where w 1 and w 2 are shadow prices on the two leisure activities (see Sen [1972] for adiscussion of shadow pricing constrained goods). If labour is constrained at marketprices, the individual must be induced to consume more leisure (until the labourconstraint is no longer binding), which is accomplished by reducing the (shadow) priceof both leisure activities below the market wage (w i = w2 < w(q)). If active leisure isconstrained, the individual must be induced to consume less active leisure (until theconstraint is no longer binding), which is accomplished by raising the (shadow) priceof active leisure above the market wage rate (w 1 > w(q) w 2 ). Finally, if both activeleisure and labour are constrained, the individual must be induced to consume morepassive leisure and less active leisure by reducing the shadow price of passive leisureand increasing the shadow price of active leisure (w 2 < w(q) < w 1 ). Obviously, theshadow prices depend on the market wages and prices, the levels of the constraintsand endowments, and preferences.Solving the individual's problem and substituting the optimal allocations into theutility function, it is possible to obtain the global indirect utility function (the solutionto equation (3.1)):u = V(p, w(q), w(q)C(T, q,^T (4)) + I, H (T, q, 7 1 (4 TL (q)), q^V (p, q, T, I )(3.5)Chapter 3. The Welfare Properties of Three Health Status Statistics^104where G(T, q, T i (q), TL (q)) and H(T, q,7 1 (q),7L(q)) are functions of the shadow prices,endowments and constraints that indicate the full income and the leisure allocationspossible.3.2.2 Derivation of the Health Status IndexesA health care project is described by its effect on health status (which changes from(qB, TB) to (qA , TA), where A denotes "after" and B "before" the project). Associatedwith these two states are utility levels uA and uB. It is assumed costs of the project arecovered by a third party payer and do not, directly or indirectly, affect the individual'ssituation.3.2.3 Human Capital Measures (HK)Recall that the human capital measure equates the value of a health state with theincome that can be earned in that health state. This requires that the amount oflabour effort in a particular health state supplied by the individual be identified, aswell as the return to this labour effort (i.e. the wage rate). To do this, the set ofconstrained expenditure functions is employed, and each case is assessed separately.Unconstrained CaseTo find the labour supply in this case, apply Shepard's lemma to the unconstrainedexpenditure function:L'(u,p,w(q),q) = DE (u,p,w(q), eq) aw(q)(3.6)A health status improvement is measured by the change in earned income (AHK)arising from the health status change. Given the labour supply function above, andChapter 3. The Welfare Properties of Three Health Status Statistics^105assuming labour and active leisure use are unconstrained in both the before and afterstates, 6 this may be expressed asaE ( uAA) ,qA) w(^aE (uB ,p,w (e) , e)AHK w(e'^ q'd )) awo,A)^ aw(e)^• (3.7)Labour Constrained CaseIf labour choice is constrained, then labour can only be provided to the level of theconstraint andHK w(q)TL(q),^(3.8)such that the health status change is measured aszHK w(qA )71(9,A ) _ w(gB)71(e).^(3.9)Leisure Constrained CaseThe individual's labour choice incorporates any binding leisure restrictions. This maybe recovered using Shepard's lemma on the leisure constrained expenditure function:= 8t(u,p,w(q),1-1(q),q)aw(q)Thus, the value of a health status change may be expressed asat(uA p w(qA), Ti (qA) , qA)ZIHK = w(qA^" (qA )w (qB E (uB p w (q1 ), n(qB ) qB)•aw (qB )(3.10)(3.11)6This assumption is made for convenience. The human capital function is still well defined ifthe after state is characterized by one set of constraints, and the before state by a different set.One would have to substitute the appropriate labour supply function according to the constraintsbinding in that case.Chapter 3. The Welfare Properties of Three Health Status Statistics^106Labour and Leisure are ConstrainedIf both constraints are binding, the amount of labour supplied equals the labourconstraint and the labour constrained case holds.3.2.4 Willingness-to-PayThe willingness-to-pay (cv) is the maximum amount of income the individual is willingto give up in order to have the change in health status (the compensating variation,rather than the equivalent variation, is adopted since health care projects typicallyinvolve moving to better states that people are willing to pay for - the "do no harm"ethic ensures few converse situations arise). This is the amount of income that can bededucted in the after state and leave the individual as well off as in the before state.Then, recalling equation (3.5), cv may be defined implicitly:V(p,qA ,TA ,I - cv) =i)(p,qB ,TB ,I) uB .^(3.12)3.2.5 QALY/HYE MeasuresIn this analysis, the HYE (healthy year equivalent) is employed rather than a QALYbased on an arbitrary time frame. This is done so that subsequent welfare resultsare not conditioned on measurement error. It is also assumed that the individualaccounts for the effects on his or her income that result from a change in healthstatus. This contradicts the position taken by Torrance et al. (1972) that the QALYbe purged of all income effects (they suggest using an income replacement scheme sothat there are no financial effects to the respondent - or his or her family - from thehealth status change). The position adopted by Torrance et al. implicitly assumes allfinancial effects are captured in the cost assessment, rather than in the health outcomeassessment (e.g. increases in earnings would be counted as cost savings). In reality,Chapter 3. The Welfare Properties of Three Health Status Statistics^107the cost assessment only incorporates costs incurred by the funding agency (e.g. theMinistry of Health). This is reasonable when one considers the funding agency mustchoose programs to maximize health subject to its own budget constraint. If the costassessment included costs for which the agency was not responsible or cost savingswhich it could not recoup (such as income effects of the patients), the choices madewould either not exhaust or exceed the budget constraint.The HYE is similar to the willingness-to-pay measure since both are based ontrade-offs that achieve welfare equivalence. However, the numeraire for HYEs istime, not income, and the reference state for HYEs is fixed at perfect health and notsome arbitrary "after" state.In the HYE assessment, the individual gives up time in perfect health until he orshe is indifferent between that state and the one described, i.e.V (p, (T m), I) = V (p, q, T , I ) u. (3.13)The HYE value for a project is the difference between the HYEs for the before andafter statesHYE = (TA — m(qA )) — (TB — m(qB )).^(3.14)Note that in the unconstrained case the change in healthy year equivalent and thewillingness-to-pay measures are proportionately related by the market wage rate.(In the dynamic context discussed before, time foregone must be valued at anaverage wage rate, i.e. in the unconstrained case: tii(T — m) w, ((it ). ThenT — m = 4.(E(u,p t ,w t (4t ),qt ) — I). If the wage schedule at perfect health varies over„,time, then w varies as the time frame varies. Different health states are then valuedby different functions of utility, causing potential problems with exactness. However,if the wage schedule is relatively continuous and health states are not too disparate,a reasonable approximation is achieved.)Chapter 3. The Welfare Properties of Three Health Status Statistics^1083.3 Choosing Treatments for a Given IndividualThis section examines the welfare properties of the three health status indexes whenthey are used to determine the "best" treatment for an individual. Recall that theefficiency criteria dictate that the appropriate choice is the one which assigns theindividual to his or her most preferred outcome state. This requires that the indicatorbe consistent (exact) with the individual's preference ordering over outcomes.A health status index is said to be exact if it assigns higher values to health statesthat are more preferred by the individual in question. A health state is completelydescribed by two characteristics: morbidity (q), and length of life (T). By definition,the utility function, U, is exact.3.3.1 Exactness of the Health Status IndexBecause a health status change affects only q and T, is it possible to express theutility function over different health states as ii(q,T) = V(p,q,T, I) (where p and Iare fixed variables). Then a health status index, HS, is exact if and only ifH S (q,T) = q5(V (p, q,T , I )) = 0(f)(q, T ))^(3.15)(4) is some increasing monotonic transform).Human Capital MeasuresHuman capital measures are not based directly on preferences, but on one behavioralmanifestation of these preferences:HK = w(q)1, - ,^ (3.16)where the labour supply, L", usually depends on the utility level (in the labour con-strained case, the labour supply depends on the constraint only).Chapter 3. The Welfare Properties of Three Health Status Statistics^109Lemma 1 (HK): When labour supply is unconstrained, HK is exact (i.e. assignshigher values to more preferred states) if and only if the health care project affects Tonly and labour is a normal good. Labour constrained HK is exact if and only if thehealth care project affects only the labour constraint.Proof: see Appendix E.Note that the leisure constrained case is identical to the unconstrained case, exceptfor the additional argument in the expenditure function, 7 1 (q). It, like w(q), must bethe same for all projects compared. Because of this, the human capital measure isnever exact between states of the world characterized by different binding time useconstraints.It is clear that the human capital statistic is not exact for most health statecomparisons. It cannot compare two states where both morbidity and mortalitychange, nor can it compare two states characterized by different binding time useconstraints. Even for the very narrow set of health states which it potentially canevaluate, it is exact only for very restricted (and unlikely) preferences. Hence, it isnot a suitable index for evaluating health states at the individual level.Willingness-to-Pay MeasuresUnlike the human capital measure, the willingness-to-pay (WTP) measure is basedon a direct comparison of well-being associated with any two states.Lemma 1 (WTP): WTP is exact (i.e. takes on positive values when the "after"state is preferred to the "before" state, and vice versa).Proof: see Appendix E.Note that the presence of constraints has no effect on the exactness properties ofthe WTP measure.Chapter 3. The Welfare Properties of Three Health Status Statistics^110Healthy Year EquivalentsThe healthy year equivalent (HYE) is similar to WTP, although the valuation ofhealth states depends on the value of time, rather than income.Lemma 1 (HYE): HYE is exact (i.e. it assigns higher values to more preferredstates).Proof: see Appendix E.Note that the presence of time use constraints has the same effect on RYE as onWTP.3.3.2 DiscussionExactness results are summarized in Table 3.1. HK statistics cannot compare projectsinvolving different levels of morbidity, and can only rank projects involving differentlengths of life if labour is a normal good (implying at least one type of leisure isinferior) and the set of binding time use constraints is constant. Willingness-to-paymeasures and healthy year equivalents, on the other hand, are always exact.In conclusion, the human capital measures are the least satisfactory in this type ofdecision framework (single individual). They cannot evaluate health states with dif-ferent degrees of morbidity, and are consistent over length of life only under restricted(and unlikely) preferences. It is likely that the wrong treatments will be chosen iftreatment choices are made according to the rankings of the human capital index.The willingness-to-pay measure is exact, its only disadvantage being the difficultyin obtaining accurate data. The QALY based measures are also exact, and may bemore reliable than the willingness-to-pay measures. Theoretically, both indexes indi-cate appropriate treatment paths, although the willingness-to-pay measures may bebiased because of market distortions. These distortions have to be assessed in lightChapter 3. The Welfare Properties of Three Health Status Statistics^111of the significant costs of acquiring the alternate QALY data.q THK inexact L I > 0cv exact exactBYE exact exactTable 3.1: Conditions for Exactness3.4 Choices Between IndividualsIn this section, attention is turned from decisions over which treatment to give to aparticular individual to decisions about which individual should receive a particulartreatment (e.g. who should receive a heart transplant when donor hearts are scarce).Decisions of this nature involve the type of allocative issues which led to dissatisfactionwith both the willingness-to-pay and human capital approaches and the developmentof QALY based alternatives.It is assumed that the policy maker chooses to allocate treatments to the peoplewho benefit the most by them (as measured by the health status index). For instance,a heart transplant allows person A to return to work as a corporate executive making100,000 dollars a year, or person B to continue to live on a pension in retirement. Notreceiving a transplant results in death for either person. The human capital statisticmeasures the annual benefit to person A as 100,000 dollars more than the benefits toperson B. The policy maker would then choose person A over person B for the nextdonor heart because this choice results in the greater measured health benefit.The question is whether such decisions are consistent with the principle of equalentitlement.^This requires that all persons who prefer one treatment path over'Other ethical positions are possible (e.g. merit based allocations), although equal entitlement isfairly common. What is important is to recognize that some such ethical judgement is inherent inthis type of decision and that these judgements should be made explicitly. Acceptance of arbitraryChapter 3. The Welfare Properties of Three Health Status Statistics^112another have an equal opportunity to receive that treatment (i.e. the health statusstatistic measures the benefits of such a treatment as equal for all persons who wantthis treatment). This is done in two steps. First, it is determined if such health statusmeasures vary across individuals with identical preferences but different endowments.Second, the variations in health status measures across individuals with identicalendowments but different preferences are assessed.3.4.1 Variations Due to EndowmentsIn this section, each health status statistic is evaluated for independence from (i)income (wealth and earnings), (ii) health (life expectancy and co-morbid effects), and(iii) time use constraints. Because choices between individuals are based on measureddifferences in health status as a result of treatment, the appropriate function foranalysis is the difference in the health status statistic evaluated at the treatment andno treatment states.Human Capital MeasuresThe human capital statistic measures the value of a health status change (denoted,(qB TB) to NA , TA , \by the move from^)) as the change in earned income due to thehealth status change:AHK = w(qA )I:(qA ,TA )— w(q .13 )Lx(qB ,TB ),^(3.17)where Llq,T) is the supply of labour in health state (q,T).income effectsThe (unconstrained labour supply) human capital statistic is invariant to the levelof wealth (I) if and only if the income elasticity of labour supply is perfectly elastic,value standards in morally inept.Chapter 3. The Welfare Properties of Three Health Status Statistics^113and is invariant to earning capacity (w) if and only if the price elasticity of laboursupply is unitary.'It is relatively easy to derive these results by differentiating equation (3.17) (withthe appropriate unconstrained labour supply functions substituted into the equation)by I and w respectively. In the former case, 80HKI8I = 0 (the statistic is indepen-dent of wealth) if and only if —dwl (w(qA) dw) = d(P-V-)l-8-til (where `d' denotesthe change between the before and after states and LA is the labour supply in theafter state). This must apply to all projects, including those that affect only lengthof life. But then dw = 0, so independence requires d(-1-A ) 0 (i.e. the incomeelasticity must, be the same for all states). Imposing this condition when the projectaffects wages, the independence condition becomes dw(849I) = 0. Since dw 0,this requires that the income elasticity of labour supply be zero.Assume that wages differ between individuals by a constant amount. Then theeffect of a change in wages may be found by differentiating equation (3.17) withrespect to this factor. Then 49,AHKjaw = 0 if and only if LIEL„, + 1) = c (whereElm is the price elasticity of labour supply, 11 for all states. Unless the case where= 0 can be eliminated a priori, this can only hold if c = 0. Then independencerequires ELw = —1 (unitary price elasticity).Consider the effects when these elasticities do not hold. Suppose the incomeelasticity of labour is negative (people with large wealth holdings work less). A hearttransplant allows a person to live and labour activity is not constrained after thetransplant. Consider two people, one with large wealth holdings and one with littlewealth, but otherwise identical. Then, because the income elasticity of labour isnegative, the individual with small wealth holdings spends a larger portion of this8 In the labour constrained case, the labour supply is invariant to the level of income or wages,so the conditions do not hold.Chapter 3. The Welfare Properties of Three Health Status Statistics^114extended life working than the wealthier individual so that the human capital statisticfavours poor individuals when the donor heart is allocated.health effectsThe human capital statistic is invariant to life expectancy (T) if and only if thesupply of labour is invariant to the length of life remaining and is invariant to co-morbid effects if and only if the change in earnings is the same for all health states.These results can be found by the same methods above. Differentiating equa-tion (3.17) with respect to T and following the same steps as in the wealth case,independence from life expectancy requires 5L/OT = 0. Co-morbid effects canalso be found by differentiating (3.17) with respect to qj , some morbid characteris-tic unaffected by the health status change. This yields the independence condition:OAHK/aq j w(qA)8LA /80, w(qB )aLB/aqj +(aw ( qA vamp (5,w(e)/0q2)LB • —In general, people with more years to live will work more (au > 0), with theexception of people who are constrained in labour time. Then, the human capitalstatistic discriminates against the elderly (e.g. a young person will get an organtransplant over an older person because, ceteris paribus, the period in which incomecan be earned is longer for the youngster). The effects of co-morbidity are moredifficult to predict: a pre-existing disability may either inflate or dampen the earningsgain from the health status improvement. It is not possible to predict if this statisticdiscriminates against the disabled.time use constraintsTo examine the effects of time use constraints, the constrained versions of laboursupply are used. It is assumed that only one set of constraints is binding in both thebefore and after treatment states. This allows problems with non-linear constraintsto be circumvented, while allowing comparisons across persons who are differentiallyconstrained.Chapter 3. The Welfare Properties of Three Health Status Statistics^115Generally, human capital measures are independent of time use constraints if andonly if labour supply is unaffected by such constraints. In the labour constrainedcase, this requires that the labour constraint not be binding.If the labour constraint is binding (regardless of whether or not the leisure con-straint is binding or not), the effect on the human capital measure may be found bydifferentiating the labour constrained case by the labour constraint: am yar,,(w (qA )___ w (e)) aL laTL. Given that wages cannot be restricted a priori to be equal,this requires that aLvari,^0 (i.e. the labour constraint does not bind). Similarly,leisure constraint effects can be found by differentiating the leisure constrained labourfunctions: 50HK/aT i w(e)(aLA !aro_ w(e)(0.0 !aro.While it is clear labour constrained individuals (the mandatorily retired) are dis-criminated against when the human capital statistic is used to allocate resources, theleisure constrained individual may or may not be.Willingness-to-Pay MeasuresThe willingness-to-pay statistic may be represented in a global indirect utility functionwhich encompasses all possible constraints (multiple constraints are a problem for thedual function only):v (p, w (qA w (qA )G(qA , TA , (qA ), TE, (qA )) H(qA, TA (qA ), ( qA ), qA= v 09, w (qB , w (qB )G(qB , TB , 7.1 (qB) ,TL (qB)) + H ( qB , TB , 7_1 (qB ), L W )), qB ).(3.18)This function is used to derive several of the independence conditions below becausethe results obtained can be generalized beyond the time use unconstrained case.To compare the effects of time use constraints, however, the segmented constrainedindirect utility functions are used. This is allowable because it is the difference inChapter 3. The Welfare Properties of Three Health Status Statistics^116willingness-to-pay for a given health status change between the case when the sametime use constraints bind in both the before and after states, and when they do notbind in either state that is of interest.income effectsThe willingness-to-pay measure is independent of wealth (I) if and only if incomeis additively separable from health status (q and T) in the utility function, and isindependent of earning potential if the effect on the marginal utility of income of achange in wages is proportional to the effect on the marginal utility of health status.To demonstrate the above, express equation (3.18) in its reduced form (withall constants suppressed) to obtain the equivalent expression V(qA,TA ,I — cv) =V (V', TB , I). Then cv is independent of I if and only if it is the same for all values ofI. If cv is independent of I, then cv f(qA ,TA ,qB ,TB ). Set (qA ,TA ) = (q,T) and( qB , TB) = (q, T) (i.e. the before state is fixed at (q, T)). Then the fixed values maybe suppressed in the expression for cv: cv f(q,T). Substituting into the indirectutility function, one obtains: V (q, T, I — cv) = V (4,T, I). Given that cv is indepen-dent of I, choose I to be I = cv m, where m can take on any value. Then theindirect utility function becomes V(q,T,m) = P, cv + m) which, because (4,T)are fixed, may be expressed V (q, T, m) = v(cv m) cv m. But cv f (q, T ), soV (q, T, m) f (q, T) + m, which is the desired result. If V (q, T, I) =12-- f (q, T) + I , thencv = NA,TA)_ f(qB ,TB) , which is independent of I.To derive the earnings result, assume that dw(qA ) = dw(qB ), and that the changein health status is small (so that cv may be defined as a function of derivatives 9 ).Expressing (3.18) in the reduced form, independence requires V (qA , TA, I — cv , w (qA )+dw) = V (qB , TB , I , u) (qB ) dw) for all dw. Solving for cv and differentiating with9The results are easily extended to large changes in health status since any discrete change inhealth status may be expressed as a series of small changes. Given that these conditions hold overall small changes, they must then hold for any large change.Chapter 3. The Welfare Properties of Three Health Status Statistics^117respect to dw, one obtains acv/aw = a(virs )/(9w (where VHS dHS = Vqdq-EVTdT, andsubscripts denote derivatives), and Ocv/aw 0 if and only if VHs,w/VHs =In both cases, the effect of a change in income on willingness-to-pay depends onhow that change affects the marginal value of health relative to the marginal value ofincome. For the statistic to be biased to the rich, the marginal value of health mustincrease more than proportionately than the marginal value of income when wealthor wages change. In the case of wealth, it is generally accepted that the marginalvalue of income falls as wealth increases, and evidence from Viscusi and Evans (1990)suggests the marginal value of morbidity increases with income. Thus, the measureis biased against the people with small wealth holdings. It is not clear that the samebias holds for earning potential, however.health effectsThe willingness-to-pay measure is independent of life expectancy (T) or co-morbideffects (qj ) if and only if each element of health status is weakly separable from allother elements.This result can be demonstrated using either T or qj (qj is used here). Assumethat the change in health status is very small (so that derivatives may be used). Then,since cv may be expressed cv = dT rE i dqi , independence requires acv/aq ; = 0,or c9( 1/70/aqi = (9( 17/ ,)/aqj = 0 for all qi . But this is the definition of weak separabilityof qj from all other elements of health and income. Since this must hold for everyaspect of health, this implies all elements must be weakly separable from each other,i.e. V (qi , qj ,T, I ) = (v i (qi ), v 2 (qi ), v 3 (T), v 4 (I)).Again, the effects of a change in health status unrelated to treatment onwillingness-to-pay depend on the relative effects on the marginal value of incomeand health. Evidence from Viscusi and Evans (1990) suggests the marginal utility ofincome is increasing in health, and standard economic theory suggests the marginalChapter 3. The Welfare Properties of Three Health Status Statistics^118utility of any one good (e.g. health) should be subject to diminishing returns. Inthis case, the marginal utility of health actually falls as the marginal utility of incomeincreases, suggesting the aged and disabled are, ceteris paribus, favoured in treatmentselection when this measure is used.time use constraintsTime use constraints (either labour or leisure) do not affect willingness-to-pay ifthey are weakly separable from health and income in the utility function.The purpose of this section is to compare the willingness-to-pay of an individualwhose time is always constrained against one whose time is never constrained. Thus,the segmented constrained indirect utility functions are employed since only one set ofconstraints is ever binding in either case. In the unconstrained case, denoted by "u",, _cvu = E(uuA,p, tv (e) , gA ) E (uuB ,p, w (qA ), qA ) = Eu (uuA ) — Eu(uuB ). The constrainedcase, denoted by "c", may be expressed in similar fashion as cu` = Ec(un — Ec(u i9 ).Then the difference in the willingness-to-pay due to the constraint is cvu— cvc. This isequal to zero if and only if Eu(u1)— Eu(uuB ) = Ec(un— Ec(uBc ), or (8Eu /8u)duu =(8EclOu)duc. Since 0E/8u = 81,1/81 , and du = avian-s (where dHS = E i dqi +dT),this implies MRSItisj^/14- RS;1" (i.e. the lti RSH .9 ,1 is unaffected by the presenceof the constraint, which defines weak separability).Not surprisingly, the condition for independence of willingness-to-pay from a con-straint is very similar to the condition for any other factor. If, for instance, thepresence of a labour constraint makes the marginal value of income increase relativeto the marginal value of health (as would be expected for improvements in T), thewillingness-to-pay measure will favour the unconstrained individual (e.g. the individ-ual in mandatory retirement is less likely than someone who may choose to work tobe selected for an organ transplant).Chapter 3. The Welfare Properties of Three Health Status Statistics^119Healthy Year EquivalentsAs with the willingness-to-pay measure, the healthy year equivalent can be implicitlydefined in a global utility function. Results derived from these functions can begeneralized to all sets of constraints. The HYE may be expressed:v(p,w(qA ,w(qA )G(qA , ,7_1(qA ), ,L (qA )) + H (qA , , Ti(qA ),TL(qA )),qA )=V(p,w(4,w(4)G(4,TA —m,r1 (4),71,(4))+ I , H (q, T A — m, 71(4),TL(4)), 4). ( 3 . 19 )Then HYE = (TA _ ni (qA ,TA)_ TB +m ( qB ,TBN \.)) When constraints themselvesare assessed, the segmented constrained indirect utility function is used for the reasonsgiven in the willingness-to-pay section.income effectsWealth does not affect HYEs if and only if wealth is separable from health (q andT). HYEs are independent of earning capacity (wages) if and only if the effect on themarginal utility of morbidity is proportional to the effect on the marginal utility oflongevity.These results are similar to those in the willingness-to-pay case, although theirderivation is somewhat more involved. Differentiating the AHYE function with re-spect to income, one obtainsThen oAkly-Elar 0 if and only if MY E(q,T)/ar c for all (q,T). But sinceOHY E(4,T)/61 = 0 (perfect health is assigned a value of T regardless of the levelof wealth), allY E(q,niar 0 for all (q,T). Given HY E(q,T) = MRSq ,T (q — 4),(again, assuming the change in q is sufficiently small to express HYE explicitly),OHYEIOI = 0 if and only if 5/14- RS qx/8/ = 0, which defines weak separa-bility (because any large change in health status may be viewed as a series ofOAHYEMI = OHYE(qA ,TA )/01- -,r9HY E(qB ,TB )/OI.Chapter 3. The Welfare Properties of Three Health Status Statistics^120small changes, and because this condition holds at every point, then this condi-tion must hold for all changes in health status regardless of size). Similarly, inde-pendence from earning capacity requires (9HY E(q,T)/aw = 0 for all (q,T). UsingHY E(q,T) = (Vq /VT)(q — 4), this requires Vq./Vg = VTW/VT.The HYE favours the wealthy if an increase in wealth (or wage) increases themarginal utility of morbidity relative to the marginal utility of time. A priori, it isnot clear whether this is the case or not. However, if time use is unconstrained, it isthe case that 171 = WVT . If this substitution is made in the independence expressionfor wealth, the condition for independence and the direction of bias is the same as thewillingness-to-pay case (amry E ar wa(vg ivi )/ an and the wealthy are probablyfavoured by this statistic. This proportionality between the HYE and the willingness-to-pay statistics is lost if time use is constrained.health effectsThe HYE is independent of the length of life (T) if and only if length of life isadditively separable in the utility function, whereas it is independent of co-morbideffects if and only if each element of morbidity is weakly separable from all the otherelements.To derive these effects, apply the same arguments as in the income case to obtain<WY E(q,T)/OT = c for all (q,T). But (WY E(4,T)/aT = 1, since a year in perfecthealth is worth a year in perfect health. Thus, OM' E(q, T)/(9T = 1 for all (q,T).This requires V (q, AT) 1 (q, AT — m) for all A > 0. Let A = 1/T. Then m = (q).Since T —m U(q,T), this implies U(q,T) il(q)+ T. Sufficiency is obvious. Toderive the results for co-morbid effects, repeat the exercise used in the willingness-to-pay case with T substituted for I.These results suggest the aged are discriminated against if the marginal utilityof time falls more than the marginal utility of morbidity as time increases. There isChapter 3. The Welfare Properties of Three Health Status Statistics^121some empirical evidence to suggest that this is the case (see Brooks [1986] for evidencethat QALY values are concave with respect to time). This is the HYE ability to payanalogue. The disabled are discriminated against if the disability reduces the marginalvalue of the morbid state proportionately less than the marginal value of length oflife. Because the value of an extra year is reduced by a disability, it is possible thatthe HYE discriminates against the disabled, especially if the different components ofmorbidity are substitutes (e.g. the loss of one faculty, such a.s vision, makes one moredependent on other faculties, such as hearing).time use constraintsTime use constraints (either labour or leisure) do not affect HYEs if the constraintsare weakly separable from health and income in utility.This result may be derived by the same methods used in the willingness-to-paycase. Thus, one would expect the retired individual to be discriminated against bythe HYE or either of the other two health status statistics.SummaryThe majority of the conditions for independence derived above are inconsistent withobserved preferences. Where possible, empirical evidence is used to determine themost probable direction of bias of the three health statistics. Not surprisingly, thehuman capital measure is biased towards individuals who are more inclined or ableto earn labour income (the people with small financial reserves, those who earn thegreatest return on work effort, the young and those who are allowed to work). It isnot clear what the direction of bias is for those who are disabled or otherwise con-strained in their leisure activities because these factors may increase or decrease theincentives to work. The willingness-to-pay measure is probably biased to the wealthy(an ability to pay argument), and surprisingly, the aged and disabled (because theChapter 3. The Welfare Properties of Three Health Status Statistics^122marginal utility of income has been found to be positively correlated with health, asa person becomes healthier, he or she is less inclined to sacrifice wealth for additionalhealth). While it is clear the retired are also discriminated against, the effects of wagedifferentials and leisure constraints are not so apparent. Finally, it seems likely thatthe young and healthy are favoured by the HYE statistic, as well as those who arenot labour constrained. It is not clear how the HYE is affected by wealth or earnings,although the wealthy are favoured if time use is unconstrained.In conclusion, it is clear the person in mandatory retirement is discriminatedagainst by all three statistics. The elderly and disabled are the most favourablytreated by the willingness-to-pay statistic, while they may be least favourably treatedby the HYE. The people with substantial wealth holdings are favoured the most bythe willingness-to-pay statistic (and maybe the HYE) and the least by the humancapital statistic.3.4.2 Preference VariationIn this section, the implications of preference variation are assessed. It is assumedeveryone has identical endowments, including time use constraints, and the samepreference ordering over the two health states under consideration (this assumptionrestricts the marginal rate of substitution between morbidity and longevity to be thesame among all individuals and imposes weak separability of health from non-healthgoods in the utility function). Because the preference ordering over the sub-space ofhealth is restricted to be the same for all agents, equal entitlement requires the healthstatus measures be invariant to any other aspect of preferences.Chapter 3. The Welfare Properties of Three Health Status Statistics^123Human Capital MeasuresBecause the human capital measure is an inexact measure of health, it is not possibleto relate it to the marginal rate of substitution between morbidity and length oflife. It is obvious that it discriminates against people who, at the margin, valuethe consumption of leisure more than purchased goods (as reflected in the value ofincome), since these people, ceteris paribus, work less.Willingness-to-Pay MeasuresAssume that the health status change is very small (see the previous section for thesufficiency argument that any condition derived for small changes also holds for largechanges). Then the willingness-to-pay measure is directly related to the marginalrate of substitution between any element of health and income (and, by transitivity,is consistent with the marginal rate of substitution between any two elements ofhealth), evaluated at the "after" health state. Given that cv = vvHis dHS (whereVHS dHS 1/Tc/T E i vgi dgi ), it is clearly inversely related to the marginal utility ofincome. Thus, agents who value income more highly than other agents are assignedlower values for a given health status improvement and are discriminated against inthe allocation of treatments.Healthy Year EquivalentsThe healthy year equivalent is a function of the marginal rate of substitution betweenmorbidity and longevity, much as the willingness-to-pay measure is a function ofthe marginal rate of substitution between health and income. This is supposed toensure that individuals with the same preference ordering over any two health statesare treated equally with respect to these two health states. However, the HYE isChapter 3. The Welfare Properties of Three Health Status Statistics^124constructed such that the marginal rate of substitution is evaluated at perfect health,and this can destroy the egalitarian nature of the statistic if agents differ in their riskaversion to health.Lemma 2: The HYE value for any given health status improvement is less themore risk averse with respect to health is the agent.Proof: see Appendix E.Equal entitlement is not assured if agents differ in their risk aversion to healthand the HYE statistic is used to allocate treatments. The more risk averse an agentis, the lower the value assigned to his or her health status improvement, and the lesslikely he or she will be assigned a beneficial treatment. Thus, agents who are veryrisk averse to health outcomes are discriminated against.3.5 Choices Between ProgramsIn this section, attention is turned to the use of health status statistics to evalu-ate broadly based health care programs. These choices involve comparing differenttreatments which affect different people (e.g. pancreatic transplants affect the healthof severe diabetics, lung transplants affect the health of people with severe respira-tory diseases). These assessments involve measuring the health of (sometimes large)groups of people. Whether these measurements are consistent with social preferencesis now assessed.AssumptionsTo make the analysis more tractable, a number of assumptions are made. First,the number of people in society is fixed (at N) so that issues of optimal popula-tion size need not be addressed. Each person is characterized by a health stateChapter 3. The Welfare Properties of Three Health Status Statistics^125(q2 , T2 ) and income level (L). The social profiles of these characteristics are thenQ = (qi, •••, (IN), T (T1, ..., TN), and I = ..., IN ). It is assumed all individualsface constant prices of purchased goods, p, but may earn person-specific wage rates,WiThe purpose here is to determine the welfare properties of the aggregate statis-tics as currently used, not to invent new statistics that satisfy certain welfareproperties. Because N is assumed fixed, all three aggregate health status mea-sures are the sum of the individual statistics over all members of the community:a) HK HK (Q , T) = E i HK 2(42, Ti )b) WTP cv(Q , T) = E i cv i (q, ,c) HYE HY E (Q , T) = E i HY Ei(qi , Ti)/ NThe policy maker chooses the program which results in the highest valued ag-gregate health statistic. For instance, if there were twenty people willing to paytwenty thousand dollars each for pancreatic transplants, and ten people willing topay thirty thousand dollars each for lung transplants, the aggregate value of a pan-creatic transplant program would be worth one hundred thousand dollars more thana lung transplant program by the willingness-to-pay measure and would be selectedby the policy maker.Evaluation CriteriaThe choices made on the basis of these aggregate health status measures should beconsistent with social preferences (so that programs which society prefers most arethe first to be adopted). This requires that the health status indexes themselves beconsistent with such preferences.Social preferences order health status profiles of the community. Such an orderingmust possess basic consistency properties: completeness, transitivity, and asymmetry.Chapter 3. The Welfare Properties of Three Health Status Statistics^126The evaluation criteria restrict the characterization of these social preferences. Suchpreferences must be welfarist (depend only on individual assessments of well-being)and sensitive to the prevailing aversion to social inequality in the community. Itis assumed such social preferences can be represented by a social welfare function(SWF). This SWF must be an ethically flexible Bergson-Samuelson SWF to satisfythe above restrictions.The aggregate health status statistics are assessed to determine (i) if they aresupported by any set of rational social preferences, (ii) whether these social preferencesare welfarist, and (iii) the ethical position of these preferences.3.5.1 Rationality of the Social OrderingThe first step is to determine if the social ordering of states implied by these functionsis complete, transitive, and asymmetric. The ordering is complete if the statistic isincreasing in each individual's utility (i.e. is Pareto inclusive) and, therefore, in everyaspect of health which contributes to utility (an incomplete ordering is called a quasi-ordering). An ordering is transitive if, when state A is weakly preferred to state B,and B to C, then A is weakly preferred to C. An ordering is asymmetric if, when Ais preferred to B, B is never preferred to A.Human CapitalThe social preferences which support the human capital health status statistic (recallthat the aggregate human capital statistic used in health policy and examined hereis the sum of the individual statistics: HK(Q,T) = E i HKi (qi ,Ti )) are typicallyincomplete, excluding those health effects which do not affect earned income and thosepeople not in the formal labour market (either because of institutional constraints orco-morbid conditions). Furthermore, in the unconstrained case, the index is exactChapter 3. The Welfare Properties of Three Health Status Statistics^127only if q is fixed. Hence, inter-morbidity comparisons cannot be made. Thus, thehuman capital statistic is supported by only a quasi-ordering.Willingness-to-PayRecall that the aggregate willingness-to-pay statistic evaluated here is the one thatis typically used in health policy: the sum of the individual willingnesses-to-pay.Willingness-to-pay statistics are complete over all health states and can be chosento be Pareto inclusive (if the summation is defined over all individuals in society).They are not, in general, supported by social orderings because of serious rationalityproblems of asymmetry (see Boadway [1974], Blackorby and Donaldson [1985]), be-cause the statistics are conditioned on end state variable values (hence, the functioncomparing the move from the before to the after state may differ from the functionused to compare the move from the after to the before state, so that both movesmay appear to be welfare enhancing in aggregate). Obviously, this is not an issueif the index is independent of these factors. Such independence is also necessary forthe ordering to be welfarist, the subject of the next section, and the conditions forindependence are covered there.Healthy Year EquivalentsThe aggregate healthy year equivalent health status statistic is supported by a quasi-ordering, being transitive and reflexive (the reference points are fixed for all compar-isons), but not necessarily complete (while Pareto inclusive, the index is undefined forstates worse than death). The severity of incompleteness is less than in the humancapital case, and probably occurs infrequently in practice.Chapter 3. The Welfare Properties of Three Health Status Statistics^1283.5.2 WelfarismA social welfare function is welfarist if it depends only on individual assessmentsof well-being. This requires not only that each individual statistic be exact, butthat the aggregate statistic depends only on the utility information in the individualstatistics, not any conditioning or reference factors (since these factors are supposedto be irrelevant to the rankings of social states).Roberts (1980) has derived conditions for the sum of compensating variations torepresent a Bergson-Samuelson SWF. He does this in two steps. First, he shows thatthe ordering implied by the aggregate statistic is consistent with a Bergson-SamuelsonSWF if and only if it is independent of the reference variables (prices). Second, ifthe social decision statistic is additive, then the social preferences which support itare independent of reference prices if and only if the indirect utility function is of theform V i (p,./i) a(p)Ii bi (p). These results are now generalized to the utility basedstatistics considered here.Proposition: the social ordering which is implied by a given health statistic isindependent of reference variables if and only if the ordering is consistent with aBergson-Samuelson SWF.Proof: see Roberts (1980) or Chapter 4, Theorem 1.Proposition: given that the aggregate health status statistic is the sum of theindividual statistics, the social preferences which support the aggregate statistic areindependent of reference variables if and only ifHSi (u i , x i , y) = a(y)-yi (u i) bi (x i, y), (3.20)where x i are variables that may be person-specific and y are variables that are thesame for all individuals.Chapter 3. The Welfare Properties of Three Health Status Statistics^129Proof: see Roberts (1980) or Blackorby and Donaldson (1985) for the originalproof based on compensating variations.The proof is easily modified for the health index case. Consider an arbitrary index,HS i , defined over utility, u i , and some reference variables, O. Then independence re-quires that E i HSi (ut, 9) > HSi (ur, 8) for all 9. Set 8 = 6 and define HSi (u i , 61 ) =-yi (ui ) zi , where yi is increasing and ; is continuous. Then the independence condi-tion may be restated: E i HSi (u i , 8) = E i (zi), 8) E i hi (zi , 8) = H (E i zi , 8),the solution to which is a Pexider equation of the form HSi (u i , 8) = a(9).-yi (u i )+ bi(0)(obviously, no person-specific elements of 8 may appear in a(•)).The implications of these propositions for the three health statistics under con-sideration are now examined.Human CapitalIn the case of human capital, there are two cases where the statistic may be exact:when only longevity changes (and labour supply is increasing in longevity) in theunconstrained case, and when the only effect is on the labour constraint in the labourconstrained case.Corollary 1: The unconstrained human capital statistic is consistent with (3.20)if and only if (1) wages are fixed and the same for all individuals and (2) if there isno utility in leisure (e.g. L = T). The labour constrained human capital statistic isconsistent with (3.20) under the same conditions as the unconstrained statistic.Proof: see Appendix E.In both cases, independence requires that wages are the same for all individualsand that no individual gets any utility from leisure (i.e. spends as much time aspossible working). Obviously, these conditions do not hold in the real world, so thesocial preferences which support the human capital statistic cannot be welfarist.Chapter 3. The Welfare Properties of Three Health Status Statistics^130Willingness-to-PayThe willingness-to-pay statistic is the one Roberts (1980) originally dealt with. Oneimportant difference between Roberts' work and the problem addressed here is thathealth status (qi , Ti ) is person-specific (different people have different health).Corollary 2: The willingness-to-pay statistic is consistent with (3.20) if and onlyif income and health are additively separable in the indirect utility function and utilityis homothetic in income, i.e.Vi(qi , Ti , Ii , y, x2) a(y)Ii bi (qi ,T1 , y, x i ).Proof: see Appendix E.Unlike the Blackorby-Donaldson (1985) result for person-specific prices (that nopreferences exist which satisfy independence when prices are person-specific and theaggregator function is linear), preferences do exist that satisfy independence whenhealth status is person-specific. This is because there are no a priori restrictions onthe relationship between health and income in the indirect utility function as thereare between income and prices. Even so, empirical evidence suggests the preferencesrequired for independence are not observed in practice (Viscusi and Evans [1990]).Healthy Years EquivalentsIn the case of healthy year equivalents, the fact that health may be person-specific isoffset by the fact that the index is based on a common and fixed reference point.Corollary 3: The healthy year equivalent statistic is consistent with (3.20) if andonly if income and health are additively separable in the indirect utility function andutility is homothetic in length of life, i.e.V i (qi ,Ti ,/i ,y,x i ) u a(qi ,y)Ti^b i (Ii ,y,x i).Chapter 3. The Welfare Properties of Three Health Status Statistics^131Proof: see Appendix E.This restriction implies that the marginal utility of time in any given morbid statemust be constant and the same for all individuals (since any morbid state may be thereference health state). This is in contrast to the willingness-to-pay case where themarginal utility of income is restricted to be constant and equal for all individuals.Empirical observation suggests neither set of restrictions holds in practice (in thehealthy year equivalent case, discounting over time must be ruled out).3.5.3 EthicsThe final consideration is the ethical position implied by these health status statistics.The degree of inequality aversion is reflected in the amount of curvature in the ag-gregator function. Each of the aggregate health status indexes is additive, reflectinginequality neutrality over the health status measure. This does not reflect inequal-ity neutrality in the supporting social preferences, which are defined over individualutilities, not health status measures.Human CapitalConsistency with welfarist social preferences imposes restrictions on individual pref-erences such that L"^T. This in turn implies V(p, w, T, I)^(wT I)a(p) (orV(p,wo-L,I) (wry I)a(p)). Hence, E i HKi is cardinally related to the sumof individual incomes, and the human capital statistic is indifferent to inequality inincome.Chapter 3. The Welfare Properties of Three Health Status Statistics^132Willingness-to-PayBlackorby and Donaldson (1985) have shown that the aggregate willingness-to-paymeasure used in health policy (the sum) is indifferent to income inequality. The gistof their argument (which is supported by a formal proof) is that the willingness-to-pay measures changes in well-being and therefore cannot be sensitive to inequalityaversion which requires comparing levels of well-being across people.Healthy Year EquivalentsThe sum of the individual healthy year equivalents is a (non-strictly) concave functionof a convex function of well-being. This is because preferences over time are assumedto be concave so that the transformation of utility imposed by the healthy yearequivalent is convex. But a concave function of a convex function is not generallyconcave, so the inequality aversion (neutrality in this case) built into the aggregatorfunction will not represent the inequality aversion of the social preferences whichsupport this health statistic. In fact, the social preferences which support this indexare very likely to be characterized by equality affection, in which case the healthstatistic assigns higher values to health profiles that are less equal.The only exception to this situation is when individual preferences are all homoth-etic in length of life. Then the healthy year equivalent is a linear transform of utilityand the sum of the individual healthy year equivalents is supported by social prefer-ences that are indifferent to the distribution of well-being (see Chapter 4, Theorem 6for a formal discussion).Chapter 3. The Welfare Properties of Three Health Status Statistics^1333.5.4 SummaryNone of these health status statistics satisfactorily orders all possible communityhealth profiles. The human capital statistic cannot evaluate programs that affectquality of life. It could not, for instance, compare a kidney transplant program (whichgenerally improves the quality of life but does not increase the quantity of life of kidneypatients on dialysis) against a heart transplant program (which primarily increasesthe length of life of heart patients) in a way consistent with individual preferences.The healthy year equivalent is also incomplete and cannot assess programs that mayresult in health states worse than death (e.g. bone marrow transplants that reject).But the most problematic of the health status measures is the willingness-to-pay,which does not consistently rank projects (i.e. it may indicate both that a hearttransplant program yields more health than a kidney transplant program, and that akidney transplant program yields more health than a heart transplant program).Generally, independence from extraneous factors is not achieved by any of themeasures. While the theoretical requirements for independence are different for thethree statistics (one could argue that they are most stringent for the human capitalmeasure and least for the healthy year equivalent), empirical evidence indicates noset of restrictions is satisfied. - Thus, no measure is supported by welfarist socialpreferences and the rankings of programs depend on the choice of reference variables.Ethically, the human capital and willingness-to-pay measures are indifferent toinequality. The healthy year equivalent is the most perverse statistic, however, sinceits measure of health status is a convex transformation of well-being associated withhealth. Thus, health improvements to people who are already relatively healthyare assigned greater values than health improvements to people who are relativelyunhealthy.Chapter 3. The Welfare Properties of Three Health Status Statistics^1343.6 ConclusionQALY analysis was developed to provide policy makers with a decision statistic thatwas consistent with individual preferences, but free of the ethical ramifications oflinking value to ability to pay. The results of this paper suggest the success of thisendeavour has been somewhat limited. The healthy year equivalent is exact, andtherefore appropriately indicates which treatments are most beneficial to a given in-dividual. In choosing individuals for treatment, observed preference patterns indicatethe healthy year equivalent may actually discriminate against people who have poorhealth endowments (the reverse of the willingness-to-pay statistic), and may not beany less discriminatory against the financially poor than the willingness-to-pay statis-tic. Furthermore, it discriminates against people who are risk averse with respect tohealth. In evaluating health care programs, the healthy year equivalent is supportedby rational social preferences, an improvement over the other measures. The con-sequences of the fact that these preferences are not welfarist is minimized by (inpractice) the use of a fixed reference health state for all comparisons. However, theethics of the aggregate measure may be perverse if individual utility functions areconcave in time and may actually favour very unequal distributions of health overmore equal distributions. Overall, the healthy year equivalent does seem to be asuperior measure of health status over the alternatives available, although it is notwithout its own difficulties.Additional work needs to be done before these results may be considered conclu-sive. Two avenues of empirical work need to be undertaken. First, the nature ofpreferences must be empirically identified, particularly how the marginal utilities oftime and income vary with income, age, and co-morbid effects. Second, the threestatistics should be assessed for relative variance and any systematic bias that arisesChapter 3. The Welfare Properties of Three Health Status Statistics^135in their calculation in practice, and the theoretical results should be reconsideredgiven this information.Theoretically, this assessment could be strengthened in a number of ways. Deci-sions of this nature are inherently dynamic and involve variable population sizes, andthe present analysis has ignored these facts. The few comments made in this chapterabout the effect of dynamic contexts on the results, and the fact that many of theresults hinge on the relationship between morbidity and longevity in preferences, sug-gest this is one avenue deserving of further investigation. The effects of externalitiesshould also be considered. Finally, the analysis should be extended to include theother dimensions of these rationing decisions, most notably costs.Chapter 4A QALY Based Societal Health Statistic for Canada, 19854.1 Introduction4.1.1 PurposeThis paper assesses the appropriateness and feasibility of QALYs (quality adjustedlife years) as a foundation for an index of societal health. Just as G.N.P. acts as ameasure of the economy's performance in the aggregate, such an index of societalhealth could act as a gauge of the performance of the health care system as a whole.Such an index is necessary to assess the overall allocation of health care resources,and to effectively target policy to improve both the level and distribution of healthstatus in society.Results in this paper suggest that, theoretically, the QALY serves as an imperfectmeasure of societal health, but that these imperfections are far less severe than thoseassociated with currently used indexes, and that such failures are probably endemic toany index based on individual preferences. A QALY based index, constructed usingthe best available data, indicates that morbidity has a significant effect on Canadianhealth status (e.g. Canadians, as a group, are prepared to give up 10 per cent oftheir longevity in order to eliminate morbidity from their society — more if they areinequality averse), that the distribution of health across regions and gender shiftswhen morbidity is accounted for (e.g. the advantage women have over men in termsof life expectancy falls by half when morbidity is taken into account, the people of136Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^137Quebec are relatively much healthier once morbidity is included in the inter-regionalassessment, but people in the Atlantic provinces are worse off), and that resourcesmight be best targeted on alleviating role (the ability to fulfil social functions) andmotor (broadly defined mobility) dysfunction.4.1.2 Criteria for a Societal Health IndexTo be useful, a societal health index must assign higher values to more preferredcommunity health status profiles than to less preferred profiles. To do this, an indexmust satisfy the following conditions.CompletenessAn index must encompass all aspects of health status and the health of all membersof society. Otherwise, improvements (deteriorations) in the excluded aspects or inthe health of the excluded individuals are not reflected in higher (lower) values of theindex, even though relevant changes have occurred.Completeness presupposes a clear definition of what health status is. Well-beingin health may be distinguished from the broader concept of utility by limiting therange of aspects to those that the health care system attempts to (as opposed todoes) impact directly (Evans [19841, p. 5) — assuming all other aspects of well-being can be held fixed. Such a definition encompasses those interpretations of illnessrelating to physical and psychological pathology, as well as broader dysfunction. Moreimportantly, it is necessary if the index is to be used to judge the performance of thehealth care system.Health status evolves over the lifespan. Point- and period-of-time indexes cannotincorporate the longevity aspect of health. Since many health care resources areChapter 4. A QALY Based Societal _Health Statistic for Canada, 1985^138expended to reduce mortality, only lifetime health status measures are complete. 1A societal health index must encompass all members of society and be increas-ing in each individual's health. This requires that the index be Pareto inclusive,though not necessarily egalitarian (where the amount of the increase is constant forall individuals), as has sometimes been suggested.As a corollary, note that completeness also requires that the index be sensitive torelevant changes within the domain. Otherwise, the index may not detect importanthealth status changes that have occurred.ConsistencyAn index must be consistent with the set of value judgements pertaining to healthstatus held by a benevolent social planner: health status changes which are trivial intheir impact must not be assigned greater weight than those that are more importantto society. The question of whose values should count and by how much is essentiallynormative, involving a value judgement itself.The first normative judgement made here is that the individual who endures thehealth state is the best judge of the well-being associated with that state. Paternalism,on the other hand, can lead to the situation where the societal index indicates animprovement in health status even when every individual feels worse off than he orshe was before.' This is an explicitly welfarist position which is well establishedin economics and has received increasing support in the medical literature (see, forinstance, Geigle and Jones [1990]).'Notice that a purely outcome approach, which excludes aspects of process (how outcomes areachieved) and prognosis, has been adopted. This reflects the ex post position, where only healthoutcomes achieved matter, rather than the ex ante position, where all possible future health states,not just those actually attained, are included in the assessment.2 Note that the individual values the health status outcome, so that agency relationships, whicharise from complicated processes, need not apply.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^139These individual valuations must then be aggregated to a societal level. This mustbe done in a manner consistent with social preferences (Patrick [1976]). Otherwise,the health status index may indicate that it is better to devote all resources to effecta very small health status improvement in a less deserving person (in "society's"opinion) than a very large improvement in the health status of many more deservingindividuals. Wagstaff (1991) has suggested the aggregate QALY index be treated asa social welfare function. Yet, he gives no consideration to the assumptions inherentin this approach, or the resultant social ethics. It is therefore necessary to considerwhat social ethics may be compromised by this approach and the implications of thesecompromises.Ethical ContentThe social welfare function chosen must be sensitive to ethics regarding justice andfairness held by the social planner. In terms of outcomes', such ethics can usually becouched in terms of the distribution of health status. An index must be increasingin each individual's health because the health of each individual is socially valuable.Also, there appears to be a trend in Canadian society towards preference for healthstatus distributions that are more equal rather than less equal (see the evolutionof this principle from the Royal Commission on Health Services [1964], where onlyhealth maximization mattered, through to the Epp Report [1986], where equality wasthe first of three policy priorities).3 Notice that a function defined over outcomes can only take an ethical position over outcomes.If justice is based on something else, such as the fairness of the process by which health outcomesare determined, such an index cannot reflect the ethics involved. This suggests the validity of thesocial welfare functional approach may have to be reconsidered at a later date.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^140FeasibilityFinally, an index must be feasible to construct. This condition really acts as a con-straint on the optimization problem defined by the above conditions. An index cannotbe considered worthwhile if the resources required for its construction are greater thanany resource savings that could ensue from the use of the index as an instrument tobetter allocate resources. An index that does not fit the criteria above may still directthe health care system to better resource allocation than the index which satisfies thecriteria but is impractical to apply. This criterion may change over time as more dataand better techniques in measurement are developed.4.1.3 Literature ReviewThe practice of health status measurement has followed closely contemporary dataavailability. The first index to be widely used is life expectancy. Its most seriousdrawback is incompleteness: it ignores non-fatal illness. It is consistent with prefer-ences over longevity only if there are no states worse than death (because the indexis strictly increasing in time). It is Pareto inclusive, covering all members of society,and the ethics are inequality neutral since the index is invariant to the distributionof years of life across members of society. Data are available from census informationand are highly reliable.The second class of indexes incorporate labour force data with life expectancy.These include Sullivan's (1966) index (using occupational disability), Chiang's (1968)index (using worker absenteeism), and Miller's Q (1970) (using wages lost due toillness). While such measures do account for some aspects of morbidity, they areless complete in other dimensions since only the health of persons in the labourmarket is counted. Furthermore, health effects which do not have an effect on workerChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^141productivity are still ignored. The values assigned to states are inconsistent with mostpreferences because states in which a person is unable to work are assigned the samevalue as the state where that individual is dead, regardless of the utility associatedwith other aspects of life, whereas ability to work is deemed equivalent to perfecthealth, regardless of the discomfiture associated with the morbid state. Ethically,these measures are seriously flawed since the worthiness of a person's health is tied tohis/her labour force productivity. Data are readily available but may be biased dueto incentives on the part of workers to misreport.The third class of indexes attempts to adjust life expectancy figures with mor-bidity values (such as Patrick et al.'s Q.W.B. (quality of well-being index) [1973],and Torrance's QALY [1976c]) to construct a quality adjusted life expectancy indexthat incorporates both length and quality of life. These measures encompass an evenbroader range of morbid effects, although completeness is limited to the scope of themeasure in question (the Q.W.B. scale has a fixed set of components; the QALY isgeneric, since it can be based on any set of components). Valuations are consistentwith preferences over morbidity only if the value weights are obtained from a repre-sentative sample of the population; none of the measures accommodates preferencevariation within the population. In order for the morbidity weighted life expectancyvalues to be consistent with the value of health status, the morbidity values must beappropriately weighted in terms of the value of being alive. Only the QALY explicitlydoes this. Individual scores are then aggregated by some function (usually additive),with the cardinality of the index dictating the nature of the comparability acrossutilities (e.g. when the indexes are added, this imposes the social valuation that oneyear of life is worth the same regardless of to whom it accrues). Data for these indexesare not readily available: even though value weights for states have been derived, theassessment of the prevalence of these states in the population has only recently beenChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^142attempted.There have been two attempts to employ the data available to approximate anindex of this class. Erickson et al. (1989) cross-reference the morbid states surveyedin the National Health Interview Survey (N.H.I.S.) to the Q.W.B. scale to constructa morbidity adjustment factor which is then combined with life expectancy figuresfor the U.S. They encounter a number of difficulties including: (1) The informa-tion collected in the N.H.I.S. does not correlate perfectly with the components ofthe Q.W.B. scale. Since the Q.W.B. scale is inclusive (measures must exist for allcomponents and only those components), this requires proxies for some states notin the N.H.I.S. and the exclusion of other states described in the N.H.I.S. (2) Onehas to assume the preferences underlying the Q.W.B. scale are representative of thesample in the N.H.I.S. (3) No linkage exists between the morbidity data and the lifeexpectancy data (i.e. the transition probabilities for moving from one morbid stateto another or from any morbid state to death are unknown). Erickson et al. assumeindependence throughout the transition matrix (describing the probabilities of mov-ing between morbid states over time) 4 and match mean values from the two datasets. Furthermore, no linkage exists between the values associated with the morbidstates and death. This cannot be overcome by statistical assumptions or by betterdata collection.Wilkins and Adams (1983) use the data in the Canada Health Survey (1981) andpurport to link these to QALY values. The statistical problems affecting Ericksonet al. are still present, since the data collection is similar. The difference betweenthe two is how these states are valued, Wilkins and Adams adopting a system thatpotentially is theoretically (values are linked to life) and practically (no states are'They have to assume that the likelihood of occupying a given health state in any period is thesame for every individual regardless of the health status of that individual in the previous period.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^143excluded) better. However, the morbid states chosen are very crude, resembling thelabour force models of the previous decade in scope. The valuations, while they maybe in a reasonable range, are arbitrary since preferences for the states used have notbeen surveyed. Many of the conclusions are sensitive to these choices.While it appears the QALY based index, in principle, may be superior to otherindexes, it has yet to be implemented in a fashion that achieves its advantages. Theremainder of this paper assesses the theoretical performance of the QALY based indexagainst the criteria stated above, examines the feasibility of constructing such anindex with available data and what improvements in data collection may be requiredif current standards are inadequate, and what sorts of policy decisions may be assistedby such information to judge if these improvements are warranted.4.2 Model4.2.1 Theoretical AssessmentThe environment in which the health states exist is described by the following as-sumptions.i) An individual's state is described by x. x includes health status (q describing themorbidity profile over life, t the length of this life) and non-health factors (denoted byK, including such aspects as income and personal characteristics). Thus, x = (q, t,ii) It is assumed each individual (i = 1, ..., N) has a well-defined preference or-dering, R. i , over states of the world and that this ordering has a representation,: Rm.^IV, such thatUi(xl)^Oi(x0)^(4.1)It is assumed this function may be conditioned on non-health factors such that L'i(q,t)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^144represents preferences over (q, t) conditioned on IC (if preferences over (q, t) are separa-ble from n, then this function is independent of it, not conditioned on these variables).The value of the health state is represented by u i (u i U i (q,t)). It is assumed U i iscontinuous in t. It is further assumed that, when necessary, these individual utilityfunctions are interpersonally comparable and numerically measurable.iii) It is assumed social preferences exist over the distribution of health statesacross the N individuals in society. It is further assumed that these social preferencesare based on individual utilities. Let R. denote these social preferences and assume^they may be represented by W :^R1. Let Q = (q i , qN ) and T = (t 1 , ..., t N).Then(QA, TA ) R (QB, TB) 4_4)^W(u14 ,^ ) >^4 _4 W(U1(4, tj'), UN(ev, tk)) > W(Uig , tBi^UN(4, tfv ))Vii (q14 ,^tfv) > 1;17 (43 ,^t113,^tgr).^(4.2)A social welfare function is said to be welfarist if it depends only on individual pref-erences or utilities alone, and is extra-welfarist if it depends on factors in additionto (u 1 , ...,uN ). The former is often described as a Bergson-Samuelson social welfarefunction (B-S SWF).iv) Assume the QALY, cp(q) is based on the time trade-off instrument, i.e.^(pi = t i^U i (qi,ti) = ui (4, m i )^(4.3)(where 4 is the same for all individuals and is usually perfect health). Assume thetime frame used in the time trade-off exercise is equal to time alive such thatco i t i = mi = M i (qi , t i . 4),^ (4.4)which is the quality adjusted life time (QALT), or healthy year equivalent (Mehrez andGafni [1989]), defined over the whole life-cycle. Note that M i is ordinally equivalentChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^145to U 1 . Thus,mi = M i (qi , t i ,^=^ti), q).^(4.5)Such assumptions ensure the QALY based societal health status index is not distortedby imperfections in the valuation of individual health status alone, but only by flawsinherent in the aggregate index.Note that U i (qi, 0) is, logically, invariant to the level of qi (because the individualdoes not endure the state, but dies immediately). Hence, because the individual isindifferent to zero years in perfect health and zero years in any other health state,he or she is not prepared to give up any time to move between these two morbiditylevels. Thus, 111 i (qi, 0, (y) = 0. This is described asCondition. N: M i (qi3 O,4) = 0 for all qi .v) The societal health index may be expressed by some aggregator function, F,over the QALT's:HS = (4.6)F is supported by a set of preferences' over health states, denoted by Rr, such that(QA, TA )Ter(QB, TB )^'14%7 ) > r(mB1 ,...,m33v ).^(4.7)14Thether these preferences are consistent with those of a Bergson-Samuelson socialwelfare function is now addressed.The properties of P, given M i ,Ui,and W, are now assessed.4.2.2 CompletenessIncompleteness over health states may occur if states worse than death exist and theQALY is bounded from below by zero. In this case, the QALY value is undefined,''Torrance (1986) suggests T should be additive (i.e. r^r 2N_, raj ), claiming such a measure isegalitarian. The accuracy of this claim is addressed later in this paper.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^146since no non-negative value of in solves the time trade-off problem. Whether such asituation is apt to hold over an entire lifetime is doubtful (especially if suicide is anoption), so the problem may be more academic than practical.To overcome this problem, the following regularity condition is imposed:Condition C: U i (qi ,t i ) is increasing in t i for all q2 .This requires that the the domain of q be restricted such that all health states arepreferred to death.The domain of the aggregator function is chosen to encompass all members ofsociety. Because time trade-off values exist only for those members of society who areable to express their preferences, or for whom appropriate proxy values exist, thereis a risk that the very young or the severely incapacitated may be misrepresented.4.2.3 ConsistencyBy assumption (iv) above, the QALT measure is consistent with each individual'spreferences over morbidity and mortality as conditioned on non-health factors (seeLemma 1 of Chapter 2). Thus, only consistency with social preferences need beassessed.Since the QALTs measure levels rather than changes in health, 4ny aggregationof QALTs represents some transitive and reflexive ordering over health status profilesover society (I' cannot be a function of social preferences if intransitivities occur).The ordering need not be consistent with the social values of the benevolent socialplanner.Dependence on Reference Health StatesRecall thatmt =^ti), =^0,^(4.8)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^147so thatr(mi, ...,m N ) = r(h(U 1 (qi,t1),4),^f N(UN (4N, tN),= r (M 1 (91, t1, q),^MN (4N,^ 4))^(4.9)and(QA, TA )R,r (QB, TB )1-1 (M 1 (4, ti , 4), ..., M N (d,tk,q))^r(M 1 (4, t i3 ,^..., M N (4,t1,,, 4)).^(4.10)This implies the social ordering, 'R. r , depends on the choice of the reference healthstate used in the time trade-off exercise.'This is clearly an undesirable property: while one health status profile might bepreferred to another under one reference point, the ordering may be reversed underanother reference point. Even though the actual states and satisfaction levels achieveddo not change, the ordering of the two states is reversed. While the reference point isusually fixed at perfect health, so that such asymmetries do not occur in practice, itis perverse that the ordering of states should depend on one state which is not amongthose ranked. It implies that the aggregator function cannot be purely welfarist, sothat health states cannot be evaluated simply by the well-being associated with them.Roberts (1980) addresses a similar problem evaluating income distributions withreference prices. He found independence requires individual and social preferencesto interact in a particular way. The problem may be restated for health states:Independence requires that the ordering of health status profiles be invariant to choiceof reference point,r(fi(U 1 (4^f N(UN (q -k,tk), (I))6 It is this reference point which makes the individual valuations cardinal and allows their ag-gregation (in Arrow's model, the independence of irrelevant alternatives is violated). This imposedcomparability invokes a certain ethical position between individuals (this is discussed in the sectionon ethics).Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^148r(fi(U 1 (4)ti.3 ),4)).-., fN(UN (41■7-,tgr),4))4-4 r(f1 (U 1 (4,4),4),..., fN (UN (d,t -k), 4)) >fN (UN (q)A3,,t 1A3,),4))^(4.11)for all 4,4- in the set of available reference points.This condition restricts the nature of 7Z.r , the social preferences described by thedecision statistic, F. These restrictions are summarized in the following five theorems.Theorem 1: The ordering, R,r , is independent of 4 (the reference health state)if and only if it is consistent with a Bergson-Samuelson social welfare function.Proof: see Appendix F.In addition, independence in the social preferences places joint restrictions onthe functional form of the decision statistic and individual preferences. Consider thefollowing theorems, which describe what individual preferences must be, given fourpopular ethical positions on distribution.Theorem 2: Given that F is additively separable, i.e.r(mi, ...,mN)^E oi(rni)^(4.12)(where ".=-?--" means "is ordinally equivalent to" such that the two expressions arerelated by an increasing monotonic transform), where each qt is continuous and in-creasing, then R.r, is independent of the reference point, 4, if and only ifU 2 (4=, t=)^a(qi )02 (t i )^bi (qi )^(4.13)for all i = 1,^N .Proof: see Appendix F.Theorem 3: Given that F is linear (inequality neutral) and symmetric, i.e.F(m l ,...,m N )^E (4.14)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^149and that Condition N holds, R, r is independent of the reference point, 4, if and onlyifUi(qi,ti)^a(g2)ti^(4.15)for all i = 1, ...,N.Proof: see Appendix F.Theorem 4: Given that F is symmetric Cobb-Douglas, i.e.N11 (M1)-7MN)^(4.16)7Zr is independent of the reference point, 4, if and only if^U i (qi ,t i ) o bi (q1 )t7 (gi)^(4.17)for all i = 1, ..., N.Proof: see Appendix F.Theorem 5: Given that F is maximin (extreme inequality aversion), i.e.r (mi , m N ) min { m , niN} ,^ (4.18)and that for each 4, , the range of^-4) is the same for all i, then R,r is independentof the reference point, 4, if and only ifUi (qi,t i )^Uk (qk ,tk )^ (4.19)for all i = 1,^N . That is, all individuals have identical preferences.Proof: see Appendix F.Notice the restrictions on individual preferences implied by the above. The valueof an extra year in the reference health state must be the same, regardless of to whomit accrues. Unless the choice of reference state can be limited, this restriction mustChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^150apply to all health states.' This assumes a degree of preference similarity across theindividuals in society.Normally, the preferences of individuals must be taken as given and the socialpreferences are chosen. If, as is widely believed, time is discounted geometrically,the only ethical position above that can be taken without inducing reference pointdependence is the Rawlsian, and this only if preferences are identical.These results are disappointing since they suggest, given realistic assumptionsabout individual preferences, that there is no aggregator of QALTs that is consistentwith a Bergson-Samuelson social welfare function.'4.2.4 Ethical ContentThe previous section suggests that the ethical flexibility of the social welfare functionis limited by the structure of individual preferences if welfarism is maintained. Ifthe welfarism condition is relaxed, can the ethical position of a QALY based societalhealth status index be made more defensible?At a minimum, the social welfare function should be increasing in each individual'slevel of health. But the ethical position should also incorporate some amount ofinequality aversion: a society in which some individuals live long healthy lives while'In Chapter 2, the extended sympathy QALY instrument was developed. It differed from theestablished QALY instruments in its use of a reference individual as well as a reference health state.Hence, resultant QALY comparisons are consistent with interpersonal utility comparisons. In thiscase, conditions for independence apply only to the reference individual's utility function: in theadditive SWF case, independence requires Ur (q,t) = Or (a(q)t b(q)) (i.e. the reference individual'sutility function must be quasi-homothetic). While this condition must apply to all individuals, sinceany individual may be chosen to be the reference individual), no cross-person restrictions apply(e.g. it is not necessary that a' = (la). Hence, the conditions for independence when the extendedsympathy instrument is used are less restrictive than when any of the established instruments areused. Only the extended sympathy case can reflect that the marginal value of a year of life in anygiven health state can be of different value to different people.8 This is quite apart from the concern of Wagstaff (1991) that such a function is defined over onlya partial welfare space.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^151others live short and miserable lives is less preferred to one where all individualsendure a moderate amount of good health. To reflect such a position, the aggregatorfunction must be S-concave9 across levels of individual health status.Inequality AversionThe first issue is whether the index can incorporate inequality aversion. If it cannot,the index is inadequate for any policy evaluations where the distribution of outcomesis considered important.The aggregator function can easily be chosen so as to be concave in its arguments(the QALTs). This implies inequality aversion to the distribution of well-being onlyif the QALT is concave in time (since a concave function of a concave function isitself concave, but a concave function of a convex function need not be). This againdepends on the structure of preferences.Theorem 6 (Blackorby and Donaldson [1988]): M(qi ,t i , qc) is concave in t if andonly ifUi(qi,ti) a i (qi )ti (4.20)Proof: see Appendix F.Since preferences are probably strictly concave in t (as they are under when timeis discounted geometrically), the aggregation of QALTs may imply perverse socialpreferences where social states characterized by greater inequality are assigned highervalues than states with less inequality.Combining the results from Theorems 2 to 5 and 6, the only ethical positionwhich satisfies inequality aversion and welfarist principles is the utilitarian (i.e. rF M i (Ui , 0), and this only under the assumption that individuals do not discount91f the social welfare function, W, is symmetric and quasi-concave, then its satisfies S-concavity.W is quasi-concave if its level sets are convex.a'(q) — a2 (q(a 2 ))„tai(4) — ai(q(ai)) •(4.21)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^152over time.Horizontal EquityTorrance (1986, p. 17) has argued in favour of the arithmetic mean as an aggregatorfunction, claiming it is ethically just because it assigns equal weight to a year in perfecthealth for each individual. However, if preferences vary, the aggregation involved ishorizontally inequitable.QALY analysis imposes M(q, 1, a i ) = M i (4, 1, 4-) = 1, where 4- is perfect healthand ai describes the personal characteristics on which preferences are conditioned. Inaddition, the individual selects a morbid state deemed equivalent to death (a variablereference point) such that 1171(q(a i ), 1, a i ) = M(q, 0, a i ) = 0, where q(a i ) is the deathequivalent. Suppose there are two agents, A and B, and that A is more averse to deathq ( aA) < q(aB )).than B (i.e. )) Assuming both individuals' preferences are homotheticin time, so that the conditions for Theorem 3 are satisfied and the aggregator functionreflects the inequality neutral equity position,Assume only one element of a i , §, affects this choice of death equivalent and differ-entiate with respect to this element:aiffi aq(a i )sign v^—s ign^ (4.22)andsign aqa§i =sign mai) .^ (4.23)This reflects that agent A is more likely to prefer projects that increase longevity,while B is more likely to prefer projects which improve the quality of life. SupposeA and B occupy the same health state and that a project which improves quality ofChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^153life may be given to only one of them. Both would prefer to have the project, since,for a given length of life, both prefer better morbid states to worse morbid states.Yet B will always be chosen to receive the project even though the aspect over whichpreferences vary (death) is unaffected by the project. Even though linearity exists inindividual and social preferences, egalitarianism (defined as equal entitlement to thesame resources by people who occupy the same observed state) is not ensured. Thisis an example of how ethics may be capricious when cardinality is assumed to imposecomparability.4.2.5 SummaryIt is clear that aggregated QALTs do not generate completely satisfactory indexesof health status, even when individual QALTs are measured in a way completelyconsistent with individual preferences. Identical individual preferences are neithernecessary nor sufficient for aggregated QALTs to reflect acceptable social preferences.Of greatest concern is that the curvature in individual preferences may cause aggre-gated QALTs to favour extreme distributions of health status, rather than more equaldistributions. This contradicts the egalitarian spirit with which such QALY basedindexes were first developed (Torrance [1986]).Problems of incompleteness and cross-person assessments which are inconsistentwith interpersonal utility comparisons can be circumvented by slight modificationsin the measurement instrument; but the other concerns raised above are inherent inthe QALY approach. Thus, some ethical compromises must be made. However, suchcompromises are characteristic of all well-being based indexes, and those associatedwith QALY based health indexes are more agreeable than most of the alternatives.Whether it is feasible to implement an index of this nature is now assessed.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^1544.3 Empirical AssessmentThis section examines the feasibility of constructing a QALY based societal healthindex given current data availability. The theoretical model described above requiresa tremendous amount of data since it is based on the value of the morbidity pathover the entire lifetime. Such data are not available, nor are they apt to be in thenear future. This section describes what approximations must be accepted if onlycurrently available data are used in the construction of such an index.4.3.1 DataData available include life expectancy (Statistics Canada [1991]), a point in timemorbidity profile, stratified by age (Statistics Canada [1987]), and preferences overthese morbid states. These data represent all adult (age 15 and over) Canadians livingin the ten provinces (the territories are excluded). These data are supplementedwith data on persons living in institutions (Statistics Canada [1990]). Morbidityaspects include endurance, agility, and perception (long-term physical ill-health), roleand socio-emotional function, and short-term ill-health. The path of morbidity overthe lifetime remains as yet unknown. Data sources and the following procedure aredescribed in greater detail in the data appendix (Appendix G).4.3.2 ProcedureBriefly, the procedure is as followsi) QALY weights for the non-institutionalized population are calculated from theGeneral Social Survey data: First, reports of health satisfaction are compared to re-ports of morbidity factors, including chronic ill-health (endurance, perception, andagility), short-term ill-health, social ill-health and emotional ill-health. Each factorChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^155is represented by a binary structure and may be either present or absent and, ifpresent, severe or not severe. The construction of these morbidity factors is discussedin appendix G. Observations with multiple chronic or short-term morbid states aredeleted so that estimates reflect the marginal disutility of the morbid state as takenfrom a point of perfect health and not any cross-effects from other illnesses. The func-tion is estimated by Probit analysis because the satisfaction responses are reportedas ordered categories (it must be assumed that all individuals use the same timeframe and the same cardinal scale for these responses). These estimates are givenin Table G.1. Satisfaction values for all possible configurations of morbid states arethen reconstructed using a functional form chosen to be consistent with a multiplica-tive multi-attribute utility function. 1° Second, morbid states in the G.S.S. are thenmatched with QALY values reported in the literature. Ten such matches were found(these are given in Appendix G). These QALY values were then regressed against theabove estimated satisfaction values associated with these states, with the restrictionthat perfect health be assigned a value of one imposed. A logarithmic functional formwas found to provide the best fit of this estimated relationship. All satisfaction valuesderived in the first step are transformed according to this estimated relationship sothat they are consistent with a time trade-off scale. This procedure is repeated sep-arately for men and women since their estimated satisfaction functions are found tobe significantly different from each other. These transformation functions are givenin Appendix G as well.ii) Because the sampling methods used in the G.S.S. excluded persons in institu-tions, including those institutionalized because of ill-health, the G.S.S. data providean over-estimate of the health of Canadians. To adjust for this bias, numbers ofpersons institutionalized for ill-health by age, province, and sex are taken from the'The discussion in Chapter 2 explains why this estimation procedure is appropriate.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^156H.A.L.S. (1990) and are incorporated into the morbidity data base (these figures areadjusted for population growth between 1985 and 1986 and for persons living in theterritories to make the population base comparable with that of the G.S.S. — see Ap-pendix G for details). The state of being institutionalized is then assigned a range ofvalues found in the literature (.33 to .56).iii) Measures of morbidity for five and ten year age groupings are then calculated.Morbid states endured by people falling within these age groups are weighted by theirestimated QALY values and summed. The arithmetic means of these QALY-weightedmorbidity values are calculated using the population weights in the G.S.S. and theadjusted population counts from the H.A.L.S.iv) The expectation of living to a given age is calculated, conditioned on beingalive at age fifteen (this is necessary to make the population base comparable withthat of the G.S.S.). These expectations are then summed according to the same agegroupings on which the average estimated QALYs are based.v) Life expectancy is then weighted by the average QALY value for each agegrouping, i.e.^80 1^ NtHSE = E (-pD(05) ps(t115))(E 0(q,,,ort ),^t=15 2^j=where HSE is the estimated health status index, P D (t)15) is the probability of dyingin the tth year given the individual lived to age 15 (this is multiplied by 1/2 onthe assumption that people, on average, die at the mid-point of the time interval),Ps (t115) is the probability of surviving to the tth year given the individual survivedtill age 15, Nt is the number of people in age category t for whom morbidity dataare available (weighted as described above), and 0(q) is the estimated QALY valuefor health state q. Standard errors for these estimates are approximated using a 8-method as described in Rothenberg (1984), under the assumption of independence(4.24)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^157between the estimators of the value of health states (the estimated QALYs) and theprevalence of those states (the arithmetic averages).H S E is an estimate of the quality adjusted life expectancy for adult Canadiansliving in the provinces. How good an estimate depends on the appropriateness of thefollowing three preference assumptions, necessary to statistically link the availabledata sets to construct a societal health index.i) Additivity of Preferences.The value of the lifetime path for morbidity must be constructed by adding thevalues of health states endured for subperiods of life, rather than valuing the lifetimehealth status as a whole. The bias associated with this procedure depends on thestructure of preferences.Lemma 1: The sum of QALYs defined over periods of life less than full life isequivalent to the QALT if and only if preferences may be represented asI (q, t,^= (k(a(q, 00.^ (4.25)Proof: see Appendix F.Since observed preferences are typically concave in time, the sum of QALYs typi-cally over-estimates the true quality adjusted lifetime.Independence from Time FrameG.S.S. satisfaction levels are regressed on QALY values in the literature. Thelatter are based on a time frame of 70 years. The former are not dimensioned by afixed time. In fact, there is no reason to believe the time frames used by respondentsdo not vary. If responses depend on time, such a procedure is biased.Lemma 2: QALYs are independent of time if and only if preferences may berepresented byLqq, t,^= 0(a(q,^ (4.26)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^158Proof: see Appendix F.Since QALYs are believed to be concave in time, QALY values estimated overlonger time frames will be less than QALYs estimated over shorter time frames.Since the G.S.S. satisfaction responses are probably based on some time frame lessthan the lifespan, the estimated function which transforms the G.S.S. preferences toQALY values will produce QALY estimates that are biased downwards.iii) Strict Independence Must Exist Between (1) Mortality and Morbidity in SocialPreferences or (2) in Their Bivariate Joint Distribution.Since the true bivariate distribution is unknown (the G.S.S. data are not linked tothe mortality data), one either has to combine time alive and morbidity in a fashionthat is independent of this distribution or make assumptions about this distributiongiven the aggregator function chosen.Consider first the restrictions which must be imposed on the functional form ofthe SWF if no restrictions are imposed on the bivariate distribution. Given that thetrue SWF, W, is defined over QALTs and the estimated SWF, is defined, in theabsence of data linking the occurrence of morbidity with mortality, over a QALYindex, I(cp(q)), and a life expectancy index, J(t), then unbiasedness ensues if andonly ifW(‘P(qi)ti,•••,(P WON) = ck(1/i7 (I (c,o(q)), J(t))).^(4.27)But this is a Pexider equation with the solutionW(.) = a ll(cp(qi )t i ) bi^ (4.28)Let bi be the proportion of people in states (c,c(q),t), co(q), and t respectively. Thenthe condition for unbiasedness may be expressedJ ,p(qi)ti'r(w(q)t)^H( 9,(qi ))Prwq» II tiPT(t).^ (4.29)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^159Taking logarithms yieldsE E Pr(c,o(q),t)ln(cp(qi )t i ) = E Pr(c,o(q))1n t,o(qi) E Pr(t)lnt i .^(4.30)t^q q^ tUsing the properties of logarithms and probabilities, (4.30) may be re-expressed asE Pr(t)ln(t i) E Pr(cp(q))1n(cp(qi )) =t^ qE Pr (cp(q))1n cp(q) E Pr (t)ln(t i ).^(4.31)This result holds regardless of the joint distribution between morbidity and mor-tality. However, the solution is based on lifetime QALYs (which are unavailable). Ifthe function is based on QALT segments, each segment is treated as a different butfar less well off person and the Cobb-Douglas social preferences underestimate thetrue value of societal health. Thus, this solution is not very practical.Alternatively, one can begin with a particular social welfare function and deter-mine what sorts of conditions must be imposed on the distribution function. Sincethe inequality neutral SWF is invariant to the use of piecemeal QALY values, beginwith the functionW(‘P(4)1,--,S0(4)N,ti,...,tN),tco(q) (cp(q)t)f(cp(q),t)dco(q)dt = E(cp(q)t), (4.32)where f((p(q),t) is the joint distribution function. The estimated function isW = ( L (q) cp(q)f (cp(q))cicp(q))( It t f (t)dt) = E ((,o(q))E (t).^(4.33)But E (c(40) = E (cp(q))E (t) H C ov (cp(q), t) = 0 (i.e. the distribution of morbidityand mortality are independent).If morbidity and mortality are positively correlated (i.e. well people live longer),then the estimated health status index overestimates the true value of health statusin society.With these caveats in mind, attention is now turned to the calculated qualityadjusted life expectancies.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^160Table 4.1: Quality Adjusted Life Expectancy, Canada and the Provincesmen CDA NWFD PEI NS NB QUEE(t) 58.98 58.85 58.39 58.22 58.44 57.88E(cpt) 53.94(4.45)52.88(7.36)52.00(8.56)52.38(6.97)53.26(6.87)53.84(7.42)ONT MAN SASK ALTA BCE(t) 59.32 59.07 59.90 59.57 60.03E(cpt) 53.74(7.76)53.61(6.86)55.08(7.13)53.87(6.64)55.40(7.22)women CDA NWFD PEI NS NB QUEE(t) 65.52 65.13 65.93 64.89 65.80 65.14E(cot) 57.70(4.39)56.19(7.19)58.67(8.61)56.45(6.86)57.36(7.45)57.79(7.29)ONT MAN SASK ALTA BCE(t) 65.47 65.73 66.47 65.82 66.13E(cpt) 57.34(7.23)57.01(6.74)57.56(7.52)58.24(7.17)59.30(7.40)4.4 Results and Implications4.4.1 Quality Adjusted Life ExpectancyThe calculated quality adjusted life expectancies are given in Table 4.1 (standarderrors are in brackets where applicable). Life expectancy in Canada at age fifteen is58.98 years for men and 65.52 years for women (ranging, for men, from 57.88 yearsin Quebec to 60.03 years in British Columbia and, for women, from 64.89 years inNova Scotia to 66.47 years in Saskatchewan). After adjusting for quality, the nationalfigures become 53.94 years for men and 57.70 years for women (ranging, for men,from 52.00 years in P.E.I. to 55.40 years in British Columbia and, for women, from56.19 years in Newfoundland to 59.30 years in British Columbia).These figures suggest that morbidity is an important component of ill healthand that the population as a whole would be willing to give up ten per cent of lifeChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^161expectancy to eradicate ill-health while alive.' It is interesting to note the differentialimpact of the adjustment across the provinces. Before adjusting for quality, Quebechas the poorest average level of health, while the westernmost provinces have thebest health indicators. After adjusting for quality, Quebec ranks fourth among theten provinces, while, among men, only British Columbia maintains a clear advantagein the west, and the Atlantic provinces prove to be at a greater health risk than isindicated by mortality alone.' 13Although the average falls after quality adjustment, the variance increases (therange nearly doubles). This may reflect different policies towards institutionalizingthe severely disabled across the provinces. Since this effectively withdraws the sickermembers from the sample used in the survey, the above estimates may be biased. Tocheck this, the adjustment is repeated with observations on the disabled in institutionsaccounted for. These results are presented in Table 4.2 14 (again, standard errorsappear in brackets where applicable).The effect of the institutional adjustment on rankings is minimal. However, theadjustments are directly related to the healthiness of the population (i.e. the higherIt is possible to make such claims because morbidity has been measured using a time trade-offinstrument. Hence, the final index indicates how much time is worth how much morbidity becausemorbidity is measured in terms of time. This is one of the principle advantages of using QALY datarather than another health status index in the calculation of a societal health index.'Such differences are not due to variations in tolerance for certain states (since average preferencesare used) but actual advantages in morbidity.°One must be cautious when interpreting these figures since the standard errors of these estimates,particularly for the smaller provinces, tend to be quite high. The national figures are based on amuch larger sample and are correspondingly that much more reliable. For this reason, most of theensuing analysis focuses on national figures. It is interesting to note that the greater part of thesehigh standard errors is driven by high variance in the morbid states achieved and not the estimatedvalues of these states. This reinforces the argument that the arithmetic mean is an unsatisfactoryindex of societal health because it is incapable of reflecting this wide dispersion of achieved outcomes,and that additional data must be collected to allow such distributionally sensitive measurements."The unadjusted figures may differ between Tables 4.1 and 4.2 because aggregation occurs over10 year periods instead of 5 year periods. If morbidity increases at higher ages, one would expectthe 10 year averages to be higher than the 5 year averages. The results indicate this holds in mostcases.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^162Table 4.2: Quality Adjustment with Institutional Data, Canada and the Provincesmen CDA NWFD PEI NS NB QUEE(t) 58.98 58.85 58.40 58.22 58.44 57.88E(cot/no I) 53.94 52.64 52.71 52.42 53.32 53.82(4.45) (7.36) (8.56) (6.97) (6.87) (7.42)E(c,oth,o(/) = .56) 53.72 52.46 52.53 52.27 53.14 53.58(4.38) (7.57) (8.55) (6.98) (6.85) (7.48)E(cpt/cp(/) = .33) 53.52 52.37 52.41 52.15 52.99 53.37(4.38) (7.57) (8.55) (6.98) (6.85) (7.48)ONT MAN SASK ALTA BCE(t) 59.32 59.07 59.90 59.57 60.03E(cpt/no I) 53.72 53.78 55.15 53.91 55.45(7.76) (6.86) (7.13) (6.64) (7.22)E(cpt/c,c(/) = .56) 53.50 53.63 54.94 53.65 55.17(7.71) (7.04) (7.02) (6.56) (7.14)E(cot/cp(/) = .33) 53.30 53.49 54.75 53.39 54.95(7.71) (7.04) (7.02) (6.56) (7.14)women CDA NWFD PEI NS NB QUEE(t) 65.52 65.13 65.95 64.89 65.80 65.14E(cot/no I) 57.75 56.47 58.34 56.52 57.56 57.92(4.39) (7.19) (8.61) (6.86) (7.45) (7.29)E(c,oth,o(/) = .56) 57.40 56.25 58.13 56.29 57.10 57.52(4.32) (7.29) (8.64) (6.74) (7.41) (7.33)E(yoth,o(/) = .33) 56.97 55.95 57.82 55.98 56.66 57.10(4.32) (7.29) (8.64) (6.74) (7.41) (7.33)ONT MAN SASK ALTA BCE(t) 65.47 65.73 66.47 65.82 66.13E(cot/no I) 57.34 1 57.15 57.58 58.28 59.33(7.23) (6.74) (7.52) (7.17) (7.40)E(cptAo(I) = .56) 57.04 56.82 57.30 57.84 58.94(7.11) (6.88) (7.51) (7.04) (7.30)E(sot/cp(/) = .33) 56.60 56.43 56.91 57.35 58.52(7.11) (6.88) (7.51) (7.04) (7.30)Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^163the quality adjusted health status of the province, the higher the rate of institu-tionalization), with the two most western provinces and Quebec showing the highestproclivity to institutionalize, and the Atlantic provinces showing the least. Thus,while differential rates of institutionalization account for part of the differential, theyare not sufficient to explain away the patterns observed. Furthermore, the adjustmentis greater for women than for men in every case, suggesting that women are institu-tionalized at a much higher rate and that the non-institutionalized quality adjustedfigures are biased in favour of the women.4.4.2 Male-Female DifferentialsThe above results suggest that, while women may live longer than men, they maynot live as well. In fact, the advantage is nearly halved between the quality adjustedfigures and the unadjusted figures (the unadjusted life expectancies give women a 6.54year advantage over men; this falls to 3.76 years after adjusting for non-institutionalmorbidity and 3.45 years after adjusting for institutionalization as well).A residual test indicates preferences for health states differ significantly betweenmen and women (see Appendix G for details: women generally associate more disu-tility with short term ill-health and severe cases of ill-health than men, while menassociate greater disutility with chronic and more moderate cases of ill-health). Forthis reason, the estimation procedure is repeated for each group separately. The link-age to the QALY values is complicated by the fact that the QALY data are averagedover both sexes (supposedly because the values do not differ between the two groups- see Torrance et al. [1982] and Torrance {1976b]). Estimated satisfaction levels areconverted to estimated QALY values both by the common transformation used be-fore and by sex-specific transforms that are estimated on the assumption that QALYvalues are the same for the two groups (the latter of these two methods is probablyChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^164Table 4.3: Male-Female Differentials, Canadanoadjust.jt. pref.jt.^trans.spec. pref.jt.^trans.spec. pref.spec. trans.men 58.98 53.94 53.12 54.01(4.45) (4.85) (5.74)women 65.52 57.70 57.05 55.73(4.39) (5.23) (4.34)difference 6.54 3.76 3.93 1.72the more theoretically sound - see the data appendix for details). These results arepresented in Table 4.3 (standard errors appear in brackets).Using common preference functions and common QALY transforms, the non-institutional difference is 3.76 years. With specific preference functions and commontransforms, the difference is 3.93 years. With specific preference functions and spe-cific transforms, the difference falls to only 1.72 years - a quarter of the unadjusteddifferential. One possible interpretation of these results is that women are preparedto give up about twice as much longevity as men in order to spend their remainingyears in perfect health. This is in part due to the fact that women experience moreillness while alive than men (the difference is apparent when common values are usedto weight morbid states), and in part because women seem to place greater value onliving well rather than longer (the differences are greater when gender-specific valuesof morbid states are used).While these results must be accepted cautiously because of the high standarderrors, they do suggest that, overall, more resources should be devoted to women'shealth than had heretofore been considered appropriate. Especially in medical re-search directed at alleviating ill-health rather than prolonging life, women should begiven more consideration than men, not less.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^165Table 4.4: Morbidity by Major Category, Menillness prevalencepresentprevalenceseveredisutilitypresentdisutilityseveresocietal healthallendurance 14.22 2.56 .1300 .0087 1.14(.006) (.115) (.176)role 10.40 1.19 .1818 .3175 1.49(.013) (.095) (.385)emotional 3.20 N/A .2206 N/A(.021)social 12.71 N/A .0183 N/A .500(.003) (.215)hearing 8.55 .0071(.006)sight 3.37 .72 .0626 .1041 .280(.023) (.153) (.259)short 8.02 5.15 .0551 .1184 .810(.009) (.011) (.247)agility 5.82 .80 .0643 .1386 .25(.023) (.651) (.406)4.4.3 The Importance of MorbidityIt appears that morbidity significantly detracts from health in Canada. This is hardlysurprising for an industrialized country. But if more resources are to be targeted atthe alleviation of morbidity, where should they best be directed? The impact ofmorbidity has been assessed in terms of prevalence and the disutility of any givenfactor, but not in terms of social disutility. Results for all three measures are givenin Table 4.4 and 4.5. (Note: prevalence is the percentage of the population withthe condition, disutility is the estimated marginal amount of dissatisfaction taken atperfect health, and societal health is the mean number of quality adjusted life yearsthat is achieved if the condition is eradicated.)Note that the social and perception values for societal health include both socialand emotional, and hearing and sight components respectively.Chapter 4. A QALY Based Societal Health Statistic for Canada, 1985^166Table 4.5: Morbidity by Major Category, Womenillness prevalencepresentprevalenceseveredisutilitypresentdisutilityseveresocietal healthallendurance 23.67 2.56 .0725 .1471 2.17(.003) (.042) (.180)role 12.47 1.19 .1412 .1801 2.10(.013) (.057) (.388)emotional 3.56 N/A .1468 N/A(.17)social 9.59 N/A .0070 N/A .560(.003) (.203)hearing 7.00 .0721(.010)sight 4.58 .92 .0273 .0965 .330(.18) (.153) (.217)short 12.90 7.12 .0928 .1535 1.33(.005) (.007) (.246)agility 9.26 .86 .0011 .3772 .40(.017) (.254) (.380)Problems of endurance are the most prevalent, affecting nearly 20 per cent ofthe Canadian population, while endurance and short-term ill-health are the mostprevalent severe illnesses. Social ill-health is relatively more prevalent among men,while short term ill-health is relatively more prevalent among women. Emotionalill-health is the least prevalent morbid effect for both groups.The estimated disutility is greatest for morbidity in the role category, and smallestin the perception category. For men, the role category is far more important thanfor women. Social and emotional function are also more important. Women, on theother hand, are far more concerned about agility and relatively more concerned withshort-term episodes of ill-health.The aggregate effects are, as to be expected, a combination of the two resultsabove. The most significant category for men is role, while for women, it is enduranceChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^167(although both categories rank highly for both men and women). The least importantcategories are agility for men and perception for women (again, both categories arerelatively unimportant for both men and women). The societal index gives a clearindication of where resources should be targeted to alleviate morbidity. While thecomponents of the endurance category fall plainly in the medical realm, one hasto consider if the role category could be better approached through more sociallyoriented programs. This lends support to the idea that health and welfare programsshould be integrated to achieve the greatest returns.4.5 Avenues for Future ResearchThis paper has identified a number of problems associated with QALY based measuresof societal health. The supposed advantage of such measures is their relationship topreferences, both individual and social. Yet these relationships have been shown tobe flawed.The ranking of different health status profiles may depend on some health statenot even under consideration. The comparability assumptions imposed by this ref-erence state are also unsatisfactory. More distressing, the QALY based index maybe inconsistent with the distributional ethics held by the benevolent social planner— even when the aggregator function is chosen to accommodate these ethics. Theinconsistencies between the QALY based index and social preferences are theoreticalin nature and cannot be overcome by improved data collection. One should recognizethat social orderings of this nature are invariably flawed and that the flaws associatedwith this index have been identified and may well be the best compromise available.QALY based measures may also distort the preferences of individuals, althoughthis problem is empirical rather than theoretical, and can be overcome by improvedChapter 4. A QALY Based Societal Health Statistic for Canada, 1985^168data collection. The first problem is to identify the paths of morbidity over lifetimes.While it is impractical to wait for every member of society to complete his or hermorbidity path, longitudinal surveys (such as the Canada Sickness Survey [1954]),rather than the periodic ones currently planned, could be used to estimate the tran-sition matrix over time and the joint distribution over morbidity and mortality. Thesecond problem is to collect valuation data over these states. The exercise carriedout above is not completely satisfactory since satisfaction levels are reported on anarbitrary scale that has no linkage to the value of life and may vary across indi-vidual respondents. Furthermore, values constructed from periodic QALYs (ratherthan QALTs) are apt to be biased and correction factors need to be estimated. Thisrequires a much broader set of QALY values than currently exists. Finally, if the col-lected QALY data support the hypothesis that QALY values differ across people, orthat different individuals value health differently (or should have their health valueddifferently from the rest of the population), the assumption of identical preferencesmust be relaxed and an index with interpersonal comparability properties must beadopted instead of the time trade-off.Chapter 5ConclusionOne of the problems in health care evaluation is the measurement of health outcomes."Natural" or clinical units (such as number of cases of a disease in a population)are unsatisfactory because they have no underlying value basis. It is impossibleto compare different health conditions or different changes in the same condition.Standard economic measures (such as willingness-to-pay) are, on the other hand,badly distorted by imperfections in health and related markets (e.g. supplier induceddemand), and favour allocations that are biased to the wealthy. Such values may notreflect the true "worth" of a health state, and the worthiness of an individual's healthdepends on his or her income.The QALY (quality adjusted life year) has been put forth as a solution to thismeasurement problem. It is a health index whose weights are based on preferencesfor health states (hence, it has a value foundation), and these values are free of manyof the less attractive features of the standard economic measures. QALYs are nowused to (a) choose treatments for a given individual (e.g. cancer therapy), (b) chooseindividuals for a given treatment (e.g. organ transplants), and (c) choose programsfor funding (e.g. Oregon Medicaid reforms, Ontario formulary lists). Unfortunately,it is not clear if such allocations are appropriate. While there has been considerableresearch done on the intrapersonal properties of QALYs (i.e. whether they are utilitynumbers or not), there has been an inadequate investigation of the interpersonaland aggregation properties that are involved, particularly in the latter two types of169Chapter 5. Conclusion^ 170decisions. This thesis investigates these neglected welfare properties.The first essay addresses two issues of how to obtain QALY values: how to valuehealth states and how to identify which health states must be valued. The first goal ofthis chapter is to identify valid measurement instruments (which convert preferencesfor health states to a numerical scale). The definition of construct validity employedexplicitly incorporates how QALY values are used to make policy decisions. Twonecessary conditions for validity are derived from this concept: QALY values mustrepresent the individual's preferences over health states and QALY values should beindependent of the level of non-health factors. All known QALY instruments are thenevaluated for consistency with these conditions. All QALYs are found to rank healthstates appropriately, but none can rank improvements (or changes) in health (withoutimposing unrealistic restrictions on preferences). Nor does any QALY instrumentgenerate values that are independent of context. It is concluded that there is no"gold standard" instrument. It is shown that the presumed theoretical correctness ofthe standard gamble instrument (or the theoretical incorrectness of the time trade-off instrument) is wrong. Policy makers must instead choose the QALY instrumentwhich fits the type of decision to be made. For decisions involving different lengths oflife, a time trade-off instrument should be used; while for decisions involving differentnumbers of people, a person equivalent instrument should be used. In general, thedescription of the reference state should be expanded to include a fixed level for everyaspect of the state that could change as as result of the health care projects beingcompared.One significant contribution made in this chapter is the development of the ex-tended sympathy instrument. The literature to date has focused on the measurabilityproperties of QALY values, and comparability properties, which are involved when-ever choices between individuals are made, have been neglected. Unlike any otherChapter 5. Conclusion^ 171instrument, the extended sympathy is based on comparable utility functions. Thevalues obtained are thus appropriate whenever the policy decision involves choosingbetween individuals. The extended sympathy instrument also provides one methodto improve the interpersonal ethics involved in some QALY statistics discussed insubsequent chapters.The second part of this chapter examines whether any of the reconstruction meth-ods proposed by Keeney and Raiffa (1976) apply over broadly defined categories ofill-health. This is the only known example where mutual preference independenceover widely applicable health conditions is treated as the null hypothesis. Additivestructures are found to be biased, but multiplicative structures are found to be valid.This result suggests there can be significant cost savings in QALY data collection if a"utility index" approach is adopted. When a large variety of health conditions mustbe valued (for instance, when system-wide evaluations are to be undertaken, suchas was done recently in Oregon), the most cost-effective method of obtaining QALYvalues is to identify the attributes of each condition and the value for each attribute.The value for any condition can be found just from the values for its componentattributes. This result also suggests that other health status indexes, most of whichhave an additive structure, do not appropriately value health states.The third chapter assesses if the types of allocation decisions suggested by QALYindexes are appropriate and whether the alternative measures (willingness-to-pay andhuman capital) perform any better. Different but mutually consistent ethical criteriaare imposed on the different types of decisions which use these values.When choosing a treatment for a given individual, the goal is to give the patientthe health outcome he or she most prefers. Although this "patient centred ethic"is gaining widespread acceptance, it contradicts the paternalism that occasionallyappears in health policy. Both the healthy year equivalent (the QALY index chosenChapter 5. Conclusion^ 172for this analysis) and the willingness-to-pay measure achieve this goal, but the humancapital measure generally does not. Hence, if the human capital statistic is used todetermine the treatment path for a patient, the patient may be left in what, in his orher opinion, is an inferior health state of those available.When choosing an individual for a given treatment, the principle followed in thischapter is equal entitlement. Equal entitlement requires that any two individuals whohave the same preferences over two health states should have an equal opportunityto achieve either health state. This is perhaps the most contentious principle usedin the chapter and it is not uncommon to find other positions based on merit orhealth maximization. The human capital statistic is found to discriminate againstthe wealthy, the retired, and those who value leisure. The willingness-to-pay measureis found to discriminate against the poor, the retired, and those who value income.Finally, the healthy year equivalent is found to discriminate against the retired, thosewho are risk averse with respect to health, and, very likely, those who are otherwise inpoor health. These results seriously undermine the egalitarian justification for usingQALYs. QALYs are not only inconsistent with equal entitlement, but the people whoare discriminated against are, in some cases, the very ones widely believed to deserveextra consideration (i.e. those endowed with poor health and those who have takencare of their health endowments).One method proposed in the literature (Wagstaff [1991]) to overcome this problemis to assign "named weights" to individuals so that more deserving people carrygreater weight in social decisions than less deserving people. But this method isinadequate to prioritize individuals by their health status since, when health statuschanges, the weight remains fixed. Instead, the position adopted in this thesis is thatsuch apparent discrimination can be overcome by the use of a more distributionallysensitive aggregation rule. Such a rule assigns higher values to whomever is in worseChapter 5. Conclusion^ 173health. This requires attention be turned from measuring health status changes tohealth status levels. The use of health status improvement (change) measures impliesutilitarian ethics, which are sometimes inconsistent with egalitarian principles. Theuse of health status change measures is only defensible, if at all, for small projectsthat have little impact on the overall distribution of health in society.When choosing a program for funding, society wants to pick the set of programsthat gives the best level and distribution of community health. Thus, a good indexshould encompass all aspects of community health (that is, be complete). It shouldmeasure these aspects such that community health increases whenever the health ofa constituent improves. To be consistent with the first ethical principle, this requiresthe index be consistent with welfarist social preferences. Because, ceteris paribus,equal distributions of health are socially preferred to unequal distributions of health,these preferences should also be distributionally sensitive. This is consistent withthe second ethical principle, which deals with horizontal rather than vertical equity.The aggregate versions of the three health statistics commonly used in practice areassessed for consistency with these objectives.The human capital measure is found to be seriously incomplete. It is not consistentwith welfarism since this requires that everyone earn the same wage and spend alltheir time working. It is distributionally neutral. The willingness-to-pay measuregenerally fails to order health profiles sensibly. It is consistent with welfarism only ifthe individual willingness-to-pay for a health status change is the same for all levelsof income. It is also distributionally neutral. The mean (or sum of) healthy yearequivalent(s) can order most health profiles sensibly (only states worse than deathpose a problem). It is welfarist only if the individual healthy year equivalents areproportional to the length of life. However, the distributional ethics of this indexare (likely) perverse, and the index favours unequal distributions of health in theChapter 5. Conclusion^ 174community. Clearly, the first two indexes are unsatisfactory, neither being able toeven identify which community health profiles are better than others.While the healthy year equivalent can order most community health profiles, themanner in which it measures them is not completely satisfactory. Typically, the indexdepends on the reference health state. This result simply reflects the consequences ofArrow's Impossibility Theorem: reasonable social orderings based solely on individualpreferences do not exist. To combine individual orderings over health states, theaggregate QALY index converts these preferences to a cardinal scale by anchoringeach individual ordering such that a year in the reference health state is the same valuefor everyone (i.e. the independence of irrelevant alternatives is violated)) It shouldbe recognized that some such ethical compromise is involved for any index based onindividual (or patient) preferences. The alternative is to adopt a paternalistic position(dictatorship) that the QALY index was designed to overcome.The policy maker can circumvent the most serious implications of reference statedependence by ensuring that all community health measures are fixed to the samereference state (usually perfect health). Then a consistent ranking of health profilesis generated.A more distressing ethical implication of aggregate QALY statistics is that theyassign higher values to improving the health of people who are already relativelyhealthy rather than those who are unhealthy. To compensate for this, the policy makercan choose an aggregation rule that is characterized by a high degree of inequalityaversion (the feasibility of such rules is addressed in the fourth chapter).These results suggest that the QALY is an acceptable decision tool for choosing'An alternative approach to combine individual orderings is to make these orderings interper-sonally comparable. The extended sympathy instrument generates such interpersonally comparablevalues. If the policy maker found the implications of cardinality (a year of life in perfect health isworth the same to all) too offensive, the extra investment required to collect extended sympathyQALY values would be warranted.Chapter 5. Conclusion^ 175treatments for a given patient, is better than the alternatives for choosing betweenprograms for funding (although the policy maker may be well advised to use an aggre-gation rule with strict inequality aversion to compensate for the distributional effects),but is perhaps unacceptable for choosing between people for treatment. Policy mak-ers must be prepared to favour the healthy and those who are less cautious regardingtheir health if they are going to allocate resources on the basis of improvements inhealth measured by QALY-type statistics.The fourth chapter of this thesis examines the theoretical and empirical propertiesof a QALY-based societal health index. In response to some of the issues raised in theprevious chapter, different aggregation rules over individual healthy year equivalentsare examined. It is investigated whether there exists any aggregation of individualhealthy year equivalents that represents acceptable social preferences for communityhealth. Aggregation is shown to be possible, but, as with any social ordering gen-erated by individual orderings for health states, some ethical compromises must bemade. It is found that the only aggregation rule which represents complete, welfarist,and distributionally defensible social preferences is the sum, but that this requires theassumption that individuals do not discount over time. Given that this condition usu-ally does not hold in practice, some ethical compromise must be made. Most healthplanners would probably sacrifice welfarism to obtain more appropriate distributionalethics. However, the indexes which fall into this category are based on quality ad-justed lifetimes and cannot be estimated with the piecemeal data that are available.Hence, an additive structure, with all its accompanying distributional faults, is usedto estimate the health status statistic.A measure of health status for adult Canadians living in the provinces is calculatedwith the available data. Results indicate (1) morbidity significantly reduces the healthof Canadians (society would be willing to give up 10 percent of its life expectancyChapter 5. Conclusion^ 176to rid itself of all morbidity), (2) that women suffer from more morbidity than men(their health advantage is cut in half when life expectancy is adjusted for quality),and (3) that the most socially significant categories of ill-health are role and motordysfunction. Policy makers could use information of this type to more effectivelytarget health care resources (e.g. allocation of funds between life-saving versus life-improving interventions, allocation of funds between prevention programs for gender-specific diseases, targeting programs that assist people with moving around theircommunity and occupational retraining programs).This chapter demonstrates that, while the construction of such indexes is certainlyfeasible, it could be greatly improved with additional data. First, information on thedynamics of illness would allow a broader range of aggregation rules (and more dis-tributionally defensible ethics) and overcome the positive bias in this index becausemortality and morbidity had to be assumed uncorrelated. Second, individual healthstate values which are appropriately scaled to a QALY interval (rather than satisfac-tion with health as reported on an ordinal scale) would overcome the need to assumerestrictions on individual preferences to construct the social index, and increase itsempirical reliability.This thesis has answered a number of questions relating to the use of QALYs inhealth policy making. Its main contribution to the literature has been to identify theunderlying interpersonal and social ethics of QALY-based decision statistics that arecurrently used to guide health care policy. For decisions involving choices betweenindividuals, the interpersonal ethics are shown to be non-egalitarian. In fact, theQALY index discriminates against the very people most societies would, if anything,choose to offer preferential health care access (the people born with poor endowmentsof health and those who have taken care of their health endowments). Thus, theinterpersonal properties of the QALY index are in some ways inferior to those of theChapter 5. Conclusion^ 177competing health status measures they were designed to replace.These interpersonal problems can be overcome if a decision statistic is chosenwhich incorporates concerns for inequality. This requires abandoning the currentlyused change in health measure (which, with utilitarian underpinnings, is inconsis-tent with egalitarianism) in favour of a social welfare function defined over individualQALYs. The "more deserving" can then be given more weight in policy decisions bychoosing an aggregation rule that incorporates strict inequality aversion. The resultsof this thesis indicate such social aggregation is both theoretically and practicallypossible, although some compromise in social ethics is involved. However, these eth-ical compromises are inherent for any social ordering based on individual preferences(Arrow's theorem), and must be accepted unless the policy maker is prepared to giveup the position that the "patient knows best". In order to adhere to this principle,the position taken throughout the thesis is that the individual is the best judge ofhis or her own welfare (this is an underlying principle in QALY analysis) and thathealth states should be valued by the individual enduring them (i.e. QALY valuesshould come from patients, or from people who have the same preferences as thesepatients). Average community preferences should only be used to determine the na-ture of the aggregation rule (i.e. the degree of inequality aversion- across health inthe community). The use of average preferences to evaluate individual health statescould result in the situation where society's health improves by forcing patients toaccept treatments (and health outcomes) that they do not want.Current aggregation rules are shown to be inequality promoting. They also imposethe social value that a year of life in perfect health (or whatever the reference healthstate is) is worth the same to everyone. It is shown that the first problem canbe overcome by choosing an aggregation rule that is more distributionally sensitive(this is clearly possible in theory, although more data are required to implementChapter 5. Conclusion^ 178this). The second problem, if it is indeed considered a problem, can be overcomeby using an extended sympathy QALY instrument. These values are interpersonallycomparable, so it is unnecessary to impose preference similarity arbitrarily. Theimplied social ethics of the extended sympathy QALY-based aggregate index may ormay not be more acceptable to the policy maker, and the extra resources required bythis measurement technique may or may not be warranted.This thesis has raised as many questions as it has answered. This new researchagenda is summarized below. Issues of who should provide QALY values when indi-viduals are unable or unwilling to do so are not addressed. This is not a trivial prob-lem. Some people are too incapacitated to provide QALY values (the very young,the very old, the mentally ill), while others may recognize incentives to misreportthe value of certain health states (they could under-report the value of their currenthealth state to attract more resources to treatment of this state). One response tothese problems is to use average QALY value functions (as opposed to average QALYvalues that entail a certain ethical position), where the average is taken from a repre-sentative group (representative in the sense that people in this group have the samepreferences as the person whose health state needs to be valued). There has not beenadequate empirical analysis as to whether such representative groups exist (evidencefrom Chapters 2 and 4 indicate that statistically significant differences do appear toexist across some characteristics), and whether responses are normally distributed sothat the use of an average response is appropriate. Another line of empirical analy-sis that should be undertaken is to investigate (1) whether the extended sympathyinstrument is feasible in practice, and (2) if so, whether responses vary significantlyto justify the use of such an instrument. The discussion above should suffice as jus-tification for this analysis. These two lines of research constitute a departure fromthe empirical analysis that has been undertaken up till now. The typical focus hasChapter 5. Conclusion^ 179been on content validity (whether different instruments generate the same values).Statistical differences consistent with theory are now well established. Hence, it isargued here that field research should be taken in new directions.The welfare assessment can be improved in a number of ways. The analysis inthis thesis treated morbidity and mortality as completely separable. A more realisticrepresentation would have morbidity evolving over the life-cycle. This complicatesthe analysis considerably and may change some of the results obtained in the simplermodel (e.g. the healthy year equivalent function may become discontinuous). Anotherissue is how to incorporate other data that are relevant to the policy maker in adecision statistic, particularly resource use. The analysis in the thesis presupposedthat a fixed level of funding for health care had already been determined. The issueof how, if at all, QALYs could be used to help determine what this overall budgetshould be was not addressed. Many of the results in the thesis suggested QALYs couldbe used to determine the optimal distribution (as opposed to the level) of resourcesacross health programs by asking which generated the greatest societal health. Thisrequires evaluating the health of the entire community, even those not affected bythe project, for every decision. More feasible piecemeal rules, which approximate thebest strategy plan, should be devised and assessed for accuracy.The most exciting line of research involves extensions in the area of societal healthmeasurement. While theoretically sound, greater confidence could be placed in theempirical results (and the policies suggested by them) with two improvements in datacollection. The first is to obtain more reliable values for the health states endured bymembers of society. No work has been done on whether or not the health conditionsrecorded in the General Social Survey capture all the relevant dimensions of ill-health.It would also be worthwhile to learn how close the values assigned to the health statesin this research are to the values that would have been obtained if they had beenChapter 5. Conclusion^ 180measured directly with QALY instruments. Obviously, if these values are significantlydifferent, the societal health measures obtained and the policy implications drawnfrom them may have to be adjusted. The second improvement in data collection isto find the "careers" of certain health conditions (i.e. the transitional probabilitiesof moving between health states over time). This would reduce the statistical bias inthe calculated quality adjusted life expectancy, but would also allow the use of otheraggregate statistics that reflect a greater degree of inequality aversion. Even with thedata available, continued measurement of societal health can generate policy relevantinformation. Such measures can be repeated over time (to track the health of apopulation), or across regions (to compare the relative performance of different healthcare systems or the impact of other factors that affect the health of a community).Such data are, in fact, necessary for any type of system-wide assessment.Bibliography[1] Anderson, J., J. Bush, M. Chen, and D. Dolenc (1986). "Policy Space Areas andProperties of Benefit-Cost/Utility Analysis." JAMA, vol. 255, 794-795.[2] Atkinson, A. and F. Bourguignon (1982). "The Comparison of Multi-Dimensioned Distributions of Economic Status." Review of Economic Studies,vol. 49, 183-202.[3] Arrow, K. (1978). "Extended Sympathy and the Possibility of Social Choice."Philosophia, 7, 223-237.[4] Avorn, J. (1984). "Benefit and Cost Analysis in Geriatric Care." New EnglandJournal of Medicine, vol. 310, 1294-1301.[5] Birch, S. and C. Donaldson (1987). "Applications of Cost-Benefit Analysis toHealth Care." Journal of Health Economics, vol. 6, 211-225.[6] Birch, S. and A. Gafni (1991). "Cost-Effectiveness/Utility Analyses: Do CurrentDecision Rules Lead Us to Where We Want to Be?" CHEPA D.P. 91-6.[7] Birnbaum, M. (1973). "The Devil Rides Again: Correlations as an Index of Fit."Psychological Bulletin, vol. 79, 239-240.[8] Blackorby, C. and D. Donaldson (1990). "A Review Article: The Case Againstthe Use of the Sum of Compensating Variations in Cost-Benefit Analysis." Cana-dian Journal of Economics, vol. 23, 471-494.[9] Blackorby, C. and D. Donaldson (1988). "Money Metric Utility: A HarmlessNormalization?" Journal of Economic Theory, vol. 46, 120-129.[10] Blackorby, C. and D. Donaldson (1985). "Consumers' Surpluses and ConsistentCost-Benefit Tests." Social Choice and Welfare, vol. 1, 251-262.[11] Blackorby, C., D. Primont, R. Russell (1978). Duality, Separability, and Func-tional Structure: Theory and Economic Applications. New York: North-Holland.[12] Boadway, R. (1974). "The Welfare Foundations of Cost-Benefit Analysis." Eco-nomic Journal, vol. 84, 426-439.[13] Boadway, R. and N. Bruce (1984). Welfare Economics. New York: Basil Black-well.181Bibliography^ 182[14] Bombardier, C., A. Wolfson, A. Sinclair, and A. McGeer (1982). "A Comparisonof Three Preference Measurement Methodologies in the Evaluation of a Func-tional Health Status Index." In R. Deber and G. Thompson (eds.). Choices inHealth Care: Decision Making and Evaluation of Effectiveness. Toronto: Univer-sity of Toronto Press.[15] Boyle, M. and G. Torrance (1984). "Developing Multiattribute Health Indexes."Medical Care, 22, 1045-1057.[16] Boyle, M., G. Torrance, J. Sinclair, S. Horwood (1983). "Economic Evaluation ofNeonatal Intensive Care of Very-low-birth-weight Infants." New England Journalof Medicine, vol. 308, 1330-1337.[17] Brent, R. (1991). "A New Approach to Valuing Life." Journal of Public Eco-nomics, vol. 44, 165-172.[18] Brooks, R. (1986). The Development and Construction of Health Status Mea-sures. IHE Report 1986:4.[19] Broome, J. (1978). "Trying to Value a Life." Journal of Public Economics, vol.9, 91-100.[20] Butler, J. (1990) "Welfare Economics and Cost-Utility Analysis." A.N.U. Dept.of Economics W.P. 205.[21] Canada. Dept. of National Health and Welfare (1954). Canada Sickness Survey.Ottawa: Dominion Bureau of Statistics.[22] Canada. Dominion Bureau (1960). Illness and Health Care in Canada. CanadianSickness Survey, 1950-1951. Cat. 82-518. Ottawa: Queen's Printer.[23] Canada. Parliament (1964). Royal Commission on Health Services Report. Ot-tawa: Queen's Printer.[24] Canada. Statistics Canada (1990). The Health and Activity Limitation SurveyHighlights: Disabled Persons in Canada. Cat. 82-620. Ottawa: Minister of Re-gional Industrial Expansion.[25] Canada. Statistics Canada (1987). General Social Survey: Health and SocialSupport, 1985. Cat. 11-612. Ottawa: Minister of Supply and Services.[26] Canada. Statistics Canada (1981). Health of Canadians: Report of the CanadaHealth Survey. Ottawa: Minister of Supply and Services.Bibliography^ 183[27] Canada. Statistics Canada, Canadian Centre for Health Information (1991).Health Reports: Life Tables: Canada and the Provinces, 1985-1987, supp. 13,vol. 2. Ottawa: Minister of Industry, Science and Technology.[28] Carr-Hill, R. (1989). "Assumptions of the QALY Procedure." Social Science andMedicine, vol. 29, 469-477.[29] Carr-Hill, R. (1985). "The Evaluation of Health Care." Social Science andMedicine, vol. 21, 367-75.[30] Chew, S. (1980). Two Representation Theorems and Their Application to Deci-sion Theory. Unpublished Ph.D. thesis. U.B.C.[31] Chiang, C. (1968). Introduction to Stochastic Processes in Biostatistics. NewYork: John Wiley.[32] Churchill, D., B. Lemon and G. Torrance (1984). "A Cost-Effectiveness Analy-sis of Continuous Ambulatory Peritoneal Dialysis and Hospital Hemodialysis."Medical Decision Making, vol. 4, 489-500.[33] Culyer, A. (1989). "The Normative Economics of Health Care Finance and Pro-vision." Oxford Review of Economic Policy, vol. 5, p. 34-58.[34] Culyer, A. (1976). Need and the National Health Service. London: Martin Robert-son and Company.[35] Deaton, A. and J. Muellbauer (1980). Economics and Consumer Behaviour. NewYork: Cambridge University Press.[36] Diewert, W. E. (1982). "Duality Approaches to Microeconomic Theory." In K.Arrow and M. Intriligator (eds.). Handbook of Mathematical Economics, vol. 2.New York: North-Holland Publishing Company.[37] Diewert, W.E. (1973). "Functional Forms for Profit and Transformation Func-tions." Journal of Economic Theory, vol. 6, 284-316.[38] Donabedian, A. (1971). "Social Responsibility for Personal Health Services: AnExamination of Basic Values." Inquiry, vol. 8, 3-19.[39] Donaldson, David (1991). "On the Aggregation of Money Measures of Well-Beingin Applied Welfare Economics." Paper presented to the Western AgriculturalAssociation/American Agricultural Association.[40] Drummond, M. (1987). "Cost Benefit Analysis in Health Care: Future Direc-tions." In G. Teeling Smith (ed.). Health Economics: Prospects for the Future.New York: Croom Helm.Bibliography^ 184[41] Drummond, M., G. Stoddart, and G. Torrance (1987). Methods for the EconomicEvaluation of Health Care Programs. Toronto: Oxford University Press.[42] Eichhorn, W. (1978). Functional Equations in Economics. Don Mills, Ontario:Addison-Wesley Publishing Company.[43] Epp, J. (1986). A Framework for Health Promotion. Ottawa: Health and WelfareCanada.[44] Erickson, P., E. Kendall, J. Anderson, and R. Kaplan (1989). "Using CompositeHealth Status Measures to Assess the Nation's Health." Medical Care, vol. 27,s66-76.[45] Evans, R.G. (1984). Strained Mercy: The Economics of Canadian Health Care.Toronto: Butterworths.[46] Feeny, D. and G. Torrance (1989). "Incorporating Utility Based Quality-of-LifeAssessment Measures in Clinical Trials: Two Examples." Medical Care, vol. 27,S190-204.[47] Fishburn, P. (1988). Non-linear Preference and Utility Theory. John HopkinsUniversity Press.[48] Fishburn, P. (1964). Decision and Value Theory. New York: Wiley.[49] Froberg, D. and R. Kane (1989). "Methodology for Measuring Health State Pref-erences - I-IV." Journal of Clinical Epidemiology, vol. 42, 345-354, 459-471, 585-592, 675-685.[50] Furlong, W., D. Feeny, G. Torrance, R. Barr, and J. Horsman (1990). "Guideto Design and Development of Health State Utility Instrumentation." McMasterUniversity, CHEPA, D.P. 90-9.[51] Gafni, A. and S. Birch (1991). "Equity Considerations in Utility-Based Mea-sures of Health Outcomes in Economic Appraisals: An Adjustment Algorithm."Journal of Health Economics, vol. 10, 329-342.[52] Gafni, A. and G. Torrance (1984). "Risk Attitude and Time Preference." HealthManagement Science, vol. 30, 440-451.[53] Geigle, R. and S. Jones (1990). "Outcomes Measurement: A Report from theFront." Inquiry, vol. 27, 7-13.[54] Giauque, W. and T. Peebles (1976). "Application of Multiattribute Utility The-ory in Determining Optimal Test Treatment Strategies for Strepococcal SoreThroat and Rheumatic Fever." Operations Research, 24, 933-950.Bibliography^ 185[55] Haldane, J. (1988). "Persons and Values." Journal of Medical Ethics, vol. 14,39-41.[56} Hanke, S. (1981). "On the Feasibility of Benefit-Cost Analysis." Public Policy,vol. 29, 147-158.[57] Harris, J. (1987). "QALYfying the Value of Life." Journal of Medical Ethics, vol.13, 117-123.[58] Harsanyi, J. (1955). "Cardinal Welfare, Individual Ethics, and InterpersonalComparisons of Utility." Journal of Political Economy, vol. 63, 309-321.[59] Hausman, J. and D. Wise (1978). "A Conditional Probit Model for Qualita-tive Choice: Discrete Decisions Recognizing Interdependence and HeterogeneousPreferences." Econometrica, vol. 46, 403-426.[60] Hilden, J. (1985). "The Non-Existence of Interpersonal Scales: A Missing Linkin Medical Decision Theory." Medical Decision Making, vol. 5, 215-228.[61] Kahneman, D. and A. Tversky (1979). "Prospect Theory: An Analysis of Deci-sion Under Risk." Econometrica, vol. 47, p. 263-291.[62] Jones-Lee, M. (1976). The Value of Life: and Economic Analysis. Chicago: Uni-versity of Chicago Press.[63] Keeney, R. and H. Raiffa (1976). Decisions with Multiple Objectives: Preferencesand Value Tradeoffs. New York: Wiley.[64] Kennedy, P. (1985). A Guide to Econometrics, 2nd ed. Cambridge: MIT Press.[65] Klein, G., H. Moskowitz, S. Mahesh, and A. Ravindran (1985). "Assessmentof Multi-Attributed Measurable Value and Utility Functions via MathematicalProgramming." Decision Sciences, vol. 16, 309-324.[66] Klevit, H., A. Bates, T. Castanares, E. Kirk, P. Sipes-Metzler, and R. Wopat(1991). "Prioritization of Health Care Services. A Progress Report by the OregonHealth Services Commission." Archives of Internal Medicine, vol. 151, 912-916.[67] Krischer, J. (1976). "Utility Structure of a Medical Decision." Operations Re-search, vol. 24, 951-972.[68] Labelle, R. and J. Hurley (1991). "Implications of Basing Health Care ResourceAllocation on Cost-Utility Analysis in the Presence of Externalities." CHEPAD.P. 91-3.Bibliography^ 186[69] Lipscomb, J. (1980). "Value Preferences for Health: Meaning, Measurement,and Use in Program Evaluation." In R. Kane and R. Kane (eds.) Values andLong- Term Care. Toronto: Lexington.[70] Longmore, D. and H. Rehahn (1975). "The Cumulative Cost of Death." Lancet,vol. 1, 1023-1025.[71] Loomes, G. and L. McKenzie (1989). "The Use of QALY's in Health Care Deci-sion Making." Social Science and Medicine, vol. 28, 299-308.[72] Luce, R. and H. Raiffa (1957). Games and Decisions. John Wiley and Sons.[73] Machina, M. (1982). "Expected Utility Analysis Without the Independence Ax-iom." Econometrica, vol. 50, p. 277-323.[74] Maynard, A. (1991). "Developing the Health Care Market." Economic Journal,vol. 101, 1277-1286.[75] McDowell, I. and C. Newell (1987). Measuring Health: A Guide to Rating Scalesand Questionnaires. Oxford University Press.[76] McGuire, A., J. Henderson, and G. Mooney (1988). The Economics of HealthCare. New York: Routledge and Kegan Paul.[77] McNeil, B. and S. Pauker (1980). "Optimizing Patient and Societal DecisionMaking by the Incorporation of Individual Values." In R. Kane and R. Kane(eds.). Values and Long- Term Care. Toronto: Lexington.[78] McNeil, B., S. Pauker, H. Sox, A. Tversky (1982). "On the Elicitation of Pref-erences for Alternative Therapies." New England Journal of Medicine, vol. 306,1259-1262.[79] McNeil, B., R. Weichselbaum and S. Pauker (1981). "Speech and Survival Trade-offs Between Quality and Quantity of Life in Laryngeal Cancer." New EnglandJournal of Medicine, vol. 305, 982-987.[80] Mehrez, A. and A. Gafni (1991). "The Healthy Years Equivalents: How to Mea-sure Them Using the Standard Gamble Approach." Medical Decision Making,vol. 11, 140-146.[81] Mehrez, A. and A. Gafni (1989). "Quality Adjusted Life Years, Utility Theory,and Healthy-years Equivalents." Medical Decision Making, vol. 9, p. 142-149.[82] Miller, J. (1970). "An Indicator to Aid Management in Assigning Program Pri-orities." Public Health Reports, vol. 85, 724-731.Bibliography^ 187[83] Mishap, E. (1976). Cost Benefit Analysis. New York: Praeger Publishers.[84] Mooney, G. (1977). The Valuation of Life. London: MacMillan Press.[85] Neu, C. (1980). "Individual Preferences for Life and Health: Misuses and PossibleUses." In R. Kane and R. Kane (eds.). Values and Long-Term Care. Toronto:Lexington.[86] Patrick, D. (1976). " Constructing Social Metrics for Health Status Indexes."International Journal of Health Services, vol. 6, 443-453.[87] Patrick, D., J. Bush, and M. Chen (1973). "Methods for Measuring Levels ofWell-Being for a Health Status Index." Health Services Research, vol. 8, 228-245.[88] Patrick, D. and M. Bergner (1990). "Measurement of Health Status in the1990's." Annual Review of Public Health, vol. 11, 165-183.[89] Pliskin, J., D. Shephard and M. Weinstein (1980). "Utility Functions for LifeYears and Health Status." Operations Research, vol. 28, 206-224.[90] Read, J., R. Quinn, D. Berwick, H. Fineberg, and M. Weinstein (1984). "Pref-erences for Health Outcomes." Medical Decision Making, vol. 4, 315-329.[91] Roberts, K. (1980). "Price Independent Welfare Prescriptions." Journal of PublicEconomics, vol. 13, 277-297.[92] Rosser, R. and P. Kind (1978). "A Scale of Valuations of States of Illness. Is Therea Social Consensus?" International Journal of Epidemiology, vol. 7, 347-358.[93] Rothenberg, T. (1984). "Approximating the Distributions of Econometric Esti-mators and Test Statistics." In Z. Griliches and M. Intriligator (eds.). Handbookof Econometrics, vol. 2. New York: North-Holland, 882-935.[94] Sackett, D. and G. Torrance (1978). "The Utility of Different Health States asPerceived by the General Public." Journal of Chronic Diseases, vol. 31, 697-704.[95] Sen, A. (1986). "Social Choice Theory." In K. Arrow and M. Intriligator (eds.).The Handbook of Mathematical Economics, vol. 3. New York: North-Holland.[96] Sen, A. (1985). Commodities and Capabilities. North-Holland.[97] Sen, A. (1972). "Control Areas and Accounting Prices: an Approach to EconomicEvaluation." Economic Journal, vol. 82, s486-501.[98] Shoemaker, P. (1982). "The Expected Utility Model: its Variants, Purposes,Evidence, and Limitations." Journal of Economic Literature, vol. 20, 529-563.Bibliography^ 188[99] Stevens, S. (1959). "Measurement, Psychophysics and Utility." In C. Churchmanand P. Ratoosh (eds.). Measurement Definitions and Theories. Wiley. p. 18-63.[100] Sullivan, D. (1966). "Conceptual Problems in Developing an Index of Health."Vital and Health Statistics, vol. 2, 17.[101] Sutherland, H., V. Dunn, and N. Boyd (1983). "Measurement of Values forStates of Health with Linear Analog Scales." Medical Decision Making, vol. 3,477-487.[102] Torrance, G. (1987). "Utility Approach to Measuring Health Related Qualityof Life." Journal of Chronic Diseases, vol. 40, 593-603.[103] Torrance, G. (1986). "Measurement of Health State Utilities for Economic Ap-praisal." Journal of Health Economics, 5, 1-30.[104] Torrance, G. (1982). "Multi-Attribute Utility Theory as a Method of MeasuringSocial Preferences for Health States in Long-Term Care." In R. Kane and R. Kane(eds.). Values and Long-Term Care. Lexington: Lexington Books.[105] Torrance, G. (1976a). "Toward a Utility Theory Foundation for Health StatusIndex Models." Health Services Research, vol. 11, 349-369.[106] Torrance, G. (1976b). "Social Preferences for Health States: An Empirical Eval-uation of Three Measurement Techniques." Socio-Economic and Planning Sci-ences, vol. 10, 129-136.[107] Torrance, G. (1976c). "Health Status Index Models: A Unified MathematicalView." Management Science, vol. 9, 990-1001.[108] Torrance, G., M. Boyle, and S. Horwood (1982). "Application of Multi-attributeUtility Theory to Measure Social Preferences for Health States." Operations Re-search, vol. 30, 1053-1069.[109] Torrance, G., G. Stoddart, M. Drummond, and A. Gafni (1981). "Cost BenefitAnalysis versus Cost-Effectiveness Analysis for the Evaluation of Long TermCare Programs." Health Services Research, vol. 16, 474-476.[110] Torrance, G., W. Thomas, and D. Sackett (1972). "A Utility MaximizationModel for Evaluation of Health Care Programs " Health Services Research, vol.7, 118-33.[111] van Praag, B. (1968). Individual Welfare Functions and Consumer Behaviour.A Theory of Irrational Rationality. Amsterdam: North-Holland.Bibliography^ 189[112] Veit, C. and J. Ware (1982). "Measuring Health and Health Care Outcomes." InR. Kane and R. Kane (eds.). Values and Long-Term Care. Lexingtom: LexingtonBooks.[113] Viscusi, W. and W. Evans (1990). "Utility Functions that Depend on HealthStatus: Estimates and Economic Implications." American Economic Review, vol.80, 353-374.[114] Wagstaff, A. (1991). "QALYs and the Equity Efficiency Trade-off." Journal ofHealth Economics, vol. 10, 21-41.[115] Ware, J. and J. Young (1979). "Issues in the Conceptualization and Measure-ment of Value Placed on Health." In S. Mushlin and D. Dunlop (eds.) Health:What is it Worth?. Toronto: Permagon Press.[116] Weinstein, M. and W. Stason (1977). "Allocation of Resources to Manage Hy-pertension." New England Journal of Medicine, vol. 296, 732-739.[117] White, K., S. Wong, D. Whistler, and S. Haun (1990). SHAZAM User's Refer-ence Manual Version 6.2. Toronto: McGraw-Hill[118] Wilkins, R. and 0. Adams (1983). Healthfulness of Life. Montreal: The Institutefor Research on Public Policy.[119] Williams, A. (1988). "Priority Setting in Public and Private Health Care."Journal of Health Economics, vol. 7, 173-83.[120] Williams, A. (1987). "Measuring Quantity of Life." In G. Teeling Smith (ed.)Health Economics: Prospects for the Future. New York: Groom Helm.[121] Williams, A. (1983). "Economics of Coronary Artery Bypass Grafting." BritishMedical Journal, vol. 291, 326-329.[122] Wolfson, A., A. Sinclair, C. Bombardier, and A. McGeer (1982). "PreferenceMeasurements for Functional Status in Stroke Patients: Interrater and Intertech-nique Comparisons." In Values in Long-Term Care.[123] Wright, S. (1985). "Health Satisfaction: A Detailed Test of the Multiple Dis-crepancies Theory Model." Social Indicators Research, vol. 17, 299-313.Appendix AChapter 2 ProofsProof of Lemma 1 (CS):Given the assumptions in the description section,cp c• (q; t, K) — U (q, t, K) — U (q° ,t,K)U (q' , t, K,) — U (q° , t, ic) •(A.1)1.a) Ordering levelsSince q' and q° are fixed points, U (q l ,t, K) and U (q° ,t, n) are fixed valuesand u respectively) so^cpcs (q; t, K) = U (q,^-= aU(q,t,^U (q,t, IS),^(A.2)-where a^and b =^. If U(q1 ,t, tc) > U (q° ,t, tc), then a > 0 byu-u^u-ustrict monotonicity.1.b) Ordering differencesSince socs (q; t, tz) c U(q,t,^(pc's E Sc (0) H U E S° (0).2) Completeness:Obviously, if it^u, then any q defined for U (q,t, ic) is defined for,,acs(q;t,K).3) Uniqueness:a) Invariance to O(U). Let U' = O(U).190Appendix A. Chapter 2 Proofs^ 191Necessity:^By the definition of uniqueness(pcsfqk it,^. Substituting for (,ocs (q; 1, K )LP(p CS(q .,,pcs ( q ; lou^ock((uU((qq i„ t, ) —,t./:)) ) _ 00((Uu( g( .: , t , K)) (A.3)Rearranging this expression yieldsO(U (q,t , k)) — 4,(U(q° ,t , 10) =(pc s (q; t, ti)u (0(U (q i ,t , k)) — 4)(U (q° ,t, ic))).^(A.4)Since q 1 and q° are fixed, so are ¢)(U (qi , t, ic)) and q5(U (q° ,t , K)) (equalto v and u respectively).(pcst,..4,O(U(q,t, h.)) — u Kju(V — u).^(A.5)O(U(q, t, K))^cpCS(q; t, )ua + b.^(A.6)Since cocs (q; t, K)u^U (q, t, tc),OU(q,t, K)) = (dU (q,t, K) + e)a + b= fU (q,t, k) + g^U(q,t, ts).^(A.7)Sufficiency: If U' aU + b(p CS (q ; t, ic)u = aU(q,t, k) + b — aU (q° ,t, Is) — baU (q 1 ,t,^+ b — aU (q° , t, 11,) — bU(q,t,^— U (q° , t,^ = (pc. s(q;t, ic)u.^(A.8)U(4 ,t,^— U (q°,t, K)Suppose not. Then U'^U . (pc s (q; t, K)u,^(OP((Uu((qq1'tt"") )—cl) (UU((qqu(" t,t"")))) --FLOW (0, K)^U(q,t,K). Since (p c' s (q;t,K)u^U(q,t,k) and(pcs (q; t, )u ,^U(q,t, K), this implies (p c s (q; t, K ^cp cs (q; t ic)u •(5)(pCS (q; t, )(cile) CZ.7b)(eadAppendix A. Chapter 2 Proofs^ 192b) Invariance to (q i ,q ° ).Given (q 1 , q ° ) and (q1, e) arecpcs (q; t, 'O w = aU (q, t, lc)e. Thus, cp cs (q;t,/c) (04 0 ) =dU(q,t,K) -Fe4-4b=dandU(q° ,t, tc) = U (q° ,t, k).If not, cpcs (q;t,K) (q i,q 0 )4PCS (q; t, K)(41,e)•fixed points,+ b and cpcs (q;t,low ,(0 ) , dU (q,t, lc) +(p c s (Et, ow ,e)^aU (q,t, ic)^b =a = c 4-4 U(q1 ,t,K) = U (4 1 ,t, K) andProof of Lemma 1 (ME):Given the assumptions in the description section,U(q,t, ic)cpmE (q; t, K) = U (q.1 t , K)1.a) Ordering of levelsSince q l is a fixed point, U (q l , t, K) = E so(pmE (q; t, lc) = -:_U (q,t,^U(q,t, K),(A.9)(A.10)> 0^U (q 1 ,t, tc) > O.1.b) Ordering of DifferencesSince yomE (q;t, tc) r U (q,t, 10 (if U(q 1 ,t,^0), cpME C Sc (ü) +-+ U Esc(ü) .2) Completeness:Obviously, if U(q 1 ,t, K,)^0, then any q defined for U(q,t, K) is defined for,pmE(q;^but if U (q 1 ,t, K) -= 0, (pmE (q;t,K) is undefined for all q.3) Uniqueness:Appendix A. Chapter 2 Proofs^ 193a) Invariance to (U). Let U' = O(U).Necessity:^By the definition of uniqueness soME(q., 1 , K )uV; 1kIE (q; t, is )ut . Substituting for (,omE (q; t, K)u , :(pME (4; t, K)u — O(U (q, t, K))^(1)(U (q, t, K))O(U (q 1 ,t, K))since q' is fixed. Since (pm E (q; t, K)u^U (q, t,ck(U (q, t, K)) = -a7 U(q,t,k) r U (q, t , K).(A.11)(A.12)Sufficiency: If U' = aU ,^(pmE 07;t, low = aU(q,t,K)^U(q,t, K)^MEcp^(q; t, K)u .^(A.13)aU(q 1 ,t,K)^U(q 1 ,t,K)Suppose not.^Then U'^(7 .^(pME (q; t, K)u, = Vuu((qpq' t:ik ))))aO(U (q,t , ic))^U (q,t, K). Since (pmE(q;t,K)u^U (q,t, K), this im-plies cp m E (q; t, ,)v, u cpmE (q; t, K)ub) Invariance to q l .com E (q; t , K )0 = soME (q; t, K yo^uLT((qqi,t,tts")) = uU(C:; t:tk"))^u(q1 t , K^UN,t,K)^_U (41 , t, ). Suppose not.^Then^ME (q; t,^=^(41 ,t")^U(e1 ,t,k) U(q,t,i)^— UN 1 ,t"),^lei(q; t ic )^U(q 1^4") u (41 ,t,K) u(41,0=Yr^iomE( q; t, tc ) i^(romE(q; loqizp ME(q; t , Thus,Proof of Lemma 1 (SG):Given the assumptions in the description section, cp sG (q; t, K) = p where p isimplicitly defined((q, t, K), 1) = CI ((q l ,t, K), p, (q ° ,t, K), (1 — p)).^(A.14)1.a) Ordering of levels:Appendix A. Chapter 2 Proofs^ 194Solve the above equation for p. Since all arguments of the right hand sidebut p are fixed: CI ((q, t, is ^= U(q,t,K) = U (p). Then,p = U 1 (U(q, t, ic)).^(A.15)Since U is an increasing monotonic function, so is its inverse. Thus,se,sG(q;t,ic) =0 U (q,t, K).1.b) Ordering of differencesSufficiency: If OH pl (q, t, K), then the problem above becomes1 -0(q,t,K) = (q, t,^= pt) (q 1 ,t,^+ (1 —^(q° ; t,^(A.16)such thatSG^U (q,t, rz) — (q° ,t,P = (70 kg, (A.17)0. (qi,t,K) — U (q°,t,K)which has the same structure as the CS problem. Following that proofcpSG^t ) ZE (q, t, K) and (pSG E Sc(ü).Necessity: Ifscp c(q;t,K) P = aü (q, t, ic)+ b, then p_e_ = ü(q,t,K). Giventhat there exists1.1((q,t, n), 1) = ((q 1^(q° ,t, K), 1 — p)^(A.18)and that there exists 0 such that0(0 ((q, t, K), 1)) = (q, t,^(A.19)which does not affect the choice of p above, then these two facts may becombined to yieldah^-= O(U ((q ,t, K),p,(q ° , t, ,c), 1 — p)). (A.20)Appendix A. Chapter 2 Proofs^ 195Since the left hand side is linear in p, so is the right hand side. Since theright hand side is ordinally equivalent to U, then U must be homotheticin p (i.e. 0(x.in,P,x/0„, 1 — p) (x.in) + (1 — p)I7If Cr(xwin, p, x /08„, 1 — p) -(1'-- p0- (X win ) + (1 — p) 0 (x108s), iosc (q; t,K) --f--I I (q, t, 10. Thus, ,,o .s.G c S° (U). If not, then cp sG (q;t,K) -f-- U (q,t, K) andcoSG E so (0) .2) Completenesscosc ( q;Domain:^t, K) E [0, 1] by the properties of a distribution func-tion. By certainty equivalence, U (q° ,t, K) = (( q l , t, K), 0, (q° ,t, K), 1) andU(q 1 ,t, K) = ((q 1 ,t, 10,1, (q° ,t, K), 0). By assumption, Op (.), 0e (-) > 0 forall (p, 6,). Then the domain is defined by (q1 ,t, K)32(q,t, K)R(q° ,t, K) since,if (q,t, (P indicating strict preference), it is not possible toincrease p above one to achieve indifference, whereas if (q ° ,t, OP(q,t, K),one cannot reduce p below zero.3) Uniqueness:a) Invariance to 0(0). Let U' = 0(0).cpsc(q;t, 100,Sufficiency:^=^0-1 (0 -1 (0( 17 ((q,t, 10, 1 ))))^=U-1(0((q,t,K),1))= (pSG VI such that any two representationsrelated by an increasing monotonic (i.e. with an inverse) function willyield the same result.soSG (q;^)0, = (19 SG (q; K ,\eNecessity: If ) then p must solveCI ((q,t, K), 1) =^((q 1 ,t, K), p, (q ° ,t, K), (1 — p)),and0(ü(q,t, K ), 1) = 0(CI ((q 1 ,t, K),p, (q°, t, K), (1 — p))).^(A.21)Appendix A. Chapter 2 Proofs^ 196But the choice of p is invariant to^(or 0 -1 ), so they are the sameproblem.Invariance to U' c U if 0(x„,i,,,p, x ic,„ , 1 —p) u pa (x tein ) + (1 — p)(1(x10„).This is a standard result of expected utility theory.^IfU(xwin p, x loss , 1 — p)^+ (1 — p)0 (x iass ), thencp sG (q;t, lc) = ^(q, t, K) — (q ° ,t, K) (A.22)U (q tc,' ,t, K) — 0(q°,t, )'which is the same as for CS, but with U replacing U. The resultfollows.b) Invariance to choice of standard gamble.SG^\kg, (. 7 /C gq i ,e) = pC r((q,t, K), 1) = a ((q 1 ,t, K), p, (q° ,t, K), (1 — p)) = U (p),^(A.23)co sa (g;t,n)(qie) =CT ((q,t, K ), 1) = ((q 1 ,1, K), p, (e,t, K),^- p)) = U'(p').^(A.24)Thenp = 4-4 U (U) = -1 (U) .'((q 1 ,t, K), p, (q ° ,t, K), (1 — p))= 0((4 1 ,t,K),p,(4°,t,K), (1 — p)) V p.^(A.25)Pick p = 1 and p = 0. Then the above condition reduces totwo certainty equivalents: U (q 1 ,t, K) = (41 ,1, K) and U (q ° ,t, K) =u(4°,t,K).Appendix A. Chapter 2 Proofs^ 197Suppose not. Then, if CI (x ipin , p, x loss , 1 — p)^pCI^+ (1 —p)0K) — (q° , t, K) `p(9 ,4°)(q; t, K) = ^aU(q,t, K) + b,^(A.26)U (g l , t , K) — U (q° , t, K)K) — 0(4,t,K)(pW,40 ) (q;t, K) = ^(q,t, K) + b', (A.27)U(4 1 ,t,K)— (10-0,t, lc)which is the same as the CS case, with U replacing U. Thus,SGcp (q i ao)(q• t^)^ CO(: ,40 )(q; t ic )If (x^, p, doss, 1 — p) pU (x.in) + (1 — p)( (x10.8),tp SG ( q;^070) =^-1(0 (( q, t , K ) , 1)) ,tp SG ( q; t tow 4_0) = tr --1 (0(( q, t ,n ) , 1)) ,cp SG(q; t, low ao) =W I (y, SG (Et,^400)^cio sG (q; t,^4n ) .0(A.28)(A.29)(A.30)(A.31)Proof of Lemma 1 (TTO):Given the assumptions in the description section, cc, TTO ( q;^(t — m)/twhere m is implicitly defined:U (q,t, K) = U(q i ,t — m, K).^(A.32)1.a) Ordering levels:Solve the above equation for m. Since all arguments of the right hand sidebut m are fixed: U(q,t, K) = U (t — m). Thent — m = U^U (q, t , K)).^(A.33)Appendix A. Chapter 2 Proofs^ 198Since U is an increasing monotonic function, so is its inverse. Thus, t —mU (q, t, K). But^(q;t, ts) = (t — m)/t = (1/t)U -1 (U (q,t, is)) which isnot an allowable transformation if t varies since it would depend on anargument of the function itself. If t is fixed, this is simply a ratio scaletransform of an allowable transform, which is itself allowable.Ordering levels of morbiditySufficiency: If U(.) tp(q, ts), then the problem above becomesck(tA(q, is)) =^m)p,(q1 , is)),^(A.34)such that cpTTO (q;^=^1214((qqr2)^(q, ) since q 1 isfixed such that µ(q 1 , K) = d> 0.Necessity: If cpTTO (q; t^(,0TTO (E t, lc) must be independent oft since the right hand side of the equation is independent of t. By thedefinition of independence, this means that coTTO (q; t, is) must be the samefor all values of t, i.e.U (q, At, is) = U (q1 ,A(t m), is) V a > 0.^(A.35)Set A = Vt. Then U (q, 1, is) = U (q l , (t—m)1 t, is) = U((t—m)/t). But thenall arguments but q are fixed on the left hand side so (q,^U ((t—m)1t)such that (t m)1 t =^(q, is)). Let 0 -1 (C7) = A. Then rearrangementof the above expression yields t — m = tp(q, it). From above, t — m Uso U(q,t,tc)^tA(q,Suppose not. Then t-trn = f (t) 0(A(q, K)).1.b) Ordering differencesIf U(x)^tA(q,t, tc), (,o TT° (q; t, is) -1= A(q; is). Thus coTTO E sc(11)• If(x)^p,(q, N)t , then (pTT° E Sc(0) for any choice of t. If U(x)^A(q, is)t,Appendix A. Chapter 2 Proofs^ 199then^(q; t, ) ° U (q,t, /0 if and only if t is fixed and then (,oTTOS° (0).2) CompletenessDomain: Since U 9 (X) t(X) 0 and U (q° t , tc) = U (q, 0, is) by assumption,(q, t, ic)n(q ° ,t, tc,) for all (t, /c) since t — m > 0 (i.e. cannot increase m toachieve indifference) and (q 1 ,t, K)t(q,t, IC) for all (1, K) since m > 0 (cannotreduce m to achieve indifference). For q [(A q 1 ], it is not possible toadjust t — m to achieve indifference.3) Uniqueness:a) Invariance to 0(U). Let U' = q5(U).Sufficiency: cpTTO (q ; tou, II) is defined by - 1(0-1 (OW (q, K))))u-i u t,K))^(10TTO ( q. IC )u , such that any two representationsrelated by an increasing monotonic (i.e. with an inverse) function willyield the same result.(pTTO(q;^= (,OTTO (q; t , lc ,N uNecessity:^ ) then t — m must solve^U(qt, tc) = U(q 1 ,t — m,^(A.36)and^cb(U(q,t, K)) = 4)(U (q i ,t —^(A.37)But the choice of t — m is invariant to 0 (or 0 -1 ), so they are the sameproblem.Invariance to p. c /L if U(x)^tc).If U(x)^ty(q, K), thenft(q, h')v;^(q;t,TTO^K\^1 ^/491, K ) (A.38)Appendix A. Chapter 2 Proofs^ 200which is the same as for ME, but with ia replacing U. The resultfollows.b) Invariance to choice of reference state.Since,(pTTO (q ; t, K)qi = U 1 (U(q,t, K )) t,^(A.39)(70TTO (q; /04.,^(I-1(U(q,t,K))/t,^(A.40)thenU -1 (U) = 0 -1 (U)4.--* U =^U(q1,t,K) =^, t ,^(A.41)for all (t, ,cSuppose not. Then, if U(x)^tit(q,K), this is the same problem asME and the result follows. If U(x) tit(q,K), thenTTO (q; K)ql^0---1(0(90TTo(q;t,toir))^95,7To (q;t,ti,),7 .^(A.42)OProof of Lemma 1 (ES):Given the assumptions in the description section, (70ES (q; ti, fci , ti, KJ) = (ti —77/)/ti where m is implicitly definedU(qi,ti,K i ) = U(qi,t i — m,Ki ).^(A.43)1.a) Ordering levels:Recognizing that all arguments but m are fixed on the right hand side,solve the above equation for m: U(qi,ti,K i) Iii(ti — m). Thenti — m = Ui l (U(qi,ti,K i )).^(A.44)Appendix A. Chapter 2 Proofs^ 201Since Uj is an increasing monotonic function, so is its inverse. Thus,ti — m^U(qi,ti,Ki).^But cpEs (q;t i ,K i ,ti ,K;) = (ti — m) / t i(1/t i )Uj (U(qi ,t i , K i )) which is not an allowable transformation if t i variessince it would depend on an argument of the function itself. If t i is fixed,this is simply a ratio scale transform of an allowable transform, which isitself allowable.Ordering levels of morbidity:Sufficiency: If ti(x i )^tip,(qi, KO, then, since q; is fixed and p(gj ,Ki ) > 0,the proof from TTO may be applied.Necessity: If^ti ,^ti, Ki) =^(pEs (q; t i , l£i , tj,Ki ) must beindependent of t i since the right hand side of the equation is independentof t i . By the definition of independence, this means that coEsmust be the same for all values of t i , i.e.U(qj ,A(ti — m), Kj ) V A > 0.^(A.45)Set A = 1/t i and the proof for TTO may be applied.1.b) Ordering differencesIf U(x i ) ,----° t ip.(qi ,k i ) for all i, cpES (4; t i ,K i ,ti ,K j ) ---c= 1.4qi ,K i ). Thus, (pEs ESc(tti). If not, then (pES (q; ti, ki ) ti, Ki) --9- U(qi ,t i ,K i ) and (pES E So(U0 ifand only if t i is fixed.2) CompletenessDomain: Since Ugx j ), Ugx j ) > 0 for all G, t by assumption, and t; — m E[0, ti], then (q.i , ti ,K)R(qi ,t i ,K i )R(qi , 0, Itj ) since if state i is preferred to jat ti , one cannot increase ti to achieve indifference, whereas if state j atti = 0 is preferred to state i, one cannot decrease tj to achieve indifference.Appendix A. Chapter 2 Proofs^ 2023) Uniqueness:a) Invariance to O(U). Let U' = q(U).^cpES 47; ti, Ki,ti, Ki)Er is defined by U(qi,ti,K i ) =^— m, Yep) andDES (g; t i ,k i ,ti ,Ki )u, is defined by U'(qi ,t i ,K i ) = Ul(qi ,ti —But this is the same problem as for TTO, except the arguments of theright hand side function have changed. Since that proof is unaffectedby changes in the argument values, the result carries through.Invariance to p' p if U(x i )^t ip(qi,Again, this problem is the same as for TTO with changes in the valuesof the arguments which do not affect the proofs. The result follows.b) Invariance to choice of reference stateAgain, this is the same problem as for TTO, although the dimension-ality of the reference state has increased by the inclusion of personalcharacteristics. ^Proof of Lemma 1 (PE):Given the assumptions in the description section,PE ( 1) • • •) qN;t1, ...,1 k ^N) = (N — m)/N where m is implicitly defined147 (qi,^•••,KN) =T 177"^"', qN-m7 qN-m+17 •••7 eN1^•••7 tN7 Ki7 •••7 KN)7which may be rewritten asIiV({q},N)= Ii/({q 1 },N — m, {q°}, m).(A.46)(A.47)1.a) Ordering levels of societal healthAppendix A. Chapter 2 Proofs^ 203Since q l , , N, t i , Ki are fixed in the expression above, the right hand sidemay be expressed= I/1/(m).^ (A.48)Taking the inverse of IV on both sidesm = 147-1 (Tir(fq}, N))^W({q},N),^(A.49)since IV is monotonic (in order to have an inverse).But yoPE (q i , ...,q N ;t i ,...,t N ,K i ,...,K N ) = (N — m)/ N = 1 — (m IN) =1— (1/N)(W -1 (1;17(.))) which is not an allowable transformation if N variessince it then depends on an argument of the function itself. If N is fixed,this is simply a ratio scale transform of an allowable transform, which isitself allowable.1.b) Ordering outcomesSufficiency: If W(.) = E;Y__ 1 U^KO and t i = t, Ki = K for all i, then theproblem above becomesU (qi, tip K i )N = U (q1 ,t i , ic)(N — m) U(q ° ,t i , K i )m,^(A.50)PE (qi, qiv^N ki, KN) = (N m)/ NU (qi ,t i ,^— U(9 ° , t i , (A.51)U(q 1 , t i , K i ) — U(q°,t„Since q^° , t i , K t are fixed, PE r(l°^lq1)--•;qN ,t1)-7t N ,K1^•KN) = a(U(qi ,t i , K i ))^b C U(qi ,t i ,where a > 0 since (q 1 , t i , K i )n(q° , t i , K i ). If 1/17(•)^E^Ki) andt i^tj or Ki^Kj and U (qi , t i , Ki)^v(q) b(t i ,K i )+ c (i.e. Uq , § = 0), thenthe problem above becomesNv(q) E b(t i , K i ) = (N — m)v(q 1 ) mv(q°) E b(t i , K i ),^(A.52)i=iAppendix A. Chapter 2 Proofs^ 204such that yo PE (q1, •••, qN;t1) •••)t^'i1, •••, N) =^vf(qi)N v(qti) which, sincevo - ) ._070), q1,t 1, ..., tN , hl, •••1 KAr are fixed, is cardinally related to U.Necessity: Consider the restriction on the SWF. If W^U(qi,ti,then the independence from t results of the TTO section may be appliedhere (with respect to N): then (pPE U.Consider the restriction on (t i , K i ). If these are allowed to vary across i,then the problem becomesNN -rnE(U (Qt ,i))^E (U(q 1 , t i , rc i ))^E^U (q° , t i , K i ).^(A.53)Standardizing to a given (t i , /£ i ):NNU (q,t, k) E y U§ (q,t, ind§ 2: =i=1 §(N — m)(U(q 1 , f, k)) m(U (q ° , f, k))+N-m^E E^f, k)d§ i^E E U§ (q° , f, k)d§ i .^(A.54)i.1^g §SolvingU(q,t, k) — U (q° ,t, k)(PPE^•••, qN;ti,^t N ,^•••, N = u(qi^k)^u(q0,t,k)1 Ef`I-1- 771^E § UE , § (q,t, k)d§ i (q 1 — q)^+( N •)[ U(q1 ,t, k) — U (q° , t, k)Er=-N_,. + 1^E § Ue,§(q, f, k)d§ i (q° — q ) (A.55)U(q 1 ,t,^— U (q° ,t, k)which = f (U (q), q) unless UE , § = 0 for all e, §, or dx = 0. The lattercondition is the identical agents condition. The former condition requires^(q, t, K) c v(q)^b(t, 10.Ordering expected outcomesAppendix A. Chapter 2 Proofs^ 205Sufficiency: Let N be large. Define Pi as the proportion of agents inthe sample with characteristics (t i ,K i ). Since the selection rule is randomand N is large, PI' = PI = P . For an additive SWF, this impliesW(•) ---?-- EI^Ici)N , such that^E PrU(q,ti, Ki)N = E^ti, Ki)(N — m)I^ I+^PIU(q ° ,ti, tti)m^(A.56)IPE( ql,^ = (N m)i N =El Pi t I (q,t i ,^—^Pi t (q° ,ti,^PIU(q 1 ;ti,Ki) —^PIU(q ° ,ti,PE ((^••• qN;t1,^Kl)^N)(A.57)^a E^b c E(11(q,ti,K;))^(A.58)INecessity: Suppose (i) N is small or (ii) the selection rule (represented byS) is non-random. Then PIN = f (t i , ...,t^N),PI^f(ti,•••,tN_m,K1,-,KN-m),PI = f (tN_m+i, ...,tN, KN_m+i,^N), and PP^P . Then^PE^N ; tl,•••,tN,K1,--,KN)=a(t i , ..., tN,^iv) E Piu(q , ti, K i ) + b(ti, ••., tN, K1, •••, N)IE(U(q,ti,Ki)),^ (A.59)since the transform depends on arguments of the function itself.1.c) Ordering differencesIf W(-)^U(xi)^vi(q) and v 2 (q) = v(q) V i or q § = 0 V (4", §),then from exactness pPE (q1,•-•,QN;ti,...,07,K1,...,KN)^U(qi ,t i , K i ), suchAppendix A. Chapter 2 Proofs^ 206that (70PE E Sc (U). If U E Sc(0, then ()OPE E S°(0- ). If not, butN and (t i ,...,tN,K i ,...,KN) are fixed cpPE (qi,...,qN;ti,•••,tN,Ki,•••,KN) o(U(q i ,t i , KO,^U (qN,tN, KN)), such that (pPE E^(U ).2) CompletenessDomain: Follow the same argument used in TTO, but replace R withRsociai , qt with q2 and t with N. Since Wq , I/17N > 0 and N — m E [0, NJ,the result follows.3) Uniqueness:a) Invariance to 0(W). Let W' 0(W).Then this is the same problem as for TTO but W replaces U.Invariance to 0 2 (U i ). Let U i 'Because the SWF involves interpersonal comparisons (except in thecase of dictatorship), and individual specific transforms affect thesecomparisons, cpPE will not be unique to individual specific transforms(this is a basic result of Arrow's theorem) unless the ordering is some-how flawed. When the restrictions on the ordering are relaxed, theset of allowable transforms will depend on the nature of W. If W ishomothetic, U must be cardinally unit comparable, i.e. U = aUIf W is based on levels rather than differences, then U must be cardi-nally fully comparable, i.e. U = U. See Sen (1986) for proofs of theseand similar propositions.b) Invariance to {(t i , is and choice of selection rule.Since there are a finite number of agents, there exists U( q, t i,^=U i (q), such that (t i , n i ) may be treated as an agent specific transform.The results of (a) above then apply, (t i , K i ) allowed to affect utilityAppendix A. Chapter 2 Proofs^ 207over q only as individual specific transforms are allowed above, andcommon characteristics allowed to affect utility over q only as commontransforms are allowed above.A change in selection rule may be viewed as a change in the individualspecific transforms of the cured and the deceased such that (a) maybe applied once again.c) Invariance to choice of reference points ( q 1 , q°)Again, this is the same problem as for TTO, with W and N replacingU and t. ^Proof of Lemma 2 (CS):^Sufficiency: If U(x) = av(q)w(t,K)^az(t,K)^b, then (pc's (q;t,n)vo)-v(q`) _ L„-cs( q )V (t, K ).v(4 )—v(e)Necessity: If (p CS (q; t , ) is independent of (t, K), then, by the definitionof independence, cpCS (q; i, K) = 93 cs (0 . (pcs (q; t, K) = uU((q,t,t")")--_UriC(q:0,t,i72) ,but since q 1 and q° are fixed, this may be expressed 50 cs (q; t, K ) ,U(q,t,is)—z(t,Ki^U(q,tos)—z(tot) Substituting this into the independencey(t")—z(t,K)^—^w(t")^•oCS lq \^El(q,t" ).- Z (t" )^ 93 C S Vcondition:^k ) = Let^k ) = v(q) and rearrange:w(t")^•U(q,t, tc) = v(q)w(t, 10 + z(t, K). Since (pc s (q; t, K) is invariant to affinetransforms of U (see Lemma 1), this completes the proof. ^Proof of Lemma 2 (ME):Sufficiency: If U(x) = av(q)w(t,K), then c,o mE (q;t,K) =omE( q) (t, to .Necessity: Independence requires (pmE(q; t,^= omE(e.) Substituting theexpression for co AfE (q;t,k) and setting ci,mE (q) = v(q) and U(q l ,t,tt) =atr(q)w(t")^v(q) av(q')w(t,n)^v(q1)Appendix A. Chapter 2 Proofs^ 208w(t, is) (since q 1 is fixed) yields U (q,t,^= v(q)w(t, K). Invariance to ratioscale transforms (Lemma 1 (ME)) completes the proof. ^Proof of Lemma 2 (SG): If the von Neumann-Morgenstern axioms hold and (t, i)are fixed for all states of the world, then 4,o sG(q;t,^ 414" )-° (!ii°" 4" ) whichU(q,,„„,t,tt )--U(qh,„ ,t,k) 7is the same expression as for the CS instrument, except the restrictions derivedmust apply to U and not an arbitrary U. ^Proof of Lemma 2 (TTO):a) Independence from It,Sufficiency: If U(x) = U (v(q,t), K), then, for any /C there exists some Usuch that U(x) = U (v(q,t)). By Lemma 1 (TTO), TTO is invariant totransforms of U.t_m asoTTL, _. t"\Necessity: Since cp TTO (q; t,^= t^(9§W ^(-- 6---rr -I ) t1 . From thea§ expression for TTO, U (q, t, K) = U (q l ,t - m, K). Totally differentiate thisexpression to get E E ue (x)Ak + ut (x)dt = 0 which yields dt = -m =E)/ut(x)Tjt(x)^k^av,Traoe;t") = _ a E, r/c.,, ^,)^LI (x) d4^Then ^a§^t^•, _Thus, sign (8vTr°a§(q't" )) - -sig- z-, auc(xk a§m(x)) — sign (E aura§E.t ).b) Independence from tSufficiency: If U^tp,(q, K), then TTO is defined by ck(gq, is )t) =4;(µ ( q 1 , K)(t — m)) such that (pTTo(q;t,to t-nt^AN") )^t^gq)•Necessity: see Lemma 1 (TTO).If not, thenatoTTO (q;^_a m at mt^t2).^(A.60)Totally differentiate the basic TTO expression to obtainUt (q,t, K)dt = Ut (q 1 .t - m, K)dt - Ut (q 1 ,t - m, K)dm^(A.61)Appendix A. Chapter 2 Proofs^ 209dm= 1 ^Ut(q,t, c) dt^Ut(q1 ,t — m, tc) .Substituting this into the above expression yieldsatoTTo (q; t, K)^1 ,^Ut (q,t, K )^90^(q; t, ic)) >^< 0 .TTOat^Ut(q1 ,t — m,Note that if Ut (q,t,^>p7To (q; t fre) < 1 for all q .Ut(q1 ,t — m, K),acoTTO (q;t") >at(A.62)(A.63)0 sinceProof of Lemma 2 (ES):a) Independence from 1£If 1£i^ICJ, let KJ =^Then U (qi ,t i ,^= U (qj , tj —m, + j). Differ-entiate this expression to obtainSolving, obtain D ES I_ 4key, L.,:^ti,bj > 0 signifies KJ > K i , whereasEe UE(004k +Ut (x i )dt +E § U§(xj)6 j = 0.E 4 u4c.ockk+u.(xi)(5,) ). SinceK3) = (1 ut(.t)tij < 0 signifies Kj < 1£ t , aWE S^"i) =^ > / < 0 *---* U§ < / > 0 and ac'Es (gi a;tri ,t.i"J) >^< 0 4_,(-0§,^--ut t5(()..)U§ > < 0.b) Independence from tIf ti^tj , then the result depends on which time frame differs. Let tjchange. Then the effect on the ES may be found by differentiation of theES problem: 0 = Ut (x)dt; Ut (x)(—dm), which implies dtj = dm sinceUt (x) = Ut (x). Then ai,E S (qi;ti ,ni ,t j ,ni =^— dm) = 0.at;Let t i change.^The differentiation exercise becomes Ut (x i )dt i =—Ut (xj )dm. ThenagoES (gi;t'"i,t j,tsj)^-1 1ES (qi ti ^dm ) =at i^— TAY'^,dti—567ES (qi; t i ,^ti' IC:7) + uUt (: i ))^(Ut(xi)(ti) Ut(x.i)(ti m)) >^< 0 —as the utility of time for i is greater/less than the utility of time for j. ^Proof of Lemma 2 (PE):Appendix A. Chapter 2 Proofs^ 210a) If t i = t, pc t = K, V i E [1, N], but t i^or k i^n,:, the PE problem becomesW K) = I;17(q1,...,q 1N_„„4,_,,,i+1 ,...,eiv ,t,K). For any fixed N,this has the same structure as Lemma 2 (SG), with W and N replacing Uand q respectively, and the proof follows the same structure.Suppose the SWF is welfarist (such that N is separable from (q, t, K)) andthat utility over q depends on (t,K).Then I47(•)^U(q,ti, K i ) andPEU(q,t,K)—U(q° ,t,n.)CsO^q N ; t1, ...,t Ar, K1, ••• K N ) = u (qi ,t" )_u(q „ ,t" ) . But this is the sameas Lemma 2 (CS) and the results follow from that proof.b.^Suppose ti^ti or Ki^Ki •^Then the PE prob-lem is to find m such that 14 7. (qi,--,qN)tl, --)tN)K1) •••7KN)^=t N,Ki,..., N)• But this is essentiallythe same problem as the identical agents case except (t 1 ,..., t N , K1, .. K N)is an 2N-dimensional vector instead of a two-dimensional vector. The sameargument applies over the vector.Consider when the SWF is welfarist and utility over q depends on (t i ,rs i ).EN^t^N—^tThen (PPE (qi,•-•7(iN,t1,•••,tN,Ici,-••,isy) =^(q' E '^(qE 2,4_ 1 U(q 1 U(q° ,t,",)-Z. Then cocE( • ) = (u5(q,ti,Ki) — u5(qo,ti,Ki)) — cp PE( • )( u6 (q i ,ti,Ki ) _U§ (q° ,ti, K i ))/b, b > 0 since (q1 , t i , K^(q°, t i , K i ). Then cpCE (•) = 0 V qU§ ,E(x i ) = 0 H U(x i ) v(q)+ b(ti , K i ). Otherwise, reference must be madeto the above expression. ^Proof of Proposition 1:Instruments generate identical values if and only if they are identical transforma-tions of identical utility functions. Given reference points that fit all instruments tothe same interval, this requires all transformations have the same curvature at allAppendix A. Chapter 2 Proofs^ 211points (i.e. linearity which requires the respective metrics to have constant marginalutility) and that the orderings not be influenced by factors peculiar to each instru-ment (i.e. independence). Thus, the proposition may be proved by combining theindependence results across instruments:(a) intersection of independence conditions across all six instruments yields the re-suit.(b) intersection of independence conditions across SG, TTO, ES, and PE yields theresult.(c) intersection of independence conditions across SG, TTO, and ES yields the result.(d) intersection of independence conditions across ES and PE yields the result. ^Proof of Lemma 3Necessity: If the QALY satisfies WDI, then utility must satisfy WDI.If the QALY satisfies WDI,(9(c,06 (q;1, K)/ cpc, (q;t, K) > 1- (co E.(q; ic)R E., (q; K))(94k(A.64)(subscripts denote partial derivatives) when co e, > (p c., (see derivation under axiomaticmethods in the text). If cp(q; t, K) = O(i (q,t, K)) (as is always the case for CS, SG,and PE, and is the case when t is fixed for TTO and ES (see equations 2.3 to 2.15)),then (A.64) translates toa(ckuUE,(q,t, 101 OuUe,(q, t , ')) > 1— (ouu6 ( q ,t, 101 41)uU&i (q,t, K))^(A.65)Nkwhich reduces toawe,(q , t , K)/bre.,(qnt , n)) > 1— (u,,( q,t,K)/UE, (q, t,^(A.66)Appendix A. Chapter 2 Proofs^ 212which is the condition for all^to be WDI in U. If U is homothetic in t, thencp(q;t, ic) = 0(µ(q, K)) (for TTO and ES), then the same argument applies to it as toU and the result is that all 6, are WDI in it, which is equivalent to saying all 4.k areWDI in U, since WDI is transformation independent.Sufficiency: If utility satisfies WDI, then the QALY satisfies WDI.The above argument holds in reverse since all steps are if and only if. ^Proof of Lemma 4Necessity: If the QALY satisfies MPI, then utility must satisfy MPI.If the QALY satisfies MPI,0(coei(q;t,^cp i (q;t, ic)) = 0.ask (A.67)If cp(q;t, K.) = O(U (q, t, K)) (as is always the case for CS, SG, and PE, and is the casewhen t is fixed for TTO and ES (see equations 2.3-2.15)), then (A.67) translates to49((kuU6 (q, t, )/cbu^(q, t, ))^0Nkwhich reduces to0(Uei (q,t, /c)/ Ue) (q,t, 10) = 0askwhich is the condition for all Sk to be MPI in U. If U is homothetic in t, then4,o i = Oi (g) (for TTO and ES). Then the same argument applies to as to U and theresult is that all Sk are MPI in p, which is equivalent to saying all 4' i, are MPI in U,since MPI is transformation independent.Sufficiency: If utility satisfies MPI, then the QALY satisfies MPI.The above argument holds in reverse since all steps are if and only if. ^Proof of Lemma 5Necessity: If the QALY satisfies SDI, then the utility function must satisfy SDIand the QALY must be an affine transform of the utility function.(A.68)(A.69)Appendix A. Chapter 2 Proofs^ 213If the QALY satisfies SDI, then(,o(q; t, K) = E ; (Ej; t,Sincec,o(q;t, K) = O(U (q,t, K))v)(Ei ; t, Kr) = 0 ( u (Ei t, K)) = O(vi (0)(from equations 2.3 to 2.15) (A.70) becomesK0(U(q,t, K)) = E o(vi (ei ))j=1Or(A.70)(A.71)(A.72)(A.73)KU (q, t, K) = f (E Ovi(0))•^ (A.74)j=1Deriving the expressions for (,o given in equations 2.3 to 2.15 using this functionalform for utility, one obtains:,o(q; t, K ) = 0(f (E vj (6)))(70 (E3 ;t, ic)) = 0(f (v3 ((7) Evk(ek)) = o(f(f),(0))k0:7such that (A.70) now becomes4( f(Evj(S7))) = E(0(f (73i(i))))which, given the definition of f , becomesO (Eir'3(&:)) = E(0(1)3(6)))(A.75)(A.76)(A.77)(A.78)which is a Pexider equation with the solution q5(U) = aU b (where q(U) (p).Sufficiency: If utility is affinely related to the QALY and exhibits SDI, then theQALY exhibits SDI.If U satisfies SDI, then Ue,,ei (q,t, K)^0. Since cp = aU b, this implies (p ei ,Ej (q; t, K) =(q, t, K) = 0, which is the condition for (p to be SDI. ^Appendix BSimulation Results1 . 00 80 60 40.20.0 tto: r=.05, t= 1 ,1 0,200.0^0.2^0.4^0.6^0,8^1 .0tto: r=.1, t=1,10,20crFigure B.1: Time Trade-Off214Appendix B. Simulation Results^ 2150.80.60.40.20.0sg: (a,d)—( 1 , 1 ),(.75, 1 .5),(5,3) 0.0 0.2 0.4^0.6 0.8 101.101.00.880.66pet s= 1 , .5, 0^ccor 0.50.440.22OGO0.0 0.2 0.4 0.6 0.8Figure B.2: Standard Gamble and Person EquivalentsAppendix B. Simulation Results^ 2161.00.80.60.40.20.0 es: Yi=.5,1,2xYj, r=0, t=200.0^0.2^0.4^0.6^0.8^101.00.80.60.40.20.0es: Yi=.5.1,2xYj, r=.10, t=200.0^02^0.4^0.6^0.8^10qFigure B.3: Extended SympathyAppendix B. Simulation Results^ 217likely example 1.00.80.60.4020.0rocy0.0^0.2^0.4^0.6^0.8^1.0qNote: (a,d)=(.75,1.5), r=.05, t=10, yi=.75yj, s=.9Figure B.4: Most Likely CaseAppendix CData Sources for Chapter 2The data used in this paper are all drawn from the Canadian G.S.S. (General SocialSurvey) of 1985. This survey covered 11,200 respondents in a stratified sample ofthe Canadian population. The advantages of the scope of the study include a largenumber of observations from a cross-section of the population (as opposed to medicalpersonnel or persons in a very localized area). Each individual is surveyed only for hisor her own health state. This means (1) the observation advantage is much smallerthan it might appear and (2) assumptions will have to be made that (like) individualshave the same preferences so that a utility function over several points in morbidityspace can be estimated.VariablesUtility, the dependent variable in this analysis, is taken to be the satisfaction withhealth variable (exists on a four-point scale). This is an ordered response variablewith no interval properties (the categories were not assigned numerical ratings). Re-sponse rates on this variable were high (99 per cent), although the data are highlyskewed towards higher levels of satisfaction (the smallest cell, very dissatisfied, had500 observations).The independent variables are various health characteristics. These form the argu-ments of the utility function. This assumes utility is based on levels of characteristicsand does not depend on expectations or differences from some reference level (seeWright [1985] for supporting evidence on this stance). These variables fall into three218Appendix C. Data Sources for Chapter 2^ 219categories.The first category is long-term activity limitation. These variables fit the verynarrow definition of illness as physical limitation. There are four groups of variablesPertaining to mobility (four), agility (three), sight (one), and hearing (one), as wellas general activity limitation. These are multivariate binary variables: the ailmentmay be present or absent, and if present, moderate or severe. About 30 percent of thepopulation reported some chronic ailment, so it should be expected that some cellswill have few observations. While these four categories of ill-health are much moregeneral than heretofore tested for in independence analysis, they are only a subset ofthose factors that affect satisfaction with health. For analysis at this level to be valid,one has to make assumptions about the distribution of these excluded variables.The second category of independent variables consists of short term ailments,measured by their impact on activities normally undertaken in a two week period(hence, these measures should be independent of the long term measures above).While the "loss days" provide a numerical measure of severity of illness, these variablesare less appealing than those in the first category because (1) the loss function dependson other non-health characteristics and reflects need as well as health itself, and (2)days may be lost to varying degrees which cannot be numerically compared (bed daysare worse than restricted days, but it is not known by how much). Only 6 percentof respondents reported any bed days, while only 7 percent reported any restricteddays. While few cells are well represented, it is believed the data are sufficient todetermine if short run ailments are separable from more permanent ones. This doesnot appear to have been tested for before.The final possible category of variables are more social in nature, including suchfactors as number of visits and contacts with friends and relatives. These are consis-tent with the broader WHO definition of health and Torrance's classification schemeAppendix C. Data. Sources for Chapter 2^ 220(1986). Unfortunately, emotional health is still absent. What needs to be determinedis (1) whether these factors influence satisfaction over health, (2) if they do, is it ina fashion that would allow physical dysfunction to be analyzed in isolation, and (3)can these factors be combined in a piecemeal fashion with physical factors in broadlybased QALY analysis?Finally, age is used as a conditioning variable.Construction of VariablesTo estimateU = a +^)31q0.5 +^0.5 0. 50.5 E E^qkthe variables are defined:dependent variablesatisfaction with healthconstruction: survey item 73(a)structure: ordered categorical (1=very satisfied, 2=satisfied, 3=dissatis-fied, 4=very dissatisfied)response: 11088/11200 (61 non-respondents in this category were alsonon-respondents in the independent variable category)independent variableschronic morbidityendurance (E)construction: poor health indicated (assigned a value of one) if poorhealth reported in any one of its factor components, otherwisegood health indicated (assigned a value of zero), where the factorcomponents are:Appendix C. Data Sources for Chapter 2^ 221survey item q(27) 400m walksurvey item q(28) stair climbsurvey item q(29) 5 kg carrysurvey item q(30) standingstructure: binaryagility (A)construction: poor health indicated (assigned a value of one) if poorhealth reported in any one of its factor components, otherwisegood health indicated (assigned a value of zero), where the factorcomponents are:survey item q(31) bendingsurvey item q(33) graspingsurvey item q(34) reachingstructure: binaryperception (P)construction: poor health indicated (assigned a value of one) if poorhealth reported in any one of its factor components, otherwisegood health indicated (assigned a value of zero), where the factorcomponents are:survey item q(35) seeingsurvey item q(36) hearingstructure: binaryshort term morbidity (S)Appendix C. Data Sources for Chapter 2^ 222construction: poor health indicated (assigned a value of one) if any num-ber of sick or bed days reported, otherwise good health indicated (as-signed a value of zero). Factor components are:survey item q(13) bed dayssurvey item q(17) activity limit daysstructure: binarysocial health (L)construction: poor health indicated (assigned a value of one) if the re-spondent had fewer than average contacts or visits (average being themean number of visits or contacts in the sample) across a majority ofcontact groups (eg. parents, children, siblings, friends, others); goodhealth indicated (assigned a value of zero) if more than the averageacross most contact/visit groups. Factor components are:visits (mother q(107), father (112), children (116), siblings (121),other relatives (124), friends (127))contacts (mother q(108), father (113), children (117), siblings (122),other relatives (125), friends (128))structure: binaryresponse rate 11064/11200conditioning factorsageconstruction: survey item 44structure: ordered categorical (5 year age groups, 1=15-19 years, ....14=80 years and over)Appendix C. Data Sources for Chapter 2^ 223response: 11200/11200Appendix DEmpirical Results for Chapter 2Table D.1: Estimation Resultsvariable all data < 30yrs 30-65 yrs > 65 yrsconstant -.12061 .033990 -.08887 -.47021(t-test) (-7.5760) (1.2451) ( -4.0536) (-11.681)endurance (E) .55324 .70668 .57805 .70867(t-test) (13.773) (7.0033) (9.4992) (10.297)agility (A) .31733 .56996 .33122 .36632(t-test) (6.001) (4.4172) (4.2674) (3.8977)perception (P) .16695 .29324 .31818 .25607(t-test) (3.5098) (2.2423) (4.1021) (3.4210)short (S) .44632 .29584 .49361 .61696(t-test) (10.096) (4.2780) (7.2285) (5.1158)social (L) .054817 .048678 .021129 -.079966(t-test) (1.2846) (.61175) (.37935) (-.51763)ExA .15977 -.29063 .22885 .10513(t-test) (2.4791) (-1.5221) (2.2870) (1.0070)ExP -.10795) -.034529 -.15608 -.18840(t-test) (-1.5324) (-.15539) (-1.2294) (-1.8560)ExS .23008 -.11095 .25493 .21252(t-test) (3.0861) (-.64301) (2.2306) (1.4803)ExL .051348 -.16422 .14384 .050799(t-test) (.52604) (-.67486) (1.1037) (.23410)AxP .033642 -.20137 -.014383 .082248(t-test) (.47541) (-.77462) (-.10467) (.83324)AxS .11732 .20946 .049534 -.016288(t-test) (1.5140) (1.0354) (.41348) (-.12960)AxL -.14345 -.21466 -.24133 .20035(t-test) (-1.3012) (-.82644) (-1.5902) (.87907)PxS -.090396 .12764 -.041435 -.23521(t-test) (-1.1311) (.54995) (-.28075) (-2.0565)224Appendix D. Empirical Results for Chapter 2^ 225PxL(t-test).28658(2.4075).58248(2.0517) 70095979.34481(1.7705)-.17761.087443(.42160)-.35416SxL -.15281(t-test) (-1.3523) (.043895) (-1.1314) (-1.2877)a l 1.4828 1.6859 1.5164 1.2962(t-test) (81.057) (43.354) (54.981) (39.667)a 2 2.3947 2.7675 2.4709 2.1416(t-test) (79.262) (33.047) (52.248) (45.104)likelihood -10693 -2624.7 -4814.3 -3131.1d.o.f. 11064 2913 5062 3089Appendix D. Empirical Results for Chapter 2^ 226TestsTable D.2: Additive Tests- all data < 30 yrs 30-65 yrs > 65 yrsExA 2.48* -1.52 2.29* 1.01ExP -1.53 -.155 -1.23 -1.86*ExS 3.09* -.643 2.23* 1.48ExL .526 -.675 1.10 .234AxP .475 -.775 -.105 .833AxS 1.51 1.04 .413 -.130AxL -1.30 -.836 -1.59 .879PxS -1.13 .550 -.281 -2.06*PxL 2.41* 2.05* 1.77* .422SxL -1.35 .044 -1.13 -1.29Note: t-test, degrees of freedom 11064, 2895, 5046, and 3071. "*" indicates sig-nificance at 5 per cent level.Table D.3: Joint Additive Testsall characteristics (10 dof) 43.16* 9.82 24.87* 14.23chronic-short-social (6 dof) 27.56* 6.83 13.33* 7.74physical-social (4 dof) 8.99 5.17 7.80 2.56chronic-social (3 dof) 7.06 5.17 6.21 1.61chronic-short (3 dof) 21.38* 1.50 7.75* 5.10within chronic (3 dof) 7.78 3.38 6.51 4.09Note: Wald testNote: chronic= {E,A,P}, short= {S}, physical= {E,A,P,S}, social= {L}.Appendix D. Empirical Results for Chapter 2^ 227Table D.4: Multiplicative Tests (all data)- ExP ExS ExL AxP AxS AxL PxS PxL SxLExA 4.62* .001 .049 .033 .012 3.03 3.37 4.31* 2.16ExP 4.88* .760 1.06 3.76* 1.20 .001 6.08* 1.13ExS .047 1.17 .016 1.99 3.08 4.84* 2.23ExL .076 .058 1.52 .752 4.31* 1.85AxP .016 1.74 1.07 4.51* 1.78AxS 1.96 2.38 4.30* 2.11AxL 1.13 5.83* .058PxS 5.94* 1.03PxL 6.81*Note: Wald test, 1 degree of freedom.Table D.5: Joint Testschronic-short 3.09 (3 dof )chronic-social 6.57 (3 dof )physical-social 5.03 (6 dof )within chronic ( sing. matrix )Note: Wald test.Note: joint tests where the number of individual conditions exceeds the numberof estimated parameters cannot be done.Appendix D. Empirical Results for Chapter 2^ 228Table D.6: Multiplicative Tests < 30 yrs)- ExP ExS ExL AxP AxS AxL PxS PxL SxLExA .254 .044 .344 .092 2.06 .120 .643 3.24 .009ExP .084 .407 .263 .525 .612 .295 3.82* .003ExS .368 .151 1.14 .601 .476 3.17 .006ExL .248 .581 .054 .554 3.99* .106AxP 1.36 .454 .699 3.49 .016AxS .801 .005 2.48 .001AxL .737 3.75* .196PxS 2.96 .002PxL 1.92Note: Wald test, 1 degree of freedom.Table D.7: Joint Testschronic-short 1.30 ( 3 dof )chronic-social 4.77 ( 3 dof )physical-social 5.04 ( 4 dof )within chronic ( sing. matrix )Note: Wald test.Appendix D. Empirical Results for Chapter 2^ 229Table D 8: Multiplicative Tests ( 30-65 yrs)- ExP ExS ExL AxP AxS AxL PxS PxL SxLExA 4.45* .119 .711 .749 .763 5.44* 1.64 2.41 1.26ExP 4.01* 1.37 .187 1.37 2.21 .217 3.18 1.11ExS .863 .981 .306 2.49 1.10 2.68 1.31ExL 1.03 .946 2.86 1.17 1.40 2.18AxP .086 2.24 .005 2.71 1.16AxS 2.50 .201 2.51 1.26AxL 2.17 4.70* .387PxS 2.98 1.15PxL 3.81*Note: Wald test, 1 degree of freedom.Table D.9: Joint Testschronic-short 1.22 ( 3 dof )chronic-social 5.41 ( 3 dof )physical-social ( sing. matrix )within chronic ( sing. matrix )Note: Wald test.Appendix D. Empirical Results for Chapter 2^ 230Table D.10: Mu tiplicative Tests ( > 65 yrs)- ExP ExS ExL AxP AxS AxL PxS PxL SxLExA 4.39* .016 .094 .151 .407 .486 5.97* .199 1.25ExP 4.78* .001 2.07 1.81 .514 .238 .112 1.67ExS .102 .043 .520 .811 5.56* .205 1.35ExL .132 .046 .350 .025 .096 1.26AxP .599 .812 3.23 .221 .872AxS .733 1.91 .169 1.53AxL .409 .031 1.84PxS .085 1.79PxL .934Note: Wald test, 1 degree of freedom.Table D.11: Joint Testschronic-short 5.71 ( 3 dof )chronic-social .415 (3 dof)physical-social 2.31 (3 dof)within chronic ( sing. matrix )Note: Wald test.Appendix D. Empirical Results for Chapter 2^ 231D.12: Multilinear Testsall data < 30 yrs 30-65 yrs > 65 yrs8-ar(E,A),P .217 .734 .155 .000(E,A),S 2.27 .582 2.53 3.32(E,A),L 5.11+ .294 6.20+ .000(E,P),A 3.36 .842 4.07+ 1.38(E,P),S 9.52+ .000 3.73 10.04+(E,P),L .981 1.27 .293 .184(E,S),A 1.16 .495 2.30 1.19(E,S),P .720 .014 .010 .656(E,S),L 3.94+ .155 3.56 1.55(E,L),A 2.01 .346 .183 .556(E,L),P .866 .274 .120 .545(E,L),S 2.23 .016 .196 .098(A,P),E 15.15+ .015 6.09+ 9.85+(A,P),S 4.72+ .241 .235 1.84(A,P),L 2.97 1.62 5.13+ .211(A,S),E 3.61 .302 .915 7.11+(A,S),P .082 .112 .581 .021(A,S),L 1.15 .006 1.86 .213(A,L),E .037 .314 .064 .721(A,L),P .775 .285 .117 .945(A,L),S 1.97 .056 .182 .109(P,S),E 44.87+ .073 13.59+ 31.96+(P,S),A 12.65+ 1.75 1.98 2.74(P,S),L .220 2.52 1.46 .191(P,L),E .221 .343 .098 .223(P,L),A 1.43 .273 .169 .727(P,L),S .915 .054 .172 .161(L,S),E .050 .316 .068 .729(L,S),A 1.91 .489 .174 .512(L,S),P .882 .260 .117 .492Appendix D. Empirical Results for Chapter 2^ 232Table D.13: Joint Testschronic-short (3 dof) 11.16+ .4.26 5.26 12.26+chronic-social (3 dof) 10.41+ 2.20 7.25 2.20physical-social (6 dof) 17.08+ 10.25 17.47+ 140.86+w/in chronic (3 dof) 27.72+ 1.31 12.35+ 13.82+Note: Wald test. One degree of freedom for single tests. "+" indicates test issignificantly different from zero, but sign is consistent with WDI.Appendix D. Empirical Results for Chapter 2^ 233Simulated QALY ValuesTable D.14: Simulated Holistic QALY Valuesq cs sg tto es pe4 1.000 1.000 1.000 0.643 1.000E 0.759 0.912 0.653 0.446 0.780A 0.868 0.965 0.796 0.530 0.880P 0.933 0.987 0.891 0.584 0.939S 0.809 0.939 0.717 0.484 0.827L 0.979 0.997 0.964 0.623 0.981ExA 0.514 0.708 0.393 0.279 0.549ExP 0.733 0.895 0.619 0.425 0.753ExS 0.404 0.572 0.295 0.212 0.442ExL 0.707 0.879 0.592 0.408 0.732AxP 0.776 0.922 0.674 0.458 0.796AxS 0.594 0.789 0.470 0.330 0.625AxL 0.907 0.979 0.851 0.562 0.916PxS 0.77:; 0.920 0.671 0.457 0.793PxL 0.780 0.924 0.680 0.462 0.800SxL 0.854 0.960 0.776 0.519 0.868aggregated valuesTable D.15: Aggregated Values (cs)q holistic/multilinear additive multiplicativeExA 0.514 0.627 0.564ExP 0.730 0.692 0.660ExS 0.404 0.568 0.479ExL 0.707 0.737 0.727AxP 0.776 0.801 0.784AxS 0.594 0.677 0.628AxL 0.907 0.846 0.841PxS 0.773 0.742 0.717PxL 0.780 0.911 0.909SxL 0.854 0.788 0.780Note: .A = 1.944.Appendix D. Empirical Results for Chapter 2^ 234Table D.16: Aggregated Values (sq holistic/multilinear additive multiplicativeExA 0.708 0.877 0.805ExP 0.895 0.899 0.872ExS 0.572 0.851 0.726ExL 0.879 0.909 0.904AxP 0.922 0.952 0.941AxS 0.789 0.904 0.855AxL 0.979 0.962 0.960PxS 0.920 0.926 0.907PxL 0.924 0.984 0.983SxL 0.960 0.937 0.933Note: .\ = 23.415.Table D.17: Aggregated Values (ttoq holistic/multilinear additive multiplicativeExA 0.393 0.449 0.445ExP 0.619 0.544 0.542ExS 0.295 0.370 0.364ExL 0.592 0.617 0.616AxP 0.674 0.686 0.685AxS 0.470 0.512 0.509AxL 0.851 0.759 0.759PxS 0.671 0.607 0.606PxL 0.680 0.854 0.854SxL 0.776 0.680 0.679Note: A = 0.05469.Appendix D. Empirical Results for Chapter 2^ 235Table D.18: Aggregated Values (es)q holistic/multilinear additive multiplicativeExA 0.279 0.333 0.259ExP 0.425 0.387 0.348ExS 0.212 0.287 0.182ExL 0.408 0.427 0.414AxP 0.458 0.471 0.449AxS 0.330 0.372 0.312AxL 0.562 0.511 0.504PxS 0.457 0.425 0.394PxL 0.462 0.565 0.561SxL 0.519 0.465 0.455Note: A = 3.361.Table D.19: Aggregated Values (pe)q holistic/multilinear additive multiplicativeExA 0.549 0.660 0.590ExP 0.753 0.719 0.684ExS 0.442 0.607 0.505ExL 0.732 0.761 0.749AxP 0.796 0.820 0.800AxS 0.625 0.707 0.652AxL 0.916 0.861 0.855PxS 0.793 0.766 0.738PxL 0.800 0.920 0.917SxL 0.868 0.807 0.798Note: A = 2.661.Measures of distortionNote: The measures of distortion do not take into account whether the calculateddifferences are statistically significant or not. The distortions are measured from theholistic value for the instrument indicated at the top over all fifteen morbid states.Appendix D. Empirical Results for Chapter 2^ 236Table D.20: Distortions from cs Valuesaggregation cs sg tto es peholistic 0.000 1.962 1.345 4.962 0.296additive 0.744 2.618 1.436 4.833 0.871multiplicative 0.583 2.302 1.455 5.198 0.689Table D.21: Distortions from sg Valuesaggregation cs sg tto es peholistic 1.962 0.000 3.307 6.925 1.667additive 1.611 .735 3.250 6.795 1.382multiplicative 1.911 .503 3.269 7.160 1.652Table D.22: Distortions from tto Valuesaggregation cs sg tto es peholistic 1.345 3.307 0.000 3.618 1.640additive 1.706 3.962 .712 3.488 1.994multiplicative 1.416 3.647 .702 3.854 1.655Table D.23: Distortions from es Valuesaggregation cs sg tto es peholistic 4.962 6.925 3.618 0.000 5.258additive 5.314 7.580 3.675 .479 5.612multiplicative 5.013 7.265 3.655 .446 5.272Table D.24: Distortions from pe Valuesaggregation cs sg tto es peholistic .296 1.667 1.640 5.258 0.000additive .699 2.322 1.692 .5.128 .707multiplicative .572 2.007 1.711 5.494 .525Appendix EProofs to Chapter 3Proof Lemma 1 (HK): To be exact, w(q)L - (u,p,w(q),q) cannot depend on thevariable arguments of U (this would make HK a non-monotonic transform of U).Thus, q must be the same for all states compared, i.e. V (p,w(q),w(q)T + I , q) =V(p,w,wT + I). Then uA > uB H V(p,w,wTA I) > V(p,w,w1113 + I) 4-4 wTAI > wTB + I. By duality relationships: L - (U,p,w) = L - (17(p,w,wT I),p,w) =L(wT I,p,w). Then exactness requires L(wTA I,p,w) > L(wTB I,p,w)uA > uB . This requires that alai-^> 0 (i.e. labour is a normal good).This result is unchanged by the presence of leisure constraints (as can be seen ifthe leisure constrained indirect utility function is substituted for the unconstrainedindirect utility function above. However, exactness does not hold if the set of bindingconstraints changes because this makes HK a function of the variable arguments ofU.In the case where labour is constrained, the proof proceeds as above, except TLreplaces T. No restriction on income elasticities is required: the transform is positivesince, ceteris paribus, utility increases with the constraint because time can be betterallocated. ^Proof Lemma 1 (WTP): WTP may be expressed:V(p,qA ,TA ,/ — ) = uB .^ (E.1)To be exact, cv must be positive whenever state A is preferred to state B. Let uA > uB .Then for (E.1) to hold, income must be changed in state A to reduce the utility of237Appendix E. Proofs to Chapter 3^ 238state A. By assumption, 0V/0/ > 0, so this requires that income be reduced, oralternately, cv > 0. Thus, this is exact. ^Proof Lemma 1 (HYE): HYE may be expressed:AHY E (TA — mA ) (TB mB ), where^V (p, 4, T Z — m 2 , I) = ut.^ (E.2)To be exact, T i — m i must be larger the larger is ie. Let u i increase by du i . Thenthe equality in (E.2) requires that T i — m i be changed to increase the left hand sideof the equation by du i . Since V is assumed to be increasing in T (not unreasonableat a reference morbid state of perfect health), this requires T 2 - mi increase. Thus,this is exact. 1 ^Proof Lemma 2: Consider a project affecting longevity alone (i.e. (q,TB )(q,T A )). The true willingness to pay in time units is given by U(q,T A —m)=U(q,T B )which is, of course, m = TA — TB . The HYE approach instead sets U(q,T A )u(4,-,TA — mA) , or mA = 111 RSQT (4,TA )dq, and U(q,TB ) = U(4,T A — mB), orMB = 11IRS9 ,744,T B )dq, and the value of the health status change is mB — mA .Given that mB — mA = amRs,T d^(y921 BRAaT^q =^v7, + v,TT )dq, the HYE value is fallingas the absolute risk aversion with respect to time increases.Now consider a project affecting q alone (i.e. (qB , T)^(qA,T)). The truewillingness to pay in time units is given by U(qA , T — m) = U(qB ,T). The HYEapproach instead sets U(qA , T) = U(4,T — mA), or mA = AIRSqx(4,-,T)dqA , andU(qB ,T) = U(4,,T — mB ), or mB = MRSq ,T (a,T)dqB , and the value of the healthstatus change is mB — mA . Given that mB — mA = —MRSqT (4, T)( -4 - qA — q + qB )and that m = .111RSqT(qA,T)(qA — qa) , m > < mB mA as amaRqsqT < / > 0. But1 A. problem may arise if the wage schedule varies with time. In this case, the marginal valueof time varies, and as a result, HYE can be based on two different transformations of the utilityfunction where the transformation depends on the level of utility (i.e. is inexact).Appendix E. Proofs to Chapter 3^ 239since^)amRsiL^v , this implies the HYE value is falling as the absoluteat/^171zvg^p A risk aversion with respect to morbidity increases. EProof to Corollary 1: If E i HKi (Ti ) is consistent with a Bergson-Samuelsonsocial welfare function, thenxi,y)^w iL i (v i (TiB),w:L i (v i (TiA),^> (E.3)where x i are person-specific variables, y are variables held in common by all persons(prime denotes a change in value), and v(T) = V(p,w,wT + I , q).For the wage rate, this requires that w i (aL i /aw) + L i = 0 for all i. For otherelements of (x i , y), (3.20) requires L i (v(Ti ), x i , y) = b(y)(1 i (vi (Ti ))^ai (x i , y)). Butthis requires that —atiTR ^ Yet, when income rises, the level of consumption ofpurchased goods goes up and, normally the marginal value of leisure increases relativeto the marginal value of purchased goods (so more time is allocated to leisure and lessto labour). This does not happen only if leisure has no utility, which implies L" = T(the case where the marginal utility of purchased goods is zero can be ruled out sinceaLlaT > 0 by assumption). But if L" = T, 8L/8w = 0 and the wage constraintholds if and only if dw = 0 (w0 dwT = 0) and w i = w for all i.In the labour constrained case, u i = 27i(7-1,(qi )), so 71(qi) = cki (u i ). Then thehuman capital statistic becomes H K i (qi) = = wi fi(u i , x i , y). Then (3.20)requires that wi f i (u i , x i , y) = a(y)-yqui) bi (x i , y). But this implies (1) w i = w forall i, and (2) u t w" (gia)lyb)` ( ''Y ) . This means that T — 71(qi) does not enter utility(i.e. leisure has no utility; if it did, this representation of preferences implies thatutility would be decreasing in the constraint).Proof of Corollary 2: From (3.20), cv(qi , Ti ) = a(y)-yqui) bi(x i ,y). For themove from any health state to itself, this may be re-expressed as u i^a(y)/i + bi (x i , y),Appendix E. Proofs to Chapter 3^ 240where x i includes (qi ,Ti ), but not 4. Note that this applies to the global indirectutility function, V, and thus applies to all constrained and unconstrained cases. ^Proof of Corollary 3: From (3.20), HY E(qi ,Ti )^a(y)-y i (ui) bi (x i ,y). At(the reference health state), this may be re-expressed as u i a(y)Ti bi (x i , y).x i includes (Ii ), but not (qi ,Ti ). The argument for excluding qi is based on the as-sumption that all morbid states are indifferent in death because they are not actuallyexperienced. Hence, if Ti .,-- 0, then no utility from qi can be realized (bequests allowutility to be derived from wealth even after death, however). The restriction derivedfrom (3.20) then seems to imply that qi be the same for all individuals (i.e. an elementof y). But this restriction was derived when qi was set equal to the reference healthstate, which is fixed for all evaluations, and only applies to this reference health state.The marginal utility of time spent in the reference health state must be the same forall individuals (i.e. a year of perfect health is worth the same to everyone). Theoret-ically, any q may be chosen as the reference health level, so this restriction must holdfor any q. ^Appendix FProofs to Chapter 4Theorem 1:Necessity: If Rr is independent of the reference point, 4- , then(QA, TA )R.r(Q B , TB )r(fi(U 1 (e, tt), 4), ..., fiv (uN(qpv , tk), 4)) >r(f1(U 1 (4,4'), 4),^f N (UN^47 ), ii))^(F.1)for all possible reference points, 4. If Rr is independent of q , then an arbitraryfixed reference point, 4, may be chosen from the set of reference points. Letfi (U i (qi ,t i ), = gi (u i ). Substitute this into (F.1) to obtainr(gi (ut ), g N (uk)) > 11(g ju in, g N (4)).^(F.2)Then there exists some function, W(u i , ...uN)^r(gi(ui),...,gN(uN)), such that(QA , TA )rer (QB , TB) 4_4 w(ut ,^> w(uBi^(F.3)Sufficiency: If Rr is consistent with a Bergson-Samuelson social welfare function,it must be independent of the reference point, i.e.(QA , TA )zr (QB , TB) 4_4 w(ut, ...uN) > w(uBi _4j^(F.4)Theorem 2: Additively separable case.241Appendix F. Proofs to Chapter 4^ 242From the proof of Theorem 1, it is possible to obtain(QA, TÄ )Rr (QB , TB)w (uAi^) w(usi^)E 02(gi (4)) > E oz(gi (up))E oz(fi (uf ,^= E Oz(fi (u? 3 ,0)for all q. This means thatE ot(fi(ui, q)) = (NE 02(gi(ui)),where 4 is increasing in its first argument.Define zi = Oi (gi (ui )), and hi (zi,)^Oi (fi (u i ,q-)) where z i is a continuous variableand h i is increasing. Substitute this into (F.6),E^= (1)(Ez i3 O.^ (F.7)Because each h i is continuous in z i , 4 is continuous in its first argument and, for eachq, (F.7) is a Pexider equation with the solution= a(q)zi^bi (q) (F.8)for all -4 (see Eichhorn [1978], Theorem 3.1.5). This implies^ckl(fi (u i , (I)) = a(4)02 (g i (u i )) + bi(4)^(F.9)or thatOi(Mi(qi,ti,q)) = a(4)0i (gi (U i (qi ,t i )))^ (F.10)Set qi = q in (F.10). By definition, fi (u i ,q) = t i at this value of q (i.e. no t isgiven up for the move from q to 4-). Thus (F.10) becomes(F.5)(F.6)O i (t i ) = ei(qi )0i (gi (U i (qi ,t i )))^bi (qi )^(F.11)Appendix F. Proofs to Chapter 4 243Orti))) = a(qi )0i (t i )^bi (qi ),where a(qi ) = 1/ä(qi ) and bi (qi ) = —bi (qi )/(i(qi ). It follows thatti) = eb i (a(qi )0i (ti )^bi (qi )),(F.12)(F.13)where I is increasing. This implies (4.13).Sufficiency:Suppose that (4.13) holds. Thenmi = M i (qi , t i , 4-) = oi[a(%)0i(ti) (g)bi(qi) — bi (4) }a(where O i is the inverse of the function 0 i ) andE^Ei [a(qi)0i (t i) bi(qi )] — E i bi(q)a(4)so that(F.14)(F.15)(QA ,TA )R,r (QB ,TB )^E[a(e)oi(t.iii )+ bi (e)] > E[a(qr)0 1 (tr)-Fbi(e)] (F.16)which makes R.r independent of 4. ^Theorem 3: Additive CaseFrom Theorem 2, independence for any additively separable case requiresU i (qi ,t i )^a(qi )t i^baqi )•^ (F.17)Set t i = 0. Then Condition N requires that b i (qi ) be independent of qi and (4.15) isimmediate.Sufficiency follows from Theorem 2. ^Appendix F. Proofs to Chapter 4^ 244Theorem 4: Cobb-Douglas case.Combining the results of Theorem 1 with equation (4.16), if R,r is independent of4, , then(QA , TA )7Z,r(QB , TB )Mut, > llfauli3, 4)11 gi (4) > H gi (u/i3)^> fJ 43,^ (F.18)where gi (ui ) = fi (ui ,q) (4 being a fixed reference value of q), and z i = gi (u i ) for alli = 1,...,N. Defining h i (zi , 4) = fi (u i , 4), it follows thatll h i (zi , (I) = 0(11 z i , 4)•^(F.19)For every -4, this is a Pexider equation with the solutionh i (zi ,^= bi (4)t7(4)^(F.20)(see Eichhorn [1978], Theorem 3.5.5). Setting qi = q as in Theorem 2 yieldst i = bi (qi )[gi (udia ( qi) .^ (F.21)Rearranging,ui = U i (qi, t i )^bi (qi )t7 (gi) ,^(F.22)wherea(qi) = 1 /a(qi)^ (F.23)andbi (qi ) = b(qi ) -1 /a (qi) .^ (F.24)Sufficiency is obvious from inspection. ^Appendix F. Proofs to Chapter 4^ 245Theorem 5:If Rr is independent of 4,(QA TÄ )7Zr (QB TB )>4-> min{ fi (u11 , q), fN (uk ,^> min{fi (uBi. ,^fN(ug, 4)}4-> minIg i (uAl ),...,gN (uk)} > minIg i (uBi ),...,gN (ug)}4-4 min{4 , 4[1 } > (F.25)where g i (ui ) = fi (u i ,4), q is some fixed reference value of 4, and zi = gi (ui ) for alli =^N. Defining h i (zi , -4) = fi (u i , q), it follows thatminfhi(zi, 4),^h N (zN , q)} = F(nin{ ,^Z N}, 4).^(F.26)This requires^ i (zi , 4) = hgzk , 4)^zi^zk . (F.27)The range condition ensures that there are z's satisfying (F.27) at every level. There-fore h i = hk = h for all i, k, and(qi , ti), 4) = h i (gi (u i , 4.)) = h(gi (ui ), 4).^(F.28)Setting^qi as in Theorem 2, the following is obtainedti = h(gi (u i , gi ))^(F.29)and, therefore^u i =gi-1 (v(qi ,t i ))^v(qi ,t i )^(F.30)for some function v. This implies (4.19).Appendix F. Proofs to Chapter 4^ 246Sufficiency is immediate. ^Theorem 6 (Blackorby and Donaldson [19881): ConcavityRecall that m is defined by the followingm = minft/U(4,m) > U(q,t) = u}.^(F.31)M is a dual function of the utility function, albeit based on time rather than income.Thus, M resembles the cost function in standard economic theory and the proof byBlackorby and Donaldson is easily modified for this case.Equation (4.20) requires that preferences be homothetic in t. Suppose they arenot. Then there exist two health states, (q° ,t ° ) and (q 1 ,t 1 ), such that U(q° ,t ° ) =u( q1 , 0) . This impliesM(q0, tO , (1)^(q1 tl,^ (F.32)If preferences are not homothetic in t, then there exists A > 0 such that U (q° , )t ° ) >U(q 1 , )1O), implyingM(q°, At°, > M(q 1 , At', 0^(F.33)for all 4. Note that A may be chosen to be less than one (the inequality continues tohold).Now choose^q° so thatFrom (F.32),andFrom (F.33),/1//(q° , t ° , g.° ) = to .M (q° ,t ° , q° )^114- (q 1 , , q ° ) = t °AM(q l ,t 1 ,q° ) = At ° .m(qo ,At o ,go ) >(F.34)(F.35)(F.36)Appendix F. Proofs to Chapter 4^ 247By the definition of a minimum, (F.31) yieldsAt ° > 111 (q° , At ° , q° ).^ (F.37)Equations (F.35), (F.36), and (F.37) implyAM (q i ,t 1 , q° ) > M (q 1 ,^, q° ).^(F.38)By invoking Condition N, the following is obtainedtl1 (g l , 0, q ° ) = M (q 1 , a0, q ° ) = 0 = AO.^(F.39)Equations (F.38) and (F.39) combine to yieldM (q l ,^+ (1 — a)0, q ° ) =^(q l , At', q ° )< AM (q l , t i q° ) (1 — A)M(q 1 ,0,q° ) = A (q 1 , t1, q ° )^(F.40)which contradicts the definition of concavity.Sufficiency:If U (q,t) = 0.(a(q)t b(q)), then m(u, 4)^b(q), which is concave (non-strictly) in t. 1:3Lemma 1: Additivity of PreferencesIn the two period case (which can be generalized), additivity requires(p ( 9,1 + 9,21t1^t2 )(t1^t2)^(p(q11t1)(t1)^(100721t2 ) (t2)^(F.41)(where cp denotes the QALY function and the superscripts represent different timeperiods). This requiresU(q,t)^0(a(q)t)•Necessity:Appendix F. Proofs to Chapter 4^ 248Recall thatU (q, t) = U (q, m),^ (F.42)U (q, t) = U (m), (F.43)U (U (q,t)) = m^ (F.44)Since cp(0) = U -1 (U(q,t))/t, yo(qjt)t'^U (U (q,t)) if t = t'.Then the additivity condition becomes^U-1(u( q l , 17 2 ,^t2))^U-1(u(ql, t l))^U1(u(q2, t2)) .^(F.45)Set q 1 = q 2 and incorporate as a parameter:f(ti^t2)lt l ) + f(t2) .^(F.46)This is a Cauchy equation with the solution: f (t) = at + b. For each q there existsf (t; q) = a(q)t b(q). Since f (q, t) = U -1 (U (q,t)) = ck(U (q, t)),U (q, t) = 0.(a(q)t^b(q)).^ (F.47)However, since f (q, 0) is normalized to be zero, regardless of q, b(q) = 0 for all q.Sufficiency:If U(q,t) = q(a(q)t), then cp(q1t) = aa (grirt . Summing over two periods:^ta(q1)ti (t i )^a(q2)t2 ( 2 ) = a(q1)t1^a(q2)I2 (t 1^t 2 ).a(4)0^a(q)t2a(q)(t1^t2)Lemma 4: Time Independence:Necessity:Time independence requires that(F.48)cp(q 1 1t 1 )t i = cp(q 1 1t 2 )t i ,^ (F.49)Appendix F. Proofs to Chapter 4^ 249or that cp(q) be the same for all values of t (i.e. A(AT ) =^for — for any positive value ofA). From the definition of cp, this requiresU(q, ) t) = U(4, Am)^ (F.50)for all A > 0. Set A = 1/t. Then the problem becomesU(q, 1) = U(q) = U(q,m/t) = U(m/t).^(F.51)Taking the inverse of U,U-1(0 (q)) = a(q) = (m/t)^ (F.52)or m = a(q)t But m = ck(U(q,t)), so this implies U(q,t) = 0(a(q)t).Sufficiency:If U(q,t) = 0(a(q)t), then m = a(q)t/a(q) and cp(q/t) = m/t = a(q)/a(q), whichis clearly independent of LEIAppendix GEmpirical Analysis to Chapter 4G.0.1 Data DescriptionData are extracted from four sources: (1) morbidity data are taken from the 1985General Social Survey (G.S.S., Statistics Canada [1987]), (2) institutional data aretaken from the 1986 Health and Activity Limitation Survey (H.A.L.S., StatisticsCanada [1990]), (3) QALY values are taken from Torrance et al. (1982) (as reportedin Drummond et al. [1987]), and (4) life expectancy data are taken from the 1985-87life tables (Statistics Canada [1991]).The G.S.S. covers a stratified sample of all Canadians age 15 and over (i e childrenare excluded), living in the ten provinces (i.e. persons residing in the territories areexcluded), and not living in institutions. 1The health characteristics surveyed in the G.S.S. depend on the age of the respon-dent, persons over 55 years of age being asked a supplemental set of questions. Thus,two data sets are constructed: one includes all respondents but covers a smaller set ofvariables, the other includes only older respondents, but has a larger set of variables.Variables chosen includeendurance=1 if a positive response to any of questions 27 (walk), 28 (climb), 29(carry), or 31 (bend);'Families with members over 65 years of age who were either living on Indian Reserves or weremembers of the Armed Forces are also excluded, although the population weights apparently accountfor these biases.250Appendix G. Empirical Analysis to Chapter 4^ 251=0 otherwise.endurance2=1 if a severe response to any of the above;=0 otherwise.role=1 if a positive response to question 37 (activity limitation);=0 otherwise.role2=1 if the response to question 151 is permanently unable to work;=0 otherwise.emotion=1 if the response to question 75 is unhappy;=0 otherwise.social=1 if, with respect to either contacts or visits, the respondent reports less thanthe average frequency of encounters in a majority of categories of types of peoplewith whom such contacts can be made (questions 107, 108, 112, 112, 116, 117,121, 122, 124, 125, 127, 128);=0 otherwise.hearing=1 if positive response to question 36 (hearing);=0 otherwise.sight=1 if positive response to question 35 (reading);=0 otherwise.perception=1 if severe response to either question 35 or 36;=0 otherwise.Appendix G. Empirical Analysis to Chapter 4^ 252short=1 if positive response to question 17 (sick days);=0 otherwise.short2=1 if positive response to question 13 (bed days);.=0 otherwise.agility=1 if positive response to question 33 (grasp) or 34 (reach);=0 otherwise.agility2=1 if severe response to either question 33 or 34;=0 otherwise.Additional variables for the over 55 group includemobility=1 if positive response to question 87 (yardwork) or 91 (light housework);=0 otherwise.mobility2=1 if severe response to either question 87 or 91;=0 otherwise.selfcare=1 if positive response to question 103 (self care);=0 otherwise.selfcare2=1 if severe response to question 103;=0 otherwise.Analysis includes the comparison of the above variables with reported satisfaction(question 73). Stratification variables include province, sex, and age. The weightvariable is used in the calculation of averages. Observations with inadmissible re-sponses to any of the above (e.g. not stated, did not know, no opinion) are omitted.Appendix G. Empirical Analysis to Chapter 4^ 253These comprise less than .5 per cent of the sample. The data set defined over all agegroups has 10739 observations, while the data set defined over the over-55 group has6688 observations.G.0.2 Calculation of Satisfaction EstimatesThe results of the second chapter suggest a multiplicative functional form should bechosen for estimation.'Health variables (as described above) are regressed against satisfaction withhealth. This assumes that all individuals in the sample assign the same meaningto each of the categories (e.g. that very satisfied is very satisfied to everyone). Pro-bit analysis is used because the dependent variable is categorical. This is done withthe non-linear procedure in SHAZAM, using the log-density for an ordered probit.Linear estimates are used as starting values, although the final estimates are robustto the choice of initial values. The performance of the estimates based on the wholepopulation are superior to those based on the over 55 data set, even though the latter2 The same conclusion appears to be supported with this estimation, despite slight differences inthe variable set considered. This conclusion is drawn from results obtained by imposing additivityon the equation estimated, but deleting observations with multiple health ailments. If these healthailments are additively separable, the resultant parameter estimates should not be significantlydifferent from those based on the entire data set. If additive separability does not hold, then theparameter estimates will differ. (Use of such a procedure is necessary since it is impossible toestimate the equation with higher order terms.) Over all age groups, additive independence isimposed alternately on (1) all variables (no responses are deleted), (2) on the social variables only(all multiples except those involving social ill-health and one other variable are deleted), (3) on thesocial variables and within chronic ill-health (multiples within endurance, agility, and perception areallowed), and (4) on no variables (all multiple responses are omitted). LR tests based on the differentparameter estimates indicate additive independence is violated in the first case (x 2 = 216.40, with13 degrees of freedom), marginally violated in the third (x 2 = 20.55), but not violated by the secondcase (x2 = 2.154). These results mimic those found in Chapter 2, and the other structural resultsfound there are assumed to apply here as well (e.g. multiplicative structures cannot be refuted).Because more observations entail more precise estimates, the second case is adopted for estimation(8407 observations are left). The performance of the parameter estimates from the third case areinferior at the stage when they must be linked to QALY values.Appendix G. Empirical Analysis to Chapter 4^ 254contains two extra explanatory variables. 3 Thus, the all-inclusive data set is adopted.The data set used for estimation is then the one for all age groups with observationshaving more than one ill-health state (social ill health excluded) deleted. This assumesadditive independence only of social ill health from the other ill health categories.The estimate associated with any other ill health variable is then a measure of themarginal disutility of moving to that health state from a state of perfect healthand is free of any interaction effects from any co-morbid state (e.g. the parametervalue for endurance measures the change in satisfaction, or rather the probabilityassociated with reporting this level of satisfaction, when an individual moves froma state of perfect health to one characterized by some endurance impairment; itdoes not indicate the change in satisfaction when someone who also has a perceptionimpairment acquires an endurance impairment — these values are the same only ifpreferences are additively separable over endurance and perception). The estimatesbased on the general data set with additive independence imposed on social ill-healthare reported in Table G.1. All parameter estimates have the expected sign (since ill-health should contribute to health dissatisfaction). Notice that, given how the severevariables are defined, the marginal probabilities associated with severe ailments arefound by adding the coefficients for the ailment being present and severe.The estimates are checked for variation across identifiable groups to verify theconstancy of preferences. This is done by regressing calculated residuals against theindependent variables after sorting observations by group characteristics. The set ofstratification variables used is more limited than in Chapter 2, since the focus hereis to establish whether preferences vary only across the groups whose health is being3 In the latter case, the procedure sometimes fails to converge, the estimates are more sensitive tothe inclusion of certain observations, and the estimates are occasionally of the "wrong" sign. Thismay be due to misspecification of the extra variables, multicollinearity with other variables, or agedependence in the estimates. Correlation statistics and the performance of the parameter estimateslends support to the latter two hypotheses.Appendix G. Empirical Analysis to Chapter 4^ 255Table G.1: Satisfaction Function Estimates (t-statistics in bracketsvariable all women menconstant -.107 -.093 -.120(-6.54) (-4.13) (-5.10)endurance .402 .337 .539(8.80) (5.97) (6.83)endurance2 .077 .314 -.499(.440) (1.52) (-1.47)role .685 .627 .733(8.57) (5.51) (6.47)role2 .260 .155 .476(1.40) (.648) (1.54)emotional .749 .650 .873(7.79) (5.02) (6.00)social .061 .034 .083(1.54) (.589) (1.54)hearing .144 .336 .032(2.31) (3.33) (.404)sight .191 .132 .272(1.88) (.974) (1.78)perception .078 -.027 .134(.283) (-.068) (.344)short .358 .425 .241(6.10) (5.73) (2.49)short2 .259 .251 .254(4.07) (3.11) (2.40)agility .104 -.022 .279(1.05) (-.168) (1.82)agility2 1.08 1.41 .293(2.58) (2.79) (.363)a l 1.61 1.64 1.58(68.9) (50.7) (46.4)az 2.65 2.60 2.75(48.1) (36.9) (30.1)LL -7368.544 -3923.268 -3431.698Appendix G. Empirical Analysis to Chapter 4^ 256compared where such variability would affect the conclusions drawn. The stratifica-tion variables used are sex, age, and province. Tests indicate there is some variationin each group. For age, significant variation occurs in the constant (as is expectedsince satisfaction with health rather than satisfaction with morbidity is reported andthe former probably contains a longevity component that is related to age), but therelationships over the morbidity variables do not, suggesting preferences over morbid-ity are constant across age groups. Preference variation across provinces cannot beassessed because of insufficient sample size in the smaller provinces. Regional subag-gregates reveal little variation. Preferences do vary by sex. This is confirmed by there-estimation of the satisfaction equations by sex (see Table G.1). LR tests indicatethe differences are statistically significant (x 2 = 68.04 between the men and womenwith 16 degrees of freedom).G.0.3 Calculation of Expected SatisfactionThe estimates from the probit equation for the general data set with additive in-dependence imposed on the social ill-health variables are used to generate expecteddissatisfaction (D(q)) for each single dimensioned health state (i.e. when only onecharacteristic indicates poor health). These are converted to utility values by thefollowing negative affine transformationD(q) — D(all D(q) — 3.947410S(q) D(no q) — D(all q) —2.44372where S(q) is the estimated satisfaction associated with health characteristic q, D(q)is the estimated dissatisfaction, D(all q) is the dissatisfaction associated with thepoorest possible health state, and D(no q) is the dissatisfaction associated with thebest possible health state. For women, the transformation isDF (q) — 3.982166SF(q)^1.508255 — 3.982166' (G.2)Appendix G. Empirical Analysis to Chapter 4^ 257Table G.2: Marginal Disutilities (taken at perfect health)variable all women menendurance .089 .072 .130endurance2 .107 .147 .009role .158 .141 .182role2 .226 .180 .317emotional .174 .147 .221social .013 .007 .018hearing .015 .072 .007sight .040 .027 .063perception .075 .097 .104short .078 .093 .055short2 .141 .153 .118agility .022 .001 .064agility2 .292 .377 .139whereas for men it isDm (q)- 3.777051Sm(q) - 1.498834 - 3.777051 . (0.3)These satisfaction values are reported in Table 0.2.To find the values associated with the multidimensional health states, the param-eter restriction associated with multiplicative utility structures is invoked(1 + A) = II(1 AA,)^ (0.4)where A, is the marginal probability associated with the worst level of characteristici (i.e. endurance2, role2, social with emotional, perception (with both blindnessand deafness), short2, and agility2). 4 This is consistent with the procedure adoptedby Boyle and Torrance (1982). The restriction is solved for A. Although a fifth'Note that this procedure may be somewhat biased since estimated rather than actual values areused. The unbiased version (in the two characteristic case) requires1+A = 1 + A(Ari' +^+A 2 (A i A 2 )c(where "e" denotes the expectation operator), whereas the estimated version gives1+ A = 1+ A(A ri + A) + A 2 ACA 2`".order polynomial, only one root satisfying rationality restrictions was found for eachequation (all data, men, and women). The only non-zero real root for the wholedata set is -.8312363. The only non-zero real root for the male equation is -.2345702,whereas for the female equation there are three non-zero real roots, but only one fallsin the bounds dictated by multi-attribute theory: .2010460.These values are then used to calculate a second-order approximation of a multi-plicative utility function:N^N NU(qi , q N ) = E S (qi) E E AS (qi )S (qi ).j>i(G.5)Notice that the same root value of A is used regardless of the severity level of thecharacteristic (as is done by Torrance et al. [1982]) and that only mutually exclusivecharacteristics are combined in this way.G.0.2 Linkage to QALY ValuesBecause the expected utility values calculated above are not linked to the value of life,against which they will eventually be compared, it is necessary to transform themso that they map into the QALY time trade-off interval. Since the time trade-offfunction is assumed to be a monotonic function of utility over morbidity conditionedon time, it is necessary to either match QALY and satisfaction time frames or assumeSince(A 1 ) 2 ) e = Apt + cov(a i , A2 ),the estimated version yieldsA(A + aZ — 1) = A 2 (A i A2) e — COV.Let a = A(AC + A; — 1) and b A 2 ((AiA2)c — COV). Solve for A:= (—a+a)/2b = 0,a/b.Since A' = 0 is the additive case, assume A - = alb Differentiate with respect to COV to get /7,Since b2 > 0, ac.a,bw < 0, agy > / < 0 as a < / > 0 (i.e. if the agent is risk averse, thenegative; if the agent is risk seeking, the bias is positive.)Oh OCOV •bias is258Appendix G. Empirical Analysis to Chapter 4^ 259QALYs are independent of time. Since the former assumption cannot be assessedwith the G.S.S. survey (no time dimension is given on the satisfaction response), itis necessary to assume time independence to proceed.The monotonic function relating expected utility with QALYs is recovered bymatching morbid states surveyed in the G.S.S. with those for which QALY values(obtained by time trade-off) exist. These values are taken from Torrance et al. (1982)(found in Drummond et al. [1987]) and includeendurance=P2endurance2=P3role=R2+R4/2role=R3+R5/2 5social=S2emotional=S3sight=H6hearing=H3perception=H8Perfect health is matched to the absence of any reported health impairments for thetenth match.For both P and R, the G.S.S. provides only partial information (i.e. a subset of the characteristicsactually used in the Torrance et al. valuation exercise). Since the correlation of the missing andobserved characteristics is unknown, the values of P and R are chosen to most closely approximatethe ordering of health states by expected utility. For state P, this requires assuming endurance andmobility are uncorrelated, whereas for R it requires that selfcare be inhibited by half the respondentswho are role dysfunctional (only the one mid-point is considered).Appendix G. Empirical Analysis to Chapter 4^ 260The QALY values for the states above are derived from middle-aged parents liv-ing in and around Hamilton, Ontario. The linkage generates biased values if thepreferences of these parents differ from those of the general Canadian population.To ascertain if this is the case, preference variation between middle-aged Ontarioresidents and the Canadian population is tested for using a Hausman-Wise (1978)residual test (the calculated errors for this group are regressed on the explanatoryvariables). No significant differences are found (no t-test on any of the explanatoryvariables is significant, and the overall F-test is only .629 (with 13 degrees of freedom),far below the critical level), so the linkage is performed on the parameter estimatesfor the whole G.S.S. sample.The QALY and expected utility values associated with these states are then usedto uncover the relationship between the two value scales. The linkage is difficultto make since both the functional form and the parameters must be estimated andthere are only ten common observations, all skewed towards perfect health. Fourfunctional forms are evaluated: a basic linear, a translog (first and second order),a quadratic (first and second order) and a Box-Cox. In both cases where they aretried, the performance of second order approximations is found to be inferior to firstorder approximations (t-tests indicated none of the second order terms is statisticallysignificant and the second order approximations are far more sensitive to the inclusionor exclusion of matched observations). The non-linear forms, the Box-Cox and thedouble log (first order translog), dominate the linear forms (the linear and first orderquadratic). Of the two non-linear forms, the double log is chosen because (1) the Box-Cox estimates are less stable, especially when observations are omitted, (2) modelselection statistics are generally better for the double log, and (3) the double logprovides the intermediate values of all the functional forms tried (a plot of the Box-Cox, double log, and linear functions are virtually identical in the range where dataAppendix G. Empirical Analysis to Chapter 4^ 261observations exist, and differ only in the extremely bad states; the Box-Cox gives thehighest values to these states, the linear the lowest). 6The logarithmic transformation used is 0(g)^0-(01.4321 (standard error on theparameter is .20840).Because the preferences of men and women for different health states apparentlydiffer, this procedure has to be repeated for men and women separately. Unfortu-nately, the mean QALY values reported are not broken down by gender. This createsthe potential for two sorts of bias: first, a linear average of the estimated QALY maymisrepresent the true average QALY and, second, the weights used in the two valueindexes may differ.The Case Against Common TransformsConsider the case where mean QALYs are regressed against average utility, butthe proportion of men and women in the two groups is the same, expressed as amand aF respectively. If the true relationship is described by f, then the regressioninvolvesam (70M aF ci,"F = f (aM uM aFUF)^(G.6)am fnuM) aF fF(uF) = f(aMuM aFUF).^(G.7)But this is a Cauchy equation with the solutionf i (Ui ) = aU i + b.^ (G.8)Suppose this is not the case, but that f is in fact strictly convex (as the estimatedcoefficients suggest). Then(alti uMMM,^MaF UF )> a io m aF cp (G.9) 'This decision is supported by Torrance (1976b) who found time trade-off values relate to categoryscale values (which resemble the G.S.S. responses in design) by a logarithmic function.Appendix G. Empirical Analysis to Chapter 4^ 262(i.e. the estimated QALY overstates the true average QALY, for any proportion,regardless of which gender has a higher utility for any particular state). The greaterthe difference in men's and women's preferences, the greater is this bias.The Case Against Specific TransformsConversely, consider the case where the proportion of men versus women differsbetween the two groups. Even if QALYs and utilities are cardinally related, bias ispresent whenever preferences differ (the estimated QALY is biased upwards if theutilities are based on fewer men than the true QALY and men value the state morethan women, et cetera). Since Torrance et al. (1982) do not provide a breakdownby sex of the respondents, it is not possible to assess the impact of this situation.For the purposes here, it is assumed that their random sample of parents reasonablyapproximates the proportions found in the G.S.S. after weighting. This does pose aproblem when the specific transforms are calculated since the proportional represen-tation on the utility side is (0, 1), while it is closer to 50-50 on the QALY side. In thiscase, the bias depends entirely on the differences in utility values between the sexes.Since it is not clear that either of the above two methods generates less biasedtransforms, both are estimated. It is obvious that both methods generate some bias(although the specific transform case probably generates less than the common trans-form case).As for the grouped data, the logarithmic form is found to perform the best forboth the male and female equations. Parameter estimates do differ:0(q) ..ITo(omo(oFufT (q) 1.4321 ,UM (q) 1.1568 ,(7F (q) 1.7578 ,(with standard errors on the parameters of .20840, .19595, and .22186 respectively).Appendix G. Empirical Analysis to Chapter 4^ 263G.0.5 Comment on Institutional DataThe most serious omission from the G.S.S. as a source of health status data is itsexclusion of institutionalized persons. Data on the disabled population residing ininstitutions are available for Canada and the provinces by age and sex in the H.A.L.S.(1990). Three adjustments are necessary before these data can be incorporated intothe above morbidity data. First, the groups excluded from the G.S.S. (those under 15years and those residing in the two territories) have to be deleted from the H.A.L.S.figures. Second, because the H.A.L.S. is based on a survey taken the year after theG.S.S., the population figures have to be scaled down to 1985 levels. This is done bytaking the population base of the G.S.S. (total number of observations in the samplemultiplied by the average weight: 11200x1756) and dividing by the total populationin the H.A.L.S. (after the above deductions have been made). This figure is usedto scale down the H.A.L.S. data (this assumes a constant proportional relationshipbetween the 1985 and 1986 populations under 15 years, residing in the territories andliving in institutions). These adjusted populations give the appropriate weights to beused when the institutional data are combined with the G.S.S. morbidity data.The third adjustment is to link the morbid state (being institutionalized) to theQALY value scale. Since these states are external to the G.S.S. preference function,the procedure used for the G.S.S. states cannot be followed. Instead, values forhospital confinement found in Sackett and Torrance (1978) are used. The value forthe state depends critically on the duration of the state, which is not available in theH.A.L.S. (it being a point in time survey like the G.S.S.). It is assumed the averageduration must lie between 3 months (a QALY value of .56) and 8 years (a QALYvalue of .33). These values, combined with the population weights above, are thenincorporated into the QALY averages.Appendix G. Empirical Analysis to Chapter 4^ 264G.0.6 Calculation of QALY AveragesQALY averages are calculated from the estimated QALY values, using the G.S.S.weights (and the institutional weights when the H.A.L.S. is incorporated). In theG.S.S. data set, this is done by five year age groupings, but when combined withthe H.A.L.S. data set, only ten year demarcations are available. Because morbidityappears to increase with age, but the number of people alive does not, less severemorbid states receive a higher weight in these averages than the more severe states.This bias worsens the more years on which the average is based. Thus, the 5 yeargroupings should give a lower average morbidity level than the 10 year groupings,although both are biased upwards.Because of data limitations (longitudinal data are unavailable), arithmetic aver-ages are used above. These mask the degree of inequality in the population's morbid-ity and health statuses. Although morbidity averages are misleading, because they donot incorporate length of life, it is still interesting to observe how inequality aversionaffects the results. For the adult, non-institutionalized population, over all ages, thearithmetic mean of estimated QALYs (based on common preferences) is .917 (thisreflects the position of inequality neutrality). The geometric mean (reflecting Cobb-Douglas social preferences and some inequality aversion) is .903 and the minimumvalue (reflecting Rawlsian preferences) is .203. If everyone lives a seventy year lifes-pan, the inequality aversion in the Cobb-Douglas increases the willingness to give uplongevity in order to eradicate morbidity by 1 year, nearly a twenty per cent increaseover the inequality neutral case. As inequality aversion increases, so does this differ-ence. This underscores the need for adequate longitudinal data, since without them,the health index is forced to adopt inequality neutral ethics, despite the obvious in-equalities present in society. This is particulary relevant in health policy which canAppendix G. Empirical Analysis to Chapter 4^ 265often effect discrete changes in society's health status profile.G.0.7 Life ExpectancyLife expectancy data are taken from Statistics Canada (1991), since the populationbase for these figures most closely approximates that of the G.S.S. (1985-1987 versus1985). Three adjustments are made so that the bases are as comparable as possible.First, life expectancy is conditioned on living to the fifteenth birthday, rather thanbeing born, i.e.E(tIt > 15) = .5P13 (15115) + Ps (15I15)+Ps (15115)(.5P D (16116)+ P s (16I16)) +...+Ps (15115)...P s (T — 1171 — 1).5P D (TIT),^(G.10)where PD (tIt) = probability of dying at age t given survival to the tth birthday(= DL D t is the number of people who die in the tth year and L t is the numberof people who are alive at the beginning of the tth year), Ps (tIt) = probability ofsurviving the tth year given survival to the tth birthday (= ), and T is themaximum age.By cumulative laws of probabilityE(tIt > 15) = .5PD (15115) +Ps (1545) .5PD (16)15) + + .5PD (TI15), (G.11)where PD (tI15) = --,Dt and P S (t115) = ^ Si5^. nce T is not given in the tables,L1life expectancy for the last period (80+) has to be calculatedL85Lgor E(80I80) = E(80115) (0.12)(i.e. life expectancy at age 80 is scaled so that the probabilities are conditioned onliving to the 15th birthday and not the 80th).Appendix G. Empirical Analysis to Chapter 4^ 266If life expectancy from birth is used, the health status index is biased upwardssince no morbidity is reported for the under 15 group. The measure proposed isnecessarily incomplete, but unbiased.It is assumed that people die at the mid-point of the tth year on average (hence,PD is multiplied by .5). Since people over fifteen generally die at an increasing rate,this involves some positive bias which can be minimized by choosing the time intervalto be as small as possible (i.e. one year) and aggregating these into the larger intervalsby simple summation. This is the second adjustment. Thus, the number of years ofexpected life between any two ages is calculated as the sum of P s (ti15) .5PD (ti15)over all t lying within the interval. Intervals are 5 years when G.S.S. data alone areused, and 10 years when H.A.L.S. data are used as well.Third, the life expectancy figures for Canada incorporate data from the two ter-ritories. Since the effect of these persons on the national figures cannot be isolated,the analysis is performed for each province as well as for the country. The provincialfigures are uncontaminated by the territories' influence, but are subject to the va-garies of small sample sizes in some cases. Since the effect of the territories' differentmortality profiles is apt to be dominated by the provinces' larger populations, thebias should be quite small.G.0.8 QALT CalculationsLife expectancy at each age is adjusted by the mean QALY value for persons of thatage:^80^ NtHSE = E [Ps(to 5) + .5PD(05)ilE 0(q21t) Ntb^t=15 i=1(0.13)where Nt is the number of people in age category t.Standard errors are estimated using Rothenberg's (1984) Taylor's series expansionAppendix G. Empirical Analysis to Chapter 4^ 267over the non-linear function of random variables used to calculate the quality adjustedlife expectancy (E t Pr (t 11.5)1)7 gilt , where bis are the satisfaction values given in TableG.2, the as are the power transforms used to convert the satisfaction values to a timetrade-off consistent scale, and the Ti ts are the morbid states endured by the peoplesurveyed in age group t — note that the probability of survival is treated as parametricsince it is based on the entire population of interest). Such a method approximatesthe true standard errors asymptotically if the estimators of the random variablesused in the function are consistent and asymptotically normal. It is assumed in thesecalculations that the estimators are independent of one another.A problem of bias may result from the fact that the life tables are based ona steady state estimate of the population, while the G.S.S. is based on the actualpopulation. Because of the post-war baby boom, this means the G.S.S. has moreobservations in the younger and middle-aged segments of the population. If theproportion exceeds the probabilities at the younger years when people are generallyhealthier, the estimated health status is biased upwards. However, since the averageis taken over 5 year intervals, there does not seem much opportunity for such a biasto have an effect.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Uses and abuses of QALY analysis
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Uses and abuses of QALY analysis Holmes, Ann M. 1993
pdf
Page Metadata
Item Metadata
Title | Uses and abuses of QALY analysis |
Creator |
Holmes, Ann M. |
Date Issued | 1993 |
Description | A major contribution of economics to health services research has been the development of QALYs (quality adjusted life years) as a measure of health status. This thesis investigates, in three essays, the use of QALYs in health care project evaluation and as an indicator of societal health. The first essay examines the validity (defined as consistency with preferences) and feasibility of various QALY construction methods. Conditions for validity, derived from welfare principles, are used to assess the different methods. A new QALY instrument is devised that has interpersonal content (i.e. is valid for choices involving different individuals). Bias is shown to depend on various independence relationships within preferences. A number of these conditions are tested using data from the General Social Survey of 1985 (Canada. Statistics Canada [1987]). The second essay examines the welfare properties of the QALY-based index as it is commonly employed to make health policy decisions. A comparison with alternative economic-based health indexes (human capital and willingness-to-pay) is provided. The QALY-based measure does indicate which treatment is best for an individual. In choosing patients for treatment, however, QALY-based measures probably discriminate against certain types of individuals, including those who are risk averse with respect to health and in poor health. In choosing between health programs, aggregate QALY-based measures do order community health profiles sensibly (except where people endure states worse than death), unlike the other measures considered. The QALY-based index may, however, favour unequal distributions of health. The final essay assesses the appropriateness and feasibility of QALYs as a foundation for an index of societal health. Results suggest that, theoretically, the QALY serves as an imperfect measure of societal health, but that these problems are endemic to any index based on individual preferences. Using the best available data, a QALY based index is calculated to measure the level and distribution of ill-health in Canada and indicate where health policy can be most effectively targeted. The essay concludes with a discussion of what improvements in data collection are required to obtain more accurate figures. |
Extent | 11204186 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2008-09-17 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0086362 |
URI | http://hdl.handle.net/2429/2149 |
Degree |
Doctor of Philosophy - PhD |
Program |
Economics |
Affiliation |
Arts, Faculty of Vancouver School of Economics |
Degree Grantor | University of British Columbia |
GraduationDate | 1993-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1993_spring_phd_holmes_ann.pdf [ 10.69MB ]
- Metadata
- JSON: 831-1.0086362.json
- JSON-LD: 831-1.0086362-ld.json
- RDF/XML (Pretty): 831-1.0086362-rdf.xml
- RDF/JSON: 831-1.0086362-rdf.json
- Turtle: 831-1.0086362-turtle.txt
- N-Triples: 831-1.0086362-rdf-ntriples.txt
- Original Record: 831-1.0086362-source.json
- Full Text
- 831-1.0086362-fulltext.txt
- Citation
- 831-1.0086362.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0086362/manifest