Open Collections

UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Identification of risk groups : study of infant mortality in Sri Lanka Kan, Lisa 1988

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
[if-you-see-this-DO-NOT-CLICK]
UBC_1988_A6_7 K36.pdf [ 4.64MB ]
Metadata
JSON: 1.0097696.json
JSON-LD: 1.0097696+ld.json
RDF/XML (Pretty): 1.0097696.xml
RDF/JSON: 1.0097696+rdf.json
Turtle: 1.0097696+rdf-turtle.txt
N-Triples: 1.0097696+rdf-ntriples.txt
Original Record: 1.0097696 +original-record.json
Full Text
1.0097696.txt
Citation
1.0097696.ris

Full Text

IDENTIFICATION OF RISK GROUPS: STUDY OF  INFANT  MORTALITY IN SRI LANKA by LISA KAN  B.Sc,  Simon Fraser University, 1986  A THESIS SUBMITTED IN PARTIAL  FUTJFIIJLMENT  THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES The Department of Statistics  We accept this thesis as conforming to the required standard  THE UNIVERSITY OF BRITISH COLUMBIA September 1988 ©  Lisa Kan, 1988  OF  In  presenting  degree  this  at the  thesis  in  University of  partial  fulfilment  British Columbia,  freely available for reference and study. copying  of  department  this or  thesis by  publication of this  for scholarly  his thesis  or  her  Department of The University of British Columbia Vancouver, Canada  DE-6 (2/88)  the  I agree  I further agree  purposes  may  representatives.  It  be is  requirements  for  an  advanced  that the Library shall make it that permission granted  for extensive  by the head  understood  that  for financial gain shall not be allowed without  permission.  Date  of  of  my  copying  or  my written  ABSTRACT  Multivariate s t a t i s t i c a l methods, including recent computing-intensive techniques, are explained  and applied  i n a medical sociology  context to  study infant death i n r e l a t i o n to socioeconomic r i s k factors of households in S r i Lankan v i l l a g e s .  The  data analyzed were c o l l e c t e d by a team of s o c i a l s c i e n t i s t s who  interviewed  households i n S r i Lanka during 1980-81.  Researchers would l i k e  to i d e n t i f y c h a r a c t e r i s t i c s (risk factors) distinguishing those households at  relatively  high  or  low  risk  of  experiencing  an  infant  death.  Furthermore, they would l i k e to model temporal and s t r u c t u r a l relationships among important r i s k factors.  Similar  statistical  issues  s o c i o l o g i c a l and epidemiological  and  analyses  studies.  are  relevant  to  many  Results from such studies may be  useful to health promotion or preventive medicine program planning.  With respect discriminating statistical linear  t o an outcome such as infant death,  factors or variables can be i d e n t i f i e d using a v a r i e t y of  discriminant  discriminant,  partitioning  (CART).  methods, including logistic The  linear  usefulness  Fisher's  parametric  discrimination, of  a  ii  (normal) recursive  discriminant  of the data  variables are dichotomous, o r d i n a l , normal, etc.,)  context and objectives of the analysis.  and  particular  methodology may depend on d i s t r i b u t i o n a l properties the  r i s k groups and  and also  (whether on the  There are at least three conceptual approaches to s t a t i s t i c a l studies of  risk  r e l a t i v e  factors.  An A  risk.  classification  epidemiological perspective uses the notion of second  generally  or discriminant analysis, i s to predict  outcome, or class membership. probability  approach,  A third approach is to  referred a  as  dichotomous  estimate  of each outcome, or of belonging to each class.  to  the  These three  approaches are discussed and compared; and appropriate methods are applied to the Sri Lankan household data.  Path  analysis  i s a standard method used to investigate  relationships among variables in the social sciences.  causal  However, the normal  multiple regression assumptions under which this method i s developed are very  restrictive.  In this  thesis,  limitations  of path analysis are  explored, and alternative loglinear techniques are considered.  iii  TABLE OF CONTENTS Abstract  i i  Table of Contents  iv  List of Tables  vi  L i s t of Figures  vii  Acknowledgements  viii  1.  Introduction  1  2.  A Study of Infant Mortality in Sri Lanka  4  2.1  4  3.  Infant Mortality in Medical Sociology  2.2 The Sri Lankan Household Data  7  Discriminant Applications to Identify Risk Groups  13  3.1  13  Basic Approaches  3.2 Optimality Criteria for Discriminants  3.3  15  3.2.1  Relative Risk  15  3.2.2  Decision Theoretic Bayes Rules  19  Sample Space Partitions Corresponding to Bayes Rules  26  3.3.1  Linear Discriminants for Normal Distributions  27  3.3.2  Logistic Linear Discriminants  29  3.3.3  Classification Trees: Recursive Partitioning  31  3.4 Construction of Discriminants from Sample Data  35  3.4.1  Logistic Discriminant  36  3.4.2  CART Discriminant: Growing a Class i f ica i ton Tree .. 37  3.4.3  CART Discriminant: Pruning  a Classification Tree .. 40  3.4.3.1  Test Sample Estimates of Risk  44  3.4.3.2  Cross-Validation Estimates of Risk  46  iv  4.  Path Analysis  48  4.1  Structural Modelling with Quantitative Data  49  4.1.1  Path Models  49  4.1.2  Estimation and Interpretation of Path Coefficients .. 53  4.2  5.  59  4.2.1  Loglinear and Logit Models  59  4.2.2  Path Models  63  4.2.3  Estimation of Path Coefficients  65  4.2.4  Goodness-of-Fit  68  for Path Models  Statistical Analyses on the Sri Lankan Household Data  71  5.1  Identification of Infant Mortality Risk Groups  71  5.1.1  Logistic Discrimination  72  5.1.2  Discrimination Using CART  76  5.1.3  Discussion  80  5.2  6.  Structural Modelling with Qualitative Data  Causal Modelling  84  5.2.1  Structural Modelling with Quantitative Data  85  5.2.2  Structural Modelling with Qualitative Data  90  5.2.3  Discussion  95  Remarks and Recommendations on Statistical Methods Used to Identify Risk Groups  96  Bibliography Appendix I  100 Partitioning the Sample Space Using Logistic Discrimnation (Younger Women)  Appendix II  Modified Path Analysis - Model Selection (Younger Women)  Appendix III  104  105  Modified Path Analysis - Model Selection (Older Women)  108 v  L I S T  O F  T A B L E S  Table I  Variables used in the Sri Lankan household study  10  Table II  Households used in the analysis  12  Table III  Estimated direct  and indirect  effects for path  model (4.3)  58  Table IV  Various loglinear models for three-dimensional tables .. 60  Table V  Results of forward stepwise logistic regression ........ 74  Table VI  Comparison of sample space partitioning between logistic discrimination and CART  Table VII  Estimated logistic regression equations for younger women  Table VIII  82  Estimated direct  83 and indirect  effects on infant  death  89  Table IX  Variables used in modified path analysis  92  Table X  Goodness-of-fit statistics for loglinear models (younger women)  107  Table XI  Goodness-of-fit statistics for loglinear models (older women)  110  vi  LIST  Figure 1  OF FIGURES  Conceptual model of medical sociological approach to research on infant mortality  Figure 2  4  Examples of Relative Risk functions for known probability densities  17  Figure 3  An example of a binary tree  31  Figure 4  An example of a path diagram  49  Figure 5  An example of a path diagram with path coefficients  51  Figure 6  An example of a colored  Figure 7  A path model with estimated path coefficients  56  Figure 8  A path model with dichotomous variables  64  Figure 9  CART results for the younger women  78  Figure 10  CART results for the older women  79  Figure 11  Path model specifying temporal relationships among  path diagram  ....52  selected variables  84  Figure 12  Path analysis results for the younger women  87  Figure 13  Path analysis results for the older women  88  Figure 14  Path diagram showing causal links implied by selected logit models for younger women Path diagram shoving causal links implied by selected  93  logit models for older women  94  Figure 15  vii  ACKNOWLEDGEMENTS  I would like to thank Dr. Nancy E. Waxier-Morrison data and the stimulus for my research. his  guidance,  suggestions,  Dr. A. John Petkau's  helpful  I am grateful to Dr. Ned Glick for  and patience comments  for providing the  in producing  are also  greatly  this thesis. appreciated.  Finally, I thank my husband, Scott, for his continuous encouragement and support.  Without his belief in me, i t might have taken me longer to get  here.  viii  1.  Introduction  A study of infant mortality in Sri Lanka was conducted by a team of social scientists during 1980-81 (before the current c i v i l war) to identify households and socioeconomic conditions in which there was a high risk of experiencing an infant death.  Further, relationships among risk factors  would also be of interest to future planning programs in developing  countries-  of any preventive  health  Similar applications of multivariate  analyses are widely used to identify risk groups in epidemiology, urban planning, economics, business, etc. .  This thesis explores and applies  various statistical methods for assessing risk groups, and relationships among risk factors.  Risk, groups and discriminating factors can be identified by a variety of statistical discriminant and modelling methods.  The most often used  criterion for determining the goodness of a discriminant rule has been the rate of misclassification.  However, the importance of misclassification  rate varies depending on the purpose of discrimination.  In medical  diagnosis, the objective is to pinpoint as accurately as possible the cause of symptoms.  Since i t is not desirable to subject a healthy individual to  possibly detrimental treatments, such as chemotherapy, nor to leave an infection untreated because of misdiagnosis, misclassification rates are preferred.  discriminant rules with low  In medical screening, say early  breast cancer detection, examinations are performed on apparently  healthy  volunteers from the general population, for the purpose of separating them into groups with high and low probabilities for breast cancer (Sackett and Holland  1975).  The idea of a screening discriminant i s to use a few  1  inexpensive measurements to capture a l l those with the disorder in a  high  risk, group, so that more complicated, and often more expensive examinations need be performed only on this smaller group of individuals. Thus, factors considered to be good  screening factors may not be acceptable  diagnosis  factors.  In epidemiology and medical sociology, the main objective is to  discover  the  context  in which a  disorder  may  homosexual men were identified as the f i r s t high AIDS, although homosexuality per clearly,  by  using  sexual  se  is not  orientation  misclassification rate would be high. the risk of infant death is being p o l i t i c a l perspective.  as  occur.  risk  the a  For  example,  group in studies of  cause of disease; discriminant  and  rule,  the  In our Sri Lankan household study, examined from the socioeconomic and  Health planning involves not only the understanding  of biomedical causes of infant death, but also the social context in which infant death may  occur.  Although discriminant rules constructed  socioeconomic and p o l i t i c a l variables may  using  not have low misclassification  rates, the socioeconomic and p o l i t i c a l conditions under which a family is most l i k e l y  to  experience an  infant death can  still  be  identified.  Thus, the goal is to find discriminating variables and discriminant rules that partition the households into distinguishable groups with respect to the risk  of infant mortality.  determining the goodness  In this thesis, two  of a discriminant rule are  other criteria for investigated, and  discriminant methods that are appropriate for the Sri Lankan household data set are applied.  A second objective of the Sri Lankan household study is to test a theoretical  model that  places  infant mortality at  expanding series of social contexts.  2  the  Infant deaths may  center  of  an  be affected by  proximate factors such as inadequate nutrition or poor sanitation creating conditions  for tetanus or diarrhea.  These proximate factors may  be  influenced by the education level of the mother, and the economic status of the  family, which in turn, may be linked to ethnic  group  membership.  Path analysis is the standard method used to analyze such models in the social sciences.  However, the assumptions under which this method is  developed are highly restrictive. limited.  Thus, the use of path analysis is  In this thesis, limitations of the methodology are explored, and  alternative techniques are considered.  3  2.  A Study of Infant, M o r t a l i t y i n S r i Lanka  2.1  I n f a n t M o r t a l i t y i n Medical S o c i o l o g y  In medical sociology, infant mortality i s viewed as a consequence of biosocial  interactions.  The  key  idea  behind  the  disease i s that etiology i s b i o l o g i c a l l y s p e c i f i c .  biomedical model  Hence, medical research  i s p r i m a r i l y focused on disease agents and host-agent interactions. other  hand,  social  science  research  on  of  infant  mortality  On the  has  been  t r a d i t i o n a l l y concentrated on the association between socioeconomic status and the l e v e l and pattern of mortality i n the population.  The  specific  medical causes of death are generally not addressed by s o c i a l s c i e n t i s t s . Medical sociology attempts to bridge these two approaches infant mortality. premise  that  necessarily  to the study of  Mosley and Chen (1984) proposed a framework based on the  " a l l social operate  and  through  a  economic determinants common set  of  of  child  biological  proximate determinants, to exert an impact on mortality".  mortality  mechanisms,  This framework  can be summarized by the following i l l u s t r a t i o n .  socioeconomic factors  Figure 1  biomedical factors  infant mortality  Conceptual model of medical s o c i o l o g i c a l approach t o research on i n f a n t mortality  4  or  Primary causes of infant death understood  from  contributes  the medical  to high  infant  in developing  perspective. mortality  countries  are well  One of the factors  rates  i s risk  of  that  infection.  Patel (1980) noted the common use of dung as a healing agent prior to 1940 in Sri Lanka. 1906,  As documented by the Registrar of Ceylon Medical College in  tetanus,  infection  a common cause  to the navel  childbirth.  This  after  source  of infant death, separation  of infection  often  resulted from  of the umbilical  can easily  cord in  be eliminated by  abolishing such practice. Another source of infection is the contaminated water supply caused by lack of proper sanitation f a c i l i t i e s . of  infection may be eliminated  This source  by construction of sanitary  latrines.  In general, most infant deaths are preventable with current understanding of disease transmission and existing health technology.  Although  most  infant deaths  are preventable  with  the available  technology, the social context in which infant death occurs may block the use of such technology.  The Sri Lankan government has created a subsidy  program for the construction of latrines. poor to take advantage of such subsidies. of hospitals for childbirths.  However, many families are too Another example involves the use  Waxier et al. (1985) suggest that childbirth  may not be considered serious enough to require a doctor's care. hospitals for maternity hospitals  which  Thus,  care are sometimes not used, even though these  are essentially  free,  are within  short  distances.  Therefore, in order to design an effective package of health policies to promote infant survival, the biomedical  and the social context  problem must be examined concurrently (Mosley 1984).  5  of the  Two recent developments i n sociological research have also altered the approach al.  to infant mortality studies,  (1985).  as pointed  out by Waxier  et  McKeown (1976) has argued t h a t changes i n health status across  time are probably better predicted by changes i n s a n i t a t i o n and a v a i l a b l e food s u p p l i e s , than by health care or narrowly defined medical v a r i a b l e s t h a t are often considered.  Secondly, i n f a n t m o r t a l i t y has been used, by  development economists and others, as a c e n t r a l i n d i c a t o r of the s t a t e of development,  or  quality  countries (Morris 1979).  of  life,  of  populations  These developments  have  in  called  developing  f o r expanded  models that place i n f a n t m o r t a l i t y i n a l a r g e r s o c i a l context.  The  proximate  n u t r i t i o n (Puffer supply that  create  causes  and  the  maternal education  Bernstein 1982, and  Perry 1982,  infant  Serrano 1973)  conditions  Smucker et al. 1980).  of  or  death poor  may  be  sanitation  inadequate and  water  f o r tetanus or diarrhea ( P a t e l 1980,  and  However, these proximate causes may be r e l a t e d t o level  (Caldwell  and McDonald 1982,  Simmons and  and Chowdhury 1982), economic status of the f a m i l y (Grosse and Waxier et al. 1985), and access t o health  services  (World Bank 1975), which i n t u r n , may be r e l a t e d t o ethnic group membership (Waxier et al. 1985).  I n the S r i Lankan household study, r e l a t i o n s h i p s  between i n f a n t m o r t a l i t y and various biomedical and socioeconomical are examined.  6  factors  2.2 The S r i Lankan Household Data  As described in Waxier et a.1. (1985), the 22 districts of Sri Lanka were divided into three clusters having different patterns of quality of life  based on results  of a previous study (Morrison and Waxier  Four villages representative of a typical clusters were selected.  1984).  d i s t r i c t from each of the three  For each village, a random sample of 40 households  was drawn from the population l i s t .  A household was substituted only i f  the sampled house was empty, or i f both male and female head of household were absent in several calls over a period of weeks.  Approximately 30  substitutions were made in the sample of 480 households.  The researchers  who devised this sampling scheme regard the sampled households as being representative of the Sri Lankan village population.  A long systematic set of open questions was used for interviewing both the male and  the female head  of household.  The  questions elicited  information on health, housing, nutrition, employment, education, etc. . The female head of the household, in addition, reported on the number of live births in her lifetime, and the number of her children who died before reaching age one.  Information on the cause of death (or symptoms at death)  was also obtained for each infant that died.  The variable of primary interest in our analysis is a dichotomous response  indicating whether  or not the female head  experienced at least one infant death. the study are listed in Table I.  7  of household has  A l l explanatory variables used in  391 households (82% of the total sample) have complete information on the variables of interest.  Table II shows that 92% of the total sample  satisfied the i n i t i a l inclusion criterion: a female head of household with known child-bearing history, and known number of infant deaths must be present in the household.  Further, the table shows that 12% of these  households had missing information (where 11% have at most one missing variable and 1% have two missing variables).  Most missing values appear in  the variables concerning family income, and among older female head of households; otherwise, there was no noticable pattern when the distribution of households with missing information was examined for each variable.  Several populations may require separate  analysis in this  study.  Women with more childbirths are more likely to have experienced at least one infant death.  Thus, the Sri Lankan village population i s separable  with respect to the dichotomous response on infant death by the number of childbirths.  Furthermore, several explanatory variables may have different  relevance to women of different age groups.  For example, the use of health  services for childbirth is restricted by a v a i l i b i l i t y which may vary across time.  The impact of ethnicity may also vary for the different generations.  Thus, analysis should be performed separately for the various age groups. However, the available sample size restricts the number of allowed strata. Since older women also tend to have more childbirths, the sample is divided into two groups based on the woman's age (<44 and 44 ). +  Most women in the  latter age group are postmenopausal; thus, women in this age group have similar numbers of childbirths.  In contrast, the number of childbirths  varies for women in the younger age group. correspondence  between household  Since there is a one-to-one  and female head of the household, the  8  terms,  fiousehold  and  woman,  w i l l be used interchangably t o refer to a unit  of observation throughout t h i s t h e s i s .  In our analysis of t h i s S r i Lankan  household  corresponding  survey,  the two  data  sets  age <44 (250 cases), and those of age 44 random samples.  9  +  t o those  women of  (141 cases) are treated as simple  Table I  Name  X  X  variables used in the Sri Lankan household study  Explanation  Codes  Infant death indicator  1 2  No. of languages spoken at home  1 one 2 two or more  Current usage of health services - where was the last child born?  1 2 3  Nutrition - no. of protein foods consumed in the past week, from four most common types listed.  0 none 1 one type  at least one none  hospital home with midwife home without midwife  4  four types  Sanitation  1 2 3 4 5  none communal latrine own / open-pit type own / water-sealed toilet  Economic status - no. of household items owned, from five listed.  0 none 1 one 5  No. of hrs a day female head of household worked outside the home  five  0 none 1 one - three 2 four  7 nine 8 ten or more X  No. of household members currently employed  10  0 none 1 some 2 all  Name  X  IO  Explanation  Codes  Primary source of income  1 2 3 4  salary land/business/boat piece rate food stamps etc.  No. of bustrips taken in the last week  0 1  none one  7 8  seven eight or more  Ethnicity  1 2  Sinhalese others  Years of schooling for female head of household  0 1  none one  11 12 X 1  AGE  eleven twelve or more  Education level of female head relative to that of male head  1 lower 2 same 3 higher  Age of female head of household  as reported  11  Table I I  Households used in the analysis  Total number of households sampled  480  no female head of household  12  no child birth or no information on child birth  25  no information on infant deaths number of invalid households  1 38  Total number of valid households  442  missing information on one variable  48  missing information on two variables  3  number of excluded households  51  Total number of households included in analysis  391  number of women with age <44  250  number of women with age 44*  141  (  12  3.  Discriminant Applications to Identify  3.1  Basic  Risk  Groups  Approaches  In the S r i Lankan household study, we are interested in deriving discriminant  rules that  partition  the households  into distinguishable  groups with respect to the risk of experiencing infant death.  There are at  least three basic approaches to this problem.  An epidemiological perspective  uses the notion of relative  If a population t can be divided into two disjoint subpopulations, and < , then relative  risk.. say *  risk  of a particular phenomenon is defined to be the  occurrence probability in  relative to the occurrence probability in I .  2  For example, we would like observable variables to define some groups t and * such that the probability of infant death i s higher for households 2  in t  2  relative  to the probability for households in t .  In general,  a variable which can partition the population so that one subset has high relative risk is considered an important rish  A second approach i s to predict  /actor.  a dichotomous outcome based on some  collected information; for example, classify a family as likely or unlikely to experience an infant death based on the sanitation f a c i l i t y , nutrition, etc. available to the family.  This approach i s generally referred to as  discriminant analysis or classification, and as pattern recognition in engineering.  The idea i s to select discriminating variables and to derive  discriminant rules that minimize the expected cost of misclassification. This w i l l be referred to hereafter as the c l a s s i f i c a t i o n approach.  13  A third approach is to estimate or of  belonging  to  each  class,  the  probability  given  some  of each outcome  collected  information;  for example, estimate the probability of infant death given the educational level of the mother. Regression c I ass  Trees  Using the terminology  in C l a s s i f i c a t i o n ,  and  {CART) by Breiman et al. (1984), this approach is called  probab i l i t y  estimation.  The methods used in this approach search  for variables and rules that minimize a squared error loss function to be defined later (Section 3.2.2).  Obviously, these three approaches are related.  For instance, class  probability estimation for an observation [e.g. for a family) suggests a discriminant that assigns  the  observation  to whichever class  has  the  maximum probability; and relative risk can be estimated for the resulting discriminant partition. perspectives  can  probabilities,  be  and  The similarities and differences between these described  in the  in  terms  more general  of  various  context  of  conditional  decision theory.  Some s t a t i s t i c a l techniques and software may be adapted to more than one of these approaches.  We w i l l f i r s t consider the roles of these approaches in  characterizing  a  good  discrimination  will  conditional  discriminant.  be  probabilities  discussed are  in  known.  The the  underlying context  However,  principles  where in  the  practice  of  various these  conditional probabilities are often unknown, and need to be estimated from the sample data.  The last section describes how  obtained.  14  these estimates may  be  3.2 O p t i m a l i t y C r i t e r i a f o r  Discriminants  R e l a t i v e Risk  3.2.1  Relative risk  i s generally considered  in a context relating the  presence or absence of a specific disease to exposure levels possible risk factor(s) (Schlesselman 1982).  for some  The concept of relative risk  is simplest when exposure level i s dichotomous (presence or absence of a factor). that  A high relative risk (of disease) among those exposed suggests  the factor  Schlesselman 1982,  may  be a  cause  of disease (Breslow  and Day 1980,  Hennekens and Buring 1987).  Let X be a random variable that indicates the level of exposure to a specific risk factor. D e f i n i t i o n 3.1  Suppose there are only two levels.  Relative  risk, is defined as  P(d*sease\X=2) P(dr.sease\X = 1 )  When RR > I, the probability  of disease in the population with X = 2 i s  higher than the probability of disease in the population with X = l. The reverse relationship  Historically, variables.  is implied when RR < 1 .  relative  risk  used  primarily  for dichotomous  But suppose the random variable X i s continuous on the real  line, or positive half-line, etc.. we  was  are interested  Then by considering X as a risk factor,  in partitioning  the real  line  distinguishable with respect to the risk of disease.  15  into  two regions,  Is i t reasonable to use relative risk as a partitioning Suppose  the  disease-present  and  the  disease-absent  populations  densities of X denoted respectively by p(x|disease) and p ( x | n ©  smooth  unimodal  densities,  the  ideal  have  disease),  If p(x|disease) is  which, in practice, may be estimated from sample data. right-shifted with respect to p(x|n© disease),  criterion?  then, at least for most  partition  is  in the  half-lines, { X < c} and { X > c}, for some c on the real line.  form  of  Thus, by  Bayes theorem, for any c e R, the corresponding relative risk is  RR(c) =  W ° " \ X P(disease|X  > c) < c)  {  P ( X > c\disease) P(X  The  two  > c)  P(X  P{X  < c)  <  c\disease)  3  2  )  examples illustrated in Figure 2 show that for densities with  monotone likelihood ratio, RR(c) may increase to infinity as c decreases; but the discriminants corresponding to such extreme c are of no practical value.  Thus, choosing c to maximize RR{c)  partitioning.  is not a useful criterion for  Furthermore, because RR{c) may not be a monotone function,  relative risk values do not provide information on how well separated are the two populations, disease-absent and disease-present.  For example, a  relative risk value of about 2 can arise from different partitions of the real  line  in either  of  the  two  situations  Since relative risk does not indicate disease-present  and  disease-absent  illustrated  in Figure 2.  the magnitude of shift between the densities,  relative  risk  is  not  necessarily informative about the practical discriminating nature of a risk factor that is continuous rather than dichotomous.  16  Figure 2  Examples of relative risk function for known probability densities  Density Plots: N(0,1) vs. N(1,1)  p(x)  i i i i I i i i i I i i i i I i i i i I i i i i I i i i i -1.0 0.0 1.0 2.0 3.0 4.0 5.0  x Relative Risk Function for N(0,1) vs. N(1,1) 6.0 5.0  -  4.0  RR(c) 3.0  -  2.0  -  1.0  M  1.0  I  I  |  0.0  I  I  I  I  | I  1.0  I  I  I  |  2.0  17  I  I  I  I  |  3.0  I  I  I  I  |  4.0  I  I  I  I  5.0  Density Plots: N(0,1) vs. N(2,1) 0.5  not diseased  ll  0.4  -  diseased  0.3 _ z  p(x) 0.2 _ z  0.1  -  0.0  i  -1.0  I i i I i | i i i i | i i i i fl 0.0 1.0 2.0 3.0  i I i  l I I | i i i i 4.0 5.0  x Relative Risk Function for N(0,1) vs. N(2,1) 6.0 5.0 H 4.0  RR(c) 3.0  H  2.0 1.0 -1.0  i i i i I i i 0.0  I I I I I I I I I I I I I I I [ I I I I 1.0 2.0 3.0 4.0 5,  I  18  These properties indicate that relative risk may not be a meaningful criterion for selecting discriminating variables.  Even though relative  risk associated with a particular discriminant may be of interest, relative risk per se is not usually an appropriate criterion for construction of a discriminant.  3.2.2  D e c i s i o n T h e o r e t i c Bayes Rules  Although the formal  objective differs for classification and class  probability estimation, both approaches use discriminant methods that can be described  in a general framework of decision theory as presented in  Classification,  and.  Regression  Trees  (CART) by Breiraanei al.  (1984).  In the following, discussion w i l l be restricted to the two-class problem, which is appropriate for the Sri Lankan household study.  Generalization to  more than two classes can easily be made.  Let X  be the sample space of possible measurement vectors, and l e t  S = {1,2} denote the set of possible classes.  Further, l e t X e X  random variable whose distribution is denoted by P(dx), denote the class membership. Definition  3.2  Suppose jf is the set of possible  A decision  rule d  D e f i n i t i o n 3.3  A loss  l e t Y e 55 actions.  d is a jtf-valued function on X  : X  :  -» sf.  function L is a real-valued function on S x sf : L  Thus L(y,a)  and  be a  :« x  -» R.  is the loss when Y = y and a e jf is the action taken.  19  D e f i n i t i o n 3.4 rule d is used.  The risk. R(d)  is the expected loss when the decision  That i s , Rid) = E [ L(Y,d(X))  ].  In the classification approach, we are interested in predicting the class membership of an object with measurement vector X = x  .  Thus, we  want to construct decision rules that assign class membership in t to every measurement vector  x <= X,  and  so,  l e t the  action  space  J#  C  be "6.  Furthermore, any decision rule d is equivalent to the partition of sample space X  into two regions, l and * , such that an object with measurement 2  vector x e t . is classified as class j, for j = 1,2. ~  These rules w i l l be  j  called c l a s s i / i c a t ion rv.les.  The loss function, L (y, a), in this situation c  is the cost of classifying a class y object as a class a object, denoted by C(a|y).  Suppose C(a|y) is positive when a * y  and  is O  otherwise.  Then the risk or expected cost of using decision rule d is given by R (d) = C(l 12) P(Y  Let  the  probability that  for j = i,2. prevalences of X,  = 2,X  an  e * ) + C(2\l  ) P(Y  observation  comes  from  In epidemiological terms, these a priori of the two classes.  (3.3)  = 1 ,X e I ).  class  j  be  probabilities are  Further, l e t the conditional probability  given an object from class J be denoted by p(x\j)  for j = 1,2.  Then the risk in (3.3) can be re-expressed as  * ( d ) . C ( i U » n [j o  z  p(x\2)  + C(2\l  ) n[ t  20  rfxj  J  P(x\l)  dx j .  (3.4)  In the class probability estimation  approach, we are interested in  obtaining an estimate of the probability that an object with measurement vector X = x belongs to class j. That i s , we are interested in estimating  p(j | x ) = p(Y = j |X = x ) ,  j=l,2.  Thus, we want to construct rules of the type,  d(x)  with d(J\x)  d(2\x))  for J = 1,2, and £ . <i(j'|x) = f , for every x e X.  > O  Such rules w i l l be called action space  = (d(i | x ) ,  class  probability  estimators.  Hence, the  consists of a l l pairs of nonnegative numbers that sum to /.  Let the loss function L (y,cn) for a = (a ,a ) € 4f be defined by p  where ^j-(y) for J =  i s  ~  ~  the Kronecker  delta  p  2  (l i f y = j  and 0  otherwise),  Then the risk of a decision rule d i s given by  R (d) » E [ L <y,d(X)) ] =  But  1  given X = x ,  probability p ( j | x ) ,  E [ < d<y|X) - 6 (X) ) ] . 2  6^.(7) i s a Bernoulli  (3.6)  random variable with success  for J = 1,2. Thus, E [ <5 ^. (y) | X = x ] = P O ' | X ) and  E [ (SjAY) - p U l x )  )  2  I X = x ] = Var[ 6 {Y) |X = x ] y  = P(j'lx)  21  [1 - p ( j | x ) ] .  (3.7)  Hence, for any a e sf^  f  - Zj < <v " ~ y)  = Zj PU\X)  t l  -POIX)]  = 2p(l\x)p(2\x)  from (3.7).  pol  +  ~ ~*j x = x ]  ) + P(J|  E  (P(J'|X)  + Zj (PU\x)-  )21  )  -  *j)  z  <*J) , 2  Therefore, for class probability estimation, the risk of a  rule d i s given by  R (d) = 2 E [ pit |X)p(2|X) ] + Zj p  lPU\X)  ~ dU |X)) ] , (3.8) 2  where the f i r s t term does not depend the rule.  A Bayes  D e f i n i t i o n 3.5  rule  i s a decision rule d  that  minimizes  B  the risk function R(d).  In the classification approach, a Bayes rule d  that minimizes the D  expected cost as expressed in (3.4), is obtained by choosing  i  \ X  *2  \ ~  €  €  X  •  •  ^(x\2V  p(x|2)  C(2\l)  <  C(2\l)  TI  n\  2  J '  A  N  D  ( 3  '  9 )  ) '  as shown in Anderson (1984), with the Bayes risk as given in (3.4) with the above regions i and t .  22  In the class probability estimation approach, the unique Bayes rule i s given by d ( x ) = ( p(l |x), p(2\x)  ) for x e X, with risk  B  R (d ) = 2 E[ p(l |X)p(2|X) ] = 2 J p(l  \x)p(2\x)  (3.10)  P(dbc)  which can be seen easily from (3.8).  Bayes rule and Bayes risk can also be defined for a partition of the sample space X. Definition 3.6  The partition  /unction  T associated with the partition  T i s defined as T : X -*• T such that T ( X ) = t i f and only i f x e t, for a l l x € X and * e T. A decision rule d i s said to correspond to the partition T constant on each subset of T.  i f i t is  That i s , for every l e T, there exists some  jtf-valued function u on T such that c o U ) = d(x) for every x e i .  Then a  decision rule d^. corresponding to the partition T i s explicitly given by d^-(x) = u>(r (x)), and the associated risk is given by  R(d  ) =£  where P{t) = P(X € * ) .  £[ HY,<*(i))  = <oU)  value  minimizes E [ L ( / , a )  that  minimizes  (3.11)  Thus d_ i s a Bayes rule corresponding to the  partition T i f and only i f a  |X 6 i ] P « ) ,  (x) = C O ( T (x)) such that for each t e T, |X e I ].  For convenience, l e t  E [ L(y,a) | X e i ] over  23  a e jtf,  toU)  be a  for * e T.  Furthermore, for t e T, l e t  r U ) = E [ L(y,toU)) | X e * ] .  Then the Bayes risk corresponding to the partition T can be written as  R(T) = £  rU)PU).  (3.12)  In the classification approach to discrimination, a Bayes rule d corresponding to the partition T i s obtained by setting d^ (x) =  <*MT  for  minimizes  x e X,  where  E f Z . (Y,i) | X € 4 1  o> it)  is  a  value  i e {i .2}  that  (x))  Then for * e J*. co it) i s a value  for * e J*.  £ € U ,2} that minimizes  E[L (y,i) o  |X€<]  where p ( j U ) = piY = j'|X e i ) ,  * C{t\t)p{l\t)  j" =1,2.  +  C(i\2)p(2\t),  Thus, the minimum conditional  expected cost of misclassif ication on subset t e 7" i s given by  r U ) = min [ C(2\l)p(l c  \t), C(l \2)p{2\i)  ].  (3.13)  Then the Bayes risk for partition T can be written as  R {T) = E  i* (*)*>(*)•  (3.14)  In the class probability estimation approach, the unique Bayes rule d corresponding to partition T  Q  i s obtained by setting «* (>0 = " ( (x)) T  B  24  for x s X, where <»> U) is the pair of nonnegative values a = (« /« ) that p  ±  2  minimizes  = Ej. £ [ (  -  pom  + PU\*)  | x  = Zj E [ ( 6 ( y ) - p{j\t)f y  = E.  since 6 j ,(y) given  X € I is a  oi (t) = ( p ( i | 4 ) , p(2\l) p  ] + E  | X € * ]  2  (PO'U) - « )  y  2  Y  «j)  2  Bernoulli random variable with success  p(j'U) = p{Y = j'|X e <)  probability  4,  )  OLJ  + E (PU'I*) -  P ( J l * ) [ l - P(J"I*)]  J  €  -  for  J =1,2.  Thus  for  t e  T,  ), and the minimum conditional expected loss is  given by r U) = 2p(* \t)p(2\t).  (3.15)  p  The Bayes risk for partition T can then be written as  R (T) = E teT  ^  p  (3.16) p  Suppose the sample space X is to be divided into two regions using the class probability estimation approach.  How  do these two regions compare  with those selected by the classification approach? partition T =  R (D p  For any two-region  }/  = E  *eJ*  = 2p(l  p  U)PU)  (3.17)  \t )p(2\l )P(t ) t  ±  + 2p(l  t  25  \i )p(2\i )PU ). 2  z  2  Suppose n^, rz , p(x|z) and p(x\2)  are known as in the classification  2  approach.  Then (3.17) can be re-expressed as  R (5-) = 2p(i \\)P(X p  e * \2)n ±  = 2p(/|* )n [ J 1  2  + 2p(2\i )P(X  z  2  e  \l)n  (3.18)  ±  P(x|2) <*x J  J  + 2p<2|  p(x|i) dx ] .  But this i s same as the expected cost (3.4) of a classification rule i f 2p(l\l )  = C(l\2)  t  and  2p(2\l ) 2  = C(2\t ).  Let  T* = U*,/}  be  the  partition with minimum risk R (• ) among a l l two-region partitions; that i s , p  * let T  be the best  estimation approach.  two-region partition using the class  probability  Suppose the cost ratio is given by  Then from (3.9), a Bayes rule that minimizes the expected cost in (3.4) is * determined by the partition T . Therefore, by varying the cost ratio, the best  two-region partition determined by the class probability estimation  approach can be obtained from the classification approach.  3.3 Sample Space P a r t i t i o n s Corresponding t o Bayes Rules  In the following sections, some of the commonly used methods for discriminant analysis are presented.  The most widely used method assumes  multivariate normality for the observations from both classes. 26  In this  case,  a Bayes rule  i s obtained  minimizes the risk function.  by choosing  The logistic  a linear-  partition that  discrimination procedure also  provides a linear partition for use with both normal and certain non-normal populations. such as kernal  Methods based on nonparametric density estimation algorithms, and nearest  neighbor  methods, are also available, but w i l l  not be covered in this thesis. Instead, the method of classification is explored. and  trees  A recent report produced by a panel on Discriminant Analysis  Clustering (DAC report), which was created under the Committee on  Applied  and Theoretical Statistics,  National  Research  provides a helpful summary of a l l these methods.  Council (1988),  In the following, we  present three of these methods from the decision theoretic perspective. In addition, we examine the classification  trees  method in much greater  detail.  3. 3. 1  L i n e a r D i s c r i m i n a n t s f o r Normal D i s t r i b u t i o n s  In the classification problem, by assuming the two class-conditional distributions are known multivariate normal with equal covariance matrices, namely N(y ,Z) and N(y ,Z), Wald (1944) showed a Bayes rule is obtained by 4  2  choosing the linear partition given by  x «= X : x ' z " ^ - « ) > * 1  2  where the point k  } , and  i s a function of rc n^, C(l \2), C(2\l ), (j , y  27  (3.19)  2  and Z ;  see Anderson (1984), Hand (1981), Dillon and Goldstein (1984), and others. The linear projection given by x Z"*(g  ~ fcj )/ i s sometimes called the  T  normal  linear  discriminant  2  function.  However, in most applications, the mean vectors and the covariance matrices are unknown.  Suppose there i s a sample of size  and a sample of size  from class 2.  Let  from class 1  be estimated by the usual  mean x^. of the sample from class j population for J = 1,2, and l e t Z be estimated by the pooled sample covariance S defined by  o _ s> —  <* - 1)S. + <N_ - l)S_ 1  (N  1  1  where S and  2—,  2  +N  - 2) 2  are the corresponding sample covariance matrices.  ±  Then the  Bayes decision regions are estimated by  \ = | x <= X : x S ( x T  1  ^  ^  ~  _ 1  ~1  ~  = I x e X : x S (x T  _ 1  - x )> * ~Z  i , and  2  (3.20)  J  - x ) < te \ ,  2  where the point *  2  i s a function of n^, n^, C(l\2),  The linear projection given by x S T  discriminant  function  - 1  C(2\l),  (x^ - x ) i s the Fisher 2  suggested by Fisher (1936).  28  x^, x  g  and S.  linear  3.3.2  L o g i s t i c Linear Discriminants  In the classification problem, logist  ic discrintinat  linear partition of the sample space for use  ion also provides a  with normal and  certain  non-normal populations; see Lachenbruch (1975), Hand (1981), Dillon  and  Goldstein (1984), DAC report (1988), and others.  Suppose that the two class population densities can be expressed as  P(x|j) = expfOj + x g y ) ,  for J = 1,2.  T  (3.21)  Then by invoking Bayes theorem,  P<*  where n  O  Ix)  =  P(x\t)n  T  =  = log( n / n ) + (a - a ) and 1  inu!. t i v a r i a t e  2  1  logist  lo  ic  n = ft - ft .  2  function,  * [TT^ilTTxr]  ~±  ~  ^-2  This  i s called  a  which can be re-expressed as  •  +  2*3  •  (3.23)  Thus the probability of belonging to a class given a measurement vector X = x can be estimated by modeling the logit of p ( i |x) as a linear function of x.  Furthermore, by substituting (3.22) into (3.9), the best decision  region in the classification setting i s given by the partition,  * = | x € X 4  : x g > te | , and T  g  29  (3.24)  I  = X x € X  : x y> < T  where the point fc i s a function of a , a , n , n , C(l \2) and 3  1 2  C(2\t).  1 2  So far the logarithm of each class-conditional probability function is assumed to be adequately modeled by a linear function. general  approach assumes the  difference between the  class-conditional probability functions is linear.  A slightly more logarithms of  This is equivalent  the to  the approach adopted by Anderson (1972) which assumes the logit of p(l \x) is linear as expressed in (3.23).  The equivalence relationship can easily  be  (3.22).  seen by examining expression  Clearly, the model expressed  in (3.22) is exact when the class conditional probability density functions are multivariate normal with identical variance-covariance  Thus,  for  known normal p(x|/J>  and  p(x\2),  the  logistic  coefficients are functions of normal parameters, and regions  given  in (3.19). densities  in  (3.24) correspond  However, are  i f the  multivariate  to  underlying  normal  with  the class  matrices,  regression  the Bayes decision  Wald's  linear partition  conditional probability  unknown parameters,  then  the  logistic discrimination procedure cannot be expected to classify as well as does  the  linear  discriminant  function (Efron  Wilson 1978).  30  1975,  and  Press  and  3.3.3  C l a s s i f i c a t i o n Trees: R e c u r s i v e P a r t i t i o n i n g  The technique of classification trees for discriminant analysis was initially  developed  by Morgan  and Sonquist (1963),  Messenger (1973) under the name automat ic interaction This  technique  has been  pursued  and  refined  detection  (AID).  several  people.  by  Recent development, under the name classification trees  and Morgan and  and  regression  (CART), is described in detail in the book by Breiman et al.  (1984).  The primary differences between AID and CART is in the tree construction.  The technique of CART creates a binary tree-structured discriminant by repeatedly splitting subsets of sample space X into two descendant sets, starting with X t 1  t 5  itself.  An example i s illustrated in Figure 3, where  = X, t and t are disjoint subsets of t with tut '  2  9  1  are disjoint subsets of * with tut 2  4  2  = t , and t and 3  =t . 5  2  t  t  t  2  3  t 5  Figure 3  An example binary tree  31  1 / 4  Those subsets with no descendant In the above example, t , t  and t  sets are called terminal  subsets.  are the terminal subsets.  Thus the  technique of CART constructs discriminant rules that partition the sample space as specified by the terminal subsets.  That i s , t^,^,^} forms a  partition of the sample space that corresponds to some decision rule.  The tree i s constructed based on a set of binary questions of the form f Is x e i? } for some subset t of. X.  Let the measurement vector X be  M dimensional, X = (X^,.. . , X ) , with mixture of ordered and categorical t  m  types.  1  Then the allowable set of splits i s defined as follows: a.  Each s p l i t depends on the value of a single variable.  b.  For each ordered variable X , the questions are of the form { Is x < c? }, for a l l c in the range of X . vft  c.  m  For each categorical variable X , the questions are of m  the form { Is x e S? }, for a l l subsets S of possible TTI  X -values. m  Let J" be a fixed partition and l e t t e 7 be a fixed subset of X in J*. *  Consider a s p l i t o of t into two disjoint subsets l and t . Let T modification of T after applying s p l i t o to t.  Then the risk reduction  As defined in Brieman et al. (1984), a variable is ordered values are real numbers; and a variable i s categorical values from a finite  set with no natural ordering.  variable can be a continuous or an ordinal variable.  32  be the  i f its measured i f i t takes on Thus an ordered  AR(o,4) = R(T)  - R(T ) due to the s p l i t o is given by  AR(o,4) = RU) - [RUJ  + RU ) ]  = P(t) [ r U ) - P^ritJ  where P  = P [X e 4  (3.25)  R  | X <= 4] and P  - P^U^) ],  = P [X e 4  | X e *].  The r e l a t i v e  risk reduction due to the s p l i t is then given by  A R ( o U ) = AR(4,l) / P{i) = r(*) - Pr.U ) - P r U ). L, 1. R R Thus, the risk,  reduction  (3.26)  i s achieved by choosing the s p l i t o  partition  that maximizes the relative risk reduction.  In the class probability estimation approach,  PU\*)  =P  u  PU\iJ  +P  R  PU\* ), M  J =  Thus by substituting the above into r U) in (3.15), AR p  can be shown  p  to be AR  (<»|i) = 2P P [ p U | * ) - p(J I * ) ] .  Hence the relative risk reduction  (3.27)  2  is maximized i f the difference between  class probabilities i n the two resulting subsets is maximized.  Suppose  class 1 corresponds to the class of households with infant death. Then the class  probability estimation approach seeks splits  that  maximize the  difference in probability of infant death between the two resulting groups. Furthermore, because of the multiplicative factor P^P^t  the criterion also  favors those splits which divide the set t more evenly into two subsets.  33  Note that relative involves a ratio  risk, in epidemiology, as defined in Definition 3.1,  rather than a difference:  "  P(l i * r * R  Thus a desirable s p l i t should have a very high or very low relative risk value.  In any case, there i s no way of ensuring even s p l i t s .  as discussed  in Section 3.2.1, using relative  risk  as a  Therefore, partitioning  criterion may not be provide splits of practical value.  Risk reduction i s not a good criterion for choosing a s p l i t in the classification approach. Breiman et al. (1984: pp. 95-96) shoved that for any s p l i t of * into ^ J*U)  =  t  and * , RJl) > K.U r  ) + (* ) R  C  w i t n  R  ), where j*(u) minimizesC(JIi )pU 1^)  )=  equality i f + C(J\2)p{2\-u,),  K  JW  for subset v, of X.  Thus, i t i s conceivable that every allowable s p l i t of t  may produce a partition for which AR {o,t) i s zero.  In situations where  C  the population  i s predominated by a single class, the risk  criterion may result in no s p l i t s . that  risk  reduction partition  reduction  The second defect is caused by the fact  (in the classification  approach)  is a  one-step optimization process that does not account for the future s p l i t s . In some situations, the best current choice of s p l i t may not provide the best overall improvement in strategic position.  For futher discussion of  these considerations, see Breiman et al. (1984: pp. 94-98). Two  splitting  criteria  for the classification  approach have  been  implemented in the CART software: Gini criterion and Twoing criterion. In the two-class problem, these criteria can be shown to coincide (Breiman 34  et al. 1984: pp. 104-108). Thus, in this thesis only the Gini criterion is considered.  Let T be any partition of sample space X.  of r(t) consider an  impurity  function  For t e T, instead  i(t) defined by  i(t) = 2p{l \t)p(2\t), called the Gini  diversity  index.  (3.28)  Then, the  partition  impurity  toxT  is defined by KT)  = £  i{t)P{t).  (3.29) *  Thus the impurity reduction due to the s p l i t -o is AI(4,t) = I (T) - 1 (T ), where T and T  are as defined in AR(o,l) earlier; and the relative impurity  reduction due to the s p l i t o is given by AZ(o|*) = A7(o,«) / P{t) = 2P P [ P{1 \t ) - P{1 \t) ] . 2  Li  But this  K »  Li  R  is precisely the risk reduction criterion used  probability estimation approach as expressed in (3.27). reduction partition approach  is the  probability  using  same as  estimation  Gini the  diversity risk  approach.  index  reduction Therefore,  in the class  Thus, the impurity  in the partition  the  (3.30)  J  sample  classification in the space  class X  is  partitioned in the same manner by both approaches when CART is used.  3. A C o n s t r u c t i o n of D i s c r i m i n a n t s from Sample Data  Since the measurement variables available in the Sri Lankan household study are mainly ordinal, not continuous, partitioning of the sample space  35  by assuming normal populations may latter  two  techniques,  logistic  not be appropriate. linear  Thus, only the  discrimination and  CART, are  discussed in this section.  In practice, classification or discrimination problems begin with a sample of correctly classified objects, each with a set of measurements, x. The classification approach uses the sample to derive rules that partition the  sample  space  into disjoint  regions  with  each  predominantly inhabited by members of a single class.  region  purely  or  The partitioning of  a population into classification regions is similar to, but not quite the same as the partitioning of population into groups distinguishable with respect  to  high  and  In principle, class risk  are  risk  of  belonging  to  a  specific  is clearly defined while the terms high  relative.  technique (for  low  class  Both the  logistic  probability  risk, and low  discrimination and  estimation)  estimate  class.  the the  CART class  probabilities for each possible measurement vector x in the sample space X. The high  and low risk  groups are then defined by choosing a probability  threshold.  3. 4. 1  L o g i s t i c Discriminant  Let { (X ,Y ) : n = i ~n  W } be a random sample of size N from the  n  joint distribution of {X,Y), where X is a X-valued random variable and Y is a S-valued random variable that denotes the observation.  class membership of  Logistic discrimination assumes that  36  the  r  n  P(Y = i i x )  for x € X.  Thus, for x e X,  P<y = i i x ) =  1 + exp(* + x o  T 2  (3.31)  )  Therefore, the parameters i) and r? can be estimated by maximizing the Q  likelihood  function, N  n =  1  A l l logistic discriminant analyses performed for the S r i Lankan household study are accomplished by using a logistic regression program, PLR, of BMDP Statistical Software.  3.4.2  CART D i s c r i m i n a n t : Crowing a C l a s s i f i c a t i o n Tree  Let { (X ,y ) : n = *  } be a random sample of size N from the  n  joint distribution of (X.Y), a S-valued observation.  random variable In both  where X i s a X-valued random variable and Y i s that denotes  the class  the classification and the class  estimation approach, there are two situations prior  probabilities  membership of the  rt^ and rc  probability  to consider: one when the  are known, and another when the prior  probabilities are unknown.  37  Consider f i r s t the situation where the prior probabilities are known. Let N , be the number of observations with y = j, j =i ,2. partition of the sample space X. observations  with  Suppose J* i s a  Given t e T, l e t Nj(l) be the number of  x € i and  y = j,  for j = 1,2.  Then  estimate  P{t) = P(X « t) by  n .  PU) = E,. — ^  N . J  J  Suppose  P(t) > O  (3.32)  J  for a l l t e J*.  Then  for j = * ,2,  estimate  = P(y = J'lX e *) by  PU\*)  p(j|«) =  In  practical  n , N .U) / N . ;? 1— . PU)  (3.33)  J  applications,  however,  the prior  probabilities are  often unknown. Then for any * e T, l e t N(t) be the number of observations with  x € t,  and  estimate  P(t) = P(X e t)  by  the  proportion  of  observations in t,  PU) =  •  (3.34)  Suppose PU) > 0 for a l l l e J*. Then estimate p(j'|*) by the proportion of observations belonging to class j in the subset t, N .U)  p(j\t)  =  . A?U)  38  (3.35)  For  any * e T,  j = 1,2.  l e t pij  \t)  be estimated by the appropriate  i n the classification  approach, l e t to it)  pij\t),  be the smallest  C  i « {1,2} that minimizes C(£|i)pU |*) + C{i\2)p(2\t),  it)  and estimate r C  in (3.13) by r U ) = min [ C(2\l )p(l c  \t),  dl\2)pi2\t)  In the class probability estimation approach, l e t co it) p (pit \i), p(2\t)), and estimate r U) in (3.15) by p r  p  Using the appropriate Pit)  it)  ].  (3.36)  denote the vector  = 2pU\t)p(2\l).  (3.37)  and r ( i ) , the Bayes risk associated with the  partition T i s then estimated by  RiT)  = £  4eT  rU)PU) .  (3.38)  Recall from Section 3.3.3 the desirable splitting criterion for either the classification or the class probability estimation approach (see (3.27) or (3.30)). t G T.  Consider a s p l i t o of * e T into *  ) > O.  £(*  Let T be a partition of sample space X with Pit)  > O for every  and t , where Pit  ) > O and  Let  R  P  -h*JP(*)  and  P  - J ^ - . P(*)  Then, the empirical splitting rule for either approach is to choose an allowable s p l i t o of t that maximizes  ^IAIC  p ( f  'V "  p U  39  1  V i a  ( 3 , 3 9 )  This partitioning procedure w i l l continue to s p l i t until each subset of the current partition contains either observations of the same class, or observations with obtained  in  identical measurement vector x.  this  manner  are  artificial  and  Discriminant rules  highly data  dependent.  Furthermore, i t is conceivable that this splitting procedure may until each terminal set contains only one observation. the  construction of  a  parsimonious  partition  continue  In the following,  suggested  by  Breiman  efc al. (1984) is summarized.  3.4.3  CART D i s c r i m i n a n t : Pruning- a C l a s s i f i c a t i o n Tree  The stop-splitting criterion i n i t i a l l y consists of setting a threshold and deciding not to s p l i t further i f the decrease in the estimated impurity for the classification approach, or the decrease in the estimated risk for the class probability estimation approach, is less than the threshold. This may lead to two problems.  If the threshold is set too low, then there  are too many subsets in the resulting partition. too high, good splits may be lost.  If the threshold is set  That i s , a subset t may not produce a  s p l i t with a large enough decrease, but i t s descendants t  and t  may  be  able to do so.  Breiman et al.  (1984) suggest the following alternative.  The basic  procedure can be summarized in three steps which are more easily described by tree terminologies. discriminants.  Recall the construction of binary tree-structured  Since each node on a tree corresponds to some set on the  40  sample space X, henceforth.  the terms, node  So far, the terminal  and set,  w i l l be used  interchangeably  nodes of a given tree, which constitute a  partition of the sample space, is the only tree terminology introduced. D e f i n i t i o n 3.7 no ancestor;  The root  node of a given binary tree is the node with  that i s , the set on the tree which is not a subset of  any  other sets on the tree. Let a binary tree be denoted by T.  Any node on the tree T  t « T, and the set of terminal nodes is denoted by D e f i n i t i o n 3.8  A branch  is denoted by  T.  of T with root node t e T consists of the  node t and a l l descendants of I in ?. D e f i n i t i o n 3. 9 T  Pruning  just below the node t.  D e f i n i t i o n 3. IO  T*  a branch T  from a tree T involves cutting off  The resulting tree is denoted by T - T .  is a pruned  subtree  of T  i f T'  i s obtained by  successively pruning off the branches of T.  The alternative to the stop-splitting procedure has three basic steps. The sample space X is f i r s t partitioned into an o v e r l y large that i s , the sample space is partitioned into fine sets. pruned upward until appropriate  only  the  sized  estimated risk. estimation  node is l e f t .  estimate of the risk, the right  pruned subtrees, i s selected. right  root  tree  is  to  This tree is then By  using  a more  tree from among the  The most obvious criterion for selecting a choose  This criterion may  errors.  sized  binary tree;  the  pruned  subtree  also be adjusted  However, these criteria  41  may  not  with  minimum  to compensate for always select a  sensible  tree.  In most practical applications, the primed subtrees and  their corresponding  risk estimates are inspected; and  by using external  information about the variables and by noting the context of the problem, the right  sized, tree is selected.  The f i r s t step is to grow a large tree T  by continuing the splitting  procedure until a l l the terminal nodes are either pure, or contain only identical measurement vectors. with RiT^) = R{J" ).  be the smallest pruned subtree of  T  Q  Note that the pruning criterion may differ for the  o  classification  Let  and  the  The estimated risk R(T)  class  = E  cases: r(4) is the estimated classification approach, while  probability  estimation  approach.  r(*)£(*) i s defined differently for the two wi thin-node misclassif ication cost in the r(4)  i s the  estimated  within-node Gini  diversity index in the class probability estimation approach.  Now for any branch T  of T , define R{T  R{T.)  = E  ) by  R(t), t  where T  is the set of terminal nodes of 3^.  Breiman et  a l . (1984:  pp. 287-288) showed that for any nonterminal node t of T , R{i) > R{T ). D e f i n i t i o n 3.11 parameter-  Let \ > O be a real number called the  and define the cost-complexity R-^{T) = R(T)  measure  (T) as  +  where \T\ is the number of terminal nodes in the tree T.  42  complexity  The complexity parameter X may be thought terminal node of a tree.  of as the portal ty on each  Thus the cost-complexity measure takes  into  account the risk associated with a tree, as v e i l as the complexity of the tree.  Consider any nonterminal node I of T . As long as R^TJ  the tree with the branch vithout the branch 3^.  intact i s preferred over the pruned subtree However, at some c r i t i c a l value of X, the two  cost-complexities become equal. pruned off i s preferred over D e f i n i t i o n 3.12  < R^{{•*}),  Then the smaller tree with the branch T  T. ±  Consider a nontrivial tree T.  Define a function t(t)  for * « T by  i*V  -  1  t€  +00  Then define the -weakest link  t i n T as the node satisfying  t(t*;3r)  Let \ = Z{i ;T ).  Hi;3r).  = min  Then the node i  2  T.  ±  i s the weakest link i n the sense that  as the complexity parameter X increases, i t i s the f i r s t node with R^(U}) equals  ( ) , where  i s a branch of T  ±  with root node *. Thus, when the  complexity parameter i s \ , the pruned subtree, T , obtained by pruning 2  away the branch T * from T , i s preferred over T . Nov define recursively *  for k = 2,3,as  1  1  long as 3 ^ i s not just a terminal node,  43  Continuing pruning in this manner, a decreasing sequence of subtrees i s obtained: T , T , J r  , where T  i s the root node on a l l subtrees.  Furthermore, a corresponding increasing sequence of complexity parameters is also obtained (Breiman et al. 1984: p.286).  The next step is to select one of these pruned subtrees as the sized, tree.  If R ( ^ )  right  i s used to estimate the risk associated with T^, the  largest tree w i l l always have the minimum estimated risk.  Furthermore,  this estimate is biased. Thus a more accurate estimate of R{T.)  i s needed.  Two methods of estimation are discussed by Breiman et al. (1984): use of an independent test sample and cross-validation.  As noted earlier, the sequence of subtrees, T ,...,T , may differ for the  classification  and  the  class  probability  estimation  approach.  Since the class probability estimation approach seems more appropriate for the discrimination objectives of the  Sri Lankan household  study,  the  description of the estimation methods w i l l be restricted to the class probability estimation approach.  Extension to the classification approach  can be made similarly.  3. 4. 3. 1  T e s t Sample Estimates of Risk  The sample i s divided randomly into two sets, where one set is used to construct the decision rules, and the other i s used to estimate the risk associated with each rule constructed. These two sets are generally called the training  sample,  and the test  sample  44  respectively.  Let y denote the random sample { (>< ,y^) : n = 1 ,...,N }. A sample of fixed size A / The  2 >  is randomly selected from y to form the test sample J^ . <2>  remainder J^  (1>  = ? - J*  constitutes the training sample, which is  <2)  used to construct the decreasing sequence of pruned subtrees, T ,...,T. 1  For each pruned subtree T^,  r^.  l e t p ^ ( j | x ) estimate the probability of  belonging to class j given measurement vector x , j = 1,2, by applying 3 ^ to the cases in the test sample. Then for j = 1,2, define  l  m  r  = C^<*l2n> " < W  *  * j < K —£r )m  N .  < 2)  where n .  <2>  (2)  ( 3  '  4 0 )  and Y = j }, and 6. (y ) is the Kronecker  n  ~ n  2  .  = { n : (X ,Y ) e ?  J  ]'  n  I  n  delta (/ i f y = £ and 0 otherwise). Test sample estimate of the Bayes risk associated with the tree 3 ^  * then given by s  R (r ) »E*)"<*V j la  n  M  •  (3.41)  If the prior probabilities are unknown, estimate «  by  A/^. / A / 2 >  2 >  ,  j =  The standard error estimate for R (J*^) denoted by S£"(R (J"^)), i8  obtained et al.  by  standard  statistical  t8  methods  as  described  in  may  be  Breiman  (1984).  A large sample is needed for this method.  In particular, a large  number of cases i s required in the training sample so that the rules constructed are somewhat reliable.  45  3.4.3.2 C r o s s - V a l i d a t i o n E s t i m a t e s o f Risk  When the data set i s large, test sample estimation i s a reasonable approach.  However, when the number of cases i s only a few hundred as i n  the Sri Lankan household study, test sample estimation can be inefficient in i t s use of available data.  Thus, cross-validation is preferred.  In V-fold cross-validation, the original sample f i s randomly divided into V subsets of similar sizes, f^, v = 1, ...,V. Then the v-th sample i s defined as ^  < v )  = ? - J» , for v = 1,...,V.  By using the entire sample J " , the decreasing sequence of pruned subtrees, T ,...,T^ is constructed.  /V^7  with corresponding complexity parameters, X ,...,X ,  t  Then for each h = t,  o f x  *  x  w i t h  x  *+i  let  denote the geometric  k °°=  (V)  ~  < v>  Now for each sample Cr , v = 1, ...,V, construct  , the optimally  pruned subtree with respect to the complexity parameter Xj^, h =  .  Then for each tree 3 '^ , l e t P ^ ( j | x ) estimate the probability of belonging to class j given measurement vector x , j = 1,2, by applying l v > !  v>  v>  to the cases in y . Then for j = 1.2, define V  R  T  {  T  K  )  =  —  E  E  E  [ p i ( ^ | x ) - 6 £ (y n ) ] ,  where 77 V* = { n : (X ,/ ) € y J  ~ n  2  V >  n  v  >  (3.42)  and Y = J }, and <5 . (y ) is the Kronecker  n  n  I  n  delta U i f y = i and O otherwise). Cross-validated estimate of the Bayes 46  risk associated with the tree 3 ^ i s then given by  R  c v  ( ^ ) = £Ryir ) nj .  (3.43)  h  If the prior probabilities are unknown, estimate nj by N , / N, j - 1,2. Standard error estimate for R («7" ) denoted by S E ( R ( T^)), cv  CV  fc  maybe obtained  by heuristic arguments as described in Breiman et al. (1984).  The right  sized, tree may be defined as the pruned subtree with minimum  estimated risk, or as recommended by Breiman et al. (1984), the tree selected by the 1 SE rule:  instead of  risk, the smallest tree  «  t  *  C  <*W  8  V  < * W  the tree with minimum estimated  satisfying  *  S ^ t e J  +  i9  hm  * * ^**> CV  SEiR (T ))  +  SE(R  C V  or  (^)),  whichever i s appropriate, i s selected. This rule was created to take into account  the instability of minimum estimated risk, and to select the  simplest tree whose estimated risk i s comparable to the minimum estimated risk.  Note that  i s a pruned subtree of 3"^^.  47  4.  Path A n a l y s i s  Path analysis contrast  investigates causal  to the focus  individuals or cases.  patterns in a set of variables, in  of discriminant  analysis  on  patterns  This s t a t i s t i c a l methodology, vhich was  among  introduced  by a geneticist, Sewall Wright, in the 1920*s, has been popularized sociological others).  literature (see Duncan 1966, Land  1969, Blalock  Path analysis utilizes a visual representation,  diagram,  in the  1970 and  called  path  which consists of arrows leading from one variable to another, to  illustrate  the  The s t a t i s t i c a l  cause-and-effect part  cause-and-effect  relationships  among  the  variables.  of the method does not specify the direction of  relations  between  the variables,  but does  provide  quantitative assessments of the relationships v i a what are called coefficients.  Thus, this  i s not a method  for discovering  path causal  relatioships among the variables, but rather a method for assessing whether or not a specified set of relationships among the variables i s compatible with the observations.  Hence, directions of causality between variables  are specified by using non-statistical information In  practice,  the natural  temporal  ordering  or substantive  theory.  of the variables  usually  indicates the direction of causality between the variables.  The method of path analysis was i n i t i a l l y developed for quantitative data, where a path diagram i s based on a sequence of linear regression models.  However, most sociological data  quantitative.  causal  instead of  Thus, assumptions under which path analysis was developed  are generally not satisfied. studying  are qualitative  Goodman (1972, 1973a,b) proposed a method for  relationships  among discrete  48  variables,  where a path  diagram is based on one or more loglinear or logit models. However, causal models thus constructed have limitations, and are not directly analogous to causal models with continuous variables (Fienberg 1980,  Rosenthal 1980).  Various problems in causal modelling with quantitative or qualitative data have  been  explored  Lauritzen 1983,  recently  (Wermuth  1980  Kiveri, Speed and Carlin 1984,  and and  1987,  Wermuth  others).  In  and this  thesis, only the basic approach which lead to the more recent developments for qualitative data is examined.  4.1  S t r u c t u r a l M o d e l l i n g w i t h Q u a n t i t a t i v e Data  4.1.1  Path Models  A path model can be represented by a path  diagram.  Suppose we are  interested in the relationship between infant mortality (X ), a dichotomous o  variable, and two explanatory variables, say age (X^) and education (X^) of the mother.  We suspect that both age  mortality directly.  and  education influence infant  Further, we rule out the possiblity that education  affects age, but w i l l postulate that age affects the level of education attained.  Then this model can be represented pictorially as in Figure 4.  L e v e l of education X  Age Figure 4  X  +  i  X  o  Infant death  An example of a path diagram  49  The directed arrow, leading from one variable to another, indicates that the f i r s t variable has direct influence on the second. A path is formed by moving along the arrows. X —*• X^—* X  Q  In our example, X^—• X^, X^—* X , X —• X , and Q  are the possible paths.  2  Q  If a path diagram contains a path  that traces back onto i t s e l f , then the diagram is said to have a feedback, loop.  Any path model represented by a diagram with no feedback loop i s  called a recursive  system.  A l l path models considered hereafter are  recursive.  The method of path analysis assumes that a l l relationships are linear. Thus for the above example,  = ft X ,  X  (4.1)  2  ' 21 l ' X = ft X + ft X . O OA 1 02 2  But  in pratice  variation.  this  i s not exact;  there  are unmeasured  sources of  Thus, the above system of equations i s more appropriately  expressed as X  = ft X + 6 ,  (4.2)  2  ' 21 1 2' X = ft X + ft X + 6 , O CU 1 CS 2 O'  where the error terms, 6 other  variables  and 6 , have mean O and are uncorrelated with the  in the corresponding  equations.  Without  generality, assume hereafter that a l l variables are standardized and  unit variance.  standardized  Conventionally,  loss of to mean O  coefficients in the equations with  variables are referred to as path  denoted by fv. . , where the subscripts  50  coefficients,  and are  represent the direct effect of  standardized variable X*. on standardized variable XV . J  1  Thus, our path  model can be re-expressed as  X' = fl X'+ fl e , 2  '21  1  ' 22  (4.3)  2'  X* = fi X'+ fi X'+ fx e , O  where coefficients such as, A residual  path,  CU  2 2  1  ' 02  and /v  coefficients.  2  ' OO  O'  , are generally referred to as the  The path diagram i s then modified as  follows.  L e v e l of e d u c a t i o n  02  Age  >  X' o  Infant death  01 00  Figure 5  Since a  An example of a path diagram with path coefficients  path model can  be  represented  by a  sequence  of  linear  submodels, the corresponding path diagram can be modified to better reflect this key concept by the use of colors.  For instance, the earlier example  can be represented by a path diagram with colored arcs as in Figure 6 .  The  modified path diagram i s visually more attractive, in the sense that v i t a l information can be extracted more easily.  Suppose we want to know which  variables have direct effect on a specific variable in a more complicated path model.  Instead of staring at a maze of arcs, we can focus on a 51  particular color and especially useful  obtain the desired  in specifying  the  information.  system  of  This  feature is  linear equations  that  represents a path model. £  2  L e v e l of e d u c a t i o n  Age  X' o  X;  £  Figure 6  Infant death  O  An example of colored,  path diagram  The basic assumptions underlying the application of path analysis for quantitative data are summarized as follows: i.  Causal (or temporal) ordering of the variables in the model is assumed as specified. evaluated  Validity of the model cannot be  from the data; external criteria or  substantive  theory must provide justification for the model proposed. ii. iii.  Relationships among the variables are linear and additive. Error terms are not  correlated with variables proceeding  them in the submodel, nor with each other. iv.  The variables are measured on an interval scale (at least), with the exception  of dichotomous variables, which can  52  be  included as interval-scaled by assigning numerical scores to the two categories.  4.1.2  E s t i m a t i o n and I n t e r p r e t a t i o n of Path C o e f f i c i e n t s  Path coefficients may be estimated  in two ways.  The f i r s t method of  decomposing correlation coefficients was employed by Wright (1934, 1960) in the development of path analysis.  The second method consists of applying  ordinary least squares regression to each submodel in the system.  The latter method of estimation automatically provides estimates  of  the precision of the coefficients, and a framework in which hypotheses concerning the coefficients may be tested.  Although the regression method  is generally preferred, the method of decomposing correlation coefficients offers a more fundamental understanding of the relationships among the variables considered,  in the following, these two estimation methods are  illustrated in the context of the earlier example using a random sample of size N.  Since  the  variables  are  standardized,  the  sample  correlation  coefficient between X, and X . can be expressed as J  r.  . =  -|T-  V  x*. x'.  .  Let the sample correlation coefficient be zero, i f the two variables are assumed to be uncorrelated.  Then in path model (4.3),  53  Let fv^. denote the estimate of path coefficient ft  i  .  Then path  model (4.3) implies that  r  = 21  f since  £ x*  2  *  V  N  ^  - 7 7 -  - ... . _ 1 x' x' ± Z  V x*  —r-7-  **  N  (ft x ' + 21 1  1  7  7  (4.4)  ft e> ) = ft , 22 2 21' 7  i  = 1, and  £ ^ x  2  •  0  Similarly,  = fi + ft r  r Ol  r  =  e  Q2  CK  = ft + ft r 'oat  ,  and  (4.5)  Z l '  02  Ol  12  r  .  In general, Wright (1934) shoved that  r . . = £ ft.  (4.6)  vhere s runs over a l l variables with direct effect on X£ .  Therefore,  estimates of the path coefficients can be obtained by solving for ft. .'s in the decomposition of correlation coefficients.  ft = ' 21 ft  ' Ol  =  r  r  oi  J  (4.7)  , 21 ' - r  r  oe 21  2 1 - r 21  - r 02  r O2 l  i - r  12  54  ,  '  ,  r 02  In our example,  12  ,  and  Now the residual path coefficients can be obtained by noting  r  r  = —JJ— TJ x ' w <J  = — - -L YJ (ft  N  '21  2 ^ 2  1  x ' + ft I  7  ^ 2  ~ 2  = —JT— E x ' = f v + n, + n, 00 N o '01 02 00 7  e  22 2  7  +  )  = ft  / \  2ft  7  21  7  / \  ft  + ft 7  22 '  ,  and  A  ft  ox 02 21 7  7  Thus, (  1  - KI  Y'*>  ( 4  -  8 )  ft =[ 1 - ft - ft - 2fV A ft I ' OO V. ' CM. OB Od. 02 21 ' 7  For a simple path model as i n our example, this method of estimation seems straight forward.  However, for a more complicated model, this method can  be very tedious.  Since a path model i s essentially a sequence of linear submodels, path coefficients can be estimated by applying the method of ordinary least squares  regression to each submodel.  ordinary least squares estimate of ft  ft  21  Thus for path model ( 4 . 3 ) , the  is  1 2 = ^ _ ,2 E *\  = r 21'.  since x ; and x' are standardized; and the normal equations for the second 2  linear relationship are as expressed in (4.5).  It can be shown easily that 2 2 / 1 - R , where R i s the  coefficient of multiple determination between the dependent variable in question and those variables with direct model ( 4 . 3 ) ,  55  influence  on i t .  Thus for  ~  ft '  2 22  i  =  - R  2  = i  —LN  2-i  = / ~ ft 21  ft  = f  -R  ' OO  =  N  '  f  **  2  A  where  (ft x' f " a i ,  + ft x* )  i j - £) (ft x'  1  0-1.2  =  r  - A. ' Ol  Ol  7  1  ^ 2  7  ^  - rt  —  02  02 ^  2 ^  2ft ft ft ' O l ' 0 2 ' 21  z  . '  is the coefficient of multiple determination between dependent  variable X' and independent variable X* . and R  i s the coefficient of  2  l '  2  multiple  determination  variables X* and X* .  0-1.2  between dependent variable  X^  and  independent  Therefore, estimates of the path coefficients agree  for both methods. Proof of the general result can be found in Land (1973). By treating the data from the Sri Lankan household study as a simple random sample, the path coefficients for our example path model (4.3) are estimated (see Figure 7).  °-1 33  L e v e l of education X'  2  -o.  Age  ie  -o.  15  X' o  X^ O. 13  Infant death O. 98  Figure 7  A path model with estimated path coefficients  56  A l l path coefficients are significantly nonzero at 5 % level.  But, as shown  by the residulal path coefficients, or equivalently the coefficients of multiple determination, linear models do not f i t the data well.  For  further analysis, one may try transforming the variables.  Wright developed the method of path analysis as a means of studying the direct  and indirect  effects of variables.  Direct  effect refers to the  effect of an independent variable on a dependent variable directly without any mediating variables.  effect pertains to the effect of an  Indirect  independent variable on a dependent variable through a third variable which affects  the dependent variable  either  directly or indirectly.  In our  example, x ; has an indirect effect on X^ thru X^ which has a direct effect on X ' .  In another model, X* may not have a direct effect on X ' , but has an 2  O  O  indirect effect thru another variable, say X^, that has a direct effect on X* . o  The observed correlation between two variables can be expressed as a sum of three components.  The direct and indirect effects of one variable  on the other account for two of the components.  The third component of  correlation coefficient i s attributable to the antecedent variables common to the two variables under consideration. the spxirioxis  component.  This component is referred to as  The decomposition of correlation coefficient as  shown in (4.6) may be re-expressed as follows:  d i r e c t effect + i n d i r e c t effects + spurious component ft. . tj  +  E  ~*.  ,  ft. is  57  r  sj  .  +  E  **, . rt.t o 7  r  o j.  where both X' and X* have direct influence on X". with s running over a l l s o J. variables X^ which are influenced by X*. , and o running over a l l variables X' which influence X'. : that i s . s runs over a l l variables  that have a  direct path to X£ and can be reached by following the arrows from Xj , and o runs over a l l variables that have a direct path to X£ , and can reach X^. by following the arrows. the total  effect.  direct r r  21 at  r OS  The sum of direct and indirect effects i s called  For our path model (4.3),  effect  = ft 21 = ft  indirect  effect  spurious  component  + ft r 02 21  oi  = ft 02  + ft O l  r 12  Using data from the S r i Lankan household study, the estimated direct and indirect effects are shown in the following table.  Effect  Direct  Indirect  Age on education  -0.16  —  Age on infant  death  0.13  Education on infant death  Table I I I  Estimated direct  and indirect  -0.15  0.02 —  effects for path model (4.3)  Thus, the effect of age on infant death i s mainly direct.  Therefore,  decomposition of a correlation coefficient provides a way of separating the direct effect on the dependent variable  from the indirect effect which  manifests itself through the correlations with other explanatory variables. 58  4.2  S t r u c t u r a l M o d e l l i n g w i t h Q u a l i t a t i v e Data  4.2.1  L o g l i n e a r and L o g i t Models  Gocdman (1972, 1973a, b) proposed using loglinear and logit models to study the causal patterns i n a set of discrete variables.  Commonly used  terminologies and notations for the analysis of categorical variables are reviewed  in the context of three-dimensional contingency tables.  A more  complete presentation of this methodology can be found in Fienberg (1980), Haberman (1978), Bishop, Fienberg and Holland (1975), and others.  Consider three variables, A, B and C , with 1, J and K categories respectively. m.  Suppose a random sample of size N has been collected. Let  denote the expected number of observations with (A,B,C) = (i,j',AO for  i =  J = 1,...,J and k = l ...,K.  Then the general  f  loglinear  model is given by  log m., .. = u + u ^  l jk  + v. .  + v. 1< I >  + v.  2< J >  . . + v.  12 < l_?>  (4.9)  3<fc>  ,, + v.  13< lfc>  .. + u  23<jfe>  , .. .  123<ljfe)  where J  7 **  i =1  1(1)  K 2<J>  J=l  **  k=i  59  3<fc>  7  I  J  y  XL  **  ,. = y  i2<i j>  i=i  **  j=t  I  i2<i j>  *•*  = Y XL  xi  ** 231 j hi  h=l  J = TJ  -  k.=i**  J=l  . .,  taitki  i3<ife>  K  = LY  I  = y -u  TJ.  i=i  J  T. xi  K  ii , . = y  O,  23< jhi K  xi  .  T  =  v.  . ..  = O.  This general loglinear model does not impose any restriction on expected c e l l counts l £jk}r m  a n  d i s denoted by [ABC].  By setting some of the  u-terms to zero, special cases of the model can be obtained:  Model  u-terms set to zero  IAB][AC][BC)  XL  [AB] [AC] [AB][BC] [AC][BC] [AB][C] [AC][B] [BC][A] [A][B] [C]  T a b l e IV  ±23i  . ..  I  jhi  123< l j h i ' Xi  123 < I jhi  , '  123< Ijhi  '  XL  123<  .  . . . . . tjh)  .  129 < I jfc> 123<  I. jhi  .,  23< J&> tanhi  XI  , .  12<l J >  u  ... 13<lfe> '  XI  . . .  . XI  . . .  123< I J«> ' XI  XL  12<lJ> ' 12<t J> '  '  U  . . .  12<lJi '  XL  .,  23 < jfc> 23 < j'fc> ia< th)  XL  .  , XL  ia< ifc> '  ,,  23<jhi  Various loglinear models for three-dimensional tables  60  Model [AB][AC][BC] assumes that each two-variable interaction is unaffected by the value of the third variable.  Models  [AB][AC], [AB]IBC], and  [AC][BC] are obtained by assuming conditional independence of two variables given the third.  For example, model [AB][AC] assumes that variables B and  C are independent given variable A. Models [AB][C], [AC][B], and [BC)[A] are obtained by assuming one variable i s jointly independent of the other two.  For example, model  [AB)[C] assumes that  variable  C  i s jointly  independent of variables A and B. Lastly, model [A][B][C] assumes that the three variables are mutually independent.  The method proposed by Goodman i s restricted to a hierarchical set of models  in which higher-ordered  lower-ordered terms are present.  terms may appear only  i f the related  An example of a nested  hierarchy  of  models is given below:  [A][B][C] c [AB][C] <z [AB][AC] c [AB] [AC] [BC] c [ABC],  where c means " i s a special case of".  Effects of categorical predictors, say A and B, on a dichotomous response, say C, can also be assessed by a logit model:  C  for i = 1  | AB  = log  .,7, and  2(J>  j=  1 ,...,J, where  61  +  W  i2<i j>  (4.10)  KI)  *"•  i=l  ±2(1. J i  2<J>  j=l  i=l  ±2<IJ>  j=i  Note that this logit model can be obtained from the general loglinear model by making the following identifications:  xo = 2 XL  ,  w  3<i>  =2  xo . 2<J>  XL  .  = 2 xi  .  . . =2 u  xo  ,  23<J1>'  ...  taut ),  i< i> 12<tJ>  ...  123<  Xjk>,  Special cases of this logit model can again be obtained by setting some of the io-terms to zero.  Logit models for categorical predictors are special cases of logistic response  models  introduced  i n Section 3.3.2.  Let p. ..  i for i = 1  probability that (A,B,C) = (i,j,te),  j = 1,. ,.,J, and  Then, (4.10) can be rewritten as  te -1,2.  log  = XO + XO 1<1>  with the same restrictions on the w-terms. X  ,  denote the  JR.  2< J)  (4.11)  + W . + XO ±2<l J>  Suppose I = J = 2.  Let X^ and  be dummy variables defined as B  i f A = 1, i f A = 2, and l e t X  A  -••f-i  if B = i , i f B = 2,  = X X . Further, l e t p(te|X) denote the probability of C = te  AB  given X  and  A  B  and X , i . e . l e t p(te|X) = p. . . Then (4.11) can be rewritten as B  LJ  62  K.  loa LeilLXL-1 *  = w + w X + w X + w t(l) A 2<1> B  J  [p(2\X)  10  (4.12)  X 12<11> A B  Thus, logit models are special cases of logistic response models where the predictors need not necessarily be categorical. with more than two  Extension to predictors  categories can be made similarly  by defining  the  appropriate dummy variables.  4. 2. 2  Path Models  As  in Section 4.1,  suppose we  are  interested  in the  relationship  between infant death (C) and two explanatory variables, say age education (B) of the mother. two  levels.  The  {A) and  But now assume that each variable has only  relationship between variables A  and  B  can then  be  expressed by the logit model  n  . . B  IA  logit^. ' B  where  E  W A  I  B | A  ,  =w  B | A  . .  i<£>'  (4.13)  ,  _.  A  <£>  =  0*  N o w  build a logit model with C (infant death) as the  response variable, and A and B as the explanatory variables.  The three  unsaturated loglinear models corresponding to such a logit model are  The best  1.  [AB][AC)[BC]  2.  [AB][AC]  3.  [AB)[BC].  model among those providing acceptable  external information, or substantive theory.  63  f i t is chosen using  The f i t of a recursive system  of logit models can be assessed by two approaches, which are presented in later section.  Suppose model 1 is the best model. Then the path model can  be represented by the following diagram with path coef ficients  given by the  10- terms.  L e v e l of education B |AB  <i>  Age  A  c I AB w '  C  Infant death  i< i >  Figure 8  A path model with dichotomous variables  Several drawbacks of this method proposed by Goodman (1972, 1973a,b) are  illuminated  by  the above example.  Although  Goodman does assign  numerical values to arrows in the diagram, these values do not have the same interpretation as in path analysis for continuous variables.  There is  no calculus of path coefficients; so there i s no way of evaluating the indirect effect of a variable.  Further, variables with multiple categories  have multiple coefficients associated with a given arrow in the diagram.  Thus, interpretation of the model may be complicated.  path  Since a  sparse contingency table w i l l pose problems in estimation of the u-terms, and thus the u>-terms, the number of categories for each variable, and the number of variables considered must be restricted.  In view of these  obstacles, we w i l l limit ourselves to variables with two categories, and consider only a small number of variables.  64  4.2.3  E s t i m a t i o n o f Path C o e f f i c i e n t s  The  path coefficients are estimated by maximum likelihood method,  which w i l l be illustrated using a two-dimensional easily be extended to higher dimensional tables. data set i s assumed to be a fixed cxoss-classified consideration.  according  to  sample,  i t s values  table.  The method can  Our Sri Lankan household in which each member i s  for the variables  under  Since a multinomial sampling model i s assumed for the Sri  Lankan household study, the estimation procedure w i l l be developed based on such  models.  Estimation procedures  are similar  for other  commonly  encountered sampling models, such as product-multinomial and Poisson (see Bishop, Fienberg and Holland 1975, and Fienberg 1980).  Consider a random sample of N subjects, where (A^,B^) for subject h i s observed, h. = /  N. Let p^j denote the probability that (A,B) =  {i,j),  and l e t Z^j be the number of subjects with A = i and B - j, for i ,j = 1,2. Then, under the multinomial sampling model, the expected number of subjects with A = i and B = j i s given by m.. . = £(Z. .) = Np. . . ij  (4.14)  ij  The general loglinear model for a two-dimensional table i s log m.. . = xi + v.  + xi  .+u  (4.15)  for i,j = i,2, where 2 T.u. i=l  =  2  2  T \ x i . =  Vv.  j=l  i =1  65  2 .. =  V J=i  VL  ..=0.  Alternatively, the matrix representation of this model is  m. m  log  l  ti 12  1  —  L  l  t  XL  1-1-1  XL  1-1  171  21 m. 22  i  1-1  1-1-1  J  U  1  L  1(1> 2<1>  XI  12<11>  J  or log  m. = WQ.  The likelihood function i s given by  L(Q)  oc  rj p.  2  . .  oc  ij  m.  n  where s. .. are the observed c e l l counts. *• J  ij ,  Thus the maximum likelihood  equations are given by  a  L(Q) = w'is - m) = 0 ,  log  where -z = (z ,z , 3 , 2 ) of m.  ~  11' 12  21  22  (4.16)  and m. i s the maximum likelihood  '  estimate  ~  Further, the observed Fisher information matrix i s given by  &  q  =  log L (Q) = W  T  (4.17)  M W,  where  M =  Hence,  the maximum  m. 0 11 0 m 0  0  0  0  likelihood  12  0  O  0  0  m. 0 21 m. 0 22  estimates  Newton-Raphson iterative procedure:  66  J  of Q  can be  obtained by  [^lV]V(.- »,, a  gi+o.gi**  where Q  1=0,1,...  s  i s the estimate of (3 at the l - t h stage, j a  <li  tf**' is the diagonal matrix corresponding to i n i t i a l estimate g  < 0 >  »JJ  < 1 >  .  l )  = exp(W(3 ), and <l>  Since the choice of  w i l l affect the rate of convergence,  estimate should be chosen carefully. estimate of Q with weights  In general, the weighted least square  — - — will S  the i n i t i a l  provide a satisfactory  initial  i j  estimate.  The u-terms can also be estimated by using various other methods (see Bishop, Fienberg and Holland 1975).  However, only the Newton-Raphson  iterative procedure provides a readily available estimate of the precision of Q.  The maximum likelihood  estimator Q i s asymptotically normally  distributed with mean Q and variance information matrix. matrix & , q  where &  i s the Fisher  In practical applications, the observed information  which i s available  upon convergence  procedure, i s often used in place of 3>.  in the Newton-Raphson  Therefore, statistical inference  for the u-terms (in vector Q) i s possible.  Although the above iterative procedure i s described for the saturated loglinear model in the case of two-dimensional tables, extension to other loglinear models simply involves modifying the m.-vector, the W-matrix, and others accordingly. similarly.  Thus, estimates  of the u-terms can be obtained  Since path coefficients (w-terms) are twice the appropriate  ii-terms, they can be estimated from the estimates of u-terms.  67  4. 2. 4  Goodness-of-Fit f o r Path Models  A path model is specified by a recursive system of models. The f i t of a system of logit models can be assessed by directly checking the f i t of each component model, or by computing a set of estimated expected c e l l counts for the combined system.  Once the expected c e l l counts are estimated, the f i t of the model can be  assessed  by either  the Pearson  chi-square  statistic  X  or the  2  likelihood-ratio s t a t i s t i c G i 2  v  2  - r» (observed  - expectecD  expected  '  2  a  9  where the summation i n both cases i s over a l l cells in the table.  If the  fitted model i s correct and the total sample size i s large enough, both X and G  2  are approximately x  2  distributed with degrees of freedom given by  d.f. = # of cells - # of parameters.  (4.20)  In the context of causal modelling, Goodman uses the likelihood-ratio test statistic G  2  to evaluate the f i t of a model.  Improvement  in the f i t of a model by adding  or deleting  iteraction terms can also be assessed by chi-square statistics.  some  Consider  two models, model I and II, where model II i s a special case of model I. That i s , model II i s obtained from model I by setting some of the u-terms  68  to zero.  AG  Then the likelihood-ratio test statistic,  = G (II) - G (I) = 2  2  2  with d.f.  2  = d.f.(I)  - d.f  expected^  E observed  *  (4.21)  log expected  xx  can be used to test whether the difference  .{ID  between the expected c e l l counts for the two models is simply due to random variation  given  the true  expected  cell  counts  satisfy  model I.  For instance, in our example, the effect of adding the relationship between A  (age) and C (infant death) to the model [AB)[BC]  can be evaluated by  using the test s t a t i s t i c AG  2  = G ( 2  [AB)[BC]  ) - G ( [AB] [AC ] [BC ] ) 2  with / degree of freedom.  Goodness-of-f i t of a path model can also be assessed by using the expected c e l l counts of the combined system of logit or loglinear models. The computation of these combined estimates i s best illustrated by an example.  Suppose we have three variables with the following  causal  ordering: A precedes B precedes C, as shown i n Figure 8. system,  consisting  (4.22)  Then the estimated expected c e l l  of the pair  of unrestriced  logit  counts for a models  implied  by (4.22), are given by  a  IA 1  m.ijh  "  C  I  A B  69  ~  m  I  C  A B  1  (4.23)  A  where and  i s  { f f i .  i  . ' JR. c  A  B  the number of observations with {A,B) = (i,j),  }  B  I  A  and to^y 1  are the estimated expected c e l l counts for the logit models  with variables B and C as the response variables respectively.  Since the  latter model involves conditioning on the marginal totals ^^j^r  which can  be  seen  from  the maximum  in (4.23) i s obtained.  likelihood  Thus,  equations, the second  the likelihood-ratio  test  equality  statistic is  given by  G =2  *  £  2  t  J  K  * log  *  2  E  tjh  . . 2  B  where G . 2  B | A  IA  (4.24)  h  J  m  a  = G .  J  L ijh  i,j,h  "  i  *  l  °  €  \  I  J  Z  i  ~  :  J  B  * A  — C  I  A B  + G , 2  C  IAB  is the likelihood-ratio test s t a t i s t i c for logit model specified  on the 2x2 table obtained by collapsing over variable C, and . | g 2  a b  is the  likelihood-ratio test s t a t i s t i c for logit model specified on the complete 2x2x2 table.  Thus, the overall likelihood-ratio test s t a t i s t i c has degrees  of freedom given by the sum of degrees of freedom corresponding to the two 2  component G 's. A more detailed discussion on this approach can be found in Goodman (1973b), and Fienberg (1980).  70  5.  R e s u l t s o f S t a t i s t i c a l A n a l y s e s on t h e S r i Lankan Household  The  Data  Sri Lankan  discriminant  infant  methods  households with high  to  mortality  identify  data set  risk  was  factors  risk of infant mortality.  first  and  analyzed  to  by  characterize  Methods for path analysis  were then applied to the identified risk factors, in order to assesss the relationships among them, and their relationship to infant death.  5.1  I d e n t i f i c a t i o n of Infant M o r t a l i t y  R i s k Groups  The main objective of this analysis is to identify risk factors that discriminate  between  mortality.  By  households  using  the  with  relatively high  terminologies  Section 3, the problem can be formalized  and  and  notations  as follows.  low  infant  introduced  in  For each household  sampled in the Sri Lankan household study, let Y be a dichotomous variable indicating whether or not an infant death has vector of explanatory variables. household belongs.  occurred, and  let X be a  Then, Y specifies the class to which the  The explanatory variables are listed as X-variables  in  Table I, which includes information on nutrition, sanitation, education of the  mother, economic status,  family,  etc..  Then, the  combinations of the x-values.  childbirth environment, ethnicity of sample  space  X  consists  of  the  a l l possible  Using decision theoretic criteria, estimates  of infant death probability at each x-value partition the sample space X into  two  groups.  regions  corresponding  Two discriminant  to  methods are  71  relatively advocated  in  high  and  Section 3:  low  risk  logistic  discrimination and class probability estimation by CART. For each of these methods, the analysis was performed separately for those women of age less than 44 (N - 250) and those of age greater than or equal to 44 (N = 141).  5.1.1  Logistic Discrimination  A forward stepwise procedure implemented in the logistic regression program PLR of BMDP, was used to select explanatory or predictive variables that  may adequately model the logit  described in Section 3.  of infant death probability, as  The results of this analysis are shown in Table V.  Consider the results for younger women (Table Vb). About 25% of these women with age less than 44 have experienced at least one infant death. Maximum likelihood estimates of the regression coefficients i n the most parsimonious model indicate that probability of infant death seems to be greater for those who gave birth at home, and for those whose families have lower economic status.  By setting some threshold value p , the Sri Lankan Q  village households can be partitioned into two risk groups with the higher risk group composed of households with estimated infant death probability greater than the threshold value.  Using the maximum likelihood estimation  results, the sample space can be partitioned as follows: the region of high risk corresponds to families with 1.  last child born in hospital, and economic status < -4.732 ( logit  72  p  + 1.134  ), or  2.  last child born at home with a midwife, and economic status < -4.732 ( logit  3.  p  last child born at home without a midwife, and economic status < -4. 762 ( logi t p  Details  + 0.305 ), or  Q  + O. 352 ).  on formulation of the above partition are shown i n Appendix I.  Although this partition of the sample space can be interpreted easily, this may not always be the case where more variables are in the final model.  Next, consider likelihood  estimates  the results of  parsimonious model indicate  for older  the regression that  women (Table Vb). coefficients  probability  of infant  Maximum  i n the most death  for  the  non-Sinhalese families may be twice as high as that for the Sinhalese families.  Thus, for the older women, the relatively high and low risk  groups may be defined by ethnic group membership.  73  Table V  a.  Results of forward stepwise logistic regression  Model selection  Study group  Model  Women of age<44  constant  -2 log X  constant, X  8.509  1  0.004  7.003  2  0.030  11.665  1  0.001  5  constant, X , X s'  Women of age44  +  p-value  d.f.  z  constant constnat, X  10  maximum likelihood under previous model where X. =  , maximum likelihood under current model  X  i s the environment of child birth,  X_  i s the economic status, and  z  5  X^  i s the ethnicity.  o  Note that X  is treated as continuous variable, while X and  5  X  2  are treated as categorical variables represented by dummy  variables as defined on the following page.  74  Maximum likelihood estimates of the coefficients  in the final model  Maximum likelihood estimate Study group  Variable  coefficient  s.e.  Women of age<44  constant  -0.597  0.209  X  -0.210  0.098  X  0.292  0.224  X  0.245  0.238  constant  -0.683  0.187  0.622  0.187  5 2<2> 2<9>  Women of age44  +  X  10<2>  where X_ i s the economic status, 5  i f the last child was born at home with a midwife, 2<2>  -1 O  i f the last child was born in hospital, otherwise, i f the last child was born at home without a midwife,  X  2(3)  X  10<2>  -1  o  i f the last child was born in hospital, otherwise, and i f the household i s non-Sinhalese, i f the household i s Sinhalese.  75  5.1.2  D i s c r i m i n a t i o n u s i n g CART  The probability of infant death at each point i n the sample space was estimated using the CART software described in Section 3, using the 10-fold cross-validation procedure.  As i n the previous section, younger and older  women were analyzed separately.  For the younger women, the pruned subtree  with the minimum cross-validated estimate of risk i s shown i n Figure 9. If the same criterion is used for the older women, then a t r i v i a l tree with one terminal node would be selected. Thus, the next largerbe obtained  tree which can  by growing a tree with an appropriate complexity  parameter  using the entire sample, is considered (see Figure 10).  For younger women, the binary tree (Figure 9) has three  terminal  groups corresponding to low risk, and one terminal group corresponding to high. risk.  Women who gave birth in the hospitals, or whose families have  high economic status appear to have a relatively low risk of experiencing at least one infant death.  For those women who gave birth at home, and  whose families have low economic status, families whose major source of income i s from piece-rate work or hourly labor seem to be at a much lower risk than those families whose income i s from other sources. households in poverty,  For those  piece-rate work or hourly labor may provide a  steadier source of income.  Thus, women who give birth at home, live in  poverty, and whose families have no steady income, are at the highest risk of experiencing at least one infant death.  76  For older women, the binary tree (Figure 10) suggests that Sinhalese families may have been at a lower risk than the non-Sinhalese families. The estimated probability of infant death indicates  the risk  death may be twice as high in non-Sinhalese families families.  77  of infant  as in Sinhalese  Figure 9  CART results for the younger women  63 class 187 class  1 2  C25%>  in  hospital  24 class 115 class  Where was the l a s t c h i l d born? 1 2  at home  39 class 72 class  C17X?  1 2  C35VO  0-2  34 class 48 class  Economic status  1 2  5 class 24 class  C171D  C41T&  piece  rate  10 class 31 class  Primary source of income  others  24 class 17 class  1 2  1 2  C59X>  C24TO  class  1 : households with infant death experiences,  class  2\ households with no infant death experience.  Proportion of class  1 households are reported in the brackets.  78  1 2  F i g u r e 10  CART results for the older women  48 93  class class  1 2  C34%>  Sinhalese  others I O  16 class 59 class  32 34  1 2  class class  I 2  (48%)  (21%)  class  1 : households with infant death experiences,  class  2: households with no infant death experience.  Proportion of class  1 households are reported in the brackets.  79  5.1. 3  Discussion  Explanatory  variables  considered  important  by  the  logistic  discrimination method were also considered important by the CART method. However, the partition of the sample space into regions of relatively high and low risk may be different for the two methods. Logistic discrimination forces a linear partition, whereas CART partition i s  piecewise  linear.  For younger women, economic status of the family i s considered an important risk  factor  by both methods.  But i n the CART result, the  partition uses this variable only for those women giving birth at home. Suppose the threshold value, p , in Section 5.1.1 Q  CART result.  equals O. 17 as in the  Then logistic discrimination method partitions the sample  space into high and low risk regions as follows: the region of High, risk corresponds to families with 1.  last child born in hospital, and economic status < 3 , or.  2.  last child born at home with a midwife, or.  3.  last child born at home without a midwife.  Thus, women who gave birth at home are in the high  risk group, and so are  women who gave birth in the hospital but whose family is poor.  But this  contradicts the CART result (Figure 9), where a l l women giving birth in hospital are in the low risk group.  Consider the 3x2  contingency table  formed by cross-tabulating the environment of childbirth, and the economic status dichotomy created by grouping the categories 0-2 and 3-5 , as shown in Table VI.  The table shows that the partition provided by the CART  80  method  seems more coherent  than  the partition provided  by logistic  discrimination.  The  logistic  between the logit  discrimination  method assumes that  of infant death probability (logit  the relationship p) and economic  status (X ) for environment of childbirth (X^), can be modelled by parallel 5  straight lines (Table VII).  This criterion seems reasonable for latter two  childbirth conditions, but not for a l l three conditions.  By imposing this  parallelism on the results, the more appropriate partitioning of the sample space is overlooked. variables  However, i f interactions between the two explanatory  were allowed, logistic discrimination  appropriate partitioning. fitting  might have obtain the  In general, logistic discrimination may require  many different models with various  iteraction terms before a  partitioning comparable to that found by the CART method, is discovered.  Discrepancies between results for the two age groups may be explained by several factors.  Health services may be more readily available at time  of child bearing for the younger women.  Younger generation may also be  less inhibited by health technologies; and thus utilizes the services more frequently.  Ethnicity may be more relevant to everything  mortality) when the older women were child bearing.  (including infant  Ethnicity may s t i l l be  pertinent to economic status and usage of health services in the younger generation, but the effect of ethnicity on infant mortality may have lessen.  Lastly, economic status at time of study may be strongly related  to economic status at time of child bearing for the younger women, perhaps not for the older women.  81  but  Table VI  Comparison of sample space partitioning by logistic discrimination and by CART  The following table i s constructed based on women of age less than 44.  Economic status - ownership of household items. (X ) 5  Where was the l a s t child born ? (X )  0-2  2  In hospital At home with midwife  3-5  °-13  °-  4 1  At home without midwife  (-^-)  (-£-:>  (-£-)  0-13  (JL-)  The high, risk group identified by logistic discrimination i s the group of households in the highlighted region given by the  union  of the f i r s t column and the last two rows. The high  risk group identified by CART i s the group of households  in the highlighted region given by the intersection column and the last two rows.  82  of the f i r s t  T a b l e VTI  Estimated logistic regression equations for younger women  Estimated  Where was the last child born? (X ) 2  L o g i s t i c Regression Equation  In hospital  logit  p = -1. 134 - 0. 210 X  At home with midwife  logit  p = -0.305 - 0.210 X  At home without midwife  logit  p = -0.352 ~ 0.210 X  5  5  83  5.2  Causal M o d e l l i n g  Discriminant  analysis  performed  status, environment of childbirth, associated with infant mortality.  and  earlier ethnic  indicates  that  economic  group membership may  To understand how  be  these variables work  together to affect infant mortality, a path model is constructed  based on  the natural temporal ordering of the variables (Figure 11).  F i g u r e 11  A path model specifying temporal relationships among selected variables  84  5.2.1  Structual Modelling with Quantitative  Data  The following analysis is performed using the REG procedure in the SAS statistical  software,  by treating  a l l four  variables  as continuous.  Results of path analysis for the two age groups are shown in Figures 12 and 13 respectively.  The estimated direct and indirect effects of explanatory  variables on infant mortality are summarized in Table VIII for the two age groups.  Comparing path models shown in Figures 12 and 13 suggests that the relationships  among the variables  The effect of ethnic  may differ  for the two age groups.  group membership on childbirth environment seems  stronger for the younger women. Economic status and childbirth environment appear to affect infant mortality  for the younger women, whereas only  ethnicity appears to have a substantial effect on infant mortality for the older women.  Consider the estimated direct and indirect effects of explanatory variables on infant mortality for the younger women (Table VIII).  Although  ethnicity has virtually no direct effect on infant mortality, i t does seem to influence the other two variables, economic status and environment of childbirth, to affect infant mortality.  Thus minority  group status may  adversely affect the economic status, and may obstruct  access to better  childbirth environment, which in turn, increases the risk of infant death.  85  Estimated  direct and  indirect effects of explanatory variables on  infant mortality for the older women in (Table VIII) indicate that neither economic status nor childbirth environment have strong direct or indirect effects on infant mortality. Therefore, minority group status seems to be the only factor, among the three considered, to increase the risk of infant death.  For  both  path  models (Figure 12  and  13),  the  path  coefficients  corresponding to the unobserved sources of variations are high.  Thus, the  linear models considered by path analysis do not seem to f i t the data well. Since the occurrence of infant death is a relatively rare event, and the variables investigated are not immediate biological causes a linear model is not likely to f i t the data well.  of infant death,  However, this type of  model s t i l l provides some useful information on the relationships among the variables.  86  F i g u r e 12  Path analysis results for the younger women  |  0.93  Economic  Environment of c h i l d b i r t h X 2  |  where  0.90  • signifies s t a t i s t i c a l l y nonzero path coefficient at the 10% level (excluding residual path coefficients).  87  F i g u r e 13  Path analysis results for the older women  90  Economic status  \  X -O.  sS  44  Ethnicity X  -O. -O.  io  O.  o.  IO  ^  25  —•  S  19  y  16  -o.02  Infant death y 95  Environment of c h i l d b i r t h Xz |  where  O.  96  • signifies s t a t i s t i c a l l y nonzero path coefficient at the *0% level (excluding residual path coefficients).  88  Table VIII  Estimated direct  and indirect  effects on infant death  Effect on Infant Mortality Study Group Age <44  Age 44  +  Variable (source)  Direct  Indirect  Ethnicity  0.00  -0.12  Economic status  0.13  0.05  Use of health services for childbirth  -0.16  Ethnicity  -0.25  -0.04  Economic status  0.10  -0.G1  Use of health services for childbirth  0.02  89  1  5.2.2  Structural Modelling with Qualitative Data  The preceding section applied statistical analysis that was originally derived for continous variables; but most of the variables in this study are ordered categorical.  In this section, the relationships between the  variables are analyzed using the method for categorical variables proposed by Goodman, which was described in Section 4.2. method as discussed  Due to limitations of the  in Section 4.2.2, the variables considered are receded  into two categories (Table IX).  Let A - D be the receded variables for  ethnicity, economic status, environment of childbirth, and infant death respectively.  Then the following causal  ordering  of the variables i s  assumed: A preceeds B preceeds C preceeds D.  Programs written  i n a language implemented in the s t a t i s t i c a l  package called S were used for the analysis. causal  software  Path diagrams depicting the  connections implied by the best logit or loglinear models for women  of the two age groups are shown in Figures 14 and 15.  Details on the model  selection are given in Appendix II and III respectively for the two groups of women.  The  path diagram for the younger women (Figure 14)  (1) minority group status may adversely obstruct  access to better  indicates that:  affect economic status, and may  childbirth environment; (2) poverty may have  blocked access to better childbirth environment; (3) lastly, poverty and  90  childbirth minority  environment may be linked  group status  to infant mortality.  Although  does not seem to have direct effect on infant  mortality, i t does seem to have an indirect  effect through economic status  and environment of childbirth.  The  path diagram for the older  women (Figure 15)  indicates  that:  (1) minority group status may have negative effects on both economic status and  infant mortality;  (2) poverty may have blocked  access  childbirth environment; but (3) neither economic status environment for  older  nor childbirth  has any significant effect on infant mortality. women, no variables  i n addition  to better  Therefore,  to ethnicity (among those  considered) can significantly improve discrimination between high and low risk groups.  91  T a b l e IX  variables used in modified path analysis  Variable A  Variable in original data set  Codes  X  Ethnicity  1 2  Sinhalese non-Sinhalese  Economic status  1 2  0-1 2+  to  B  X s  C  X  Use of health services for childbirth  1 2  in hospital at home  D  y  Infant death  1 2  at least one none  92  F i g u r e 14  Path diagram showing causal links implied by selected logit models for the younger women  93  Figure 15  Path diagram shoving causal links implied by selected logit models for the older women  Economic status \ -O. 73 \  Ethnicity A  -0.59  s  NJ  S  -O. 41  Infant death D  Environment of c h i l d b i r t h C  where  signifies non-significant relationship.  94  5.2.3  Discussion  Causal  interpretations  of  path  diagrams  constructed  quantitative and qualitative approaches are similar.  by  both  For the younger  women, both path diagrams (Figures 12 and 14) show that minority group status seems to result in poverty, and seems to obstruct access to better childbirth environment, which in turn, leads to infant deaths.  For the  older women, both path diagrams (Figures 13 and 15) indicate that minority group status per se appears to be the only factor that has any effect on infant mortality.  Discrepancies between results for the two age groups may  be explained as in Section 5.1.3.  None of the linear regression models in Figures 12 and 13 f i t the data particularly well, as shown by the path coefficients corresponding to the unobserved sources of variations.  On the other hand, the loglinear or  logit models considered in Figures 14 and 15 provide reasonable f i t to the data sets.  However, the method for qualitative data does not provide  quantitative assessments of indirect effects as provided by the method for quantitative data.  95  6.  Remarks and Recommendations on S t a t i s t i c a l Methods Used to Identify Risk Groups  An objective of the Sri Lankan household survey was small number of risk  factors  that distinguish groups of women having  relatively high or low probability of experiencing death.  to identify a  at least one  infant  This study examined socioeconomic factors (not medical causes) that  are relevant to resource allocation priorities, and to cultural obstacles in  the  planning  of  health  services  and  health  promotion  programs.  Structural or temporal relationships among the risk factors are also of interest to the  researchers.  Statistical discrimination methods were used to select significant risk factors, and to identify the high risk group (or groups) in the Sri Lankan households.  Although both logistic discrimination and  computing-intensive,  the  computing resources, and Otherwise, the  CART  logistic has  discrimination  method  CART  requires  are less  more readily available software packages.  technique  is preferable,  since  informative and more easily interpretable results.  i t provides more  Furthermore, the CART  technique does not require any distributional assumptions.  After a small set of risk factors had been identified by discriminant analysis, the  structural or temporal relationships among selected risk  factors and infant mortality were investigated using path analysis.  The classical method of path analysis using linear regression models has  often  been applied  to  social science  96  data  that  are  ordinal  or  categorical  in nature, where a modified method using  response models would be more appropriate.  logistic quanta1  When the classical method i s  applied inappropriately, the resulting path model usually does not f i t the data well, as indicated by high residual path coefficients. modified  method does provide  a better  Although the  f i t , i t i s highly  computing-  intensive, and is restrictive in the number of variables allowed i n the proposed path model.  In  practice,  variables  social scientists would  than the models considered  here.  use path models with more variables  that  were not  selected by the discrimination methods might s t i l l be of interest to the researchers,  when considering  infant mortality in a larger socioeconomic  and p o l i t i c a l context.  The approach used in this thesis, and recommended for similar studies to identify risk groups, applies discriminant analysis (preferably CART) as an exploratory  tool, and then uses path analysis (preferably  quanta 1 response modelling) to confirm  logistic  significance of relationships among  variables.  In our Sri Lankan household study, discriminant  analysis identified  economic status and environment of childbirth as significant risk factors for the younger women.  In contrast, ethnic group membership is the only  risk factor identified for the older women.  Younger women who gave birth  at home, and whose families have low economic status appear to be at a high risk of experiencing at least one infant death, whereas, younger women who  97  gave birth in the hospital, or whose families have high economic status seem  to be at a substantially lower  risk.  For the older women,  non-Sinhalese families appear to have a higher risk of experiencing at least one infant death than the Sinhalese families.  Results  of path  analysis  on infant  mortality  using  the three  identified risk factors suggest that the changing role of ethnicity may have partially explained the discrepancies between previous results for the two age groups.  While ethnic group membership may be relevant to many  things, including infant mortality, for the older generation, i t s influence on infant mortality seems to have lessened for the younger generation.  The discrepancies between results for the two age groups may also be explained by other factors.  Health services may not have been as readily  available at time of child bearing for the older women as for the younger women. The use of better childbirth environment by the younger women may also be explained  by the changing attitude toward the seriousness of  childbirth by the families.  Finally, the economic status at the time of  study may be strongly related to the economic status at time of child bearing for the younger women, but may not be so for the older women.  In  order  to plan  an effective  health  program to promote infant  survival, one must understand the socioeconomic conditions in which infant death is likely to occur, as well as the biomedical causes of infant death. Our analysis suggests most of the high risk households w i l l be too poor to take advantage of the government's subsidy program for the construction of  98  sanitary latrines.  Although Sri Lanka has  a well-organized  network of  essentially free health services that extend into rural areas, access to and  usage  of  better  childbirth environment  Health planning entails more than designing  can a  still  be  program that  improved. treats  or  prevents a health disorder; i t must also ensure health care delivery to those in need.  99  BIBLIOGRAPHY  Anderson, J.A. (1972). Biometriha, 59:19-35.  Separate  sample  logistic  discrimination.  Anderson, T.W. (1984). An Introduction to Multivariate Analysis, 2nd ed.. New York: John Wiley & Sons. Bishop, Y.M.M., Fienberg, S.E., and Holland, P.W. Multivariate Analysis: Theory and Practice. Press. Blalock, H.M. ed. (1970). Chicago: Aidine. Breiman, L., Classification  Causal  Models  in  the  Statistical  (1975). Discrete Cambridge: The MIT Social  Sciences.  Friedman, J.H., Olshen, R.A., and Stone, C.J. (1984). and Regression Trees. Belmont: Wadsworth & Brooks.  Breslow, N.E. and Day, N.E. (1980). The Analysis of Studies. Statistical Methods in Cancer Research, International Agency for Research on Cancer.  Case-Control Vol. 1. Lyon:  Caldwell, J. and McDonald, P. (1982). Influence of matrernal education on infant and child mortality: levels and causes. Health Policies and Education, 2:251-267. Chovdhury, A. (1982). Education and infant survival in rural Bangladesh. Health Policies and Education, 2:369-374. Cox, D.R.  (1970).  The  Analysis  of  Binary  Data.  Dillon, W.R., and Goldstein, M. (1984). Multivariate John Wiley & Sons.  London: Methuen.  Analysis.  New York:  Duncan, O.D. (1966). Path analysis: sociological examples. The Journal of Sociology, 72:1-16.  American  Efron, B. (1975). The efficiency of logistic regression compared to normal discriminant analysis. Journal of the American Statistical Association, 70:891-898. Fienberg, S.E. (1980). The Analysis Data. Cambridge: The MIT Press. Fisher, R.A. (1936). problems. Annals of Goodman, L.A. The American  (1972). Journal  of  Cross-Classified  The use of multiple Eugenics, 7:179-188.  measurements in  A general model for the analysis of Sociology, 77:1035-1086.  100  Categorical taxonomic  of surveys.  (1973a). of surveys.  Causal analysis of data from panel studies and other kinds  The  American  Journal  of  78:1135-1191.  Sociology,  . (1973b). The analysis of multidimensional contingency tables when some variables are posterior to others: a modified path analysis approach. Biometrika, 60:179-192. Grosse, R. and Perry, B. (1982). Correlates of l i f e expectancy in less developed countries. Health Policies and Education, 2:275-304. Haberman, S.J. (1978).  Introductory  Analysis  Topics.  of  Qualitative  Data.  Uolume  1:  New York: Academic Press.  Hand, D.J. (1981). Discrimination Wiley & Sons.  and Classification.  Heise, D.R., ed. (1975). Sociological Jossey-Bass.  Methodology  New York: John  1976. San Francisco:  Kendall, M.G. and O'Muircheartaigh, CA. (1977). Path analysis and model building, World Fertility Survey, Technical Bulletin No. 414. Kiveri, H., Speed, T.P. and Carlin, J.B. (1984).  Recursive causal models.  Journal  Society  of  the  Australian  Lanchenbruch, P.A. (1975). Press. Land, K.C. Methodology  Mathematical  Discriminant  Analysis.  A, 36:30-52.  New York: Hafner  (1969). Principles of path analysis. Sociological 1969, ed. E.Borgatta. San Francisco:Hossey-Bass.  . (1973). Identification, parameter estimation and hypothesis testing in recursive sociological models. Structural Equation Models in the Social Sciences, eds. A.S. Goldberger and O.D. Duncan. New York: Seminar Press. Leik, R. (1975). Causal retrospective. Sociological San Francisco: Jossey-Bass. McKeown, T. (1976). Academic Press.  The  models with nominal and ordinal data: Methodology 1976, ed. D.R. Heise.  Modern  Rise  of  Population.  New York:  Mishler, E.G., Amarasingham, L.R., Hauser, S.T., Liem, R., Osherson, S.D., and Waxier, N.E. (1981).  Patient  Care.  Social  Contexts  Morgan, J.N. and Sonquist, J.A. (1963). survey data, and a proposal. Journal of Association,  of  Health,  Cambridge: Cambridge University Press.  58:415-435.  101  Illness,  and  Problems i n the analysis of the American Statistical  Morgan, J.N. and Messenger, R.C. (1973). THAID: a sequential search program for the analysis of nominal scale dependent variables. Ann Arbor: Institute for Social Research, University of Michigan. Morris, M.D.  (1979). Measuring  New York: Pergamon Press.  the  Condition  of  the  World's  Poor.  Morrison, B. and Waxier, N.E. (1984). Three patterns of basic needs within Sri Lanka: 1971-1973. Unpublished paper. Mosley, W.H. (1984). Child survival: research and policy. Child Population and development review, a supplement to volume 10. The Population Council, Inc..  Survival. New York:  , and Chen, L. (1984). An analytical framework for the study of child survival in developing countries. Child Survival. Population and development review, a supplement to volume 10. New York: The Population Council, Inc.. Mueller, J.H., Schuessler, K.F., and Costner, H.L. (1977). Reasoning in Sociology. Boston: Houghton Mifflin.  Statistical  Panel on Discriminant Analysis, Classification, and Clustering Discriminant  Academy Press.  Analysis  and  Clustering.  Washington, D.C:  (1988). National  Patel, M. (1980). Effects of the health service and environmental factors on infant mortality: the case of Sri Lanka, Jour rial of Epidemiology and Community Health, 34:76-82. Press, S.J. and Wilson, S. (1978). and discriminant analysis.  Association,  73:699-705.  Choosing between logistic regression  Journal  of  the  American  Statistical  Puffer, R.R. and Serrano, C.V. (1973). Patterns of Mortality in Childhood. Scientific Publication No. 262. Washington, D.C: Pan American Health Organization. Rosenthal, H. (1980). The limitation of log-linear analysis. Sociology, 9:207-212. Sackett, D.L. and Holland, W.W. (1975). disease. The Lancet, 2:357-359.  Contemporary  Controversy in the detection of  ® SAS Institute, Inc. (1985). SAS  Edition.  User's  Cary, NC: SAS Institute Inc..  Schlesselman, J.J. (1982). Case-Control  Analysis.  Guide:  Statistics,  Studies:  New York: Oxford University Press.  Design,  Version  5  Conduct,  Simmons, G. and Bernstein, S. (1982). The educational status of parents, and infant and child mortality in rural North India. Health P o l i c i e s and Education, 2:349-367.  102  Smucker, C , Simmons, G., Bernstein, S., and Misra, B. (1980). Neo-natal mortality in Sourth Asia: the special role of tetanus. Population Studies, 34:321-335. Wald, A. (1944). On a s t a t i s t i c a l problem arising in the classification of an individual into one of two groups. Annals of Mathematical S t a t i s t i c s , 15:145-163. Waxier, N.E., Morrison, B.M., Sirisena, W.M., and Pinnaduwage, S. (1985). Infant mortality in S r i Lankan households: a causal model. Social Science and Medicine, 20:381-392. Wermuth, N. (1980). Linear recursive equations, covariance selection, and path analysis. Journal of the American Statistical Association, 75:963-997. . (1987). Parametric collapsibility and the lack of moderating effects in contingency tables with a dichotomous response variable. Journal of the Royal S t a t i s t i c a l Society, 49:353-364. , and Lauritzen, S.L. (1983). Graphical and recursive contingency tables. Biometriha, 70:537-552. Winship, C. and Mare, R.D. (1983). for discrete data. The American World Bank (1975). Health Bank.  Sector  models for  Structural equations and path analysis Journal of Sociology, 89:54-110. Policy  Paper.  Washington, D.C.: World  Wright, S. (1934). The method of path coefficients. Mathematical S t a t i s t i c s , 5:161-215 . (1960). Path coefficients and path regression: complementary concepts? Biometrics, 14:189-202.  Annals  alternative or  Vande Geer, J.P. (1971). Introduction to M u l t i v a r i a t e Analysis the Social Sciences. San Francisco: W.H. Freeman and Company.  103  of  for  Appendix I  Partitioning the Sample Space Using Logistic Discrimination (Younger Women)  Let p  o  be some threshold value chosen, so that the high, risk group i s  composed of households with estimated probability of experiencing at least one infant death greater than p . Q  Then using maximum likelihood estimates  of the regression coefficients (Table Vb), the high  risk households have  explanatory variables satisfying the following inequality: - 0.597 - 0.210 X 5  where X  denotes  + 0.292 X  + 0.245 X  2<2>  the economic status,  > logit  p ,  2<9>  and X  5  (A.l)  O  and X 2<2>  are dummy  2(3)  variables representing the categorical variable X^ as defined below,  X  f  1  1  = =1-1 \ -1  *  X  ~  i f the last child was born at home with a midwife, i i f the last child was born in hospital, otherwise,  f the last child was born at home without a midwife, ii f the last child was born in hospital, l o otherwise. o  = \ -1 2 < 9 >  Alternatively, the partition region can be described by examining each childbirth environment i n (A.l) : the region of high  risk corresponds to  families with 1.  last child born in hospital, and economic status < -4.762 ( logit  2.  + 1.134 ), or  0  last child born at home with a midwife, and economic status < -4.762 ( logit  3.  P  p  + 0.305 ), or  Q  last child born at home without a midwife, and economic status < -4. 762 ( logi t p  Q  104  + O. 352 ).  Appendix I I  Modified Path Analysis - Model Selection (Younger Women)  Using the method proposed by Goodman as described in Section 4.2, the relationship between variables A and B i s investigated through the logit model ,  , . B |1 A  logit  i  = u>  ~  with estimated effect parameter \  +  .  IA  B t  i  ,.  (A.2)  r  w x  )  „,  IA  B  (  IA  B 1  ±  = -0.63 .  )  By examining results of  f i t t i n g the three unsaturated loglinear models corresponding to the logit model with C as the reponse variable, and A and B as the explanatory variables  (models Ml - M3  [AB)[AC][BC] That  in  Table X),  we  see  that  models  (Ml) and [AB][AC] (M2) provide reasonable f i t s for the data.  i s , their  goodness-of-fit statistics (either  X  or G )  2  2  are not  s t a t i s t i c a l l y significant. However, G (M2) - G (M1) = 3.087 with 1 degree 2  Z  of freedom i s significant at the 10% level, suggesting the relation between variables B and C may be important. preferred.  Thus, the larger model, Ml, i s  The corresponding logit model is  ,  logit  C U B  ..  1  I AB  C  = w  1  ,  C  1 AB  + it>  VN  with estimated effect parameters;  .  C  + w  I AB  '.  ,  £• I  *>  .,  ^  =0.82  1< 1 >  _.  (A.3) C  IAB  and w  = -0.24 . 2<  i>  Now examine the effects of A on D, B on D, and C on D as suggested by the assumed causal ordering.  The results of f i t t i n g the seven unsaturated  loglinear models corresponding to the logit model with D as the reponse variable, and A, B  and C  as the explanatory variables (M4 -MIO in  Table X), show that a l l except model [ABC)[AD]  105  (M8) f i t the data well.  Since model M7 i s a special case of model M4, and G (M7) - G (M4) = 0.216 2  2  with * degree of freedom i s not s t a t i s t i c a l l y significant at the 5% level, the smaller model, M7, i s preferred.  For models M9 and MIO, two special  cases of model M7,  G (M9) - G (H7) = 6.729 2  and  2  G (M10) - G (M7) = 4.886, 2  2  each with 1 degree of freedom; both are s t a t i s t i c a l l y significant at the 5% level. Thus, father reduction from model M7 i s not desirable.  The logit  model corresponding to H7 i s  ,  . .  logtt  D  IA  ..'  B C  IJ  = w  D  1  1  A B C  + w  ,  with estimated effect parameters: u> 15  \  D  2< J >  2<  IA  B C  + w  ,  '  D  3<fc>  IA  ,  =0.33 and w  1  1>  . ,  B C  (A.4)  1  a< i >  The results are summarized by the path diagram in Figure 14.  106  = -0.38 .  , ,  e x  Goodness-of-fit statistics for loglinear models (younger women)  Model  d.f.  X  2  G  Z  Mi  [AB][AC)[BC]  1  0.215  0.215  M2  [AB][AC]  2  3.325  3.302  M3  [AB][BC]  2  32.029  32.887  M4  [ABC] [AD] [BD)[CD]  4  2.830  2.896  MS  [ABC] [AD] [BD]  5  8.766  8.063  M6  [ABC] [AD] [CD]  5  6.629  7.051  M7  [ABC] [BD] [CD]  5  3.012  3.112  M8  [ABC][AD]  6  13.740  13.252  M9  [ABC][BD]  6  9.781  9.841  MIO  [ABC][CD]  6  7.716  7.998  where  X G  2  z  is the Pearson chi-square s t a t i s t i c , and is the likelihoodf ratio s t a t i s t i c .  107  Appendix I I I  Modified Path Analysis - Model Selection (Older Women)  Using the method proposed by Goodman as described in Section 4.2, the relationship between variables A and B is investigated through the logit model ,  , . B | A  logit  ^  with estimated effect parameter  IA  B  xo  =  t  B  I  +  i  ±  B  w t  (  i  IA  C  .  r  >  A  - -0.73  w i  ,  '  )  .  By examining results of  f i t t i n g the three unsaturated loglinear models corresponding to the logit model with C as the reponse variable, and A and B as the explanatory variables  (models Ml - M3  [AB][AC][BC] That  in  Table XI),  we  see  that  models  (Ml) and [AB][BC] (M3) provide reasonable f i t s for the data.  i s , their  statistically  goodness-of-fit statistics (either  significant.  X  Since M3 i s a special  2  or G ) are  not  2  case  of Ml,  and  G (M3) - G (M1) = 0.609 with 1 degree of freedom i s not s t a t i s t i c a l l y 2  2  significant  at the 5% level,  the smaller model, M3, i s preferred.  The corresponding logit model is ,  . .  logit  C  AB  A  with estimated effect parameter,  AB  C  = io  .  ci I  1  .  C  + xo  AB  '. ,  ..  C  .  (A.6)  AB  w 2 l ± >  ~ -0.41  . By examining results of  f i t t i n g the seven unsaturated loglinear models corresponding to the logit model with D as the reponse variable, and A, B and C as the explanatory variables (M4 -MIO  in Table XI), we see [ABC] [AD] (M8) is the smallest  model that f i t s the data well.  Since adding more interaction terms into  model M8 does not significantly improve the f i t , the most  108  parsimonious  model is M8. Thus, the corresponding logit i s given by  ,  .,  logit.^  with estimated effect  D|ABC  DIABC  - » '  parameter,  ,  DIABC  + wj  iy  w^l*  BC  summarized by the path diagram in Figure 15.  109  ,  ~ -0.59 .  ,„  „  (A.7) %  The results are  T a b l e XI  Goodness-of-fit statistics for loglinear models (older women)  Model  d.f.  x2  G  2  Ml  [AB] [AC ] [BC ]  1  0.106  0.105  M2  [AB][AC]  2  4.023  4.058  M3  [AB)[BC]  2  0.724  0.715  M4  [ABC] [AD] [BD][CD]  4  1.776  1.682  MB  [ABC][AD][BD]  5  2.505  2.514  Me  [ABC][AD][CD]  5  1.771  1.714  M7  [ABC][BD][CD]  5  12.187  12.130  M8  [ABC] [AD]  6  2.503  2.515  MQ  [ABC][BD]  6  13.335  13.304  M1Q  [ABC][CD]  6  12.805  12.917  X  2  is the Pearson chi-square s t a t i s t i c , and  2  i s the likelihood ratio s t a t i s t i c .  where  G  110  

Cite

Citation Scheme:

    

Usage Statistics

Country Views Downloads
United States 2 1
Canada 1 0
Hong Kong 1 0
India 1 0
China 1 21
City Views Downloads
Ashburn 2 1
Brossard 1 0
Hong Kong 1 0
New Delhi 1 0
Beijing 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0097696/manifest

Comment

Related Items