A Model for Multivariate Binary Data with Covariates Based on Compatible Conditionally Specified Logistic Regressions by Ying Liu B.Sc, Hangzhou Normal University, 1985 M.Sc, Shanghai University of Finance and Economics, 1991 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES Department of Statistics We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August 1994 ©Ying Liu, 1994 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of J> la>t< 5 ^< e s The University of British Columbia Vancouver, Canada Date Auj. 30 . /9?^: Abstract Rather than construction of a multivariate distribution from given univariate or bivari-ate margins, recently several papers seek to promote the development and usage of a simple but relatively unknown approach to the specification of models for dependent binary outcomes through conditional probabilities, each of which is assumed to be lo-gistic. These recent proposals were all offered as heuristic approaches to specifying a multivariate distribution capable of representing the dependence of binary outcomes. However, they are limited in scope, for they all describe some special patterns of depen-dence. This thesis is concerned with a model for a multivariate binary response with covariates based on compatible conditionally specified logistic regressions. With this model, we allow for a general dependence structure for the binary outcomes. Three likelihood-based computing methods are introduced to estimate the parameters in our model. An example on the coronary bypass surgery is presented for illustration. ii Contents Abstract ii Table of Contents iii List of Tables v Acknowledgement vii 1 Introduction 1 2 Compatibility Conditions for Conditionally Specified Multivariate Dis-tributions 5 2.1 Compatibility in 2 Dimensions 6 2.2 Compatibility in 3 Dimensions 7 3 Models for Multivariate Binary Responses with Covariates 9 3.1 Conditions for Compatibility in 2 or 3 Dimensions 10 3.1.1 2 Dimensional Case 10 3.1.2 3 Dimensional Case 14 3.2 General Case 20 3.3 Further Extension 23 4 Comparisons with Other Models 26 4.1 Regressive Logistic Models 26 in 4.2 Conditional Logistic Regression Models for Exchangeable Binary Data . . 28 5 Data Analysis and Computing 30 5.1 The Data Source and Description 30 5.2 Estimation Procedure 31 5.3 Comparison of Three Methods 35 5.4 Selection of Covariates 37 5.4.1 Odds Ratio Analysis 45 5.4.2 Univariate Analysis and Comparison of Models 46 5.5 Other Issues 50 5.5.1 General 50 5.6 Examples 52 5.6.1 Example 1 52 5.6.2 Example 2 57 Appendix: C Program for Applying the Newton-Raphson Method 62 IV List of Tables 5.1 Description for the Subset of MCR Data 32 5.2 Description for the Subset of MCR Data (continued) 33 5.3 Results from Separate Logistic Regressions with Response Variables REM, NEM 34 5.4 Method Comparison under Case 1 37 5.5 Method Comparison under Case 2 38 5.6 Method Comparison under Case 3 38 5.7 Method Comparison under Case 4 39 5.8 Method Comparison under Case 5 40 5.9 Method Comparison under Case 5 (continued) 41 5.10 Method Comparison under Case 6 42 5.11 Method Comparison under Case 6 (continued) 43 5.12 Method Comparison under Case 6 (continued) 44 5.13 Odds Ratio for REM 47 5.14 Odds Ratio for NEM 48 5.15 Odds Ratio for PUM 49 5.16 The Association of REM (V21) and NEM (V22) 53 5.17 CI of Odds Ratio for REM (V21) and NEM (V22) 53 5.18 The Estimates of the Coefficients under All Five Cases 55 5.19 The Estimates of the Coefficients under All Five Cases (continued) . . . 56 v 5.20 The Log-likelihood Functions with Different Number of Covariates . . . 57 5.21 The Log-likelihood Functions of Three Response Variables with Different Number of Covariates 58 5.22 The Estimates of the Coefficients under the Third Case 59 5.23 The Marginal Odds Ratio for REM and NEM 60 VI Acknowledgements I would like to thank my supervisor, Dr. Harry Joe for his guidance, help and support throughout the development of this thesis. I would also like to thank Dr. Martin Puterman for his careful reading of the manuscript and valuable comments. Finally, I must acknowledge the support of UBC Department of Statistics during my Masters program. vii Chapter 1 Introduction This thesis is concerned with a model for a multivariate binary response with covariates. The model is based on compatible logistic regressions. Association of outcomes may be a factor of primary or secondary interest in models for binary data. When measures of association are of interest, models for the marginal expectation may not be appropriate. Rather than construction of a multivariate distribution from given univariate or bivariate margins, recently several papers seek to promote the development and usage of a simple but relatively unknown approach to the specification of models for dependent binary outcomes through conditional probabilities. For binary response data, logistic regression models are useful for modelling the conditional probabilities. Practically speaking, we frequently encounter the variety of circumstances in which sev-eral binary response variables are measured on associated explanatory variables. In such situations, interest may centre on modelling the multivariate response for analy-sis. Take two examples for illustration. The first example on coronary bypass surgery will be used in the thesis later. This example provides the pre-operation information such as patient 's age, gender, etc. and the post-operation information such as patient 's complications and survival status which are all given in the form of binary data. Since 1 we are interested in those outcome variables indicating the patient's complications and survival status after the surgery, we have a situation in which the analysis of a multivari-ate binary response with covariates is needed. A second example concerns the analysis of responses, college plans (Yi) and parental encouragement (Y2) and the covariates, IQ (Xi), sex (X2), and socioeconomic status (X3), where Yi is 1 for yes, and 0 for no, i = 1,2 (see Bonney, 1987). For this example, building a model for a multivariate binary response with covariates might be of interest. Recently, regressive logistic models have been proposed when the data comprise se-quences of binary outcomes possibly with measures on associated explanatory variables (Bonney, 1987). Regressive logistic models are based on the elementary decomposition of the joint probability so that the likelihood of a set of binary dependent outcomes, with or without explanatory variables, is expressed as a product of successive conditional probabilities each of which is assumed to be univariate logistic. In Bonney's models, each binary response variable is an explanatory variable for later response variables in the sequence. Another proposal (Rosner, 1984) is the polychotomous logistic regression model. In this model, Rosner converts the problem into a 2n polychotomous logistic distribution for the number of successes when explanatory variables are zero. As a mo-tivation for Rosner's model, Connolly and Liang (1988) describe a class of conditional logistic regression models for correlated binary data. This includes the polychotomous logistic model of Rosner as a special case. These recent proposals were all offered as heuristic approaches for specifying a multivari-ate distribution capable of representing the dependence of binary outcomes. However, they are limited in scope, for they all describe some special patterns of dependence. Regressive logistic models proposed by Bonney deal with sequential binary outcomes, whereas Connolly and Liang's models might be useful when the dependent binary data 2 are almost exchangeable such as in the familial data situation. In order to allow for a general dependence structure for the binary outcomes, we intro-duce a model for multivariate binary data with covariates based on compatible condi-tionally specified logistic regressions. Conditionally, each binary response variable is a covariate for the logistic regressions for the other binary response variables. Moreover, there are conditions on the regression parameters in order that these conditional dis-tributions are compatible. Furthermore, we can initially use logistic regressions to fit the conditional probabilities for each binary outcome, given the remaining binary out-comes in addition to covariates, and use this to assess whether the multivariate model is reasonable. Two different types of compatibility conditions are introduced in Chapter 2. One nec-essary and sufficient compatibility condition is proposed by Arnold and Strauss (1988); another compatibility condition comes from extending a result from Gelman and Speed (1993). The focus of Chapter 3 is to discuss general conditions under which there exists a joint probability distribution with the given specified conditional probability distributions and to develop a model for multivariate binary data with covariates based on compatible conditionally specified logistic regression. Starting with 2 and 3 dimensions enlightens us on the conclusion of the general case (from Section 3.1 to Section 3.2). In Section 3.3, we briefly discuss what are conditions for compatibility when the conditional distributions are based on a distribution that is not logistic. In Chapter 4, we compare our model with regressive logistic models proposed by Bon-ney and conditional logistic regression models for correlated binary data proposed by Connolly and Liang. 3 In Chapter 5, an example with the coronary bypass surgery data is used for illustration of our model. In Section 5.1, we describe the data source in the example. In Section 5.2, we introduce three computing methods which are all likelihood-based and are all used for estimating the parameters in our model. For our example, comparison of three methods is done in Section 5.3. While doing data analysis and computing among other things, answers to the following questions are sought (Section 5.4 and Section 5.5): 1. For our example, how do we choose the covariates to use in our model? 2. How much does the association of the response variables depend on the covariates? 4 Chapter 2 Compatibility Conditions for Conditionally Specified Multivariate Distributions The goal of this chapter is to study conditions for compatibility of conditional prob-abilities; that is, conditions for which a given set of conditional probability functions leads to a proper joint multivariate distribution. This will be made clear through some examples, starting with the two dimensional case. Let (X, Y) be a two dimensional random vector. Clearly, its probabilistic behavior for most purposes is adequately specified by knowledge of its joint cumulative distribution function FX,Y(X, y) = P r ( X < x, Y < y); x, y e R or joint density function fx,y{x,y). A variety of transforms can be used to characterize -Fy.y- It is clear that one marginal distribution and the family of corresponding conditional distributions, i.e., knowledge of Fy(y) and FX\Y{X\V) — Pr(-ST < x\Y = y), for every y, will completely determine the 5 joint distribution of (X, Y). On the other hand, knowledge of the marginal distributions Fx(x) and Fy(y) has long been known to be inadequate to determine FX,Y(X, y). What if we are given both families of conditional distributions, FX\Y(x\y) f ° r every possible value y of Y and Fy\x{y \x) f ° r every possible value x of XI Provided the families of conditional distributions are compatible, in a sense that there exists a valid joint distribution with the given conditional distributions, then indeed these families of conditional distributions will determine the joint distribution of (X, Y). 2.1 Compatibility in 2 Dimensions The joint, marginal and conditional densities of X and Y will be denoted by fx,y(x, y), fx(x), /r(y)> fx\y(x\y) and fy\x(y\x). One compatibility condition proposed by Arnold and Strauss (1988) is that conditional densities fx\y{x\y) a n d fY\x{y\x) a r e compatible if and only if there exist nonnegative functions a(x) and b(y) such that a{x)h\x{y\x) = b(y)fX\Y(x\y) for all (x,y). Another compatibility condition is given in Gelman and Speed (1993). Suppose the conditional densities fx\y('\y) a r e given for all y and fY\x('\xo) is given for a particular value xo- Assume that fx\y(xo\y) > 0 f ° r au< J/- Then for the joint density fx,y{xTy)i jx\Y{xo\y) with the proportionality constant equal to J J fx\Y{xo\y) That is, for two random variables, the set of conditional densities given one variable plus one conditional density given the other variable determine the joint bivariate dis-tribution. If the conditional densities fx\Y('\y) a r e given for all y and /Y\X('\X) a r e given for all a:, then, Gelman and Speed stated that a condition for compatibility, derived from (2.1) is that: fx[y{x\y)fY\x(y\xi),fx\Y(x\y)fY\x(y\x2) = fx\Y(x2\y)fY\X(y\xi) .^ g . fx\Y(xi\y) fx\Y(x2\y) fx\Y(xi\y)fY\x(y\x2) does not depend on x, y for all choices of x\ ^ x2. 2.2 Compatibility in 3 Dimensions Similarly, a three dimensional random vector (X, Y, Z) might be specified by giving the distributions of X given (Y,Z), Y given (X,Z) and Z given (X, Y). Such conditional specifications must be checked for compatibility in a manner analogous to that described in section 2.1. On the basis of the idea provided by Arnold and Strauss, the compatibility condition for a three dimensional random vector {X, Y, Z) is that conditional densities fx\Y,z(x\y, z), fY\x,z(y\x,z) and /Z\X,Y(Z\X, y) a r e compatible if and only if there exist nonnegative functions a(y,z), b(x,z) and c(x,y) such that <*(v,z)fx\r,z(x\y>z) = Kx,z)fY\x,z(y\x>z) = c(x,y)fz\xAz\x,y) for all (x,y,z) (see Arnold, Castillo and Sarabia (1992), Chapter 8). The trivariate extension of Gelman and Speed's compatibility condition is as follows. Suppose the conditional densities fx\Y,z{'\y,z) are given for all y, z, fY\x,z('\xo,z) are given for all z and a fixed x0, and fz\x,Y('\xo^yo) 1S given for some fixed yQ and the same xQ as for fY\x,z('\xo,z). Assume that fx\Y,z(xo\y,z) > 0 for all y, z and fY\x,z(yo\xo,z) > 0 for all z, then for the joint density fxy,z{x,y,z), f u „ x „ fx\Y,z(x\v, z)fY\x,z(y\x<>, z)fz\x,Y(z\xo, y0) . v JX,Y,Z{X> y, z) a T 7 j w 7 — j x V"*) JX\Y,zKxO\y, z)jY\X,Z\yo\XQ, Z) 7 with the proportionality constant equal to fx,Y{xo,yo)- ft the conditional densities fx\Y,z(-\y>z) a r e S i v e n for a 1 12/ ' ZJ fy\x,z(-\xiz) a r e 8 i v e n f o r a11 £> zi a n d fz\xy(-\x,y) are given for all x, y, then a condition for compatibility is that fx\Y,z(X2\yi Z)fY\X,z(yi\x2, z)fY\X,z(y\xli Z)fz\X,Y(z\xi,yi) , g ^ fx\Y,z(Xi |y» •2)/y|A-,z(yikl, «)/K|X,z(y|*2, z)fz\xAz\x2, J/2) does not depend on x, y, z for all choices of {x\,y\) ^ (a^Ste)-Cases involving random variables of dimension greater than three can, of course, be considered. The general ideas, however, are clearly visible in the 2 and 3 dimensional settings and both concepts for the compatibility condition can be extended to higher dimensions. 8 Chapter 3 Models for Multivariate Binary Responses with Covariates The ideas in the previous chapter will be applied to get a model of compatible logistic regressions for a multivariate binary response. Consider a set of n correlated binary variables Y = (Yi, Y2,..., Yn), where Yi is coded 1 and 0, and m covariate variables X = ( X i , X 2 , . . . ,Xm). Suppose that we are given the conditional probability distributions Pr(Yi\Yj,j ^ «,X) for i = 1 ,2, . . . ,n . From previous discussions, the joint probability distribution pr(Y|x) = Pr(y1,r2,...,yn|x) will be determined if these conditional probability distributions are compatible. With binary responses, the logit link function is a natural choice although, in principle, any link function could be chosen. In our study, we assume that all conditional distri-bution of Yi given Yj, j ^ i, and X are logistic. That is, for a given binary response variable, the remaining binary response variables are used as covariates in addition to X . Now our goal is to discuss general conditions under which there exists a joint proba-bility distribution P r ( Y | X ) with the given conditional probability distributions. At the 9 same time, a model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions can be constructed. 3.1 Conditions for Compatibility in 2 or 3 Dimen-sions The two approaches of Arnold &; Strauss and Gelman &; Speed are illustrated in this section. 3.1.1 2 Dimensional Case For ease of exposition, consider first the case of two binary response variables Yi and Y2, and a covariate vector x. Suppose that Yi conditional on x and Y2 = t/2 is logit and that Yi conditional on x and Y\ = y\ is logit, that is, «^PMK=II«..X)] = ^ f f ; : f c g = C*i + ftx + 7122/2, = a 2 + &X + 721J/1, with parameters <*i, /?i, or2? #2? 712, and 721. After some rearrangement, Pr(5-, = *|»,x) = f ^ V + V T ^ ' i 1 + exp(o:i + /?ix -I- 712J/2) and Pr(K2 = ^b1,x) = * ± ^ ^ * M . (3.!) 1 + exp(o:2 + P2X + 721 J/i) (Note that for convenience, the following shorthand notation Pr(r/i|j/2,x) is used for Pr(yi = yi\Y2 = t/2)X = x) , etc; when the context is clear.) 10 To ensure compatibility for the conditional probability distributions Pr(t/i|?/2,x) and Pr(y 2 | j / i ,x) , we have the following theorem. T h e o r e m 3.1 The conditional probability distributions Pr(i / i |y2 ,x) and Pr(i/2 |2/i,x) in (3.1) are compatible if and only i/712 = 721, in which case, the joint probability distri-bution is: Pr(yx , y2 |x) = - exp[(o;i + ftx)^ + (a2 + ftxjjfo + 7122/12/2], (3.2) where c = 1 + exp(c*i + ftx) + exp(a2 + &X) + exp[(o;i + ct2) + {Pi + /?2)x + lu]-Proof 1: (using Arnold and Strauss's compatibility condition) Pr(j/i |y2 ,x) and Pr(j/2|2/i,x) are compatible if and only if there exist probabilities Pr(j/i, r/2 |x), Pr(y2 |x) and Pr(yi |x) such that: Pr(2/i,2/2|x) = Pr(y1 | j /2 ,x)Pr(2/2 |x) = Pr(ya |y i ,x)Pr(y i |x) . (see Arnold, Castillo and Sarabia (1992)). Equivalently: P r ( r i = 1|F2 = l , x ) _ Pr (y t = l |x) pr(y2 = i|y1 = i,x) Pr(y2 = i |x) ' pr(y1 = i|y2 = o,x)_Pr(y1 = i|x) pr(y2 = o|yx = 1, x) Pr(y2 = o|x)' pr(yi = o|y2 = i,x) pr(yi = o|x) Pr(y2 = i|y1 = o,x)~Pr(y2 = i|x)' Pr(yi = o|y2 = 0, x) Pr(yi = 0|x) Pr(y 2 = 0|ya = 0,x) ~ Pr(y 2 = 0 |x ) ' (3.3) (3.4) (3.5) (3.6) For example, dividing (3.3) by (3.4), and dividing (3.5) by (3.6), we have following equations respectively: pr(y2 = o|x) = Pr(y1 = i|y2 = i,x) pr(y2 = o|y1 = i,x) pr(y2 = i|x) pr(y2 = i|yx = i ,x) ' pr(ya = i|y2 = o,x) 11 exp(a! + ftx + 712) 1 + exp(a2 + /?2x + 721) 1 + exp(«i + fax. + 712) exp(a2 + ftx + 721) 1 1 + exp(c*i + ffix) 1 + exp(a2 + /?2X + 721) exp(Qa + ftx) exp(7i2) • [1 + exp(a! + ftx)] exp(72i) • exp(a2 + /fcx) • [1 + exp(a1 + ftx + 712)]' pr(r2 = o|x) _ Pr(yri = o|y2 = i,x) Pr(y2 = o|rx = o,x) Pr(r2 = l|x) ~ Pr(r2 = l | n = 0 , x ) ' P r ( F 1 = 0 | y 2 = 0,x) _ 1 1 + exp(a2 + /32x) 1 + exp(c*i + ftx + 712) exp(a2 + #2x) 1 + exp(ai + ftgx) 1 + exp(a2 + /?2x) 1 + exp(ai + /?ix) (3.7) exp(a2 + /?2x) • [1 + exp(<*i + ftx + 712)]' (3.8) Comparing (3.7) with (3.8), only the condition 712=721 makes them consistent. Other cases yield the same result. Therefore, we conclude that a necessary and sufficient condition for Pr(j/i|j/2)x) and Pr(y2|2/i,x) to be compatible conditional distributions is 712=721-Since Pr(F2 = 0|x) = 1 - Pr(F2 = l |x), solving (3.8) we have Pr(F = llx) = exp(a2 + ftx) • [1 + exp(o;1 + ftx)] 2 1 + exp(ai + Ax) + exp(a2 + /?2x) + exp[(ai + a2) + (ft + ft)x + 7 l 2 ] ' Similarly, we also can get exp(oi + &X) • [1 + exp(a2 + /?2x)] Pr(y! = l |x) 1 + exp(c*i + Pix) + exp(a2 + /?2x) + exp[(ai + a2) + (ft + ftjx + 712]' Summarily, Pr( i i =y i | x ) = -exp[(ai + /3ix)j/i][l + exp(a3 + i82X + 7i2y1)], c Pr(F2 = 2/2 |x) = -exp[(a2 + fax)y2][l + exp(an + ftx + Twife)]. (3.9) c 12 where c = 1 + exp(ai + ftx) + exp(a2 + fax) + exp[(<*i + a2) + (/?i + fa)x + 712]. Using (3.1) and (3.9), it is easy to prove that the joint probability distribution is (3.2). Proof 2: (using Gelman and Speed's compatibility condition) Assume that Pr(y1 |y2)X), Pr(y a |y2,x) > 0 for all y2, x and two fixed values y1? yi. Implied by (2.2), a compatibility condition for Pr(yi |y 2 ,x) and Pr(3/2|3/i>x) is that P r (y i |y 2 , x )Pr (y 2 | y l ,x ) Pr (y i |y 2 ,x )Pr (y 2 | y" ,x ) P r ( ^ ' | y 2 , x ) P r ( t / 2 l ^ , x ) Pr(yi |y 2 ,x) I Pr(yi ' |y2 ,x) P r ( y ; | y 2 , x ) Pr(y2 | t/i ' ,x) ^ UJ does not depend on y\, y2 for all choices of yx ^ yx. Using (3.1), (3.10) can be rewritten as follows: Pr(yil2/2,x)Pr(t/2lyl,x) _ exp[(o?i + fax + 7123/2)3/1] exp[(a2 + fax + 7213/^ )3/2] w Pr(yi |y2,x)Pr( j / 2 |y i ,x) 1 + exp(«i + ftx + 7i2y2) 1 + exp(a2 + fax + 721ft) exp[(a1 + ftx + 7123/2)3/1] exp[(q2 + /?2x + 7213/1 )y2] x 1 + exp(ai + /9ix + 7123/2) 1 + exp(a2 + fax. + 721J/1) 1 + exp(a2 + fax + 72iy") r , v » /., = T T 7 — T f l — 7 7T • exp (a i + PixKVx - 3/1) 1 + exp(c*2 + fax + 7213/1) • exp[y2(yi' - yi)(7i2 - 721)] (3.11) With y1 = 1, yl — 0, (3.11) does not depend on yi and y2 if and only if 712 = 721- That is, a necessary condition for Pr(yi |y 2 ,x) and Pr(y 2 |y i ,x) to be compatible is 712 = 721. Letting yx = 0, from (2.1), (2.2) and (3.1), we get that the joint probability distribution Pr(3/i>3/2|x) has the following property: Pr(2/i|2/2,x)Pr(y2|yi = 0,x) Pr(j/i,2/2 |x) oc Pr(yi = 0|y2 ,x) exp[(a1 + fkx)V\ + (<*2 + ^ x ) y 2 + 7123/13/2] 1 + exp(a2 + /?2x) 13 with the proportionality constant equal to 1 + exp(a2 + /?2x) 1 + exp(ax + /?ix) + exp(a2 + ftx) + exp[(ai + a2) + (ft + /?2)x + j12]' In other words, the joint probability distribution is proved as (3.2). It is easy to prove that the condition 7i2 = 72i is sufficient. 3.1.2 3 Dimensional Case Next, in order to get clearer ideas about the general multivariate case, it is necessary to further discuss the 3 dimensional case with three binary response variables Y\, Y2 and I3 , and a covariate vector x. By assuming the linear logistic models for the conditional prob-ability distributions Pr(Fx = y i |y 2 ,y 3 ,x ) , Pr(F2 = t /2 |yi ,y3 ,x) and Pr(F 3 = 2/3|2/i,2/2,x), we have r, / 1 x exp[(ai + ftx + 712y2 + 713^3)2/1] Pr(yi y2 ,y3,x) = — -.—— • r, 1 + exp(ai + ftx + 7122/2 + 713J/3) exp[(a2 + ftx + j2iyi + 7232/3)2/2] Pr(y2 |yi ,y3,x) = 1 + exp(a2 + ftx + 72ij/a + 723?/3)' and T W 1 ^ exp (0:3 + ftx + 731yi + 732^2)2/3 ,„ 10x P % 3 2/1,2/2, x) = — -.—— • r, (3.12) 1 + exp(a 3 + ftx + 731yx + 732y2) where a 's , fts and 7's are parameters. (Note again that we are using shorthand notation, that is clear from its context.) For a 3 dimensional case, an analog of Theorem 3.1 to guarantee compatibility for (3.12) is given in Theorem 3.2. T h e o r e m 3.2 The conditional probability distributions Pr(yi | j /2 , t /3 ,x) , Pr(y2 | t / i ,y3 ,x) and Pr(j/3 | j / i ,y2 ,x) in (3.12) are compatible if and only 1/71:2=721,713=731, and 7 2 3 = 14 732 • Furthermore, the joint probability distribution is of the form Pr(2/i, 2/2,2/3 |x) = - exp[(aa + /?ix)yi + (a2 + My2 + (<*3 + /?3x)y3 +7122/12/2 + 7132/12/3 + 7232/22/3] 1 3 = - e x p [ £ ( a , - + #x)y , -+ £ 7.i2/.2/j] C »'=1 l<«'<j<3 (3.13) where c = 1 + £ ? = 1 exp(a,- + # x ) + Ei<,<i<3 exp[(a;,- + a,-) + (A + ^ ) x + 7,^] + exp[(a1 + a2 + a3) + (ft + ft + ft)x + (7l2 + 7l3 + 723)]. Proof 1: (using the extension of Arnold and Strauss's compatibility condition) Pr(2/i|2/2,2/3,x), Pr(j/2|2/i,2/3,x) and Pr(2/3|2/i,2/2,x) are compatible if and only if there are probabilities Pr(j/i,t/2,2/3|x), Pr(2/2,2/3|x),Pr(t/1, y3 |x) and Pr(j/!,2/2|x) such that pr(2/i>2/2,2/3|x) = Pr(2/i|2/2,2/3,x)-Pr(T/2,2/3|x) = Pr(y2 |y i ,y3 ,x) -Pr(y i ,y 3 | x ) = P r (y 3 | y i , y 2 , x ) -P r (y i , y 2 | x ) for all 2/1,2/2, 2/3, x. Equivalently, Pr(2/i|2/2,y3,x Pr(2/2|2/i,y3,x Pr(2/i|2/2,2/3,x Pr(y3|2/i,y2,x Pr(2/2|2/i,2/3,x Pr(2/3|2/i,y2,x Pr(2/i,2/3|x) Pr(2/2,2/3|x)' Pr(2/i,2/2[x) Pr (^ ,2 /3 |x) ' Pr(2/i,y2|x) P r (y i ,y 3 | x ) ' Special cases of (3.14) are Pr(r1^o,r3 = oix) Pr(y2 = o,r3 = o|x) P r ( F 1 = 0 | y 2 = 0 , r 3 = 0,x) Pr(F2 = 0111=0 , F3 = 0,x) 1 + exp(Q2 + ftx) 1 + exp(ai + ftx)' (3.14) (3.15) 15 Pr(y1 = i,y3 = o|x) = Pr(r1 = i|r2 = o,y3 = o,x) Pr(F 2 = 0, Y3 = 0|x) Pr(F 2 = 0 | y = 1, Y3 = 0, x) 1 + exp(c*2 + fox + 721) 1 + exp(a1 + fox.) exp(c*i + fox), (3.16) Pr(y1 = o,y3 = Q|x) = Pr(y1 = o|ra = i,y3 = o,x) pr(y2 = i,r3 = o|x) Pr(y2 = i|Ki = o,y3 = o,x) 1 + exp(a2 + fox) 1 + exp(ai + fox + 712) exp(e*2 + fox)' (3.17) Pr(y1 = i,y3 = o|x) = Pr(y1 = i|y2 = i,y3 = o,x) Pr(y2 = i,y3 = o|x) Pr(y2 = i|ya = i,y3 = o,x) _ l + e x p ( a 2 + /?2x + 72i) 1 + exp(ai + fox + 712) • exp[(a! - a2) + (fo - fo)x + ( 7 l 2 - 7 2 1 ) ] . (3.18) Dividing (3.15) by (3.16) and dividing (3.17) by (3.18), we have Pr(y 1 = 0,y3 = 0lx) 1 + exp(a2 + fox) 1 Pr(yx = 1, y 3 = 0|x) 1 + exp(a2 + fox + 721) exp(ai + fox)' Pr(yx = 0, y 3 = 0|x) 1 + exp(a2 + fox) 1 Pr(y a = l , y 3 = 0 | x ) l + exp(c*2 + /?2X + 721) exp(ai + fox + 7 l 2 - 7 2 i ) ' (3.19) (3.20) Comparing (3.19) with (3.20), we conclude that one of the necessary and sufficient condi-tions for Pr(yi \y2, J/3, x ) , Pr(j/2|j/i, t/3, x) and Pr(t/3|?/i, y2, x) to be compatible conditional distributions is that 712 = 721. By symmetry, we conclude that other necessary and sufficient conditions for compati-bility are 71 3 = 731 and 72 3 = 732 . Using the same procedure which has resulted in (3.19) and (3.20), we have Pr(yx = o,y3 = i|x) = Pr((yx = o,y3 = iix) pr(y2 = o,y3 = i|x) pr(ya = i,y3 = i|x) pr(y2 = o,y3 = i|x) ' pr(yx = i,y3 = i|x) Pr(y1 = o|y2 = o,y3 = i,x) Pr(y2 = o|y1 = i,y3 = i,x) pr(y2 = oiyj = o,y3 = i , x ) ' pr(ya = i|y2 = o,y3 = i,x) 1 + exp(a2 + fox + 723) 1 1 + exp(a2 + fox -(- 712 + 723) exp(ai + fox + 713) 16 (3.21) and Pr(y1 = i>y3 = o|x) = Pr(y1 = i,y3 = o|x) pr(y1 = i,y2 = o|x) Pr(yx = i,y3 = i|x) pr(yx = i,r2 = o|x)" Pr(n = i,y3 = i|x) pr(y3 = o|y1 = i,y2 = o,x) Pr(ya = o|yi = i,y3 = i,x) pr(y2 = o|Fx = i,y3 = o,x) * pr(y3 = m = i,y2 = o,x) 1 + exp(a2 + fax. + 7 l 2 ) 1 1 + exp(a2 + fax + 712 + 723) exp(a3 + fax + 713)' From (3.21) and (3.22), Pr(yx = 0,y3 = l|x) and P r ^ = l ,y 3 = 0|x) can be expressed in terms of Pr(yx = 1, y3 = l |x), respectively: Pry, - 0, li - 1 W - , ' + TP(0%+ A X + ™] , • VT(Y; = \Ys = 1|X ' (3.23) 1 + exp(a2 + fax + 712 + 723) exp(a1 + fax + 713) and Pr(K, = 1. y3 = 0W = l + e x p ( a i + / ? 2 x + 7 , 2) Pr(r , = 1 ,Y, = l|x) 1 + exp(a2 + fax + 712 + 723) exp(«3 + fax + 713) Substituting (3.24) for (3.20), Pr(yx = 0, Y\ = 0|x) also can be expressed in terms of Pr(y1 = l ,y 3 = l |x): P r ( y = i y = 1 ] x ) = 1 + exp(Q2 + fax) Pr(y1 = l ,y 3 = l |x) ' 3 1 + exp(a2 + fax + 712 + 723) exp[(tt! + a3) + (fa + fa)x + 7 l3] (3.25) Under the constraints of Pr(Yi = 0, Y3 = 0|x) + Pr(yi = 0, Y3 = l |x) + Pr(yx = 1, Y3 = 0|x) + Pr(ya = l ,y 3 = l|x) = 1, a bit of calculation by substituting (3.23), (3.24) and (3.25) will result in following equation: c-Pr(y1 = i ,y 3 = i |x) = 1 [1 + exp(a2 + fax + 712 + 723)] exp[(ai + a3) + (fa + fa)x + 713] where c = 1 + £? = 1 exp(a,- + fax) + £i<«j<3 exp[(a,- + a,) + (fa + fa)x + 7^ -] + exp[(ai + a2 + a3) + (fa + fa + fa)x + (712 + 713 + 723)] • 17 Thus, Pr(rx = l , y 3 = l |x) = - • e x p [ ( a 1 + a3) + (A + & ) x + 713] c •[1 + exp(a2 + /?2x + 712 + 723)] Similarly, we can obtain other probabilities Pr(y,-,y_,|x). In brief, we can write Pr(y,-,yy |x) as: Pr(y,-, yj |x) = - • exp[(a,- + #x)y,- + (a,- + fijx)yj + myiVj] c •[1 + exp(afc + /?*x + 7,fcj/i + 7*iW)] (3-26) where i < j , k ^ i, k ^ j and i, j , k = 1,2,3. Finally, using (3.12) and (3.26), we can obtain the joint probability distribution as (3.13). P r o o f 2: (using the extension of Gelman and Speed's compatibility condition) From (2.4), a condition for Pr (y i |y 2 ,y 3 ,x ) , Pr (y 2 | y i ,y 3 ,x ) and Pr(y 3 | y i ,y 2 ,x ) to be compatible is that Pr(Vi lya, 3/3,x) Pr(y2 |y|', y3, x) Pr(y2 |yj , 2/3, x) Pr(y3 |yj , y2 >x) , ^ Pr(yi|y2, ya, x) Pr(y2 |yi , ys, x) Pr(y2|yi'» y3, x) Pr(y3 |yi', y2 , x) does not depend on j/i , y2 and y3 for all choices of (y1,y2) 7^ ( y i ^ ) -Applying (3.12), in this case we rewrite (3.27) as: Pr(yr iy2,y3,x)Pr(y 2 , | y i ,y 3 ,x)Pr(y 2 | y l ,y3 ,x)Pr(y 3 | t / i ,y 2 ,x) ^ Pr(yi|y2, V3, x) Pr(y2 |yi , y3, x) Pr(y2 |yi', y3, x) Pr(y3 |yi', y2 , x) 1 + exp(a 3 +/5 3x + 73^1 + 732y2') r, fl w » /. , fl w » #v, — ) — — r— rf • exp (a i + ftx)^ - yx) + (a 2 + /?2x)(y2 - y2) 1 + exp(a 3 + /?3x + 7 3 ^ ! + 732y2) • exp[72i(yiy2 - y'iy'2) • exp[y2(y" - yj)(7i2 ~ 721)] • exp[y3(yi' - yj)(7i3 - 731)] • exp[y3(y2' - y2)(72 3 - 732)] (3.28) 18 Let yx = 1, ya = 0, y2 = 1 and y2 = 0, then (3.28) does not depend on yi, y2 and y3 if and only if 712 = 721, 713 = 731 and 723 = 732. This indicates that necessary conditions for the trivariate case are 712 = 721, 713 = 731 and 723 = 732-Applying (2.3), let both fixed values of y\ and t/2 be zeros, then we know for the joint probability distribution Pr(yi , t/25£/3> |x) , D / 1 x Pr(yily2,y3,x)Pr(y2|yi = 0,y3,x)Pr(y3|yi = 0,3/2 = 0,x) Pr(yi , 2/2, 2/3, X) OC — -: —— r Pr(yi = 0|y2 ,J/3,x)Pr(y2 = 0|yx = 0 ,y 3 ,x) _ exp[(Qi + /?ix)yi + (0:2 + /?2x)y2 + (Q3 + ^3x)y3 + 7122/13/2 + 7i32/i2/3 + 723^ /23/3] 1 + exp(a 3 + /33x) with the proportionality constant equal to [1 + exp(ai3 + /?3x)]/c, where c is given in (3.13). Now we have proved that the joint probability distribution is the form of (3.13), and sufficiency of the conditions is easily proved. In both Theorem 3.1 and Theorem 3.2, we use two kinds of techniques for the proof. The first technique uses Arnold and Strauss's compatibility conditions; the second uses Gelman and Speed's idea for compatibility conditions. It is obvious, from the lengths of the proofs in the 3 dimensional case, that the extension of Gelman and Speed's compatibility conditions is easier to use than the extension of Arnold and Strauss's. In fact, using the extension of Gelman and Speed's compatibility conditions, it is possible to prove an analogous theorem to Theorem 3.1 and Theorem 3.2 for the general multivariate case, but using the extension of Arnold and Strauss's compatibility conditions, the necessity of the condition might be too tedious to prove. 19 3.2 General Case Our ultimate aim is to develop a conditionally specified logistic regression model for multivariate binary responses with covariates and to discuss necessary and sufficient conditions for compatibility under general dimensions. For notation, we change /?jX above to j3j\Xi + • • • + f3jmxm. Assume Pr(yn | y i , . . . , y ^ , x) - 1 + ^ ^ + ^ ^ + ^ ^ } . (3.29) The above discussions in both 2 and 3 dimensional cases provide immediate generaliza-tion of the results for the n dimensional case. For the reason that we have mentioned at the end of last section, however, we prove the necessary conditions for conditional distributions (3.29) to be compatible only using the extension of Gelman and Speed's compatibility condition. T h e o r e m 3.3 The conditional probability distributions Pr(y, |yj, j ^ i ,x ) for i = 1 , . . . , n, in (3.29) are compatible if "fij=fji for all i ^ j . Under conditions for compatibility, a joint probability distribution is as follows: i n tn P r ( y i , . . . , y n |x) = -exp{]T[(a,- + £ PijXj)yi] + J2 7 v W > (3-3 0) C ,=1 j = l l< .< i<n where c is the normalizing constant which involves a sum of2n exponential terms and is the form of 1 1 n m c = Yl • • • J2 exp{J2(ai + J2 Pnxj)yi + Yl lavivi) (3-31) J/l=0 j / n =0 t'=l j=l i<j Proof: By extending (2.4) and (3.27) to the n dimensional case, we conclude that a condition for Pi(yi\yj,j ^ i,x) to be compatible is that FEUPr(y,-|y|,• • •,y'j-i,yi+i,•. •, yW) x) • n?=i^(y"\y'u••••>y"-nVt+u• • •»Vn,x) ,g ^ n?=i Pr(y,-|yi', • • •,y"_i, y i+i , . . . , y», x) • n?=i Pr(y,:|2/i, • • •,y,'_i,y.-+i,...,y„,x) 20 does not depend on y,- ( i — 1 , . . . , n) for all choices of (y1,..., yn) ^ (y1,..., yn). Given (3.29), we can write (3.32) in detail, that is: nr= i Pr(y,[yi,. • •, y,-_i, yi+i, • • •, yn, x) • nr=i Pr(y,"|yi'> • • •»y"-n yi+i. • • •. yn, x) nr=i Pr(y.|yi, • • •, y,"_i, y.+i, . . . , y», x) • n,"=i Pr(y-|yi, •. •, y,'_i, y,+i , . . . , y„, x) _ n"=i exp[(a,- + J2T=i Paxi + £ j< i 7«i2/j + Ej>.- 1fijVj)yi] ll"=i exp[(a,- + E7=i Paxi + Ei<i 7.j'yi + £;>.- lavM n"=i exP[(o!,- + Y%LI PiM + Ej<,- 7«jyj + Zj>i mjyjWi] n?=i exp[(a,- + ££=i AjXj + Ej<,- luv'j + Ej>.- 7yw)yi'l n m = I I exp[(««- + £ Pijxi)(y" - y'd + £ 7« (yj y" - yj-y,')] i=l j= l :>j n I I e x p E 7j.-yj(y,' - y'i)] • I I e x p E 7o-yj(y" - y.')] (3-33) «=i «<j t = i i'<j If (y[,..., yn) = 1 and ( y " , . . . , y„) = 0, (3.33) becomes n,"=i Pr(yi\y'i, •••, y,--i, y.-+i, • • -, y», x) • n L i Pr(y"lyi, • • •» y"-n y«+i, • • •, yw, x) nr=i Pr(y,-|yi', • • •,y-'-i,y.+i,...,yn ,x) • n?=i Pr(y,'|yi,...,y,'_i,y,-+i,...,y„, x) n m n = H exp[-(a t- + Y, PaxJ + S Ty)] • I I e x P E ( 7 ; . ' - 7«i)yj]- (3-34) t=l j = l «>i i=l i<j Therefore, (3.32) does not depend on yt- ( i — 1 , . . . , n) if and only if 7,j = 7,-,- for all i ^ j in (3.34). Now we get the conclusion that the necessary conditions for Pr(y, |yj , j 7^ z,x) to be compatible conditional distributions are 7,j = 7,-,- for all i ^ j . Again extending (2.1) to the n dimensional case, for the joint probability distribution P r ( y i , . . . , y B | x ) , Pr^,...^*^*?^---'^1 V-X\ (3.35) n"=i Pr(y,-|yi, • • •, y,-_i, y.+i, • • •, y», x) In this case, if yt = 0 for all i, we can write (3.35) as follows: n m P r ( y i , . . . , y„|X) oc I I exp[(a,- + ^ A ^ i + X}7ijy;)y.] 1=1 i = i i< i 21 with the proportionality constant equal to 1/c, where c is defined as (3.31). That is, the model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions is: i n TO Pr(yi, • • •, y„|x) = - exp{£[(a , - + ] T flyx,-)^ + Y, -yijViVj} C , = 1 j = l l < t ' < i < n where c = £ * i = 0 • • • Ej„=o exP{E,"=i(«.- + EjLi PijXj)yi + E ,< ; lijViVj}-Finally, we prove the sufficient conditions for conditional distributions (3.29) to be compatible. T h e o r e m 3.4 The conditional probability distributions of (3.30) are linear logistic dis-tributions. Proof: Given the model (3.30) with 7^ = 7,,-, i ^ j , we have Pr(y,-,i ± k\X) = Vv(yu...,i/*-i,y* = l ,y*+i,-• -,Vn\X) + Pr(?/i,.. . ,t/ f c_i,y f c = 0,yk+1,...,yn\X) •j m m = -{exp[(a* + ^ flya:,-) + X ( a » ' + H Pnxi)Vi c 3=1 i^k 3=1 + D 7*;yj + E To-yt-yj] Jj4A l < i < i < n , ' i i # m + expE(a,- + ]T #,•&,•)# + 2 HjViyj]} 1 m = -e x P E ( a ' ' + E ^ I i ) j ' ' + X) HiViVj] C i^k i=l l<i<j<n,i,j^k m •{1 + expK + ] £ fly*,- + XI 7fcj2/j]}-3=1 jjtk Further, we obtain Pr(yi>S/2,--.,yn|A') Pr(y* |y; , j ? * * , * ) = Prfo.j^*!*) 22 exp{E"=i [(<*«• + E?Li PijXj)vi] + Ei<i<j<n amy5} exp[E,^fc(a.- + E^zi fcjxfivi + J2i<i<j<n,i,j*k amy]] 1 {1 + exp[a* + E?Li PkjXj + Ej*k 7kjVj]} exP[(o!fc + E ^ i PkjXj)yk + (Ej^fe 7kjyj)yk] 1 + exp[«* + E^=i PkjXj + Ej^jt 1kjVj\ This equation indicates that conditional probability distributions Pv(yk\yj,j ^ k,X) for k = 1 , . . . , n are linear logistic distributions. • 3.3 Further Extension The discussion so far assumes implicitly that the conditional distributions Pr(i/j|?/,-, i ^ j,x) for j = l , . . . , n , are logistic, so we say that the model (3.30) for multivariate binary data with covariates we have constructed is based on compatible conditionally specified logistic regressions. In fact, as mentioned before, with binary responses, any other link function could be chosen instead of the logit link function. From this point of view, a question to be addressed here is what are conditions for compatibility when the conditional distributions are based on a distribution F that is not logistic, i.e., Pr(5"j = 1 lift, i ± j , x) = F{ai + pjX + J2 ljiyi),3 = 1, • • •, n. In this section, we intend to motivate the development of conditions for compatibility when F is not logistic. For this reason, let us simply consider the case of two binary response variables Y\ and Y2, and a covariate vector x . Assume that the conditional distribution of Yj is Pi(Yj = l|y,-,x) = F{ctj + 0jX + rijiyi) f o r ; = 1,2,» = 3 - j . 23 Here we use Arnold and Strauss's compatibility condition instead of Gelman and Speed's. Consequently, the conditional distributions Pr(yi\y2,x.) and Pr(y2|j/i,x) are compatible if and only if there exist probabilities Pr(yi,y2 |x), Pr(j/i|x) and Pr(i/2|x) such that: Pr(2/i,3/2|x) = Pr(yi|y2,x)Pr(y2 |x) = Pr(j/2|yi,x)Pr(y!|x). In detail, we write the above form as: Pr(y1 = i|r2 = i,x) Pr(y! = i|x) Pr(y2 = l|yx = l ,x) Pr(y2 = l | x ) ' Pr(yx = i|y2 = o,x) _ Pr(yx = i|x) Pr(y2 = o|ya = i,x) " pr(y2 = o|x)' Pr(yx = o|y2 = l ,x) Pr(y t = 0|x) Pr(y2 = \\YX = 0,x) ~ Pr(y2 = l | x ) ' Pr(y t = o|y2 = 0,x) Pr(yi = 0|x) p r(y2 = o|yx = 0,x) ~ Pr(y2 = 0|x)" In particular, if we divide (3.36) by (3.37) and divide (3.38) by (3.39), we have pr(y2 = o|x) = pr(y1 = i|y2 = i,x) pr(y2 = o|y1 = i,x) pr(y2 = i|x) pr(y2 = i|ya = i , x ) ' Pr(yx = i|y2 = o,x) F{ax + A x + 712) 1 - F(a2 + A x + 7 a i ) F(ai + ^x) * F(a2 + Ax + 721) pr(y2 = o|x) = Pr(y1 = o|y2 = i,x) Pr(y2 = o[y1 = o,x) pr(y2 = i|x) ~ Pr(y2 = i|y1 = o,x)'Pr(y1 = o|y2 = o,x) 1 - F(ai + fax. + 712) 1 - F(a2 + ^ x ) (3.36) (3.37) (3.38) (3.39) (3.40) (3.41) 1 - F(ai + Ax) F{a2 + Ax) Obviously, (3.40) and (3.41) must be equal. In other words, a necessary and sufficient condition for Pr(j/i|t/2,x) and Pr(y2 |yx,x) to be compatible conditional distributions is F (a i + A x + 712) F(a2 + Ax) _ F(a2 + A x + 721) Fj^ + Ax) 1 - F(ai + Ax + 712) ' 1 - ^(02 + Ax) 1 - F(a2 + Ax + 721) ' 1 - F(ai + Ax) (3.42) After analysing (3.42), we find that both of the following conditions make (3.42) hold: 24 1. ai = a2, fix = #2, and 712 = 721. 2. 712 = 721 = 0, i.e., the two binary response variables Y\ and Y2 are independent. It is not obvious that general conditions can be obtained. There may be conditions that depend on F. As for higher dimensional cases, the general analysis is quite difficult. 25 Chapter 4 Comparisons with Other Models Logistic regression is widely used to study the effects of explanatory variables on binary response variables. Several different specifications have been considered for dependent binary response variables. By means of logistic regression, several useful approaches to specifying a multivariate distribution capable of representing the dependence of binary response variables have recently been proposed. However, in contrast to model (3.30) which allows for a general dependence structure for binary response variables, these other models are limited in scope. Here we discuss two kinds of these models and compare them with model (3.30). 4.1 Regressive Logistic Models To describe sequentially dependent binary response variables, Bonney (1987) proposed regressive logistic models in which the probability of an observation is conditioned on all preceding observations. The basic theory of Bonney's models is the follow. Since there exists a natural ordering to the indexing of all dependent binary response variables, it is reasonable to decompose the probability of Y given X into a product of n probabilities 26 such as: Pr(Y|x) = pr(y1,y2,...,y„|x) = Pr(r1|x)Pr(y2|r1,x)--.Pr(rn|r1,r2,...,n_1,x). (4.i) Correspondingly, he defines the ith logit as Pr(Yi = l\Y1,...,Yi.1,X) 6i = ^[ P r ( ^ = 0 | y 1 , . . . , « . 1 > X ) 1 ' and assumes that 0,- is a linear function of Yi,..., Yi-i,X. and can be expressed by 0i = a + 7a Zx H h 7n_iZn_1 + /?X, where £; = < 2 1 ^ - 1 , i f j < i 0, if j>i. Therefore, the joint probability is decomposed into a product of successive conditional probabilities each of which is assumed to be univariate logistic: P '(Y|X) = ft ~ ^ - (4-2) Model (4.2) provided by Bonney has the theoretical and practical advantage that it can be analyzed and fitted in the same way as the logistic regression model for independent binary response variables, and with the same computer programs. However, there exist differences between model (3.30) and model (4.2). Model (3.30) is based on compatible conditionally specified logistic regressions and deals with a general dependence structure for the Yi's, but model(4.2) is based on a product of a sequence of logistic regressions and only should be used if the binary variables are observed sequentially in time. The decomposition (4.1) is applicable only when a natural ordering of the dependent obser-vations exists. Note that a different order generally implies a different model, and the joint probabilities (4.2) are not necessarily the same for different orders. 27 4.2 Conditional Logistic Regression Models for Ex-changeable Binary Data Rosner (1984) proposed a polychotomous logistic regression model which reduces to the beta-binomial distribution for the number of successes when explanatory variables are zero. Based on Rosner's idea, Connolly and Liang (1988) suggested a general model in which Rosner's model is considered as a special case: P r ( Y | X a , . . . , Xn) = c(0,0) exp{ £ ) Fn(k, 0) + £ Xj/3Yj} (4.3) where c is a normalizing constant chosen to make the probabilities, P r ( Y | X ) , sum to unity and Y. is the sum of the Yj's. It follows from (4.3) that the logit conditional probability of Yj; = 1 given Y__,- = {Y\, • • •, Yj-_i, Y j + i , . . . , Yn) and Xj is logitVxiXj = l\Y-i,Xi) = Fn(wj;0) + / ?%• , (4.4) where Wj = Y—Yj is the sum of the Y's excluding Yj. Clearly, the conditional probability for Yj = 1 in (4.4) depends on YLj only through the sum. In other words, this indicates a very important feature of the logistic representation in (4.4): when there are no covariate effects or when X\ = • • • = Xn, then the models are such that the n Yj's are exchangeable. Compared with model (3.30), model (4.3) introduced by Connolly and Liang has some limitations due to the restricted dependence structure. That is, model (4.3) might only be reasonable when the dependence among observations is approximately exchangeable such as in familial data. On the other hand, in model (3.30), the logit conditional probability of Yj = 1 given ?/_, and X is m logitVx(Yi = l|y_,-,X) = a,- + ^ Nxi + X}7«j2/j-28 This logistic representation means that the dependence of each single y on the rest is only through the sum when all 7,-j's are equal. Thus, we conclude that model (3.30) is a special case of Connolly and Liang's model (4.3) when jij = 7 for all i ^ j and /?,_,- = flj for all i,j, since then £ i ? y 7 ^ = q/(y. - y{). 29 Chapter 5 Data Analysis and Computing An example with some coronary bypass surgery data will now be presented to illustrate the application of the model for multivariate binary responses with covariates that we have discussed in Chapter 3. From a clinical point of view, it is important to determine risk factors for different complications. Using multivariate binary model (3.30), we can determine risk factors for several response variables simultaneously and account for dependence between the binary response variables. Zhang (1993) did the risk factor analysis separately for each binary response variable (or complication variable). In order to fit model (3.30), three likelihood-based computing methods will be intro-duced. After that , several topics will be discussed, including choice of covariates and inferences from the model. 5.1 The Data Source and Description MCR (Merged, Multi-Center, Multi-Specialty Clinical Registries) is an international data-base developed by Health Data Research Institute in which information about pa-tients who had heart-related surgery was recorded. When doing statistical analysis, instead of using the entire data set, we choose a random sample subset which contains 30 880 patients information (this is done partly in order to reduce computational time). The information available both pre and post operation is provided in Table 5.1 and Ta-ble 5.2. The pre-operation information includes patient's age, gender, prior myocardial infarction, existence or non-existence of other diseases, body surface area, etc. (the first 19 variables: VI to V19). The post-operation information includes the patient 's status during and after the bypass surgery; for example, complications, such as renal or neuro-logical problems, and survival status to 30 days following surgery (the last 7 variables: V20 to V26). The variables of primary interest are those outcome variables indicating the patient 's complications and survival status after the surgery. Here, the complication variables are related to the quality of life after surgery. For more documentation of the data set, see Zhang (1993). 5.2 Estimation Procedure Our aim in this section is to discuss from a practical viewpoint, how to apply model (3.30) and how to estimate the parameters in model (3.37). Firstly, a simple, indirect, yet important way is to use the existing computer programs such as the glm() function in Splus which fits the coefficients of the linear logistic model using maximum likelihood in the binomial family. Using the glm() function in Splus to fit the conditional probabilities Pr(y,|yj, j ^ i,x) separately, not only can we estimate the parameters, but also we can assess the compatibility conditions for the conditionally specified distributions. For example, after using the glm() function twice to fit model (3.30) for two binary response variables REM (V21), NEM (V22) and one covariate AGE (VI) in the MCR data set, we have the results in Table 5.3. From Table 5.3, in addition to obtaining the 31 Table 5.1: Description for the Subset of MCR Data Variable VI V2 V3 V4 V5 V6 V7 V8 V9 V10 Vl l V12 V13 V14 Name Age Sex Prior Myocardial Infarction Obesity Chronic Obstructive Pulmonary Disease Diabetes Renal Disease Hypertension Alcohol Abuse Cancer Liver Disease Central Nervous System Disease Prior Cerebrovascular Accident Rheumatic Heart Disease Codes/Values Years 0=Male 1=Female 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes Abbreviation AGE SEX PMI OBE COP DIA REN HTN ETO CA LIV CNS PCA RHE 32 Table 5.2: Description for the Subset of MCR Data (continued) Variable V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 Name Other Surgery No Surgery Valve Replacement Left Ventricular Dysfunction Body Surface Area Discharge/30 Day Status Renal Complication (Mild or Severe) Neurological Complication (Mild or Severe) Pulmonary (Mild or Severe) Myocardial Infarction Low Out Syndrome (Mild or Severe) Sepsis Codes/Values 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=Normal 1= 40-49% 2= 30-39% 3= 20-29% 4= < 20% Square meters 0=Dead 1=Alive 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes 0=No l=Yes Abbreviation OTH NON VAL LVD BSA STA REM NEM PUM MI LOM SEP 33 Table 5.3: Results from Separate Logistic Regressions with Response Variables REM, NEM Parameter 712 a2 721 Value Std. Error t value -8.9190 1.5952 -5.5912 0.0818 0.0229 3.5700 1.4760 0.3914 3.7712 -7.1316 1.0204 -6.9900 0.0705 0.0149 4.7296 1.4521 0.3959 3.6676 estimate of the parameters, more important, we find that 71 2 is quite close to 721 and this gives an indication of the adequacy of fit of model (3.30) for these data. Secondly, model (3.30) can be viewed as an exponential family model c _ 1 exp{A - 1 s} with "sufficient" statistic vector s = (su...,sM) = (yi, • • • ,yn,yiy2,ym, • • • ,yn-iyn,xiyj,i = l,...,m,j = l, . . . ,n). Given data of the form (yn,..., i/,n, xa,..., £4 m) , i = 1,...,N, the sufficient statistic vector is X2,=i s«- Using the Newton-Raphson procedure for an exponential family log-likelihood, we can obtain the maximum likelihood estimate of the parameters in model (3.30) by an iterative process as: A r+1 = A r - H " 1 ( A r ) u ( A r ) for r = 0 , 1 , 2 , . . . , where u(A) is a vector whose j t h component is the first derivative of the log-likelihood with respect to the jth parameter and the matrix H(A) (sometimes called the Hessian matrix) is the matrix of second partial derivatives of log-likelihood function. A0, a vector of initial estimates of A, can come from the results of using the glm() function, separately for each response variable. 34 In order to use the Newton-Raphson method to estimate parameters in model (3.30), A C program (see Appendix) was written for analyzing the MCR data set. The Newton-Raphson procedure means that M(M + 1 ) / 2 second derivative evaluations, M first derivative evaluations and a matrix inverse are needed even before the linear search is attempted. In order to simplify these problems, the quasi-Newton method is suggested as the third choice in obtaining the estimate of the parameters in model (3.30) since in the quasi-Newton method, a method is used to approximate H _ 1 directly from the first derivative information available at each step of the iteration (sometimes this approximation can be inaccurate) (see Nash 1990). To use the quasi-Newton method for analyzing our data, another C program was written. As in the first C program, we get initial values from the separate logistic regressions. 5.3 Comparison of Three Methods All three methods (the glm() function, the Newton-Raphson method and the quasi-Newton method) we have introduced can be used to fit model (3.30), but what are computational differences among them? By means of the MCR data set, we use the three methods to estimate the parameters in model (3.30) under different cases, for example: • Case 1: one binary variable - REM (V21), one covariate - LIV (VI1). (See Table 5.4). • Case 2: one binary variable - REM (V21), two covariates - AGE (VI) and LIV ( V l l ) . (See Table 5.5). • Case 3: two binary variables - REM (V21) and NEM (V22), one covariate - SEX (V2). (See Table 5.6). 35 • Case 4: two binary variables - REM (V21) and NEM (V22), two covariates - SEX (V2) and COP (V5). (See Table 5.7). • Case 5: three binary variables - REM (V21), NEM (V22) and PUM (V23), two covariates - SEX (V2) and COP (V5). (See Table 5.8 and Table 5.9). • Case 6: four binary variables - REM (V21), NEM (V22),PUM (V23) and MI (V24), three covariates - AGE (VI), SEX (V2) and PMI (V3). (see Table 5.10, Table 5.11 and Table 5.12). From the computations for the MCR data set with three methods, we find that: 1. The glm() function in Splus is an approximate method, though from the standard errors in Table 5.4 to Table 5.12, it appears that both the glm() function and the Newton-Raphson method generate rather similar estimates of the parameters. It is less convenient than other two methods since the number of many times we need to run the glm() function depends on the number of binary response variables. An advantage of the glm() function is that one can assess the goodness of fit by a rough comparison of 7^ and 7,-,- relative to their standard errors from the glm() function before correctly applying model (3.30) to a data set. 2. Sometimes the quasi-Newton method generates inaccurate estimated standard er-rors. For example, in Table 5.10, the standard error of ax is 1.09 when using the quasi-Newton method, but when using the glm() function or the Newton-Raphson method, the standard errors of ct\ are 1.71 and 1.64, respectively. Also in Table 5.10, using the quasi-Newton method, the standard error of 0:3 is 0.94, but using the glm() function or the Newton-Raphson method, the standard errors of 03 are 0.52 and 0.52, respectively. 36 Table 5.4: Method Comparison under Case 1 oil value Std. Error /?ii value Std. Error loglikelihood glm() function -3.2058 0.1745 1.4141 1.0941 Newton-Raphson -3.2059 0.1749 1.4141 1.0942 -146.5499 quasi-Newton -3.2059 0.1748 1.4140 1.0108 -146.5495 3. The computational time for using the Newton-Raphson method to fit model (3.30) is much less than that for using the quasi-Newton method, but it requires more programming effort to compute second derivatives, first derivatives and a matrix inverse, etc. 4. The model (3.30) appears to fit adequately in all cases considered here, based on a check of the requirement 7,-j = 7,-,- using glm() function. One could do other standard checks for the adequacy of the logistic regressions (eg. Hosmer and Lemeshow, 1989). 5.4 Selection of Covariates Another point worth mentioning is that , as stated before, the clinical researchers or surgeons are only interested in those polychotomous responses indicating the patient's survival status and some complication variables which are related to the quality of life after surgery, and there is the question of how to choose the covariates to use. Generally speaking, only those covariates which show potential relation with the response variable are selected. In this section, two methods for selection of covariates are suggested. 37 Table 5.5: Method Comparison under Case 2 ai value Std. Error flu value Std. Error /?12 value Std. Error loglikelihood glm() function -9.6722 1.5862 0.0966 0.0224 1.8047 1.1188 Newton- Raphson -9.6723 1.5900 0.0966 0.0225 1.8048 1.1192 -135.1319 quasi-Newton -9.3933 1.5782 0.0926 0.0223 1.7787 1.0257 -135.1475 Table 5.6: Method Comparison under Case 3 «i value Std. Error a2 value Std. Error /?n value Std. Error #21 value Std. Error 712 value Std. Error 721 value Std. Error loglikelihood glm() function -3.7141 0.2509 -2.5473 0.1504 0.5764 0.3586 0.1043 0.2669 1.8547 0.3797 1.8546 0.3806 Newton- Raphson -3.7140 0.2536 -2.5474 0.1515 0.5756 0.3603 0.1043 0.2683 1.8547 0.3810 -392.3369 quasi-Newton -3.7140 0.2431 -2.5475 0.1409 0.5756 0.3335 0.1043 0.2587 1.8548 0.3542 -392.3369 38 Table 5.7: Method Comparison under Case 4 c*i value Std. Error a.?, value Std. Error /?n value Std. Error fi\2 value Std. Error #21 value Std. Error #22 value Std. Error 7i2 value Std. Error 721 value Std. Error loglikelihood glm() function -3.9967 0.2835 -2.5258 0.1530 0.5212 0.3667 1.5225 0.3997 0.1155 0.2672 -0.2900 0.4156 1.9178 0.3922 1.9173 0.3914 Newton- Raphson -3.9947 0.2825 -2.5251 0.1543 0.5175 0.3651 1.5214 0.3997 0.1116 0.2686 -0.2863 0.4186 1.9166 0.3921 -386.2089 quasi-Newton -3.9947 0.2721 -2.5252 0.1273 0.5175 0.3402 1.5214 0.3580 0.1116 0.1796 -0.2863 0.3217 1.9167 0.3641 -386.2089 39 Table 5.8: Method Comparison under Case 5 «i value Std. Error «2 value Std. Error «3 value Std. Error /?u value Std. Error /?i2 value Std. Error /?2i value Std. Error /?22 value Std. Error fci value Std. Error /332 value Std. Error glm() function -4.9749 0.4487 -3.0887 0.2194 -0.8458 0.0921 0.5686 0.3718 1.1336 0.4100 0.1149 0.2729 -0.5904 0.4215 -0.0081 0.1646 1.2974 0.2408 Newton- Raphson -4.9244 0.4439 -3.0889 0.2198 -0.8435 0.0920 0.5204 0.3693 1.0935 0.4104 0.1141 0.2723 -0.5917 0.4208 -0.0086 0.1646 1.2883 0.2409 quasi-Newton -4.9245 0.4485 -3.0890 0.2222 -0.8436 0.0924 0.5204 0.3714 1.0934 0.4129 0.1141 0.2731 -0.5917 0.4223 -0.0086 0.1669 1.2883 0.2420 40 Table 5.9: Method Comparison under Case 5 (continued) 712 value Std. Error 713 value Std. Error 723 value Std. Error 721 value Std. Error 73i value Std. Error 732 value Std. Error loglikelihood glm() function 1.5391 0.4038 1.8039 0.4652 1.2493 0.2660 1.4963 0.4039 1.8062 0.4655 1.2615 0.2661 Newton- Raphson 1.4974 0.4040 1.7854 0.4712 1.2503 0.2662 -924.4926 quasi-Newton 1.4974 0.4079 1.7855 0.4767 1.2503 0.2692 -924.4912 41 Table 5.10: Method Comparison under Case 6 ai value Std. Error a.1 value Std. Error 0:3 value Std. Error c*4 value Std. Error /?n value Std. Error /?i2 value Std. Error /?i3 value Std. Error /?2i value Std. Error /?22 value Std. Error /?23 value Std. Error glm() function -10.3369 1.7149 -8.0159 1.0736 -1.0627 0.5241 -3.2307 0.8946 0.0800 0.0242 0.2847 0.3843 0.6403 0.2373 0.0770 0.01554 -0.1451 0.2807 -0.3320 0.1887 Newton- Raphson -10.1998 1.6367 -7.9365 1.0620 -1.0352 0.5186 -3.1701 0.8842 0.0752 0.0230 0.2913 0.3753 0.9493 0.2790 0.0758 0.0154 -0.1271 0.2793 -0.3429 0.2012 quasi-Newton -10.1091 1.0864 -7.7579 1.0129 -0.9465 0.9363 -3.0885 0.9967 0.0739 0.0181 0.3136 0.7905 0.9395 0.3493 0.0732 0.0157 -0.1166 0.3682 -0.3420 0.3460 42 Table 5.11: Method Comparison under Case 6 (continued) #31 value Std. Error #32 value Std. Error /333 value Std. Error #41 value Std. Error #42 value Std. Error #43 value Std. Error 712 value Std. Error 713 value Std. Error 714 value Std. Error glm() function 0.0021 0.0084 0.1598 0.1890 -1.6840 0.1331 -0.0008 0.0141 -0.2366 0.3350 -0.4165 0.2195 1.2000 0.4157 2.3946 0.4760 0.1531 0.6662 Newton-Raphson 0.0014 0.0083 0.1603 0.1888 -1.6642 0.1313 -0.0018 0.0139 -0.2292 0.3342 -0.4903 0.2291 1.1181 0.4216 2.7162 0.5087 0.0198 0.6542 quasi-Newton 0.0001 0.0148 0.1629 0.2271 -1.6608 0.1411 -0.0027 0.0165 -0.2357 0.9272 -0.5051 0.3361 1.1284 0.9704 2.7072 0.5792 0.0403 0.9719 43 Table 5.12: Method Comparison under Case 6 (continued) 723 value Std. Error 724 value Std. Error 734 value Std. Error 721 value Std. Error 73i value Std. Error 732 value Std. Error 741 value Std. Error 742 value Std. Error 743 value Std. Error loglikelihood glm() function 1.0296 0.3040 -0.2330 0.4875 1.1814 0.3862 1.0650 0.4304 2.8826 0.5216 1.0161 0.3135 -0.0134 0.6564 -0.170 0.4762 1.1795 0.3533 Newton-Raphson 0.9935 0.3076 -0.1803 0.4765 1.1522 0.3446 -994.3497 quasi-Newton 0.9925 0.6812 -0.1831 0.9951 1.114 0.6171 -994.4018 44 5.4.1 Odds Ratio Analysis When the variables are binary, a natural way to measure the association is the odds ratio (as well as its logarithm). Consider two binary variables Y\ and Y2, where Y{ is coded 1 and 0. Let pij = Pr(Yi = i,Y2 — j), i,j = 0 ,1 , so that the odds ratio, ip, is defined as: , PooPn v = PoiPw From a random sample of size N, we generate the following 2 x 2 contingency table with r2 = o r2 = i Y1 = O r1 = i a b c d The estimated success probabilities in the two data sets are poo = a/N, plo = b/N, Poi — c/N, and p\\ = d/N, and so the estimated odds ratio, /^>, is given by 1 _ PooP\\ _ ad PwPoi be An interval estimate or confidence interval (CI) of the odds ratio is more useful than a point estimate for measuring the association. Also we note that the logarithm of the estimated odds ratio is better approximated by a normal distribution. The approximate standard error of the estimated log odds ratio, log 0 , can be shown to be given by s.e.(logip) » i / ( - + - + - + - ) V a b c a An outline proof of this result is given in Schlesselman (1982). An approximate 100(1 — a)% confidence interval for xjj is exp[log tj> ± za/2s.e.(log 0)] where Zp is the upper /3 quantile of the standard normal distribution (see Hosmer and Lemeshow 1989). 45 Table 5.13, Table 5.14 and Table 5.15 provide the odds ratios for response variable REM (V21), NEM (V22) and PUM (V23) respectively. For more detailed information, see Zhang (1993). Statistically, only those 95% CI not containing 1 are more strongly related with the variable. Using odds ratio analysis, we conclude that covariates REN (V7), LIV (VI1), CNS (V12), COP (V5), DIA (V6), and HTN (V8) are strongly related with the variable REM (V21); covariates CNS (V12), CA (V10), OBE (V4), REN (V7), and HTN (V8) are strongly related with the variable NEM (V22); covariates COP (V5), OBE (V4), REN (V7), ETO (V9), CNS (V12), DIA (V6), and HTN (V8) are strongly related with the variable PUM (V23). 5.4.2 Univariate Analysis and Comparison of Models An alternative for deciding the covariates to use is based on the differences of log-likelihoods. The difference in the log-likelihoods of two nested models measures the extent to which the additional terms improve the fit of the model to the observed re-sponse variables. Take model (3.30) which has n binary response variables, for example. Suppose that two linear logistic models, model (1) and model (2), say, are to be com-pared, where the two models are as follows: • Model (1): / o ^ P r ( y ; = % - , j ^ »,*,-*, 1 < k < m - 1) = a, + E?J? PijXik + • Model (2): logit?i{Yi = l\yjtj ^ i,xik,l < k < m) = a,- + E 7 = i ^ a + 46 Table 5.13: Odds Ratio for REM Variable V2 V4 V5 V6 V7 V8 V9 Vl l V12 V13 V14 V15 V17 Heading SEX OBE COP DIA REN HTN ETO LIV CNS PCA RHE OTH VAL $ 1.48 1.13 1.86 1.80 9.78 1.75 1.34 3.58 3.40 1.33 0.88 0.90 0.64 s.e.(log^) 0.21 0.37 0.26 0.24 0.22 0.20 0.47 0.62 0.33 0.60 0.59 0.52 1.01 95 % CI of ^ (0.97, 2.26) (0.54, 2.37) (1.11,3.11) (1.11, 2.91) (6.30, 15.2) (1.18, 2.62) (0.53, 3.95) (1.04, 12.3) (1.75, 6.61) (0.40, 4.34) (0.27, 2.85) (0.32, 2.50) (0.08, 4.76) Remark positively related positively related positively related positively related positively related positively related 47 Table 5.14: Odds Ratio for NEM Variable V2 V4 V5 V6 V7 V8 V9 V10 Vl l V12 V13 V14 V15 V17 Heading SEX OBE COP DIA REN HTN ETO CA LIV CNS PCA RHE OTH VAL rP 1.34 2.11 1.22 1.38 1.62 1.49 1.40 2.96 2.20 3.92 2.02 0.81 0.94 0.58 s.e.(log^) 0.15 0.21 0.21 0.18 0.23 0.14 0.32 0.43 0.55 0.24 0.37 0.43 0.35 0.73 95 % CI of $ (0.99, 1.81) (1.37, 3.25) (0.80, 1.84) (0.96, 1.98) (1.02, 2.57) (1.12, 1.97) (0.73, 2.68) (1.26, 6.93) (0.74, 6.53) (2.41, 6.39) (0.97, 4.19) (0.34, 1.88) (0.47, 1.90) (0.14, 2.46) Remark positively related positively related positively related positively related positively related 48 Table 5.15: Odds Ratio for PUM Variable V2 V4 V5 V6 V7 V8 V9 V10 V12 V13 V14 V15 V17 Heading SEX OBE COP DIA REN HTN ETO CA CNS PCA RHE OTH VAL I 1.08 3.38 3.45 1.91 3.32 1.86 2.21 0.56 2.12 0.67 0.97 0.58 1.03 s.e.(log^) 0.09 0.16 0.13 0.11 0.16 0.08 0.21 0.41 0.21 0.29 0.23 0.22 0.34 95 % CI of $ (0.90, 1.30) (2.45, 4.66) (2.66, 4.48) (1.52, 2.40) (2.41, 4.58) (1.57, 2.20) (1.46, 3.36) (0.25, 1.25) (1.39, 3.21) (0.37, 0.91) (0.61, 1.54) (0.37, 0.91) (0.52, 2.01) Remark positively related positively related positively related positively related positively related positively related positively related 49 Denote the maximized log-likelihoods under model (1) and model (2) by L\ and Li, respectively, so that twice the difference in the log-likelihoods of two models is 2[L2 - Lx] Asymptotically (large sample size N), the distribution of this statistic has a x2 distri-bution with degrees of freedom v\ — v-i equal to the difference in the number of the parameters between two models (see Cox and Hinkley (1974)). In our example, since we have n binary response variables, that is, v-± — v-j, = n when adding an extra covariate xm to model (1) which already has covariates x\, x2,..., arm-i- Using the difference between the log-likehood functions of two models, we can find out how much the covariate xm improves the fit of the model to the observed response variables. Other methods for selection of covariates will be introduced in the next section. 5.5 Other Issues Before concluding this chapter, another question to discuss is how much the association of the response variables depends on the covariates. This is one inference of interest. Another reason that this issue arises is that some methods, for example, Generalized Estimating Equations, seem to assume the correlation of two response variables does not depend on covariates. What is the case for our data? 5.5.1 General As we know, with binary variables, the odds ratio (as well as its logarithm) is a natural measure of association. Considering the 2 dimensional case for model (3.30) where there exists only two binary response variables, the odds ratio is P r ( l , l [x) Pr(0, Q|x) _ exp[(Q l + ftx) + (a 2 + ftx) + 712] _ ( . P r ( l , 0|x) Pr(0, l | x ) e x p ( a i + frx) exp(a2 + ftx) e x P ^ 1 2 ^ 50 Hence, the logarithm of the odds ratio is 712 and it does not depend on the covariates. However, it is not true that the odds ratio does not depend on the covariates when the number of binary response variables is greater than 2. Take the 3 dimensional case in model (3.30), for a particular instance. Based on model (3.30), the joint distribution of Yi and Yj is Pr(y«>yj>y*lx) Pr(y,-,yj|x) = Pr(yk\yi,yj,x) c'1 exp{Ef = i (a , + E t e i Pitxt)yj + Ei<i<j<3 ajViVj} exp{(afc + Efc i PktXt + HkVi + 1jkyj)yk}/l + exp(afc + E£Li PktXt + 7,-*y,- + Jjkyj) m = c _ 1 [1 + exp(afc + J2 PktXt + HkVi + IjkVj)] t=\ m m • exp{(o!,- + ^2PitXt)yi + ( a , + J2Pjtxt)Vj + 7.j2/»2/j} t=i t=i for (t, j , k), where c = 1 + E L i exp(a,- + E £ i #***) + E i < , ^ < 3 exp[(at- + a,-) + ET=i{Pit + Pit)xt + tij] + exp[E-= 1 « . + ES=i Ef=i Pitxt + Ef=i 7tf]-As a result, the odds ratio for Yi and 1} is Pr(Yj = 0,Yj= 0|x) Pr(y;- = 1, Y3•, = l |x ) Fv(Yi = 0,Yj = l | x ) P r ( X = 1,^- = 0|x) [1 + exp(ak + E*=i &<**)] [1 + exp(afc + E £ i #w*t + 7*i + 7*i)] exp(7.i)(5.1) [1 + exp(a* + E £ i PktXt + 7 H ) ] [ 1 + e x P ( « * + E £ i PktXt + Jkj)] This form implies that when there exist three binary response variables in model (3.30), the odds ratio for arbitrary two binary response variables indeed depends on the covari-ates x. Using similar steps as above, we can get the same conclusions for the case of more than three binary response variables in model (3.30). Next, we take two examples to illustrate how the association of binary response variables varies with covariates for our data. In each example, we will separately consider model (3.30) with 51 • one covariate: COP (V5); • two covariates: COP (V5), and CNS (V12); • three covariates: COP (V5), CNS (V12), and REN (V7); • four covariates: COP (V5), CNS (V12), REN (V7), and HTN (V8); • five covariates: COP (V5), CNS (V12), REN (V7), HTN (V8), and OBE (V4). 5.6 Examples 5.6.1 Example 1 In our first example, only two complication variables, REM (V21) and NEM (V22) are considered as the response variables in model (3.30). After using the Newton-Raphson method to fit model (3.30), the log-odds ratio for REM and NEM is simply equal to the estimate value of the parameter 712. Table 5.16 presents the log-odds ratio for REM, NEM and its standard error (showed in parenthesis) under different number of covariates. From Table 5.16, we find that the log-odds ratio for REM and NEM decreases when we add an extra covariate. As a matter of fact, we say the association of two binary response variables varies with covariates, but it is not significantly so when considering the standard errors. Considering the confidence interval of the odds ratio for REM and NEM, we find that all 95% confidence intervals under the above five cases do not contain 1 (see Table 5.17). This implies that variable REM and variable NEM are strongly related each other whatever covariates we choose. Now let us return to the topic of Section 5.4. We mentioned that the odds ratio analysis and another approach based on the differences of log-likehhoods, can suggest which 52 Table 5.16: The Association of REM (V21) and NEM (V22) Covariates C0P(V5) COP CNS(V12) COP CNS REN(V7) COP CNS REN HTN(V8) COP CNS REN HTN OBE(V4) Log-Odds Ratio 1.9296 1.7422 1.7005 1.6938 1.6842 Std. Error 0.3909 0.4070 0.4180 0.4184 0.4219 Table 5.17: CI of Odds Ratio for REM (V21) and NEM (V22) Covariates COP(V5) COP CNS(V12) COP CNS REN(V7) COP CNS REN HTN(V8) COP CNS REN HTN OBE(V4) 95% CI of odds ratio (3.2010, 14.8166) (2.5715, 12.6785) (2.4139, 12.4257) (2.3959, 12.3524) (2.3568, 12.3186) 53 covariates to use. Here introduce two related methods: the first method is to use the ratios of the estimates of the coefficients for the covariates to their standard errors: (3/s.e.(p); the second uses differences in the values of log-likelihood function to assess covariates. For different number of covariates, Table 5.18 and Table 5.19 list the estimates of the coefficients for the covariates (/?'s), corresponding standard errors and their ratios when applying the Newton-Raphson method to fit model (3.30). Statistically, if the absolute value of a ratio is large than 1.96, this might suggest that we need to consider the variable of this coefficient as our covariate in model (3.30). After checking the ratios in Table 5.18 and Table 5.19, we conclude 1. Among the five covariates: COP (V5), CNS (V12), REN (V7), HTN (V8) and OBE (V4), covariate COP and covariate REN are most important for response variable REM. 2. For response variable NEM, covariate CNS and covariate OBE are more important than other three covariates: COP, REN and HTN. 3. Covariate HTN seems to be less important than other covariates for both response variable REM and response variable NEM. HTN appears not be significant when several covariates are included, possibly because it is strongly associated with the other covariates. In last section, we know, from Table 5.13 and Table 5.14, that of the five covariates, four of them: COP, CNS, REN and HTN are individually strongly related with the response variable REN; four of them: CNS, REN, HTN and OBE are individually strongly related with the response variable NEM. Comparing with the conclusion from checking the ratios 54 Table 5.18: The Estimates of the Coefficients under All Five Cases Fi=REM COP (V5) /?u Std. Error Ratio CNS (V12) fa Std. Error Ratio REN (V7) 0i3 Std. Error Ratio HTN (V8) 0i4 Std. Error Ratio OBE (V4) 0i5 Std. Error Ratio Number of Covariates 1 2 3 4 5 1.5516 1.5344 1.4579 1.4532 1.4644 0.3982 0.4021 0.4128 0.4119 0.4110 3.8965 3.8160 3.5317 3.5280 3.5630 1.2536 0.9634 0.9432 0.9569 0.5372 0.5666 0.5668 0.5637 2.3336 1.7003 1.6641 1.6975 1.5753 1.5471 1.5515 0.4242 0.4286 0.4272 3.7136 3.6097 3.6318 0.1658 0.1567 0.3705 0.3722 0.4475 0.4210 0.1593 0.5358 0.2973 55 Table 5.19: The Estimates of the Coefficients under All Five Cases (continued) F2=NEM COP (V5) 02 i Std. Error Ratio CNS (V12) 022 Std. Error Ratio REN (V7) 023 Std. Error Ratio HTN (V8) 024 Std. Error Ratio OBE (V4) 025 Std. Error Ratio Number of Covariates 1 2 3 4 5 -0.2800 -0.3023 -0.3144 -0.3228 -0.3278 0.4183 0.4251 0.4270 0.4258 0.4240 0.6694 -0.7111 -0.7363 -0.7581 -0.7731 1.5045 1.4802 1.4662 1.4477 0.4082 0.4121 0.4130 0.4161 3.6857 3.5918 3.5501 3.4792 0.1909 0.1489 0.1158 0.4117 0.4154 0.4155 0.4637 0.3584 0.2787 0.2050 0.1244 0.2545 0.2582 0.8055 0.4818 0.9050 0.3432 2.6369 56 Table 5.20: The Log-likelihood Functions with Different Number of Covariates Covariates COP(V5) COP CNS(V12) COP CNS REN(V7) COP CNS REN HTN(V8) COP CNS REN HTN OBE(V4) Log-likelihood -387.4225 -377.1679 -370.5885 -370.0956 -366.7212 in Table 5.18 and Table 5.19, partly we can say that to choose covariates, checking the ratios of the estimates of the coefficients for the covariates to their standard errors is an additional method to the odds ratio analysis. In the meanwhile, we also calculate the log-likelihood function when response variables are two complication variables, REM (V21) and NEM (V22), and covariates are one of the above five cases (see Table 5.20). Table 5.20 shows that the log-likelihood function increases when we add an extra covariate. Especially, we find that when we add covariate HTN to covariates COP, CNS and REN, the log-likelihood function only increases by 0.4929, less than other increase. Again, this may suggest that covariate HTN is less important than other covariates for both response variables. To conclude, using multivariate binary model (3.30), we can do the risk factor analysis for several response variables (or complication variables) simultaneously. 5.6.2 Example 2 Now, we briefly discuss another example for checking the 3 dimensional case where there exist three binary response variables, REM (V21), NEM (V22), and PUM (V23) in model (3.30). 57 Table 5.21: The Log-likelihood Functions of Three Response Variables with Different Number of Covariates Covariates C0P(V5) C0P(V5) CNS(V12) C0P(V5) CNS(V12) REN(V7) C0P(V5) CNS(V12) REN(V7) HTN(V8) C0P(V5) CNS(V12) REN(V7) HTN(V8) 0BE(V4) Log-likelihood -925.7063 -910.6644 -889.7837 -882.5742 -873.7382 Difference 15.0419 20.8807 7.2095 8.8360 In Table 5.21, we present the values of log-likelihoods under different number of co-variates when using the Newton-Raphson method and the differences of log-likelihoods when adding another covariate to previous case. Clearly, as previous example the log-likelihood function increases when we add an extra covariate. Considering differences, it seems that the last two covariates HTN and OBE are not more important than other covariates, but comparing with x\i they are still statistically significant. We choose the third case: three covariates COP, CNS and REN for our final analysis. Similar to Table 5.18 and Table 5.19, Table 5.22 deals with three binary response variable REM, NEM, PUM and three covariates COP, CNS, REN in model (3.30). Checking the ratios in Table 5.22, we say that among these three covariates, CNS is not significant for binary response variable REM; COP and REN is not significant for binary response variable NEM. After using the Newton-Raphson method to fit model (3.30) for the third case, in ad-dition to the estimates of the coefficients listed in Table 5.22, we also obtained other estimates of the parameters in model (3.30). Therefore, through equation (5.1) the 58 Table 5.22: The Estimates of the Coefficients under the Third Case COP (V5) CNS (V12) REN (V7) Coefficient Std. Error Ratio Coefficient Std. Error Ratio Coefficient Std. Error Ratio REM (V21) 1.1318 0.4189 2.7018 0.7580 0.5554 1.3648 1.2210 0.4280 2.8528 NEM (V22) -0.5786 0.4254 -1.3601 1.2370 0.4180 2.9593 -0.1104 0.4110 0.2686 PUM (V23) 1.3013 0.2447 5.3179 1.1185 0.4131 2.7076 1.5234 0.2988 5.0984 marginal odds ratio for REM and NEM is the form of [1 + exp(c*3 + E L i fcxj)][l + exp(a 3 + £ L i fax, + 7w + 723)] exp(7i2) [1 + exp(a 3 + Ef=i fax; + 713)] [1 + exp(a 3 + £?=i fax, + 723)] 1 + exp(-0.9899 + 1.3013si + 1.1185s2 + l-5234x3) 6 X P^ * ' ' 1 + exp(-0.9899 + 1.3013*1 + 1.1185x2 + 1.5234x3 + 1.5186) 1 + exp(-0.9899 + 1.3013si + 1.1185x2 + 1.5234x3 + 1.5186 + 1.1775) 1 + exp(-0.9899 + 1.3013a?! + 1.1185x2 + 1.5234x3 + 1.1775) 1 + exp(-0.9899 + 1.3013ai + 1.1185x2 + 1.5234x3) 1 + exp(1.7062 + 1.3013X! + 1.1185x2 + 1.5234x3) 1 + exp(0.1687 -f 1.3013X! + 1.1185x2 + 1.5234x3) " 1 + exp(0.1876 + 1.3013X! + 1.1185x2 + 1.5234ar3) ' ' (5.2) As a function of the covariates COP, CNS and REN, the marginal odds ratio for REM and NEM varies with the covariates for model (3.30). In fact, the association of the response variables depends strongly on the covariates in this 3 dimensional case. From Table 5.1 and Table 5.2, we know all covariates COP, CNS and REN are binary variables. Substituting different values of the covariates for equation (5.2), we give Table 5.23 to show the changes of the marginal odds ratio for REM and NEM. From 59 Table 5.23: The Marginal Odds Ratio for REM and NEM COP (V5) CNS (V12) REN (V7) Xx = 0 X2 = 0 x 3 = 0 Xx = 1 X2 = 0 x 3 = 0 Xx = 0 X2 = 1 x 3 = 0 xx = 0 x2 = 0 x3 = 1 xx = 1 x2 = 1 x3 = 0 Xx = 1 X2 = 0 x 3 = 1 xx = 0 x2 = 1 x3 = 1 Xx = 1 X2 = 1 X3 = 1 Marginal Odds Ratio 0.8490 0.4464 0.4800 0.4124 0.3291 0.3094 0.3173 0.2826 these marginal odds ratios in Table 5.23, we conclude that of three covariates, the most important covariate for the changes of the marginal odds ratio for REM and NEM is REN, next is COP. 60 Bibliography [1] Arnold, Barry C , Castillo, Enrique and Sarabia, Jose-Maria (1992). Conditionally Specified Distributions. Springer-Verlag. [2] Arnold, B.C. and Strauss, D. (1988). Bivariate distributions with exponential con-ditionals. JASA 83, 522-527. [3] Bonney, George Ebow (1987). Logistic regression for dependent binary observations. Biometrics 43, 951-973. [4] Chambers, John M., Hastie, Trevor J. and AT&T Bell Laboratories (1992). Statis-tical Models in S. Wadsworth & Brooks / Cole Advanced Books & Software Pacific Grove, California. [5] Collett, D. (1994). Modelling Binary Data. Chapman & Hall. [6] Nash, J C (1990). Compact Numerical Methods For Computers Linear Algebra and Function Minimisation. Second Edition. Adam Hipger, Bristol and New York. [7] Connolly, Margaret A. and Liang, Kung-Yee (1988). Conditional logistic regression models for correlated binary data. Biometrika 75, 3, 501-506. [8] Cox, D.R. and Hinkley, D.V. (1974). Theoretical Statistics. Chapman Hall, London. [9] Gelman, A. and Speed, T.P. (1993). Characterizing a joint probability distribution by conditionals. J. Roy. Statist. Soc. B 55 , 185-188. 61 [10] Hosmer, D.W. and Lemeshow, S. (1989). Applied Logistic Regression. Wiley, New York. [11] McCullagh, P., and Nelder, J.A. (1989). Generalized Linear Models. Second Edi-tion. Chapman Hall, London. [12] Press, W.H., Flannery, B.P., Teukolsky, S.A. and Vetterling, W.T. (1988). Nu-merical Recipes in C: the Art of Scientific Computing. Cambridge University Press, Cambridge, Eng. [13] Schlesselman, J.J. (1982). Case-Control Studies: Design, Conduct, Analysis. Oxford University Press, New York. [14] Zhang, Hongbin (1993). Risk prediction models for binary response variable for the coronary bypass operation (M.Sc. Thesis). Dept. of Statistics, University of British Columbia. 62 Appendix C Program for Applying the Newton-Raphson Method # i n c l u d e < s t d i o . h > # i n c l u d e <math.h> #de f ine N 900 # d e f i n e NP 150 # d e f i n e NC 20 ma inO { double t [NP] [NP] ,h[NP] [NP] , s [N] [NP] ,x[N] [NC] ,y[N] [NC] ; double c [N] , lm[NP] , suf [NP] ; double dif,mx,prod,sum,lik,logc; int iter,i,j,k,d,n,m,nn,kf,ii,M,Ml,jj[6]; double **a,**b,**dmatrix(); FILE *fp; void d2b(),gaussj(); if ((fp = fopen("name,,,"r")) »- NULL) { printf("file not found\n"); exit(); } scanf("y.d */.d '/.d", &n, &m, &nn) ; /* n is dimension of binary response variables m is dimension of covariates nn is the number of observation M=n+n*m+n*(n-l)/2; M1=M+1; /* M is the number of parameters Next step is to get initial values of the parameters 63 and read values for binary response variables and covariates from a file "name" f or(i=l; i<=M; i++) f scanf (fp, '7.F" ,&lm[i] ) ; for(i=l;i<=M;i++) printf ('7.6.3f", lm[i]); printf ("\n") ; for(k=l;k<=nn;k++) { for(i=l;i<=m;i++) f scanf (fp,ny.F",&x[k] [i]) ; for(j=l;j<=n;j++) fscanf (fp,"'/.F",&y[k] [j]); } printf("\n"); for(i=l,kf=l;i<=n;i++) kf*=2; printf ("n=y.d, m=y.d, nn=y.d, M=*/,d, kf=y.d\n", n,m,nn,M,kf) ; for(i=l;i<=M;i++) suf[i]=0.; /* Next, calculate sufficient statistics — suf */ for(k=l;k<=nn;k++) { for(i=l;i<=n;i++) { s[k][(i-l)*m+i]=y[k][i]; for(j=l;j<=m;j++) s[k] [(i-l)*m+i+j]=y[k] [i]*x[k] [j] ; } fo r ( i= l ; i<=n- l ; i++) { for( j=i+l ; j<=n;j++) s [k] [n+n*m+n*( i - l ) - i* ( i - l ) /2+j - i ] - y [ k ] [ i ] * y [ k ] [ j ] ; } } for(i=l;i<=M;i++) { for(k=l;k<=nn;k++) su f [ i ]+=s[k] [ i ] ; pr in t f ( '7 .8 .4f" , s u f [ i ] ) ; if(i*/.10==0) p r in t f ( " \n" ) ; 64 } p r i n t f ( " \ n " ) ; * / d i f = l . ; i t e r=0 ; a = dmatrix(l ,M,l ,M); b = d m a t r i x ( l , M , l , l ) ; while( i ter<30 && d i f> l . e -4 ) { for(i=l;i<=M;i++) { for(j=l;j<=Ml;j++) h [ i ] [ j ] = 0 . 0 ; } for(k=l;k<=nn;k++) { c[k]=0. ; for(i=l;i<=M;i++) { for(j=l;j<=Ml;j++) t [ i ] [ j ]=0.0; } /* c a l l s rou t ine , d 2 b ( n , i i , j j ) : r e tu rns poss ib le binary vector j j of dimension n fo r ( i i= l ; i i<=kf ; i i++ ) { d 2 b ( n , i i , j j ) ; for( i=l ; i<=n; i++) •C s [k] [ i+( i - l )*m] = ( d o u b l e ) j j [ i ] ; for(j=l;j<=m;j++) s [k][ ( i - l )*m+i+j] = ( d o u b l e ) ( j j [ i ] ) * x [ k ] [ j ] ; } fo r ( i= l ; i<=n- l ; i++) { for( j=i+l ; j<=n;j++) s [k] [n+n*m+n*( i - l ) - i* ( i - l ) /2+j - i ] = ( d o u b l e ) j j [ i ] * j j [ j ] ; } for(i=l,sum=0.;i<=M;i++) sum+=lm[i]*s[k][i]; prod=exp(sum); c[k]+=prod; for(i=l;i<=M;i++) t [ i ] [Ml]+=s[k] [i]*prod; for(i=l;i<=M;i++) { for(j=l;j<=M;j++) t [ i ] [j]+=prod*s[k] [ i ]*s[k] [ j ] ; } 65 } for(i=l;i<=M;i++) t [ i ] [Ml]/=c[k] ; for(i=l;i<=M;i++) { for(j=l;j<=M;j++) t [ i ] [ j ] - t [ i ] [Ml]*t [ j] [Ml]-t [ i ] [ j ] / c [k ] ; } for(i=l;i<=M;i++) { for( j - i ; j<-M+l; j++) h [ i ] [ j ] + - t [ i ] [j] ; } } for(i=l;i<=M;i++) { for( j«l ; j<-M;j++) a [ i ] [ j ] - h [ i ] [ j] ; b [ i ] [ l ] = s u f [ i ] - h [ i ] [ M l ] ; } /* c a l l s rou t ine , gaussj (a ,M,b, l ) for solving a l i n e a r system */ gauss j ( a ,M,b , l ) ; for(i=l ,dif=0.; i<=M;i++) { mx=fabs(b[ i ] [ l ] ) ; if(mx>dif) dif=mx; } for(i=l;i<=M;i++) l m [ i ] - = b [ i ] [ 1 ] ; i te r++; if (iter'/.l==0) { for(i=l;i<=M;i++) { printf("lm= ,/.8.4f", l m [ i ] ) ; if(i*/.10==0) p r in t f ( " \n" ) ; } p r i n t f ("iter=*/.4d\n", i t e r ) ; } } i f ( i te r>=30) p r i n t f ( " d i d not converge\n"); for(i=l;i<=M;i++) { p r in t f ('7.8.4f */.8.4f 7.8.4f \n" , lm[ i ] , s q r t ( - a [ i ] [ i ] ) , l m [ i ] / s q r t ( - a [ i ] [ i ] ) ) ; 66 } printf ("iter='/.4d\n",iter); for(lik=0.,i=l;i<=M;i++) lik+=suf[i]*lm[i]; for(i=l,logc=0.;i<=nn;i++) logc+=log(c[i]); lik-=logc; printf("loglikelihood= y.9.4f\n", lik); } void d2b(n,ii,jj) int n,ii,jj[] ; { int i,d; d=ii-l; for(i=n;i>=l;i—) { jj[i]=d'/.2; d=d/2; } 67
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A model for multivariate binary data with covariates...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A model for multivariate binary data with covariates based on compatible conditionally specified logistic… Liu, Ying 1994
pdf
Page Metadata
Item Metadata
Title | A model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions |
Creator |
Liu, Ying |
Date Issued | 1994 |
Description | Rather than construction of a multivariate distribution from given univariate or bivariate margins, recently several papers seek to promote the development and usage of a simple but relatively unknown approach to the specification of models for dependent binary outcomes through conditional probabilities, each of which is assumed to be logistic. These recent proposals were all offered as heuristic approaches to specifying a multivariate distribution capable of representing the dependence of binary outcomes. However, they are limited in scope, for they all describe some special patterns of dependence. This thesis is concerned with a model for a multivariate binary response with covariates based on compatible conditionally specified logistic regressions. With this model, we allow for a general dependence structure for the binary outcomes. Three likelihood-based computing methods are introduced to estimate the parameters in our model. An example on the coronary bypass surgery is presented for illustration. |
Extent | 2135026 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-03-03 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0087512 |
URI | http://hdl.handle.net/2429/5380 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1994-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1994-0492.pdf [ 2.04MB ]
- Metadata
- JSON: 831-1.0087512.json
- JSON-LD: 831-1.0087512-ld.json
- RDF/XML (Pretty): 831-1.0087512-rdf.xml
- RDF/JSON: 831-1.0087512-rdf.json
- Turtle: 831-1.0087512-turtle.txt
- N-Triples: 831-1.0087512-rdf-ntriples.txt
- Original Record: 831-1.0087512-source.json
- Full Text
- 831-1.0087512-fulltext.txt
- Citation
- 831-1.0087512.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0087512/manifest