A Model for Multivariate Binary Data with Covariates Based on Compatible Conditionally Specified Logistic Regressions by Ying Liu B.Sc, Hangzhou Normal University, 1985 M.Sc, Shanghai University of Finance and Economics, 1991 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES Department of Statistics We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August 1994 ©Ying Liu, 1994 In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of J> la>t< 5 ^< e s The University of British Columbia Vancouver, Canada Date Auj. 30 . /9?^: Abstract Rather than construction of a multivariate distribution from given univariate or bivariate margins, recently several papers seek to promote the development and usage of a simple but relatively unknown approach to the specification of models for dependent binary outcomes through conditional probabilities, each of which is assumed to be logistic. These recent proposals were all offered as heuristic approaches to specifying a multivariate distribution capable of representing the dependence of binary outcomes. However, they are limited in scope, for they all describe some special patterns of dependence. This thesis is concerned with a model for a multivariate binary response with covariates based on compatible conditionally specified logistic regressions. With this model, we allow for a general dependence structure for the binary outcomes. Three likelihood-based computing methods are introduced to estimate the parameters in our model. An example on the coronary bypass surgery is presented for illustration. ii Contents Abstract ii Table of Contents iii List of Tables v Acknowledgement vii 1 Introduction 2 1 Compatibility Conditions for Conditionally Specified Multivariate Distributions 5 2.1 Compatibility in 2 Dimensions 6 2.2 Compatibility in 3 Dimensions 7 3 Models for Multivariate Binary Responses with Covariates 3.1 4 9 Conditions for Compatibility in 2 or 3 Dimensions 10 3.1.1 2 Dimensional Case 10 3.1.2 3 Dimensional Case 14 3.2 General Case 20 3.3 Further Extension 23 Comparisons with Other Models 26 4.1 26 Regressive Logistic Models in 4.2 Conditional Logistic Regression Models for Exchangeable Binary Data . . 28 5 Data Analysis and Computing 30 5.1 The Data Source and Description 30 5.2 Estimation Procedure 31 5.3 Comparison of Three Methods 35 5.4 Selection of Covariates 37 5.4.1 Odds Ratio Analysis 45 5.4.2 Univariate Analysis and Comparison of Models 46 5.5 5.6 Other Issues 50 5.5.1 50 General Examples 52 5.6.1 Example 1 52 5.6.2 Example 2 57 Appendix: C Program for Applying the Newton-Raphson Method IV 62 List of Tables 5.1 Description for the Subset of MCR Data 32 5.2 Description for the Subset of MCR Data (continued) 33 5.3 Results from Separate Logistic Regressions with Response Variables REM, NEM 34 5.4 Method Comparison under Case 1 37 5.5 Method Comparison under Case 2 38 5.6 Method Comparison under Case 3 38 5.7 Method Comparison under Case 4 39 5.8 Method Comparison under Case 5 40 5.9 Method Comparison under Case 5 (continued) 41 5.10 Method Comparison under Case 6 42 5.11 Method Comparison under Case 6 (continued) 43 5.12 Method Comparison under Case 6 (continued) 44 5.13 Odds Ratio for REM 47 5.14 Odds Ratio for NEM 48 5.15 Odds Ratio for PUM 49 5.16 The Association of REM (V21) and NEM (V22) 53 5.17 CI of Odds Ratio for REM (V21) and NEM (V22) 53 5.18 The Estimates of the Coefficients under All Five Cases 55 5.19 The Estimates of the Coefficients under All Five Cases (continued) v ... 56 5.20 The Log-likelihood Functions with Different Number of Covariates ... 57 5.21 The Log-likelihood Functions of Three Response Variables with Different Number of Covariates 58 5.22 The Estimates of the Coefficients under the Third Case 59 5.23 T h e Marginal Odds Ratio for REM and NEM 60 VI Acknowledgements I would like to thank my supervisor, Dr. Harry Joe for his guidance, help and support throughout the development of this thesis. I would also like to thank Dr. Martin Puterman for his careful reading of the manuscript and valuable comments. Finally, I must acknowledge the support of UBC Department of Statistics during my Masters program. vii Chapter 1 Introduction This thesis is concerned with a model for a multivariate binary response with covariates. The model is based on compatible logistic regressions. Association of outcomes may be a factor of primary or secondary interest in models for binary data. When measures of association are of interest, models for the marginal expectation may not be appropriate. Rather than construction of a multivariate distribution from given univariate or bivariate margins, recently several papers seek to promote the development and usage of a simple but relatively unknown approach to the specification of models for dependent binary outcomes through conditional probabilities. For binary response data, logistic regression models are useful for modelling the conditional probabilities. Practically speaking, we frequently encounter the variety of circumstances in which several binary response variables are measured on associated explanatory variables. In such situations, interest may centre on modelling the multivariate response for analysis. Take two examples for illustration. The first example on coronary bypass surgery will be used in the thesis later. This example provides the pre-operation information such as patient's age, gender, etc. and the post-operation information such as patient's complications and survival status which are all given in the form of binary data. Since 1 we are interested in those outcome variables indicating the patient's complications and survival status after the surgery, we have a situation in which the analysis of a multivariate binary response with covariates is needed. A second example concerns the analysis of responses, college plans (Yi) and parental encouragement (Y2) and the covariates, IQ (Xi), sex (X2), and socioeconomic status (X3), where Yi is 1 for yes, and 0 for no, i = 1,2 (see Bonney, 1987). For this example, building a model for a multivariate binary response with covariates might be of interest. Recently, regressive logistic models have been proposed when the data comprise sequences of binary outcomes possibly with measures on associated explanatory variables (Bonney, 1987). Regressive logistic models are based on the elementary decomposition of the joint probability so that the likelihood of a set of binary dependent outcomes, with or without explanatory variables, is expressed as a product of successive conditional probabilities each of which is assumed to be univariate logistic. In Bonney's models, each binary response variable is an explanatory variable for later response variables in the sequence. Another proposal (Rosner, 1984) is the polychotomous logistic regression model. In this model, Rosner converts the problem into a 2 n polychotomous logistic distribution for the number of successes when explanatory variables are zero. As a motivation for Rosner's model, Connolly and Liang (1988) describe a class of conditional logistic regression models for correlated binary data. This includes the polychotomous logistic model of Rosner as a special case. These recent proposals were all offered as heuristic approaches for specifying a multivariate distribution capable of representing the dependence of binary outcomes. However, they are limited in scope, for they all describe some special patterns of dependence. Regressive logistic models proposed by Bonney deal with sequential binary outcomes, whereas Connolly and Liang's models might be useful when the dependent binary data 2 are almost exchangeable such as in the familial data situation. In order to allow for a general dependence structure for the binary outcomes, we introduce a model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions. Conditionally, each binary response variable is a covariate for the logistic regressions for the other binary response variables. Moreover, there are conditions on the regression parameters in order that these conditional distributions are compatible. Furthermore, we can initially use logistic regressions to fit the conditional probabilities for each binary outcome, given the remaining binary outcomes in addition to covariates, and use this to assess whether the multivariate model is reasonable. Two different types of compatibility conditions are introduced in Chapter 2. One necessary and sufficient compatibility condition is proposed by Arnold and Strauss (1988); another compatibility condition comes from extending a result from Gelman and Speed (1993). The focus of Chapter 3 is to discuss general conditions under which there exists a joint probability distribution with the given specified conditional probability distributions and to develop a model for multivariate binary data with covariates based on compatible conditionally specified logistic regression. Starting with 2 and 3 dimensions enlightens us on the conclusion of the general case (from Section 3.1 to Section 3.2). In Section 3.3, we briefly discuss what are conditions for compatibility when the conditional distributions are based on a distribution that is not logistic. In Chapter 4, we compare our model with regressive logistic models proposed by Bonney and conditional logistic regression models for correlated binary data proposed by Connolly and Liang. 3 In Chapter 5, an example with the coronary bypass surgery data is used for illustration of our model. In Section 5.1, we describe the data source in the example. In Section 5.2, we introduce three computing methods which are all likelihood-based and are all used for estimating the parameters in our model. For our example, comparison of three methods is done in Section 5.3. While doing data analysis and computing among other things, answers to the following questions are sought (Section 5.4 and Section 5.5): 1. For our example, how do we choose the covariates to use in our model? 2. How much does the association of the response variables depend on the covariates? 4 Chapter 2 Compatibility Conditions for Conditionally Specified Multivariate Distributions The goal of this chapter is to study conditions for compatibility of conditional probabilities; that is, conditions for which a given set of conditional probability functions leads to a proper joint multivariate distribution. This will be made clear through some examples, starting with the two dimensional case. Let (X, Y) be a two dimensional random vector. Clearly, its probabilistic behavior for most purposes is adequately specified by knowledge of its joint cumulative distribution function FX,Y(X, or joint density function y) = P r ( X < x, Y < y); x, y e R fx,y{x,y). A variety of transforms can be used to characterize -Fy.y- It is clear that one marginal distribution and the family of corresponding conditional distributions, i.e., knowledge of Fy(y) and FX\Y{X\V) — Pr(-ST < x\Y = y), for every y, will completely determine the 5 joint distribution of (X, Y). On the other hand, knowledge of the marginal distributions Fx(x) and Fy(y) has long been known to be inadequate t o determine FX,Y(X, we are given both families of conditional distributions, FX\Y(x\y) y). What if f ° r every possible value y of Y and Fy\x{y \x) f ° r every possible value x of XI Provided the families of conditional distributions are compatible, in a sense that there exists a valid joint distribution with the given conditional distributions, then indeed these families of conditional distributions will determine the joint distribution of (X, Y). 2.1 Compatibility in 2 Dimensions The joint, marginal and conditional densities of X and Y will be denoted by fx,y(x, fx(x), /r(y)> fx\y(x\y) and fy\x(y\x). y), One compatibility condition proposed by Arnold and Strauss (1988) is that conditional densities fx\y{x\y) an d fY\x{y\x) are compatible if and only if there exist nonnegative functions a(x) and b(y) such that a{x)h\x{y\x) for all = b(y)fX\Y(x\y) (x,y). Another compatibility condition is given in Gelman and Speed (1993). Suppose the conditional densities fx\y('\y) are value xo- Assume that fx\y(xo\y) given for all y and fY\x('\xo) > 0 f°r au< is given for a particular J/- Then for the joint density fx,y{xTy)i jx\Y{xo\y) with t h e proportionality constant equal to J J fx\Y{xo\y) T h a t is, for two random variables, the set of conditional densities given one variable plus one conditional density given the other variable determine the joint bivariate distribution. If the conditional densities fx\Y('\y) are are given for all y and /Y\X('\X) given for all a:, then, Gelman and Speed stated that a condition for compatibility, derived from (2.1) is that: fx[y{x\y)fY\x(y\xi),fx\Y(x\y)fY\x(y\x2) fx\Y(xi\y) = fx\Y(x2\y)fY\X(y\xi) fx\Y(x2\y) .^ g . fx\Y(xi\y)fY\x(y\x2) does not depend on x, y for all choices of x\ ^ x2. 2.2 Compatibility in 3 Dimensions Similarly, a three dimensional random vector (X, Y, Z) might be specified by giving the distributions of X given (Y,Z), Y given (X,Z) and Z given (X, Y). Such conditional specifications must be checked for compatibility in a manner analogous to that described in section 2.1. On the basis of the idea provided by Arnold and Strauss, the compatibility condition for a three dimensional random vector {X, Y, Z) is that conditional densities fx\Y,z(x\y, y) a r e compatible if and only if there exist nonnegative and /Z\X,Y(Z\X, fY\x,z(y\x,z) functions a(y,z), b(x,z) and c(x,y) such that <*(v,z)fx\r,z(x\y>z) = Kx,z)fY\x,z(y\x>z) = for all (x,y,z) z), c(x,y)fz\xAz\x,y) (see Arnold, Castillo and Sarabia (1992), Chapter 8). The trivariate extension of Gelman and Speed's compatibility condition is as follows. Suppose the conditional densities fx\Y,z{'\y,z) are given for all y, z, are given for all z and a fixed x0, and fz\x,Y('\xo^yo) the same xQ as for fY\x,z('\xo,z). fY\x,z(yo\xo,z) 1S u x „ fx\Y,z(x\v, „ X z JX,Y,Z{ > y, ) a given for some fixed yQ and Assume that fx\Y,z(xo\y,z) z )fY\x,z(y\x<>, z )fz\x,Y(z\xo, T 7 j w 7—j JX\Y,zKxO\y, z)jY\X,Z\yo\XQ, 7 > 0 for all y, z and fxy,z{x,y,z), > 0 for all z, then for the joint density f fY\x,z('\xo,z) x Z) y0) . v V"*) with t h e proportionality constant equal to fx,Y{xo,yo)fx\Y,z(-\y>z) a r e Siven for a 1 1 2/' ZJ fy\x,z(-\xiz) are 8iven ft t h e conditional densities f o r a11 £> zi a n d fz\xy(-\x,y) are given for all x, y, then a condition for compatibility is that fx\Y,z(X2\yi X fx\Y,z( i Z )fY\X,z(yi\x2, z )fY\X,z(y\xli 2 Z )fz\X,Y(z\xi,yi) z x |y» • )/y|A-,z(yikl, «)/K|X,z(y|*2, z)fz\xA \ 2, does not depend on x, y, z for all choices of {x\,y\) ,g ^ J/2) ^ (a^Ste)- Cases involving random variables of dimension greater than three can, of course, be considered. The general ideas, however, are clearly visible in the 2 and 3 dimensional settings and both concepts for the compatibility condition can b e extended to higher dimensions. 8 Chapter 3 Models for Multivariate Binary Responses with Covariates T h e ideas in the previous chapter will be applied to get a model of compatible logistic regressions for a multivariate binary response. Consider a set of n correlated binary variables Y = (Yi, Y2,..., ( X i , X 2 , . . . ,Xm). Pr(Yi\Yj,j Yn), where Yi is coded 1 and 0, and m covariate variables X = Suppose that we are given the conditional probability distributions ^ «,X) for i = 1 , 2 , . . . , n . From previous discussions, the joint probability distribution pr(Y|x) = Pr(y1,r2,...,yn|x) will be determined if these conditional probability distributions are compatible. With binary responses, the logit link function is a natural choice although, in principle, any link function could be chosen. In our study, we assume that all conditional distribution of Yi given Yj, j ^ i, and X are logistic. That is, for a given binary response variable, the remaining binary response variables are used as covariates in addition to X . Now our goal is to discuss general conditions under which there exists a joint probability distribution P r ( Y | X ) with the given conditional probability distributions. At the 9 same time, a model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions can be constructed. 3.1 Conditions for Compatibility in 2 or 3 Dimensions The two approaches of Arnold &; Strauss and Gelman &; Speed are illustrated in this section. 3.1.1 2 Dimensional Case For ease of exposition, consider first the case of two binary response variables Yi and Y2, and a covariate vector x. Suppose that Yi conditional on x and Y2 = t/2 is logit and that Yi conditional on x and Y\ = y\ is logit, that is, «^PMK=II«..X)] = ^ff;:fcg = C*i + ftx + 7122/2, = a 2 + &X + 721J/1, with parameters <*i, /?i, or2? #2? 712, and 721. After some rearrangement, Pr(5-, = *|»,x) = f ^ V + V T ^ ' i 1 + exp(o:i + /?ix -I- 712J/2) and Pr(K2 = ^b 1 ,x) = * ± ^ ^ * M . (3.!) 1 + exp(o: 2 + P2X + 721 J/i) (Note that for convenience, the following shorthand notation Pr(r/i|j/2,x) is used for Pr(yi = yi\Y2 = t/2)X = x ) , etc; when the context is clear.) 10 To ensure compatibility for the conditional probability distributions Pr(t/i|?/2,x) and Pr(y 2 |j/i,x), we have the following theorem. T h e o r e m 3.1 The conditional probability distributions Pr(i/i|y 2 ,x) and Pr(i/ 2 |2/i,x) in (3.1) are compatible if and only i/712 = 721, in which case, the joint probability distri- bution is: Pr(y x , y 2 |x) = - exp[(o;i + ftx)^ + (a2 + ftxjjfo + 7122/12/2], (3.2) where c = 1 + exp(c*i + ftx) + exp(a 2 + &X) + exp[(o;i + ct2) + {Pi + /? 2 )x + lu]P r o o f 1: (using Arnold and Strauss's compatibility condition) Pr(j/i|y 2 ,x) and Pr(j/ 2 |2/i,x) are compatible if and only if there exist probabilities Pr(j/i, r/ 2 |x), Pr(y 2 |x) and Pr(yi|x) such that: Pr(2/i,2/2|x) = Pr(y 1 |j/ 2 ,x)Pr(2/ 2 |x) = Pr(ya|yi,x)Pr(yi|x). (see Arnold, Castillo and Sarabia (1992)). Equivalently: P r ( r i = 1|F2 = l , x ) _ P r ( y t = l | x ) p r (y 2 = i|y 1 = i,x) Pr(y2 = i|x)' pr(y1 = i|y2 = o,x)_Pr(y 1 = i|x) pr(y2 = o|yx = 1, x) Pr(y2 = o|x)' pr(yi = o|y2 = i,x) pr(yi = o|x) Pr(y2 = i|y1 = o,x)~Pr(y 2 = i|x)' Pr(yi = o|y2 = 0, x) Pr(yi = 0|x) Pr(y 2 = 0|y a = 0,x) ~ Pr(y 2 = 0 | x ) ' (3.3) (3.4) (3.5) (3.6) For example, dividing (3.3) by (3.4), and dividing (3.5) by (3.6), we have following equations respectively: pr(y2 = o|x) pr(y2 = i|x) = Pr(y1 = i|y2 = i,x) pr(y2 = o|y1 = i,x) pr(y2 = i|yx = i,x)' pr(ya = i|y2 = o,x) 11 exp(a! + ftx + 712) 1 + exp(a 2 + /?2x + 721) 1 + exp(«i + fax. + 712) exp(a 2 + ftx + 721) 1 1 + exp(c*i + ffix) 1 + exp(a 2 + /?2X + 721) exp(Qa + ftx) exp(7i 2 ) • [1 + exp(a! + ftx)] exp(72i) • exp(a 2 + /fcx) • [1 + exp(a 1 + ftx + 712)]' (3.7) p r (r 2 = o|x) _ Pr(yri = o|y2 = i,x) Pr(y2 = o|rx = o,x) Pr(r 2 = l|x) ~ _ Pr(r 2 = l | n = 0 , x ) ' P r ( F 1 = 0 | y 2 = 0,x) 1 1 + exp(a 2 + /32x) 1 + exp(c*i + ftx + 712) exp(a 2 + #2x) 1 + exp(ai + ftgx) 1 + exp(a 2 + /?2x) 1 + exp(ai + /?ix) exp(a 2 + /?2x) • [1 + exp(<*i + ftx + 712)]' (3.8) Comparing (3.7) with (3.8), only the condition 712=721 makes them consistent. Other cases yield the same result. Therefore, we conclude that a necessary and sufficient condition for Pr(j/i|j/ 2) x) and Pr(y 2 |2/i,x) to be compatible conditional distributions is 712=721- Since Pr(F 2 = 0|x) = 1 - Pr(F 2 = l|x), solving (3.8) we have P r ( F = llx) = 2 exp(a 2 + ftx) • [1 + exp(o;1 + ftx)] 1 + exp(ai + A x ) + exp(a 2 + /?2x) + exp[(ai + a2) + (ft + ft)x + 7l2]' Similarly, we also can get Pr(y! = l|x) exp(oi + &X) • [1 + exp(a 2 + /?2x)] 1 + exp(c*i + Pix) + exp(a 2 + /?2x) + exp[(ai + a2) + (ft + ftjx + 712]' Summarily, P r ( i i = y i | x ) = -exp[(ai + /3ix)j/i][l + exp(a 3 + i82X + 7i2y1)], c Pr(F 2 = 2/2 |x) = -exp[(a 2 + fax)y2][l + exp(an + ftx + Twife)]. c 12 (3.9) where c = 1 + exp(ai + ftx) + exp(a 2 + fax) + exp[(<*i + a2) + (/?i + fa)x + 712]. Using (3.1) and (3.9), it is easy to prove that the joint probability distribution is (3.2). P r o o f 2: (using Gelman and Speed's compatibility condition) Assume that Pr(y 1 |y 2 )X), Pr(y a |y2,x) > 0 for all y 2 , x and two fixed values y 1? yi. Implied by (2.2), a compatibility condition for P r ( y i | y 2 , x ) and Pr(3/2|3/i>x) is that Pr(yi|y2,x)Pr(y2|yl,x) Pr(yi|y2,x)Pr(y2|y",x) Pr(yi|y2,x) I Pr(yi'|y 2 ,x) does not depend on y\, y2 for all choices of yx ^ Pr(^'|y2,x)Pr(t/2l^,x) P r ( y ; | y 2 , x ) Pr(y 2 |t/i',x) ^ UJ yx. Using (3.1), (3.10) can be rewritten as follows: Pr(yil2/2,x)Pr(t/ 2 lyl,x) Pr(yi|y2,x)Pr(j/2|yi,x) _ = exp[(o?i + fax + 7123/2)3/1] 1 + exp(«i + ftx + 7i 2 y 2 ) exp[(a 2 + fax + 7213/^)3/2] w 1 + exp(a 2 + fax + 721ft) exp[(a 1 + ftx + 7123/2)3/1] exp[(q 2 + /? 2 x + 7213/1 )y 2 ] x 1 + exp(ai + /9ix + 7123/2) 1 + exp(a 2 + fax. + 721J/1) 1 + exp(a 2 + fax + 72iy") » /., r, v TT 7—Tfl—7 7T • exp ( a i + PixKVx - 3/1) 1 + exp(c*2 + fax + 7213/1) • exp[y 2 (yi' - yi)(7i2 - 721)] (3.11) With y1 = 1, yl — 0, (3.11) does not depend on yi and y2 if and only if 712 = 721- That is, a necessary condition for Pr(yi|y 2 ,x) and P r ( y 2 | y i , x ) to be compatible is 712 = 721. Letting yx = 0, from (2.1), (2.2) and (3.1), we get that the joint probability distribution Pr(3/i>3/2|x) has the following property: Pr(j/i,2/ 2 |x) oc Pr(2/i|2/2,x)Pr(y 2 |yi = 0,x) Pr(yi = 0|y 2 ,x) exp[(a 1 + fkx)V\ + (<*2 + ^ x ) y 2 + 7123/13/2] 1 + exp(a 2 + /?2x) 13 with the proportionality constant equal to 1 + exp(a 2 + /? 2 x) 1 + exp(ax + /?ix) + exp(a 2 + ftx) + exp[(ai + a2) + (ft + /? 2 )x + j12]' In other words, the joint probability distribution is proved as (3.2). It is easy to prove that the condition 7i 2 = 7 2 i is sufficient. 3.1.2 3 Dimensional Case Next, in order to get clearer ideas about the general multivariate case, it is necessary to further discuss the 3 dimensional case with three binary response variables Y\, Y2 and I 3 , and a covariate vector x. By assuming the linear logistic models for the conditional probability distributions Pr(F x = y i | y 2 , y 3 , x ) , Pr(F 2 = t/ 2 |yi,y 3 ,x) and P r ( F 3 = 2/3|2/i,2/2,x), we have r, / 1 x exp[(ai + ftx Pr(yi y 2 ,y3,x) = — -.—— 1 + exp(ai + exp[(a 2 + ftx Pr(y 2 |yi,y3,x) = 1 + exp(a 2 + + 7 12 y 2 + 713^3)2/1] • r, ftx + 7122/2 + 713J/3) + j2iyi + 7232/3)2/2] ftx + 7 2 ij/ a + 7 23 ?/ 3 )' and TW 1 ^ exp (0:3 + ftx + 7 31 yi + 732^2)2/3 P % 3 2/1,2/2, x) = — -.—— • r, 1 + exp(a 3 + ftx + 7 31 y x + 7 32 y 2 ) ,„ 10 x (3.12) where a's, fts and 7's are parameters. (Note again that we are using shorthand notation, that is clear from its context.) For a 3 dimensional case, an analog of Theorem 3.1 to guarantee compatibility for (3.12) is given in Theorem 3.2. T h e o r e m 3.2 The conditional probability distributions Pr(yi|j/ 2 ,t/ 3 ,x), Pr(y 2 |t/i,y 3 ,x) and Pr(j/ 3 |j/i,y 2 ,x) in (3.12) are compatible if and only 1/71:2=721,713=731, and 7 2 3 = 14 732 • Furthermore, the joint probability distribution Pr(2/i, 2/2,2/3 |x) = is of the form - exp[(a a + /?ix)yi + (a2 + My2 + (<*3 + /? 3 x)y 3 +7122/12/2 + 7132/12/3 + 7232/22/3] = 3 1 - exp[£(a,-+ #x)y,-+ C »'=1 £ 7.i2/.2/j] (3.13) l<«'<j<3 where c = 1 + £ ? = 1 exp(a,- + # x ) + Ei<,<i<3 exp[(a;,- + a,-) + (A + ^ ) x + 7,^] + exp[(a 1 + a2 + a3) + (ft + ft + ft)x + (7l2 + 7l3 + 723)]. P r o o f 1: (using the extension of Arnold and Strauss's compatibility condition) Pr (2/i|2/2,2/3,x), Pr(j/ 2 |2/i,2/3,x) and Pr(2/3|2/i,2/2,x) are compatible if and only if there are probabilities Pr(j/i,t/ 2 ,2/3|x), Pr(2/ 2 ,2/3|x),Pr(t/ 1 , y 3 |x) and Pr(j/!,2/ 2 |x) such that p r(2/i>2/2,2/3|x) = Pr(2/i|2/2,2/3,x)-Pr(T/ 2 ,2/3|x) = Pr(y2|yi,y3,x)-Pr(yi,y3|x) = Pr(y3|yi,y2,x)-Pr(yi,y2|x) for all 2/1,2/2, 2/3, x. Equivalently, Pr(2/i|2/2,y3,x Pr(2/2|2/i,y3,x Pr(2/i,2/3|x) Pr(2/2,2/ 3 |x)' Pr(2/i|2/2,2/3,x Pr(y3|2/i,y2,x Pr(2/i,2/2[x) Pr(^,2/3|x)' Pr(2/2|2/i,2/3,x Pr(2/3|2/i,y2,x Pr(2/i,y2|x) Pr(yi,y3|x)' (3.14) Special cases of (3.14) are Pr(r 1 ^o,r 3 = oix) Pr(y2 = o,r3 = o|x) P r ( F 1 = 0 | y 2 = 0 , r 3 = 0,x) Pr(F 2 = 0 1 1 1 = 0 , F 3 = 0,x) 1 + exp(Q 2 + ftx) 1 + exp(ai + ftx)' 15 (3.15) Pr(y1 = i,y 3 = o|x) = P r ( F 2 = 0, Y3 = 0|x) Pr(y1 = o,y3 = Q|x) p r (y 2 = i,r 3 = o|x) Pr(r1 = i|r 2 = o,y3 = o,x) Pr(F 2 = 0 | y = 1, Y3 = 0, x ) 1 + exp(c*2 + fox + 721) exp(c*i + 1 + exp(a 1 + fox.) = = _ (3.16) fox)' (3.17) Pr(y1 = o|ra = i,y 3 = o,x) Pr(y2 = i|Ki = o,y3 = o,x) 1 + exp(a 2 + fox) 1 + exp(ai + fox + 7 12 ) Pr(y1 = i,y 3 = o|x) Pr(y2 = i,y 3 = o|x) fox), exp(e*2 + Pr(y1 = i|y2 = i,y 3 = o,x) Pr(y2 = i|ya = i,y 3 = o,x) l + e x p ( a 2 + /?2x + 72i) 1 + exp(ai + fox + 712) • exp[(a! - a2) + (fo - fo)x + ( 7 l 2 - 721)]. (3.18) Dividing (3.15) by (3.16) and dividing (3.17) by (3.18), we have P r ( y 1 = 0,y 3 = 0lx) Pr(y x = 1, y 3 = 0|x) Pr(y x = 0, y 3 = 0|x) Pr(y a = l , y 3 = 0 | x ) 1 + exp(a 2 + fox) 1 + exp(a 2 + fox + 721) 1 exp(ai + 1 + exp(a 2 + fox) 1 l + exp(c*2 + /?2X + 7 21 ) exp(ai + fox + (3.19) fox)' 7l2 - 72i)' (3.20) Comparing (3.19) with (3.20), we conclude that one of the necessary and sufficient conditions for Pr(yi \y2, J/3, x ) , Pr(j/ 2 |j/i, t/3, x) and Pr(t/ 3 |?/i, y2, x ) to be compatible conditional distributions is that 712 = 721. By symmetry, we conclude that other necessary and sufficient conditions for compatibility are 7 1 3 = 731 and 7 2 3 = 7 3 2 . Using the same procedure which has resulted in (3.19) and (3.20), we have Pr(yx = o,y3 = i|x) pr(ya = i,y 3 = i|x) = Pr((yx = o,y3 = iix) pr(y2 = o,y3 = i|x) pr(y2 = o,y3 = i|x) ' pr(yx = i,y 3 = i|x) Pr(y1 = o|y2 = o,y3 = i,x) Pr(y2 = o|y1 = i,y 3 = i,x) pr(y2 = oiyj = o,y3 = i , x ) ' pr(ya = i|y2 = o,y3 = i,x) 1 + exp(a 2 + fox + 7 23 ) 1 1 + exp(a 2 + fox -(- 712 + 723) exp(ai + fox + 713) 16 (3.21) and Pr(y1 = i>y3 = o|x) Pr(yx = i,y 3 = i|x) = Pr(y1 = i,y 3 = o|x) pr(y1 = i,y2 = o|x) pr(yx = i,r 2 = o|x)" Pr(n = i,y 3 = i|x) pr(y3 = o|y1 = i,y2 = o,x) Pr(ya = o|yi = i,y 3 = i,x) pr(y2 = o|Fx = i,y 3 = o,x) * pr(y3 = m = i,y2 = o,x) 1 + exp(a 2 + fax. + 7 l 2 ) 1 1 + exp(a 2 + fax + 7 12 + 723) exp(a 3 + fax + 713)' From (3.21) and (3.22), Pr(y x = 0,y 3 = l|x) and P r ^ = l,y 3 = 0|x) can be expressed in terms of Pr(y x = 1, y 3 = l|x), respectively: Pry, - 0, li - 1 W - , ' + T P(0 % + A X + ™] , • VT(Y; = \Ys = 1|X ' (3.23) 1 + exp(a 2 + fax + 712 + 723) exp(a 1 + fax + 713) and Pr(K, = 1. y 3 = 0 W = l + e x p ( a i + /?2x+ 7,2) Pr(r, = 1 ,Y, = l|x) 1 + exp(a 2 + fax + 712 + 723) exp(« 3 + fax + 713) Substituting (3.24) for (3.20), Pr(y x = 0, Y\ = 0|x) also can be expressed in terms of Pr(y 1 = l , y 3 = l|x): Pr( y = i y ' = 3 1]x) = 1 + exp( Q2 + fax) Pr(y 1 = l , y 3 = l|x) 1 + exp(a 2 + fax + 712 + 723) exp[(tt! + a 3 ) + (fa + fa)x + 7 l 3 ] (3.25) Under the constraints of Pr(Yi = 0, Y3 = 0|x) + Pr(yi = 0, Y3 = l|x) + Pr(y x = 1, Y3 = 0|x) + Pr(y a = l,y 3 = l|x) = 1, a bit of calculation by substituting (3.23), (3.24) and (3.25) will result in following equation: c-Pr(y 1 = i , y 3 = i|x) = 1 [1 + exp(a 2 + fax + 712 + 723)] exp[(ai + a3) + (fa + fa)x + 713] where c = 1 + £ ? = 1 exp(a,- + fax) + £i<«j<3 exp[(a,- + a,) + (fa + fa)x + 7^-] + exp[(ai + a2 + a3) + (fa + fa + fa)x + (7 12 + 713 + 723)] • 17 Thus, Pr(rx = l , y 3 = l | x ) = - • e x p [ ( a 1 + a3) + ( A + & ) x + 7 13 ] c •[1 + exp(a 2 + /?2x + 712 + 723)] Similarly, we can obtain other probabilities Pr(y,-,y_,|x). In brief, we can write Pr(y,-,yy |x) as: Pr(y,-, yj |x) = - • exp[(a,- + #x)y,- + (a,- + fijx)yj + myiVj] c (3-26) •[1 + exp(a fc + /?*x + 7,fcj/i + 7*iW)] where i < j , k ^ i, k ^ j and i, j , k = 1,2,3. Finally, using (3.12) and (3.26), we can obtain the joint probability distribution as (3.13). P r o o f 2: (using the extension of Gelman and Speed's compatibility condition) From (2.4), a condition for P r ( y i | y 2 , y 3 , x ) , P r ( y 2 | y i , y 3 , x ) and P r ( y 3 | y i , y 2 , x ) to be compatible is that Pr(Vi lya, 3/3,x) Pr(y 2 |y|', y 3 , x) Pr(y 2 |yj, 2/3, x) Pr(y 3 |yj, y 2 > x) , ^ Pr(yi|y2, ya, x ) Pr(y 2 |yi, ys, x ) Pr(y 2 |yi'» y3, x ) Pr(y 3 |yi', y 2 , x ) does not depend on j/i, y2 and y 3 for all choices of (y1,y2) 7^ ( y i ^ ) - Applying (3.12), in this case we rewrite (3.27) as: Pr(yriy2,y3,x)Pr(y2,|yi,y3,x)Pr(y2|yl,y3,x)Pr(y3|t/i,y2,x) ^ Pr(yi|y2, V3, x) Pr(y 2 |yi, y 3 , x) Pr(y 2 |yi', y 3 , x) Pr(y 3 |yi', y 2 , x ) 1 + e x p ( a 3 +/5 3 x + 73^1 + 732y2') w » /. , #v, r, fl fl w » — ) — — r— rf • exp ( a i + ftx)^ - y x ) + ( a 2 + /? 2 x)(y 2 - y 2 ) 1 + e x p ( a 3 + /?3x + 7 3 ^ ! + 7 32 y 2 ) • exp[72i(yiy 2 - y'iy'2) • exp[y 2 (y" - yj)(7i2 ~ 721)] • exp[y 3 (yi' - yj)(7i 3 - 731)] • exp[y 3 (y 2 ' - y 2 )(7 2 3 - 732)] (3.28) 18 Let yx = 1, y a = 0, y 2 = 1 and y2 = 0, then (3.28) does not depend on yi, y2 and y3 if and only if 712 = 721, 713 = 731 and 723 = 732. This indicates that necessary conditions for the trivariate case are 712 = 721, 713 = 731 and 723 = 732- Applying (2.3), let both fixed values of y\ and t/2 be zeros, then we know for the joint probability distribution Pr(yi, t/25£/3> |x), D / 1x Pr(yily2,y3,x)Pr(y2|yi = 0,y3,x)Pr(y3|yi = 0,3/2 = 0,x) Pr(yi, 2/2, 2/3, X) OC _ — -: —— r Pr(yi = 0|y 2 ,J/3,x)Pr(y 2 = 0|yx = 0 , y 3 , x ) exp[(Qi + /?ix)yi + (0:2 + /?2x)y2 + (Q3 + ^3x)y 3 + 7122/13/2 + 7i32/i2/3 + 723^/23/3] 1 + exp(a 3 + /33x) with the proportionality constant equal to [1 + exp(ai3 + /?3x)]/c, where c is given in (3.13). Now we have proved that the joint probability distribution is the form of (3.13), and sufficiency of the conditions is easily proved. In both Theorem 3.1 and Theorem 3.2, we use two kinds of techniques for the proof. The first technique uses Arnold and Strauss's compatibility conditions; the second uses Gelman and Speed's idea for compatibility conditions. It is obvious, from the lengths of the proofs in the 3 dimensional case, that the extension of Gelman and Speed's compatibility conditions is easier to use than the extension of Arnold and Strauss's. In fact, using the extension of Gelman and Speed's compatibility conditions, it is possible to prove an analogous theorem to Theorem 3.1 and Theorem 3.2 for the general multivariate case, but using the extension of Arnold and Strauss's compatibility conditions, the necessity of the condition might be too tedious to prove. 19 3.2 General Case Our ultimate aim is to develop a conditionally specified logistic regression model for multivariate binary responses with covariates and to discuss necessary and sufficient conditions for compatibility under general dimensions. For notation, we change /?jX above to j3j\Xi + • • • + f3jmxm. Assume Pr(y n | y i , . . . , y ^ , x ) - 1 + ^ ^ ^ + ^ + ^ ^ } . (3.29) The above discussions in both 2 and 3 dimensional cases provide immediate generalization of the results for the n dimensional case. For the reason that we have mentioned at the end of last section, however, we prove the necessary conditions for conditional distributions (3.29) to be compatible only using the extension of Gelman and Speed's compatibility condition. T h e o r e m 3.3 The conditional probability distributions Pr(y,|yj, j ^ i , x ) for i = 1 , . . . , n, in (3.29) are compatible if "fij=fji for all i ^ j . Under conditions for a joint probability distribution is as follows: i n tn P r ( y i , . . . , y n |x) = -exp{]T[(a,- + £ PijXj)yi] + C where c is the normalizing compatibility, ,=1 j=l J2 7vW> (3-30) l<.<i<n constant which involves a sum of2n exponential terms and is the form of c 1 1 J/l=0 j/n=0 ex n a m = Yl • • • J2 p{J2( i + J2 Pnxj)yi + Yl lavivi) t'=l j=l (3-31) i<j Proof: By extending (2.4) and (3.27) to the n dimensional case, we conclude that a condition for Pi(yi\yj,j ^ i,x) to be compatible is that FEUPr(y,-|y|,• • •,y'j-i,yi+i,•. •, yW) x) • n?=i^(y"\y'u••••>y"-nVt+u• • •»Vn,x) , g ^ n?=i Pr(y,-|yi', • • •,y"_i, yi+i,..., y», x) • n?=i Pr(y,:|2/i, • • •,y,'_i,y.-+i,...,y„,x) 20 does not depend on y,- ( i — 1 , . . . , n) for all choices of (y1,..., y n ) ^ (y1,..., yn). Given (3.29), we can write (3.32) in detail, that is: nr = i Pr(y,[yi,. • •, y,-_i, yi+i, • • •, yn, x) • nr=i Pr(y,"|yi'> • • •»y"-n yi+i. • • •. yn, x) nr=i Pr(y.|yi, • • •, y,"_i, y.+i,..., y», x) • n,"=i Pr(y-|yi, •. •, y,'_i, y,+i,..., y„, x) _ n"=i exp[(a,- + J2T=i Paxi + £ j < i 7«i2/j + Ej>.- 1fijVj)yi] ll"=i exp[(a,- + E7=i Paxi + E i < i 7.j'yi + £;>.- lavM n"=i exP[(o!,- + Y%LI PiM + Ej<,- 7«jyj + Zj>i mjyjWi] n?=i exp[(a,- + ££=i AjXj + Ej<,- luv'j + Ej>.- 7yw)yi'l n m ex = I I p[(««- + £ Pijxi)(y" - y'd + £ 7« (yj y" - yj-y,')] i=l j=l :>j n I I e x p E 7j.-yj(y,' - y'i)] • I I e x p E 7o-yj(y" - y.')] «=i If (y[,..., «<j t=i (3-33) i'<j y n ) = 1 and ( y " , . . . , y„) = 0, (3.33) becomes n,"=i Pr(yi\y'i, •••, y,--i, y.-+i, • • -, y», x) • n L i Pr(y"lyi, • • •» y"-n y«+i, • • •, yw, x) nr=i Pr(y,-|yi', • • •,y-'-i,y.+i,...,y n ,x) • n?=i Pr(y,'|yi,...,y,'_i,y,-+i,...,y„, x) n m n x = H exp[-(a t - + Y, Pa J + S Ty)] • I I t=l j=l «>i ex i=l P E ( 7 ; . ' - 7«i)yj]- (3-34) i<j Therefore, (3.32) does not depend on yt- ( i — 1 , . . . , n) if and only if 7,j = 7,-,- for all i ^ j in (3.34). Now we get the conclusion that the necessary conditions for Pr(y,|yj,j 7^ z,x) to be compatible conditional distributions are 7,j = 7,-,- for all i ^ j . Again extending (2.1) to the n dimensional case, for the joint probability distribution Pr(yi,...,yB|x), V X Pr^,...^*^*?^---'^1 - \ n"=i Pr(y,-|yi, • • •, y,-_i, y.+i, • • •, y», x ) In this case, if yt = 0 for all i, we can write (3.35) as follows: n m P r ( y i , . . . , y„|X) oc I I exp[(a,- + ^ A ^ i + X}7ijy;)y.] 1=1 i=i 21 i<i (3.35) with the proportionality constant equal to 1/c, where c is defined as (3.31). That is, the model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions is: i n TO Pr(yi, • • •, y„|x) = - e x p { £ [ ( a , - + ] T flyx,-)^ + C ,=1 j=l Y, -yijViVj} l<t'<i<n where c = £ * i = 0 • • • Ej„=o ex P{E,"=i(«.- + EjLi PijXj)yi + E , < ; lijViVj}- Finally, we prove the sufficient conditions for conditional distributions (3.29) to be compatible. T h e o r e m 3.4 The conditional probability distributions of (3.30) are linear logistic dis- tributions. Proof: Given the model (3.30) with 7^ = 7,,-, i ^ j , we have Pr(y,-,i ± k\X) = Vv(yu...,i/*-i,y* = l,y*+i,-• + Pr(?/i,...,t/ f c _i,y f c = 0,yk+1,...,yn\X) m •j = -,Vn\X) m a -{exp[(a* + ^ flya:,-) + X ( » ' + H c 3=1 + D 7*;yj + Jj4A i^k E 3=1 To-yt-yj] l<i<i<n,'ii# m + expE(a,- + ]T #,•&,•)# + 1 = C 2 HjViyj]} X) HiViVj] m ex a PE( '' + E^Ii)j''+ i^k i=l m l<i<j<n,i,j^k •{1 + expK + ] £ fly*,- + XI 7fcj2/j]}3=1 Further, we obtain Pr(y*|y;,j ? * * , * ) = Pnxi)Vi Pr(yi>S/2,--.,yn|A') Prfo.j^*!*) 22 jjtk exp{E"=i [(<*«• + E?Li PijXj)vi] + Ei<i<j<n amy5} xp[E,^fc(a.- + E^zi fcjxfivi + J2i<i<j<n,i,j*k amy]] e 1 {1 + exp[a* + E?Li PkjXj + Ej*k 7kjVj]} exP[(o!fc + E ^ i PkjXj)yk + (Ej^fe 7kjyj)yk] 1 + exp[«* + E^=i PkjXj + Ej^jt 1kjVj\ This equation indicates that conditional probability distributions Pv(yk\yj,j ^ k,X) for k = 1 , . . . , n are linear logistic distributions. • 3.3 Further Extension The discussion so far assumes implicitly that the conditional distributions Pr(i/j|?/,-, i ^ j,x) for j = l , . . . , n , are logistic, so we say that the model (3.30) for multivariate binary data with covariates we have constructed is based on compatible conditionally specified logistic regressions. In fact, as mentioned before, with binary responses, any other link function could be chosen instead of the logit link function. From this point of view, a question to be addressed here is what are conditions for compatibility when the conditional distributions are based on a distribution F that is not logistic, i.e., Pr(5"j = 1 lift, i ± j , x) = F{ai + pjX + J2 ljiyi),3 = 1, • • •, n. In this section, we intend to motivate the development of conditions for compatibility when F is not logistic. For this reason, let us simply consider the case of two binary response variables Y\ and Y2, and a covariate vector x . Assume that the conditional distribution of Yj is Pi(Yj = l|y,-,x) = F{ctj + 0jX + rijiyi) f o r ; = 1,2,» = 3 - j . 23 Here we use Arnold and Strauss's compatibility condition instead of Gelman and Speed's. Consequently, the conditional distributions Pr(yi\y2,x.) and Pr(y 2 |j/i,x) are compatible if and only if there exist probabilities Pr(yi,y 2 |x), Pr(j/i|x) and Pr(i/ 2 |x) such that: Pr(2/i,3/2|x) = Pr(yi|y 2 ,x)Pr(y 2 |x) = Pr(j/2|yi,x)Pr(y!|x). In detail, we write the above form as: Pr(y1 = i|r 2 = i,x) Pr(y! = i|x) Pr(y 2 = l|yx = l,x) Pr(y 2 = l | x ) ' (3.36) Pr(yx = i|y2 = o,x) _ Pr(yx = i|x) Pr(y2 = o|ya = i,x) " pr(y2 = o|x)' Pr(yx Pr(y 2 Pr(y t p r (y 2 = = = = o|y2 \\YX o|y2 o|yx = = = = l,x) Pr(y t = 0,x) ~ Pr(y 2 = 0,x) Pr(yi = 0,x) ~ Pr(y 2 = (3.37) 0|x) l|x)' 0|x) 0|x)" (3.38) (3.39) In particular, if we divide (3.36) by (3.37) and divide (3.38) by (3.39), we have pr(y2 = o|x) pr(y2 = i|x) = pr(y1 = i|y2 = i,x) pr(y2 = o|y1 = i,x) pr(y2 = i|ya = i , x ) ' Pr(yx = i|y2 = o,x) F{ax + A x + 712) 1 - F(a2 + A x + F(ai + ^ x ) 7ai) * F(a2 + Ax + 721) (3.40) pr(y2 = o|x) = Pr(y1 = o|y2 = i,x) Pr(y2 = o[y1 = o,x) pr(y2 = i|x) ~ Pr(y2 = i|y1 = o,x)'Pr(y 1 = o|y2 = o,x) 1 - F(ai + fax. + 712) 1 - F(a2 + ^ x ) 1 - F(ai + Ax) F{a2 + Ax) (3.41) Obviously, (3.40) and (3.41) must be equal. In other words, a necessary and sufficient condition for Pr(j/i|t/ 2 ,x) and Pr(y 2 |y x ,x) to be compatible conditional distributions is F ( a i + A x + 712) F(a2 + A x ) _ 1 - F(ai + Ax + 712) ' 1 - ^(02 + Ax) F(a2 + A x + 721) Fj^ + Ax) 1 - F(a2 + Ax + 721) ' 1 - F(ai + Ax) (3.42) After analysing (3.42), we find that both of the following conditions make (3.42) hold: 24 1. ai = a2, fix = #2, and 712 = 721. 2. 712 = 721 = 0, i.e., the two binary response variables Y\ and Y2 are independent. It is not obvious that general conditions can be obtained. There may be conditions that depend on F. As for higher dimensional cases, the general analysis is quite difficult. 25 Chapter 4 Comparisons with Other Models Logistic regression is widely used to study the effects of explanatory variables on binary response variables. Several different specifications have been considered for dependent binary response variables. By means of logistic regression, several useful approaches to specifying a multivariate distribution capable of representing the dependence of binary response variables have recently been proposed. However, in contrast to model (3.30) which allows for a general dependence structure for binary response variables, these other models are limited in scope. Here we discuss two kinds of these models and compare them with model (3.30). 4.1 Regressive Logistic Models To describe sequentially dependent binary response variables, Bonney (1987) proposed regressive logistic models in which the probability of an observation is conditioned on all preceding observations. The basic theory of Bonney's models is the follow. Since there exists a natural ordering to the indexing of all dependent binary response variables, it is reasonable to decompose the probability of Y given X into a product of n probabilities 26 such as: Pr(Y|x) = p r (y 1 ,y 2 ,...,y„|x) = Pr(r1|x)Pr(y2|r1,x)--.Pr(rn|r1,r2,...,n_1,x). (4.i) Correspondingly, he defines the ith logit as 6i Pr(Yi = l\Y1,...,Yi.1,X) = ^ P r ( ^ = 0|y1,...,«.1>X)1' [ and assumes that 0,- is a linear function of Yi,..., 0i = a + 7a Zx H where Yi-i,X. and can be expressed by h 7 n _iZ n _ 1 + /?X, 21^-1, ifj<i £; = < 0, if j>i. Therefore, the joint probability is decomposed into a product of successive conditional probabilities each of which is assumed to be univariate logistic: P'(Y|X) = ft ~ ^ - (4-2) Model (4.2) provided by Bonney has the theoretical and practical advantage that it can be analyzed and fitted in the same way as the logistic regression model for independent binary response variables, and with the same computer programs. However, there exist differences between model (3.30) and model (4.2). Model (3.30) is based on compatible conditionally specified logistic regressions and deals with a general dependence structure for the Yi's, but model(4.2) is based on a product of a sequence of logistic regressions and only should be used if the binary variables are observed sequentially in time. The decomposition (4.1) is applicable only when a natural ordering of the dependent observations exists. Note that a different order generally implies a different model, and the joint probabilities (4.2) are not necessarily the same for different orders. 27 4.2 Conditional Logistic Regression Models for Exchangeable Binary D a t a Rosner (1984) proposed a polychotomous logistic regression model which reduces to the beta-binomial distribution for the number of successes when explanatory variables are zero. Based on Rosner's idea, Connolly and Liang (1988) suggested a general model in which Rosner's model is considered as a special case: P r ( Y | X a , . . . , Xn) = c(0,0) exp{ £ ) Fn(k, 0) + £ Xj/3Yj} (4.3) where c is a normalizing constant chosen to make the probabilities, P r ( Y | X ) , sum to unity and Y. is the sum of the Yj's. It follows from (4.3) that the logit conditional probability of Yj; = 1 given Y__,- = {Y\, • • •, Yj-_i, Y j + i , . . . , Yn) and Xj is logitVxiXj = l\Y-i,Xi) = Fn(wj;0) + /?%•, (4.4) where Wj = Y—Yj is the sum of the Y's excluding Yj. Clearly, the conditional probability for Yj = 1 in (4.4) depends on YLj only through the sum. In other words, this indicates a very important feature of the logistic representation in (4.4): when there are no covariate effects or when X\ = • • • = Xn, then the models are such that the n Yj's are exchangeable. Compared with model (3.30), model (4.3) introduced by Connolly and Liang has some limitations due to the restricted dependence structure. That is, model (4.3) might only be reasonable when the dependence among observations is approximately exchangeable such as in familial data. On the other hand, in model (3.30), the logit conditional probability of Yj = 1 given ?/_, and X is m logitVx(Yi = l|y_,-,X) = a,- + ^ 28 Nxi + X}7«j2/j- This logistic representation means that the dependence of each single y on the rest is only through the sum when all 7,-j's are equal. Thus, we conclude that model (3.30) is a special case of Connolly and Liang's model (4.3) when jij = 7 for all i ^ j and /?,_,- = flj for all i,j, since then £ i ? y 7 ^ = q/(y. - y{). 29 Chapter 5 D a t a Analysis and Computing An example with some coronary bypass surgery data will now be presented to illustrate the application of the model for multivariate binary responses with covariates that we have discussed in Chapter 3. From a clinical point of view, it is important to determine risk factors for different complications. Using multivariate binary model (3.30), we can determine risk factors for several response variables simultaneously and account for dependence between the binary response variables. Zhang (1993) did the risk factor analysis separately for each binary response variable (or complication variable). In order to fit model (3.30), three likelihood-based computing methods will be introduced. After that, several topics will be discussed, including choice of covariates and inferences from the model. 5.1 T h e D a t a Source and Description MCR (Merged, Multi-Center, Multi-Specialty Clinical Registries) is an international data-base developed by Health Data Research Institute in which information about patients who had heart-related surgery was recorded. When doing statistical analysis, instead of using the entire data set, we choose a random sample subset which contains 30 880 patients information (this is done partly in order to reduce computational time). The information available both pre and post operation is provided in Table 5.1 and Table 5.2. The pre-operation information includes patient's age, gender, prior myocardial infarction, existence or non-existence of other diseases, body surface area, etc. (the first 19 variables: VI to V19). The post-operation information includes the patient's status during and after the bypass surgery; for example, complications, such as renal or neurological problems, and survival status to 30 days following surgery (the last 7 variables: V20 to V26). The variables of primary interest are those outcome variables indicating the patient's complications and survival status after the surgery. Here, the complication variables are related to the quality of life after surgery. For more documentation of the data set, see Zhang (1993). 5.2 Estimation Procedure Our aim in this section is to discuss from a practical viewpoint, how to apply model (3.30) and how to estimate the parameters in model (3.37). Firstly, a simple, indirect, yet important way is to use the existing computer programs such as the glm() function in Splus which fits the coefficients of the linear logistic model using maximum likelihood in the binomial family. Using the glm() function in Splus to fit the conditional probabilities Pr(y,|yj, j ^ i,x) separately, not only can we estimate the parameters, but also we can assess the compatibility conditions for the conditionally specified distributions. For example, after using the glm() function twice to fit model (3.30) for two binary response variables REM (V21), NEM (V22) and one covariate AGE (VI) in the MCR data set, we have the results in Table 5.3. From Table 5.3, in addition to obtaining the 31 Table 5.1: Description for the Subset of MCR Data Variable Name Codes/Values Abbreviation VI Age Years AGE V2 Sex 0=Male SEX 1=Female V3 Prior Myocardial Infarction 0=No l=Yes PMI V4 Obesity 0=No l=Yes OBE V5 Chronic Obstructive Pulmonary Disease 0=No l=Yes COP V6 Diabetes 0=No l=Yes DIA V7 Renal Disease 0=No l=Yes REN V8 Hypertension 0=No l=Yes HTN V9 Alcohol Abuse 0=No l=Yes ETO V10 Cancer 0=No l=Yes CA Vll Liver Disease 0=No l=Yes LIV V12 Central Nervous System Disease 0=No l=Yes CNS V13 Prior Cerebrovascular Accident 0=No l=Yes PCA V14 Rheumatic Heart Disease 0=No l=Yes RHE 32 Table 5.2: Description for the Subset of MCR Data (continued) Variable Name Codes/Values Abbreviation V15 Other Surgery 0=No l=Yes OTH V16 No Surgery 0=No l=Yes NON V17 Valve Replacement 0=No l=Yes VAL V18 Left Ventricular Dysfunction 0=Normal LVD 1= 40-49% 2= 30-39% 3= 20-29% 4= < 20% V19 Body Surface Area Square meters BSA V20 Discharge/30 Day Status 0=Dead STA 1=Alive V21 Renal Complication (Mild or Severe) 0=No l=Yes REM V22 Neurological Complication 0=No l=Yes NEM (Mild or Severe) V23 Pulmonary (Mild or Severe) 0=No l=Yes PUM V24 Myocardial Infarction 0=No l=Yes MI V25 Low Out Syndrome (Mild or Severe) 0=No l=Yes LOM V26 Sepsis 0=No l=Yes SEP 33 Table 5.3: Results from Separate Logistic Regressions with Response Variables REM, NEM Parameter Value Std. Error t value -8.9190 1.5952 -5.5912 0.0818 0.0229 3.5700 712 1.4760 0.3914 3.7712 a2 -7.1316 1.0204 -6.9900 0.0705 0.0149 4.7296 1.4521 0.3959 3.6676 721 estimate of the parameters, more important, we find that 7 1 2 is quite close to 721 and this gives an indication of the adequacy of fit of model (3.30) for these data. Secondly, model (3.30) can be viewed as an exponential family model c _ 1 e x p { A - 1 s } with "sufficient" statistic vector s = (su...,sM) = (yi, • • • ,yn,yiy2,ym, • • • ,yn-iyn,xiyj,i = l,...,m,j Given data of the form (yn,..., i/, n , xa,..., £ 4 m ), i = 1,...,N, = l,...,n). the sufficient statistic vector is X2,=i s«- Using the Newton-Raphson procedure for an exponential family loglikelihood, we can obtain the maximum likelihood estimate of the parameters in model (3.30) by an iterative process as: Ar+1 = A r - H " 1 ( A r ) u ( A r ) for r = 0 , 1 , 2 , . . . , where u(A) is a vector whose j t h component is the first derivative of the log-likelihood with respect to the jth parameter and the matrix H(A) (sometimes called the Hessian matrix) is the matrix of second partial derivatives of log-likelihood function. A0, a vector of initial estimates of A, can come from the results of using the glm() function, separately for each response variable. 34 In order to use the Newton-Raphson method to estimate parameters in model (3.30), A C program (see Appendix) was written for analyzing the MCR data set. The Newton-Raphson procedure means that M(M + 1 ) / 2 second derivative evaluations, M first derivative evaluations and a matrix inverse are needed even before the linear search is attempted. In order to simplify these problems, the quasi-Newton method is suggested as the third choice in obtaining the estimate of the parameters in model (3.30) since in the quasi-Newton method, a method is used to approximate H _ 1 directly from the first derivative information available at each step of the iteration (sometimes this approximation can be inaccurate) (see Nash 1990). To use the quasi-Newton method for analyzing our data, another C program was written. As in the first C program, we get initial values from the separate logistic regressions. 5.3 Comparison of Three M e t h o d s All three methods (the glm() function, the Newton-Raphson method and the quasiNewton method) we have introduced can be used to fit model (3.30), but what are computational differences among them? By means of the MCR data set, we use the three methods to estimate the parameters in model (3.30) under different cases, for example: • Case 1: one binary variable - REM (V21), one covariate - LIV (VI1). (See Table 5.4). • Case 2: one binary variable - REM (V21), two covariates - AGE (VI) and LIV ( V l l ) . (See Table 5.5). • Case 3: two binary variables - REM (V21) and NEM (V22), one covariate - SEX (V2). (See Table 5.6). 35 • Case 4: two binary variables - REM (V21) and NEM (V22), two covariates - SEX (V2) and COP (V5). (See Table 5.7). • Case 5: three binary variables - REM (V21), NEM (V22) and PUM (V23), two covariates - SEX (V2) and COP (V5). (See Table 5.8 and Table 5.9). • Case 6: four binary variables - REM (V21), NEM (V22),PUM (V23) and MI (V24), three covariates - AGE (VI), SEX (V2) and PMI (V3). (see Table 5.10, Table 5.11 and Table 5.12). From the computations for the MCR data set with three methods, we find that: 1. The glm() function in Splus is an approximate method, though from the standard errors in Table 5.4 to Table 5.12, it appears that both the glm() function and the Newton-Raphson method generate rather similar estimates of the parameters. It is less convenient than other two methods since the number of many times we need to run the glm() function depends on the number of binary response variables. An advantage of the glm() function is that one can assess the goodness of fit by a rough comparison of 7 ^ and 7,-,- relative to their standard errors from the glm() function before correctly applying model (3.30) to a data set. 2. Sometimes the quasi-Newton method generates inaccurate estimated standard errors. For example, in Table 5.10, the standard error of ax is 1.09 when using the quasi-Newton method, but when using the glm() function or the Newton-Raphson method, the standard errors of ct\ are 1.71 and 1.64, respectively. Also in Table 5.10, using the quasi-Newton method, the standard error of 0:3 is 0.94, but using the glm() function or the Newton-Raphson method, the standard errors of 03 are 0.52 and 0.52, respectively. 36 Table 5.4: Method Comparison under Case 1 glm() function Newton-Raphson quasi-Newton -3.2058 -3.2059 -3.2059 Std. Error 0.1745 0.1749 0.1748 value 1.4141 1.4141 1.4140 Std. Error 1.0941 1.0942 1.0108 -146.5499 -146.5495 oil value /?ii loglikelihood 3. T h e computational time for using the Newton-Raphson method to fit model (3.30) is much less than that for using the quasi-Newton method, but it requires more programming effort to compute second derivatives, first derivatives and a matrix inverse, etc. 4. T h e model (3.30) appears to fit adequately in all cases considered here, based on a check of the requirement 7,-j = 7,-,- using glm() function. One could do other standard checks for the adequacy of the logistic regressions (eg. Hosmer and Lemeshow, 1989). 5.4 Selection of Covariates Another point worth mentioning is that, as stated before, the clinical researchers or surgeons are only interested in those polychotomous responses indicating the patient's survival status and some complication variables which are related to the quality of life after surgery, and there is the question of how to choose the covariates to use. Generally speaking, only those covariates which show potential relation with the response variable are selected. In this section, two methods for selection of covariates are suggested. 37 Table 5.5: Method Comparison under Case 2 glm() function ai flu /?12 value -9.6722 Newton- Raphson quasi-Newton -9.6723 -9.3933 Std. Error 1.5862 1.5900 1.5782 value 0.0966 0.0966 0.0926 Std. Error 0.0224 0.0225 0.0223 value 1.8047 1.8048 1.7787 Std. Error 1.1188 1.1192 1.0257 -135.1319 -135.1475 loglikelihood Table 5.6: Method Comparison under Case 3 glm() function «i value -3.7141 Std. Error a2 /?n #21 712 721 0.2509 value -2.5473 Newton- Raphson quasi-Newton -3.7140 -3.7140 0.2536 0.2431 -2.5474 -2.5475 Std. Error 0.1504 0.1515 0.1409 value 0.5764 0.5756 0.5756 Std. Error 0.3586 0.3603 0.3335 value 0.1043 0.1043 0.1043 Std. Error 0.2669 0.2683 0.2587 value 1.8547 1.8547 1.8548 Std. Error 0.3797 0.3810 0.3542 value 1.8546 Std. Error 0.3806 -392.3369 -392.3369 loglikelihood 38 Table 5.7: Method Comparison under Case 4 Newton- Raphson quasi-Newton -3.9947 -3.9947 0.2835 0.2825 0.2721 value -2.5258 -2.5251 -2.5252 0.1530 0.1543 0.1273 value 0.5212 0.5175 0.5175 glm() function c*i value -3.9967 Std. Error a.?, Std. Error /?n fi\2 #21 Std. Error 0.3667 0.3651 0.3402 value 1.5225 1.5214 1.5214 Std. Error 0.3997 0.3997 0.3580 value 0.1155 0.1116 0.1116 0.2686 0.1796 -0.2863 -0.2863 Std. Error #22 7i2 721 0.2672 value -0.2900 Std. Error 0.4156 0.4186 0.3217 value 1.9178 1.9166 1.9167 Std. Error 0.3922 0.3921 0.3641 value 1.9173 Std. Error 0.3914 -386.2089 -386.2089 loglikelihood 39 Table 5.8: Method Comparison under Case 5 «i glm() function Newton- Raphson quasi-Newton -4.9749 -4.9244 -4.9245 0.4487 0.4439 0.4485 -3.0887 -3.0889 -3.0890 0.2194 0.2198 0.2222 value -0.8458 -0.8435 -0.8436 Std. Error 0.0921 0.0920 0.0924 value 0.5686 0.5204 0.5204 Std. Error 0.3718 0.3693 0.3714 value 1.1336 1.0935 1.0934 Std. Error 0.4100 0.4104 0.4129 value 0.1149 0.1141 0.1141 Std. Error 0.2729 0.2723 0.2731 value -0.5904 -0.5917 -0.5917 Std. Error 0.4215 0.4208 0.4223 -0.0081 -0.0086 -0.0086 Std. Error 0.1646 0.1646 0.1669 value 1.2974 1.2883 1.2883 Std. Error 0.2408 0.2409 0.2420 value Std. Error «2 value Std. Error «3 /?u /?i2 /?2i /?22 fci /3 32 value 40 Table 5.9: Method Comparison under Case 5 (continued) 712 7 13 723 721 73i 732 glm() function Newton- Raphson quasi-Newton value 1.5391 1.4974 1.4974 Std. Error 0.4038 0.4040 0.4079 value 1.8039 1.7854 1.7855 Std. Error 0.4652 0.4712 0.4767 value 1.2493 1.2503 1.2503 Std. Error 0.2660 0.2662 0.2692 value 1.4963 Std. Error 0.4039 value 1.8062 Std. Error 0.4655 value 1.2615 Std. Error 0.2661 -924.4926 -924.4912 loglikelihood 41 Table 5.10: Method Comparison under Case 6 glm() function ai Newton- Raphson quasi-Newton -10.1998 -10.1091 1.7149 1.6367 1.0864 value -8.0159 -7.9365 -7.7579 1.0736 1.0620 1.0129 value -1.0627 -1.0352 -0.9465 0.5186 0.9363 -3.1701 -3.0885 0.8946 0.8842 0.9967 value 0.0800 0.0752 0.0739 0.0242 0.0230 0.0181 value 0.2847 0.2913 0.3136 value -10.3369 Std. Error a.1 Std. Error 0:3 Std. Error c*4 value -3.2307 Std. Error /?n Std. Error /?i2 /?i3 /?2i Std. Error 0.3843 0.3753 0.7905 value 0.6403 0.9493 0.9395 Std. Error 0.2373 0.2790 0.3493 value 0.0770 0.0758 0.0732 0.0154 0.0157 -0.1271 -0.1166 0.2793 0.3682 -0.3429 -0.3420 0.2012 0.3460 Std. Error /?22 0.01554 value -0.1451 Std. Error /?23 0.5241 0.2807 value -0.3320 Std. Error 0.1887 42 Table 5.11: Method Comparison under Case 6 (continued) Newton-Raphson quasi-Newton 0.0014 0.0001 0.0084 0.0083 0.0148 value 0.1598 0.1603 0.1629 0.1888 0.2271 -1.6642 -1.6608 0.1313 0.1411 -0.0018 -0.0027 0.0139 0.0165 -0.2292 -0.2357 0.3342 0.9272 -0.4903 -0.5051 glm() function #31 value 0.0021 Std. Error # 32 Std. Error /333 value -1.6840 Std. Error #41 712 713 714 0.0141 value -0.2366 Std. Error #43 0.1331 value -0.0008 Std. Error #42 0.1890 0.3350 value -0.4165 Std. Error 0.2195 0.2291 0.3361 value 1.2000 1.1181 1.1284 Std. Error 0.4157 0.4216 0.9704 value 2.3946 2.7162 2.7072 Std. Error 0.4760 0.5087 0.5792 value 0.1531 0.0198 0.0403 0.6542 0.9719 Std. Error 0.6662 43 Table 5.12: Method Comparison under Case 6 (continued) 723 724 734 721 73i 732 741 glm() function Newton-Raphson quasi-Newton value 1.0296 0.9935 0.9925 Std. Error 0.3040 0.3076 0.6812 -0.1803 -0.1831 value -0.2330 Std. Error 0.4875 0.4765 0.9951 value 1.1814 1.1522 1.114 Std. Error 0.3862 0.3446 0.6171 value 1.0650 Std. Error 0.4304 -994.3497 -994.4018 value 2.8826 Std. Error 0.5216 value 1.0161 Std. Error 0.3135 value -0.0134 Std. Error 742 743 0.6564 value -0.170 Std. Error 0.4762 value 1.1795 Std. Error 0.3533 loglikelihood 44 5.4.1 Odds Ratio Analysis When the variables are binary, a natural way to measure the association is the odds ratio (as well as its logarithm). Consider two binary variables Y\ and Y2, where Y{ is coded 1 and 0. Let pij = Pr(Yi = i,Y2 — j), i,j = 0 , 1 , so that the odds ratio, ip, is defined as: , PooPn v= PoiPw From a random sample of size N, we generate the following 2 x 2 contingency table with Y1 r2 = o = O r1 = i a b r2 = i c d T h e estimated success probabilities in the two data sets are poo = a/N, plo = b/N, Poi — c/N, and p\\ = d/N, and so the estimated odds ratio, ^/>, is given by 1 _ PooP\\ _ ad PwPoi be An interval estimate or confidence interval (CI) of the odds ratio is more useful than a point estimate for measuring the association. Also we note that the logarithm of the estimated odds ratio is better approximated by a normal distribution. The approximate standard error of the estimated log odds ratio, log 0 , can be shown to be given by s.e.(logip) » i/(- + - + - + -) V a b c a An outline proof of this result is given in Schlesselman (1982). An approximate 100(1 — a)% confidence interval for xjj is exp[log tj> ± za/2s.e.(log 0)] where Zp is the upper /3 quantile of the standard normal distribution (see Hosmer and Lemeshow 1989). 45 Table 5.13, Table 5.14 and Table 5.15 provide the odds ratios for response variable REM (V21), NEM (V22) and PUM (V23) respectively. For more detailed information, see Zhang (1993). Statistically, only those 95% CI not containing 1 are more strongly related with the variable. Using odds ratio analysis, we conclude that covariates REN (V7), LIV (VI1), CNS (V12), COP (V5), DIA (V6), and HTN (V8) are strongly related with the variable REM (V21); covariates CNS (V12), CA (V10), OBE (V4), REN (V7), and HTN (V8) are strongly related with the variable NEM (V22); covariates COP (V5), OBE (V4), REN (V7), ETO (V9), CNS (V12), DIA (V6), and HTN (V8) are strongly related with the variable PUM (V23). 5.4.2 Univariate Analysis and Comparison of Models An alternative for deciding the covariates to use is based on the differences of loglikelihoods. The difference in the log-likelihoods of two nested models measures the extent to which the additional terms improve the fit of the model to the observed response variables. Take model (3.30) which has n binary response variables, for example. Suppose that two linear logistic models, model (1) and model (2), say, are to be compared, where the two models are as follows: • Model (1): / o ^ P r ( y ; = % - , j ^ »,*,-*, 1 < k < m - 1) = a, + E?J? PijXik + • Model (2): logit?i{Yi = l\yjtj ^ i,xik,l 46 < k < m) = a,- + E 7 = i^a + Table 5.13: Odds Ratio for REM Variable Heading $ s.e.(log^) 95 % CI of ^ V2 SEX 1.48 0.21 (0.97, 2.26) V4 OBE 1.13 0.37 (0.54, 2.37) V5 COP 1.86 0.26 (1.11,3.11) positively related V6 DIA 1.80 0.24 (1.11, 2.91) positively related V7 REN 9.78 0.22 (6.30, 15.2) positively related V8 HTN 1.75 0.20 (1.18, 2.62) positively related V9 ETO 1.34 0.47 (0.53, 3.95) Vll LIV 3.58 0.62 (1.04, 12.3) positively related V12 CNS 3.40 0.33 (1.75, 6.61) positively related V13 PCA 1.33 0.60 (0.40, 4.34) V14 RHE 0.88 0.59 (0.27, 2.85) V15 OTH 0.90 0.52 (0.32, 2.50) V17 VAL 0.64 1.01 (0.08, 4.76) 47 Remark Table 5.14: Odds Ratio for NEM Variable Heading rP s.e.(log^) 95 % CI of $ Remark V2 SEX 1.34 0.15 (0.99, 1.81) V4 OBE 2.11 0.21 (1.37, 3.25) V5 COP 1.22 0.21 (0.80, 1.84) V6 DIA 1.38 0.18 (0.96, 1.98) V7 REN 1.62 0.23 (1.02, 2.57) positively related V8 HTN 1.49 0.14 (1.12, 1.97) positively related V9 ETO 1.40 0.32 (0.73, 2.68) V10 CA 2.96 0.43 (1.26, 6.93) Vll LIV 2.20 0.55 (0.74, 6.53) V12 CNS 3.92 0.24 (2.41, 6.39) V13 PCA 2.02 0.37 (0.97, 4.19) V14 RHE 0.81 0.43 (0.34, 1.88) V15 OTH 0.94 0.35 (0.47, 1.90) V17 VAL 0.58 0.73 (0.14, 2.46) 48 positively related positively related positively related Table 5.15: Odds Ratio for PUM Variable Heading I s.e.(log^) 95 % CI of $ Remark V2 SEX 1.08 0.09 (0.90, 1.30) V4 OBE 3.38 0.16 (2.45, 4.66) positively related V5 COP 3.45 0.13 (2.66, 4.48) positively related V6 DIA 1.91 0.11 (1.52, 2.40) positively related V7 REN 3.32 0.16 (2.41, 4.58) positively related V8 HTN 1.86 0.08 (1.57, 2.20) positively related V9 ETO 2.21 0.21 (1.46, 3.36) positively related V10 CA 0.56 0.41 (0.25, 1.25) V12 CNS 2.12 0.21 (1.39, 3.21) V13 PCA 0.67 0.29 (0.37, 0.91) V14 RHE 0.97 0.23 (0.61, 1.54) V15 OTH 0.58 0.22 (0.37, 0.91) V17 VAL 1.03 0.34 (0.52, 2.01) 49 positively related Denote the maximized log-likelihoods under model (1) and model (2) by L\ and Li, respectively, so that twice the difference in the log-likelihoods of two models is 2[L2 - Lx] Asymptotically (large sample size N), the distribution of this statistic has a x2 distribution with degrees of freedom v\ — v-i equal to the difference in the number of the parameters between two models (see Cox and Hinkley (1974)). In our example, since we have n binary response variables, that is, v-± — v-j, = n when adding an extra covariate xm to model (1) which already has covariates x\, x2,..., ar m -i- Using the difference between the log-likehood functions of two models, we can find out how much the covariate xm improves the fit of the model to the observed response variables. Other methods for selection of covariates will be introduced in the next section. 5.5 Other Issues Before concluding this chapter, another question to discuss is how much the association of the response variables depends on the covariates. This is one inference of interest. Another reason that this issue arises is that some methods, for example, Generalized Estimating Equations, seem to assume the correlation of two response variables does not depend on covariates. What is the case for our data? 5.5.1 General As we know, with binary variables, the odds ratio (as well as its logarithm) is a natural measure of association. Considering the 2 dimensional case for model (3.30) where there exists only two binary response variables, the odds ratio is P r ( l , l[x) Pr(0, Q|x) _ exp[( Q l + ftx) + ( a 2 + ftx) + 712] _ P r ( l , 0|x) Pr(0, l | x ) e x p ( a i + frx) exp(a 2 + ftx) 50 ( ex P^ 12 . ^ Hence, the logarithm of the odds ratio is 712 and it does not depend on the covariates. However, it is not true that the odds ratio does not depend on the covariates when the number of binary response variables is greater than 2. Take the 3 dimensional case in model (3.30), for a particular instance. Based on model (3.30), the joint distribution of Yi and Yj is Pr(y,-,yj|x) = exp{(a fc Pr(y«>yj>y*l x ) Pr(yk\yi,yj,x) c'1 e x p { E f = i ( a , + E t e i Pitxt)yj + Ei<i<j<3 ajViVj} + E f c i PktXt + HkVi + 1jkyj)yk}/l + exp(a fc + E£Li PktXt + 7,-*y,- + Jjkyj) m = c _1 [1 + exp(a fc + J2 PktXt + HkVi + IjkVj)] t=\ m • exp{(o!,- + ^2PitXt)yi t=i m + ( a , + J2Pjtxt)Vj + 7.j2/»2/j} t=i for (t, j , k), where c = 1 + E L i exp(a,- + E £ i #***) + E i < , ^ < 3 exp[(a t - + a,-) + ET=i{Pit + Pit)xt + tij] + exp[E- = 1 «. + ES=i Ef=i Pitxt + Ef=i 7tf]As a result, the odds ratio for Yi and 1} is Pr(Yj = 0,Yj= 0|x) Pr(y;- = 1, Y3•, = l | x ) Fv(Yi = 0,Yj = l | x ) P r ( X = 1,^- = 0|x) [1 + exp(ak + E*=i &<**)] [1 + exp(a fc + E £ i #w*t + 7*i + 7*i)] exp(7.i)(5.1) [1 + exp(a* + E £ i PktXt + 7 H ) ] [ 1 + e x P ( « * + E £ i PktXt + Jkj)] This form implies that when there exist three binary response variables in model (3.30), the odds ratio for arbitrary two binary response variables indeed depends on the covariates x. Using similar steps as above, we can get the same conclusions for the case of more than three binary response variables in model (3.30). Next, we take two examples to illustrate how the association of binary response variables varies with covariates for our data. In each example, we will separately consider model (3.30) with 51 • one covariate: COP (V5); • two covariates: C O P (V5), and CNS (V12); • three covariates: COP (V5), CNS (V12), and REN (V7); • four covariates: C O P (V5), CNS (V12), REN (V7), and HTN (V8); • five covariates: C O P (V5), CNS (V12), REN (V7), HTN (V8), and OBE (V4). 5.6 Examples 5.6.1 Example 1 In our first example, only two complication variables, REM (V21) and NEM (V22) are considered as the response variables in model (3.30). After using the Newton-Raphson method to fit model (3.30), the log-odds ratio for REM and NEM is simply equal to the estimate value of the parameter 712. Table 5.16 presents the log-odds ratio for REM, NEM and its standard error (showed in parenthesis) under different number of covariates. From Table 5.16, we find that the log-odds ratio for REM and NEM decreases when we add an extra covariate. As a matter of fact, we say the association of two binary response variables varies with covariates, but it is not significantly so when considering the standard errors. Considering the confidence interval of the odds ratio for REM and NEM, we find that all 95% confidence intervals under the above five cases do not contain 1 (see Table 5.17). This implies that variable REM and variable NEM are strongly related each other whatever covariates we choose. Now let us return to the topic of Section 5.4. We mentioned that the odds ratio analysis and another approach based on the differences of log-likehhoods, can suggest which 52 Table 5.16: The Association of REM (V21) and NEM (V22) Covariates Log-Odds Ratio Std. Error C0P(V5) 1.9296 0.3909 COP CNS(V12) 1.7422 0.4070 COP CNS REN(V7) 1.7005 0.4180 COP CNS REN HTN(V8) 1.6938 0.4184 COP CNS REN HTN OBE(V4) 1.6842 0.4219 Table 5.17: CI of Odds Ratio for REM (V21) and NEM (V22) Covariates 95% CI of odds ratio COP(V5) (3.2010, 14.8166) COP CNS(V12) (2.5715, 12.6785) COP CNS REN(V7) (2.4139, 12.4257) COP CNS REN HTN(V8) (2.3959, 12.3524) COP CNS REN HTN OBE(V4) (2.3568, 12.3186) 53 covariates to use. Here introduce two related methods: the first method is to use the ratios of the estimates of the coefficients for the covariates to their standard errors: (3/s.e.(p); the second uses differences in the values of log-likelihood function to assess covariates. For different number of covariates, Table 5.18 and Table 5.19 list the estimates of the coefficients for the covariates (/?'s), corresponding standard errors and their ratios when applying the Newton-Raphson method to fit model (3.30). Statistically, if the absolute value of a ratio is large than 1.96, this might suggest that we need to consider the variable of this coefficient as our covariate in model (3.30). After checking the ratios in Table 5.18 and Table 5.19, we conclude 1. Among the five covariates: COP (V5), CNS (V12), REN (V7), HTN (V8) and OBE (V4), covariate COP and covariate REN are most important for response variable REM. 2. For response variable NEM, covariate CNS and covariate O B E are more important than other three covariates: COP, REN and HTN. 3. Covariate HTN seems to be less important than other covariates for both response variable REM and response variable NEM. HTN appears not be significant when several covariates are included, possibly because it is strongly associated with the other covariates. In last section, we know, from Table 5.13 and Table 5.14, that of the five covariates, four of them: COP, CNS, REN and HTN are individually strongly related with the response variable REN; four of them: CNS, REN, HTN and OBE are individually strongly related with the response variable NEM. Comparing with the conclusion from checking the ratios 54 Table 5.18: The Estimates of the Coefficients under All Five Cases Number of Covariates Fi=REM COP (V5) CNS (V12) REN (V7) HTN (V8) OBE (V4) 1 2 3 4 5 /? u 1.5516 1.5344 1.4579 1.4532 1.4644 Std. Error 0.3982 0.4021 0.4128 0.4119 0.4110 Ratio 3.8965 3.8160 3.5317 3.5280 3.5630 fa 1.2536 0.9634 0.9432 0.9569 Std. Error 0.5372 0.5666 0.5668 0.5637 Ratio 2.3336 1.7003 1.6641 1.6975 0i3 1.5753 1.5471 1.5515 Std. Error 0.4242 0.4286 0.4272 Ratio 3.7136 3.6097 3.6318 0i4 0.1658 0.1567 Std. Error 0.3705 0.3722 Ratio 0.4475 0.4210 0i5 0.1593 Std. Error 0.5358 Ratio 0.2973 55 Table 5.19: The Estimates of the Coefficients under All Five Cases (continued) Number of Covariates F 2 =NEM COP (V5) CNS (V12) REN (V7) HTN (V8) OBE (V4) 1 2 3 4 5 -0.2800 -0.3023 -0.3144 -0.3228 -0.3278 Std. Error 0.4183 0.4251 0.4270 0.4258 0.4240 Ratio 0.6694 -0.7111 -0.7363 -0.7581 -0.7731 02i 0 22 1.5045 1.4802 1.4662 1.4477 Std. Error 0.4082 0.4121 0.4130 0.4161 Ratio 3.6857 3.5918 3.5501 3.4792 023 0.1909 0.1489 0.1158 Std. Error 0.4117 0.4154 0.4155 Ratio 0.4637 0.3584 0.2787 024 0.2050 0.1244 Std. Error 0.2545 0.2582 Ratio 0.8055 0.4818 025 0.9050 Std. Error 0.3432 Ratio 2.6369 56 Table 5.20: The Log-likelihood Functions with Different Number of Covariates Covariates Log-likelihood COP(V5) -387.4225 COP CNS(V12) -377.1679 COP CNS REN(V7) -370.5885 COP CNS REN HTN(V8) -370.0956 COP CNS REN HTN OBE(V4) -366.7212 in Table 5.18 and Table 5.19, partly we can say that to choose covariates, checking the ratios of the estimates of the coefficients for the covariates to their standard errors is an additional method to the odds ratio analysis. In the meanwhile, we also calculate the log-likelihood function when response variables are two complication variables, REM (V21) and NEM (V22), and covariates are one of the above five cases (see Table 5.20). Table 5.20 shows that the log-likelihood function increases when we add an extra covariate. Especially, we find that when we add covariate H T N to covariates COP, CNS and REN, the log-likelihood function only increases by 0.4929, less than other increase. Again, this may suggest that covariate HTN is less important than other covariates for both response variables. To conclude, using multivariate binary model (3.30), we can do the risk factor analysis for several response variables (or complication variables) simultaneously. 5.6.2 Example 2 Now, we briefly discuss another example for checking the 3 dimensional case where there exist three binary response variables, REM (V21), NEM (V22), and P U M (V23) in model (3.30). 57 Table 5.21: T h e Log-likelihood Functions of Three Response Variables with Different Number of Covariates Covariates Log-likelihood C0P(V5) -925.7063 C0P(V5) CNS(V12) -910.6644 15.0419 C0P(V5) CNS(V12) REN(V7) -889.7837 20.8807 C0P(V5) CNS(V12) REN(V7) HTN(V8) -882.5742 7.2095 C0P(V5) CNS(V12) REN(V7) HTN(V8) 0BE(V4) -873.7382 8.8360 Difference In Table 5.21, we present the values of log-likelihoods under different number of covariates when using the Newton-Raphson method and the differences of log-likelihoods when adding another covariate to previous case. Clearly, as previous example the loglikelihood function increases when we add an extra covariate. Considering differences, it seems that t h e last two covariates H T N and O B E are not more important than other covariates, but comparing with x\i they are still statistically significant. We choose the third case: three covariates COP, CNS and REN for our final analysis. Similar to Table 5.18 and Table 5.19, Table 5.22 deals with three binary response variable REM, NEM, P U M and three covariates COP, CNS, REN in model (3.30). Checking the ratios in Table 5.22, we say that among these three covariates, CNS is not significant for binary response variable REM; C O P and REN is not significant for binary response variable NEM. After using the Newton-Raphson method to fit model (3.30) for the third case, in addition to t h e estimates of t h e coefficients listed in Table 5.22, we also obtained other estimates of the parameters in model (3.30). Therefore, through equation (5.1) the 58 Table 5.22: The Estimates of t h e Coefficients under t h e Third Case COP (V5) CNS (V12) REN (V7) REM (V21) NEM (V22) PUM (V23) Coefficient 1.1318 -0.5786 1.3013 Std. Error 0.4189 0.4254 0.2447 Ratio 2.7018 -1.3601 5.3179 Coefficient 0.7580 1.2370 1.1185 Std. Error 0.5554 0.4180 0.4131 Ratio 1.3648 2.9593 2.7076 Coefficient 1.2210 -0.1104 1.5234 Std. Error 0.4280 0.4110 0.2988 Ratio 2.8528 0.2686 5.0984 marginal odds ratio for R E M and NEM is the form of [1 + exp(c*3 + E L i fcxj)][l + e x p ( a 3 + £ L i fax, + 7w + 723)] [1 + e x p ( a 3 + Ef=i fax; + 713)] [1 + e x p ( a 3 + £ ? = i fax, + 723)] exp(7i 2 ) 1 + exp(-0.9899 + 1.3013si + 1.1185s 2 + l-5234x 3 ) ^ * ' ' 1 + exp(-0.9899 + 1.3013*1 + 1.1185x2 + 1.5234x3 + 1.5186) 1 + exp(-0.9899 + 1.3013si + 1.1185x2 + 1.5234x3 + 1.5186 + 1.1775) 6XP 1 + exp(-0.9899 + 1.3013a?! + 1.1185x2 + 1.5234x3 + 1.1775) 1 + exp(-0.9899 + 1.3013ai + 1.1185x2 + 1.5234x3) 1 + exp(1.7062 + 1.3013X! + 1.1185x2 + 1.5234x3) 1 + exp(0.1687 -f 1.3013X! + 1.1185x2 + 1.5234x 3 ) " 1 + exp(0.1876 + 1.3013X! + 1.1185x2 + 1.5234ar3) ' ' (5.2) As a function of t h e covariates COP, CNS and REN, the marginal odds ratio for REM and NEM varies with t h e covariates for model (3.30). In fact, t h e association of the response variables depends strongly on the covariates in this 3 dimensional case. From Table 5.1 and Table 5.2, we know all covariates COP, CNS and REN are binary variables. Substituting different values of the covariates for equation (5.2), we give Table 5.23 to show t h e changes of t h e marginal odds ratio for R E M and NEM. From 59 Table 5.23: The Marginal Odds Ratio for REM and NEM Marginal Odds Ratio COP (V5) CNS (V12) REN (V7) Xx = 0 X2 = 0 x3 = 0 0.8490 Xx = 1 X2 = 0 x3 = 0 0.4464 Xx = 0 X2 = 1 x3 = 0 0.4800 xx = 0 x2 = 0 x3 = 1 0.4124 xx = 1 x2 = 1 x3 = 0 0.3291 Xx = 1 X2 = 0 x3 = 1 0.3094 xx = 0 x2 = 1 x3 = 1 0.3173 Xx = 1 X2 = 1 X3 = 1 0.2826 these marginal odds ratios in Table 5.23, we conclude that of three covariates, the most important covariate for the changes of the marginal odds ratio for REM and NEM is REN, next is COP. 60 Bibliography [1] Arnold, Barry C , Castillo, Enrique and Sarabia, Jose-Maria (1992). Conditionally Specified Distributions. Springer-Verlag. [2] Arnold, B.C. and Strauss, D. (1988). Bivariate distributions with exponential conditionals. JASA 83, 522-527. [3] Bonney, George Ebow (1987). Logistic regression for dependent binary observations. Biometrics 43, 951-973. [4] Chambers, John M., Hastie, Trevor J. and AT&T Bell Laboratories (1992). Statistical Models in S. Wadsworth & Brooks / Cole Advanced Books & Software Pacific Grove, California. [5] Collett, D. (1994). Modelling Binary Data. Chapman & Hall. [6] Nash, J C (1990). Compact Numerical Methods For Computers Linear Algebra and Function Minimisation. Second Edition. Adam Hipger, Bristol and New York. [7] Connolly, Margaret A. and Liang, Kung-Yee (1988). Conditional logistic regression models for correlated binary data. Biometrika 75, 3, 501-506. [8] Cox, D.R. and Hinkley, D.V. (1974). Theoretical Statistics. Chapman Hall, London. [9] Gelman, A. and Speed, T.P. (1993). Characterizing a joint probability distribution by conditionals. J. Roy. Statist. Soc. B 55 , 185-188. 61 [10] Hosmer, D.W. and Lemeshow, S. (1989). Applied Logistic Regression. Wiley, New York. [11] McCullagh, P., and Nelder, J.A. (1989). Generalized Linear Models. Second Edition. Chapman Hall, London. [12] Press, W.H., Flannery, B.P., Teukolsky, S.A. and Vetterling, W . T . (1988). Numerical Recipes in C: the Art of Scientific Computing. Cambridge University Press, Cambridge, Eng. [13] Schlesselman, J.J. (1982). Case-Control Studies: Design, Conduct, Analysis. Oxford University Press, New York. [14] Zhang, Hongbin (1993). Risk prediction models for binary response variable for the coronary bypass operation (M.Sc. Thesis). Dept. of Statistics, University of British Columbia. 62 Appendix C Program for Applying the Newton-Raphson Method #include <stdio.h> # i n c l u d e <math.h> # d e f i n e N 900 # d e f i n e NP 150 # d e f i n e NC 20 mainO { d o u b l e t [NP] [NP] ,h[NP] [NP] , s [N] [NP] ,x[N] [NC] ,y[N] [NC] ; d o u b l e c [ N ] , l m [ N P ] , s u f [NP] ; double dif,mx,prod,sum,lik,logc; int iter,i,j,k,d,n,m,nn,kf,ii,M,Ml,jj[6]; double **a,**b,**dmatrix(); FILE *fp; void d2b(),gaussj(); if ((fp = fopen("name,,,"r")) »- NULL) { printf("file not found\n"); exit(); } scanf("y.d */.d '/.d", &n, &m, &nn) ; /* n is dimension of binary response variables m is dimension of covariates nn is the number of observation M=n+n*m+n*(n-l)/2; M1=M+1; /* M is the number of parameters Next step is to get initial values of the parameters 63 and read values for binary response variables and covariates from a file "name" f or(i=l; i<=M; i++) f scanf (fp, '7.F" ,&lm[i] ) ; for(i=l;i<=M;i++) printf ('7.6.3f", lm[i]); printf ("\n") ; for(k=l;k<=nn;k++) { for(i=l;i<=m;i++) f scanf (fp,ny.F",&x[k] [i]) ; for(j=l;j<=n;j++) fscanf (fp,"'/.F",&y[k] [j]); } printf("\n"); for(i=l,kf=l;i<=n;i++) kf*=2; printf ("n=y.d, m=y.d, nn=y.d, M=*/,d, kf=y.d\n", n,m,nn,M,kf) ; for(i=l;i<=M;i++) suf[i]=0.; /* Next, calculate sufficient statistics — suf */ for(k=l;k<=nn;k++) { for(i=l;i<=n;i++) { s[k][(i-l)*m+i]=y[k][i]; for(j=l;j<=m;j++) s[k] [(i-l)*m+i+j]=y[k] [i]*x[k] [j] ; } for(i=l;i<=n-l;i++) { for(j=i+l;j<=n;j++) s[k][n+n*m+n*(i-l)-i*(i-l)/2+j-i] -y[k][i]*y[k][j]; } } for(i=l;i<=M;i++) { for(k=l;k<=nn;k++) s u f [ i ] + = s [ k ] [ i ] ; printf('7.8.4f", suf[i]); if(i*/.10==0) p r i n t f ( " \ n " ) ; 64 } printf("\n"); dif=l.; iter=0; a = dmatrix(l,M,l,M); b = d m a t r i x ( l , M , l , l ) ; while(iter<30 && d i f > l . e - 4 ) { for(i=l;i<=M;i++) { for(j=l;j<=Ml;j++) h [ i ] [ j ] = 0 . 0 ; } for(k=l;k<=nn;k++) { c[k]=0.; for(i=l;i<=M;i++) { for(j=l;j<=Ml;j++) t [ i ] [j]=0.0; } /* c a l l s r o u t i n e , d 2 b ( n , i i , j j ) : r e t u r n s p o s s i b l e binary vector j j of dimension n */ for(ii=l;ii<=kf;ii++) { d2b(n,ii,jj); for(i=l;i<=n;i++) •C s [ k ] [ i + ( i - l ) * m ] = ( d o u b l e ) j j [ i ] ; for(j=l;j<=m;j++) s [ k ] [ ( i - l ) * m + i + j ] = (double)(jj[i])*x[k][j]; } for(i=l;i<=n-l;i++) { for(j=i+l;j<=n;j++) s[k][n+n*m+n*(i-l)-i*(i-l)/2+j-i] = (double)jj[i]*jj[j]; } for(i=l,sum=0.;i<=M;i++) sum+=lm[i]*s[k][i]; prod=exp(sum); c[k]+=prod; for(i=l;i<=M;i++) t [ i ] [Ml]+=s[k] [i]*prod; for(i=l;i<=M;i++) { for(j=l;j<=M;j++) t [ i ] [j]+=prod*s[k] [ i ] * s [ k ] [j] ; } 65 } for(i=l;i<=M;i++) t [ i ] [Ml]/=c[k] ; for(i=l;i<=M;i++) { for(j=l;j<=M;j++) t [ i ] [ j ] - t [ i ] [Ml]*t [j] [Ml]-t [ i ] [ j ] / c [ k ] ; } for(i=l;i<=M;i++) { f o r ( j - i ; j < - M + l ; j + + ) h [ i ] [ j ] + - t [ i ] [j] ; } } for(i=l;i<=M;i++) { for(j«l;j<-M;j++) a [ i ] [ j ] - h [ i ] [j] ; b[i][l]=suf[i]-h[i][Ml]; } /* c a l l s routine, gaussj(a,M,b,l) f o r solving a l i n e a r system */ gaussj(a,M,b,l); for(i=l,dif=0.;i<=M;i++) { m x = f a b s ( b [ i ] [ l ] ) ; if(mx>dif) dif=mx; } for(i=l;i<=M;i++) l m [ i ] - = b [ i ] [ 1 ] ; iter++; if (iter'/.l==0) { for(i=l;i<=M;i++) { printf("lm= , /.8.4f", l m [ i ] ) ; if(i*/.10==0) p r i n t f ( " \ n " ) ; } p r i n t f ("iter=*/.4d\n", i t e r ) ; } } i f ( i t e r > = 3 0 ) p r i n t f ( " d i d not converge\n"); for(i=l;i<=M;i++) { p r i n t f ('7.8.4f */.8.4f 7.8.4f \ n " , l m [ i ] , s q r t ( - a [ i ] [ i ] ) , lm[i]/sqrt(-a[i][i])); 66 } printf ("iter='/.4d\n",iter); for(lik=0.,i=l;i<=M;i++) lik+=suf[i]*lm[i]; for(i=l,logc=0.;i<=nn;i++) logc+=log(c[i]); lik-=logc; printf("loglikelihood= y.9.4f\n", lik); } void d2b(n,ii,jj) int n,ii,jj[] ; { int i,d; d=ii-l; for(i=n;i>=l;i—) { jj[i]=d'/.2; d=d/2; } 67
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A model for multivariate binary data with covariates...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A model for multivariate binary data with covariates based on compatible conditionally specified logistic… Liu, Ying 1994
pdf
Page Metadata
Item Metadata
Title | A model for multivariate binary data with covariates based on compatible conditionally specified logistic regressions |
Creator |
Liu, Ying |
Date Issued | 1994 |
Description | Rather than construction of a multivariate distribution from given univariate or bivariate margins, recently several papers seek to promote the development and usage of a simple but relatively unknown approach to the specification of models for dependent binary outcomes through conditional probabilities, each of which is assumed to be logistic. These recent proposals were all offered as heuristic approaches to specifying a multivariate distribution capable of representing the dependence of binary outcomes. However, they are limited in scope, for they all describe some special patterns of dependence. This thesis is concerned with a model for a multivariate binary response with covariates based on compatible conditionally specified logistic regressions. With this model, we allow for a general dependence structure for the binary outcomes. Three likelihood-based computing methods are introduced to estimate the parameters in our model. An example on the coronary bypass surgery is presented for illustration. |
Extent | 2135026 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-03-03 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0087512 |
URI | http://hdl.handle.net/2429/5380 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1994-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1994-0492.pdf [ 2.04MB ]
- Metadata
- JSON: 831-1.0087512.json
- JSON-LD: 831-1.0087512-ld.json
- RDF/XML (Pretty): 831-1.0087512-rdf.xml
- RDF/JSON: 831-1.0087512-rdf.json
- Turtle: 831-1.0087512-turtle.txt
- N-Triples: 831-1.0087512-rdf-ntriples.txt
- Original Record: 831-1.0087512-source.json
- Full Text
- 831-1.0087512-fulltext.txt
- Citation
- 831-1.0087512.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0087512/manifest