{"http:\/\/dx.doi.org\/10.14288\/1.0091594":{"http:\/\/vivoweb.org\/ontology\/core#departmentOrSchool":[{"value":"Science, Faculty of","type":"literal","lang":"en"},{"value":"Statistics, Department of","type":"literal","lang":"en"}],"http:\/\/www.europeana.eu\/schemas\/edm\/dataProvider":[{"value":"DSpace","type":"literal","lang":"en"}],"https:\/\/open.library.ubc.ca\/terms#degreeCampus":[{"value":"UBCV","type":"literal","lang":"en"}],"http:\/\/purl.org\/dc\/terms\/creator":[{"value":"Wu, Kunling","type":"literal","lang":"en"}],"http:\/\/purl.org\/dc\/terms\/issued":[{"value":"2009-11-20T01:27:04Z","type":"literal","lang":"en"},{"value":"2003","type":"literal","lang":"en"}],"http:\/\/vivoweb.org\/ontology\/core#relatedDegree":[{"value":"Master of Science - MSc","type":"literal","lang":"en"}],"https:\/\/open.library.ubc.ca\/terms#degreeGrantor":[{"value":"University of British Columbia","type":"literal","lang":"en"}],"http:\/\/purl.org\/dc\/terms\/description":[{"value":"Generalized linear mixed effects models (GLMMs) are popular in many longitudinal\r\nstudies. In these studies, however, missing data problems arise frequently, which makes\r\nstatistical analyses more complicated. In this thesis, we propose an exact method and an\r\napproximate method for GLMMs with informative dropouts and missing covariates, and\r\nprovide a unified approach for simultaneous inference. Both methods are implemented\r\nby Monte Carlo EM algorithms. The approximate method is based on Taylor series expansion,\r\nand it avoids sampling the random effects in the E-step. Thus, the approximate\r\nmethod may be computationally more efficient when the dimension of random effects is\r\nnot small. We also briefly discuss other methods for accelerating the EM algorithms.\r\nTo illustrate the proposed methods, we analyze two real datasets, a AIDS 315 dataset\r\nand a dataset from a parent bereavement project, using these methods. A simulation\r\nstudy is conducted to evaluate the performance of the proposed methods under various\r\nsituations.","type":"literal","lang":"en"}],"http:\/\/www.europeana.eu\/schemas\/edm\/aggregatedCHO":[{"value":"https:\/\/circle.library.ubc.ca\/rest\/handle\/2429\/15346?expand=metadata","type":"literal","lang":"en"}],"http:\/\/purl.org\/dc\/terms\/extent":[{"value":"3704414 bytes","type":"literal","lang":"en"}],"http:\/\/purl.org\/dc\/elements\/1.1\/format":[{"value":"application\/pdf","type":"literal","lang":"en"}],"http:\/\/www.w3.org\/2009\/08\/skos-reference\/skos.html#note":[{"value":"Simultaneous Inference for Generalized Linear Mixed Models with Informative Dropout and Missing Covariates by Kunling Wu M.Sc, Beijing Polytechnic University, China, 1999 A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T OF T H E REQUIREMENTS FOR T H E D E G R E E OF Master of Science in T H E F A C U L T Y OF G R A D U A T E STUDIES (Department of Statistics) We accept this thesis as conforming to the required standard The University of British Columbia December 2003 \u00a9 Kunling Wu, 2003 Library Authorization In presenting this thesis in partial fulfillment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Kunling Wu 19\/12\/2003 Name of Author (please print) Date (dd\/mm\/yyyy) Title of Thesis: Simultaneous Inference for Generalized Linear Mixed Models with Informative Dropout and Missing Covariates Degree: Master of Science Department of Statistics The University of British Columbia Vancouver, B C Canada Year: 2003 Abstract Generalized linear mixed effects models (GLMMs) are popular in many longitudinal studies. In these studies, however, missing data problems arise frequently, which makes statistical analyses more complicated. In this thesis, we propose an exact method and an approximate method for GLMMs with informative dropouts and missing covariates, and provide a unified approach for simultaneous inference. Both methods are implemented by Monte Carlo E M algorithms. The approximate method is based on Taylor series ex-pansion, and it avoids sampling the random effects in the E-step. Thus, the approximate method may be computationally more efficient when the dimension of random effects is not small. We also briefly discuss other methods for accelerating the E M algorithms. To illustrate the proposed methods, we analyze two real datasets, a AIDS 315 dataset and a dataset from a parent bereavement project, using these methods. A simulation study is conducted to evaluate the performance of the proposed methods under various situations. ii Contents Abstract ii Contents iii List of Tables vii List of Figures viii Acknowledgements ix Dedication x 1 Introduction 1 1.1 Generalized Linear Mixed Effect Models 1 1.2 Missing Data Problems 3 1.3 Motivating Examples 5 1.3.1 Example 1 5 1.3.2 Example 2 5 1.4 Objectives and Outline 6 2 Generalized Linear Mixed Models and Missing Data 8 2.1 Introduction 8 iii 2.2 Generalized Linear Models 8 2.2.1 Model Specification 9 2.2.2 Maximum Likelihood Estimation in GLMs 10 2.2.3 Quasi-Likelihood Approach 13 2.3 Generalized Linear Mixed Models 14 2.3.1 Generalized Linear Mixed Models 14 2.3.2 Maximum Likelihood Estimation 15 2.3.3 Literature for Generalized Linear Mixed Models 16 2.4 Literature for Missing Data 17 2.4.1 Literature of Informative Dropout 17 2.4.2 Literature of Missing Covariates 18 3 Exact Inference for G L M M s with Informative Dropout and Missing Covariates 20 3.1 Introduction 20 3.2 Models and Likelihood . . . \u2022. 21 3.3 Monte Carlo E M Algorithm 23 3.3.1 E-step 24 3.3.2 M-step 26 3.3.3 Variance Estimation 27 3.4 Sampling Methods 28 3.4.1 Gibbs Sampler 28 3.4.2 Adaptive Rejection Algorithm -. . 29 3.4.3 Rejection Sampling 30 3.4.4 Sampling Method for Binary Variables 31 iv 3.5 P X - E M Algorithm 31 3.6 Convergence 34 4 Approximate Inference for G L M M s with Informative Dropout and Missing Covariates 35 4.1 Introduction 35 4.2 Approximate Inference without Missing Values 36 4.3 Approximate Inference with Missing Values 39 4.4 Strategies for Sampling the Missing Values 42 4.5 P X - E M 43 5 Covariate Models and Dropout Models 46 5.1 Introduction 46 5.2 Dropout Models 46 5.3 Covariate Models 48 5.4 Sensitivity Analyses 49 6 Real Data Examples 50 6.1 Introduction 50 6.2 Example 1 51 6.2.1 Data Description . . . ' 51 6.2.2 Models 52 6.2.3 Analysis and Results 54 6.2.4 Sensitivity Analysis 56 6.2.5 Conclusion 58 6.3 Example 2 '. \u2022 59 6.3.1 Data Description 59 v 6.3.2 Models 60 6.3.3 Analysis and Results 62 6.3.4 Sensitivity Analysis 63 6.3.5 Conclusion 65 6.4 Computation Issues 65 7 Simulation Study 71 7.1 Introduction 71 7.2 Description of the Simulation Study 72 7.2.1 Models 72 7.2.2 Bias and Mean-Squared Error 73 7.3 Simulation Results 74 7.3.1 .Comparison of Methods with Varying Missing Rates 74 7.3.2 Comparison of Methods with Different Variances 75 7.3.3 Comparison of Methods with Different Sample sizes 76 7.3.4 Comparison of Methods with Varying Intfa-individual Measurements 76 7.3.5 Conclusion 77 8 Conclusion and Discussion 81 Bibliography 84 vi List of Tables 6.1 Summary statistics 52 6.2 Estimates for the AIDS data 55 6.3 Sensitivity analysis for covariate models 57 6.4 Sensitivity analysis for. dropout models \u2022. \u2022 \u2022 \u2022 58 6.5 Summary statistics , 60 6.6 Estimates for the Parent Bereavement data 62 6.7 Sensitivity analysis for covariate models 63 6.8 Sensitivity analysis for dropout models 64 7.1 Simulation results with varying missing rates 75 7.2 Simulation results with varying variances 76 7.3 Simulation results with varying sample sizes 77 7.4 Simulation results with varying intra-individual measurements 78 vii List of Figures 6.1 Viral loads (log 1 0 scale) for six randomly selected patients. The open dots are the observed values and the dashed line indicates the detection limit of viral loads. The viral loads below the detection limit are substituted with log 1 0 (50) 67 6.2 GSI scores for six randomly selected parents. The open dots are the ob-served values and the GSI scores at time 0 are the baseline values 68 6.3 (a) Time series and (b) autocorrelation function plots for CH50 69 6.4 (a) Time series and (b) autocorrelation function plots for 6 4 6 associated with patient 46 70 7.1 (a) Time series and (b) autocorrelation function plots for z 2 79 7.2 (a) Time series and (b) autocorrelation function plots for 6 1 8 associated with individual 18 80 viii Acknowledgements First and foremost, I would like to thank my supervisor, Dr. Lang Wu, for his excellent guidance and immense help during my study at UBC. Without his support, expertise and patience, this thesis would not have been completed. Also, I would like to thank my second reader, Dr. Paul Gustafson, for his invaluable comments and suggestions on this thesis. Furthermore, I thank Dr. Nancy Heckman and Dr. Bertrand Clarke for their invaluable advice on my consulting projects, which will benefit me very much in the future. I thank all the faculty and staff in the Department of Statistics at UBC for providing such a nice academic environment. I also thank all the graduate students in the Department of Statistics for making my study at UBC so enjoyable. Most importantly I would like to thank my parents for loving me and believing in me. My big thanks goes to my beloved husband, Weiliang Qiu, for his love, his constant support and encouragement, which push me to be the best at everything I do. K U N L I N G W U The University of British Columbia December 2003 ix To my parents and husband. Chapter 1 Introduction 1.1 Generalized Linear Mixed Effect Models Longitudinal data or repeated measurement data occur frequently in many applications where repeated measurements are obtained for each individual. Statistical analysis of longitudinal data are reviewed in Diggle, Liang and Zeger (1994). One of the key ad-vantages of a longitudinal study over a cross-sectional study is to separate variation over time within an individual from difference among individuals, while a cross-sectional study can not do this because it simply records one measurement for each individual. So the analysis of cross-sectional data may confound time effect and may give misleading results. For longitudinal data, it is important to recognize two sources of variations: intra-individual variation produced by different measurements within a given individual, and inter-individual variation among different individuals. Generalized linear models (GLMs) such as logistic regression models, extend nor-mal linear models to allow non-normal error distributions in the natural exponential family such as Poisson and binomial distributions. GLMs can handle not only continu-ous variables but also discrete variables, as long as the distribution of the variable belongs 1 to the natural exponential family. Therefore, GLMs provide a unified different approach for continuous and discrete responses and have wide applicability in practice. For ex-ample, in Agresti (1990), a sample of male residents of Framingham, Massachusetts, was collected according to their blood pressures. During a follow-up period, whether or not these male residents developed coronary heart diseases, was recorded and viewed as response. So the response variable is binary. To investigate the relationship between the blood pressure and the coronary heart disease, we can build a logistic regression model and then make statistical inferences based on this GLM. Generalized linear mixed models (GLMMs) are useful for longitudinal studies which extend GLMs by introducing random effects to account for correlation within the repeated measurements for a given individual. Such models can separate two kinds of variations, borrow information cross individuals and allow discrete and continuous responses. Therefore, GLMMs are very popular in the analysis of longitudinal data. A G L M M may be written as a hierarchical two-stage model. At the first stage, intra-individual variation is charactered by a G L M . In the second stage, inter-individual variation is represented through individual-specific regression parameters. Covariates are often introduced in the second stage to partially explain systematic variation. There are two main approaches to estimate parameters in GLMMs: (i) an exact likelihood inference based on numerical integration (Booth and Hobert (1999)), and (ii) an approximate inference based on linearization procedures via Taylor series expansion (Breslow and Clayton (1993); Vonesh et al. (2002)). In the exact inference, marginal likelihood is obtained by integrating out random effects from the joint distribution of re-sponse and random effects. By maximizing the marginal likelihood, we obtain estimates of parameters of interest. However, the integration is usually intractable, one may use Monte Carlo approximations to evaluate it (Wei and Tanner (1990)). The exact likelihood 2 inference works well with a small dimension of random effects. However, computation may become quite demanding or unstable as the dimension of random effects increases. In such cases, we may consider the approximate inference which avoids this computation difficulty by integrating out the random effects. The strategy for the approximate infer-ence is to iteratively solve L M E models based on second-order Taylor series expansion around current estimates. If the number of measurements for each individual is large enough, approximate methods may give reasonable estimates for parameters. Otherwise, approximate maximum likelihood estimates may be inconsistent. 1.2 Missing Data Problems Missing data are a serious problem in many applications and arise frequently in lon-gitudinal studies. Two kinds of missing data often occur in a longitudinal study: (i) missing covariates due to different measurement schedules for covariates and response or other problems; and (ii) missing responses due to dropout or missing visits. For example, individuals may withdraw or die before the end of study or do not come to the study center for measurements at scheduled times for various reasons. Missing data problems make statistical analysis in longitudinal studies much more complicated, since standard complete-data methods are not directly applicable. Commonly-used naive methods for missing data include the complete-case method, which deletes all incomplete observations, the mean imputation method, which substi-tutes missing values with the mean values of observed data, and the last-value-carried-forward method, which imputes a missing value by the immediate previous observed data. At the presence of missing data, the missing data mechanism must be taken into account in order to obtain valid statistical inference. Little and Rubin (1987) define three types 3 of missing data mechanisms. Missing data are missing completely at random (MCAR) if the probability of missingness is independent of both observed and unobserved data. For example, patients do not come to the study center because of reasons irrelevant to the treatment such as simply forgetting the appointment. Missing data are missing at ran-dom (MAR) if the probability of missingness depends only on observed data, but not on unobserved data. For example, a patient may occasionally fail to visit the clinic because he\/she is too old. Missing data are nonignorable or informative missing data (NIM) if the probability of missingness depends on unobserved data. For example, a patient fails to visit the clinic because he\/she is too sick. If missing values are MCAR, the complete-case method will give unbiased, but inefficient estimates. If the missing data are not MCAR, the naive methods may give biased, even misleading results due to not taking missing data information into consideration. MCAR and MAR are called ignorably missing. We can ignore the missing data mechanism in likelihood inference when missing values are ignorably missing (Little and Rubin (1987)). Little (1992, 1995) gave a review on missing covariates in regression and drop-out in repeated-measures studies. Ibrahim, Lipsitz, and Chen (1999) proposed a Monte-Carlo E M method for estimating parameters in GLMs with nonignorable missing covariates. Wu and Wu (2001) estimated parameters in nonlinear mixed effects models with missing covariates (MAR) by a three-step multiple imputation method. Wu and Carroll (1988) considered linear mixed effect models with informative dropout and assumed missingness depending on random effects. Ibrahim, Chen and Lipsitz (2001) developed a Monte Carlo E M algorithm for estimating parameters in GLMMs with informative dropout. However, little literature considers parameter estimation in GLMMs with informative dropout and missing covariates simultaneously. 4 1.3 Motivating Examples 1.3.1 Example 1 Our research is motivated by a longitudinal study from the AIDS Clinical Trial Group (ACTG) Protocol 315 (Wu and Ding (1999)). In this study, 46 HIV infected patients were treated with a potent antiviral drug. Plasma HIV-1 RNA (viral loads) were repeatedly quantified on days 2, 7, 10, 14, 21, 28, and weeks 8, 12, and 24 after initiation of treatment. After the antiviral treatment, the patients' viral loads will decay, and the decay rate may reflect the efficacy of the treatment. The Nucleic Acid Sequence-Based Amplification assay that is used to measure the viral load has a detection limit. If the viral load drops below the detection limit after the treatment, the viral load can not be measured, which may indicate that the treatment may be successful. To investigate the treatment effect, one approach is to define the response as whether the viral load is below the detection limit or not, which is thus a binary variable. In this study, patients drop out before the end of the study, and the dropout may be informative. Thus, the response contains non-ignorable missing values. Preliminary studies show that some baseline covariates such as CD4 cell counts, tumor necrosis factor (measured by TNF levels) and total complement levels (measured by CH50), may partially explain variation in the viral load trajectory. However, some of these covariates are also missing. Our objectives are to model the viral load trajectory and to identify covariates that may partially predict changes of viral loads, in the presence of informative dropouts and missing covariates. 1.3.2 Example 2 Our second example involves a longitudinal study from a parent bereavement project, which investigates the long-term mental outcomes of parents whose children died by 5 accident, suicide, or homicide. After their children's death, the parents usually experience a high level of mental distress. In this study, the mental distress of 239 parents were measured at baseline (i.e. 4 to 6 weeks after their children's death), and then at 4, 12, 24 and 60 months post-death. The Global Severity Index (GSI), which is the most sensitive indicator of mental distress, is used to measure the parents' distress levels. A high GSI score indicates a high level of mental distress. If the parents' adjustment to their children's death goes well, their GSI scores will decrease over time, at least lower than their baseline GSI scores. To examine how the parents' mental distress changes over time after their children's death, we treat whether or not a parent's GSI score after baseline is lower than his\/her baseline value as response. Several other relevant factors were also obtained at baseline, including parents' gender, marital status, age, education, annual income, the cause of death, age and gender of the deceased child. These baseline factors may be important predictors of parents' distress and thus are viewed as covariates. Note that some baseline covariates such as income contain missing values, and some responses are also missing. Our objectives are to investigate the change of parent's distress levels over time and to determine which covariates affect the parent's mental distress. 1.4 Objectives and Outline In this thesis, we develop an exact inference method, implemented by a Monte-Carlo E M algorithm, to make simultaneous inferences for GLMMs with informative dropout and missing covariates. To avoid computational difficulties when the dimension of random effects is not small, we propose an approximate inference method, which integrates out the random effects for more efficient computation. The remainder of this thesis is organized as follows. Chapter 2 introduces GLMs 6 and GLMMs and reviews the literature about informative dropout and missing covari-ates. Chapter 3 discusses the exact inference method for estimation of GLMMs with informative dropout and missing covariates. The approximate inference method based on linearization is presented in Chapter 4. We discuss dropout models and covariate models in Chapter 5. In Chapter 6, we apply our methods to two real data examples. Chapter 7 presents our simulation study. We conclude the thesis with a discussion in Chapter 8. 7 Chapter 2 Generalized Linear Mixed Models and Missing Data 2.1 Introduction Before we present our methods for estimating parameters in GLMMs with informative dropout and missing covariates, we give a brief introduction to GLMs, GLMMs, and methods for the missing data problems in this chapter. In Section 2.2, we introduce GLMs and the methods of estimation for parameters in GLMs. Section 2.3 describes GLMMs, briefly discusses two main methods for estimating parameters in GLMMs and reviews the literature for GLMMs. In Section 2.4, we give a literature review about methods of handling informative dropout and missing covariates respectively. 2.2 Generalized Linear Models A classical linear model is useful to model a continuous response under the assumption that the response follows a normal distribution and a linear relationship exists between 8 the mean of the response and covariates. However, in practise, some non-normal dis-tributions such as binomial, Poisson, etc, may be better assumptions for some response variables such as discrete variables. For example, we may want to study whether devel-oping a heart disease relates to the blood pressure level. Here, we treat the health status of patients' heart as our response. The response is thus a binary variable which takes values of 0 or 1, where 0 means that a patient has a heart disease and 1 means that a patient has no heart disease. Obviously, here the assumption of normality is completely unrealistic. Moreover, frequently the mean of the response can not be expressed as a linear form of the covariates. In those situations, we can not use standard linear models. Generalized linear models (GLMs), which are a extension of classical linear models, can not only deal with variables whose distributions come from the exponential family but also allow nonlinear forms between the mean of responses and the covariates. Variables in the exponential family include continuous variables such as normal and exponential, and discrete variables such as binomial and Poisson. Due to the capability to handle continuous data as well as discrete data, GLMs unify different methodologies and thus have wide applicability in practice. 2.2.1 M o d e l Specification GLMs are specified by three components including a random component, a systematic component, and a link function. Let y \u2014 (yi,y2) \u2022 \u2022 \u2022 IVN)T be a vector of independent and identically distributed (i.i.d) observations whose distribution belongs to the natural exponential family. Then the density function of each observation y* can be expressed in the form f(yu A) = exp{[yi6\u00bbi - \u00a5>(0i)]\/a(0) +
) is a constant in the log-likelihood function about 9 and thus is not ignored in the following log-likelihood function. For N independent observations, the log-likelihood function is N l(0\\y) = Y^m\\yi) i=l N i=l N S o m e u s e f u l E q u a t i o n s Now we will derive some useful identities used in maximizing the likelihood function. The derivation of (2.5) with respect to 0$ gives dl 1 \/ d
) 12 MLEs can be obtained by solving the following score equation S(f3) = -^XWA(Y-\u00bb((3)) = Q- (2.15) The solution to the above equation (2.15) can be performed by Fisher scoring algorithm or Gauss-Newton algorithm. In the case of canonical links, both Fisher scoring and Newton-Raphson reduce to the iteratively re-weighted least squares algorithm. Under the regularity condition, MLEs of parameters in GLMs have the asymptotic normality property P^N((3,a( o, (f>i, \u00a72, z)T\u2022 We assume that the r^-'s are independent for all i and j. Note that the covariates in model (6.6) are selected based on the likelihood ratio test. 61 Table 6.6: Estimates for the Parent Bereavement data Methods Parameters ft ft ft ft Exact Estimate -1.882 0.182 0.083 0.345 method SE 0.966 0.070 0.165 . 0.258 p- value 0.051 \u2022 0.010 0.612 0.181 Approximate Estimate -1.579 0.139 0.058 -0.239 method SE 0.898 0.065 0.152 0.193 p- value 0.079 0.033 0.704 0.216 :\" SE refers to the standard error. 6:3.3 Analysis and Results We consider the following methods to estimate the parameters in models (6.4)-(6.6). (i) the exact method using the Monte Carlo E M algorithm, (ii) the approximate method using the Monte Carlo E M algorithm. Estimates of (3, along with their standard errors and p-values, are shown in Table 6.6. Compared with the exact method, the approximate method resulted in smaller absolute values of estimates and smaller standard errors. Especially for the estimate of fa, the exact and approximate methods gave opposite results. As discussed in previous chapters, the approximate method should have a faster convergence rate, since it avoids sampling the random effect in each E M iteration. However, for this example, the number of iterations to convergence for the approximate method is 24, larger than the number of iterations to convergence for the exact method, which is 13. The P X - E M algorithm improved the convergence speed a bit in this example. The number of iterations to convergence for the exact method based on P X - E M is 9, smaller than 13. Table 6.6 shows that education is significant based on the exact method and the approximate method. The estimate for education fa based on the exact method is 0.182, 62 Table 6.7: Sensitivity analysis for covariate models Covariate Parameters Models Po Pi P2 Ps Original Estimate -1.882 0.182 0.083 0.345 model SE 0.966 0.070 0.165 0.258 (6.5) p- value 0.051 0.010 0.612 0.181 Estimate -1.969 0.189 0.107 0.312 CM1 SE 1.043 0.076 0.175 0.255 p- value 0.059 0.013 0.542 0.222 *SE refers to the standard error. which suggests that the estimated odds of having a lower distress than the baseline value is exp(0.182) = 1.2 times higher, when parents increase their education level by one unit. Based on both the exact method and the approximate method, income and time do not have significant effects on change of parents' mental distress. 6.3.4 Sensitivity Analysis To check the sensitivity of the above results to the covariate models, we consider the following alternative covariate model (i) Alternative Covariate Model 1 (CM1): Model (6.5) with a2 = 0. That is, x i 2 \\ x n ~ N(ai,az), i.e., xi2 is independent of Xu. Table 6.7 shows that results based on the original covariate model and the alternative covariate model are quite similar. This suggests that the results may be robust to the covariate models. We also check the sensitivity of our results to the dropout models. We consider the following alternative dropout models. 63 Table 6.8: Sensitivity analysis for dropout models Dropout \u2022 Parameters Model ft ft ft ft Original Estimate -1.882 0.182 0.083 0.345 model SE 0.966 0.070 0.165 0.258 (6.6) p- value 0.051 0.010 0.612 0.181 Estimate -1.592 0.161 0.063 0.545 D M I SE 0.808 0.059 0.136 0.273 p- value 0.049 0.006 0.644 0.046 Estimate -1.958 0.193 0.087 0.239 DM2 SE 1.019 0.074 0.169 0.254 p- value 0.055 0.010 0.604 0.347 Estimate -1.460 0.167 0.094 0.939 DM3 SE 1.006 0.073 0.172 0.282 p- value 0.146 0.022 0.585 < 0.001 * SE refers to the standard error. (i) Alternative Dropout Model 1 (DMI): logit {Pr(ry = l\\ = (