Regression Approaches to Estimation of Relative Risk: Application to Multiple Sclerosis Studies by Shing Fu B.Sc., The University of Hong Kong, 2008 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in The Faculty of Graduate Studies (Statistics) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) October 2010 c© Shing Fu 2010 Abstract Using a log link for binary response in generalized linear mixed-effects models (GLMM) allows direct estimation of the relative risk. If a random intercept is the only random effect in the conditional mean structure, the marginal mean has the same form. The fixed effects, representing the log relative risks, have the same interpretation in both the mixed-effects model and the marginal model. This leads to two approaches to estimate the relative risks, 1) maximum likelihood for the mixed-effects models and 2) the generalized estimating equations (GEE) approach for the marginal models. In our study, we apply such log-linear models to assess the effects of neutralizing antibodies on interferon beta-1b in relapsing-remitting multiple sclerosis. The results obtained by the two approaches are compared. The relative efficiency of the GEE approach and the robustness of the GLMM approach to some forms of misspecification of the model for the random effects are studied by simulations. ii Table of Contents Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . ix Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Multiple Sclerosis . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 The Interferon beta-1b Pivotal Trial . . . . . . . . . . . . . . 2 1.3 Neutralizing Antibodies in the IFNB Pivotal Trial . . . . . . 4 1.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3.2 Review of Cross-sectional Analysis . . . . . . . . . . . 4 1.3.3 Review of Longitudinal Analysis . . . . . . . . . . . . 5 1.4 The BEYOND Trial . . . . . . . . . . . . . . . . . . . . . . . 8 1.5 Objectives and Outline of the Report . . . . . . . . . . . . . 9 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.1 Generalized Linear Models . . . . . . . . . . . . . . . . . . . 11 2.2 The Quasi-likelihood Approach . . . . . . . . . . . . . . . . . 16 2.3 Marginal Models for Longitudinal Data . . . . . . . . . . . . 17 iii Table of Contents 2.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.3.2 Inferential Approach: Generalized Estimating Equa- tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 2.4 Generalized Linear Mixed-effect Models . . . . . . . . . . . . 26 2.4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . 26 2.4.2 Inferential Approach: Maximum Likelihood . . . . . . 29 2.5 Log-linear Models . . . . . . . . . . . . . . . . . . . . . . . . 32 2.5.1 Mixed-effects Models . . . . . . . . . . . . . . . . . . . 32 2.5.2 Marginalized Models . . . . . . . . . . . . . . . . . . . 33 3 Analysis of BEYOND Trial Data . . . . . . . . . . . . . . . . 40 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2 Basic Model Specification . . . . . . . . . . . . . . . . . . . . 46 3.3 Computational Issues . . . . . . . . . . . . . . . . . . . . . . 49 3.4 Time Trend . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 3.5 Neutralizing Antibody Effects . . . . . . . . . . . . . . . . . . 55 3.5.1 NAB Status . . . . . . . . . . . . . . . . . . . . . . . 55 3.5.2 NAB Titer . . . . . . . . . . . . . . . . . . . . . . . . 56 3.6 Relative Efficiency of GEE to GLMM Approach on Estimating NAB Effects . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 3.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 4 Analysis Based on Collapsed Data . . . . . . . . . . . . . . . 67 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 4.2 Model Specification . . . . . . . . . . . . . . . . . . . . . . . 69 4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 4.3.1 NAB Status . . . . . . . . . . . . . . . . . . . . . . . 70 4.3.2 NAB Titer . . . . . . . . . . . . . . . . . . . . . . . . 72 4.4 Impact of Exceptionally Short Intervals . . . . . . . . . . . . 75 4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 iv Table of Contents 5 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 5.2 Simulation I: Efficiency of GEE . . . . . . . . . . . . . . . . . 85 5.3 Simulation II: Misspecification of Random Effects . . . . . . . 100 5.4 Simulation III: Misspecification of the Random Effect Distri- bution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 5.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 6 Conclusions and Discussion . . . . . . . . . . . . . . . . . . . . 129 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 Appendices A Details of Model Fits for the Daily Relapse Data . . . . . . 139 B Details of Model Fits for the Collapsed Data . . . . . . . . . 146 v List of Tables 3.1 Time of the first available NAB titer . . . . . . . . . . . . . . 42 3.2 Time to the first confirmed NAB+ . . . . . . . . . . . . . . . 43 3.3 NAB status dynamics . . . . . . . . . . . . . . . . . . . . . . . 43 3.4 Number of relapses per eventually NAB+ patients . . . . . . . 44 3.5 p-values for likelihood ratio tests comparing GLMM fits with different time trends: eventually NAB+ patients only. . . . . . 54 3.6 Estimated relative risks of NAB+ to NAB− from GLMM fits with various time trends: eventually NAB+ patients only. . . 55 3.7 Estimated relative risks of NAB+ and sublevels to NAB−: eventually NAB+ patients only. . . . . . . . . . . . . . . . . . 57 3.8 Estimated relative risk of NAB titers: eventually NAB+ pa- tients only. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.1 Number of NAB titer measures . . . . . . . . . . . . . . . . . 67 4.2 Number of relapse onsets per interval . . . . . . . . . . . . . . 68 4.3 Number of days at risk between consecutive collections of NAB serum specimens . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4 Estimated relative risks of NAB+ and sublevels to NAB−: eventually NAB+ patients only, collapsed data. . . . . . . . . 73 4.5 Estimated relative risk of NAB titers: eventually NAB+ pa- tients only, collapsed data. . . . . . . . . . . . . . . . . . . . . 76 4.6 Impact of 1-day intervals on estimates of log relative rates, β. 79 5.1 Results of Simulation I–1 . . . . . . . . . . . . . . . . . . . . . 91 vi List of Tables 5.2 Results of Simulation I–2: More Clusters . . . . . . . . . . . . 94 5.3 Results of Simulation I–3: Larger Cluster Size . . . . . . . . . 97 5.4 Results of Simulation II–1 . . . . . . . . . . . . . . . . . . . . 104 5.5 Results of Simulation II–2: More Clusters . . . . . . . . . . . 108 5.6 Results of Simulation II–3: Larger Cluster Size . . . . . . . . . 112 5.7 Results of Simulation III–1: Beta-Binomial Setup . . . . . . . 122 5.8 Results of Simulation III–2: Triangular Distribution . . . . . . 126 A.1 Model fit based on eventually NAB+ patients only: linear time trend, NAB status. . . . . . . . . . . . . . . . . . . . . . . . . 140 A.2 Model fit based on eventually NAB+ patients only: linear time trend, refined NAB status. . . . . . . . . . . . . . . . . . . . . 141 A.3 Model fits based on 250 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. . . . . . . . . . . . . . 142 A.4 Model fits based on 500 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. . . . . . . . . . . . . . 144 B.1 Model fit based on collapsed data, eventually NAB+ patients only: linear time trend, NAB status. . . . . . . . . . . . . . . 147 B.2 Model fit based on collapsed data, eventually NAB+ patients only: linear time trend, refined NAB status. . . . . . . . . . . 148 B.3 Model fits based on collapsed data, 250 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. . . . . 149 B.4 Model fits based on collapsed data, 500 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. . . . . 151 vii List of Figures 1.1 Disease progression of each MS type. . . . . . . . . . . . . . . 2 2.1 Conditional and implied marginal mean structures of a log- linear model with random intercept and slope: random effects are evaluated at the 10th, 20th, . . . , 90th percentiles. . . . . . . . 39 3.1 Proportion of time on study as NAB− for eventually NAB+ patients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.2 Empirical annualized relapse rates for preceding 3-month pe- riods and the corresponding loess fits for eventually NAB+ patients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.3 Annualized fitted relapse rate of GLMM for NAB− periods, natural regression splines for time trend, eventually NAB+ patients only. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.4 Fitted natural cubic spline (df = 2) on log NAB titers of GLMM: eventually NAB+ patients only. . . . . . . . . . . . . 59 5.1 Probability density functions of random intercept bi with mean −0.003 and variance σ2b = 0.3162 when 1) normally distributed and 2) baseline conditional probability follows beta a distri- bution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 5.2 Probability density functions of random intercept bi with mean 0 and variance σ2b = 0.3 2 when 1) normally distributed and 2) triangularly distributed. . . . . . . . . . . . . . . . . . . . . . 125 viii Acknowledgements I would like to express my deepest gratitude to my supervisor, Professor John Petkau for his insightful guidance and constant encouragement throughout the course of my study in UBC. It is a rewarding experience for me to work under his supervision. I am also grateful to Professor Rollin Brant for being my second reader and providing valuable comments. I would like to thank all my professors from whom I learnt a lot, in partic- ular, Professor Harry Joe with whom I always enjoy working. Special thanks to Peggy Ng, Viena Tran, Elaine Salameh and Andrea Sollberger for their efficient assistance, and to my colleagues in the Department of Statistics for their support. Last but not least, I am indebted to Jeanne and Ian Chen and their family for taking care of me and helping me adapt to the new environment when I first arrived in Vancouver. ix To my family. x Chapter 1 Introduction 1.1 Multiple Sclerosis In this report, we compare two regression approaches to estimation of rela- tive risk with application to studies of neutralizing antibodies in relapsing- remitting multiple sclerosis patients treated with interferon beta-1b. Multiple sclerosis (MS) is a chronic and often disabling disease of the central nervous system. MS is thought to be an autoimmune disease where a patient’s own immune system attacks myelin in the brain and spinal cord. Myelin forms the insulating protective myelin sheaths of nerve fibres. Damage to the myelin sheaths or nerve fibres affects the transmission of nerve signals resulting in symptoms including difficulties in moving and coordination, deterioration of sensory functions, problems in bowel and bladder, among many others. Le- sions in the brain and spinal cord formed at the site of inflammation where the myelin is damaged are usually monitored by magnetic resonance imaging (MRI) scans. Specific symptoms and severity of MS vary considerably across patients. Currently there are four types of MS characterized by disease progression in term of severity of disability: 1) Relapsing-Remitting MS (RRMS), 2) Pri- mary Progressive MS (PPMS), 3) Secondary Progressive MS (SPMS) and 4) Progressive Relapsing MS (PRMS). Figure 1.1 shows the typical progression profiles of each type of MS. RRMS is the most common type of MS. According to the MS Society 1 1.2. The Interferon beta-1b Pivotal Trial Figure 1.1: Disease progression of each MS type. of Canada, about 80-85% of the patients are first diagnosed with RRMS. RRMS is characterized by recurrent cycles of acute attack, also called relapse or exacerbation, followed by complete or partial recovery with absence of disease progression between relapses. 1.2 The Interferon beta-1b Pivotal Trial There is no cure to MS to date, but disease modifying therapies exist to re- duce the exacerbation rate and severity of relapse. In 1995 interferon beta-1b 2 1.2. The Interferon beta-1b Pivotal Trial (IFNB) became the first disease modifying therapy approved for RRMS in Canada. The efficacy of IFNB in reducing exacerbation rate in RRMS was first demonstrated by a multicenter, randomized, double blind, placebo-controlled trial conducted by the IFNB Multiple Sclerosis Study Group[1–3]. From June 1988 to May 1990, 372 MS patients with disease history longer than 1 year were enrolled in this pivotal trial: 123 patients were assigned to a placebo arm, 125 to a low dose arm of 1.6 million international units (MIU) and 124 to a high dose arm of 8 MIU of IFNB, self-administered by subcutaneous injections every other day. The primary clinical outcome was annualized ex- acerbation rate. An exacerbation was defined as appearance of a new symp- tom or worsening of an existing symptom, attributable to MS, accompanied by an appropriate new neurologic abnormality. An exacerbation should also be preceded by stability or improvement for a minimum of 30 days and last for at least 24 hours in the absence of fever. The radiological outcome was cranial MRI scans for brain lesions. All patients had a baseline MRI scan and subsequent annual MRI scans. A cohort of 52 patients at the Univer- sity of British Columbia, one of the participating sites, had MRI scans every 6 weeks for 2 years. The trial was terminated with all patients converted to high dose (8 MIU) treatment in January 1993. Total time-on-study for individual patients ranged from 3.5 to 5 years. Based on 2-year data, the annual exacerba- tion rate was 1.27 for the placebo group, 1.17 for the 1.6 MIU group and 0.84 for the 8 MIU group. The reduction in exacerbation rate in the 8 MIU group relative to the placebo group was 33% in the first year of study and 28% in the second year of study, both with statistical significance at the 0.05 level. For the final three years, the reduction still ranged from 24% to 30%. However, these reductions did not achieve statistical significance. A decreas- 3 1.3. Neutralizing Antibodies in the IFNB Pivotal Trial ing trend in the annual exacerbation rate was observed over the 5-year course of study in all groups with the placebo group always having the highest rate whereas the 8 MIU group always had the lowest rate. 1.3 Neutralizing Antibodies in the IFNB Pivotal Trial 1.3.1 Overview The presence of neutralizing antibodies (NABs) to IFNB in the serum spec- imens of some patients in the IFNB pivotal trial was well documented[1, 3]. Serum specimens were collected quarterly during the trial and tested for NABs by a cytopathic effect (CPE) assay. Development of NABs began mostly during the first year on study. In some patients, NABs appeared inconsistently and intermittently while in others, NABs were detected per- sistently subsequent to their first appearance. The minimum detection limit of the CPE assay for NAB titers is 20 neutralizing units (NU/mL). A patient was identified as NAB+ if at least two consecutive NAB titers were at least 20 NU/mL; otherwise the patient was considered NAB−. Under this ad hoc rule, 2 of 123 placebo patients were eventually identified as NAB+, whereas in both the 1.6 MIU and 8 MIU arm about 35% of the patients were identified as NAB+ by the end of the trial. 1.3.2 Review of Cross-sectional Analysis NABs were expected to diminish the efficacy of IFNB in reducing MS dis- ease activity. To assess the impact of NABs on exacerbation rate, the an- nualized exacerbation rate of the NAB+ patients was compared to that of the NAB− patients on the same arm for every half-year period in a cross- 4 1.3. Neutralizing Antibodies in the IFNB Pivotal Trial sectional analysis[4]. In the 8 MIU group, the NAB+ patients had higher estimated exacerbation rates than the NAB− patients for periods beyond 13 months on study and these exacerbation rates resembled the rates of the placebo patients in that time period. The increases were significant over the half-year periods of 19 – 24 and 25 – 30 months on study and over the longer periods of 13 – 36 and 19 – 36 months on study. In the 1.6 MIU group, there was no consistent pattern of increased exacerbation rates in the NAB+ group compared to the NAB− group. For the half-year period of 19 – 24 months on study, the exacerbation rate in the NAB+ group is even significantly lower than the NAB− group. The study claimed to have shown significant evidence of diminished treatment effect in the 8 MIU group in terms of increased risk of exacerbation associated with development of NABs. 1.3.3 Review of Longitudinal Analysis Petkau et al.[5] pointed out that the cross-sectional analyses were inadequate as the baseline covariates, which might be predictive of exacerbation rates, were not taken into account. Also, the analyses were based on each half-year period separately and therefore were not very efficient. They suggested a longitudinal approach to analyze the effects of NABs on exacerbation rate to allow each patient to act as his own control. In their analysis, all patients were initially considered as NAB−. The subsequent NAB status for each day on study was then determined by one of several proposed ad hoc rules. The relapse rates during NAB+ periods of individual patients were compared to their rates during NAB− periods. This within subject comparison eliminates the main effects of baseline covariates which are potential confounders of the NAB effect. The longitudinal approach also allows straightforward incorpo- ration of a time trend to adjust for the naturally changing exacerbation rates. The first rule of classification of NAB status, referred to as “once positive, always positive”, defines the time of switching from NAB− to NAB+ by the 5 1.3. Neutralizing Antibodies in the IFNB Pivotal Trial time of the first of two consecutive NAB titers of at least 20 NU/mL. Once switched to NAB+, the patient is considered to remain NAB+ until the end of the study irrespective of any of the following NAB titers. The requirement of two consecutive NAB titers higher than a cut-off value is referred to as “confirmation required”. The second rule, referred to as “all switches considered”, allows switch- ing back of the NAB status from NAB+ to NAB−. The definition for the time of switching from NAB− to NAB+ is the same as the first rule, which is the first time of two consecutive titers of at least 20 NU/mL. The time of switching from NAB+ back to NAB− is the time of the first subsequent NAB titer below 20 NU/mL with no confirmation required. The patient’s NAB status can switch back and forth for any number of times. Under either rule of classification, a patient is identified as “eventually NAB+” if he has at least one NAB+ period; otherwise the patient is identi- fied as “never NAB+”. Refining modifications applicable to both rules were also considered to take some account of the titer levels. NAB+ is divided into two sublevels, Low NAB+ and High NAB+, to examine the relative effect of different levels of NAB titers. Two modifications were considered: confirmation required or not required for switching from a lower level to a higher level. In both cases, confirmation is not required for switching from a higher level to a lower level. The longitudinal approach facilitates direct comparison of exacerbation frequencies during NAB− periods and NAB+ periods of individuals who had at least one NAB+ period. Specifically Petkau et al.[5] considered the gener- alized estimating equations approach which directly estimates the effects of NABs at the population average level in terms of the relative risk of exacer- bations while accounting for the correlation in the longitudinal observations. 6 1.3. Neutralizing Antibodies in the IFNB Pivotal Trial Their analyses also adjusted for the natural trend in exacerbation rates over time by incorporating a linear effect of time. By the time the longitudinal analysis was conducted, serum specimens were tested by a new assay, namely the MxA assay, which has the same detection limit as the CPE assay. NAB status determined by both assays was considered but in separate analyses. With the MxA assay and the “once positive, always positive” definition, comparing NAB+ periods to NAB− pe- riods, the exacerbation rate increased by 28% (95% CI: −15%, 92%) in the 1.6 MIU group and by −2% (95% CI: −21%, 21%) in the 8 MIU group. With the “all switches considered” definition, the increase in exacerbation rate was 29% (95% CI: 0%, 67%) in the 1.6 MIU group and 18% (95% CI: 0%, 40%) in the 8 MIU group. With either definition, the increase was higher for the low dose (1.6 MIU) group than the high dose (8 MIU) group. Using the titer values obtained by the CPE assay, similar results were obtained, however, with lower statistical sensitivity. In additional analyses, no evidence indicat- ing a carryover of NAB+ effect was demonstrated. After adjustment for the overall time trend, the relapse rates during the first NAB− period were not statistically different from the rates during subsequent NAB− periods which were preceded by at least one NAB+ period. Petkau et al.[5] also estimated the power of the longitudinal analyses based on the estimated correlation in the repeated observations on individ- ual patients over time. Treatment with 8 MIU IFNB reduces the exacerbation rate to about two-thirds of the placebo rate. A 50% increase in exacerbation rate associated with the switch from NAB− to NAB+ is an effect of impor- tance as it implies that the rate during NAB+ period resembles the placebo rate. With the MxA assay, the power to detect a 50% increase in exacerba- tion rate when NAB status switches from negative to positive was estimated to be 96% for the “once positive, always positive” classification and 99.7% for 7 1.4. The BEYOND Trial the “all switches considered” classification. These longitudinal analyses with results from the MxA assay had adequate statistical sensitivity for detecting an effect of this magnitude. However, the NAB effects on exacerbation rates were not precisely esti- mated with the pivotal trial data, as demonstrated by the wide confidence intervals obtained. The estimation of NAB effects are based directly on the eventually NAB+ patients only. The low sensitivity is due in part to the rel- atively small fraction of patients (about 35%) ever switching to NAB+ over the course of trial. A larger sample of eventually NAB+ patients is required for more precise estimation of the NAB effects. 1.4 The BEYOND Trial The Betaseron Efficacy Yielding Outcomes of a New Dose (BEYOND) trial was the largest MS phase III trial ever, involving a total of 2244 patients with RRMS[6]. Patients enrolled were followed for 2 to 3.5 years. The original objective of the BEYOND trial was to compare the efficacy of the standard dose of IFNB (250 mcg, i.e. 8 MIU) with an experimental, higher dose of IFNB (500 mcg, i.e. 16 MIU) and with 20 mg glatiramer acetate (GA), an alternative treatment to IFNB. A total of 2244 enrolled patients were randomized to the 250 mcg IFNB group, the 500 mcg IFNB group and the GA group in the ratio 2:2:1. The two IFNB groups were given the specified dose of IFNB injected subcuta- neously every other day while the GA group was given a 20 mg GA injection subcutaneously every day. Serum specimens were collected at baseline and every 6 months for patients in the two IFNB arms and serum NAB titers were measured by the MxA assay. The BEYOND trial provides, for the first time, a potentially large number of NAB+ patients for assessment of NAB 8 1.5. Objectives and Outline of the Report effects on RRMS disease activity so greater statistical sensitivity should be expected relative to previous studies. 1.5 Objectives and Outline of the Report The primary objective of our study is to contrast two longitudinal approaches for comparing exacerbation rates during NAB− and NAB+ periods. The first, the generalized estimating equations (GEE) approach, has been ap- plied in some previous longitudinal studies of NAB effects. The second is the full likelihood approach of generalized linear mixed-effects models (GLMM), which has not been utilized previously in this context due to computational difficulties. The efficiency of GEE relative to the GLMM approach is exam- ined. An accurate model to reflect the natural diminishing relapse rates is crucial for estimating NAB effects unbiasedly. The adequacy and appropri- ateness of a linear time trend will be assessed. The impact of improving the fit of natural time trends by regression splines will also be examined. When the clinical outcome is observed on a daily basis, the huge number of responses for each patient inevitably hinders the use of some standard sta- tistical software for conducting the longitudinal analysis. The daily outcome is collapsed to a less frequently observed outcome to attempt to circumvent these computational difficulties. The loss of sensitivity due to such reductions in the resolution of the relapse outcome is also investigated. For coherence and consistency, the same set of analyses is performed separately on the 250 mcg group and the 500 mcg group. This chapter has provided some background information on the scientific problem and described some related statistical issues. Chapter 2 provides a general review of the two longitudinal approaches to be utilized. A specific class of GLMM with log link and random intercept is described in detail 9 1.5. Objectives and Outline of the Report with specific features highlighted. The regression parameters of this log- linear model can be estimated by both longitudinal approaches. Chapter 3 presents the results of the longitudinal analyses based on daily clinical outcomes. The log-linear model described in Chapter 2 is utilized. Computational issues that arose when fitting this model to our data are dis- cussed. The adequacy and appropriateness of a linear time trend is first assessed. When the form of the time trend is decided, the relative efficiency of the GEE approach in estimating the NAB effects is examined. The pri- mary assessment of NAB effects is based on comparison of NAB+ periods to NAB− periods. Association between the relapse rate and the NAB titers is estimated to reveal any biological gradients of relapse rate with titer value. Chapter 4 considers collapsing the daily relapse response to a less fre- quent response. The collapsed response is the number of relapses between consecutive acquisitions of serum specimens. The log-linear model discussed in Chapter 2 is applied to the collapsed count data. The loss in efficiency due to reduction in resolution of the response, as well as the efficiency of the GEE approach relative to the GLMM approach is examined. Chapter 5 describes and presents the results of simulation studies inves- tigating the finite sample properties of the GLMM and the GEE approaches for the type of models utilized in Chapter 3. We focus on the relative ef- ficiency of GEE when GLMM is correctly specified and the robustness of GLMM under some form of misspecification of the random intercept. Chap- ter 6 summarizes the overall findings and discusses problems that remain to be investigated in the future. 10 Chapter 2 Methodology 2.1 Generalized Linear Models The class of generalized linear models (GLMs) was first developed by Nelder and Wedderburn[7] to unify and extend various regression models, such as classical linear models, logistic and probit regression models for binary obser- vations and log-linear models for contingency tables. The following notations are established for the discussion of cross-sectional analysis in Sections 2.1 and 2.2. y = (y1, . . . , yn) T is a n×1 vector of observations or responses which is a realization of a n×1 vector of random variables Y = (Y1, . . . , Yn)T . Each response yi is associated with a p×1 vector of covariates xi. These vectors of covariates xi can be grouped into a n×p covariate matrix X = (x1, · · · ,xn)T . The GLM model assumption has two components, a stochastic compo- nent and a systematic component. The stochastic component assumes that each component of Y independently follows a distribution in the exponential family with density function fY (·) given by fYi(yi; θi, φ) = exp { yiθi − b(θi) a(φ) + c(yi, φ) } , (2.1.1) where θi is a location parameter and φ is a dispersion parameter. The disper- sion parameter is known for some distributions. For example, φ = 1 for the binomial and Poisson distributions, the most commonly used distributions in GLMs for binary and count responses respectively. For our scope of discus- sion, we assume a common constant dispersion a(φ) = φ across observations. 11 2.1. Generalized Linear Models The mean and variance of Yi are given by E(Yi) = b ′(θi) = µi (2.1.2) and Var(Yi) = φ · b′′(θi) = φ · v(µi). (2.1.3) The variance is determined by the mean through the known variance func- tion v(·), up to a scalar multiple represented by the dispersion parameter φ. The systematic component assumes that the effects of the covariates can be represented by a linear predictor, ηi = x T i β, (2.1.4) where β = (β1, . . . , βp) T is a p× 1 vector of unknown regression coefficients. In matrix notation, η = Xβ, where η = (η1, . . . , ηn) T . A known link function g(·) then relates the linear predictor to the corresponding mean, g(µi) = ηi, (2.1.5) which completes the specification of the mean structure. The link func- tion g(·) can be any monotone differentiable function but there are common choices for specific types of response variable. The log likelihood function of a GLM has a general form, l(θ, φ; y) = n∑ i=1 li(θi, φ; yi) = n∑ i=1 { yiθi − b(θi) φ + c(yi, φ) } . (2.1.6) 12 2.1. Generalized Linear Models By recognizing from (2.1.1) and (2.1.2) that ∂li ∂θi = yi − b′(θi) φ = yi − µi φ , and from (2.1.3) that dθi dµi = dθi db′(θi) = 1 b′′(θi) = 1 v(µi) , the score function, the first derivative of the log likelihood function (2.1.6), is given by s(β;φ,y) = ∂l ∂βT = n∑ i=1 ∂li ∂θi dθi dµi ∂µi ∂βT = n∑ i=1 ∂µi ∂βT 1 φ · v(µi) (yi − µi) . (2.1.7) From (2.1.4), we have ∂µi ∂βT = dµi dηi ∂ηi ∂βT = dµi dηi xi, so the score function can be re-expressed as s(β;φ,y) = 1 φ n∑ i=1 xiwi dηi dµi (yi − µi), where wi = ( dµi dηi )2 1 v(µi) . If the maximum likelihood estimate (MLE) of β is an interior point in the parameter space, it is a solution to the estimating equations n∑ i=1 xiwi dηi dµi (yi − µi) = 0, 13 2.1. Generalized Linear Models which can be expressed in matrix notation as XTW∆(y − µ) = 0, (2.1.8) with W = diag(w1, . . . , wn), derivative matrix ∆ = diag ( dη1 dµ1 , . . . , dηn dµn ) and µ = (µ1, . . . , µn) T . Under the assumption of constant dispersion, solving for the MLE of β does not involve the dispersion parameter φ. Nelder and Wedderburn[7, 8] provided a generic model fitting algorithm, the iteratively reweighted least squares (IRWLS) method on transformed dependent variables, for solving the GLM maximum likelihood estimating equations (2.1.8). IRWLS is computationally simple and is equivalent to Fisher’s scoring method first proposed by Fisher in the context of probit analysis. Fisher’s scoring method is a variant of Newton’s method where the Hes- sian matrix is replaced by its expected value. Omitting the dispersion pa- rameter, the expected value of the Hessian matrix is given by E [ d2l dβdβT ] = n∑ i=1 E [ (yi − µi) d dβ ( xiwi dηi dµi ) + xiwi dηi dµi d dβ (yi − µi) ] = n∑ i=1 [ d dβ ( xiwi dηi dµi ) E (yi − µi)− xiwi dηi dµi dµi dβ ] = − n∑ i=1 xiwi dηi dµi dµi dηi dηi dβ = − n∑ i=1 xiwix T i , which can be expressed as E [ d2l dβdβT ] = −XTWX. 14 2.1. Generalized Linear Models Given the current estimates at the kth iteration of Fisher’s scoring method, β̂ (k) , µ̂(k) η̂(k), W(k) and ∆(k), the updated β̂ (k+1) is obtained by β̂ (k+1) = β̂ (k) + ( XTW(k)X )−1 XTW(k)∆(k)(y − µ̂(k)). (2.1.9) Fisher’s scoring method can be shown to be equivalent to IRWLS by rear- ranging (2.1.9) as, XTW(k)X ( β̂ (k+1)− β̂(k) ) = XTW(k)∆(k)(y − µ̂(k)) ⇒ β̂(k+1) = (XTW(k)X)−1 XTW(k)(Xβ̂(k)+ ∆(k)(y − µ̂(k))) = ( XTW(k)X )−1 XTW(k) (̂ η(k)+ ∆(k)(y − µ̂(k)) ) . (2.1.10) This updating is thus equivalent to solving weighted least squares equations with the transformed dependent variable z (k) i = η̂ (k) i + dηi dµi (yi − µ̂i(k)) and weights w (k) i = ( dµi dηi )2 1 v(µ̂i (k)) . In accordance with standard likelihood theory, √ n ( β̂ − β ) converges in distribution to normal with mean 0 and covariance matrix φ · (XTWX)−1, as n tends to infinity. As already noted, the MLE of β does not depend on the dispersion pa- rameter φ. However, as Var(β̂) depends on φ, the dispersion parameter needs to be estimated for any statistical inference. In case of a common dispersion 15 2.2. The Quasi-likelihood Approach parameter, φ can be consistently estimated by the moment estimator φ̃ = 1 n− p n∑ i=1 (yi − µ̂i)2 v(µ̂i) , (2.1.11) where µ̂i = g −1 ( xTi β̂ ) are the MLEs of the means. 2.2 The Quasi-likelihood Approach In practice, we may be able to justify a specified mean structure and variance function for observed responses. However, there may not be any distribution in the exponential family or any actual probability distribution which has the desired mean-variance structure and range of possible response values. Wedderburn[9] proposed the quasi-likelihood approach as an alternative to the GLM approach when inference on the regression parameters is the main interest. In the quasi-likelihood approach, the responses Yi’s are as- sumed independent with mean µi and variance φ · v(µi), where v(·) is a known variance function and the dispersion parameter φ may be unknown but is assumed to be constant. The mean µi depends on covariates xi as in (2.1.4) and (2.1.5) for GLM. No further assumption is required regarding the distribution of Yi. A quasi-likelihood function, if it exists, is defined as Q(µ, φ; y) = n∑ i=1 ∫ µi yi yi − t φ · v(t)dt, which is anticipated to exhibit similar behaviour to a log-likelihood func- tion under mild assumptions. In some cases, the quasi-likelihood function is the GLM log-likelihood function with the corresponding mean-variance 16 2.3. Marginal Models for Longitudinal Data structure. The maximum quasi-likelihood estimate β̃ of the regression pa- rameters maximizes the quasi-likelihood function. When this estimate is an interior point solution, it solves the quasi-likelihood estimating equations which take the same form as (2.1.7) and thus can be solved by Fisher’s algo- rithm (2.1.9) or IRWLS (2.1.10) as for GLM. The resulting quasi-likelihood estimator β̃ is consistent and √ n ( β̃ − β ) converges in distribution to nor- mal with mean 0 and covariance matrix φ ·(XTWX)−1 as n tends to infinity. In general, the dispersion parameter can be consistently estimated by the same moment estimator in (2.1.11), given the quasi-likelihood estimates of the means, µ̃i = g −1 ( xTi β̃ ) . 2.3 Marginal Models for Longitudinal Data 2.3.1 Overview For longitudinal data, observations on a response are taken repeatedly on each subject over a period of time. The repeated measures on the same subject are likely to be statistically dependent due to unobservable or un- measured characteristics of the subjects. An assumption of independence of the repeated measures is usually not justified. Direct application of GLM to longitudinal data is thus inappropriate. Alternative approaches which account for the dependence are needed in order to draw valid statistical in- ferences. Two extensions to GLM applicable to longitudinal data will be discussed and compared. First, we focus on a class of marginal models which is a straightforward extension of GLM to longitudinal data. In contrast to the class of mixed-effects models which will be discussed in the next section, this class of marginal models enable separate construction of the mean struc- ture and the dependence structure. 17 2.3. Marginal Models for Longitudinal Data Suppose m subjects are followed over time with ni repeated measures on the ith subject at times tij, j = 1, . . . , ni. A total of N = ∑m i=1 ni mea- sures are recorded. Let observation yij be the realization of Yij, the response variable of the jth measure on the ith subject at time tij. Each response is associated with a p × 1 vector of covariates xij. These covariates can be invariant or varying over time. The covariate vectors for the same individual can be grouped in a ni × p matrix, Xi = (xi1, · · · ,xini)T . The vectors of responses Yi = (Yi1, . . . , Yini) T are independent across subjects. A marginal model has a three-fold specification. Firstly, the marginal mean E(Yij) = µij depends on the covariates through a known link function g(·) as g(µij) = ηij = x T ijβ, where β = (β1, . . . , βp) T is a vector of p regression coefficients. Secondly, the marginal variance Var(Yij) depends on the marginal mean through a known variance function v(·) that is scaled by a dispersion parameter, Var(Yij) = φ · v(µij). The dispersion parameter can be time dependent or modelled as a function of covariates, but for our scope of discussion, we assume a constant dis- persion across subjects and times. The first two components of the model specification are the same as in GLM, whereas the third component is the extension to accommodate longitudinal data by specification of the pairwise dependence of responses of the same individual. We depict the dependence structure by pairwise correlations where the ni×ni correlation matrix Ri(α) of Yi is completely specified by a s × 1 vector of association parameters α. The variance-covariance matrix of Yi is thus given by Vi = φ ·A1/2i Ri(α)A1/2i , (2.3.1) 18 2.3. Marginal Models for Longitudinal Data where Ai = diag(v(µi1), . . . , v(µini)). With only the mean structure and the variance-covariance structure spec- ified, the joint distribution of Yi is not completely determined in general. For continuous data, the joint distribution of the longitudinal observations are often assumed to be multivariate normal which is completely specified by the first two moments. For discrete data, there is no simple analogue to the multivariate normal distribution. Higher order moment assumptions are involved in constructing a joint distribution for non-normal data which be- comes impractical. For example, Bahadur[10] proposed a parameterization for multivariate binary responses Y = (y1, . . . , yn) T , P(Y = y) = n∏ i=1 µyii (1− µi)1−yi×( 1 + ∑ i<j ρijrirj + ∑ i<j<k ρijkrirjrk + · · ·+ ρ1,...,nr1 . . . rn ) , where ri is the realization of Ri = Yi−µi√ µi(1−µi) , ρij = Corr(Yi, Yj) = E(RiRj), ρijk = E(RiRjRk), . . . , and ρ1,...,n = E(R1 . . . Rn). Diggle et al.[11] and Chaganty and Joe[12] pointed out that the correlations among the binary responses are constrained by the marginal means in complicated ways with the upper and lower bounds determined by the Fréchet bounds. When the marginal mean depends on some covariates, the convenient assumption that the pairwise and higher-order correlations are independent of the covariates is impossible. When a full likelihood is not available, alternative approaches to max- imum likelihood estimation must be employed for inference on regression parameters. One alternative is the generalized estimating equations (GEE) approach. 19 2.3. Marginal Models for Longitudinal Data 2.3.2 Inferential Approach: Generalized Estimating Equations Liang and Zeger[13, 14] proposed the generalized estimating equations (GEE) approach as an extension of generalized linear models for longitudinal data to obtain consistent estimates of regression coefficients and their standard errors. The GEE approach does not require the assumption of an exponen- tial family distribution for the marginal model. Therefore, it can be regarded more generally as an extension of the quasi-likelihood approach for clustered data. The GEE approach provides a framework for estimation for the class of marginal models in the previous section. Due to the often complicated underlying stochastic process that gener- ates the longitudinal data, the true correlation structure is usually difficult to determine. When the dependence of the mean on the covariates is of pri- mary interest, the regression parameters β are the main target of inference and the association parameters α can be treated as nuisance parameters. In these circumstances, it would be desirable if valid inference for the regres- sion parameters could be achieved without having to specify the correlation structure correctly. The GEE approach gives a consistent estimate of β and the asymptotic variance-covariance matrix of this estimate regardless of the true correlation structure. The specific estimating equations of the GEE approach depend upon a so-called “working” correlation matrix, also denoted Ri(α), but this need not be the true correlation matrix. Common choices of “working” cor- relation structure include, 1. Independence where Ri(α) = Ini , the ni × ni identity matrix; 20 2.3. Marginal Models for Longitudinal Data 2. Exchangeable where Corr(Yit, Yit′) = α, (2.3.2) for all t 6= t′; 3. AR−1, a continuous time analogue of the first-order autoregressive process, where Corr(Yit, Yit′) = α |t−t′| for all t 6= t′; and 4. m-dependence where Corr(Yit, Yit+l) = { αk, for l = 1, . . . ,m 0, for l = m+ 1, . . . , ni . GEE can be viewed as quasi-likelihood estimating equations for longitu- dinal data taking the form m∑ i=1 DTi V −1 i (yi − µi) = 0, (2.3.3) where Di = dµi dβ is a ni × p matrix, Di = ∂µi1/∂β1 · · · ∂µi1/∂βp ... . . . ... ∂µini/∂β1 · · · ∂µini/∂βp , and Vi = φ ·A1/2i Ri(α)A1/2i as specified in (2.3.1) but with Ri(α) being the “working” correlation matrix. Since Di and Vi do not depend on yi, the left- hand side of (2.3.3) has expectation equal to 0, so the estimating equations are unbiased irrespective of the choice of “working” correlation matrix. The 21 2.3. Marginal Models for Longitudinal Data roots of the estimating equations are thus consistent if the mean structure is correctly specified so that E (yi − µi) = 0. Estimating equations (2.3.3) can be solved iteratively by a modified Fisher’s scoring algorithm augmented with moment estimation of α and φ. Regard- ing the left hand side of (2.3.3) as the estimating function, the expected value of its first derivative matrix is given by −∑mi=1 DTi V−1i Di. Hence, given the kth step estimates β̂ (k) , α̂(k) and φ̂(k), the updated β̂ (k+1) , α̂(k+1) and φ̂(k+1) can be obtained by a two-stage iterative procedure: 1. Compute µ̂(k), D (k) i and V (k) i based on the current estimates β̂ (k) , α̂(k) and φ̂(k). Then, update β̂ (k+1) by the modified Fisher’s scoring algo- rithm, β̂ (k+1) = β̂ (k) + { m∑ i=1 D (k)T i V (k)−1 i D (k) i }−1{ m∑ i=1 D (k)T i V (k)−1 i ( yi − µ̂(k)i )} , or equivalently by β̂ (k+1) = { m∑ i=1 D (k)T i V (k)−1 i D (k) i }−1{ m∑ i=1 D (k)T i V (k)−1 i zi } , (2.3.4) where zi = D (k) i β̂ (k) + ( yi − µ̂(k)i ) . Step (2.3.4) is equivalent to weighted least squares with transformed dependent variable zi, covariate matrix Di and weight matrix V −1 i . 2. After obtaining the fitted mean µ̂(k+1) from the updated regression coefficients β̂ (k+1) , the standardized residuals r (k+1) ij = ( yij − µ̂(k+1)ij ) √ v(µ̂ (k+1) ij ) , (2.3.5) 22 2.3. Marginal Models for Longitudinal Data form the basis for estimation of φ and α. The association parameter, α, is estimated by a √ m-consistent estimator, α̂(β, φ), which is a function of the data, β and φ. The unknown dispersion parameter, φ, in α̂(β, φ) is estimated by a √ m-consistent estimator, φ̂(β), which is a function of the data and β. The unknown β in both α̂(β, φ) and φ̂(β) is replaced by β̂ (k+1) evaluated in step 1. The two steps: the updating to β̂ (k+1) and the estimation of φ and α are iterated until convergence. An initial value for β is usually obtained from a GLM fit pretending the observations are independent. Under mild regularity conditions, as the number of subjects m tends to infinity, √ m(β̂ − β) is asymptotically p-variate normal with mean 0 and covariance matrix ΣGEE given by ΣGEE = (2.3.6) lim m→∞ m· ( m∑ i=1 DTi V −1 i Di )−1( m∑ i=1 DTi V −1 i Cov(Yi)V −1 i Di )( m∑ i=1 DTi V −1 i Di )−1 . For finite samples, the covariance matrix of β̂ can be approximated by the so-called “sandwich estimate”, Σ̂β̂ ≈ (2.3.7)( m∑ i=1 D̂Ti V̂ −1 i D̂i )−1( m∑ i=1 D̂Ti V̂ −1 i (yi − µ̂i) (yi − µ̂i)T V̂−1i D̂i )( m∑ i=1 D̂Ti V̂ −1 i D̂i )−1 , where the term Cov(Yi) in the asymptotic covariance matrix (2.3.7) is re- placed by a simple moment estimate (yi − µ̂i) (yi − µ̂i)T , while Di and Vi are also replaced by their corresponding estimates. The consistency of β̂ and the covariance matrix follows provided the mean structure is correctly speci- fied regardless of the choice of “working” correlation structure. The validity 23 2.3. Marginal Models for Longitudinal Data of these asymptotic results does not depend on the specific choice of estima- tors of α and φ as long as they are √ m-consistent. Thus, the GEE approach is robust to the “working” correlation assumption. However, higher statis- tical efficiency is achieved if the “working” correlation structure is closer to the true correlation structure[13]. Liang and Zeger[13, 14] also proposed the following moment estimators of α and φ. Given the standardized residuals (2.3.5) at a certain iteration, the moment estimator of φ is given by φ̂ = ∑m i=1 ∑ni j=1 r 2 ij N − p . The estimator of α depends on the specific structure of R(α). For example, when α represents the pairwise correlation in an exchangeable correlation structure as defined in (2.3.2), for given φ, it can be estimated by α̂ = φ ·∑mi=1∑j>k rijrik∑m i=1 1 2 ni(ni − 1)− p. The sandwich estimator of variance may exhibit bias when the number of clusters is small, e.g. m < 30. Paik[15] proposed the jackknife variance estimator as an improvement to the sandwich estimator. This jackknife es- timator is defined as Σ̃β̂ = m− p m m∑ i=1 ( β̂−i − β̂ )( β̂−i − β̂ )T , (2.3.8) where β̂−i is the estimate obtained for β when the i th subject is removed from the dataset. In a simulation study, the jackknife variance estimator was shown to have superior properties over the sandwich variance estima- tor. This jackknife estimator is sometimes referred to as the fully iterated jackknife estimator. In each iteration, the model is fitted by fully iterated 24 2.3. Marginal Models for Longitudinal Data Newton method to the dataset with one cluster removed. Since the model needs to be fitted as many times as the number of clusters, this jackknife estimator is computationally intensive when the number of clusters is mod- erate or large. Halekoh et al.[16] implemented the sandwich estimator, the fully iterated jackknife estimator as well as an approximate jackknife esti- mator and an one-step jackknife estimator in the R package geepack. For the one-step jackknife estimator, each β̂−i is obtained by one-step Newton iteration. Their simulations and other previous simulations indicate that the latter two computationally less demanding versions of jackknife estimates are in many cases accurate approximates to the fully iterated jackknife variance estimate. This class of marginal models is a suitable modelling approach in many clinical trials and observational studies when the average effects of treat- ments or exposures in the population of interest are the primary target of inference. The regression parameters are directly interpretable as the popu- lation average effects. This interpretation is invariant to the specification of the correlation structure, or more generally the association structure. GEE permits consistent point estimation and interval estimation under weak assumptions. Hypothesis tests for contrasts of β are based on the resulting Wald-type statistics. Suppose we are interested in testing H0 : Cβ = 0 against H0 : Cβ 6= 0, with C being a contrast matrix of rank q. The Wald-type statistic W 2 = β̂ T CT ( CΣ̂β̂C T )−1 Cβ̂ (2.3.9) asymptotically follows a χ2-distribution with q degrees of freedom. The Wald-type test can be used to compare nested models if the reduced model can be obtained by setting some contrasts in the full model to zero. However, likelihood-based tools such as likelihood ratio tests are not available. There 25 2.4. Generalized Linear Mixed-effect Models are also no widely accepted criteria, such as the Akaike Information Crite- rion (AIC), for comparing non-nested models. Thus, model comparisons are sometimes difficult, especially for non-nested models. 2.4 Generalized Linear Mixed-effect Models 2.4.1 Overview Generalized linear mixed-effect models (GLMM) is another extension of gen- eralized linear models for repeated measures and longitudinal data. In GLMM, some of the regression coefficients are assumed to vary across subjects to reflect natural between-subject heterogeneity due to unmeasured and unob- servable factors. Given a q × 1 vector of random effects bi, Yij is assumed to have a con- ditional distribution in the exponential family with density function, fY (·), as (2.1.1). Conditional on the random effects, the Yij’s are usually assumed to be independent of one another. This is sometimes referred to as the con- ditional independence assumption. Though the conditional independence assumption is not necessarily required, it is invariably adopted for discrete responses since direct modelling of correlation structure is difficult. The conditional mean of Yij is modelled as g (E[Yij|bi]) = ηij = xTijβ + uTijbi, where g(·) is a known link function and the q × 1 vector uij is a subset of the p× 1 vector of covariates xij. The random effects bi are assumed to be independently and identically distributed with density fb(·), having mean E(bi) = 0 and depending on a set of parameters Vb. The random effects distribution is often assumed to be multivariate normal for mathematical 26 2.4. Generalized Linear Mixed-effect Models convenience. In this case, Vb represents the covariance matrix of the ran- dom effects. When the conditional independence assumption is made, association among the repeated measures is not modelled directly. Instead, within subject cor- relation is entirely induced by the random effects. In the case of linear mixed-effects models where the continuous responses are assumed to be con- ditionally independent and normal with common variance σ2 and an identity link g(t) = t is used, the induced marginal covariance structure is given by Cov(Yi) = UiVbU T i + σ 2Ii, where Ui = (ui1 · · ·uini)T is the ni × q matrix of covariates corresponding to the random effects and Ii is the ni × ni. identity matrix. For discrete responses, a nonlinear link would usually be used and the induced covariance matrix is often mathematically intractable. The regression coefficients β are usually referred to as the fixed effects. In the general case, they can be interpreted as the effects of the covariates on an individual whose random effects bi are equal to 0. In some cases the random effects are assumed only for the time independent covariates, repre- senting heterogeneity at baseline levels. In that case, the fixed effects for the time dependent covariates are interpreted as the effects of the covariates at the subject level. These common effects across individuals do not necessarily represent the effects at population level. Therefore, mixed-effects models are sometimes called subject-specific models, in contrast the marginal models in Section 2.3 which are referred to as population-averaged models. When an identity link is used, the conditional mean structure of GLMM, E[Yij|bi] = xTijβ + uTijbi, 27 2.4. Generalized Linear Mixed-effect Models and its implied marginal mean structure, E[Yij] = x T ijβ, happen to have the same functional form for β. In general when nonlinear links are adopted, the implied marginal mean structure, E[Yij] = Eb[g −1(xTijβ + u T ijbi)], and the conditional mean structure have different functional forms. The functional form of the implied marginal mean structure depends on the dis- tribution of the random effects and the link function. In many cases, such as when a logit link and normal random effects are assumed, an explicit an- alytic form may not exist. As a consequence, the regression coefficients β may not have a straightforward interpretation in terms of the marginal mean structure. This issue can be illustrated by an example. Suppose given a random intercept bi0 and a random slope bi1, the responses Yij are independent with conditional means µ (C) ij . which depend on the conditional linear predictor η (C) ij through a log link, log ( µ (C) ij ) = η(C) = (β0 + bi0) + (β1 + bi1)xij. (2.4.1) The marginal mean µ (M) ij can be obtained by taking expectation over the distribution of the random effects, µ (M) ij = E [ µ (C) ij ] = E [exp {(β0 + bi0) + (β1 + bi1)xij}] = exp {β0 + β1xij} · E [exp {bi0 + bi1xij}] = exp {β0 + β1xij} ·Mb0,b1 (1, xij) , 28 2.4. Generalized Linear Mixed-effect Models where Mb0,b1 (·, ·) is the moment-generating function (MGF) of the joint dis- tribution of b0 and b1. If b0 and b1 are statistically independent, the joint MGF is the product of the marginal MGFs Mb0(·) and Mb1(·). If the ran- dom effects are assumed to follow independent normal distributions with mean 0 and variance σ20 and σ 2 1 respectively, the log transformed marginal mean happens to have a simple analytic form log ( µ (M) ij ) = ( β0 + 1 2 σ20 ) + β1xij + 1 2 σ21x 2 ij, (2.4.2) which is no longer a linear but a quadratic function in xij. The impact of xij on the marginal mean µ (M) ij depends not only on β1 but also on the variability of the random slope σ21. The regression coefficient β1 does not have a simple interpretation in terms of the population averaged effect of the predictor x. Figure 2.1 shows the conditional means as well as the implied marginal means for parameter values β0 = −5, σ0 = 2, β1 = 1.5, σ1 = 1. For the conditional means, the random intercept and random slope are evaluated at the 10th, 20th, . . . , 80th and 90th percentiles. 2.4.2 Inferential Approach: Maximum Likelihood Maximum likelihood estimation is the standard inferential approach for GLMM. The marginal likelihood function for the parameters β, φ and Vb is L(β,Vb, φ; y1, . . . ,ym) = m∏ i=1 ∫ { ni∏ j=1 fY (yij|bi; xi,β, φ) } fb(bi; Vb)dbi. The likelihood function does not have a closed form in general. Numeri- cal methods are required to evaluate the integral that yields the marginal likelihood of yi Li(β,Vb, φ; yi) = ∫ { ni∏ j=1 fY (yij|bi; xi,β, φ) } fb(bi; Vb)dbi, (2.4.3) 29 2.4. Generalized Linear Mixed-effect Models which can also be expressed as an expectation of the conditional likelihood over the distribution of the random effects, E [∏ni j=1 fY (yij|bi; xi,β, φ) ] . If the random effects are assumed to be normally distributed and the number of random effects q is small, for example fewer than 5, the marginal likeli- hood (2.4.3) is usually approximated by Gauss-Hermite quadrature. Using a fixed set of K quadrature points and weights {(z1, w1), . . . , (zK , wK)}, the numerical approximation is simply a weighted sum of the conditional likeli- hood evaluated at scaled quadrature points, bi = V 1/2 b zk, Li(β,Vb, φ; yi) ≈ K∑ k=1 wk · { ni∏ j=1 fY (yij|bi = V1/2b zk;β) } . The accuracy of the approximation is higher with more quadrature points. Adaptive Gaussian quadrature is an alternative numerical approximation which shifts the centers of the Gaussian approximation from 0 to the pos- terior mode of the random effects. Higher accuracy is achieved with the same number of quadrature points than with Gauss-Hermite quadrature. Numerical optimization, such as the quasi-Newton method with numerical derivatives, is then applied to locate the maximum of the marginal likelihood function. Given the MLEs of β, Vb and φ, inference on the fixed effects β follows standard likelihood theory. Hypothesis tests for nested models can be carried out by the likelihood ratio (LR) test, in addition to the Wald-type test (2.3.9) which is the sole option for the GEE approach. The LR test statistic, G2 = 2 (lfull − lreduced) , (2.4.4) follows asymptotically a χ2-distribution with s degrees of freedom, where lfull and lreduced are the maximum log likelihood under the full model and the reduced model respectively and s is the number of restrictions on the 30 2.4. Generalized Linear Mixed-effect Models parameters of the full model for obtaining the reduced model. The LR test is sometimes preferred to the Wald-type test for higher statistical sensitivity and its property of invariance to parameterization. Testing of variance components by the LR test are non-standard prob- lems. Under the null hypothesis, some variance components are putatively zero, falling on the boundary of the parameter space. In general, the asymp- totic distributions of the test statistics (2.4.4) are mixtures of χ2-distributions[17]. The form of mixture depends on the number of parameters of interest on the boundary and in the interior space under null hypothesis, and the number of nuisance parameters on the boundary and in the interior space. For the sim- plest case when a single variance component is the only parameter of interest and all other nuisance parameters lie in the interior space, the asymptotic dis- tribution of the LR test statistic is a 50:50 mixture of χ20 and χ 2 1 distributions. Random effects bi of individual subjects can be predicted by the empirical Bayes estimates, b̂i = E [ bi|Yi = yi;β = β̂,Vb = V̂b, φ = φ̂ ] , with the posterior mean of the random effects evaluated as if all the parame- ters conditioned on are known with values equal to the corresponding plug-in estimates β̂, V̂b and φ̂. The integration yielding the conditional expectation typically needs to be evaluated by numerical methods. The maximum likelihood estimates of the variance components of re- gression models are well-known to be underestimates in small samples. For example, in simple linear models, the maximum likelihood estimate of the variance is obtained by dividing the residual sum of squares by n, the num- ber of observations. An unbiased estimate of this variance is obtained by replacing n in the denominator of the maximum likelihood estimator by the 31 2.5. Log-linear Models residual degrees of freedom. For general linear models or linear mixed-effects models, the bias adjustment is more than a simple adjustment to the denom- inator. This motivated a variant of maximum likelihood estimation, namely restricted maximum likelihood (REML) estimation, first introduced by Pat- terson and Thompson[18]. One explanation of the underestimation is that maximum likelihood does not take into account the fact that β is also esti- mated, while the estimator of the variance components is not orthogonal to the estimator of β. The main idea of REML is therefore to estimate the vari- ance components by maximum likelihood upon a partition of the data whose distribution does not depend on β but only on the variance components. In the case of general linear model, one way to partition the data is to use the ordinary least squares residuals for estimation of the variance components. After obtaining the REML estimates of the variance components, β is esti- mated by the usual general least squares approach. The notion of REML generalizes to GLMM, but the estimation algorithm is less straightforward than for linear models. For example, McCulloch con- sidered a class of probit normal models for binary data and described a version of REML implemented with the EM algorithm[19]. As the bias of maximum likelihood for variance components diminishes when the sample size is substantially larger than the dimension of β, the difference between maximum likelihood and REML becomes less important[20]. 2.5 Log-linear Models 2.5.1 Mixed-effects Models In this section, we describe a class of log-linear models with random intercept which will be utilized for data analysis in the following chapters. Given a univariate random effect bi, the responses Yij are assumed to be conditionally independent, following some distribution in the exponential family (2.1.1) 32 2.5. Log-linear Models with conditional mean E(Yij|bi) = µ(C)ij . The conditional mean depends on the random intercept and a p× 1 vector of covariates xij through a log link, log ( µ (C) ij ) = bi + ηij = bi + β0 + x T ijβ, where β0 is the overall intercept and β is a p× 1 vector of fixed effects. The random intercept is assumed to follow some distribution with density func- tion fb(·) having mean E(bi) = 0 and variance Var(bi) = σ2b . Conditional on the random effect, the regression coefficient correspond- ing to a continuous covariate can be interpreted as the log relative risk for a binary response or the log relative rate for a count response for every unit increment of the covariate, adjusted for all other covariates. When the co- variate is binary, representing two exposures or treatments, the regression coefficient can be interpreted as the adjusted log relative risk or relative rate comparing the two exposures or treatments. The likelihood-based approach for GLMM described in Section 2.4 can be applied to analyze data using this log-linear random intercept model. 2.5.2 Marginalized Models The marginal mean structure for the log-linear mixed-effects model can be obtained by taking the expectation of the conditional mean over the distri- bution of the random intercept. Denoting the marginal mean as µ (M) ij = E(Yij) = E[E(Yij|bi)] = E(µ(C)ij ), 33 2.5. Log-linear Models the marginal mean structure is given by log ( µ (M) ij ) = η (M) ij = log ( E [ exp(bi + β0 + x T ijβ) ]) = log ( E [ ebi ]) + β0 + x T ijβ = log (Mb(1)) + β0 + x T ijβ = log (Mb(1)) + ηij (2.5.1) where Mb(·) is the MGF of the random intercept bi. The constant term log (Mb(1)) can be absorbed into the intercept, yielding log ( µ (M) ij ) = β∗0 + x T ijβ. (2.5.2) In this model, the marginal mean structure has the same functional form as the conditional mean structure. It follows that the regression parameters β have the same interpretation at the individual level as well as at the popu- lation averaged level regardless of the distribution of the random intercept. The only exception is the intercept term which depends on the distribution of the random intercept as the MGF of the random effect enters the intercept term in the implied marginal mean structure. In the special case of a normal random effect distribution with variance σ2b , β∗0 = β0 + 1 2 σ2b . (2.5.3) Binary response When the response Yij is binary, the conditional and marginal distribution are both Bernoulli with mean µ (C) ij and µ (M) ij respectively. The marginal variance is simply Var(Yij) = µ (M) ij (1− µ(M)ij ). 34 2.5. Log-linear Models The marginal covariance of two distinct responses on the same individual, Yij and Yik where j 6= k, is given by Cov(Yij, Yik) = Cov(E[Yij|bi],E[Yik|bi]) + E(Cov[Yij, Yik|bi]) = Cov(µ (C) ij , µ (C) ik ) = E(µ (C) ij µ (C) ik )− µ(M)ij µ(M)ik = E (exp{2bi + ηij + ηik})− µ(M)ij µ(M)ik = Mb(2) M2b (1) µ (M) ij µ (M) ik − µ(M)ij µ(M)ik = ( Mb(2) M2b (1) − 1 ) µ (M) ij µ (M) ik . (2.5.4) The marginal correlation is thus Corr(Yij, Yik) = Cov(Yij, Yik)√ Var(Yij)Var(Yik) = ( Mb(2) M2b (1) − 1 ) µ (M) ij µ (M) ik√ µ (M) ij (1− µ(M)ij )µ(M)ik (1− µ(M)ik ) , which depends upon the random effect distribution and the marginal means in an obscure way. In general, the pairwise correlation increases with the marginal mean. When the µ (M) ij ’s are small, the marginal correlation is ap- proximately Corr(Yij, Yik) ≈ ( Mb(2) M2b (1) − 1 )√ µ (M) ij µ (M) ik . For a fixed individual i, if the marginal mean µ (M) it does not vary much over time so that √ µ (M) ij µ (M) ik ≈ µ(M)i , an exchangeable correlation structure is a close approximation to the above implied marginal correlation. If no baseline covariates are included in the model such that µ (M) i does not vary across individuals, a common exchangeable correlation structure is a reasonable 35 2.5. Log-linear Models choice of “working” correlation structure. In the special case when there is no covariate effect, the correlation matrix has exactly an exchangeable structure. Count response When the response Yij is a count, the conditional distribution of Yij given bi is usually assumed to be Poisson, with mean µ (C) ij and variance equal to the mean. By observing in (2.5.1) that exp(ηij) = µ (M) ij Mb(1) , the variance of the marginal distribution is Var(Yij) = Var(E[Yij|bi]) + E(Var[Yij|bi]) = Var(µ (C) ij ) + E(µ (C) ij ) = Var (exp{bi + ηij}) + µ(M)ij = exp{2ηij} · Var (exp{bi}) + µ(M)ij = ( E(exp{2bi})− E(exp{bi})2 ) [ µ(M)ij Mb(1) ]2 + µ (M) ij = ( Mb(2) M2b (1) − 1 ) µ (M)2 ij + µ (M) ij . When the random intercept follows normal distribution with variance σ2b , the marginal variance, Var(Yij) = ( eσ 2 b − 1 ) µ (M)2 ij + µ (M) ij , depends on the marginal mean µ (M) ij and the variance component σ 2 b . For σ2b > 0, the marginal variance is greater than the marginal mean. The overdispersion of Yij with respect to Poisson is due to the extra variation in- 36 2.5. Log-linear Models troduced by the random intercept representing baseline heterogeneity which persists throughout the course of the longitudinal study. Since the constant dispersion parameter φ can be eliminated in the estimating equations (2.3.3) and the sandwich estimator of variance (2.3.8), when an independence or a fixed “working” correlation matrix is used, the GEE estimator of β and the sandwich estimator of variance are robust to overdispersion. However, when a more complicated “working” correlation structure is used, some associa- tion parameters α need to be estimated. The consistent estimator of α may involve φ. In this case, we have to allow for and estimate the overdispersion φ when GEE with Poisson variance function is used for this marginal model. The marginal covariance in (2.5.4) does not depend the distributional form of the response but only the marginal means of the responses and the distributional form of the random effect. The marginal correlation of the counts Yij and Yik is thus Corr (Yij, Yik) = Cov (Yij, Yik)√ Var(Yij)Var(Yik) = ( Mb(2) M2b (1) − 1 ) µ (M) ij µ (M) ik√([ Mb(2) M2b (1) − 1 ] µ (M)2 ij + µ (M) ij )([ Mb(2) M2b (1) − 1 ] µ (M)2 ik + µ (M) ik ) = 1√( 1 + 1 /[ Mb(2) M2b (1) − 1 ] µ (M) ij )( 1 + 1 /[ Mb(2) M2b (1) − 1 ] µ (M) ik ) , which depends on the random effect distribution and the marginal means in a complicated way. When the within subject covariate effects are small such that µ (M) ij ≈ µ(M)ik for j 6= k, an exchangeable structure may provide a good approximation to the true correlation structure. When the between subject covariate effects are also small such that the µ (M) i ’s are similar across subjects, a common exchangeable structure with a common pairwise correlation of ob- 37 2.5. Log-linear Models servations across subjects and time may be a reasonable approximation. In the extreme case when there are no covariate effects at all, the marginal cor- relation matrix has exactly an exchangeable structure. In general, direct uti- lization of the exact marginal correlation structure involves some difficulties. The estimation of the component ξ = Mb(2) M2b (1) is non-standard and most generic GEE model-fitting algorithms do not support the use of such a customized “working” correlation. When a simple “working” correlation structure better than the independence structure is sought, the exchangeable structure may be applied as an effort to partially account for the within-subject correlation. The GLMM log-linear random intercept model and its derived marginal model are applied to analyze the BEYOND trial data. The GLMMs are fitted by MLE, whereas the marginal models are fitted by GEE with an exchangeable correlation structure. Chapter 3 presents the results when the response is a binary indicator of relapse onset on the corresponding day on study. Chapter 4 considers an alternative form of response obtained by collapsing the daily binary responses to counts over time intervals. 38 2.5. Log-linear Models Figure 2.1: Conditional and implied marginal mean structures of a log-linear model with random intercept and slope: random effects are evaluated at the 10th, 20th, . . . , 90th percentiles. −1.0 −0.5 0.0 0.5 1.0 1.5 2.0 − 8 − 6 − 4 − 2 0 2 4 xij lo g ( µ ij ) conditional marginal GLMM : log(µij(C)) = (β0 + bi0) + (β1 + bi1)xij Parameters : β0 = − 5, β1 = 1.5, σ0 = 2, σ1 = 1 39 Chapter 3 Analysis of BEYOND Trial Data 3.1 Introduction In the BEYOND trial, 897 patients were randomized to the 250 mcg IFNB group, 899 patients were randomized to the 500 mcg IFNB group and 448 patients randomized to the 20 mg GA group. Serum samples were collected on most of the patients at baseline and semiannually in the two IFNB groups. Serum samples were tested for NABs to IFNB. NAB titers were measured by the MxA assay. Among the 1773 patients who had at least one NAB titer measure, 320 out of 888 in the 250 mcg IFNB arm and 340 out of 885 in the 500 mcg IFNB arm had at least one NAB titer of 20 NU/mL or higher. Variables of Interest Onset of relapse. The clinical outcomes of interest are the dates of onset of new relapses. Whether a new relapse started or not was recorded for every day on study for each individual patient. By definition, a new relapse cannot start before the patient recovers from an ongoing relapse. Therefore, during the periods of ongoing relapses patients were not subjected to the risk of new relapses. We call the time between the second day and the last day of a relapse inclusively days-not-at-risk and other time during the trial days-at- risk. Inference on the probability of a new relapse starting on a particular day should be based exclusively on observations on days-at-risk. Unfortunately, 40 3.1. Introduction the ending dates of relapses were not recorded in the BEYOND trial. Hence, we adopt an ad hoc rule which assumes that all relapses last for exactly 30 days. The 29 days following the start of each relapse are excluded from all our reported analyses. NAB titers. The NAB titers of serum samples collected at baseline and every 6 months afterward were measured by the MxA assay. The MxA assay has a detection limit of 20 NU/mL. NABs to IFNB are regarded as absent if the NAB titer is less than the detection limit. In the BEYOND dataset, some NAB measures are missing at baseline. In all our analyses in this and the next chapter, we assume the titer values corresponding to any missing NAB measures at baseline are less than 20 NU/mL. NAB status. In our analyses, the NAB status for each day is determined by the “all switches considered” rule with “confirmation required” for both switching from NAB− to NAB+ as well as from NAB+ back to NAB−. The requirement of confirmation is due a belief that the reliability of the NAB titers is low. Detailed description of this ad hoc classification rule is given in Section 1.3.3. The cutoff for switching between NAB− and NAB+ is chosen to be 20 NU/mL, the detection limit of the MxA assay. Refined NAB status. A refinement of the NAB status classification is also considered. The NAB+ periods are further divided into periods of Low NAB+, Medium NAB+ and High NAB+ to explore the usefulness of levels of NAB titer in predicting strength of NAB effects. Confirmation is not required for switchings between sublevels of NAB+. The NAB+ periods are classi- fied as Low NAB+ for NAB titers in the interval [20NU/mL, 100NU/mL), Medium NAB+ for NAB titers in the interval [100NU/mL, 400NU/mL) and High NAB+ for NAB titers equal to or higher than 400 NU/mL. 41 3.1. Introduction Table 3.1 summarizes the scheduled clinical visits where the first NAB serum titer became available for individual patients. Some patients did not have their first NAB titer available until some time after baseline. Since NABs are not expected to develop in the very early part of treatment, or to be present at baseline before the treatment is administered, we assume that the NAB titers for these patients are 0 NU/mL for all visits before NAB titers are available. Table 3.1: Time of the first available NAB titer Groups Baseline/Day 1 Week 26 Week 52 Week 78 End of study 250 mcg 863 24 1 0 0 500 mcg 863 19 1 1 1 Using the above NAB classification rule, 248 of the 897 (28%) patients in the 250 mcg group have at least one NAB+ period while the remaining 639 patients never have any NAB+ periods. In the 500 mcg group, 262 of the 899 (29%) patients have at least one NAB+ period while the remaining 623 patients have no NAB+ periods. Table 3.2 shows the distribution of the time to the first confirmed NAB+ among eventually NAB+ patients. In both treatment arms, the majority of the eventually NAB+ became NAB+ in the first one and a half years on study. More than half of the eventually NAB+ patients were already NAB+ by around 6 months on study. Table 3.3 shows the distribution of the number of NAB+ periods for each patient. All eventually NAB+ patients had only one NAB+ period. About 85% of the eventually NAB+ patients in both groups, had an initial NAB− period follow by a NAB+ period which extended to the end of their follow- 42 3.1. Introduction Table 3.2: Time to the first confirmed NAB+ Time† (Year) Groups 0.0 0.5 1.0 1.5 2.0 2.5 250 mcg 2 144 72 18 7 0 500 mcg 1 173 56 27 4 1 †Rounded to the nearest 0.5 year. up. The remaining 15% first had a NAB− period, followed by a NAB+, then by a conversion back to NAB−. Two exceptional cases in the 250 mcg group and one in the 500 mcg group had NAB+ periods beginning from baseline. Table 3.3: NAB status dynamics All Switches considered ; confirmed NAB+ and confirmed NAB- NAB Dynamics 250 mcg 500 mcg Eventually NAB+ − + 215 229 − + − 31 32 + 1 1 + − 1 0 subtotal 248 262 Never NAB+ − 639 623 Total 887 885 The total number of days at risk for the eventually NAB+ patients in the 250 mcg group ranges from 254 to 1267 days. The never NAB+ patients in the 250 mcg group have days at risk ranging from 15 to 1280 days. For 43 3.1. Introduction the 500 mcg group, the range for eventually NAB+ patients is 378 to 1242 days, whereas the range for never NAB+ patients is 7 to 1197 days. In both IFNB treated groups, the average proportion of time on study as NAB− for individual eventually NAB+ patients is about 36%. Figures 3.1a and 3.1b show the distribution of the proportion of NAB− periods The total number of relapses during the course of trial for individuals in the 250 mcg group ranges from 0 to 6 with an average of 0.86 for eventually NAB+ patients and ranges from 0 to 7 with an average of 0.78 for never NAB+ patients . The annualized crude relapse rate is 0.362 for the even- tually NAB+ group and 0.356 for the never NAB+ group. In the 500 mcg group, the total number of relapses ranges from 0 to 5 with an average of 0.77 for eventually NAB+ patients and from 0 to 8 with an average of 0.72 for never NAB+ patients. The annualized crude relapse rate is 0.330 for the eventually NAB+ group and 0.334 for the never NAB+ group. Table 3.4 is a frequency table of the total number of relapses per patient throughout the course of study for the eventually NAB+ patients. A similar distribution for the number of relapses is observed for both IFNB treated groups. For the 250 mcg and the 500 mcg groups respectively, 51% and 56% of the patients did not experience any relapses during the study; 26% and 23% had one relapse; and 23% and 21% had two or more relapses. Table 3.4: Number of relapses per eventually NAB+ patients Number of relapses Groups 0 1 2 3 4 5 6 250 mcg 126 65 36 11 7 2 1 500 mcg 147 60 32 15 7 1 0 44 3.1. In tro d u ction Figure 3.1: Proportion of time on study as NAB− for eventually NAB+ patients. (a) The 250 mcg group Proportion of time on study as NAB− N u m b e r o f P a t i e n t s 0.0 0.2 0.4 0.6 0.8 1.0 0 1 0 2 0 3 0 4 0 5 0 6 0 (b) The 500 mcg group Proportion of time on study as NAB− N u m b e r o f P a t i e n t s 0.0 0.2 0.4 0.6 0.8 1.0 0 2 0 4 0 6 0 8 0 45 3.2. Basic Model Specification The NAB effect in our report refers to the relative risk of relapse onset if a patient is NAB+ instead of NAB− on that day. The NAB effect is assessed by comparing the relapse risk during the NAB+ periods to the risk during the NAB− periods within individuals. Therefore only eventually NAB+ patients provide the relevant information for estimating NAB effects. All our analyses are based only on the eventually NAB+ groups. 3.2 Basic Model Specification We construct a mixed effect log-binomial regression model for the daily prob- ability of a relapse onset. The response Yit is a binary indicator of whether a relapse started at time t for the ith patient. Conditional on the random effects, Yit is assumed to follow a Bernoulli distribution with probability µ (C) it . The conditional independence assumption is adopted. The primary target of inference is the NAB effect quantified as the per- centage increase in relapse risk on any given day on study during NAB+ periods relative to NAB− periods. This succinct summary of relative risk can be directly estimated by relating the daily relapse probability µ (C) it to a linear predictor through a log link. In general, we consider linear predictors for the daily relapse probability which includes additive components of 1) a NAB effect, 2) the time trend of the natural progression of relapse frequency over the course of trial, and 3) a random intercept representing inherent be- tween subject variation in MS disease activity. Figure 3.2 shows the log empirical annualized relapse rates for preceding 3-month periods separately for the two IFNB treated groups. For month 36 to 39, the empirical relapse rate is zero in both groups; whereas for month 39 to 42, the empirical relapse rate is zero for the 500 mcg group. Since relapse rates are presented on log scale, these two periods are not included in Fig- 46 3.2. B asic M o d el S p ecifi cation Figure 3.2: Empirical annualized relapse rates for preceding 3-month periods and the corresponding loess fits for eventually NAB+ patients. Month−on−Study L o g E m p i r i c a l A n n u a l i z e d R e l a p s e R a t e 0 3 6 9 12 15 18 21 24 27 30 33 36 39 −2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.08 0.10 0.20 0.40 0.80 1.00 E m p i r i c a l A n n u a l i z e d R e l a p s e R a t e l l l l l l l l l l l l l 250 mcg group 500 mcg group 47 3.2. Basic Model Specification ure 3.2. The loess fit to the log empirical relapse rate illustrates a decreasing time trend for the 250 mcg group which is reasonably linear. On the other hand, no distinguishable time trend is suggested in the 500 mcg group. We consider an initial model with a simple mean structure, log ( µ (C) it ) = bi + β0 + β1t+ β2NABit, (3.2.1) where bi is a random intercept from normal(0, σ 2 b ), β0 is the intercept; β1 is the slope of the linear time trend; and β2 represents the NAB effect with NABit being the binary covariate of NAB status corresponding to Yit coded as 0 for NAB− and 1 for NAB+. The model assumes that the log daily relapse risk is changed by a constant level during the NAB+ period. When the NAB status switches from NAB+ back to NAB−, the log daily relapse risk returns to the initial NAB− level adjusted for the time trend without any NAB effects carried over. As described in Section 2.5.2, the marginalized model derived from the initial mixed-effects model 3.2.1 has mean structure, log ( µ (M) it ) = β∗0 + β1t+ β2NABit, (3.2.2) where the intercept β∗0 = β0 + 1 2 σ2b as mentioned in (2.5.3). The mixed-effects model and the marginalized model share the same regression coefficients β1 and β2. For analysis of the daily relapse data presented in this chapter, the log- binomial random intercept models are fitted by maximum likelihood, whereas the marginalized models are fitted by GEE with an exchangeable “working” correlation structure (GEE-EXCH) with variance Var(Yij) = φµ (M) ij ( 1− µ(M)ij ) for binary responses. Given the estimated regression coefficients from either approach, the relative risk of NAB+ periods versus NAB− periods can be estimated by R̂R = exp ( β̂2 ) . The standard error of the estimated relative 48 3.3. Computational Issues risk can be approximated by the delta method, SE(R̂R) ≈ R̂R · SE(β̂2). An approximate (1 − α) × 100% confidence interval for the relative risk can be obtained by transforming a symmetric confidence interval for β2, R̂R · exp(±z1−α/2SE(β̂2)), where z1−α/2 is the (1 − α/2)th upper quantile of a standard normal distribution. 3.3 Computational Issues Various R packages and routines are available for fitting GLMMs. Packages we are aware of up to the time this report was written include 1) function glmmPQL() in package MASS, 2) package glmmADMB, 3) package MCMCglmm, 4) package glmmML, 5) package glmmAK, 6) function glmer() in package lme4, and 7) function glmm() in package repeated. Not all of these packages suit our purposes: 1 – 3) do not perform maxi- mum likelihood estimation and 4 – 5) do not have the log link available for the binomial family. Only glmer(lme4) and glmm(repeated) perform log- binomial regression with adaptive Gauss quadrature implemented. Among the two model fitting routines, glmer(lme4) can handle most specifications of random effects, while glmm(repeated) can only accommodate a random intercept. Neither of these routines provides an option of REML as an al- ternative to maximum likelihood for estimation of the variance components 49 3.4. Time Trend except for the case of linear mixed-effects models. Although patients in the BEYOND trial were only followed for a mod- erate length of 2 to 3.5 years, since relapse onset was monitored on a daily basis, each patient has a long vector of binary responses. The average num- ber of days at risk for eventually NAB+ patients is 871 in the 250 mcg group and 854 in the 500 mcg group. The size of our dataset also poses a challenge for the memory management of the model fitting routines and resources of the platform R runs on. The presumably more powerful glmer(lme4) can- not handle our large dataset, while glmm(repeated) still manages to fit the models, despite taking a substantial amount of time. The MLE of GLMMs in this chapter and Chapter 4 are obtained by glmm(repeated) using 15 quadrature points with default convergence criteria and control parameters. The same problem is encountered for the GEE approach. The two stan- dard of excellence R packages for GEE, gee and geepack, were unable to handle our data. With the permission of Mr. Rick White, we used his GEE model fitting C routine, which can handle our large dataset, for all our re- ported GEE model fitting in this chapter, as well as in Chapter 4. 3.4 Time Trend The proposed longitudinal analysis of NAB effects automatically takes into account the confounding effects of baseline covariates by making the compari- son of NAB− time periods to NAB+ time periods within individual patients. The natural trend of relapse risk over time becomes the most important fac- tor remain to be adjusted for. Accuracy in modelling this time trend is essential to the soundness of the estimation of the NAB effect. Based on the patterns in Figure 3.2, we propose an initial linear time 50 3.4. Time Trend trend in the log transformed daily relapse risk. The linear trend on the log scale implies that the relapse risk changes exponentially over time. Adequacy of the linearity assumption is assessed by comparing GLMM fits with alter- native time trends modelled by natural cubic splines with degrees of freedom equals to 2, 3 or 4 to the linear time trend. The linear time trend can be regarded as a natural cubic spline with 1 degree of freedom. NAB status is included as a covariate in all the models so that the impact of including a more complicated time trend on the estimation of NAB effects can be as- sessed. Fitted time trends are visualized in Figures 3.3a and 3.3b for the 250 mcg group and 500 mcg group respectively. The “annualized” fitted relapse rates are obtained by multiplying the fitted daily relapse probabilities by a factor of 365.25. For the 250 mcg group, the 2 and 3 degrees of freedom splines seem to suggest more curvature for the diminishing trend in relapse risk than the exponential decay implied by linear time trend in log relapse risk. The wigglyness of the fitted curve of the 4 degrees of freedom spline suggests over- fitting. In contrast, the linearity assumption in the 500 mcg group appears to be representing the trend of relapse risk over time well as the higher degree of freedom splines almost coincide with the linear trend. Signs of overfitting are observed for the 3 and 4 degrees of freedom splines. All the fitted time trends for the 500 mcg group suggest that the change over the 3.5 year period is minimal. Comparing the two IFNB treated groups, the “annualized” fitted relapse rates in the 250 mcg group are higher than the rates in the 500 mcg group during the 3.5 years on study. Analyses are expected to have lower sensitivity for the 500 mcg group than the 250 mcg group due to the lower relapse risk. Results of likelihood ratio tests comparing the fitted splines and linear time trends are presented in Table 3.5. Among the natural cubic splines we 51 3.4. T im e T ren d Figure 3.3: Annualized fitted relapse rate of GLMM for NAB− periods, natural regression splines for time trend, eventually NAB+ patients only. (a) The 250 mcg group Year−on−Study L o g A n n u a l i z e d R e l a p s e R a t e A n n u a l i z e d R e l a p s e R a t e 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 −2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.08 0.10 0.20 0.40 0.80 1.00 l l l l l l Linear Time Trend Natural Cubic Spline, df=2 Natural Cubic Spline, df=3 Natural Cubic Spline, df=4 52 3.4. T im e T ren d Figure 3.3: – Continued (b) The 500 mcg group Year−on−Study L o g A n n u a l i z e d R e l a p s e R a t e A n n u a l i z e d R e l a p s e R a t e 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 −2.5 −2.0 −1.5 −1.0 −0.5 0.0 0.08 0.10 0.20 0.40 0.80 1.00 l l l l l l Linear Time Trend Natural Cubic Spline, df=2 Natural Cubic Spline, df=3 Natural Cubic Spline, df=4 53 3.4. Time Trend considered, none fits the data significantly better than the linear trend in either IFNB treated group. Table 3.5: p-values for likelihood ratio tests comparing GLMM fits with different time trends: eventually NAB+ patients only. All Switches considered; confirmed NAB+, confirmed NAB− Alternative hypothesis, H1 250 mcg group 500 mcg group Natural Splines Natural Splines Null hypothesis, H0 df = 2 df = 3 df = 4 df = 2 df = 3 df = 4 Natural Splines df = 1† 0.40 0.69 0.55 0.97 0.65 0.72 df = 2 − 0.81 0.49 − 0.36 0.52 df = 3 − − 0.24 − − 0.49 † natural spline with 1 degree of freedom is equivalent to linear. The estimated relative risks of NAB+ to NAB− time periods under each of the time trend models considered are presented in Table 3.6. With the linear time trend, the estimated increase in relapse risk of NAB+ periods is 10% in the 250 mcg group, but this doubles to about 20% when the time trends are fitted with splines. The increase in relapse risk is about 30% in the 500 mcg group under all the different models considered for the time trend. The estimated NAB effects are always stronger in the 500 mcg group, though the effects are statistically insignificant in both groups under all the time trends considered. The standard errors of the estimated relative risks in the 500 mcg group are larger than those in the 250 mcg group, as expected given the lower relapse risks in the 500 mcg group revealed in Figures 3.3a and 3.3b. For the 500 mcg group, as revealed by the fitted splines and the likelihood 54 3.5. Neutralizing Antibody Effects Table 3.6: Estimated relative risks of NAB+ to NAB− from GLMM fits with various time trends: eventually NAB+ patients only. All Switches considered ; confirmed NAB+, confirmed NAB− 250 mcg group 500 mcg group R̂R SE 95% CI R̂R SE 95% CI Linear 1.10 0.19 (0.79, 1.54) 1.29 0.24 (0.90, 1.84) Natural Spline, df=2 1.21 0.24 (0.81, 1.79) 1.29 0.27 (0.86, 1.94) Natural Spline, df=3 1.21 0.25 (0.81, 1.81) 1.34 0.29 (0.87, 2.05) Natural Spline, df=4 1.24 0.25 (0.83, 1.84) 1.33 0.29 (0.87, 2.03) R̂R: estimated relative risk. ratio tests, the assumption of a linear time trend is apparently reasonable and adequately represents the natural change of relapse risks over time. For the 250 mcg group, the fitted splines suggest that a quadratic time trend might be more appropriate. However, no substantial evidence of lack of fit is indicated by the likelihood ratio tests for either group. In both groups, the standard error of the estimated relative risk increases with the degrees of freedom of the regression splines, illustrating a tradeoff between bias and variance. Based on the results of the likelihood ratio tests, we decide to adopt the parsimonious model of a linear trend in log relapse risk over time for the subsequent analyses. 3.5 Neutralizing Antibody Effects 3.5.1 NAB Status Tables 3.7a and 3.7b present the estimates and standard errors of the rela- tive risks obtained by GEE-EXCH and GLMM based on the initial specifi- cation of mean structure (3.2.1). Results based on the refined NAB status of Low/Medium/High NAB+ are also included. 55 3.5. Neutralizing Antibody Effects Similar patterns and levels of estimated NAB effects are given by GEE- EXCH and GLMM for both IFNB treated groups. For the 250 mcg group, no significant NAB effect is detected. The estimated increase in relapse risk is about 10% comparing NAB+ to NAB−. The NAB effects do not differ substantially across the three sublevels of NAB+. Comparing to NAB−, the estimated increases in relapse risk are less than 10% for Low NAB+, about 15% for Medium NAB+ and less than 5% for High NAB+. A different pattern is observed for the 500 mcg group. The time adjusted increase in relapse risk of NAB+ period relative to NAB− period is estimated to be about 30%. The increases in relapse risk are about 10%, 55% and 60% for Low, Medium and High NAB+ periods respectively. The NAB effects are statistically significant at the 0.05 level for Medium and High NAB+ based on the GEE-EXCH model fit. Although the results are not quite statistically significant (Low: p = 0.65, Medium: p = 0.05, High: p = 0.05) using the GLMM approach, they still provide evidence suggesting presence of NAB effects in the 500 mcg group. The suggested NAB effects also appear to increase when NAB titer raises from low to medium level, but the relative risks are similar for medium and high levels. 3.5.2 NAB Titer The direct use of NAB titer in estimating NAB effect on relapse risk is also explored by replacing the NAB status predictor NABit in model (3.2.1) by a log titer term log ( Titerit 20 ) · I{Titerit ≥ 20}, where Titerit is the latest NAB titer available as of time t for the ith patient. Natural regression splines are fitted on the log titers, serving to discover the form of association between relapse risk and NAB titers. Figures 3.4a and 3.4b show the fitted natural cubic splines with 2 degrees of freedom and histograms of the titer values. As illustrated by the fitted splines, there is little suggestion of an NAB effect in the 250 mcg group, whereas in the 500 mcg group, the relapse risks appear 56 3.5. N eu tralizin g A n tib o d y E ff ects Table 3.7: Estimated relative risks of NAB+ and sublevels to NAB−: eventually NAB+ patients only. (a) The 250 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI NAB+ vs. NAB− 1.092 0.172 (0.802, 1.487) 1.102 0.188 (0.788, 1.541) Low NAB+ vs. NAB− 1.075 0.181 (0.774, 1.495) 1.094 0.199 (0.766, 1.564) Med NAB+ vs. NAB− 1.158 0.256 (0.751, 1.786) 1.150 0.254 (0.747, 1.772) High NAB+ vs. NAB− 1.035 0.270 (0.621, 1.725) 1.023 0.277 (0.602, 1.738) (b) The 500 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI NAB+ vs. NAB− 1.293 0.222 (0.923, 1.811) 1.287 0.236 (0.898, 1.843) Low NAB+ vs. NAB− 1.111 0.223 (0.749, 1.647) 1.099 0.228 (0.732, 1.650) Med NAB+ vs. NAB− 1.541 0.339 (1.001, 2.370) 1.547 0.345 (1.000, 2.394) High NAB+ vs. NAB− 1.602 0.379 (1.007, 2.548) 1.593 0.384 (0.993, 2.557) R̂R: estimated relative risk. Titers for NAB status — Low: [20, 100), Med: [100, 400), High: [400,∞). 57 3.5. Neutralizing Antibody Effects to increase with NAB titers until reaching a certain maximum limit. As revealed by the plots of the fitted splines in Figures 3.4a and 3.4b, a quadratic trend in log titer value might represents the form association between relapse risk and NAB titer more accurately than a linear trend. However, a quadratic fit would complicate the interpretation A model which captures the basic shape of this exposure-response relationship but retains simple interpretation of NAB effects is desired. Therefore, a piecewise linear model on log titers which implies a constant relative risk for doubled titers over a range of titers from 20 NU/mL to a cutoff k NU/mL is proposed for more formal analysis. The conditional mean structure of the piecewise linear model log ( µ (C) it ) = bi+β0+β1t+β2 log ( min {Titerit, k} 20 ) ·I{Titerit ≥ 20}, (3.5.1) marginalizes to log ( µ (M) it ) = β∗0 + β1t+ β2 log ( min {Titerit, k} 20 ) · I{Titerit ≥ 20}, (3.5.2) where k is a prespecified titer cutoff level. Presuming that β2 is positive, the time adjusted relapse risk reaches its maximum at the cutoff level. The relative risk for an arbitrary titer value compared to any titer ≤ 20 NU/mL is ( min{Titerit,k} 20 )β2 . When the titer values are between 20 and k NU/mL, the relative risk corresponding to doubling the NAB titer is 2β2 . Given the estimated regression coefficient β̂2, for titers less than cutoff, the standard error of R̂R = ( Titerit 20 )β̂2 can be approximated by log ( Titerit 20 ) R̂R · SE(β̂2). Four values for the cutoff k are considered: 100 NU/mL, 400 NU/mL, 1000 NU/mL and ∞ NU/mL, where the last case reduces to a linear trend in log titers. Estimated relative risks based on log titers are presented in 58 3.5. N eu tralizin g A n tib o d y E ff ects Figure 3.4: Fitted natural cubic spline (df = 2) on log NAB titers of GLMM: eventually NAB+ patients only. (a) The 250 mcg group − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 5 1 . 8 0 1 2 3 4 5 6 7 Log(NAB Titer / 20) L o g R e l a t i v e R i s k , ( v s . 2 0 N U / m L ) R e l a t i v e R i s k , ( v s . 2 0 N U / m L ) 250 mcg arm NAB Titer, (NU/mL) % o f T o t a l P a t i e n t − D a y s 0 . 0 0 . 3 20 55 150 400 1100 3000 8000 22000 59 3.5. N eu tralizin g A n tib o d y E ff ects Figure 3.4: – Continued (b) The 500 mcg group − 0 . 2 0 . 0 0 . 2 0 . 4 0 . 6 0 . 8 1 1 . 2 1 . 5 1 . 8 0 1 2 3 4 5 6 7 Log(NAB Titer / 20) L o g R e l a t i v e R i s k , ( v s . 2 0 N U / m L ) R e l a t i v e R i s k , ( v s . 2 0 N U / m L ) 500 mcg arm NAB Titer, (NU/mL) % o f T o t a l P a t i e n t − D a y s 0 . 0 0 . 2 0 . 4 20 55 150 400 1100 3000 8000 22000 60 3.5. Neutralizing Antibody Effects Tables 3.8a and 3.8b. The pattern of the results given by GEE-EXCH and GLMM resemble each other closely in both IFNB treated groups. Estimated relative risks from the two approaches are effectively the same. For the 250 mcg group, no significant associations between NAB titers and relapse risks are found regardless of the titer cutoff. For the model with no cutoff (k = ∞), β̂2 is −0.002. For models with finite cutoff, β̂2’s are positive ranging from 0.002 to 0.028. The estimated increase in relapse risk for doubled titers are less than 2% in all cases, and the maximum increases in relapse risk achievable at the cutoff are less than 5%. For the 500 mcg group, for all four models, β̂2 is positive and the asso- ciation between NAB titers and relapse risk is statistically significant. For cutoffs at 100, 400 and 1000 NU/mL, the estimated increases in relapse risks for doubled titers are about 18%, 12% and 10% respectively, whereas the es- timated increases in relapse risk at cutoff are 48%, 63% and 71% respectively. With no cutoff, the estimated relative risks for doubled titers is about 8%. Tables A.1a to A.4 in Appendix A show complete results of the model fits for all the models in this section. For GEE-EXCH, the estimated correlation α is 0.001 in every model for both the 250 mcg and 500 mcg groups. With essentially zero estimated correlation, the same model fits would be obtained with an independent “working” correlation structure. The estimated marginal intercept β̂∗0 obtained from the GEE-EXCH fit is compared to the estimated conditional intercept β̂0 and variance components σ̂2b obtained from the GLMM fit with reference to their theoretical relation- ship (2.5.3). The derived marginal intercepts β̂0 + 1 2 σ̂2b are consistently larger than the estimated marginal intercepts β̂∗0 in all the fitted models. The dif- ference ranges from 0.042 to 0.055. 61 3.6. R elative E ffi cien cy of G E E to G L M M A p p roach on E stim atin g N A B E ff ects Table 3.8: Estimated relative risk of NAB titers: eventually NAB+ patients only. (a) The 250 mcg group GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI cutoff=∞ Doubled titer before cutoff 0.999 0.038 ( 0.928, 1.075 ) 0.994 0.038 ( 0.922, 1.072 ) cutoff=1000 Doubled titer before cutoff 1.004 0.043 ( 0.923, 1.092 ) 0.999 0.043 ( 0.918, 1.088 ) At cutoff 1000 NU/mL vs. 20 NU/mL 1.022 0.248 ( 0.635, 1.644 ) 0.996 0.243 ( 0.617, 1.608 ) cutoff=400 Doubled titer before cutoff 1.002 0.049 ( 0.910, 1.103 ) 0.996 0.050 ( 0.904, 1.098 ) At cutoff 400 NU/mL vs. 20 NU/mL 1.007 0.214 ( 0.664, 1.528 ) 0.984 0.212 ( 0.646, 1.500 ) cutoff=100 Doubled titer before cutoff 1.019 0.077 ( 0.879, 1.183 ) 1.014 0.079 ( 0.870, 1.181 ) At cutoff 100 NU/mL vs. 20 NU/mL 1.045 0.184 ( 0.740, 1.476 ) 1.032 0.187 ( 0.724, 1.472 ) 62 3.6. R elative E ffi cien cy of G E E to G L M M A p p roach on E stim atin g N A B E ff ects Table 3.8: – Continued (b) The 500 mcg group GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI cutoff=∞ Doubled titer before cutoff 1.078 0.038 ( 1.005, 1.156 ) 1.079 0.037 ( 1.009, 1.153 ) cutoff=1000 Doubled titer before cutoff 1.100 0.047 ( 1.012, 1.196 ) 1.100 0.043 ( 1.018, 1.189 ) At cutoff 1000 NU/mL vs. 20 NU/mL 1.712 0.411 ( 1.069, 2.741 ) 1.714 0.382 ( 1.108, 2.652 ) cutoff=400 Doubled titer before cutoff 1.119 0.057 ( 1.014, 1.235 ) 1.120 0.053 ( 1.020, 1.229 ) At cutoff 400 NU/mL vs. 20 NU/mL 1.626 0.355 ( 1.060, 2.493 ) 1.630 0.334 ( 1.091, 2.435 ) cutoff=100 Doubled titer before cutoff 1.181 0.096 ( 1.006, 1.385 ) 1.182 0.093 ( 1.013, 1.380 ) At cutoff 100 NU/mL vs. 20 NU/mL 1.470 0.278 ( 1.015, 2.131 ) 1.475 0.270 ( 1.029, 2.112 ) 63 3.6. Relative Efficiency of GEE to GLMM Approach on Estimating NAB Effects 3.6 Relative Efficiency of GEE to GLMM Approach on Estimating NAB Effects Comparing the two longitudinal approaches applied, when the conditional models are the correct models, the GLMM approach has the highest asymp- totic efficiency. However, the GEE approach, as a method of moments ap- proach, is more robust to model misspecification as it only requires correct specification of the mean structure and independence between subjects. The efficiency of the GEE approach relative to the GLMM approach in our finite sample case is investigated. Specifically the efficiency in estimating NAB ef- fects is our primary interest. We define relative efficiency of GEE to GLMM as the ratio of the variance of GLMM to the variance of GEE estimated rel- ative risks. The pattern observed based on the estimates in Tables 3.7a, 3.7b, 3.8a and 3.8b is not consistent. Based on NAB status, GEE-EXCH demonstrated a higher efficiency; the relative efficiencies are 119.5% and 113.0% for the 250 mcg and the 500 mcg groups respectively. For the 250 mcg group, the relative efficiency of GEE-EXCH is 120.9%, 98.4% and 105.3% for estimating the effects of Low NAB+, Medium NAB+ and High NAB+ versus NAB− respectively. For the 500 mcg group, the relative efficiency of GEE-EXCH is 104.5%, 103.6% and 102.7% for estimating the effects of Low NAB+, Medium NAB+ and High NAB+ versus NAB− respectively. Based on NAB titers, comparisons are made on all the estimated relative risks presented in Ta- bles 3.8a and 3.8b. In the 250 mcg group, the relative efficiencies are close to 100%, ranging from 96.0% to 105.3%. However, in the 500 mcg group, the estimates given by GEE-EXCH are less precise with relative efficiencies ranging from 83.7% to 94.8% Comparison based on the variance of the estimated regression coefficients 64 3.7. Summary representing the NAB effects results in similar conclusions. The relative ef- ficiencies are 113.0% and 113.7% for the 250 mcg and the 500 mcg groups respectively for estimating an overall effects associated with the binary NAB status. For the 250 mcg group, the relative efficiencies are 104.5% for Low, 103.6% for Medium and 102.7% for High NAB+ versus NAB−. For the 500 mcg group, the relative efficiencies are 106.1% for Low, 102.8% for Medium and 103.7% for High NAB+ versus NAB−. GEE-EXCH returned more precise estimates of NAB effects when they are based on NAB status. While based on NAB titers, the relative efficiency of GEE-EXCH is close to 100%, ranging from 101.4% to 105.8% for the 250 mcg group. Considerable loss of efficiency is noticed, however, for the 500 mcg group with relative efficiency of GEE-EXCH ranging from 86.0% to 91.3%. The average loss of efficiency in this group with the direct use of NAB titer is about 10%. 3.7 Summary The analysis of the BEYOND trial suggests some effects of NABs to IFNB on RRMS patients’ relapse risks. The additional analysis based on NAB titers sheds some light on how the level of NABs is associated with an increase in the risk of relapse. Some positive associations between NAB titers and frequency of relapse are suggested. Regarding the efficiency of the GEE and GLMM approaches, no consistent evidence of one being superior to the other is found. Recall that the model for this daily relapse data cannot be fitted with most generic GEE or GLMM routines in R. An alternative to avoid this computational difficulty is to use another configuration of the relapse data. Since NAB titers were measured every 6 months, we may consider collapsing the daily relapse data and performing our analysis based on the number of relapses between consecutive NAB titer measures. This reduction in the 65 3.7. Summary resolution of the relapse data and the potentially less precise estimation of the time trend may affect the statistical power in detecting NAB effects. The next chapter presents analyses based on collapsed data and attempts to answer the questions concerning impacts on the sensitivity for detecting NAB effects. 66 Chapter 4 Analysis Based on Collapsed Data 4.1 Introduction In the BEYOND trial data, each patient has about 800 longitudinal binary indicators of relapse onset. The dimension of this response data causes some computational difficulties in fitting GLMMs and marginal models. For most model fitting procedures in R to be applicable, the response vectors need to be reduced to a more manageable size. Since NAB serum specimens were mea- sured normally every half-year, much less frequently than the daily record of relapse onset, we consider collapsing the daily relapse data to the number of relapse onsets between consecutive NAB measures. This collapsing reduces the average number of longitudinal observations to about 6. Table 4.1 shows the distribution of the number of NAB serum specimens collected, which is equivalent to the number of collapsed intervals. In both the 250 mcg and 500 mcg groups, most patients had 5 to 7 collapsed intervals. Table 4.1: Number of NAB titer measures 2 3 4 5 6 7 8 250 mcg group 1 1 15 75 84 58 14 500 mcg group 0 5 24 80 89 57 7 67 4.1. Introduction Table 4.2 shows the number of relapse onsets per interval. For the 250 mcg group, about 86% of the intervals had no relapse onsets, about 13% had one relapse onset and the remaining 1% had two relapse onsets. For the 500 mcg group, about 88% of the intervals had no relapse onsets, about 11% had one relapse onset and the remaining 1% had two relapse onsets. Only about 1% of the intervals in each group has two relapse onsets, which was the largest number of relapse onsets observed in a single interval, the aggregation of the events from the daily data is minimal. Some loss of precision in estimation due to collapsing is anticipated but not to a large extent. Table 4.2: Number of relapse onsets per interval 0 1 2 Total 250 mcg group 1261 188 13 1462 500 mcg group 1315 168 17 1500 Table 4.3 shows the distribution of the lengths of the intervals at risk corresponding to the collapsed observations. For the 250 mcg group, the lengths of the intervals at risk range from 1 to 544 days, whereas for the 500 mcg group, the lengths range from 1 to 675 days. In both groups, the number of days at risk clustered at around 1 day and 180 days. In the 250 mcg group, 200 out of 1462 are 1-day intervals, whereas in the 500 mcg group, 205 out of 1500 are 1-day intervals. Only one of these 1-day intervals had a relapse onset in each group. These exceptionally short intervals correspond to the end of study visit for some of the patients when the protocol specified that a final NAB specimen be taken. 68 4.2. Model Specification Table 4.3: Number of days at risk between consecutive collections of NAB serum specimens Days at risk 0 – 49 50 – 99 100 – 149 150 – 199 200+ 250 mcg group 215 82 64 1063 38 500 mcg group 225 72 60 1100 43 4.2 Model Specification For the collapsed data, the response Yij of the i th patient is the number of relapse onsets during the period starting from tj, the day the j th NAB titer was measured, to tj+1 − 1, the day before the next NAB titer measurement inclusively. Given a random effect bi, the counts Yij are assumed to be conditionally independent following a Poisson distribution with mean µ (C) ij = λijsij, where sij is the length of the period at risk expressed in years. The conditional mean depends on the covariates through a log link, log(µ (C) ij ) = β0 + bi + β1tj + β2NABij + log(sij). The relative relapse rate for NAB+ time periods versus NAB− time periods is represented by exp(β2). Since the NAB status remains unchanged between consecutive NAB titer measures, the NAB status NABij corresponding to Yij is readily obtainable from the daily data in Chapter 3. The random intercept bi is assumed to be a random sample from a normal distribution with mean 0 and variance σ2b . The marginal mean structure of this GLMM is only a shift of the intercept from the conditional mean structure, log(µ (M) ij ) = β ∗ 0 + bi + β1tj + β2NABij + log(sij), 69 4.3. Results as shown in (2.5.2). The Poisson regression model with random intercept for count data is fitted by maximum likelihood, whereas the marginalized model is fitted by the GEE approach with an exchangeable “working” corre- lation structure (GEE-EXCH). Although the marginal distribution of Yij is not Poisson, we assume the marginal variance is proportional to the mean, Var(Yij) = φµ (M) ij , in our GEE analyses. The dispersion parameter φ allows for anticipated overdispersion with respect to Poisson. With the dimension of the data reduced as indicated, both glmer() of the lme4 package and glmm() of the repeated package can fit the GLMM model. For the GEE approach, both gee() of the gee package and geeglm() of the geepack package can handle the collapsed data. For coherence and ease of comparison of results based on the daily relapse data and the collapsed data, the same model fitting routines in R as in Chapter 3 are implemented for the collapsed data. 4.3 Results 4.3.1 NAB Status Tables 4.4a and 4.4b show the estimated relative rates obtained by the GLMM and the GEE approach. Complete results of these model fits are given in Tables B.1a to B.2b, in Appendix B. In the 250 mcg group, es- timated NAB effects given by the two approaches based on the collapsed relapse data are very similar. The time adjusted increase in annualized re- lapse rate associated with switching from NAB− to NAB+ is about 15%. For NAB+ sublevels, the increases with respect to NAB− are about 13% for Low NAB+, 22% for Medium NAB+ and 10% for High NAB+. We define the relative efficiency of GEE-EXCH by the ratio of the variance of the GLMM to that of the GEE estimate. Based on the estimates of the log relative rates, β̂2, the relative efficiency of GEE-EXCH ranges from 93.2% to 109.6% in the 250 mcg group and from 81.3% to 94.8% in the 500 mcg group. Based on estimates of relative rates in Table 4.4a, the relative efficiency of 70 4.3. Results GEE-EXCH ranges from 91.7% to 107.4% for the 250 mcg group and from 79.4% to 90.4% for the 500 mcg group. The estimated relative rates obtained by either approach have similar precision for the 250 mcg group but some noticeable loss of efficiencies for GEE-EXCH is observed for the 500 group. In the 250 mcg group, no convincing NAB effects is demonstrated by ei- ther approach, just as in the analyses based on the daily data. The estimated relative rates based on the collapsed data are higher than the corresponding estimates based on the daily data. The pattern of the relative rates with respect to low, medium and high levels of NABs is similar. In the 500 mcg group, estimated increases in relapse rate given by the two approaches are similar, with those given by GLMM always slightly smaller. The estimated increases in relapse rate associated with switching from NAB− to NAB+ are about 29% and 29% from GEE-EXCH and GLMM, respectively. For the NAB sublevels, the estimated increases in relapse rates relative to NAB− are 11% and 10% for Low NAB+, 54% and 55% for Medium NAB+ and 60% and 59% for High NAB+. GLMM has a higher efficiency in this case with the relative efficiency of GEE-EXCH ranging from 79.4% to 90.4%. None of the estimated NAB effects are statistically significant at the 0.05 level. The estimated relative rates given by both approaches closely resemble the corresponding estimated relative risks based on the daily relapse data. From the model fits in Tables B.1a to B.2b, the derived marginal inter- cepts β̂0 + 1 2 σ̂2b estimated from the GLMM fits are larger than the estimated marginal intercepts β̂∗0 by about 0.06 for the 250 mcg group and about 0.11 to for the 500 mcg group. For the GEE approach, the estimated exchangeable correlation α̂ ranges from 0.050 to 0.054 for the 250 mcg group and from 0.105 to 0.107 for the 500 mcg group. The precision in estimating the log relative rate β2 is similar for the col- 71 4.3. Results lapsed and the daily data for GLMM. In the 250 mcg group, the variances of the estimated log relative rates (β̂2) based on the collapsed data are about 3% to 4% larger than the variances based on the daily relapse data. In the 500 mcg group, the variances of β̂2 based on the collapsed data are about −1% to 2% larger than those based on the daily relapse data. However, a more severe loss of efficiency due to collapsing is observed for GEE-EXCH. In the 250 mcg group, the variances based on the collapsed data are about 10% to 16% larger than the variances based on the daily relapse data. In the 500 mcg group, the variances are inflated by about 17% to 20%. The comparison of precision based on estimated relative rates R̂R = exp(β̂2) shows similar but amplified patterns. For GLMM, the inflation of variance due to collapsing is about 10% to 20% in the 250 mcg group and about −3% to 3% in the 500 mcg group. For GEE-EXCH, the variances are inflated by about 23% to 28% in the 250 mcg group and about 17% to 29% in the 500 mcg group. 4.3.2 NAB Titer The NAB effect based on NAB titers is also examined by replacing the NAB status covariate, NABij, by log NAB titers with a cutoff at k NU/mL, log ( min{Titerij ,k} 20 ) . Results based on NAB titers are presented in Tables 4.5a and 4.5b. Similar patterns as with the daily relapse data are observed with the collapsed data. For the 250 mcg group, similar results are given by GLMM and GEE- EXCH with the estimates obtained by GLMM always slightly larger. The estimated increases in relapse rate for doubled titers are about 3% for GEE- EXCH and 4% for GLMM when a cutoff at 100 NU/mL is used. For higher cutoffs, the estimates are less than 1.5%. The estimated increases in relapse rate at the cutoffs relative to titer value ≤ 20 NU/mL are less than or equal 72 4.3. R esu lts Table 4.4: Estimated relative risks of NAB+ and sublevels to NAB−: eventually NAB+ patients only, collapsed data. (a) The 250 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI NAB+ vs. NAB− 1.092 0.172 (0.802, 1.487) 1.102 0.188 (0.788, 1.541) Low NAB+ vs. NAB− 1.075 0.181 (0.774, 1.495) 1.094 0.199 (0.766, 1.564) Med NAB+ vs. NAB− 1.158 0.256 (0.751, 1.786) 1.150 0.254 (0.747, 1.772) High NAB+ vs. NAB− 1.035 0.270 (0.621, 1.725) 1.023 0.277 (0.602, 1.738) (b) The 500 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI NAB+ vs. NAB− 1.293 0.222 (0.923, 1.811) 1.287 0.236 (0.898, 1.843) Low NAB+ vs. NAB− 1.111 0.223 (0.749, 1.647) 1.099 0.228 (0.732, 1.650) Med NAB+ vs. NAB− 1.541 0.339 (1.001, 2.370) 1.547 0.345 (1.000, 2.394) High NAB+ vs. NAB− 1.602 0.379 (1.007, 2.548) 1.593 0.384 (0.993, 2.557) R̂R: estimated relative risk. Titers for NAB status — Low: [20, 100), Med: [100, 400), High: [400,∞). 73 4.3. Results 8.5%. No convincing association between NAB titers and relapse rates is demonstrated; none of the increases are statistically significant at the 0.05 level. Compared to results based on the daily relapse data, estimated relative rates based on the collapsed data are always slightly higher. For the 500 mcg group, results based on the two approaches also resem- ble each other but with the estimates based on the GEE approach always slightly larger. For doubled titers, the estimated increases in relapse rate are about 20% for GEE-EXCH and 18% for GLMM for a cutoff at 100 NU/mL, about 13% and 12% for a cutoff at 400 NU/mL, about 11% and 10% for a cutoff at 1000 NU/mL and about 9% and 8% for no cutoff. The maximum increases achievable at cutoff with respect to titer ≤ 20 NU/mL are about 54% for GEE-EXCH and 47% for GLMM for cutoff at 100 NU/mL, about 70% and 64% for cutoff at 400 NU/mL and about 81% and 74% for cutoff at 1000 NU/mL. Regardless of the cutoff level chosen, the NAB effects are sta- tistically significant at the 0.05 level, indicating some evidence of a positive association between NAB titers and relapse rate. The efficiency in estimating the relative rates are similar for the two approaches in the 250 mcg group, whereas in the 500 mcg group, the GLMM approach gives somewhat more precise estimates in terms of the relative rates. Complete results of the model fits are listed in Tables B.3 and B.4 in Appendix B. Comparing the estimated marginal intercept β̂∗0 to the derived marginal intercept β̂0 + 1 2 σ̂2b , the latter is larger by about 0.06 for the 250 mcg group and about 0.10 to 0.11 for the 500 mcg group. For the GEE approach, the estimated exchangeable correlation α̂ ranges from 0.051to 0.052 for the 250 mcg group and from 0.097 to 0.098 for the 500 mcg group. The loss of efficiency due to collapsing is assessed by comparing the vari- ances of the estimated log relative risks (β̂2) based on the collapsed data to 74 4.4. Impact of Exceptionally Short Intervals the variances of the estimates based on the daily relapse data. For GLMM, the variances of the estimates based on the collapsed data are about 0% to 3% larger than the variances based on the daily relapse data. For GEE-EXCH, the variances of the estimates based on the collapsed data are about 5% to 10% larger than the variances of the estimates based on the daily relapse data. A consistently larger loss of efficiency due to collapsing is observed for GEE-EXCH compared to GLMM. 4.4 Impact of Exceptionally Short Intervals The analyses of the collapsed data presented in the previous section are based on all intervals. In the 250 mcg group, 1262 of the 1462 intervals have length longer than 1 day. Among the 200 1-day intervals, 87% are NAB+ and 1 of the NAB− 1-day intervals had a relapse onset. In the 500 mcg group, 1295 of the 1500 intervals have length longer than 1 day. Among the 205 1-day intervals, 86% are NAB+ and 1 of the NAB+ 1-day intervals had a relapse onset. Since the empirical relapse rate in these 2 exceptionally short intervals with relapse onset is 1 per day, we are concerned that they may have excessive impact on the estimated rates. On the other hand, negligible effect on the standard error is expected if these 1-day intervals are excluded from the dataset as they contribute little to the total exposure time in the complete dataset. The impacts of these 1-day intervals are investigated in this section. Tables 4.6a and 4.6b show the fitted log relative rates with and without the 1-day intervals for both IFNB treatment groups. For the 250 mcg group, the estimates based on all intervals are consistently smaller than the esti- mates based only on intervals longer than one day for both GEE-EXCH and GLMM. The increase in β̂ due to exclusion of the 1-day intervals ranges from 17% to 65%. The standard errors of the estimates are almost the same under 75 4.4. Im p act of E x cep tion ally S h ort In tervals Table 4.5: Estimated relative risk of NAB titers: eventually NAB+ patients only, collapsed data. (a) The 250 mcg group GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI cutoff=∞ Doubled titer before cutoff 1.004 0.039 (0.929, 1.084) 1.006 0.040 (0.931, 1.086) cutoff=1000 Doubled titer before cutoff 1.010 0.046 (0.925, 1.104) 1.012 0.045 (0.929, 1.103) At cutoff 1000 NU/mL vs. 20 NU/mL 1.058 0.270 (0.642, 1.743) 1.071 0.266 (0.659, 1.742) cutoff=400 Doubled titer before cutoff 1.008 0.052 (0.911, 1.116) 1.011 0.051 (0.916, 1.116) At cutoff 400 NU/mL vs. 20 NU/mL 1.037 0.233 (0.668, 1.610) 1.048 0.229 (0.683, 1.608) cutoff=100 Doubled titer before cutoff 1.031 0.082 (0.881, 1.206) 1.036 0.082 (0.887, 1.210) At cutoff 100 NU/mL vs. 20 NU/mL 1.073 0.199 (0.746, 1.544) 1.085 0.199 (0.757, 1.556) 76 4.4. Im p act of E x cep tion ally S h ort In tervals Table 4.5: – Continued (b) The 500 mcg group GEE-EXCH GLMM R̂R SE 95% CI R̂R SE 95% CI cutoff=∞ Doubled titer before cutoff 1.085 0.041 (1.008, 1.168) 1.082 0.037 (1.012, 1.158) cutoff=1000 Doubled titer before cutoff 1.110 0.051 (1.015, 1.214) 1.103 0.044 (1.021, 1.192) At cutoff 1000 NU/mL vs. 20 NU/mL 1.805 0.465 (1.089, 2.991) 1.740 0.390 (1.121, 2.699) cutoff=400 Doubled titer before cutoff 1.133 0.062 (1.017, 1.261) 1.122 0.053 (1.022, 1.231) At cutoff 400 NU/mL vs. 20 NU/mL 1.714 0.407 (1.076, 2.728) 1.644 0.338 (1.099, 2.459) cutoff=100 Doubled titer before cutoff 1.204 0.109 (1.009, 1.438) 1.181 0.093 (1.012, 1.377) At cutoff 100 NU/mL vs. 20 NU/mL 1.539 0.323 (1.020, 2.323) 1.470 0.269 (1.028, 2.103) 77 4.5. Summary both the complete and the reduced data set, with those based on the latter being very slightly larger in general. The conclusion on the NAB effects is not affected by exclusion of 1-day intervals as the p-values are still quite high, larger than about 0.3. For the 500 mcg group, the estimates and their standard errors are very similar for model fits based on all intervals and those based only on intervals longer than one day for both GEE-EXCH and GLMM. For GEE-EXCH, the standard errors are always slightly smaller based on intervals longer than one day, whereas for GLMM, the standard errors are almost identical. It is worth noting that, when the 1-day intervals are excluded, the corre- lation estimated by GEE-EXCH increases from 0.05 to 0.15 for the 250 mcg group and from 0.11 to about 0.12 to 0.13 for the 500 mcg group. The higher correlation may partly compensate for the increase in variance due to reduced total follow-up time. On the other hand, for GLMM, the estimated variance components are similar for both versions of the data. Although exclusion of the 1-day intervals from the model fits results in increases in estimated log relative rates in the 250 mcg group, the general conclusions about the NAB effects are not altered. On the contrary, the model fits in the 500 mcg group are not noticeably affected. 4.5 Summary Collapsing the daily relapse data to less frequent responses greatly reduces the computational burden of model estimation at a potential cost of reduced statistical efficiency. For our particular cases, the loss of efficiency for esti- mating β2 is small for GLMM; the variances of β̂2 increased by less than 5%. It appears that collapsing the sparse daily response vector, consisting mostly 78 4.5. S u m m ary Table 4.6: Impact of 1-day intervals on estimates of log relative rates, β. (a) The 250 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− All Intervals Intervals > 1 day β̂ SE z-score P value β̂ SE z-score P value GEE-EXCH NAB+ vs. NAB− 0.142 0.167 0.85 0.40 0.171 0.167 1.02 0.31 Low NAB+ vs. NAB− 0.125 0.181 0.70 0.49 0.146 0.179 0.81 0.42 Med NAB+ vs. NAB− 0.201 0.232 0.87 0.39 0.249 0.233 1.07 0.29 High NAB+ vs. NAB− 0.094 0.274 0.34 0.73 0.155 0.275 0.56 0.57 GLMM NAB+ vs. NAB− 0.138 0.174 0.79 0.43 0.182 0.177 1.03 0.30 Low NAB+ vs. NAB− 0.122 0.185 0.66 0.51 0.164 0.188 0.87 0.38 Med NAB+ vs. NAB− 0.195 0.224 0.87 0.38 0.241 0.226 1.07 0.29 High NAB+ vs. NAB− 0.097 0.275 0.35 0.72 0.156 0.279 0.56 0.58 79 4.5. S u m m ary Table 4.6: – Continued (b) The 500 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− All Intervals Intervals > 1 day β̂ SE z-score P value β̂ SE z-score P value GEE-EXCH NAB+ vs. NAB− 0.269 0.188 1.43 0.15 0.263 0.183 1.44 0.15 Low NAB+ vs. NAB− 0.103 0.218 0.48 0.63 0.105 0.212 0.49 0.62 Med NAB+ vs. NAB− 0.476 0.239 1.99 0.05 0.461 0.236 1.95 0.05 High NAB+ vs. NAB− 0.502 0.256 1.96 0.05 0.496 0.252 1.97 0.05 GLMM NAB+ vs. NAB− 0.240 0.183 1.31 0.19 0.239 0.185 1.30 0.20 Low NAB+ vs. NAB− 0.084 0.207 0.41 0.68 0.074 0.209 0.36 0.72 Med NAB+ vs. NAB− 0.433 0.222 1.95 0.05 0.438 0.223 1.96 0.05 High NAB+ vs. NAB− 0.469 0.243 1.93 0.05 0.480 0.245 1.96 0.05 Titers for NAB status — Low: [20, 100), Med: [100, 400), High: [400,∞). 80 4.5. Summary of long strings of 0’s, does not result in much loss of resolution. Thus the efficiencies for estimating the NAB effect are similar in both configurations of data. For the GEE approach, greater loss of efficiency due to collapsing is re- ported; the variances of β̂2 increased by 10% to 20%. The variable length of intervals adds further variation in the mean of the responses which determine the strength of the pairwise correlations of the responses. An exchangeable “working” correlation structure approximates the true correlation structure less accurately in the case of the collapsed data than in the case of the daily relapse data. This is likely the cause of the higher loss of efficiency for GEE- EXCH than for GLMM. Overall, the pattern of estimated relative rates based on the collapsed data resembles that based on the daily relapse data. Analyses based on the two configurations of data arrived at the same overall conclusions for the NAB effects. For the daily relapse data in Chapter 3, the relative efficiency of GEE- EXCH for estimating the log relative risks β2 in terms of NAB status is 100% to 108% for the 250 mcg group and 101% to 106% for the 500 mcg group. The relative efficiency demonstrated is slightly lower when the relapse rate is modelled as a function of NAB titers: 100% to 104% for the 250 mcg group and 93% to 96% for the 500 mcg group. For the collapsed data considered in this chapter, the relative efficiency of GEE-EXCH for estimating the log relative rates β2 is 97% to 104% for the 250 mcg group and 93% to 98% for the 500 mcg group, when NAB status is utilized. With NAB titers, the relative efficiency for estimating β2 is 97% to 100% for the 250 mcg group and 86% to 93% for the 500 mcg group. 81 4.5. Summary The estimated relative efficiency is always higher for the 250 mcg group compared to the 500 mcg group, for daily relapse data compared to collapsed data and for using NAB status compared to using NAB titers. However, the reported figures do not provide a clear picture of whether one of the two estimation approaches is generally more efficient than the other. As an attempt to clarify this issue, in Chapter 5, simulation studies con- trasting the GEE approach to the GLMM approach are reported. In particu- lar, scenarios with binary and sparse responses are considered. The efficiency of GEE approaches with independent (GEE-IND) and exchangeable (GEE- EXCH) correlation structure are compared to GLMM when both GEE and GLMM are correctly specified. Situations when GEE is correctly specified but GLMM is misspecified are also considered. 82 Chapter 5 Simulation Study 5.1 Introduction In the BEYOND data, eventually NAB+ patients had an average of 0.7 to 0.8 relapse onsets during the course of the study. The empirical annualized re- lapse risk was about 0.33 to 0.36. The average proportion of time on study as NAB− was about 36%. Assuming a random intercept model, the estimated variance component was about 0.82. In this chapter, we consider simulation studies comparing the small sample properties of the GEE and GLMM ap- proaches used in Chapter 3 to analyze the BEYOND data. The simulations are designed such that the generated data capture some salient features of the daily relapse data. Thus the results of the simulation are more relevant to our data analyses in Chapter 3. The number of daily relapse observations for each patient in the BEYOND trial is about 800. Conducting simulations with this cluster size is not feasible with the computing resources currently available to us. Thus, much smaller cluster sizes are considered in our sim- ulations. A typical simulated cluster will have roughly one event, as for the BEYOND dataset. The simulations do not fully represent the scenario of the BEYOND data, nevertheless, they shed some light on the performance of the GLMM and GEE approaches for clustered binary data when the event is rare. Specifically, the efficiency of GEE-EXCH and GEE-IND are compared to GLMM when (1) the observations follow the distribution modelled by GLMM and (2) when the marginal and conditional mean structures are the same but some aspect of the distribution of the random effect is misspecified 83 5.1. Introduction in GLMM. Three simulation studies are conducted. The first study exam- ines primarily the finite sample efficiency of GEE when the GLMM is the true model. The relative efficiency is defined as the reciprocal of the ratio of empirical variances. The finite sample biases are also examined. The second and third studies examine the impact of two forms of model misspecification on GEE and GLMM. In all the simulations, clustered binary responses are generated from the log-linear models for binary response described in Section 2.51. The Bernoulli probability is associated with a binary covariate while no time trend is con- sidered. As shown previously in Section 2.5, the only difference between the marginal and conditional mean structure is the intercept term. The es- timated regression coefficients for the binary covariate obtained by GLMM and GEE are thus directly comparable in all three simulation studies. The model fitting routine for GLMM implemented for the simulations is glmer() of the R package lme4, which is computationally faster than glmm() in the R package repeated. In glmer(), the random effect is assumed to be normal, the variance component is estimated by maximum likelihood and fifteen quadrature points are used for the adaptive Gaussian quadrature. For the GEE approach, the function gee() in the R package gee is utilized. The standard errors are estimated by the robust sandwich estimator of variance. The moment estimates for the correlation (when an exchangeable “working” correlation is used) and the sandwich estimator of variance proposed by Liang and Zeger[13] are implemented in gee(). 1The mixed-effects models for the second simulation study in Section 5.3 are excep- tions with some changes to the random intercept which does not alter the marginal mean structure. 84 5.2. Simulation I: Efficiency of GEE 5.2 Simulation I: Efficiency of GEE Assume there are M independent subjects, each with ni repeated observa- tions. Given a random intercept bi, the binary responses Yij are independent Bernoullis with conditional mean pi (C) ij modelled as, log ( pi (C) ij ) = β0 + bi + β1xij, where xij is a binary covariate. The random intercept bi follows a normal distribution with mean 0 and standard deviation σb = 0.3. Since a log link and a normal random effect are used, the simulated Bernoulli probabilities are not bounded above by 1. This smaller variance component than that estimated for the BEYOND data (≈ 0.8) is therefore chosen, so that the simulated probabilities would very seldom exceed one. In fact, for all three simulation studies, all our simulated probabilities are within 0 and 1. Combinations of the following covariate and parameter settings are con- sidered in all simulations unless otherwise specified: A Number of observations per subject: 1. ni = 15, for all i. 2. ni uniformly distributed over the interval [14, 15]. 3. ni uniformly distributed over the interval [12, 15]. B Baseline probability: β0 = log(1/15), so that a typical patient (with bi ≈ 0) has roughly one event on average among the repeated observa- tions in the absence of the covariate effect. C Effect of the binary covariate x: 1. β1 = log(1.0) = 0. 2. β1 = log(1.3) ≈ 0.26. 85 5.2. Simulation I: Efficiency of GEE 3. β1 = log(1.5) ≈ 0.41. D Distribution of covariate x for individual subject: For each sub- ject, the covariate xij starts with a sequence of 0’s followed by a se- quence of 1’s: 1. The first 30% of xij values are 0 and the subsequent 70% are 1. 2. The proportion of xij values that equal 0 is distributed uniformly from 20% to 50%. When β1 = 0 and σb = 0.3, P ( 0.04 < pi (C) ij < 0.18 ) ≈ 0.95 and P ( 0.03 < pi (C) ij < 0.22 ) ≈ 0.99. For β1 = 0, the within-subject correlation is about 0.007. For β1 = log(1.5), the correlation ranges from 0.007 when both x = 0 to 0.011 when both x = 1. The number of simulated replicates is 1000 for each setting. The model fitting algorithms used are glmer(lme4) for GLMM and gee(gee) for GEE- exchangeable (GEE-EXCH) and GEE-independent (GEE-IND). Sometimes the GLMM model fitting algorithm, glmer(), complains about boundary problems. The actual cause of the error is still unknown. The algorithm basically complains that some fitted probabilities are numerically zero. Oc- casionally, this error can be solved by reducing the number of quadrature points to as low as 3. For the same data sets, this problem disappears when the log link is replaced by the logit link. Simulated replicates with this type of error are replaced until no errors are reported. The number of replicates replaced is also recorded. No errors were encountered with gee() in any of our reported simulation studies even for the replicates causing a problem for glmer(). 86 5.2. Simulation I: Efficiency of GEE Results Tables 5.1a to 5.1c show the results of this simulation. Table 5.1a shows the estimated biases, the empirical standard errors and the average standard errors: the average robust standard errors for GEE estimates and the av- erage model-based standard errors for GLMM. Estimates obtained by GEE with exchangeable and independent “working” correlation structures are al- most identical. Similar patterns of estimated bias in β1 are demonstrated by GLMM and GEE. The estimated bias is consistently small, usually within ±0.015. Exceptions include the three cases where the proportion of xij = 0 within cluster is 30% and β1 = 0 (RR = 1.0), where the estimated biases range from 0.020 to 0.026 and the last of the cases where the proportion of xij = 0 within cluster ranges from 20% to 50% and β1 = log(1.5), where the estimated biases are about −0.044. For GEE, no specific patterns of discrepancy between the average robust standard errors and the empirical standard errors are observed; differences are within about −4% to 6%. For GLMM, the average model-based standard errors are consistently larger than the empirical standard errors by up to about 8% except in one case where the average model-based standard error is smaller by about 2%. As shown in Table 5.1b, the efficiencies of GEE-EXCH and GEE-IND relative to GLMM are consistently larger than 1 by a slight margin across all 18 cases considered. The GEE approaches are as efficient as GLMM in this simulation. Table 5.1c reports the average estimated variance component and the numbers of simulated replicates replaced due to errors reported by glmer. The variance component is underestimated by about 10% to 30% as about 14% to 23% of the estimated variance components are numerically zero. The proportion of numerically zero estimated variance components increases with the variability of cluster size in general. Moderate to large number of sim- 87 5.2. Simulation I: Efficiency of GEE ulated replicates are replaced, with the number increasing with the relative risk and the variability in cluster size. For the cases with relative risk 1.5 and cluster size ranging from 12 to 15, more than 1000 replicates were replaced, inducing substantial bias in the types of samples included in the simulation. More Clusters Since the responses and covariates are binary, the pattern of observations on each cluster can be represented by a 2 × 2 table. In our simulation, the Bernoulli probabilities are very low and the vector of covariate values are similar across clusters, or identical in some cases. Hence, there may be very little variation in terms of the pattern of observations across clusters. The predicted random effects for clusters sharing the same pattern should intu- itively be identical, so the estimation of the variance component is essentially based on a few groups of predicted random effects taking the same values. This partly explains the reason for poor estimation of the variance compo- nent. The same simulations were repeated with the number of subjects M in- creased from 250 to 1000 to see if this leads to better estimation of the variance components. Results are reported in Tables 5.2a to 5.2c. GEE- EXCH and GEE-IND again give essentially identical results. GLMM and GEE show similar patterns of estimated biases, which are small, less than 0.012. The only exceptions are the two cases where β1 = log(1.5) and ni is uniformly distributed from 12 to 15, where the estimated bias is about −0.031 for the case where the proportion of xij = 0 is 30% and about −0.036 for the case where the proportion of xij = 0 ranges from 20% to 50%. For GLMM, the differences of the model-based standard errors from the empir- ical standard errors range from about −2% to 7%, whereas those for GEE range from about −4% to 5%. The estimated relative efficiencies of GEE- EXCH are consistently greater than 1 but by less than 1%. Similar relative 88 5.2. Simulation I: Efficiency of GEE efficiencies are observed for GEE-IND. For the variance component, less than 4% of the estimates are numerically 0. A decreasing trend of the average estimated variance component with respect to increasing relative risk is observed, which was also observed in the previous simulation with M = 250. The numbers of replicates replaced are small for relative risk 1.0. However, the numbers of replacements are excessive when the relative risk and the variation in cluster size increases. The number of replicates replaced is larger than 10000 for the case with RR = 1.5 and cluster size varying from 12 to 15. The simulation results may not be reliable for these cases. Larger Cluster Size A third simulation under the scenario of this section was conducted, where the efficiency of GEE is compared to GLMM hopefully in the absence of possible effects from poor estimation of variance components. The number of clusters, M , is 250. The cluster sizes are distributed uniformly from 80 to 100, and β0 = log(0.05) so that about 5 events are expected in a typical cluster with size 100 in the absence of covariate effect (β1 = 0). The larger β0 leads to higher within cluster correlation. In the absence of the covariate effect, the within cluster correlation is about 0.005. Results are presented in Tables 5.3a to 5.3c. In all cases for both GLMM and GEE, the estimated biases are negligible, less than 0.01, and the average standard errors and the empirical standard errors are in general agreement. The relative efficiency of GEE-EXCH is consistently slightly larger than 1, but by less then 1%. The relative efficiency of GEE-IND ranges from about 98% to 100%. Table 5.3c shows that the variance component is accurately estimated and no estimates are numerically 0. However, a modest number of simulated replicates are still replaced due 89 5.2. Simulation I: Efficiency of GEE to the boundary problem in glmer(). Summary In this section, we considered a scenario of clustered data with sparse binary events and a binary covariate which varies within cluster. Clustered data is a more general type of data which encompasses longitudinal data, repeated measurement and familial data. In contrast to longitudinal data, a natural chronological ordering of the observations may not necessarily be relevant to clustered data. In this setting, GEE-EXCH demonstrates the same efficiency as GLMM in estimating the log relative risk β1. The GEE-IND approach also demonstrates high efficiency. The GEE-IND and the GEE-EXCH approaches are almost equivalent in the first two simulation settings. In the third setting where a larger cluster size is considered, the relative efficiency of GEE-IND is still close to 100%, although clearly less than that of GEE-EXCH. 90 5.2. S im u lation I: E ffi cien cy of G E E Table 5.1: Results of Simulation I–1 (a) Comparison of estimated log relative risk, β1 Number of subjects, M = 250 β0 = log(1/15), σb = 0.3 GLMM-MLE GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 15 0.022 0.165 0.159 0.022 0.161 0.159 0.022 0.161 0.159 14–15 0.026 0.166 0.158 0.026 0.162 0.158 0.026 0.162 0.158 12–15 0.020 0.167 0.157 0.020 0.163 0.157 0.020 0.163 0.157 1.3 15 0.009 0.161 0.165 0.009 0.157 0.164 0.009 0.157 0.165 14–15 0.006 0.161 0.159 0.005 0.158 0.158 0.005 0.158 0.158 12–15 0.007 0.162 0.160 0.007 0.159 0.160 0.007 0.159 0.160 1.5 15 0.005 0.159 0.156 0.004 0.156 0.155 0.004 0.156 0.155 14–15 -0.001 0.159 0.156 -0.002 0.156 0.156 -0.002 0.156 0.156 12–15 -0.012 0.160 0.157 -0.013 0.158 0.156 -0.013 0.158 0.156 20%–50% 1.0 15 0.003 0.127 0.127 0.004 0.124 0.127 0.004 0.124 0.127 14–15 0.006 0.128 0.119 0.006 0.125 0.119 0.006 0.125 0.119 12–15 -0.002 0.131 0.123 -0.002 0.128 0.122 -0.002 0.128 0.122 1.3 15 0.003 0.121 0.114 0.003 0.118 0.114 0.003 0.118 0.114 14–15 -0.004 0.122 0.119 -0.005 0.119 0.119 -0.005 0.119 0.119 12–15 -0.015 0.124 0.119 -0.016 0.121 0.119 -0.016 0.121 0.119 1.5 15 -0.001 0.118 0.114 -0.002 0.116 0.114 -0.002 0.116 0.114 14–15 -0.013 0.119 0.118 -0.014 0.116 0.118 -0.014 0.116 0.118 12–15 -0.044 0.121 0.113 -0.045 0.119 0.112 -0.045 0.119 0.112 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates. 91 5.2. Simulation I: Efficiency of GEE Table 5.1: – Continued (b) Relative efficiency† to maximum liklihood for estimating β1 Number of subjects, M = 250 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 15 1.005 1.004 14–15 1.005 1.004 12–15 1.002 1.004 1.3 15 1.004 1.001 14–15 1.004 1.004 12–15 1.004 1.005 1.5 15 1.005 1.004 14–15 1.003 1.004 12–15 1.004 1.004 20%–50% 1.0 15 1.005 1.005 14–15 1.006 1.006 12–15 1.006 1.007 1.3 15 1.005 1.006 14–15 1.004 1.001 12–15 1.005 1.001 1.5 15 1.004 1.005 14–15 1.004 1.004 12–15 1.003 1.003 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 92 5.2. Simulation I: Efficiency of GEE Table 5.1: – Continued (c) Estimation of variance component in GLMM and computation issues of glmer() Number of subjects, M = 250 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 15 0.271 18.7% 112 14–15 0.268 21.0% 108 12–15 0.255 22.8% 160 1.3 15 0.261 14.0% 158 14–15 0.260 13.7% 206 12–15 0.248 17.7% 320 1.5 15 0.251 11.2% 300 14–15 0.257 11.7% 433 12–15 0.234 16.1% 1169 20%–50% 1.0 15 0.270 19.6% 78 14–15 0.272 18.3% 102 12–15 0.261 22.0% 150 1.3 15 0.269 14.9% 158 14–15 0.265 13.9% 208 12–15 0.250 19.5% 358 1.5 15 0.254 13.0% 365 14–15 0.243 15.6% 565 12–15 0.216 23.2% 1744 93 5.2. S im u lation I: E ffi cien cy of G E E Table 5.2: Results of Simulation I–2: More Clusters (a) Comparison of estimated log relative risk, β1 Number of subjects, M = 1000 β0 = log(1/15), σb = 0.3 GLMM-MLE GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 15 0.003 0.082 0.079 0.003 0.080 0.078 0.003 0.080 0.078 14–15 0.004 0.082 0.080 0.004 0.080 0.080 0.004 0.080 0.080 12–15 0.006 0.083 0.078 0.006 0.081 0.077 0.006 0.081 0.077 1.3 15 0.003 0.080 0.075 0.002 0.078 0.075 0.002 0.078 0.075 14–15 0.005 0.080 0.082 0.004 0.079 0.082 0.004 0.079 0.082 12–15 -0.001 0.081 0.081 -0.001 0.079 0.080 -0.001 0.079 0.080 1.5 15 -0.003 0.079 0.077 -0.004 0.078 0.077 -0.004 0.078 0.077 14–15 -0.007 0.079 0.078 -0.008 0.078 0.078 -0.008 0.078 0.078 12–15 -0.031 0.080 0.075 -0.032 0.078 0.075 -0.031 0.078 0.075 20%–50% 1.0 15 0.002 0.063 0.061 0.002 0.062 0.061 0.002 0.062 0.061 14–15 0.003 0.064 0.062 0.003 0.062 0.062 0.003 0.063 0.062 12–15 0.000 0.066 0.062 0.000 0.064 0.062 0.000 0.064 0.062 1.3 15 -0.002 0.060 0.059 -0.003 0.059 0.059 -0.003 0.059 0.059 14–15 -0.002 0.061 0.061 -0.003 0.060 0.061 -0.003 0.060 0.061 12–15 -0.010 0.062 0.058 -0.010 0.061 0.058 -0.010 0.061 0.058 1.5 15 -0.003 0.059 0.055 -0.004 0.058 0.055 -0.004 0.058 0.055 14–15 -0.011 0.059 0.057 -0.012 0.058 0.057 -0.012 0.058 0.057 12–15 -0.036 0.061 0.057 -0.037 0.059 0.057 -0.037 0.059 0.057 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates. 94 5.2. Simulation I: Efficiency of GEE Table 5.2: – Continued (b) Relative efficiency† to maximum liklihood for estimating β1 Number of subjects, M = 1000 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 15 1.007 1.009 14–15 1.006 1.005 12–15 1.005 1.007 1.3 15 1.004 1.005 14–15 1.006 1.007 12–15 1.005 1.005 1.5 15 1.006 1.006 14–15 1.005 1.006 12–15 1.004 1.002 20%–50% 1.0 15 1.004 1.002 14–15 1.006 1.003 12–15 1.006 1.007 1.3 15 1.006 1.007 14–15 1.006 1.004 12–15 1.004 1.001 1.5 15 1.005 1.003 14–15 1.004 1.000 12–15 1.002 0.999 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 95 5.2. Simulation I: Efficiency of GEE Table 5.2: – Continued (c) Estimation of variance component in GLMM and computation issues of glmer() Number of subjects, M = 1000 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 15 0.324 1.7% 36 14–15 0.323 2.8% 36 12–15 0.323 3.6% 67 1.3 15 0.316 0.6% 102 14–15 0.313 0.9% 162 12–15 0.309 0.8% 640 1.5 15 0.302 0.2% 612 14–15 0.300 0.4% 1258 12–15 0.280 1.6% 11624 20%–50% 1.0 15 0.328 2.3% 16 14–15 0.330 2.2% 36 12–15 0.325 3.6% 63 1.3 15 0.318 0.7% 141 14–15 0.313 1.2% 234 12–15 0.305 1.9% 908 1.5 15 0.302 0.5% 1008 14–15 0.296 0.9% 2234 12–15 0.267 3.7% 24633 96 5.2. S im u lation I: E ffi cien cy of G E E Table 5.3: Results of Simulation I–3: Larger Cluster Size (a) Comparison of estimated log relative risk, β1 Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 GLMM-MLE GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 100 0.007 0.076 0.073 0.007 0.075 0.073 0.007 0.075 0.073 90–100 0.009 0.076 0.077 0.009 0.075 0.077 0.009 0.075 0.077 80–100 0.003 0.077 0.074 0.003 0.076 0.074 0.003 0.076 0.074 1.3 100 0.001 0.074 0.072 0.001 0.073 0.072 0.001 0.074 0.073 90–100 0.003 0.075 0.072 0.003 0.074 0.072 0.004 0.074 0.072 80–100 0.003 0.075 0.073 0.002 0.074 0.073 0.002 0.074 0.073 20%–50% 1.0 100 0.004 0.057 0.057 0.004 0.056 0.057 0.004 0.057 0.057 90–100 -0.002 0.058 0.057 -0.001 0.057 0.057 -0.002 0.058 0.057 80–100 0.002 0.059 0.058 0.002 0.058 0.058 0.002 0.059 0.058 1.3 100 -0.001 0.055 0.053 -0.001 0.054 0.053 -0.001 0.054 0.053 90–100 0.000 0.055 0.054 0.000 0.055 0.054 0.000 0.055 0.054 80–100 -0.001 0.056 0.056 -0.001 0.055 0.056 -0.002 0.056 0.056 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates. 97 5.2. Simulation I: Efficiency of GEE Table 5.3: – Continued (b) Relative efficiency† to maximum liklihood for estimating β1 Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 100 1.002 0.997 90–100 1.003 0.995 80–100 1.002 0.988 1.3 100 1.002 0.984 90–100 1.001 0.994 80–100 1.001 0.988 20%–50% 1.0 100 1.002 0.999 90–100 1.003 0.994 80–100 1.001 0.996 1.3 100 1.001 1.000 90–100 1.004 1.003 80–100 1.002 0.995 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 98 5.2. Simulation I: Efficiency of GEE Table 5.3: – Continued (c) Estimation of variance component in GLMM and computation issues of glmer() Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 100 0.300 0% 272 90–100 0.299 0% 362 80–100 0.297 0% 440 1.3 100 0.296 0% 159 90–100 0.297 0% 221 80–100 0.299 0% 334 20%–50% 1.0 100 0.296 0% 127 90–100 0.297 0% 155 80–100 0.296 0% 164 1.3 100 0.297 0% 103 90–100 0.295 0% 104 80–100 0.298 0% 161 99 5.3. Simulation II: Misspecification of Random Effects 5.3 Simulation II: Misspecification of Random Effects In our second scenario, we examine the impact of a form of misspecification of the random effects. All the features of this simulation study are the same as in Section 5.2 except the assumptions concerning the random effects. The conditional mean Yij is now modelled as log ( pi (C) ij ) = β0 + bij + β1xij. The random effects bij are independent across subjects. Within a subject, bij follows an AR-1 process, bij = ρbij−1 + √ 1− ρ2ij, for j = 2, . . . , ni, where bi1 and ij are i.i.d. N(0, σ 2 b ). The former conditional independence assumption is violated in the sense that within a cluster, the events tend to be clustered together, although the effect may not be apparent for binary data even with a substantially long sequence of observations. The marginal distribution of any individual Yij does not depend on the latent stochastic process of the random effects. Since bij follows a normal distribution with zero mean and variance σ2b , the marginal distribution of Yij is the same as in Section 5.2. The marginal correlation of Yij and Yik is Corr (Yij, Yik) = [ exp ( ρ|j−k|σ2b )− 1]µ(M)ij µ(M)ik√ µ (M) ij ( 1− µ(M)ij ) µ (M) ik ( 1− µ(M)ik ) , so the within-subject observations have a banded correlation structure for β1 = 0. 100 5.3. Simulation II: Misspecification of Random Effects We consider two choices of ρ, 0.1 and 0.5. For ρ = 0.1, and β1 = log(1.5), the lag-1 correlation of Yij and Yij+1 ranges from 0.0007 when both x’s are 0 to 0.0011 when both x’s are 1. The lag-2 and lag-3 correlation are about 0.0001 and less than 0.0001 respectively. For ρ = 0.5, and β1 = log(1.5), the lag-1 correlation ranges from 0.0035 when both x’s are 0 to 0.0054 when both x’s are 1. The lag-2 correlation ranges from 0.0017 to 0.0027 and the lag-3 correlation ranges from 0.0008 to 0.0013. Results Tables 5.4a to 5.4d present the results for the sets of simulations with 250 subjects in each of the 1000 simulated replicates. The results given by GEE- EXCH and GEE-IND are almost identical which is in coherence with the negligible true marginal correlations. In fact, the estimated biases, the av- erage standard errors and the empirical standard errors from all three ap- proaches are very close to each other. The estimated biases for estimation of β1 lie in the range of −0.031 to 0.022, which coverts to biases of less than 3% in estimation of the relative risk, exp(β1). The average standard errors are in general agreement with the empirical standard errors with no specific pattern of discrepancy. The largest disagreements are observed in the case where β1 = log(1.5), the proportion of the binary covariate x being zero ranges from 20% to 50% within cluster and the cluster sizes ni range from 12 to 15; for both ρ = 0.1 and 0.5, for all three approaches, the av- erage standard error is larger then the empirical standard error by about 7%. The simulation results show that the GEE approaches are as efficient as the GLMM approach. For both GEE-EXCH and GEE-IND, the estimated relative efficiencies are consistently larger than 1 by a slight margin of less than 0.005 in all settings considered, except one instance for GEE-IND in which the estimated relative efficiency is 0.999. 101 5.3. Simulation II: Misspecification of Random Effects For GLMM, about half of the estimated variance components are numeri- cally zero, though the interpretation of this estimated variance component in the context of the true model is unclear. The number of simulated replicates replaced ranged from 4 to 763 times across the 24 cases considered. The frequency increases with β1 as well as with variation in cluster size and the variation in the within cluster proportion of x taking the value zero. More Clusters Tables 5.5a to 5.5d show the results of the same set of simulations with the number of subjects increased from 250 to 1000. The bias for estimation of β1 is smaller than 0.01 in general except in four cases where RR = 1.5 and clus- ter size ranges from 12; to 15. In these cases, the biases range from −0.010 to −0.030, however, the number of replicates replaced ranges from about 2400 to 5469. For the standard errors, the percentage differences between the average standard error and the empirical standard error are within 6.5%. The estimated relative efficiency of GEE-EXCH and GEE-IND is consis- tently slightly larger than 100% but by no more than 0.3%. For the settings with relative risk equal to 1.0, no more than 5 replicates are replaced in each case. For the settings with relative risk equal to 1.5, the number of replicates replaced increases with the variation in cluster size, ranging from about 100 to about 5000. Regarding the estimated variance component, 37.5% to 56.1% are numerically zero for low autocorrelation ρ = 0.1 and 48.1% to 69.4% are numerically zero for high autocorrelation ρ = 0.5. Larger Cluster Size Tables 5.6a to 5.6c present results for selected cases with 250 subjects and larger and constant cluster size of ni = 100 in each simulated replicate. We considered combinations of 2 levels of log relative risk β1 = 0 and log(1.3) and 102 5.3. Simulation II: Misspecification of Random Effects 2 levels of autocorrelation ρ = 0.1 and 0.5. The same conclusions concerning the pattern of bias, the validity of the robust standard error for the GEE approaches and the model-based standard error for GLMM, and the relative efficiency of GEE are apparent. For this setting with cluster size ni = 100, the numbers of replicates replaced are negligible. The observed proportions of estimated variance components numerically zero are still around 50%. The average estimated variance component ranges from 0.044 to 0.065. However, the estimation of the fixed effect is not affected by the estimated variance components being severely underestimated. Summary The GEE and GLMM approaches are in general robust to this investigated form of misspecification for the latent stochastic process of the random effect. The percentage of estimated variance component numerically equal to zero is higher, about 50%, compared to about 0% to 23% in Section 5.2 when the model is correctly specified. The number of replicates replaced is also much smaller in this section. Compared to Section 5.2, more or less the same patterns are revealed for bias and the accuracy of the standard error. The efficiency of GEE-EXCH and GEE-IND are almost the same as GLMM probably because the correlation within cluster is almost zero. 103 5.3. S im u lation II: M issp ecifi cation of R an d om E ff ects Table 5.4: Results of Simulation II–1 (a) Comparison of estimated log relative risk, β1, when the time-dependent random effect follows an AR-1 process, ρ = 0.1. Autocorrelation, ρ = 0.1 Number of subjects, M = 250 β0 = log(1/15), σb = 0.3 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 15 0.013 0.163 0.161 0.013 0.161 0.161 0.013 0.161 0.161 14–15 0.019 0.164 0.161 0.019 0.162 0.161 0.019 0.162 0.161 12–15 0.022 0.165 0.163 0.022 0.163 0.163 0.022 0.163 0.163 1.5 15 0.015 0.158 0.164 0.014 0.157 0.164 0.014 0.157 0.164 14–15 0.009 0.158 0.158 0.009 0.157 0.158 0.009 0.157 0.157 12–15 -0.010 0.158 0.157 -0.010 0.157 0.156 -0.010 0.157 0.156 20%–50% 1.0 15 -0.003 0.125 0.124 -0.003 0.123 0.124 -0.003 0.123 0.124 14–15 0.003 0.127 0.130 0.003 0.126 0.129 0.003 0.126 0.129 12–15 0.002 0.129 0.126 0.002 0.128 0.126 0.002 0.128 0.126 1.5 15 0.003 0.116 0.113 0.003 0.116 0.113 0.003 0.116 0.113 14–15 -0.007 0.117 0.115 -0.007 0.116 0.115 -0.007 0.116 0.115 12–15 -0.025 0.120 0.113 -0.025 0.119 0.113 -0.025 0.119 0.113 Continued on Next Page. . . 104 5.3. S im u lation II: M issp ecifi cation of R an d om E ff ects Table 5.4: – Continued (b) Comparison of estimated log relative risk, β1, when the time-dependent random effect follows an AR-1 process, ρ = 0.5. Autocorrelation, ρ = 0.5 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 15 0.010 0.163 0.156 0.010 0.161 0.156 0.010 0.161 0.156 14–15 0.011 0.164 0.157 0.011 0.163 0.156 0.011 0.163 0.156 12–15 0.005 0.164 0.153 0.005 0.162 0.153 0.005 0.163 0.153 1.5 15 0.005 0.157 0.157 0.005 0.156 0.157 0.005 0.156 0.157 14–15 0.009 0.158 0.156 0.009 0.157 0.156 0.008 0.157 0.156 12–15 -0.015 0.158 0.164 -0.015 0.158 0.164 -0.015 0.158 0.164 20%–50% 1.0 15 0.001 0.125 0.124 0.001 0.124 0.123 0.001 0.124 0.123 14–15 -0.002 0.127 0.123 -0.002 0.125 0.123 -0.002 0.126 0.123 12–15 0.000 0.130 0.130 0.000 0.128 0.130 0.000 0.128 0.130 1.5 15 -0.004 0.117 0.115 -0.005 0.116 0.115 -0.005 0.116 0.115 14–15 -0.007 0.117 0.116 -0.007 0.117 0.116 -0.007 0.117 0.116 12–15 -0.030 0.120 0.112 -0.030 0.119 0.112 -0.031 0.119 0.112 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates. 105 5.3. Simulation II: Misspecification of Random Effects Table 5.4: – Continued (c) Relative efficiency† to maximum liklihood for estimating β1, when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 250 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 15 1.002 1.003 14–15 1.001 1.003 12–15 1.002 1.001 1.5 15 1.000 1.001 14–15 1.001 1.003 12–15 1.002 1.001 20%–50% 1.0 15 1.002 1.004 14–15 1.002 1.003 12–15 1.002 1.001 1.5 15 1.000 1.001 14–15 1.002 1.001 12–15 1.000 1.001 Autocorrelation, ρ = 0.5 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 15 1.002 1.002 14–15 1.000 1.002 12–15 1.003 1.002 1.5 15 1.002 1.000 14–15 1.001 1.000 12–15 1.001 0.999 20%–50% 1.0 15 1.002 1.001 14–15 1.002 1.001 12–15 1.003 1.004 1.5 15 1.000 1.002 14–15 1.004 1.001 12–15 1.002 1.001 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 106 5.3. Simulation II: Misspecification of Random Effects Table 5.4: – Continued (d) Estimation of variance component in GLMM and computation issues of glmer(), when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 250 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 15 0.127 54.7% 9 14–15 0.130 50.7% 11 12–15 0.134 52.0% 34 1.5 15 0.100 53.0% 47 14–15 0.096 56.4% 72 12–15 0.084 62.2% 425 20%–50% 1.0 15 0.117 55.3% 14 14–15 0.122 55.2% 4 12–15 0.134 54.2% 14 1.5 15 0.096 55.1% 79 14–15 0.099 56.0% 151 12–15 0.079 65.5% 712 Autocorrelation, ρ = 0.5 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 15 0.145 48.3% 13 14–15 0.151 48.5% 19 12–15 0.134 52.4% 37 1.5 15 0.115 47.2% 47 14–15 0.120 47.5% 96 12–15 0.100 54.7% 470 20%–50% 1.0 15 0.144 48.0% 10 14–15 0.144 48.1% 12 12–15 0.150 49.3% 21 1.5 15 0.118 47.7% 89 14–15 0.111 50.7% 206 12–15 0.104 55.3% 763 107 5.3. S im u lation II: M issp ecifi cation of R an d om E ff ects Table 5.5: Results of Simulation II–2: More Clusters (a) Comparison of estimated log relative risk, β1, when the time-dependent random effect follows an AR-1 process, ρ = 0.1. Autocorrelation, ρ = 0.1 Number of subjects, M = 1000 β0 = log(1/15), σb = 0.3 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 15 0.005 0.081 0.078 0.005 0.080 0.078 0.005 0.080 0.078 14–15 0.003 0.081 0.084 0.003 0.081 0.084 0.003 0.081 0.084 12–15 0.004 0.082 0.080 0.004 0.081 0.080 0.004 0.081 0.080 1.5 15 0.001 0.078 0.081 0.001 0.078 0.081 0.001 0.078 0.081 14–15 -0.001 0.078 0.081 -0.001 0.078 0.081 -0.001 0.078 0.081 12–15 -0.015 0.079 0.078 -0.015 0.078 0.078 -0.015 0.078 0.078 20%–50% 1.0 15 0.005 0.062 0.064 0.005 0.062 0.064 0.005 0.062 0.064 14–15 0.001 0.063 0.062 0.001 0.063 0.062 0.001 0.063 0.062 12–15 -0.002 0.064 0.065 -0.002 0.064 0.065 -0.002 0.064 0.065 1.5 15 -0.004 0.058 0.056 -0.004 0.058 0.056 -0.004 0.058 0.056 14–15 -0.006 0.058 0.057 -0.006 0.058 0.057 -0.006 0.058 0.057 12–15 -0.028 0.059 0.056 -0.028 0.059 0.056 -0.028 0.059 0.056 Continued on Next Page. . . 108 5.3. S im u lation II: M issp ecifi cation of R an d om E ff ects Table 5.5: – Continued (b) Comparison of estimated log relative risk, β1, when the time-dependent random effect follows an AR-1 process, ρ = 0.5. Autocorrelation, ρ = 0.5 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 15 0.001 0.081 0.080 0.001 0.080 0.080 0.001 0.080 0.080 14–15 0.000 0.081 0.078 0.000 0.081 0.078 0.000 0.081 0.078 12–15 0.004 0.082 0.079 0.004 0.081 0.079 0.004 0.081 0.079 1.5 15 -0.003 0.078 0.075 -0.003 0.078 0.075 -0.003 0.078 0.075 14–15 -0.004 0.078 0.075 -0.005 0.078 0.075 -0.005 0.078 0.075 12–15 -0.021 0.078 0.079 -0.021 0.078 0.079 -0.021 0.078 0.079 20%–50% 1.0 15 0.001 0.062 0.064 0.001 0.062 0.064 0.001 0.062 0.064 14–15 -0.003 0.063 0.061 -0.003 0.063 0.061 -0.003 0.063 0.061 12–15 -0.001 0.065 0.065 -0.001 0.064 0.065 -0.001 0.064 0.065 1.5 15 -0.003 0.058 0.057 -0.004 0.058 0.057 -0.004 0.058 0.057 14–15 -0.005 0.059 0.058 -0.006 0.058 0.058 -0.006 0.058 0.058 12–15 -0.026 0.060 0.058 -0.027 0.060 0.058 -0.027 0.060 0.058 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates. 109 5.3. Simulation II: Misspecification of Random Effects Table 5.5: – Continued (c) Relative efficiency† to maximum liklihood for estimating β1, when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 1000 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 15 1.000 1.001 14–15 1.001 1.003 12–15 1.002 1.001 1.5 15 1.001 1.000 14–15 1.000 1.002 12–15 1.000 1.000 20%–50% 1.0 15 1.002 1.002 14–15 1.001 1.001 12–15 1.001 1.003 1.5 15 1.001 1.002 14–15 1.000 1.000 12–15 1.000 1.000 Autocorrelation, ρ = 0.5 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 15 1.001 1.002 14–15 1.002 1.001 12–15 1.002 1.002 1.5 15 1.002 1.002 14–15 1.001 1.000 12–15 1.000 1.000 20%–50% 1.0 15 1.002 1.002 14–15 1.001 1.001 12–15 1.002 1.003 1.5 15 1.001 1.000 14–15 1.001 1.002 12–15 1.001 1.001 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 110 5.3. Simulation II: Misspecification of Random Effects Table 5.5: – Continued (d) Estimation of variance component in GLMM and computation issues of glmer(), when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 1000 β0 = log(1/15), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 15 0.098 51.7% 2 14–15 0.108 49.9% 0 12–15 0.101 53.9% 4 1.5 15 0.075 52.7% 106 14–15 0.072 52.9% 294 12–15 0.056 64.1% 2429 20%–50% 1.0 15 0.109 48.1% 0 14–15 0.108 49.6% 0 12–15 0.109 51.2% 5 1.5 15 0.078 53.3% 224 14–15 0.067 59.0% 552 12–15 0.049 69.4% 4581 Autocorrelation, ρ = 0.5 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 15 0.129 39.3% 0 14–15 0.128 42.0% 2 12–15 0.138 41.3% 4 1.5 15 0.107 37.5% 144 14–15 0.107 38.6% 297 12–15 0.079 51.5% 2904 20%–50% 1.0 15 0.131 41.1% 0 14–15 0.125 43.4% 0 12–15 0.140 40.1% 3 1.5 15 0.111 39.0% 219 14–15 0.097 44.4% 630 12–15 0.077 56.1% 5469 111 5.3. S im u lation II: M issp ecifi cation of R an d om E ff ects Table 5.6: Results of Simulation II–3: Larger Cluster Size (a) Comparison of estimated log relative risk, β1, when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 100 0.002 0.075 0.075 0.002 0.074 0.075 0.002 0.074 0.075 1.3 100 0.002 0.073 0.073 0.002 0.073 0.073 0.002 0.073 0.073 20%–50% 1.0 100 0.002 0.057 0.056 0.002 0.056 0.056 0.002 0.056 0.056 1.3 100 0.003 0.054 0.054 0.003 0.054 0.054 0.003 0.054 0.054 Autocorrelation, ρ = 0.5 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 100 -0.001 0.075 0.078 -0.001 0.075 0.078 -0.001 0.075 0.078 1.3 100 0.000 0.073 0.078 0.000 0.073 0.077 0.000 0.073 0.077 20%–50% 1.0 100 0.001 0.057 0.057 0.001 0.057 0.057 0.001 0.057 0.057 1.3 100 0.001 0.054 0.054 0.001 0.054 0.054 0.001 0.054 0.054 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates.112 5.3. Simulation II: Misspecification of Random Effects Table 5.6: – Continued (b) Relative efficiency† to maximum liklihood for estimating β1, when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 100 1.001 1.001 1.3 100 1.001 1.000 20%–50% 1.0 100 1.000 1.001 1.3 100 1.001 1.001 Autocorrelation, ρ = 0.5 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 100 1.001 1.002 1.3 100 1.001 1.001 20%–50% 1.0 100 0.999 1.001 1.3 100 1.000 1.000 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 113 5.3. Simulation II: Misspecification of Random Effects Table 5.6: – Continued (c) Estimation of variance component in GLMM and computation issues of glmer(), when the time-dependent random effect follows an AR-1 process. Autocorrelation, ρ = 0.1 Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 100 0.049 55.4% 7 1.3 100 0.044 55.8% 1 20%–50% 1.0 100 0.053 52.0% 2 1.3 100 0.048 51.6% 2 Autocorrelation, ρ = 0.5 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 100 0.057 48.8% 3 1.3 100 0.054 45.8% 2 20%–50% 1.0 100 0.060 45.4% 2 1.3 100 0.055 45.4% 0 114 5.4. Simulation III: Misspecification of the Random Effect Distribution 5.4 Simulation III: Misspecification of the Random Effect Distribution In the third scenario, we study the impact of a form of misspecification of the random effect distribution. All the features of the simulation study in Section 5.2 are retained in this section except for this departure of the random effect distribution. Beta Distribution for the Baseline Conditional Probability Firstly, a beta-binomial setup is considered. The baseline conditional prob- ability, P (Yij = 1|bi, xij = 0) = exp (β0 + bi), as a function of bi, is assumed to follow a beta distribution with mean pi0 = exp ( β0 + σ2b 2 ) and variance σ2pi = (exp(σ 2 b )− 1) · pi20. The equivalent parameterization in the usual shape parameters α and β of the beta distribution is α = pi0 ( pi0(1− pi0) σ2pi − 1 ) and β = (1− pi0) ( pi0(1− pi0) σ2pi − 1 ) . For β0 = log(5/100) and σb = 0.3, pi0 lies between 0.026 and 0.088 with probability 0.95 and between 0.020 to 0.010 with probability 0.99. The beta distribution for pi0 is analogous to implementing a slightly asymmetric dis- tribution for bi, with approximately, mean −0.003, standard deviation 0.316 and skewness −0.35. The probabilities corresponding to x = 1 are obtained by multiplying the baseline probabilities by the relative risk exp(β1). Figure 5.1 shows the probability density function of the random intercept when the baseline conditional probability follows the specified beta distri- 115 5.4. Simulation III: Misspecification of the Random Effect Distribution bution and compares it to a normal distribution to illustrate the extent it deviates from normal. The same first two moments of the normal distribution are matched to have mean −0.003 and standard deviation 0.316. Under this beta-binomial setting, the random intercept distribution does not deviate much from a normal distribution and is only slightly left-skewed. Results Table 5.7a shows the estimated biases, the average standard error and the empirical standard error for the three estimation approaches. Results ob- tained by the three approaches are very similar, with numerical values within a margin of 0.001. The estimated bias is small, less than 0.010 in absolute value in all cases. The average standard error and the empirical standard error closely resemble each other in general. The ratio of the grand average of the standard error to the grand average of the empirical standard error is 101.3% for GLMM and 100.0% for GEE-EXCH and GEE-IND. Table 5.7b shows the estimated efficiencies of GEE-EXCH and GEE-IND relative to GLMM. Among the 18 cases we considered, GEE-EXCH is as efficient as GLMM with relative efficiency ranging from 99.9% to 100.4%. GEE-IND is in general less efficient in this setup as the estimated relative efficiencies mostly range from 98.2% to 100.0% with three cases where the relative efficiencies are larger than 100.0% (100.3%, 100.3% and 100.8%). The average estimated variance components reported in Table 5.7c are close to true value, σb = 0.3. None of the estimated variance components is numerically 0. A considerable number of replicates, ranging from 48 to 311, are replaced. This very mild misspecification of the random effect distribution does not appear to influence the bias and accuracy of the model-based standard 116 5.4. Simulation III: Misspecification of the Random Effect Distribution error of GLMM. In this particular setup, GEE-EXCH and GLMM have very similar efficiency, while GEE-IND has a slightly lower efficiency. Triangular Distribution for the Random Intercept A second setup considers a triangular distribution for the random effect. The random intercept bi is generated from a distribution with density function fb(x) = 2 · (2δ − x) (3δ)2 for x ∈ [ − δ, 2δ ]. The cumulative distribution function (CDF) is Fb(x) = (x+ δ) · (5δ − x) (3δ)2 for x ∈ [−δ, 2δ], and the inverse CDF is F−1b (u) = (2− 3 √ 1− u) · δ for u ∈ [0, 1] . This triangular distribution has mean 0 and variance δ2 2 . In this subsection, We choose a scale parameter δ = 0.3 √ 2 so that the standard deviation of bi is the same as in our earlier simulations (σb = 0.3). Figure 5.2 compares the density of this triangular distribution with δ = 0.3 √ 2 to a normal distribution with zero mean and variance σ2b = 0.3 2. This density function is perpendicular triangular in shape with skewness approximately 0.57. It imitates a situation when the majority of patients have relatively low event rates while a few exceptional patients have comparatively high event rates. The 18 basic settings listed in Section 5.2 are considered, with cluster sizes uniformly distribution from 80 to 100 and β0 = log(5/100). This implies a baseline probability in the range from 0.03 to 0.12. Results for these settings are directly comparable to those in Tables 5.3a to 5.3c. 117 5.4. Simulation III: Misspecification of the Random Effect Distribution Results Table 5.8a shows the estimated biases, average standard errors and the em- pirical standard errors. The estimated biases of all three approaches are small, less than 0.005 in absolute value in all cases. The average standard errors are in good agreement with the empirical standard errors in general. For empirical standard errors around 0.055 to 0.075, the discrepancies of the average standard errors are within a margin of 0.002. The ratio of the grand average of the standard errors to the grand average of the empirical standard errors reported in Table 5.8a is 101.4% for GLMM, 100.0% for GEE-EXCH and 100.2% for GEE-IND. For GLMM, the ratios of the standard errors are consistently larger than 100% ranging from about 100% to 104%. Excep- tions include three cases with the ratios about 99.0% to 99.5%. In general, for all three approaches, there is essentially no difference between the robust standard errors or the model-based standard errors from their corresponding empirical standard errors. Table 5.8b presents the estimated relative efficiencies. GEE-EXCH is ba- sically as efficient as GLMM in this setting as well, with estimated relative efficiencies ranging from 99.9% to 100.4%; whereas GEE-IND tends to be slightly less efficient, with estimated relative efficiencies ranging from 98.3% to 100.1%. Table 5.8c reports on the estimation of the variance component and num- ber of simulated replicates replaced. The variance component is overesti- mated by 4.0% to 6.3%. The number of simulated replicates replaced is comparable to the previous setup with the beta distribution. Even with this more extreme distribution for the random effect, regarding estimation of the regression coefficients, GLMM is still quite robust to this form of model misspecification. The variance component is however overes- 118 5.5. Conclusion timated on average leading to potentially slight inflation of the model-based standard error, although the effect is very limited in our cases. The bias and validity of the sandwich estimator of variance of the GEE approaches are not affected by the distribution of the random effect, as far as we observed. For the two setups considered in this section, GEE-EXCH has the same efficiency as GLMM while GEE-IND is slightly less efficient by 1% to 2%. Summary Our limited simulation studies in this section show that GLMM estimation of the regression coefficients is quite robust to moderate departures of the true random effect distribution from the normal distribution when the latter is conventionally assumed. Comparing the two settings of model misspeci- fication in this section to the correct model specification setting in Section 5.2, similar biases, average standard errors and empirical standard errors are demonstrated. The number of replicates replaced are consistently smaller in the two model misspecification settings than in the correct model specifica- tion case. The number of replacements is however similar between these two settings. For the average estimates of the variance component, those in the cor- rect model specification setting are always the smallest, while the triangular random effect distribution setting always results in the largest variance com- ponent estimates among the three simulation settings. The efficiencies of GEE-EXCH and GEE-IND relative to GLMM are also similar in all three settings. 5.5 Conclusion In the simulation studies for clustered binary responses and a binary covari- ate in this chapter, we have only considered circumstances with very low 119 5.5. Conclusion marginal probability of events. Simulated response vectors usually consist of long strings of zeros and very rare events. In general, the pairwise correla- tion is bound to be low for small marginal probability, which is presumably the reason no loss of efficiency for the GEE approaches is observed in the simulations even for the independence “working” correlation structure. The GLMM estimates of the regression coefficients are robust to depar- tures of the random effect distribution from the conventional normal assump- tion even if the true underlying distribution is moderately skewed. Misspeci- fication to the underlying stochastic process of the random effect of the type investigated also has no impact on the quality and efficiency for estimating the regression coefficients. However, it is worth noting that the GLMM log binomial model seems to be a difficult model to fit when the cluster size is small (n = 15) compared to fitting the corresponding marginal model by the GEE approach. 120 5.5. Conclusion Figure 5.1: Probability density functions of random intercept bi with mean −0.003 and variance σ2b = 0.3162 when 1) normally distributed and 2) base- line conditional probability follows beta a distribution. −2 −1 0 1 2 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 1. 2 Quantile D en si ty Log−beta Normal 121 5.5. C on clu sion Table 5.7: Results of Simulation III–1: Beta-Binomial Setup (a) Comparison of estimated log relative risk, β1 when the baseline probability follows a beta distribution. Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 100 0.006 0.076 0.077 0.006 0.075 0.077 0.006 0.075 0.078 90–100 0.006 0.076 0.075 0.006 0.075 0.075 0.006 0.076 0.076 80–100 0.008 0.077 0.074 0.008 0.076 0.074 0.007 0.076 0.074 1.3 100 -0.001 0.074 0.075 -0.002 0.073 0.074 -0.001 0.074 0.075 90–100 0.002 0.075 0.075 0.002 0.074 0.074 0.002 0.074 0.075 80–100 0.001 0.075 0.075 0.001 0.074 0.075 0.000 0.074 0.074 1.5 100 0.001 0.074 0.073 0.000 0.073 0.073 0.001 0.073 0.073 80–100 0.004 0.074 0.075 0.004 0.073 0.075 0.003 0.074 0.076 90–100 0.003 0.074 0.072 0.002 0.073 0.072 0.002 0.074 0.072 20%–50% 1.0 100 0.002 0.057 0.057 0.002 0.056 0.057 0.002 0.057 0.057 90–100 0.002 0.058 0.055 0.002 0.057 0.055 0.002 0.057 0.055 80–100 -0.005 0.059 0.057 -0.005 0.058 0.057 -0.005 0.058 0.057 1.3 100 0.000 0.055 0.055 0.000 0.054 0.055 0.000 0.054 0.055 90–100 0.003 0.055 0.054 0.003 0.055 0.054 0.003 0.055 0.054 80–100 0.002 0.056 0.054 0.001 0.055 0.054 0.001 0.056 0.055 1.5 100 0.001 0.054 0.054 0.001 0.053 0.054 0.001 0.053 0.054 90–100 0.001 0.054 0.054 0.001 0.053 0.053 0.001 0.054 0.054 80–100 0.002 0.055 0.054 0.001 0.054 0.054 0.001 0.055 0.054 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates.122 5.5. Conclusion Table 5.7: – Continued (b) Relative efficiency† to maximum liklihood for estimating β1 when the baseline probability follows a beta distribution. Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 100 1.003 0.982 90–100 1.002 0.986 80–100 1.001 0.999 1.3 100 1.003 0.986 90–100 1.001 0.986 80–100 0.999 1.003 1.5 100 1.004 0.990 80–100 1.004 0.983 90–100 1.000 0.995 20%–50% 1.0 100 1.002 0.990 90–100 1.002 0.985 80–100 1.003 0.995 1.3 100 1.004 0.997 90–100 1.004 1.008 80–100 1.001 0.987 1.5 100 1.002 1.003 90–100 1.004 0.994 80–100 1.002 1.000 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 123 5.5. Conclusion Table 5.7: – Continued (c) Estimation of variance component in GLMM and computation issues of glmer() when the baseline probability follows a beta distribution. Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 100 0.302 0% 164 90–100 0.302 0% 235 80–100 0.301 0% 311 1.3 100 0.301 0% 80 90–100 0.301 0% 115 80–100 0.302 0% 199 1.5 100 0.298 0% 83 90–100 0.299 0% 128 80–100 0.300 0% 185 20%–50% 1.0 100 0.297 0% 48 90–100 0.300 0% 83 80–100 0.299 0% 102 1.3 100 0.301 0% 48 90–100 0.300 0% 75 80–100 0.299 0% 97 1.5 100 0.297 0% 53 90–100 0.299 0% 76 80–100 0.299 0% 103 124 5.5. Conclusion Figure 5.2: Probability density functions of random intercept bi with mean 0 and variance σ2b = 0.3 2 when 1) normally distributed and 2) triangularly distributed. −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0. 0 0. 5 1. 0 1. 5 Quantile D en si ty Triangular Normal 125 5.5. C on clu sion Table 5.8: Results of Simulation III–2: Triangular Distribution (a) Comparison of estimated log relative risk, β1 when the random effect follows a triangular distribution. Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 GLMM GEE-EXCH GEE-IND Prop. xij = 0 RR ni Bias SE ∗ (M) SE ‡ (E) Bias SE † (R) SE ‡ (E) Bias SE † (R) SE ‡ (E) 30% 1.0 100 0.002 0.076 0.075 0.002 0.075 0.075 0.002 0.075 0.076 90–100 0.001 0.076 0.076 0.001 0.075 0.075 0.001 0.076 0.076 80–100 0.004 0.077 0.075 0.004 0.076 0.075 0.003 0.076 0.075 1.3 100 0.000 0.074 0.074 0.000 0.073 0.074 0.000 0.074 0.074 90–100 0.002 0.075 0.074 0.002 0.074 0.074 0.002 0.074 0.074 80–100 0.004 0.075 0.073 0.003 0.074 0.073 0.003 0.074 0.073 1.5 100 0.002 0.074 0.072 0.002 0.073 0.072 0.002 0.073 0.073 80–100 0.004 0.074 0.074 0.004 0.073 0.074 0.004 0.073 0.074 90–100 0.004 0.074 0.073 0.003 0.073 0.073 0.003 0.074 0.073 20%–50% 1.0 100 0.001 0.057 0.056 0.001 0.056 0.056 0.000 0.057 0.056 90–100 0.002 0.058 0.058 -0.002 0.057 0.058 -0.002 0.057 0.059 80–100 0.003 0.059 0.060 0.003 0.058 0.060 0.003 0.058 0.060 1.3 100 0.001 0.055 0.054 0.000 0.054 0.054 0.000 0.054 0.055 90–100 0.000 0.055 0.054 0.000 0.054 0.054 0.000 0.055 0.054 80–100 0.002 0.056 0.055 0.002 0.055 0.055 0.002 0.056 0.055 1.5 100 0.001 0.053 0.054 0.000 0.053 0.054 0.000 0.053 0.054 90–100 0.003 0.054 0.053 0.002 0.053 0.053 0.001 0.054 0.053 80–100 0.000 0.055 0.053 -0.001 0.054 0.053 -0.001 0.054 0.053 ∗ Averaged model-based standard error. † Averaged robust standard error. ‡ Empirical standard error, i.e. sample standard deviation of simulation replicates.126 5.5. Conclusion Table 5.8: – Continued (b) Relative efficiency† to maximum liklihood for estimating β1, when the random effect follows a triangular distribution. Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni GEE-EXCH GEE-IND 30% 1.0 100 1.002 0.995 90–100 1.004 1.000 80–100 1.002 0.995 1.3 100 1.000 1.001 90–100 1.000 0.996 80–100 1.002 0.996 1.5 100 1.002 0.987 80–100 1.003 0.984 90–100 0.999 1.000 20%–50% 1.0 100 1.003 0.998 90–100 1.002 0.996 80–100 1.002 1.000 1.3 100 1.002 0.984 90–100 1.003 0.989 80–100 1.002 1.001 1.5 100 1.002 0.990 90–100 1.001 0.983 80–100 1.001 0.987 †Relative efficiency defined as reciprocal of the ratio of empirical variances. 127 5.5. Conclusion Table 5.8: – Continued (c) Estimation of variance component in GLMM and computation issues of glmer(), when the random effect follows a triangular distribution. Number of subjects, M = 250 β0 = log(5/100), σb = 0.3 Prop. xij = 0 RR ni Average σ̂b Prop. σ̂b < 0.01 Replicates replaced 30% 1.0 100 0.316 0% 161 90–100 0.319 0% 235 80–100 0.318 0% 334 1.3 100 0.313 0% 80 90–100 0.314 0% 102 80–100 0.318 0% 185 1.5 100 0.313 0% 70 90–100 0.315 0% 94 80–100 0.315 0% 169 20%–50% 1.0 100 0.317 0% 66 90–100 0.314 0% 83 80–100 0.316 0% 108 1.3 100 0.315 0% 38 90–100 0.314 0% 70 80–100 0.315 0% 104 1.5 100 0.312 0% 60 90–100 0.314 0% 88 80–100 0.314 0% 127 128 Chapter 6 Conclusions and Discussion A log-linear random intercept model for estimating relative risk for clustered or longitudinal data is described in this report. This model implies a com- mon mean structure at the population average level and the subject specific level, which leads to two approaches for estimation of the same regression coefficient. In the case of a binary response, the corresponding regression coefficient represents the log relative risk. The first model fitting approach, maximum likelihood, is based on the integrated likelihood of the GLMM model; the second, the GEE approach, is an estimating equations approach based on the marginalized mean structure of this GLMM model. This regression model was applied to the BEYOND trial data to assess the effects of neutralizing antibodies on the efficacy of interferon beta-1b in relapsing-remitting multiple sclerosis. Analyses based on daily relapse data showed that the model fits obtained by GLMM and GEE-EXCH are very similar. The estimated correlations by GEE-EXCH were very low, about 0.001. The BEYOND trial data suggested some NAB effect for medium and high level NAB titer values in the 500 mcg group, but this pattern was barely significant for GEE-EXCH and not significant for GLMM. For the 250 mcg group, no evidence of NAB effect was suggested based on NAB status. When using the NAB titers directly, no evidence of NAB effects was sug- gested, again, in the 250 mcg group. On the other hand, in the 500 mcg group, a positive association between the titer value and the risk of relapse onset was indicated for both approaches. 129 Chapter 6. Conclusions and Discussion Another version of the data was considered by collapsing the binary indi- cators of daily relapse onset to the number of relapse onsets within intervals between two consecutive serum sample collections. Analyses based on the collapsed data resulted in generally the same conclusions on NAB effect. The estimated relative risks obtained by GEE-EXCH were slightly larger than those obtained by GLMM. The correlation estimated by GEE-EXCH was about 0.05 for the 250 mcg group and 0.11 for the 500 mcg group. Com- paring the standard errors of the estimated log relative risk based on the daily relapse data and the collapsed data, there was little loss of efficiency for GLMM due to collapsing. For GEE-EXCH, mild to moderate loss of efficiency was observed. The variance of β̂ corresponding to NAB status was inflated by up to 20% in the 500 mcg group. To assess the efficiency of GEE-EXCH and GLMM, the variances of β̂ were compared. Based on the daily relapse data, GEE-EXCH was slightly more efficient except in the 500 mcg group with direct use of NAB titer when GLMM was more efficient. Based on the collapsed data, in the 250 mcg group, the efficiency of the two approaches were similar. In the 500 mcg group, since a substantial loss of efficiency was observed for GEE-EXCH due to collapsing, it was less efficient compared to GLMM with both NAB status and NAB titers. When the relapse onsets are very sparse among the binary responses, the within-subject correlation is usually very low. Our simulations showed, not surprisingly, that GEE-EXCH was as efficient as GLMM when the events were sparse and the within cluster correlation was smaller than 0.1. We also looked at the impact of model misspecification under the scenario of low event probability and low correlation. When the random effects follow a latent AR-1 process, our simulations showed that for estimation of the fixed effect, GLMM was robust to this form of model misspecification, even 130 Chapter 6. Conclusions and Discussion though the variance component was poorly estimated. Similar robustness of GLMM to departures from the conventionally assumed normal form of the random effect distribution was also observed. In the situation when the random effects follow a triangular distribution with moderate skewness, the estimated regression coefficients were essentially unaffected and their model- based standard errors are accurate, although the variance component was slightly overestimated. In these cases of misspecification of the random effect distribution, GEE-EXCH has the same efficiency as GLMM while GEE-IND is almost as efficient with a loss of less than 2%. Various studies of the efficiency of GEE estimators can be found in the literature. Neuhaus[21] compared a mixed-effects logistic model, a marginal logistic model and Rosner’s model for clustered binary data. Although the regression coefficients in the three models do not have the same interpreta- tion, they estimate the same null covariate effect when the true regression coefficients are zero. The mixed-effects model and Rosner’s model can be fit- ted by maximum likelihood whereas the marginal logistic model can be fitted by GEE. He considered a setting where a single covariate has exchangeable correlation structure within cluster. Assuming the mixed-effects model holds with no actual covariate effect, approximate analytic results relating the re- gression coefficients as well as the standard errors obtained by the three approaches were derived asymptotically. Based on those results, the asymp- totic relative efficiencies (AREs) of the derived Wald-type tests of the three approaches were compared. It was concluded that when the covariate only varies at the cluster level, all three approaches have similar power in de- tecting covariate effects. However, when the covariate varies within cluster, the mixed-effects model and GEE-EXCH are similar and more efficient than GEE-IND and Rosner’s model. The gain in efficiency increases with the within cluster correlation of the response. 131 Chapter 6. Conclusions and Discussion Fitzmaurice[22] compared the efficiency of GEE-EXCH and GEE-IND to maximum likelihood when the binary responses follow a general multivariate distribution with mean structure related to covariates through a logit link. The longitudinal data are measured at the same time points across clusters. A binary covariate, either cluster-level or within-cluster, was considered. The AREs of the estimated time effects were close to 1 so the focus was on the binary covariate. For cluster-level covariates, the efficiency of GEE-EXCH and GEE-IND are almost the same. Their AREs decrease with increasing correlation of the responses but are quite high (> 0.9) provided the corre- lation is modest (< 0.3). For within-cluster covariates, GEE-EXCH retains a similar pattern of ARE. However, GEE-IND results in a considerable loss of efficiency, as its ARE drops more sharply than that of GEE-EXCH as the correlation increases. GEE-IND was also compared to GEE-EXCH un- der the null covariate effect when both the response and the covariate has an exchangeable correlation structure within cluster. Conclusions similar to those of Neuhaus[21] were reached: loss of efficiency is associated with cor- relation of both the covariates and the responses and the two approaches are equally efficient when the covariate is constant within cluster. Fitzmaurice’s results[22] explained the paradoxical findings in some previous studies on relative efficiency of GEE-IND to GEE-EXCH where either a cluster-level or a within-cluster covariate was considered, but not both. Mancl and Leroux[23] investigated extensively the loss of efficiency due to the use of an independence “working” correlation structure when the true correlation structure is exchangeable. Asymptotic efficiency comparing GEE- IND to GEE-EXCH was based on the analytic form of the covariance matri- ces of the estimated regression coefficients for the case of linear models. The ARE was shown to depend on the covariate distribution, the cluster size, the within cluster correlations of the responses and the regression coefficients. The distribution of the covariates was described in terms of the proportion 132 Chapter 6. Conclusions and Discussion of covariate variation attributable to between-cluster variation. Mancl and Leroux[23] identified that GEE-IND is fully efficient relative to GEE-EXCH in two situations: 1) when the covariate is cluster-level (i.e. covariate varia- tion is entirely between-cluster); or 2) when cluster size is constant and the within-cluster covariate is mean-balanced across clusters (i.e. covariate vari- ation is entirely within-cluster). The latter situation had been overlooked in previous studies but actually had been demonstrated by Fitzmaurice’s time covariate, which is a special case of a mean-balanced covariate as it is identi- cal across clusters. The ARE decreases as the covariate distribution deviates from these two extremes. Larger cluster size was shown to aggravate the loss of efficiency. When the cluster size is large (e.g. 100), even when the response correlation is low (e.g. ρy = 0.1), the ARE can be as low as about 30%. They noted that the results may give good approximations for more general situations with other link functions and variance functions, especially if the resulting weights do not vary to a great extent. In an example for bi- nary response and binary covariate with ρy = 0.2 and using a logit link, the ARE is about 0.9 for a constant cluster size of 4 and drops to about 0.5 for a constant cluster size of 20. The ARE is apparently unrelated to the baseline probability and increases only slightly with the magnitude of the covariate effect. Compared to our simulation study where a binary covariate varying within cluster was considered, we did not see any loss of efficiency due to variations in covariates, cluster size or covariate effects. This was most likely due to the low correlation in our simulated responses. Chaganty and Joe[12] pointed out that the usual GEE “working” cor- relation matrix, assumed to be constant over possible values of covariates, cannot be the true correlation structure in general for non-normal, especially binary responses. For binary responses, the correlations are constrained by Fréchet bounds which are determined by the marginal means. The bounds could be quite narrow when the range of the covariate is wide. Also, strong 133 Chapter 6. Conclusions and Discussion positive dependence is possible without high correlation. They suggested to regard the “working” correlation matrix merely as a weight matrix. A set of guidelines for selecting the optimal weight matrix was described. Since the optimal weight matrix is fixed, no association parameters are involved so the weight matrix can be ensured to be positive definite. By simulation, the efficiency of GEE is shown to be high relative to maximum likelihood when data is generated from a multivariate probit model. In this case, the optimal weight matrix turns out to be a rough approximate to the average true correlation matrix. Chaganty and Joe[12] also argued that some studies of the relative effi- ciency of GEE in the literature are invalid. For example, they pointed out that Fitzmaurice’s[22] comparison of the GEE approach to maximum likeli- hood was invalid due to the questionable likelihood used. The correlations between the binary responses were assumed constant across possible values of the covariate, which was generally impossible. In contrast, our comparisons of the GEE approach to GLMM were based on a correct likelihood in which the correlations between the responses actually depended on the covariates2. To conclude, when the marginal probability of the binary response is low, the within cluster correlation is usually low. In this case, the GEE approach is almost as efficient as GLMM even when an independence “working” corre- lation matrix is used. Our simulations also indicated that GLMM was robust to misspecification of the random effect distribution as well as when the ran- dom intercept followed a latent AR-1 process within cluster. The estimation of regression coefficients was not sensitive to the normality assumption of the random intercept in general. Overall, the GEE approach demonstrated high efficiency. Fitting the marginal model was also computationally easier 2Although one may argue that a normal random effect distribution is theoretically impossible when a log link is applied to binary responses, we think this is not an issue provided the variance component is small enough. 134 Chapter 6. Conclusions and Discussion and less demanding than fitting the GLMM. All in all, the GEE approach is recommended over GLMM for estimation of relative risk when the relative frequency of the event is very low for the correlated binary response. In our analyses of the BEYOND data, the time trend of relapse risk was modelled as linear based on the conclusion of the likelihood ratio tests. Confirmatory analyses fitting a second degree spline or a quadratic time trend can be performed for both daily and collapsed data to see if more comparable NAB effects are estimated for both IFNB treatment groups. Our study only compared the efficiency of the GEE approach to GLMM when the response probability is low and the within cluster correlation is low. Further research could be done to study the relative efficiency for higher marginal probability and within cluster correlation. Also, the robustness of GLMM under more extreme departure from the normal random effect distribution could be examined. Since with the log link, the mean is not bounded above by 1, problems may arise when fitting the log binomial models to data with higher marginal probability. Some of these problems are documented in the literature in the context of cross-sectional studies. These issues of fitting the log-binomial model in the context of longitudinal studies and possible alternatives to circumvent the difficulties could be topics for future studies. 135 Bibliography [1] IFNB Multiple Sclerosis Group (1993). Interferon beta-1b is effective in relapsing-remitting multiple sclerosis. I. Clinical results of a multicenter, randomized double-blind, placebo-controlled trial. Neurology , 43, 655– 661. [2] Paty, D. W., Li, D. K. B., UBC MS/MRI Study Group, and IFNB Multiple Sclerosis Study Group (1993). Interferon beta-1b is effective in relapsing-remitting multiple sclerosis. II. MRI analysis results of a mul- ticenter, randomized double-blind, placebo-controlled trial. Neurology , 43, 662–667. [3] IFNB Multiple Sclerosis Study Group and UBC MS/MRI Analysis Group (1995). Interferon beta-1b in the treatment of multiple sclero- sis: Final outcome of the randomized controlled trial. Neurology , 45, 1277–1285. [4] IFNB Multiple Sclerosis Study Group and UBC MS/MRI Analysis Group (1996). Neutralizing antibodies during treatment of multiple scle- rosis with interferon beta-1b: Experience during the first three years. Neurology , 47, 889–894. [5] Petkau, A. J., White, R. A., Ebers, G. C., Reder, A. T., Sibley, W. A., Lublin, F. D., Paty, D. W., and the IFNB Multiple Sclerosis Study Group (2004). Longitudinal analyses of the effects of neurtalizing an- tibodies on interferon beta-1b in relapsing-remitting multiple sclerosis. Multiple Sclerosis , 10, 126–138. 136 Bibliography [6] O’Conner, P., Filippi, M., Arnason, B., Comi, G., Cook, S., Goodin, D., Hartung, H.-P., Jeffery, D., Kappos, L., Boateng, F., Filippov, V., Groth, M., Knappertz, V., Kraus, C., Sandbrink, R., Pohlt, C., Bogu- milt, T. (2009). 250 µg or 500 µg interferon beta-1b versus 20 mg glati- ramer acetate in relapsing-remitting multiple sclerosis: a prospective, randomised, multicentre study. Lancet Neurology , 8, 889–897. [7] Nelder, J. A. and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A, 135, 370– 384. [8] McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models . Chapman and Hall/CRC Press, 2nd edition. [9] Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method. Biometrika, 61, 439–447. [10] Bahadur, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. Solomon, H. (ed.), Studies on item analysis and prediction, pp. 158–68, Stanford Mathematical Studies in the Sociak Science VI, Stanford University Press, Stanford, California. [11] Diggle, P., Heagerty, P., Liang, K.-Y., and Zeger, S. (2002). Analysis of Longitudinal Data. Oxford University Press, 2nd edition. [12] Chaganty, N. R. and Joe, H. (2004). Efficiency of generalized estimating equations for binary responses. Journal of the Royal Statistical Society, Series B , 66, 851–860. [13] Liang, K.-Y. and Zeger, S. (1986). Longitudinal data analysis using gen- eralized linear models. Biometrika, 73, 13–22. [14] Zeger, S. and Liang, K.-Y. (1986). Longitudinal data analysis for discrete and continuous outcomes. Biometrics , 42, 121–130. 137 [15] Paik, M. C. (1988). Repeated measurement analysis for nonnormal data in small samples. Communications in Statistics - Simulation and Com- putation, 17, 1155–1171. [16] Halekoh, U., Højsgaard, S., and Yan, J. (2006). The R package geepack for generalized estimating equations. Journal of Statistical Software, 15, 1–11. [17] Self, S. G. and Liang, K.-Y. (1987). Asymptotic properties of maximum likelihood estimators and likelihood ratio tests under nonstandard con- ditions. Journal of the American Statistical Association, 82, 605–610. [18] Patterson, H. D. and Thompson, R. (1971). Recovery of inter-block information when block sizes are unequal. Biometrika, 58, 545–554. [19] McCulloch, C. E. (1994). Maximum likelihood variance components es- timation for binary data. Journal of American Statistical Association, 89, 330–335. [20] Fitzmaurice, G. M., Laird, N. M., and Ware, J. H. (2004). Applied Longitudinal Analysis . John Wiley and Sons. [21] Neuhaus, J. M. (1993). Estimation efficiency and tests of covariate effects with clustered binary data. Biometrics , 49, 989–996. [22] Fitzmaurice, G. M. (1995). A caveat concerning independence estimat- ing equations with multivariate binary data. Biometrics , 51, 309–317. [23] Mancl, L. A. and Leroux, B. G. (1996). Efficiency of regression estimates for clustered data. Biometrics , 52, 500–511. 138 Appendix A Details of Model Fits for the Daily Relapse Data 139 A p p en d ix A . D etails of M o d el F its for th e D aily R elap se D ata Table A.1: Model fit based on eventually NAB+ patients only: linear time trend, NAB status. (a) The 250 mcg group All switches considered ; confirmed NAB+, confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -6.720 0.139 -48.46 <0.01 -6.943 0.137 -50.52 <0.01 Time -0.185 0.104 -1.78 0.08 -0.192 0.110 -1.74 0.08 NAB+ vs. NAB− 0.088 0.158 0.56 0.58 0.097 0.171 0.57 0.57 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.973 − − − − − − − σb(GLMM) − − − − 0.729 0.069 10.60 <0.01 (b) The 500 mcg group All switches considered ; confirmed NAB+, confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -7.087 0.151 -47.02 <0.01 -7.354 0.156 -47.22 <0.01 Time -0.052 0.113 -0.46 0.64 -0.050 0.116 -0.43 0.66 NAB+ vs. NAB− 0.257 0.172 1.49 0.14 0.252 0.183 1.38 0.17 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.970 − − − − − − − σb(GLMM) − − − − 0.802 0.073 10.94 <0.01 140 A p p en d ix A . D etails of M o d el F its for th e D aily R elap se D ata Table A.2: Model fit based on eventually NAB+ patients only: linear time trend, refined NAB status. (a) The 250 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -6.722 0.140 -48.11 <0.01 -6.946 0.138 -50.28 <0.01 Time -0.182 0.104 -1.75 0.08 -0.187 0.113 -1.66 0.10 Low NAB+ vs. NAB− 0.073 0.168 0.43 0.67 0.090 0.182 0.49 0.62 Med NAB+ vs. NAB− 0.147 0.221 0.66 0.51 0.140 0.221 0.63 0.53 High NAB+ vs. NAB− 0.034 0.261 0.13 0.90 0.023 0.270 0.09 0.93 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.973 − − − − − − − σb(GLMM) − − − − 0.729 0.069 10.59 <0.01 (b) The 500 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -7.062 0.150 -47.02 <0.01 -7.334 0.157 -46.80 <0.01 Time -0.100 0.121 -0.83 0.41 -0.096 0.121 -0.80 0.43 Low NAB+ vs. NAB− 0.105 0.201 0.52 0.60 0.095 0.207 0.46 0.65 Med NAB+ vs. NAB− 0.432 0.220 1.97 0.05 0.436 0.223 1.96 0.05 High NAB+ vs. NAB− 0.471 0.237 1.99 0.05 0.466 0.241 1.93 0.05 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.971 − − − − − − − σb(GLMM) − − − − 0.809 0.074 11.00 <0.01 ∗Titers for NAB status — Low : [20, 100), Med : [100, 400), High : [400,∞).141 A p p en d ix A . D etails of M o d el F its for th e D aily R elap se D ata Table A.3: Model fits based on 250 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = ∞ (Intercept) -6.703 0.135 -49.59 <0.01 -6.924 0.132 -52.32 <0.01 Time -0.151 0.092 -1.63 0.10 -0.148 0.102 -1.45 0.15 log (NAB Titer/20) -0.002 0.054 -0.03 0.97 -0.009 0.056 -0.16 0.87 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.972 − − − − − − − σb(GLMM) − − − − 0.728 0.069 10.57 <0.01 cutoff = 1000 NU/mL (Intercept) -6.704 0.135 -49.54 <0.01 -6.925 0.133 -52.26 <0.01 Time -0.157 0.093 -1.69 0.09 -0.155 0.103 -1.51 0.13 log (min {NAB Titer, 1000} /20) 0.006 0.062 0.09 0.93 -0.001 0.062 -0.02 0.98 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.972 − − − − − − − σb(GLMM) − − − − 0.729 0.069 10.59 <0.01 Continued on Next Page. . . 142 A p p en d ix A . D etails of M o d el F its for th e D aily R elap se D ata Table A.3 – Continued GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = 400 NU/mL (Intercept) -6.703 0.135 -49.58 <0.01 -6.924 0.133 -52.22 <0.01 Time -0.154 0.094 -1.64 0.10 -0.152 0.103 -1.47 0.14 log (min {NAB Titer, 400} /20) 0.002 0.071 0.03 0.97 -0.005 0.072 -0.07 0.94 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.972 − − − − − − − σb(GLMM) − − − − 0.728 0.069 10.58 <0.01 cutoff = 100 NU/mL (Intercept) -6.707 0.136 -49.18 <0.01 -6.916 0.133 -51.92 <0.01 Time -0.166 0.095 -1.75 0.08 -0.134 0.102 -1.31 0.19 log (min {NAB Titer, 100} /20) 0.028 0.109 0.25 0.80 -0.050 0.111 -0.45 0.65 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.972 − − − − − − − σb(GLMM) − − − − 0.727 0.069 10.56 <0.01 143 A p p en d ix A . D etails of M o d el F its for th e D aily R elap se D ata Table A.4: Model fits based on 500 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = ∞ (Intercept) -7.032 0.146 -48.18 <0.01 -7.309 0.148 -49.30 <0.01 Time -0.086 0.117 -0.74 0.46 -0.085 0.113 -0.75 0.45 log (NAB Titer/20) 0.108 0.051 2.10 0.04 0.110 0.049 2.23 0.03 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.970 − − − − − − − σb(GLMM) − − − − 0.814 0.074 11.04 <0.01 cutoff = 1000 NU/mL (Intercept) -7.044 0.148 -47.76 <0.01 -7.319 0.149 -49.14 <0.01 Time -0.102 0.119 -0.86 0.39 -0.100 0.114 -0.87 0.38 log (min {NAB Titer, 1000} /20) 0.137 0.061 2.24 0.03 0.138 0.057 2.42 0.02 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.971 − − − − − − − σb(GLMM) − − − − 0.812 0.074 11.02 <0.01 Continued on Next Page. . . 144 A p p en d ix A . D etails of M o d el F its for th e D aily R elap se D ata Table A.4 – Continued GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = 400 NU/mL (Intercept) -7.056 0.148 -47.55 <0.01 -7.331 0.150 -48.96 <0.01 Time -0.101 0.119 -0.85 0.40 -0.100 0.115 -0.87 0.38 log (min {NAB Titer, 400} /20) 0.162 0.073 2.23 0.03 0.163 0.068 2.39 0.02 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.971 − − − − − − − σb(GLMM) − − − − 0.811 0.074 11.01 <0.01 cutoff = 100 NU/mL (Intercept) -7.079 0.150 -47.23 <0.01 -7.354 0.152 -48.28 <0.01 Time -0.084 0.116 -0.72 0.47 -0.082 0.112 -0.73 0.47 log (min {NAB Titer, 100} /20) 0.240 0.118 2.04 0.04 0.244 0.113 2.17 0.03 α(GEE-EXCH) 0.001 − − − − − − − φ(GEE-EXCH) 0.970 − − − − − − − σb(GLMM) − − − − 0.797 0.073 10.91 <0.01 145 Appendix B Details of Model Fits for the Collapsed Data 146 A p p en d ix B . D etails of M o d el F its for th e C ollap sed D ata Table B.1: Model fit based on collapsed data, eventually NAB+ patients only: linear time trend, NAB status. (a) The 250 mcg group All switches considered ; confirmed NAB+, confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -0.904 0.134 -6.76 <0.01 -1.177 0.131 -8.98 <0.01 Time -0.245 0.115 -2.14 0.03 -0.235 0.114 -2.05 0.04 NAB+ vs. NAB− 0.142 0.167 0.85 0.40 0.138 0.174 0.79 0.43 α(GEE-EXCH) 0.050 − − − − − − − φ(GEE-EXCH) 2.242 − − − − − − − σb(GLMM) − − − − 0.818 0.071 11.51 <0.01 (b) The 500 mcg group All switches considered ; confirmed NAB+, confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -1.304 0.154 -8.48 <0.01 -1.600 0.149 -10.72 <0.01 Time -0.040 0.130 -0.31 0.76 -0.030 0.117 -0.25 0.80 NAB+ vs. NAB− 0.269 0.188 1.43 0.15 0.240 0.183 1.31 0.19 α(GEE-EXCH) 0.105 − − − − − − − φ(GEE-EXCH) 1.934 − − − − − − − σb(GLMM) − − − − 0.905 0.075 12.00 <0.01 147 A p p en d ix B . D etails of M o d el F its for th e C ollap sed D ata Table B.2: Model fit based on collapsed data, eventually NAB+ patients only: linear time trend, refined NAB status. (a) The 250 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -0.905 0.134 -6.75 <0.01 -1.178 0.131 -8.97 <0.01 Time -0.243 0.114 -2.13 0.03 -0.234 0.117 -2.00 0.05 Low NAB+ vs. NAB− 0.125 0.181 0.70 0.49 0.122 0.185 0.66 0.51 Med NAB+ vs. NAB− 0.201 0.232 0.87 0.39 0.195 0.224 0.87 0.38 High NAB+ vs. NAB− 0.094 0.274 0.34 0.73 0.097 0.275 0.35 0.72 α(GEE-EXCH) 0.050 − − − − − − − φ(GEE-EXCH) 2.239 − − − − − − − σb(GLMM) − − − − 0.819 0.071 11.51 <0.01 (b) The 500 mcg group All Switches considered ; confirmed NAB+, unconfirmed L/M/H and confirmed NAB− GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value (Intercept) -1.296 0.156 -8.31 <0.01 -1.593 0.150 -10.63 <0.01 Time -0.095 0.141 -0.68 0.50 -0.082 0.123 -0.66 0.51 Low NAB+ vs. NAB− 0.103 0.218 0.48 0.63 0.084 0.207 0.41 0.68 Med NAB+ vs. NAB− 0.476 0.239 1.99 0.05 0.433 0.222 1.95 0.05 High NAB+ vs. NAB− 0.502 0.256 1.96 0.05 0.469 0.243 1.93 0.05 α(GEE-EXCH) 0.105 − − − − − − − φ(GEE-EXCH) 2.183 − − − − − − − σb(GLMM) − − − − 0.913 0.076 12.07 <0.01 ∗Titers for NAB status — Low : [20, 100), Med : [100, 400), High : [400,∞). 148 A p p en d ix B . D etails of M o d el F its for th e C ollap sed D ata Table B.3: Model fits based on collapsed data, 250 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = ∞ (Intercept) -0.868 0.126 -6.86 <0.01 -1.142 0.122 -9.37 <0.01 Time -0.196 0.100 -1.95 0.05 -0.189 0.106 -1.79 0.07 log (NAB Titer/20) 0.005 0.057 0.09 0.93 0.008 0.057 0.14 0.89 α(GEE-EXCH) 0.054 − − − − − − − φ(GEE-EXCH) 2.054 − − − − − − − σb(GLMM) − − − − 0.819 0.071 11.49 <0.01 cutoff = 1000 NU/mL (Intercept) -0.870 0.127 -6.85 <0.01 -1.145 0.122 -9.36 <0.01 Time -0.200 0.102 -2.00 0.05 -0.196 0.106 -1.84 0.07 log (min {NAB Titer, 1000} /20) 0.012 0.065 0.22 0.82 0.018 0.063 0.28 0.78 α(GEE-EXCH) 0.054 − − − − − − − φ(GEE-EXCH) 2.068 − − − − − − − σb(GLMM) − − − − 0.819 0.071 11.51 <0.01 Continued on Next Page. . . 149 A p p en d ix B . D etails of M o d el F its for th e C ollap sed D ata Table B.3 – Continued GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = 400 NU/mL (Intercept) -0.870 0.127 -6.85 <0.01 -1.145 0.123 -9.32 <0.01 Time -0.200 0.103 -1.95 0.05 -0.193 0.107 -1.81 0.07 log (min {NAB Titer, 400} /20) 0.012 0.075 0.16 0.87 0.016 0.073 0.21 0.83 α(GEE-EXCH) 0.054 − − − − − − − φ(GEE-EXCH) 2.068 − − − − − − − σb(GLMM) − − − − 0.819 0.071 11.50 <0.01 cutoff = 100 NU/mL (Intercept) -0.877 0.129 -6.80 <0.01 -1.154 0.125 -9.25 <0.01 Time -0.213 0.104 -2.04 0.04 -0.207 0.108 -1.91 0.06 log (min {NAB Titer, 100} /20) 0.044 0.115 0.38 0.70 0.051 0.114 0.45 0.66 α(GEE-EXCH) 0.053 − − − − − − − φ(GEE-EXCH) 2.111 − − − − − − − σb(GLMM) − − − − 0.820 0.071 11.52 <0.01 150 A p p en d ix B . D etails of M o d el F its for th e C ollap sed D ata Table B.4: Model fits based on collapsed data, 500 mcg arm, eventually NAB+ patients only: linear time trend, log NAB titers. GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = ∞ (Intercept) -1.257 0.145 -8.66 <0.01 -1.571 0.138 -11.42 <0.01 Time -0.085 0.135 -0.63 0.53 -0.076 0.116 -0.65 0.51 log (NAB Titer/20) 0.118 0.054 2.17 0.03 0.114 0.050 2.30 0.02 α(GEE-EXCH) 0.106 − − − − − − − φ(GEE-EXCH) 2.078 − − − − − − − σb(GLMM) − − − − 0.920 0.076 12.11 <0.01 cutoff = 1000 NU/mL (Intercept) -1.276 0.148 -8.64 <0.01 -1.584 0.139 -11.43 <0.01 Time -0.102 0.138 -0.74 0.46 -0.089 0.117 -0.76 0.45 log (min {NAB Titer, 1000} /20) 0.151 0.066 2.29 0.02 0.142 0.057 2.47 0.01 α(GEE-EXCH) 0.106 − − − − − − − φ(GEE-EXCH) 2.100 − − − − − − − σb(GLMM) − − − − 0.918 0.076 12.09 <0.01 Continued on Next Page. . . 151 A p p en d ix B . D etails of M o d el F its for th e C ollap sed D ata Table B.4 – Continued GEE-EXCH GLMM estimate SE Z-score p-value estimate SE Z-score p-value cutoff = 400 NU/mL (Intercept) -1.291 0.149 -8.64 <0.01 -1.596 0.140 -11.39 <0.01 Time -0.103 0.139 -0.74 0.46 -0.087 0.117 -0.75 0.45 log (min {NAB Titer, 400} /20) 0.180 0.079 2.27 0.02 0.166 0.069 2.42 0.02 α(GEE-EXCH) 0.107 − − − − − − − φ(GEE-EXCH) 2.086 − − − − − − − σb(GLMM) − − − − 0.916 0.076 12.08 <0.01 cutoff = 100 NU/mL (Intercept) -1.312 0.153 -8.57 <0.01 -1.609 0.143 -11.22 <0.01 Time -0.083 0.135 -0.62 0.54 -0.066 0.115 -0.58 0.56 log (min {NAB Titer, 100} /20) 0.268 0.131 2.05 0.04 0.240 0.114 2.11 0.03 α(GEE-EXCH) 0.107 − − − − − − − φ(GEE-EXCH) 1.978 − − − − − − − σb(GLMM) − − − − 0.909 0.076 12.03 <0.01 152
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Regression approaches to estimation of relative risk...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Regression approaches to estimation of relative risk : application to multiple sclerosis studies Fu, Shing 2010
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Regression approaches to estimation of relative risk : application to multiple sclerosis studies |
Creator |
Fu, Shing |
Publisher | University of British Columbia |
Date Issued | 2010 |
Description | Using a log link for binary response in generalized linear mixed-effects models (GLMM) allows direct estimation of the relative risk. If a random intercept is the only random effect in the conditional mean structure, the marginal mean has the same form. The fixed effects, representing the log relative risks, have the same interpretation in both the mixed-effects model and the marginal model. This leads to two approaches to estimate the relative risks, 1) maximum likelihood for the mixed-effects models and 2) the generalized estimating equations (GEE) approach for the marginal models. In our study, we apply such log-linear models to assess the effects of neutralizing antibodies on interferon beta-1b in relapsing-remitting multiple sclerosis. The results obtained by the two approaches are compared. The relative efficiency of the GEE approach and the robustness of the GLMM approach to some forms of misspecification of the model for the random effects are studied by simulations. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2010-10-25 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0071408 |
URI | http://hdl.handle.net/2429/29503 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2010-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2010_fall_fu_shing.pdf [ 1.08MB ]
- Metadata
- JSON: 24-1.0071408.json
- JSON-LD: 24-1.0071408-ld.json
- RDF/XML (Pretty): 24-1.0071408-rdf.xml
- RDF/JSON: 24-1.0071408-rdf.json
- Turtle: 24-1.0071408-turtle.txt
- N-Triples: 24-1.0071408-rdf-ntriples.txt
- Original Record: 24-1.0071408-source.json
- Full Text
- 24-1.0071408-fulltext.txt
- Citation
- 24-1.0071408.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0071408/manifest