Analyses of Longitudinal and Time-to-Event Data in a Randomized Clinical Trial in the Presence of a Lag Time in the Stabilization of Treatment By Eugenia Yu Hoi Y i n B. Sc. University of British Columbia 2000 A THESIS SUBMITTED IN PARTIAL F U L F I L L M E N T O F T H E REQUIREMENTS FOR T H E D E G R E E O F M A S T E R OF SCIENCE in T H E FACULTY OF G R A D U A T E STUDIES D E P A R T M E N T O F STATISTICS We accept this thesis as conforming to the required standard T H E UNIVERSITY O F BRITISH COLUMBIA February 2003 © Eugenia Yu Hoi Y i n , 2003 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Statistics The University of British Columbia 6356 Agricultural Road Vancouver, Canada V 6 T 1Z2 Date: Apr.'I IT, 2 O 0 3 Abstract Randomized controlled clinical trials (RCTs) are generally considered to be the best experimental setting for assessing new medical therapies. In medical research, the evaluation of RCTs is often based on two approaches: the commonly recommended intent-to-treat (ITT) analysis, and the more controversial per-protocol (PP) approach, which respectively attempt to assess the clinical effectiveness and the efficacy of a therapy. In the presence of a variable lag time in treatment stabilization following randomization, the two approaches may differ not only in their patient inclusion and exclusion criteria, but also in their definitions of the baseline time, from which follow-up is to be measured. In this work, ITT and PP analyses are applied to the evaluation of an eye pressure lowering therapy, in data from the Collaborative Normal Tension Glaucoma Study. In this study, the therapeutic intervention consisted of achieving a 30% reduction in intra-ocular pressure, and necessitated a lag time before the lowered pressure level became stable. This thesis includes longitudinal and survival analyses, based on measurements taken on some of the main variables in this study. In this case, the PP approach defines baseline time in the treated group as the time at which treatment stabilization has been achieved. It thus loses some of the advantages of randomization, and may suffer potential bias in parameter estimation as well as diminishing statistical power in testing the treatment effect. We investigate these potential problems through some simulation work. While the ITT and the P P approaches fail to account for the delay in treatment stabilization, we also develop a multistate model for survival analysis and a piecewise linear mixed effects (LME) model for longitudinal analysis, both of which address the lag time problem in assessing the effectiveness of the therapy. Finally, we consider a baseline-adjustment approach to match the control group to the delayed treatment group for an efficacy assessment of the therapy. These methods that account for the lag time are compared to the ITT and the PP approaches, and recommendations based on their performance in our study and their general applicability are given. ii Table of Contents Abstract ii List of Tables vi List of Figures viii Acknowledgements x 1 Introduction 1 2 The Collaborative Normal Tension Glaucoma Study 4 2.1 Normal Tension Glaucoma 4 2.1.1 Introduction 4 2.1.2 Epidemiology 5 2.1.3 Diagnosis and Management 6 2.1.4 Visual Field Measurements 7 3 2.2 Motivation and Design of the Study 2.3 Description of the Data 10 2.3.1 10 Definition of the Progression End Point Evaluation of Randomized Clinical Trials 12 3.1 Intent-to-Treat versus Per-Protocol Principles 12 3.2 Intent-to-Treat and Per-Protocol Analyses of the Normal Tension Glaucoma Data 4 8 14 Linear Mixed Effects Models for the Longitudinal Mean Defect Data 17 4.1 18 The Linear Mixed Effects Model iii 4.2 5 24 5.1 Stochastic Multistate Models for Survival Time Data 24 5.1.1 The Disability Model 25 5.1.2 Application to the Normal Tension Glaucoma Data 27 Baseline-Adjustment Analysis of the Time to Progression Data 31 Results 6.1 6.2 33 Analysis of the Mean Defect Data 6.1.1 7 19 Multistate Models for the Time to Event Data 5.2 6 Application to the Mean Defect Data . . . 33 Piecewise Linear Modelling Approach Analysis of the Time to Event Data 36 : . . 46 6.2.1 Analysis of the Time to IOP Stabilization Data 47 6.2.2 Analysis of the Time to Progression Data . . 52 6.2.3 Analysis Using the Disability Model 62 6.2.4 The Baseline-Adjustment Analysis • • • Simulation 66 71 7.1 The Objectives 7.2 Generating Mean Defect Data ; 72 7.2.1 Generation of the Mean Structure Specified by the Fixed Effects . 73 7.2.2 Generation of the Random Effects and the Within-Patient Errors 73 7.2.3 Different Combinations of the Decay Rates of the Mean Defect '. . 75 7.3 Simulation Procedures 7.4 Results 71 ; 78 . 79 7.4.1 Comparison of the I T T and the P P approaches 79 7.4.2 Comparison of the P P and the P C L I N approaches 84 8 Discussion 9 Conclusions 96 111 iv 10 Future Research Bibliography List of Tables 6.1 R E M L coefficient estimates of the fixed effects in the L M E model for the ITT and P P approaches. 6.2 . . 34 R E M L coefficient estimates of the fixed effects in the L M E model for the . Three Different Approaches (ITT, PP, PCLIN) 6.3 41 Table of AICs, BICs, observed log likelihoods and within-patient mean square errors (a ) from fitting the L M E models for the three different 2 approaches 6.4 46 Estimates of the log relative hazards ratio (ft) and relative hazards ratio (exp(/5)) for the gender and treatment type covariates from the Cox regression analysis of the time to IOP stabilization data 6.5 50 Estimates of the log relative hazards ratio (/5) and relative hazards ratio (exp(/3)) for the group covariate from the gender-specific Cox regression analysis of the time to progression data (M for males, F for females). 6.6 . . 58 Estimates of the log relative hazards ratio (ft) and relative hazards ratio (exp(/3)) from the stratified (by gender) Cox regression analysis of the time to progression data. . 6.7 . 60 Estimates of the log relative hazards ratio (/3) and relative hazards ratio (exp(/3)) for the time-dependent IOP stabilization covariate within the treated patients from the gender-specific time-dependent Cox regression analysis (M for males, F for females) 6.8 66 Estimates of the log relative hazards ratio (/3) and relative hazards ratio (exp(/3)) for the group covariate from the gender-specific Cox regression analysis under the baseline-adjustment approach (M for males, F for females) 68 vi 7.9 The different decay rates of the M D for the control and treated groups used in the simulation. . 76 7.10 Proportions of rejecting the null hypothesis of no time by group interaction effect in the I T T and the P P analyses, for an M D decay rate of the control group = -0.001721 and different sets of M D decay rates of the treated group. 80 7.11 Proportions of rejecting the null hypothesis of no time by group interaction effect in the I T T and the P P analyses, for an M D decay rate of the control group = -0.003442 and different sets of M D decay rates of the treated group. 81 7.12 Proportions of rejecting the null hypothesis of no difference in the M D decay rates between the two treatment groups after IOP stabilization in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.001721 and different sets of M D decay rates of the treated group. . . 86 7.13 Proportions of rejecting the null hypothesis of no difference in the M D decay rates between the two treatment groups after IOP stabilization in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.003442 and different sets of M D decay rates of the treated group. . . 87 7.14 Sample mean of the 500 estimated regression coefficients and standard errors in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.001721 and different sets of M D decay rates of the treated group 89 7.15 Sample mean of the 500 estimated regression coefficients arid standard errors in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.003442 and different sets of M D decay rates of the treated group. . . 90 vii List of Figures 4.1 The observed M D trajectories over time from randomization (in days) for the 53 control patients in the data set. 4.2 20 The observed M D trajectories over time from randomization (in days) for the 44 treated patients in the data set 21 5.3 The Disability Model 26 5.4 The Disability Model for the Glaucoma Data 6.5 The average M D level observed over time from randomization for the con- . trol and the treated groups 6.6 38 Individual observed M D trajectories and the mean fitted trajectory from the P C L I N approach from time of randomization for the control group. . 6.7 44 A random sample of 24 individual fitted M D trajectories from the P C L I N approach from time of randomization 6.9 43 Individual observed M D trajectories and the mean fitted trajectory from the P C L I N approach from time of randomization for the treatment group. 6.8 28 45 (a) Estimated Kaplan-Meier survivor functions for the time to IOP stabilization of the four gender by treatment groups, (b) Estimated KaplanMeier survivor functions for the time to IOP stabilization of the surgical and non-surgical groups 49 6.10 (a) Estimated Kaplan-Meier survivor functions for the time to progression of the four treatment-gender groups based on the intent-to-treat (ITT) approach, (b) Estimated Kaplan-Meier survivor functions for the time to progression of the four treatment-gender groups based on the per-protocol (PP) approach 56 viii 6.11 (a) Estimated Kaplan-Meier survivor functions for the time to progression of the two treatment groups based on the intent-to-treat (ITT) approach. (b) Estimated Kaplan-Meier survivor functions for the time to progression of the two treatment groups based on the per-protocol (PP) approach. . 6.12 (a) Estimated Kaplan-Meier survivor functions for the time to progression of the four gender by treatment groups based on the baseline-adjustment approach, (b) Estimated Kaplan-Meier survivor functions for the time to progression of the treated and the control groups based on the baselineadjustment approach 7.13 Histograms of estimated coefficients and /?4 + p 5 (PCPTUPTT) (3 T2 used in the simulation, (a) p c c c (from the P P analysis) (from the P C L I N analysis) for randomly selected sets of = -0.0015 (b) B = -0.001721, p (c) p = -0.003442, p Tl = -0.005953, p T2 T ix = -0.001721, p = -0.004232, Tl = -0.002977, PTI = -0.005953 Tl (3 = -0.004698, p 2 = -0.003442 T1 Ptime-x.gr mip =-0.0025 (d) p c = -0.003442, Acknowledgements I would like to thank my supervisor Dr. Michael Schulzer for his continuous guidance and patience without which the development of this thesis would not have been possible. His expertise in both statistical and medical fields has enlightened me and motivated my interest in biostatisics. He has inspired me a lot in biostatistical research, and broadened my knowledge beyond textbooks and examples. Also, thank you to Dr. Jim Zidek for his support and confidence in me throughout my stay in U B C as both an undergraduate and a graduate student. M y decision to pursue further studies in statistics was undoubtedly due to Jim's enthusiasm and encouragement. As well, I am grateful to Dr. Nancy Heckman and Dr. Richard Mathias (from the Department of Health Care and Epidemiology) for being my second readers and for their precious advice on my thesis. Needless to say, thanks. should also go to all the graduate students and faculty of the department, and to all my friends. Vivien and Lisa:- many thanks for walking with me through these years, especially for all the mental and intellectual support which bonds us closely as the millennium Head-Line Team!! Mandy and Stephanie: thanks for lending an ear to me during my hard times!! Last but not least, I am most thankful for the love and support from my parents and my aunts who are always beside me. EUGENIA Y U The University of British Columbia April 2003 Chapter 1 Introduction Randomized controlled clinical trials ( R C T s ) are generally recognized as the best experimental setting for assessing new medical therapies. Randomization of patients to different treatments promotes comparability of the treatment groups and minimizes potential selection biases w i t h respect to unmeasured characteristics of the patients. Differences i n the response between the control and the treatment groups may then be attributed to the treatment itself rather than to some confounding factors. W h e n there is a lag time i n the stabilization of treatment following randomization, the definition of baseline from which patients are to be followed is particularly crucial i n the clinical comparisons between treatment groups. Randomization alone may not be sufficient to validate the results of treatment comparisons if the baseline is defined inappropriately. In general, when a method of comparison is inappropriate, or when the assumptions underlying a correct approach are not satisfied, the results w i l l be erroneous and may lead one to type I or type II errors. F i n d i n g an appropriate method of comparison i n the presence of a lag time i n the stabilization of treatment following randomization is often difficult. T h e possibility of a lag time i n full treatment effect was first noted by Halperin et a l . [1]. Such a lag time arises, for example, when the treatment under study does not take immediate effect upon its administration to treated patients at the time of randomization. There is a delay before the intended treatment effect is achieved, and this makes comparisons between the control and the treatment groups difficult. A l t h o u g h randomization of patients guards 1 Chapter 1. Introduction 2 against bias in the treatment assignment and in subsequent data analyses, this Tag time is likely to affect the results of comparisons but often cannot be specified precisely before the trials are conducted. The effect of such a lag time on statistical comparison procedures has been receiving great attention by the medical and statistical communities in the past few decades. The presence of a lag time in the treatment is mostly seen in long-term treatments. Wellknown examples include the Lipid Research Clinics Coronary Primary Prevention Trial ( C P P T ) [2], the Women's Health Trial [3] and the Physicians' Health Study [4]. In the C P P T , the treatment was a cholesterol lowering therapy. The therapy was expected to gradually reduce the amount of plaque in blood vessels and hence the risk of coronary disease. In the Women's Health Trial, a randomized controlled trial was initiated to determine if a low fat diet effectively reduced the incidence of breast cancer among high-risk group of women. The cholesterol lowering therapy and the diet intervention introduced some sort of linear lag phase where the effect of the treatment gradually increased with time. On the other hand, a different model for the lag time was used in the Physicians' Health Study where the effect of beta-carotene on cancer incidence was investigated. The experimenters believed that the drug did not affect pre-existing tumors and time was needed for new tumors to develop and become detectable. Hence a threshold lag time was assumed: the effect of the drug was not associated with tumors detected within the first two years since administration at the time of patient randomization. In all the above cases, the treatment effect was not immediate and thus introduced a lag time from the time of randomization before the treatment reached its full effect. Several authors proposed new statistical procedures to take into account the lag time in analyzing survival data. Zucker and Lakatos [5] considered a linear and a threshold lag model, and presented two weighted log rank type statistics for comparing survival curves in a non-parametric setting. Luo [6] extended their ideas to the Cox proportional hazards regression model to include lagged effects of some of the covariates. Nevertheless, little discussion in the literature has been devoted to the consequences of applying ordinary Chapter 1. Introduction 3 comparison procedures without a careful adjustment for the delay. The lag time present in non-survival type data has not been widely addressed either. It is important to make clinical practitioners and designers of clinical studies aware of the problem and this motivated this study of the statistical issues related to treatment lag times. For this work, I have focused on investigating and discussing the effect posed by a delay in treatment stabilization on the results of treatment comparisons. The comparison procedures are based on the intent-to-treat and the per-protocol principles which are commonly adopted in the evaluation of clinical trials. Longitudinal and survival data collected from the Collaborative Normal Tension Glaucoma Study [7] - [10] are used for analyses throughout this work. Approaches that take into account the lag time present in our data have been considered, in addition to some classical methods of comparison for longitudinal and survival analyses. Furthermore, a demonstration of the potential problems including bias and diminishing statistical power in applying the per-protocol approach will be given through some simulation work. The remainder of this thesis is organized as follows. The details of the Collaborative Normal Tension Glaucoma Study and an introduction to normal tension glaucoma are first provided in Chapter 2. Chapter 3 gives a full description of the intent-to-treat and per-protocol principles and an application to the evaluation of the glaucoma study. Methodologies of modelling the longitudinal and survival data from the glaucoma study are detailed in Chapters 4 and 5, and the results of analyses are presented in Chapter 6. Simulation of longitudinal data for demonstrating the performance of the intent-to-treat and per-protocol approaches follows in Chapter 7. A general discussion and conclusions are given in Chapters 8 and 9. The thesis ends with recommendations and suggestions for future work in Chapter 10. Chapter 2 The Collaborative Normal Tension Glaucoma Study The Collaborative Normal Tension Glaucoma Study (CNTGS) [7]-[10] is a prospective multi-center study for investigating the effects of intra-ocular pressure (IOP) reduction on disease progression in normal tension glaucoma (NTG). Before giving the details of the design of the study in Section 2.2 and a description of the data in Section 2.3, we familiarize the readers with some basic information on the nature of the disease, its diagnosis and management to enhance their understanding of the purpose of the trial and subsequently, the methodologies and analyses which are presented in this thesis. 2.1 2.1.1 Normal Tension Glaucoma Introduction Glaucoma has been one of the leading causes of blindness among adults and the elderly in particular. Its definition varies across the ophthalmic community, but the disease can be referred to as a chronic ophthalmic condition characterized by optic nerve head damage, a characteristic loss of visual field and an elevated IOP. In 1857, Von Graefe [11] described a group of patients having cupping of the optic nerve head and visual field defects but with IOP levels that remained within the statistically normal range. The term "normal tension glaucoma" was then coined to describe this particular group of glaucomatous conditions. 4 Chapter 2. The Collaborative Normal Tension Glaucoma Study 2.1.2 5 Epidemiology The prevalence of normal tension glaucoma was estimated in a number of studies. The estimates range from 0.3 to 4% among patients in their mid-60s [12]. The wide range is the result of the many different definitions of N T G being employed. Similar to OpenAngle Glaucoma, which comprises the largest group of patients suffering from glaucoma with elevated IOP, normal tension glaucoma is mostly asymptomatic at the early stage. There is no associated visual field loss and therefore most patients are unaware of the disease. When untreated, N T G patients will gradually lose their peripheral vision and eventually may suffer total blindness. The Glaucoma Research Foundation [13] reported a rate of blindness from glaucoma between 93 and 126 per 100,000 people over the age of 40. In particular, Open-Angle Glaucoma accounts for 19% of all blindness among African-Americans compared to 6% in Caucasians. As has been discussed by Sassani [14], results from previous research on the risk factors for normal tension glaucoma showed that age, gender, race, diseases including migraine and diabetes, and genetic factors are associated with the development of the disease. More specifically, the prevalence increases with age; females, Asians and African Americans, and people with migraine, diabetes or family history of glaucoma are more susceptible to developing normal tension glaucoma. The most recent study of the natural history of the disease, as conducted by the Collaborative Normal Tension Glaucoma Study Group, investigated the risk factors for the progression of visual field abnormalities in N T G [10]. It was found that the female gender, the presence of migraine and disk hemorrhage contribute separately to a higher risk of progression. Asian patients have a slower rate of progression despite a high prevalence of N T G within the race, while black patients show a faster rate of progression. Moreover, age, the untreated level of IOP and self-declared family history of glaucoma were found to have no effect on the progression rate in this study. Chapter 2. The Collaborative Normal Tension Glaucoma Study 2.1.3 6 Diagnosis and Management The diagnosis of normal tension glaucoma is often made by a diagnosis of exclusion. The determination of the nature of the disease is based upon the elimination of other diseases that share similar symptoms and characteristics. A l l other causes leading to damage of the optic nerve and visual field loss, for example, cardiovascular abnormalities, must be eliminated, and the IOP level has to be shown repeatedly not to exceed the normal statistical upper bound (21 mm Hg) before normal tension glaucoma can be diagnosed [12]- To measure intra-ocular pressure, tonometry is used, but it has rather poor sensitivity and specificity in detecting glaucoma if used alone [15]. Therefore in practice, it is used in combination with ophthalmoscopy, which examines the appearance of the optic nerve, for early detection of the disease. Moreover, gonioscopy helps to examine the structure of the anterior chamber angle for determining whether a patient suffers from Open-Angle or Angle-Closure Glaucoma. Normal tension glaucoma shares many clinical features with Open-Angle Glaucoma, but we do not plan to discuss their similarities here. Readers can refer to the literature on glaucoma for more details. Sassani [14] provides a comprehensive reference on the subject. Despite the fact that an IOP outside the normal range has not been documented, patients with normal tension glaucoma tend to have a wider diurnal IOP fluctuation, which might account for the glaucomatous features in the absence of a consistent elevated IOP level [12]. Furthermore, studies have shown that asymmetric normal tension glaucoma is associated with an asymmetric IOP [16]-[18], so IOP is believed to play a role in the underlying mechanism causing the disease. A n d with this belief, treatments for normal tension glaucoma aim at reducing IOP levels. Patients diagnosed to have an early stage of the disease are usually treated with medications. When patients show progression on medication or experience considerable visual field loss, they are treated Chapter 2. The Collaborative Normal Tension Glaucoma Study 7 with laser surgery. Filtering or incisional surgery is applied upon failure of the previously mentioned treatments or upon persistence of the progression. The medical and surgical treatments all attempt to lower IOP in hope of preventing further progressive damage. 2.1.4 Visual Field Measurements To monitor disease progression, both the optic nerve head and the visual field need to be assessed regularly. Nowadays, automated static perimetry is used to quantify visual field loss, based on the linear relationship between visual perception and the change in stimulus intensity which is measured in the logarithm scale of decibels (dB) [15]. The Humphrey Field Analyzer (HFA), which was used in the C N T G S , is one of the most commonly used automatic perimeters. In essence, perimetry based on the H F A entails estimating threshold values at each test location in the central 30 degrees of the visual field, where a threshold can be described as the minimum brightness of a stimulus a patient perceives at a particular test location. To estimate the threshold value at each location tested, stimuli are presented at that location and the intensities are decreased in AdB steps until reversal, i.e., from perceived to not perceived or vice versa. The test process then reverses and the intensities increases in 2dB steps until the second reversal occurs, at which time the threshold determination is stopped. The last seen stimulus intensity will be used as the threshold estimate. Detailed description of the H F A and the threshold estimation procedure can be found in [19]. From these threshold values at different locations summarize the four global indices that quantify visual field loss: the mean deviation or mean defect (MD), the pattern standard deviation (PSD), the short-term fluctuation (SF) and the corrected pattern standard deviation (CPSD). The mean defect, which is an important outcome variable in our data, is a variance weighted average departure of each test.location from the agecorrected value, where a threshold of stimulus intensity is observed at every test location in the visual field: Chapter 2. The Collaborative Normal Tension Glaucoma Study MD = f1 y t 8 -n where Ui is the observed threshold Vi is the normal age-corrected reference threshold sf is the between-patient variance of normal field measurements at the ith of the n test locations. The M D measures the overall sensitivity of the retina to light. A large negative M D is suggestive of a serious overall abnormality of the visual field. On the other hand, P S D and C P S D are more effective indices for quantifying localized visual field defects. In the presence of a cataract, M D tends to have reduced specificity because cataracts are characterized by a generalized depression of thresholds over the entire field, thus leading to a decreased M D level. Monitoring the rate of decay of M D is useful for assessing the rate of progression of normal tension glaucoma, as M D is reflective of the overall degree of visual field defects. 2.2 Motivation and Design of the Study For all forms of glaucoma that are associated with an elevated intra-ocular pressure (IOP), treatments always involve the lowering of the IOP, and a reduced IOP has known beneficial effect on the natural history of the disease. However, for N T G patients whose IOP stays inside the statistically normal range, the usefulness of having an IOP reduction is unknown. Clinical findings suggest asymmetric N T G is often associated with asymmetric IOP levels [16]-[18]. One of the main objectives of the C N T G S was to ascertain Chapter 2. The Collaborative Normal Tension Glaucoma Study 9 the role IOP reduction plays in normal tension glaucoma. The C N T G S Group compared the time-to-progression experience of an untreated group of N T G patients to a treated group in which patients received medical, laser and/or surgical treatment(s) to achieve a 30% reduction from the mean of the last three prerandomization pressure readings. The effectiveness and the efficacy of the IOP lowering strategy were assessed by using an intent-to-treat approach and a per-protocol approach, respectively. The principles underlying the two approaches are discussed in Chapter 3. Two hundred and thirty patients from twenty-four centers were enrolled in the study. To be eligible for the study, the patients all had unilateral or bilateral normal tension glaucoma and other ophthalmic characteristics which met the criteria as described in [7]. Upon entry into the study, patients remained unrandomized until a fixation threat or progression of the study eye(s) occurred. A fixation threat can be described as having visual field defects at the point of fixation, which is the area of maximum visual acuity in human visual field. The eligible eye of each patient was then randomized to either the control group, in which the eye remained untreated, or the treatment group, in which a 30% reduction in IOP was achieved by means of medical, laser and/or surgical interventions. Most treated patients were first placed on topical medication or laser treatment. When either or both failed to reduce the IOP to the desired level, patients underwent filtering surgery. There were also cases where treated patients were given the surgical treatment immediately after randomization. Once stabilization of the treatment effect was achieved, the patients were followed regularly until their study eyes reached the progression end point (which is defined in Section 2.3.1) or until their lifetime in the study was censored. Meanwhile, patients had their mean defect (MD) measured repeatedly and regularly, at each of their clinical visits. The time to IOP stabilization after a 30% reduction for the treated patients, the time to the progression end point and the mean defect values comprised the three main outcomes of the study. Covariate information on demographics, medical history Chapter 2. The Collaborative Normal Tension Glaucoma Study 10 and baseline ophthalmic characteristics was also collected. 2.3 Description of the Data Due to confidentiality concerns, a subset of the data analyzed by the C N T G S Group was used for this thesis. Hereafter, I will refer to this subset as the data unless stated otherwise. The data were obtained by sampling at random 97 patients from the 145 who enrolled in the study and whose study eyes met the criteria for randomization. Among the selected 97 patients, 44 were in the treated group and 53 were in the control group. Longitudinal data of the mean defect measurements (in decibels) and survival data of the time to IOP stabilization (in days) and the time to progression (in years) were available for analyses. Besides the group variable, the effects of gender, type of therapy that treated patients received, as well as age, IOP and M D levels at baseline were studied in my thesis. 2.3.1 Definition of the Progression End Point The Collaborative Normal Tension Glaucoma Study adopted two definitions of the progression end point: the protocol definition and a definition based on the so-called fourof-five criteria. The former ensured identification of minimal visual field alterations to minimize any risk to eyes of untreated patients in the study, and the details of this definition can be found in [7]. The latter was used for the purpose of analyzing study outcomes. A computer algorithm was developed for the identification of the progression end point. In essence, progression was considered confirmed when four of five consecutive follow-up fields showed progression in a cluster of test locations relative to baseline visual fields, with at least one non-peripheral progression point (test location) common to all four fields. A progression relative to baseline fields was defined as having two or more adjacent points (they could not all be peripheral) whose M D values decreased by at least Chapter 2. The Collaborative Normal Tension Glaucoma Study 11 lOdB, relative to the average of baseline values at each of these points taken at the time of randomization. A complete description of the four-of-five criteria can be found in [7]. Chapter 3 Evaluation of Randomized Clinical Trials 3.1 Intent-to-Treat versus Per-Protocol Principles In conducting clinical trials, treatment assignment is ideally done through randomization because randomization tends to give an unbiased comparison of the different treatment groups. In practice, however, clinicians often encounter problems of patient drop-outs, non-compliance and missing observations. Some patients do not ultimately receive the treatments to which they are preassigned. T h i s then leads to concern about how one should analyze clinical trials i n order to have a proper comparison between the treatment groups. There have been two principles that are adopted i n the evaluation of clinical trials: the intent-to-treat ( I T T ) principle and the per-protocol ( P P ) principle. The former is based on the idea that all patients who are randomized should be included i n the final analysis of the trial irrespective of the presence of drop-outs, cross-overs and non-compliance. Patients are assumed to remain i n the'treatment groups to which they have been randomized even if they switch to another treatment during the period they are followed. According to Lachin [20], the intent-to-treat principle refers to a set of criteria for the evaluation of the benefits and risks of a new therapy that essentially calls for the complete inclusion of all data from all patients randomized i n the final analyses. T h e intent-to-treat principle is contrasted w i t h the per-protocol principle i n which the m a i n purpose of the analysis lies i n the assessment of the efficacy of a treatment. W i t h this principle, the evaluation of a clinical trial is based only on patients who actually 12 Chapter 3. Evaluation of Randomized Clinical Trials 13 adhere to the treatment preassigned. Observations on dropped-out and non-compliant patients are excluded from the analysis. Lachin [20] described the principle as a strategy to select and to examine the experience of a subset of patients that meet the desired efficacy criteria for inclusion in the analysis. It is important to distinguish between efficacy and effectiveness because their assessment entails different strategies and clinical implications. Efficacy refers to the effects of an intervention, such as a medication under ideal conditions, while effectiveness refers to how successful an intervention is in clinical practice whose conditions often deviate from the controlled conditions in efficacy studies. A s described by Indrayan and Sarmukaddam [21], efficacy is evaluated under controlled conditions, whereas effectiveness is determined not only by efficacy but also by coverage, compliance, provider performance, etc. There has been intense controversy over which of the two principles should be employed i n m a k i n g treatment comparisons. Advocates for the intent-to-treat approach argue that it not only provides a means of treatment comparison i n an unbiased fashion, but also realistically assesses the usefulness of a treatment i n clinical practice where it is infeasible to track how patients are receiving their prescribed treatments. T h e other group advocating the per-protocol approach argue that an intent-to-treat analysis is less powerful i n detecting the presence of a treatment effect because a treatment effect is possibly diluted by the inclusion of patients who do not adhere to their treatments. Those who follow the per-protocol principle believe that by studying the subset of patients who do receive the treatments exactly as described in the protocol, the treatment effect can be truly assessed and estimated. B u t they have overlooked some potential problems of this approach. L a c h i n [20] described some of the statistical considerations in the intent-to-treat design and in the analyses of clinical trials. He also discussed potential bias and statistical power issues related to the per-protocol analysis which he referred to as the efficacy subset analysis. T h e dropped-out and non-compliant patients possibly comprise a group Chapter 3. Evaluation of Randomized Clinical Trials 14 with demographic characteristics and health conditions substantially different from the rest of the patients who are included in the per-protocol analysis. This subset of patients being included is not identified at the time of randomization, and hence randomization of patients does not help ensure an unbiased comparison. The validity of the results from the per-protocol analysis becomes questionable; even if the results are valid, they may not be applicable to the study population in general. In many recent studies, clinical trials are evaluated using both approaches. Results from both are reported and compared. There seems to be a belief that when a perprotocol analysis identifies a significant treatment effect, the same result obtained in the intent-to-treat analysis further demonstrates the usefulness of the treatment. A n d for this reason, the intent-to-treat analysis tends to serve as a confirmation tool of a treatment assessment, rather than a means of unbiased treatment assessment. It is unclear how the results obtained from the two approaches are related and can be compared as they involve essentially two different sets of patients (one set is the subset of the other). Regardless, one should be aware of the potential problems of the per-protocol approach. Moreover, because of its difference in clinical interpretation and implication from the intent-to-treat approach, one needs to be cautious when drawing conclusions from either approach. 3.2 Intent-to-Treat and Per-Protocol Analyses of the Normal Tension Glaucoma Data In the evaluation of the Collaborative Normal Tension Glaucoma study, both the intentto-treat and the per-protocol approaches were used for treatment comparisons. In particular, the survival experience of the treated patients who had their IOP levels reduced by 30% from the prerandomization values was compared to that of the patients in the control group who remained untreated until they reached the progression end point or their follow-up times in the trial were censored. In the intent-to-treat analysis, all the Chapter 3. Evaluation of Randomized Clinical Trials 15 patients who were randomized were included for treatment comparison, regardless of the fact that there were problems such as inability to achieve the desired reduced level of IOP, progression prior to IOP stabilization, treatment complications that affected visual acuity of the treated patients, and non-compliance. In contrast, patients having any of the above problems were excluded from the per-protocol analysis. As the two approaches differ in the inclusion of patients for analyses, a substantial difference in sample size is expected when the degree of drop-outs and non-compliance is high. However, among the 145 patients who were randomized in the Collaborative Normal Tension Glaucoma Study, only five treated patients withdrew from the study before the stabilization of IOPs took place and did not meet the efficacy requirement to be included in the per-protocol analysis. In defining the intent-to-treat and the per-protocol approaches that were adopted by the C N T G S group, not only the original criteria of inclusion of patients were used, but also two different baselines were defined. In the intent-to-treat analysis, the baseline was taken to be the time of randomization for all the patients. Even though a 30% reduction in IOP was not immediately achieved upon medical, laser or surgical intervention for the treated patients, measuring the patients starting from the time of randomization gave a. reasonable assessment of an overall clinical effectiveness of the treatment as the treated patients began with the IOP-lowering therapy at randomization. On the other hand, in the per-protocol analysis, treated patients had their follow-up times measured from the baseline time at which their IOPs stabilized after a 30% reduction. Equivalently, the baseline for the control group remained at the time of randomization while that for the treatment group was shifted to the patients' individual times of IOP stabilization. This new baseline was chosen for the treatment group because having a 30% reduction in I O P was the desired treatment criterion. The treatment with the I O P reduction was supposed to have taken its full effect at the time of IOP stabilization. Due to the small number of patients who withdrew from the study, the I T T and P P Chapter 3. Evaluation of Randomized Clinical Trials 16 approaches did not differ much in terms of sample size. Rather, the baseline definition was the distinguishing factor between the two approaches. Throughout this thesis, we will follow the same principles defining the intent-to-treat and the per-protocol approaches as were adopted in the evaluation of the Collaborative Normal Tension Glaucoma Study. The results of our analyses based on the two approaches will be presented in Chapter 6, and the reliability of the results from the two approaches will be discussed in Chapter 8. Chapter 4 Linear Mixed Effects Models for the Longitudinal Mean Defect Data To model repeated measurements collected over time, the multivariate normal distribution ( M V N ) , generalized estimating equations ( G E E ) and the linear mixed effects ( L M E ) model are amongst the popular choices. T h e M V N approach may not be applicable to our case because it generally works best only when the subjects have observations taken at a common set of times. Moreover, its application puts quite a strong distributional assumption on the data. In order to have more relaxed assumptions to work w i t h , and to incorporate irregular follow-up measurements for patients enrolled i n the Collaborative N o r m a l Tension G l a u c o m a Study, the G E E [22] and the L M E model were considered. F i t t i n g the mean defect data using the two methods gave similar results marginally, but the L M E model was chosen for further analysis of the data because it allows for random effects for covariates which vary substantially between subjects. Also, the L M E model automatically takes care of problems of missing response and has the flexibility of fitting a wide variety of correlation structures for the within-patient errors and for the random effects. In particular, the data show a large between-patient variation i n the baseline mean defect ( M D ) and i n the decay pattern of M D over time. Using the L M E model, we can better model the M D data by allowing different intercepts and decay rates for different patients. 17 18 Chapter 4. Linear Mixed Effects Models for the Longitudinal Mean Defect Data T h e Linear M i x e d Effects M o d e l 4.1 T h e linear mixed effects model that was fitted to the data was described i n L a i r d and Ware [23]. For individual i, the model has the form Yi = Xi/3 + + e { h ~ N(0, D), a ~ N(0, lb), bi 1 (4.1) a where Yj is the vector of responses, /3 is the vector of fixed effects, which are constant across subjects, bi is the vector of random effects, and is independent of bj for i ^ j, Xj and Zj are the design matrices for the fixed and random effects, respectively, et is the vector of within-subject errors, and is independent of e,- for i ^ j. The model i n Equation (4.1) has two major components: the mean structure Xj/3 and the covariance structure defined i n terms of the distribution of the random effects and the within-subject errors. T h e vectors of random effects, bj's, are each assumed to be normally distributed with mean 0 and covariance matrix D, and are mutually independent of the e;'s. T h e vector of within-subject errors, e;, also follows the normal distribution w i t h mean 0 and covariance matrix R; of dimension ri; x n;, where n» is the number of observations for individual i. T h e unknown parameters i n Rj do not depend upon i. It can be shown that marginally, Yj is independently normally distributed w i t h mean Xj/3 and covariance matrix V(Yj) = Rj + ZjDZf. T h e random effects introduce an extra component of variation ZjDZf to the response variable. Furthermore, i n the simplest case where Rj = <r I, i.e., the within-subject errors are mutually independent, the 2 Chapter 4. Linear Mixed Effects Models for the Longitudinal Mean Defect Data 19 i n d i v i d u a l response components within the same subject, (Yu, Y^, •••> Y ) are correlated ini i n the presence of random effects. E s t i m a t i o n of the fixed and random effects, and parameters of the covariance structures can be based on least squares and m a x i m u m likelihood methods, or an empirical Bayes methodology. Details on the estimation procedure were discussed i n L a i r d and Ware [23]. T h e m a x i m u m likelihood ( M L ) and the restricted m a x i m u m likelihood ( R E M L ) methods are by far the most popular choices of estimation procedures. R E M L estimates are often more efficient as R E M L estimation adjusts for the degrees of freedom used i n the estimation of the parameters. However, when the total number of observations 2~2iLi i ( n m is the number of subjects) is much larger than the number of unknown parameters which are to be estimated, the M L and the R E M L methods give very close parameter estimates. 4.2 Application to the Mean Defect Data One of the major study questions which w i l l be addressed i n this thesis is whether a 30% reduction of I O P from the baseline value successfully slows down the rate of generalized visual field loss as measured by the rate of decay of the mean defect. Moreover, the mean defect level of the control and the treated groups would also be our study interest because a more negative level is indicative of a higher severity of normal tension glaucoma. P r e l i m i n a r y plots of the individual patients' mean defect trajectories for the control and treated groups separately (see Figures 4.1 and 4.2) suggest a general trend of depression of the M D over time/ A l t h o u g h the M D level within each patient tends to show moderate fluctuation, it does not seem too unreasonable to assume a linear model for the M D data of each of the individuals. Furthermore, we observed a large between-patient variation i n the baseline M D readings at randomization and i n the rate of change of M D over time. T o take into account the highly variant information across patients, we fitted 20 Chapter 4. Linear Mixed Effects Models for the Longitudinal Mean Defect Data Figure 4.1: T h e observed M D trajectories over time from randomization (in days) for the 53 control patients i n the data set. ••\/ 0 500 1500 0 time 0 500 1500 0 0 500 200 600 0 400 800 0 200 600 0 time 0 400 200 400 0 600 0 1500 0 1500 500 400 400 0 time 1500 0 200 2500 150 250 time 200 time 1000 0 50 • time 500 time 0 time 200 time ' 500 800 time time time 200 1500 time time time 0 2000 time time 0 1000 400 800 time 600 0 time 200 600 time i •? 0 200 600 0 time . 400 800 0 400 800 time 0 time 400 800 • \f •*! 0 400 1000 0 time 400 800 0 400 time 0 time 1000 0 600 time *— time 200 ft. \ 1 400 800 0 time 400 800 time S,/' 5 2 0 500 1500 0 time 400 800 0 200 time 400 0 time 200 500 0 time 400 800 time \/1 0 500 1500 0 time 400 1000 0 200 time 600 0 time 400 1000 0 time 1000 2000 time <yvv 0 400 1000 0 time 0 500 1500 0 2500 1500 0 500 time 600 0 0 400 V \ 800 1500 0 r 400 time 400 1000 0 time time " \ v time 1000 200 time time V\r' ' 500 0 time time 0 400 800 0 500 time 800 1000 2500 time . 1500 0 500 time 1500 Chapter 4. Linear Mixed Effects Models for the Longitudinal Mean Defect Data 21 Figure 4.2: The observed M D trajectories over time from randomization (in days) for the 44 treated patients in the data set. \f" 0 500 1500 0 500 time 1500 500 0 2500 0 400 time 1000 2000 0 1000 time 1000 0 time 2000 time 1000 2500 0 200 time 600 time \ . \ 1000 0 2500 1000 2000 0 500 100 time X. 1500 0 400 time 200 800 0 \ / 500 time 1500 0 0 1500 0 400 1500 1000 time ••'•\/ 0 time 1000 2000 time 500 time 400 1000 0 time V 500 0 time l\ 0 0 • . time 0 2000 time 1p00 time 0 1000 ••v 1500 • •••.'\ 0 time v 0 ' '.'•••it?-': •«: •••v-\ 500 1500 time ^ 200 400 0 time time 1000 2000 0 time 400 800 time ••\ :,:\'.0 400 800 0 50 time 0 500 1500 0 time 500 500 500 1500 1500 time 500 1500 0 time 1500 0 1000 500 1500 0 time 2000 0 time 1000 500 1500 time 2500 0 time 1000 2500 time T 0 400 800 time 0 0 timeS. 0 150 time 0 500 time 0 500 1500 time • 1500 0 time 0 500 1500 time 500 1500 time 0 500 1000 2000 time 1 500 time 0 Chapter 4. Linear Mixed Effects Models for the Longitudinal Mean Defect Data 22 the Laird and Ware linear mixed effects (LME) model given in Equation (4.1) to the mean defect data and included as random effects the intercept and the time-slope covariate. The response variable was the M D measured over time. Covariates that comprised the fixed effects were • the times of repeated M D measurements • gender • group membership • age at baseline • baseline M D • baseline IOP In both the intent-to-treat and the per-protocol analyses, the interaction between time and group was also included in order to assess any difference in the decay rate of the M D between the two treatment groups. A significant time x group interaction effect would imply a significantly different rate of change in M D and hence a different rate of disease progression between the control and the treated groups. Various covariance structures including the independence model, the first-order continuous autoregressive model (CAR1, which is equivalent to the exponential model) and the Gaussian model were fitted to the within-patient errors in the preliminary analysis of the data. The first-order continuous autoregressive model was found to give the best fit. For the random effects, an unstructured covariance matrix was assumed. Moreover, there is a lower level below which the M D seldom attains in normal tension glaucoma. Patients whose M D levels at baseline are close to the lower limit might show a different trend of depression and in particular a slower decay rate than those who have Chapter 4. Linear Mixed Effects Models for the Longitudinal Mean Defect Data 23 less negative M D at baseline, because the closer the M D is to the lower limit, the less room is left for depression of the generalized visual field. This phenomenon is referred to as the "floor effect". In the case where the treated patients and the control patients in our study began with rather different initial M D levels, a difference between the mean M D decay rates of the two groups of patients might be the result of a potential floor effect rather than of a significant treatment effect. We checked for the presence of such a floor effect in our data by including an additional covariate into the L M E model in both the I T T and the P P analyses. The covariate was an indicator variable for whether a baseline M D level was above or below -YldB. This particular value was chosen because a patient with an M D less than -12<iB is generally regarded as having an advanced stage of normal tension glaucoma. The interaction between time and this indicator variable can be tested to investigate whether the decay pattern of the M D depends on the baseline M D level. If such an interaction exists, it would imply patients who have advanced normal tension glaucoma have a different M D decay rate from patients with an early or a moderate stage of the disease, regardless of whether they are treated or not. , Results from fitting the L M E model under the intent-to-treat and the per-protocol approaches are given in Chapter 6, and the appropriateness of the two approaches to analyzing the M D data are discussed in Chapter 8. Chapter 5 Multistate Models for the Time to Event Data Classical methods of survival analysis such as the Kaplan-Meier survivor function [24] estimation, Cox regression [25] and parametric modelling have been extensively used for analyzing time to event data. While these methods can easily be applied to our study of the time to progression, we have not taken into account the possible linkage between the two events: IOP stabilization and progression in the intent-to-treat and the per-protocol analyses of the time to progression data. If we are to focus on the treated group only, the time to IOP stabilization may provide valuable information to our understanding of the hazard of progression. B y including the time to IOP stabilization as a covariate, we can assess its effect on the time to progression within the treated group. However, since the time to IOP stabilization was irrelevant and thus unobserved in patients in the control group, taking into account this time information is not directly feasible when comparing the survival experience to progression between the two treatment groups. Here we consider a multistate modelling approach. By means of a multistate model, we can model the events of IOP stabilization and progression simultaneously, and flexibly deal with the partially-observed times to IOP stabilization. 5.1 Stochastic Multistate Models for Survival Time Data In ordinary survival analysis, time to failure is the only outcome of interest. Individuals are followed until they reach the failure end point or until their lifetime is censored. 24 Chapter 5. Multistate Models for the Time to Event Data 25 Within the framework of stochastic processes, the occurrence of failure can be thought of as a transition from an "alive" state to an absorbing state of failure, and the force of transition is equivalent to the hazard function for the survival time. A multistate model is an extension of the simple two-state Markov model in ordinary survival analysis. It can flexibly accommodate intermediate events that happen before failure and whose effects on the risk of failure are of study interest. It can also be used to model multiple failure events in situations where there is more than one cause of failure and the risks of failure due to different causes are to be compared. By definition, a multistate model is a model for a stochastic process which occupies one of a set of discrete states at any time point. A transition is defined as a change in state. The state structure specifies the states and which transitions are possible. In order to define the full statistical model, one must specify the state structure as well as the form taken by the hazard functions for each possible transition. There are many different types of multistate models and we will focus on the disability model which is described in the following section. Readers are referred to Hougaard [26] for more information on the types of multistate models and their applications. 5.1.1 The Disability Model Before failure occurs, one might observe other events which possibly affect the risk of failure. For example, in a clinical trial studying whether a heart, transplantation is beneficial to patients suffering from heart diseases, patients who are randomized to the treatment group may receive heart transplantation surgery during their follow-up periods, and some may die before a suitable heart is available for transplantation. Moreover, the time of transplantation from randomization is variable across the treated patients. In order to assess whether having a transplanted heart reduces the risk of death caused by heart attacks, it would be useful to model receiving a transplantation as a separate event. Another motivating example is the development of non-fatal complications during the Chapter 5. Multistate Models for the Time to Event Data 26 Figure 5.3: The Disability Model 0: Alive h (t) \ 02 1: Disabled _ 2: Failure /h (t) 12 course of a disease. Patients infected with a certain disease may die with or without a complication. However, the complication which only occurs in some but not all of the patients at a variable time may be assumed to alter the risk of death. In the above two examples, the events of receiving a transplantation and the development of a complication are likely to influence the risk of failure which is to be assessed. In such cases, it is useful to model the two events: transplantation/complication and failure simultaneously. A special type of multistate model that fits in with the context similar to the examples is the disability model. As shown in Figure 5.3, the disability model consists of three states: state 0 usually refers to the "alive" state without the "disability" which can be the heart transplantation or a disease complication; state 1 is the "disabled" state in which individuals are "disabled" but have not yet failed, and state 2 always refers to the state of failure. The hazard function for a transition from state i to state j at time t is represented by hij(t). Transitions that are possible are indicated by the arrows. The disability model depicted in Figure 5.3 describes a situation where a patient can fail with and without going through the "disabled" state and all the transitions are irreversible. For instance, a patient who has made a transition from state 0 to state 1 cannot return to state 0. In other words, once a patient becomes "disabled", he/she will remain "disabled" for the rest of his/her lifetime. Also, state 2 is an absorbing state: one cannot leave state 2 once the person enters the state. Chapter 5. Multistate Models for the Time to Event Data 27 A common approach to analyzing the disability model involves separate analyses of the times to different states. The individual hazards for transitions from state i to state j, hij(t) = 0,1, 2, i < j), can be modelled separately by means of semi-parametric and parametric models. The modelling of an overall hazard function for the disability model has also been discussed in the literature. Andersen [27] proposed the use of transition probabilities between states to define an overall hazard function as a linear combination of the individual hazard functions. The author gave an illustration of the approach using the Steno Memorial Hospital Diabetes survival data. Hougaard [26] discussed the possibility of applying a single Cox proportional hazards model with time-dependent covariates in analyzing the disability model and exemplified an application using the well-known Stanford heart transplant data [28]. 5.1.2 Application to the N o r m a l Tension Glaucoma D a t a In the Collaborative Normal Tension Glaucoma Study, patients who were randomized were followed until they reached the progression end point or their lifetimes in the trial were censored. The treatment of lowering the IOP by 30% of the pre-randomization levels for the treated group did not take immediate effect. In particular, a variable time elapsed after randomization before the desired reduction of the IOP was achieved. Upon a successful reduction, we can regard the treated patients as making a transition to the state where their IOPs remained stable but the progression end point had not yet been reached. They remained in this state until a transition to the progression state occurred. A few treated patients who were randomized in the study but not included in our data indeed failed to achieve the desired level of reduced IOP. They made transitions directly to the progression end point as in the case of the control patients. Furthermore, a 30% reduction of IOP was permanent upon IOP stabilization and the transition to the progression state was irreversible. Thus, we see that the disability model provides a reasonable description of the sequence of events observed in the study. We could define Chapter 5. Multistate Models for the Time to Event Data 28 Figure 5.4: The Disability Model for the Glaucoma Data 0: Alive \ 1: IOP stabilization 2: Progression the three states in the disability model, as shown in Figure 5.4 as follows: • State 0: Alive, in which state a patient did not have a stable 30% reduction of IOP and had not yet reached the progression end point. • State 1: IOP stabilization, in which state a patient achieved a stable IOP after an intended reduction but had not yet reached the progression end point. • State 2: Progression, in which state a patient reached the progression end point. There was a need to include the state of IOP stabilization in the model because a 30% IOP reduction was fully achieved upon transition to this state and the effect of the stabilization of the treatment on the risk of progression was to be assessed. Using the same notation in defining the individual hazards in the disability model, let h i(t) represent the hazard function for the time taken to achieve a stable 30% re0 duction in IOP as a result of medical or surgical intervention, hnit) and h {t) represent 02 the hazard functions for the time to progression with and without IOP stabilization, respectively. When we model the transition hazards separately, we can incorporate different covariates that are relevant to the different transitions. For example, when modelling / i i (t) 0 within the treated group, we can include the type of IOP lowering therapy as a covariate Chapter 5. Multistate Models for the Time to Event Data 29 and study if the hazard of reaching I O P stabilization differed between patients who received surgical and non-surgical treatments. W h e n modelling h (t) among the treated i2 patients, the effect of the length of the time period awaited for I O P stabilization on the progression hazard after establishing a stable 30% I O P reduction might be of interest. In addition to the waiting time for I O P stabilization, the I O P and M D levels at stabilization can provide valuable information to be included as covariates i n the model. Since by definition all patients i n the control group reached the progression directly without going through the state of I O P stabilization, the modelling of the progression hazard without a stable I O P reduction, h (t), involves initially all the treated and control patients. More02 over, we can add the treatment group covariate to the hazard model and assess whether the untreated and treated group had a different risk of progression without a stable I O P reduction. Treated patients who had their I O P s successfully reduced before progression w i l l have their lifetimes censored at the time of I O P stabilization for the estimation of h (t). 02 W i t h i n the family of C o x models [29]-[30], i f the length of the waiting period to achieve I O P stabilization does not have an effect on the risk of progression w i t h a stable I O P reduction, and if we have proportionality between the two progression hazards at a l l times t: hi (t) = c x h (t), for some unknown c, then the three states i n the disability 2 02 model can be analyzed simultaneously using a single C o x model [26]. More specifically, we can fit a single C o x model h(t) w i t h a time-dependent covariate I(t) as an indicator function, where I(t) — 0 during the period when no stabilization has been achieved, and I(t) = 1 during the period after the stabilization is established. We can also include other covariates such as M D and I O P at randomization, age, group membership and gender. T h e C o x model w i t h the time-dependent covariate I(t) is given as follows: hi(t) = h (t)exp{aIi(t) *x Q i grimPt + /3 Xj} T (5.2) where for the zth individual, hi(t) is the hazard function, h (t) is the baseline hazard 0 function and Ii(t) is the time-dependent indicator variable for I O P stabilization. T h e Chapter 5. Multistate Models for the Time to Event Data 30 time t is measured from the time of randomization. The vector of covariates including the group covariate x i gr0UPt is represented by X j , and j3 is the vector of regression coefficients representing the log relative hazard ratios. Since is relevant to the treated patients only, we make use of an interaction term between and the group variable (x gr0UPii = 0 for the control group, 1 for the treated group) so that /;(£) only enters the above Cox model if patient i is treated. The significance of the coefficient for the interaction term, a, can be tested to assess the effect of IOP stabilization on the progression hazard within the treated patients. Also note that by using an indicator function for IOP stabilization, we are assuming a threshold treatment effect, i.e., at times when a treated patient had not yet shown a desired stable IOP reduction and Ii(t) = 0, we will assume the treatment has not taken any effect. As mentioned earlier, some treated patients in the Collaborative Normal Tension Glaucoma Study reached the progression end point before IOP stabilization was achieved (they made direct transition from state 0 to state 2 in Figure 5.4) though they constituted only a small proportion of the treated patients. However, the data we analyzed in this thesis did not include any of these patients. Consequently, when we model the transition hazard ho2(t), all the treated patients had censored times to progression without a stable IOP reduction. We therefore have difficulty in comparing the hazards of progression without a stable IOP reduction between the control and the treated groups. Adjusting for the group covariate in modelling the hazard ho^it), which could have been done if the data from treated patients who reached the progression end point prior to IOP stabilization were available, may not provide useful information to our understanding of the time-to-progression experience without IOP stabilization in our case. We might also encounter computational problems in the estimation of the group covariate coefficient. In seeing these potential problems, we will focus on the control patients only and leave out the group covariate in modelling h 2(t) in this thesis. 0 Chapter 5. Multistate Models for the Time to Event Data 31 Besides adopting the multistate modelling approach by means of a disability model, other non-parametric, semi-parametric and fully parametric methods will be used in each of the intent-to-treat and the per-protocol approaches to analyze the time to progression data in our study. 5.2 Baseline-Adjustment Analysis of the Time to Progression Data While the disability model accounts for the lag time in the stabilization of the treatment by including the IOP stabilization as an intermediate event before progression occurs and by modelling the lag time as the time to IOP stabilization from randomization, this multistate modelling approach does not deliberately adjust the baseline for the control group to correspond to the delay in the treatment stabilization in the treated group. One possible method of correspondence is the use of "controls matched by covariates". In essence, the method involves matching each control with a treated patient with similar covariate such as age, gender or other baseline information. Each control will have a baseline shifted from randomization to the time of IOP stabilization of the treated patient being matched. However, in our data set, the numbers of treated and control patients are different. There are not enough treated patients to perform a one-on-one matching with the control patients and thus the method may not be applicable. Another way of adjusting the baseline can be based on the distribution which the time to IOP stabilization tends to follow. The best fitted distribution can be found in the parametric modelling of the time to IOP stabilization data. We can then shift the baseline for each control patient from the time of randomization by a time randomly generated from the best fitted parametric distribution. The lifetime of the control patients will be measured from the new baseline and that of the treated patients will be measured from their individual times of IOP stabilization. Ordinary survival analysis of the adjusted data can be performed to compare the time-to-progression experience of the two groups of patients after the stabilization of the treatment is established. Chapter 5. Multistate Models for the Time to Event Data Results of analyses using the classical methods, the multistate modelling approach and the baseline adjustment approach are presented in Chapter 6. A full discussion of the performance of the different approaches can be found in Chapter 8. 32 Chapter 6 Results 6.1 Analysis of the Mean Defect Data The rationale for using the linear mixed effects (LME) model to fit the mean defect data was given in Chapter 4. A full model with all the covariates listed in Section 4.2 and a time x group interaction as fixed effects was initially fitted to the M D data using both the intent-to-treat (ITT) and the per-protocol (PP) approaches. A random intercept term and a random time slope were fitted to account for the apparently different baseline M D levels and decay pattern of M D over time across patients. When we switched from the I T T to the P P analysis, the times of repeated M D measurements and the values of the baseline covariates were adjusted according to the new baseline time. The group and gender variables were parametrized in the following way: • group: 0 for control group, 1 for treated group ; • gender: 0 for female, 1 for male Furthermore, we assumed the first-order continuous autoregressive (CAR1) model for the within-patient errors and an arbitrary covariance matrix for the random effects. The parameter estimation was based on the Restricted Maximum Likelihood ( R E M L ) method. 33 Chapter 6. Results 34 Table 6.1: R E M L coefficient estimates of the fixed effects in the L M E model for the I T T and P P approaches. ITT Fixed Effects Estimate P-value -0.00160 <0.0001 j3\ - time -0.993 0.005 p\ - group 0.834 <0.0001 /3 - baseline M D 0.0005 0.059 /5 - time x group 3 4 PP Estimate P-value -0.00157 <0.0001 -0.002 0.995 0.913 <0.0001 0.0007 0.021 The M D data for all the 97 patients were included in the I T T analysis, while the data from one patient who dropped out of the study were excluded from the P P analysis. We tested the significance of each covariate in the full model, and both analyses showed an insignificant effect of gender, age at baseline and IOP at baseline at a 5% level. We then fitted the following reduced model: Yij = 0oi + /?ij(time) + /3 (group) + /? (baseline MD) + /5 (time x group) + 2 3 4 (6.3) In Equation (6.3), i is the patient index and j is the index of repeated measurements on the same individual. is the jth. M D measurement of the ith patient. Note that the intercept and the slope coefficients are different for different patients. We can write /3 i 0 = A) + and Pu = /5i + bu, where / V s are the fixed effects and b^s are the random effects. For both the I T T and the P P analyses, model diagnostics were performed and showed that the reduced model gave a reasonable fit to the data. The normality of the random effects and the within-patient errors, and the C A R 1 model for the latter seemed to be a valid assumption. The I T T and the P P estimated autocorrelation coefficients were respectively 0.21 and 0.16. This implies the correlation between the errors present in two M D observations measured one day apart was 0.21 and 0.16 on average. Chapter 6. Results 35 With the same model being fitted, the two approaches gave rather different results, which are presented in Table 6.1. After adjusting for the baseline M D value, the I T T analysis gave a slightly more negative overall rate of change of M D than the P P analysis. At a 5% level, the group coefficient was found to be significantly less than zero in the I T T analysis, indicating that the mean M D level was lower in the treatment arm than in the control arm after adjusting for the baseline levels. However, the P P analysis showed no evidence of a difference between the mean M D levels of the two groups after the baseline levels were adjusted, at the same significance level. Moreover, we obtained a positive coefficient estimate for the group and time interaction term in both the I T T and P P analyses. The positive estimate suggested a slower decay rate of M D for the treated patients on average than the patients in the control group. In particular, the P P analysis showed a significant and stronger interaction (a more positive coefficient); the I T T analysis gave a marginally significant result (p=0.059). A t a 5% level, the I T T analysis did not show a significantly slower depression rate of M D for the treated group, while the P P analysis demonstrated a superior treatment effect in slowing down the deterioration of the visual field. In order to test for the presence of a floor effect, we included in the model in Equation (6.3) an extra covariate as a fixed effect: Xfi oor = /(baseline M D < -12dB) and an interaction term between time and Xfi oor in both the I T T and the P P analyses. The interaction between the indicator variable and time was tested and was found to be insignificant (p=0.4885 for I T T and p=0.6420 for P P ) . The decay pattern of the M D appeared to be independent of the baseline M D level regardless of the approach of analysis. Chapter 6. Results 6.1.1 36 Piecewise Linear Modelling Approach To understand what actually led to the difference in the results of the I T T and the P P analyses and consequently, the conclusions made on the usefulness of an IOP reduction strategy, we looked closely at the data in hope to reveal any hidden pattern which was not observed in the individual M D trajectories. Upon examination of the average M D level over time across patients in the control and the treatment groups separately, we suspected that the M D of the treated group had rather different decay patterns before and after IOP stabilization. In Figure 6.5 where the average M D level for the two groups were plotted, we observed the following: • Over the period (0 days, 1750 days), the average M D level for the control group appears to decrease linearly with time in spite of its mild fluctuation. • The treated group shows a sharp monotonic decline in the average M D over the period (0 days, 170 days), which is followed by a considerably slower decay rate until about 1750 days. Notice that the median time to IOP stabilization among treated patients is 189 days. The rapid decay of M D in the beginning might possibly be the result of the IOP lowering therapy, which could have caused further depression of the generalized visual field in addition to the natural depression as observed within the untreated patients. Nevertheless, upon the stabilization of the IOP, the treated group seems to have a slower decay rate than the control group, and the average M D level curves of the two groups merge at about 1750 days. It appears that the average M D level curve for the treated group can be well approximated by a segmented linear model with a change point close to the median time of IOP stabilization. Such an observation is not apparent at the individual level, but it does suggest a potential need for fitting a piecewise linear model to the individual treated patients and a pure linear model to the individual control patients. • There is substantial overlapping of the two average M D level curves beyond 1750 Chapter 6. Results 37 days. Moreover, the curves have a much wider fluctuation than was the case before 1750 days. A possible explanation for the large fluctuation is the loss of patients who had reached the progression end point. The mean (median) times to progression observed in our data are 1190.4 days (1122 days) and 1714.0 days (1889.5 days) for the control and the treated groups, respectively. With the increasing number of patients who left the study either because of drop-outs or reaching the progression end point as time elapses, the average M D levels of the remaining patients are subject to a larger variation and hence, a rightward funnel shape is seen in the plot. • The average M D level curve for the treated group does not show any consistent superiority over that of the control group over the entire follow-up period. The treated group has an average M D level below the control group most of the time, and the treatment does not seem to result in a less negative M D level. In particular, the treated patients were at a disadvantage because of a dramatic decrease in M D during the period awaited for IOP reduction and stabilization even though the mean time to progression was about 500 days longer among the treated patients. We have indeed looked to see if the piecewise linear model was necessary for the control group. Analyses of the M D data performed separately on the two groups showed that a linear model adequately described the M D trend over time for the control group while a segmented trajectory was needed for the treated group. As the I T T approach assumed a linear M D trajectory for each treated patient and the P P approach modelled the post-IOP-stabilization data, the two approaches did not explicitly take into account the rapid decay of the M D during the period awaited for IOP stabilization within the treated group. To account for the difference in the average decay rate of M D before and after IOP stabilization among the treated patients, here we considered a linear mixed effects model which allows for a piecewise linear trajectory for the treated group but a linear trajectory for the control group. Chapter 6. Results 38 Figure 6.5: T h e average M D level observed over time from randomization for the control and the treated groups. CO oo DO > Q CD 2 CD > cc CVJ NC = # controls surviving NT = # treated surviving N C = 53 NC = 50 NC = 28 N C = 14 NT = 44 NT = 42 NT = 38 NT = 29 0 500 1000 I 1500 2000 Time from randomization (days) 2500 3000 Chapter 6. Results 39 The model we fitted for this piecewise linear ( P C L I N ) approach was parameterized to have a mean structure (6.4) Poi + PiXu + • • • + PpXpi + /3p+ix i i p+ for the zth patient. The component (f3 + 0iXu H 0i t h /3 Xpi) is the same mean structure p of the linear model under the I T T and the P P approaches. The extra covariate x i p+ ti added to the structure is defined as Xp+ij = (time — T s i ) / ( t i m e > Tsi) x /(treatment) where Tsi is the time taken to I O P stabilization i f patient i was treated, and time is measured from the time of randomization. The covariate represented by x ij p+ only enters the model for the treated group and when the time of M D measurement happens after patient % had the I O P reduced and stabilized. The L M E model w i t h a mean structure given i n E q u a t i o n (6.4) thus fits a two-segment linear trajectory to the treated group w i t h a change point located at the treated individuals' times of I O P stabilization. However, a linear model is fitted for the control group. Gender, age at baseline and baseline IOP, included as fixed effects, were again found to be insignificant at a 5% level in the P C L I N modelling approach, and the final model w i t h same covariates as in Equation (6.3) was fitted to the data: (6.5) where for patient i XUJ is the time of the jth M D measurement from the time of randomization, Chapter 6. Results 40 X21 is the group membership indicator (0 for control and 1 for treatment), x i is the MD level at randomization, 3 x^j = XUJ x X2i defines the time by group interaction, defined as (xUj - TSi)I(xuj > TSi) x x2i, eij is the random error for the jth MD measurement. Also note that the intercept, the time of repeated measurements and the extra covariate x j were included as both fixed and random effects. We were allowing a different post5i IOP-stabilization MD decay rate for each treated patient. And for the within-patient errors and the random effects, the continuousfirst-orderautoregressive model and an arbitrary covariance matrix were assumed, respectively. Again, the REML method was used for parameter estimation. The results from this final model (Table 6.2) confirmed the need for two separate mean time slopes for the treated group. Model diagnostics also showed including the covariate x&j as random effect significantly improved the modelling of the individuals' MD data. The control group had a more negative estimated mean decay rate of the MD as compared to the results obtained from the ITT and the PP analyses. Before stabilization of the IOP, a treated patient was expected to have a faster decay rate than a control patient, as indicated by the negative estimate of the time by group interaction term. After stabilization, however, the decay rate slowed down considerably. The coefficient estimate for the time from IOP stabilization was not only positive, but also its magnitude was larger than the coefficient for the time by group interaction. As a result, there appeared to be a slower decline of the MD upon IOP stabilization among the treated patients, at a rate slower than the control group. To compare statistically the rate of change of the MD level between the controls (Pc) and that of the treated patients after IOP stabilization (PTI), we tested the hypotheses: Chapter 6. Results 41 Table 6.2: R E M L coefficient estimates of the fixed effects in the L M E model for the Three Different Approaches (ITT, PP, P C L I N ) . Pi p\ P 8 As 3 4 ITT Estimate P-value -0.00160 <0.0001 -0.993 0.005 0.834 <0.0001 0.0005 . 0.059 Fixed Effects - time - group - baseline M D - time x group - post IOP stabilization PP Estimate P-value -0.00157 <0.0001 -0.002 0.995 0.913 <0.0001 0.0007 0.021 0 : PC = PT2 OR a : PC^PT2 O R fa + fcjLO H vs. H PCLIN Estimate P-value -0.00172 <0.0001 -0.564 0.136 0.831 <0.0001 -0.0022 0.025 0.00307 0.002 8 + B = 0 (6.6) 5 4 The corresponding test statistics is T = ft + ft ^ P - - iv(o, l ) under H y/Var(fa + &) 0 Var(B + 8 ) can be estimated by Var(6 + 8 ) which is given by A Var0 4 5 4 + fa) = Var0 ) + Var0 )+2xCov0 J ) i = 5 5 I 9.85 x 1 0 +9.86 x 1 0 -0.0022 + 0.00307 _ 7 = 2 g _ 7 5 + 2 x (-9.4 x l 0 ~ ) = 9.1 x 10" 7 3 x/9.1 x 102 x (1 - $(|T|)) 8 p-value = 2 x (1 - $(2.83)) = 0.0046 $ is the standard normal cumulative distribution function. The small p-value (0.0046) showed that B + f3 was significantly different from 0. In 4 5 particular, while the treated group had a significantly faster decay rate of M D during the Chapter 6. Results 42 period awaited for IOP stabilization (/3 < 0, p=0.025 in Table 6.2), the deterioration 4 slowed down to a rate significantly slower than the control rate afterwards. The individual M D trajectories observed in the data and the mean fitted trajectory from the P C L I N approach for the control and treated groups were plotted, respectively, on the same scale in Figures 6.6 and 6.7. Each thin solid path corresponds to the observed M D level of a patient over the time period he or she was followed, and the thick solid line corresponds to the mean fitted trajectory. In Figure 6.7, the change point of the mean fitted trajectory was chosen to be the mean time to IOP stabilization. From these two plots, we saw a large between-patient variation in the M D readings at randomization as well as in the temporal trend of the M D levels. Although it is not apparent from the individual paths that the treated patients had a different M D slope before and after the IOP stabilization, the mean fitted trajectories for both groups captured the overall trend that was observed in the average M D level curves in Figure 6.5. We also plotted in Figure 6.8 the individual fitted M D trajectories from the P C L I N approach for a random sample of 24 patients in the data. The plots with a linear trajectory correspond to control patients and those with a broken trajectory correspond to treated patients. The change point of each segmented line is the time of IOP stabilization specific to a treated patient. A n d the scattered points are the observed M D values. The individual fitted trajectories seem to give a reasonable fit to the data. In order to assess the performance of the three approaches used to model the M D data, we compared the AIC, BIC indices and the log likelihood values (presented in Table 6.3) from fitting the model given in Equation (6.3) for the I T T and the P P analyses and the one given in Equation (6.5) for the P C L I N analysis. Indeed, there is no direct method to compare the P P approach with the other two approaches based on the indices and the log likelihood because they essentially involved two different sets of patients. However, we were able to compare the I T T and the P C L I N Chapter 6. Results 43 Figure 6.6: Individual observed M D trajectories and the mean fitted trajectory from the P C L I N approach from time of randomization for the control group. 0 500 1000 1500 2000 time (days since randomization) 2500 3000 Chapter 6. Results 44 Figure 6.7: Individual observed M D trajectories and the mean fitted trajectory from the P C L I N approach from time of randomization for the treatment group. time (days since randomization) Chapter 6. Results 45 Figure 6.8: A random sample of 24 individual fitted MD trajectories from the PCLIN approach from time of randomization. 0 500 1000 1500 0 500 time (days) 0 500 1000 1500 0 500 time (days) 0 500 200 1000 400 0 600 200 400 600 200 600 time (days) 500 2000 0 600 800 _ 0 400 0 50 1000 1200 200 0 , 0 200 400 600 time (days) 200 300 500 400 600 0 , 800 500 1500 0 500 0 500 1000 time (days) 1500 200 2500 400 600 time (days) 1000 0 500 1500 time (days) 2500 0 200 time (days) 0 1500 time (days) 1500 200 1000 time (days) time (days) time (days) 0 500 time (days) 800 100 0 time (days) time (days) 800 1500 100 time (days) time (days) 0 0 time (days) 1000 200 time (days) 0 , time (days) time (days) 0 1500 time (days) 600 1000 time (days) 1500 0 200 600 time (days) 1000 Chapter 6. Results 46 Table 6.3: Table of AICs, BICs, observed log likelihoods and within-patient mean square errors (a ) from fitting the L M E models for the three different approaches. 2 Approach ITT PP PCLIN Number of Patients 97 96 97 Number of M D Observations 1493 1351 1493 AIC BIC 5883.6 5290.4 5876.6 5936.7 5342.1 5950.9 Log Likelihood -2931.8 -2635.0 -2924.3 a 2 2.57 2.38 2.52 models because the former is nested within the latter, and both models were fitted to the same set of M D data. The log likelihood ratio test gave a x statistic of 14.99 (= 2 —2 x {—2931.8 — (—2924.3)}) with 4 degrees of freedom, and a corresponding p-value of 0.0047 which suggested P C L I N modelling approach outperformed the I T T approach in terms of goodness of fit. Besides, the P C L I N approach resulted in a sightly smaller mean square error than the I T T approach. 6.2 Analysis of the Time to Event Data In Chapter 5, we have described the disability multistate model and its potential application in modelling the survival data for the two events: IOP stabilization and progression simultaneously. In this section, we will present results from the separate analyses of the time to IOP stabilization data and the time to progression data using some classical methods of survival analysis. Later in Section 6.2.3, we will show the results from modelling the individual transition hazards (hi (t) and ho2(t) in Figure 5.4) and from the 2 Cox analysis with time-dependent covariates under the multistate modelling approach. Among the 44 treated patients in our data, the time to progression information was missing for one patient and was therefore excluded from the progression survival analysis. The same patient was excluded from the analysis of the time to IOP stabilization data in order to have the same set of treated patients in both analyses. Chapter 6. Results 6.2.1 47 Analysis of the Time to IOP Stabilization Data In managing normal tension glaucoma, patients are placed on non-surgical therapy or surgical therapy to lower the IOP level depending on the severity of the disease. It would be useful to have some knowledge about the expected waiting times for the different therapies to effect IOP reduction. This motivated our interest to study the time to IOP stabilization among the treated patients. Note that modelling the hazard for the time to IOP stabilization is essentially the same as modelling the transition hazard hoi (t) in the disability model (see Figure 5.4) which was discussed earlier in Chapter 5. We analyzed the time to IOP stabilization data using non-parametric, semi-parametric and parametric methods. The key covariate in the analyses was the type of treatment received. In the Collaborative Normal Tension Glaucoma Study, most patients who underwent the surgical treatment experienced failure of achieving the IOP reduction by means of an initial non-surgical treatment. Only two out of the 43 patients received only the surgical therapy. We thus categorized the treated patients into two groups according to whether they had ever undergone a filtering surgery, and we did not consider having a separate group for those two patients. Among the 43 treated patients included for analysis, 24 received non-surgical treatment only and 19 had undergone a filtering surgery. In regard to the gender composition, the female group had 27 patients and the male group had 16 patients. Notation The end point of interest here is the stabilization of the IOP, and for all the treated patients included in our analysis, the time to this event was observed, so there were no censored observations and the analysis was free of concerns about the censoring pattern of the data. For individual i, we defined T$i to be the time taken to IOP stabilization (measured in days) and X ; to be the vector of covariates. We parameterized the two Chapter 6. Results 48 categorical variables: gender and the type of treatment received variables in the following way: • Gender (Xi): 0 for female, 1 for male • Type of treatment received ( X ) : 0 for receiving a non-surgical treatment only, 1 2 for ever receiving a surgical treatment Kaplan-Meier Survival Analysis (Non-parametric) The Kaplan-Meier survivor functions were estimated for and compared between the four different gender by treatment groups ((Xi,X ) 2 = (0,0),(0,1),(1,0),(1,1)). The log rank test showed that the four groups had significantly different survival experiences (p=0.00813). However, we observed in Figure 6.9(a) substantial overlapping of the estimated survival curves for the two genders within each of the two treatment groups, which is suggestive of the absence of a gender effect. Individual analyses were carried out to compare the two treatment groups and the two gender groups separately. The results did not show a significant difference in survival experience between the opposite genders (p=0.273). The patients undergoing the filtering surgery had a significantly longer waiting period for IOP reduction and stabilization than those who just received medical or laser treatment (p=0.0054), as can be seen in Figure 6.9(b). Cox Regression Analysis (Semi-parametric) Despite the advantage of the Kaplan-Meier analysis which makes no assumption on the distribution of the time to IOP stabilization, the associated log rank test can only assess the effects of categorical covariates. To incorporate other continuous covariates into the modelling of the hazard of IOP stabilization, we made use of the Cox proportional hazards Chapter 6. Results 49 Figure 6.9: (a) Estimated Kaplan-Meier survivor functions for the time to I O P stabilization of the four gender by treatment groups, (b) Estimated K a p l a n - M e i e r survivor functions for the time to I O P stabilization of the surgical and non-surgical groups. (a) T i m e to I O P s t a b i l i z a t i o n ( d a y s ) (b) CO o Non-surgical Surgical CO o Nn = # surviving in non-surgical group Ns = # surviving in surgical group CM O O o Nn=24 Nn=0 Nn=0 Ns=19 I Ns=3 Ns=2 200 400 T i m e to I O P stabilization (days) 600 Chapter 6. Results 50 Table 6.4: Estimates of the log relative hazards ratio (8) and relative hazards ratio (exp(/3)) for the gender and treatment type covariates from the Cox regression analysis of the time to IOP stabilization data. Covariate Gender {X ) Treatment type (X ) x 2 P 0.253 -0.984 exp(/3) 1.288 0.374 SE(/3) 0.167 0.343 P-value 0.1300 0.0041 model: hi(t) = ho(t)exp{B Xi} T (6.7) where hi(t) is the hazard function for the ith individual at time t, X j is the vector of covariates, h (t) is the baseline hazard function, and (3 is the vector of parameters 0 representing the log relative hazard ratios. The Cox regression analysis was initially performed to adjust for all the covariates including gender, type of treatment, baseline IOP, M D and age at randomization. The baseline covariates did not have a significant contribution to modelling the hazard at a 5% level. The group and gender covariates were found to be significant, but there was no suggestion of an interaction effect between the two covariates. We then fitted a reduced model excluding the insignificant covariates, and the results are presented in Table 6.4. Interestingly, the gender effect in the reduced model was deemed insignificant as opposed to the case when fitting the full hazard model at the same significance level. Nevertheless, the positive gender coefficient indicated a higher hazard of IOP stabilization for the male patients, i.e., a female patient who belonged to the same treatment group as another male patient will be expected to wait longer for her IOP level to reduce and stabilize. And again, the surgical group was shown to have a significantly smaller hazard of IOP stabilization. According to the results based on the model with only the type of treatment covariate, the relative hazard of two patients who received and did not receive the surgical therapy was estimated to be 0.406. The Chapter 6. Results 51 surgical therapy appeared to incur a longer waiting period before IOP stabilized after being reduced by 30%. We also investigated the presence of a possible gender-treatment interaction, which was found to be insignificant at a 5% level. To check the validity of using the Cox model in Equation (6.7), we tested for any time dependence of the coefficient B for the covariates (individually and globally) in both the full and the reduced models, and we found no evidence of such a dependence. This implies the assumption of proportional hazards was valid for all the covariates used to model the hazard of IOP stabilization. Parametric Modelling As one of the classical approaches, parametric models were fitted to complete a full analysis of the time to IOP stabilization data. In seeing that only the type of treatment received and possibly gender played a significant role in determining the length of period taken to reach IOP reduction and stabilization, we included only these two covariates in the parametric models. Among the many distributions commonly used for modelling survival data, we found that the log-normal, Weibull and log-logistic distributions gave satisfactory fits (the probability plots for the three distributions resembled a straight line). Because of the simplicity of the Weibull distribution over the other two, we chose to report the results based on the Weibull hazard model, which can be parameterized as follows: h{t) = A ^- (6.8) x 7 {exp(B K)}^T (6.9) 1 where h(t) is the hazard function at time t, A = exp(3 -x) depends on the vector of T covariates x through an exponential function and linear coefficients in 8. The scale parameter A and the shape parameter 7 uniquely define a Weibull distribution. Chapter 6. Results 52 The gender effect was again shown to be insignificant (p=0.24). Refitting the same model by excluding the gender component yielded an estimated relative hazard between a surgically treated patient over a non-surgically treated patient of exp(-0.581) =0.56 (SE=0.157, p=2.2xl0~ ). 4 Regardless of the method used to compare the survival experience of the surgical and the non-surgical groups, patients who ever underwent the filtering surgery to achieve a 30% reduction of IOP had a significantly longer waiting time to IOP stabilization. The presence of a gender effect is ambiguous, although all the three methods of analysis suggested the IOP of the male patients tend to reduce and stabilize faster than the female patients. 6.2.2 Analysis of the Time to Progression Data Similar to the analysis performed in the previous section, non-parametric, semi-parametric and parametric methods were used to analyze the data for the time to progression. To investigate the effect of the lag time (brought by the delay awaited for IOP reduction and stabilization within the treated group) on the results based on the two different definitions of baseline time, the three methods were applied in both the intent-to-treat and the per-protocol analyses. In regard to the number of patients to be included in the I T T and the P P analyses, the patient whose time to progression was missing was dropped out of the study. As a result, both analyses excluded this single patient and had a sample size of 96 patients among which 53 belonged to the control group and 43 belonged to the treated group. Chapter 6. Results 53 Notation Suppose we define and Y^ to be the actual times to progression for the ith. pa- tient according to definitions of time zero under the intent-to-treat and the per-protocol approaches, respectively. Also let c\ ^ and Cf^ be, respectively, the censoring times to 1 progression for the two approaches. The control group will have YP=Yl 2) The treated group will have Y} =Y^ - T 2) 1) and Cf =C\ ] Si l) and - T , where T Si Si C\ =C\ . 1] 2) is the time to IOP stabilization for the i t h patient. The time to progression is either observed or censored, and we will define it by T^—m\n{Y^\C^) for .7=1,2. Again, the following relationships hold: • T^=T^ for the control group and • r/ =7; - T 2) (1) Si for the treated group. A l l the time data including Y^\c\^ and C\^ (z=l,2,...,96, .7=1,2) were measured in years. The time to IOP stabilization T i was similarly scaled. S We also denote the vector of covariates by X;. In particular, we define the two categorical variables, gender and the group membership by X\ (0 for female, 1 for male) and X 2 (0 for control, 1 for treatment), respectively. The other covariates which might have an effect on the time to progression and will be studied include the M D level, the IOP level and the age at baseline. Kaplan-Meier Survival Analysis (Non-parametric) For each of the I T T and P P approaches, the Kaplan-Meier survivor functions were estimated for and compared between the four groups of patients of different genders and Chapter 6. Results 54 group membership ( ( X i , X ) = (0,0),(0,1),(1,0),(1,1)). The log rank test found a signifi2 cant difference in the survival experience to progression among the four groups (p=0.0006 for I T T and 0.005 for P P ) . The plots of the estimated survivor functions (Figure 6.10) showed substantial overlapping of the two gender curves within the same treatment or control group. The gender effect did not seem to explain the difference in the survival experience of the four groups. When we compared only the two treatment groups, both the I T T and the P P approaches resulted in a significant group effect. The plots of the estimated Kaplan-Meier survival curves (Figure 6.11) showed a superior survival experience of the treated group, and the two curves did not intersect over the whole course when the patients were followed. Upon comparison of the results of the I T T and the P P approaches, the log rank test demonstrated a slightly more significant difference in survival between the two treatment groups in the I T T analysis (p=0.0001 for I T T versus p=0.0009 for P P ) . This can also be seen from the larger separation between the two estimated Kaplan-Meier survivor functions in Figure 6.11(a) as compared to the two functions in Figure 6.11(b). In fact, the estimated Kaplan-Meier survivor function for the control group was the same under the two approaches because the baseline defined for the controls was always the time of randomization. The baseline for the treated group was shifted from the time of randomization to the individuals' times of IOP stabilization when we took the P P approach. The time to progression as defined in the P P approach was shorter than its I T T counterpart within the treated group, and hence a diminished difference between the two groups was demonstrated in the P P analysis . Although the treatment effect was found to be significant in both analyses, the P P analysis gave a slightly weaker evidence against the null hypothesis of no treatment difference. Unlike the case for the control group, the difference in the baseline definition adopted by the two approaches led to different Kaplan-Meier estimates for the treated group. The KaplanMeier estimate of the survivor function for the treated group did not simply undergo a rank invariant transformation when switching between the I T T and the P P approaches, because a different time awaited for IOP stabilization, Tsi, was subtracted from T/ ^ to 1 obtain T/ ^. The ranks of the set of T/^'s are not preserved under the transformation 2 Chapter 6. Results 55 As mentioned earlier, the overlapping of the estimated survivor functions for the two gender groups within the same treatment or control group (Figure 6.10) suggested that there was no gender effect. Nevertheless, we carried out separate analyses to test for the treatment effect within each gender group and to see if they gave different results for the two genders. We compared the treated female group to the control female group ((Xi, X ) 2 — (0,0) and (0,1)), and the treated male group to the control male group ((Xi,X ) — 2 (1,0) and (1,1)), using both the I T T and the P P approaches. Interestingly, the I T T analysis revealed a difference in the treatment effect on the two genders. The log rank test demonstrated a significant difference in the time-to-progression experience between the treated female and control female patients (p=0.00064), but there was no evidence of a treatment effect within the male patients (p=0.137). On the other hand, the P P analysis showed no treatment effect in either gender group at a 5% level (p=0.471 for males and p=0.0583 for females) although the treatment effect for the female group was marginally significant. This contradicts with the results obtained from the P P analysis when we did the treatment comparison on all patients together. Cox Regression Analysis (Semi-parametric) Besides obtaining the non-parametric Kaplan-Meier estimates of the survivor functions, we modelled directly the hazard of progression using the Cox proportional hazards model (Equation (6.7)). Later, we repeated the Cox regression analysis within each of the two gender groups separately. We also considered the stratified Cox model with gender being the strata variable to model the progression hazard. When we first modelled the progression hazard for all the 96 patients using the Cox proportional hazards model (Equation (6.7)), the patients' age, M D level and IOP level at baseline, gender and group membership were included as covariates. In this full model, Chapter 6. Results 56 Figure 6.10: (a) Estimated Kaplan-Meier survivor functions for the time to progression of the four treatment-gender groups based on the intent-to-treat (ITT) approach, (b) Estimated Kaplan-Meier survivor functions for the time to progression. of the four treatment-gender groups based on the per-protocol (PP) approach. (a) CO c "> "£ 13 CO c o Yr o CO O CO O •+ ^1" CL o o CM Control,Female Control,Male Treatment.Female Treatment, Male H h o o 4 6 8 Time to progression (years) (b) -H-t-hTJ CO CD O 1 •> > =3 CO •Io Q. I TH—r! CO O o 5 Control,Female Control,Male Treatment, Female Treatment, Male o Q. » CM O O O 0 4 6 Time to progression (years) 8 Chapter 6. Results 57 Figure 6.11: (a) Estimated Kaplan-Meier survivor functions for the time to progression of the two treatment groups based on the intent-to-treat (ITT) approach, (b) Estimated Kaplan-Meier survivor functions for the time to progression of the two treatment groups based on the per-protocol (PP) approach. (a) CO 4 f • I I I I I I I; o CD O I I II o NC = # controls surviving H—h NT = # treated surviving • CNJ o o o NC=53 NC=37. NC=14 NC=5 NC=1 NT=43 NT=39 NT=29 NT=12 NT=0 4 6 8 Time to progression (years) (b) •H h-1-+ oo o 4t • 4H—H—I—I—H ft- CO o I I II o Control Treatment NC = # controls surviving NT = # treated surviving CNJ O NC=53 O NT=43 o NC=37 (VT=38 NC=14 NC=5 NC=1 NT=22 NT=7 NT=0 i— 0 4 Time to progression (years) 8 Chapter 6. Results 58 Table 6.5: Estimates of the log relative hazards ratio (8) and relative hazards ratio (exp(/3)) for the group covariate from the gender-specific Cox regression analysis of the time to progression data (M for males, F for females). ' B Covariate Group I T T PP M -1.12 -0.98 F -1.83 -1.43 exp(B) M F 0.33 0.16 0.38 0.24 SE(8) M F 0.70 0.56 0.69 0.51 P-value M F 0.11 0.0011 0.16 0.0049 the baseline values of age, M D level and IOP level were adjusted according to the I T T or the P P approach we used to analyze the data. Upon fitting the full model, we found that the group covariate satisfied the proportional hazards assumption in the P P analysis but not in the I T T analysis. It can be that the two treatment groups had proportional hazards within each gender group while the two genders had different baseline hazards. We explored this possibility by analyzing the time to progression data separately for the two genders. If indeed the assumption for the group covariate is valid within each gender group, it is also legitimate to fit a Cox model stratified by gender. For each of the I T T and the P P approaches, we fitted the full model separately to the time to progression data specific to the two genders. In all the resulting four analyses, the proportional hazards assumption held for all of the covariates individually as well as globally. Similar to the case in the gender-specific Kaplan-Meier survival analyses presented earlier, we obtained different results for the two gender groups. Both the I T T and the P P approaches revealed a difference in the treatment effect on the two genders. More specifically, there did not seem to be a difference in the hazard of progression between the male control patients and the male treated patients. However, in the female group, the treated patients had a significantly smaller risk of progression than the control patients. The age covariate, IOP level and M D level at baseline were not associated with the hazard of progression for either gender group. We repeated the gender-specific analyses, fitted a reduced model involving only the group covariate and obtained the results in Table 6.5. Chapter 6. Results 59 Again, the assumption of proportional hazards between the control and the treated groups was satisfied within each gender group under both the ITT and the PP approaches. Neither the ITT nor the PP analyses showed enough evidence against the null hypothesis of no treatment effect among the male patients. In contrast, the female patients who received the treatment had a significantly smaller risk of progression than those who remained untreated, as shown in both the ITT and the PP analyses. The ITT analysis gave a relative hazard of 0.16 for a treated female patient versus an untreated female patient, while the PP analysis gave a relative hazard of 0.24. In other words, within the female group, receiving an IOP reduction treatment is associated with about a 6-fold decrease in the risk of progression, as suggested by the ITT analysis, and roughly a 4fold decrease in the PP analysis. Furthermore, the ITT approach gave a more significant result for the treatment effect on the female patients than the PP approach did. The same argument given in the Kaplan-Meier survival analysis (p.54) in this section accounted for this difference in significance. However, it is not clear how the shift in baseline within the treated group affects the assessment of a treatment effect when indeed such an effect does not exist. As the proportional hazards assumption for the group covariate was valid within patients of the same gender, modelling the hazard of progression using the Cox model stratified by gender was justified. In essence, we were assuming two different baseline hazards for the two genders while the two genders share the same coefficient for each of the other covariates. The stratified Cox model has the form M * ) = h {t)exp{a Xi}, T bj j = l,2' • (6.10) where hij(t) is the hazard at time t for the ith individual of gender group j, hbj(t) is the baseline hazard of the jth gender group, x; is the vector of covariates at time t not including gender, and 3 is the vector of parameters representing the log relative hazards ratios. Infittingthe stratified Cox model, we included the group covariate and its interaction Chapter 6. Results 60 Table 6.6: Estimates of the log relative hazards ratio (6) and relative hazards ratio (exp(/?)) from the stratified (by gender) Cox regression analysis of the time to progression data. Covariate Group Group x Strata(gender) ITT PP ITT PP -1.472 -1.203 0.356 0.227 exp(/3) 0.229 0.300 1.428 1.255 SE(/?) 0.448 0.43 0.448 0.43 P-value 0.001 0.005 0.43 0.60 with the strata covariate, i.e., gender, and the results are presented in Table 6.6. It can be shown that by adding the group by strata interaction term to the hazard model, the result is identical to performing separate fits with only the group covariate for each stratum. For instance, the I T T log relative hazards ratio estimate (B) for the group covariate within the male patients, which is -1.12 (Table 6.5), can be obtained by summing the I T T estimates of the log relative hazards ratios of the group covariate and the interaction term (-1.472+0.356) presented in Table 6.6. For the female patients, taking the difference of the group covariate estimate and the interaction estimate obtained by the stratified Cox regression would give the log relative hazards ratio estimate for the group covariate in the analysis focusing on only the female patients. Despite the fact that the gender-specific I T T and P P analyses showed a significant treatment effect in the female group but not in the male group, both the I T T and the P P stratified Cox regression analyses did not suggest the presence of an interaction effect between gender and the treatment group membership (see Table 6.6). We then refitted the stratified Cox model leaving out the interaction term. The I T T approach gave a group coefficient estimate of-1.58 (p=0.0003) while the P P approach gave an estimate of -1.28 (p=0.0017). The negative coefficient implied a smaller risk of progression for the treated group and hence a favourable treatment effect. Based on the I T T analysis, the relative hazard of progression of a treated patient versus an untreated of the same gender was estimated to be exp(-1.58) = 0.206. Similarly, the P P analysis gave a relative hazard Chapter 6. Results 61 of exp(-1.58) = 0.278. While the stratified Cox model has the flexibility of incorporating into the hazard covariates which only satisfy the proportional hazards assumption within certain strata, the approach has certain limitations in our application. First, we cannot assess the gender effect with the stratified model because no direct estimate of the gender coefficient is available. In fact, because a different baseline hazard function is used for each gender, the relative hazard between the two genders will be time-dependent and cannot be determined unless the exact form of the baseline hazards is known. Comparison of the survival experience between two patients of opposite genders thus cannot be easily made. Second, the same coefficient for group, /3 up, is assumed for both genders. The value of gr0 exp(/3 flroup ) gives an "overall" measure of the relative risk of a treated patient versus an untreated patient of the same gender. As a result, the estimated relative risk of a treated patient versus an untreated patient within the female group is the same as that within the male group. Assuming the same relative risk for the two genders might be unrealistic in our situation where the two genders have essentially different baseline hazards. Parametric Modelling Various distributions common for modelling survival times were fitted to the data. The Weibull, log normal and log-logistic distributions seemed to fit the data satisfactorily and gave small and comparable deviance values. Here, we present the results from the Weibull fit because it is simpler to interpret than the other two fits. In both the I T T and the P P approaches, only the group covariate was fitted in the hazard model given by Equation (6.9), and the two corresponding group coefficients were both shown to be significantly different from zero. A coefficient estimate of -0.878 from the I T T analysis implied that the treated group and the control group had a relative hazard of exp(-0.878) = 0.416 and thus a favourable treatment effect. Similarly, the P P analysis showed a superior effect for the treated group whose hazard was estimated to be exp(-0.754) = 0.47 times of the Chapter 6. Results 62 hazard for the control group. The results were highly significant in both the I T T and P P analyses although the former analysis gave slightly smaller p-value (p=4.96xl0~ for 4 I T T and p = 2 . 8 7 x l 0 - for P P ) . , 3 A l l three methods of analysis showed the treatment significantly prolonged the time to progression, regardless of which definition of the baseline was used. Other covariates did not contribute to the modelling of the survival time or the hazard. 6.2.3 Analysis Using the Disability Model In Section 6.2.1, we have presented the results of modelling the transition hazard h i(t) 0 in the disability model (see Figure 5.4), i.e., the hazard of IOP stabilization among the treated patients. For the transition hazard hi (t) in the model, we could think of it as 2 the post-IOP-stabilization hazard of progression within the treated patients. Treated patients were at risk for the transition from state 1 to state 2 starting from the time of IOP stabilization until they reached the progression end point. Thus, the time t in hi (t) 2 was measured from the time of IOP stabilization. A n d for h (t), we could interpret it Q2 as the progression hazard among patients in the control group and patients who were treated but progressed before a stable IOP reduction was achieved. Because data from the latter group were unavailable for our analysis, we focused on the control patients only in modelling h (t). Control patients were at risk for the transition from state 0 to state 02 2 starting from the time of randomization until they reached the progression end point. The time t in h (t) was measured from the time of randomization. 02 For both the modelling of the transition hazards h {t) and ho (t), we have applied x2 2 non-parametric, semi-parametric and fully parametric methods, and similar results were obtained from the different approaches. In the following two sections, we will focus on the results from fitting the Cox proportional hazards model given in Equation (6.7). We checked that the proportional hazards assumption was satisfied, and therefore modelling Chapter 6. Results 63 hi2(t) and ho2(t) using the Cox model seemed'appropriate. Modelling of hi2(t) within the Treated Group A n initial full Cox model with covariates including gender, age, M D and IOP levels at IOP stabilization, the type of IOP lowering therapy and the time awaited for IOP stabilization from the time of randomization was fitted to the 43 treated patients. Although the data did not show a significant effect of any of the covariates at a 5% level, we obtained a negative coefficient for both the time taken to reach IOP stabilization (/3=-0.26, p=0.85) and the type of IOP lowering therapy (/3=-1.46, p=0.24). The coefficient estimates suggested that a treated patient who took a longer time to achieve a stable IOP reduction had a smaller risk of progression, and the same applied to a patient who received the surgical treatment (type of therapy was parameterized as 0 for receiving only non-surgical treatment and 1 for ever receiving a surgical treatment). The effect of IOP at stabilization (p=0.11) may have been diluted by the other insignificant covariates included in the model, so we fitted a reduced model with only the IOP at stabilization covariate. The fit gave a coefficient estimate of 0.37 (p=0.074) and a relative hazard ratio of exp(0.37)=1.44. Though only marginally significant, the*coefficient with a positive estimated value implied a higher IOP at stabilization was associated with a higher.risk of progression among the treated patients. We also tested for the presence of an interaction between various pairs of covariates, but no significant results were obtained. Modelling of ho (t) within the Control Group 2 A full Cox model with covariates including gender, age, M D and IOP levels at randomization was fitted to the 53 control patients, and no significant covariates were identified at a 5% level. Unlike in the analysis of the post-IOP-stabilization survival experience Chapter 6. Results 64 within the treated patients where IOP at stabilization had a marginally significant effect on the post-stabilization progression hazard, IOP at randomization did not seem to affect the hazard of progression within the control group (p=0.25). There was also no evidence of a significant interaction effect between any pairs of covariates included in the full model. Cox analysis with time-dependent IOP-stabilization indicator variable We explored the possibility of fitting a single Cox model with time-dependent covariates for the analysis of the disability model as described in Chapter 5. Recall that if the length of the waiting period to achieve IOP stabilization does not have an effect on the risk of progression with a stable IOP reduction, and if we have proportionality between the two progression hazards hi (t) and h (t) at all times, it would be legitimate to model 2 02 an overall hazard by means of a Cox model with a time-dependent indicator variable for IOP stabilization [26]. The above analysis of the post-stabilization times to progression within the treated patients showed that the length of the time period awaited for IOP stabilization had no significant effect on the post-stabilization time to progression at a 5% level (p=0.85) and thus validated the first condition required for fitting a single Cox model. This first condition also held within each gender group (p=0.18 for male patients and p=0.69 for female patients). Since our data did not include any.treated patients who reached the progression end point before IOP stabilization was established, we did not have a proper modelling of the transition hazard h (t) which, in principle, should involve Q2 complete uncensored time to progression without IOP stabilization from some patients in both the untreated and treated groups. We were therefore unable to check the second condition (proportionality of the progression hazards h\ (t) and h (t)). We proceeded 2 02 by assuming the second condition held in our case and fitted the Cox model (in Equation (5.2)) with covariates including gender, age, M D and IOP at randomization, as well as the interaction term between the group covariate x gr0UP}i (0 for control group and 1 for treated Chapter 6. Results 65 group) and the time-dependent indicator for IOP stabilization Ii(t). As all the treated patients included in our data underwent IOP stabilization before reaching the progression end point, they had x = 1 by definition, and /;(£) = 1 during their observed post- gr0UPii stabilization periods. The two variables xgr0UPti and Ii(t) became linearly dependent, and can lead to computational problems in the estimation of the regression coefficients for the two variables. To avoid the problem, the group main effect was excluded from the model, and consequently, we could only assess the effect of IOP stabilization on the progression hazard within the treated patients, and we were unable to assess whether the treated and untreated groups had different risks of progression. At a 5% significance level, gender, age, M D and IOP at randomization were found to have no effect on the hazard of progression. Among the treated patients, the IOPstabilization covariate was found to be significant ( p = 7 . 6 x l 0 ) , and the relative risk -4 of having versus not having a stable 30% IOP reduction was estimated to be exp(1.41)=0.24. This implied about a four-fold decrease in risk of progression when a stable IOP reduction was achieved. Moreover, the proportional hazards assumption held for all the time-independent covariates, so the appropriateness of using a Cox model was justified. Checking for the assumption for the IOP-stabilization indicator variable Ii(t) was unnecessary because the hazard ratio involving the Ii(t) term changes with time. Indeed, the property of proportional hazards is not required for time-dependent covariates for the validity of applying the Cox model in Equation (5.2). We also fitted the same Cox model for the two genders separately. The analyses showed no evidence of a significant effect of any of the baseline covariates at a 5% level within each gender group. We then fitted a reduced model with only the IOP-stabilization covariate and the results are presented in Table 6.7. Among the male treated patients, the effect of having a stable IOP reduction on the progression hazard was insignificant although a negative estimate of /3(IOP stabilization) suggested that a stabilized 30% reduction of the IOP was associated with a smaller risk of progression. On the other hand, Chapter 6. Results 66 Table 6.7: Estimates of the log relative hazards ratio (j3) and relative hazards ratio (exp(/3)) for the time-dependent IOP stabilization covariate within the treated patients from the gender-specific time-dependent Cox regression analysis (M for males, F for females). Covariate IOP Stabilization M -1.06 7? F -1.79 exp(P) M 0.35 F 0.17 SE{B) M 0.70 F 0.56 P-value M F 0.13 0.0014 a stable reduction of the IOP had a significant effect on the progression hazard among the female treated patients. Within the treated group, female patients who attained IOP stabilization had a hazard of about 6 times smaller than those who did not. A Cox model stratified by gender was also fitted, and we again found that the baseline covariates were all insignificant at a 5% level. Refitting the stratified Cox model with only the IOP-stabilization indicator variable gave an estimate of /5 (IOP stabilization) of -1.54 (p=0.0004) which indicated a favourable effect of achieving a stable 30% reduction of the IOP from the prerandomization readings within the treated group. Based on this estimate, the relative risk of progression of two treated patients (of the same gender) with and without IOP stabilization was exp(-1.54) = 0.215, which in turn implied an almost five-fold decrease in the risk of progression after treatment stabilization. 6.2.4 The Baseline-Adjustment Analysis In Section 6.2.1, the Weibull distribution was found to model the time to IOP stabilization (Tsi) within the treated group satisfactorily. To correspond to the lag time in treatment stabilization in the treated group, an adjustment was made to the baseline for the control group. Our baseline-adjustment analysis introduced a random Weibull time shift from the time of randomization for each of the control patients in the data. Let Tci represent the random time shift for the ith patient if he or she happened to be untreated. The Chapter 6. Results variable 67 Ta follows the Weibull distribution (see parameterization in Equation with estimated parameters A=0.572 and 7=1.68 for the time (6.9)) t measured in years. The Weibull parameters were estimated from fitting the Weibull distribution to the times to IOP stabilization observed within the treated patients in our data without adjusting for any covariate information. The therapy type was the only significant covariate identified in the analysis of the time to IOP stabilization data but was not adjusted in our baselineadjustment approach because the therapy type was irrelevant for the control patients. The Weibull distribution with no covariate adjustment was shown to also provide a reasonable fit to the time to IOP stabilization data. For each control patient, a random Ta was generated and was subtracted from the patient's original time to progression to obtain the adjusted time to progression. The values of the baseline covariates including M D , IOP and age were changed accordingly to match the shifted baseline. The time to progression for each treated patient will be the same as defined under the P P approach, i.e., the treated patients were followed from their individual time of IOP stabilization. We then carried out classical survival analysis on the new set of data with the adjustment made to the control group. Kaplan-Meier Survival Analysis (Non-parametric) Kaplan-Meier estimates of the survivor functions were obtained for and the time-toprogression experience was compared between the four treatment-gender groups. The log rank test showed a significant difference between the survivor functions of the four groups, but the difference did not seem to be explained by the gender effect because the estimated Kaplan-Meier survival curves for the two genders (Figure 6.12(a)) crossed within each of the two treatment groups. Especially within the treated group, the curves of the opposite genders overlapped considerably. The log rank test comparing the survivor functions for the two genders only did not find a gender effect (p=0.335). We then focused on just the two treatment groups and plotted the estimated Kaplan-Meier survivor curves Chapter 6. Results 68 Table 6.8: Estimates of the log relative hazards ratio (/?) and relative hazards ratio (exp(/3)) for the group covariate from the gender-specific Cox regression analysis under the baseline-adjustment approach (M for males, F for females). Covariate Group M -1.2 P F -1.81 exp{p) M • F. 0.3 0.16 SE(P) U F 0.70 0.56 P-value M F 0.087 0.0012 in Figure 6.12(b). From the plot, we observed a prolonged time to progression for the treated group, and such a favourable treatment effect was confirmed by the log rank test (p=9.4xl0~ ). 5 Cox Regression Analysis (Semi-parametric) A full Cox proportional hazards model (given in Equation (6.7)) with all the covariates including gender, treatment group membership and baseline IOP, MD and age was first fitted to the adjusted data. All the covariates except the group variable were found to be insignificant at a 5% level. There was also no evidence of a group by gender interaction effect. Refitting the Cox model with only the group covariate again showed a significant treatment effect, but the assumption of the control and the treated groups having proportional hazards was invalid. We proceeded with performing separate Cox regression analyses for the two genders, and the proportional hazards assumption for every covariate that was considered in the full and the reduced models, as described below, was satisfied. A full model with all the relevant covariates including treatment group membership, baseline IOP, MD and age was fitted to the data from each gender group, and all the baseline covariates remained insignificant at a 5% level. The results fromfittingthe reduced model with only the group covariate for each gender are presented in Table 6.8. Similar to the results obtained in the gender-specific analyses under the ITT and Chapter 6. Results 69 Figure 6.12: (a) Estimated Kaplan-Meier survivor functions for the time to progression of the four gender by treatment groups based on the baseline-adjustment approach, (b) E s t i m a t e d Kaplan-Meier survivor functions for the time to progression of the treated and the control groups based on the baseline-adjustment approach. (a) -HH 1—I n 1 ' r—Mr i+T! r- ---^=H--H- •H 1 1 H- •+¥- -F- •+• — - Mc-Jr4H~ • -+1 H—I •— 441—I -+»• h . Control, Female Control, Male Treatment, Female Treatment, Male H h -I h T i m e to progression (years) (b) Control Treatment H—I 1—I 1—HH—h NC = # controls surviving NT = # treated surviving 0 NC=53 NC=.27 NC=12 NC=3 NT=43 NT=38 NT=.22 NT=7 2 4 6 T i m e to p r o g r e s s i o n ( y e a r s ) Chapter 6. Results 70 the P P approaches, the adjusted data showed a significant favourable effect of a 30% IOP reduction within the female group. The male gender seemed to benefit from the treatment, but the evidence was not strong enough to be deemed significant at a 5% level. Based on the results presented, the relative hazard of a treated female patient versus an untreated was estimated to be 0.163, which implied an approximately six-fold decrease in the risk of progression for having a 30% IOP reduction. We also carried out a stratified Cox regression analysis with gender as the strata variable. The stratified Cox model (in Equation (6.10)) did not identify a significant effect of any of the baseline covariates and the strata by group interaction. The group covariate remained significant ( p = 2 . 2 x l 0 ) , and the relative hazard of a treated patient versus an untreated patient -4 was estimated to be exp(-1.6) = 0.202. Parametric Modelling Different parametric models were fitted to the adjusted data, and the Weibull, log-normal and the log-logistic distributions provided satisfactory fits. We again chose to report the results based on the Weibull fit. After adjusting for the group membership covariate, the Weibull hazard model (given in Equation (6.9)) fitted had an estimated group coefficient of -1.13 and hence a relative hazard of exp(-1.13) = 0.32 for a treated patient versus an untreated patient. The group coefficient was significant (p=6.35xl0~ ) and in particular, 4 the risk of progression was reduced by two thirds for having a 30% IOP reduction as compared to the case when untreated. The different methods of analysis of the mean defect data and the time to event data will be compared, and the results from each will be further discussed in Chapter 8. Chapter 7 Simulation 7.1 The Objectives In Chapter 6, we have seen that the intent-to-treat ( I T T ) and the per-protocol ( P P ) analyses of the mean defect ( M D ) data led to quite different results. Such differences can possibly be explained by the rapid decay of M D upon receiving the treatments by the treated patients, which had not been accounted for in both the I T T and the P P analyses. A n immediate concern would be how much of an impact this first phase of change w i t h i n the treated group prior to I O P stabilization has on the estimates of the decay rates, the overall level of the M D and the significance of the time by group interaction effect in the I T T analysis. Consequently, the I T T and the P P analyses might lead to different results. One of the objectives of carrying out a simulation experiment here is to investigate how the results from an I T T analysis and those from a P P analysis compare i n the presence of an initial phase of deterioration. Another objective is to study how the P P and the P C L I N modelling approaches perform in comparing the post-treatmentstabilization rates of progression between the treated and the untreated groups. We simulated M D data for ninety-seven patients to comprise our sample which was the same size as in the original data. T h e data were simulated based on the linear mixed effects ( L M E ) model fitted under the piecewise linear ( P C L I N ) modelling approach, as given in E q u a t i o n (6.5), using different sets of values for the decay rates of the M D for the control and the treated groups. We chose this particular L M E model to be the base 71 Chapter 7. Simulation 72 model for the simulation because it gave the best fit to the original M D data among the three different approaches that were presented in Chapter 6. 7.2 Generating Mean Defect Data The data generation involved two.major steps: the generation of the mean structure of M D specified by the fixed effects, and the generation of the random effects and the within-patient errors. The model for simulating the, data has been given by Equation (6.5): Yij = Poi + PliXuj + P X 2 + 2I PzX + PiXuj + P iX + dj Zi 5 5ij (7.11) Here, i is the patient index and j is the index for the repeated measurements within each patient. The response represents the M D measured at time XUJ. Besides the time covariate, the other four covariates are the group membership (x ), the M D at 2i randomization (a^i); the time by group interaction (x^j = XUJ x x j) and the piecewise 2 component {xuj) which enables the fitting of a segmented M D trajectory with a change point at the treated patients' individual times of IOP stabilization for the treated group and a linear trajectory for the control group. The piecewise component has been defined as %5ij = (XUJ ~ T i)I{xiij > T i) x x S S 2i where Tsi denotes the time taken to IOP stabilization if the ith patient belongs to the treated group; otherwise, Ts% is unobserved. A n d for k = 0,1,5, Pki can be expressed as the sum of the fixed effect term Pk and the random effect term 6^. For patient i (i = 1, 2,...,97), the random effects follow the normal distribution with a zero mean and a covariance matrix D common to all of the patients. The within-patient errors, e^'s are also normally distributed with a zero mean and a covariance structure R j . Hereafter, we Chapter 7. Simulation 73 w i l l refer to the fit to the original data based on the model i n E q u a t i o n (7.11) as "the best fit" unless stated otherwise. 7.2.1 Generation of the Mean Structure Specified by the Fixed Effects For the simulation of the mean structure of the M D data for the iih patient, we specified the part of the model (in Equation (7.11)) involving only the fixed effects, i.e., X;/3 whose j t h component is: A) + Pix + 8 x + B x + piX^j + B x Uj 2 2i 3 3i 5 5ij T h e same design m a t r i x X j and the time to I O P stabilization T i as observed i n the S original d a t a were used i n the simulation. T h e times of repeated M D measurements, the group membership, the initial M D reading at randomization i n the simulated data were thus exactly the same as i n the original data, and the total number of M D observations from a l l the patients was also the same. Moreover, the estimated values of the parameters including 8 , 8 and 8 obtained from the best fit to the original data were used for the 0 2 3 simulation. T h e set of parameters (81,84,85) specifying the decay rates of the M D for the treated and control groups had values altered i n a systematic way for different sets of simulations to reflect different deterioration schemes. Further discussion on the choice of parameter values is given i n Section 7.2.3. 7.2.2 Generation of the Random Effects and the Within-Patient Errors For patient i, we generated a random effects vector bj = (b , bn, b ) from the multivariate 0i 5i normal distribution w i t h mean 0 and a covariance matrix D . T h e m a t r i x D was the estimated covariance structure of the individual patients' random effects from the best fit and it was given by Chapter 7. Simulation 74 2.03 D = 9.57 x 10- 4 -9.69 x 10- 4 9.57xl0- 4 -9.69 x 10~ 4 1.43 x l O - 6 -8.98 x 1 0 - 7 -8.98 x 10~ 7 6.8 x 10~ 7 Because the Laird and Ware L M E model in Equation (4.1) assumes no correlation between the random effects of any two different patients, and that the same covariance structure D is used for all the patients, the covariance matrix for the vector of random effects for all of the patients, (^01, ^11, hi, bo2, &12, &52, • •&0.97, • , ^1,97, &5,97, ) will be block diagonal with the matrix D. For the within-patient error vector for the ith patient, e i} we assumed the indepen- dence covariance structure, i.e., R j = <r I, where a was estimated from the best fit to 2 2 the original data. Although the results from fitting the original data showed that the within-patient errors tend to follow the continuous first-order autoregressive model, the correlation between the errors present in two M D observations of the same patient measured one day apart was found to be about 0.2 on average. As the times of successive measurements as observed in the original data were at least a week apart and in the general case, several months apart, the correlation between the errors present in two consecutive M D levels would be close to 0. It was therefore reasonable to assume independence of the within-patient errors. We consequently simulated within-patient errors as independent observations from the normal distribution with mean 0 and a variance of The random component in the zth patient's M D value would then be defined as Zj&j+ej, where Zj, the design matrix for the random effects to be used for our simulation, was the same design matrix used to fit the original data. Chapter 7. Simulation 7.2.3 75 Different Combinations of the Decay Rates of the Mean Defect In studying the effect of the first phase of rapid decline of the M D within the treated patients on the results of the I T T and the P P analyses of the M D data, one can imagine not only the absolute decay rate but also the rate relative to both the post-IOP-stabilization rate and the rate of the control group will be a determining factor that helps to explain the difference in the results of the two types of analyses. This motivated us to experiment with different combinations of the parameters (Bi,B±,B$) in order to study the effect under different schemes of generalized visual field loss which was measured by the M D . These parameter combinations gave rise to the sets of (PC,PTI,PT2) in Table 7.9 where • B = Pi represents the mean decay rate of the M D for the control group c • PTI = Pi + Pi represents the mean decay rate of the M D for the treated group before IOP stabilization • PT2 = Pi + Pi + Pb represents the mean decay rate of M D for the treated group after IOP stabilization The rationale behind the chosen values of the three parameters is given as follows. The natural history of normal tension glaucoma showed a gradual decay in the M D level, as was observed within the control group in our original data. We therefore assumed a negative mean time slope for the control group, i.e, Pc < 0 in our simulation. In particular, here we considered two different mean depression rates: 0.001721 and 0.003442. The former was the estimate of Pc from the best fit to the original data, and the latter was double the former. The time by group interaction coefficient, as represented by /? in 4 Equation (6.5), must be non-positive so as to give a PTI steeper than Pc- We saw from the original data that the treatment caused an initial decay in addition to the natural depression of the M D prior to IOP stabilization. The simulated data must also reflect such a trend, and therefore for each of the two Pc values, we tried four different PTIS: Chapter 7. Simulation 76 Table 7.9: T h e different decay rates of the M D for the control and treated groups used in the simulation. B (Intercept) = -1.1149, B (Group) = -0.5642, B ( M D at randomization) = 0.8309 0 2 3 PTI = PC Pri = Pc - 0.001256 p = T 1 Pn p c 2 x 0.001256 = Pc - 4 x 0.001256 Pc = -0.001721 -0.001 o -0.0014 ~ -0.001721 P t 2 o P t 2 ~ o P t 2 ~ o P t 2 ~ -0.001 -0.00125 -0.0015 -0.001721 -0.002 -0.00225 -0.0005 -0.001 -0.0015 -0.001721 -0.002 -0.0025 -0.0007 -0.00105 -0.0014 -0.001721 -0.0021 -0.00245 Pc = -0.003442 -0.002 o -0.0025 ~ -0.003 -0.003442 -0.0025 -0.003 o -0.003442 ~ -0.004 P T 2 2 o 2 ~ o P T 2 ~ -0.002 -0.0025 -0.003 -0.003442 -0.004 -0.002 -0.0025 -0.003 -0.003442 -0.004 Chapter 7. Simulation • • PTI = 77 Pc PTI = PC~ «A, for K = 1,2,4 The increment A was chosen to be 0.001256 which was about half size of the estimated value of P4 from the best fit to the original data. So we always had Pc > PTI- The first case assumed no harm was done by the treatment during the time the treated patients were waiting for their IOPs to reduce and stabilize, and that their M D declined at the same rate as the control patients did. The other three cases showed a steeper decay in the treated group as compared to the control group before the stabilization of the treatment. Finally, for every (PCPTI) combination, various PTI values were used to simulate the M D data. The size of increments in PT2 varied depending on the values of Pc and PTI • The PT2 values were chosen to reflect the following three scenarios: 1. p2 T > PC> PTI- the treatment results in a slower mean decay rate of the M D than the rate of the control group after the stabilization of the IOP is achieved. Situations where the mean M D trajectories of the control and the treated groups intersect after IOP stabilization will be considered. The intersection can occur within or outside the time range the patients were followed in the study. 2. P2 = T PC- the treated patients share the same mean decay rate of the M D as the control patients after IOP stabilization. 3. Pc > PT2 > PTI: the mean post-IOP-stabilization decay rate of the M D is slower than before IOP stabilization, but is still faster than the rate of the control group. This implies a definite unfavourable effect of the treatment. Chapter 7. Simulation 7.3 78 Simulation Procedures We followed the steps below in our simulation experiment: 1. Specify the values of (Pi,$4,^5) to be used in simulation. 2. Generate the mean structure for 97 patients as described in Section 7.2.1, using the specified values of (/5i,^4,/? ). 5 3. Generate the random effects and the within-patient errors for 97 patients to obtain the random component of the M D values, as described in Section 7.2.2. 4. Sum the mean structure and the random component to obtain the M D values for the 97 patients. The vector Y of all the M D values from the 97 patients comprises one sample of the simulated mean defect data. 5. Repeat steps 3 and 4 to obtain a total of 500 samples. Denote the Ith. sample by Y W , Z=l,2,...,500. 6. For each of the 500 samples, fit the simulated data Y ' ' using all three approaches: ! intent-to-treat, per-protocol and piecewise linear modelling approaches. Specifi- cally, the first two approaches fit the model given in Equation (6.3). The I T T analysis models the data simulated for all 97 patients, while the P P analysis models the same set of data with the exclusion of one patient who did not meet the criterion for the P P analysis of the original data. The P C L I N modelling approach, on the other hand, fits the model in Equation (6.5) to the simulated data for all 97 patients. The change points of the segmented M D trajectories to be fitted for the treated patients remain the same as observed in the original data, i.e., the originally observed times of IOP stabilization. 7. For each of the 500 samples, obtain parameter estimates of the fixed effects and their standard errors from fitting the simulated data through each of the three Chapter 7. Simulation. 79 approaches. 7.4 Results 7.4.1 C o m p a r i s o n of the I T T a n d t h e P P approaches Because the I T T and the P P approaches fit a linear trajectory to both the control and the treated groups, questions of interest might include "Is the decay rate of the M D the same for the two treatment groups?" and "How does the overall M D level compare between the two treatment groups?" T o answer the first question, we can test the parallelism of the two linear trajectories corresponding to the two treatment groups by testing for the presence of a time by group interaction effect. L e t Btimexgroup be the regression coefficient for the interaction. W e would be testing the following hypotheses: H '• P'c = 0 PT OR Ptimexgroup = 0 . ' (7.12) VS. H : P'Q ^ P a T OR Ptimexgroup 7^ 0 where P' and PT are the time-slopes of the M D for the control and the treated groups, c respectively. T h e null hypothesis H states that there is no interaction effect, a n d the 0 alternative hypothesis H states the presence of an interaction effect. a For each combination of (PC,PTI,PTI) used for simulating the M D data, we computed the p-value for the above hypothesis test for each of the 500 samples and obtained the proportion of rejecting the null hypothesis at a 5% significance level under the I T T a n d the P P approaches, which is presented i n Tables 7.10 and 7.11. W e also used the paired t-test to compare the proportions of rejecting H from the I T T and the P P analyses 0 for each combination of {Pc,PTI,PT2)- T h e proportion of rejection can be treated as the mean of 500 Bernoulli random variables, each of which takes on a value of 1 i f H is a rejected and a value of 0 if H is not rejected. B y the result of the Central L i m i t Theorem, 0 the normality of the proportions of rejection can be approximated as the sample size of Chapter 7. Simulation 80 Table 7.10: Proportions of rejecting the null hypothesis of no time by group interaction effect i n the I T T and the P P analyses, for an M D decay rate of the control group = -0.001721 and different sets of M D decay rates of the treated group. Pc = -0.001721 Parameter values used for simulation Proportion of rejecting H ITT PP PTI PT2 -0.001 0.706 0.732 -0.001721 -0.0014 0.210 0.254 -0.001721 0.072 0.076 -0.001 0.644 0.737 0.302 -0.00125 0.426 -0.0015 0.074 0.112 -0.002977 -0.001721 0.070 0.064 -0.002 0.236 0.138 -0.00225 . 0.600 0.426 -0.0005 0.974 0.990 -0.001 0.588 0.768 -0.0015 0.104 0.162 -0.004232 -0.001721 0.108 0.078 -0.002 0.134 0.296 0.874 -0.0025 0.728 -0.0007 0.756 0.950 -0.00105 0.314 0.688 -0.0014 0.064 0.246 -0.006743 -0.001721 0.154 0.050 -0.0021 0.640 0.279 -0.00245 0.914 0.690 D 500 is reasonably large. The use of the paired t-test is thus validated. T h e values of the time-slope parameter for the control group estimated i n either the I T T or the P P analysis, P' , c were very close to Pc used for simulating the data. T h i s suggested E(P'C) = Pc, where E(.) represents the expected value, and that P' c or p. c is an unbiased estimator Chapter 7. Simulation 81 Table 7.11: Proportions of rejecting the null hypothesis of no time by group interaction effect i n the I T T a n d the P P analyses, for an M D decay rate of the control group = -0.003442 and different sets of M D decay rates of the treated group. Pc = -0.003442 Parameter values used for simulation Proportion of rejecting H ITT PP PTI PTI -0.002 0.998 0.998 -0.0025 0.924 0.918 -0.003442 -0.003 0.336 0.356 -0.003442 0.056 0.050 -0.0025 0.862 0.918 -0.003 0.274 0.360 -0.004698 -0.003442 0.080 0.058 -0.004 0.598 0.496 -0.002 1.000 1.000 -0.0025 0.914 0.818 -0.005953 -0.003 0.218 0.376 -0.003442 0.100 0.078 -0.004 0.622 0.466 -0.002 0.986 0.998 -0.0025 0.700 0.936 -0.008464 -0.003 0.110 0.360 -0.003442 0.152 • 0.048 -0.004 0.818 0.490 0 ' Chapter 7. Simulation 82 Certain trends were noted in the rejection rates of the null hypothesis: • For each fixed pair of (Pc, PTI) used for the simulation, the paired t-test comparing the proportions of rejection showed that the I T T analysis always gave a significantly larger proportion of rejecting H than the P P analysis for cases where B T2 D (i.e., PTI being more negative than B ). Whereas for B c T2 > B (i.e., B c T2 < B c being less steep than B ), the opposite trend occurred, and for most cases, the difference in c the rejection proportions was significant. When j3 T2 > Pc, the slope of the treated group prior to IOP stabilization, P \, T which was always more negative than PT , averaged out with PTI to give a PT 2 that satisfied p \ < PT < PT2 under the ITT.approach, and the two slopes P' T C and PT became quite comparable. However, the P P approach compared the postIOP-stabilization slope of the treated group with the slope of the control group. The difference between P' and PT under the P P approach will be more distinct and c easier to detect, thus resulting in a larger proportion of rejecting the null hypothesis. On the other hand, when PT2 < Pc, the PTI which was more negative than P 2 T gave a slope PT that differed more from B' under the I T T approach. In fact, one c had PT < PT2 < Pc- The P P approach, however, ignored the first portion of data for the treated group before IOP stabilization, so the slope p under this approach T was comparable to PT2 used for simulating the data. Consequently, a more distinct difference between p T and B' was observed in the I T T analysis than was the case c in the P P analysis, and a higher proportion of rejecting the null hypothesis was obtained. • For each fixed Pc and all cases where PT2 = Pc, the data simulated can be treated as if they were generated under the null hypothesis of no time by group interaction under the P P approach because this approach compared the slopes of the control and the treated groups after IOP stabilization was achieved. Indeed, our assumption seemed reasonable as for a 5% test, a rejection rate close to 0.05 was obtained Chapter 7. Simulation 83 regardless of the values of Pc and p \T However, from the I T T analysis, we did hot obtain a proportion close to 0.05 when PT2 = Pc and PT2 PTI- For each fixed Pc and PT2 — Pc, the rejection rate increased with the magnitude of PTI- This agreed with our intuition: the more negative the PTI, i.e, the sharper the decline of the M D during the period awaited for IOP stabilization, the greater the extent to which PT\ will pull down the overall slope PT for the treated group in the I T T analysis. The difference between p T and P' increased and thus led to a stronger evidence against the null hypothesis of no c time by group interaction and a greater rejection rate than in the case of the P P analysis. • For all the special cases where Pc = PTI = PT2, the I T T and the P P approaches resulted in similar rejection rates of the null hypothesis. The paired t-test for each of these special cases did not show a significant difference in the two rejection rates. W i t h Pc = PTI — PT2, the data simulated would be under the null hypothesis of no time by group interaction in both the I T T and the P P analysis, and this explained for having rejection rates close to 0.05 for our test at a 5% level. • For each fixed Pc, the discrepancy in the proportions of rejecting the null hypothesis under the I T T and the P P approaches increased in size as PTI increased. Summary The simulation experiment helped to explain why different results were obtained from analyzing the mean defect data using the I T T and the P P approaches. It also gave insights into the two approaches to the evaluation of general clinical trials that study treatments entailing a lag time for their stabilization and treating diseases whose progress is monitored by following the change of a certain prognostic characteristic over time. In situations where the treatment speeds up the progression rate of a disease before its Chapter 7. Simulation 84 stabilization while later inducing a favourable effect such as slowing down the progression rate, the P P analysis fails to account for the initial phase of rapid progression among the treated patients. Moreover, the approach tends to exaggerate the difference between the progression rates of the treated and the control groups. On the other hand, when the treatment causes a rapid progression not only before but also after the full effect is achieved upon stabilization, the P P analysis tends to diminish the time-group interaction effect and makes it more difficult to detect a difference between the progression rates of the treated and the control groups. Nevertheless, this alone does not necessarily mean that an I T T analysis is preferable to a P P analysis for the evaluation of a clinical trial. It is important to make it clear that an I T T analysis aims to assess the effectiveness of a therapy while a P P analysis attempts to assess the efficacy of the therapy. The P P approach adopted throughout this thesis has a number of pitfalls, which will be further discussed in Chapter 8. However, from the statistical point of view, the most appropriate approach to assessing the treatment effect would be one that accounts for the different patterns of progression before and after the full treatment effect is reached. 7.4.2 Comparison of the P P and the P C L I N approaches Suppose we are interested in knowing how the rate of progression of a disease compares between the treated group and the untreated group after the stabilization of treatment is achieved. We will be testing the following two sets of hypotheses for the P P and the P C L I N analyses, respectively: 0 7^0 (7.13) Chapter 7. Simulation 85 where P' and pr are the time-slopes of the M D for the control and for the treated groups, c respectively under the P P approach. H :PC = PT2 O R & + & = 0 0 / (7.14) vs. H : PC^PT2 OR A & + & ? 0 where Pc and PT2 are the time-slopes of the M D for the control group and for the treated group after IOP stabilization, respectively under the P C L I N approach. The parameters Ptimexgroup represents the time by group interaction under the P P approach, and /? + p 4 represents the true difference between the M D decay rates of the control and the treated groups under the P C L I N approach based on which the data were simulated. The test statistics for the above hypothesis tests will be P P analysis : T PP = ^r S E(Pti x group) u (7.15) X9roup me . P C L I N analysis : T LIN PC (7.16) = SE(p + p ) 4 5 The p-value for testing the hypotheses under the P C L I N approach can be obtained in the same way as presented in Section 6.1 (p.41). Again, for every set of (PC,PT\,PT2) used for the simulation, the proportions of rejecting the null hypotheses in Equations (7.13) and (7.14) at a 5% significance level were computed and are summarized in Tables 7.12 and 7.13. We also used the paired t-test to compare the proportions of rejecting H from 0 the P P and the P C L I N analyses for each combination of {Pc, PT\, PT2)- The proportion of rejection can be treated as the mean of 500 Bernoulli random variables, each of which takes on a value of 1 if H is rejected and a value of 0 if H is not rejected. 0 0 Observe that for all cases where P 2 = Pc, the P P and the P C L I N approaches gave T very close rejection rates in the respective tests. Especially, the P P and the P C L I N rates were close to 0.05 for a significance level of 5%. Also, the paired t-test for all these cases did not show a significant difference in the two rejection rates. It is therefore reasonable to assume the two hypothesis tests were at the same significance level, and it is legitimate 5 Chapter 7. Simulation 86 Table 7.12: Proportions of rejecting the null hypothesis of no difference i n the M D decay rates between the two treatment groups after I O P stabilization in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.001721 and different sets of M D decay rates of the treated group. Pc = -0.001721 Parameter values used for simulation Proportion of rejecting H PP PCLIN PTI PT2 -0.001 0.732 0.732 -0.001721 -0.0014 0.254 0.226 -0.001721 0.076 0.078 -0.001 0.737 0.734 -0.00125 0.426 0.418 -0.0015 0.112 0.110 -0.002977 -0.001721 0.064 0.070 -0.002 0.138 0.204 -0.00225 0.426 0.538 -0.0005 0.990 0.998 -0.001 0.768 0.766 0.162 -0.0015 0.180 -0.004232 -0.001721 0.072 0.078 -0.002 0.134 0.194 -0.0025 0.728 0.790 -0.0007 0.950 0.956 -0.00105 0.688 0.714 -0.0014 0.246 0.248 -0.006743 -0.001721 0.050 0.058 -0.0021 0.279 0.328 -0.00245 0.690 0.744 0 Chapter 7. Simulation 87 Table 7.13: Proportions of rejecting the null hypothesis of no difference i n the M D decay rates between the two treatment groups after I O P stabilization in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.003442 and different sets of M D decay rates of the treated group. Pc = -0.003442 Parameter values used for simulation Proportion of rejecting H PP PCLIN PTI . PTI -0.002 0.998 1.000 -0.0025 0.924 0.922 -0.003442 -0.003 0.356 0.358 -0.003442 0.050 0.052 -0.0025 0.918 0.928 -0.003 0.364 0.360 -0.004698 -0.003442 0.064 0.058 -0.004 0.564 0.496 -0.002 1.000 1.000 -0.0025 0.914 0.928 -0.005953 0.392 -0.003 0.376 -0.003442 0.082 0.078 -0.004 0.502 0.466 -0.002 0.998 0.998 -0.0025 0.936 0.948 -0.008464 -0.003 0.360 0.380 -0.003442 0.042 0.048 -0.004 0.538 0.490 0 Chapter 7. Simulation 88 to compare their statistical power. Intuitively, the comparison of the slopes B' and p c under the P P approach and the comparison of the slopes B and B c T2 T under the P C L I N approach should yield very similar results as both are comparing the slopes of the controls and of the treated patients after treatment stabilization is established, but the results from the simulation experiment tell a slightly different story. From the rejection rates presented in Tables 7.12 and 7.13, we noted the following: . • For the difference sets of slopes (PC,PT\,PT2) used for simulating the data, when PT2 > Pc, the P P and the P C L I N approaches rejected the null hypothesis in their respective tests at about the same rate, regardless of the values of pc and p \. T In all but two cases where PTI > Pc, the paired t-test did not show a significant difference in the two rejection rates. This implies that the two hypothesis tests are relatively equally powerful in detecting a difference between the slopes describing the post-IOP-stabilization decay of the M D for the treated and the control groups at a 5% level. • For all cases where PT < Pc, the P C L I N approach gave a higher rejection rate 2 than the P P approach. The paired t-test also showed a significant difference in the two rejection rates for all such cases. The hypothesis test in the P C L I N analysis appears to be more powerful in detecting a difference between the decay rates of the M D after IOP stabilization for the treated and the control groups at a 5% level. In order to explain the above trends, we examined the parameter estimates obtained from the two approaches. The estimated values of the time-slope coefficient, or equivalently, the slope coefficient for the control group (P' from the P P analysis and Pc from c the P C L I N analysis), were very close and they will not be presented here. There was also no systematic discrepancy between the coefficient estimates P' and Pc across the c different sets of (PC,PTI,PT2)', there did not seem to be a bias in P' as an estimator of Pcc (Pc is an unbiased estimator of pc-) We also looked at the following two parameters: Chapter 7. Simulation 89 Table 7.14: Sample mean of the 500 estimated regression coefficients and standard errors in the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.001721 and different sets of M D decay rates of the treated group. Pc = -0.001721 PP Parameter values used for simulation PTI -0.001721 -0.002977 -0.004232 -0.006743 PCLIN Ptimex group(SE) (values in I O " ) 7.71 2.93 3.56 2.84 2.82 0.45 7.42 2.83 4.96 2.86 . 2.39 2.87 0.22 2.77 -2.41 2.77 -4.99 2.75 12.5 2.90 7.50 2.79 2.53 2.88 0.23 2.79 -2.61 2.88 2.84 -7.49 10.3 2.87 6.88 2.85 2.82 3.56 0.13 2.79 -3.72 2.77 -7.09 2.85 4 PT2 -0.001 -0.0014 -0.001721 -0.001 -0.00125 -0.0015 -0.001721 -0.002 -0.00225 -0.0005 -0.001 -0.0015 -0.001721 -0.002 -0.0025 -0.0007 -0.00105 -0.0014. -0.001721 -0.0021 -0.00245 P4 + P 5 . (SE) (values i n 7.38 3.21 0.07 7.12 4.68 2.15 -0.08 -2.70 -5.31 12.3 7.21 2.30 0.001 -2.83 -7.71 10.2 6.67 3.21 -0.02 -3.80 -7.24 io- ) 4 2.78 2.73 2.70 2.70 2.73 2.72 2.64 2.64 2.63 2.76 2.67 2.75 2.65 2.73 2.70 2.72 2.71 2.73 2.65 2.62 2.70 • Ptimexgroupfromthe P P analysis • PA + Ps from the P C L I N analysis For each set of (PC,PTI,PT2) used for simulation, the sample mean of the estimates of the above two parameters and their standard errors across the 500 samples were computed and are tabulated in Tables 7.14 and 7.15. The histograms of Ptimexgroup and /3 + P5 for four randomly selected combinations 4 Chapter 7. Simulation 90 Table 7.15: Sample mean of the 500 estimated regression coefficients and standard errors i n the P P and the P C L I N analyses, for an M D decay rate of the control group = -0.003442 and different sets of M D decay rates of the treated group. /3 = -0.003442 C PP Parameter values used for simulation Ari -0.003442 -0.004698 -0.005953 -0.008464 PT2 -0.002 -0.0025 -0.003 -0.003442 -0.0025 -0.003 -0.003442 -0.004 -0.002 -0.0025 -0.003 . -0.003442 -0.004 -0.002 . -0.0025 -0.003 -0.003442 -0.004 PCLIN Ptime x group(SE) (values i n 1 0 - ) 14.7 2.76 9.80 2.89. 4.54 2.83 0.19 2.78 9.56 2.81 4.67 2.76 0.30 2.81 -5.43 2.80 14.5 2.75 9.53 2.90 4.45 2.72 0.25 2.78 -5.26 2.83 14.5 2.80 9.57 2.78 4.48 2.79 0.05 2.85 -5.47 2.87 4 fa + (3 (SE) (values i n 14.5 9.56 4.30 -0.06 9.37 4.43 0.06 -5.69 14.3 9.36 4.30 0.05 -5.44 14.4 9.48 4.37 -0.10 -5.54 10 ) 2.64 2.75 2.70 2.66 2.68 2.61 2.67 2.67 2.61 2.74 2.58 2.66 2.70 2.65 2.63 2.64 2.71 2.72 5 4 Chapter 7. Simulation 91 of (PC,PTI,PT2) were also plotted in Figure 7.13, where in each histogram plotted the 500 parameter estimates obtained from the 500 samples. From the plots, we saw that the histograms of (3 timexgroup had slightly heavier tails (greater dispersion) than the his- tograms of /?4 + fa. This agreed with what we observed from the standard errors of the estimatesfaimexgroupandfa+fa'-SE0 ) > SE(fa +fa)for all cases. The mean timexgroup SE0 group) timex and SE(fa +fa)were compared using the t-test, and the difference was shown to be significant for every combination of (Pc, PTI, PT2) used in simulation. From Tables 7.14 and 7.15, we also noticed that thefaimexgroup ^always had a larger 1 sample mean than the fa+fa's regardless of the values of Pc, PTI and (3T2, although such a trend was not apparent in the histograms. Moreover, all the histograms resembled the normal distribution, indicating that the normality assumption of Ptimexgroup and fa + fa used in constructing the test-statistics Tpp and TPCLIN for the hypothesis tests was reasonably approximated. Sincefa+fais an unbiased estimator of /3 + p , i.e., E(fa + fa) = P4 + P5, a property 4 5 of the L M E model which gives unbiased estimates of the fixed effects, the systematically larger sample means of the Ptimexgroup^ than those of the fa +fa'ssuggested that E(fai exgroup) > E ( ^ + fa) — fa + fa. To ascertain such a difference in the expected m 4 values, we used the paired t-test to compare the mean values of Ptimexgroup and fa + fa for each combination of (pc, PT\PT2), and a significant difference in the two mean values was demonstrated for all combinations of (PC,PT\PT2)- In other words, based on our assumption that the M D data followed the piecewise L M E model, Ptimexgroup under the P P approach seemed to be a biased estimator of the true difference between the post-IOPstabilization decay rates of the M D of the control and the treated groups. Such a bias may be explained by the difference in the baseline adopted by the P P and the P C L I N approaches. The times of repeated M D measurements that entered the L M E model under the two approaches were measured from two different baseline times. However, it was not clear what actually led to the discrepancy in the mean values of the faimexgroup^ 92 Chapter 7. Simulation Figure 7.13: Histograms of estimated coefficients Pumex group (from the P P analysis) a n d PA + Ph (from the P C L I N . a n a l y s i s ) for randomly selected sets of (PC,PTI,PT2) used i n the simulation, (a) B = -0.001721, B = -0.004232, B = - 0 . 0 0 1 5 (b) B = -0.001721, p = -0.002977, B = -0.005953 (c) B = -0.003442, B = -0.005953, 0 . 0 0 2 5 (d) p = -0.003442, 8 = -0.004698, B -0.003442 JT2 c Tl T1 T2 T2 c c c Tl T2 Tl (a):PP (b):PP S S • IliSS ft 0.0 8 0.0005 0.0 0.0005 0.0010 estimated value of beta(tjme'group) estimated value of Deta{time*group) (a):PCLIN (b): PCLIN i s i! JI_JULJLJ;IJ_ .-IEZ3 estimated value of beta_4+beta_5 estimated value of beta_4+beta_5 (c):PP (d):PP I • §• 8 • CO | 8 S • 0.0005 0.0010 0.0015 -0.0005 0.0 0.0005 estimated value of beta(iime*group) estimated value of beta(time'group) (c):PCLIN (d):PCLIN 8J & s • 1 s 9• 8 • O 0.0005 0.0010 - i. 0.0015 estimated value of beta_4+beta_5 estimated value of beta_4+beta_5 JL Chapter 7. Simulation 93 and the /3 + /3 's. The greater standard error of the Ptimexgroup's as compared to that 4 5 of the /3 + /5 's is possibly the result of a smaller number of M D observations used for 4 5 parameter estimation in the P P analysis. We have seen that in the estimation of the true difference between the slopes of the control and the treated group after IOP stabilization, the estimator Pumexgroup under the P P approach seemed to be biased and less efficient (i.e., having a larger standard error) than the estimator /3 + p under the P C L I N approach. With this knowledge, the reasons 4 5 behind the trends observed in the rejection rates of the null hypotheses in Equations (7.13) and (7.14) become apparent and are given as follows. Suppose B t i m e x g r o u p = C\ and B + B = C for some real constants C\ and C . As Pumexgroup and /3 + J3 are 4 5 2 2 unbiased estimators of B t i m e x g r o u p and 8 + 8 , we have E0 4 5 4 5 ) = C\ and E(/3 +/3 ) timexgroup 4 5 = C . Also notice that the standard errors for the Pumexgroup's and the /3 + /3 's stayed 2 4 5 relatively constant irrespective of the estimated values of the parameters. Hence, it was reasonable to assume the standard errors were independent of the values of the parameter estimates. When /3 -f- ^5 > 0, or 8 T2 4 > B , i.e., in the presence of a favourable c treatment effect in slowing down the M D decay rate after achieving IOP stabilization, earlier results from comparing the mean values of Pumexgroup's and /3 + /5 's suggested 4 that Cx = E(P ) 5 > E04 + p ) - C . A n d since C = /3 + /5 > 0, we have timexgroup 5 2 2 4 5 C\> C > 0. However, SE(Pumexgroup) > SE(/3 + P ). Upon comparison of the relative 2 4 5 size of the parameter estimates and their SE's, we found that Ptimex group SE(Pti ) mexgroup or P Ptimex group SE (Ptimex group) SE(P + p ) 4 5 (See Equations (7.15) and (7.16)). We thus expect to have Chapter 7. Simulation 94 E(T ) = PP „ C ~ L SE (Ptimexgroup) ft , = E(T CLIN) P SE(P4 + PQ) It follows that on average, the p-values for the two. hypothesis tests i n the P P a n d the P C L I N analyses w i l l be very similar and about the same rejection rates w i l l be obtained, as observed i n our simulation results. W h e n P4 + p 5 < 0, or PT2 < Pc, Le., i n the presence of an unfavourable treatment effect (the treatment speeds up the M D decay rate after achieving I O P stabilization), we again expect to have C implies d 2 = E(/3 + J3 ) < E(Ptimexgroup) = C i - Because C 4 5 2 < 0, C 2 < C x is closer t o or above 0. A n d w i t h SE(Ptimexgroup) > S E ( 5 + ^5), the following ( 4 is likely to hold: E(\T \) PP = SE (Ptimexgroup) < Co SE(p 4 + p) E(\T CLIN\) P 5 It follows that on average, the p-value obtained from the P P analysis w i l l be larger than that from the P C L I N analysis for the two-sided hypothesis tests. T h i s explains the smaller rejection rates for the P P analysis. Summary W e have seen from the simulation results that when the d a t a were generated from the L M E model under the P C L I N modelling approach, the P P analysis slightly overestimated and produced less efficient estimates of the difference between the slopes of the treated and the control groups after I O P stabilization. In particular, for cases where the postI O P - s t a b i l i z a t i o n M D w i t h i n the treated patients decayed at a slower rate than the untreated patients (PT2 >.Pc), although the P P and the P C L I N approach gave very similar rejection rates of the null hypotheses stating no difference between the p o s t - I O P stabilization slopes of the two groups, the hypothesis tests corresponding to the two Chapter 7. Simulation 95 approaches seeming equally powerful was simply a coincidence resulted from the biased and the less efficient estimates of the difference. W h e n the mean M D decay rate for the treated group after I O P stabilization was faster than that of the control group (/3 r2 < Pc), which was not the situation present i n our original data, the simulation results suggested that the hypothesis test under the P C L I N approach was statistically more powerful t h a n the counterpart under the P P approach. In summary, the results from the comparison between the P P and the P C L I N approaches implied that the negligence of the lag time i n the P P analysis may give rise to biased parameter estimates and a decreased statistical power of hypothesis tests for treatment comparisons. Chapter 8 Discussion In this thesis, we investigated two-(different approaches to evaluate a clinical trial: the intent-to-treat (ITT) approach and the per-protocol (PP) approach. The former calls for inclusion of all the patients who are randomized in the final analysis of the trial irrespective of problems of missing data, cross-overs, drop-outs and non-compliance, while the latter analyzes a subset of patients who meet the treatment efficacy criteria. We applied the two approaches to the data from the Collaborative Normal Tension Glaucoma Study in which a delay in treatment stabilization was observed in the treated group. Besides the difference in the inclusion of patients, the two approaches we adopted differed in the definition of the baseline time. The ITT approach assumed all the patients were followed starting at the time of randomization, whereas the baseline for the treated group was shifted to the individuals' times of IOP stabilization under the PP approach. Upon IOP stabilization, the effect of having a 30% IOP reduction was regarded as fully achieved. The two baseline definitions were chosen in accordance with the principles of assessing the treatment effectiveness and the treatment efficacy in the ITT and the PP analyses, respectively. In modelling the mean defect (MD) data, a linear mixed effects (LME) model was fitted in both the ITT and the PP analyses. In particular, the MD was assumed to decay linearly with time in each of the treated and the control groups. A random intercept and time-slope were included in the model to allow for different initial MD levels at baseline and different MD decay rates across patients. At a 5% level, gender and baseline 96 Chapter 8. Discussion 97 information including age and I O P did not explain the M D trend over time i n either the I T T or the P P analyses. A significantly slower average M D decay rate for the treated group than the control group as a result of lowering the I O P was demonstrated i n the P P analysis but not in the I T T analysis. Furthermore, the I T T analysis showed that the treated group had a more negative overall M D level than the control group, while the P P analysis d i d not pick out such a difference. Based on these results, quite different conclusions were reached on the usefulness of an I O P reduction strategy i n treating normal tension glaucoma. The P P analysis suggested a beneficial treatment effect, but the I T T analysis concluded an ineffective or even a harmful treatment because the I O P lowering therapy not only failed to slow down the natural decay of the M D but also depressed the overall M D level. A n additional approach to analyzing the M D data, the piecewise linear ( P C L I N ) m o d elling approach, revealed that during the period awaiting the I O P to reduce and stabilize, the treated patients had on average a faster M D decay rate than both the rate after I O P stabilization and the rate of the control group. In light of this finding, the difference in the results of the I T T and the P P analyses can possibly be explained by the fact that the M D d a t a prior to I O P stabilization of the treated patients were excluded from the P P analysis. T h e first phase of sharp decline of the M D pulled down the mean M D level for the treated group, which also happened to be more negative at randomization t h a n the mean level for the control group in the I T T analysis. Besides, the slower decay rate of the M D w i t h i n the treated group that followed after I O P stabilization averaged out w i t h the i n i t i a l rapid decay and resulted in an overall rate that was comparable w i t h that of the controls in the I T T analysis. Therefore, there was weaker evidence to declare a significant time by group interaction than in the case of a P P analysis. T h e one patient who was included in the I T T analysis but not in the the P P analysis only had five M D observations measured from the time of randomization. The excluded patient, who belonged to the treated group, had a similar early M D trend as observed i n the other Chapter 8. Discussion 98 treated patients: there appeared to be a larger rate of decrease in M D before I O P stabilization than the rate afterwards. It was unlikely that the exclusion of this particular patient had led to the difference i n the results from the two approaches. Because the time to I O P stabilization was relatively short compared to the treated patients' follow-up periods, and the M D trend over time was highly variable across patients, any difference between the M D trend before and after I O P stabilization w i t h i n each of the treated individuals was not easily noticeable. A linear model appeared to provide an adequate fit to the data of the treated group. However, upon fitting a segmented linear trajectory for the treated group in the P C L I N modelling approach, the modelling of the M D data was significantly improved marginally. T h e i n d i v i d u a l fitted trajectories of the treated patients also seemed to provide a reasonably good description of the M D observations, especially for patients whose M D patterns differed greatly before and after I O P stabilization. A l t h o u g h the P C L I N approach was data-driven, it was shown to outperform the I T T and the P P approaches we have used in modelling the M D data in our study. The difference i n the conclusions drawn from the I T T and the P P analyses motivated our simulation experiment which aimed to study how a different decay pattern of the M D before and after I O P stabilization within the treated group affected the results of the two approaches. We generated M D data using the best fitted L M E model found from the P C L I N modelling approach. The analysis of the simulated data showed that the P P analysis tends to magnify the difference between the decay rates of the M D between the control and the treated groups in the case where the treated slope became less steep than the control slope after I O P stabilization, but the difference was diminished when the treated slope after I O P stabilization, while not as steep as it had been before the stabilization, remained steeper than the control slope. The poor performance of the P P analysis can be attributed to its exclusion of the data from the treated patients prior to I O P stabilization. A n d in any case where the treatment effect differed before and after Chapter 8. Discussion 99 the 30% I O P reduction was attained, the P P analysis failed to account for the trend in the first phase. Besides, we simulated M D data to compare the performance of the P P and the P C L I N approaches in terms of parameter estimation and statistical power of detecting a difference between the post-IOP-stabilization decay rates of the M D of the treated and the untreated groups. Summaries of the simulation results were already given i n Chapter 7. In applying the I T T and the P P approaches to model the M D data in our study, we also looked into the floor effect issue by including in the linear mixed effects model an indicator variable classifying patients as having or not having advanced normal tension glaucoma at baseline ( A n M D level below -12dB is generally regarded as having an advanced stage of the disease). The interaction between the indicator variable and the time-slope covariate was tested and was found to be insignificant at a 5% level i n both the I T T and the P P analyses. The decay pattern of the M D appeared to be independent of the baseline M D level, as analyzed using either approach. We also compared the mean baseline M D of the treated and the control groups under the I T T and the P P approaches. W h e n the baseline was taken to be the time of randomization for all the patients in the I T T analysis, the two-sample t-test did not suggest a significant difference between the mean M D levels at randomization of the two groups (the observed means were -7.39tiB for the treated group and -6.72<£B for the control group, p=0.49). However, in the P P analysis, the two-sample t-test showed that the two groups had significantly different baseline M D levels (the observed means for the treated and the control groups were -8.93ciB and -6.72dB, respectively, p=0.014). Recall that the P P analysis demonstrated a significantly slower linear decay rate of the M D for the treated group than the control group, and it also showed no evidence of a dependence between the M D decay rate and the i n i t i a l M D level. However, we cannot rule out the possibility that the significant treatment effect was an artifact of the floor effect because as time elapsed since randomization, more and more patients reached the progression end point and the two groups of untreated and treated patients who remained i n the study may have become Chapter 8. Discussion 100 more and more unbalanced i n terms of sample size, mean M D level and variation of the M D measurements. The floor effect may eventually be influential and result i n a fallacy of a beneficial treatment effect. Moreover, the P C L I N analysis showed that the treated patients on average experienced a sharp decline i n M D that was followed by a decay at an even slower rate than the control group. It remained uncertain if the slower decay rate was the result of a beneficial effect of having a 30% reduction of the I O P or, perhaps in part, of the floor effect after the large drop of the M D caused by the I O P lowering therapy. We d i d not analyze the floor effect i n the P C L I N analysis because of the complexity of the L M E model which involved many time related terms (XUJ, x^j and xj 5i in E q u a t i o n (6.5)). We could take the same approach as in the I T T and the P P analyses, i.e., we tested for an interaction effect between each of the above time related terms and the classification variable based on the initial M D level. The significance of these interaction effects may help us study whether the time slopes (fa, fa + fa and fa + fa + fa) are affected by the baseline M D level, but such an approach does not seem to provide an adequate and effective assessment of the floor effect. Besides, the interpretation of these interaction effects can be difficult because x^j and x&j are already interaction terms. A p a r t from the floor effect, a statistical phenomenon called "regression to the mean", also known as the regression effect, might account for the slower M D depression rate observed among patients whose baseline M D levels were highly negative as compared to those who started w i t h less negative M D levels even in the absence of a real treatment effect. T h e regression effect is usually present between any two variables which are correlated. A s an example, suppose we randomly select two patients from the population of normal tension glaucoma patients and whose M D levels at baseline are both less than or greater than the baseline mean. The patient w i t h a baseline M D level farther from the baseline mean w i l l on average have an M D level, relative to the other patient, less far from the mean M D level at a later time, where the i n i t i a l and the subsequent M D levels are likely to be correlated. T h i s illustrates a property of the regression effect: the more extreme the starting value of the M D , the greater the regression to the mean. Patients Chapter 8. Discussion 101 w i t h more extreme negative M D levels initially w i l l on average decrease to a lesser extent than patients w i t h i n i t i a l M D levels closer to the mean. Therefore, at any fixed time since randomization, patients w i t h highly negative M D at baseline w i l l be expected to show a slower M D depression than those w i t h less negative baseline M D levels, irrespective of the existence of a treatment effect. Moreover, the degree of the regression effect depends on the strength of correlation between the two variables. A weaker correlation w i l l lead to a stronger regression effect. W h e n the time between two M D measurements becomes longer, the measurements w i l l be less correlated, resulting in a greater regression to the mean. T h e decay rate of the M D which measures the decrease i n M D relative to time w i l l be more influenced by the regression effect as a longer time elapses from randomization. So far our discussion of the "regression to the mean" has been focused on the M D levels at baseline and at a later time. T h i s statistical phenomenon can also be observed between the two populations of time slopes before and after I O P stabilization w i t h i n the treated group. Intuitively, the two time slopes are correlated because they measure the rate of change in M D on the same patient over the pre-stabilization and post-stabilization periods. A regression effect between the two time slopes w i t h i n the treated groups is likely to be present, and this helps to explain why a steep i n i t i a l M D decay rate might have been followed by a flatter post-stabilization rate than that preceded by a less steep i n i t i a l time slope. In the analysis of the time to I O P stabilization data w i t h i n the treated group, we applied classical methods including the non-parametric K a p l a n - M e i e r method, the semiparametric C o x regression and parametric modelling. A l l three methods d i d not show a significant effect of the age, I O P and M D level at randomization at a 5% level. There was also no strong evidence of a gender effect, although male patients appeared to take a shorter time to achieve I O P reduction and stabilization on average. Nevertheless, it is important to use caution when interpreting the results based on any of the classical methods. T h e three types of therapy (medical, laser and surgical) were not randomly assigned to the treated patients; the investigators decided the type of treatment the patients were Chapter 8. Discussion 102 to receive. T h e study design was therefore observational and not randomized, and the comparison between the surgical and non-surgical groups might be subject to selection bias. Patients who had a filtering surgery likely differed from patients who received only non-surgical treatments with respect to some uncontrolled factors which affect the ease to achieve I O P reduction and stabilization. We looked to see if the M D and the I O P levels at baseline differed significantly between the two groups of treated patients as this baseline information might account for the investigators' decision, and for the failure of the non-surgical treatments i n reducing the I O P to the desired level. T h e surgical group had an average baseline M D level of -8.84dB (SD=4.88dB) and I O P level of 17.ObmmHg (SD=l.55mmHg). T h e corresponding figures for the non-surgical group were -6A2dB (SD=5.49tiB) and 16.83mmHg (SD=2.53mmHg). T h e non-surgical group appeared to have a less negative M D and a lower I O P at the time of randomization, but the evidence was not strong enough to demonstrate a significant difference i n the mean level of either variable using the two-sample t-test at a 5% level (p=0.14 for M D and p=0.74 for I O P ) . However, we d i d not have information on other demographics and ophthalmic conditions which were the basis of the choice of treatment, and thus we cannot identify any source of selection bias. Moreover, some patients who were originally assigned the medical or laser therapy switched to the surgical treatment due to failure to reduce the I O P level using the non-surgical treatment. A m o n g the 43 treated patients, only two were initially given the surgical therapy. B y counting patients who had the medical or laser treatment i n the first place and switched to surgical treatment later i n the surgical group, the extra waiting time before the change of treatment was included as part of the time to I O P stabilization, thus lengthening the observed time to I O P stabilization w i t h i n the surgically treated group. O u r results which showed a longer waiting period for I O P stabilization for the surgical group might be attributable solely to the additional time prior to the change of treatment rather than a true difference between the effects of the non-surgical (medical or laser) treatments and the filtering surgery. B u t because the time at which the patients switched to the surgical treatment was not available, we were unable to make appropriate adjustments to the observed time to I O P stabilization Chapter 8. Discussion 103 when comparing the IOP-stabilization experiences from receiving the different types of therapy. In analyzing the time to progression data, both the I T T and the P P analyses were carried out. T h e y both showed no evidence of a significant effect of the age at baseline, I O P and M D at baseline, using any of the three standard methods. W h e n analyzing the d a t a from all the patients using the Kaplan-Meier and the parametric modelling methods, the treated group was found to have a significantly prolonged time to progression relative to the untreated group under both the I T T and the P P approaches. In the C o x regression analysis under either the I T T or the P P approach, the treatment of having a 30% reduction i n the I O P seemed to have no effect on the male patients, while female patients benefited from the treatment which significantly reduced the risk of progression. Despite the difference i n the treatment effect observed between genders, the treatment by gender interaction was found to be insignificant in the C o x regression. It is worth noting that among the 96 patients included in both the I T T and the P P analyses, 34 were male and 62 were female. The number of patients who reached the progression end point was 10 and 23 w i t h i n the male and the female groups, respectively. W i t h the small number of male patients and a smaller proportion of uncensored times to progression (29.4% for males versus 37.1% for females), the test for the presence of a treatment effect w i t h i n the male group w i l l be less powerful than the same test w i t h i n the female group. T h e treatment might indeed reduce the progression risk for the male patients, but there was not enough power to detect a treatment effect. In the case where the treatment m a i n effect is deemed insignificant, an interaction effect between treatment group and gender w i l l be even harder to detect. W h e n we applied the non-parametric method to analyze the time to I O P stabilization d a t a and the time to progression data, the log rank test was used to compare the survival experiences of two or more populations. A n alternative to the log rank t e s t i s the W i l c o x o n test [31]. Readers are referred to Collett [31] for details on the statistical Chapter 8. Discussion 104 properties of and the difference between the two tests. In general application, the choice of test depends on the alternative hypothesis that is to be tested. Collett discussed that when the alternative to the null hypothesis of no difference between the survival time distributions of two groups is that the hazard for an individual in one group is proportional to the hazard of a similar individual in the other group at any given time, the log rank test is more appropriate. For other types of departure from the equality of survival time distributions between two groups, the Wilcoxon test w i l l be more suitable. O u r analyses showed that the proportional hazards model fitted reasonably well to the time-to-event data. A s the proportional hazards assumption held in our case, to test for a difference between the survival experiences of two different groups in our data, which can be described by an alternative hypothesis of a non-unity proportional hazards between the two groups, a log rank test seemed appropriate. C o m p a r i n g the results from the I T T and the P P analyses, we noticed that for all of the log rank test and the hypothesis tests used in the C o x regression analysis and the parametric method, the P P approach demonstrated a smaller reduced risk of progression for the treated group than the I T T approach did. The baseline set at the time of I O P stabilization for the treated group under the P P . approach shortened the time to progression as compared to its I T T counterpart. Therefore, when the 30% reduction of I O P successfully slowed down the hazard to progression as suggested by the significant treatment effect, the shorter time to progression for the treated group i n the P P analysis led to a smaller difference i n the hazards or time to progression between the two treatment groups, thus explaining the less prominent reduction in the risk of progression for the treated group. Indeed, the group coefficient in the C o x model as estimated by the I T T and the P P approach had very close standard errors. Therefore, w i t h a group coefficient estimate farther from the 0 value in the I T T analysis, a smaller p-value and hence a more significant result were obtained. The same was observed in the results from fitting the W e i b u l l hazard models. Chapter 8. Discussion 105 We have seen in our study that both the I T T and the P P analyses showed a beneficial treatment effect in reducing the risk of progression. However, in general applications of the two approaches to clinical trials involving a delay in treatment stabilization, two other possible scenarios may arise: • W h e n the treatment effect does exist but is not very strong, the I T T approach might show a significant result for the treatment effect while the P P approach does not. T h e time to progression w i t h i n the treated group, which is measured starting from the time at which the treatment stabilization is established, is shorter than its I T T counterpart, and hence shows a weaker evidence against the null hypothesis stating no treatment effect. The argument of a P P approach being able to detect a treatment effect better than an I T T approach is not justifiable. Moreover, delaying the baseline time to the time of achieving treatment stabilization for the treated patients often results in a smaller sample size due to potential drop-outs during the pre-stabilization period. A s a result, statistical tests performed in the P P analysis are prone to have a lower power. • W h e n the treatment effect does not exist, i.e., the 30% lowering of the eye pressure does not alter the risk of progression before or after the treatment stabilization, the I T T analysis should not reject the null hypothesis of no treatment effect, subject to a type I error of size equal to the significance level of the hypothesis test. However, the time to progression for the treated patients analyzed i n the P P analysis is its I T T counterpart shortened by an amount equivalent to the waiting time for treatment stabilization. If such a waiting time is sufficiently long, the d a t a might show a significant difference in the time-to-progression experience between the treated and the control groups, and subsequently result i n a type II error. Even worse, one might conclude an unfavourable treatment effect based on the results from the P P analysis when indeed the treatment is neither beneficial nor harmful. W h i l e the I T T and the P P approaches can lead to different results and conclusions on Chapter 8. Discussion 106 the usefulness of a treatment due to the difference in their baseline definitions, analyzing d a t a w i t h a delay in treatment stabilization by means of a multistate model avoids the problem of defining a reasonable baseline for the control group to correspond to the lag time observed in the treatment group. A multistate model describes treatment stabilization as a separate event which occurs between the time of randomization and the time of reaching an end point of interest. For our study, we applied the disability model and performed separate analyses of the times of transition to the two states: the state of I O P stabilization and the state of disease progression. A n advantage of modelling hazards for the times to different states separately is the flexibility of incorporating different covariates that are relevant to the various transitions between states. In addition to modelling the individual transition hazards, we fitted C o x and stratified C o x models w i t h a time-dependent indicator variable for treatment stabilization. If the d a t a from treated patients who reached the progression end point before I O P stabilization were available, fitting the C o x model (Equation (5.2)) w i t h the group covariate and the timedependent IOP-stabilization indicator variable would allow us to assess both the effect of a stable I O P reduction on the progression hazard w i t h i n the treated patients, and the difference between the time-to-progression experiences of the treated and the control groups. However, such data were unavailable for our analysis, and to avoid computational problems (p.65), we did not include the group covariate in the C o x model and thus could not assess whether the treated and untreated groups had different risks of progression. W h e n we analyzed the data from patients of both genders, we obtained a highly significant favourable result for the effect of a stable I O P reduction on the progression hazard w i t h i n the treated group ( p = 7 . 6 x l 0 ~ ) . The result showed about a four-fold decrease in risk 4 of progression after a stable I O P reduction was achieved. T h e gender-specific analyses, on the other hand, gave rather different results for the opposite genders. T h e effect of having a stable reduced I O P was found to be insignificant among the male treated patients, while w i t h i n the female treated patients, a stable reduced I O P had a favourable effect of reducing the risk of progression to about one-sixth of the same risk i n the case of not having a stabilized I O P reduction. Chapter 8. Discussion 107 A s discussed earlier, the presence of a m i n i m u m M D level that can be attained in the course of normal tension glaucoma poses a potential floor effect which complicates the comparison of the M D decay rates between the control and the treated groups and may lead one to a false impression of the existence of a treatment effect. Such a floor effect might also affect the comparison of the time-to-progression experiences between treatment groups. A s the progression end point in our study was defined as having a decrease of at least l O t i B in M D relative to the level at the time of randomization, patients who began w i t h an M D level close to the lower bound had limited room to achieve the required decrease and hence took a longer time to reach the progression end point. These patients tend to show a smaller risk of progression not as a result of a treatment effect but rather a floor effect. Indeed, when the treated and the control groups have significantly different mean M D levels at baseline or at a later time point during the course of the disease, the floor effect may come into play and affect treatment comparisons. In the presence of both a floor effect and a beneficial treatment effect which prolongs the survival time to progression, if the treated group happens to have a more negative mean M D level than the untreated group on average, the treatment effect w i l l be biased i n favour of the treatment (an overestimation of a beneficial treatment effect). O n the contrary, if the control group had a mean M D level closer to the lower limit, the beneficial treatment effect w i l l be masked by the floor effect. T h e pseudo-reduction i n the risk of progression caused by the floor effect within the control group may be about the same extent as the true reduction caused by the treatment itself w i t h i n the treated group. A l t h o u g h we did not explicitly address the floor effect issue i n the analysis of our time-to-event data, it is important to be aware of this common phenomenon and its potential impact on the results of treatment group comparisons. T h e P P approach adopted throughout this thesis compared a control group from the time of randomization to a treated group from the time at which the stabilization of the treatment was established. It attempted to assess the efficacy of a 30% I O P reduction i n slowing down the progression of normal tension glaucoma, but the approach was Chapter 8. Discussion 108 unsatisfactory because of its two major pitfalls: firstly, the baseline was defined at the time of randomization for the control group and at the time of individuals' times of I O P stabilization for the treated group. The two-sample t-test showed that the mean baseline M D level was significantly lower within the treated group (p=0.014). Indeed, during the period awaited for I O P reduction and stabilization, the M D of the treated patients dropped considerably. The paired t-test comparing the mean M D levels at randomization and at the time of I O P stabilization w i t h i n the treated group gave a significant result ( p = 2 . 0 x l 0 ) . B y comparing the two groups from two different baselines without t a k i n g - 4 into consideration the treatment lag time, the desirable properties of randomization of treatment assignment w i l l be lost. The difference between the mean baseline M D levels of the two groups in the P P analysis illustrates the lack of homogeneity of different treatment groups. Unlike the experimental approach of an I T T analysis, a P P analysis is essentially observational because it analyzes only a subset of patients who are originally randomized and meet the efficacy criteria. A m o n g patients who qualify for inclusion for the P P analysis, the control group w i l l most likely differ from the treated group w i t h respect to other baseline and demographic characteristics besides the treatment assignment. Problems of confounding and bias in results may then arise: any difference between the two groups may then be attributed to some uncontrolled factors other than the treatment effect. A n earlier discussion of the floor effect already pointed out that w i t h different baseline mean M D levels of the two groups, the assessment of the treatment effect can be confounded by the floor effect. Here we see that randomization alone does not guard against bias in treatment group comparison if an inappropriate approach such as the P P approach is used. Secondly, the P P approach seemed to give biased estimates of the time by group interaction effect in the analysis of the longitudinal M D data, as suggested by the simulation results. T h e simulation study also found that the P P approach was less powerful than the P C L I N approach i n detecting a difference between the decay rates of the treated and the control groups under some circumstances. A l t h o u g h in our study the number of patients Chapter 8. Discussion 109 who were originally randomized but were excluded from the P P analysis was small, one can imagine i n cases where a large number of patients does not meet the efficacy criteria, especially when the treatment lag time is long compared to the patients' follow-up period, the sample size in the P P analysis w i l l be reduced greatly. T h e number of patients or observations included in the P P analysis is usually smaller than the planned sample size at the design stage, but how much of a loss in sample size due to the exclusion of patients from analysis is often difficult to predict before a clinical trial is conducted. A s a result, the desired power of statistical tests can never be achieved. Recruiting more patients to compensate for the loss i n sample size is not only uneconomical but also infeasible i n most of the cases. In seeing the potential problems of bias and diminishing power of the P P approach i n the analysis of the glaucoma data and in the simulation experiment, i n order to have a valid efficacy assessment in the presence of a lag time in treatment stabilization, we must make appropriate adjustments to the control group to correspond to the delayed treated group. We exploited our findings about the distribution that the time to I O P stabilization d a t a followed in our baseline-adjustment approach. The approach adjusted the baseline for the control group to correspond to the lag time i n treatment stabilization i n the treated group. We simply shifted the baseline for each control patient from the time of randomization by a time randomly generated from the W e i b u l l distribution originally fitted to the times to I O P stabilization for the treated patients in our data. B y adjusting the baseline in this way, we were assuming that in the absence of a treatment effect, the time taken to reach I O P stabilization followed the W e i b u l l distribution. In comparison to the I T T and the P P approaches, our baseline-adjustment approach demonstrated a stronger treatment effect, i.e., the treated group showed a larger reduction in the risk of progression relative to the control group than in the I T T and the P P analyses. Such a trend was not readily observed in the plot of Kaplan-Meier estimates of survivor functions (Figure 6.12(b)). T h e gender-specific C o x regression analyses gave an estimate of the relative hazard of a treated versus an untreated patient of 0.3 for the male gender Chapter 8. Discussion 110 and an estimate of 0.16 for the female gender. Correspondingly, the I T T and the P P estimates were respectively 0.33 and 0.38 for the male gender, and 0.16 and 0.24 for the female gender. In the parametric analysis, a Weibull fit adjusted for the group membership covariate gave a coefficient estimate of-1.13 whereas the I T T and the P P analyses estimated the group coefficient to be -0.878 and -0.754, respectively. A stronger treatment effect shown by the baseline-adjustment approach as compared to the P P approach was indeed expected because the seemingly shorter time to progression among the untreated patients demonstrated in the P P analysis was truncated as a result of the baseline shift, and the difference between the time-to-progression experience between the treated and the untreated groups became even more evident. W h i l e the adjustment made to the baseline of the control group helped to reduce the imbalance of the two treatment groups at their respective baselines induced by the lag time i n treatment stabilization, any approaches involving baseline matching between treatment groups do not guarantee comparable groups at the adjusted baselines. Furthermore, these approaches often fail to make use of all the available observations originally recorded. For example, repeated measurements taken at times before the adjusted baseline may have to be discarded because they do not have meaningful contribution to a longitudinal or a time-series analysis. In summary, the piecewise linear model gave the most reasonable fit to the M D data as it accounted for the sharp M D decline before the stabilization of the I O P w i t h i n the treated group, while the I T T and the P P approach fitted a linear model and the lag time was ignored. For the survival data, the multistate model provided an appropriate means of assessing the effect of I O P stabilization on the time-to-progression experience i n normal tension glaucoma. It was particularly useful because it can flexibly accommodate the event of I O P stabilization whose time varied across the treated patients. Chapter 9 Conclusions For the analysis of clinical trials i n the presence of a lag time i n the stabilization of treatment, an efficacy-assessment approach entails comparisons between treatment groups after the establishment of treatment stabilization. In evaluating the efficacy of a 30% I O P lowering therapy for normal tension glaucoma i n our study, we adopted a per-protocol ( P P ) approach that compared a control group from the time of randomization and a treated group from the patients' individual times of treatment stabilization ( I O P stabilization) , and a l l the data from the treated patients collected prior to I O P stabilization were excluded from the analysis. A s treated patients generally experience changes i n disease state during the period awaited for treatment stabilization, neglecting the lag time led to the invalidity of our P P approach for efficacy assessment. Based on the results from our simulation study of the M D data, we saw that failure to appropriately account for the lag time i n the P P analysis can result i n bias i n parameter estimation and d i m i n ishing power of statistical tests. Therefore, a plausible method for assessing treatment effects after the stabilization of treatment should take the lag time into consideration and make proper baseline time matching between the treatment groups. In any case, it is important to be aware of the fact that an adjustment made to the baseline times may produce heterogeneous treatment groups that differ w i t h respect to certain baseline characteristics. O u r proposed P C L I N approach i n the longitudinal analysis and the multistate modelling approach w i t h the use of a single C o x model w i t h time-dependent treatement stabilization covariate i n the time-to-progression survival analysis not only incorporated information on I O P stabilization into the modelling of the data, but also 111 Chapter 9. Conclusions 112 followed the intent-to-treat ( I T T ) principle. B y including all patients who were randomized i n the final analysis of the trial as in the case of an I T T analysis, both approaches tend to preserve homogeneity of the treatment groups and hence provide an unbiased means of treatment group comparisons. O n the contrary, in using the P P approach, which analyzed a subset of patients/data which fulfilled the treatment efficacy criteria, randomization of the patients was lost and a reduced sample size resulted. Subsequent analyses of this subset of data was prone to selection bias and loss i n statistical power. In conclusion, when there is a delay in treatment stabilization, a n , I T T analysis that takes into account the lag time remains an ideal approach to the evaluation of a clinical trial. In analyzing the time-to-progression data, the multistate modelling approach is preferred to the baseline-adjustment approach because the former evaluates the overall clinical effectiveness of the I O P reduction therapy in an unbiased fashion. A l t h o u g h the latter seemed to provide a reasonable baseline time correspondence between the treated and the control groups, and showed a better efficacy assessment than our P P approach, the assessment was likely to be biased. Chapter 10 Future Research The m a i n focus of this thesis has been on interpreting the difference i n the results and comparing the performance of the intent-to-treat and the per-protocol analyses i n evaluating clinical trials i n the presence of a delay i n treatment stabilization. T h e intentto-treat ( I T T ) approach included all the patients who were randomized into the final analysis and patients were followed starting from the time of randomization. W h i l e the criteria according to the I T T principle as previously described compose the only definition of the I T T approach, there can be different ways of defining a P P approach because the treatment efficacy criteria which a P P approach is based on vary according to how experimenters view a therapy as efficacious. In the context of our study, a P P analysis aims to assess a treatment effect after the stabilization of treatment is established. We could have defined a different set of efficacy criteria for our P P analysis which assumes a different baseline other than the time of randomization for the control patients or includes a different subset of patients for analysis. O u r baseline-adjustment approach used i n analyzing the time-to-progression data adjusted the baseline times for the control patients by m a k i n g use of the distribution which the time to I O P stabilization w i t h i n the treated group followed. E x p l o r i n g other possible methods of baseline adjustment for better time correspondence between treatment groups remains a future research topic on efficacy assessment. In the simulation experiment, the mean defect data were generated based on the piecewise linear mixed effects model, and no missing data problem due to drop-outs was 113 Chapter 10. Future Research 114 addressed. In light of the presence of a possible floor effect imposed by the lower l i m i t of the M D level that can be attained, all the simulated M D values that fall below the lower bound should be excluded from the analysis. We can treat patients as if they drop out of the study at the time when their M D reaches the lower limit, and consequently, in the simulation experiment, we could regard the M D data that fall outside the realistic range as non-ignorable missing data. In particular, a more negative decay rate w i l l result in a higher proportion of missing data, and the impact the missing observations have on the results of the I T T and the P P analysis requires further research. Investigating such an impact through simulations can lead to a better understanding of the floor effect. Moreover, patients whose M D levels at baseline are close to the lower l i m i t probably show a different trend of depression and especially a slower M D decay rate than those who have less negative M D at baseline. We might simulate different M D slopes for patients w i t h a wide range of different i n i t i a l M D levels and study the floor effect on the outcome. To account for the delay i n the treatment effect present i n our data, we fitted a piecewise linear mixed effects model to the mean defect data w i t h i n the treated group, and fitted the multistate disability model to the time-to-progression d a t a by means of a time-dependent C o x model w i t h an indicator variable for I O P stabilization. In both the longitudinal and the survival analyses, we assumed a threshold treatment effect. Different decay rates of the M D before and after I O P stabilization were assumed for the treated patients. The decay rate did not change gradually in the vicinity of the time of I O P stabilization; the treated patients followed a different decay pattern after I O P stabilization. Similarly, by using a time-dependent indicator variable for I O P stabilization i n the C o x model, the treatment of having a 30% I O P reduction w i l l be treated as not having taken any effect before the stabilization and as having reached its full effect afterwards. T h e abrupt change from no treatment effect to a full effect at the time of I O P stabilization might not be realistic and does not seem to describe accurately the effect of the I O P lowering treatment. We can consider a linear lag model or other non-threshold types of lag models for the treatment effect to better capture a gradual achievement of Chapter 10. Future Research 115 a full effect over time. The use of lagged models was mostly discussed i n the context of survival analysis. Implementing the same idea in longitudinal analysis is worthy of future work, and success in the implementation w i l l enhance our understanding of the effect a treatment has on the course of a disease in the presence of a lag time in treatment stabilization . O n the other hand, when analyzing the multistate disability model, we can apply a different approach proposed by Andersen [27], who used transition probabilities in the parameter estimation procedure. T h i s approach involves fewer assumptions than the time-dependent C o x regression analysis that was applied in this thesis. It is also more applicable in general cases where the treatment groups do not have proportional hazards. However, the parameter estimation procedure is far more complex and computationally intensive. Throughout this thesis, the M D data and the time-to-event data were analyzed separately. T h e M D level was correlated w i t h the time to progression because the M D is a measure of the generalized visual field loss and the progression end point was determined by a persistent presence of a cluster of depression points in the visual field based on the four-of-five criteria adopted by the Collaborative N o r m a l Tension G l a u c o m a Study. B e i n g able to incorporate the information on the degree of generalized visual field loss into the analysis of the time-to-progression data w i l l most likely improve the modelling of the hazard of progression and therefore provide insights into the prediction of the survival rates (without reaching the progression end point) for patients w i t h normal tension glaucoma. In particular, as repeated measurements of the M D are available in our data, one can consider modelling the longitudinal data and the survival data simultaneously. Wulfsohn and Tsiatis [32] proposed a joint modelling approach which made use of a joint likelihood of the covariate repeatedly measured and the survival process for parameter estimation. T h e y considered the linear mixed effects model for the covariate process and the C o x proportional hazards model for the survival process. The two models were used for the separate analyses of the M D data and the time to event data i n our study and they were found to provide reasonably good fits to the data. Therefore, the joint Chapter 10. Future Research 116 modelling approach seems to be readily applicable to our case. A l l of the above recommendations for future research deal w i t h the issue of a lag time in treatment stabilization in the evaluation of clinical trials. Some of the suggested analytical approaches such as adjusting baseline times for the control group are ad-hoc and do not have a general application i n practice. If clinical researchers foresee a lag time for the achievement of a full treatment effect, it would be useful to come up w i t h remedial actions at the design stage of a clinical trial. For example, problems of patient drop-outs before treatment stabilization and inability to achieve full treatment effect are associated w i t h reduced sample sizes. Measures that correct for patient loss i n the design of clinical trials w i l l free researchers from future problems regarding insufficient statistical power of treatment comparison procedures. The corrective measures are especially advantageous as problems brought about by the treatment delay are often difficult to rectify after the trial is carried out. In any case, it Is equally important to address the treatment lag issue at the design stage and in the analysis of a clinical trial. Bibliography Halperin, M., Rogot, E. G u r i a n , J . and Ederer, F. Sample Sizes for M e d i c a l Trials w i t h Special Reference to Long-term Therapy. Journal of the Chronic Diseases. 1968;21:13-24. L i p i d Research Clinics Program. T h e Coronary P r i m a r y Prevention T r i a l : Design and Implementation. Journal of the Chronic Diseases. 1979;32:609-31. Self, S., Prentice, R., et al. Statistical design of the Women's Health T r i a l . Controlled Clinical Trials. 1988;9:119-36. Physicians' Health Study Steering Committee. Physicians' Health Study Protocol. Brookline Mass: Harvard Medical School, Department of Medicine; 1983. Zucker, D. and Lakatos, E. Weight L o g R a n k T y p e Statistics for C o m p a r i n g Survival Curves W h e n There Is a T i m e L a g i n the Effectiveness of Treatment. Biometrika. 1990;77:853-864. Luo, X . , Turnbull, B. W . , et a l . Regression for Censored Survival D a t a w i t h L a g Effects. Communications in Statistics. Theory and Methods. 1994;23:3417-3438. Collaborative N o r m a l Tension G l a u c o m a Study Group. Comparison of G l a u c o m a tous Progression between Untreated Patients w i t h N o r m a l Tension G l a u c o m a and Patients w i t h Therapeutically Reduced Intraocular Pressure. American Journal of Ophthalmology. 1998;126:487-497. Collaborative N o r m a l Tension Glaucoma Study Group. T h e Effectiveness of Intraocular Pressure Reduction i n the Treatment of N o r m a l Tension G l a u c o m a . American Journal of Ophthalmology. 1998;126:498-505. Collaborative N o r m a l Tension G l a u c o m a Study Group. N a t u r a l History of N o r m a l Tension Glaucoma. Ophthalmology. 2001;108:247-253. Collaborative N o r m a l Tension Glaucoma Study Group. Risk Factors of Progression of V i s u a l F i e l d Abnormalities i n N o r m a l Tension Glaucoma. American Journal of Ophthalmology. 2001;131:699-708. Graefe, V . Uber die Iridectomie bei Glaucom und uber den glaucomatosen process. Graefe Arch Ophthalmol. 1857;3:456. Wilensky, J . T . and Gieser, D. K. Low-Tension Glaucoma. In: Weinstein G . W . ed. Open-Angle Glaucoma. New York: Churchill Livingstone; 1986:58. 117 • Bibliography 118 T h e G l a u c o m a Research Foundation. Gleams. 2001;18(3):4. Sassani, J . W., ed. Ophthalmic Fundamentals: Glaucoma. New Jersey: Slack;1999. Andersen, K. E. and Jeppesen, O. K. Classifying Visual Field Data. Master's Thesis. Department of M a t h e m a t i c a l Sciences, A a l b o r g University; 1998:9. Cartwright, M . J . and Anderson, D. R. Correlation of A s y m m e t r i c Damage w i t h A s y m m e t r i c Intraocular Pressure in N o r m a l Tension Glaucoma. Arch Ophthalmol. 1998;106:888-890. Crichton, A., Drance, S. M., Douglas, G . R. and Schulzer, M . Unequal Intraocular Pressure and Its Relation to Asymmetric V i s u a l F i e l d Defects i n Low-Tension Glaucoma. Ophthalmology. 1989;96:1312-1314. Haefliger, I. O. and Hitchings, R. A . Relationship between A s y m m e t r y i n V i s u a l F i e l d Defects and Intraocular Pressure Difference i n an Untreated N o r m a l Tension G l a u c o m a Population. Arch Ophthalmol. 1990;68:564-567. Heijl, A . Humphrey F i e l d Analyzer. In: Drance S. and Anderson D. ed. Automatic Perimetry in Glaucoma. A Practical Guide. Orlando: Grune & Stratton;1985:132. Lachin, J . M . Statistical Considerations in the Intent-to-Treat Principle. Controlled Clinical Trials. 2000;21:167-189. Indrayan, A . and Sarmukaddam, S. B. Medical Bio statistics. New York: Dekker;2001:53. Marcel B u r t o n , P., G u r r i n , L, and Sly, P. Tutorial i n Biostatistics: E x t e n d i n g the Simple Linear Regression M o d e l to Account for Correlated Responses: A n Introduction to Generalized E s t i m a t i n g Equations and Multi-level M i x e d M o d e l l i n g . Statistics in Medicine. 1998;17:1261-1291. L a i r d , N. M . and Ware, J . H. Random-Effects Models for L o n g i t u d i n a l D a t a . Biometrics. 1982;38:963-974. K a p l a n , E. L. and Meier, P. Nonparametric E s t i m a t i o n from Incomplete Observations. Journal of the American Statistical Association. 1958;53:457-481. Cox, D. R. Regression Models and Life Tables (with Discussion). Journal of the Royal Statistical Society. B. 1972;34:187-202. Hougaard, P. Analysis of Multivariate Survival Data. New York: Springer;2000. Andersen, P. K. Multistate Models i n Survival Analysis: A Study of Nephropathy and M o r t a l i t y i n Diabetes. Statistics in Medicine. 1988;7:661-670. [28] Crowley J . and H u , M . Covariance Analysis of Heart Transplant Survival D a t a . In: Hougaard, P. Analysis of Multivariate Survival Data. New York: Springer;2000. Bibliography [29] Cox, D. R. and Oakes, D. Analysis of Survival Data. New York: H a l l ; 1984. Chapter 8. 119 C h a p m a n and [30] Lawless, J . F. Statistical Models and Methods for Lifetime Data. New York: J o h n W i l e y & Sons;1982. 344-345,365,392-394. [31] Collett, D. Modelling Survival Data in Medical Research. New York: C h a p m a n and H a l l ; 1994. 43-45. [32] Wulfsohn, M . S. and Tsiatis, A . A . A Joint M o d e l for Survival and L o n g i t u d i n a l D a t a Measured w i t h Error. Biometrics. 1994;53:330-339.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Analyses of longitudinal and time-to-event data in...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Analyses of longitudinal and time-to-event data in a reandomized clinical trial in the presence of a.. Yin, Eugenia Yu Hoi 2003
pdf
Page Metadata
Item Metadata
Title | Analyses of longitudinal and time-to-event data in a reandomized clinical trial in the presence of a lag time in the stablization of treatment |
Creator |
Yin, Eugenia Yu Hoi |
Date | 2003 |
Date Issued | 2009-10-29T19:10:48Z |
Description | Randomized controlled clinical trials (RCTs) are generally considered to be the best experimental setting for assessing new medical therapies. In medical research, the evaluation of RCTs is often based on two approaches: the commonly recommended intent-to-treat (ITT) analysis, and the more controversial per-protocol (PP) approach, which respectively attempt to assess the clinical effectiveness and the efficacy of a therapy. In the presence of a variable lag time in treatment stabilization following randomization, the two approaches may differ not only in their patient inclusion and exclusion criteria, but also in their definitions of the baseline time, from which follow-up is to be measured. In this work, ITT and PP analyses are applied to the evaluation of an eye pressure lowering therapy, in data from the Collaborative Normal Tension Glaucoma Study. In this study, the therapeutic intervention consisted of achieving a 30% reduction in intra-ocular pressure, and necessitated a lag time before the lowered pressure level became stable. This thesis includes longitudinal and survival analyses, based on measurements taken on some of the main variables in this study. In this case, the PP approach defines baseline time in the treated group as the time at which treatment stabilization has been achieved. It thus loses some of the advantages of randomization, and may suffer potential bias in parameter estimation as well as diminishing statistical power in testing the treatment effect. We investigate these potential problems through some simulation work. While the ITT and the PP approaches fail to account for the delay in treatment stabilization, we also develop a multistate model for survival analysis and a piecewise linear mixed effects (LME) model for longitudinal analysis, both of which address the lag time problem in assessing the effectiveness of the therapy. Finally, we consider a baseline-adjustment approach to match the control group to the delayed treatment group for an efficacy assessment of the therapy. These methods that account for the lag time are compared to the ITT and the PP approaches, and recommendations based on their performance in our study and their general applicability are given. |
Extent | 6609116 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Collection |
Retrospective Theses and Dissertations, 1919-2007 |
Series | UBC Retrospective Theses Digitization Project |
Date Available | 2009-10-29 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0090966 |
URI | http://hdl.handle.net/2429/14377 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2003-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- [if-you-see-this-DO-NOT-CLICK]
- ubc_2003-0335.pdf [ 6.3MB ]
- Metadata
- JSON: 1.0090966.json
- JSON-LD: 1.0090966+ld.json
- RDF/XML (Pretty): 1.0090966.xml
- RDF/JSON: 1.0090966+rdf.json
- Turtle: 1.0090966+rdf-turtle.txt
- N-Triples: 1.0090966+rdf-ntriples.txt
- Original Record: 1.0090966 +original-record.json
- Full Text
- 1.0090966.txt
- Citation
- 1.0090966.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
China | 13 | 12 |
United States | 9 | 0 |
Japan | 4 | 0 |
France | 3 | 0 |
United Kingdom | 2 | 0 |
Sweden | 1 | 0 |
City | Views | Downloads |
---|---|---|
Beijing | 9 | 0 |
Ashburn | 6 | 0 |
Unknown | 6 | 0 |
Tokyo | 4 | 0 |
Shenzhen | 4 | 12 |
Mountain View | 2 | 0 |
Absecon | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0090966/manifest