UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Log hazard regression Sun, Huiying 1999-12-31

You don't seem to have a PDF reader installed, try download the pdf

Item Metadata

Download

Media
ubc_1999-0621.pdf [ 5.16MB ]
Metadata
JSON: 1.0089159.json
JSON-LD: 1.0089159+ld.json
RDF/XML (Pretty): 1.0089159.xml
RDF/JSON: 1.0089159+rdf.json
Turtle: 1.0089159+rdf-turtle.txt
N-Triples: 1.0089159+rdf-ntriples.txt
Original Record: 1.0089159 +original-record.json
Full Text
1.0089159.txt
Citation
1.0089159.ris

Full Text

L O G  H A Z A R D  R E G R E S S I O N by  Huiying Sun Ph.D, Harbin Institute of Technology, Harbin, CHINA, 1991. A THESIS S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F THE REQUIREMENTS FOR THE DEGREE OF M A S T E R OF SCIENCE in T H E F A C U L T Y O F G R A D U A T E STUDIES Department of Statistics  We accept this thesis as conforming /  to the required standard  T H E U N I V E R S I T Y O F BRITISH C O L U M B I A September, 1999 ©Huiying Sun, 1999  In presenting degree  this  at the  thesis  in  University of  freely available for reference copying  of  department publication  this or of  partial fulfilment  of  British Columbia,  I agree  and study.  his  or  her  Department of The University of British Columbia Vancouver, Canada  DE-6 (2/88)  that the  representatives.  may be It  this thesis for financial gain shall not  permission.  requirements  I further agree  thesis for scholarly purposes by  the  is  an  advanced  Library shall make it  that permission for extensive granted by the understood  be  for  that  allowed without  head  of  my  copying  or  my written  Abstract We propose using regression splines to estimate the two log marginal hazard functions of bivariate survival times, where each time could be censored. The method is a modified version of Kooperberg, Stone and Truong's (JASA, 1995) hazard regression for estimating a univariate survival time. We derive an approach to find standard errors for estimates of the difference of the log hazard functions. The approach is inspired byWei, Lin, and Weissfeld (JASA, 1989). We also propose procedures for testing the four hypotheses that the marginals follow an exponential or Weibull distribution and that the two failure times have the same distribution or have proportional hazards. A simulation study is conducted to assess the performance of our estimates and test procedures. We study the effects of the censoring rates, correlation levels, and number of knots. The regression is applied to the data set of the Diabetic Retinopathy Study (Diabetic Retinopathy Study Research Group, 1981). Our analysis for the data set matches study results of Huster, Brookmeyer, and Self (1989).  u  Contents  Abstract  ii  Table of Contents  iii  List of Tables  vi  List of Figures  vii  Acknowledgments  xv  1 Introduction  1  2 Hazard Regression  5  2.1  2.2  Definitions and Notation  5  2.1.1  Failure T i m e Distribution  5  2.1.2  Censored Survival D a t a  7  L o g Hazard Regression  8  2.2.1  Model and Estimates  8  2.2.2  Conditions on b and f3  12  2.3  Consistency and Normality of J3  15  2.4  Proofs of the Lemmas  18  iii  2.5  Log Hazard Regression of Paired Failure Times  26  2.5.1  Model and Notation  26  2.5.2  Consistency and Normality of /3  27  3 Regression Space  32  3.1  Cubic Splines  33  3.2  The Regression Space B  33  3.3  Numerical Implementation  38  3.4  Knot Selection  40  3.5  Hypothesis Testing  42  3.5.1  Univariate Case  42  3.5.2  Bivariate Case  43  4 Application to the Diabetic Retinopathy Study  46  4.1  Data Description  46  4.2  Data Analysis  47  4.2.1  Model 1  50  4.2.2  Model 2  54  4.3  Other Models Used to fit the Eye Data . .  60  4.3.1  Exponential Model  60  4.3.2  Weibull Model  63  4.3.3  Cox Proportional Hazard Model  64  5 Simulation  68  5.1  Description of the Simulation Study  69  5.2  The Univariate Case  74  5.2.1  Exponential Model  74  5.2.2  Weibull Model  77 iv  5.2.3 5.3  5.4  The B-spline Model  79  The Bivariate Case  81  5.3.1  Generation of Paired Dependent Data  81  5.3.2  Proportional Hazards Model  83  5.3.3  Non-proportional Hazards Model  86  5.3.4  Effect of the Number of Knots  87  Summary of Simulations  88  6 Conclusion  135  Bibliography  139  Appendix  142  A Simulation Results for Bivariate Data with 50% Censoring  142  v  List of Tables 4.1  Test statistic and p-value for testing that the failure times of the treated eyes and the untreated eyes have the same distribution  59  4.2 x test statistics and p-values for testing that failure times are exponen2  tially distributed 4.3  63  z statistics and p-values for testing that failure times have Weibull distributions  4.4  63  Test statistic and p-value for testing that the two hazards of the treated eyes and the untreated eyes are proportional  64  4.5 Test results from the Cox proportional hazards model for the eye data. . 64  vi  List of Figures 4.1  Histograms of observed times of the eye data  48  4.2  Scatter plots of the observations of the eye data  49  4.3 The estimated density functions of the eye data for Model 1  51  4.4  52  The estimated survival functions of the eye data for Model 1  4.5 The estimated hazard functions of the eye data for Model 1 with pointwise 95% confidence intervals 4.6  53  The estimated log ratio of the hazard functions of the eye data with pointwise 95% confidence intervals for Model 1  54  4.7 The estimated density functions of the eye data for Model 2  55  4.8 The estimated survival functions of the eye data for Model 2  56  4.9 The estimated hazard function of the untreated eye for Model 2  56  4.10 The estimated log ratio of the hazard functions of the eye data with pointwise 95% confidence intervals for Model 2  57  4.11 The estimated density functions of the failure times of the untreated eyes. 58 4.12 The estimated survival functions of the untreated eyes  58  4.13 The estimated hazard functions of the failure times of the untreated eyes. 59 4.14 The estimated survival curves of the treated eye  61  4.15 The estimated survival curves of the untreated eye  62  vii  4.16 The estimated log hazards ratio from Model 1 and the Cox proportional hazards model with the pointwise 95% confidence intervals from the Cox proportional hazards model  66  4.17 The estimated log hazards ratio from Model 1 and the Cox proportional hazards model with the pointwise 95% confidence intervals from Model 1. 66 4.18 The estimated log hazards ratio from Model 2 and the Cox proportional hazards model with the pointwise 95% confidence intervals from the Cox proportional hazards model  67  4.19 The estimated log hazards ratio from Model 2 and the Cox proportional hazards model with the pointwise 95% confidence intervals from Model 2. 67 5.1  The histograms of the simulated failure times, censoring times, noncensored failure times, and observed censoring times for exponential data as in model (5.2)  5.2  91  Log hazard of the exponential distribution (5.2) and the pointwise quartiles and empirical mean of the 500 estimated log hazards  5.3  92  The pointwise standard deviations of the estimated log hazard for the exponential model (5.2) and the pointwise quartiles and empirical mean of the 500 estimated standard errors  5.4  93  The quantiles of the pointwise z values of the estimated log hazards for the exponential distribution (5.2)  5.5  94  Histograms and qq-plots of the normalized estimate of j3 from the Quantile Knots methods with three knots for the exponential model (5.2).  5.6  . . 95  Empirical distribution functions of the p-values for testing that the failure times are exponentially distributed  viii  96  5.7  The histograms of the simulated failure times, censoring times, noncensored failure times, and observed censoring times for Weibull data as in model (5.3)  5.8  97  Log hazard of the Weibull distribution (5.3) and the pointwise quartiles and empirical mean of the estimated log hazards  5.9  98  The pointwise standard deviations of the estimated log hazards for the Weibull distribution (5.3) and the pointwise quartiles and empirical mean of the 500 estimated standard errors.  99  5.10 The quantiles of the pointwise z values of the estimated log hazards for the Weibull model (5.3)  100  5.11 The histograms and the qq-plots of the standardized estimates of the parameter (3 from the method 5.1 with six knots for the Weibull model (5.3)  101  5.12 Empirical distribution functions of the p-values for testing the hypothesis that the failure time distribution is exponential  102  5.13 Empirical distribution functions of the p-values for testing that the failure times have a Weibull distribution  103  5.14 The histograms of the simulated failure times, censoring times, noncensored failure times, and observed censoring times for the data as in the B-spline model (5.4)  104  5.15 Log hazard of the the B-spline data as in model (5.4) and the pointwise quartiles and empirical mean of the estimated log hazards  105  5.16 The pointwise standard deviations of the estimated log hazards for the B-spline model (5.4) and the pointwise quartiles and empirical mean of the estimated standard errors  106  ix  5.17 The quantiles of the pointwise z values of the estimated log hazards for the B-spline model (5.4)  107  5.18 The histograms and the qq-plots of the standardized estimates of the parameter /3 from the True Knots method for the B-spline model (5.4). . 108 5.19 Empirical distribution functions of the p-values for testing the hypothesis that the failure times have a Weibull distribution  109  5.20 The plots of the correlation coefficients versus 9 for the data as in the proportional hazards and non-proportional hazards models  110  5.21 The histograms of the marginal failure times, and non-censored marginal failure times for the data generated according to the proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 25%. . . I l l 5.22 The true log hazards ratio of the data from the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the 500 estimated log hazards ratios  112  5.23 The standard deviations of the estimated log hazards ratio for data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment"  113  5.24 The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment"  114  5.25 Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the proportional hazards model in Section 5.1 have proportional hazards  115  5.26 The histograms of the marginal failure times, and the non-censored marginal failure times for the data from the non-proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 25%  116  5.27 The true log hazards ratio of the data from the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios  117  5.28 The standard deviations of the estimated log hazards ratio for data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment"  118  5.29 The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment"  119  5.30 Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the non-proportional hazards model in Section 5.1 have proportional hazards  120  5.31 The true log hazards ratio of the data from the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Same Knots method. . . 121 5.32 The true log hazards ratio of the data from the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Different Knots method. 122 5.33 The true log hazards ratio of the data from the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Same Knots method. . . 123 5.34 The true log hazards ratio of the data from the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Different Knots method. 124  xi  5.35 The standard deviations of the estimated log hazards ratio by the Same Knots method for data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment"  125  5.36 The standard deviations of the estimated log hazards ratio by the Different Knots method for data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment"  126  5.37 The standard deviations of the estimated log hazards ratio by the Same Knots method for data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment"  127  5.38 The standard deviations of the estimated log hazards ratio by the Different Knots method for data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment"  128  5.39 The quantiles of the pointwise z values of the estimated log hazards ratio by the Same Knots method for the data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment". 129 5.40 The quantiles of the pointwise z values of the estimated log hazards ratio by the Different Knots method for the data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment". 130 5.41 The quantiles of the pointwise z values of the estimated log hazards ratio by the Same Knots method for the data generated according to the nonproportional hazards model. The censoring rate is 25% for the "treatment". 131 5.42 The quantiles of the pointwise z values of the estimated log hazards ratio by the Different Knots method for the data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment"  132  xii  5.43 By the Same Knots method empirical distribution functions of the pvalues for testing the hypothesis that the two failure times of the data generated as in the proportional hazards model in Section 5.1 have proportional hazards  133  5.44 By the Same Knots method empirical distribution functions of the pvalues for testing the hypothesis that the two failure times of the data generated as in the non-proportional hazards model in Section 5.1 have proportional hazards  134  A . l The histograms of the marginal failure times, and non-censored marginal failure times for the data generated according to the proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 50%. . . 143 A.2 The true log hazards ratio of the data as in the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the 500 estimated log hazards ratios  144  A.3 The standard deviations of the estimated log hazards ratio for data generated according to the proportional hazards model. The censoring rate is 50% for the "treatment"  145  A.4 The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the proportional hazards model. The censoring rate is 50% for the "treatment"  146  A.5 Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the proportional hazards model in Section 5.1 have proportional hazards  147  A.6 The histograms of the marginal failure times, and the non-censored marginal failure times for the data from the non-proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 50% xiii  148  A.7 The true log hazards ratio of the data as in the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the 500 estimated log hazards ratios  149  A.8 The standard deviations of the estimated log hazards ratio for data generated according to the non-proportional hazards model. The censoring rate is 50% for the "treatment"  150  A.9 The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the non-proportional hazards model. The censoring rate is 50% for the "treatment"  151  A. 10 Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the non-proportional hazards model in Section 5.1 have proportional hazards  xiv  152  Acknowledgments Foremost, I would like to express my sincere thanks to my supervisor, Nancy Heckman, for her guidance, support, and patience throughout the development of this thesis. Her numerous advices will be eternally appreciated. I would also like to thank James Zidek for his thoughtful comments on this manuscript. Additionally, I am grateful to all faculty members and office staff for being kind and helpful throughout my Master's program. Finally, I would like to thank all graduate students for letting me share their knowledge and expertise and making my two years at UBC such a pleasant experience.  xv  Chapter 1 Introduction Bivariate failure time data arise when study subject units are paired. Examples of paired units include eyes or ears from the same person, twins, and father and son. Since there exist natural relationships between the two subjects in one pair, the two failure times within the same pair might be correlated. Moreover, in such studies, either or both failure times might not be observed because of censoring. A well-know study involving paired failure times is the Diabetic Retinoparthy Study, Diabetic Retinopathy Study Research Group (1981). The dependence, along with the censoring, greatly complicates the analysis of the data. This data set will be discussed in Chapter 4. In univariate survival analysis, classical methods such as the Kaplan-Meier estimator and the Cox proportional regression model are based on the hazard function h(i), P(t<T<t  = v  +  At->o  ;  At\T>t)  =  At  f£ S(t)'  where S is the survivor function and / is the density function, see Section 2.1. If T is a discrete random variable and T can take on values 0 < ti < t < • • •, then the hazard 2  function is h(t) \T>t)n[t ) = P(T = t \i>t,)3  3  P  ^  T  =  1  -- 1i -  s  {  ^ , j - -l , i2 o,  ... .  And thus the survival function can be written as  s(t)=  n  j:tj<t  see Lawless (1982). An analogous formula relating S and h exists for T continuous, see Gill (1992). But in the bivariate  Ccise^ cis  Gill discussed, we do not have a nice formula  relating the hazard and survivor functions, since there is no canonical way to define past, present, and future at "time" t. Traditionally, there are two main approaches to analyze censored paired data. One approach is to try non-parametric estimation of the bivariate survivor function. The typical works on this approach are Dabrowska (1988) and Prentice and Cai (1992). Dabrowska extended the univariate Kaplan-Meier approach by defining a bivariate hazard and using it to estimate the joint survival function. The marginals of Dabrowska's estimate are given by the univariate Kaplan-Meier estimates. She proved the consistency of the estimates but did not give estimates for the covariance of the estimates. Prentice and Cai (1992) did not define a joint hazard. Instead, they gave a representation of the bivariate survivor function in terms of the marginal survivor functions and the covariance  where Ti and T2 are paired failure times and h is the hazard of 7*, k = 1,2. They k  proposed an estimate of this bivariate survivor function and proved the consistency of their estimate but they did not give estimates of standard errors. Lin and Ying (1993) presented a simple estimator of the bivariate survival function and also estimated the covariance function of their estimates. They assumed univariate censoring, that is, that there is one censoring time and it affects both failure times. The other approach to bivariate survival analysis is to extend the Cox proportional hazards model, Cox (1972), to paired data. Recall the Cox proportional model has 2  hazard function h(t\z) = h (t) exp((3z) 0  where h is the baseline hazard, z is a covariate vector, and (3 is unknown. The Cox 0  proportional model assumes that all observations are independent. Holt and Prentice (1974) assume that within a pair, hazards are proportional according to a covariate, but the baseline hazard can depend on the pair. The effect of the covariate, that is, the value of (3, is the same across pairs. Clayton (1978) and Oakes (1982) presented a fully parametric model for paired survival data. This model assumes that each marginal follows the Cox proportional model with respect to some covariates. They used an additional parameter to describe the association within a pair. Huster, Brookmeyer, and Self (1989) extended the ClaytonOakes model to allow censoring. They also discussed another approach for inducing the correlation within a pair. They obtained the parameter estimates from an independence working model, that is, univariate estimates are used in the model to get the parameter estimates, and then they estimated variance robustly. Wei, Lin, and Weissfeld (1989) proposed a model for multivariate failure times. They assumed each marginal distribution of the failure times follows a Cox proportional hazards model with respect to some covariates. They showed that the resulting estimators of the parameters for covariate effects are asymptotically jointly normal and gave a consistent estimate of the covariance matrix. Lee, Wei, and Amato (1991) used a different way to extend the Cox proportional model to paired data. For paired data (Tu, T i), with treatment indicator Zj, they showed 2  the consistency and asymptotic normality of the estimated regression coefficient in the Cox proportional model. In this case the Zi's might be dependent. But the corresponding variance-covariance estimate may no longer be valid due to the dependence between the members in the pairs. They proposed a correct variance-covariance estimate taking 3  account of the correlation within a pair. All of the proposed models do not estimate the baseline hazard functions, treating it as a nuisance parameter. With the development of smoothing theory and methods, many people have used splines in survival analysis, that is, to approximate density functions, survival functions, hazard functions, or baseline hazard functions, in the presence of censored data. For example, Abrahamowicz, Ciampi, and Ramsay (1992) used B-splines to estimate the density and O'Sullivan (1988) used smoothing splines to estimate log hazard functions. Kooperberg, Stone, and Troung (1995) used linear B-splines and their tensor products to estimate the conditional log-hazard function as a function of t and a covariate vector z. Their model contains the Cox proportional hazards model as a submodel. Since cubic splines can provide a better approximation than linear splines, Kooperberg et. al. also proposed a model in which the log hazard function is estimated with cubic B-splines. Unfortunately, because of the complication in estimation and model selection, they did not consider covariates in this model. Nor did they provide standard errors for the estimated log hazard function. In this thesis, we use cubic B-splines to estimate log hazards in univariate survival data and log hazard ratios in bivariate data. We use Kooperberg's approach to get estimates, but we provide estimates of standard errors. In Chapter 2, we define the log hazard regression model, the estimates of the log hazard functions and ratios, and prove the asymptotic properties of the estimates. In Chapter 3, we discuss the properties of cubic B-splines functions and how to choose knots which define these functions. In Chapter 4, we use our model to analyze the data set from the Diabetic Retinoparthy Study. We present our simulation study in Chapter 5.  4  Chapter 2 Hazard Regression In this chapter we define a parametric regression model for log hazard functions of censored failure times. We start with introducing definitions and notation in the first section. The regression model and the estimates are defined in Section 2.2. We show the consistency and normality of the estimates in Section 2.3 and leave the proofs of two required lemmas to Section 2.4. Finally, in Section 2.5, we define the regression model for bivariate failure times and discuss inference based on the estimates.  2.1 2.1.1  Definitions and Notation Failure Time Distribution  All functions, unless stated otherwise, are defined over the interval R = [0,oo). Let T +  be a nonnegative continuous random variable representing the failure time of individuals in some population. Let / denote the density function of T and let the distribution  5  function be F{t) =P{T  <t) = f f{u)du. Jo  The probability of an individual surviving till time t is given by the survivor function f(u) du = l-  F(t).  At\T>t)  M  The hazard function is defined as = V  1  l  i  m  r(t<T<t  +  Ai-+0  =  At  S(t)  '  which specifies the instantaneous rate of failure at time t, given that the individual survives until at least time t. The functions / , F, S, and h give mathematically equivalent specifications of the distribution of T. Since f(t) = —S'(t), by the definition of h, h(t) = ~~  logS(t).  Thus log S(t) - log 5(0) = - /'' h(u) du Jo  and since 5(0) = 1, S(t) = exp (- £ h{u) duj  (2.1)  f(t) = h(t) exp (- J* h(u) du^j .  (2.2)  and  For our log hazard regression, we denote the log hazard function by a, that is, (2.3)  a{t) = log h(t).  It is easy to derive expressions for S(t) and f(t) in terms of a(t). 6  2.1.2  Censored Survival Data  Consider n individuals in some population. Let Tj be the failure time of individual i, and Ci a censoring time, that is, a time beyond which individual i can not be observed. The variable Tj will be observed whenever Tj < Cj. Let Xi = min(Tj, Ci) and r  1  ifTj<Cj  0  ifTj>Cj.  That is, 5i indicates whether the failure time Tj is censored or not and Xi is equal to Tj if Ti is observed, and to Ci if it is not. The data from observations on n individuals consist of the pairs (Xi, 6i), i = 1, • • •, n. For the data we first make the following assumptions. Assumptions  (1) (Xi,6i) are independent random vectors with an identical distribution; (2) the failure times Ti and the censoring times Cj are independent. We denote the survivor functions and the density functions of failure time Tj and the censoring times Cj by S,S ,f c  and f , respectively. By Lawless (1982), under the c  Assumptions (1) and (2),  The probability is a measure defined on [0, oo) x {0,1}. We denote it by v. Then the density function of (Xi, Si) is defined by f(t,5) =  (2.4)  f(t) S(t) - f (t) ' S (t) . s  1  6  1 S  c  s  c  It is easy to check that / is the Radon-Nikodym derivative of v with respect to the cr-finite cross-product measure a. defined on [0, oo) x {0,1} such that the marginal of \i on [0, oo) is the Lebesgue measure and the marginal on {0,1} is counting measure. 7  2.2  Log Hazard Regression  In this section wefita parametric model for the log hazard function a of failure time T. First we define the model in a general form and determine the estimates that we will use. Then, in Section 2.2.2, we discuss the conditions on the model.  2.2.1  Model and Estimates  Let G be a p-dimensional linear space of real continuous functions defined on |0, oo), and let {bj,j = 1, • • • ,p} be a basis of G. We call G the regression space. The log hazard regression is a model for the log hazard function, that is,  = P 1>(t),  a(t\B) = J2PMt) 3=1  where (3 = (Pi,P )  (2-5)  T  is unknown and b(t) = (h(t), • • •, b (t)) .  T  T  P  p  Given (3, by Equations (2.3), (2.1), and (2.2), we have the hazard function, survivor function, and density function h(t\3)  =  exp  S(t\/3) =  (2.6)  exp(- / exp(^^( ))du), W  j=l  J 0  f(t\0)  (2.7)  t  = exp(2/3 -6 (t))exp{- f e  i=i  J  i  W  J o  (j^ PM^)du},  j=i  (2.8)  respectively. Note that to ensure that (2.7) defines a non-degenerate survivor function, it is necessary and sufficient that rt  P  / exp ( 53 Pjbj(u))du  < oo  (2.9)  for some t > 0 and I  exp exp ((  P  Pjbj(u))du  = +oo.  (2.10)  So we need some conditions on b and (3. For now we assume that (2.9) and (2.10) hold, and we will postpone discussing the conditions until Section 2.2.2. For n observations (Xi,5i),i  = l,---,n,  we make the following assumptions, in  addition to Assumptions (1) and (2). Assumptions (3) the log hazard function a of Tj satisfies model (2.5); (4) the distribution of the censoring times Ci does not involve (3 and the density fc of C is bounded. Then by (2.4), the density function of (X{,5i) is  /(*, si® = nm'sitw-'Mty-'Sctt)'.  (2.H)  We denote the joint distribution function of (Xi, S{) by L(X\(3), that is, L(X\(3)  6,1(3),  = f[f(Xi, i=i  and we choose 0 to maximize L. Let f (t,5\(3)  = f(t\!3) S(t\(3f5  q  (2.12)  &  and L ((3\X)  f[f (X ,S \(3).  =  q  q  l  i  i=i Then L(X\(3)  = =  fiMXiJifflMXtf-'iSciXi)*'  i=l  L (3\X)f[f (X ) - *S (X ) *. 1  q  c  i  s  6  e  i  i=i Since S and f do not involve j3, we estimate (3 by J3, the maximizer of the function c  c  We call the partial derivative of log L with respect to j3 the score function, denoted q  by U~( \j3\X), which is a vector with p components n  uM(P\X)  = ^-logL (f3\X)  = J2^-loef (X ,S \0),  q  m  q  i  (2-13)  i  m = 1,2, •••,£>. The observed information matrix, denoted by I^((3\X), is a p x p matrix with (mZ)-th entry  ''"'ci*'- =  = "gaSft  -a£w,  ]06LMX)  w  (214)  We denote the information matrix, in the usual way, by I((3), which is a p x p matrix with the (ml)-th entry I(f3)  ml  = -E | ^ £ ^ - log f {X , 5,|/3)|. q  (2.15)  x  By definition of / , and Equations (2.8) and (2.7), we can write f , log q  U^ \ I^ \ n  n  and I ((3) in terms of the basis functions b\, • • •, b . We have p  f (t,5\(3) = [ e x p ( £ / ^ ( i ) ) ] e x p { - /*exp ( £ P M u ) ) d u \ ; 5  q  °  3=1  (2.16)  3=1  and log/ (M|/3) =5J2(3Mt)  ~ [ ex (J2p b (u))du. t  g  V  j  3  Thus, under regularity conditions, log f {X 5 \f3) q  d(3  u  t  m  =  I  =  5j6 (^i) - /  h  EM  M- /*  fXi m  (E  P  6 ( « ) exp  Pjbj(u))du;  m  J0  P  e x  • i J'=I  and b {u)bi(u) exp  - ^ - ^ log f {Xi,8i\(3) = ~ J q  m  o  10  (2.17)  Hence U^(f3\X)  = Y\s bm(X )1=1  m  i  [  i  ,  6 (u)exp(^/3 6 (u))d« m  i  J 0  .7  =1  (2.18)  i  and /W(/3|X)  M/  = £  bmiuMu) exp (J2PMu))du.  (2.19)  Since for any vector v = (vi, • • •, v ) ^ 0, T  p  v I^((3\X)v T  = £ r\v b(u)) e (YPMu))du i=i i=i T  > 0,  2  W  1/0  I^ \p3\X) n  is positive definite (i.e., the second partial derivative of logL is negative g  definite). Therefore, if (3 exists, it is the unique solution of the p equations U \f3\X) {n  = Q.  (2.20)  In the next chapter, we will show how to numerically solve (2.20) for (3. Note: By the relationship between L and L, q  log L(f3\X) = ^  9  log  L (f3\X). g  Therefore, U^((3\X)  =  m  i=l °Pm  Y^^gf(X ,5 \p3) l  l  and I^((3\X)  ml  =  -£-*io f<x 6 \/3) op opi g  i = l  ii  i  m  which are the usual definitions of the score function and the observed information matrix.  11  2.2.2  Conditions on b and (3  As we mentioned in the last section, we need some conditions on the basis b or on (3 such that (2.9) and (2.10) are true. A sufficient condition that (2.9) and (2.10) hold is that b is bounded. We say a function u : R  —>• RP is bounded  +  if there exists 0 < M < oo  such that M*)||  for all te  2  = £ M * ) | < M , 2  R. +  Proposition 2.2.1 If b is bounded, then (i) (2.9) and (2.10) hold for all t and [3 and (ii) there exists a number M((3,p) such that  '  f  m  <  M  (P, )ex p  P  (- ^).  (2.21)  1  Proof. Since b is bounded, there exists a number 0 < M < oo, dependent on (3 and p, such that | Y%=i Pjbj(u)\ < M. Hence < exp (£pMu))  < P( )e x  M  (2-22)  It follows that for any t rt P rt / exp f ^2Pjbj(u))du < / exp(M)du  JO  j=\  < oo  and roo  .r  r°° roo  .  Jo « P ( E y ^ / A-( « - ) ) d « >-Jo 0  1 I  ^r^du exp(M)  — OO.  So we proved that (2.9) and (2.10) are true. Next, let M((3,p) = exp(M), then, by the definition of / from (2.8) and from (2.22),  3=1 £  M  ^  P  »  E  °  J  X  P  ( - M ( ^ ) ) .  12  3=1  which completes the proof of (ii).  •  We would like the regression space G to contain the log hazard for the commonly used Weibull distribution. Recall that the density function of a Weibull distribution is f (t) w  = 7 A i - e x p ( - (Xty), 7  7  1  i > 0 , A > 0 , > 1. 7  The corresponding log hazard function is a (t) = log(7A-0 + (7-l)log(t). w  Thus, if b (t) = \og(t) and 6 _i = 1, then model (2.5) includes the Weibull distributions. p  p  More generally, we consider the basis 61, • • •, 6 _i bounded and b (t) = log(i + c), p  p  where c > 0 is a constant. We call b log tail and {bj, j — 1, • • • ,p} bounded plus log p  tail. This basis doesn't satisfy the assumption of Proposition 2.2.1. But we have the  following result. Proposition 2.2.2 If b is bounded plus log tail and /3 > —1, then P  (i) (2.9) and (2.10) hold for all t and (ii) there exists a number M((3,p) such that  f(t\(3)<M((3,p)(t  +  c)^ex ( P  (t + c ) ^  + 1  (2.23)  MOM  Proof. Since b is bounded plus log tail, there exists M such that YJj=\ Pjbj(u) < log(M). Then (2.24) Therefore, 00  13  for any t and 0 0  o  (u  +  cY  p  exp(M)  du = oo  as P > —1. Then (i) is proved. To prove (ii), using the definition of / from (2.8) and p  using (2.24), we have f(t\f3) =  exp ( £ ftbjit)) exp { -  f  exp ( E  («))<*«}  where M((3,p) = max{Ci,C }. So (ii) is true.  •  2  From now on, unless stated otherwise, we assume the n observations (Xi,5i),  i =  1, • • • ,n, satisfy Assumptions (1) - (4) defined in the previous sections and either Assumption  (5a) 6 is bounded and (3 E RP or (5b) 6 is bounded plus log tail and (3 6 RP x (—1, oo). 1  We call RP the parameter space, if (5a) holds and RP' x (—1, oo) the parameter space 1  if (5b) holds. Note 1  Since (2.9) holds for all t > 0, f{t) > 0 for all t > 0. So the support of / is R , +  which will be used for the proof of Theorem 2.3.1 in the next section. Note  2 It is easy to see that if b is bounded then for any compact set B in the parameter  space, there exists M dependent on B and p such that /(*|/3)<Mex (--M P  14  for all (3 e B. If 6 is bounded plus log tail, then for any compact set B in the parameter space, there exist M and f3' > —1, dependent on B and p, such that p  for any (3 G B. That is, on a compact set B in the parameter space, the density function f(t\(3) of T can be dominated by a function of t.  Consistency and Normality of /3  2.3  In this section we use Theorem 5.1 in Lehmann and Casella (1998) to show the following theorem. Theorem 2.3.1 Let (Xi, Si), • • •, (X , 5 ) satisfy Assumptions (1) - (5). Then, for (3 , n  n  0  the true parameter, (i) (3 —> (3Q in probability;  (ii) y/K0 - fi ) => 0  (iii) ^I^((3\X)  N(O,[I{0 )n o  —> I((3 ) 0  ? n  probability provided (3 —> (3 in probability. 0  We first state the conditions of their theorem. Conditions A . For different values of (3, the distributions P$ of the observations are distinct. B. The distributions Pp have common support. C. The observations (Xi, Si), • • •, (X , 6 ) are i.i.d. with probability density f(t,6\(3) n  n  with respect to a cr-finite measure p.  15  D. There exists an open set u of the parameter space Q, containing the true parameter point (3 such that for almost all (X, 5), the density f(X,5\(3) admits all third 0  derivatives (d /dp dp dp )f(X,  8\(3) for all (3eu.  3  j  k  l  E . The first and second logarithmic derivatives of / satisfy the equations £ { ^d_ l o g / ( . M | / 3 ) } = 0, df3  and I  =  ^ = {-^^f( -m} E  x  E{^u f(x,s .±u f(x,s\m}, V  W  v  I,me {1,•••,?}. F. I((3)jk is finite and the matrix I(/3), a p x p matrix with the (ml)-th entry 7(/3) ;, m  is positive definite for all (3 € to. G.  There exist functions Mjki such that  (P/dPjdpMnogfix^ip)] for all {3 e OJ, where Ep (M (X)) jkl  Theorem 2.3.2 • • •, (X ,5 ) n  n  <  M (x) jkl  < oo for all j, k, I e {1, • • • ,p}.  (Theorem 5.1 in Lehmann, E.L. and Casella, G, 1998). Let  (X ,6 ), 1  X  be i.i.d., each with a density f(t,5\f3) which satisfies the above conditions.  Then with probability tending to 1 and as n —>• oo, there exists a solution J3 of the likelihood equation  " dlog/(A^|/3) such that ( a ) Pj is consistent for estimating Pj, (b) y/n(0—p3) is asymptotically normal with mean zero and covariance matrix [I(p3)]~ , l  and  16  (c) J3j is asymptotically efficient in the sense that y/n($j — Pj) converges in distribution to a random normal variable with mean zero and variance [I((3)jj]~ . l  Proof of Theorem 2.3.1. Suppose (3 is the solution of the equation  o=  n  E  dlog(/ (M|/3)) g  d(3  i=i  n  E  aiog(/Mfl)) d(3  To show (i) and (ii), we need only check the conditions of Theorem 2.3.2. First, by Assumptions 1 and 2 and the definition of f(-,-\(3) from (2.11),  (Xi,6i),  • • •, (X , S ) and /(•, -|/3) satisfy condition C. n  n  Second, since {bi, • • •, b } is a basis, from (2.16), the definition of f in terms of the p  q  basis functions, for different value of (3, f is distinct. From Note 1, after Propositions g  2.2.1 and 2.2.2, the support of f is [0, oo). By the relationship between /(•, -|/3) and f q  q  from (2.11) and (2.12), conditions A and B hold. Then, from (2.11) and (2.12), aiog/CXx.ftlfl) _ 3 l o g / , | / 3 )  f n  O K  .  dPi I = 1, • • • ,p, since S and f do not involve (3. Hence condition D follows from a direct c  c  calculation of the derivatives of f from the definition (2.16). q  Next for condition F, I((3)jk is finite follows from Lemma 2.3.4 below. As shown in the calculations after (2.19), 1(13) is positive definite. Finally, conditions E and G follow from Lemmas 2.3.4 and 2.3.5 below, respectively. Therefore, (a) and (b) in Theorem 2.3.2 yield (i) and (ii), respectively. To prove (iii), we need to use (i) and the bounds in Lemma 2.3.4 below. The proof is as in Theorem 3.10 of Lehmann and Cassella (1998). Thus, we have finished proving Theorem 2.3.1. Using (ii) and (iii) in the above theorem and Slutsky's theorem, we have the following Corollary. 17  Corollary 2.3.3 Let (Xi, 81), • • •, (X , 8 ) satisfy Assumptions (1) - (5). Then n  n  [n&\f3\X)}^{f)  - (3) => N{QJ),  where I is the p x p identity matrix. The following two lemmas are used in the proof of Theorem 2.3.1. Suppose that b satisfies Assumption (5a) (or Assumption (5b)) and B is  Lemma 2.3.4  a compact set in RP (or in R ~ x (—1, 00)). Then there exists 0 < M < 00, dependent on p  l  B andp (or on B,p, and c), such that for any (3 G B, s G {1, • • • k  k = 1, • • •, m; m =  0,1,2,3, \d™ log f (X q  u  8^(3)  <  MK(X ),  (2.26)  X  where if b is bounded  1 + Xi  K{X()  1 + I log(A^! + c)| + ( X i + c ) ^  p+4  (2.27)  if b is bounded plus log tail.  < 00.  Furthermore, E{K(Xi)}  Lemma 2.3.5 Let (3 be the true values of the parameters in model (2.5). Then 0  i) ii)  E E  dlog/^,^), d(3  l/3  5 l o g / , ( * ! , 8i\f3)\ OP,  0;  0  dlogf (X ,8i\(3) q  00  OP  d logf (Xi,8i\(3) 2  x  0o  = -E  q  QPmdPi  We will give the proofs of the above lemmas in the next section.  2.4  Proofs of the Lemmas  Now we prove Lemmas 2.3.4 and 2.3.5. 18  0o l  For convenience, we first calculate the partial derivatives of log f with respect to q  (3 in terms of the basis functions. By the definition of log/ from (2.17), for s G g  {l,---,p};k  =  l,---,m,  $i YPi ji i) b  ~ j  x  j=i  l O g 1 ) 3 )  .9"  k  = < 5ib {X]) -  d(3 ---d/3 Sl  Sm  .  3=1  fXi ,P I b (u)exv{YPj j( )) > j=i b  Sl  -/  P ( Y/3jbj{u))du,  e x  J o  u  m = 0;  .  rn = 1;  du  Sl  (2.  J o  b (u) • • -b (u)exp Sl  [YPj j( )) ^ j=i b  Sm  J o  u  du  m = 2,3.  Proof of Lemma 2.3.4 v If b is bounded, since B is a compact set, Y, \Pj j{ )\ 3=1 b  V  exp (YJ Pjbj(u)j. 3=1  u  1S  bounded on B, and so is  Hence, by (2.28), we can find 0 < M < oo, dependent on B and p,  such that (2.26) holds. If b is bounded plus log tail, then &i, • • •,fep—iare bounded. Hence there exists 1 < M i < oo such that for (3 G B  {  p-i  EftM ) u  3=1  p-i Y\ j( )\\< i, 3=1 b  u  (2.29)  M  which implies exp ( Y h 3 («)) < e b  Ml  (2.30)  (u + c) ". p  3=1  Then for m = 0, by (2.28) and (2.30), (2.26) is true. To prove (2.26) for m = 1, 2,3, it suffices to show that there exists M < oo such that rXi  [  1  IM") •  + tf'du < i M  + (i  l  x  J0  + c)^ ), 4  (2-31)  Sk G {1, • • • ,p}; k = 1, • • • ,m. This clearly holds for s G {1, • • • ,p — 1}, since bj is k  bounded, j = 1, • • • ,p — 1. But the log tail \og(t + c) is unbounded, which makes the proof of (2.31) more complicated. So we need to investigate the properties of the integral f Jo  Xl  \ \og(u + c)\ (u + m  c)^du,  m = 1,2,3. Noting /3 > — 1 and c > 0, we have the following easily proven facts. p  19  Fact 1 for any £ > 0 and any compact set B C (—l,oo), there exists M < oo such x  that sup  | log(u + c)I (u + c) 2 < M L ;  ue(0,l],/3pGB  Fact 2 for any compact set B 6 (—1, oo), sup / (u + c) 2 du < 00; /3 eB Jo p  Fact 3 log(u + c) < it + c for u G [1, 00), Fact 4 for any /?' > 0, B" > - 1 , and M' > 0, J  < 00.  (u + c)^" exp { - (u +c) '/M'}du 0  We will use Facts 1 - 3 to prove that i) there exists M < 00 such that for (3 G B, 2  f I log(u + c)\ {u + cf du < M , Jo 1  m  p  2  m = 1,2, 3; ii) there exists M < 00 such that for X\ > 1 and (3 E B, 3  j*  I log(« + c)| (u + cfrdu < M {X + c ) ^ ,  1  m  + 4  3  1  m — 1, 2, 3. If i) and ii) hold, then, when X\ < 1, |log(w + c ) | ( M + c ) ^ ^ /o ./o < [ \\og(u + c)\ (u + c)^du < M , Jo 10 M  m  2  and when Xi > 1, / 1/  =  A l  |log(u + c)r(u + c)^du  0  f 1 log(u + c)\ (u + c)A>d« + /"*' I Iog(w + c)\ {u + c)* *du 1  J  m  m  */1  0  < M + M ( X ! + c ) ^ + <M(l + (Xi + c ) ^ ) 4  2  +4  3  20  1  for M = max{M ,M }. So (2.31) would hold, which implies that (2.26) would be true. 2  3  Proof of i): Since B is compact, using Facts 1 and 2, we can find M' , M' ' < oo such 2  2  that for (3 G B, | log(« + c)\ {u + c)^  < Mj,  m  m = 1,2,3, for u G (0,1) and < M'i.  Ciu + c ^ d u Jo  Then for 0 G B,  C \log{u + c)\ (u +  cf"du  m  J  <  J  <  0  /  \log(u+  0  c)\ (u + c)^(u  M' f (u + c)^du  < M' M'  2  2  2  c) s du h  1  - M < oo, 2  0  J  m — 1, 2, 3.  +  m  Proof of ii): By Fact 3 and (2.29), j*  \\og(u + c)\ (u + cf*du  1  m  {u + c) (u + 3  = <  cY du v  ^((^i+^-a+c)^ ) 4  M (X 3  1  +  c)^  +4  m — 1, 2,3, for some M > 0. 3  Now that we have proven that (2.26) is true, we show that E{K(Xi)} First suppose that b is bounded. Then EiKiX,)}  = E{1 + X }< l  E{(1 + Ti)},  since, by definition, X\ < T . By Proposition 2.4.1 below, E{T{\ < oo. x  21  < oo.  Now consider b bounded plus log tail. Write  K(X ) X  = K*{X )  +  1  K**{X ), 1  where K*{X ) 1  = l + (X, + c ) ^ +  + | log(X  4  + c)\I{X  1  + c > 1}  L  and = | l o g ( X ! + c)\I{X  K**(X ) X  L  + c < 1}.  Since K* is a non-decreasing function,  E{K*(X )} X  < E{K*{T )}  <  X  E{K(T )}, X  which is finite by Proposition 2.4.2 below. T o show E(K**) Assumption 4,  is bounded, note that, by  f is bounded, and by (ii) in Proposition 2.2.2, f(t\(3) is bounded on c  [0,1]. T h u s , the marginal density of X , X  f (x) Xl  - f(x\f3)S (x) c  +  S(x\p)f (x), c  is bounded. T h e n we have  E{K**(X )} 1  since /  Jo  < £  I log(a; + c)\f (x)dx Xl  < oo  \\og(x + c)\dx < oo. Therefore,  E{K{X )} X  = E{K*{X )}  + E{K**(X )}  1  X  < oo,  which completes the proof of L e m m a 2.3.4. T h e following propositions were used in the proof of L e m m a 2.3.4.  P r o p o s i t i o n 2.4.1  If b is bounded, then for any a > 0 , E(T ) < oo. A  22  Proof. Since b is bounded, by (ii) in Proposition 2.2.1, there exists 0 < M((3,p) < oo such that /(^)<M(/3,p)exp(From this and  F  I  i  ;  s  ).  Fact 4 in the proof of L e m m a 2.3.4, we have roo  E(T ) = J a  o  .  roo  t f(t\(3)dt< J a  M(P,p)t exp(  .  —f  )dt  a  o  < oo.  p)  • Proposition 2.4.2 If b is bounded plus log tail and (5 > —1, then V  (i) for any a > 0, E(T ) < oo, and a  (ii) E ( | l o g ( T + c ) | ) < o o .  Proof. Since bi, - • • ,b -i are bounded, by (ii) in Proposition 2.2.2, there exists 0 < p  M((3,p) < oo such that (2.23) hold. Let  T h e n from (2.23) and  Fact 4 in the proof of L e m m a 2.3.4,  roo  E(T ) = / a  roo  t f(t\3) dt < M(P,p) / t (t + c)^g(t,3) yo Jo a  a  dt < oo,  as /3 > —1. Thus (i) is proved. P  To prove (ii), using (2.23) and roo  £ ( | l o g ( T + c)|)= roo  /  Jo  Facts 1 - 4 in the proof of L e m m a 2.3.4, we have |log(t + c)|/(t|/3)dt  <  M(3,p)  <  M(3,p) { £  <  M(3,p) {Mx j f \ t + c ) ^ ( t , / 3 ) dt + f"{t-¥cf- g{t,3)  Jo  \log{t-rc)\(t + | log(t +  c)^g(t,3)dt  c)\(t + c)^{t  + c)^g{t, +l  which finishes the proof of (ii).  3) dt +  {t + c)^ g(t,  3) dt  +l  dt} < oo, •  23  Remark: The following proposition will be used in the proof of Lemma 2.3.5 and Section 2.5. The proof is omitted since it uses the same procedures and arguments as in the proofs of Propositions 2.4.1 and 2.4.2. Suppose that b satisfies the assumptions of Lemma 2.3.4- Then  Proposition 2.4.3 EilKiXj} }  < oo.  2  The proof of Lemma 2.3.5 To prove i), write E  d0  0o  d\ogf{X ,8 \0)  = E  l  l  d(3  0o  dE(l)  df(X 6M u  00  0.  To justify the interchange of differentiation and integration, we must prove that there exists a neighborhood B of 0  O  f(Xi,6i\(3)  such that uniformly on B, the partial derivatives of  with respect to 0 can be dominated by an integrable function of (Xi,5).  Then by the Dominated Convergence Theorem, we can exchange the order of derivative and integral. To do this, we choose a > 0 such that B = {0 : \\0 — 0 \\ < a} is compact in the O  parameter space. By Lemma 2.3.4, there exists M such that for /3 e B, \dlog  f (X ,6 \0Y q  1  1  <  I = 1, • • • ,p, and E(K{Xi)) dfiXuSM  MK(Xi),  (2.32)  < oo. Hence, from (2.25) and (2.32), for 0 e B,  f(Xi,6M  dlog/(XuSM 00,  <MK(X )f{X 5 \0). l  u  l  Thus it suffices to find an integrable function of (Xi, 5) to dominate K(Xi)f(Xi, By the definition of /(-,-\0) from (2.11), K(t)f(t,l\0)  = K(t)f(t\0)S (t) c  24  < K(t)f(t\0)  <  K(t)f*(t),  Si\0).  where /* is the bound on / given in Note 2 after Proposition 2.2.2. It is easy to show that Kf* is integrable. To bound K(X )f(X , x  O\0) = K(X )S(X \0)f (X ),  x  x  x  c  it suffices to bound  x  K(X )S(X \0), x  x  since, by assumption 4, f is bounded. By the definition of S(t\0) from (2.7) and by c  the proofs of Propositions 2.2.1 and 2.2.2, there exists M > 0 such that, for 0 G B, S(t|/j)<exp(-^)=S*(t) if b is bounded, and S(t\f3) < Mexp ( - ^  ) = S*(t)  if 6 is bounded plus log tail, where (3' is the lower bound of p for 0 e B. One easily p  p  shows that KS* is integrable. To prove ii), provided integration and differentiation can be interchanged, we can write 0  =  -  J_ \d\ogf{X S \Pl\_ dPm dPi E  lt  r — (  J dp \ r\d\ogf{X 8 \0) u  I \  x  1  , d*\ogf(X ,8 \0)\  x  x  1%  m  l  x  dp„  j  W q  x  d\ogf{X ,8 \0)  x  \d\ogf {X ,8 \0)  E  fd\ogf(X ,8 \0)  dPi  m  =  d  x  +  q  dp  x  f  v  )ttx.MMn  \d*\ogf {X ,8 \0)\  x  q  J  dPi  m  t  dpjit  d\ogf {X ,8 \0)\  l  x  j  x  x  dp dPi  J'  ' ^  }-  m  Therefore, £  {afc  log/  ' ' ' 4 w  il,  3)  log/  ' ^ ' } - {a£k (;f  l!  3)  =  B  l o g /  To exchange integration and differentiation, we need to bound d log 2  ( x  l l / 3 )  f(X ,8 \0)/dp dPi. x  x  m  By i) and Proposition 2.4.3, this can be done with an argument similar to that in the proof of i). We omit the calculations here. 25  2.5  Log Hazard Regression of Paired Failure Times  Now we consider the individuals with paired failure times. We first introduce the notation and define the log hazard regression model and estimates for paired failure data. Then, in Section 2.5.2, we show the consistency and normality of our estimates and give a consistent estimate of the covariance matrix of the estimates.  2.5.1  Model and Notation  Consider n subjects with paired failure times for each. For k = 1, 2, and i = 1, • • •, n, let Tfa and C i be the kih. failure time and censoring time of the zth individual, respectively. k  Let X  ki  = min(T , C ) and fci  ki  1 if Tki < Cki, Ski  =  s  0  \iT >C . ki  ki  Then (Xki, Ski) is the observation of the kth. failure of the zth individual. Note: the censoring times may be the same, i.e., Cu = C i2  We refer to fk, F , Sk,hk, and a as the corresponding density, distribution, survivor, k  k  hazard, and log hazard functions, respectively. We fit a log hazard regression model for the two log hazard functions separately. That is, we suppose that a (t\[3 ) = l  Y,(3 b {t)  l  lj  lj  and a (*|/3 ) = £ # y M * ) 2  2  For k = 1, 2, we estimate (3 by (5 as defined in Section 2.2.1. Referring to the univariate k  k  case, we denote the regression basis by b . As in (2.16), we let k  / (t,<5|/3,) = / (i| 9 )^ (t|/3 ) -' . 1  fc9  fc  26  /  fc  fc  fc  5  We also let U \f3 \X ), k  and Ik{P ) be the corresponding score functions,  ll \(3 \X ),  n  k  n  k  k  k  k  observed information matrix, and information matrix, respectively, see (2.13), (2.14) and (2.15). Similarly, we denote by K the corresponding functions defined in (2.27). Let k  (  \  (  - \ (2.33)  (3  We denote the ith summand in U  k  by u (f3 ), that is, ki  k  d ki{Pk) = ^g-l0g/fc (^fci,4i|/3fc),  U  9  k = 1,2. Let  U^(f3 ,(3 \X ,X ) 1  2  1  2  ^ UP(0 \X ) 2  2  )  i=i v  «2i(/3 ) 2  and define ^ £-21  2.5.2  ^22  j  ^ cov(tt (/3i),iin(^i)) u  cov^n^),^!^)) ^  (2.34)  y cov(u i(/32)>wii(/3i)) cov(u i(/3 ),W2i(/3 )) J 2  2  2  2  Consistency and Normality of (3  In this section, we show the consistency and the normality of the estimates J3. The main result, given in Theorem 2.5.2, relies on the asymptotic normality of 11^0-^, /3 \Xi, X ) 2  2  given in the following. P r o p o s i t i o n 2.5.1  Suppose, marginally, each of the failure times satisfies Assumptions  (1) through (5). Then -^t/W(/3 ,/3 |X ,X )^iV(0,S). 1  2  1  27  2  Proof. Since u (3 ),i = 1, • • •, n, are i.i.d. and, by i) in Lemma 2.3.5, E(u i{3 )) = 0, KI  k  k  K  using the multivariate central limit theorem, we have ( iV(0,E). «2i(/3 ) 2  • Theorem 2.5.2 Suppose, marginally, each of the failure times satisfies Assumptions (1) through (5). Then (i) (3 —>• 3 in probability; (ii) y/n{J3 -&)=> N{0, Q), where ( Qn Qu Qu Qn \  Q=  {Q21  Q22  (  (2.35)  J  [ I {3 )- V h{Pi)l  2  2  / (/3 )- S 7 (/3 )-  1  2l  1  2  2  2 2  2  1  2  j  Proof, (i) For any e > 0, by Theorem 2.3.1,  P{\\3 -(3\\>e)<  P{\\3  X  - 3J >  e) +  P(\\3  2  L3 \\ > e) -> 0,  -  2  as n —>• 00. (ii) By expanding U^ifl^/3 [-X"i,X ) 2  in a Taylor series about 3, we have  2  ( lt \3\\X ){3 -3 )  N  L  L  U^{p^ \X X ) 2  u  1  1  (n)  \ u >{p \x ) ) n  2  2  2  li (3* \X )(0 -3 ) N)  K  2  2  2  2  where 3* = (3\,32) is on a line segment between (3 and (3. Since T  U^0 ,L3 \X X ) 1  2  U  (  ^  2  = O,  2  U ?\B \X,) {  A  /  X  ^ U?(3 \X ) 2  2  )  I^idUX,) 0  0 /r(r ()/ 3 ; | x ) y n  2  28  J  i.e., (i /iHri0)ta* t|Xi)  ^ J3 - (3 ^  n  o  o  -i \F \x ))  l  y  {  2  2  2  j-u \&\x )) n  2  a  By (iii) in Theorem 2.3.1, [lit\pi\x )l  ^  l  k  hiP,)-  1  in probability, since (3* —> f3 in probability, k = 1, 2. Using this and Proposition 2.5.1, k  k  by Slutsky's theorem, we have vH9-0)^iV(O,Q).  • To estimate Q, let  Qki =n [^(PklXk)]'  1  \n (\±u 0 )uum ) T  ki  k  i = 1  j n [^li 0i\Xi)]~\ n)  (2.36)  k, I — 1, 2, and Qn Qn  Q=  \ Q21  Q22  J  In the next theorem we show that Q is a consistent estimate of Q. Theorem 2.5.3 Suppose, marginally, each of the failure times satisfies Assumptions (1) - (5) for k — 1, 2. Then Q is a consistent estimate of Q. Proof. Let /3 be the true values of the parameter. From Theorem 2.3.1, Q  lli 0 \X )^I ((3 ) n)  k  k  k  ko  in probability, k = 1,2. Therefore, we need only prove that - £ n  K(&)  r  Sw  (2.37)  i=l  29  in probability, k, I = 1, 2. By the law of large numbers, 1 ^ Y,u i(P )u (3 ) n i=i k  -+ ^(u (/3 )u ()9 ) ) = S  T  k0  H  :r  l0  fcl  fc0  ll  w  in probability. Hence, to prove (2.37), it is sufficient to prove that j  1  n  -ZM/3fcW^)  - - X>**03*oW/3«>)  T  i=i  n  in probability.  n  n  o  r  i=i  Let u (3 ) be the sth component of u (3 ), i.e., kis  k  ki  u s{3 ) = ki  k  ——\ogf (X ,S \3 ),  k  kg  kl  ki  k  OPks  s = 1, • • • ,p ; k — 1, 2; z = 1, • • •, n. We need to prove k  •y  1 ~ ~Y kis{Pko) nj(Pio) ->• 0> 71  n  -Y kis(P )uiij(Pi) u  n  u  k  n  1=1  in probability, for s = 1, • • • ,p ; j = 1, - • • ,p ;k,l k  applied to T,7=i kis(fa )uuj(fai)  = 1  k  j n  1  U  -i  1 n  i=l  m=l  1  +  U  k0  l0  /3fcm  Z^fcOml  •(Ukis{Pk)uiij(Pl)) /3*  l/3/m  5/3 Im  j=i d = l  /3/0m  Pk  E  m  "  /0  Y kis((3 )Uhj(3 )  fcm n p;  dd  ;ij  fc0  •(«*M(/3*)WMJ(A)) /3*  5/3  sEE  Pk"  n  fcis  n  ~-  -Yi=ikis(fak)uiij{fal)  +  = 1, 2. Using the mean value theorem  t  and £ ? u (/3 )u (/3 ), we have  u  ^ E E i=l m=l  (2-38)  u  i=l  •iUkia{Pk)^lijW) /3* Y I A m - AfcOml  km  m=l  P(  ^EE  -( kis{0k)uiij(Pl)) U  5/3  A  /m  EI I A m m=l  —  Aum  = l between fa and (3 . Thus, since /3 ->• 3 in probability, to where 3* is on a line i=i segment n  m  prove (2.38), it suffices to show that ± E L i £ m = i  is bounded in  ^h'km  probability around (3 . That means we need to find M < oo and a neighborhood of /3 -o such that 1 n Pk d  {  -(Ukis{P ) Uj(Pl)) fcm U  5/3  k  30  < M  when n —>• oo, k, I = 1,2. We choose a number a > 0 such that B = {/3 : \\f3 — (3 \\ < a} is a compact set in the parameter space. By Lemma 2.3.4, there exist M and M ; such that for k  3 e B,k,l = 1,2, k ^ l,m = 1, • • • ,p , k  d H]iPi)^—u {(3 ) d(3km-(Ukis(Pk) Uj(0l)) °P m d d Sfiq(Xu, 5 \d )————\ogf (X , 5 \(3 )) u  u  kis  k  k  2  l0  li  <  l  kq  ki  ki  k  MMK^X^KtiXu).  For k = l, d < d(3km-(Ukis(Pk)Ulij(Pl))  2M [K (X )} . 2  2  k  k  ki  Thus,  {  1  SUP -  n  E  >  Pk  E  d •( kUl3 )uuj((3i)) d(3,km u  k  <M  P^MkM^j^KkiX^KtiX^KM^  which will converge to 1 for M sufficiently large, since -> EiK^X^K^Xu)) n  < oo  t=i  by Proposition 2.4.3.  •  As in the univariate case, we have the following Corollary. Corollary 2.5.4 If b\ and b satisfy assumptions 1 through 5, then 2  V^Q-"0-l3)=>N(O,I), where I is (pi + p ) x (p± + p ) identity matrix. 2  2  31  Chapter 3 Regression Space In the previous chapter we discussed the regression model for log hazard functions for a general regression space G. Here, we will use a family of extremely flexible functions, cubic splines, as the regression space. This family was used by Kooperberg, Stone, and Truong in 1995 for univariate log hazard regression. In this chapter we first give a brief introduction to cubic splines. Then we give the definition of a restricted cubic spline regression space B and a method to construct a cubic B-spline basis of B. In Section 3.3 we introduce the numerical computation method that we use to calculate the estimates of the log hazard regression model for paired failure data. In Section 3.4 we explain our methods for choosing knots for the log regression model. Finally, in Section 3.5, we show that the log hazard regression model with the regression space B can be used for hypothesis testing. In the univariate case, we can test the hypotheses that the failure times have an exponential or a Weibull distribution. In the bivariate case, we can test the hypotheses that the two failure times have the same distribution or that the two failures have proportional hazards.  32  3.1  Cubic Splines  Definition 3.1.1 The function (j) is a cubic spline on [a, b] with knots ti, - • • ,tx (a < ti < • • • < tj( < b) if <f> is a cubic polynomial on the subintervals [a, ti], [ii, £2]; • • •, \PK, b] and (f) has 2 continuous derivatives on [a, b]. Denote the collection of these splines by Sp(ti, • • • ,t ). K  By De Boor (1978), Sp(ti, • • •, t ) is a linear space with dimension K + A. The power K  basis {p , k = —3, • • •, K} of Sp(ti, • • •, t ) is defined as: k  K  Po(t) = l ,  p-i(t) = t,  p- = t , 2  Pi(t) = (t-ti)l,  2  p- = t , 3  3  p (t) = (t - t ) , 3  K  K  +  where f (t - t')  3  (t ~ t% =  for t > if  U  for t < If. Therefore, a power basis representation of a cubic spline <j> € Sp(ti, • • •, t^) is Ht) = E e (t) = E kPk  k=-3  k=0  + E °k{t - t f . k  +  (3.1)  k=l  The power basis is very easy to understand, but isn't used in computation since it has bad numerical properties (see De Boor, 1978). A numerically much better basis is the B-spline basis, which consists of functions that are zero except on some sequential subintervals. For more details about the B-spline basis, see De Boor (1978) and Shikin (1995)' We will use a restricted B-spline basis, defined in Section 3.2, for our regression.  3.2  The Regression Space B  In this section we will introduce the cubic spline regression spaces defined by Kooperberg et al. (1995) and we will give an algorithm to construct restricted B-spline bases in those spaces. 33  First, we assume the log hazard function a(t\3) of a failure time to be a cubic spline from Sp(ti, • • •, t ) K  satisfying  (**) a is linear on [0,ii] and is constant on [£/<-, oo). If we use the power basis representation a =  EfcL-3 6kPk,  equal 0, and it places three constraints on  then (**) implies #_ and #_ 3  2  61, • • •, 9K- This leaves a i f + 4 — 2 — 3 =  K - 1 dimensional space. More precisely, define V = span{p_i,p ,Pi, • • -,PK} 0  and B = {<p £ V : (f> is linear on [0, t\] and constant on [tk, oo)}. Then dim(P) = K + 2 and dim(c^) = K - 1.  Next, we give the definition of a restricted B-spline basis of B. We assume that K > 3, and place restrictions on bi, bx-i and, if K > 3, on 6 , • • •, 2  Definition 3.2.1 The set of functions {bi, • • •, 6^-1} ^ & is called a restricted B-spline n  basis of B if it has the following properties: 1. b\ is linear but not constant on [0, £1] and is zero on [t ,oo); 3  2. if K > 3, for 1 < j < K — 2, bj is zero on [0, £,-_i) and [tj , 00) but not zero on +3  (tj-i-, tj); 3. if K > 3, bj<-2 is zero on [0, £ # - 3 ) , not zero on  (^-3,^-2);  and a non-zero  constant on [tK, °o); 4- b -\ is a non-zero constant on [0, 00). K  Clearly, anyfy'ssatisfying 1-4 are bounded, a condition required in Section 2.2.1. Now we need to show that the restricted B-spline basis is well defined. That means that 34  we need to prove that b/s with the above properties exist and they are a basis of B. We will give an algorithm for constructing bj's (this implies the existence), but first we will show that any bfs satisfying 1-4 are a basis. Theorem 3.2.2 If b\, • • •, bx-i, K > 3, in B have properties 1 - 4 of Definition 3.2.1, then they are a basis of B.  Proof. Since B is a K — 1 dimensional space, to show that {b\, • • •, bx-i} is a basis of B, we need only prove that b\, • • •, bx-i are linearly independent. Let K-l  cj>{t) = £  a&it).  Suppose that (f)(t) = 0. We are going to show that this implies that otj = 0,j — \,---,K  — 1. Consider t G [0,ii). If K > 3, by properties 2 and 3, bj(t) = 0, for  j = 2, • • •, K - 2. If K = 3, then <f>(t) = ai&i(t) + a b (t). Therefore, for K > 3 and 2  2  t G [0,£ ) x  <t>(t) =  +  ttK-i^-iW  = 0.  By property 4, bx-x is non-zero constant, and by property 1, bi is linear but not a constant. Hence we have ai = tttf-i = 0.  (3.2)  Hence, if K = 3, then &! and 6 are a basis of B. Now we assume that K > 3. Consider 2  £ G (ti, t )- If K > 4, by properties 2 and 3,  = 0, j = 3, • • •, K - 2. If K = 4, then  2  + oc b {t) + a & 3 ( * ) -  <K*) =  2  2  3  Hence, for K > 4 and i G (h,t ), using (3.2), we have 2  </>{t) = tti6 (i) + a b (t) + otK^bK-iit) x  2  2  35  = a b {t) = 0. 2  2  So OJ2 = 0 since b2(t) ^ 0 by property 2. B y induction on j , using a similar explanation,  for t G {tj-i, tj), <f)(t) = ajbj(t) = 0, which implies that a.j = 0 for j = 3, • • • ,K — 2. Hence we have that a,- = 0,j  =  1, • • •, K — 1, which shows that {bi, • • •, b -i}  •  K  is a basis of B.  For the convenience of our constructing a restricted B-spline basis, we give the following result as a lemma.  Lemma 3.2.3 Fix J = 2, • • •, K — 2, and let  0  J+2  b{t) = E 0 Pk(t)+ E kPk{t). k=-3 k=J-l e  k  Then b = c on [tj ,oo) +2  if and only if # _  3  ,#_ ,  0j, QJ+I, 0j  #o,  2  +2  solve the  linear system ds +  4- 9j + 9j+i + 9j 2  0_ + 3 O _ i 0 / _ i + 3tj9j + 3t 0 2  J+1  J+l  0_i + 3£3_i#/-i + 3£j#/ + 3t 9j i 2  J+1  OQ + tj^Oj-i  Proof.  = 0  +  +  + tj9j + t  J + l  + 3O+20J+2  = 0  + 3r^ 0/+  = 0  +2  9  J +  i +£j 0/ + 2  First regroup terms of b(t) by powers of  2  = c.  + 2  t, so that  = a + ait + a t + a £ , 3  2  0  3  2  with a's depending on indicator functions involving the 9k 's and the £,-'s. We see that 6 = c on [ t j , co) is equivalent to a + 2  0  = c,  = a  2  = a  3  = 0, which is equivalent to the  above equations.  •  Now we start our construction of a restricted B-spline basis of B. Let  bj = Y,k=-i OkjPk,  j = 1, - • • ,K — 1. We will find O^s so that thefy'ssatisfy properties 1 - 4 of Definition 3.2.1. 36  Step 1: Let b -i  = p , i.e.,  K  # (K-I) =  0  0  1 and 9 (K-i) = 0, for k = - 1 , 1 , • • •, K. k  Step 2: To define bi = J2k=-i QkiPk, let 9 i = 0 if k > 3. As in the proof of Lemma k  3.2.3, we regroup terms of bi by powers of t. Then b\ = 0 on #oij $ ( - i ) i > ^ i i ) $21 > #31  implies that  [£3,00)  solve the following linear system  t\9l\ # ( - 1 ) 1 + 3 £ 2 # n -+•  #11 + #21 + #31  = 0  + *2#21 + ^3#31  =  3i2#2i  = 0  3i3#3i  +  #01 + * i # l l + ^2^21 + ^2^31  0  0.  =  Since the t s are not equal to each other, there is a unique solution for the linear system k  once one of the 9 i,k = — 1, • • •, 3, is fixed. We let 9 — 1 and solve the linear system. k  n  Note that #(-1)1 ^ 0 otherwise the linear system has only zero solution which would contradict  #n  =  1. Thus, 61 is linear but not constant on [0, £1) and b\ = 0 on  (£3,00).  To construct bj, 1 < j < K — 1, we first construct b* which is zero on [0, £j_i) and a non-zero constant on [t,+ , 0 0 ) . We will define the fr-'s as linear combinations of the 2  bj's such that bj is zero on [tj+3,  1 < j < K — 2.  00),  Step 3 For 1 < j < K - 1, let  Then for any constant c, by Lemma 3.2.3, bj = c on [£ ,oo) and 0 j = 0, s = J+2  -3, - 2 , -1,0 imply that 9[ _^ ,9' , j  U-i)i  9  + 3 'jj t  t)-i9{j-i)j  d  +  +  j  'n  6  jj  +  ^ - i % - i b ' +  j+l  'u+i)j  9  +  ld  j+2  '(j+2)j =  d  t  +2  2  j+1  + t 9[ 2  j+1)j  satisfy the linear system  9[ ^  J+ {j+i)j + i+29{j )j  t  + t j9'jj + t 9[ 2  9[ ^,  j+2  j+2)j  + ^f+i^O+ib + ^+2%+2)j  37  S  0  =  0 = =  0 c.  Let Q[j-i)j = 1- T h e n we can solve the first three equations for 9'^,  Q'(j+2)j  ^  an<  then set c to satisfy the fourth equation. We must show that c ^ 0. It is easy to see that, if c = 0, then the linear system in L e m m a 3.2.3 has only the zero solution. Hence here  = 1 implies that c ^ 0. T h e n b* is zero on [ 0 , n o n - z e r o  on  (tj-\,tj),  and a non-zero constant on ( £ , - , o o ) . +2  Step 4: Let bx-2 — b* _ . T h e n bx-2 satisfies property 3. K  2  Step 5: Recall that b* is a non-zero constant on [ £ , - , o o ) . For 1 < j < K — 1, let +2  dj be that constant on [tj , oo). For 1 < j < K — 2, if we let +2  a  J +  i  then bj is zero on [ £ j , o o ) and also on [0, r.j_i), since fr- and b*+l are zero on [ 0 , £ j _ i ) . + 3  Also 6j is not zero on (£,•_!,£,•) since Thus bi, • • •,  is zero but 6* is not zero on  satisfy properties 1 - 4  (tj-\,tj).  and are a B-spline basis of B.  From now on, for given knots, we use B as our regression space with the restricted B-spline basis h, • • • ,b -i K  defined as above.  mean the basis is {bi, • • •, bpc-i, bK}  B y a bounded basis plus log tail, we  and the regression space is span{&i, • • •, bx}, where  bi<(t) = log(t + c), c > 0 is a constant. Since the basis b is bounded or bounded plus log tail, all relevant results discussed in Chapter 2 can be used. T o simplify notation, we will denote both regression spaces  byB.  3.3  Numerical Implementation  In this section we introduce the algorithm used to calculate the estimates (3 and Q in the log hazard regression model of paired failure times. B y the definition of (3 from (2.33), the estimates (3  X  rately. So we need only discuss the calculation of f3x. 38  and Q  2  are calculated sepa-  We can use Kooperberg's heft  algorithm (see Kooperberg, Stone, Truong, 1995) to find 3 . They use the Newtonr  1  Raphson method for computing 3 since 3 is the unique solution of the equations r  1  1  (2.20). Specifically, they start with an initial guess J3^\ then iteratively determine fa[ ^ from h+  according to the formula  - (fc+1)  \X{) — logL ( 3  and stop the iterations when log L (f3 q  - (fc)  x  q  r  1  \X ) < e, where e is a given X  positive number chosen so that estimates with the desired accuracy can be obtained. Thus the main numerical task in calculating J3 is the computation of the log likelihood logLq^^Xi),  score function  {3 ), and observed information matrix X  for various values of 3. By the definitions of l o g L ^ / ^ j X i ) , U^^)  ^(d^Xi)  and /{^(JSJXI) ,  see (2.17), (2.18), and (2.19), this computation involves the numerical approximation of E i=i  j  XU  J o  mdt,  (3.3)  for ip of the form p i  = hm(t)b (t)  expCEPijhjlt)),  u  Z,m G { l , - " , P i } -  Kooperberg et. al. do not calculate j  Xli 0  ip(t)dt  for each i. Rather, they rewrite (3.3)  as roo  j  N(t)rp(t)dt,  (3.4)  where N(t) is the number of i satisfying X  ki  > t. The function JV(-) is piecewise  constant, has jumps at the observations X , • • • ,X , and equals zero to the right of kx  kn  the maximum observation. Then they rewrite (3.4) as  E r  N(twt)dt,  „ Ja.v-1  39  where {a„} is afinitegrid of points containing all knots. Then they calculate / N ( t ) i p ( t ) d t numerically. Now consider calculation of Q. By the definition of Q from (2.36), the calculation of Q involves the calculation of /3 , ^(fi^Xk),  and the summands u (J3 ),  fc  KI  K  k = 1,2,  i = 1, • • •, n . As we mentioned above, we can get (3 from the heft algorithm directly. K  By slightly modifying the heft code, we can obtain I \[3 \X ) N  K  to evaluate u i(/3 ), we need to calculate f*  u  k  K  k  k  from heft as well. But  ip(t)dt separately. We can not get these  integrals from heft without rewriting its entire integration program. Instead, we use Gaussian Quadrature (see Abramowitz and Stegun 1964, p. 916) for the integration ip(t)dt to calculate u (J3 ), k = 1,2. KI  3.4  K  Knot Selection  In Section 3.2, we give the method to define the regression space B for given knots. In this section we introduce the methods that we use to choose the knots. For the univariate case, we use the following two methods to select the knots. • Choose knots by Kooperberg' heft algorithm. For a data set, Kooperberg's heft algorithm can choose the knots for the model fitting. The algorithm chooses knots by a stepwise addition and stepwise deletion procedure. See Kooperberg et. al. 1995 for details. We hoped that heft would select knots well. But from our simulation study we find that there are some numerical calculation problems with the knots chosen by heft. If the ratio of the maximum of the knots to the minimum of the knots is too big, then the resulting log hazard estimate and the calculation of the estimated standard errors may be impossible, see Section 5.2.2. So we should not use the knots from heft, if we receive a warning message from the heft code, or the calculation of the estimated standard error is not possible. 40  • Choose the knots by the quantiles. We choose the quantiles of the non-censored observations as the knots to define the regression space B. This procedure is based on ideas in Kooperberg et. al's knot selection for their stepwise addition method and Abrahamowicz et. al's knot selection method, Abrahamowicz, Ciampi, and Ramsay (1992). Choosing knots equal to quantiles may also result in a large ratio of maximum of the knots to the minimum of the knots. As noted above, this causes numerical problems. We solve this problem by truncating the data, that is, if there are any warnings, we use the quantiles of a truncated data set. We truncate the observations which are greater than 80 in our simulations. For paired data we use the following methods: • Use different knots for modeling the two log hazard functions. Choose knots for each failure time by the above two methods. We then use the two sets of knots to define the two regression spaces and marginally fit the log hazard regression models for the paired data. • Use the the same knots for modeling the two log hazard functions. In this method we need to choose one set of knots which defines one regression space for both log hazard functions. If we denote the ranges of non-censored observations {X  ki  : 5ki = 1} by R , k = 1, 2, then our knots must lie in RiC\ R . There are two way k  2  to choose this set of knots. a) Use the quantiles of the non-censored observations which lie in Ri f) R . 2  b) Use the union of the knots selected for each marginal log hazard estimate. Denote the sets of knots selected for the two log hazards by /Ci and /C . Use (K-i U /C ) fl 2  (Ri n R ) as the set of knots for the regression space. 2  41  2  3.5  Hypothesis Testing  In this section we show how to use the restricted B-spline basis and the regression space B and given data to test hypotheses.  3.5.1  Univariate Case  In the univariate case, we can test: • HQ\ the failure time has an exponential distribution; • HQ-. the failure time has a Weibull distribution. By the definition of the log hazard regression model in (2.5), the log hazard function is  If the failure time is exponentially distributed, then a(t\3) = constant. Therefore, for the restricted B-spline basis defined in Section 3.2, ({bj, j = 1, • • •, K — 1} if b is bounded or {bj, j = 1, • • •, K} if b is bounded plus log tail, where K is the number of knots), we can rewrite HQ as H : pj =  0iorj^K-l.  E  0  Then we can write HQ as H: E  0  X J B  =  0  for an appropriate matrix X . By Corollary 2.3.3, E  [nlW(P\X)]±0-0)=>N(O,r).  42  Hence, under HQ, ( n(Xjfa) (Xj[I^0\X)rX )-\X f3)  =>  T  T  e  XK-2  ^ & i bounded;  xk-i  ^ &  Therefore, n(X fa) {X [I^\fa\X)]- X )- (X 3) T  T  T  1  1S  bounded plus log tail.  can be used as a test statistic.  T  1  r  e  s  To test HQ, we need to use the restricted B-spline basis plus log tail 6^(i) = log(i). Thus, if the failure time has a Weibull distribution, then  a(t\a) = p _ b - {t) K  1  K  1  + p b {t). K  K  Hence, testing H™ is equivalent to testing Pj = 0 for j < K — 1. By a similar procedure as for testing HQ, we can find a matrix X  w  such that  n{Xlfa) {Xl[&\fa\X)]-'X )-\Xlfa)  =>  T  w  and we can use n(X^fa) (X^[I^( 3\X)]- X )- (X^ 3) T  1  r  3.5.2  1  r  w  _  2 X K  2  as our test statistic.  Bivariate Case  In bivariate case, we can test: • H : the two failure times have the same distribution; 0  • HQ-. the two failure times have proportional hazards. For given paired data (Xn, 5u, X i, S ), i = 1, • • • ,n, to test Ho and H , we model p  2  2i  the log hazard functions of the two failure times with the same regression space B. Then the log hazard regressions are  3=1  3=1  43  If the two failure times have the same distribution, then aiMft) -  a (t\(3 ) = J2(foj ~ fajMt) = 0. 2  2  3=1  Therefore, testing H is equivalent to testing 0  H* : P - 3 Q  lj  r  = 0fovj =  2j  l,---,p.  Then we can rewrite H$ as H* : X B  =0  T  for an appropriate matrix X. For bases described in Section 3.2, by Corollary 2.5.4, ^Q-^{f3-f3)=>N({)J).  Then under H , 0  n(X C3) (X QX)- (X f3) T  T  T  1  =»  T  So, we use n(X 3) {X QX)- (X P) T  T  T  1  XK-I  ^ ° i bounded;  XK  if b is bounded plus log tail.  s  as test statistic.  T  r  If the two hazards are proportional, then p a  i(*l/5i) - ot (t\(3 ) = £ ( / ? y ~ r%j)bj(t) = constant. 2  2  3=1  Hence, for the regression spaces and bfs defined in Section 3.2, the test for Hi equivalent to the test for Pij — p j = 0 for j ^ K — 1. Then we can rewrite Hi as 2  Hi : X B T  =0  for an appropriate matrix X. By Corollary 2.5.4, v^g-*G9-£)=>JV(o,/).  44  Then under HQ, ( n(X 0) (X QXr\X i3) T  T  T  => { XK-2  T  XK-I Therefore, n(X '(3) (X QX)- (X p) T  T  T  l  T  ^ ° ^  0  1S  1S  bounded; bounded plus log tail.  can be used as test statistic.  Note: When T^'s and T ;'s are independent, the Cox proportional hazards model 2  h (t) c  = M*)exp(7)  (3.5)  provides an estimate of the relative risk 7 . Thus under independence assumption, we can also test H by testing HQ : 7 = 0. To our knowledge, there is no non0  parametric test for proportional hazards for dependent paired data as we defined in Chapter 2. We might use the usual Cox proportional model, assuming the two failure times are independent. However, while the usual estimate of 7 may be good, the standard errors are probably biased.  45  Chapter 4 Application to the Diabetic Retinopathy Study In this chapter we apply the log hazard regression model to data collected to study the effect of laser treatment on diabetic retinopathy (see Diabetic Retinopathy Study Research Group, 1981). We give a description of this study in Section 4.1, and analyze the data in Section 4.2. Finally, in Section 4.3, we discuss analysis of this data set using other models.  4.1  Data Description  Diabetic retinopathy is a complication associated with diabetes mellitus, which consists of abnormalities in the microvasculature within the retina of the eye. It is the major cause of visual loss in many industrialized countries (Murphy and Patz, 1978). The Diabetic Retinopathy Study (DRS) was funded by the National Eye Institute in 1971 to investigate the effectiveness of laser photocoagulation in delaying the onset 46  of blindness for diabetic retinopathy patients. One eye of each patient was randomly chosen to receive photocoagulation and the other eye was observed without treatment. A total of 1,742 patients was followed over several years. The endpoint used to assess the treatment effect was the occurrence of visual acuity less than 5/200 at two consecutively completed 4-month follow-ups. The only data available are from the 197 patients defined as high-risk by DRS criteria. Of the 197 pairs of observations, approximately one-half (101/197) of the untreated eyes and one-quarter (54/197) of the treated eyes achieved the outcome after 5 years of follow-up. The histograms of the censored and uncensored data for treatment and control (Figure 4.1) show that more untreated eyes than treated eyes failed during the study and many patients left the study after about 3 years (36 months) or more from the start time. The correlation between the uncensored observations of the treatment group and the control group is 0.28. This indicates possible dependence between the two failure times. We show the scatter plots in Figure 4.2. For this data set, the two censoring times are identical, that is, Cn = Cc%The primary goal of the DRS study was to assess the effectiveness of the laser photocoagulation treatment. A secondary goal was to assess the relative risk of blindness of the untreated and the treated eyes as a function of time.  4.2  Data Analysis  In this section we will address the following questions: 1. Do the failure times of the treated and the untreated eyes have the same distribution? That is, is there a treatment effect?  47  Treatment  20  40  60  20  Uncensored Times  40  60  Censored Times  Control  20  40  60  20  Uncensored Times  40  60  Censored Times  Figure 4.1: Histograms of observed times of the eye data.  48  80  •a  o  2 o  CD  c cu o  o  CO T3 CD CO  CM  c D  H  S  •  •  • • »• 0  •  o CM  •• 20  40  60  0  Treated failure  20  40  60  Treated failure  T3  2> o co c cu o  O CO  o  T5 0)  • ••  CO  • %  i_ CD  C  D  20  40  60  o CM  0  Treated censored  20  40  60  Treated censored  Figure 4.2: Scatter plots of the observations of the eye data.  49  2. What is the log hazards ratio? That is, what is the relative risk of blindness of the untreated eyes and treated eyes? We carry out two analyses of the eye data. First, as described in Section 3.4 on knot selection, we let Kooperberg's heft algorithm choose the knots for the log hazard regression model (Model 1). This allows different knots to be used for the log hazards estimates of the failure times of the treated and the untreated eyes. In our second analysis (Model 2), we use the same regression space for the log hazards of the two failure times. The set of knots used is the union of the knots chosen by the heft algorithm for each failure time. With both models we can get an estimate of the log hazards ratio with 95% pointwise conference intervals. We then test if there is a treatment effect, and if the two hazards are proportional with Model 2, as described in Section 3.5.  4.2.1  Model 1  To estimate each failure time's log hazard, we use Kooperberg's heft algorithm with log tail (3 (t) = log(t) (c = 0), and without specifying knots. The heft algorithm selects p  1.5,6.17, and 63.33 as knots for the regression model of the treated eyes and chooses no knots for the untreated eyes, which means that heft chooses a Weibull model for the failure time of the untreated eye. Then the log hazard regression model, Model 1, is: a {t\3 )  = /3Tlb l(t)+pT2bT2(t)+PT3bTz{t),  (4.1)  a {t\3 )  = Pcibci(t) + Pc2b 2(t).  (4.2)  T  c  T  c  T  C  As defined in Section 3.2, b  T1  is linear on [0,1.5), cubic on [1.5,6.17) and [6.17,63.33),  constant on [63.33, oo), br2 and ba are constant functions, and b (t) = bc2(t) = log(t). T3  The estimates of 3 and 3 from heft are T  C  (3 = (0.00017,-8.13,0.69) T  50  T  Time to Blindness (Months)  Figure 4.3: The estimated density functions of the eye data for Model 1. and £  = (-3.68,-0.18) , T  c  respectively. The estimates of the marginal densities, survival functions, hazard functions, and the log hazards ratio are in Figures 4.3 to 4.6. From the plot in Figure 4.3, we find that both estimated densities have a high value in the first twenty months. But the density corresponding to the untreated eye is higher than that of the treated eye during the observed time period. The estimated density of the failure time of the untreated eyes has its maximum at time zero, while the other density achieves its maximum at about month six. Since the regression model is for the log hazard, we calculate the estimated densities based on the estimate of log hazards as in equation (2.8). A pointwise confidence interval for the estimated densities would require integration of the upper and lower bounds of confidence intervals for the hazard function. This is difficult, so these confidence intervals have not been calculated. 51  Treatment Control  T i m e to B l i n d n e s s (Months)  Figure 4.4: The estimated survival functions of the eye data for Model 1.  The estimated survival curves, in Figure 4.4, show that the estimated survival function of the treated eyes is always greater than that of the untreated eyes. Therefore, there appears to be a large treatment effect. From the plots of the estimated hazard functions in Figure 4.5, we see that the risk of blindness in the untreated eye is much higher than that of the treated eye. The risk of blindness in the untreated eyes decreases with time while the treated eyes may have a maximal risk at t = 6. Based on the estimated hazard functions, we can see that the treatment may have delayed the onset of blindness for the patients for the first couple of months after the operation. Figure 4.6 shows the estimated log hazards ratio with pointwise 95% confidence intervals. Values greater than zero indicate a higher risk of the control group and so it seems that the treatment has a beneficial effect. The estimate of log relative risk of blindness of the untreated eye to the treated eye is equal to 1.15 at t = 1, and decreases to 0.34 at about month six, then increases smoothly. After attaining its maximum value 52  Treatment  Estimated hazard 9 5 % Cl  Time to Blindness (Months)  Control  Estimated hazard 9 5 % Cl  Time to Blindness (Months)  Figure 4.5: The estimated hazard functions of the eye data for Model 1 with pointwise 95% confidence intervals.  53  Figure 4.6: The estimated log ratio of the hazard functions of the eye data with pointwise 95% confidence intervals for Model 1. of 1.186 at month 42, it decreases slowly. Thus, based on Figure 4.6, the estimated log ratio of the two hazards gives an answer to questions 1 and 2 asked in the beginning of Section 4.2.  4.2.2  Model 2  To test if there is a treatment effect, we assume the log hazard functions of the treated eye and the untreated eye are from the same regression space B. So we will use the same restricted B-spline basis plus a log tail for the two regressions. To choose knots that define B, we refer to the knots used for Model 1. As heft uses 1.5,6.17, and 63.33 for the treatment group and no knots for the controls, we use 1.5,6.17, and 63.33, the  54  union of the knots used for the two groups, as the knots to define our regression space B. Then B is a three dimensional space and the log hazard regression, Model 2, is: a (t\B )  =  frMt)  a (t\B )  =  3cMt)+Pc2b (t)+PcMt),  T  T  c  c  + fahit) + PTM*), 2  ( -3) 4  (4-4)  where 61 = bn, b = 6T2, and 63 = 6^3 as defined in Section 4.2.1. Using the heft 2  algorithm with fixed knots, we get the estimates J3  = (0.00017, -8.13,0.69) ,  P  = (0.00007,-5.15,0.16) .  r  T  T  c  As in Section 4.2.1, we calculate the estimates of the densities, survival, hazard functions, and the log hazards ratio. We give the plots in Figures 4.7 to 4.10. The results are similar to those from Model 1.  Time to Blindness (Months)  Figure 4.7: The estimated density functions of the eye data for Model 2.  55  Treatment Control  T i m e to B l i n d n e s s (Months)  Figure 4.8: The estimated survival functions of the eye data for Model 2.  Figure 4.9: The estimated hazard function of the untreated eye for Model 2.  56  Figure 4.10: The estimated log ratio of the hazard functions of the eye data with pointwise 95% confidence intervals for Model 2. Since Model 1 and Model 2 use the same regression space for the log hazard of the failure times of the treated eyes, the two models produce the same estimates for the failure times of the treated eyes. Thus we do not repeat the plots of these estimates. We compare the estimates from Model 1 and Model 2 for the untreated eyes in Figures 4.11 to 4.13. From Figures 4.11 to 4.13 we see that Model 1 and Model 2 give almost the same estimates of the survival function of the untreated eyes. But the estimated densities and hazards look a little different. Next we use Model 2 to test the hypothesis that the failure times of the treated eye and the untreated eye have the same distribution. By the definition of Model 2 from (4.3) and (4.4), this hypothesis is equivalent to Ho : PTJ - P  C J  = 0, j = 1,2,3. 57  M o d e l  T i m e toBlindness  2  (Months)  Figure 4.11: The estimated density functions of the failure times of the untreated eyes.  M o d a l  1  M o d e l  2  T i m e to Blindness  (Months)  Figure 4.12: The estimated survival functions of the untreated eyes.  58  T i m e to Blindness  M o d e l  1  M o d e l  2  (Months)  Figure 4.13: The estimated hazard functions of the failure times of the untreated eyes. Using the test described in Section 3.5, we obtain the test statistics and p-values shown in Table 4.1. Table 4.1: Test statistic and p-value for testing that the failure times of the treated eyes and the untreated eyes have the same distribution. Test PTJ- P  C J  = 0,j = 1,2,3  xl  p-value  48.56  0  Hence we may reject H and conclude that the two distributions are different. From 0  our plots of the log hazards ratios in Figures 4.6 and 4.10, we conclude that the laser photocoagulation treatment had a significant effect in delaying the onset of blindness in patients with diabetic retinopathy.  59  4.3  Other Models Used to fit the Eye Data  Huster et. al. (1989), considered modeling the marginal distributions for the eye data as exponential and as Weibull or modeling the data via the Cox proportional hazards. We show the estimated marginal survival curves from fitting different models in Figures 4.14 and 4.15. The Kaplan-Meier estimate is also given. Note that for the treatment Model 1 and Model 2 are the same and for the control Model 1 is the Weibull. We see that all fit inside the confidence intervals of the Kaplan-Meier estimates except the estimates from the exponential model. But the estimated survival functions from our log hazard regression models are closer to the Kaplan-Meier estimates than those from the parametric models. In this section we use Model 2, defined in Section 4.2.2, to decide which model(s) are appropriate to the eye data. Then, based on the test results, if we can choose a standard parametric model, say exponential, Weibull, or proportional hazards, we will use the selected model to test if there is a treatment effect.  4.3.1  Exponential Model  Since Model 2 from (4.3) and (4.4) includes the exponential distribution, we can use the model to test if the marginal distributions are exponential. As we discussed in Section 3.5, we test the two null hypotheses: fr  = 0J = l,3  P  = 0,j = l,3.  3  and cj  Table 4.2 presents our test results. Hence we reject the hypotheses that the failure times of the treated eye and the untreated eye are exponentially distributed. 60  K-M Model 1 = Model 2 Exponential Weibull 95% Cl of K-M  0  20  40  60  Time to Blindness (Months)  Figure 4.14: T h e estimated survival curves of the treated eye.  61  K-M Model 2 Exponential Weibull = Model 1 95% Cl of K-M  0  20  40  60  Time to Blindness (Months)  Figure 4.15: T h e estimated survival curves of the untreated eye.  62  Table 4.2: x test statistics and p-values for testing that failure times are exponentially 2  distributed. Test  4.3.2  x\  p-value  = 0,j = 1,3  11.26  0.004  = 0,3 = 1,3  8.14  0.017  Weibull Model  To test the hypotheses that the marginal distributions are Weibull, we test the two hypotheses: PTI  =0  and Pci = 0. We show the test results in Table 4.3. Table 4.3: z statistics and p-values for testing that failure times have Weibull distributions. Test PTI =  PCI  =  se 0  1.73 x 10"  4  0 6.94 x 10~  5  z  p-value  6.28 x 10"  5  2.75  0.006  3.59 x 10"  5  1.93  0.053  Thus we reject the hypothesis that the failure time of the treated eye has a Weibull distribution and we can not accept the hypothesis that the failure time of the untreated eye has a Weibull distribution. 63  4.3.3  Cox Proportional Hazard Model  Now we use Model 2 to test the hypothesis that the two hazards are proportional. As mentioned in Section 3.5, we need to test H:  P  0  T j  -P  = 0,  C j  i =  1,3,  since b is constant. The test statistics and p-values are shown in Table 4.4. 2  Table 4.4: Test statistic and p-value for testing that the two hazards of the treated eyes and the untreated eyes are proportional. Test  x\  = 0,3  PTJ-PCJ  =  1,3  2.81  p-value 0.25  Thus we may assume that the two groups have proportional hazards, that is, we may assume that the relative risk of blindness of the untreated eye versus the treated eye is a constant. Now we assume that the two failure times are independent and fit the Cox proportional hazards model (3.5) for the eye data. To test the treatment effect, we test 7 = 0. The results are given in Table 4.5. Table 4.5: Test results from the Cox proportional hazards model for the eye data. Test  7  se  7 = 0 0.777 0.169  z-value  p-value  4.6  4.2 x 10~  6  The Cox proportional hazards model also indicates that there is a treatment effect. Figures 4.16 to 4.19 compare the pointwise 95% confidence intervals of the log hazards 64  ratio from Models 1 and 2 and the Cox proportional hazards model, which assumes the two failure times are independent. We can see that the confidence interval from Model 2 is narrower than the confidence interval from the Cox proportional hazards model during the months 4 through 23, when the most failures in the both groups occurred.  65  O  20  40 T i m e to B l i n d n e s s  SO (Months)  Figure 4.16: The estimated log hazards ratio from Model 1 and the Cox proportional hazards model with the pointwise 95% confidence intervals from the Cox proportional hazards model.  C o x  i  T i m e to B l i n d n e s s  (Months)  Figure 4.17: The estimated log hazards ratio from Model 1 and the Cox proportional hazards model with the pointwise 95% confidence intervals from Model 1.  66  40  T i m e to B l i n d n e s s ( M o n t h s )  Figure 4.18: The estimated log hazards ratio from Model 2 and the Cox proportional hazards model with the pointwise 95% confidence intervals from the Cox proportional hazards model.  Figure 4.19: The estimated log hazards ratio from Model 2 and the Cox proportional hazards model with the pointwise 95% confidence intervals from Model 2.  67  Chapter 5 Simulation This chapter contains discussion of a simulation study of estimates in the log hazard regression model. Our main aim is to check the bias and variability of our estimates for the log regression model. We consider different censoring rates and different correlation levels of the paired failure times. We also investigate the three test procedures: for exponential or Weibull marginal distributions and for proportional hazards, see Section 3.5. Our study consists of two parts: the univariate case and the bivariate case. In the univariate case, presented in Section 5.2, we investigate the estimates of the marginal log hazard functions and the pointwise standard errors. We also test that the marginals follow exponential or Weibull distributions. In the bivariate case, presented in Section 5.3, we examine the estimated log hazards ratios and their estimated standard errors. We also test that the paired failure times follow the proportional hazards model. In Section 5.1 we give a brief description of our data generation and model fitting. We present the simulation results in Sections 5.2 and 5.3 for the marginal and bivariate data, respectively.  68  In this simulation study we find that, in general, the estimates for the log hazards perform well except in the tails of the failure time distribution or when the censoring is high. The estimated standard errors slightly underestimate the true variability, which is fairly small. The estimated standard errors for the log hazards ratios do not depend on the correlation level too much. All of the estimates depend on the censoring rate. The lower the censoring, the better the results. We will give a summary of this study in Section 5.4.  5.1  Description of the Simulation Study  In this study for each distribution of T considered, we generate 200 pairs Ts and Cs in each simulation, where the Ts are failure times, the Cs are censoring times, and the Ts and Cs are independent. We choose two censoring rates c = 0.25 or 0.50, i.e., P(T > C) = 0.25  or  P(T > C) = 0.50.  We generate the failure time T from distributions with log hazards of the form <x{t\l3)=PMt) + --- + P b (t), p  (5.1)  p  where {bi, • • • ,6 _i} is a B-spline basis and b (i) = log(t). We use three distributions p  p  for the marginal failure time T: Exponential :  a (t) T  log(20).  (5.2)  In this case, we choose the censoring time C with a (t) c  69  log(20)  for a 50% censoring rate or  a {t) c  = -log(60)  for 25% censoring.  Weibull : a (t) T  = - 3 . 6 8 - 0.18log(t).  (5.3)  T h e corresponding censoring time C has the log hazard  a (t)  = - 3 . 6 8 - 0 . 1 8 log(t)  a (t)  = - 4 . 6 9 - 0.181og(i)  c  for 50% censoring or c  for 25% censoring.  B-spline : a (t) T  = 0.00017  - 8.13 +0.69 log(i),  (5.4)  where b is linear on [0,1.5), cubic on [1.5,6.17) and [6.17,63.33), and zero on x  [63.33, oo). See Sections 3.2 and 4.2 for the exact definition of b . T h e censoring x  time follows a Weibull with log hazard  a (t)  = - 1 3 . 3 9 + 2 log(*)  a (t)  = - 2 5 . 2 4 + 41og(i)  c  for 50% censoring or c  for a 25% censoring rate.  70  In all three cases, we choose the parameters for T's distribution to fit the eye data in Chapter 4. We choose the exponential distribution with mean close to the empirical mean of the non-censored failure times of the treated eyes; the Weibull which is the distribution of the estimated distribution of the failure time of the untreated eye based on the given data. The B-spline model (5.4) is the estimated distribution of the failure times of the treated eye as calculated in Section 4.2. We use S-plus to generate data from an exponential or Weibull distribution. For the data as in the B-spline model (5.4), we first generate a random variable u ~ ?7(0,1) then solve S(t) = u for t numerically. Since t is the (1 — u)th quantile of T's distribution, t can be calculated via heft's quantile function. We use the following three methods to estimate the log hazard function. True M o d e l : Fit the data assuming it follows the true model. That is, estimate the parameters in (5.1) using the true b/s. In this case both the number and the locations of the knots do not depend on the data set. Quantile Knots : Use the quantiles of the non-censored observations as the knots that define the b/s in (5.1). We consider two cases: three knots and six knots. In the three knots case, our knots are the quartiles. In the six knots case, our knots are the 1/6, • • •, 5/6 quantiles. Thus, the locations of the knots depend on the data set, but the number of the knots does not. Flexible Knots : Choose the knots that define the fy's in (5.1) by Kooperberg's heft algorithm. In this case both the number and locations of the knots depend on the data set. Thus we have six models for (T, C), from the three distributions for the failure times and the two censoring times for each distribution. For each of these six models we run 500 simulations and calculate the above four estimates of the log hazard functions. 71  In the bivariate case, we will use the Clayton method to generate (Ti,T ) from 2  marginals for Ti and T and with a parameter 9, (see Section 5.3.1). We consider two 2  types of paired data: Proportional Hazards : The failure time Ti and censoring time Ci, say for the treatment, are as in the univariate B-spline model. Then the censoring rates for T\ are 0.5 and 0.25. The failure time T , say for the control, is from the distribution 2  with the log hazard a (t) = 0.00017 bi(t) - 7.13 + 0.691og(t).  (5.5)  2  Then T\ and T have the proportional hazards. We assume that the treatment 2  and control have the same censoring time C = C\ — C . 2  Non-proportional Hazards : The failure time X, and censoring time C\ of the treatment are as in the univariate B-spline model. The failure time T of the control 2  is from the exponential model (5.2). Thus T\ and T do not have proportional 2  hazards. Again we also assume that the treatment and the control have the same censoring time C. So the censoring rates of T\ are 0.5 and 0.25. The correlation level of Xi and T is determined by a parameter 9. We will give 2  more details about 9 in Section 5.3.1. In this study we consider three correlation levels between Xi and X : 9 = 1, in which case Xi and X are independent; 9 = 1.5 and 9 — 2.5. 2  2  The bigger 9, the more Xi and X are correlated. 2  We use the following three methods to estimate the log hazards ratios of the paired data. Cox : Assume Xi and X are independent and have proportional hazards. Use the Cox 2  proportional hazards model, as implemented in Splus's coxph.  72  Same Knots : Use the Quantile Knots method with the same six knots for Ti and T  2  to estimate each marginal log hazard. The knots are the 1/7, 2/7,3/7,4/7, 5/7, 6/7 quantiles of the union of the non-censored observations of Ti and T which 2  are less than 80. See our explanation in Section 3.4. Different Knots : Use the Quantile Knots method with six knots for estimating Xi's log hazard and six knots for T 's when Ti and T have proportional hazards, and 2  2  three knots for T 's when T and T do not have proportional hazard. Then the 2  x  2  knots are different for Ty and T in each simulation. 2  Hence in the bivariate case we have 12 models for (Ti, T , C): two pairs of marginals 2  for  T ,T , X  2  two censoring rates, and three correlation levels. For each of the 12 models  we run 500 simulations and use the above three methods to estimate the log hazards ratios. In all of these simulations, we will look at: • plots of the pointwise averages and quantiles of the estimates to study the bias and variability of the estimates. • plots of the pointwise standard deviations of the estimates to study the variability of the estimates. • plots of the quantiles of the pointwise z values of the estimates to assess the reliability of the estimates. The pointwise z values are constructed as estimate - true estimated SE • plots of the empirical distribution of the p-values for testing that the failure times have exponential or Weibull marginal distributions and for testing that the two failure times have proportional hazards. Plots appear at the end of this chapter and in the appendix. 73  5.2  The Univariate Case  We present our simulation results for the models (5.2), (5.3), and (5.4) below in Sections 5.2.1, 5.2.2, and 5.2.3, respectively.  5.2.1  Exponential Model  In this section we apply the True Model, Quantile Knots, and Flexible Knots estimation methods to the data generated from the exponential distribution (5.2). Figure 5.1 shows the histograms of the failure times, the censoring times, the non-censored data and the censored data for those distributions. Since 60 is the 98th percentile of noncensored observations under 25% censoring and the 99.7th percentile of the non-censored observations under 50% censoring, we only plot estimates for values of t between 0 and 60. See Figures 5.2 to 5.4. Note that in the Flexible Knots estimation method, if no knots are chosen and the coefficient of the log tail is zero, then an exponential model is fit. This occurs 468 out of 500 times for 25% censoring and 481 out of 500 times for 50% censoring. Hence the estimates from the Flexible Knots method are usually the same as from the True Model. So we do not include the results from the Flexible Knots estimation method in these figures. Figure 5.2 shows the quartiles and empirical mean of the estimated log hazards from the True Model, Quantile Knots, and Flexible Knots methods and the true log hazard function for the exponential model (5.2). As expected, we find that the estimates from the True Model method are the closest to the true log hazard. For the Quantile Knots method, using three knots seems better than using six knots, which is somewhat surprising. In general, we expect that the more knots are used, the smaller the bias of the estimates but the larger the variability. The six knots estimates are more variable  74  but they are also more biased. This may be since the true distribution is exponential and so no knots are needed. Using more knots would not reduce the bias but it would increase the variability of the estimates. It is not surprising that the higher the censoring rate, the higher the bias of the estimates. Figure 5.3 shows the pointwise standard deviations of the estimated log hazards and the quartiles and the empirical mean of the estimated standard errors of the estimated log hazards.  It is normal that the True Model method has the smallest pointwise  standard deviations. The standard deviations using six knots are bigger than when using three knots. The higher the censoring rate, the bigger the standard deviations. Comparing the the pointwise standard deviations of the estimated log hazards with the pointwise quantiles of the estimated standard errors, we find that the bias of the estimated standard errors is the smallest with True Model method and the largest when using six knots. The higher censoring rate causes a bigger bias of the estimated standard errors except with the True Model method. We give the 97.5 and 2.5 percentiles and the quartiles of the pointwise z-values of the 500 estimated log hazards in Figure 5.4. We expect that they are between —2 and 2. We find that they are not out of this range too much for all estimates. Thus, pointwise confidence intervals based on the estimates would be reliable. Figure 5.5 presents the histograms and the qq-plots of Zi =  Pi- Pi  —  se  ,  ft  i = 1,2,3, the standardized estimates of 3 with the Quantile method using three knots. Those figures show that the estimated 3s are approximately normal. For the method using six knots, the graphs look the same. We do not show them here. We also check three test procedures that the failure times follow an exponential distribution. Each procedure involves fitting the log hazard by same regression model and then using a x test that some regression parameters are equal to zero. First we fit 75 2  the data from model 5.2 to a Weibull model, that is, a{t\d) = falogit) + fo. Then we use (Pi / se(Px)) as the test statistic to test Pi = 0. This is the usual para2  metric test of exponential versus Weibull. When using the Quantile Knots method, equation (5.1) can be written as a{t\B) = p\h (<) + fohit) +  PM*)  when using three knots and a(t\0) = PMt) + ••• + PM*) + Peh(t) when using six knots. As discussed in Section 3.5.1, the test is equivalent to test that  Pi = P = 0 3  when using three knots and  Pi = • • • = P = Pe = 0 4  when using six knots. Using the x statistics discussed in Section 3.5.1, we can calculate 2  p-values of the test for each simulation. In Figure 5.6, we give plots of the empirical distributions of the p-value to test the hypothesis that the failure times have an exponential distribution. The plots in the top row in Figure 5.6 show the p-values for the test of exponential versus Weibull, and those in the second and third rows are the p-values for the test of exponential versus B-spline model as in (5.4) when using three and six knots, respectively. Since the null hypothesis is true, the p-value should be uniformly distributed. Thus we expect that each plot would be a straight line. We find that the test works well and it does not depend on the knot selection or the censoring rates too much. However, the test rejects a little too often when the censoring rate is high. 76  5.2.2  Weibull Model  As in Section 5.2.1, we first present the histograms of the data in Figure 5.7. From the histograms we can see that, for both censoring rates, most non-censored failure times were less than 150. In fact 150 is about the 98th percentile for 50% censoring and the 95th percentile for 25% censoring of the observations. So in Figures 5.8 to 5.10, we end our plots at time 150. As we mentioned in Section 3.4, the large range of the distribution of non-censored failure times may cause numerical problems for the Quantiles Knots and the Flexible Knots methods. This happens in our simulations for model (5.3) when we use the Flexible Knots method. Using the knots chosen by the heft algorithm we get four warning messages under 25% censoring and ten warning messages under 50% censoring. Moreover, we can not carry out the calculation for the estimated standard error for any simulated data sets due to the big ranges of the knots. Hence with the Flexible Knots methods we can only get the results of the estimated log hazards for simulated data sets that did not produce warning messages. We can not get any results related to the estimated standard errors. However, the Quantile Knots Method works well for the data as in model (5.3). Figure 5.8 shows summary plots of the estimated log hazards compared to the true log hazard of the model (5.3). We find that under 25% censoring the True Model and Quantile Knots estimation methods perform well. The estimates with the Flexible Knots method are highly biased when time is large, especially for high censoring. When we look at the pointwise standard deviations in Figure 5.9, as expected, the True Model method gives the smallest standard deviations. The Quantile Knots method using six knots has the largest standard deviations. When we compare the pointwise quartiles of our estimated standard error with the pointwise standard deviations of the estimated log hazards, we find that all the estimates 77  look fine, except that the estimates are a little bit too small after time 40. The censoring rate effect is the same as in the exponential model: the higher the censoring rate, the bigger the bias and variability of the estimates of the standard errors, particularly for small values of time. Figure 5.10 shows the pointwise 2.5 and 97.5 percentiles and quartiles of the empirical z values of the estimated log hazards. They are as expected for standard normal random variables. In Figure 5.11, we give the histograms and qq-plots of the estimates of 3 from the Quantile Knots method with six knots under 50% censoring. From the plots we can see that the estimates of 3 are approximately normal. For the True Model and three knots estimates, the graphs look the same. We do not present them here. For test procedures, we check not only the tests that data follow an exponential distribution as in Section 5.2.1, but also test that data follow a Weibull distribution. For the Weibull test we fit the data using bfs from the Quantile Knots estimation method. We use the test statistics discussed in Section 3.5.1 to test the hypothesis that the data follow a Weibull distribution. The test is equivalent to testing that /?i = 0 when using three knots and that (5\ = • • • = (3$ = 0 when using six knots. Figures 5.12 and 5.13 show the plots of the empirical distribution functions of the p-values for testing the null hypotheses that the distribution is exponential and that the distribution is Weibull, respectively. We expect a high proportion of small p-values in Figure 5.12 and a straight line in Figure 5.13. From Figure 5.12, we see that under 25% censoring the True Model method and the Quantile Knots method with six knots has the lowest power. Increasing censoring to 50% results in lower power for all methods. From Figure 5.13, we find that the test for Weibull works very well and it depends on neither the number of the knots nor the censoring rates.  78  5.2.3  The B-spline Model  We first look at the histograms of the failure and censoring distributions, the censored failure times, and the non-censored failure times presented in Figure 5.14. From the histograms for the non-censored failure times, we see a high proportion of times on the interval [0,20], with the remaining times almost uniformly distributed on [20,150] for censoring rate 25% and on [20,100] for censoring rate 50%. As in the Weibull model (5.3) case, the large range of the distribution of non-censored failure times causes some numerical problems for the B-spline model when we use the Quantiles Knots and the Flexible Knots methods. Using the Flexible Knots method, that is, using the knots chosen by the heft algorithm, we can calculate only 113 estimated standard errors out of 500 simulations under 25% censoring and 306 under 50% censoring. The same problem occurs when we use six knots with the Quantile Knots method. To avoid those problems, when we use six knots we choose the knots which are the quantiles of the non-censored failure times less than 80. We present the simulation results in Figures 5.2.15 to 5.2.19. The results for the Flexible Knots method are based on the estimates that we can get. By comparing the estimated log hazards with the true hazard function in Figure 5.15, we fined that the biases of all estimates are fairly small when time is less than 70 under 25% censoring and when time is less than 40 under 50% censoring. We also find that those two numbers, 70 and 40, are close to the medians, 70.82 and 36.89, of the noncensored failure times in the two models. The estimates with the Flexible method perform well until time 100 under 25% censoring. We expect that using larger knots would get better estimates at tail part if there were not numerical calculation problems. Figure 5.16 gives the pointwise standard deviations of the estimated log hazards and the quantiles and the empirical mean of estimated standard errors. From those plots we find that the True Model method and using three knots give the similar pointwise 79  standard deviations which are smaller than the pointwise standard deviations with using six knots. But the the difference between using six knots and using three knots is no as big as with data generated according to the Exponential and Weibull models. All of the estimates have the biggest variability at time zero. The censoring rate effect is as the same as for data generated as in models (5.2) and (5.3). The biases of the estimated standard error are very small except with the Flexible Knots method. Next we look at the plot of the 2.5 and 97.5 quantiles and the quartiles of the pointwise z values of the estimated log hazards from the True Model and Quantile Knots methods in Figure 5.17. We find that under 25% censoring the estimates look reliable for time is less than 70. Under 50% censoring the reliability of the estimates is not too bad as the time is less than 40, which match what we see from Figure 5.2.14. Different from in the Exponential model and Weibull model cases, for the B-spline model, the estimates with using six knots look more reliable than those with three knots. It does not surprise us since we expect using six knots would obtain better estimates than using three knots. To see if we can rely on the number of knots used by heft algorithm, we also check the numbers of knots used by heft algorithm. The average of the numbers of knots used by heft in the 500 simulations is 5.18, which is close to 6. The histograms and the qq-plots of the estimated parameters fa, fa and fa, in Figure 5.18, show that these estimates look normally distributed but not with mean zero. We also test the hypothesis that the data follow a Weibull distribution. For the Quantile Knots estimation method, the test procedures are the same to those we used for the data from the Weibull distribution (5.3) in Section 5.2.2. For the True Model method, we test fa = 0. For the Flexible Knots method we test fa = • • • = (3 _ = 0, K  2  where K is the number of the knots chosen by the heft for each simulation. The test statistics are the same as we discussed in Section 3.5.1. Figure 5.19 shows that, under 25% censoring, by all the four estimation methods the test procedures for that the failure  80  times have a Weibull distribution have a high power except that by using six knots.  5.3  The Bivariate Case  In this section we generate the paired data (T\, C\, T , C ). T\ and T are from (5.2), 2  2  2  (5.3) or (5.4). We assume C\ = C , and the censoring times are defined according to 2  the distribution of Ti from (5.2), (5.3) or (5.4). We are interested in investigating our estimated log hazards ratios and the estimates of their standard errors. We will also test for proportional hazards. In Section 5.3.1 we introduce the method we use to generate dependent data for given marginal distributions. Then we present the simulation results in Sections 5.3.2 to 5.3.4.  5.3.1  Generation of Paired Dependent Data  Clayton (1978) proposed a family of bivariate distributions for survival times. Let S\ and S denote the marginal survivor functions for each member of a pair of failure times 2  (Ti,T ). The joint survivor function for the Clayton model with parameter 9 is 2  i  " e-i  S(t t ,B) u  tt u  2  > 0, 9 > 1  ={  2  ti,t > 0.9 = 1  Si(*i)S (t ), 2  (5.6)  2  2  When 9 = 1, the failure times are independent. The parameter 9 can be written in terms of two conditional hazard functions. If we denote the hazard for the conditional distribution of T\ given T = t and the hazard 2  for Ti given T > t by h \ 2  2  h \T =t {ti)/h \T >M Tl  2  2  Tl  2  Tl  (ti)  T2=lt2  and h \ (ti), Tl  T2>t2  = 9. See Clayton (1978).  81  2  respectively, then in this model  For given Si,S  and 9 we generate bivariate data (Ti,T ) satisfying the Clayton 2  2  model using the following: (A) Generate Ti = ti according to Si, (B) Generate T = t according to the conditional distribution of T given Ti = t . 2  2  2  x  (A) can be carried out as in the univariate case. For (B), when 9 > 1, the conditional probability is (abusing notation slightly), pfrp ^ ,  , x  P(Ti>t \Ti-ti)  P(T > t and Tj = 2  -  2  fr)  -eh  [Sijti) -  dS(t  -  2  u  + S (t y~  1 9  t, 2  9)/dh  - l]"^"  d  2  ^  2  1  (1 -  9)Si(ti)- S[(ti) e  S[{ti) =  [Siiti) '  =  1-  + s (t ) -  1 9  l  2  - l] ^  e  1  2  =  T2 Tl=tl  2  i*T | 7 i = t i ( £ 2 ) 2  f°  r  ^2:  l-P(T >t |Ti=i ) 1  2  1  1-[5 (t ) - + 5 (t ) - -l] ^5 (t )- . 1  1  Solving for S (t ), 2  e  1  1  2  e  T  e  2  1  1  we have  2  5 (i ) = {[(1 - u) ^ 1  2  e  F \ (t ).  We will generate U = u, a uniform [0,1], and solve u = u =  Si(ti)-  2  + l} ^.  - l]Si{ti) l  (5.7)  1  e  Now solve Equation 5.7 for t . Let p denote 1 minus the right part of Equation 5.7 2  2  equal p . Then t is the p th quantile of the distribution function 1 — 5 . If S has 2  2  2  2  2  a standard distribution say mean one exponential, then t =qexp (p ) in Splus. For 2  2  nonstandard distributions, solve S (t ) = 1 — p for t numerically. 2  2  2  2  Thus, for given marginal survivor functions Si and S and the parameter 9, we can 2  generate a random bivariate variable (Ti,T ) = (ti,t ) satisfying the Clayton model by 2  the following procedure. 82  2  1. Generate a random variable Ti = t\ according to Si, 2. Calculate Si(ti); 3. Generate a random variable U = it ~ f/(0,1) ; 4. Calculate P2 = 1 - {[(1 - U) -^ - l]Si{tif1  6  + l}^'j  5. Solve S (t ) = 1 — p for t . 2  2  2  2  Our choices of marginal distributions Si and S follow the proportional hazards and 2  non-proportional hazards model as defined in Section 5.1. Figure 5.20 presents the plots of the correlation coefficients of Ti and T vs 9, where T has survivor function S , 2  k  k  k = 1,2 In our simulation studies, we consider these models with 9 = 1,9 = 1.5, and 2.5. When 9 = 1, Ti and T are independent. 2  5.3.2  Proportional Hazards Model  We present our simulation results for the proportional hazards model in this section. We address the effect of the correlation level and the comparisons of the Cox, Same Knots, and Different Knots estimates of the log hazards ratio. Recall that the Cox estimate assumes that 9 = 1. Since we had studied the effect of the censoring rate on the estimates of log hazards in the univariate case, we only present the results for 25% censoring here and give the results for 50% censoring in the Appendix. Figure 5.21 shows the histograms of the two marginals and the non-censored failure times. We can see that there are very few non-censored observations larger than 150. For the same reason as in the univariate case, we only plot results in the the Proportional Hazards model up to time 150.  83  First we look at the plots of the pointwise quartiles and average of the 500 estimated log hazards ratios in Figure 5.22. Comparing them with the true log ratio, we find that the estimates from the Cox method have the smallest bias, as we expected. The plots show that at time values within the range of knots, the biases of the estimates from the Same Knots method and the Different Knots method are almost the same except that the estimates from the Different Knots method have bigger bias at time zero. It looks like the correlation level does not affect the Same Knots and Different Knots estimates too much. But it is interesting to see that the estimates from the Cox method have the smallest bias when 9 = 1.5. Then we look at the plots of the pointwise standard deviations of the 500 estimated log hazards ratios in Figure 5.23. We see that the estimates of the log hazards ratio from the Cox method have the smallest variability. There is no big difference between the variabilities of the estimated log hazards ratio from the other two methods. We compare Figure 5.23 with Figure 5.16, which shows the variability of the marginal log hazards. It seems that near time 0 the variabilities of the estimated log hazards ratio with the Same Knots and Different Knots methods are similar to the variabilities of the estimated marginal log hazards. When we compare the estimated standard errors with the pointwise standard deviations in Figure 5.23, we find that the estimates of the standard error from the Cox methods have the smallest bias when 9 is 1 and 1.5, that is when the paired data are independent or only slightly correlated. But when 9 is 2.5, the estimated standard errors from the other two methods have smaller bias than the Cox method. The reason for the increasing bias of the estimated standard errors from the Cox method might be the violation of the assumption of independence of the two failure times. Over all three correlation levels, the biases of the estimated standard errors from the Same Knots method and the Different Knots method are very close to each other.  84  When we look at the plots of the quantiles of the pointwise z values of the estimated log hazards ratios, Figure 5.24, we note that the estimated log hazards ratios from the Cox method are more reliable than the estimates from the other two methods, and the best case for the Cox method is when 9 = 1.5. For the Same Knots and Different Knots methods, the reliability of the estimates of the log hazards ratio is fairly good when time t is between 5 and 70, which is about the range of the knots. Once more the reliability of the estimates from the Same Knots and Different Knots methods does not depend on the correlation levels, which does not surprise us because our estimated standard errors take the correlation into account. Finally, we test the hypothesis that the failure times have proportional hazards by first estimating the two log hazards using the Same Knots method. We then test Pij ~ 02j = 0, j = 1, 2, 3,4, 6, as indicated in Section 3.5.2. Using the x test statistics 2  discussed in Section 3.5.2, we can calculate p-values for the test. We look at Figure 5.25, the plots of the empirical distributions of the p-value for testing that the failure times have proportional hazards. The null hypothesis is rejected far more than we expected. We expected each plot would be a straight line since the null hypothesis is true and the p-value should be uniformly distributed. An interesting result is that the distribution curve is closest to the line y = x when the data are highly correlated (9 = 2.5). Through the simulation results, we would like to say that for the data as in the proportional hazards model, the Cox method works well. The performance of the Same Knots and Different Knots methods is reasonable. The estimated standard errors from all three methods are good but those from the Cox model deteriorate as 9 increases.  85  5.3.3  Non-proportional Hazards Model  Figure 5.26 shows the histograms of the two marginals and the non-censored failure times for the data from the non-proportional hazards model. As with the proportional hazard model, we only plot results in the the non-proportional hazards model up to time 150. First we look at Figure 5.27, the plots of the true log ratio and the pointwise quantiles and average of the 500 estimated log hazards ratios. As we expected, the Cox method does not work. The estimates from the other two methods are very close. These estimates have smallest bias when the data are independent. But it is hard to see any difference between 9 = 1.5 and 9 = 2.5. Next we look at Figure 5.28, the plot of the pointwise standard deviations of the 500 estimated log hazards ratios and the pointwise quartiles and mean of the 500 estimated standard errors. Since the Cox method does not work for estimating log hazards ratios for the non-proportional hazards data, we only discuss the results from the other two methods here. We find that the pointwise standard deviations from the Same Knots method are a little bit bigger than those from the Different Knots method. But the standard deviation from the Different Knots method has a bigger jump at t — 0. When we look at the variability of the estimated standard errors, we find that the bias difference between the Same Knots method and the Different Knots method is very small within the range of knots. The standard errors from the Same Knots and Different Knots methods are too small for t > 70. The Different Knots method performs slightly better in this range. The bias and the variability of the estimated standard errors do not depend on the correlation level of the paired data. We also check the plots of the quantiles of the pointwise z values of the estimated log hazards ratios from the three methods, Figure 5.29. Those plots show that the quantile curves for the Same Knots and Different Knots methods mainly lie in the range (—2, 2) 86  with the best performance when the data are independent. It is not surprising that the Cox method performs poorly since the failure times do not have proportional hazards. Now we consider testing the hypothesis that the failure times have proportional hazards. The test procedures are the same as those we used for the data sets from the proportional model in Section 5.3.2 When we look at the plot of the empirical distributions of the p-values, Figure 5.30, we find that the test power is not high but the lowest power is when 6 = 1.  5.3.4  Effect of the Number of Knots  In this section we study how the number of knots used in each simulation affects the estimates of the log hazards ratios and the standard errors. The data sets are generated as in the proportional hazards model and the non-proportional hazards model. For the data set from proportional hazards model, we use three, six, and nine knots to estimate the marginal hazards of T\ and T . For the data generated from the non-proportional 2  hazards model, we use three, six, and nine knots for failure time T\ but only three knots for T . For the data from proportional hazards model, the true hazard of T has three 2  2  knots, see (5.5). In contrast, for the data from non-proportional hazards model, T is 2  exponentially distributed. As we found in the univariate simulation (Section 5.2), heft will rarely choose more than three knots to fit this exponential. Therefore, we do not need to study fits with more than three knots. The simulation results are as we expected, that is, the more knots, the smaller the bias of the estimates but the bigger the variability of the estimates. We show the results in Figures 5.31 to 5.37. From those graphs we find that the estimates of the log hazard ratio by using three knots have the smallest variability but the biggest bias. There is no big difference in bias between using six knots and using nine knots but the estimates using nine knots have the biggest variability. 87  When we look at the estimated standard errors, we find that the estimates using six knots have the smallest bias and variability. Figures 5.39 to 5.42 show the plots of the quantiles of the pointwise z values of the estimated log hazards ratios from the three methods. It can be seen that the performance of the estimates using six knots is similar to those with nine knots and superior to those using three knots. Figures 5.43 and 5.44 present the p-values to test that the two failure times have proportional hazards for the data as in the proportional model and non-proportional model, respectively. We find that, for both data sets, the number of knots does not effect the test too much but the test performs better when Ti and T are dependent 2  than when Ti and T are independent. 2  Overall, for the data generated as in the proportional hazards and non-proportional hazards models, using six knots seems better than using three knots or using nine knots. Of course, the value of six may also depend on the sample size and the true data distribution.  5.4  Summary of Simulations  Through this simulation study, we find the following. In the univariate case: 1. With all the estimation methods, the estimates of the log hazard perform well within the range of knots used. However, if the range of knots is large, the Flexible Knots method might cause some numerical calculation problems in estimating the log hazards and in calculating standard errors. 2. With the Quantile Knots method, the estimates of standard errors perform well  88  except that they slightly underestimate the variability of the estimated log hazards when the censoring rate is high. 3. The censoring rate affects the bias and variability of the estimated log hazards. The smaller the censoring rate, the better the estimates. 4. For all marginal models, the True Model method gives the best estimates for the log hazards. 5. The estimates using three knots look a little bit better than those using six knots for the data generated as in the exponential distribution (5.2) and the Weibull distribution (5.3). But for the data generated as in the B-spline model (5.4), the estimates with six knots look more reliable. 6. The Flexible Knots method gives better estimated log hazards than other methods when the ratio of the largest knot to the smallest knot is not too big. Otherwise calculation of the estimated standard errors and the estimated log hazards causes numerical problems. 7. The test procedures for testing that the failure times follow an exponential and Weibull distribution perform well. In the bivariate case: 1. With the Same Knots and Different Knots estimation methods, the estimates of the log hazards ratio perform well within the range of knots used. There is not a big difference between the estimates from the two estimation methods. 2. For the data with proportional hazards, the Cox method gives the least-biased estimates of the log hazards ratios when the failure times are independent. For the data from the non-proportional hazards model, the Cox proportional model does not work at all. 89  3. The test that the failure times have proportional hazards does not perform well. We expect a bigger sample size might improve the test procedure. 4. The correlation levels do not affect the estimated log hazards ratios too much. But they affect the test that the failure times have proportional hazards. The test has the lowest power when the two failure times are independent. Further simulations are required to better understand the effects of correlation. In this simulation we also check the test procedures for the hypothesis that the two failure times have the same distribution. But there are some numerical problems when we calculate p-values. We do not present the result here. One more thing we would like to point out is that our estimated formulae for the standard errors assume for non-random knots. The procedures used in our simulations have random knots except for the True Model method in the univariate case. So this might be a reason that the estimated standard errors are a bit small.  90  c = 0.25  o  o  0  50  100  150  200  250  0  200  The failure times  0  50  400  600  The censoring times  100  150  0  50  100  150  The observed censoring times  The non-censored failure times  c = 0.5  0  50  100  150  200  250  300  0  50  The failure times  0  20  40  60  80  100  150  200  250  300  The censoring times  100  0  The non-censored failure times  20  40  60  80  100  120  The observed censoring times  Figure 5.1: T h e histograms of the simulated failure times, censoring times, non-censored failure times, and observed censoring times for exponential data as in model (5.2).  91  c = 0.5  c = 0.25  co  o  o  CO  CO  "3CO  CO  0  10  20 30 40 True Model  50  60  0  10  20 30 40 True Model  50  60  10  20  30 40 3 Knots  50  60  10  20  30 40 6 Knots  co  o  CO  o co  •"3CO  co  -3-  0  10  20  30 40 3 Knots  50  60  CO  CO  ci  pj  o co  o  CO  •>JCO  CO  0  10  20  30 40 6 Knots  50  60  Log hazard Mean of the estimated log hazards Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards  Figure 5.2: L o g hazard of the exponential distribution (5.2) and the pointwise quartiles and empirical mean of the 500 estimated log hazards.  92  c = 0.25  c = 0.5 o  20 30 40 True Model  50  60  0  10  20 30 40 True Model  0  10  20  30 40 3 Knots  20  30 40 6 Knots  CM  o  20  30 40 3 Knots  50  60  co o  20  30 40 6 Knots  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.3: T h e pointwise standard deviations of the estimated log hazard for the exponential model (5.2) and the pointwise quartiles and empirical mean of the 500 estimated standard errors.  93  c = 0.25  c = 0.5  0  10  20 30 40 True Model  50  60  0  10  20 30 40 True Model  50  60  0  10  20  30 40 3 Knots  50  60  0  10  20  30 40 3 Knots  50  60  0  10  20  30 40 6 Knots  50  60  0  10  20  30 40 6 Knots  50  60  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.4: T h e quantiles of the pointwise z values of the estimated log hazards for the exponential distribution (5.2).  94  c = 0.25  - 3 - 2 - 1 0 PI  1  2  3  - 3 - 2 - 1 0  1  2  3  P2  c = 0.5  8  -2  0  2  4  -2  0  2 P2  4  - 4 - 2  0  2  p3  Figure 5.5: Histograms and qq-plots of the normalized estimate of 3 from the Quantile Knots methods with three knots for the exponential model (5.2).  95  c = 0.25  c = 0.5  0.0  0.2  0.4 0.6 Weibull  0.8  1.0  0.0  0.2  0.4 0.6 Weibull  0.8  1.0  0.0  0.2  0.4 0.6 6 Knots  0.8  1.0  0.0  0.2  0.4 0.6 6 Knots  0.8  1.0  p-values y=x  Figure 5.6: Empirical distribution functions of the p-values for testing that the failure times are exponentially distributed.  96  c = 0.25  o  0  200  400  600  800  1200  0  1000  The failure times  0  200  400  2000  3000  4000  The censoring times  600  0  The non-censored failure times  200  400  600  800  The observed censoring times  c = 0.5  0  200  400  600  800  200  The failure times  400  600  800  The censoring times  I 100  200  300  100  400  Figure 5.7:  200  300  400  500  The observed censoring times  The non-censored failure times  T h e histograms of the simulated failure times, censoring times, non-cen  failure times, and observed censoring times for Weibull data as in model  97  (5.3).  c = 0.25  c = 0.5  50 100 True Model  150  50 100 True Model  150  50  100  150  50  100  150  100  150  3 Knots  3 Knots  50  100  6 Knots  50 6 Knots  150  50 100 Flexible Knots  50 100 Flexible Knots  150  Log hazard Mean of the estimated log hazards Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards  Figure 5.8: L o g hazard of the Weibull distribution (5.3) and the pointwise quartiles and empirical mean of the estimated log hazards.  98  c = 0.25  c = 0.5  Standard Deviation Mean of the estimated standard errors -  Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.9:  T h e pointwise standard deviations of the estimated log hazards for the  Weibull distribution (5.3)  and the pointwise quartiles and empirical mean of the 500  estimated standard errors.  99  c = 0.25  c = 0.5  50 100 True Model  150  50 100 True Model  150  50  150  50  100  150  100  150  100 3 Knots  50  3 Knots  100  50  150  6 Knots  6 Knots  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.10: T h e quantiles of the pointwise z values of the estimated log hazards for the Weibull model (5.3).  100  ,  s  s •  s  8 •  S  8  -3  -2  -  1  0 &4  Figure 5.11:  1  2  3  -3  -2  -1  0 (5  1  2  3  -3  -2  1  - 1 0  2  3  f6  T h e histograms and the qq-plots of the standardized estimates of the  parameter 3 from the method 5.1 with six knots for the Weibull model (5.3).  101  c = 0.25  c = 0.5  co O  o d  0.10 3 Knots  0.20  0.0  0.05  0.10 3 Knots  0.15  0.20  Figure 5.12: Empirical distribution functions of the p-values for testing the hypothesis that the failure time distribution is exponential.  102  c = 0.25  c = 0.5  p-values y=x  Figure 5.13: Empirical distribution functions of the p-values for testing that the failure times have a Weibull distribution.  103  c = 0.25  200  400  600  100  The failure times  50  100  200  300  The censoring times  150  SO  The non-censored observations  100  150  The censored data observations  c = 0.5  200  400  600  50  The failure times  0  20  40  60  80  100  150  200  250 300  The censoring times  100  100  The non-censored observarions  200  300  The censored observations  Figure 5.14: The histograms of the simulated failure times, censoring times, noncensored failure times, and observed censoring times for the data as in the B-spline model (5.4).  104  c = 0.25  c = 0.5  0  50 100 True Model  150  0  50 100 True Model  150  0  50  100 6 Knots  150  0  50  100  150  50 100 Flexible Knots  150  50 100 Flexible Knots  150  0  6 Knots  0  Log hazard Mean of the estimated log hazards Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards  Figure 5.15: Log hazard of the the B-spline data as in model (5.4) and the pointwise quartiles and empirical mean of the estimated log hazards.  105  c - 0.25  c = 0.5  0  50 100 True Model  150  0  50 100 True Model  150  0  50  150  0  50  100  150  100  150  50 100 Flexible Knots  150  100 3 Knots  0  50  3 Knots  100  150  0  50  6 Knots  0  50 100 Flexible Knots  6 Knots  150  0  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.16: The pointwise standard deviations of the estimated log hazards for the B-spline model (5.4) and the pointwise quartiles and empirical mean of the estimated standard errors.  106  c = 0.5  c = 0.25  0  50 100 Flexible Knots  150  0  50 100 Flexible Knots  150  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.17: T h e quantiles of the pointwise z values of the estimated log hazards for the B-spline model (5.4).  107  c = 0.25  c = 0.25  c = 0.5  True Model  True Model  0.10  0.05  0.10  0.15  0.05  6 Knots  0.05  0.10  0.10  0.15  0.20  0.15  0.20  0.15  0.20  6 Knots  0.15  0.05  Flexible Knots  0.10  Flexible Knots  Figure 5.19: Empirical distribution functions of the p-values for testing the hypothesis that the failure times have a Weibull distribution.  109  Proportional hazards  tx>  1.5  2.0  2.5  3.0  3.5  4.0  3.5  4.0  e  Non-proportional hazards  oo  o  O  1-5  2.0  2.5  3.0  e  Figure 5.20: The plots of the correlation coefficients versus 6 for the data as in the proportional hazards and non-proportional hazards models  110  Treatment  0  200  400  600  Control  0  100 200  All data  o o o  300 400  All data  1  0  50  100  150  Non-censored treatment  i  1  0  50  1  r  150  250  Non-censored control  Figure 5.21: The histograms of the marginal failure times, and non-censored marginal failure times for the data generated according to the proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 25%.  Ill  Cox  S a m e Knots  150  VI  Different Knots  150  o  d  in  V)  150  V)  o V)  v;  T  I  vi  v i  0  50 100 9 = 2.5  150  VI  0  50 100 6 = 2.5  150  0  50 100 9 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure 5.22: The true log hazards ratio of the data from the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the 500 estimated log hazards ratios.  112  Cox  0  50 100 8 = 2.5  S a m e Knots  150  0  50 100 6 = 2.5  150  Different Knots  0  50 100 9 = 2.5  150  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.23: The standard deviations of the estimated log hazards ratio for data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment".  113  Cox  S a m e Knots  Different Knots  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.24: The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment".  114  Figure 5.25: Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the proportional hazards model in Section 5.1 have proportional hazards.  115  Treatment o o O  m  Control o o o o co  T  CM  o o o o  o o o  ISI  o  1 t11 .>1  o o o in  8  H  CM  _ J « _  200  400  600  0  50  All data  100  200  All data  o o o  o  -> 0  100  200  0  300  50  100  150  200  Non-censored control  Non-censored treatment  Figure 5.26: The histograms of the marginal failure times, and the non-censored marginal failure times for the data from the non-proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 25%.  116  (  Cox  S a m e Knots  Different Knots  o I  50 100 6 = 2.5  150  50 100 6 = 2.5  150  50 100 9= 1.5  150  50 100 9 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure 5.27: The true log hazards ratio of the data from the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios.  117  Cox  0  50  100  S a m e Knots  150  0  50  100  150  0  50  100 8=1  150  50 100 8 = 2.5  150  0  50 100 8 = 2.5  150  9=1  0  50 100 8 = 2.5  Different Knots  9=1  150  0  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.28: T h e standard deviations of the estimated log hazards ratio for data generated according to the non-proportional hazards model. T h e censoring rate is 25% for the "treatment".  118  Cox  0  50 100 0=1.5  S a m e Knots  150  0  50 100 0=1.5  Different Knots  150  0  50  100 9=1.5  150  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.29: The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment".  119  0.0 0.05 0.10 0.15 0.20 e=i  0.0 0.05  O.IO  e = i .5  0.15 0.20  0.0 0.05  O.IO  e = 2.5  0.15 0.20  Figure 5.30: Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the non-proportional hazards model in Section 5.1 have proportional hazards.  120  3 Knots  50  0  6 Knots  9 Knots  100 9 = 1  150  50 100 9 = 1.5  150  0  50 100 9 = 1.5  150  50 100 9 = 2.5  150  0  50 100 9 = 2.5  150  0  50 100 9 = 1.5  150  50 100 9 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure 5.31: The true log hazards ratio of the data from the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Same Knots method.  121  3 Knots  6 Knots  150  9 Knots  150  150  0  50 100 G = 1.5  150  50 100 6 = 1.5  150  50 100 6= 1.5  150  0  50 100 9 = 2.5  150  50 100 9 = 2.5  150  50 100 6 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure 5.32: The true log hazards ratio of the data from the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Different Knots method.  122  3 Knots  0  100  50  6 Knots  150  0  6=1  50  9 Knots  100  150  0  50  100 6=1  150  6=1  0  50 100 6= 1.5  150  0  50 100 6 = 1.5  150  0  50 100 6= 1.5  150  0  50 100 0 = 2.5  150  0  50 100 6 = 2.5  150  0  50 100 6 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure 5.33: The true log hazards ratio of the data from the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Same Knots method.  123  3 Knots  6 Knots  9 Knots  o I  150  150  50  100  50  e = 1.5  100  150  e = 1.5  o  —  7  P I  50 100 9 = 2.5  150  0  50  100  150  0  9 = 2.5  50 100 9 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure 5.34: The true log hazards ratio of the data from the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the estimated log hazards ratios by the Different Knots method.  124  3 Knots  6 Knots  50  100  e=i  150  0  50 100 6 = 2.5  150  9 Knots  150  50  100 6= 1  150  50 100 6 = 2.5  150  150  150  50 100 6 = 2.5  150  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.35: The standard deviations of the estimated log hazards ratio by the Same Knots method for data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment".  125  3 Knots  6 Knots  9 Knots  50  100  150  150  50 100 8 = 1.5  150  150  50 100 8 = 2.5  150  6=1  150  o o 0  50 100 8 = 2.5  150  50 100 8 = 2.5  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.36: The standard deviations of the estimated log hazards ratio by the Different Knots method for data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment".  126  3 Knots  6 Knots  9 Knots  o  150  150  150  50 100 0 = 2.5  150  150  50 100 0 = 2.5  150  150  50 100 6 = 2.5  150  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.37: The standard deviations of the estimated log hazards ratio by the Same Knots method for data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment".  127  3 Knots  0  50 100 0 = 2.5  6 Knots  9 Knots  150  150  150  150  150  50 100 0 = 2.5  150  50  100 0= 1  150  150  50 100 6 = 2.5  150  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure 5.38: The standard deviations of the estimated log hazards ratio by the DifFerent Knots method for data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment".  128  3 Knots  0  50 100 6 = 2.5  6 Knots  150  0  50 100 6 = 2.5  9 Knots  150  0  50 100 6 = 2.5  150  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.39: The quantiles of the pointwise z values of the estimated log hazards ratio by the Same Knots method for the data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment".  129  3 Knots  0  50  100  e=i  6 Knots  150  0  50  100  e=i  9 Knots  150  0  50  100  e=i  150  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.40: The quantiles of the pointwise z values of the estimated log hazards ratio by the Different Knots method for the data generated according to the proportional hazards model. The censoring rate is 25% for the "treatment".  130  3 Knots  0  50  6 Knots  100  150  0  50  100 6 = 1  150  0  50  100 6=1  150  50 100 6 = 1.5  150  0  50 100 6 = 1.5  150  0  50 100 6 = 1.5  150  0=1  0  9 Knots  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.41: The quantiles of the pointwise z values of the estimated log hazards ratio by the Same Knots method for the data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment".  131  3 Knots  6 Knots  9 Knots  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure 5.42: The quantiles of the pointwise z values of the estimated log hazards ratio by the Different Knots method for the data generated according to the non-proportional hazards model. The censoring rate is 25% for the "treatment".  132  3 Knots  6=1  6 Knots  9 Knots  6 = 1  6=1  p-values y=x  Figure 5.43:  B y the Same Knots method empirical distribution functions of the  values for testing the hypothesis that the two failure times of the data generated as the proportional hazards model in Section 5.1 have proportional hazards.  133  3 Knots  0.0  6 Knots  0.10  0.20  0.0  0.0  0.10 6= 1.5  0.20  0.0  0.10 0 = 2.5  0.20  e=i  9 Knots  0.10  0.20  0.0  0.0  0.10 6= 1.5  0.20  0.0  0.0  0.10 6 = 2.5  0.20  0.0  e=i  0.10  0.20  0.10 6= 1.5  0.20  e=i  0.10 6 = 2.5  0.20  Figure 5.44: By the Same Knots method empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the nonproportional hazards model in Section 5.1 have proportional hazards.  134  Chapter 6 Conclusion In this thesis a flexible parametric model, log hazard regression model for paired censored failure times, is proposed. In this model B-splines are used to estimate the log hazards of marginal failure times and the log hazards ratio of the two failure times. Consistent estimates of the standard errors for the estimated marginal log hazards and the log hazards ratio are presented as well. Based on this model we propose test procedures for the four hypotheses that the marginals follow an exponential or W e i b u l l distributions and that the two failure times have the same distribution or have proportional hazards. A simulation study indicates that when the censoring rate is not too high, the estimates of the log hazards and the log hazards ratio perform well w i t h i n the range of the knots used and the estimates of standard errors for the estimated log hazards ratio are good, but a little bit small. T h i s underestimation is bigger at time zero. The simulation study also shows that the tests that the marginal failure times follow an exponential or W e i b u l l distribution perform very well. B u t the test for the hypotheses that the two failure times have proportional hazards or have the same distribution tend to over-reject the null hypothesis when the null hypothesis is true. Also, the test is not  135  powerful. The model was applied to the Diabetic Retinopathy Study data to assess the effectiveness of laser photocoagulation in delaying the onset of blindness for diabetic retinopathy patients. The conclusion of the analysis with the log hazard regression model agrees with those from the standard parametric models fitted by Huster et. al (1989), that is, there is a significant laser photocoagulation treatment effect. We use the log hazard model to test the hypothesis that the failure times of the treated eye and untreated eye have an exponential or a Weibull distribution. The test results indicate that there is significant evidence to reject the hypotheses that the failure time of the treated eye follows either exponential or Weibull distribution. The test results also show that there is significant evidence to reject that the failure time of the untreated eye follows an exponential distribution. But the test that the failure time of the untreated eye follows a Weibull yields a marginal result. The p-value is 0.053. The p-value for testing the hypothesis that the failure times of the treated eye and the untreated are proportional hazards is very high, 0.25. This, along with the tendency of the test to over-reject (see Section 5.3), indicates that it is reasonable to assume that the failure times of the treated eye and the untreated eye have proportional hazards. This result supports the analysis of Huster et. al. (1989). We note that knot selection is a basic step to use the log hazard regression model. The simulation study shows that our Quantile Knots procedure performs well when the censoring rate is not too high. Even when we restrict the range of the knots, the estimates are fairly reliable within the range of the knots used. We can not always choose knots using Kooperberg et. al's stepwise addition and stepwise deletion method, since the chosen knots often cause numerical problems (see Section 5.2). However, the knots they selected will give us some indication of the number and the location of the knots that should be used. From the experience of our simulation,  136  we give the following suggestions for using the log hazard regression models to analyze paired data: 1. Use the heft algorithm to choose a set of knots for each failure time as we mentioned in Section 3.4. If there is no warning message from the heft code and the calculation of estimated standard errors can be performed, then use the Different Knots method to estimate the log hazards ratio with the two sets of knots. If there is any warning message or the calculation of the estimated standard errors is impossible for one or both failure times, choose the knots for the one or both failure times by the quantiles method as we discussed in Section 3.4. Use the number of knots used by heft. 2. Test for exponential or Weibull. Fit the data with the log hazards model decided by the selected knots above. Then use the methods we mentioned in Section 3.5, to test if the marginals have exponential or Weibull distributions. If they do have those standard parametric distributions, use those distributions for modeling the marginal failure times. 3. To test proportional hazards, choose one set of knots for the two failure times. Use the quantiles of the non-censored failure times as discussed in Section 3.4. The number of knots should be between max{iT , K } and K + K , where K 1  2  x  2  x  and K are the numbers of of knots selected by heft for Ti and T , respectively. 2  2  Then use the method given in Section 3.5 to test whether the two failure times have proportional hazards. From the definition of our log hazard regression model, we can generalize the model to multivariate censored survival data easily. We also can include covariates in this model. If we assume that the effects of covariates are independent of time, then the calculation of the estimates is straightforward based on the heft algorithm. We can model the 137  log hazard as a function of time plus a function of the covariate. If the effects of the covariates are dependent on time, then the log hazard would be modeled as a bivariate function of time and the covariate. However, the calculations of the estimates would be more complicated. To simplify calculations somewhat, Kooperberg et. al. modeled this bivariate function using linear B-spline and their tensor products. There are some unanswered questions. First, why is the variability of the estimate of log hazard or log hazards ratio big at time zero? We might reduce the problem if we restrict the B-spline functions to be constant between zero and the smallest knot. Second, how do we choose the range of knots? As we see from our simulations, the estimates perform better within the range of the knots than outside of the range. We need more simulations to find a better way to determine the range of knots. Third, why do the test procedures for testing that the failure times have proportional hazard or have the same distribution work poorly, while the test procedures for testing that marginal failure times follows an exponential or Weibull distribution work well? We expect that a big sample size might improve the behavior of the test procedure. Finally, why does a high correlation level yield a better test for testing that failure times have proportional hazards? We need more simulations to understand the effect of the correlation level.  138  Bibliography [1] Abrahamowicz, M. Ciampi, A. and Ramsay, J. 0. (1992). Nonparametric Density Estimation for Censored Survival Data: Regression-Spline Approach. The Canadian Journal of Statistics 20, 171 - 185.  [2] Andersen, P. K. and Gill, R. D. (1982). Cox's Regression Model for Counting Processes: a Large Sample Study. The Annals of Statistics 10, 1100 - 1120. [3] De Boor, C.(1978). A Practical Guide to Splines. Applied Mathematical Sciences.  27, Springer-Verlag, New York. [4] Clayton, D. G. (1978). A Model for Association in Bivariate Life Tables and Its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence. Biometrika 65, 141 - 151. [5] Dabrowska, D. M . (1988). Kaplan-Meier Estimate on the Plane. The Annals of Statistics 16, 1475 - 1489.  [6] Diabetic Retinopaty Study Research Group (1981), Diabetic retinopathy study. Investigative Ophthalmology and Visual Science, 21, 149 - 226.  [7] Diggle, P. J., Liang, K. Y., and Zeger, S. L. (1996). Analysis of Longitudinal Data. Clarendon Press, Oxford.  139  [8] Eilers, P. H. C. and Marx, B. D. (1996). Flexible Smoothing with B-splines and Penalties. Statistical Science 11, 89 - 102. [9] Eubank, R. L. (1988). Spline Smoothing and Nonparametric Regression. M. Dekker, New York. [10] Ferguson, T. S. (1996). A Course in Large Sample Theory. Chapman and Hall, London/Weinheim/New York/Tokyo/Melbourne/Madras. [11] Fleming, T. R. and Harrington, D.P. (1991). Counting Processes and Survival Analysis. John Wiley & Sons, Inc., New York. [12] Gill, R. D. (1992). Multivariate Survival Analysis. Theory Probab. Appl. 37, 18 31. [13] Holt, J. D. and Prentice, R. L. (1974). Survival Analysis in Twin Studies and Matched Pair Experiments. Biometrika 61, 17 - 30. [14] Huster, W. J., Brookmeyer, R., and Self, S. G. (1988). Modeling Paired Survival Data with Covariates. Biometrics 45, 145 - 156. [15] Kooperberg, C , Stone, C. J, and Truong, Y. K. (1995). Hazard Regression. Journal of the American Statistical Association 90, 78 - 94. [16] Kooperberg, C. (1998). Bivariate Density Estimation with an Application to Survival Analysis. Journal of Computational and Graphical Statistics 7, 322 -341. [17] Lawless, J. F. (1982). Statistical Models and Methods for Lifetime Data. Wiley, New York. [18] Murphy, R.P. and Patz, A. (1978). New Concepts in Management of Retinal Vascular Disorder. Ophthalmology Update, International Congress Series. No. 508, K.Mizuno and Y.Mitsui(eds), 115 - 125, Amsterdam: Excerpta Medica. 140  [19] Oakes, D. (1982). A model for Association in Bivariate Survival Data. Journal of the Royal Statistical Society, Series B 44, 414 - 428. [20] Sen, P. K and Singer, J. M . (1993). Larger Sample Methods in Statistics. Chapman  and Hall, New York, London. [21] Shikin, E. V.(1995). Handbook on Splines for the User. CRC Press, Boca Rato. [22] Stone, C. J., Hansen, M. H., Kooperberg, C , and Truong, Y. K. (1997). Polynomial Splines and their Tensor Products in Extended Linear Modeling. The Annals of Statistics 25, 1371 - 1470.  [23] Wei, L. J., Lin, D. Y., and Weissfeld, L. (1989). Regression Analysis of Multivariate Incomplete Failure Time Data by Modeling Marginal Distributions. Journal of the American Statistical Association, 84, 1065 - 1073.  [24] Wei, L. J. and Lachin, J. M. (1984). Two-Sample Asymptotically Distribution-Free Tests for Incomplete Multivariate Observations. Journal of the American Statistical Association 79, 653 - 661.  141  Appendix A Simulation Results for Bivariate Data with 50% Censoring Figures A . l to A . 10 show the simulation results for the proportional and nor-proportional data generated as in the Section 5.1 with 50% censoring.  142  Treatment o o o in  Control  o o o o  CM  o o o in  CO  o o o o  o o o in  200  400  600  0  100  All data  0  20 40 60 80 Non-censored treatment  200  300 400  All data  0  50  100  200  Non-censored control  Figure A . l : The histograms of the marginal failure times, and non-censored marginal failure times for the data generated according to the proportional hazards model in Section 5.1. The censoring rate for the "treatment" is 50%.  143  Cox  0  50  100  S a m e Knots  150  0  6=1  50  100  150  6=1  150  0  50  Different Knots  100  0  150  6 = 2.5  0  50  100  150  6=1  50 100 6 = 1.5  150  50 100 6 = 2.5  150  150  50 100 6 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure A.2: The true log hazards ratio of the data as in the proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the 500 estimated log hazards ratios.  144  Cox  0  50 100 9 = 2.5  S a m e Knots  Different Knots  150  50 9=  100 1.5  150  50 6=  100 1.5  150  150  50 100 9 = 2.5  150  50 100 9 = 2.5  150  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure A . 3 : T h e standard deviations of the estimated log hazards ratio for data generated according to the proportional hazards model. T h e censoring rate is 50% for the "treatment".  145  Cox  S a m e Knots  Different Knots  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure A.4: The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the proportional hazards model. The censoring rate is 50% for the "treatment".  146  p-values y=x  Figure A.5: Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the proportional hazards model in Section 5.1 have proportional hazards.  147  Treatment  Control o o o o co  o o o  lO CM  o o o o  O O O  m  o o o o  o o o m  C\J  200  400  600  0  50  All data  100  200  All data  o o o o co  o o o o  o o o o  o o o  CO  o o o o  o o o  CM  CM  0  50  100  150  0  Non-censored treatment  50  100  200  Non-censored control  Figure A . 6 : T h e histograms of the marginal failure times, and the non-censored marginal failure times for the data from the non-proportional hazards model in Section 5.1. T h e censoring rate for the "treatment" is 50%.  148  Cox  50  100 e=i  S a m e Knots  150  50  100 6= 1  150  Different Knots  100  50  150  6=1  //.,--  50 100 e = i.5  150  50 100 6 = 2.5  150  50 100 6 = 2.5  150  50 100 e = i.5  150  50 100 6 = 2.5  150  Log hazards ratio Mean of the estimated log hazards ratios Median of the estimated log hazards Upper and lower quartiles of the estimated log hazards ratios  Figure A.7: The true log hazards ratio of the data as in the non-proportional hazards model defined in Section 5.1 and the pointwise quartiles and empirical mean of the 500 estimated log hazards ratios.  149  Cox  0  50 100 6 = 2.5  S a m e Knots  150  0  50 100 6 = 2.5  150  Different Knots  0  50 100 6 = 2.5  150  Standard Deviation Mean of the estimated standard errors Median of the estimated standard errors Upper and lower quartiles of the estimated standard errors  Figure A.8: The standard deviations of the estimated log hazards ratio for data generated according to the non-proportional hazards model. The censoring rate is 50% for the "treatment".  150  Cox  S a m e Knots  Different Knots  0  50 100 0 = 1.5  150  0  50 100 6 = 1.5  150  0  50 100 6 = 1.5  150  0  50 100 6 = 2.5  150  0  50 100 6 = 2.5  150  0  50 100 6 = 2.5  150  Median Upper and lower quartiles 2.5 and 97.5 quantiles  Figure A.9: The quantiles of the pointwise z values of the estimated log hazards ratio for the data generated according to the non-proportional hazards model. The censoring rate is 50% for the "treatment".  151  O.O  0.05  O.IO 9=1  0.15  0.20  0.0  0.05  0.10 8 = 1.5  0.15  0.20  O.O  0.05  0.10 9 = 2.5  0.15  0.20  Figure A. 10: Empirical distribution functions of the p-values for testing the hypothesis that the two failure times of the data generated as in the non-proportional hazards model in Section 5.1 have proportional hazards.  152  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Country Views Downloads
China 9 8
United States 9 1
France 1 0
Russia 1 0
Ukraine 1 0
Japan 1 0
City Views Downloads
Ashburn 5 0
Shenzhen 5 8
Unknown 3 11
Mountain View 2 1
Shanghai 2 0
Beijing 2 0
San Jose 1 0
Saint Petersburg 1 0
Tokyo 1 0

{[{ mDataHeader[type] }]} {[{ month[type] }]} {[{ tData[type] }]}
Download Stats

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0089159/manifest

Comment

Related Items