Approximate Methods for JointModels in Longitudinal StudiesbyLibo LuB.Sc., Shandong University, 2002M.Sc., University of Science and Technology of China, 2005A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate Studies(Statistics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)August 2010c⃝ Libo Lu 2010AbstractLongitudinal studies often contain several statistical issues, such as longitudinalprocess and time-to-event process, the association among which requires jointmodeling strategy.We ﬁrstly review the recent researches on the joint modeling topic. After that,four popular inference methods are introduced for jointly analyzing longitudinaldata and time-to-event data based on a combination of typical parametric models.However, some of them may suﬀer from non-ignorable bias of the estimators.Others may be computationally intensive or even lead to convergence problems.In this thesis, we propose an approximate likelihood-based simultaneous in-ference method for jointly modeling longitudinal process and time-to-event pro-cess with covariate measurement errors problem. By linearizing the joint model,we design a strategy for updating the random eﬀects that connect the two pro-cesses, and propose two algorithm frameworks for diﬀerent scenarios of joint like-lihood function. Both frameworks approximate the multidimensional integral inthe observed-data joint likelihood by analytic expressions, which greatly reducethe computational intensity of the complex joint modeling problem.Weapplythisnew methodtoa realdatasetalong withsomeavailablemethods.The inference result provided by our new method agrees with those from otherpopular methods, and makes sensible biological interpretation. We also conduct asimulation study for comparing these methods. Our new method looks promisingin terms of estimation precision, as well as computation eﬃciency, especially wheniiAbstractmore subjects are given. Conclusions and discussions for future research are listedin the end.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Some Issues in Longitudinal Studies . . . . . . . . . . . . . . . . . 11.1.1 Longitudinal Data . . . . . . . . . . . . . . . . . . . . . . . 11.1.2 Incomplete Data . . . . . . . . . . . . . . . . . . . . . . . . 41.1.3 Time-to-Event Data . . . . . . . . . . . . . . . . . . . . . . 61.2 Joint Modeling Longitudinal Data and Time-to-Event Data . . . . 71.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.2 Statistical Models . . . . . . . . . . . . . . . . . . . . . . . 91.2.3 Inference Approaches . . . . . . . . . . . . . . . . . . . . . 101.3 Motivating Example . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3.1 Brief Biological Background . . . . . . . . . . . . . . . . . 121.3.2 Statistical Question . . . . . . . . . . . . . . . . . . . . . . 131.4 Objective and Outline . . . . . . . . . . . . . . . . . . . . . . . . . 14ivTable of Contents2 Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.1 Nonlinear Mixed Eﬀects (NLME) Model for Longitudinal Process 162.2 Nonparametric Model for Covariate Measurement Errors Problem 192.3 AFT Model for Time-to-Event Data . . . . . . . . . . . . . . . . . 222.4 Models for Nonignorable Missing Data . . . . . . . . . . . . . . . 263 Simultaneous Inference . . . . . . . . . . . . . . . . . . . . . . . . . 293.1 Two-Step Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.1.1 Naive Two-Step Method . . . . . . . . . . . . . . . . . . . 313.1.2 Modiﬁed Two-Step Method . . . . . . . . . . . . . . . . . . 323.2 EM Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2.1 E-Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343.2.2 M-Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363.3 Laplace Approximation . . . . . . . . . . . . . . . . . . . . . . . . 373.4 Linearization Method . . . . . . . . . . . . . . . . . . . . . . . . . 393.4.1 Framework I . . . . . . . . . . . . . . . . . . . . . . . . . . 433.4.2 Framework II . . . . . . . . . . . . . . . . . . . . . . . . . 434 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.1 Data Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454.2 Model Speciﬁcation . . . . . . . . . . . . . . . . . . . . . . . . . . 464.3 Analysis and Results . . . . . . . . . . . . . . . . . . . . . . . . . 485 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.1 Simulation Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 545.2.1 Number of Subjects . . . . . . . . . . . . . . . . . . . . . . 545.2.2 Numbers of Repeated Measurements . . . . . . . . . . . . 565.2.3 Between-Subject Variance . . . . . . . . . . . . . . . . . . 57vTable of Contents5.2.4 Within-Subject Variance . . . . . . . . . . . . . . . . . . . 586 Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . 61Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64viList of Tables4.1 Estimation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 515.1 Simulation Results: n = 60 . . . . . . . . . . . . . . . . . . . . . . 545.2 Simulation Results: n = 120 . . . . . . . . . . . . . . . . . . . . . . 555.3 Simulation Results: ni = 12 . . . . . . . . . . . . . . . . . . . . . . 565.4 Simulation Results: A2,B2 . . . . . . . . . . . . . . . . . . . . . . 585.5 Simulation Results: µ2,λ2,σ2 . . . . . . . . . . . . . . . . . . . . . 59viiList of Figures1.1 Longitudinal Data of 6 Subjects . . . . . . . . . . . . . . . . . . . . 21.2 Plots of Longitudinal Data of Subject 10 . . . . . . . . . . . . . . . 131.3 Plots of Longitudinal Data of Subject 18 . . . . . . . . . . . . . . . 141.4 Plots of Longitudinal Data of Subject 33 . . . . . . . . . . . . . . . 144.1 Longitudinal Data Trajectories . . . . . . . . . . . . . . . . . . . . 464.2 Estimate of Survivor Function . . . . . . . . . . . . . . . . . . . . . 474.3 Residuals Plots I of NLME Model . . . . . . . . . . . . . . . . . . 494.4 Residuals Plots II of NLME Model . . . . . . . . . . . . . . . . . . 494.5 Residuals Plots I of Covariate Measurement Errors Model . . . . . 504.6 Residuals Plots II of Covariate Measurement Errors Model . . . . . 50viiiAcknowledgmentsFirst and foremost, I would like to express my gratitude to my supervisors, Dr.Lang Wu and Dr. Jiahua Chen for their support, encouragement and patiencethroughout my study at the University of British Columbia. Their world-leadingexpertise and excellent guidance deﬁned a role model for my career. This thesiswould not have been completed without their inspiring comments and constructivesuggestions.I would also like to thank everyone in the Department of Statistics at theUniversity of British Columbia. In the past two years, they have kindly left meinvaluable knowledge, along with wonderful memories. I had a great time withtheir company.Many many many thanks to my parents and Miss. Marie Li for their un-conditional love and support, and to my lifelong friends for their lifelong loyalty.And ﬁnally, I must thank Buddha for always being with me no matter what hap-pens, dispelling my fear, healing my pain, comforting my soul and guiding my lifeforward. It is lucky to be his follower.ixChapter 1Introduction1.1 Some Issues in Longitudinal StudiesWiththedevelopmentofmodernstatisticalmethodology, longitudinal studies playan increasingly important role in health science, medical research, social scienceand economics. This type of study obviously diﬀers from the classical cross-sectional studies in terms of the times of measurement of the response. Thatis to say, in a longitudinal study, some variables are measured more than onceat diﬀerent time points for each subject. Longitudinal studies, which enable theresearchers to separate the cohort and time eﬀects, are valuable because the eﬀectof time could be a serious confounding factor to other covariates of interest.1.1.1 Longitudinal DataSuppose n subjects are included a longitudinal study. The value of the responsevariable y for subject i at time tij is denoted as yij, where i = 1,...,n,j =1,...,ni. This type of data are called longitudinal data. The graphical displayof longitudinal data requires attention. A sample plot of longitudinal data of 6subjects is on the top of Figure 1.1, which is quite diﬃcult for discovering growthpatterns. It is often helpful to distinguish subjects with diﬀerent types of pointsand connect the observations of each subject with lines, such as the bottom plotof Figure 1.1.Comparing to the data obtained from a cross-sectional study, the data col-11.1. Some Issues in Longitudinal StudiesFigure 1.1: Longitudinal Data of 6 Subjects0.0 0.2 0.4 0.6 0.8 1.0123456TimeResponse0.0 0.2 0.4 0.6 0.8 1.0123456TimeResponse21.1. Some Issues in Longitudinal Studieslected from a longitudinal study are naturally correlated. This feature requiresa statistical methodology to take the correlation into account. For a longitudi-nal data analysis, a primary interest is to estimate the population mean and itsrelationship with covariates. A popular method for this purpose is called themarginal model, which separately models the mean structure and the variance-covariance structure of the response variable. Its underlying idea is close to thequasi-likelihood approach that only requires the ﬁrst and second moment informa-tion of the response variable without speciﬁc distributional assumption. Anotherapproach is the transitional model, which assumes a Markov structure for thelongitudinal process to model the correlation among the repeated measurements.This approach is appealing if the current observation strongly depends on previousones.Marginal models and transitional models consider the between-subject varia-tion and within-subject variation separately. A mixed eﬀects model considers thetwo sources of variation simultaneously by incorporating random eﬀects to rep-resent the between-subject variation and within-subject correlations. Laird andWare (1982) proposed a general linear mixed eﬀects (LME) model:yi = Xiβ+Zibi +ei (1.1)bi ∼ N(0,D), ei|bi ∼N(0,Ri), (1.2)where β are the ﬁxed eﬀects (population parameters), bi = (bi1,...,biq)T are therandom eﬀects (subject-speciﬁc parameters), Xi is a ni × (p+ 1) design matrixcontaining the covariates of subject i, Zi is a ni ×q design matrix (Zi is oftena submatrix of Xi) for the random eﬀects, ei = (ei1,...,eini)T are the randomerrors of those within-subject measurements, Ri is a ni ×ni matrix characteriz-ing variance-covariance of within-subject measurements, and D is the variance-31.1. Some Issues in Longitudinal Studiescovariance matrix of the random eﬀects.The inference procedure of mixed eﬀects models is a natural extension of classicregression methods for cross-sectional models, and most ideas and results, such asthe maximum likelihood estimation (MLE) and asymptotic normality, also applyto the estimation of LME models.Diggle et al. (2002) provided a comprehensive overview of the above threemethods. Verbeke and Molenberghs (2001) and Davidian and Giltinan (1995)gave more discussions on linear and nonlinear mixed eﬀects models. In practice,the longitudinal trajectories could be so complex that parametric models arenot ﬂexible enough. For those cases, Wu and Zhang (2006) introduced somesemiparametric or nonparametric models.1.1.2 Incomplete DataThe data observed in longitudinal studies are very likely to be incomplete, suchas measurement errors and missing data. In the following, we brieﬂy introducethese problems.Measurement Errors ProblemIt is ideal that we can get precise values of the selected variables, such as genderand age. However, in practice, the measurements of some continuous variables,such as blood pressure and CD4, are often imprecise due to various reasons.We call this issue as measurement error problem, the distinguishing feature ofwhich is that we can only observe a variable z that are related to z . A majorgoal of measurement error modeling is to obtain nearly unbiased estimates of theparameters in the main model by ﬁtting a sub-model based on z. This objectiverequires careful analysis, because simply substituting z for z without properadjustments may lead to biased estimates in some cases. Carroll et al. (2006)41.1. Some Issues in Longitudinal Studiesreviewed general methods for measurement errors problems.Missing Data ProblemMissing data problems mainly encompass missing responses, missing covariates,and dropouts. A main reason is that not all subjects are available at every mea-surement time throughout the entire study, especially if the study lasts for a longperiod.In the cases of missing responses, missing covariates, and dropouts, part ofthe data are completely missing. The missingness can be classiﬁed into threemechanisms:• missing completely at random (MCAR): missingness depends neither on theobserved data nor on the unobserved data.• missing at random (MAR): missingness may depend on the observed databut not on the unobserved data.• missing not at random (MNAR): missingness may depend on both the ob-served data and the unobserved data.For instance, suppose a student has not submitted his assignment in classon the due day, then his grade for that assignment was missing. If he left hisassignment at home, the missing grade is MCAR. If he missed that class, thenthe missing grade is MAR, since the missingness is determined by the observeddata “attendance”. If he has not ﬁnished his assignment, then the missing gradeis MNAR.There are enormous valuable literatures on this topic, since the missing dataare often informative and nonignorable. Statistical inference without consideringthis issue might result in biased estimation. Little and Rubin (2002) provided asystematic introduction to major missing data methods. Schafer (1997) gave a51.1. Some Issues in Longitudinal Studiesgeneral introduction for incomplete multivariate data analysis. Wu (2009) oﬀeredan overview of analyzing incomplete data problems with mixed eﬀects models.1.1.3 Time-to-Event DataIn many longitudinal studies, the time to certain events happen, such as timeof patient’s death, are of great interest. It is conventional to talk about survivaldata and survival analysis, regardless of the nature of the events. Often, statisticalmodels are built for studying the relationship between the event time and othercovariates.The distinguishing feature of time-to-event data is censoring, which meansthat the event time may not be observed exactly. For example, suppose thata subject is lost to follow-up after 5 years of observation. According to thesecomplex reasons, censoring issues can be classiﬁed into three types:• Right-censoring. The time of event is not observed because it happened tothe right of the 5th year. This case is called right-censoring.• Left-censoring. Suppose a subject had an event before the 5th year but theexact time of the event is unknown. This case is called left-censoring.• Interval-censoring. Suppose a subject had an event between the 4th yearand 5th year but the exact time of the event is unknown. This case is calledinterval-censoring.Here are some possible reasons why censoring may occur:• a subject does not experience the event before the study end;• a person is lost to follow-up during the study period;• a person withdraws from the study because of a death (if death is not theevent of interest) or some other reasons.61.2. Joint Modeling Longitudinal Data and Time-to-Event DataThe censoring feature, which is sometimes referred as a special missing data issue,determines that speciﬁc statistical methodology is necessary.Common approaches in time-to-event data analysis include Cox ProportionalHazards model (Cox model), which assumes that the eﬀect of a covariate is tomultiply the hazard by some constant, and Accelerated Failure Time model (AFTmodel), which assumes that the eﬀect of a covariate is to multiply the predictedevent time by some constant. Time-to-event data have been extensively studied.Lawless (2003) provided comprehensive introduction to the relevant statisticalmodels and methods.1.2 Joint Modeling Longitudinal Data andTime-to-Event Data1.2.1 MotivationThe statistical issues mentioned in section 1.1 often occur simultaneously but usedto be analyzed separately in longitudinal studies. Previous relevant researchesreported that the intrinsic relationships among longitudinal process, time-to-eventprocess and incomplete data mechanism can be greatly inﬂuential to the statisticalinference for diverse scientiﬁc objectives.• Wu and Carroll (1988) reported that analyzing longitudinal data withoutmodeling informative censoring issue might lead to biased results. Theirwork was generalized by DeGruttola and Tu (1994). Little (1995) pointedout the possible estimation bias for analyzing longitudinal data due to in-gnoring the missing data problems. Wu (2002) and Liu and Wu (2007)revealed the eﬀects of measurement errors problems and missing data prob-lems on the inference of longitudinal process. These researches imply thatclassical statistical methods for longitudinal data analysis may need to be71.2. Joint Modeling Longitudinal Data and Time-to-Event Dataadjusted for incorporating certain critical events in longitudinal studies, suchas outcome-dependent drop-out.• Faucett and Thomas (1996) and Wulfshon and Tsiatis (1997) modeled time-to-event data with time-varying covariates measured with errors, where thejoint modeling approach and the separate modeling approach provided dif-ferent estimates. These researches indicated that valid modeling of time-to-event data might be conditional on relevant longitudinal processes, forinstance, the time-varying covariates. Their researches were generalized re-spectively by Xu and Zeger (2001a,b) and Song et al. (2002a,b).• Henderson et al. (2000) jointly analyzed longitudinal data and time-to-eventdata with equal interests using the likelihood-based approaches. Wang andTaylor (2001) and Brown and Ibrahim (2003) further generalized the model-ing assumptions on longitudinal process. Joint modeling of both longitudi-nal process and time-to-event process is particularly essential if the associa-tion between the event times and growth trends of the longitudinal processis of interest.Generally speaking, joint modeling links the longitudinal process and the time-to-event process via a common set of random eﬀects. Under the joint modelingframework, the main focus could be just on the longitudinal process or the time-to-event process, or it could be on both processes. Since this joint modeling approachcontains at least two processes, the parameters to be estimated are usually morethan a single inference. If only one speciﬁc process is of interest, then the otherprocess would better be modeled as simple as possible to reduce the nuisanceparameters.Tsiatis and Davidian (2004) provided an excellent overview on the motivationsof joint modeling and relevant preceding literature. The rest of this section brieﬂy81.2. Joint Modeling Longitudinal Data and Time-to-Event Datareviews both statistical model and inference approach that have been adopted forrecent researches.1.2.2 Statistical ModelsModels for Longitudinal DataMany researches adopt mixed eﬀects models for modeling longitudinal data. Onetypical model, adopted by Ratcliﬀe et al. (2004), Zeng and Cai (2005a,b), Li et al.(2009) and Huang et al. (2009), is parametric linear mixed eﬀects models withcovariates as predictors. Other researchers, such as Brown et al. (2005), Dingand Wang (2008), Ye et al. (2008) and Zhang et al. (2010), used nonparametricmodels for more ﬂexible descriptions of the longitudinal processes. Vonesh et al.(2006) and Yu et al. (2008) adopted nonlinear mixed eﬀects models according tothe corresponding scientiﬁc backgrounds. Rizopoulos et al. (2008), Liu (2009) andLi et al. (2010) established their approaches based on generalized mixed eﬀectsmodels for analyzing categorical longitudinal data.Models for Time-to-event DataFor modeling time-to-event data, some joint modeling studies, including Ratcliﬀeet al. (2004), Song and Huang (2006), Ding and Wang (2008), Li et al. (2009)and Zhang et al. (2010), were based on Cox proportional hazard model, partiallybecause Cox model is widely used in practice. Recently, according to the scientiﬁcbackground of the data, more and more attention have been turned to acceleratedfailure time model, such as Tseng et al. (2005) and Rizopoulos et al. (2008).Particularly, Vonesh et al. (2006) and Rizopoulos et al. (2010) adopted Weibullmodel, which bears the characteristics of both Cox model and AFT model.91.2. Joint Modeling Longitudinal Data and Time-to-Event Data1.2.3 Inference ApproachesTwo-stage ApproachThe idea of two-stage approach is to ﬁrstly estimate the shared parameters bifrom either the longitudinal process or the time-to-event process, such as Zhanget al. (2008) and Ye et al. (2008), with the corresponding parameters of the ﬁrstmodel being estimated. Then the estimated shared parameters ˆbi are used for theestimation of the parameters of the second model. Although two-stage approach iseasy to implement, there are several drawbacks (Wulfshon and Tsiatis, 1997). Forexample, the estimated parameters of one process from the ﬁrst step are regardedas ﬁxed in the modeling of the other process in the second step, which does notpropagate uncertainty from step one to step two.Likelihood-Based ApproachA more uniﬁed alternative of statistical inference is based on the joint likelihoodfunction given both longitudinal data and time-to-event data. The joint modelingapproaches simultaneously estimate the parameters that describe the longitudinalprocess, as well as those that describe the time-to-event process. Besides enhanc-ing the eﬃciency of the inference, joint modeling is also expected to provide moreprecise estimation of the relationship between the two processes.Tseng et al. (2005) considered a LME model for longitudinal data along witha general AFT model for time-to-event model, and developed an EM algorithmto maximize the resulting log-likelihood, which involves intractable integrals overthe distribution of ﬁxed eﬀects. The similar EM algorithms have been applied tomany other researches based on Cox models, such as Ratcliﬀe et al. (2004), Wuet al. (2008), Rizopoulos et al. (2009) and Li et al. (2010). Vonesh et al. (2006)and Rizopoulos et al. (2009) eﬀectively reduced the computational diﬃculties in101.2. Joint Modeling Longitudinal Data and Time-to-Event Datathe E-step by using Laplace approximation.Zeng and Cai (2005a) adopted a LME model and a Cox model for the lon-gitudinal process and the time-to-event process, and discussed the asymptoticbehavior of the MLE obtained from EM algorithm. Hsieh et al. (2006) reinforcedthe merit of the joint modeling approach in Wulfshon and Tsiatis (1997) by provid-ing a theoretical explanation of the robustness features observed in the literature.They suggested that the likelihood-based procedure with normal random eﬀectscan be very eﬃcient and robust as long as there is rich enough information avail-able from the longitudinal data. However, if the longitudinal data are too sparseor carry too large measurement errors, the eﬃciency loss of joint modeling can bequite substantial. Hsieh et al. (2006) also recommended to use bootstrap methodto get more reliable estimates of the standard errors of the MLEs.Brown et al. (2005), Yu et al. (2008) and Zhang et al. (2010) took Bayesianapproaches to joint model two processes and developed Markov Chain MonteCarlo (MCMC) implementations, which were also based on likelihood. Ibrahimet al. (2001b, chap. 7) provided a detailed discussion on joint modeling from aBayesian perspective. Yu et al. (2004) compared the inference results for Bayesianapproach and EM approach, both of which rely on the speciﬁcation of likelihoodfunctions.Semiparametric ApproachSong and Huang (2006) and Song and Wang (2008) focused on estimating theﬁxed eﬀects in the joint models of longitudinal process and time-to-event processby developing a set of unbiased estimating equations (conditional and correctedscore approaches), which yield consistent and asymptotically normal estimatorswith no assumptions on the random eﬀects. Their approaches reduce reliance onthe parametric modeling assumptions.111.3. Motivating Example1.3 Motivating Example1.3.1 Brief Biological BackgroundHuman Immunodeﬁciency Virus (HIV) is a virus that directly attacks certainhuman organs, such as the brain, heart, and kidneys, as well as the human im-mune system. Infection with HIV occurs by the transfer of blood, semen, vaginalﬂuid, pre-ejaculate, or breast milk. Within these bodily ﬂuids, HIV is present asboth free virus particles and virus within infected immune cells. The four ma-jor routes of transmission are unsafe sex, contaminated needles, breast milk, andtransmission from an infected mother to her baby at birth (Vertical transmission).Screening of blood products for HIV has largely eliminated transmission throughblood transfusions or infected blood products in the developed world.The primary cells attacked by HIV are the CD4 lymphocytes, which helpdirect immune function in the body. HIV infection leads to low levels of CD4 cellsthrough three main mechanisms: direct viral killing of infected cells; increasedrates of apoptosis in infected cells; and killing of infected CD4 cells by CD8cytotoxic lymphocytes that recognize infected cells. Since CD4 cells are requiredfor proper immune system function, when enough CD4 lymphocytes have beendestroyed by HIV, the immune system barely works. Many problems experiencedby people infected with HIV result from a failure of the immune system to protectthem from certain opportunistic infections (OIs) and cancers.Most people infected with HIV eventually develop Acquired ImmunodeﬁciencySyndrome (AIDS). These individuals mostly die from opportunistic infections ormalignancies associated with the progressive failure of the immune system. HIVprogresses to AIDS at a variable rate aﬀected by viral, host, and environmentalfactors. Most individuals will progress to AIDS within 10 years of HIV infection,sooner or later. HIV treatment with anti-retrovirals increases the life expectancy121.3. Motivating Exampleof people infected with HIV. Even after HIV has progressed to diagnosable AIDS,the average survival time with antiretroviral therapy was estimated to be morethan 5 years as of 2005. Without antiretroviral therapy, someone who has AIDStypically dies within a year.1.3.2 Statistical QuestionAccording to the clinical experiences, HIV infection often occurs along with thevariation of CD4 cells, CD8 cells and their ratio. Recently, the CD4/CD8 ratiohas become a new biomarker for assessing the relative condition of HIV subjects,and further more, for predicting the progression from HIV to AIDS.After having certain anti-HIV treatments, the viral loads of HIV-infected sub-jects are expected to decline, roughly with biphasic exponential decay. Meanwhile,we might observe a corresponding increase of CD4 and a decrease of CD8, and con-sequently, an increase of CD4/CD8 ratio. However, the viral load might reboundafter a period of treatment due to various reasons. The relationship between HIVviral suppression and immune restoration is of great attention in AIDS research.Figure 1.2: Plots of Longitudinal Data of Subject 100 50 100 1503.03.54.04.55.0Subject 10Timelog10(RNA) CD4100150200250300350400450log10(RNA)CD40 50 100 1503.03.54.04.55.0Subject 10Timelog10(RNA) CD4/CD80.350.400.450.500.550.600.65log10(RNA)CD4/CD80 50 100 150100150200250300350400450Subject 10TimeCD4 CD4/CD80.350.400.450.500.550.600.65CD4CD4/CD8Our research was motivated by an AIDS Clinical Trials Group (ACTG) study(Wu and Ding, 1999) for demonstrating that the initial viral decay rate reﬂects131.4. Objective and OutlineFigure 1.3: Plots of Longitudinal Data of Subject 180 50 100 1502.53.03.54.0Subject 18Timelog10(RNA) CD4250300350log10(RNA)CD40 50 100 1502.53.03.54.0Subject 18Timelog10(RNA) CD4/CD80.400.420.440.460.480.50log10(RNA)CD4/CD80 50 100 150250300350Subject 18TimeCD4 CD4/CD80.400.420.440.460.480.50CD4CD4/CD8Figure 1.4: Plots of Longitudinal Data of Subject 330 50 100 1503.03.54.04.55.05.56.0Subject 33Timelog10(RNA) CD4100150200250log10(RNA)CD40 50 100 1503.03.54.04.55.05.56.0Subject 33Timelog10(RNA) CD4/CD80.180.200.220.240.26log10(RNA)CD4/CD80 50 100 150100150200250Subject 33TimeCD4 CD4/CD80.180.200.220.240.26CD4CD4/CD8the eﬃcacy of an anti-HIV treatment. Figure 1.2, 1.3 and 1.4 are the trajectoriesof HIV viral load, CD4 count, and CD4/CD8 ratio of three randomly selectedsubjects. HIV viral load appears to be negatively correlated with both CD4 countand CD4/CD8 ratio. This article studies the potential underlying associationbetween the viral load decay and the CD4/CD8 time trend by jointly modelingthe HIV viral load dynamics and the time of the ﬁrst decrease of CD4/CD8.1.4 Objective and OutlineIn this thesis, we develop a new approximate statistical inference method forjointly modeling longitudinal process and time-to-event process with covariate141.4. Objective and Outlinemeasurement errors. Comparing to other existing inference methods, the methodsthat we proposed are easy to implement and much more computationally eﬃcient.This thesis is organized as follows. Chapter 2 introduces the statistical modelsthat will be included in our joint modeling. Chapter 3 discusses currently availableinference methods and proposes our new methods with procedure details. Thesemethods are to be applied to a real dataset from the motivating example inChapter 4 for data analysis. The simulation results are displayed in Chapter 5.We ﬁnally make the conclusion and future research discussion in Chapter 6.15Chapter 2Statistical ModelsThe joint modeling of longitudinal process and time-to-event process is basicallycomposed of several simpler models which represent respectively diﬀerent speciﬁcprocesses. In this chapter, we introduce the statistical models for each process inour longitudinal study.2.1 Nonlinear Mixed Eﬀects (NLME) Model forLongitudinal ProcessThe general LME model (1.1) has achieved numerous successes in both theoreticaland applied research for its simplicity in modeling, computation and interpreta-tion. However, linear structure only provides local linear approximation to thegrowth trend of the response variable and thus limits our vision within the rangeof the observed data. In many longitudinal studies, the developments of responsevariables have certain deterministic mechanisms underlying, which can often berepresented by nonlinear mathematical formulations of physically meaningful pa-rameters. The nonlinear formulations, if being correctly speciﬁed, support moreprecise estimation for the statistical model, both inside and outside of the range ofthe observed data, with less parameters than linear models. Therefore, nonlinearmixed eﬀects models (NLME) are becoming increasingly favorable in representingthe longitudinal process with known developing mechanism.Supposensubjects areincluded in alongitudinal study. Letyij be the response162.1. Nonlinear Mixed Eﬀects (NLME) Model for Longitudinal Processvalue of subject i at time tij, and let zij be the corresponding covariate value.We can write the NLME model in the following two-stage hierarchical nonlinearmodel:yij = g(tij,βi)+eij (2.1)βi = h(zi,β,bi) (2.2)bi ∼ N(0,B), ei|bi ∼N(0,Ri), (2.3)where g(·) and h(·) are given nonlinear functions, βi are the subject-speciﬁc pa-rameters and β are the ﬁxed eﬀects (population parameters), zi contain the co-variates of subject i, bi = (bi1,...,biq)T are the random eﬀects distinguishingsubjects, ei = (ei1,...,eini)T are the random errors of within-subject measure-ments,B is the variance-covariance matrix of the random eﬀects, andRi is ani×nimatrix characterizing the variance-covariance of within-subject measurements.In practice, the function g(·) is determined by the scientiﬁc problem and thedata background. The function h(·) is often a linear combination of the ﬁxedeﬀects and the random eﬀects, such asβi = Aiβ+Bibi, (2.4)where Ai is a design matrix depending on the elements of zi, and Bi is a de-sign matrix, which typically involve only zeros and ones, allowing some elementsof βi to have no associated random eﬀects. The random eﬀects bi representbetween-subject variance that is not explained by covariatezi. Thus, the variance-covariance matrix B is usually unstructured. The error term ei represent the lackof ﬁt of the models and possible measurement errors of the data. The choice ofRi should be guided under practical background, on which Davidian and Giltinan(2003) gave detailed discussions.172.1. Nonlinear Mixed Eﬀects (NLME) Model for Longitudinal ProcessA widely applicable approach for estimating parameters of NLME is based onlikelihood. Let θ = (β,κ,B) denotes all parameters, where κ is the collectionof parameters in Ri,i = 1,...,n. The marginal distribution of the response yi isgiven byf(yi|θ) =∫f(yi|zi,θ,bi)f(bi|B)dbi, (2.5)and the likelihood isL(θ|y) =n∏i=1L(θ|yi) =n∏i=1∫f(yi|zi,θ,bi)f(bi|B)dbi, (2.6)The statistical inference is then based on maximum likelihood estimation (MLE)approach, which means that the classic large-sample asymptotic result is promis-ing to hold in this situation.In general, the marginal distribution (2.5) or the likelihood function (2.6) hasno closed-form expression, which greatly distinguishes NLME model from LMEmodel. This characteristic implies that a major challenge of likelihood-basedapproach for NLME model is the maximization of the likelihood function (2.6).Commonly applicable methods are:• Numerical method or Monte Carlo method, which uses diﬀerent integrationtechniques to approximate the integral of (2.6) and then maximizes theapproximate likelihood.• EM algorithm, which iteratively maximizes the likelihood function (2.6).• Laplace approximation method, which approximates the integral of (2.6) bya simple form and then maximizes the approximate likelihood.• Taylor expansion method, which linearizes the NLME into a sequence ofLMEs and approximate the maximum point of the likelihood function (2.6).182.2. Nonparametric Model for Covariate Measurement Errors ProblemDavidian and Giltinan (1995) and Wu (2009) provided comprehensive discussionson the asymptotic results and implementational details.2.2 Nonparametric Model for CovariateMeasurement Errors ProblemIn longitudinal studies, the covariates are usually recorded and analyzed to par-tially explain the between-subject variance. However, the covariates may be mea-sured with substantial errors. It is well known that ignoring covariates measure-ment errors might lead to severe bias in statistical inference (see, e.g., Carroll et al.(2006)). Therefore, the covariate measurement errors problem should be incorpo-rated into the analysis of longitudinal data. Generally, the choice of appropriateanalysis method for measurement errors problem depends on the following twoconsiderations.• The choice of statistical model depends on the type and amount of dataavailable for analysis. A sophisticated error model may be less reliablewithout suﬃcient data available.• The measurement error models can mainly be classiﬁed into functional mod-els and structural models. Comparatively speaking, functional models treatz as a sequence of unknown ﬁxed constants or a random variable with min-imal distributional assumption. It leads to estimation procedures that arerobust to mis-speciﬁcation of the distribution of z . On the other hand,structural models assume z follow speciﬁc parametric distributions, theestimates based on which are usually more eﬃcient if the parametric distri-butions are correctly speciﬁed.From the viewpoint of functional data analysis, measurement errors problemsin longitudinal studies are closely related to ﬁtting several smooth curves. Thanks192.2. Nonparametric Model for Covariate Measurement Errors Problemto the magniﬁcent developments of functional data analysis and nonparametricstatistical analysis in recent years, the researchers now have more ﬂexible tech-niques to handle complex data with high precision. Rice and Wu (2001) proposeda linear mixed eﬀects regression spline model for making subject-speciﬁc inferencefor the covariates with measurement errors and distinguishing the between-subjectand within-subject variance. Liu and Muller (2009) developed a general andmodel-free dynamic equation for obtaining the derivatives of the growth curvesfor sparse data. This empirical diﬀerential equation has potential to provide valu-able insights into the time-dynamics of random trajectories.In our study, we adopt the approach of Rice and Wu (2001) for modelingthe time-varying covariate process to incorporate the measurement errors. Sincethe covariate process could be quite complex, a ﬂexible empirical nonparametriclinear mixed eﬀects model is considered.zi(t) = r(t)+hi(t)+ξi(t) ≡z i (t)+ξi(t), i = 1,...,n. (2.7)wherez i (tik) = r(tik)+hi(tik) are the true but unobserved covariate values, r(tik)and hi(tik) are unknown nonparametric smooth functions representing the pop-ulation means and the subject-speciﬁc variation respectively, and ξi(tik) are theerror terms followingN(0,λ2). The random smooth functionhi(·) is introduced toincorporate the large between-subject variation in the covariate process, while theﬁxed smooth function r(·) represents population average of the covariate process.We assume that hi(·) is the realization of a zero-mean stochastic process.As in Rice and Wu (2001), we can approximate the nonparametric func-tions r(t) and hi(t) by linear combinations of some basis functions Ψp(t) =202.2. Nonparametric Model for Covariate Measurement Errors Problem[ψ0(t),ψ1(t),...,ψp 1(t)]T and Φq(t) = [ϕ0(t),ϕ1(t), ...,ϕq 1(t)]T as follows:r(t) ≈ rp(t) =p 1∑k=0αkψk(t) = Ψp(t)Tα, (2.8)hi(t) ≈ hiq(t) =q 1∑k=0aikϕk(t) = Φq(t)Tai, i = 1,...,n. (2.9)whereα= (α0,...,αp 1)T isap×1vectorofﬁxedeﬀectsandai = (ai0,...,ai,q 1)Tis a q×1 vector of random eﬀects, and we assume ai are i.i.d. ∼ N(0,A). Thefunctions Ψ(·) and Φ(·) can theoretically be taken as any basis functions, includ-ing various types of splines, such as B-splines (de Boor, 1978), smoothing splines(Chambers, 1991) and P-splines (Ramsay and Silverman, 2005), or local polyno-mials (Fan and Gijbels, 1996). The above approximations of r(t) and hi(t) can bearbitrarily accurate as the values of p and q increase. However, too large valuesof p and q may lead to overﬁtting problem. Appropriate values of p and q can bedetermined by standard model selection criteria such as AIC or BIC values (Riceand Wu, 2001). By approximatingr(t) andhi(t) withrp(t) andhiq(t) respectively,the covariate model (2.7) can be approximated by the following LME modelzi = z i +ξi = ΨTpα+ΦTq ai +ξi (2.10)ai ∼ N(0,A), ξi|ai ∼N(0,λ2Ii). (2.11)Model (2.10) can be used to model covariate measurement errors problems(Carroll et al., 2006). For multiple covariates, we can model each covariate sepa-rately using (2.10). However, a more reasonable approach is to consider a multi-variate version of model (2.10), such as a multivariate LME model (Shah et al.,1997).212.3. AFT Model for Time-to-Event Data2.3 AFT Model for Time-to-Event DataIn the analysis of time-to-event data, the survivor function and the hazard func-tion, which depend on time, are of particular interest. Denote f(t) as the prob-ability density function (p.d.f.) of event time T. The survivor function S(t) isdeﬁned as the probability of surviving to time tS(t) = Pr(T ≥t) =∫ 1tf(x)dx. (2.12)The graph of S(t) against t is called the survival curve. The hazard function h(t)is deﬁned as the event rate at time t conditional on no event occurs until time tor later.h(t) = lim∆t!0Pr(t≤T <t+∆t|T ≥t)∆t= f(t)S(t) = −ddt logS(t). (2.13)Many statistical analysis methods for analyzing time-to-event data, no mat-ter they are parametric, semiparametric or nonparametric, are closely related tothe likelihood function given the data, which is determined by the existence andspeciﬁc type of censoring issue. Let’s consider a simple right-censoring case forexample. Let Ti be the event time for subject i,i = 1,...,N. Let Ci be the cen-soring time for subjecti. Obviously, if no censoring occurs, Ti is observed and lessthan the potential censoring time (Ti ≤ Ci). If censoring occurs, we only knowthat Ti >Ci. In general, we denoteρi = min(Ti,Ci), δi = I(Ti ≤Ci). (2.14)222.3. AFT Model for Time-to-Event DataBoth ρi and δi are random variables and their joint p.d.f. isf(ρi)δiS(ρi)1 δi. (2.15)Suppose T1,...,Tn are independent, we may obtain the likelihood function giventhe right-censored data asL =N∏i=1f(ρi)δiS(ρi)1 δi. (2.16)The Kaplan-Meier method, the nonparametric MLE of the survivor functionS(t), can be used to estimate this curve from the observed survival times withoutassumption on distribution. However, in many cases, the survival function S(t)is assumed following certain distributions. For instance, if S(t) can be written asS(T|u,σ) = S0(logT −uσ), (2.17)where u∈R is a location parameter and σ> 0 is a scale parameter, we say T hasa log-location-scale distribution, which is a widely used parametric distributionfor time-to-event data. If we generalize u to u(z), we may have more covariates zinto account, that isS(T|z,u,σ) = S0(logT −u(z)σ). (2.18)Suppose ϵ is a random variable with survivor function S0(ϵ) and is independentfrom z, then (2.18) can be rewritten aslogT = u(z)+σϵ. (2.19)Here the survivor function is represented in a regression setting.232.3. AFT Model for Time-to-Event DataFormula (2.19) is a general semiparametric AFT model. In practice, u(·) canbe taken as a linear function and we have the following parametric AFT modellogT = γ0 +γT1 z+σϵ, (2.20)where ϵ is independent from z. In this thesis, we consider the random eﬀects aiand bi as covariates and build the following AFT frailty model for time-to-eventdata.logTi = γ0 +γT1 ai +γT2 bi +σϵi, i = 1,...,n. (2.21)where γ = (γ0,γ1,γ2) are the unknown parameters and ϵi are i.i.d. randomvariables. Note that the NLME longitudinal model (2.1) and the time-to-eventmodel (2.21) are connected through the random eﬀects bi.There are three major parametric AFT models based on diﬀerent distributionsof ϵ.• Weibull AFT model. Assume the event times follow a Weibull distributionwith scale parameter exp(−(γ0+γT1 ai+γT2 bi)/σ) and shape parameter 1/σ,the survivor function isS(t) = exp{−exp(logt−γ0 −γT1 ai −γT2 biσ)}, (2.22)Because the Weibull distribution has both the proportional hazards prop-erty and the accelerated failure time property, this model is particularlyattractive.• The log-logistic AFT model. Assume the event times follow a log-logisticdistribution with parameters (γ0 +γT1 ai +γT2 bi)/σ and 1/σ, the survivor242.3. AFT Model for Time-to-Event Datafunction isS(t) ={1+exp(logt−γ0 −γT1 ai −γT2 biσ)} 1. (2.23)• The log-normal AFT model. Assume the event times follow a log-normaldistribution with parameters γ0 +γT1 ai +γT2 bi and σ, the survivor functionisS(t) = 1−Φ(logt−γ0 −γT1 ai −γT2 biσ). (2.24)Once the distribution is assumed, one may ﬁnd the corresponding likelihood func-tion by specifying thef(t) andS(t) in (2.16). Suppose the survival timeϵi in AFTmodel follows Weibull distribution, the pdf and survivor function of ϵi isf(ϵ) = exp(ϵ−eϵ), S(ϵ) = exp(−eϵ), (2.25)that isf(t) = exp[σ 1(logt−γ0 −γT1 ai −γT2 bi) (2.26)− exp(σ 1(logt−γ0 −γT1 ai −γT2 bi))], (2.27)andS(t) = exp[−exp(σ 1(logt−γ0 −γT1 ai −γT2 bi))]. (2.28)The log-likelihood function of the AFT model for subject i isℓ(γ|ai,bi) =n∑i=1(δiϵi −eϵi)=n∑i=1[δiσ 1(logTi −γ0 −γT1 ai −γT2 bi)−exp(σ 1(logTi −γ0 −γT1 ai −γT2 bi))].252.4. Models for Nonignorable Missing DataBy integrating out the random eﬀects ai,bi, we can obtain the log-likelihoodfunction of γ.ℓ(γ) =n∑i=1∫∫ℓ(γ|ai,bi)f(ai|A)f(bi|B)daidbi (2.29)=n∑i=1∫∫ (δiσ 1(logTi −γ0 −γT1 ai −γT2 bi))f(ai|A)f(bi|B)daidbi−n∑i=1∫∫ [exp(σ 1(logTi −γ0 −γT1 ai −γT2 bi))]f(ai|A)f(bi|B)daidbi= σ 1n∑i=1δi[logTi −γ0 −γT1E(ai)−γT2E(bi)]−n∑i=1exp(σ 1(logTi −γ0))∫exp(−σ 1γT1 ai)f(ai|A)dai×∫exp(−σ 1γT2 bi)f(bi|B)dbi= σ 1n∑i=1δi[logTi −γ0 −γT1E(ai)−γT2E(bi)]−n∑i=1[exp(logTi −γ0σ)× exp(−γT1E(ai)σ +γT1 Σ(ai)γ1σ2 −γT2E(bi)σ +γT2 Σ(bi)γ2σ2)]Then the statistical inference could be carried out based on the log-likelihoodfunction (2.29) and the MLE theory applies.2.4 Models for Nonignorable Missing DataLots of generally applicable statistical inference methods for incomplete data is-sues are likelihood-based, the idea of which is illustrated by a missing responsecase in the following. Suppose some values of the response variable y are missing.Let ri = 1 if yi is missing and ri = 0 if yi is observed. Denote the latent variablethat carries the information of the missingness as ω, and the joint distributionof response y and missingness indicator r, condition on covariates x, is given by262.4. Models for Nonignorable Missing DataPr(y,r|x,ω). Since this distribution can be rewritten asPr(y,r|x,ω) = Pr(y|x,θ(ω))Pr(r|y,x,ψ(ω)), (2.30)whereθ(ω),ψ(ω) are subsets or transformation ofω, the likelihood function can beobtained if the full-data response modelPr(y|x,θ(ω)) and the missing mechanismPr(r|y,x,ψ(ω)) are speciﬁed. The statistical inference can therefore be carriedout based on by maximum likelihood theory.Little (1992) reviewed the statistical methods of estimation in regression mod-els with missing covariates. Six methods dealing with missing covariates are com-pared: complete-case methods, available-case methods, least squares on imputeddata, maximum likelihood methods, Bayesian methods and multiple imputation.He suggested that the maximum likelihood method, Bayesian methods, and mul-tiple imputation method perform well, and the maximum likelihood method ispreferred in a large samples scenario and Bayesian methods or multiple imputa-tion method are preferred in a small samples scenario.In the speciﬁcation of NLME model (2.1), all covariate values should be avail-able, either observed or estimated, at the response measurement times tij. How-ever, in practice this may not be the case since covariates may be measured atdiﬀerent time points or may be missing at tij. Model (2.10) automatically incor-porates missing covariates when the missing data mechanism is ignorable (e.g.,MAR or MCAR), since model (2.10) is ﬁtted to the observed covariates whichmay be measured at diﬀerent time points than the responses. When the missingcovariates are nonignorable (i.e., the missingness may be related to the missingvalues), the missing data mechanism should be modeled.Little (1995) gave an overview on modeling dropout mechanism in longitudinalstudies. For example, we may assume that the missingness is related to the true272.4. Models for Nonignorable Missing Databut unobserved covariate value and the previous missing status. That is, we mayassume thatlogit(Pr(rij = 1)) = η0 +η1z i (tij)+η2ri,j 1, (2.31)wherez i (tij) = Ψp(tij)Tα+Φq(tij)Tai. We can then incorporate the missing datamodel into the joint likelihood for inference. Alternative missing data models mayalso be considered.The statistical analysis in this thesis is mainly based on likelihood. When themissing covariates are nonignorable, we only need to add an additional model forthe missing mechanism to the joint likelihood and then proceed the analysis. Forsimplicity of presentation, we only focus on ignorable missing covariates in thefollowing sections.28Chapter 3Simultaneous InferenceHaving introduced several models in previous chapter for the statistical prob-lems discussed in Section 1.1, the attentions are turned to how to analyze thedata based on these models. As we emphasized in Section 1.2.1, the longitudinaldata and the survival data are associated and the models share the same ran-dom eﬀects. Thus, separate analysis of the longitudinal data and survival datamay lead to biased results. This essential feature requires a simultaneous infer-ence for those statistical models. In this chapter, we ﬁrstly introduce a simplemethod that can be implemented easily by standard softwares. Then we discusstwo likelihood-based simultaneous inference methods, EM algorithm and Laplaceapproximation, which both have their own advantages and are widely applica-ble. Finally, we linearize the joint models and propose a new likelihood-basedapproximate simultaneous inference approach. The purpose of Laplace approx-imation and the liearization method is to reduce computation burden since theEM algorithm is computationally intensive.A brieﬂy review below is to remind the readers of the statistical models thatare adopted in this thesis. Suppose n subjects are included a longitudinal study.Let yij be the response value for subject i at time tij, zij be the correspondingcovariate, and Ti be the event time of subject i, i = 1,...,n. The statisticalmodels that we introduced in Chapter 2 are reviewed as follows:29Chapter 3. Simultaneous Inference• Parametric NLME model for the longitudinal process:yij = g(tij,βi)+eij, i = 1,...,n, j = 1,...,ni, (3.1)βi = h(zi,β,bi) (3.2)bi ∼ N(0,B), ei|bi ∼N(0,µ2Ii), (3.3)where g(·) and h(·) are given nonlinear functions,βi are the subject-speciﬁcparameters and β are the ﬁxed eﬀects (population parameters), zi containthe covariates for subject i, bi = (bi1,...,biq)T are the random eﬀects be-tween subjects, ei = (ei1,...,eini)T are the random errors of within-subjectmeasurements, B is the variance-covariance matrix of the random eﬀects,and µ characterizes variance of within-subject measurements.• Semiparametric LME model for the covariate measurement errors problem:zi = z i +ξi = ΨTpα+ΦTq ai +ξi, i = 1,...,n, (3.4)ai ∼ N(0,A), ξi|ai ∼N(0,λ2Ii), (3.5)where Ψ(·) and Φ(·) are some basis functions, α are the ﬁxed eﬀects andai are the random eﬀects between subjects, ξi = (ξi1,...,ξini)T are therandom errors of within-subject measurements, A is the variance-covariancematrix of the random eﬀects, and λ characterizes variance of within-subjectmeasurements.• Parametric AFT model for the time-to-event process:logTi = γ0 +γT1 ai +γT2 bi +σϵi, i = 1,...,n. (3.6)where ϵi follows a standard extreme value distribution.303.1. Two-Step Method3.1 Two-Step MethodMost joint modelings of longitudinal process and time-to-event process are basedon the assumption that two processes are linked with some shared parameters,which are usually the random eﬀects representing variations between subjects.A simple statistical inference method is to make independent inference for oneprocess in the ﬁrst step to estimate the shared parameters, and then incorporatethe estimated parameters into the inference for the other process in the secondstep. We call this method Two-Step Method.3.1.1 Naive Two-Step MethodSince both the longitudinal process and the time-to-event process depend on thecovariate z, which were measured with errors, we need to solve the measurementerrors problem (3.4) at the beginning. Laird and Ware (1982) gave the LMEmodel formulation and proposed an EM algorithm to ﬁt the model by treating therandom eﬀects as missing data. Lindstrom and Bates (1988) employed the linearfeature of LME model and obtained the ﬁrst two marginal moments for yi. Theﬁxed eﬀects β can be estimated by generalized least squares, and the variancecomponents can be estimated by normal-theory pseudo-likelihood or restrictedmaximum pseudo-likelihood using a general framework based on Newton-Raphsonmethod. Pinheiro and Bates (2002) described the covariance parametrizations ofA, which could be more diverse and complex than our case. By ﬁtting the LMEmodel (3.4), we have the “true” covariate z i for modeling the longitudinal processand the estimate of random eﬀects ai for modeling the time-to-event process.For the NLME model (3.1), Lindstrom and Bates (1990) alternated between apenalized nonlinear least squares (PNLS) step and a linear mixed eﬀects (LME)step. In the PNLS step, the current estimate of precision factor is held ﬁxed,and the conditional estimates of the random eﬀects bi and the ﬁxed eﬀects β are313.1. Two-Step Methodobtained by minimizing a penalized nonlinear least squares objective function.The LME step then updates the estimate of the precision factor based on the ﬁrst-order Taylor expansion of the likelihood function around the current estimates ofβ and bi respectively. Using this method to ﬁt the longitudinal model (3.1), wecan obtain the estimate of the ﬁxed eﬀectsβ and the random eﬀects bi for ﬁttingthe time-to-event model.The time-to-event process is analyzed by the AFT regression model. Havingthe random eﬀects ai and bi estimated from above two models, we can ﬁt theAFT regression model by standard statistical packages.Thus, the naive two-step approach can be summarized as below.• Covariate model inference. For a time-varying covariate, assume a ﬂexibleand robust model to estimate the true covariate values and the randomeﬀects between subjects.• Nonlinear mixed eﬀects model inference. Estimate the ﬁxed eﬀects, the ran-dom eﬀects and the variance components with the covariate values measuredwith errors replaced by the true values.• Time-to-event model inference. Fit the time-to-event model based on theestimated random eﬀects from above two estimations.3.1.2 Modiﬁed Two-Step MethodThenaivetwo-stepmethodiseasilyunderstandableandexecutableusingstandardstatistical software. However, as pointed out by Ye et al. (2008) and Albert andShih (2009), the naive two-step method may lead to two types of bias.• The covariate trajectories may be diﬀerent according to the censoring statusof the event times. Thus, applying a single covariate model to all covariatedata may lead to biased estimation for the ﬁrst model.323.2. EM Algorithm• Inference for the second model that ignores the estimation uncertainty inthe ﬁrst model may also lead to biased results (e.g., under-estimating thestandard errors).The ﬁrst bias, called bias from informative dropouts, may depend on the strengthof the association between the longitudinal process and the time-to-event process.The second bias may depend on the magnitude of measurement errors in covariate.Therefore, we need to modify the naive two-step method to reduce these biases.To incorporate the estimation uncertainty in the ﬁrst step, we could adjustthe standard errors of the parameters’ estimates using bootstrap method by re-sampling subjects and keeping the repeated measurements of resampled subjects.After repeating above procedure B times, the standard errors for the estimates ofthe ﬁxed parameters can be estimated by the sample variance-covariance matrixacross theB bootstrap datasets, which is expected be more reliable than the naivetwo-step method results, if the assumed models are correct.However, the modiﬁed two-step method can not completely reduce the biasesin the naive two-step method, because it still treats the two processes separately.In order to make reliable inference, simultaneous inference based on a joint modelmay be more appropriate. In order to provide an entirely decent inference method,making sure that the method considers two processes simultaneously is the ﬁrstand most fundamental step to do. In the following sections, we consider someuniﬁed approaches based on the joint likelihood function given all observed lon-gitudinal data and time-to-event data.3.2 EM AlgorithmThe two-step methods introduced in the previous section, naive or modiﬁed, ﬁtthe models separately with a simple “plug-in” strategy. EM Algorithm can be333.2. EM Algorithmused to estimate all parameters in the longitudinal model, the covariate modeland the time-to-event model simultaneously based on the joint likelihood withthe bias of the naive two-step method avoided. The joint likelihood approachis quite general and can be extended to statistical inference for more than tworelated models. Such a joint likelihood method provides eﬃcient estimation if theassumed models are correct.Speciﬁcally, let θ = (α,β,γ,µ,λ,σ,A,B) be the parameters’ collection ofabove three models, and let f(·) denotes the generic density functions and F(·)be the corresponding cumulative distribution functions. The joint log-likelihoodgiven all the observed data can be written as:ℓo(θ|y,z,ρ,δ) =n∑i=1ℓ(i)o (θ|yi,zi,ρi,δi)=n∑i=1∫∫log[f(yi|ai,bi;α,β,µ)f(bi|B)f(zi|ai;α,β,λ)f(ai|A)×f(ρi|ai,bi;γ,σ)δi(1−F(ρi|ai,bi;γ,σ))1 δi]daidbi. (3.7)The EM algorithm (Dempster et al., 1977) is a powerful iterative algorithm tocompute MLEs in a wide variety of situations, including missing data problemsand random eﬀects models. To get the MLE of θ in our case, we may consideran EM algorithm by treating the unobservable random eﬀects (ai,bi) as “missingdata”. EM algorithm contains an E-step, which computes the conditional expecta-tion of the “complete data” log-likelihood given the observed data and parameterestimates, and an M-step, which maximizes the conditional expectation in theE-step to update the parameters’ estimates.3.2.1 E-StepThe “complete data” are {yi,zi,ρi,δi,ai,bi},i = 1,2,··· ,n, and the parameterestimates at the tth iteration is denoted as θ(t). The E-step at the (t + 1)th343.2. EM Algorithmiteration can be written as:Q(θ|θ(t)) =n∑i=1∫∫[logf(bi|B)+logf(yi|ai,bi;α,β,µ)+logf(ai|A)+logf(zi|ai;α,β,λ)+log(1−F(ρi|ai,bi;γ,σ))1 δi+logf(ρi|ai,bi;γ,σ)δi]f(ai,bi|yi,zi,ρi,δi;θ(t))daidbi, (3.8)The computation of the conditional expectation Q(θ|θ(t)) in the E-step is usu-ally non-trivial for high-dimensional integrals. However, since the integral is anexpectation with respect to f(ai,bi|yi,zi,ρi,δi;θ(t)), it can be evaluated basedon the Monte Carlo EM of Wei and Tanner (1990) and Ibrahim et al. (2001a).That is to generate mt samples from f(ai,bi|yi,zi,ρi,δi;θ(t)) and approximatethe expectation Q(θ|θ(t)) by its empirical mean, with the random eﬀects ai,bireplaced by the simulated values a(j)i ,b(j)i ,j = 1,...,mt.Q(θ|θ(t)) = 1mtmt∑j=1n∑i=1[logf(yi|a(j)i ,b(j)i ;α,β,µ)+logf(b(j)i |B)+logf(zi|a(j)i ;α,β,λ)+logf(a(j)i |A)+δi logf(ρi|a(j)i ,b(j)i ;γ,σ)+(1−δi)log(1−F(ρi|a(j)i ,b(j)i ;γ,σ))]= Q(1)(α,β,µ)+Q(2)(B)+Q(3)(α,λ)+Q(4)(A)+Q(5)(γ,σ). (3.9)We may choose m0 as a large number and mt = mt 1 + mt 1/k,(k ≥ 1) inthe tth iteration. Increasing mt at each iteration may speed up the algorithm’sconvergence (Booth and Hobert, 1999).To generate independent samples from f(ai,bi|yi,zi,ρi,δi;θ(t)), we may useGibbs sampler (Gelfand and Smith, 1990) by sampling from the following full353.2. EM Algorithmconditionals iteratively:f(ai|yi,zi,ρi,δi,bi;θ(t))∝ f(ai;θ(t))f(yi|ai,bi;θ(t))f(zi|ai;θ(t))f (ρi,δi|ai,bi;θ(t)),f(bi|yi,zi,ρi,δi,ai;θ(t))∝ f(bi;θ(t))f(yi|ai,bi;θ(t))f (ρi,δi|ai,bi;θ(t)),where f (ρi,δi|ai,bi;θ(t)) = f(ρi|ai,bi;θ(t))δi(1−F(ρi|ai,bi;θ(t)))1 δi. Thus, wecan use the Gibbs sampling method to iteratively sample from the above fullconditionals. After a burn-in period, we can obtain a sample from the conditionaldistribution f(ai,bi|yi,zi,ρi,δi;θ(t)).3.2.2 M-StepThe parameter θ at the (t + 1)th iteration can be estimated by maximizingQ(θ|θ(t)). Generally, Q(θ|θ(t)) is a nonlinear function and there is no closed-form expression for the estimate ˆθ(t+1). The maximizers could be obtainedvia standard numerical optimization procedures for complete data, such as theNewton-Raphson method or more advanced algorithms (Fletcher, 1987; Nocedaland Wright, 2006).EM algorithm iterates between the two steps until convergence and the ﬁ-nal estimates of θ are regarded as a local maximum. The asymptotic variance-covariance matrix of ˆθ can be obtained using well-known complete-data formulas(Louis, 1982), where the expectations in those formulas can be approximatedby Monte-Carlo means. Speciﬁcally, note that the observed information matrixequals the expected complete information minus the missing informationIobs(ˆθ) = Icom(ˆθ)−Imis(ˆθ)363.3. Laplace ApproximationDeﬁne˙Q(θ|ˆθ) =n∑i=1˙Qi(θ|ˆθ) =mt∑j=1n∑i=11mtSij(θ)¨Q(θ|ˆθ) = ∂2Qi(θ|ˆθ)∂θ∂θT =mt∑j=1n∑i=11mt∂Sij(θ)∂θ .Since the parameters are all distinct, above matrices are block diagonal. Thenthe asymptotic observed information matrix isIobs(ˆθ) = −¨Q(θ|ˆθ)−mt∑j=1n∑i=11mtSij(ˆθ)STij(ˆθ)−n∑i=11mt˙Q(θ|ˆθ) ˙QT(θ|ˆθ). (3.10)The asymptotic variance-covariance matrix of ˆθ can be approximated byCov(ˆθ) = I 1obs(ˆθ).3.3 Laplace ApproximationThe convergence of the foregoing Monte-Carlo EM algorithm depends on the di-mension of the random eﬀects (ai,bi). If the dimension of the random eﬀects(ai,bi) is not small, for instance larger than 2, the computation of Monte-CarloEM method can be extremely intensive since the method involves simulatinglarge samples at each iteration. Moveover, the algorithm may not converge insome cases. Therefore, it is valuable to approximate the observed-data joint log-likelihood ℓo(θ|y,z,ρ,δ) using the ﬁrst-order Laplace approximation, which com-pletely avoids integration and fastens the entire computation speed signiﬁcantly.373.3. Laplace ApproximationThe “complete data” log-likelihood can be written as:ℓc(θ|a,b) =n∑i=1ℓ(i)c (θ|yi,zi,ρi,δi)=n∑i=1log[f(yi|ai,bi;α,β,µ)f(bi|B)f(zi|ai;α,β,λ)f(ai|A)×f(ρi|ai,bi;γ,σ)δi(1−F(ρi|ai,bi;γ,σ))1 δi] (3.11)Vonesh et al. (2002) and Lee et al. (2006) showed that the observed-data log-likelihood ℓo(θ|y,z,ρ,δ) can be approximated by its ﬁrst-order Laplace approxi-mation˜ℓo(θ|~a,~b) = ℓc(θ|~a,~b)− 12 log − 12π∂2ℓc(θ|a,b)∂(a,b)2 (a,b=~a,~b), (3.12)where (~a,~b) = {(~ai,~bi),i = 1,2,...,n} solve the equations∂ℓ(i)c (θ|ai,bi)/∂(ai,bi) = 0, i = 1,2,...,n. (3.13)Thus, an approximate estimate of θ can be obtained by solving the equation∂˜ℓo(θ|~ai,~bi)/∂θ = 0. (3.14)Given starting values θ(0) and (a(0),b(0)), we iterate the above procedures untilconvergence and obtain the approximate estimate ˆθ of the MLE. To be speciﬁc,we carry out the following two steps at the tth iteration.• Estimate the random eﬀects (˜a(t),˜b(t)) by solving equations (3.13),• Update the ﬁxed eﬀect θ(t+1) by solving equation (3.14).Laplace approximation method completely avoids integration, and thus is com-putationally attractive. The estimated random eﬀects obtained this way can be383.4. Linearization Methodinterpreted as the empirical Bayes estimates. The standard errors of the approx-imate estimates can be calculated based on the approximate formula for theirvariance-covariance matrix, which can be obtained in a similar way by eliminatingthe mean parameters and the random eﬀects using a similar Laplace approxima-tion. Speciﬁcally,Cov(ˆθ) = −[∂2ℓo(θ|˜a ,˜b )∂θ∂θT] 1 =ˆ (3.15)where ˆθ are the estimated ﬁxed eﬀects, and ˜a ,˜b are the estimated randomeﬀects at the ﬁnal iteration.Vonesh et al. (2002) showed that the approximate estimate ˆθ is consistent andasymptotically normally distributed when both the sample size and the numberof measurements within each individual go to inﬁnity(ˆθ−θ) = Op[max{(n) 12,(minni) 1}], (3.16)and√n(ˆθ−θ) →N(0, ¯I(θ) 1) (3.17)where ¯I(θ) = limn∑iIi(θ)/n and Ii(θ) is the information matrix of subject i.3.4 Linearization MethodInspired by a popular approximate method for solving NLME model, we propose alinearization method for our joint models. Following the similar idea of Lindstromand Bates (1990), we ﬁrstly linearize the NLME (3.1). Given the estimates ofparameters ˜α,˜β,˜ai,˜bi, we have the ﬁrst-order Taylor expansion ofgij = g(tij,βi),which is a LME model˜yi = Wiα+Xiβ+Uiai +Vibi +ei, (3.18)393.4. Linearization MethodwhereWi = (wi1,··· ,wini)T, wij = ∂gij/∂α,Xi = (xi1,··· ,xini)T, xij = ∂gij/∂β,Ui = (ui1,··· ,uini)T, uij = ∂gij/∂ai,Vi = (vi1,··· ,vini)T, vij = ∂gij/∂bi,˜yi = yi −gi(˜α,˜β,˜ai,˜bi)+Wi˜α+Xi˜β+Ui˜ai +Vi˜bi,with all the partial derivatives being evaluated at (˜α,˜β,˜ai,˜bi). Now we have twoLMEs and a time-to-event model, and we adopt the strategy of EM algorithm toestimate the parameters.To calculate (3.8), we need to estimate the random eﬀects ai,bi at ﬁrst. Thegeneral EM approach samples ai,bi from the conditional distribution of the ran-dom eﬀectsf(ai,bi|yi,zi,ρi,δi;θ(t)) by iteratively updating their full conditionals:f(ai|yi,zi,ρi,δi,bi;θ(t))∝ f(ai;θ(t))f(yi|ai,bi;θ(t))f(zi|ai;θ(t))f (ρi,δi|ai,bi;θ(t)),f(bi|yi,zi,ρi,δi,ai;θ(t))∝ f(bi;θ(t))f(yi|ai,bi;θ(t))f (ρi,δi|ai,bi;θ(t)),where f (ρi,δi|ai,bi;θ(t)) = f(ρi|ai,bi;θ(t))δi(1−F(ρi|ai,bi;θ(t)))1 δi.403.4. Linearization MethodSince we havef(ai;θ(t)) ∝ | ait| ni2 exp[−12(ai −νait)T 1ait(ai −νait)]∝ | ait| ni2 exp[−12aTi 1aitai +νTait 1aitai − 12νTait 1aitνait],f(bi;θ(t)) ∝ | bit| ni2 exp[−12(bi −νbit)T 1bit(bi −νbit)]∝ | bit| ni2 exp[−12bTi 1bitbi +νTbit 1bitbi − 12νTbit 1bitνbit],f(zi|ai,θ(t)) ∝ λ niit exp[− 12λ2it(zi −ψTi αt −ϕTi ai)T(zi −ψTi αt −ϕTi ai)]∝ λ niit exp[− 12λ2itaTi ϕiϕTi ai + 1λ2it(zi −ψTi αt)TϕTi ai− 12λ2it(zi −ψTi αt)T(zi −ψTi αt)],f(yi|ai,bi,θ(t)) ∝ µ niit exp[− 12µ2it(˜yit −Witαt −Xitβt −Uitai −Vitbi)T×(˜yit −Witαt −Xitβt −Uitai −Vitbi)]∝ µ niit exp[− 12µ2itaTi UTitUitai − 12µ2itbTi VTit Vitbi+ 1µ2it(˜yit −Witαt −Xitβt −Vitbi)TUitai+ 1µ2it(˜yit −Witαt −Xitβt −Uitai)TVitbi− 12µ2it(˜yit −Witαt −Xitβt)T(˜yit −Witαt −Xitβt)],The likelihood function given the event time data is linearized as follows:f (ρi,δi|ai,bi;θ(t)) ∼=∝ exp[− 12σ2taTi (γ1tγT1t)ai + 1σ2t(logTi −γ0t +δi)γT1tai− 12σ2tbTi (γ2tγT2t)bi + 1σ2t(logTi −γ0t +δi)γT2tbi− 12σ2t(logTi −γ0t +δi)2].Then we may obtain the approximate explicit form of the distribution of the ran-413.4. Linearization Methoddom eﬀects, which follows a multivariate normal distribution. Therefore, the fullconditionals of the random eﬀects can be approximated by the following multi-variate normal distributions:f(ai|yi,zi,ρi,δi,bi;θ(t))∝ f(ai;θ(t))f(yi|ai,bi;θ(t))f(zi|ai;θ(t))f (ρi,δi|ai,bi;θ(t))∝ exp[−12(ai −νai(t+1))T 1ai(t+1)(ai −νai(t+1))]∼ MVN(νai(t+1), ai(t+1)), (3.19)f(bi|yi,zi,ρi,δi,ai;θ(t))∝ f(bi;θ(t))f(yi|ai,bi;θ(t))f (ρi,δi|ai,bi;θ(t))∝ exp[−12(bi −νbi(t+1))T 1bi(t+1)(bi −νbi(t+1))]∼ MVN(νbi(t+1), bi(t+1)), (3.20)where 1ai(t+1) = 1ait + ϕiϕTiλ2it +UTitUitµ2it +γ1tγT1tσ2t ,νTai(t+1) 1ai(t+1) = νTait 1ait + 1λ2it(zi −ψTi αt)TϕTi + 1σ2t(logTi −γ0t +δi)γT1t+ 1µ2it(˜yit −Witαt −Xitβt −Vitνbit)TUit, 1bi(t+1) = 1bit + VTit Vitµ2it +γ2tγT2tσ2t ,νTbi(t+1) 1bi(t+1) = νTbit 1bit + 1σ2t(logTi −γ0t +δi)γT2t+ 1µ2it(˜yit −Witαt −Xitβt −Uitνait)TVit.Thus, we may have an estimate of the distribution of the random eﬀects ai andbi at the (t+1)th iteration.423.4. Linearization Method3.4.1 Framework IWith the estimated distributions of ai and bi, we may return to Section 3.2.1and work on the log-likelihood function (3.8). If the random eﬀects ai and bican be integrated out, we may simply adopt the EM framework to complete thelinearization method. To be speciﬁc, at the tth iteration, we need to:• Linearize the joint model at the current steps ˜ai(t), ˜bi(t) and ˆθ(t);• Estimate the random eﬀects (˜ai(t+1), ˜bi(t+1)) as (3.19);• Obtain the expectation of the joint log-likelihood function;• Update θ(t+1) by maximizing the expectation of the observed data jointlog-likelihood.We repeat the above steps until the algorithm converges. The standard errors ofthe MLE of θ can be calculated based on the asymptotic observed informationmatrix (3.10).3.4.2 Framework IIIf we can not obtain an explicit form of the integral (3.8), we may simply considerthe means of the estimated distributions of the random eﬀects ai and bi, νai(t+1)and νbi(t+1), as their empirical Bayesian estimates. Then the ﬁxed eﬀects can beupdated in the same way as Laplace approximation method. That is to say, wecarry out the following steps at the tth iteration.• Linearize the joint model at current iterate points ˜ai(t), ˜bi(t) and ˆθ(t);• Estimate the random eﬀects (˜ai(t+1), ˜bi(t+1)) by (3.19),• Update θ(t+1) by solving equations (3.14) with νai(t+1) and νbi(t+1) beingestimates of ˜ai(t) and ˜bi(t).433.4. Linearization MethodSimilar to Laplace approximation method, the standard errors of the approximateMLE of θ can be approximated by (3.15).44Chapter 4Data Analysis4.1 Data DescriptionWe consider an AIDS clinical trial, which included 46 HIV infected patients beingtreated with an anti-HIV treatment. The viral loads were measured on days 0, 2,7, 10, 14, 21, 28 and weeks 8, 12, 24, and 48 after initiation of the treatment. TheCD4 and CD8 cell counts and other variables were also recorded throughout thestudy. The lower bound of the detectable limit of the viral load is 100 copies/ml,and 40 out of all 361 viral load measurements (11%) are below the detectable limit.For simplicity, we imputed the viral loads below the assay’s lower detection limit100 copies/ml with a half of the limit. The number of viral load measurementsfor each individual varies from 4 to 10. Figure 4.1 shows the trajectories of viralload, the CD4 and the CD4/CD8 of six randomly selected patients.One of the primary objectives of this study is to assess the antiviral activity ofthe treatment in the ﬁrst 12 weeks. Meanwhile, the association between the viralload and the time to the ﬁrst decline of the ratio of CD4 and CD8 is of interest.454.2. Model Speciﬁcation4.2 Model SpeciﬁcationBased on previous researches (Wu and Ding, 1999; Wu, 2002; Wu et al., 2010),we consider the following HIV viral load dynamic modelyij = log10(P1ie λ1itij +P2ie λ2ijtij)+eij (4.1)log(P1i) = β1 +bi1, λ1i = β2 +bi2,log(P2i) = β3 +bi3, λ2ij = β4 +β5z ij +bi4.where yij is the log10-transformation of the viral load of patient i at time tij, P1iand P2i are subject-speciﬁc viral load baselines, λ1i and λ2ij subject-speciﬁc ﬁrst-phase and second-phase viral load decay rates, z ij is the “true” (but unobserved)CD4 count. It is known that CD4 counts are usually measured with substantialerrors. Thus we assume that λ2ij is related to the true CD4 value z ij rather thanthe observed CD4 value zij. To avoid very small (large) estimates, which may beunstable, we standardize the CD4 counts and rescale the original time t (in days)so that the new time scale is between 0 and 1.Figure 4.1: Longitudinal Data Trajectories0 20 40 60 80123456TimeViral Load0 20 40 60 800100200300400500TimeCD420 40 60 800.10.20.30.40.50.60.7TimeCD4/CD8As we can see from Figure 4.1, the CD4 trajectories are complicated and thebetween-patient variations in CD4 trajectories are quite large, thus we consider a464.2. Model Speciﬁcationnonparametric linear mixed eﬀects model to empirically describe the CD4 trajec-tories. We use linear combinations of natural cubic splines with percentile-basedknots to approximate r(t) and hi(t). We set ψ0(t) = ϕ0(t) = 1 and take the samenatural cubic splines in the approximations with q ≤ p in order to decrease thedimension of the random eﬀects. AIC and BIC criteria are used to determine thatp = q = 2, which leads to the following model for the CD4 process.zij = (α1 +ai1)ϕ0(tij)+(α2 +ai2)ϕ1(tij)+(α3 +ai3)ϕ2(tij)+ξij. (4.2)where zij is the observed CD4 value at time tij, ϕ1(·) and ϕ2(·) are the basisfunctions, α= (α1,α2,α3)T are the ﬁxed parameters, and ai = (ai1,ai2,ai3)T arethe random eﬀects.Let Ti be the time to the ﬁrst decline of CD4/CD8 ratio of subject i. In AIDSFigure 4.2: Estimate of Survivor Function0.0 0.2 0.4 0.6 0.8 1.00.00.20.40.60.81.0Time of First Decline of CD4:CD8Survivor Functionstudies, it is of great interest to see if the time of the ﬁrst decline of CD4/CD8 isrelated to the subject characteristics of the viral load and the CD4 trajectories,474.3. Analysis and Resultsrepresented by the random eﬀects in the viral load model and the CD4 model.Therefore, we consider the following time-to-event model for TilogTi = γ0 +γ1ai1 +γ2ai2 +γ3bi2 +γ4bi4 +σϵi. (4.3)where ϵi follows standard extreme value distribution. The random eﬀects bi2and bi4, which represent between-subject variations of viral decay rates, might bepredictive for event times. The random eﬀects ai1 and ai2, which bear the mainfeatures of between-subject variations of CD4 trajectories, are also included inthe parametric AFT model (4.3).4.3 Analysis and ResultsWe estimate the models’ parameters using two-step method (TS), modiﬁed two-step method (MTS), Laplace approximation method (LAP) and linearizationmethod(LIN).TSandMTSarebasedonstandard R algorithmsin library(nlme)and library(survival). The estimation results of TS and MTS are listed alongwith those of LAP and LIN in Table 4.1. Figure 4.3 and 4.4 check the normalityassumptions of the residual and random eﬀects of the NLME model. Figure 4.5and 4.6 check the normality assumptions of the residual and random eﬀects of themeasurement error model. It seems that no assumption is severely violated.LAP and LIN are iterative algorithms with convergence being achieved whenthe maximum percentage change of all estimates is less than 5% in two consecutiveiterations. The parameters’ estimates of TS method are taken as the initial valuesfor two iterative algorithms LAP and LIN.Based on the analysis results listed in Table 4.1, we have the following obser-vations.• Four inference methods provide similar parameter estimates for the NLME484.3. Analysis and ResultsFigure 4.3: Residuals Plots I of NLME ModelFitted valuesStandardized residuals−20242 3 4 5 6 Standardized residualsQuantiles of standard normal−3−2−10123−2 0 2 4Figure 4.4: Residuals Plots II of NLME ModelRandom effectsQuantiles of standard normal−2−1012−3 −2 −1 0 1 211960250405610282610330b1−5 0 511960b2−4 −2 0 211960250405250708b3−5 0−2−1012250736250911271773610500b4.(Intercept)model of the viral load data. LAP and LIN agree that all parameters of βare signiﬁcant to the viral load, while TS and MTS suggest that β5 mightbe insigniﬁcant.494.3. Analysis and ResultsFigure 4.5: Residuals Plots I of Covariate Measurement Errors ModelFitted valuesStandardized residuals−3−2−1012−4 −3 −2 −1 0 1 Standardized residualsQuantiles of standard normal−3−2−10123−3 −2 −1 0 1 2Figure 4.6: Residuals Plots II of Covariate Measurement Errors ModelRandom effectsQuantiles of standard normal−2−1012−3 −2 −1 0 1250009250723250867610429(Intercept)−1 0 1 2 3 4250009 271764610429z1−0.4−0.20.0 0.2 0.4 0.6−2−1012250009250867271764610429z2• The estimation results of the time-to-event process obtained from four meth-ods diﬀer from each others. Generally speaking, TS and MTS suggest thatnone random eﬀect is signiﬁcant to the event time, while two simultaneous504.3. Analysis and ResultsTable 4.1: Estimation ResultsParameter TS MTS LAP LINEst S.E. Est S.E. Est S.E. Est S.E.β1 11.75 0.19 11.76 0.19 11.78 0.09 11.80 0.06β2 29.84 1.47 30.22 1.69 29.25 1.31 30.03 0.87β3 7.48 0.29 7.43 0.32 7.22 0.15 7.40 0.10β4 1.34 0.51 1.31 0.64 1.43 0.23 1.18 0.15β5 0.63 0.46 0.52 0.70 0.42 0.16 0.55 0.10µ 0.29 NA 0.29 NA 0.29 NA 0.22 NAγ0 -0.81 0.12 -0.82 0.01 -0.97 0.11 -0.71 0.20γ1 0.19 0.24 0.16 0.52 0.55 0.37 0.22 0.19γ2 -0.03 0.21 -0.01 0.52 -1.03 0.52 0.01 0.13γ3 -0.03 0.05 -0.02 0.11 -0.05 0.02 -0.03 0.04γ4 0.08 0.05 0.04 0.10 0.12 0.03 0.05 0.02σ 0.75 NA 0.76 NA 0.31 NA 0.75 NAapproaches indicate that bi4 might be signiﬁcant. This discovery distin-guished the simultaneous approaches from the TS based methods.• MTS provides very close parameter estimates as TS for both the NLMEmodel and the measurement errors model, but with greater standard errors.However, by inferring those three models simultaneously, LAP and LINprovided parameter estimates with smaller standard errors.Based on this data analysis result, we may conclude that:• For the analysis of the longitudinal process based on NLME model, the re-sults obtained from two simultaneous inference methods agree with previousresearches (Ding and Wu, 2001; Wu et al., 2008, 2010) in terms of the sig-niﬁcance conclusion of the parameters. However, some estimated values ofour analysis are diﬀerent from previous researches, since we only consideredthe observations in the ﬁrst three months in our study.• Two simultaneous inference methods suggest that CD4 is signiﬁcant to the514.3. Analysis and Resultsviral load, while TS and MTS miss this reasonable and important property.According to the conclusions from LAP and LIN, the higher the CD4 is, thelower viral load the subject might have.• Two simultaneous inference methods also indicate that the viral decay rateis signiﬁcant to the ﬁrst decline time of the CD4/CD8 ratio. The lower theviral load is, the later the CD4/CD8 may ﬁrstly decline. TS based methodsfail again in detecting this important feature.The discrepancies between these data analysis results obtained from diﬀerentmethods are important in terms of both biological interpretation and statisticalmethodology development. To study if the performance of the simultaneous in-ference methods exceed the TS based methods where longitudinal process andtime-to-event process are truly connected, we need a simulation study to comparethese methods.52Chapter 5SimulationA simulation study is conducted to evaluate the performance of the three methods:two-step method (TS), Laplace approximation method (LAP) and linearizationmethod (LIN). Modiﬁed two-step method (MTS) is omitted since MTS performsquite similar to TS only expect that MTS generally yields in greater variances ofthe estimates.5.1 Simulation DesignWe simulate the longitudinal data and covariate data based on model (4.1) and(4.2). The event times are simulated based on (4.3) with 20% randomly censoredsubject. While having the parameterβ,αandγ ﬁxed, we compare these inferencemethods based on their estimation performances in four scenarios with diﬀerentsettings of• Number of subjects,• Numbers of repeated measurements,• Between-subject variance,• Within-subject variance.ThesimulationswereimplementedonDell R900serverwith16 CPUsand 16-32GbRAM. The estimation performance includes the standard errors of the parameters’estimates and the corresponding bias and MSE, along with the average running535.2. Simulation Resultstime. Since we are more interested in the NLME model and the time-to-eventmodel estimation, the estimation results of the covariate model are omitted in thefollowing section.5.2 Simulation Results5.2.1 Number of SubjectsTo examine how these estimation methods perform with diﬀerent numbers ofsubjects, we simulate datasets with n = 60 and n = 120. The estimation resultsbased on n = 60 are reported in Table 5.1. The estimation results based onn = 120 are reported in Table 5.2.Table 5.1: Simulation Results: n = 60Parameter TS LAP LINBias SE MSE Bias SE MSE Bias SE MSEβ1 = 12 0.02 0.18 0.03 -0.04 0.20 0.04 -0.09 0.06 0.01β2 = 30 -1.13 2.16 5.95 -1.39 1.69 4.79 -0.93 1.25 2.41β3 = 7 -0.02 0.20 0.04 0.12 0.23 0.07 -0.14 0.17 0.05β4 = −2 -0.05 0.26 0.07 0.15 0.37 0.16 -0.01 0.23 0.05β5 = 3 -0.00 0.13 0.02 -0.30 0.21 0.13 -0.48 0.08 0.23µ = 0.3 -0.00 0.01 0.00 0.12 0.04 0.02 -0.01 0.02 0.00γ0 = −1 2.19 0.40 4.95 0.20 0.38 0.19 0.15 0.10 0.03γ1 = 1 0.01 0.43 0.18 -0.80 1.53 2.97 -0.91 0.18 0.87γ2 = 1 -0.01 0.48 0.23 0.52 1.10 1.47 -0.88 0.24 0.84γ3 = 1 -1.00 0.28 1.08 0.14 0.36 0.15 -0.07 0.21 0.05γ4 = 1 -0.02 0.40 0.16 -0.08 0.36 0.14 -0.10 0.21 0.05σ = 0.2 2.19 0.30 4.88 0.06 0.23 0.06 0.08 0.03 0.01Time (Sec) 680.42 180.47 13.39For Table 5.1, we have the following observations.• Three methods provided generally decent estimations forβ, except that theestimate of β5 provided by LIN seems to be biased.545.2. Simulation Results• Though TS estimated ˆγ1,ˆγ2 and ˆγ4 with less biases, its performance in theestimation of γ0,γ3 and σ is poor. Comparatively, LAP and LIN are morelikely to provide better estimation of γ, at least for γ0.The results in Table 5.1 are regarded as the baseline for the following compar-isons.Table 5.2: Simulation Results: n = 120Parameter TS LAP LINBias SE MSE Bias SE MSE Bias SE MSEβ1 = 12 0.05 0.14 0.02 0.00 0.11 0.01 -0.10 0.04 0.01β2 = 30 -0.94 1.55 3.28 -1.14 0.96 2.20 -1.09 0.92 2.03β3 = 7 -0.01 0.18 0.03 0.14 0.14 0.04 -0.12 0.16 0.04β4 = −2 -0.04 0.19 0.04 0.11 0.26 0.08 0.00 0.22 0.05β5 = 3 0.00 0.11 0.01 -0.19 0.13 0.05 -0.36 0.07 0.13µ = 0.3 0.00 0.01 0.00 0.09 0.03 0.01 -0.00 0.01 0.00γ0 = −1 2.01 0.29 4.13 -0.07 0.06 0.01 -0.05 0.09 0.01γ1 = 1 0.04 0.37 0.14 0.00 0.14 0.02 -0.01 0.19 0.04γ2 = 1 0.00 0.42 0.18 -0.00 0.11 0.01 -0.06 0.25 0.07γ3 = 1 -0.97 0.31 1.03 0.03 0.04 0.00 -0.15 0.24 0.08γ4 = 1 -0.01 0.30 0.09 -0.01 0.04 0.00 -0.15 0.23 0.08σ = 0.2 2.26 0.25 5.18 -0.06 0.15 0.03 0.02 0.03 0.00Time (Sec) 1146.46 360.46 28.83Similarly, the observations for Table 5.2, which are based on more subjects,are listed below.• Overall, if more subjects are available, the estimation results of three meth-ods can be eﬀectively improved in terms of smaller SE, bias and MSE.• The estimation result oﬀered by LAP is overall better than the results fromtwo methods.• LIN gave satisfactory estimation for β and γ, except for β5. This methodmight be even more attractive for its extraordinarily fast speed.555.2. Simulation Results• TS still provided severely biased estimates for γ0,γ3 and σ.5.2.2 Numbers of Repeated MeasurementsTo examine how these estimation methods perform with diﬀerent numbers ofrepeated measurements, we simulate datasets with each subject being repeatedlymeasured for 6 times and 12 times. Table 5.1 contains the simulation results basedon ni = 6. Here we just list the estimation results of ni = 12 in Table 5.3.Table 5.3: Simulation Results: ni = 12Parameter TS LAP LINBias SE MSE Bias SE MSE Bias SE MSEβ1 = 12 0.02 0.20 0.04 -0.02 0.15 0.02 -0.15 0.06 0.03β2 = 30 -0.52 1.38 2.18 -1.76 1.07 4.24 -0.94 1.23 2.39β3 = 7 -0.01 0.19 0.04 0.18 0.19 0.07 -0.08 0.13 0.02β4 = −2 0.01 0.24 0.06 0.53 0.33 0.38 0.05 0.17 0.03β5 = 3 -0.00 0.10 0.01 -0.45 0.22 0.25 -0.28 0.07 0.08µ = 0.3 -0.00 0.01 0.00 0.09 0.03 0.01 0.01 0.01 0.00γ0 = −1 2.16 0.43 4.87 -0.05 0.20 0.04 -0.06 0.12 0.02γ1 = 1 0.00 0.47 0.22 0.16 0.58 0.36 -0.96 0.19 0.95γ2 = 1 0.03 0.47 0.22 -0.16 0.53 0.31 -0.94 0.23 0.93γ3 = 1 -0.85 0.46 0.93 0.12 0.21 0.06 -0.05 0.21 0.05γ4 = 1 0.04 0.45 0.20 0.01 0.16 0.03 -0.09 0.21 0.05σ = 0.2 2.20 0.31 4.92 -0.02 0.22 0.05 0.08 0.05 0.01Time (Sec) 183.73 187.58 15.60The observations for the results based on more measurements, Table 5.3, are• Overall, it might be hard to reduce the bias or SE of the estimates by havingmore measurements for each subject.• Speciﬁcally, LAP led the performance of estimating β and γ, though thebias of the estimates are greater than Table 5.2.• LIN also provided good estimation for β and γ except for γ1 and γ2, whichare associated with the random eﬀects of the measurement error model.565.2. Simulation Results• With more repeated measurements ensured, the computation time of TSis greatly reduce. However, TS still provided severely biased estimates forγ0,γ3 and σ.5.2.3 Between-Subject VarianceToexaminehowtheseestimationmethodsperformwithdiﬀerentvariance-covariancematrices, we simulate datasets with diﬀerent settings of A and B follows:A1 =1 0 00 1 00 0 1, B1 =2 0 0 00 4 0 00 0 2 00 0 0 3,andA2 =2 0 00 2 00 0 2B2 =4 0 0 00 8 0 00 0 4 00 0 0 6.Table 5.2 is based on A1 and B1. Table 5.4 is based on A2 and B2.For Table 5.4,• Surprisingly, the estimation result that LAP oﬀered in the greater between-subject variance scenario is better than the less between-subject variancescenario (Table 5.1). It is almost as good as the more sample size scenario(Table 5.2). However, LIN is not that lucky, at least in terms of the bias ofestimates of β5 and γ2.• The SE of estimates are reasonably greater than previous setting of A1 andB1.575.2. Simulation ResultsTable 5.4: Simulation Results: A2,B2Parameter TS LAP LINBias SE MSE Bias SE MSE Bias SE MSEβ1 = 12 0.10 0.25 0.07 -0.03 0.16 0.03 -0.15 0.07 0.03β2 = 30 -0.87 2.18 5.52 -1.17 0.93 2.23 -0.32 2.14 4.69β3 = 7 -0.07 0.29 0.09 0.10 0.23 0.06 -0.24 0.23 0.11β4 = −2 -0.02 0.34 0.11 0.15 0.39 0.18 -0.12 0.31 0.11β5 = 3 0.00 0.11 0.01 -0.17 0.10 0.04 -0.56 0.10 0.32µ = 0.3 -0.00 0.01 0.00 0.14 0.04 0.02 0.03 0.05 0.00γ0 = −1 2.61 0.51 7.10 -0.02 0.08 0.01 -0.14 0.37 0.15γ1 = 1 0.06 0.36 0.14 -0.05 0.25 0.06 -0.03 0.32 0.10γ2 = 1 -0.01 0.38 0.15 0.04 0.21 0.05 -0.22 0.41 0.22γ3 = 1 -0.91 0.34 0.94 0.01 0.02 0.00 -0.09 0.36 0.13γ4 = 1 0.02 0.31 0.09 0.00 0.03 0.00 -0.09 0.35 0.13σ = 0.2 2.84 0.37 8.19 -0.02 0.18 0.03 0.80 5.52 31.07Time (Sec) 908.11 172.65 14.22• TS still provided severely biased estimates for γ0,γ3 and σ, while takingmuch longer time on computation.5.2.4 Within-Subject VarianceTo examine how these estimation methods perform with diﬀerent within-subjectvariance, we simulate datasets with diﬀerent settings of µ,λ and σ as follows:µ1 = 0.3,λ1 = 0.2,σ1 = 0.2, and µ2 = 0.6,λ2 = 0.5,σ1 = 0.4.Table 5.1 is based on µ1,λ1 and σ1. Table 5.5 is based on µ2,λ2 and σ2.• When within-subject variances become greater, the estimation results be-come worse in terms of both SE and bias.• No method provides satisfactory estimation for the simulated time-to-eventdata with greater within-subject variance, indicating both the TS based585.2. Simulation ResultsTable 5.5: Simulation Results: µ2,λ2,σ2Parameter TS LAP LINBias SE MSE Bias SE MSE Bias SE MSEβ1 = 12 0.05 0.20 0.04 0.04 0.23 0.06 -0.05 0.11 0.01β2 = 30 -2.68 2.43 13.09 -1.97 2.36 9.43 -1.31 2.15 6.31β3 = 7 -0.12 0.24 0.07 0.25 0.28 0.14 -0.19 0.34 0.16β4 = −2 -0.08 0.31 0.10 0.69 0.47 0.70 -0.11 0.51 0.28β5 = 3 0.02 0.16 0.03 -0.22 0.22 0.10 -1.20 0.15 1.47µ = 0.6 0.20 0.02 0.04 0.35 0.08 0.13 0.14 0.07 0.02γ0 = −1 2.28 0.41 5.38 0.36 0.85 0.86 -0.28 0.35 0.20γ1 = 1 0.01 0.52 0.27 -2.29 4.54 25.84 -0.97 0.23 1.00γ2 = 1 0.05 0.51 0.26 1.83 4.35 22.30 -0.94 0.21 0.92γ3 = 1 -0.99 0.26 1.05 1.78 3.05 12.47 -0.10 0.22 0.06γ4 = 1 0.05 0.67 0.46 -0.11 1.49 2.24 -0.10 0.23 0.06σ = 0.5 2.25 0.32 5.15 0.92 1.16 2.21 0.34 0.13 0.13Time (Sec) 1644.74 299.28 16.48method and the approximate methods might yield in greatly biased esti-mation when variance is too high. In this situation, some “exact” method,such as EM algorithm, might be necessary.In general, we may conclude that• TS is risky for unbiased estimation of joint modeling of longitudinal pro-cess and time-to-event process, especially when strong association exists inbetween.• Approximate simultaneous inference methods are promising to provide sat-isfactory estimation for both longitudinal processes and time-to-event pro-cesses when the measurement errors problem or the model mis-speciﬁcationproblem is tolerable.• LAP generally performs more stably than LIN. In fact, when more subjectsare given, or the between-subject variance falls in an appropriate range,LAP is likely to provide quite reliable estimation.595.2. Simulation Results• LIN can be considered as a valuable alternative of LAP when more subjectsare involved, especially considering its fast speed. However, this algorithmrequires further study before being widely promoted, since its estimate forsome parameters, for example β5, might be biased.60Chapter 6Conclusion and DiscussionThe computation associated with the joint modeling of longitudinal process andsurvival process can be extremely intensive and may lead to convergence prob-lems (Tsiatis and Davidian, 2004; Wu et al., 2008). The situation can be evensevere when several nonlinear and semiparametric/nonparametric models are in-volved in dealing with complex data, such as measurement errors or missing dataare present. Previous researches have mainly used EM algorithm or Laplace ap-proximation for making inference of several simultaneously developing processes.Besides the technical details of these two popular methods, we were more im-pressed by these two diﬀerent approaches of handling the sharing parameters ofthe longitudinal studies.In this thesis, following those two distinct computational frameworks, we pro-posed an approximate statistical inference method for jointly modeling longitu-dinal process and time-to-event process based on a NLME model and a para-metric AFT model. By linearizing the joint model, we designed a new strategyof updating the random eﬀects that connect two processes, and proposed twoapproaches for diﬀerent scenarios of likelihood function. Roughly speaking, oneapproach is widely applicable and easily implementable when the likelihood func-tion is comparatively complex. The other approach utilizes more information ofthe estimated random eﬀects when the likelihood function is comparatively sim-ple, and therefore is expected to be more eﬃcient. Both approaches approximatethe multidimensional integral in the observed-data joint likelihood by an analytic61Chapter 6. Conclusion and Discussionexpression, which greatly reduce the computational intensity of the complex jointmodeling problem.The proposed inference method, LIN, was applied to an HIV study along withthree other methods, TS, MTS and LAP. The inference results obtained fromdiﬀerent methods gave signiﬁcant diﬀerent conclusions towards the HIV study.Speciﬁcally,• Two simultaneous inference methods suggested that CD4 is signiﬁcant tothe viral load, which was missed by the TS based methods (TS and MTS).Based on the simultaneous inference results, the lower the CD4 is, the higherviral load the patient might have.• Two simultaneous inference methods also indicated that the viral decay rateis signiﬁcant to the ﬁrst decline time of the CD4/CD8 ratio. The lower theviral load is, the later the CD4/CD8 may ﬁrstly decline. TS based methodfailed again in detecting this important feature.We also compared these inference methods based on simulation results. Gen-erally speaking,• It is risky to use TS method for unbiased estimation of the joint modelingof longitudinal process and time-to-event process.• LAP is generally more reliable than the other two methods in providing avalid estimation of the joint models.• LIN might become a valuable alternative to LAP in terms of estimationresults and computation time, especially when more subjects are given.Linearization methods have achieved numerous successes for solving nonlin-ear problems in real world applications, according to which we believe that thelinearization strategy may also help us in jointly modeling longitudinal data and62Chapter 6. Conclusion and Discussiontime-to-event data. Our initial trial in this area seems promising and we believethat the following research directions might be interesting in the future.• Our proposed method is based on a model linearization and a likelihoodfunction approximation. The asymptotic property of the estimator is nat-urally of interest. That is to say, we need to search for some theoreticaljustiﬁcations for our approach.• The model that we adopted in our modeling for time-to-event process,Weibull model, is a speciﬁc choice for event time analysis. As we introducedin Section 1.1.3, Cox models and general AFT models are also extremelypopular in practice. Thus, one might be interested to generalize the lin-earization strategy to those situations where Cox models or general AFTmodels are involved in the joint modeling.• In many practical longitudinal studies, the response variables are recordedas various types of category. Thus, the NLME in our modeling might needto be replaced by generalized mixed eﬀects model (GLMM). Generalizingthe linearization strategy to joint modeling with GLMM may be valuableto practical application.• Without a doubt, the missing data problem is always one of the cores oflongitudinal studies. Our approach is based on likelihood, which determinesthat the missing data problems could be naturally incorporated into ourframework. However, as more statistical problems are taken into account,the joint modeling itself is getting more and more complex in terms of boththeoretical justiﬁcation and computation. These could be the focuses of ourfuture research.63BibliographyAlbert, P. S. and Shih, J. H. (2009). On estimating the relationship betweenlongitudinal measurements and time-to-event data using a simple two-stageprocedure. Biometrics, pages DOI 10.1111/j.1541–0420.2009.01324.x.Booth, J. G. and Hobert, J. P. (1999). Maximizing generalized linear mixed modellikelihoods with an automated Monte Carlo EM algorithm. Journal of the RoyalStatistical Society, Ser: B, 61:265 – 285.Brown, E. R. and Ibrahim, J. G. (2003). A bayesian semiparametric joint hierar-chical model for longitudinal and survival data. Biometrics, 59:221 – 228.Brown, E. R., Ibrahim, J. G., and DeGruttola, V. (2005). A ﬂexible b-splinemodel for multiple longitudinal biomarkers and survival. Biometrics, 61:64 –73.Carroll, R. J., Ruppert, D., and Stefanski, L. A. (2006). Measurement Error inNonlinear Models. Chapman & Hall/CRC, London, 2nd edition edition.Chambers, J. M. (1991). Statistical Models in S. Duxbury Press.Davidian, M. and Giltinan, D. M. (1995). Nonlinear Models for Repeated Mea-surements Data. Chapman & Hall/CRC, London.Davidian, M. and Giltinan, D. M. (2003). Nonlinear models for repeated mea-surements data: an overview and update. Journal of Agricultural, Biologicaland Environmental Statistics, 8:387 – 419.64Bibliographyde Boor, C. (1978). A Practical Guide to Splines. Springer-Verlag, New York.DeGruttola, V. and Tu, X. M. (1994). Modeling progression of cd-4 lymphocytecount and its relationship to survival time. Biometrics, 50:1003 – 1014.Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihoodestimation from incomplete data via the em algorithm. Journal of the RoyalStatistical Society, Ser: B, 39:1 – 38.Diggle, P., Heagerty, P., Liang, K. Y., and Zeger, S. (2002). Analysis of Longitu-dinal Data. Oxford University Press, 2nd edition.Ding, A. and Wu, H. (2001). Assessing antiviral potency of anti-hiv therapies invivo by comparing viral decay rates in viral dynamic models. Biostatistics, 2:13– 29.Ding, J. M. and Wang, J. L. (2008). Modeling longitudinal data with nonparamet-ric multiplicative random eﬀects jointly with survival data. Biometrics, 64:546– 556.Fan, J. and Gijbels, I. (1996). Local Polynomial Modelling and Its Applications.Chapman & Hall/CRC.Faucett, C. J. and Thomas, D. C. (1996). Simultaneously modeling censoredsurvival data and repeatedly measured covariates: A gibbs sampling approach.Statistics in Medicine, 15:1663 – 1685.Fletcher, R. (1987). Practical Methods of Optimization. John Wiley & Sons, 2ndedition.Gelfand, A. E. and Smith, A. F. M. (1990). Sampling-based approaches to calcu-lating marginal desities. Journal of the American Statistical Association, 85:398– 409.65BibliographyHenderson, R., Diggle, P., and Dobson, A. (2000). Joint modelling of longitudinalmeasurements and event time data. Biostatistics, 1, 4:465 – 480.Hsieh, F. S., Tseng, Y. K., and Wang, J. L. (2006). Joint modeling of survival andlongitudinal data: likelihood approach revisited. Biometrics, 62:1037 – 1043.Huang, X. Z., Stefanski, L. A., and Davidian, M. (2009). Latent-model robustnessin joint models for a primary endpoint and a longitudinal process. Biometrics,65:719 – 727.Ibrahim, J. G., Chen, M. H., and Lipsitz, S. R. (2001a). Missing responses in gen-eralized linear mixed models when the missing data mechanism is nonignorable.Biometrika, 88:551 – 564.Ibrahim, J. G., Chen, M. H., and Sinha, D. (2001b). Bayesian Survival Analysis.Springer, New York.Laird, N. M. and Ware, J. H. (1982). Random-eﬀects models for longitudinaldata. Biometrics, 38:963 – 974.Lawless, J. F. (2003). Statistical Models and Methods for Lifetime Data. WileySeries in Probability and Statistics. Wiley Interscience, 2nd edition.Lee, Y., Nelder, J. A., and Pawitan, Y. (2006). Generalized Linear Models withRandom Eﬀects: Uniﬁed Analysis via H-likelihood. Chapman & Hall/CRC.Li, L., Hu, B., and Greene, T. (2009). A semiparametric joint model for longi-tudinal and survival data with application to hemodialysis study. Biometrics,65:737 – 745.Li, N., Elashoﬀ, R. M., Li, G., and Saverc, J. (2010). Joint modeling of longitu-dinal ordinal data and competing risks survival times and analysis of the nindsrt-pa stroke trial. Statistics in Medicine, 29:546 – 557.66BibliographyLindstrom, M. J. and Bates, D. M. (1988). Newton-raphson and em algorithms forlinear mixed-eﬀects models for repeated-measures data. Journal of the Ameri-can Statistical Association, 83:1014 – 1022.Lindstrom, M. J. and Bates, D. M. (1990). Nonlinear mixed eﬀects models forrepeated measures data. Biometrics, 46:673 – 687.Little, R. J. A. (1992). Regression with missing x’s: a review. Journal of theAmerican Statistical Association, 87:1227 – 1237.Little, R. J. A. (1995). Modeling the drop-out mechanism in repeated measuresstudies. Journal of the American Statistical Association, 90:1112 – 1121.Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Values.Wiley, New York, 2nd edition.Liu, B. and Muller, H. G. (2009). Estimating derivatives for samples of sparselyobserved functions, with application to online auction dynamics. Journal of theAmerican Statistical Association, 104:704 – 717.Liu, L. (2009). Joint modeling longitudinal semi-continuous data and survival,with application to longitudinal medical cost data. Statistics in Medicine,28:972 – 986.Liu, W. and Wu, L. (2007). Simultaneous inference for semiparametric nonlinearmixed-eﬀects models with covariate measurement errors and missing responses.Biometrics, 63:342 – 350.Louis, T. A. (1982). Finding the observed information matrix when using the emalgorithm. Journal of the Royal Statistical Society, Series: B, 44:226 – 233.Nocedal, J. and Wright, S. J. (2006). Numerical Optimization. Springer, NewYork, 2nd edition.67BibliographyPinheiro, J. C. and Bates, D. M. (2002). Mixed-Eﬀects Models in S and SPLUS.Springer, New York.Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis. Springer,New York, 2nd edition.Ratcliﬀe, S. J., Guo, W. S., and Have, T. R. T. (2004). Joint modeling of longi-tudinal and survival data via a common frailty. Biometrics, 60:892 – 899.Rice, J. A. and Wu, C. O. (2001). Nonparametric mixed-eﬀects models for un-equally sampled noisy curves. Biometrics, 57:253 – 259.Rizopoulos, D., Verbeke, G., and Lesaﬀre, E. (2009). Fully exponential laplace ap-proximations for the joint modelling of survival and longitudinal data. Journalof the Royal Statistical Society, Series: B, 71:637 – 654.Rizopoulos, D., Verbeke, G., Lesaﬀre, E., and Vanrenterghem, Y. (2008). A two-part joint model for the analysis of survival and longitudinal binary data withexcess zeros. Biometrics, 64:611C619.Rizopoulos, D., Verbeke, G., and Molenberghs, G. (2010). Multiple-imputation-based residuals and diagnostic plots for joint models of longitudinal and survivaloutcomes. Biometrics, 66:20 – 729.Schafer, J. L. (1997). Analysis of Incomplete Multivariate Data. Chapman &Hall/CRC.Shah, A., Laird, N., and Schoenfeld, D. (1997). A random-eﬀects model formultiple characteristics with possibly missing data. Journal of the AmericanStatistical Association, 92:775 – 779.Song, X., Davidian, M., and Tsiatis, A. A. (2002a). An estimator for the pro-68Bibliographyportional hazards model with multiple longitudinal covariates measured witherror. Biostatistics, 3, 4:511 – 528.Song, X., Davidian, M., and Tsiatis, A. A. (2002b). A semiparametric likelihoodapproach to joint modeling of longitudinal and time-to-event data. Biometrics,58:742 – 753.Song, X. and Huang, Y. J. (2006). A corrected pseudo-score approach for additivehazards model with longitudinal covariates measured with error. Lifetime DataAnalysis, 12:97 – 110.Song, X. and Wang, C. Y. (2008). Semiparametric approaches for joint modelingof longitudinal and survival data with time-varying coeﬃcients. Biometrics,64:557 – 566.Tseng, Y. K., Hsieh, F. S., and Wang, J. L. (2005). Joint modelling of acceleratedfailure time and longitudinal data. Biometrika, 92:587 – 603.Tsiatis, A. A. and Davidian, M. (2004). An overview of joint modeling of longi-tudinal and time-to-event data. Statistica Sinica, 14:793 – 818.Verbeke, G. and Molenberghs, G. (2001). Linear Mixed Models for LongitudinalData. Springer, New York.Vonesh, E. F., Greene, T., and Schluchter, M. D. (2006). Shared parametermodels for the joint analysis of longitudinal data and event times. Statistics inMedicine, 25:143 – 163.Vonesh, E. F., Wang, H., Nie, L., and Majumdar, D. (2002). Conditional sec-ond order generalized estimating equations for generalized linear and nonlinearmixed-eﬀects models. Journal of the American Statistical Association, 97:271– 283.69BibliographyWang, Y. and Taylor, J. M. G. (2001). Jointly modeling longitudinal and eventtime data with application to acquired immunodeﬁciency syndrome. Journalof the American Statistical Association, 96:895 – 905.Wei, G. C. and Tanner, M. A. (1990). A monte-carlo implementation of the emalgorithm and the poor man’s data augmentation algorithm. Journal of theAmerican Statistical Association, 85:699 – 704.Wu, H. and Ding, A. (1999). Population hiv-1 dynamics in vivo: applicable modelsand inferential tools for virological data from aids clinical trials. Biometrics,55:410 – 418.Wu, H. and Zhang, J. (2006). Nonparametric Regression Methods for LongitudinalData Analysis: Mixed-Eﬀects Modeling Approaches. Wiley-Interscience.Wu, L. (2002). A joint model for nonlinear mixed-eﬀects models with censoringand covariates measured with error, with application to aids studies. Journalof the American Statistical Association, 97:955 – 964.Wu, L. (2009). Mixed Eﬀects Models for Complex Data. Chapman & Hall/CRC.Wu, L., Hu, X. J., and Wu, H. (2008). Joint inference for nonlinear mixed-eﬀectsmodels and time-to-event at the presence of missing data. Biostatistics, 9:308– 320.Wu, L., Liu, W., and Hu, X. J. (2010). Joint inference on hiv viral dynamics andimmune suppression in presence of measurement errors. Biometrics, 66:327 –335.Wu, M. C. and Carroll, R. J. (1988). Estimation and comparison of changes inthe presence of informative right censoring by modeling the censoring process.Biometrics, 44:175 – 188.70BibliographyWulfshon, M. S. and Tsiatis, A. A. (1997). A joint model for survival and longi-tudinal data measured with error. Biometrics, 53:300 – 309.Xu, J. and Zeger, S. L. (2001a). The evaluation of multiple surrogate endpoints.Biometrics, 57:81 – 87.Xu, J. and Zeger, S. L. (2001b). Joint analysis of longitudinal data comprisingrepeated measures and times to events. Applied Statistics, 50:375 – 387.Ye, W., Lin, X., and Taylor, J. M. G. (2008). Semiparametric modeling of longitu-dinal measurements and time-to-event data: A two stage regression calibrationapproach. Biometrics.Yu, M. G., Law, N. J., Taylor, J. M. G., and Sandler, H. M. (2004). Jointlongitudinal-survival-cure models and their application to prostate cancer. Sta-tistica Sinica, 14:835 – 862.Yu, M. G., Taylor, J. M. G., and Sandler, H. M. (2008). Individual prediction inprostate cancer studies using a joint longitudinal survivalccure model. Journalof the American Statistical Association, 103:178 – 187.Zeng, D. L. and Cai, J. W. (2005a). Asymptotic results for maximum likelihoodestimators in joint analysis of repeated measurements and survival time. Annalsof Statistics, 33:2132 – 2163.Zeng, D. L. and Cai, J. W. (2005b). Simultaneous modelling of survival and lon-gitudinal data with an application to repeated quality of life measures. LifetimeData Analysis, 11:151 – 174.Zhang, H. P., Ye, Y. Q., Diggle, P., and Shi, J. (2008). Joint modeling of timeseries measures and recurrent events and analysis of the eﬀects of air quality onrespiratory symptoms. Journal of the American Statistical Association, 103:48– 60.71BibliographyZhang, S., Muller, P., and Do, K. A. (2010). A bayesian semiparametric survivalmodel with longitudinal markers. Biometrics, 66:719 – 727.72
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Approximate methods for joint models in longitudinal...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Approximate methods for joint models in longitudinal studies Lu, Libo 2010-12-31
pdf
Page Metadata
Item Metadata
Title | Approximate methods for joint models in longitudinal studies |
Creator |
Lu, Libo |
Publisher | University of British Columbia |
Date | 2010 |
Date Issued | 2010-08-30T15:30:26Z |
Description | Longitudinal studies often contain several statistical issues, suchas longitudinal process and time-to-event process, the associationamong which requires joint modeling strategy. We firstly review the recent researches on the joint modeling topic. After that, four popular inference methods are introduced for jointly analyzing longitudinal data and time-to-event data based on a combination of typical parametric models. However, some of them may suffer from non-ignorable bias of the estimators. Others may be computationally intensive or even lead to convergence problems. In this thesis, we propose an approximate likelihood-based simultaneous inference method for jointly modeling longitudinal process and time-to-event process with covariate measurement errors problem. By linearizing the joint model, we design a strategy for updating the random effects that connect the two processes, and propose two algorithm frameworks for different scenarios of joint likelihood function. Both frameworks approximate the multidimensional integral in the observed-data joint likelihood by analytic expressions, which greatly reduce the computational intensity of the complex joint modeling problem. We apply this new method to a real dataset along with some available methods. The inference result provided by our new method agrees with those from other popular methods, and makes sensible biological interpretation. We also conduct a simulation study for comparing these methods. Our new method looks promising in terms of estimation precision, as well as computation efficiency, especially when more subjects are given. Conclusions and discussions for future research are listed in the end. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Collection |
Electronic Theses and Dissertations (ETDs) 2008+ |
Date Available | 2010-08-30 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0071254 |
URI | http://hdl.handle.net/2429/27909 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2010-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- [if-you-see-this-DO-NOT-CLICK]
- ubc_2010_fall_lu_libo.pdf [ 386.24kB ]
- Metadata
- JSON: 1.0071254.json
- JSON-LD: 1.0071254+ld.json
- RDF/XML (Pretty): 1.0071254.xml
- RDF/JSON: 1.0071254+rdf.json
- Turtle: 1.0071254+rdf-turtle.txt
- N-Triples: 1.0071254+rdf-ntriples.txt
- Original Record: 1.0071254 +original-record.json
- Full Text
- 1.0071254.txt
- Citation
- 1.0071254.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
United States | 19 | 8 |
China | 16 | 44 |
France | 5 | 0 |
Russia | 4 | 0 |
Canada | 3 | 1 |
Nigeria | 2 | 1 |
Colombia | 2 | 2 |
Germany | 1 | 0 |
Turkey | 1 | 0 |
Japan | 1 | 0 |
City | Views | Downloads |
---|---|---|
Unknown | 11 | 12 |
Shenzhen | 8 | 23 |
Greensboro | 5 | 0 |
Minneapolis | 5 | 0 |
Dalian | 4 | 4 |
Ashburn | 4 | 0 |
Beijing | 4 | 17 |
Vancouver | 3 | 1 |
Bogotá | 2 | 2 |
Sunnyvale | 2 | 0 |
San Jose | 1 | 0 |
Dachau | 1 | 0 |
Mountain View | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0071254/manifest