Linear Mixed E ects Models inFunctional Data AnalysisbyWei WangB.Sc., University of Science and Technology of China, 2001M.Sc., National University of Singapore, 2003A THESIS SUBMITTED IN PARTIAL FULFILMENT OFTHE REQUIREMENTS FOR THE DEGREE OFDoctor of PhilosophyinThe Faculty of Graduate Studies(Statistics)The University Of British ColumbiaJanuary, 2008c Wei Wang 2008AbstractRegression models with a scalar response and a functional predictor havebeen extensively studied. One approach is to approximate the functionalpredictor using basis function or eigenfunction expansions. In the expansion,the coe cient vector can either be xed or random. The random coe cientvector is also known as random e ects and thus the regression models arein a mixed e ects framework.The random e ects provide a model for the within individual covarianceof the observations. But it also introduces an additional parameter intothe model, the covariance matrix of the random e ects. This additionalparameter complicates the covariance matrix of the observations. Possibly,the covariance parameters of the model are not identi able.We study identi ability in normal linear mixed e ects models. We derivenecessary and su cient conditions of identi ability, particularly, conditionsof identi ability for the regression models with a scalar response and a func-tional predictor using random e ects.We study the regression model using the eigenfunction expansion ap-proach with random e ects. We assume the random e ects have a generalcovariance matrix and the observed values of the predictor are contaminatedwith measurement error. We propose methods of inference for the regressionmodel's functional coe cient.As an application of the model, we analyze a biological data set to in-vestigate the dependence of a mouse's wheel running distance on its bodymass trajectory.iiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . xStatement of Co-Authorship . . . . . . . . . . . . . . . . . . . . . xi1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Summary of the biology experiment . . . . . . . . . . . . . . 11.2 Review of smoothing methods . . . . . . . . . . . . . . . . . 21.3 Functional regression models . . . . . . . . . . . . . . . . . . 51.4 Introduction to the thesis work . . . . . . . . . . . . . . . . . 9Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112 Identi ability discussion of linear mixed e ects models . 132.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3 Simple su cient conditions of identi ability . . . . . . . . . 162.4 Su cient conditions of identi ability for a structured . . 172.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25iiiTable of Contents3 Linear mixed models for measurement error in functionalregression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.2 Notation and Preliminaries . . . . . . . . . . . . . . . . . . . 293.3 Parameter estimation . . . . . . . . . . . . . . . . . . . . . . 313.3.1 Initial estimates of parameters other than and 0 . 323.3.2 Updating (t)x . . . . . . . . . . . . . . . . . . . . . . 323.3.3 Updating 2(t) . . . . . . . . . . . . . . . . . . . . . . 343.3.4 Updating (t) and 2(t) . . . . . . . . . . . . . . . . . 363.4 Inference for . . . . . . . . . . . . . . . . . . . . . . . . . . 383.4.1 Hypothesis testing for . . . . . . . . . . . . . . . . 383.5 Model assumption checking . . . . . . . . . . . . . . . . . . . 433.6 Model application . . . . . . . . . . . . . . . . . . . . . . . . 443.6.1 Choosing the basis functions . . . . . . . . . . . . . . 443.6.2 Data description . . . . . . . . . . . . . . . . . . . . . 453.6.3 Choice of basis functions . . . . . . . . . . . . . . . . 453.6.4 Estimation and residual analysis . . . . . . . . . . . . 463.6.5 Inference for ( ) . . . . . . . . . . . . . . . . . . . . 473.6.6 Inference for s c . . . . . . . . . . . . . . . . . . 483.7 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . 483.7.1 Two stage estimate . . . . . . . . . . . . . . . . . . . 493.7.2 One sample comparison . . . . . . . . . . . . . . . . . 503.7.3 Two sample comparison . . . . . . . . . . . . . . . . . 513.7.4 Edge e ect discussion in one-sample MSE comparison 52Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 734 Dependence of average lifetime wheel-running on body massontogeny . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 754.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 784.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85ivTable of ContentsBibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955 Conclusions and future plan . . . . . . . . . . . . . . . . . . . 985.1 Models with correlated individuals . . . . . . . . . . . . . . . 985.2 Proposing models with a functional response . . . . . . . . . 1015.2.1 Models of a functional response . . . . . . . . . . . . 1015.2.2 Outline of the model study . . . . . . . . . . . . . . . 1035.2.3 Literature review . . . . . . . . . . . . . . . . . . . . 104Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106AppendicesA Appendix to Chapter 2 . . . . . . . . . . . . . . . . . . . . . . 107A.1 Proof of Corollary 2.4.1 . . . . . . . . . . . . . . . . . . . . . 107A.2 Proof of Corollary 2.4.2 . . . . . . . . . . . . . . . . . . . . . 108B Appendix to Chapter 3 . . . . . . . . . . . . . . . . . . . . . . 110B.1 De nition of the rst di erential . . . . . . . . . . . . . . . . 110B.2 De nition of the second di erential . . . . . . . . . . . . . . 113B.3 Matrix algebraic and di erential rules . . . . . . . . . . . . . 115B.4 Calculations in Section 3.3.2 . . . . . . . . . . . . . . . . . . 116B.5 Calculations in Section 3.3.4 . . . . . . . . . . . . . . . . . . 118vList of Tables3.1 P-values of the test Ho : (t) = 0; for all t 2 [ 1;60] in eachgroup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 543.2 P-values of the test Ho : s(t) = c(t); for all t 2 [ 1;60],within each gender. . . . . . . . . . . . . . . . . . . . . . . . . 544.1 P-values of the test Ho : (t) = 0; for all t 2 [ 1;60], usingtest statistic Uf. . . . . . . . . . . . . . . . . . . . . . . . . . 874.2 P-values of the test Ho : s(t) = c(t); for all t 2 [ 1;60],using test statistic Uf. . . . . . . . . . . . . . . . . . . . . . . 87viList of Figures3.1 Plots of log body mass versus week for the four groups oflaboratory mice. . . . . . . . . . . . . . . . . . . . . . . . . . 553.2 Histogram of the response: averaged wheel running from weeks5 to 60 for the four groups of laboratory mice. . . . . . . . . 563.3 Plots of the proportion of cumulative variance of the centeredlog body mass explained by the rst ten principal componentsof the four groups of mice, after the individual average logbody mass has been removed. . . . . . . . . . . . . . . . . . 573.4 The constant function and the rst three smoothed eigenfunc-tions of the covariance of centered log body mass of the fourgroups of mice, calculated as described in Section 3.6.3. . . . 583.5 Residual plots of the t of the Yi = averaged wheel runningof the four groups of mice. . . . . . . . . . . . . . . . . . . . 593.6 Plots of residuals of Yi = averaged wheel runining in the malecontrol group of mice before and after removing the outlier. 603.7 Plots of ^ and its standard errors computed from the Hessianmatrix (solid) and from the bootstrap (dash-dot) for the fourgroups of mice, as described in Section 3.4. . . . . . . . . . . 613.8 Comparing the permuted values of the generalized likelihoodratio statistic Ul with the Wald statistic Uw for four groupsof mice. The equivalence of Ul and Uw were observed in Ta-ble 3.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.9 Plots of ^s ^c and the standard errors of the di erence com-puted from the Hessian matrix (solid) and from the bootstrap(dash-dot) for both genders. . . . . . . . . . . . . . . . . . . 63viiList of Figures3.10 MSE of the estimate of for each value in one samplesimulation as described in Section 3.7.2. Compare the ECMEmethod with the two stage method. . . . . . . . . . . . . . . 643.11 Proportion of times Ho is rejected using level = 0:01, whereHo : (t) = 0; for all t 2 [ 1;60] in one sample simulation asdescribed in Section 3.7.2. Two test statistics are considered,Uw and Uf. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 653.12 Proportion of times Ho is rejected using level = 0:05, whereHo : (t) = 0; for all t 2 [ 1;60] in one sample simulation asdescribed in Section 3.7.2. Two test statistics are considered,Uw and Uf. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663.13 MSE of the estimate of s for each value in two samplesimulation as described in Section 3.7.3. Compare the ECMEmethod with the two stage method. . . . . . . . . . . . . . . 673.14 MSE of the estimate of c for each value in two samplesimulation as described in Section 3.7.3. Compare the ECMEmethod with the two stage method. . . . . . . . . . . . . . . 683.15 MSE of the estimate of s c for each value in two samplesimulation as described in Section 3.7.3. Compare the ECMEmethod with the two stage method. . . . . . . . . . . . . . . 693.16 Proportion of times Ho is rejected using level = 0:01, whereHo : s = c; for all t 2 [ 1;60] in two sample simulation asdescribed in Section 3.7.3. Four test statistics are consideredUl, Uw, Ue and Uf. . . . . . . . . . . . . . . . . . . . . . . . 703.17 Proportion of times Ho is rejected using level = 0:05, whereHo : s = c; for all t 2 [ 1;60] in two sample simulation asdescribed in Section 3.7.3. Four test statistics are consideredUl, Uw, Ue and Uf. . . . . . . . . . . . . . . . . . . . . . . . 713.18 MSE of the estimate of for each value using truncated logbody mass as described in Section 3.7.4. Compare the ECMEmethod with the two stage method. . . . . . . . . . . . . . . 72viiiList of Figures4.1 Plots of log body mass versus week for the four groups oflaboratory mice. . . . . . . . . . . . . . . . . . . . . . . . . . 884.2 Histogram of the response: averaged wheel running from weeks5 to 60 for the four groups of laboratory mice. . . . . . . . . 894.3 Plots of the proportion of cumulative variance of the centeredlog body mass explained by the rst ten principal componentsfor four groups of mice, after the individual average log bodymass has been removed. . . . . . . . . . . . . . . . . . . . . . 904.4 The constant function and the rst three smoothed eigen-functions of the covariance of centered log body mass of fourgroup. The four groups' functions are together in one panel.The eigenfunctions are calculated as described in Section 4.3. 914.5 Plots of ^ and its standard errors computed from the Hes-sian matrix (solid) and from the bootstrap (dash-dot) for fourgroups of mice. Standard errors are calculated as describedin Section 4.2. . . . . . . . . . . . . . . . . . . . . . . . . . . 924.6 Plots of the covariances between the average wheel runningand the log body mass trajectory at week j, j = 1;::: ;60,for four groups of mice. Solid lines are the covariance from themodel and dash-dot lines are the pointwise sample covariancesbetween average wheel running and log body mass. For thecalculation, refer to Heckmand and Wang (2007). . . . . . . 934.7 Plots of ^s ^c, the di erence between the selection ^ and thecontrol ^, and the standard errors of the di erence computedfrom the Hessian matrix (solid) and from the bootstrap (dash-dot) for both genders. . . . . . . . . . . . . . . . . . . . . . . 94ixAcknowledgementsFor the completion of this thesis, I would like very much to express myheartfelt gratitude to my supervisor Professor Nancy Heckman for all herinvaluable advice and guidance, endless patience, kindness and encourage-ment during the mentor period in the Department of Statistics of Universityof British Columbia. I have learned many things from her, particularly re-garding academic research and analytical writing. I truly appreciate all thetime and e ort she has spent in helping me to solve the problems encoun-tered even when she is in the midst of her work.I would also like to express my sincere gratitude and appreciation tomy committee members, namely Professors Gustafson, Paul and Wu, Lang(alphabetically) for their precious advice and help in my thesis.Special thanks to Professor Patrick Carter for providing us the biologicaldata and the biological background of the thesis work.Thank you to all faculty, sta and graduate students in the Departmentof Statistics, University of British Columbia, for making my stay here suchan enriching experience.xStatement of Co-AuthorshipThe thesis is nished under the supervision of my supervisor, ProfessorNancy Heckman.Chapter 2 is completed with the help of Professor Heckman.Chapter 3 is co-authored with Professor Nancy Heckman. My maincontributions are the derivation of the model parameter estimation in Sec-tion 3.3 and the model diagnostics in Section 3.5. I conduct the programmingwork of the data analysis and the simulation studies.Chapter 4 is co-authored with Professor Patrick Carter, School of Bio-logical Sciences, Washington State University. Professor Carter conductedthe biological experiment and provided us the data. Professor Carter alsogives a detailed description of the experiment and the biological background.We describes the statistical method and carry out the data analysis.xiChapter 1IntroductionThis thesis work is motivated by a biology experiment involving data onindividuals, data that can be considered as scalar responses depending onfunction-valued predictors. In the thesis, we provide a functional regressionapproach to study the biology problem of interest. In this introductorysection, we rst give a brief summary of the experiment and then a reviewof smoothing methods in functional data analysis. We give a literaturereview of the functional regression model followed by a description of ourapproach.1.1 Summary of the biology experimentProfessor Patrick Carter of the School of Biological Sciences of WashingtonState University conducted an experiment on house mice (Mus domesticus)selectively bred for increased voluntary wheel-running exercise. Details ofthe experiment are described in Morgan, Garland and Carter, 2003. Insummary, mice were divided into four groups according to gender and thetwo treatments \selection" and \control". The selection group mice werefrom lines bred with selection being on high wheel-running activity at ageeight weeks. Control mice were from lines bred at random. Body mass andwheel running activity were recorded for each mouse for sixty two consec-utive weeks, indexed from 1 to 60, except for weeks 34, 38, 39, 50. Theresearch interest is to know how body mass and wheel running are relatedand if the relationship depends on the treatment.Unfortunately, the wheel running distance data have many missing val-ues and are very noisy. In addition, the wheels were cleaned every fourweeks, which resulted in spikes in wheel-running activity every fourth week.1Chapter 1. IntroductionOne way to eliminate the problem is to average the wheel running distanceover weeks and use the averaged value in the analysis. After this averaging, ascalar is obtained. We treat the averaged wheel running as the response andthe body mass trajectory as the predictor. We can build a regression modelto study the dependence of the averaged wheel running on the body mass.To make use of the functional form of the body mass, we use a functionalregression model approach. More details of this model are in Section 1.3. Inthe next section, we give a review of smoothing methods which can be usedto study the body mass trajectories. Some of these methods will be used inSection 1.3.1.2 Review of smoothing methodsOne can view a body mass trajectory as a process which can be describedas a continuous function of time. Though the measurements of the trajec-tory are made at a nite discrete set of points, the trajectory lies in anin nite dimensional parameter space. This functional type of data is thestudy object of a statistics branch, namely functional data analysis. For areference, please see Ramsay and Silverman (2005). In this section, we givea brief summary of some popular smoothing methods to study functionaldata. Ruppert, Wand and Carroll (2003) give a detailed summary of thesemethods. These methods can be applied to study the body mass trajectoriesof the mice. To further investigate the relationship between the body massand the averaged wheel running, an additional method needs to be used.First, we consider analyzing data on one individual. Fix the ith individ-ual and let Zi( ) denote its trajectory. Let tij denote Zi's jth observationalpoint and let zij denote its observed value at that time. Data collected onthis individual are the tij's and the zij's, j = 1;::: ;ni. The dependence ofzij on tij through the function Zi is written aszij = Zi(tij) + ij; j = 1;::: ;ni: (1.1)where ij represents an error, either measurement error or modelling error,2Chapter 1. Introductionwith mean zero. Estimating Zi from the zij's is often refered to as nonpara-metric regression, since the function Zi is only assumed to be \smooth".One may view zij as the perturbed Zi at time tij.One of the popular methods to estimate Zi is local polynomial smooth-ing. See Fan and Gijbels (1996) for details. Let K be a symmetric positivefunction with K(t) decreasing as jtj increases. Let h > 0 be the smoothingparameter which is usually refered to as the \bandwidth". To estimate Zi(t)by a local polynomial of degree p, we minimizenXj=1[zj 0 1(tj t) ::: p(tj t)p]2 K tj th to obtain (^0; ^1;:::; ^p). The estimate of Zi(t) is then ^i(t) = ^0. Com-pared with other smoothing methods, one advantage of local polynomialsmoothing is its simpler theoretical analysis which has allowed greater in-sight into its asymptotic properties.Another popular smoothing method uses spline basis functions to modelthe function Zi. A spline function with degree q is de ned on an interval.One rst divides the interval into subintervals by breakpoints, called knots.On each subinterval, the spline function is a polynomial of degree q. On theentire interval, the function has q 1 continous derivatives. For a discussionof spline functions, please see Ramsay and Silverman (2005) p. 47. Apopular choice of spline basis functions is the B-spline basis. In general, let 1,:::, K be the spline basis functions and assume that Zi can be modelledasZi(t) KXk=1xik k(t) (t)0xi; xi (xi1;:::;xiK)0: (1.2)It thus follows from (1.1) thatzij =KXk=1xik k(tij) + ij (tij)0xi + ij; j = 1;::: ;ni: (1.3)3Chapter 1. IntroductionOne way to estimate the unknown coe cient vector xi is via minimizingniXj=1 zij (tij)0xi 2 : (1.4)To use the spline basis functions, one must specify the number and thelocation of \knots". This is not a simple task. Improper choice of knotsmay result in over tting or under tting the data. One strategy to overcomethis problem is to use many knots, but to penalize to prevent over tting. Inthis approach, called penalized splines, we minimize a penalized version of(1.4) to estimate xi. That is, we minimizeniXj=1 zij (tij)0xi 2 + x0iPxi; (1.5)where is a smoothing parameter and P is a penalty matrix. For examplesof P and how to choose using the method of cross validation, please seeRamsay and Silverman (2005) p. 94.Interestingly, there is a close connection between the penalized smooth-ing estimate of Zi(t) using basis functions and the best linear unbiased pre-dictor (BLUP) of Zi(t) in a random e ects model framework. Intuitively,consider the vector coe cient xi in (1.3) being random and having a covari-ance matrix x. Under a normality assumption of the xi and the ij's, thelog likelihood function of the zij's and xi can be written in a form similar to(1.5), up to constant. The smoothing parameter is related to the varianceof ij and the penalty matrix P is in the same position as the inverse of x.For more discussion on the BLUP, see Robinson (1991).The random e ects view of penalized splines is particularly useful whenanalyzing data from N individuals: the zij's, j = 1;::: ;ni, i = 1;::: ;N. Wemodel each zij as in (1.3), with the random e ects xi's assumed indepen-dently and identically distributed with covariance matrix x. The randome ects induce a covariance structure on the zij's that provides a reasonable,yet exible model for within subject dependence. The parameter x is tobe estimated using all the zij's. Standard methods to t random e ects or4Chapter 1. Introductionmixed models (containing both random and xed e ects) can thus be used.For a reference, see McCulloch and Searle (2001). Rice and Wu (2001) usea B-spline basis in (1.3) and use the EM algorithm to estimate the modelparameters. The authors also calculate the best linear unbiased predictor ofeach individual trajectory.1.3 Functional regression modelsSome ideas and methods in functional data analysis can be viewed as exten-sions of those in classical statistics. One of the extensions is the functionalregression model with a scalar response and a functional predictor. Thismodel can be used to study the dependence of the averaged wheel runningon the body mass trajectory. In this section, we introduce the functionalregression model and sketch the methods to study this model. Among therich literature on this topic, we select those papers most relevant to ourapproach.Let Yi be the averaged wheel running of the ith individual. Referringto (1.1), we view Zi as the predictor while zij is the perturbed value of Ziat time tij. Data collected on N individuals are Yi and zij;j = 1;::: ;ni,i = 1;::: ;N. The functional regression model isYi = 0 +Z (t)Zi(t)dt + ei: (1.6)The unknown functional coe cient ( ) measures the dependence of thescalar response Yi on the predictor Zi and is of primary interest. Usually itis assumed that the function is smooth.The functional regression model (1.6) is loosely related to ordinary mul-tiple regression in that the integral replaces the summation. The ordinarymultiple linear regression model for a xed individual uses \covariates"Zi(ti1);::: ;Zi(tini). and models Yi asYi = 0 + Xj jZi(tij) + ei:5Chapter 1. IntroductionIn our case, where the Zi(tij)'s are unobserved, we are essentially carryingout multiple regression with covariates measured with error. Note that the j's do not depend on i. Thus, to use all individuals in this model, the Zi'shave to be measured at the same time points to obtain the zij's, i.e. tij = tjand ni = n. When individuals are measured at di erent sets of time pointsor have di erent number of observations, it is not clear how to apply themultiple regression model. Another problem with multiple regression is thatfor large values of n, the high-dimensional vector of parameters, , may behard to estimate.The functional model (1.6) can easily accommodate the case when dif-ferent time points are observed for di erent individuals. And exploiting thecontinuity of and the Zi's allows us to reduce the number of parametersused. In the literature, there are several approaches to calculate the integralR (t)Zi(t)dt and to estimate . In the remainder of this section, we give asummary of some of these approaches and focus on those closest to ours.One approach uses a discretized version of the integral and estimates by minimizing a penalized version of the residual sum of squares. Cardot,Crambes, Kneip and Sarda (2007) assumed that data were observed at thesame n equally spaced time points, i.e. ni n, tij = tj and tj tj 1 = 1=nfor all j = 2;::: ;n. Cardot et al. considered model (1.6) but without theintercept 0, i.e. the modelYi =Z (t)Zi(t)dt + ei:The authors discretized the integral R (t)Zi(t)dt using a grid of t valuesequal to the tj's and wroteYi = 1nnXj=1 (tj)Zi(tj) + ei:Let Zi = (Zi1;:::;Zin)0, zi = (zi1;::: ;zin)0 and = ( (t1);::: ; (tn))0. Theauthors used a Total Least Squares method to estimate and predict Zi as6Chapter 1. Introductionthe solutions of the minimization problemmin 2<n;Zi2<n(1NNXi=1" Yi 1nZ0i 2+ 1njjzi Zijj2#+ n 0P );where is a smoothing parameter and P is a penalty matrix. Unfortunately,it is not clear how to apply this approach directly when individuals aremeasured at di erent sets of time points.Another approach uses basis functions to approximate the Zi's and .In this basis expansion approach, individuals can be measured at di erenttime points and they don't need to have the same number of observations.This approach approximates the Zi's as in (1.2) and similarly approximates using basis functions, 1;:::; J, as (t) =JXj=1 j j(t) 0 (t); = ( 1;::: ; J)0: (1.7)An advantage of expansions (1.2) and (1.7) is that they reduce the in nitedimensional functional uncertainty into nite dimensional unknowns whichare conveniently handled by vectors. Estimating ( ) thus reduces to esti-mating the coe cient vector . We also notice that using (1.2) and (1.7),the integral R (t)Zi(t)dt in (1.6) can be written asZ (t)Zi(t)dt = 0Txi; where T[j;k] =Z j(t) k(t)dt: (1.8)In this approach, the xi's in (1.2) can be assumed to be xed or random.When the xi's are assumed random, they are usually modelled as describedat the end of Section 1.2. That is, they are assumed independently andidentically distributed with a normal distribution with covariance matrix x.This assumption contributes to the construction of the likelihood functionof the data, the (Yi;zij)'s. The vector is estimated by maximizing the loglikelihood or a penalized log likelihood.In the following, we give brief summaries of three statistical papers and abook chapter that use a basis expansion approach. In expansions (1.2) and7Chapter 1. Introduction(1.7), the k's and j's are usually chosen to be splines, although this is notnecessary. For instance, the k's can be equal to the principal componentfunctions of the Zi's.James (2002), James and Silverman (2005) and M uller (2005) consideredmore general models. In the following, we describe their approaches withan emphasis on studying model (1.6).James (2002) used k's in (1.3) equal to a basis for natural cubic splinesand also considered = in (1.7). The Zi's were subject to error, and sothe data were the perturbed zij's, given in (1.3). The xi's were multivariatenormal and was estimated by maximizing the log likelihood. With xi'sas the missing data and the f(zij;j = 1;::: ;ni);Yig's as the observed data,the author used the EM algorithm to estimate unknown parameters.James and Silverman (2005) used k's in (1.3) equal to the estimatedprincipal component functions of the Zij's and used j's equal to cubicspline basis functions. The coe cient vector's, the xi's, were taken to be xed. When the Zi's were observed without error, the authors estimated by maximizing the log likelihood of the (Yi;Zij)'s subject to a penaltyterm which penalized 's that have R (t)f(t)dt big when Var(R Z(t)f(t)dt)is small. When the zij's were observed, the log likelihood was modi ed toinclude the measurement error and a penalty term was added to ensure thesmooth ts of the Zi's.M uller (2005) used the idea of Karhunen-Lo eve expansion of the Zi pro-cess to estimate the k's and the corresponding xi's. In this approach,the k's in (1.3) are equal to the rst K estimated eigenfunctions of theZi process. The xi's are random and have diagonal covariance matrix, andM uller assumes that the estimated xi's have diagonal covariance matrix too.M uller's method of estimating can be separated into two parts where thexi's are estimated in the rst part and is estimated in the second part. Adetailed description of M uller's method is provided in Chapter 3.Ramsay and Silverman (2005, Chapter 15) rst calculated ^i( ) by ttingthe zij's using least squares and a fairly large number of basis functions.They then estimated by minimizing a penalized residual sum of squares8Chapter 1. Introductionwhich was de ned asNXi=1 Yi 0 Z (t)Zi(t)dt 2+ 0P :This method is implemented in the R library fda (Ramsay, Wickham andGraves, 2007).1.4 Introduction to the thesis workIn (1.3), when the xi's are random, the distribution of the observed vector(zi1;::: ;zini) will not, in general, be identi able. This concern motivatesour work in Chapter 2. Assuming the xi's random introduces an additionalparameter x into the model. This additional parameter complicates thecovariance matrix of the zij's and the variance of the Yi's. Possibly, thecovariance/variance parameters of the model are not identi able.In Chapter 2, we study identi ability in normal linear mixed e ects mod-els. We derive necessary and su cient conditions of identi ability, includingconditions of identi ability for the scalar response and functional predictormixed e ects model.In Chapter 3 of the thesis, we study model (1.6) assuming the observedZi's are contaminated with measurement error. We use principal componentfunctions to expand Zi in (1.2) and use the same functions to expand in(1.7). This choice of basis functions can t the Zi's well with K fairlysmall. An added bene t of this eigenfunction approach is that we focus on\estimable directions" of . We assume that xi in (1.3) is a normal randomvector with a general covariance matrix x.In Chapter 3, we implement the ECME (Expectation/Conditional Max-imization Either) algorithm of Liu and Rubin (1994) to estimate the modelparameters. We consider Hessian matrix based and boot-strapped standarderrors of the estimate of . We propose a new integrated t-statistic for hy-pothesis testing involving ( ). Expressions of the residuals of the model t are derived and we discuss model diagnostics based on the analysis ofresiduals.9Chapter 1. IntroductionIn Chapter 4, we apply our model to analyze a biological data set, pro-viding a detailed biological background.Chapter 5 summarizes the thesis work and discusses possible future re-search directions.10Bibliography[1] Cardot, H., Crambes, C., Kneip, A. and Sarda, P. (2007) Smooth-ing splines estimators in functional linear regression with errors-in-variables. Computational Statistics & Data Analysis 51, 10, 4832-4848.[2] James, G. (2002) Generalized linear models with functional predictors.Journal of the Royal Statistical Society B 64, 3, 411-432.[3] James, G. M. and Silverman, B. W. (2005) Functional adaptive modelestimation. Journal of the American Statistical Association 100, 470,565-576.[4] Fan, J., and Gijbels, I. (1996) Local Polynomial Modelling and Its Ap-plications. London: Chapman & Hall.[5] Liu, C. H. and Rubin, D. B. (1994) The ECME Algorithm - a simple ex-tension of EM and ECM with faster monotone convergence. Biometrika81 4, 633-648.[6] McCulloch, C. E. and Searle, S. R. (2001) Generalized, Linear andMixed Models. New York: Wiley.[7] Morgan, T.J., T. Garland, Jr., and Carter, P. A. (2003) Ontogenetictrajectories in mice selected for high wheel- running activity. I. Meanontogenetic trajectories. Evolution 57, 646-657.[8] M uller, H. G. (2005) Functional modeling and classi cation of longitu-dinal data. Scandinavian Journal Statistics 32, 223-240.[9] Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis.2nd edition, Springer Series in Statistics.11Bibliography[10] Ramsay, J. O., Wickham, H. and Graves, S. (2007) R fda Library.[11] Rice, J. A. and Wu, C. O. (2001) Nonparametric mixed e ects modelsfor unequally sampled noisy curves. Biometrics 57, 253-259.[12] Robinson, G.K. (1991) That BLUP is a good thing: the estimation ofrandom e ects. Statistical Science 6, 15-32.[13] Ruppert, D., Wand, M. P. and Carroll, R. J. (2003) SemiparametricRegression. Cambridge University Press.12Chapter 2Identi ability discussion oflinear mixed e ects models2.1 IntroductionMixed e ects models have moved beyond the original simple models, becom-ing more complicated, containing more parameters. However, two di erentsets of parameters producing the same covariance matrix for the observa-tions may cause problems for parameter estimation and for inference. Inprinciple, parameter identi ability is the rst thing that should be veri edwhen building a model. As Demidenko (2004, p.118) states, \identi abil-ity may be viewed as a necessary property for the adequacy of a statisticalmodel".We study identi ability in normal linear mixed e ects models. In thenext section, we de ne the classical mixed e ects model and show that inan unrestricted form and a speci c restricted form, the model is not identi -able. Sections 2.3 and 2.4 give su cient conditions to identify parameters invarious models. In Section 2.5 we discuss identi ability of extended models.2.2 MotivationIn this section, we give the de nition of nonidenti ability. Then we introducethe general unrestricted classical linear mixed e ects model and give twomodels that are nonidenti able.De nition 2.2.1 Let y be the vector of observable random variables with0A version of this chapter will be submitted for publication. Author: Wang, W.13Chapter 2. Identi ability discussion of linear mixed e ects modelsdistribution function P where is in the parameter space . This proba-bility model for y is not identi able if and only if there exist ; 2 , with 6= and P = P .Throughout the paper, we assume all random variables follow normal dis-tributions. As the normal distribution is uniquely characterized by its rsttwo moments, identi ability of a normal distribution function then reducesto the identi ability of its mean vector and covariance matrix (Demidenko,2004, Proposition 10, p.118).In the standard linear mixed e ects model, y is the observable randomvector of length n and X and Z are known, non-random design matriceswith dimensions n p; n > p and n q; n > q respectively. We assumethroughout that both X and Z have full column rank. Theny = X + Zu + ;u N(0; u); N(0; ); u independent of : (2.1)The random e ects vector u and the error vector are unobservable. Thismodel has been studied and applied by, for instance, McCulloch and Searle(2001).Unknown parameters in the model are ( ; ), where 2 B <p; =( ; u) 2 ~ = the set of all ( ; u) with , n n, and u,q q, both symmetric and positive de nite. Throughout the paper, weassume that and do not have common elements, i.e. we assume that( ; ) 2 B ~ . We also assume that and u do not have commonelements. That is, we sometimes assume that ~ = ~ ~ u where ~ and ~ u u, and contains all n n positive de nite symmetricmatrices and u contains all q q positive de nite symmetric matrices. Wetake as our operating de nition of identi ability the ability to identify theparameters , and u.The linear mixed e ects model has become popular in the analysis oflongitudinal and functional data. For instance, the response of individuali at time t can be modelled as P j j(t) + Puik k(t) plus error, where14Chapter 2. Identi ability discussion of linear mixed e ects modelsthe j's are xed population e ects and the uik's are individual-speci crandom e ects. The inclusion of random e ects allows us to realisticallymodel covariance within an individual. Taking j's and k's equal to B-splines allows us to exibly model a wide variety of responses. In addition,if we allow some of the j's to be random but with a very speci c covariancestructure, then the resulting best linear unbiased predictors of the j's arecomputationally equivalent to the parameter estimates in a smoothing spline t or a penalized B-spline t. See Verbyla (1999) or Ruppert, Wand andCarroll (2003).The parameter vector is always identi able since Ey = X and X is offull column rank. So to study identi ability in the normal model, it su cesto study the covariance matrix of y, y = Z uZ0 + , to see when and u are identi able. Thus, nonidenti ability of model (2.1) with parameterspace B ~ is equivalent to the existence of ; 2 ~ , with 6= givingthe same y, i.e.Z uZ0 + = Z uZ0 + : (2.2)In Example 2.2.1 below, we show that the unrestricted model where~ = is nonidenti able. A similar argument shows that the restrictedmodel in Example 2.2.2 below is not identi able. Example 2.2.2 uses thecovariance structure assumed for the random e ects in the penalized Bsplinemethod. Thus, when using penalized Bsplines, one cannot assume a generalform for .Example 2.2.1Suppose we place no further restrictions on . Then the model is notidenti able. To see this, let ( ; u) 2 and 0 < a < 1. Then u =(1 a) u and = + aZ uZ0. So and u satisfy (2.2) and( ; u) 2 .Example 2.2.2Suppose that = ~ u where ~ u contains all matrices of the form 2R where R is positive de nite and known, and 2 > 0. To see that this15Chapter 2. Identi ability discussion of linear mixed e ects modelsmodel is not identi able, use the same argument as in Example 2.2.1 andnote that the constructed u is in ~ u if u is in ~ u.In practice, one usually assumes a more speci c structure for , suchas = 2 I. Restrictions may lead to identi ability, and such restrictionsand their e ects on identi ability will be discussed in the next two sections.2.3 Simple su cient conditions of identi abilityIn this section, we nd su cient conditions of identi ability of model (2.1)assuming ~ = .A further examination of (2.2) gives us the following su cient conditions.Clearly, if u is known, then Z uZ0 is known, and so is completelydetermined.If is known, then the model is identi able. To see this, consider (2.2)with = . It follows that Z uZ0 = Z uZ0 and so u = u since Z isof full column rank.If Z uZ0 1 = K, where K is known and K+I is of full column rank,then the model is identi able. Suppose by way of contradiction the model isnot identi able. Then (2.2) holds for ( u; ) 6= ( u; ) both in ~ with,by assumption, Z uZ0 = K and Z uZ0 = K . Substituting theseexpressions into (2.2) yieldsK + = K + ;that is,(K + I)( ) = 0:Since K + I is of full rank, we must have = . But, as shown in theprevious paragraph, this implies that u = u.The last condition is similar to a common condition for identi abililtyin simple linear regression models with measurement errors. The modelassumesyi = 0 + 1xi + i; i (0; 2 );16Chapter 2. Identi ability discussion of linear mixed e ects modelswhere xi is observed with error having variance 2u. The response yi thenhas variance 2u + 2 . One of the common conditions of model identi abilityis to assume the ratio 2u= 2 is known. The inverse 1 appearing in ourlast condition could be viewed as multivariate version of \denominator".If there are any supplementary data, we may then be able to nd anestimate of u, or K and we can treat this estimate as the true value.The su cient conditions for identi ability can then be satis ed.2.4 Su cient conditions of identi ability for astructured As we observed from Examples 2.2.1 and 2.2.2, the model is not identi ableeven if we restrict u to be a scalar multiple of a known matrix. In thissection, we study the e ect of putting restrictions on . In Theorem 2.4.1below, we give a necessary and su cient condition of nonidenti ability, acondition that relies mainly on the design matrix Z via HZ = Z(Z0Z) 1Z0.The theorem leads to four corollaries: Corollaries 2.4.1 and 2.4.2 give nec-essary and su cient conditions for identi ability when arises from anexchangeable covariance structure or is diagonal. Corollary 2.4.3 states aneasily checked condition on that guarantees identi ability of the model.That corollary is then applied to two commonly used error structures. Us-ing Corollary 2.4.4, we can generalize a known identi ability result, givinga shorter proof under weaker conditions.Theorem 2.4.1 Let ~ and de ne HZ = Z(Z0Z) 1Z0. Then model(2.1) with parameter space B ~ is nonidenti able if and only if there exist( ; u) 2 ~ and ( ; u) 2 ~ , with 6= such thatHZ [ ] = ; (2.3)and u = u + (Z0Z) 1Z0 [ ]Z(Z0Z) 1: (2.4)17Chapter 2. Identi ability discussion of linear mixed e ects modelsProof:Nonidenti ability of the model is equivalent to the existence of ( ; u) and( ; u) in ~ , not equal, satisfying (2.2). Note that this is equivalent tohaving ( ; u) and ( ; u) in ~ with 6= satisfyingZ( u u)Z0 = : (2.5)Suppose the model is nonidenti able. We premultiply (2.5) by Z0, post-multiply it by Z and then pre- and postmultiply by (Z0Z) 1 to get u u = (Z0Z) 1Z0 [ ]Z(Z0Z) 1: (2.6)This gives (2.4). To derive (2.3), premultiply (2.6) by Z, postmultiply (2.6)by Z0 to getZ( u u)Z0 = HZ [ ]HZ (2.7)which, by (2.5), is the same as = HZ [ ]HZ: (2.8)Premultiplying (2.8) by the idempotent matrix HZ givesHZ [ ] = HZ [ ] HZ:Substituting (2.8) into the right side of the above yields (2.3).To prove the converse, we want to show that (2.3) and (2.4) lead to (2.5).It is clear from (2.4) that (2.7) holds. If we can show that (2.8) holds thenwe are done since substituting (2.8) into the right side of (2.7) yields (2.5).To show (2.8), from (2.3) and the symmetry of , we see thatHZ[ ] = [ ]HZ:Premultiplying the above identity by the idempotent matrix HZ gives HZ[ ] = HZ[ ]HZ: Substituting (2.3) for the left side of the equation,we see that (2.8) holds. 218Chapter 2. Identi ability discussion of linear mixed e ects modelsThe proofs of the next two corollaries are in Appendix A.Corollary 2.4.1 Let 1 be an n-vector with each element being one. Supposethat the distribution of 1;::: ; n is exchangeable, that is, the covariancematrix of is of the form 2 [(1 )I + J] where J = 110. Let ~ =f = 2 [(1 )I + J]; 2 > 0; 1=(n 1) < < 1g. Suppose thematrix Z satis es 10Z 6= 0 and rank(Z) = q with 1 q < n 1. Supposethe parameter space is B ~ u. Then model (2.1) is identi able if andonly if HZJ 6= J.Comments. The condition HZJ = J means the sum of the elements ofeach row of HZ is equal to one, and this is an easy condition to check. Forthe case that q = 1, i.e. Z is a column vector (z1;::: ;zn)0 where Pzi 6= 0,HZ =0BBB@z21s2zz1z2s2z z1zns2z... ..znz1s2zznz2s2z z2ns2z1CCCA; where s2z =Xz2i :The model is identi able if and only if Z is not a constant vector.When q = 2, suppose we have the usual simple linear regression modelwith centered covariates:Z =0BBB@1 z1... ...1 zn1CCCA; withXzi = 0: (2.9)ThenHZ =0BBB@1n +z21s2z1n +z1z2s2z 1n +z1zns2z... ..1n +znz1s2z1n +znz2s2z 1n +z2ns2z1CCCA:and each row of HZ sums to one. Thus unfortunately, the model is not iden-ti able under this Z combined with the exchangable covariance structure.Corollary 2.4.2 Suppose that ~ equals the collection of all diagonal pos-itive de nite n n matrices. Then model (2.1) with parameter space B 19Chapter 2. Identi ability discussion of linear mixed e ects models~ u is identi able if and only none of the diagonal elements of HZ isequal to one.Comments. Again, the condition on HZ is easy to check. Consider thecase q = 1. As we have seen, diagonal elements of HZ equal z2i =Pz2j ,i = 1;::: ;n. Therefore, the model is identi able if and only if Z does nothave n 1 zero elements. Consider q = 2 with Z as in (2.9). The model isidenti able provided, for all i, (1=n)+z2i =Pz2j doesn't equal 1. So typically,the model is identi able.The following corollary provides a su cient condition for identi ability,a condition that can sometimes be easily checked. Consider (2.3). Note thatthe rank of HZ( ) is at most q, since the rank of HZ is q. Thus,for (2:3) to hold, we must be able to nd some and with the rank of less than or equal to q. This proves the following.Corollary 2.4.3 Suppose that ~ . Then model (2.1) with parameterspace B ~ is identi able if rank( ) > q for all ; in theparameter space.Now we apply Corollary 2.4.3 to show model identi ability under the\multiple of a known positive de nite matrix" and the \MA(1)" covariancestructures respectively in next two examples.Example 2.4.1 Multiple of a known positive de nite matrixFix R, symmetric and positive de nite, and suppose ( ; u) 2 ~ impliesthat = 2R; some 2 > 0. Consider = 2R and = 2R, 2 6= 2. Clearly = ( 2 2)R is invertible, and so is of rank n,which we have assumed is greater than q. Thus, the model is identi able.To show the model in Example 2.4.2 below is identi able, we need thefollowing lemma which is a result in (Graybill, 1983, p.285)Lemma 2.4.1 Let T be the n n Toeplitz matrix with ones on the twoparallel subdiagonals and zeroes elsewhere. Given two scalars a0 and a1, the20Chapter 2. Identi ability discussion of linear mixed e ects modelseigenvalues of the n n matrix C = a0I + a1T are i = a0 + 2ja1j cos i n + 1; i = 1;::: ;n:Example 2.4.2 MA(1)Suppose that n 1 > q: Let the components of have the MA(1) covariancestructure, i.e. of the form 2(I + T). Let ~ = f = 2(I + T); 2 >0; j j < 1=2g and suppose ( ; u) 2 ~ implies that 2 ~ .Let and 2 ~ . By Lemma 2.4.1, the eigenvalues of the di erencematrix = ( 2 2)I + ( 2 2 )T are i = ( 2 2) + 2 2 2 cos i n + 1;i = 1;::: ;n:Given any ( 2; ) and ( 2; ), with ( 2; ) 6= ( 2; ), the number of zero i's is at most one. Hence, the rank of the di erence matrix is greater thanor equal to n 1. Therefore, model (2.1) is identi able under this MA(1)covariance structure.In longitudinal or functional data analysis, usually there are N individ-uals with the ith individual modelled as in (2.1):yi = Xi + Ziui + i;ui N(0; u); i N(0; i);( u; i) 2 u ~ i ; ui and i independent: (2.10)Statistical inference is normally based on the joint model, the model ofthese N individuals. The following corollary gives su cient conditions foridenti ability of the joint model. The intuition behind the result is that, ifwe can identify u from one individual, then we can identify all of the i's.Corollary 2.4.4 If an individual model (2.10) is identi able, then the jointmodel is identi able.21Chapter 2. Identi ability discussion of linear mixed e ects modelsProof:We notice each individual model (2.10) shares a common parameter, thecovariance matrix u. If one individual model uniquely determines u andits i, the identi ed u will then yield identi ability of all the individual i's since, if Zi uZ0i + i = Zi uZ0i + i , clearly i = i . Therefore, thejoint model is identi able. 2Corollary 2.4.4 reduces the veri cation of a joint model's identi abilityto the individuals'. For instance, if the i-th individual model has Zi offull column rank and i = 2 Ini, where ni is the length of yi, then thisindividual model is identi able by Corollary 2.4.1 and thus so is the jointmodel. Note that the other individual models can still have their Zj's notof full column rank.Demidenko (2004, Chapters 2 & 3) studies the joint model but assumesthe covariance matrix of i is 2Ini. Suppose each Zi is of dimension ni q.Demidenko shows that the joint model is identi able if at least one matrixZi is of full column rank and PNi=1(ni q) > 0. Using our argument inthe previous paragraph, the condition PNi=1(ni q) > 0 can be dropped.Furthermore, our result can be applied to more general 's.2.5 ExtensionsIn this section, we discuss identi ability of a model in functional regressionfor a functional predictor y( ) and a scalar response w. We derive a necessaryand su cient condition of nonidenti ability for this model.We model y(t) as Pj j j(t) + Pk uk k(t) plus error, with the j'sand k's known, the j's unknown and the uk's unknown and random.The dependence of the response w on y is modelled through an unknownfunctional coe cient, : w = 0+R (t) [y(t) E(y(t))] dt+ where is mean0 normal noise. Thus, for appropriately de ned and with u = (u1;::: ;uq)0,we can writew = 0 + 0u + ; 0 2 <; 2 <q unknown ; (0; 2 ) independent of u: (2.11)22Chapter 2. Identi ability discussion of linear mixed e ects modelsThe predictor y is observed at a sequence of discretized points and theobserved values are contained in the vector y, which then follows model(2.1). James (2002), M uller (2005) and Heckman and Wang (2007) considerthis modelling of y and w and propose di erent approaches to estimate thefunctional coe cient, .We assume model (2.1) and (2.11). For our purpose of identi abilitydiscussion here, we consider the unknown parameters to be ( 0; ) and =( ; u; 2 ; ). We suppose that ( 0; ) 2 B < <p and that 2 ~ = u <+ <q where and u are as before and <+ is the setof positive real numbers.To study identi ability, we must study the distribution of the randomvector (y0;w). We see that E(y) = X , E(w) = 0 and (y0;w)0 has covari-ance matrix = Z uZ0 + Z u 0 uZ0 0 u + 2 !:We know the parameter is identi able if the matrix X is of full columnrank. The identi ability of 0 is also clear. So we focus on identifying thecovariance parameters .Our discussion in Section 2.2 suggests the unrestricted model won't beidenti able. In fact, we can construct an example following Example 2.2.1to show the existence of nonidentical and both in such that (2.2)holds andZ u = Z u ; (2.12) 0 u + 2 = 0 u + 2: (2.13)Example 2.5.1 Example 2.2.1 continued.Let 0 < a < 1, u and be as in Example 2.2.1, and let = =(1 a)and 2 = 2 a 0 u =(1 a). It is not hard to see (2.12) and (2.13) aresatis ed. If, in addition, we restrict a < 2 =( 2 + 0 u ), we see that 2is positive.The following theorem gives a necessary and su cient condition of non-23Chapter 2. Identi ability discussion of linear mixed e ects modelsidenti ability.Theorem 2.5.1 Let ~ , let ~ = ~ u <+ <q: and de neHZ = Z(Z0Z) 1Z0. Then model (2.1) and (2.11) with parameter space B ~is nonidenti able if and only if there exist ( ; u; 2 ; ) and ( ; u; 2; )both in ~ , with 6= such that the following hold(a) HZ [ ] = ,(b) u = u + (Z0Z) 1Z0 [ ]Z(Z0Z) 1,(c) = u 1 u ,(d) 2 = 2 + 0 u( 1u u 1) u .Proof:Nonidenti ability of the model is equivalent to the existence of nonidentical and in ~ , satisfying (2.2), (2.12) and (2.13).First suppose that (2.2), (2.12) and (2.13) hold for some 6= . Bythe argument given in Theorem 2.4.1, (2.2) implies that (a) and (b) hold.Since Z is of full column rank, (2.12) yields (c). Substituting (c) into (2.13)yields (d). We easily see that, if 6= and if (a) through (d) hold, then 6= .Now suppose that conditions (a) through (d) hold for some 6= .Again, by the argument given in Theorem 2.4.1, (2.2) holds. We easily seethat (2.12) and (2.13) also hold. 224Bibliography[1] Demidenko, E. (2004) Mixed Models: Theory and Applications, JohnWiley & Sons.[2] Graybill, F. A. (1983) Matrices with Applications in Statistics, SecondEdition, Wadsorth, Inc.[3] Heckman, N. and Wang, W. (2007) Linear mixed models for measure-ment error in functional regression. manuscript, Department of Statis-tics, University of British Columbia.[4] James, G. (2002) Generalized linear models with functional predictors.Journal of the Royal Statistical Society B 64, 3, 411-432.[5] McCulloch, C. & Searle, S. R. (2001) Generalized, Linear and MixedModels, John Wiley & Sons.[6] M uller, H.G. (2005) Functional modeling and classi cation of longitu-dinal data. Scandinavian Journal of Statistics 32, 223-240.[7] Ruppert, D., Wand, M.P. & Carroll, R.J. (2003). Semiparametric Re-gression, Cambridge University Press.[8] Verbyla, A. P., Cullis, B. R., Kenward, M. G., & Welham, S. J. (1999).The analysis of designed experiments and longitudinal data by usingsmoothing splines. Applied Statistics 48, 269-311.25Chapter 3Linear mixed models formeasurement error infunctional regression3.1 IntroductionRegression models with a functional predictor Z( ) and a scalar response Yhave been extensively studied (see, eg, Ramsay and Silverman, 2005, andreferences therein). For individual i, the dependence of Yi on Zi is modelledasYi = 0 +Z ba (t) Zi(t) E(Zi(t)) dt + ei: (3.1)The goal is to estimate .Data are collected from N independent individuals, with data on in-dividual i, i = 1;::: ;N, being Yi;Zij Zi(tij), j = 1;::: ;ni. If the Ziprocesses are observed with error, then our data on individual i are Yi andzij;j = 1;::: ;ni, withzij = Zi(tij) + ij; Cov( i1;:::; ini) = i: (3.2)Here, we consider data where the Zij's are observed with error whichcould be measurement error or modelling error, and we model the functionZi using random e ects with a set of basis functions 1;::: ; K. In our0A version of this chapter will be submitted for publication. Authors: Heckman, N.and Wang, W.26Chapter 3. Linear mixed models for measurement error in functional regressionestimation process, we approximate Zi( ) asZi(t) = (t) +KXk=1xik k(t); (3.3)where is an arbitrary function and xi (xi1;:::;xiK)0 are independentand normally distributed with mean vector 0 and covariance matrix x.This approach was taken by James (2002), M uller (2005) and Jamesand Silverman (2005) for data with and without measurement error. Jamesused k's equal to a basis for natural cubic splines and also used this ba-sis for modelling . James's approach is similar to that described in Sec-tion 3.3. However, we implement a faster algorithm, the ECME (Expecta-tion/Conditional Maximization Either) algorithm in Liu and Rubin (1994),and consider Hessian matrix based and boot-strapped standard errors.M uller used k's equal to the rst K estimated eigenfunctions of theZi process. M uller's approach can be separated into two parts: the rstpart uses only the zij's to determine , the k's and the xij's. The secondpart incorporates the Yi's to estimate 0 and . The rst part uses thePACE method (principal analysis through conditional expectation) of Yao,M uller and Wang (2005). In PACE, Yao et al. smooth the observed zij's toobtain an estimate ^ of and then centre the data by subtracting ^. Nextthe authors smooth the centred data to estimate the covariance functionof the Zi's and then estimate the rst K eigenfunctions and eigenvaluesof the estimated covariance function. Denote these estimates by 1;:::; Kand ^1;::: ; ^K. Let ^ik be the best linear unbiased estimate of individual i'skth PC score, ^ik = E(xikjzi1;::: ;zini), calculated assuming normality, (3.2)with i = 2I and (3.3) with x = diag(^1;::: ; ^K). The second part ofM uller's methodology involves regressing Yi on ^i1;::: ; ^iK to estimate k =R (t) k(t) dt and then setting ^(t) = P ^k k(t). M uller justi es the useof this regression by showing that E(Yijzij;j = 1;::: ;ni) = 0 + P11 kxik,for the true PC scores xik calculated with the true eigenfunctions.M uller's approach has a big computational advantage over James's (2002),in that it can t the Zi's well with K fairly small. An added bene t of27Chapter 3. Linear mixed models for measurement error in functional regressionthis eigenfunction approach is that we focus on \estimable directions" of .For instance, consider the extreme case where Zi can be written exactly asZi(t) = (t)+PK1 xik k(t). So Zi has no variability in directions orthogonalto the k's. Since R (t)[Zi(t) E(Zi(t))] dt = PK1 xik R (t) k(t) dt, wecan only hope to estimate R (t) k(t) dt, k = 1;::: ;K. We cannot esti-mate R (t) f(t) dt for f orthogonal to the k's. This issue was noted byJames and Silverman (2005) who handled it by adding a penalty term whichpenalizes 's that have R (t)f(t)dt big when Var(R Z(t)f(t)dt) is small.M uller's approach has the disadvantage that it does not fully use the Yi's:the xik's in (3.3) are estimated using only the zij's. Clearly, the Yi's alsoprovide information about the xik's if there is a relationship between Yi andZi, that is, if 6= 0. As James and Silverman (2005) note \It is an interestingfeature of this problem that the responses provide additional informationin the estimation of the Zi's". Also, M uller's calculation, that E(Yijzij;j =1;::: ;ni) = 0 +P11 kxik does not hold if the eigenvalues or eigenfunctionsare incorrect. In particular, the calculation relies on Cov(xij;xik) = 0 forj 6= k.We consider a hybrid approach. Like M uller, we use 1;::: ; K equalto the rst K estimated eigenfunctions of the Zi process. Thus we not onlyimprove on James's choice of k's but also focus on \estimable directions" of . We then treat these k's as xed and known in our analysis. We use allof the data, the Yi's and the zij's, to estimate the xik's and we do not placeany restrictions on x. Thus we improve on M uller's procedure, where thexik's are estimated using only the zij's, and x is assumed diagonal and isestimated completely by the eigenanalysis of the zij's.Our detailed parameter estimation procedure using the ECME algorithmis in Section 3.3. In this work, we also propose test statistics for hypothesistesting of ( ). In Section 3.4.1, we consider testing the nullity of the function , i.e. testing Ho : (t) = 0; for all t 2 [a;b]. In the two sample situation,we consider testing the equality of a selection s and a control c, i.e. testingHo : s(t) = c(t); for all t 2 [a;b]. We propose a new integrated t-statisticand three test statistics that are more standard. In Section 3.5, we deriveexpressions of the residuals of the model t and discuss model diagnostics28Chapter 3. Linear mixed models for measurement error in functional regressionbased on the analysis of residuals. In Section 3.6.2, we apply the modelto analyze a biological data set. In Section 3.7, via a simulation study,we compare our ECME estimate of to a modi cation of M uller's (2005)two-stage estimate and we also study the performance of the di erent teststatistics. Our detailed calculations to derive the ECME estimates are inAppendix B.3.2 Notation and PreliminariesBefore we t the model, we introduce some notation and carry out prelim-inary calculations. In this section and the next, we suppose that the k'sare xed and known. In practice, however, we will estimate them from aneigenanalysis of the zij's.For ease, assume ni n and tij = tj. Suppose that (3.1), (3.2) and(3.3) above hold. Let Zi = (Zi1;::: ;Zin)0, = ( (t1);::: ; (tn))0, i =( i1;::: ; in)0 represent errors and zi = (zi1;::: ;zin)0 be the observed values.Writezi = Zi + i + Axi + i; (3.4)where xi i:i:d: N(0; x); Ajk = k(tj); j = 1;::: ;n; k = 1;::: ;K; i i:i:d: N(0; 2 R); R known, symmetric and positive de nite;fx1;::: ;xng independent of f 1;::: ; ng:If R is known, then the model is identi able (Wang, 2007). However, if 2 Ris unknown, then the model is not identi able (Wang, 2007). Wang alsodiscusses model identi ability under other assumptions on the covariancematrix of i.We choose a basis for and write (t) =JXj=1 j j(t) 0 (t):29Chapter 3. Linear mixed models for measurement error in functional regressionTypically, we will take J = K and j = j, but will write for the generalcase. Thus we can writeYi = 0 + 0Txi + ei; where Tjk =Z ba j(t) k(t) dt and ei N(0; 2):(3.5)We easily see that zi;xi and Yi are jointly multivariate normal withE(zi) = ; E(Yi) = 0; E(xi) = 0;andVar(Yi) 2Y = 0T xT0 + 2; Cov(zi) z = A xA0 + 2 R;Cov(zi;Yi) z;Y = A xT0 ; Cov zi;x0i z;x = A xand Cov Yi;x0i Y;x = 0T x: (3.6)Let Wi = (z0i;Yi)0 be the ith subject's observation and let W denote E(Wi).LetC = A 0T!(3.7)and d be a block diagonal matrix as d = diag( 2 R; 2): (3.8)ThenCov(Wi) W = C xC0 + d (3.9)and the log-likelihood of the observed data is N = N2 lndet C xC0 + d 12NXi=1(Wi W )0 C xC0 + d 1 (Wi W )up to an additive constant term. Unknown model parameters are =( ; 0; x; 2 ; ; 2). Directly maximizing N over does not give us closedforms of the parameter estimates except for and 0. So we need to rely30Chapter 3. Linear mixed models for measurement error in functional regressionon iterative numerical methods to nd the estimates. We will elaborate onthese methods in the next section.3.3 Parameter estimationIn this section, we use the ECME (Expectation/Conditional MaximizationEither) algorithm (Liu and Rubin, 1994) to estimate the model parametersin .In this balanced design, the MLEs of and 0 are z and Y respectively.So throughout we take (t) = ^ = z and (t)0 = ^0 = Y and so (t)W = ; t = 0;1;::: :We estimate the other parameters iteratively and sequentially. Given (t), the parameter estimates at iteration t, we update one component of (t) at a time, holding the other components xed. We treat (zi;Yi;xi); i =1;::: ;N; as the complete data. We update 2(t) by nding its EM estimate.That is, we nd its estimate by maximizing the conditional expected com-plete data log-likelihood function, where we condition on the observed data.The other components of (t) are updated by maximizing N directly.Throughout our ECME procedure in Sections 3.3.2-3.3.4, we make thefollowing assumptions.(a) A is of full column rank;(b) T is of full row rank;(c) there exists no u and v such that, for all i = 1;::: ;n, zi = u + v0xi;(d) there exists no v such, for all i = 1;::: ;n, Yi = Y + v0(zi z).The restrictions on A and T are easily satis ed. Assumption (b) requiresJ, the number of the j basis functions to be no larger than K, the numberof the k basis functions. Typically, we will take J = K and j = j.Assumptions (c) and (d) are common for data where there is noise. We31Chapter 3. Linear mixed models for measurement error in functional regressionnotice assumption (a) implies that the matrix C de ned in (3.7) is also offull column rank.3.3.1 Initial estimates of parameters other than and 0We choose the initial estimates as follows.We take (0) to be a vector of zeroes. If this were the true value of ,then we could simply estimate 2 by the sample variance of the Yi's. Toaccount for the fact that may not be zero and thus the sample variance ofthe Yi's would overestimate 2, we take 2(0) equal to 0:6 times the samplevariance of the Yi's.The values of (0)x and 2(0) are based on the penalized eigenanalysis ofthe zi's sample covariance matrix described in Section 3.6.1. These initialestimates are sensible if R is the identity matrix, but can still be used ifR is not the identity. Roughly, the eigenanalysis in Section 3.6.1 partitionsthe variability of the zi's into two parts: variability from the Zi process andvariability from the noise. Let 1;::: ; n denote the calculated eigenvaluesand suppose the largest K su ciently describe the variability in the zi's.So we will use K eigenfunctions. A reasonable rst estimate of x is (0)xdiagonal with entries 1;::: ; K. We take 2(0) equal to PnK+1 k=(n K),explaining the remaining variability in the zij's.Clearly, under assumptions (a)-(d), we can force 2(0) and 2(0) to bepositive and (0)x > 0. Given (t), we update x, 2 ; ; and 2; as describedbelow.3.3.2 Updating (t)xWe update (t)x by maximizing N over x while keeping the other pa-rameters xed. Let SW = PNi=1(Wi )(Wi )0=N: We show that if 2(t) and 2(t) are positive and if SW (t)d > 0, then our update (t+1)x ispositive de nite and using it in the log likelihood instead of (t)x increasesthe log likelihood.With detailed derivation in Section B.4, di erentiating N with respect32Chapter 3. Linear mixed models for measurement error in functional regressionto x and equating to zero yields the rst order conditionC0 1W C = C0 1W SW 1W C: (3.10)Here, C depends on (t) and W = C xC0 + (t)d from (3.9). Equation(3.10) holds provided W is invertible at the critical value of x. Since weassume that 2(t) and 2(t) are positive, (t)d is positive de nite. So W willbe invertible provided x is non-negative de nite.We now solve (3.10) for x, rst deriving two useful identities, (3.11)and (3.12). For ease, we drop the hats and superscript t's on the parameterestimates that are being held xed, that is, on ^W ; 2(t) ; (t); and 2(t).Direct multiplication and some manipulation of the left hand side of thefollowing shows that C0 1d C C0 1d C 1+ x C0 1W = C0 1d :Solving this for C 1W yieldsC0 1W = C0 1d C 1+ x 1 C0 1d C 1C0 1d : (3.11)Postmultiplying both sides of identity (3.11) by C yieldsC0 1W C = C0 1d C 1+ x 1: (3.12)Substituting (3.11) into the right side of (3.10) and (3.12) into the left sideof (3.10) yields C0 1d C 1+ x = F SW F0;where F = C0 1d C 1C0 1d . Note that F is of full row rank. Thus, thecritical point is^ x = F SW F0 C0 1d C 1 = F(SW d)F0; (3.13)which is strictly positive de nite. And so, clearly we have W invertible at33Chapter 3. Linear mixed models for measurement error in functional regressionthe critical point.To see that the updated ^ x leads to an increase in N, we show thatthe Hessian matrix, H( x) evaluated at ^ x is negative de nite. The ijthelement of H( x) is the second order partial derivative of N with respectto the ith and jth elements of the vectorized x. From calculations inSection B.4, we haveH(^x) = (N=2) ^D ^ ; where ^ = C0 ^ 1W C; (3.14)which is clearly negative de nite.3.3.3 Updating 2(t) We update 2(t) , holding all other parameter estimates xed, using one E-step and one M-step of the EM algorithm. We show that if 2(t) and 2(t) are positive and if (t)x > 0, then our update 2(t+1) is positive. Increaseof the log likelihood after updating 2(t) by 2(t+1) is guaranteed by theproperty of the EM algorithm.Recall (zi;Yi;xi); i = 1;::: ;N, are our complete data and Wi (z0i;Yi)0,i = 1;::: ;N, are the observed data. In conditional expectations, we let \ "stand for the observed data. Abusing notation slightly, we let f denote ageneric density function with the exact meaning clear from the arguments.The E-Step of the EM algorithm calculates E (t) PNi=1 lnf(zi;Yi;xi) j and the M-step maximizes this conditional expectation over 2 to obtain 2(t+1) . By the conditional independence of zi and Yi given xi,lnf(zi;Yi;xi) lnf(zijxi) + lnf(yijxi) + lnf(xi):Since only lnf(zijxi) contains 2 , we can ignore the last two terms andobtain 2(t+1) via maximizing E (t) PNi=1 lnf(zijxi) j over 2 .From (3.4), we rst getNXi=1lnf(zijxi) = N2 ln(det 2 R) 12 2 NXi=1(zi Axi)0R 1(zi Axi):34Chapter 3. Linear mixed models for measurement error in functional regressionFollowing (3.6), we haveCov(Wi;xi) = C x (3.15)which then leads to the conditional mean and covariance of xi given Wi asE[xijWi] xijWi = xC0 1W (Wi W ); (3.16)Cov[xijWi] xjW = x xC0 1W C x: (3.17)Let~ =NXi=1(zi ^ A (t)xijWi)0R 1(zi ^ A (t)xijWi):Routine calculations yieldE (t) NXi=1lnf(zijxi)j != N2 ln(det R) nN2 ln 2 12 2 h~+ Ntr(R 1A (t)xjW A0)i:Di erentiating this conditional mean with respect to 2 and equating thederivative to zero yields 2(t+1) = 1nN ~+ 1ntr[R 1A (t)xjW A0]: (3.18)We show the update 2(t+1) is positive in the following. The rst term in 2(t+1) is positive, by assumption (c) and the fact that R is positive de nite.The second term is nonnegative by the following argument.Using the famous matrix identity(V V0 + 0) 1 = 10 10 V 1 + V0 10 V 1V0 10provided the matrix orders properly de ned, we see that (t)xjW = (t)x 1 + C(t)0 (t)d 1C(t) 1which is positive de nite. Given (t)xjW > 0, assumption (a) then implies35Chapter 3. Linear mixed models for measurement error in functional regressionthat A (t)xjW A0 0. Together with the fact that R > 0, the second term in(3.18) is thus nonnegative.3.3.4 Updating (t) and 2(t)The updates of (t) and 2(t) maximize N over and 2, holding the otherparameters xed. Suppose that 2(t) > 0 and (t)x > 0. We nd uniquecritical points, ^ and ^2, and show that they increase the log likelihoodprovided ^2 > 0.Note that log f(yi;zi) = log f(yijzi) + log f(zi), that log f(zi) doesn'tdepend on or 2. We also note given zi, yi is normal with meanE(Yijzi) 0 + 0G(zi );and variance 2Y jz Var(Yijzi) = 0K + 2 (3.19)whereG = T xA0 1z (3.20)andK = T xT0 T xA0 1z A xT0:Therefore, to maximize N with respect to and 2, we maximize~ N = N2 ln( 0K + 2) 12( 0K + 2)NXi=1 Yi 0 0G(zi ) 2 :(3.21)With detailed derivation in Section B.5, equating @ ~N=@ and @~N=@ 2 tozero yields respectively1 0K + 2NXi=1 Yi 0 0G(zi ) G(zi ) = 0 (3.22)36Chapter 3. Linear mixed models for measurement error in functional regression1( 0K + 2)2" NXi=1 Yi 0 0G(zi ) 2 N( 0K + 2)#= 0:(3.23)Note that G is of full row rank because of the following two observations.First, T is of full row rank by assumption (b). Second, the matrix z = (t)x + 2(t) R is invertible since it is positive de nite.LetM = GNXi=1(zi )(zi )0G0:Then, by assumption (c), M is positive de nite.Solving (3.22) for and (3.23) for 2 gives (t+1) = ^ = M 1GNXi=1(zi )(Yi 0) 2(t+1) = ^2 = 1NNXi=1 Yi 0 0G(zi ) 2 0K : (3.24)Unfortunately, we are not guaranteed that ^2 is positive. However, inall of our data analyses and simulation studies, the nal estimate of 2 wasalways positive.Again, to check if the update increases ~N, we show that the Hessianmatrix is negative de nite. We notice that (3.24) implies^2Y jz ^0K^ + ^2 = 1NNXi=1 Yi 0 ^0G(zi ) 2; (3.25)which is positive by assumption (d). With detailed calculation in Sec-tion B.5, the Hessian matrix H~ ; 2) when evaluated at ^ and ^2 equalsH~ ^; ^2) = N(^ 2Y jz)20@ 2K^ ^ 0K + ^2Y jzN M K^ ^ 0K 121A: (3.26)It follows that H~ ^; ^2) < 0 by the following argument. Let x1 2 <J and37Chapter 3. Linear mixed models for measurement error in functional regressionx2 2 < with at least one of x1 or x2 non-zero. Direct calculation yields(x01;x2) H N(^; ^2) x1x2!= N(^ 2Y jz)2" x2p2 +p2x01K^ 2+ ^ 2Y jzN x01Mx1#< 0:3.4 Inference for Given the estimate ^ of , we estimate the function by ^ = ^0 . If thecovariance matrix of ^ is , then the covariance function of ^, denoted V ,isV (s;t) = Cov(^0 (s); ^0 (t)) = (s)0 (t):We base inference for on ^ and an estimate of .We estimate in two ways, using bootstrap by resampling the observedWi's with replacement or using the observed Hessian matrix H~N(^; ^2)de ned in (3.26). We take ^ as the K K upper corner of the inverse of H~N(^; ^2). In doing this, we are treating the other parameters as known,ignoring the variability introduced by estimating them. Thus, we expect thatwe may underestimate V (t;t), while we don't anticipate underestimationwith the bootstrap estimate. However, the Hessian-based estimate is veryfast to compute.3.4.1 Hypothesis testing for Testing that 0To determine if Yi depends on Zi( ) we testHo : (t) = 0; for all t 2 [a;b]:We consider three test statistics.38Chapter 3. Linear mixed models for measurement error in functional regressionThe rst test statistic is the generalized likelihood ratio statisticUl = sup = 0 N sup N:The unrestricted supremum of N is achieved at the ECME estimates de-scribed in Section 3.3. We can also use the ECME procedure to calculatethe rst supremum. To obtain the rst supremum, we observe under therestriction = 0, zi and Yi are independent, with zi N( ;A xA0 + 2 R)and Yi N( 0; 2). Thus ^ = z, ^0 = Y and ^2 = P(Yi Y )2=N. Wethen calculate (t)x and 2(t) by an ECME method treating these estimatesof , ^ and 2 as known. We update x by maximizing N directly whileholding 2 xed. We update 2 by nding its EM estimate 2(t+1) whileholding x xed. We iterate untill convergence occurs.The second statistic considered is Wald's test statistic using ^, the vectorof estimated basis coe cients:Uw = ^0 ^ 1 ^:It is interesting to note that this test statistic can be re-written in terms ofa vector of function evaluations of ^. To see this, let t i ;i = 1;::: ;n , be asequence of time points, let ~ be the vector containing the values of ^ atthe t i 's, and let ~ be ~'s covariance matrix. The Wald test statistic basedon ~ is ~0 ^ + ~, where ^+ is the Moore-Penrose inverse of an estimate of ~ . We now argue that, under mild conditions on the t i 's, ~0 ^+ ~ = Uw.De ne the n J matrix as ij = j(t i ) and suppose that is of fullcolumn rank. Since ~ = ^, ~ = 0 and thus it is natural to take^ ~ = ^ 0. Since ^ +~ = +0 ^ 1 + and + = I, ~ 0 ^ +~ ~ = Uw.The third statistic is the integrated t-statisticUf =Z ba^ 2(t)^V (t;t)dt:To calculate the null distribution of a test statistic, and thus calcu-late a p-value, we use a permutation type method, one that does not rely39Chapter 3. Linear mixed models for measurement error in functional regressionon distributional assumptions. Under the null hypothesis that (t) 0for all t 2 [a;b], Yi and zi are independent. Therefore the joint distribu-tion of (zi;Yi) is the same as that of (zi;Yi0 ) where i0 is chosen at randomfrom f1;::: ;Ng. To simulate the null distribution of a test statistic, wemake Q new data sets via permuting the Yi's: the qth data set is simplyfz1;Yi1 ;:::;zN;YiN g where i1;::: ;iN is a random permutation of 1;::: ;N.The p-value is then calculated as the proportion of the resulting Q statisticvalues larger than the original observed value.Testing equality of two 'sOften we want to know if the 's governing Y 's and Z's in two di erentgroups are equal. To denote group membership, we use the superscript \s"or \c" to indicate the \selection" or \control" group respectively. We havedata collected independently from the two groups and we want to testHo : s(t) = c(t); for all t 2 [a;b]:We consider four test statistics. We assume that the selection and controllog-likelihoods, sNs and cNc, each have the same expressions as in (3.21)but with possibly di erent parameter values, superscripted by s or c.The rst test statistic is from a likelihood ratio testUl = sup s = cf sNs + cNcg supf sNs + cNcg= sup s = cf sNs + cNcg sup sNs sup cNc:Each of the last two suprema is calculated separately, using the ECMEestimates from Section 3.3.We can also apply this ECME procedure to calculate the rst supremum.Under the restriction s = c, sNs and cNc have a common parameter s = c. Following the same argument as in Section 3.3, we have c(t) = zc; s(t) = zs; c(t)0 = Y c and s(t)0 = Y s:40Chapter 3. Linear mixed models for measurement error in functional regressionTo update each of c(t)x and s(t)x , we follow the steps outlined in Sec-tion 3.3.2. To update each of c2(t) and s2(t) , we follow the steps outlinedin Section 3.3.3.To update (t), s2(t) and c2(t), we proceed as in Section 3.3.4, but withsome modi cation for our updating of (t).To describe the procedure to update (t), s2(t) and c2(t), de ne ~sNsand ~cNc in a manner analagous to ~N in (3.21). By an argument similar tothat in Section 3.3.4, we must nd , s2 and c2 to maximize ~sNs + ~cNc.Di erentiating ~sNs + ~cNc with respect to s2 and c2 and setting equal tozero yields equations analagous to the equation for 2(t+1) in (3.24).Unfortunately, di erentiating ~sNs + ~cNc with respect to and settingequal to zero yields an intractable equation. So we modify our calculationof , but retain the above-described updates for s2 and c2. Instead ofmaximizing ~sNs + ~cNc with respect to we maximize ~~sNs + ~~cNc withrespect to , where the ~~ are de ned as follows. First consider the term 0Ks + s2. Calculate the matrix Ks using s(t+1)x and s2(t+1) . Take = (t) and s2 = s2(t). Then the resulting expression for 0Ks + s2,which we denote s2(t)Y jz , no longer contains the parameters and s2. Let~~ sNs = Ns2 ln( s2(t)Y jz ) 12 s2(t)Y jzX Y si s0 0Gs(zsi s) 2 :De ne c2(t)Y jz and ~~cNc similarly.Di erentiating ~~sNs + ~~cNc with respect to and setting equal to zeroyields the update (t+1) =0@ 1 s2(t)Y jzMs + 1 c2(t)Y jzMc1A 10@ 1 s2(t)Y jzGs X(zsi s)(Y si s0 ) + 1 c2(t)Y jzGc X(zci c)(Y ci c0)1A:41Chapter 3. Linear mixed models for measurement error in functional regressionwhereMs = Gs X(zsi s)(zsi s)0Gs0and Mc is de ned similarly.To de ne the two sample Wald's statistic Uw, let ^s be the estimate of s with estimated covariance matrix ^ s and de ne ^c and ^c similarly.The two sample statistic isUw = (^s ^c)0 ^ s + ^ c 1(^s ^c):We consider a two sample statistic based on function evaluations, ratherthan on basis coe cients, recalling the notation in Section 3.4.1. Let ~s = s^s be the vector of function evaluations of ^s at a sequence of time points.Let ^s = s ^s s0 be the estimate of ~s's covariance matrix. Similarlyde ne ~c and ^c . The two sample Wald test statistic based on ~s and ~cisUe = (~s ~c)0 ^ ~ s + ^ ~ c +(~s ~c):In Section 3.4.1, the one sample situation, we argued that we needn't use thefunction evaluation statistic Ue as it is equivalent to the Wald test statisticUw under mild conditions. In the two sample situation here, however, thetwo sample Uw and Ue may not agree unless s = c and both of them areof full column rank.To de ne the two sample integrated t-statistic Uf, let ^ s(s;t) = s(s)0 ^ s s(t)be the estimate of the covariance function of ^s and de ne ^ c(s;t) similarly.The two sample Uf isUf =Z [^ s(t) ^ c(t)]2^V s(t;t) + ^V c(t;t) dt:Again, we use the permutation method to calculate the null distributionof the test statistics and thus the p-values. Under the null hypothesis that s(t) = c(t) for all t 2 [a;b], the dependence of Yi on zi is identical inboth groups. We generate Q \data sets" from the original data set, datasets that follow the null hypothesis. To construct the qth \data set", we42Chapter 3. Linear mixed models for measurement error in functional regressionrandomly split the Ns + Nc individuals into two groups of size Ns and Ncand calculate the resulting test statistic. We use the empirical distributionof the obtained Q test statistic values to approximate the null distributionof our statistic. The p-value is then calculated as the proportion of the Qstatistic values larger than the original observed value.3.5 Model assumption checkingAfter tting the model, we need to check if the model assumptions are satis- ed. Our model diagnostics rely on the analysis of residuals. In this section,we derive expressions of the tted values for Wi and for the residuals. Fittedvalues and residuals for zi and Yi are then obtained as components. We canthen plot the residuals to check model assumptions and to look for outliersand in uential points.To simplify notation, unknown parameters below stand for their esti-mates. Using model (3.4) and (3.5), we base our tted values ^ i on theBLUP of the random e ects xi, xijWi in (3.16),^Wi = W + C xijWi= W + C xC0 1W (Wi W )Recall the expression of W in (3.9). We getC xC0 1W = C xC0 + d d 1W = I d 1Wand thus^Wi = I d 1W Wi + d 1W W :It then followsri = Wi ^ i = d 1W (Wi W ):The last element of ^ i gives the tted value of Yi and the last element ofri gives the Yi residual. As we focus on modelling the dependence of Yi on43Chapter 3. Linear mixed models for measurement error in functional regressionZi, plotting residuals against tted values of Yi is useful to detect outliersor in uential points of the model t.3.6 Model applicationIn this section, we analyze a biological data set using our model. First wegive a general description of how to choose the basis functions 1;::: ; K.Then we conduct the data analysis.3.6.1 Choosing the basis functionsWe choose the basis functions, the k's, as functions that give good approxi-mations to Zi( ): we choose them as the rst few estimated eigenfunctions ofthe Zi process. To do so, we apply a smoothed principle component analysisto the observed zi's, penalizing an approximation of the second derivativeof the eigenfunction. Let ^ be the sample covariance matrix of the zi's. We nd a sequence of orthogonal vectors ~j 2 Rn to maximizev0j ^ jv0jvj + v0jD0Dvj ;where is the smoothing parameter and Dvj calculates second divideddi erences of the vector vj. The (n 2) n matrix D depends on t1;::: ;tnand is de ned to di erentiate quadratic functions exactly. That is, if vj[i] =a+bti +ct2i , then (Dvj)[i] = 2c. Given , the vectors ~j are eigenvectors ofthe matrix G 1=2 ^ 1=2, where G = I+ D0D. The approach is similar toRamsay and Silverman's (2005) but we don't use basis function expansionsof the Zi.Choice of can be done by cross-validation, but for simplicity here, weselect by examining the smoothness of the resulting ~j's. In the dataanalysis, we chose = 100.44Chapter 3. Linear mixed models for measurement error in functional regression3.6.2 Data descriptionData were provided by Patrick Carter, School of Biological Sciences, Wash-ington State University, and are described in Morgan et al, 2003. Dataare from mice divided into four groups according to gender and the twotreatments \selection" and \control". The selection group mice were bredover 16 generations, with selection being on high wheel-running activity atage eight weeks. Control mice were bred at random. In the nal generation,body mass and wheel running activity were recorded for each mouse for sixtytwo consecutive weeks, indexed from 1 to 60, except for weeks 34, 38, 39,50. The research interest is to know how body mass and wheel running arerelated and if the relationship depends on the treatment.The wheel running distance data have many missing values and are verynoisy. In addition, the wheels were cleaned every four weeks, and so we seespikes in wheel-running activity every fourth week. So in our analysis, weaverage the nonmissing wheel running distances over weeks 5 to 60 and usethis average as the response Y . The predictor Z( ) is the log transformedbody mass. We want to know if any of the groups have non-zero and ifthere is any di erence between the selection s and the control c withineach gender.Plots of the observed zi's and histograms of the Yi's in each group arein Figures 3.1 and 3.2 respectively. We see that log body mass is roughlymonotone with a high rate of increase in weeks 1 to 4. The log body massin the males is more variable than that in the females.3.6.3 Choice of basis functionsA smoothed eigenanalysis in each of the four groups yielded a rst eigen-function that was close to constant, indicating that the biggest source ofvariability in log body mass in each group was overall size of the mouse.Since a constant eigenfunction is biologically meaningful, we forced our rstbasis function to be constant and, within each group, calculated the remain-ing functions via a smoothed eigenanalysis on the centered log body mass45Chapter 3. Linear mixed models for measurement error in functional regressionas follows. We let zi = 15860Xk= 1k6=34;38;39;50zikbe the ith mouse's average log body mass andzij zi j = 1;::: ;60;j 6= 34;38;39;50the ith mouse's centered log body masses. Within each group, we calcu-lated the sample covariance matrix of the centered log body mass vectors.We then applied the smoothed eigenanalysis to this covariance matrix. Fig-ure 3.3 shows the proportion of cumulative variance explained by the rstten principal components of this analysis in each group. The variance ex-plained by the constant function was 85% in the male selection group, 72%in the male control group, 70% in the female selection group and 63% in thefemale control group. Figure 3.4 shows the constant function and the rstthree eigenfunctions for each group. They are displayed one by one withthe four groups' functions together in one panel. There we see the threesmoothed eigenfunctions are very similar in the four groups.Figure 3.3 suggests that, if we choose the rst three smoothed eigen-functions in each group, we will capture about 90% of the variability of thelog body mass trajectories, beyond the variability captured by the constantfunction. We will use a constant function plus these three eigenfunctions inour analysis.3.6.4 Estimation and residual analysisBefore proceeding with inference, we study residuals of our ts to see if thereare any outliers.We t the model (3.4) and (3.5) within each group, using R = I, andchoose the same basis functions for as for the log body mass, calculated inSection 3.6.3. Within each group, we plot the Yi residuals against the ttedYi's to detect outliers as outlined in Section 3.5. Residual plots of eachgroup are in Figure 3.5. In the control males, we see an outlier at the left46Chapter 3. Linear mixed models for measurement error in functional regressionside. This outlier may be in uential in the estimation of . So we removethis outlier and re t the model to the male control group. The rst panel ofFigure 3.6 is the same as the plot in the second panel of Figure 3.5, showingthe outlier in the control males. The second panel of Figure 3.6 shows theresidual plot of the t after removing the outlier. There we see no newoutliers. We remove the one outlier in the control males in our subsequentanalyses.3.6.5 Inference for ( )In each of the four groups, we estimated and calculated standard errorsfrom both the Hessian matrix and the bootstrap, as described in Section 3.4.These are shown in Figure 3.7.To determine if (t) = 0 for all t 2 [ 1;60], we can study the plots inFigure 3.7 and we can conduct a formal hypothesis test. From the plots inFigure 3.7, we see that, except for the female control group, all groups showa region where the zero line is not within one standard error of ^. Thissuggests that, within these groups, perhaps there may be a dependence ofaveraged wheel running on log body mass.We conduct a hypothesis test of Ho : (t) = 0 for all t 2 [ 1;60] in eachgroup using the test statistics in Section 3.4.1. We compute the standarderrors using the Hessian matrix (3.26). The results are in Table 3.1 below.The last three columns contain the permutation p-values of Ul, Uw andUf, computed by using 500 permutations. The second column gives theobserved value of the test statistic Ul and the next column gives the p-valueof Ul based on the fact that negative twice the log-likehood ratio statistic isasymptotically chi-squared distributed with 4 degrees of freedom.From the Table, we see that, in selected and control males, averagewheel-running depends on the log body mass trajectories. However, thereis insu cient evidence to make this claim in the other three groups.We observe that the p-values of Ul and Uw are identical. We plot thepermuted values of Ul and Uw in Figure 3.8 and see that Ul is an increasingfunction of Uw. This may be due to the normality assumption in the model.47Chapter 3. Linear mixed models for measurement error in functional regression3.6.6 Inference for s cWithin each gender, we want to compare the selection ( ) with the control ( ). Figure 3.9 shows the pointwise di erence between the selection ^ andthe control ^ together with the standard errors computed from the Hessianmatrix and from the bootstrap. For both males and females, the regionwithin one standard error of ^s ^c contains much of the zero line, whichsuggests we can't distinguish between the selection and the control .Table 3.2 gives results of the test Ho : s(t) = c(t); for all t 2 [ 1;60]in each gender. The last three columns contain the permutation p-values ofUl, Uw and Uf, computed by using 500 permutations. The second columngives the observed value of the test statistic Ul and the next column givesthe p-value of Ul based on the fact that negative twice the log-likehood ratiostatistic is asymptotically chi-squared distributed with 4 degrees of freedom.Given the results, we can not reject Ho in either gender. That is, withineach gender, there is no evidence of a di erence between the selected groupand the control group in terms of average wheel running's dependence onlog body mass.In conclusion, we nd a strong dependence of average wheel running onthe log body mass in the selected males and control males. We don't haveenough evidence to distinguish the di erence between the selected groupand the control group in terms of average wheel running's dependence onlog body mass in either gender.3.7 Simulation studyIn the simulation study, we compare the pointwise mean squared errorsof our ECME estimate of with those of a modi ed version of the twostage estimate proposed by M uller (2005). We also compare the power ofthe test statistics proposed in Section 3.4.1 for one-sample and two-samplecomparisons.We will see that, in terms of mean squared error, in the one-sample case,both the ECME estimate and the two stage estimate su er from an edge48Chapter 3. Linear mixed models for measurement error in functional regressione ect. In Section 3.7.4, we look into this edge e ect further. The ECMEestimate typically has slightly smaller MSE than the two stage estimate,with the improvement in MSE being more noticeable as the dependence ofY on Z increases. In the two-sample case, the two methods are comparable.For testing, in the one-sample case, there is little di erence in powerbetween the two statistics considered. In the two sample case, the integratedt-statistic has more power to distinguish the di erence between s and c.In the one-sample comparison in Section 3.7.2, we simulate data usingparameter values estimated from the male selection group data analyzed inSection 3.6.2. In the two-sample comparison in Section 3.7.3, we simulateselection group data using estimates from the male selection group data andthe control group data using estimates from the male control group datawithout the outlier.3.7.1 Two stage estimateRecall from Section 3.1 that the calculation of M uller's (2005) estimate of had two parts. The rst part ignored the yi's and only used the zi's topredict the underlying xi's. The second part of the analysis was essentiallya linear regression of the yi's on the predicted xi's. We would like to studythis procedure's ability to estimate . We suspect that ignoring the yi's inthe rst part of the procedure will lead to poorer estimation of .To study this in a way that is comparable to our over-all methodology, weslightly modify the M uller's method. We use linear mixed e ects methodol-ogy to predict the xi's from the zi's. Speci cally, we rst use our smoothedprincipal components analysis of the zi's to determine basis functions, the k's, see (3.3). We then construct the matrix T and t model (3.4) using theEM algorithm. Laird (1982) gave an elaboration on the EM computationin linear mixed models. We use the resulting estimated covariances of thexi's and i's to calculate our predictors, the BLUPs of xi given zi, that is,to calculate E(xijzi). For the second part of the analysis, we nd the leastsquares estimate of according to the regression modelYi = 0 + E(x0ijzi)T0 + ei:49Chapter 3. Linear mixed models for measurement error in functional regressionIn our t of (3.4), we force x to be diagonal. This is in keeping withM uller, and stems from the fact that the components of the xi's are principalcomponent scores.The main di erence between our approach and M uller's is in the way weestimate the required variances and covariances for calculating the BLUPs.We use a linear mixed e ects model based on eigenfunctions. M uller useda smoothing method to directly estimate the components of the covariancestructure.3.7.2 One sample comparisonWe simulate data based on parameter and eigenfunction estimates from thedata analysis of the male selection group in Section 3.6.2. Let s; sx; s2 ; s2; 0s, s and s(t) = (t)0 s denote these estimates and let As be the matrix con-structed from the basis functions, evaluated at the same tj's as in the dataanalysis.We simulate the unperturbed predictor Zi according to a multivariatenormal N( s;As sxAs0) and let the observed zi be Zi plus N(0; s2 I) noise.We consider four possible functions to describe the relationship betweenYi and Zi:Yi = 0s +Z 60 1 (t) Zi(t) Zi(t) dt + ei:The integral is calculated using the R function \sintegral" and we simulateei from N(0; s2). We take = s with = 0;2=3;4=3 or 2. When = 0,Yi does not depend on Zi. When = 2, the dependence is large. Recallin our data analysis in Section 3.6.2, we found that s was signi cantlydi erent from zero. Thus, for each value of , we can generate an observeddata set (zi;Y i ); i = 1;::: ;39.We rst compare our estimate of with the two stage estimate. We run100 simulations and the MSE of the estimate ^ is calculated as100X1 ^ (t) (t))2 =10050Chapter 3. Linear mixed models for measurement error in functional regressionat each observation point t. The results are in Figure 3.10 where the rstpanel shows the pattern of the true 's, the next two panels show the point-wise MSE's of our two estimates of and the last panel shows the di erencesof these pointwise MSE's. In the plots, the MSE's increase as increasesand both estimates of have high MSE's for t's at the edges. The MSE'sof both methods are much worse at the left hand edge, probably due to thesharp decrease of the true and the sharp increase in log body mass beforeweek 10. The ECME method seems to be more a ected by this edge. Wewill give a further study of this edge e ect in Section 3.7.4. However, in thelast panel we see that overall, the ECME estimate has a smaller MSE, withthe superiority of ECME increasing as increases.Therefore, if there is a signi cant dependence of Yi on zi through , theECME method of estimate is preferred.To test Ho : (t) = 0; for all t 2 [ 1;60], we use the test statisticsUw and Uf with standard errors calculated using the Hessian matrix (3.26)as described in Section 3.4. For each data set, we run 300 permutations tocalculate the p-values. We simulate 100 data sets for each value of andchoose levels = 0:01 and = 0:05. Figure 3.11 and Figure 3.12 summarizethe proportion of times Ho was rejected using Uw and Uf but with di erentlevels: = 0:01 and = 0:05 respectively. As expected, the powers increaseas increases. There is little di erence in power between the two statistics.3.7.3 Two sample comparisonTo simulate data from two independent samples, we choose model param-eters using the male selection group data and the male control group dataanalysed in Section 3.6.2. We simulate data for the selection group and, sep-arately, for the control group using the same methodology as in Section 3.7.2but with di erent true 's.Let s and c be the estimates of from the original male selection groupand male control group data. Let = ( s + c)=2 and = ( s c)=2.In the simulation study we set the of the selection group to + andthat of the control group to with = (0;2=3;4=3;2). Thus, the51Chapter 3. Linear mixed models for measurement error in functional regressiondi erence between the selection and control is 2 = ( s c).To estimate s and c, we t the selection data and control data sepa-rately. For each value of we simulate 100 data sets and calculate the MSEas in Section 3.7.2 in each group. Figures 3.13 and 3.14 show the MSE's ofthe two estimates in each group respectively. The pattern of the two MSE'sare similar to their counterparts in Figure 3.10.We calculate the MSE of the estimate s c as100X1 ^ s(t) ^c(t) s(t) + c(t))2 =100at each observation point t. Figure 3.15 shows the results. The patterns ofthe two MSE's are similar to their counterparts in Figure 3.10. The MSE'sof the two estimates are comparable but in general the MSE of the ECMEestimate is smaller.To test Ho : s(t) = c(t); for all t 2 [ 1;60], we compare the fourtest statistics, Ul, Uw, Ue and Uf, with standard errors calculated usingthe Hessian matrix (3.26). For each data set, we run 300 permutations tocalculate the p-values. We simulate 100 data sets for each value of andchoose levels = 0:01 and = 0:05. Figures 3.16 and 3.17 show the powerof the four test statistics with levels 0:01 and 0:05 respectively. The statisticUf is the most powerful, especially when is large, but Uf and Uw arecomparable.3.7.4 Edge e ect discussion in one-sample MSE comparisonIn Figure 3.10, we see that the MSE's of the ECME estimate of are muchworse at the left hand edge than the MSE's of the two stage estimate. Wesuspect this is probably due to the sharp decrease of the true before week10 as shown at the rst panel of Figure 3.10, and the large increase in logbody mass in that same period. In this section, we exclude the early weeks'data from our analysis.To determine the various parameter values to use in a new simulation,we re-analyze the male selection group data but with zi containing log body52Chapter 3. Linear mixed models for measurement error in functional regressionmass values a week 5 rather than at week 1. After obtaining the newparameter estimates, we simulate data in the same way as in Section 3.7.2.The simulation analysis result is in Figure 3.18. The edge e ect still existsbut this time the MSE's of ECME and the two stage method are comparableat the edges.53Chapter 3. Linear mixed models for measurement error in functional regressionGroup Observed Ul Asymptotic p-Ul p-Ul p-Uw p-UfMale selected 13:723 0:008 0:008 0:008 0:008Male control 10:940 0:027 0:044 0:044 0:034Female selected 9:095 0:059 0:082 0:082 0:264Female control 3:280 0:512 0:568 0:568 0:656Table 3.1: P-values of the test Ho : (t) = 0; for all t 2 [ 1;60] in eachgroup.Gender Observed Ul Asymptotic p-Ul p-Ul p-Uw p-Ue p-UfMale 11:919 0:018 0:364 0:356 0:532 0:110Female 5:852 0:210 0:564 0:568 0:884 0:444Table 3.2: P-values of the test Ho : s(t) = c(t); for all t 2 [ 1;60], withineach gender.54Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 602.02.53.03.5Male SelectedWeekLog body Mass0 10 20 30 40 50 602.02.53.03.5Male ControlWeekLog body Mass0 10 20 30 40 50 602.02.53.03.5Female SelectedWeekLog body Mass0 10 20 30 40 50 602.02.53.03.5Female ControlWeekLog body MassFigure 3.1: Plots of log body mass versus week for the four groups of labo-ratory mice.55Chapter 3. Linear mixed models for measurement error in functional regressionMale SelectedWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Male ControlWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Female SelectedWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Female ControlWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Figure 3.2: Histogram of the response: averaged wheel running from weeks5 to 60 for the four groups of laboratory mice.56Chapter 3. Linear mixed models for measurement error in functional regression2 4 6 8 10Male SelectedEigenvalue IndexVariance Explained0.50.60.80.91.02 4 6 8 10Male ControlEigenvalue IndexVariance Explained0.50.60.80.91.02 4 6 8 10Female SelectedEigenvalue IndexVariance Explained0.50.60.80.91.02 4 6 8 10Female ControlEigenvalue IndexVariance Explained0.50.60.80.91.0Figure 3.3: Plots of the proportion of cumulative variance of the centered logbody mass explained by the rst ten principal components of the four groupsof mice, after the individual average log body mass has been removed.57Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 600.60.81.01.21.4WeekPC Function 10 10 20 30 40 50 60-0.5-0.3-0.10.1WeekPC Function 20 10 20 30 40 50 60-0.20.00.10.2WeekPC Function 30 10 20 30 40 50 60-0.20.00.10.20.3WeekPC Function 4Figure 3.4: The constant function and the rst three smoothed eigenfunc-tions of the covariance of centered log body mass of the four groups of mice,calculated as described in Section 3.6.3.58Chapter 3. Linear mixed models for measurement error in functional regression40 50 60 70-30-10102030Male SelectedFittedObserved - Fitted30 35 40 45 50 55-30-10102030Male ControlFittedObserved - Fitted75 80 85 90 95-30-10102030Female SelectedFittedObserved - Fitted45 50 55 60-30-10102030Female ControlFittedObserved - FittedFigure 3.5: Residual plots of the t of the Yi = averaged wheel running ofthe four groups of mice.59Chapter 3. Linear mixed models for measurement error in functional regression30 35 40 45 50 55-2001030The outlierFittedObserved - Fitted30 35 40 45 50 55 60-2001030No outlierFittedObserved - FittedFigure 3.6: Plots of residuals of Yi = averaged wheel runining in the malecontrol group of mice before and after removing the outlier.60Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 60-60-2002040Male SelectedWeekbeta^(t)0 10 20 30 40 50 60-60-2002040Male ControlWeekbeta^(t)0 10 20 30 40 50 60-60-2002040Female SelectedWeekbeta^(t)0 10 20 30 40 50 60-60-2002040Female ControlWeekbeta^(t)Figure 3.7: Plots of ^ and its standard errors computed from the Hessianmatrix (solid) and from the bootstrap (dash-dot) for the four groups of mice,as described in Section 3.4.61Chapter 3. Linear mixed models for measurement error in functional regression0 5 10 15 20 2505101520Male Selectedpermuted U_wpermuted log-likelyhood ratio0 5 10 15 20051015Male Controlpermuted U_wpermuted log-likelyhood ratio0 5 10 15 20 2505101520Female Selectedpermuted U_wpermuted log-likelyhood ratio0 5 10 15 20051015Female Controlpermuted U_wpermuted log-likelyhood ratioFigure 3.8: Comparing the permuted values of the generalized likelihoodratio statistic Ul with the Wald statistic Uw for four groups of mice. Theequivalence of Ul and Uw were observed in Table 3.1.62Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 60-4002060MaleWeekbetas(t)-betac(t)0 10 20 30 40 50 60-4002060FemaleWeekbetas(t)-betac(t)Figure 3.9: Plots of ^s ^c and the standard errors of the di erence com-puted from the Hessian matrix (solid) and from the bootstrap (dash-dot)for both genders.63Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 60-200204060True betaWeekbetagamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60050100200MSE of the ECME estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60050100200MSE of the two stage estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60-20-100510Difference between the MSE?sWeekTwo stage - ECMEgamma = 0gamma = 2/3gamma = 4/3gamma = 2Figure 3.10: MSE of the estimate of for each value in one samplesimulation as described in Section 3.7.2. Compare the ECME method withthe two stage method.64Chapter 3. Linear mixed models for measurement error in functional regression0.00.20.40.60.81.0alpha = 0.01gammaProportion of times rejecting Ho: beta = 00.000 0.667 1.333 2.00000UwUfFigure 3.11: Proportion of times Ho is rejected using level = 0:01, whereHo : (t) = 0; for all t 2 [ 1;60] in one sample simulation as described inSection 3.7.2. Two test statistics are considered, Uw and Uf.65Chapter 3. Linear mixed models for measurement error in functional regression0.00.20.40.60.81.0alpha = 0.05gammaProportion of times rejecting Ho: beta = 00.000 0.667 1.333 2.0000.040.02UwUfFigure 3.12: Proportion of times Ho is rejected using level = 0:05, whereHo : (t) = 0; for all t 2 [ 1;60] in one sample simulation as described inSection 3.7.2. Two test statistics are considered, Uw and Uf.66Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 60-2001030True betasWeekbetasgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 6004080120MSE of the ECME estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 6004080120MSE of the two stage estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60-4-20246Difference between the MSE?sWeekTwo stage - ECMEgamma = 0gamma = 2/3gamma = 4/3gamma = 2Figure 3.13: MSE of the estimate of s for each value in two samplesimulation as described in Section 3.7.3. Compare the ECME method withthe two stage method.67Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 60-40-20020True betacWeekbetacgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 6050150250MSE of the ECME estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 6050150250MSE of the two stage estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60-50510Difference between the MSE?sWeekTwo stage - ECMEgamma = 0gamma = 2/3gamma = 4/3gamma = 2Figure 3.14: MSE of the estimate of c for each value in two samplesimulation as described in Section 3.7.3. Compare the ECME method withthe two stage method.68Chapter 3. Linear mixed models for measurement error in functional regression0 10 20 30 40 50 60-2002040True betas - betacWeekbetas-betacgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60100300500MSE of the ECME estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60100300500MSE of the two stage estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 20 10 20 30 40 50 60051015Difference between the MSE?sWeekTwo stage - ECMEgamma = 0gamma = 2/3gamma = 4/3gamma = 2Figure 3.15: MSE of the estimate of s c for each value in two samplesimulation as described in Section 3.7.3. Compare the ECME method withthe two stage method.69Chapter 3. Linear mixed models for measurement error in functional regression0.00.20.40.60.8alpha = 0.01gammaProportion of times rejecting Ho: betas = betac0.000 0.667 1.333 2.00000.02UlUwUeUfFigure 3.16: Proportion of times Ho is rejected using level = 0:01, whereHo : s = c; for all t 2 [ 1;60] in two sample simulation as described inSection 3.7.3. Four test statistics are considered Ul, Uw, Ue and Uf.70Chapter 3. Linear mixed models for measurement error in functional regression0.00.20.40.60.8alpha = 0.05gammaProportion of times rejecting Ho: betas = betac0.000 0.667 1.333 2.0000.020.05UlUwUeUfFigure 3.17: Proportion of times Ho is rejected using level = 0:05, whereHo : s = c; for all t 2 [ 1;60] in two sample simulation as described inSection 3.7.3. Four test statistics are considered Ul, Uw, Ue and Uf.71Chapter 3. Linear mixed models for measurement error in functional regression10 20 30 40 50 60-40-20020True betaWeekbetagamma = 0gamma = 2/3gamma = 4/3gamma = 210 20 30 40 50 600100200300400MSE of the ECME estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 210 20 30 40 50 600100200300400MSE of the two stage estimateWeekMSEgamma = 0gamma = 2/3gamma = 4/3gamma = 210 20 30 40 50 6005152535Difference between the MSE?sWeekTwo stage - ECMEgamma = 0gamma = 2/3gamma = 4/3gamma = 2Figure 3.18: MSE of the estimate of for each value using truncated logbody mass as described in Section 3.7.4. Compare the ECME method withthe two stage method.72Bibliography[1] James, G. (2002) Generalized linear models with functional predictors.Journal of the Royal Statistical Society B 64, 3, 411-432.[2] James, G. M. and Silverman, B. W. (2005) Functional adaptive modelestimation. Journal of the American Statistical Association 100, 470,565-576.[3] Laird, N. M. (1982) Computation of variance components using theE-M Algorithm. Journal of Statistical Computation and Simulation 14,295-303.[4] Liu, C. H. and Rubin, D. B. (1994) The ECME Algorithm - a simple ex-tension of EM and ECM with faster monotone convergence. Biometrika81 4, 633-648.[5] Magnus, J. R. and Neudecker, H. (1988) Matrix Di erential Calculuswith Applications in Statistics and Econometrics. John Wiley and Sons.[6] Morgan, T.J., T. Garland, Jr., and Carter, P. A. (2003) Ontogenetictrajectories in mice selected for high wheel- running activity. I. Meanontogenetic trajectories. Evolution 57, 646-657.[7] M uller, H.G. (2005) Functional modeling and classi cation of longitu-dinal data. Scandinavian Journal of Statistics 32, 223-240.[8] Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis.2nd edition, Springer Series in Statistics.[9] Wang, W. (2007) Identi ability in linear mixed e ects models.Manuscript, Department of Statistics, University of British Columbia.73Bibliography[10] Yao, F., M uller, H.G., Wang, J.L. (2005) Functional data analysis forsparse longitudinal data. Journal of American Statistical Association100, 577-590.74Chapter 4Dependence of averagelifetime wheel-running onbody mass ontogeny4.1 IntroductionIn natural populations of animals, strong correlations between large bodysize and increased Darwinian tness are frequently measured: large bodysize is related to increased fecundity, reduced predation and, in general, anincreased ability to survive in a given environment (Sogard 1997, Hone andBenton 2005). Variation in body mass is also associated with variation ina number of physiological traits, including metabolic rate and heart mass,and much e ort has been expended understanding allometric relationshipsin animals (Garland and Carter, 1994, and references therein). Body masshas even been shown to impact mutation rates in nuclear and mitochondrialDNA (Gillooly et al., 2005; Estabrook et al., 2007). Body mass is alsoassociated with locomotor performance; however, these relationships arecomplex. Body mass is positively correlated with home range area anddaily movement distances in wild populations of mammals (Garland 1983),but negatively correlated with laboratory measures of voluntary activity(Dewsbery 1980).Replicate lines of mice which have been selected for high levels of vol-untary wheel running activity are a powerful model system with which to0A version of this chapter will be submitted for publication. Authors: Carter, P. A.,Heckman, N. and Wang, W.75Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyinvestigate the relationships between body mass and voluntary locomotoractivity (Garland 2003). After 10 generations of selection, mice selectedfor high voluntary wheel running activity at 8 weeks of age ran signi cantlymore than their unselected controls; however, body mass at the age at whichwheel running was measured and selected was not a signi cant predictor ofthe amount of wheel running nor was it signi cantly di erent between se-lected and control mice (Swallow et al. 1998). A subsequent study, 14generations after selection, examined wheel running and body mass in thissystem of mice after 8 weeks of access to running wheels (Swallow et al.,1999). Three major results emerged from this study. First, body mass andwheel running share a negative genetic correlation of :50; selected micewere signi cantly lighter at maturity than control mice. Second, this resultis not mediated solely through wheel running activity; the di erence in bodymass persisted even in mice without wheel access, and in mice with wheel ac-cess even when total wheel running was statistically removed. Third, accessto running wheels did result in a decrease in body mass, but this decreasewas only about a third of the decrease caused by selection.A detailed study on body mass and wheel running ontogenies in the 16thgeneration of this system of mice was conducted by Morgan et al (2003).This experiment utilized 320 mice, half from the selected lines and half fromthe control lines; half of each of these groups were given lifetime access torunning wheels while the other half did not have wheel access. Morgan et al(2003) compared the ontogenetic trajectories of the di erent experimentalgroups using repeated measures analysis and identi ed several interestingresults. First, both the position and shape of the wheel running trajectoryhad responded to selection on wheel running at 8 weeks of age; selected miceran more across the trajectory and showed a steeper decline in activity atolder ages than control mice. This correlated response of the wheel run-ning ontogeny to early-age selection on wheel running indicates a positivegenetic covariance between wheel running at younger ages and wheel run-ning at older ages, with the covariance declining as age increases. Second,the position but not the shape of the body mass trajectory had respondedto selection, regardless of access to running wheels; selected mice with and76Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenywithout wheel access were signi cantly lighter than their counterpart con-trol mice across all ages. This result indicates a negative response of thebody mass ontogeny to selection on early-age wheel running and hence anegative genetic covariance between wheel running at 8 weeks of age andbody mass across ontogeny; in addition, it indicates strong positive geneticcovariances among body masses at di erent ages. Finally, the analysis ofthe wheel running trajectory revealed that age-speci c body mass was asigni cant covariate for age-speci c wheel running; smaller individuals ranmore at each age than did larger individuals.The fact that Swallow et al. (1999) measured a negative genetic covari-ance between age-speci c wheel running and body mass and that Morganet al. (2003) showed that body mass is a signi cant covariate for wheel run-ning at each age, with smaller mice running more than larger mice, raisesthe question of whether wheel running through ontogeny explicitly dependson the body mass trajectory. Given that body mass at each age was a sig-ni cant covariate for wheel running at that age, we hypothesized that wheelrunning is signi cantly dependent on the body mass trajectory. In addi-tion, given the age-speci c negative genetic covariance between body massand wheel running, we can ask whether this dependence has evolved in re-sponse to early-age selection on wheel running. We hypothesize that thisdependence has evolved such that we will measure quantitative di erencesbetween the selected and control mice in the dependence of wheel runningon the body mass trajectory.To test these hypotheses, we will generally take a function-valued ap-proach. In essence, we regress average wheel running on the log body masstrajectory. We then test if the \regression coe cients" are zero within eachgroup and if they are equal between two groups. To model the log bodymass trajectory, we use basis functions with expansion coe cients modeledas random e ects.77Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny4.2 MethodsExperiments were performed on house mice (Mus domesticus) from the 16thgeneration of a selective-breeding experiment for increased voluntary wheel-running exercise (Swallow et al. 1998; Garland 2003). The selection experi-ment comprises four selection and four control lines; each line is propagatedwith 10 breeding pairs that contribute one male and one female to the nextgeneration, with the condition that siblings are not mated. Breeders arechosen either randomly with respect to wheel running (control lines) or se-lectively, within family, as the mice that run the most revolutions on days ve and six of six days of wheel exposure administered at eight weeks ofage (selection lines). The individuals in this report are the o spring of thegeneration 15 breeders for generation 16 of the selection experiment. Fivebreeding pairs from each of the eight lines were remated to produce secondlitters at Washington State University. Details of this design are presentedin Morgan et al (2003). Brie y, pups were weaned at 21 days of age andplaced in treatment groups at 28 + = 3 days of age in the animal carefacility at Washington State University. Four males and four females fromeach family were used, with half designated for the active treatment andhalf designated for the sedentary treatment. Each activity group thus con-tained two females and two males from each of the ve families within eachof the eight lines, for a total of 160 individuals per activity group, i.e., 320individuals in the aging colony. Mice in the active treatment group wereplaced individually in cages with a 11:75 cm radius running wheel and elec-tronic wheel-revolution counter built into the cage-top. The mouse thus hadthe option of voluntarily getting into the wheel and running, or of remain-ing in the cage and not running. On the same day, sedentary mice wereplaced individually in standard rodent cages. Photoperiod was 12 : 12 h,and water was available ad libitum. Mice were provided excess food weekly(Harland Teklad rodent diet W8604) and apparent food consumption wasdetermined by weighing food hoppers. This measure does not account forpossible variation in food wasting, as when mice shred food pellets and allowfragments to drop in the litter. A study of food wasting at generation 1078Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyfound no signi cant di erences between selection and control lines, but sig-ni cant variation among the replicate lines within selection group (Kotejaet al. 2003). Additionally, once a week each animal was weighed and itsweekly wheel revolutions downloaded from the counter device. Cages werecleaned weekly and running wheels were cleaned monthly. Extra sibs fromall 40 families were placed in similar housing and were used as sentinels tomonitor the colony monthly for the presence of speci c pathogen exposure.At 24 months post start date, a sentinel tested positive for MHV exposure,presumably from a barrier breakdown. No other sentinels tested positive.No treatment was initiated for the virus. Subsequent necropsied mice weremonitored speci cally for the presence of diseased liver tissue. None wereobserved to have contracted hepatitis (necropsies performed by LAR veteri-narians at WSU).Before carrying out any analysis, we eliminate line e ects in log bodymass and wheel-running by subtracting out line means within each group(selection and control) and gender. Suppose individual i is in line k ofthe selection group and is female. We replace individual i's log body masswith original log body mass minus line k's average plus the selection groupaverage. The line k average and the selection group average only involvefemales. Similarly, we do this for the wheel-running.Let Zi( ) be the ith mouse's log body mass trajectory. We study howZi( ) a ects Yi, the averaged wheel running response over weeks 5 to 60. Weuse average wheel-running since the wheel running distance data have manymissing values and are very noisy. In addition, since the running wheelswere cleaned every four weeks, we see spikes in wheel-running activity everyfourth week.In the following, we describe our model for the dependence of wheel-running on log body mass. This dependence is parameterized through anunknown function . We estimate and provide standard errors. We alsotest the null hypothesis that = 0 (no dependence of wheel-running on logbody mass) and test that two groups have the same 's. We use permutationtype methods to calculate the p-values. For more details of the statisticalmethodology, see Heckman and Wang (2007).79Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyWe model Zi using random e ects with a set of basis functions 1;:::; KasZi(t) = (t) +KXk=1xik k(t);where is a smooth unknown function. The random e ects xi (xi1;::: ;xiK)0are assumed normally distributed as N(0; x). The log body mass trajec-tory Zi is observed with error at a sequence of weeks, with the observedvalue of Zi at week j equal to zij Zi(j) + ij. The ij's are independentand normally distributed with mean 0 and variance 2 : Let zi contain theobserved values of the zij's. We can then writezi = + Axi + i; = ( ( 1);::: ; (60))0 ; Ajk = k(j); i = ( i1;:::; in)0:The dependence of Yi on Zi( ) is re ected through the unknown function ( ) asYi = 0 +Z 60 1 (t) Zi(t) E(Zi(t)) dt + ei;where ei is an error term distributed as N(0; 2). The functional parameter is of primary interest and can be viewed as a generalization of the con-ventional multiple regression coe cient vector. Choosing a basis for (forease, the same as for the Zi( )), we approximate as (t) =KXk=1 k k(t):Together with our modelling of Zi( ), we can write the model of Yi asYi = 0 + 0Txi + ei; = ( 1;::: ; K)0; Tjk =Z 60 1 j(t) k(t) dt:The goal is to use information of the observed data, the zi's and Yi's, toestimate the coe cient vector .80Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyBefore estimating along with the other unknown parameters in themodel, we must rst estimate the basis functions, the k's. We apply asmoothed eigenanalysis to the zij's, penalizing an approximation of the sec-ond derivative of the eigenfunction. We choose the amount of smoothing byeye. A detailed description of choosing the eigenfunctions is provided in Sec-tion 4.3. Treating the estimated k's as xed and known, we then estimatethe model parameters via maximum likelihood. We use an iterative numer-ical method, an implementation of the ECME (Expectation/ConditionalMaximization Either) algorithm of Liu and Rubin (1994). Obtaining theestimate ^, our estimate of (t) is ^(t) = PKk=1 ^k k(t). We estimate thecovariance matrix of ^ by both a Hessian matrix based method and a boot-strapped method. Using the estimated covariance matrix, we can estimatethe variance of ^(t), and thus make inference for .To determine if Yi depends on Zi( ), we testHo : (t) = 0; for all t 2 [ 1;60]via an integrated t-statisticUf =Z 60 1^ 2(t)d ^(t)) dt:To calculate the null distribution of Uf, and thus calculate a p-value, weuse a permutation type method, one that does not rely on distributionalassumptions. Under the null hypothesis that (t) 0 for all t 2 [ 1;60],Yi and zi are independent. Therefore the joint distribution of (zi;Yi) is thesame as that of (zi;Yi0 ) where i0 is chosen at random from f1;::: ;Ng andN is the number of individuals. To simulate the null distribution of Uf,we make Q = 500 new data sets via permuting the Yi's: the qth data setis simply fz1;Yi1 ;:::;zN;YiN g where i1;:::;iN is a random permutation of1;::: ;N. The p-value is then calculated as the proportion of the resultingQ statistic values larger than the original observed value.Consider the two-sample problem where there are observations indepen-dently from a selection group and from a control group. To see if the 's81Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenygoverning the Y 's and Z's in the selection group and the control are equal,we testHo : s(t) = c(t); for all t 2 [ 1;60];where the superscript \s" or \c" indicates the group membership, \selection"or \control" respectively. To test Ho, we use the two sample test statisticUf =Z [^ s(t) ^ c(t)]2d ^s(t)) + d ^c(t)) dt:Again, we use a permutation method to calculate the null distribution ofthe two sample Uf and thus the p-values. Under the null hypothesis that s(t) = c(t) for all t 2 [ 1;60], the dependence of Yi on zi is identical inboth groups. We generate Q = 500 \data sets" from the original data set,data sets that follow the null hypothesis. To construct the qth \data set",we randomly split the Ns + Nc individuals into two groups of size Ns andNc and calculate the resulting Uf. We use the empirical distribution of theobtained Q test statistic values to approximate the null distribution of ourstatistic. The p-value is then calculated as the proportion of the Q statisticvalues larger than the original observed value.The use of function-valued traits in evolutionary biology was rst pro-posed by Kirkpatrick and Heckman (1989). The methodology has foundapplications in animal breeding, via basis function expansions with randomcoe cients. See Meyer (2005) and references therein.4.3 ResultsThe body mass trajectories for each experimental group are presented inFigure 4.1. For all four groups, body mass is a roughly monotone increasingfunction of age, with the highest rate of increase from ages 1 to 4 weeks.The body mass trajectories of males, both selected and control, are morevariable than those of the females. Histograms of average wheel runningfrom ages 5 to 60 are presented in Figure 4.2; within each sex, selected miceran more than control mice and females ran more than males (Morgan et82Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyal., 2003). In order to determine a few functions which adequately describethe body mass trajectories, a smoothed eigenanalysis of the trajectorieswas conducted. This smoothed eigenanalysis yielded a rst eigenfunctionthat was close constant, indicating that the biggest source of variability inbody mass was overall size of the mouse. Since a constant eigenfunction isbiologically meaningful, the rst basis function was forced to be constantand the remaining functions calculated via a smoothed eigenanalysis on thetrajectories, centered as follows. Let zi = 15860Xk= 1k6=34;38;39;50zikbe the ith mouses average body mass. We apply a smoothed eigenanalysisto the sample covariance matrix of the centered body masszij zi j = 1;:::;60;j 6= 34;38;39;50:Using the constant as the rst eigenfunction, we found that the corre-sponding rst principal component, overall size, accounts for 85% of thevariability in the male selection group, 72% in the male control group, 70%in the female selection group and 63% in the female control group. Fig-ure 4.3 shows the proportion of cumulative variance explained by the nextten principal components in each group after removing the variability ofthe rst principal component. The rst three smoothed eigenfunctions ineach group capture about 90% of the remaining variability of the body masstrajectories so these three eigenfunctions plus the constant function will beused in the subsequent analyses. Figure 4.4 shows each of these four leadingeigenfunctions; the functions of the four experimental groups are presentedtogether in each panel. Clearly the smoothed eigenfunctions are very similaramong the four experimental groups.We model the mean of the averaged wheel running asE(Y ) = 0 +Z (t)[Z(t) EZ(t)]dt;83Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenywhere Z(t) is the logarithm of body mass at age t. Since the observed bodymass curves are somewhat rougher than one would expect, we model theobserved log body mass at age t as Z(t) plus error. To estimate ( ), weapproximate both ( ) and Z( ) using basis function expansions. The basisfunctions used are the estimated eigenfunctions. To model within individualvariability in body mass, we use a random e ects model, that is, we assumethat the coe cients in each individuals Z( ) expansion are random. We thenestimate all parameters by maximum likelihood.The ^ is estimated for each group from the smoothed eigenfunctions.These ^'s are presented in Figure 4.5. Both the male selection and controlgroups show regions of ^ that are non-overlapping with the 0 line and for-mal testing shows that ^ is signi cantly di erent from 0 for both selectedand control males (Figure 4.5; Table 4.1; p = 0:008 and 0:034, respectively).Hence in male selection and control mice average wheel running signi cantlydepends on body mass trajectory. Interestingly, ^ for the selection males ispositive at very young ages, then negative just prior to 10 weeks, positiveagain at middle age and then negative or not di erent from 0 at older ages.For females, ^ overlaps 0 and the results of the formal testing are not signif-icant for either selection or control groups (Figure 4.5; p = 0:264 and 0:656,respectively). Hence for all females, there is no evidence that average wheelrunning depends on the body mass trajectory.Figure 4.6 shows both the pointwise estimates and the model estimatesof the covariance between average wheel running and body mass trajectoryfor all four experimental groups. The general shapes of both estimates aresimilar to each other and they are similar to the shapes of the curves seenfor the ^ estimates in Figure 4.5. For instance, in each group except for thefemale control, at the rst few ages the covariance between average wheel-running and log body mass is large. Correspondingly, at these ages each^ is also large, where we believe that average wheel running is positivelydependent on log body mass.The ^ for selection and control mice were compared within each genderto test for the evolution of the relationship of body mass trajectory andaverage wheel running (Figure 4.7). No signi cant di erences were detected84Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenybetween selected and control mice in either gender (Table 4.2); no evolutionof the dependence of wheel running on body mass trajectory in response toselection on age-speci c wheel running can be detected.4.4 DiscussionAs hypothesized, both male selected and control mice showed a signi cantdependence of wheel running on the body mass ontogeny. This result gen-erally ts those of Swallow et al. (1999), who measured a negative geneticcorrelation of 0:5 between wheel running and body mass, and Morgan etal. (2003) who measured a negative correlated response of the body mass on-togeny to selection on early-age wheel running. However, the analyses usedherein have revealed complexity that was not identi ed in the two previousstudies: both previous studies showed clear negative relationships betweenbody mass and wheel running, whereas we have shown that the relationshipis sometimes negative and sometimes positive, depending on age. As can beseen in Figures 4.5 and 4.6, body mass has a positive e ect on wheel runningat young and some middle ages, with a negative e ect at some younger agesand at older ages. Interestingly Swallow et al. (1999) conducted their studyat the younger ages (~ weeks 8 to 15) when the relationship is negative, andalthough Morgan et al. (2003) clearly showed a negative genetic covariancebetween the body mass ontogeny and early-age wheel running, that does notpreclude the phenotypic covariance from being at times positive. Hence ourresults do not disagree with those of the earlier studies but rather elucidatecomplexity that the earlier studies were not able to identify.A signi cant dependence of wheel running on the body mass trajectorywas not identi ed in females. Di erent results in males and females is notsurprising in the context of this model system of mice, in which males andfemales have di ered in traits as diverse as wheel running itself (Swallow etal., 1998) and longevity (Bronikowski et al., 2006). On the other hand, thelack of a measurable di erence in females may be caused by a lack of power.Given that Morgan et al. (2003) showed a negative correlated responseof the wheel running trajectory to selection on early age body mass, we85Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyhad hypothesized that the dependence of body mass on the wheel runningtrajectory would have evolved and hence be di erent between selected andcontrol mice. However, no such di erences were identi ed. Hence we can notdemonstrate any evolution of the dependence of wheel running on body masstrajectory. Several reasons may explain this. First, it is possible that thedependence is evolving, just slowly, and that a similar study conducted on alater generation would reveal di erences between selected and control mice.Second, the dependence of wheel running on body mass may be constrainedby genetic covariances between wheel running and body mass, e ectivelymaking it di cult to alter the dependence. Such a constraint could beidenti ed by detailed quantitative genetic analyses of the wheel runningand body mass trajectories (Kingsolver et al., 2001). Third, the inabilityto demonstrate any evolution dependence of wheel running on body masstrajectory could simply be because of a lack of power.Developing powerful yet exible statistical techniques to analyze datasets with a large number of observations on each individual is a challenge.We often do not have enough degrees of freedom to carry out a \usual"regression analysis, modelling Yi = 0 +P jZ(tj)+eij: for our data, withinone group we have around 40 individuals and 58 values of j. The function-valued approach e ectively reduces the number of parameters in the model.Using eigenfunctions as basis functions has allowed us to t our data usingonly 0 and 4 j's. Griswold, Gomulkiewicz and Heckman (2007) have foundvia simulation that basis-function type techniques are more powerful thantraditional ones. Although they have considered a slightly di erent problem,applicable to a comparision of log body mass trajectories of two groups, onewould expect their results to hold qualitatively for the problem consideredhere. The statistics methodology was developed by Heckman and Wang(2007), with this its rst application. Here we see its bene ts: the methodhas exposed patterns in dependence that were previously unknown. Themethod allows full use of information from all individuals, through the useof random e ects modelling. And by using a function-valued approach, wecan handle large number of observations per individual.86Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyGroup p-valueMale selected 0:008Male control 0:034Female selected 0:264Female control 0:656Table 4.1: P-values of the test Ho : (t) = 0; for all t 2 [ 1;60], using teststatistic Uf.Gender p-valueMale 0:110Female 0:444Table 4.2: P-values of the test Ho : s(t) = c(t); for all t 2 [ 1;60], usingtest statistic Uf.87Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny0 10 20 30 40 50 602.02.53.03.5Male SelectedWeekLog body Mass0 10 20 30 40 50 602.02.53.03.5Male ControlWeekLog body Mass0 10 20 30 40 50 602.02.53.03.5Female SelectedWeekLog body Mass0 10 20 30 40 50 602.02.53.03.5Female ControlWeekLog body MassFigure 4.1: Plots of log body mass versus week for the four groups of labo-ratory mice.88Chapter 4. Dependence of average lifetime wheel-running on body mass ontogenyMale SelectedWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Male ControlWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Female SelectedWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Female ControlWheel Running (km/week)Density20 40 60 80 1200.0000.0100.0200.030Figure 4.2: Histogram of the response: averaged wheel running from weeks5 to 60 for the four groups of laboratory mice.89Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny2 4 6 8 10Male SelectedEigenvalue IndexVariance Explained0.50.60.80.91.02 4 6 8 10Male ControlEigenvalue IndexVariance Explained0.50.60.80.91.02 4 6 8 10Female SelectedEigenvalue IndexVariance Explained0.50.60.80.91.02 4 6 8 10Female ControlEigenvalue IndexVariance Explained0.50.60.80.91.0Figure 4.3: Plots of the proportion of cumulative variance of the centered logbody mass explained by the rst ten principal components for four groupsof mice, after the individual average log body mass has been removed.90Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny0 10 20 30 40 50 600.60.81.01.21.4WeekPC Function 10 10 20 30 40 50 60-0.5-0.3-0.10.1WeekPC Function 20 10 20 30 40 50 60-0.20.00.10.2WeekPC Function 30 10 20 30 40 50 60-0.20.00.10.20.3WeekPC Function 4Figure 4.4: The constant function and the rst three smoothed eigenfunc-tions of the covariance of centered log body mass of four group. The fourgroups' functions are together in one panel. The eigenfunctions are calcu-lated as described in Section 4.3.91Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny0 10 20 30 40 50 60-60-2002040Male SelectedWeekbeta^(t)0 10 20 30 40 50 60-60-2002040Male ControlWeekbeta^(t)0 10 20 30 40 50 60-60-2002040Female SelectedWeekbeta^(t)0 10 20 30 40 50 60-60-2002040Female ControlWeekbeta^(t)Figure 4.5: Plots of ^ and its standard errors computed from the Hessianmatrix (solid) and from the bootstrap (dash-dot) for four groups of mice.Standard errors are calculated as described in Section 4.2.92Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny0 10 20 30 40 50 60-0.20.20.40.60.8Male SelectedWeekCov0 10 20 30 40 50 60-0.20.20.40.60.8Male ControlWeekCov0 10 20 30 40 50 60-0.20.20.40.60.8Female SelectedWeekCov0 10 20 30 40 50 60-0.20.20.40.60.8Female ControlWeekCovFigure 4.6: Plots of the covariances between the average wheel running andthe log body mass trajectory at week j, j = 1;::: ;60, for four groups ofmice. Solid lines are the covariance from the model and dash-dot lines arethe pointwise sample covariances between average wheel running and logbody mass. For the calculation, refer to Heckmand and Wang (2007).93Chapter 4. Dependence of average lifetime wheel-running on body mass ontogeny0 10 20 30 40 50 60-4002060MaleWeekbetas(t)-betac(t)0 10 20 30 40 50 60-4002060FemaleWeekbetas(t)-betac(t)Figure 4.7: Plots of ^s ^c, the di erence between the selection ^ andthe control ^, and the standard errors of the di erence computed from theHessian matrix (solid) and from the bootstrap (dash-dot) for both genders.94Bibliography[1] Bronikowski, A. M., Morgan, T. J., Garland, T., Jr. and Carter, P. A.(2006) The evolution of aging and age-related physical decline in miceselectively bred for high voluntary exercise. Evolution 60, 1494-1508.[2] Dewsbury, D. A. (1980) Wheel-running behavior in 12 species of muroidrodents. Behav. Process 6, 271280.[3] Estabrook, G. F., Smith, G. R. and Dowling, T. E. (2007) Body massand temperature in uence rates of mitochondrial DNA evolution inNorth American Cyprinid sh. Evolution 61, 1176-1187.[4] Garland, T., Jr. (1983) Scaling the ecological cost of transport to bodymass in terrestrial mammals. Am. Nat 121, 571-587.[5] Garland, T., Jr. (2003) Selection experiments: an under-utilized tool inbiomechanics and organismal biology. Pages 23-56 in Bels, V. L., Gasc,J. P., Casinos, A. (2003) Vertebrate biomechanics and evolution. BIOSScienti c Publishers, Oxford, U.K.[6] Garland, T., Jr and Carter, P. A. (1994) Evolutionary physiology.Annu. Rev. Physiol 56, 579621.[7] Gillooly, J. F., Allen, A. P., West, G. B. and Brown, J.H. (2005) Therate of DNA evolution: e ects of body size and temperature on themolecular clock. PNAS, 102, 140-145.[8] Gomulkiewicz, R. and Kirkpatrick, M. (1992) Quantitative genetics andthe evolution of reaction norms. Evolution 46, 390-411.95Bibliography[9] Gomulkiewicz, R. and Beder, J. H. (1996) The selection gradient of anin nite-dimensional trait. SIAM Journal of Applied Mathematics 56,509-523.[10] Griswold, C. K., Gomulkiewicz, R. and Heckman, N. (2007) Statisticalpower and hypothesis testing with functional data. manuscript.[11] Heckman, N. and Wang, W. (2007) Linear mixed models for measure-ment error in functional regression. manuscript, Department of Statis-tics, University of British Columbia.[12] Hone, D. W. E. and Benton, B. J. (2005) The evolution of large size:how does Cope's Rule work? Trends in Ecology and Evolution 20, 1,4-6.[13] Kingsolver, J.G., R.S. Gomulkiewicz, and Carter, P. A. (2001) Varia-tion, selection, and evolution of function-valued traits. Genetica 112,87-104.[14] Kirkpatrick, M. and Heckman, N. (1989) A quantitative-genetic modelfor growth, shape, reaction norms, and other in nite-dimensional char-acters. Journal of Mathematical Biology 27, 429-450.[15] Koteja, P., Carter, P. A. and Swallow, J. G. and Garland, T., Jr. (2003)Food wasting in house mice: variation among individuals, families, andgenetic lines. Physiology & Behavior 80, 375-383.[16] Meyer, K. (2005) Advances in methodology for random regression anal-yses. Australian Journal of Experimental Agriculture 45, 847-859.[17] Morgan, T. J., Garland, T., Jr. and Carter, P. A. (2003) Ontogenies inmice selected for high voluntary wheel-running activity. I. Mean onto-genies. Evolution 57, 646-657.[18] Sogard, S. M. (1997) Size-selective mortality in the juvenile stage ofteleost shes: A review. Bulletin of Marine Science 60, 1129-115796Bibliography[19] Swallow, J. G., Carter, P. A. and Garland, T., Jr. (1998) Arti cialselection for increased wheel-running behavior in house mice. BehaviorGenetics 28, 227-237.[20] Swallow, J. G., Koteja, P. and Carter, P. A. and Garland, T., Jr. (1999)Arti cial selection for increased wheel-running activity in house miceresults in decreased body mass at maturity. Journal of ExperimentalBiology 202, 2513-2520.97Chapter 5Conclusions and future planIn the thesis, we study identi ability of linear mixed e ects models in Chap-ter 2. Particularly in Section 2.5, we give necessary and su cient conditionsfor identi ability of a model used in functional data analysis. The model isour study object in Chapter 3. There we show how to use the ECME al-gorithm to estimate the model parameters, develop statistics for hypothesistesting of the functional coe cient and derive expressions of the resid-uals to check for outliers or in uential points. We also apply the modelto analyze a biological data set in Section 3.6. Chapter 4 provides a de-tailed background of the biological experiment, a description of the data,and biological interpretations of our analysis.In this chapter, we give possible directions for future research. In Sec-tion 5.1, we adjust our Chapter 3 to account for correlated individuals. InSection 5.2, we consider generalizing our Chapter 3 model from a scalarresponse to a functional response.5.1 Models with correlated individualsIn Chapter 4, we apply our model (3.4) and (3.5) to analyze a biological dataset. Speci cally, the observed ith mouse's log body mass, zi, is modelled aszi = + Axi + i (5.1)and the ith mouse's average wheel running, Yi, is modelled asYi = 0 + 0Txi + ei: (5.2)98Chapter 5. Conclusions and future planWe assume the random e ects, the xi's, are independent across individuals.However, in the biological experiment, mice may come from the same fam-ily and thus traits may be genetically correlated. This fact would lead tocorrelations among the random e ects, the xi's. Clearly, our independencemodel assumption is then violated and we should take the dependence ofindividuals into consideration. In this section, we show how to handle thisfamily correlation more directly, by adjusting our model to account for thisdependence.In the experiment, two mice came from the same family. Let the sub-script i index the family and the subscript k = 1 or 2 di erentiate the twofamily members. Models of the ikth mouse's log body mass and averagewheel running are respectivelyzik = + Axik + ik; (5.3)Yik = 0 + 0Txik + eik; k = 1;2: (5.4)We still assume that Cov(xi1) = x = Cov(xi2) but, in the ith family, xi1and xi2 are correlated asCov(xi1;xi2) = x; (5.5)where is a known constant. The constant, , comes from genetic theory(see, for instance, Heckman, 2003). We still assume the measurement errorsare independent of the random e ects and are independent across individualsas ik i:i:d: N(0; 2 R); eik i:i:d: N(0; 2);(x11;x12);(x21;x22);::: ;(xN1;xN2); 11; 12;::: ; N1; N2;e11;e12;:::;eN1;eN2 are mutually independent:The model (5.3) and (5.4) now includes the dependence between the twofamily members. In the following, we give another expression of the model,dropping the subscript k.99Chapter 5. Conclusions and future planWe rst group each family's information together. Letzi = zi1zi2!; yi = Yi1Yi2!; xi = xi1xi2!; i = i1 i2!; and ei = ei1ei2!:To write a model for the ith family, let~ = !; ~0 = 0 0!; ~ = A 00 A!; and g0T = 0T 00 0T!:From (5.3) and (5.4), our model for the ith family iszi = ~ + ~ i + i; (5.6)yi = ~0 + g0Txi + ei: (5.7)To derive the covariances of the (z0i;y0i), we let Ax = A xA0, A; =A xT0 and ;x = 0T xT0 . From (5.5), it is not hard to see thefollowing holds,Cov(zi) = Ax + Ax Ax Ax + !;Cov(zi;yi) = A; A; A; A; !;Cov(yi) = ;x + 2 ;x ;x ;x + 2!:Model (5.6) and (5.7) uses individuals' family information together andthe correlation between family members is re ected in the model's covari-ance structure. To estimate the model parameters, we can use the ECMEalgorithm. But we may need to modify the algorithm as the model covari-ance structure is not in a simple form.100Chapter 5. Conclusions and future plan5.2 Proposing models with a functional responseIn Chapter 3, we applied our model to analyze the mouse wheel runningdata set. In the experiment, the wheel running distances were observedevery week for 60 consecutive weeks. We took the averaged wheel runningdistance as a scalar response, averaged to reduce nuisance e ects in thewheel running. However, we would like to consider the entire wheel-runningtrajectory as a response and to study its dependence on the log body masstrajectory. In this section, we introduce a model which generalizes a scalarresponse to a functional response. We give an outline of the plan to study themodel. Finally, we give a literature review of current work on this functionalresponse and predictor model.5.2.1 Models of a functional responseRecall our modelling of the scalar response Yi isYi = 0 +Z ba (s)[Zi(s) E(Zi(s))] ds + ei:Now we have a functional response Yi indexed by t. Replacing the univariate ( ) function with a bivariate ( ; ), we consider generalizing the scalarresponse model to accomodate the functional response asYi(t) = (t) +Z ba (t;s) Zi(s) E(Zi(s)) ds + ei(t): (5.8)We consider a tensor-product expansion of ( ; ) using basis functions, '1,:::, 'J and 1, :::, L, as (t;s) =JXj=1LXl=1bjl'j(t) l(s) '(t)0B (s); B[j;l] = bjl: (5.9)The modelling of the predictor Zi remains the same asZi(s) = (s) +KXk=1xik k(s) (s) + (s)0xi;101Chapter 5. Conclusions and future plan smooth; xi (xi1;::: ;xiK)0 i:i:d: N(0; x):Thus we can writeYi(t) = (t) + '(t)0BTxi + ei(t);Tlk =Z ba l(s) k(s) ds Z (s) (s)0ds: (5.10)Both the Zi and Yi are observed at a sequence of discrete time points.Typically, we assume the observed Zi are perturbed with measurement er-rors. Let ij represent the measurement error of Zi at time sij, j = 1;::: ;ni.Let zij be the contaminated observed value aszij = Zi(sij) + ij (sij) + (sij)0xi + ij: (5.11)Let = ( (si1);::: ; (sini))0, i = ( i1;::: ; ini)0 represent errors and zi =(zi1;::: ;zini)0 be the observed values. Writezi = + Axi + i;Ajk = k(sj); j = 1;::: ;ni; k = 1;::: ;K; i N(0; i);x1;:::;xn; 1;:::; n are mutually independent: (5.12)Suppose the process Yi is observed at a sequence of discrete time points(t1;::: ;tmi). Let = ( (t1);::: ; (tmi))0, ei = (ei(t1);::: ;ei(tmi))0 repre-sent errors and yi = (yi(t1);::: ;yi(tmi))0 (yi1;:::;yimi)0 be the observedvalues. We then writeyi = + BTxi + ei; ij = 'j(ti);ei independent of both xi and i;ei N(0; ei);x1;::: ;xn; 1;::: ; n;e1;:::;en are mutually independent: (5.13)102Chapter 5. Conclusions and future planThe data available are the observed yi's and the zi's and the unknownparameter of most interest is B. We then use (5.12) and (5.13) as our modelof the functional response and the functional predictor for data analysis.5.2.2 Outline of the model studyWe initiate our plan to study the model (5.12) and (5.13) in this section. Thesteps basically follow those in Chapter 3: model identi ability, parameterestimation, hypothesis testing and model diagnostics. Throughout, the basisfunctions are assumed known and xed. We give a discussion of the choiceof basis functions at the end.In Section 2.5 we gave conditions of model identi ability for the scalarresponse model. We can use similar methods to study identi ability of thisfunctional response model with covariance parameters now f x; i; ei;i =1;::: ;Ng. New results will be needed to guarantee the model identi abilitywith restrictions on the i's and the ei's.After identifying the model parameters, we can proceed to estimate theunknown parameters using an ECME algorithm assuming the basis functionsknown and xed. The parameter of most interest is the matrix B in (5.9).After obtaining the estimate, similar approaches using the Hessian matrixor bootstrap can be applied to get the standard errors of ^. For hypothesistesting of , it may be of interest to know if, at certain xed t point, (t; ) iszero or not. So the statistics developed in Section 4.4 can also be used to testthe hypothesis Ho : (t;s) = 0 for all s 2 [a;b] when t is xed. After ttingthe model, residuals of the t of yi can be obtained based on the BLUP ofthe random e ects xi. However, for each individual, the tted value and theresidual of yi now are vectors, not scalars. Appropriate methods for residualanalysis need to be proposed.The 'j's and l's basis functions for and the k's for Zi can be chosenusing an eigenanalysis approach. As in the scalar response case, we applyan eigenanalysis to the observed zij's, and take 1;:::; K as the rst Kestimated eigenfunctions. We take L = K and k = k to determine oneset of the basis functions of ( ; ) in (5.9), the set of basis functions that103Chapter 5. Conclusions and future planappear in the integral T in (5.10). The other set of basis functions, the 'j's,can be chosen as the estimated eigenfunctions of the Yi process.An alternative to the use of a tensor-product expansion of is to considerusing bivariate basis function expansions. Let the l's be bivariate basisfunctions. We can approximate as (s;t) =LXl=1 l l(s;t) (s;t)0 :But we also need to consider what basis functions to choose.5.2.3 Literature reviewThere are two research results closest to our work on the functional responseand functional predictor model.Ramsay and Silverman (2005, chapter 16) study a slight modi cation ofmodel (5.8)yi(t) = (t) +Z (t;s) Zi(s) ds + ei(t):Besides using the tensor-product expansion of as in (5.9), the authorsexpand using the 'j's as (t) =J1Xj=1 l'j(t) '(t)0 :Assuming the Zi's are observed without error, the authors estimate andB by minimizing the integrated residual sum of squares, which isNXi=1Z yi(t) '(t)0 ZZi(s) (s)0B'(t)ds 2dt:Yao, M uller and Wang (2005) consider (5.8) with the Zi process observedwith error. They use k's equal to the eigenfunctions of the Zi covariancefunction in (5.11). The random e ects xi has a diagonal covariance matrixwith elements 2k's being the eigenvalues of the Zi covariance function. Theerrors ij are assumed to be i.i.d. with mean zero and variance 2 .104Chapter 5. Conclusions and future planYao et al. carry out an eigenanalysis of the Yi's covariance function,with resulting eigenfunctions equal to the l's and corresponding eigenval-ues equal to the 2l 's. They expand the observed yik, contaminated withmeasurement errors, asyik = Y (tik) +LXl=1uil l(tik) + eik; Y smooth; Euil = 0; Eu2il = 2l ; Euiluil0 = 0; l 6= l0:eik i:i:d: (0; 2e ); independent of the u0ils: (5.14)Let C(s;t) denote the two-dimensional scatterplot smoothed cross-covariancesurface Cov(Z(s);Y (t)) of the zij's and yik's. The authors estimate as^ (s;t) =KXk=1LXl=1 kl 2k k(s) l(t);where kl is calculated as kl =Z Z k(s) C(s;t) l(t) ds dt:Similarities between our modelling and Yao et al's are that the coe cientvector xi in the expansion of the Zi is random and the Zi's are observedwith measurement error. However, Yao et al's xi has a diagonal covariancestructure while our covariance matrix of the xi, x, can be more general. Weuse both the zi's and the yi's information to estimate x and other modelparameters. In Yao et al., variance elements 2k's of the xi and variance ofthe error ij, 2 , are estimated completely by the zi's. Similarly, the 2l 's ofthe ui and 2e of the eik are estimated completely using the yi's.105Bibliography[1] Heckman, N. (2003) Functional data analysis in evolutionary biology.Recent Advances and Trends in Nonparametric Statistics, 49-60, Akri-tas, M. G. and Politis, D. N. (editors).[2] Ramsay, J. O. and Silverman, B. W. (2005) Functional Data Analysis.2nd edition, Springer Series in Statistics.[3] Yao, F., M uller, H.G., Wang, J.L. (2005) Functional linear regressionanalysis for longitudinal data. Annals of Statistics 33, 6, 2873-2903.106Appendix AAppendix to Chapter 2A.1 Proof of Corollary 2.4.1To prove it, we use the following result from Graybill, 1983, p.206.Lemma A.1.1 Given two scalars a and b, the characteristic equation of thematrix C = (a b)I + bJ in is(a + (n 1)b ) (a b )n 1;and hence n 1 characteristic roots are equal to a b and one root is equalto a + (n 1)b.Proof:To prove the corollary, we use Theorem 2.4.1 and a proof by contradiction.First suppose that the model is identi able and suppose, by way ofcontradiction, HZJ = J. Fix 2 ~ . Let s > 1, 2 = s 2 and = ( 1)=s + 1. It is not hard to check 1=(n 1) < < 1. De- ne = 2 [(1 )I + J]. Then = ( 2 2)J and, sinceHZJ = J, it is clear that (2.3) is satis ed. We now show that, for any u 2 u, there exists s > 1 so that u de ned as in (2.4) is positive de -nite whenever 1 < s < s . This will show that the model is not identi able,which contradicts our assumption. Plugging = ( 2 2)J into(2.4) yields u = u + 2(1 s)(Z0Z) 1Z0JZ(Z0Z) 1: (A.1)By assumption 10Z 6= 0 and Z is of full column rank, the matrix (Z0Z) 1Z0JZ(Z0Z) 1is non-negative de nite and of rank one since J = 110. Let be its non-zero and thus the largest eigenvalue of (Z0Z) 1Z0JZ(Z0Z) 1. Let m be the107Appendix A. Appendix to Chapter 2smallest eigenvalue of the matrix u, and let s = m=( 2) + 1. For anyx 2 <q; x 6= 0, x0 ux > 0 by the following argument.x0 ux = x0 ux + 2(1 s)x0(Z0Z) 1Z0JZ(Z0Z) 1x mx0x + 2(1 s) x0x> 0Now suppose that HZJ 6= J and suppose, by contradiction, that themodel is not identi able. Then, by Theorem 2.4.1, there exist nonidentical and satisfying (2.3) and, since the rank of HZ is q, the rank of is at most q. We have =h( 2 2) ( 2 2 )iI + ( 2 2 )J:By Lemma A.1.1, the eigenvalues of are ( 2 2) ( 2 2 ),which is of multiplicity n 1 and ( 2 2) + (n 1)( 2 2 ), ofmultiplicity 1. Since is not the zero matrix, all of the eigenvaluescannot be equal to 0: we must either have no eigenvalues equal to 0, oneeigenvalue equal to 0, or n 1 eigenvalues equal to 0. In order to haverank( ) q, the n 1 multiple eigenvalues have to be zero since1 q < n 1 by assumption. That is, 2 2 = 2 2 and so = ( 2 2)J. But plugging this into (2.3) yields HZJ = J,contradicting our assumption.2A.2 Proof of Corollary 2.4.2Proof:We rst note a fact about the matrix HZ. Since HZ is symmetric andidempotent,HZ[k;k] = Xl(HZ[k;l])2 = (HZ[k;k])2 + Xl6=k(HZ[k;l])2 :108Appendix A. Appendix to Chapter 2Thus, if HZ[k;k] = 1, then HZ[k;i] = HZ[i;k] = 0 for all i 6= k.To prove the corollary, we use Theorem 2.4.1 and a proof by contradic-tion.First suppose that the model is identi able and suppose, by way ofcontradiction, a diagonal element of HZ is equal to 1. Without loss ofgenerality, we assume HZ[1;1] = 1. Then by our observation, HZ[1;i] =HZ[i;1] = 0 for all i 6= 1. Fix = diagf 21 ;:::; 2ng 2 ~ . Let 21 satisfy0 < 1 2 < 21 and de ne = diagf 1 2; 22 ;::: ; 2ng. Then =diagf 21 1 2;0;::: ;0g: It is not hard to check that (2.3) is satis ed. Clearly,for any u 2 u, u de ned as in (2.4) is also in u. Thus, the model isnot identi able, which contradicts our assumption.Now suppose that no diagonal element of HZ is equal to one and sup-pose, by contradiction, that the model is not identi able. Then there existsnonidentical diagonal matrices, and , satisfying (2.3). As 6= ,at least one of the diagonal elements of is not zero. Suppose thek-th diagonal element is not zero. By (2.3), the k-th diagonal element ofHZ must be one, contradicting our assumption. 2109Appendix BAppendix to Chapter 3In this appendix, we provide the calculations in Sections 3.3.2 and 3.3.4where we nd the updates of x and f ; 2g in the ECME procedure.In Section B.4, we derive the rst order condition (3.10) and the Hessianmatrix (3.14) of Section 3.3.2 where we maximize the log-likelihood N over x while holding the other parameters xed.. In Section B.5, we derive the rst order conditions (3.22) and (3.23), and the Hessian matrix (3.26) of Sec-tion 3.3.4 where we maximize ~N over f ; 2g holding the other parameters xed.We use the tool of matrix di erential calculus, calculating rst di eren-tials to obtain the rst order conditions and second di erentials to obtainthe Hessian matrices. The book by Magnus and Neudecker (1988) givesan elegant description on this subject. In Sections B.1-B.3, we follow thebook to introduce some de nitions and provide some background, mainlyfrom Part Two of the book. We keep the same notation as in the book.Throughout this section, chapters and page numbers all refer to (Magnusand Neudecker, 1988).B.1 De nition of the rst di erentialWe rst give the de nition of the rst di erential for a vector function (avector valued function with a vector argument). We show that the function's rst di erential is connected with its Jacobian matrix. We then give anextension of the de nition to a matrix function (a matrix valued functionwith a matrix argument) and show how to identify the Jacobian matrix fromthe rst di erential.110Appendix B. Appendix to Chapter 3De nition B.1.1 Let f : S ! <m be a function de ned on a set S in <n.Let c be an interior point of S, and let B(c;r) be an n-ball lying in S centredat c of radius r. If there exists a real m n matrix A, depending on c butnot on u, such thatf(c + u) = f(c) + A(c)u + rc(u)for all u 2 <n with jjujj < r andlimu!0 rc(u)jjujj = 0;then the function f is said to be di erentiable at c; the m n matrix A(c)is then called the rst derivative of f at c, and the m 1 vectordf(c;u) = A(c)u;which is a linear function of u, is called the rst di erential of f at c (withincrement u). If f is di erentiable at every point of an open subset E of S,we say f is di erentiable on E.After calculating the rst di erential, we identify the Jacobian matrixas follows. Let Df be the m n Jacobian matrix of f whose ijth elementis Djfi: the partial derivative of the ith component function fi of f withrespect to the jth coordinate. The First Identi cation Theorem (p.87) statesthat the rst derivative A(c) is Df(c) when f is di erentiable at c.To extend the de nition of a vector function to a matrix function with amatrix argument is straightforward using the vec operator. The vec operatortransforms a matrix into a vector by stacking the columns of the matrix oneunderneath the other.Recall the norm of a real matrix X is de ned byjjXjj = (trX0X)1=2:111Appendix B. Appendix to Chapter 3Let <n q contains all the real n q matrices. De ne a ball B(C;r) withcenter C and radius r in <n q byB(C;r) = fX : X 2 <n q;jjX Cjj < rg:Let F : S ! <m p be a matrix function de ned on a set S in <n q. Thatis, F maps an n q matrix into an m p matrix F(X). We consider thethe vector function f : vec S !<m p de ned byf(vec X) = vec F(X)and the following gives the de nition of the rst di erential of F.De nition B.1.2 Let F : S !<m p be a matrix function de ned on a setS in <n q. Let C be an interior point of S and let B(C;r) S be a ballwith center C and radius r. If there exists RC and a real (mp) (nq) matrixA, depending on C but not on U, such thatvecF(C + U) = vecF(C) + A(C)vecU + vecRC(U)for all U 2 <n q with jjUjj < r andlimU!0 RC(U)jjUjj = 0;then the function F is said to be di erentiable at C. LetdF(C;U) = A(C)vecU:Although this is a vector of length (mp), it can be formed into a matrix ofdimension m p, in the usual natural way. This m p matrix dF(C;U) iscalled the rst di erential of F at C with increment U and the (mp) (nq)matrix A(C) is called the rst derivative of F at C.From the de nition, it is clear that the di erential of F and f are relatedbyvec dF(C;U) = df(vecC;vecU):112Appendix B. Appendix to Chapter 3The Jacobian matrix of F at C is de ned asDF(C) = Df(vecC):This is an (mp) (nq) matrix, whose ijth element is the partial derivativeof the ith component of vecF(X) with respect to the jth element of vecX,evaluated at X = C. The First Identi cation Theorem for matrix functions(p.96) states that if F is di erentiable at C, thenvec dF(C;U) = DF(C)vecU:So we can calculate the di erential of F to identify its Jacobian matrix.B.2 De nition of the second di erentialWe rst introduce the de nition of twice di erentiable on which the de ni-tion second di erential is based. The de nitions are restricted to real valuedfunctions as in our calculations we only need to consider second di erentialsof real valued functions. Then we connect the Hessian matrix with the sec-ond di erential. As in the rst di erential case, we give an extension of thede nition to a real valued function with a matrix argument and also showhow to identify the Hessian matrix from the second di erential.De nition B.2.1 Let f : S !< be a real valued function de ned on a setS in <n, and let c be an interior point of S. If f is di erentiable in somen-ball B(c) and each of the partial derivatives Djf is di erentiable at c,then we say that f is twice di erentiable at c. If f is twice di erentiable atevery point of an open subset E of S, we say f is twice di erentiable on E.The following is the de nition of the second di erential.De nition B.2.2 Let f : S !< be twice di erentiable at an interior pointc of S <n. Let B(c) be an n-ball lying in S such that f is di erentiableat every point in B(c), and let g : B(c) !< be de ned by the equationg(x) = df(x;u):113Appendix B. Appendix to Chapter 3Then the di erential of g at c with increment u, i.e. dg(c;u), is called thesecond di erential of f at c (with increment u), and is denoted by d2f(c;u).To calculate the second di erential of f, by the de nition, we just need tocalculate the di erential of the rst di erential of f, i.e.d2f = d(df):We have seen that the Jacobian matrix can be identi ed from the rstdi erential. Similarly, we can identify the Hessian matrix from the seconddi erential. The Second Identi cation Theorem (p.107) states that if f istwice di erentiable at c, thend2f(c;u) = u0 (Hf(c)) u;where Hf(c) is the n n symmetric Hessian matrix of f at c with (i;j)entry equal to @2f(c)=(@ci@cj). Therefore, once we have calculated thesecond di erential, the Hessian matrix is obtainable.Similarly as in Section B.1, the extension of the second di erential fromvector functions to matrix functions is straightforward using the vec opera-tor. As we only consider real valued functions, we restrict the extension toa real valued function with a matrix argument.We follow the notation in the de nition of the rst di erential of matrixfunctions. Let the domain of f be S <n q and let C be an interior pointof S. Let B(C;r) S be a ball with center C and radius r and let Ube a point in <n q with jjUjj < r, so that C + U 2 B(C;r). The seconddi erential of f at C is then de ned asd2f(C;U) = d2f(vecC;vecU):The Second Identi cation Theorem for matrix functions (p.115) says if f istwice di erentiable at C, thend2f(C;U) = (vecU)0 Hf(C) vecU;114Appendix B. Appendix to Chapter 3where Hf(C) is the nq nq symmetric Hessian matrix of f at C de ned asHf(C) Hf(vecC):That is, the ijth element of Hf(C) is the second order derivative of f withrespect to the ith and jth element of vecX where X 2 S, evaluated at C.B.3 Matrix algebraic and di erential rulesIn this section, we list the matrix algebraic and di erential rules (chap.8)which will be used without speci c reference in our derivations. In thefollowing, we let A, B, C and D denote constant matrices, u denote avector function and U and V denote matrix functions. We let stand forthe Kronecker product. The rules are the following. tr(AB) = tr(BA), provided AB is square. tr(A0B) = (vecA)0vecB. tr(ABCD) = (vecD)0(A C0)(vecB0), provided ABCD is de nedand square. dA = 0. dAU = AdU. d(U + V) = dU + dV. d(UV) = (dU)V + U(dV). d(U0) = (dU)0. d (ln detU) = trU 1(dU). d (U 1) = U 1(dU)U 1. d (trU) = tr (dU). d(vecU) = vec (dU). d(u0Au) = u0(A + A0)du = (du)0(A + A0)u.115Appendix B. Appendix to Chapter 3B.4 Calculations in Section 3.3.2Recall the observed data log-likehood has the expression N = N2 lndet C xC0 + d 12NXi=1(Wi W )0 C xC0 + d 1 (Wi W ):We want to maximize it over x while holding the other parameters xed.In this section, holding the other parameters xed, we calculate the rstdi erential of N to obtain the rst order condition (3.10) and calculate thesecond di erential to obtain the Hessian matrix in (3.14).As we treat x as the only unknown parameter, it immediately followsfrom the expression of W in (3.9)d W = Cd xC0: (B.1)In our derivation, we will use the shorter notation d W before we reach(3.10) or (3.14). We haved N = N2 d(ln W ) 12NXi=1(Wi W )0 d 1W (Wi W )= N2 trh 1W d Wi+ 12NXi=1(Wi W )0 1W (d W ) 1W (Wi W )= N2 trh 1W d Wi+ 12tr" 1WNXi=1(Wi W )(Wi W )0 1W (d W )#= N2 trh 1W ( W SW ) 1W (d W )i(B.2)Recalling (B.1), we now haved N = N2 trhC0 1W ( W SW ) 1W Cd xi:By the First Identi cation Theorem for matrix functions mentioned in Sec-116Appendix B. Appendix to Chapter 3tion B.1, we obtain the Jacobian matrix of N at x asD N( x) = vecnC0 1W ( W SW ) 1W Co:Equating it to zero yieldsC0 1W ( W SW ) 1W C = 0which is equivalent to the rst order condition (3.10).Next we calculate d2 N to identify the Hessian matrix in (3.14). Startingfrom (B.2), we haved2 N = N2 trh(d 1W )( W SW ) 1W (d W )i N2 trh 1W d( W SW ) 1W (d W )i N2 trh 1W ( W SW )d 1W (d W )i= N2 trh 1W (d W ) 1W ( W SW ) 1W (d W )i N2 trh 1W (d W ) 1W (d W )i+N2 trh 1W ( W SW ) 1W (d W ) 1W (d W )i:The rst term and the last term at the right hand side are equal, and sothey can be combined into one termNtrh 1W (d W ) 1W ( W SW ) 1W (d W )i:Thend2 N = Ntr 1W (d W ) 1W (12 W SW ) 1W (d W ) Recall (B.1). Right hand side of the above= Ntr 1W Cd xC0 1W (12 W SW ) 1W Cd xC0 = Ntr C0 1W Cd xC0 1W (12 W SW ) 1W Cd x = N(vec d x)0 C0 1W C C0 1W (12 W SW ) 1W C vec(d x):117Appendix B. Appendix to Chapter 3When evaluated at the critical point ^ x which satis es the rst order con-dition (3.10), d2 N is then N2 (vec d x)0hC0 ^ 1W C C0 ^ 1W Civec(d x):By the Second Identi cation Theorem for matrix functions mentioned inSection B.2, we have at ^x the Hessian matrix is equal toH(^x) = (N=2) ^D ^ ; where ^ = C0 ^ 1W C:This is the matrix we saw in (3.14).B.5 Calculations in Section 3.3.4Recall we want to maximize the log-likelihood~ N = N2 ln( 0K + 2) 12( 0K + 2)NXi=1 Yi 0 0G(zi ) 2over f ; 2g to nd the update while xing the other parameters. In thissection, we derive the rst order conditions (3.22) and (3.23) via calculatingthe rst di erential of ~N with respect to f ; 2g. Calculating the seconddi erential of ~N then gives us the Hessian matrix (3.26).The following two di erentials will facilitate our calculation in d~N andd2 ~N. Recall the expression of 2Y jz in (3.19). We haved 2Y jz d 0K + 2 = 2 0Kd + d 2: (B.3)Letg( ) =NXi=1 Yi 0 0G(zi ) (zi )0G0: (B.4)We then obtaind" NXi=1 Yi 0 0G(zi ) 2# = 2 NXi=1 Yi 0 0G(zi ) (d )0G(zi )118Appendix B. Appendix to Chapter 3= 2NXi=1 Yi 0 0G(zi ) (zi )0G0(d )= 2g( )d (B.5)To calculate d~N, we use the terms 2Y jz and 0K + 2 interchangeably.d~N = N2 dhln( 0K + 2)i d 12( 0K + 2) NXi=1 Yi 0 0G(zi ) 2 12( 0K + 2)d" NXi=1 Yi 0 0G(zi ) 2#By (B.3) and (B.5), the right hand side above is equal to N2 2 0Kd + d 2 2Y jz +2 0Kd + d 22 4Y jzNXi=1 Yi 0 0G(zi ) 2+ 1 2Y jz g( )d :Letc( ; 2) =" NXi=1 Yi 0 0G(zi ) 2 N 2Y jz#: (B.6)Then d~N is equal tod 22 4Y jz c( ; 2) + 1 4Y jzh 0K c( ; 2) + 2Y jzg( )id : (B.7)By the First Identi cation Theorem mentioned in Section B.1, we obtainthe rst order conditions1 4Y jzh 0K c( ; 2) + 2Y jzg( )i= 0;12 4Y jz c( ; 2) = 0which lead to (3.22) and (3.23).Calculating d2 ~N to identify the Hessian matrix is lengthy and tedious.In fact, we don't need the closed form of the Hessian matrix but the Hes-119Appendix B. Appendix to Chapter 3sian matrix evaluated at the critical points f^; ^2g given in (3.26). So inour derivation, we will make use of the rst order conditions to simplifycalculation.We notice, equivalently, the critical points f^; ^2g only need to satisfyg(^) = 0 (B.8)c(^; ^2) = 0: (B.9)From (B.6), using (B.3) and (B.5) we havedc( ; 2) = 2NXi=1 Yi 0 0G(zi ) (zi )0G0d 2N 0Kd Nd 2= 2g( )d 2N 0Kd Nd 2;which is a function of . By (B.8),dc(^; 2) = 2N(d )0K Nd 2: (B.10)Now we calculate d2 ~N starting from (B.7). We rst calculated"d 22 4Y jz c( ; 2)#which isd 12 4Y jz!d 2 c( ; 2) + d 22 4Y jz d c( ; 2):When at f^; ^2g, by (B.9) and (B.10), it is equal to Nd 22^4Y jz 2(d )0K^ + d 2 = N^ 4Y jz d 0 d 2 K^ 1=2!d 2: (B.11)Then we calculated"1 4Y jzh 0K c( ; 2) + 2Y jzg( )id #120Appendix B. Appendix to Chapter 3which isd 1 4Y jz!h 0K c( ; 2) + 2Y jzg( )id + 1 4Y jzdh 0K c( ; 2) + 2Y jzg( )id :(B.12)At f^; ^2g, the term in the rst square brackets vanishes by (B.8) and (B.9).Thus, at f^; ^2g, (B.12) is equal to1^4Y jzh(d )0Kc(^; ^2) + 0Kdc(^; 2) + d 2Y jzg(^) + ^2Y jzdg(^)id :From (B.4), we havedg( ) = (d )0NXi=1G(zi )(zi )0G0: (B.13)Again by (B.8)-(B.10) and that dc(^; 2) is a scalar, (B.12) is equal to1^4Y jz" 2N(d )0K^ Nd 2 ^ 0K ^2Y jz(d )0NXi=1G(zi )(zi )0G0#d = N^ 4Y jz d 0 d 2 0@ 2K^ ^ 0K + ^2Y jzNPNi=1 G(zi )(zi )0G0^ 0K1Ad (B.14)Combining (B.11) and (B.14), eventually we get, at (^; ^2),d2 ~N(^; ^2) = d 0 d 2 H ~ ^; ^2) d d 2!;whereH ~ ^; ^2) = N^ 4Y jz0@ 2K^ ^ 0K + ^2Y jzNPNi=1 G(zi )(zi )0G0 K^ ^ 0K 1=21A:By the Second Identi cation Theorem mentioned in Section B.2, H ~ ^; ^2)is the Hessian matrix and we have seen it in (3.26).121
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Linear mixed effects models in functional data analysis
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Linear mixed effects models in functional data analysis Wang, Wei 2008-12-31
pdf
Page Metadata
Item Metadata
Title | Linear mixed effects models in functional data analysis |
Creator |
Wang, Wei |
Publisher | University of British Columbia |
Date | 2008 |
Date Issued | 2008-01-03T18:12:16Z |
Description | Regression models with a scalar response and a functional predictor have been extensively studied. One approach is to approximate the functional predictor using basis function or eigenfunction expansions. In the expansion, the coefficient vector can either be fixed or random. The random coefficient vector is also known as random effects and thus the regression models are in a mixed effects framework. The random effects provide a model for the within individual covariance of the observations. But it also introduces an additional parameter into the model, the covariance matrix of the random effects. This additional parameter complicates the covariance matrix of the observations. Possibly, the covariance parameters of the model are not identifiable. We study identifiability in normal linear mixed effects models. We derive necessary and sufficient conditions of identifiability, particularly, conditions of identifiability for the regression models with a scalar response and a functional predictor using random effects. We study the regression model using the eigenfunction expansion approach with random effects. We assume the random effects have a general covariance matrix and the observed values of the predictor are contaminated with measurement error. We propose methods of inference for the regression model's functional coefficient. As an application of the model, we analyze a biological data set to investigate the dependence of a mouse's wheel running distance on its body mass trajectory. |
Extent | 680020 bytes |
Subject |
functional regression mixed models |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Collection |
Electronic Theses and Dissertations (ETDs) 2008+ |
Date Available | 2008-01-03 |
Provider | Vancouver : University of British Columbia Library |
DOI | 10.14288/1.0066202 |
URI | http://hdl.handle.net/2429/253 |
Degree |
Doctor of Philosophy - PhD |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 2008-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- [if-you-see-this-DO-NOT-CLICK]
- ubc_2008_spring_wang_wei.pdf [ 664.08kB ]
- Metadata
- JSON: 1.0066202.json
- JSON-LD: 1.0066202+ld.json
- RDF/XML (Pretty): 1.0066202.xml
- RDF/JSON: 1.0066202+rdf.json
- Turtle: 1.0066202+rdf-turtle.txt
- N-Triples: 1.0066202+rdf-ntriples.txt
- Original Record: 1.0066202 +original-record.json
- Full Text
- 1.0066202.txt
- Citation
- 1.0066202.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
United States | 15 | 2 |
China | 13 | 25 |
Brazil | 9 | 0 |
Canada | 6 | 1 |
Russia | 2 | 0 |
Romania | 1 | 1 |
Italy | 1 | 0 |
France | 1 | 1 |
Belgium | 1 | 0 |
City | Views | Downloads |
---|---|---|
Unknown | 13 | 1 |
Ashburn | 12 | 0 |
Beijing | 7 | 1 |
Shenzhen | 6 | 24 |
Toronto | 3 | 0 |
Vancouver | 2 | 1 |
Pasadena | 2 | 0 |
Bucharest | 1 | 1 |
Wilmington | 1 | 0 |
Civitavecchia | 1 | 0 |
Leuven | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.24.1-0066202/manifest