Instantaneous Dynamics of FunctionalDatabyJeffrey BoneB.Sc., The University of Victoria, 2014A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate Studies(Statistics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)October 2016c© Jeffrey Bone 2016AbstractTime dynamic systems can be used in many applications to data modeling.In the case of longitudinal data, the dynamics of the underlying differentialequation can often be inferred under minimal assumptions via smoothingbased procedures. This is in contrast to the common technique of assuminga prespecified differential equation, and estimating it’s parameters.In many cases, one wants to learn the dynamics of a differential equationthat incorporates more than just one stochastic process. In the following, wepropose extensions to existing two-step smoothing methods that allow forthe presence of additional functional data arising from a second stochasticprocess. We further introduce model comparison techniques to assess thehypothesis that there is a significant change in fit provided by this additionalprocess. These techniques are applied to the instantaneous dynamics ofmouse growth data and allow us to make comparisons between mice whohave been assigned different genetic and physical conditions. Finally, tostudy the statistical properties of our proposed techniques, we carry out asimulation study based on the mouse growth data.iiPrefaceThis thesis is an original and unpublished work of the author, Jeffrey Bone,under the supervision of Dr. Nancy Heckman.The research question is an extension of the work done by Nicolas Verze-len, Wenwen Tao and Hans-Georg Mu¨ller on stochastic dynamics of func-tional data. Namely, we propose techniques to include additional stochasticprocesses in the data-driven differential equation framework and to assesstheir impact in describing functional data.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Supplementary Materials . . . . . . . . . . . . . . . . . . xiAcknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 The Data Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1 Breeding Design . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . 62.3 Data Summary . . . . . . . . . . . . . . . . . . . . . . . . . . 72.4 Previous Research and Research Objectives . . . . . . . . . . 83 Smoothing Techniques . . . . . . . . . . . . . . . . . . . . . . . 163.1 Smoothing Splines . . . . . . . . . . . . . . . . . . . . . . . . 163.1.1 Fitting Smoothing Splines . . . . . . . . . . . . . . . 173.2 Kernel Smoothing . . . . . . . . . . . . . . . . . . . . . . . . 183.3 Choice of Smoothing Parameter . . . . . . . . . . . . . . . . 204 Differential Equation Models . . . . . . . . . . . . . . . . . . 264.1 Parametric Models . . . . . . . . . . . . . . . . . . . . . . . . 264.2 Modeling Dynamics . . . . . . . . . . . . . . . . . . . . . . . 27ivTable of Contents4.3 Instantaneous Dynamics . . . . . . . . . . . . . . . . . . . . 294.4 Two Step Estimation Procedure . . . . . . . . . . . . . . . . 304.5 Model Comparisons . . . . . . . . . . . . . . . . . . . . . . . 325 Dynamics of Mouse Growth Data . . . . . . . . . . . . . . . 365.1 Model Fitting . . . . . . . . . . . . . . . . . . . . . . . . . . 365.1.1 Growth Rate Depending on Body Mass . . . . . . . . 365.1.2 Including Amount Eaten . . . . . . . . . . . . . . . . 405.2 Model Comparisons . . . . . . . . . . . . . . . . . . . . . . . 416 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . 546.1 Models for Simulated Data . . . . . . . . . . . . . . . . . . . 546.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . 567 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71AppendicesA Calculation of Conditional Expectation for Simulations . . 75B Calculation of Covariance Structure for Simulations . . . . 77vList of Tables5.1 The general trends in the estimated instantaneous relation-ship between body mass and growth rate for each of the eightgroups as seen in the subdirectory /1D Plots of the digitalappendix. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395.2 The p-values for each of the eight groups, resulting from thepermutation test from Section 4.5 to test the null hypoth-esis that the amount eaten at week t does not significantlyeffect the instantaneous relationship between body mass andgrowth rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436.1 The proportion of the null hypotheses rejected based on Sˆwhen the correlation between a and α1 is set to 0 as well asa 95% confidence interval of the expected proportion. . . . . . 576.2 The proportion of null hypotheses rejected based on Sˆ, atsignificance level of 0.05, their standard errors and a 95%confidence interval for the expected proportion rejected. . . . 586.3 The total number (across all tj) of times the null hypothesesare rejected for a significance level of 0.05 at each correlationlevel. The linear and nonparametric columns correspond totest (6.6) and (6.4) respectively. . . . . . . . . . . . . . . . . . 60viList of Figures2.1 A schematic depicting the initial breeding and formation ofthe eight genetic lines. . . . . . . . . . . . . . . . . . . . . . . 92.2 A schematic depicting the formation of generation i+ 1 fromgeneration i. This figure corresponds to one of the four se-lected lines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 The body mass (grams) of all 292 mice as a function of age(weeks). The missing slices correspond to weeks when thebody mass was not recorded. . . . . . . . . . . . . . . . . . . 112.4 The body masses (grams) of the 292 mice as a function of age(weeks), organized into each of the eight groups. The missingslices correspond to weeks when the mass was not recorded. . 112.5 The point-wise standard deviations of the body masses (grams)as a function of age (weeks). . . . . . . . . . . . . . . . . . . . 122.6 The point-wise standard deviations of the body masses (grams)for each of the eight groups as a function of age (weeks). . . . 122.7 The amount eaten (grams) by the 292 mice as a function ofage (weeks). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.8 The amount eaten (grams) as a function of age (weeks), foreach of the eight groups. . . . . . . . . . . . . . . . . . . . . . 142.9 The point-wise standard deviations of the amount eaten (grams)for the 292 mice as a function of age (weeks). . . . . . . . . . 142.10 The point-wise standard deviations of the weekly amounteaten (grams) for each of the eight groups as a function ofage (weeks). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.1 An example of a cubic smoothing spline fit to 46 data points,where the smoothing parameter has been selected by leave-one-out cross-validation. . . . . . . . . . . . . . . . . . . . . . 22viiList of Figures3.2 An example of two cubic smoothing splines fit to 46 datapoints, with different degrees of freedom (smoothing param-eters). The dashed line corresponds to using 40 degrees offreedom, while the solid line corresponds to using 10. . . . . . 233.3 An example of Nadaraya-Watson (left) and Local Linear (right)estimates of a regression function, based on 81 data points.A standard normal kernel with bandwidth 5 is used. . . . . . 243.4 An example of a contour plots of a bivariate local linear esti-mate of m using a multiplicative Gaussian kernel with stan-dard deviations 1.3 and 3.75. Plot (a) shows the estimate ofm evaluated at the data points. Plot (b) shows the estimateof m evaluated on a 20× 20 grid of evaluation points derivedfrom data (bottom). . . . . . . . . . . . . . . . . . . . . . . . 255.1 The smoothed body masses (grams) for each of the eightgroups of mice as a function of age (weeks). . . . . . . . . . . 445.2 The estimated growth rates (grams/week) for each of theeight groups of mice as a function of age (weeks). . . . . . . . 455.3 An example of the estimated deterministic component, fˆ(t, ·),as a function of body mass (grams) for the active males fromthe control group at weeks 5, 30, 55, and 80. . . . . . . . . . 465.4 Contour plot of the nonparametric estimate, fˆ(t, x), for theactive male mice from the control group. The x-axis corre-sponds to age (weeks), while the y-axis is body mass (grams). 475.5 The estimated relationship, given by fˆ(t, x), between bodymass (grams) and growth rate (grams/week) for each of theeight groups. The increase in darkness of the curves indicatesan increase in age. . . . . . . . . . . . . . . . . . . . . . . . . 485.6 The amount eaten (grams) as a function of age (weeks), or-ganized into each of the eight groups. This is the same asFigure 2.8, with the exception that here the outliers havebeen removed. . . . . . . . . . . . . . . . . . . . . . . . . . . . 495.7 A sample of four contour plots, at weeks 5, 30, 55 and 80,from the control males who were active. The x-axis indi-cates body mass (grams), while the y-axis corresponds to theweekly amount eaten (grams). The coloring is based on thevalue of fˆ(t, x, w), the conditional expected growth rate. . . . 505.8 A comparison of R2(t) (dashed) and R2W (t) (solid) as a func-tion of age (weeks) for each of the eight groups. . . . . . . . . 51viiiList of Figures5.9 A 95% bootstrap confidence interval (dashed,red) for R2W (t)for the active males in the control group. . . . . . . . . . . . . 515.10 The approximated density of Sˆ for the sedentary females fromthe control group. The red line corresponds to the observedvalue from the data, while the blue line indicates the 5th per-centile of the approximated null density of Sˆ. . . . . . . . . . 525.11 The ratio rˆ2 as a function of age (weeks) for each of the 8groups. The dashed lines represent the 5th and 95th per-centiles resulting from the permutation method described inSection 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 525.12 The point-wise p-values for each of the eight groups resultingfrom the permutation test in Section 4.5 of the null hypoth-esis that the amount eaten at week t does not significantlyeffect the instantaneous relationship between body mass andgrowth rate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536.1 The first eigenvector (red), and second eigenvector (blue)from the principal component analysis of the data vectors(xi1, . . . , xiJ)t for i = 1, . . . , n. These are used as ϕ1 and ϕ2in the simulation study. . . . . . . . . . . . . . . . . . . . . . 616.2 The mean body mass (grams) of the male active control mousegroup, used as µX(t) in the simulation study. . . . . . . . . . 626.3 A simulated data set of body mass (grams) is given in theleft pane, while a simulated data set of the amount eaten(grams) is given in the right. The correlation between thetwo processes has been set to 0. . . . . . . . . . . . . . . . . . 636.4 Histograms of the 500 p-values resulting from the nonpara-metric hypothesis test from Section 4.5. The correlation be-tween α1 and a is set to 0, 0.2, 0.4, 0.6 and 0.8. . . . . . . . . 646.5 The power of the test statistic, Sˆ, as a function of the corre-lation between α1 and a. Each power curve corresponds to afixed rejection level for the test. . . . . . . . . . . . . . . . . . 656.6 The point-wise power of the hypothesis test in (6.6) using astandard linear model (left) compared to the point-wise powerof the test in (6.4) using our rˆ(t)2 statistic, at a significancelevel of 5%. The x-axis indicates each of the five fixed cor-relations between a and α1. The darker the color of a point,the greater the value of tj . . . . . . . . . . . . . . . . . . . . . 66ixList of Figures6.7 Point-wise power curves of the hypothesis test in (6.6) (left)using a standard linear model, and of the test in (6.4), usingour rˆ(t)2 statistic (right) as functions of the correlation be-tween a and α1. The significance level is 5% and the darkerthe color of a line, the greater the value of tj . . . . . . . . . . 676.8 The proportion of times the null hypothesis in (6.4) is rejectedminus the proportion of times the null hypothesis in (6.6) isrejected, as a function of t. The different colors represent thedifferent correlation levels between a and α1. . . . . . . . . . 68xList of SupplementaryMaterials• 1D Plots: Scatterplot and animations of estimated values of f(t, x) forthe eight mouse groups at each t.• 2D Plots: Contour plot and animations of estimated values of f(t, x, w)for the eight mouse groups at each t.• Bootstrap: Bootstrap confidence intervals for the observed value ofR2W (t) for each of the eight mouse groups.• Contours 1D Fit: Contour plots of the estimated value of f(t, x) foreach of the eight mouse groups.• Density Curves: The estimated densities and corresponding observedvalues of Sˆ for each of the eight mouse groups.xiAcknowledgmentsFirstly, I wish to thank NSERC for providing me with partial financialsupport during my studies and for the NSERC Discovery Grant 7969.I’d like to extend my gratitude to the Department of Statistics at Uni-versity of British Columbia for their admittance and support. To the manyexcellent professors and administrative staff, without you, the opportunitiessuch as the ones presented to myself would not be possible.To Patrick Carter, thank you for allowing me to use your dataset andtaking time to discuss your research and its relation to my work.To my second reader, Ruben Zamar, I appreciate the time and effort youhave spent looking over and critiquing my works. Your valuable feedbackhas improved the quality of this thesis.Lastly, but most importantly, I want to extend a huge thank you tomy supervisor, Nancy Heckman. Without your support, encouragement,and willingness to engage in this research, it certainly would not have beenpossible. I feel fortunate to have worked with you over the last two yearsand I am very grateful for your time and effort.xiiDedicationThese works are dedicated to my Opa, Mr. Frank Fiederer. There is nogreater gift than that of an education.xiiiChapter 1IntroductionIn functional data analysis (FDA), one generally has data samples consistingof points that are assumed to come from a smooth curve or other infinitedimensional object. This can be thought of as data on N different scatter-plots, each corresponding to a curve. Data of this form arise in many ap-plications such as genetic trait modeling (Kirkpatrick and Heckman, 1989),online auction dynamics (Liu and Mu¨ller, 2009) and growth studies (Gasseret al., 1984; Verzelen et al., 2012). The standard introduction to FDA isprovided in an accessible manner by Ramsay and Silverman (2002, 2005).A common approach in FDA is to use a prespecified differential equationto model each of the curves (Cao and Ramsay 2007; Ramsay et al. 2007;Liang and Wu 2008; Cao et al. 2012). The parameters of this prespecifieddifferential equation are then estimated for each of the curves. This approachrelies on the ability to identify a differential equation that is appropriate forthe data before doing any fitting. The curve by curve method is a powerfulone in situations when the data are densely observed or when one is inter-ested in the exact dynamics of individual observations. On the other hand,in longitudinal studies where data are repeatedly observed for many sub-jects, pooling information across curves may improve estimation. Moreover,in situations where the underlying processes are stochastic, the presumedunderlying dynamics may not be well understood and thus prescribing adifferential equation can lead to poor fits. Examples of such processes canbe seen in subject specific studies of viral levels in HIV patients (Miao et al.,2009) and in auction price trajectories (Reddy and Dass, 2006). In thesesituations, alternative ways of viewing and modeling the data may improvethe analysis.This alternative is to view each curve as a realization of some unknownunderlying stochastic process (Yao et al. 2005; Liu and Mu¨ller 2009; Mu¨llerand Yang 2010; Mu¨ller and Yao 2010; Verzelen et al. 2012). From this view-point, there are no prespecified dynamics, but rather one tries to learn theunderlying dynamics from the data. This approach does not require strongassumptions on the data, often just that the underlying stochastic processadmits a Karhunen-Loe`ve expansion (Ash and Gardner, 1975). Much work1Chapter 1. Introductionhas been done in developing techniques for learning the dynamics of theseunderlying processes. Usually these techniques borrow information acrosscurves to better estimate the underlying process. When the data are sparsebut the pooled data are sufficiently dense, Yao et al. (2005) have proposedPrincipal Component Analysis through Conditional Expectation (PACE).This method provides a structure for estimation of covariances relating tothe underlying process, X, for instance, for estimating the covariance be-tween X(t) and X ′(s) or between X(t) and X(s). It should also be notedthat the PACE method is in contrast to approaches such as functional re-gression (Ramsay and Dalzell, 1991), where the entire process is includedin the modeling of the entire response. The applications are plentiful andinclude areas such as yeast cell cycles (Yao et al., 2005) and online auctiondynamics (Liu and Mu¨ller, 2009).Often, a relationship of particular interest is that between X ′(t) andX(t). For Gaussian processes, the idea of a dynamic transfer function hasbeen developed to understand this relationship and to estimate E[X ′(t)|X(t)](Liu and Mu¨ller, 2009; Mu¨ller and Yang, 2010). This transfer function isdefined as β in the equationX ′(t) = µX′(t) + β(t)[X(t)− µX(t)] + Z(t) (1.1)E[X ′(t)|X(t)] = µX′(t) + β(t)[X(t)− µX(t)], (1.2)where X(t) and Z(t) are independent. Mu¨ller and Yao (2010) show that theGaussianity assumption guarantees the existence of such a transfer function,and of a first order linear differential equation satisfied by each observedtrajectory. Verzelen et al. (2012) have extended this work to non-Gaussianprocesses. They show that each trajectory of a smooth stochastic processsatisfies a first order nonlinear differential equation given byX ′(t) = f(t,X(t)) + Z(t) (1.3)E{X ′(t)|X(t)} = f(t,X(t))and provide a two-step smoothing procedure to estimate f . In the Gaussiancase, f(t,X(t)) reduces to µX′(t) + β(t)[X(t) − µX(t)] from (1.1). Thatbeing said, as Heckman (2010) has pointed out, this conditional expectationdoes not give us the exact underlying differential equation nor the behaviorof the process, X, it only provides a way to study the relationship betweenX ′(t) and X(s).To date, the work on instantaneous dynamics has only included a singleprocess in the conditional expectation given in (1.3). We propose an addition2Chapter 1. Introductionto the model in (1.3) that allows for a second process, W (·), where W (t) isthought to have an influence on X ′(t). We then extend the work of Mu¨llerand Yao (2010) and Verzelen et al. (2012) for determining the domains whereone model explains the variation in X ′(t) significantly better than another.In our case, we compare the model that includes W (t) to that which doesnot. To determine this, we formulate a hypothesis test and a permutationapproach to calculating its significance level.As mentioned previously, FDA, and in particular the approach of Yaoet al. (2005) for longitudinal data, lends itself well to growth studies (Gasseret al., 1984; Verzelen et al., 2012). In continuing with this theme, we applythe two-step smoothing procedure of Verzelen et al. (2012) and our subse-quent extensions of this to data comprised of observations on eight distinctgroups of mice. These observations include the weekly body masses of themice, as well as their weekly amounts of food eaten. The eight distinctgroups are characterized by selective breeding (yes/no), access to an exer-cise wheel (yes/no) and gender (male/female). It is of biological and evolu-tionary interest as to how the instantaneous dynamics between body massand growth rate differ between each of these groups. With this in mind,we first estimate the growth rates of each mouse and then use the approachof Verzelen et al. (2012) to estimate the relationship between body mass atweek t and growth rate at week t. Further, we apply our new methods forincluding the additional stochastic process, W , corresponding to the weeklyamount eaten, in model (1.3). This allows us to determine for which of theeight groups and during which weeks, the amount eaten in a week signifi-cantly effects the instantaneous relationship between body mass and growthrate.The remainder of the thesis is organized as follows. In Chapter 2, wegive a detailed description of the mouse data set, including the breeding andexperimental design as well as some exploratory analysis and observations.Chapter 3 reviews foundational smoothing techniques such as smoothingsplines and kernel smoothing, as these are essential for the subsequent anal-yses. We also address some standard techniques for choosing the smoothingparameters. The topic of FDA and in particular the contrasts between thecurve by curve approach and that of learning the differential equation fromthe observed data is discussed in greater detail in Sections 4.1 and 4.2. Inthe remainder of Chapter 4 we formulate the models, model fitting proce-dures and model comparison techniques proposed by Mu¨ller and Yao (2010)and Verzelen et al. (2012), as well as our extensions. The analysis of themouse data introduced in Chapter 2 is described in Chapter 5, along withthe presentation and discussion of the results. Finally, Chapter 6 provides3Chapter 1. Introductionthe method and results from a simulation study to determine the statisticalproperties of our nonparametric testing method for determining the signifi-cance of W (t) in the relationship between X ′(t) and X(t), while Chapter 7provides concluding remarks.4Chapter 2The Data SetThe data set, first described in Swallow et al. (1998), was provided by Pro-fessor Patrick Carter, School of Biological Sciences, Washington State Uni-versity. The data are comprised of weekly measurements taken from 320house mice (Mus domesticus), who have been housed individually in a labo-ratory setting over a period of 80 weeks. Only 292 of these mice are includedin the subsequent analysis, as the remaining 28 died early.2.1 Breeding DesignThe lines of house mice used here are from replicate lines selected for 16generations. The breeding resulted in eight closed genetic lines. These lineswere established as follows (Swallow et al., 1998). A set of 224 mice (112 ofeach male and female) were purchased from Harlan Sprague Dawley, Indi-anapolis. This initial group of mice was paired for breeding, with the excep-tion that sibling mating was disallowed. This resulted in approximately 112litters. From each of these litters, one male and one female were randomlyselected, thus resulting in approximately 224 mice, referred to as generation−1. The mice from generation -1 were then randomly paired, again withsibling pairings disallowed. From these generation −1 pairs, eight lines wereformed by randomly selecting 10 pairs for each line. The eight lines wererandomly split into two groups of four (selection and control). The offspringof the chosen generation −1 pairs were designated generation 0. Figure 2.1gives a schematic of the above description of breeding up to generation 0.Within each line, from each of generations 0-9, 13 males and 13 femaleswere chosen to produce the next generation. At each generation, these 26breeder mice were selected at age 10 weeks. The 13 males and 13 femaleswere randomly paired to breed, with the condition that no pair were siblings.The first 10 litters with two pups of each sex were used to maintain the line.The 13 pairs, rather than just 10, were used to ensure that there would beat least 10 litters with two pups of each sex.In each generation, the 13 pairs of breeding mice in a selection line werechosen based on the average number of wheel revolutions run on days 5 and52.2. Experimental Design6 on an activity wheel. From each of the 10 families, the highest runningmale and female were selected to breed. To make up the remaining 3 pairsof the 13 required, three additional males and females were chosen. Thesesix mice were chosen based on being the second highest runners from thefamilies with the highest running totals, with the condition that no two ofthe six additional mice were siblings. These 26 mice were randomly paired tobreed the next generation as described in the preceding paragraph. Figure2.2 provides a schematic of how (for the selected lines) generation i producesgeneration i+ 1.In each generation, the 13 pairs of breeding mice each control line werechosen randomly as follows. One male and one female mouse were randomlyselected from each of the 10 families. Then an additional 3 males and 3females were randomly chosen, with the condition that no two of the sixadditional mice were siblings. These 13 males and 13 females were thenrandomly paired for breeding, again with the condition of no pair beingmade up of siblings.2.2 Experimental DesignOur analysis is based on observations made from generation 15 of the abovebreeding design. As described in the preceding section, of the eight geneticlines, four are control and four are selected for wheel running. From theseeight lines, 5 pairs of breeding mice were selected to produce the next gen-eration. For each of the five families within each line, two mice of each sexwere assigned to be active, i.e, have access to a running wheel, and two miceof each sex were to be sedentary (Theodore J. Morgan, 2003). This dividesthe resulting 320 mice into two groups of 160: active and sedentary. Thesetwo groups can be further partitioned by the sex and line type (1-8) of themice. The mice are thus divided into 32 balanced categories. In our analysis,when only 292 mice are used, these 32 categories have sizes of approximately8-12 mice.Weekly measurements were taken for each mouse over a period of 80weeks. These measurements include the body mass of the mouse (grams),the amount eaten in the last week (grams) and (for the active group) thenumber of revolutions ran in the last week. At the end of each week, eachmouse was weighed and the amount of food left in the bowl was measured.The amount eaten was calculated as the difference between the food in thebowl at the beginning of the week and the amount remaining in the bowl atthe end. This could be subject to error, as some mice would bury or dispose62.3. Data Summaryof food without actually eating it.For the purpose of our analysis, we ignore the dependence within the 8lines, thus treating the mice as independent. Therefore, we consider cate-gories formed by sex, the presence of an exercise wheel and whether or not amouse is selectively bred or not. This results in eight groups, each comprisedof 30-40 mice, where each group has roughly comparable size.2.3 Data SummaryFigures 2.3 and 2.4 show the body masses over the 80 weeks for the 292mice, and for each of the eight groups, respectively. Figure 2.5 shows thepoint-wise standard deviations of the body masses of the 292 mice. Clearly,as the mice age there is greater variation in the body masses. This patternin the variation is also evident within each group, as shown in Figure 2.6.There are some weeks when the body mass was not observed, typicallycorresponding to holidays. This can be seen by the missing slices in Figures2.3 and 2.4. In the original data set, the researchers who collected the datahad imputed these data points with the proceeding week’s observations.With the permission of researcher Patrick Carter, we treat the first k − 1of k consecutive identical values as missing. On average, it appears thatthe sedentary mice are heavier for both males and females. Also, for bothgenders, those mice that were not selectively bred seem to be heavier.Figures 2.7 and 2.8 show the amount eaten over the 80 weeks for the 292mice, and for each of the eight groups, respectively. Each of the eight groupsappears to show similar patterns, although the female sedentary mice exhibitgreater variation. Figure 2.9 shows that the point-wise standard deviationsof the amount eaten are fairly uniform, particularly after the twentieth week.This pattern is similar within each group, as seen in Figure 2.10. There arealso some substantial outliers evident in Figures 2.7 and 2.8. For example,in the lower left panel of Figure 2.8, there is a single point at 100 grams,while every other measurement is below 75 grams. These outliers explainthe spikes in the point-wise standard deviations, seen in Figure 2.10. Afterdiscussion with Professor Carter, this and three other similar points weredetermined to be inaccurate measurements or were due to food being wastedwithout the researcher who collected the data knowing and thus we treatedthem as missing values. There was also a negative measurement on an activemale, in the selected group, which we also replaced by a missing value.72.4. Previous Research and Research Objectives2.4 Previous Research and Research ObjectivesThe data collected from the mice of generation 10 from the breeding designdescribed in Section 2.1 have been used to study a variety of genetic andevolutionary traits. Natural questions concern things such as how the bodymass, amount of energy used and food consumption vary between groups(Koteja et al. 1999; Koteja et al. 2001). For example, it was found that theselected mice from generation 10 ran 70% more total revolutions per daythan their control counterparts and that overall, males ran less than females(Koteja et al., 1999). Other work has addressed topics such as the variationin the amount of food wasted between the groups (Koteja et al., 2003), whereit was found that there were significant differences in the amount of foodwasted between replicate lines, but not between the selected and controlgroups. In most of these studies, it is of interest to compare the selectedmice to those that are randomly bred and the active mice to the sedentary.In general, the primary focus is not on between gender comparisons as thesedo not provide as many conclusions about the evolutionary process of themice.Our objective for the data is two-fold. We treat both the body mass andthe amount eaten as being governed by underlying stochastic processes. Wefirst aim to explore the relationship between growth rate at a given age, t,and body mass at t for each of the eight groups. This is in contrast to usinga more complex historical model that includes all the information about thebody mass up to age t in order to study the growth rate. After estimatingthis relationship, we try to draw some general conclusions as well as makecomparisons between the groups. The second objective is to explore how therelationship between body mass at age t and growth rate at age t is changedby including the amount eaten at t in the model. This allows us to determinein which of the groups the amount eaten at t contributes significantly toexplaining the growth rate at t, providing a better understanding of theunderlying biological traits.82.4. Previous Research and Research ObjectivesFigure 2.1: A schematic depicting the initial breeding and formation of theeight genetic lines.92.4. Previous Research and Research ObjectivesFigure 2.2: A schematic depicting the formation of generation i + 1 fromgeneration i. This figure corresponds to one of the four selected lines.102.4. Previous Research and Research ObjectivesFigure 2.3: The body mass (grams) of all 292 mice as a function of age(weeks). The missing slices correspond to weeks when the body mass wasnot recorded.Figure 2.4: The body masses (grams) of the 292 mice as a function ofage (weeks), organized into each of the eight groups. The missing slicescorrespond to weeks when the mass was not recorded.112.4. Previous Research and Research ObjectivesFigure 2.5: The point-wise standard deviations of the body masses (grams)as a function of age (weeks).Figure 2.6: The point-wise standard deviations of the body masses (grams)for each of the eight groups as a function of age (weeks).122.4. Previous Research and Research ObjectivesFigure 2.7: The amount eaten (grams) by the 292 mice as a function of age(weeks).132.4. Previous Research and Research ObjectivesFigure 2.8: The amount eaten (grams) as a function of age (weeks), for eachof the eight groups.Figure 2.9: The point-wise standard deviations of the amount eaten (grams)for the 292 mice as a function of age (weeks).142.4. Previous Research and Research ObjectivesFigure 2.10: The point-wise standard deviations of the weekly amount eaten(grams) for each of the eight groups as a function of age (weeks).15Chapter 3Smoothing TechniquesConsider a set of regression points (x1, y1), (x2, y2), . . . , (xn, yn) such thatxi ∈ [a, b] for all i and suppose thatyi = m(xi) + i,where i are i.i.d with E(i) = 0, Var(i) = σ2.If instead we have regression points, (x1, w1, y1), . . . , (xn, wn, yn), then theprevious model is adapted toyi = m(xi, wi) + i,where i are i.i.d with E(i) = 0, Var(i) = σ2.This can be further generalized to include more covariates, but we restrictour focus to the univariate and bivariate cases.Smoothing techniques attempt to capture trends in the data by providingan estimate of the function m. These estimates of m are obtained in such away that the noise in the data is reduced. Smoothing techniques are typicallyused to extract information from the data, while providing flexibility androbustness.In contrast to parametric regression, smoothing does not require anypredetermined assumptions about the form of the relationship between theresponse and the explanatory variables. Instead, this relationship is deter-mined completely by the information from the data. For this reason, largersample sizes are typically needed for smoothing, when compared with thosefor parametric regression. In this section we will discuss two specific typesof smoothing: smoothing splines and kernel smoothing.3.1 Smoothing SplinesDefinition: φ : [a, b] → R is a spline of degree p, with interior knots t1 <· · · < tn, where a < t1 and tn < b, if:(1) the restrictions of φ to the intervals [a, t1], [ti, ti+1] and [tn, b] are poly-nomials of degree p for i = 1, . . . , n,163.1. Smoothing Splines(2) φ(·) is a p − 1 continuously differentiable function at the points ti, fori = 1, . . . , n.In the context of regression data, (x1, y1), (x2, y2), . . . , (xn, yn) such thatxi ∈ [a, b] and xi < xi+1 for all i, we are usually interested in minimizingthe mean squared error (MSE),1nn∑i=1(yi −m(xi))2, (3.1)with respect to m. Now, suppose we want to penalize m based on certaincharacteristics. Then we can include a penalty term in (3.1) and minimize:G(m,λ) = 1nn∑i=1(yi −m(xi))2 + λP (m), (3.2)for a fixed λ. We call λ the smoothing parameter and P the penalty function.This criterion indicates that we are willing to accept an estimate of m thatincreases the MSE if it also reduces the penalty. A common choice is topenalize m based on its curvature; that is, P (m) =∫ ba [m′′(x)]2dx (Hastieand Tibshirani, 1990). In this case, it can be shown that the minimizer ofG is a natural cubic spline with interior knots x1, . . . , xn. These naturalcubic splines are cubic splines that are subject to the condition∫ x1a m′′(x) =∫ bxnm′′(x) = 0 and are thus linear on the intervals [a, x1] and [xn, b]. SeeFigures 3.1 and 3.2 for examples of cubic splines fit to data.To better understand the penalized optimization problem in (3.2), weconsider the extreme cases of λ. One can show that, when λ → ∞, theminimizer of (3.2), mˆλ, approaches the least squares line. Heuristically, forincreasing λ we must have P (mˆλ) tending to 0 and the problem reducesto minimizing (3.1) with mˆλ restricted to having P (mˆλ) = 0; that is, mˆλrestricted to a line. On the other hand, if λ = 0, then we can choose mˆλso that the first term in (3.2) is 0 by setting mˆλ to be the interpolatingspline that passes through each data point. For intermediate values of λ,the minimizing function mˆλ is a compromise between the least squares lineand the interpolating function.3.1.1 Fitting Smoothing SplinesWe now focus on how a smoothing spline can be fit to regression data. Con-sider the case where the penalty function in (3.2) is given by the curvatureof m. Then, as stated before, the minimizer in (3.2) is a natural cubic173.2. Kernel Smoothingspline. Our input data, x1, x2, . . . , xn, are the interior knots of the cubicspline and give us n + 1 segments of the interval [a, b]. Therefore, it ap-pears we need to determine four coefficients for each segment, for a total of4(n+ 1) coefficients. Fortunately, our natural cubic splines are smooth andtwice continuously differentiable on [a, b] and linear on the intervals [a, x1]and [xn, b]. This places restrictions on the coefficients and thus reduces thenumber of basis functions needed to n. Thus, the dimension of the space ofnatural cubic splines with interior knots x1, . . . , xn is equal to n. Denote aset of basis functions as {Bi, i = 1, . . . , n}. We can now write our naturalcubic spline asm(x) =n∑i=1αiBi(x).Typical choices for the Bi’s include the truncated power basis or theB-spline basis. Notice that we have reduced our infinite dimensional class ofm’s to an n-dimensional model that is linear in the Bi’s. We can thus rewriteour objective function from (3.2) in matrix form. Setting (B)ij = Bj(xi),this yields:G(m,λ) = 1n(y −Bα)T (y −Bα) + λαTCα, (3.3)where y = (y1, . . . , yn)t, α = (α1, . . . , αn)t and (C)ij =∫ ba B′′i (x)B′′j (x)dxcontains the curvature information of the basis functions. Setting the deriva-tive of G with respect to α equal to zero and solving yieldsαˆ = (BTB + nλC)−1BTy,which is the unique minimizer of G providedBTB+nλC is positive definite.This minimizing αˆ yields the following fitted values for yi,yˆ = B(BTB + nλC)−1BTy = Sλy,where Sλ is called the smoothing or hat matrix. There are various ways tochoose the smoothing parameter, λ, that will be discussed in Section 3.3.Finally, the above discussion is restricted to univariate splines, as these areall that is required in our application. In more complex situations, one mayrequire, for example, bivariate splines.3.2 Kernel SmoothingAn alternative method to smoothing splines is to smooth the data via Kernelsmoothing. Kernel smoothing is typically used in two different applications.183.2. Kernel SmoothingThe first is for obtaining a density estimate from a sample. The second isfor investigating a regression relationship, such as the one outlined at thebeginning of this chapter. We will focus our proceeding discussion on thelatter case.For both density estimation and regression, the smoothing is done via akernel function, K(·;h) (Wand and Jones, 1995). We restrict K(·;h) to be asymmetric density function with mean 0. The scale parameter, h, controlsthe degree of smoothing and can sometimes be thought of as the standarddeviation of the random variable with density equal to the kernel function.Some typical choices for K are the Gaussian, the Epanechnikov and the boxkernels.The simplest approach to nonparametric kernel regression is to adopt alocal mean approach. The estimate at an evaluation point z is given bym̂h(z) =∑ni=1K(xi − z;h)yi∑ni=1K(xi − z;h).This is referred to as the Nadaraya-Watson estimate (Nadaraya, 1964; Wat-son, 1964). More generally, for a fixed p, the βˆ0, βˆ1, . . . , βˆp that minimizen∑i=1[yi − β0 + β1(xi − z) + · · ·+ βp(xi − z)p]2K(xi − z;h)provide the pth degree polynomial estimates of m and its derivatives eval-uated at z. Namely, the estimates are given by m̂(i)h(z) = i!βˆi(z) fori = 0, . . . , p. Note that when p = 0 the Nadaraya-Watson estimate is recov-ered. Setting p = 1 is a common choice, and is referred to as local-linearestimation. The local linear approach is good when used to estimate m,but higher order polynomials are recommended for obtaining estimates ofm’s derivatives (Wand and Jones, 1995). Figure 3.3 gives an example ofa Nadaraya-Watson (left pane) and local linear (right pane) estimate of mbased on simulated regression data.If we have two real-valued covariates, x and w, and data (xi, wi) fori = 1, . . . , n, then the local mean estimator of m is given bym̂h1,h2(z1, z2) =∑ni=1K(xi − z1, wi − z2;h1, h2)yi∑ni=1K(xi − z1, wi − z2;h1, h2)and the local linear estimate of m(z1, z2) is the value of β0 gotten from theleast squares criterionminβ0,β1,γ1n∑i=1[yi − β0 + β1(xi − z1) + γ1(wi − z2)]2K(xi − z1, wi − z2;h1, h2).193.3. Choice of Smoothing ParameterThe multivariate kernel, K, can take many forms. Here, we restrict ourmultivariate kernels to those of the multiplicative formK(xi − z1, wi − z2;h1, h2) = K1(xi − z1;h1)K2(wi − z2;h2), (3.4)where K1 and K2 are univariate kernels. See Figure 3.4 for an example of alocal linear bivariate smooth.In both the univariate and bivariate case, m̂ can be seen as a weightedaverage of the yi’s. These weights depend on the choice of the kernel func-tion, bandwidth and the proximity of the evaluation points to the data.Furthermore, as the minimizing criterion is a least squares problem andthus quadratic, we have that yˆ is linear in y = (y1, . . . , yn)t. Therefore, asin the case of smoothing splines, there exists a smoothing (hat) matrix, Shsuch that yˆ = Shy.3.3 Choice of Smoothing ParameterIn both of the previous sections our estimates of m depend on a value whichdetermines the degree to which our estimating function smooths the data. Inthe case of smoothing splines, the smoothing parameter, λ, controls smooth-ing. Likewise, for kernel smoothing, the bandwidth, h, controls smoothing.Choosing these parameters is important as a poor choice can lead to over-or under-smoothing of the data.In both smoothing splines and kernel smoothing, leave-one-out cross-validation (Hastie et al., 2001) is often used to determine λ, h or (h1, h2).This algorithm can be described as follows for choosing λ. Choosing h or(h1, h2) is similar. For each value of i = 1, . . . , n, the ith data point is leftout of the fitting and is then predicted with the resulting natural spline. Theprediction error, e∗i (λ) = yi − yˆ(−i)i , for this data point is then computed.We define the cross-validation function as the sum of the squared predictionerrors, CV (λ) =∑ni=1[e∗i (λ)]2. The λ with the lowest CV is chosen as thesmoothing parameter.Fixing the trace of the smoothing matrix is another method to specifythe smoothing parameter or bandwidth. Recall, that in both smoothingsplines and kernel smoothing, the fitted values could be written as yˆ = Sy,for some matrix S that depends on λ or h. Notice that this is a similarframework to that of linear regression, where the hat matrix, H, plays therole of S in determining yˆ. In this case, the trace of H is called the degreesof freedom and is equal to the number of parameters in the regression model.Similarly, for smoothing splines and kernel smoothing, we define the degrees203.3. Choice of Smoothing Parameterof freedom to be the trace of S. Since the smoothing matrix S dependson the smoothing parameter, by fixing trace(S) =∑ni=1 Sii to some desiredvalue, the smoothing parameter can then be computed numerically. AsSλ is an n × n matrix, the degrees of freedom are equal to the sum ofSλ’s eigenvalues. In general, the lower the degrees of freedom the greaterthe degree of smoothing. For an example of this, see the two splines inFigure 3.2. Finally one possible advantage to tuning the degrees of freedomas opposed to for example, cross validation, is that one may get betterperformance for approximating derivatives. Moreover, if one wishes to havethe same amount of smoothing over many curves (for example, mice bodymasses), tuning via the degrees of freedom provides a way to ensure this.There are other methods, such as ones based on the Akaike InformationCriterion (Sakamoto et al., 1986), that can also be used for determining thesmoothing parameters. In particular, for local polynomial kernel regressionthe so called “plug in” method is often used (Wand and Jones, 1995). Theplug in method uses the idea of plugging in estimates of the unknown pa-rameters in the expression of the asymptotically optimal bandwidth. Thisasymptotically optimal bandwidth is the one which optimizes the asymptoticmean square error. Finding estimates for the unknown parameters in theformula for the optimal bandwidth can be challenging. These estimates areoften found using kernel smoothing, thus requiring their own bandwidths,which must be chosen. This hierarchal structure and its properties havebeen explored in detail by Hall et al. (1992).213.3. Choice of Smoothing ParameterFigure 3.1: An example of a cubic smoothing spline fit to 46 data points,where the smoothing parameter has been selected by leave-one-out cross-validation.223.3. Choice of Smoothing ParameterFigure 3.2: An example of two cubic smoothing splines fit to 46 data points,with different degrees of freedom (smoothing parameters). The dashed linecorresponds to using 40 degrees of freedom, while the solid line correspondsto using 10.233.3. Choice of Smoothing ParameterFigure 3.3: An example of Nadaraya-Watson (left) and Local Linear (right)estimates of a regression function, based on 81 data points. A standardnormal kernel with bandwidth 5 is used.243.3. Choice of Smoothing Parameter(a)(b)Figure 3.4: An example of a contour plots of a bivariate local linear estimateof m using a multiplicative Gaussian kernel with standard deviations 1.3 and3.75. Plot (a) shows the estimate of m evaluated at the data points. Plot(b) shows the estimate of m evaluated on a 20×20 grid of evaluation pointsderived from data (bottom).25Chapter 4Differential Equation ModelsDifferential equations can be used to model processes encountered in a vari-ety of disciplines. The objective is typically either to estimate the trajectoryspecific parameters of the differential equation or to understand the govern-ing dynamics of the underlying process from the observed data.4.1 Parametric ModelsIn many previous works, (Ramsay and Silverman 2005; Cao and Ramsay2007; Cao et al. 2012) a previously specified nonrandom differential equa-tion is assumed to describe the underlying process, X(·). The goal is thento estimate the parameters of this differential equation from the observeddata, denoted (tj , Yj) for j = 1, . . . , J . The Yj ’s are assumed to be a noisyrealization of the underlying processYj = Y (tj) = X(tj) + j , (4.1)where j ’s are mean zero independent identically distributed random vari-ables with Var(j) = σ2. Formally, the aim is to minimize the followingpenalized least squares problemJ (c|β, λ) =n∑i=1[Y (tj)−X(tj)]2 + λ∫[LβX(t)]2dt,where c is the vector of coefficients from a basis function expansion of X(·)and Lβ is a differential operator depending on the unknown parameter vec-tor, β. We define a differential operator as in Ramsay et al. (1997) to beLβX(t) =m−1∑j=0βjDjX(t) +DmX(t),where the notation DjX(t) indicates the jth derivative of X(t). One reasonto use a general linear operator, instead of a more specific term such as the264.2. Modeling Dynamicscurvature, is to be able to penalize X in more accurately. For example,if we know that locally X satisfies some differential equation, then we maywish to use a differential operator that penalizes departure from this specificequation.Much previous work has been done to estimate the parameters of theabove minimization problem. Heckman and Ramsay (2000) jointly estimatec and β but find the estimators to be unsatisfactory. Cao and Ramsay (2007)modify this criterion and propose a two step estimation method. They firstminimize J , for a fixed β, with respect to the coefficient vector, c, yieldingcˆ(β). An un-penalized criterion involving cˆ(β) is then used to determinethe optimum β. It is important to note that this method is not iterative.The trajectory-wise approach described above can be generalized to sit-uations where there is more than one observed trajectory of X(·). In thesecases, often each trajectory is modeled via the same parametric differentialequation but with a different parameter vector, β, which is treated as a ran-dom effect. Alternatively, one can estimate the parameter vector separatelyfor each observed trajectory, as described above.4.2 Modeling DynamicsAs described in the preceding section, the trajectory-wise approach to fittinga differential equation model relies on the ability to specify a differentialequation that describes the underlying process. If the underlying processis not well understood then this may not always be practical. Conversely,if many realizations of the underlying process are available, one can tryto learn the differential equation directly from these trajectories (Liu andMu¨ller, 2009; Mu¨ller and Yang, 2010; Verzelen et al., 2012). Specifically,given n realizations, X1, . . . , Xn, of a stochastic process X on a domainT , we assume that we observe the noisy measurements of the process, Yi,according toYi(tij) = Xi(tij) + ij , j = 1, . . . , Ji, (4.2)where the ij ’s are mean zero, independent identically distributed randomvariables with Var(ij) = σ2.The method of attempting to learn the differential equation from thedata uses observations from all n realizations together to estimate the under-lying dynamics, rather than estimating the parameter of the aforementionedprespecified differential operator on a path by path basis. This method isparticularly powerful in situations when data are sparsely observed for eachrealization of X. In these cases, estimation of these individual trajectories274.2. Modeling Dynamicsis difficult, and more so for their derivatives. The borrowing of informationfrom each trajectory can often aid in this estimation. For such sparse situa-tions, Yao et al. (2005) propose the use of Functional Principle ComponentAnalysis through Conditional Expectation (PACE). The PACE method hasbeen applied to Gaussian processes (Mu¨ller and Yang, 2010) and to onlineauction dynamics (Mu¨ller and Yao, 2010).In longitudinal studies, one may be interested in relating X(t) or X ′(t)to X(s) for a fixed s ≤ t. Studying these relationships can provide insightinto typically unknown mechanisms governing the observations. For Gaus-sian processes, Liu and Mu¨ller (2009) have proposed the use of dynamictransfer functions to provide an estimate of the influence that X(t) has onX(ν)(t) for ν > 0. These transfer functions can be related to the conditionalexpectation of X(ν)(t) given X(t). Mu¨ller and Yang (2010) have extendedthe idea of transfer functions for Gaussian processes, to allow one to predicttrends in, for example, X ′(t), based on previous levels of the process X ats. The instantaneous dynamics between X(t) and X ′(t) will be the focusof Sections 4.3-4.5 but before proceeding, we reiterate the methodologicaldifferences between the trajectory wise approach described in Section 4.1and the alternative of learning the underlying process from the data, asdescribed at the start of this section.The two predominant differences between these two approaches are inhow the data are used. When the governing differential equation can bereasonably assumed, each of the observed trajectories can be analyzed sep-arately (Ramsay and Silverman 2005; Cao and Ramsay 2007; Cao et al.2012). In contrast, when the underlying dynamics are to be inferred fromthe data, then for each t, all observations from the n realizations of X canbe used to estimate the underlying process at t (Yao et al. 2005; Liu andMu¨ller 2009; Mu¨ller and Yang 2010; Mu¨ller and Yao 2010; Verzelen et al.2012). Essentially, for the approach described in Section 4.1, an equationis first specified and then the unknown parameters are estimated. On theother hand, when trying to learn the differential equation from the data,one estimates the overlying equations from the observations. Furthermore,when each trajectory is analyzed individually, typically the entire trajectoryis to be estimated (Ramsay and Silverman 2005; Cao and Ramsay 2007; Caoet al. 2012), while when information is borrowed from across the observedtrajectories, the estimation is “local” in t. Indeed, with this local approach,the focus may be on studying specific time intervals (Mu¨ller and Yao, 2010;Verzelen et al., 2012).284.3. Instantaneous Dynamics4.3 Instantaneous DynamicsIn many applications, estimation of the instantaneous relationship betweenX(t) and X ′(t) is of interest. More precisely, for a fixed time, determininghow the value of the stochastic process is effecting its rate of change canbe informative. For example, in growth studies, one may wish to infer howthe current body mass of an individual at age t is related to the individual’sgrowth rate at age t. Likewise, in finance, one may wish to infer the directionof a stock price based on its current valuation. As mentioned in Section4.2, for Gaussian processes, the use of transfer functions to describe theserelationships has been explored (Liu and Mu¨ller, 2009; Mu¨ller and Yang,2010). Verzelen et al. (2012) extended this work to non-Gaussian processes,proposing a two step kernel estimation procedure. To our knowledge, afurther generalization to include the effect of additional stochastic processeson these instantaneous dynamics has not yet been studied.This section is comprised by first reviewing how the rate of change canbe partitioned into deterministic and random components and by examiningsome specific cases of this decomposition, as described in Verzelen et al.(2012). We then conclude by discussing a similar decomposition to accountfor the influence of an additional stochastic process.Consider a differentiable stochastic process, X, on a domain, T , suchthat X and X ′ have finite variance. Provided E[X ′(t)] exists, we can de-compose X ′(t) asX ′(t) = E{X ′(t)|X(t)}+ Z(t) where Z(t) = X ′(t)− E{X ′(t)|X(t)}.Note that we always have E{Z(t)|X(t)} = 0 and Cov(Z(t), E{X ′(t)|X(t)}) =0 almost surely. Moreover, we can write E{X ′(t)|X(t)} = f(t,X(t)) forsome function f . This allows us to write the nonlinear differential equationX ′(t) = f(t,X(t)) + Z(t) (4.3)f(t,X(t)) = E{X ′(t)|X(t)}We refer to f as the deterministic part of the equation and Z as the stochasticpart. When f is unknown and must be estimated from the data, we referto the first equation in (4.3) as a data-driven differential equation. In someapplications, f may be time independent, in which case (4.3) is referred toas an autonomous system (Verzelen et al., 2012).The simplest case of (4.3) is when f is just the mean function of X ′ andso X ′ and X are uncorrelated. In this case, (4.3) reduces toX ′(t) = µX′(t) + Z(t). (4.4)294.4. Two Step Estimation ProcedureIf X is a Gaussian process, then it can be shown that f(t,X(t)) is of theform µX′(t) + β(t)[X(t)− µX(t)] and thus only µX′(t) and β(t) need to beestimated. In this case, β is referred to as the transfer function (Liu andMu¨ller, 2009; Mu¨ller and Yang, 2010). Here, (4.3) becomesX ′(t) = µX′(t) + β(t)[X(t)− µX(t)] + Z(t) (4.5)≡ µ∗(t) + β(t)X(t) + Z(t)This linear relationship may also hold for non-Gaussian processes in somecases. However, many processes are more complex than what can be cap-tured by the linear dynamics in (4.5). In these cases, Verzelen et al. (2012)propose estimating f(t,X(t)) with a two step smoothing procedure, whichis described in Section 4.4. We now conclude this section by considering anatural extension to that of the system in (4.3).While the system in (4.3) can be used to describe various types of re-lationships between X(t) and X ′(t), it leaves no room for modeling thedependence of X ′(t) on additional processes. For example, in the case ofhow growth rate is effected by an individual’s body mass, one could easilyhypothesize that the amount eaten at age t−∆, for some ∆ ≥ 0, also playsa role in the growth rate at t. In the preceding, we consider only ∆ = 0.For applications requiring the modeling of X ′(t)’s dependence on a pro-cess W (·), (4.3) can be generalized to:X ′(t) = f(t,X(t),W (t)) + ζ(t) (4.6)f(t,X(t),W (t)) = E{X ′(t)|X(t),W (t)}where ζ(t) = X ′(t)−E[X ′(t)|X(t),W (t)]. The framework that includes theprocess W (·) can be used when one suspects that a significant amount ofthe variation in X ′(t) can be explained by W (t).4.4 Two Step Estimation ProcedureRecall that in Section 4.3, we formulated three possible models to describethe instantaneous relationship between X ′(t) and X(t) (and possibly W (t)).These were given in (4.3), (4.5), and (4.6), where the notation of f and Zis reused. The linear model, (4.5), is a simple, specific case of (4.3) wheref(t,X(t)) = µ∗(t) + β(t)X(t). Thus, these models can be seen as ascendingin complexity from (4.5), to (4.3), up to the nonlinear model, (4.6), whereX ′(t) depends on X(t) and W (t) as opposed to just X(t). We now describehow each of these models can be fit, before proceeding to comparing fits inSection 4.5.304.4. Two Step Estimation ProcedureFor each of the three models, the first step is to estimate the trajectoriesXi and X′i from the raw observations, Yij , j = 1, . . . , Ji, as modeled in(4.2). For i = 1, . . . , n, we estimate the trajectory, Xi, and subsequently, itsderivative, X ′i, by the method of smoothing splines, as discussed in Section3.1. We call these estimates Xˆi and Xˆ′i, respectively. This is in slightcontrast to previous work, where convolution kernel smoothing estimatesare used (Verzelen et al., 2012).For the linear model in (4.5), one can view (Xˆi(t), Xˆ′i(t)) as regressiondata for each fixed t. Thus, a natural estimate of β(t) is the slope estimatefrom least squares linear regression at fixed t. Alternatively, one can useinformation from the whole process to estimate the best linear unbiasedpredictor (BLUP) of X ′(t) given X(t) of the form α(t) + β(t)X(t). ThisBLUP is given byβ(t) =Cov[X ′(t), X(t)]Var[X(t)].If X(t) and X ′(t) are jointly bivariate normal, then the above is the exactsolution to (4.5) (Mu¨ller and Yao, 2010). Therefore to obtain an estimate,βˆ(t), for β(t), one can estimate Cov[X ′(t), X(t)] and Var[X(t)]. One way todo this is by estimating the covariance function of X(·) in order to estimatethe eigenfunctions from X’s Karhunen-Loe´ve expansion (Rice and Silver-man, 1991). These estimated eigenfunctions then allow for the estimationof Cov[X ′(t), X(t)] and Var[X(t)] (Liu and Mu¨ller, 2009).For the nonlinear models (4.3) and (4.6), we use a two step smoothingprocedure to estimate f . As mentioned above, the first step is to smooththe trajectories. The second step is to obtain an estimate of f . This twostep procedure proceeds from the same idea used by Ellner et al. (2002) forautonomous systems and Verzelen et al. (2012) for systems as in (4.3).In the univariate case (4.3), at each fixed time point, t, our data are givenby (Xˆi(t), Xˆ′i(t)) for i = 1, . . . , n. We obtain the estimate of f(t, ·), denotedfˆ(t, ·), by using a local linear kernel smooth (Wand, 2015) of Xˆ ′i(t) on Xˆi(t),as described in Section 3.2. The corresponding bandwidth is chosen bycross-validation, as outlined in Section 3.3. Doing this for each t gives theestimate of the smooth function f . This approach is in small contrast tothe work Verzelen et al. (2012), where a Nadaraya-Watson estimate is usedinstead of a local linear estimate.To estimate f(t,X(t),W (t)) as in (4.6), we simply extend the abovemethod for (4.3) to a bivariate local linear smooth, as described in Section3.2. For a fixed t, our data are given by (Xˆi(t),Wi(t), Xˆ′i(t)) for i = 1, . . . , n.To obtain fˆ we smooth Xˆ ′i(t) on (Xˆi(t),Wi(t)). Again, cross validation is314.5. Model Comparisonsused to select the bandwidths. It is important to note that we do notalways need to preprocess the trajectories of W by smoothing. Rather,whether or not to smooth these trajectories relies on the assumed natureof the underlying process W . In Chapter 5, we do not smooth the amounteaten in our application to the mouse data set described in Chapter 2.4.5 Model ComparisonsGiven observations from n realizations, X1, . . . , Xn, it is natural to want todetermine which of models (4.3) and (4.5) best describes the instantaneousrelationship between X(t) and X ′(t). This section will first discuss the tech-niques for assessing and comparing the fits of these two models, as developedby Verzelen et al. (2012). Then, assuming the presence of observations fromn realizations, W1, . . . ,Wn, of W , we propose a statistic to test if includingboth W (t) and X(t) leads to a significant increase in explaining the varia-tion in X ′(t) when compared to X(t) alone. In other words, we propose atest to assess whether the model given in (4.6) provides a superior fit to thedata than that in (4.3).We begin our discussion by reviewing the method of Verzelen et al. (2012)for assessing the fit of the model given in (4.3) as well as the specific casein (4.5). When the fit is good, X(·) may be close to the solution of theequation X ′(t) = f(t,X(t)). To assess whether X(·) can be viewed this way,we determine the relative size of Z(·) using the decomposition of varianceVar[X ′(t)] = Var[f(t,X(t))] + Var[Z(t)]. This decomposition allows us toassess the fraction of variance of X ′(t) explained by f(t,X(t)) using thecoefficient of determination (Mu¨ller and Yao, 2010; Verzelen et al., 2012):R2(t) =Var[f(t,X(t))]Var[X ′(t)]= 1− Var[X′(t)− f(t,X(t))]Var[X ′(t)]. (4.7)On sub-domains of T when R2(t) is close to 1 then X(·) may be closeto the solution of the equation X ′(t) = f(t,X(t)). Given n realizations,X1, . . . , Xn, a natural estimate of R2(t) isRˆ2(t) = 1−∑ni=1[Xˆ′i(t)− fˆ(t, Xˆi(t))]2∑ni=1[Xˆ′i(t)− ¯ˆX ′(t)]2. (4.8)The numerator of this estimate is the sum of the squared residuals from theestimate of f , while the denominator is (n − 1) times the sample varianceof the Xˆ ′i(t)’s.324.5. Model ComparisonsWhen one wishes to assess the fit of the simpler, linear model in (4.5)then (4.8) reduces toRˆ2L(t) = 1−∑ni=1[Xˆ′i(t)− µˆ∗(t)− βˆ(t)Xˆi(t)]2∑ni=1[Xˆ′i(t)− ¯ˆX ′(t)]2. (4.9)Since for a given set of trajectories of X, one does not know whetheror not the linear form of f(t,X(t)) suffices, one may wish to compare thefit of this linear form in (4.5) to the fit of the nonlinear in (4.3). For this,one can compare the two corresponding R2 values. When the ratio betweenRˆ2L(t) in (4.9) to Rˆ2(t) in (4.8) is small, then the nonlinear model provides asignificantly better fit to the data. Conversely, when this ratio is large, thesimpler linear model may be acceptable.The above comparison is valid when one is interested in what type ofeffect X(t) has on X ′(t). To study the improvement in the fit given byincluding W (t), as in model (4.6) of Section 4.3, we propose modifying (4.7)to the coefficient of determination given byR2W (t) =Var[f(t,X(t),W (t))]Var[X ′(t)]= 1− Var[X′(t)− f(t,X(t),W (t))]Var[X ′(t)](4.10)and estimated byRˆ2W (t) = 1−∑ni=1[Xˆ′i(t)− fˆ(t, Xˆi(t),Wi(t))]2∑ni=1[Xˆ′i(t)− ¯ˆX ′(t)]2. (4.11)To determine whether or not W (t) provides a substantially better fit to thedata than the model only including X(t), we compare R2W (t) with R2(t) viar2(t) ≡ R2W (t)R2(t)=Var[X ′(t)− f(t,X(t),W (t))]Var[X ′(t)− f(t,X(t)] . (4.12)When r2(t) is large, we have a comparable amount of variation of X ′(t)explained by models (4.3) and (4.6), indicating that including W (t) doesnot provide a significantly better fit. On the other hand, when this ratio issmall, it indicates that the fit with W (t) included is better to that withoutW (t) and that model (4.3) is more appropriate. Following from (4.8) and(4.11), we estimate r2(t) byrˆ2(t) ≡ Rˆ2W (t)Rˆ2(t)=∑ni=1[Xˆ′i(t)− fˆ(t, Xˆi(t),Wi(t))]2∑ni=1[Xˆ′i(t)− fˆ(t, Xˆi(t))]2. (4.13)334.5. Model ComparisonsTo test the null hypothesis that, for all t, the overall instantaneous rela-tionship between X and X ′ at t is unaffected by W (t) we propose the teststatisticS =∫t∈Trˆ2(t)dt. (4.14)Since in practice, one does not have data at all t ∈ T , we replace S bySˆ =J∑k=1rˆ2(t∗k), (4.15)for some evenly spaced grid of points, t∗1 < · · · < t∗k. When Sˆ falls belowa given quantile of its null distribution, we reject the null hypothesis andconclude that (4.6) provides a better overall fit then model (4.3).The null distribution of S is unknown and therefore must be approxi-mated. To estimate this null distribution and thus the corresponding p-valueof the test statistic, a permutation test with the Wi’s is used. That is, fromour derived observations {(Xˆi,Wi), i = 1, . . . , n} we obtain N new datasets. The kth data set is {(Xi,Wγ(i)), i = 1, . . . , n}, where γ(1), . . . , γ(n) isa random permutation of the ordered set, {1, . . . , n}. Via a permutation,each of the observed trajectories Wj is randomly assigned to a new “part-ner” Xˆi. Heuristically, in the new data set, there will be no dependencebetween X ′ and W as the trajectories Xi and Wj are paired together atrandom. Note that this permutation induces a distribution that is a specialcase of the null distribution. The null distribution does not require indepen-dence of X and W . Rather, the assumption of the null distribution is thatE[X ′(t)|X(t),W (t)] does not depend on W (t), a weaker assumption.For each of these N data sets, the smoothing described in Section 4.4 isthen carried out for each of the two models (4.3) and (4.6) and the sampleratios, rˆ21(t), rˆ22(t), . . . , rˆ2N (t), for t ∈ {t∗1, · · · , tk} along with the test statis-tics Sˆ1, Sˆ2, . . . , SˆN are computed. We then use the empirical distributionof Sˆ1, Sˆ2, . . . , SˆN to provide an approximation the null distribution of S.When the observed ratio falls below a given quantile of this approximatenull distribution, we conclude that, in terms of instantaneous dynamics, thetwo processes, X and W , account for a significantly higher portion of thevariation in X ′, when compared to X alone.Admittedly, the test statistic S combines information across t. In manyapplications, W (t) might only be significant to the relationship betweenX(t) and X ′(t) on certain sub-domains of T , while not significant overall.The nonparametric method, described above, also allows us to explore thispossibility by approximating the null distribution of rˆ2(t) for each fixed t.344.5. Model ComparisonsFor a fixed t, the observed value of rˆ2(t) can be compared to the distri-bution of rˆ21(t), rˆ22(t), . . . , rˆ2N (t). When rˆ2(t) is below a given quantile, weconclude that W (t) together with X(t) provides a significantly better fit toX ′(t), when compared to the fit provided by X(t) alone. Of course, if thiscomparison of the observed value, rˆ2(t) is done for many t’s, one should beaware of multiple testing issues and a correction factor to the rejection levelcould be applied.35Chapter 5Dynamics of Mouse GrowthDataThe methods described in Chapter 4 can be used to estimate the dynamicsof the mouse growth data outlined in Chapter 2. We first briefly describe thesmoothing of the data from each of the eight groups of mice. We then usethe model fitting procedure outlined in Section 4.4, both including and notincluding the amount eaten. These models provide insight into not only theindividual groups, but also the differences between them. Specifically, weattempt to understand the growth processes of the mice in the eight groupsand how these processes vary. Finally, for each group, we compare the fitsof these models and study if and when the amount eaten plays a significantrole in the relationship between body mass and growth rate.5.1 Model Fitting5.1.1 Growth Rate Depending on Body MassAs outlined in Section 4.4, before fitting a nonparametric regression model,the raw data are smoothed to approximate the underlying process and itsderivative. For each mouse, we use smoothing splines using the statisticalprogramming language R (R Core Team, 2015) to obtain the estimatedbody mass curve and the subsequent growth rate as a function of age. Weset the smoothing parameter by setting the degrees of freedom, as outlinedin Section 3.3. For the smoothing of the body masses we set the degrees offreedom to 10, while for the estimated growth rates they are set to 5. Here wefix the degrees of freedom, rather than using cross-validation to ensure thateach mouse has a comparable amount of smoothing. Moreover, in the caseof derivative estimation, cross-validation does not perform reliably. Figure5.1 shows the smoothed body masses, Xˆi(t), of mice in each of the eightgroups. In general, the male mice are heavier than their female counterparts.Likewise, the sedentary mice appear heavier than the active. For both malesand females, the heaviest individual mice belong to the sedentary control365.1. Model Fittinggroups.Figure 5.2 shows the estimated growth rates, Xˆ ′i(t), of mice in each ofthe eight groups. As one would expect, the growth rates at early weeksare much higher than those in the middle or later weeks. For some miceestimated growth rates are negative during later weeks. This characteristicis most prevalent in the male sedentary groups. Furthermore, the flatteningout of the curves in many of the panes in Figure 5.2 indicates that, for mostmice, the growth rate is relatively constant after the younger ages.As outlined in Section 4.4, to estimate the deterministic part, f(t, ·),of the nonlinear differential equation in equation (4.3), we use a local lin-ear estimate (Bowman and Azzalini, 2014) where for each t, the estimatedgrowth rate, Xˆ ′i(t), is regressed on the smoothed body mass, Xˆi(t). Thekernel is chosen to be Gaussian and the smoothing parameter is selected bycross validation. Figure 5.3 shows this estimate, fˆ(t, x), at weeks t = 5, 30,55 and 80 from the active male mice from the control group. As we cansee, the nonparametric approach picks up on some of the nonlinear trendsin the data. At t = 5 the relationship between body mass and growth rateis monotone increasing. At t = 30, 55, and 80 the relationship appears tobe relatively constant with noise. However, for t = 30 and 80, there maybe slight increases in growth rate for increase in body mass for mice withinthe middle weights. For each of the eight groups and each age, plots such asthose in Figure 5.3 can be found in the digital appendix in the subdirectory/1D Plots.We can further examine the active male mice from the control groupthrough the contour plot in Figure 5.4, which displays the entire estimatefˆ(t, x). The values of fˆ range from around 0.05 to 0.1 at the early ages, whileat the older ages they are nearly all below 0.025. Examining the relationshipbetween body mass and expected growth rate given in Figure 5.4 we see thatall young mice have high growth rates, regardless of body mass. From ages10 to 35, growth rate is high for both light and heavy mice. Further, fromages 35 to 60, growth rate decreases with increasing body mass. Lastly, forages beyond 60 weeks, the pattern is unclear. Similar plots for the other 7groups show this pattern as well and can be found in the digital appendixin the subdirectory /Contour 1D Fits.Figure 5.5 provides a comparison of the estimated relationship betweenbody mass and growth rate for each of the eight groups. Within each panelof Figure 5.5 we see 80 curves, each representing fˆ(t, ·) for a fixed age t. Thedarker the color of the curve, the larger the value of t. Although one canattempt to draw some general conclusions from Figure 5.5, the best way todetermine how the relationship between body mass and growth rate changes375.1. Model Fittingover time is to examine the data and its corresponding curve at each t. Table5.1 provides a description of this relationship for each of the eight groups.As previously mentioned, the individual scatter plots can be found in thesubdirectory /1D Plots.Mouse Group Description of fˆ(t, x)F.Act.Ctrl • Relatively flat between weeks 1 and 15, with a slightuptrend and one exceptionally heavy mouse.• Monotone increasing between weeks 16 and 50, withthe exception of two heavy mice.• For weeks 50 and 80, there are several heavy mice whosegrowth rates don’t fit the pattern. Other then thesemice, the estimate still appears to be monotone increas-ing.F.Act.Sel • Relatively flat between weeks 1 and 15 except for thelightest/heaviest mice.• Monotone increasing between weeks 16 and 45, with theexception of the heaviest mice, which have low growthrates.• Relatively flat with a slight uptrend from weeks 45 and80.F.Sed.Ctrl • Monotone increasing between weeks 1 and 9, with theexception of the heaviest mice, which have low growthrates and the lightest mice, which have high growthrates.• Monotone increasing between weeks 10 and 40.• For weeks 41 and 49 the fits are poor due to a numberof heavy mice who have very different growth rates.• Flat between weeks 50 and 60.• For weeks 60 and 80, the fit is again poor due to theheavy mice with different trends.385.1. Model FittingF.Sed.Sel • Noisy upward trend from weeks 1 and 15, with theexception of two heavy mice.• Monotone increasing from weeks 16 and 40.• Flat from weeks 40 and 59, with the exception of oneheavy mouse whose growth rate is much lower thanothers.• Flat from weeks 60 and 80, with the exception of aslight uptrend from weeks 75 and 80.M.Act.Ctrl • Monotone increasing from weeks 1 and 30.• Flat with a lot of noise from weeks 30 and 80.M.Act.Sel • Monotone increasing from weeks 1 and 30, with theexception of two light mice.• Flat with noise from weeks 30 and 80. From weeks 30to 75 there is one exceptionally heavy mouse with ahigh growth rate. From weeks 65 t 80 there are twolight mice with very low growth rates.M.Sed.Ctrl • Increasing overall trend (with lots of noise) from weeks1 and 50.• Relatively flat with noise from weeks 50 and 80. Thereare several mice with negative growth rates that leadto poor fits at some weeks.M.Sed.Sel • Monotone increasing between weeks 1 and 40.• Relatively flat (with noise) from weeks 40 and 80.Table 5.1: The general trends in the estimated instantaneous relationshipbetween body mass and growth rate for each of the eight groups as seen inthe subdirectory /1D Plots of the digital appendix.Table 5.1 allows us to make some general conclusions about the groups.In both the female active groups there appears to be an increase in estimated395.1. Model Fittinggrowth rate with an increase in body mass, with the exception of the earlyweeks. On the other hand, the female sedentary mice show this increase atthe young and middle ages but not at the older ages. In all four of the malegroups we see a similar trend of an increase in estimated growth rate withan increase in body mass up to week 30-50.5.1.2 Including Amount EatenThe above analysis focuses strictly on the instantaneous relationship be-tween the body mass and growth rate in each of the eight groups. We nowfit the model given in equation (4.6), where W (t) is the amount eaten inweek t. The weekly amount eaten is displayed for each group in Figure 5.6.As mentioned at the end of Section 4.4, we choose not to smooth W , asbiologically there is no reason to assume the process is intrinsically smooth.The model is fit via local linear bivariate kernel smoothing (Bowman andAzzalini, 2014), as outlined in Section 3.2, using the multiplicative kernelin equation (3.4), where K1 = K2 are Gaussian. The bandwidths, h1 andh2, are chosen by cross validation. For each t, the evaluation points for theestimate, fˆ(t, x, w), form an equally spaced grid on [mini Xˆi(t),maxi Xˆi(t)]×[miniWi(t),maxiWi(t)].Including the process W in the model makes visual representations anal-ogous to Figures 5.4 and 5.5 difficult. Rather, the best we can do is to makecontour plots for the expected growth rate as a function of body mass andamount eaten at each t. Figure 5.7 gives an example of these contour plotsat t = 5, 30, 55 and 80 for the active males from the control group. Theseplots are at the same ages and in the same group as the plots in Figure5.3, which do not include W . In Figure 5.3, at t = 5, expected growth rateincreases with body mass. In Figure 5.7, we see a similar trend: at t = 5,for a fixed amount eaten there appears to be an increase in expected growthrate with an increase in body mass, with the exception of when the amounteaten is high. At t = 30 the expected growth rate is relatively constantas in Figure 5.3, with the exception of one heavy mouse, who can be seenin the upper right pane of Figure 5.3. For t = 55, for lighter mice thereis an increase in expected growth rate as the amount eaten increases. Formoderate and heavy body masses, the relationship is constant. Finally, att = 80, the growth rate is constant with the exception of the heavier mice.For these mice, it appears that an increase in the amount eaten increasesthe expected growth rate.Similar plots from 80 times points and 8 different groups, are includedin the subdirectory /2D Plots in the digital appendix. This subdirectory405.2. Model Comparisonsalso contains animations of the 80 contour plots for each of the 8 groupsin .gif files. These allow for some interpretation of how the amount eatenand body mass effect the expected growth rate. These interpretations arediscussed at the end of the proceeding section.5.2 Model ComparisonsFor each of the eight groups, we now compare the models in equations(4.3) and (4.6) using the techniques outlined in Section 4.5. Recall thatwe quantify the extent to which the deterministic part of models (4.3) and(4.6) explains the variation in X ′(t) with the ratios R2(t) and R2W (t), givenin equations (4.8) and (4.11), respectively. Figure 5.8 shows a comparisonbetween R2(t) and R2W (t) for each of the eight groups, while Figure 5.9 showsa 95% bootstrap confidence interval for R2W (t) for the male active controlgroup. These point-wise confidence intervals were obtained by computingR2W (t) for each of 200 samples (with replacement) of the data. The standarderrors of the resulting 200 R2W (t)’s were then computed at each t and usedwith standard normal quantiles to construct the confidence bands. Similarplots for each of the eight groups can be found in the digital appendix inthe /Bootstrap subdirectory. We now make some observations about themodel fits and some comparisons between groups. These observations arestrictly exploratory as there are no standard values of R2W (t) and R2(t) tobe compared to.Firstly, for all eight groups and all time points, we have R2W (t) ≥ R2(t),thus indicating that the amount of variation in the growth rate explainedby model (4.6) is uniformly higher than model (4.3). This is expected, asmodel (4.6) is richer than (4.3). In general, the values of R2W and R2 arefairly high for ages 10-30, while for many of the groups, we see a drop inthe values of R2W and R2 between weeks 40 and 50. This drop is generallyfollowed by an uptrend for the later ages, especially for model (4.6) .Secondly, we examine the differences between groups. In both the maleand female sedentary control groups, at young ages, the values of R2W andR2 are large. In contrast, for older ages, only R2W is large. The sedentaryselected males and females are similar to the sedentary control groups, withthe exception that R2W is small for the later ages in the male group. Inthe active selected groups, R2 has uniformly low values for both the maleand females. Model (4.6) is slightly better, although R2W is still relativelylow, especially in the females. In general, both models seem to fit the activecontrol groups better, in particular at the younger and older ages.415.2. Model ComparisonsWe now formally test whether the fit from model (4.6) is significantlybetter than that from model (4.3). For this test, we carry out the hypothesistesting procedure based on the test statistic Sˆ, outlined in Section 4.5. Foreach of the eight groups we use 500 permutations of the amounts eaten.Figure 5.10 gives the approximated null density of Sˆ for the sedentaryfemale group as well as the observed value of Sˆ. Recall that when Sˆ is small,we reject the null hypothesis and conclude that the model in (4.6) provides asignificantly better fit to that in (4.3). In Figure 5.10 the observed value of Sˆis less than the 5th percentile of the approximated null density and thereforethe model in (4.6) provides a significantly better fit. This indicates that theamount eaten in week t significantly effects the relationship between bodymass and growth rate at week t, at least for some values of t. The densitiesfor the other seven groups are similar and can be seen in the digital appendixunder /Density Curves. The resulting p-values are displayed in Table 5.2.At a rejection level of 5%, only the null hypothesis for the sedentary femalegroups (both control and selected) is rejected. That being said, the p-valuesare all lower for the sedentary groups versus their active counterparts. Thisindicates that there is more evidence of an effect of the weekly amount eatenon the instantaneous relationship between body mass and growth rate forthe sedentary mice.Figure 5.11 gives the ratios rˆ2(·) in equation (4.13) of Section 4.5 foreach of the eight groups. The dashed lines correspond to the point-wise 5thand 95th percentiles resulting from the permutation testing method givenin Section 4.5. When rˆ2(t) falls below the 5th percentile, this indicates thatthe value is unusually low, providing evidence that the amount eaten issignificant for explaining the growth rate at time t, and thus in favor of thealternative hypothesis given in model (4.6). Conversely when the value liessignificantly above the 5th percentile, there is no evidence that the amounteaten is significant in the instantaneous relationship between body mass andgrowth rate.Given the p-values in Table 5.2 it is natural to want to determine forwhich weeks and in what way the weekly amount eaten is significant inthe instantaneous relationship between body mass and growth rate. Toanswer these questions, we examine Figure 5.11, the point-wise p-valuesobtained from the permutation method, provided in Figure 5.12, as wellas the individual contour plots in /2D /Plots of the digital appendix. Thecontour plots are challenging to interpret, particularly since it is difficult totell if results are driven by a few unusual mice. Note that a contour plotindicates that W is not important in determining expected growth rate ifeach vertical line on the contour plot is of one colour/growth rate. We focus425.2. Model ComparisonsActive SedentaryControl Selected Control SelectedFemale 0.752 0.330 0.002 0.000Male 0.330 0.114 0.150 0.060Table 5.2: The p-values for each of the eight groups, resulting from thepermutation test from Section 4.5 to test the null hypothesis that the amounteaten at week t does not significantly effect the instantaneous relationshipbetween body mass and growth rate.our discussion on the sedentary females groups as they were the only twogroups with overall p-values less than 0.05 in Table 5.2.In the selected females who were sedentary, we have an overall p-valueof 0.000. The top right panel of Figure 5.12 indicates for this group thatthe weekly amount eaten is significant during weeks 3 and 4, 15 and 16,21 to 33, 38, 44 and 45, 58 to 65 and 71 to 74. To determine how theamount eaten effects the expected growth rate, we examine the contours in/2D Plots /F.Sed.Sel of the digital appendix. During weeks 3, 4, 15 and 16the growth rate is mostly constant, with the exception of the heavy micewhere the expected growth rate is decreasing as the amount eaten increases.For weeks 21 to 33 the growth rate is constant for the lighter mice. For theheavier mice, expected growth rate initially decreases with an increase inamount eaten and then increases. Weeks 38, 44 and 45 are similar, but thepattern is less clear. During weeks 58 to 65 and 71 to 74, the heaviest micehave by far the fastest growth rate, making it difficult to see other patterns.That being said, in some of these weeks it appears that an increase in foodconsumption leads to an increase in expected growth rate.For the sedentary females from the control group, whose overall p-value is0.002, it appears that for many ages past week 40, the weekly amount eatenis significant in explaining the growth rate. Specifically, weeks 40 to 42, 47to 52, 60 to 62 and 68 to 75 all have p-values less than 0.05. During weeks40 to 42 and 47 to 52, for the heavier mice, the growth rate is largest foran intermediate value of amount eaten. In addition, during weeks 47 to 52,for the lighter mice, the expected growth rate increases as the amount eatenincreases. For weeks 60 to 62 there appears to be an increase in expectedgrowth rate for an increase in amount eaten, although one fast growing andone shrinking mouse at weeks 61 and 62 make these observations slightlymore difficult to see. We see a similar, but small, increase for moderate and435.2. Model Comparisonsheavy mice during weeks 68 to 75. During these weeks there are some lightmice with negative growth rates that do not show this pattern.We conclude this section by pointing out some important facts. Firstly,since for each group we are testing 80 different hypotheses (one for eachweek), we must be aware of multiple comparison issues and therefore exercisecaution over significant results. Secondly, as will be discussed in Chapter 6,the point-wise power of the above test is often low. Therefore, it is possiblethat some of the p-values are higher than they would be under a superiortesting method. As discussed, individual analysis of the contour plots canprovide insight into whether this is the case. That being said, complexnonlinear relationships are not always obvious from such plots and thereforethe possibility exists that some of our conclusions could be further refined.Finally, we point out that each group is made up of 30-40 mice and that thisis a somewhat small number for fitting nonparametric models. An increasein sample size may provide greater clarity in the individual contour plots ascurrently a small number of mice who exhibit different trends from the trueunderlying relationship may cloud the figures.Figure 5.1: The smoothed body masses (grams) for each of the eight groupsof mice as a function of age (weeks).445.2. Model ComparisonsFigure 5.2: The estimated growth rates (grams/week) for each of the eightgroups of mice as a function of age (weeks).455.2. Model ComparisonsFigure 5.3: An example of the estimated deterministic component, fˆ(t, ·),as a function of body mass (grams) for the active males from the controlgroup at weeks 5, 30, 55, and 80.465.2. Model ComparisonsFigure 5.4: Contour plot of the nonparametric estimate, fˆ(t, x), for theactive male mice from the control group. The x-axis corresponds to age(weeks), while the y-axis is body mass (grams).475.2. Model ComparisonsFigure 5.5: The estimated relationship, given by fˆ(t, x), between body mass(grams) and growth rate (grams/week) for each of the eight groups. Theincrease in darkness of the curves indicates an increase in age.485.2. Model ComparisonsFigure 5.6: The amount eaten (grams) as a function of age (weeks), orga-nized into each of the eight groups. This is the same as Figure 2.8, with theexception that here the outliers have been removed.495.2. Model ComparisonsFigure 5.7: A sample of four contour plots, at weeks 5, 30, 55 and 80, fromthe control males who were active. The x-axis indicates body mass (grams),while the y-axis corresponds to the weekly amount eaten (grams). Thecoloring is based on the value of fˆ(t, x, w), the conditional expected growthrate.505.2. Model ComparisonsFigure 5.8: A comparison of R2(t) (dashed) and R2W (t) (solid) as a functionof age (weeks) for each of the eight groups.Figure 5.9: A 95% bootstrap confidence interval (dashed,red) for R2W (t) forthe active males in the control group.515.2. Model ComparisonsFigure 5.10: The approximated density of Sˆ for the sedentary females fromthe control group. The red line corresponds to the observed value from thedata, while the blue line indicates the 5th percentile of the approximatednull density of Sˆ.Figure 5.11: The ratio rˆ2 as a function of age (weeks) for each of the 8groups. The dashed lines represent the 5th and 95th percentiles resultingfrom the permutation method described in Section 4.5.525.2. Model ComparisonsFigure 5.12: The point-wise p-values for each of the eight groups result-ing from the permutation test in Section 4.5 of the null hypothesis thatthe amount eaten at week t does not significantly effect the instantaneousrelationship between body mass and growth rate.53Chapter 6Simulation StudyTo assess the statistical properties of our methods for the model comparisonsproposed in Section 4.5, we carry out a series of simulations based on thedata collected on mice, as described in Chapter 2.Recall that we are interested in how body mass, X, and the amounteaten, W , at a given age, t, effect the growth rate in the eight differentgroups of mice. Our findings for each of the eight groups were outlined inChapter 5. Specifically, we tested the hypothesis that the amount eaten atage t has a significant effect on the relationship between body mass andgrowth rate at t. To assess the statistical properties of our methods for thishypothesis test, we simulate data sets which are similar to those in Chapter2 and carry out the analysis described in Chapter 5. This is repeated forvarious levels of correlation between the body mass and the amount eaten.For simplicity, we only simulate data sets based on one of the eight groups.6.1 Models for Simulated DataWe generate data, Y˜i, W˜i ∈ RJ , for i = 1, . . . , n, that are independent iden-tically distributed as (Y˜ , W˜ ), where Y˜ = (Y1, . . . , YJ), W˜ = (W1, . . . ,WJ),withY˜j = X(tj) + j (6.1)W˜j = a+ γe(tj) (6.2)for j = 1, . . . , J , whereX(tj) = µX(tj) +K∑k=1αkϕk(tj),µX(tj) = E[X(tj)],αk ∼ N (0, λk) and Cov[αk, αl] = 0 for all k 6= l,j ∼ N (0, σ2j ), and Cov[k, l] = 0 for all k 6= l,Cov[αk, l] = 0 for all k and l546.1. Models for Simulated Dataanda ∼ N (µa, σ2a),E(e(t)) = 0,Cov(e(s), e(t)) = I(s = t),Cov[a, e(t)] = 0 for all t,Cov[αk, a] = ck for all k,Cov[j , a] = 0 for all j,Cov[j , e(t)] = 0 for all j and t.One can think of the Y˜i and W˜i as the simulated noisy body masses andamounts eaten, respectively, of the ith mouse.We now discuss how the values of the above parameters are chosen inorder to generate Y˜i and W˜i. To choose these parameters we use estimatesbased on the data from the group of male mice who were active on wheels andwere from the control breeding group. We denote these data as (tj , yij , wij)for i = 1, . . . , n = 38 and j = 1, . . . , J = 79, where yij denotes the bodymass and wij denotes the amount eaten for mouse i at time tj . In addition,we denote the smoothed body mass of mouse i at time tj as xij .To generate the Xi’s, that is, to determine values for K,λ1, . . . , λK andϕk(t1), . . . , ϕk(tJ), we perform a principal component analysis on the datavectors (xi1, . . . , xiJ)t for i = 1, . . . , n. From this PCA, we choose K = 2,as the first two principal components explain 95.56% of the variation inthe data vectors. We then set (ϕ1(t1), . . . , ϕ1(tJ)) and (ϕ2(t1), . . . , ϕ2(tJ))to be the eigenvectors from the PCA, shown in Figure 6.1, and (λ1, λ2) =(639.89, 24.71) to be the corresponding eigenvalues. Further, we setµX(tj) =1nn∑i=1xij .This function is displayed in Figure 6.2. For the error variances of the j ’swe setσ2j =n∑i=1(yij − xij)2/n.For the simulated amounts eaten, we set γ2, µa and σ2a asγ2 =1n(J − 1)n∑i=1J∑j=1(Wij − W¯i·)2 = 11.64 gm2,556.2. Simulation Resultswhere W¯i· =∑Jj=1Wij/J andµa = W¯·· =∑i,jWij/(nJ) = 41.18 gmσ2a =1n− 1n∑i=1(W¯i· − W¯··)2 − γ2J= 17.09 gm2.To simplify the model we choose Cov(α2, a) = 0. This assumption isnot unreasonable as the variance of α2 is much smaller than that of α1and therefore ϕ2 could possibly be omitted all together from modeling thedependence between X and W . Thus, as shown in Appendix (2), to have aproper covariance structure between Y˜i and W˜i it suffices that:Corr2(α1, a) < 1.We simulate data using varying levels of correlation between α1 and a.These levels are 0, 0.2, 0.4, 0.6, and 0.8. For each of these correlations, 500data sets are generated and analyzed with the method of Chapter 4. Anexample of one of these data sets can be found in Figure 6.3. The proceedingsection outlines the corresponding results.6.2 Simulation ResultsIn this section we present the results from the simulation study describedin Section 6.1. All simulations were carried out in R (R Core Team, 2015)and the results are displayed using ggplot2 (Wickham, 2009). As men-tioned in Section 6.1, we simulate 500 data sets for each of Corr(α1, a) =0, 0.2, 0.4, 0.6, 0.8. For each of these data sets, a p-value based on the teststatisticSˆ =J∑j=1rˆ2(tj) (6.3)is calculated via the permutation method described in Section 4.5 to testthe hypothesis that, for all t, the overall instantaneous relationship betweenX and X ′ at t is unaffected by W (t). In the following, we have used 200permutations for each data set.Figure 6.4 shows the histograms of the 500 p-values obtained from thepermutation method, for each of the five correlation levels. We first notethat when the correlation between α1 and a is 0, the distribution of the566.2. Simulation Resultsp-values is approximately uniform. Moreover, as expected, when the corre-lation between α1 and a and hence X and W is high, we observe that theresulting p-values are much lower than when the correlation is low. Thisindicates that when the correlation between the two processes is high, Sˆ canbe reasonably expected to indicate whether or not the additional process,W , effects the overall instantaneous relationship between X and X ′. Whenthe correlation between the two is lower, we may not expect the test toadequately inform of the relationship. This is further illustrated in Figure6.5 where the power curves of the test statistic, Sˆ, are shown. For low cor-relation levels the test has little power to detect the significance of W onthe instantaneous relationship between X and X ′, while for higher levelsof correlation (and higher rejection levels), we observe substantially higherpower. Tables 6.1 and 6.2 provide more detailed information about the val-ues plotted in Figure 6.1, in particular, about the variability of the valuesdisplayed.In Table 6.1 we display 95% confidence intervals for the achieved signifi-cance level of our test, that is, for the proportion of times the null hypothesisis rejected at various rejection levels, when the correlation between a and α1is 0. We expect that these confidence intervals should contain the nominalrejection level, given in the first column of Table 6.1. This is the case foreach of the first four rejection levels (as well as nearly true for α = 0.2),thus illustrating the viability of the test.Further standard errors for the power of our 5% level test at variouscorrelation levels can be seen in Table 6.2. The standard errors for theother rejection levels shown in Figure 6.1 are similar and are thus not shownhere.Alpha Proportion Rejected 95% Confidence Interval0.01 0.01 (0.00,0.02)0.05 0.05 (0.03,0.07)0.1 0.09 (0.07,0.12)0.15 0.13 (0.10,0.16)0.2 0.17 (0.13,0.20)Table 6.1: The proportion of the null hypotheses rejected based on Sˆ whenthe correlation between a and α1 is set to 0 as well as a 95% confidenceinterval of the expected proportion.As mentioned in Section 4.5, as the test statistic Sˆ is a sum over all tj ,it cannot identify at which time points W (t) and X(t) better explain X ′(t)576.2. Simulation ResultsCorrelation Proportion Rejected SE 95% Confidence Interval0.0 0.050 0.010 (0.030,0.070)0.2 0.070 0.011 (0.048,0.092)0.4 0.080 0.012 (0.565,0.104)0.6 0.140 0.015 (0.112,0.169)0.8 0.390 0.022 (0.345,0.433)Table 6.2: The proportion of null hypotheses rejected based on Sˆ, at signif-icance level of 0.05, their standard errors and a 95% confidence interval forthe expected proportion rejected.when compared to just X(t). To determine this, for each fixed tj , we testthe hypothesisH0 : E[X′(tj)|X(tj),W (tj)] = f0(tj , X(tj)) (6.4)H1 : E[X′(tj)|X(tj),W (tj)] = f1(tj , X(tj),W (tj))For this test, we obtain a p-value by using the empirical distribution ofthe 200 values of rˆ2(tj) which result from the permutations of the data.Thereby, for each fixed tj , we have 500 p-values (one for each simulateddata set). To determine the point-wise power of our test, these 500 p-valuesare compared to a chosen rejection level to obtain the proportion of timesthe null hypothesis is rejected.To give ourselves a measure for comparison, we fit for each fixed tj ,j = 1, . . . , J , the linear modelX ′i(tj) = β0 + β1(tj)Xi(tj) + β2(tj)Wi(tj) + τi (6.5)τi ∼ N (0, σ2τ )and test the hypothesisH0 : β2(tj) = 0 (6.6)H1 : β2(tj) 6= 0.Testing this for each of the 500 data sets, using the normal linear modelapproach to computing p-values, allows us to estimate the power of thishypothesis test for each tj . We can then compare the point wise power ofour test, described in the preceding paragraph, with that obtained from thelinear models test. Since we have generated data according to Gaussiandistributions, we would anticipate that (6.5) holds (at least when error is586.2. Simulation Resultsnot included), and thus the standard linear model test of (6.6) would be a“gold standard”. Of course, if (6.6) does not hold, then we should use thenonparametric approach outlined in Sections 4.4 and 4.5.Figures 6.6 and 6.7 provide a comparison of the power of the tests ofthe hypotheses (6.4) and (6.6) as a function of the correlation between aand α1 at a significance level of 5%. Predictably, for both tests, we ob-serve much better performance for higher correlation between the generatedobservations.For both the hypothesis test of the linear model and that of the non-parametric model we observe much higher power for younger ages whencompared to older ages. While the joint distribution of X ′(t), X(t) andW (t) does depend on t in a complicated way, examining the contour plotsfor various ages, there does not appear to be an obvious reason for thishigher power at younger ages.Figure 6.8 shows the proportion of times the null hypothesis in (6.4) isrejected minus the proportion of times the null hypothesis in (6.6) is rejected,as a function of t. As seen in Figures 6.6 and 6.7, when the correlationbetween a and α1 is low, neither test of the point-wise relationship has anysizable power and indeed, the tests are comparable, as can be seen in Figure6.8. For high correlations, the linear model test has much higher power atyoung ages, but significantly lower power for t ∈ (10, 60). This implies thatfor most ages, our method is an acceptable alternative to simply fitting linearmodels to the data. Moreover, the nonparametric method likely allows forthe possibility of detecting nonlinear relationships in the data that a simpleregression cannot.In Table 6.3 we display, for each of the five correlation levels and a sig-nificance level of 0.05, the total number of times that each test correctlyrejected, while the other test failed to reject, as well as when both or neithertest rejected. As we can see, the majority of the time, neither of the testsreject. This indicates that to determine the significance of any relationshipat a particular tj , more work should be done to develop a more powerfulalternative. That being said, we see that the test of (6.4) rejects a greaternumber of times than the test of (6.6). In particular, when the correlation is0.8 there is a large difference in the total number of times the null hypothe-ses are rejected. It should be noted that, as mentioned above, the “goldstandard” applies to the unsmoothed data with no error. Therefore, theactual linearity of the relationship may not perfectly hold after smoothingthe noisy simulated data.We conclude this section by describing some the potential issues withour study, as well as some ways it could be improved. Firstly, since each596.2. Simulation Resultssimulated data set contains only 38 mice, we may observe greater powerif the sample size were increased. Secondly, as mentioned in Section 4.5,our permutation method to approximate the null distribution of Sˆ onlycaptures a subset of this null distribution. Using a different approach toapproximating this underlying distribution may also lead to greater power.Further, due to computational restraints, we use only 200 permutations perdata set when calculating the null distribution of Sˆ. Increasing the number ofpermutations would increase the accuracy of the reported p-values. Finally,in Section 6.1 we specify a probability model for W . It is possible thatthis model is not the best representation of this process and therefore if adifferent model were used, it may make the test more powerful.Null Hypothesis RejectedCorrelation Linear Only Nonparametric Only Both Neither Total0.2 1781 1923 387 35409 395000.4 2236 2638 517 34109 395000.6 3046 3400 1438 31616 395000.8 3344 5447 3799 26910 39500Table 6.3: The total number (across all tj) of times the null hypotheses arerejected for a significance level of 0.05 at each correlation level. The linearand nonparametric columns correspond to test (6.6) and (6.4) respectively.606.2. Simulation ResultsFigure 6.1: The first eigenvector (red), and second eigenvector (blue) fromthe principal component analysis of the data vectors (xi1, . . . , xiJ)t for i =1, . . . , n. These are used as ϕ1 and ϕ2 in the simulation study.616.2. Simulation ResultsFigure 6.2: The mean body mass (grams) of the male active control mousegroup, used as µX(t) in the simulation study.626.2. Simulation ResultsFigure 6.3: A simulated data set of body mass (grams) is given in the leftpane, while a simulated data set of the amount eaten (grams) is given in theright. The correlation between the two processes has been set to 0.636.2. Simulation ResultsFigure 6.4: Histograms of the 500 p-values resulting from the nonparametrichypothesis test from Section 4.5. The correlation between α1 and a is setto 0, 0.2, 0.4, 0.6 and 0.8.646.2. Simulation ResultsFigure 6.5: The power of the test statistic, Sˆ, as a function of the correlationbetween α1 and a. Each power curve corresponds to a fixed rejection levelfor the test.656.2. Simulation ResultsFigure 6.6: The point-wise power of the hypothesis test in (6.6) using astandard linear model (left) compared to the point-wise power of the testin (6.4) using our rˆ(t)2 statistic, at a significance level of 5%. The x-axisindicates each of the five fixed correlations between a and α1. The darkerthe color of a point, the greater the value of tj .666.2. Simulation ResultsFigure 6.7: Point-wise power curves of the hypothesis test in (6.6) (left) usinga standard linear model, and of the test in (6.4), using our rˆ(t)2 statistic(right) as functions of the correlation between a and α1. The significancelevel is 5% and the darker the color of a line, the greater the value of tj .676.2. Simulation ResultsFigure 6.8: The proportion of times the null hypothesis in (6.4) is rejectedminus the proportion of times the null hypothesis in (6.6) is rejected, as afunction of t. The different colors represent the different correlation levelsbetween a and α1.68Chapter 7ConclusionIn this thesis, we have proposed extensions to existing techniques for study-ing the instantaneous relationship between a process and its derivative. Inmany applications, this instantaneous relationship may be significantly in-fluenced by an additional, related stochastic processes. To include an ad-ditional process in estimating the relationship between X(t) and X ′(t), weuse a two step smoothing procedure in the mold of Verzelen et al. (2012)as follows (Section 4.4). First, we smooth the data to obtain estimated tra-jectories of X and of its derivative, X ′. Secondly, we fit a nonparametricregression model via bivariate kernel smoothing, where for each t, X ′(t) isregressed on X(t) and W (t).Furthermore, we have developed a test statistic, Sˆ, to determine whetherthe addition of W (t) in the nonparametric regression provides a significantlybetter fit to using just X(t) alone. Under the null hypothesis that W (t) doesnot provide a significantly improved fit, we approximate the distributionof Sˆ through a permutation method (Section 4.5). We further use thispermutation approach to attempt to determine at which specific time pointsor intervals the inclusion of W (t) improves the fit significantly.These techniques are applied to the data set described in Chapter 2,comprised of mouse growth data for eight distinct groups. These groups arecharacterized by gender (male/female), breeding design (selected/control)and access to an exercise wheel (sedentary/active). We carry out the two-step smoothing procedure with the estimated growth rate regressed on bodymass only, and then on body mass and the previous weeks’ amount eaten.(Chapter 5). These models are fit for each of the eight groups and we notea variety of observations and comparisons.Paramount to these comparisons is that, based on our testing methods,only the two sedentary female groups had a relationship between growthrates at age t and body mass at age t that was explained significantly betterby including the amount eaten in week t. In future work, it would be inter-esting to carry out similar analyses while also taking into account the geneticdependence that results from the eight genetic lines of mice, as described inChapter 2.69Chapter 7. ConclusionTo improve our understanding of the statistical properties of our testingapproach, a simulation study based on the mouse growth data was carriedout in Chapter 6. The point-power of our method was compared to that ofthe standard test for coefficient significance in a linear model. Surprisingly,our method resulted in greater power, except when testing at smaller valuesof t.As final remarks, it is important to discuss the limitations and possiblecontinuations of this work. Firstly, our simulations and data analysis makeclear that Sˆ works well for determining whether or not W (t) is significantoverall in explaining X ′(t), but that determining what values of t contributeto a significant value of Sˆ is challenging. Trying to refine the method tomore clearly determine these values of t is a natural first step in any sub-sequent research. Further, although our testing method outlined in Section4.5 is flexible, the permutation method only approximates a subset of theSˆ’s null distribution. A further understanding of this underlying distribu-tion and better ability to approximate it may result in a more powerfultest. Moreover, we have not developed any asymptotic properties for thecomponents of our test statistic. Certainly, properties analogous to thosein Verzelen et al. (2012) would be desirable and provide an exciting oppor-tunity for future work. Another possible extension would be to explore theeffects of additional processes on the relationship between X(t) and X ′(t).Our current approach is restricted to the addition of just one process butcould be extended to include more.70BibliographyR. B. Ash and M. F. Gardner. Topics in Stochastic Processes. AcademicPress New York, 1975.A. W. Bowman and A. Azzalini. R package sm: nonparametric smoothingmethods (version 2.2-5.4). University of Glasgow, UK and Universita` diPadova, Italia, 2014. URL URLhttp://www.stats.gla.ac.uk/~adrian/sm,http://azzalini.stat.unipd.it/Book_sm.J. Cao and J. O. Ramsay. Parameter cascades and profiling in functionaldata analysis. Computational Statistics, 22:335–351, 2007.J. Cao, J. Huang, and H. Wu. Penalized nonlinear least squares estimationof time-varying parameters in ordinary differential equation. Journal ofComputational and Graphical Statistics, 21:42–56, 2012.S. P. Ellner, Y. Seifu, and R. H. Smith. Fitting population dynamics modelsto time series data by gradient matching. Ecology, 83:22562270, 2002.T. Gasser, H. G. Mu¨ller, W. Kohler, L. Molinari, and A. Prader. Nonpara-metric regression analysis of growth curves. The Annals of Statistics, 12(1):210–229, 1984.P. Hall, J.S. Marron, and B. Park. Smoothed cross validation. ProbabilityTheory and Related Fields, 92(1):1–20, 1992.T. J. Hastie and R. J. Tibshirani. Generalized Additive Models. London:Chapman & Hall, 1990. ISBN 0412343908.T. J. Hastie, R. J. Tibshirani, and J. Friedman. The Elements of StatisticalLearning. Springer New York Inc., New York, NY, USA, 2001.N. Heckman. Comments on: Dynamic relations for sparsely sampled Gaus-sian processes. Test, 19(1):46–49, 2010.N. Heckman and J. O. Ramsay. Penalized regression with model-basedpenalties. The Canadian Journal of Statistics, 28:241–258, 2000.71BibliographyM. Kirkpatrick and N. Heckman. A quantitative genetric model for growth,shape, reaction norms, and other infinite dimensional characters. Journalof Mathematical Biology, 27(4):429–450, 1989.P. Koteja, J. G. Swallow, P. A. Carter, and T. Garland. Energy cost of wheelrunning in house mice: Implications for coadaptation of locomotion andenergy budgets. Physiological and Biochemical Zoology, 72(2):238–249,1999.P. Koteja, J.G. Swallow, P.A. Carter, and T. Garland. Maximum cold-induced food consumption in mice selected for high locomotor activity:Implications for the evolution of endotherm energy budgets. Journal ofExperimental Biology, 204(6):1177–1190, 2001.P. A. Koteja, P.and Carter, J. G. Swallow, and T. Garland. Food wastingby house mice: Variation among individuals, families, and genetic lines.Physiology and Behavior, 80(2):375–383, 2003.H. Liang and H. Wu. Parameter estimation for differential equation modelsusing a framework of measurement error in regression models. Journal ofthe American Statistical Association, 103(484):1570–1583, 2008.B. Liu and H. G. Mu¨ller. Estimating derivatives for samples of sparselyobserved functions, with application to online auction dynamics. Journalof the American Statistical Association, 104:704–717, 2009.H. Miao, C. Dykes, L. M. Demeter, and H. Wu. Differential equation model-ing of HIV viral fitness experiments: model identification, model selection,and multimodel inference. Biometrics, 65(1):292–300, 2009.H. G Mu¨ller and W. J. Yang. Dynamic relations for sparsely sampled Gaus-sian processes. Test, 19:1–29, 2010.H. G. Mu¨ller and F. Yao. Empirical dynamics for longitudinal data. TheAnnals of Statistics, 38:3458–3486, 2010.E. A. Nadaraya. On estimating regression. Theory of Probability and itsApplications, 9:141–142, 1964.R Core Team. R: A language and environment for statistical computing.R Foundation for Statistical Computing, Vienna, Austria, 2015. URLhttps://www.R-project.org/.72BibliographyJ. O. Ramsay and C. J. Dalzell. Some tools for functional data analysis.Journal of the Royal Statistical Society. Series B (Methodological), 53(3):539–572, 1991.J. O. Ramsay and B. W. Silverman. Applied Functional Data Analysis.SpringerLink ebook, New York City, NY, 2002.J. O. Ramsay and B. W. Silverman. Functional Data Analysis. SpringerLinkebook, New York City, NY, 2005.J. O. Ramsay, N. Heckman, and B. W. Silverman. Spline smoothing withmodel-based penalties. Behavior Research Methods, Instruments, & Com-puters, 29(1):99–106, 1997.J. O. Ramsay, G. Hooker, D. Campbell, and J. Cao. Parameter estimationfor differential equations: a generalized smoothing approach. Journal ofthe Royal Statistical Society: Series B (Statistical Methodology), 69(5):741–796, 2007.S. K. Reddy and M. Dass. Modeling online art auction dynamics usingfunctional data analysis. Statistical Science, 21(2):179–193, 2006.J. A. Rice and B. W. Silverman. Estimating the mean and covariance struc-ture nonparametrically when the data are curves. Journal of the RoyalStatistical Society. Series B (Methodological), pages 233–243, 1991.Y. Sakamoto, M. Ishiguro, and G. Kitagawa. Akaike Information CriterionStatistics. Tokyo : KTK Scientific Publishers, 1986.J. G. Swallow, P. A. Carter, and T. Garland. Artificial selection for increasedwheel-running behavior in house mice. Behavior Genetics, 28(3):227–237,1998.Patrick A. Carter Theodore J. Morgan, Theodore Garland. Ontogenies inmice selected for high voluntary wheel-running activity. i. mean ontoge-nies. Evolution, 57(3):646–657, 2003.N. Verzelen, W. Tao, and H. G. Mu¨ller. Inferring stochastic dynamics fromfunctional data. Biometrika, 99:533–550, 2012.M. P. Wand. KernSmooth: Functions for Kernel Smoothing, 2015. URLhttp://CRAN.R-project.org/package=KernSmooth. R package version2.23-15.73M.P. Wand and M. C. Jones. Kernel Smoothing. Monographs on statis-tics and applied probability. Chapman & Hall/CRC, Boca Raton (Fla.),London, New York, 1995.G. S. Watson. Smooth regression analysis. Sankhya¯ Ser., 26:359–372, 1964.H. Wickham. ggplot2: elegant graphics for data analysis. Springer New York,2009. ISBN 978-0-387-98140-6. URL http://had.co.nz/ggplot2/book.F. Yao, H. G. Mu¨ller, and J. L. Wang. Functional data analysis for sparselongitudinal data. Journal of the American Statistical Association, 100(470):577–590, 2005.74Appendix ACalculation of ConditionalExpectation for SimulationsLet X(·),W (·) be Gaussian processes with mean and covariance functionsgiven by µX(·), CX(·, ·) and µW (·), CW (·, ·), respectively.For a fixed t, X(t),W (t) andX ′(t) are jointly normal, (X(t),W (t), X ′(t))T ∼N (µ(t),Σ(t)), where µ(t) = (µX(t), µW (t), µX′(t))T and Σ(t) is the covari-ance matrix of X(t),W (t) and X ′(t). We will first state a useful result forcomputing the conditional expectation of X ′(t) given X(t),W (t).Proposition 1. Let Z = (Z1, Z2, . . . , Zn)T and Y = (Y1, Y2, . . . , Ym)T betwo multivariate normal distributions of size n and m, respectively, suchthat Z and Y are jointly normal with covariance matrix partitioned asΣ = Cov[(ZY);(ZY)]=(ΣZZ ΣZYΣZYT ΣY Y).Then the conditional expectation of Y on Z is given byE[Y |Z] = µY + ΣZY TΣZZ−1(Z − µZ),where µZ and µY are the means of Z and Y , respectively.Proof. Appendix A in Statistics for High-Dimensional Data. Methods, The-ory and Applications, P. Buhlmann and S. van de Geer, Springer 2011.”For a fixed t, we can now apply this result with Z = (X(t),W (t)) andY = X ′(t), to find the conditional expectation of X ′(t) on X(t) and W (t).After some simple matrix multiplication we obtain:E[X ′(t)|X(t),W (t)] = µX′(t)+1D(t){Var(W (t))Cov(X(t), X′(t))− Cov(W (t), X ′(t))Cov(X(t),W (t))}{X(t)− µX(t)}+1D(t){Var(X(t))Cov(W (t), X′(t))− Cov(X(t), X ′(t))Cov(X(t),W (t))}{W (t)− µW (t)},75Appendix A. Calculation of Conditional Expectation for SimulationswhereD(t) = det[Var(X(t)) Cov(X(t),W (t))Cov(X(t),W (t)) Var(W (t))].We can calculateC(W (s), X ′(t)) =∂∂tC(W (s), X(t)),C(X(s), X ′(t)) =∂∂tC(X(s), X(t)) andCX′(s, t) =∂2CX∂s∂t.If we want E[X ′(t)|X(t),W (t)] to have no dependence on W (t) then werequire Var(X(t))Cov(W (t), X ′(t))−Cov(X(t), X ′(t))Cov(X(t),W (t)) = 0.76Appendix BCalculation of CovarianceStructure for SimulationsWe wish to simulate two correlated stochastic processes, X and W . In thefollowing, we assume that we can write X asX(t) =J∑j=1αjϕj(t), (B.1)where the αj ’s are uncorrelated, mean 0 random variables with the varianceof αj equal to λj , and W asW (t) = a+ σ(t), (B.2)where a is a mean zero random variable with variance σ2a, uncorrelated with(t) for all t, and E((t)) = 0, Cov((s), (t)) = I(s = t), the indicator of sequal to t. With these assumed decompositions and some additional statedassumptions, we will define the proper covariance structure for the processesobserved at a fixed set of time points.For a fixed set of time points, t1, . . . , tk, we will generate observations ofX and W at each ti. Let t = (t1, . . . , tk)t. Then we can writeX(t) = (X(t1), . . . , X(tk))t = Φα (B.3)where Φ = (ϕ1(t), . . . , ϕJ(t)) is a k × J matrix and α = (α1, . . . , αJ)t.Similarly, we can writeW (t) = (W (t1), . . . ,W (tk))t = a1k + σ, (B.4)where = ((t1), . . . , (tk))t. Therefore, to simulate X(t) and W (t) we willgenerate values of α, a, and . Having a proper covariance structure forX(t) and W (t) is equivalent to the J + 1 + k dimensional covariance matrixof α, a, and being positive definite. We further assume thatCov(α, ) = 077Appendix B. Calculation of Covariance Structure for SimulationsandCov(a, ) = 0.It follows from these assumptions that to have the covariance matrix ofα, a, and positive definite, we need only have the covariance matrix of αand a positive definite; that is, for an arbitrary v ∈ RJ , and z ∈ R such that(v, z) 6= 0, we must have(vt, z)Σ(vt, z)t > 0, (B.5)where Σ is the covariance matrix of (αt, a). A simple calculation yields thefollowing equivalent condition for positive definiteness:J∑j=1λjv2j + 2zJ∑j=1vjcj + σ2az2 > 0, (B.6)where cj = Cov(αj , a). A sufficient condition for (B.6) is to havec2jλjσ2a<1Jor equivalently Corr2(αj , a) <1J. (B.7)Given values of λj and σ2a, one can thus set the cj ’s according to the desiredstrength of covariance between X and W . If a subset of size I of the cj ’s isset to zero, then condition (B.7) can be replaced byCorr2(αj , a) <1J − I , (B.8)for each j such that cj 6= 0.78
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Instantaneous dynamics of functional data
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Instantaneous dynamics of functional data Bone, Jeffrey 2016
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | Instantaneous dynamics of functional data |
Creator |
Bone, Jeffrey |
Publisher | University of British Columbia |
Date Issued | 2016 |
Description | Time dynamic systems can be used in many applications to data modeling. In the case of longitudinal data, the dynamics of the underlying differential equation can often be inferred under minimal assumptions via smoothing based procedures. This is in contrast to the common technique of assuming a prespecified differential equation, and estimating it's parameters. In many cases, one wants to learn the dynamics of a differential equation that incorporates more than just one stochastic process. In the following, we propose extensions to existing two-step smoothing methods that allow for the presence of additional functional data arising from a second stochastic process. We further introduce model comparison techniques to assess the hypothesis that there is a significant change in fit provided by this additional process. These techniques are applied to the instantaneous dynamics of mouse growth data and allow us to make comparisons between mice who have been assigned different genetic and physical conditions. Finally, to study the statistical properties of our proposed techniques, we carry out a simulation study based on the mouse growth data. Supplementary material : http://hdl.handle.net/2429/59574 |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2016-10-19 |
Provider | Vancouver : University of British Columbia Library |
Rights | Attribution-NonCommercial-NoDerivatives 4.0 International |
DOI | 10.14288/1.0319180 |
URI | http://hdl.handle.net/2429/59491 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 2016-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Rights URI | http://creativecommons.org/licenses/by-nc-nd/4.0/ |
AggregatedSourceRepository | DSpace |
Download
- Media
- 24-ubc_2016_november_bone_jeffrey.pdf [ 13.37MB ]
- Metadata
- JSON: 24-1.0319180.json
- JSON-LD: 24-1.0319180-ld.json
- RDF/XML (Pretty): 24-1.0319180-rdf.xml
- RDF/JSON: 24-1.0319180-rdf.json
- Turtle: 24-1.0319180-turtle.txt
- N-Triples: 24-1.0319180-rdf-ntriples.txt
- Original Record: 24-1.0319180-source.json
- Full Text
- 24-1.0319180-fulltext.txt
- Citation
- 24-1.0319180.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0319180/manifest