G R O W T H PREDICTION OF R E C E N T P E R M A N E N T S A M P L E PLOTS FOR FOREST INVENTORY PROJECTION By Guillaume Therien B. Sc. App. (Foresterie) Universite Laval A THESIS S U B M I T T E D IN PARTIAL F U L F I L L M E N T THE REQUIREMENTS FOR T H E D E G R E E OF D O C T O R OF P H I L O S O P H Y in T H E F A C U L T Y OF G R A D U A T E STUDIES FORESTRY We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y OF BRITISH COLUMBIA April 1990 © Guillaume Therien, 1990 OF In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further copying agree that permission for extensive of this thesis for scholarly purposes may be granted department or by his or her representatives. It is by the head of my understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of FORbST^V The University of British Columbia Vancouver, Canada Date DE-6 (2/88) \\fie. 21s, mo Abstract Permanent sample plots have become the main source of information for estimating models which quantify the dynamic processes of a forest. Fitted models allow for projecting inventories, used to determine timber production and many forest management decisions. The quality of these models is largely dependent on the quality of the information provided by the permanent sample plots. However, the pool of information contained in recent permanent sample plots is limited. Efficient estimation techniques must use all the information available from such plots. Current estimation techniques can be improved. Existing techniques employed in forestry have failed to recognize the random nature of the individual model characterizing each plot. On the other hand, techniques designed for remeasured entities in other scientific fields do not address particular forestry situations such as the small number of remeasurements or the irregularity of remeasurements. A framework for estimating forestry growth models which recognizes the individuality of each plot and special forestry situations is presented in this dissertation. The proposed framework is a two-stage estimation technique, in which the growth rate of a permanent sample plot is considered analogous to the interest rate on a bank account. The first stage estimates the growth rate after removing the time effect. The second stage, based on Von Bertalanffy's growth curve, relates growth rate to site index and volume at the beginning of the growing season. The proposed predictor of future growth rates, the "weighted predictor," is a weighted average between the growth rate observed on a plot and the growth rate predicted from the second-stage model. The ii weighted predictor is then used to compound the current volume of a plot. A n estimate of the variance of the prediction can also be computed. m Table of Contents Abstract ii List of Tables vii List of Figures viii Acknowledgement ix 1 Introduction 1 2 Literature Review 7 2.1 Forestry Growth and Yield Literature 7 2.2 The Econometrics Literature 14 2.2.1 Estimated Generalized Least Squares 14 2.2.2 A l l Parameters Are Constant 16 2.2.3 Only the Intercepts Vary 17 2.2.3.1 The General Model 17 2.2.3.2 The Intercepts Are Fixed 18 2.2.3.3 The Intercepts Are Random 19 2.2.3.4 Recapitulation 19 2.2.4 A l l Parameters Vary 20 2.2.4.1 The General Model 20 2.2.4.2 The Parameters Are Fixed 21 2.2.4.3 The Parameters Are Random 22 iv 2.2.4.4 2.2.5 Recapitulation 25 Summary of the Econometrics Literature 26 2.3 The Growth Curve Literature 26 2.4 Conditional Densities and Expectations 28 2.5 Overview 31 3 Statistical Methods 33 3.1 3.2 The Timber Capital 33 3.1.1 The Interest Rate Analogy 34 3.1.2 Uncertainty and Growth Rate 37 The Statistical Model 38 3.2.1 Notational Conventions 39 3.2.2 First-Stage Model 40 3.2.2.1 Model Definition 40 3.2.2.2 Error Components 42 3.2.2.3 Assumptions 43 3.2.2.4 Matrix Notation 3.2.3 3.2.4 3.2.5 '. . 47 Second-Stage Model 49 3.2.3.1 Background 50 3.2.3.2 Model Definition 52 3.2.3.3 Assumptions 53 3.2.3.4 Matrix Notation 54 Combined Model 55 3.2.4.1 Assumption . . . 55 3.2.4.2 Matrix Notation 55 Summary 56 v 3.3 Parameter Estimation 56 3.3.1 Estimation When Covariance Matrices Are Known 57 3.3.1.1 Parameter Estimators 57 3.3.1.2 Estimation of the Variance of the Predictor 61 3.3.2 3.3.3 3.4 4 5 6 Covariance Matrix Estimators 64 3.3.2.1 First-Stage Estimators 64 3.3.2.2 Second-Stage Estimators 67 The Best Linear Unbiased Predictor and Its Variance 69 Model Predictions 69 3.4.1 Volume Predictor 70 3.4.2 Confidence Intervals Estimator 72 Numerical Example 75 4.1 Description of the Data Base 75 4.2 Goodness of Fit Criteria 79 4.3 Results 81 Discussion 89 5.1 The Basic Concept 89 5.2 The Statistical Model 91 5.3 Results 96 Conclusions 99 References Cited 102 vi L i s t of Tables 4.1 Time Intervals 77 4.2 Frequency of Volumes by Initial Age and Projection Time Class 78 4.3 Frequency of Volumes by Initial Age and Total Lag Class 79 4.4 Estimation of Observed Growth Rates 82 4.5 Overall Results from the First- and the Second-Stage Estimators and from the Weighted Predictor 84 4.6 Bias and (Relative Bias) by Initial Age and Projection Time Class . . . . 86 4.7 Bias and (Relative Bias) by Initial Age and Total Lag Class 86 4.8 Root Mean Squared Error and (Relative Root Mean Squared Error) by Initial Age and Projection Time Class 4.9 87 Root Mean Squared Error and (Relative Root Mean Squared Error) by Initial Age and Total Lag Class 87 4.10 Mean Absolute Deviation and (Relative Mean Absolute Deviation) by Initial Age and Projection Time Class 88 4.11 Mean Absolute Deviation and (Relative Mean Absolute Deviation) by Initial Age and Total Lag Class 88 vii L i s t of Figures 3.1 Growth Rate over Time for a Sigmoidal Yield Curve 41 4.2 Distribution of Site Index 76 4.3 Individual Plot Volume versus Age 76 viii Acknowledgement I would like to thank my supervisor Dr. A. Kozak as well as the rest of my supervisory committee, Dr. P. Marshall, Dr. K. lies, Dr. P. de Jong, and Dr. D. Tait for their direction and guidance. Their contributions to my thesis and forest biometrics education were enormous, as was their patience. I extend my appreciation to Dr. N. E. Heckman and Dr. J. G. Cragg for their short but invaluable contributions to my work. Funding for this research was supplied by the University of British Columbia through numerous scholarships, fellowships, teaching assistantships, research assistantships, etc. Special thanks to MacMillan Bloedel Ltd. for allowing me access to the best data base in the country. Finally, I would like to thank my peers of the first hour: Mrs Karen Jahraus, Dr. Margaret Penner, Dr. Valerie LeMay, and Dr. Jim Thrower, for their constant encouragement and support. They initiated an instructive forum where ideas and opinions were debated freely. Many thanks to Valerie who also commented on zillions of earlier drafts, and to Margaret whose own acknowledgement "inspired" this one. ix Chapter 1 Introduction The main purpose of forest management is to maintain or increase the yield of the different forest resources. This goal requires a knowledge of current inventory and growth potential. Growth potential is the change in the current inventory that would occur without harvesting. However, estimating growth potential is a difficult task because the forest is a dynamic system. Quantifying the dynamic processes in a forest is essential for forecasting the future inventory. One way of quantifying the timber aspect of the dynamic processes of a forest is to use permanent sample plots. Monitoring the same piece of land over time represents a logical way to assess timber growth. In British Columbia, permanent sample plots were first established for growth and yield purposes in the late twenties and the early thirties. They became more popular in the sixties and seventies because of an increased interest in predicting yield from managed stands (Marshall and Jahraus 1987). Today, many agencies have a network of recent permanent sample plots, which have had only a few measurements. Timber dynamics can be represented by a mathematical model that ideally has the following properties: representativity, tractability and simplicity. This model is called a growth model. The model should be representative of the biological behaviour and of the stochastic nature of this behaviour; it should be tractable so that estimation is possible, and simple in its parameterization. 1 2 Chapter 1. Introduction Modeling growth of permanent sample plots allows for the projection of sample plot inventories. Projections of temporary sample plot inventories can also be obtained. This information will be needed to estimate the timber production of a management unit and to determine an appropriate level of harvesting for this management unit. Most techniques used today to quantify inventory changes do not take advantage of all the information available from permanent sample plots. Techniques especially designed for remeasured entities are required. As most forest agencies have a large proportion of recent permanent sample plots, these techniques must be compatible with sample plots that have had only a few measurements. The objective of this dissertation is to define a modeling framework that will make use of all the information available from recent permanent sample plots in order to project an inventory with precision and accuracy. Estimates of the variability of the projections is also of special interest. For simplicity, only linear models are considered. The general linear statistical model of permanent sample plot collections can be expressed by: Yij — floij + PlijXuj + . . . + 0Kij^Kij + e;j where Yij is the dependent variable measured on individual i [i = 1, . . . ,N) at time j (j = 1,... , T); Xpij is the variable p (p = 1,..., K) measured on individual i at time j; 0pij is the parameter associated with X^j, fioij is the intercept, and e^- is the error term on individual i at time j. Judge et al. (1985, p. 515-516) classified the parameter estimation techniques of such models in 5 broad categories: 1. all parameters are constant over individuals and time; 2. the intercept varies over individuals only, the other parameters are constant over individuals and time; Chapter 1. Introduction 3 3. t h e intercept varies over i n d i v i d u a l s a n d t i m e , t h e o t h e r p a r a m e t e r s are constant over i n d i v i d u a l s a n d t i m e ; 4. a l l p a r a m e t e r s v a r y over i n d i v i d u a l s o n l y ; 5. a l l p a r a m e t e r s v a r y over i n d i v i d u a l s a n d t i m e . I n forestry, m o s t techniques d e v e l o p e d for f i t t i n g m o d e l s o n p e r m a n e n t s a m p l e plots have a s s u m e d c o n s t a n t p a r a m e t e r s over t i m e a n d i n d i v i d u a l s . T h i s fails t o recognize t h e i n d i v i d u a l i t y o f each s a m p l e p l o t . O t h e r scientific fields d e a l i n g w i t h collections o f remeasured i n d i v i d u a l s have u s e d m o d e l s w i t h at least o n e p a r a m e t e r v a r y i n g over i n d i v i d u a l s . T h e s e m o d e l s c a n " i n d i v i d u a l i z e " t h e g e n e r a l m o d e l . H o w e v e r , recent p e r m a n e n t sample plots have some p a r t i c u l a r p r o b l e m s t h a t m o d e l s u s e d i n o t h e r fields d o n o t address. These include: 1. t h e p l o t s are n o t t h e same age at t h e first measurement; 2. t h e t i m e i n t e r v a l b e t w e e n m e a s u r e m e n t s is n o t a l w a y s t h e s a m e w i t h i n a plot; 3. plots are n o t r e m e a s u r e d at t h e s a m e t i m e ; 4. t h e n u m b e r o f r e m e a s u r e m e n t s are s m a l l . A s a general r u l e , s p e c i a l a t t e n t i o n m u s t b e g i v e n t o t h e error s t r u c t u r e o f t h e m o d e l a n d t h e a s s u m p t i o n s a b o u t t h e c o v a r i a n c e m a t r i x o f t h e r a n d o m terms i f efficient e s t i m a tors are r e q u i r e d . T h e u s u a l a s s u m p t i o n s r e q u i r e d for s i m p l e o r m u l t i p l e l i n e a r regression are frequently n o t v a l i d w i t h r e m e a s u r e d i n d i v i d u a l s . B e c a u s e each p l o t is remeasured at different t i m e s , t h e p l o t s are often referred t o as a time-series. Observations from a time-series are often correlated; t h i s t y p e o f c o r r e l a t i o n is c a l l e d s e r i a l c o r r e l a t i o n . O b servations m e a s u r e d at t h e same t i m e c a n also b e c o r r e l a t e d ; t h i s t y p e o f c o r r e l a t i o n is 4 Chapter 1. Introduction called contemporaneous correlation. Both types of correlation provide information about the remeasured entities needed to estimate model parameters efficiently. In this dissertation, a new framework for projecting recent permanent sample plot inventories is proposed. This new framework addresses the complexities of remeasured entities particular to forestry. The variable of interest is growth rate, that is, the change in volume per unit of initial volume. 1 For short projections, an intuitive technique to predict future volumes on a given plot is to compound the current volume with the growth rate recently observed on this plot. Another technique, requiring a collection of permanent sample plots, is to regress the observed growth rates on stand attributes. The volume is then updated with the growth rate predicted from the regression. Another option, the one favoured in this thesis, is to use an average of both techniques, with each technique's growth rate estimates weighted proportionally to its reliability. The framework employed a two-stage model. In the first stage, average recent growth rates are obtained for individual plots. In the second stage, model parameters are estimated assuming growth rates to be a function of site index and initial volume. The predictor for a sample plot is a weighted average between the growth rate estimated in the first stage and the growth rate predicted from the second-stage model. Statistically, the framework can be explained in the following way. A general model exists, explaining the growth rate from two stand attributes: site index and initial volume. However, each individual permanent sample plot follows its own model; these individual models vary randomly around the general model. The individual model can be expressed as the general model plus an error term specific to a particular plot. The actual growth This variable is also sometimes referred to as relative growth rate. In this thesis the term growth rate will be used. 1 Chapter 1. Introduction 5 rate observations also vary, but around the individual model. The framework can therefore be considered a hierarchical model: the observations are random variations of the general model's random variation. Repeated observations on the same individual and an estimate of the general model can yield a prediction of the random error term specific to a particular plot. Adding this term to the estimated general model gives a prediction of the individual model. The individual model is "predicted" and not "estimated" because the individual model is a random variation of the general model. The predicted individual model is then used to provide a prediction of the growth rate used to update the current volume. There are many reasons to prefer an estimate of the individual growth rate rather than a global estimate. The individual estimate can be used as a decision aid for plot remeasurement scheduling. It can also be employed in sampling with partial replacement to update the plots that have not been remeasured. A third possible use of individual growth rates is the ordination of the sample plots in groups to study the causes of the differences among groups. Most techniques used in forestry have failed to recognize the stochastic nature of this individual relationship. They also have failed to take into account that many permanent sample plots in North America are less than 25 years of age and have been measured less than 5 times. The proposed framework will provide a better understanding of the information given by permanent sample plots, especially recent ones, and will result in more efficient fitting techniques. The dissertation has been organized according to the following format: a brief presentation of the past work in growth and yield and in other fields using data sets containing remeasured individuals is discussed in Chapter 2; the statistical model, the various assumptions and the estimators are discussed in Chapter 3; a numerical example and results Chapter 1. Introduction 6 are presented in Chapter 4; these results, along with the statistical model, are discussed in Chapter 5, and, finally, conclusions are drawn in the last chapter. Chapter 2 Literature Review Permanent s a m p l e p l o t d a t a bases are a c o l l e c t i o n of plots r e m e a s u r e d a n u m b e r of t i m e s . R e m e a s u r e d d a t a bases are not u n i q u e to forestry. R e s e a r c h scientists i n business, e c o n o m i c s , b i o l o g y a n d m e d i c i n e have d e v e l o p e d techniques c u s t o m i z e d to t h e i r specific p r o b l e m s i n v o l v i n g entities w i t h r e p e a t e d m e a s u r e m e n t s over t i m e . A n o v e r v i e w of these techniques is p r e s e n t e d i n t h e f o l l o w i n g pages. A b r i e f h i s t o r y of forestry g r o w t h a n d y i e l d c a n be f o u n d i n t h e first s e c t i o n . T h e s e c o n d section groups the techniques developed i n econometrics. M u c h a t t e n t i o n has been g i v e n i n business e c o n o m i c s research t o d a t a bases s i m i l a r to p e r m a n e n t and s a m p l e p l o t d a t a bases. R e s e a r c h o n b i o l o g i c a l g r o w t h curves is r e v i e w e d i n t h e t h i r d s e c t i o n . B e c a u s e trees are b i o l o g i c a l organisms, growth. general k n o w l e d g e o f b i o l o g i c a l g r o w t h c a n help u n d e r s t a n d F i n a l l y , t h e last s e c t i o n is a b r i e f p r e s e n t a t i o n expectations. tree or s t a n d of c o n d i t i o n a l densities and U s e f u l m a t r i x i n v e r s i o n rules are also g i v e n . T h e s e b a s i c rules w i l l help i n u n d e r s t a n d i n g the s t a t i s t i c a l m o d e l suggested i n C h a p t e r 3 to project i n v e n t o r y . 2.1 Forestry Growth and Yield Literature P r e d i c t i n g y i e l d , a l o n g w i t h d e t e r m i n i n g t h e c u r r e n t i n v e n t o r y , is a c o m m o n forest m a n agement a c t i v i t y . A v a i l a b i l i t y o f r e l i a b l e i n f o r m a t i o n , c o m p u t i n g t e c h n o l o g y a n d inform a t i o n requirements inventory projections. h a v e i n f l u e n c e d the q u a l i t y of the i n f o r m a t i o n desired f r o m forest Permanent s a m p l e p l o t d a t a bases have b e c o m e the m a i n source 7 Chapter 2. Literature Review 8 of information for these projections. A historical perspective of the uses of permanent sample plot data as a source of information is presented in the following pages. Yield is highly correlated with a few well-known variables: species, age, ecological association, climate, edaphic conditions, among others. Nineteenth century German foresters used the variables age, species, and site class to create their normal yield curves. "Normal" in this case merely meant that the variability in stocking was omitted by using only fully-stocked, even-aged, monospecific temporary sample plots. At any point in time, these curves represented the upper bound or potential yield of a stand. The interpretation of predictions from these tables was left to the manager's judgment and experience. To make these tables applicable to understocked forests, Gerhardt (1930, in Davis and Johnson 1987; p. 106) developed a simple correction factor. Growth was estimated from the percentage of stocking compared to that of a fully-stocked forest, the predicted growth for a fully-stocked forest and the species. With the correction factor, it was assumed that all stands were moving toward full stocking at a rate depending on the species and the stocking. Schumacher (1939) published a simple empirical yield function where the logarithm of volume was a function of site index (S) and the reciprocal of age (1/A): InV = 3 + 0 Pi{l/A) + 8 {S) + 3 (S/A) 2 3 No statistical assumptions were made to develop this model; no statistical inference could be drawn from this model. The parameters were fitted using least squared error as the criterion of optimality. Schumacher's model was among the first of the mathematical expressions for yield. Buckman (1962) recognized that serially correlated data from permanent plots were providing misleading significance levels and confidence intervals when analyzed as independent observations. 9 Chapter 2. Literature Review Clutter (1963) recognized the need for compatible growth and yield models, where predicted growth is the derivative of predicted yield. He justified the use of volume on a logarithmic scale to fit his growth and yield equations by assuming a better compatibility in general with "the statistical assumptions customarily made in regression analysis (linearity, normality, additivity and homogeneity of variance)." Leak (1966) developed a technique for providing a large-sample estimator of the overall slope of a random coefficient model. He first assumed a linear relationship between the dependent and the independent variable for each sample plot, with the error terms independently distributed as N(0,crf), where af is the variance of an error term observed on sample plot i. These individual regressions were solved using the least squares technique. On each plot, he assumed the same number of observations and a similar distribution of the independent variable, X, around the average, X (i.e., the corrected sums of squares of the variable X were approximately equal). The expected value of any individual slope was the overall slope. This was estimated as the simple average of the individual slopes. Assuming that a? was equal among plots, the standard error of the estimated overall slope was the standard error of the mean slope. Curtis (1967) developed a method to estimate the gross yield of Douglas-fir. He recognized that the usual assumption of independence was probably violated when fitting equations to permanent sample plots. To handle the problem, he first fitted linear growth equations on each plot without regard to autocorrelation using the least squares technique. Then, the coefficient of determination was computed for u^_i and ui, the consecutive estimated residuals on a plot. If the coefficient of determination r was small, 2 autocorrelation was considered negligible; otherwise, the analysis was repeated using a single randomly selected point per plot. Swindel (1968) showed the theoretical bounds on the bias of the variance of parameter estimates when serially correlated data were assumed independent for multiple linear 10 Chapter 2. Literature Review models estimated with least squares. Sullivan and Reynolds (1976), using Swindel's work, provided an example for data with two measurements. Sullivan and Clutter (1972) developed growth and yield models using permanent sample plots with two measurements per plot. They assumed common parameters over plots and time, homogeneous variance within a measurement period, serial correlation p between the first and the second measurements, and no contemporaneous correlation, in that the volumes measured in a given year were uncorrelated. If = (Vn V{2) is the vector containing the first and second measurements of volume on plot i, then, for any i 7^ j , V ; and Vj are uncorrelated and where p is the correlation between the first and the second measurement. They derived the maximum likelihood estimates of multiple linear models by an iterative procedure. Seegrist and Arner (1978) and Amer and Seegrist (1979, 1980) extended this method to allow multiple and varying number of remeasurements. 1 Ferguson and Leech (1978), fitted a two-stage model drawing on the theory relating to random coefficient modeling. In the first stage, they fitted a quadratic model to each plot with the reciprocal of age as the independent variable using ordinary least squares. They assumed that the coefficients were different among plots: ln V {j = Ao +ftiUMy)+ ft2(l/Ay ) 2 + where V{j is the volume measured on plot i at time j, /3,-j. is the coefficient k on plot i, Aij is the age of plot i at time j, and e^- is an error term. Using matrix notation, V is the column-vector of the logarithmic volumes; 6 is the vector of parameters to be estimated; X is the matrix of independent variables, and e is the vector of error terms. The first 1 In econometrics nomenclature, this type of estimation is called classical pooling. Chapter 2. Literature Review 11 stage can be written V = X8 + e. With the assumptions that e is distributed with mean vector zero and variance ft and is uncorrelated with /? or X , the least squares estimate of 8, 8, is unbiased with variance [x'rr'x]- . 1 They then considered 8 as a random variable that could be predicted from certain independent variables (second-stage models). The set of independent variables to predict 8 do not need to be related to the set of independent variables used to predict V . They fitted three different models: to predict the intercept, the coefficient of the linear term and the coefficient of the quadratic term. The second-stage model structure was where d^i is the parameter associated with the variable I for the coefficient k\ Ziki is the corresponding explanatory variable on plot i, and u;*. is the error term. Using matrix notation, the second stage can be written a = ze + u. Assuming that u is distributed with mean vector zero and variance S and is uncorrelated with 8 and Z, the random variable 8 is distributed with mean vector Z9 and variance S . Because 8 was considered a random variable, the distribution of 8 in the first stage was actually the conditional distribution of 8 given 8 with E\j3\8] = 8, and vai0\8) = [X'f2 X]_1 - l Combining the conditional distribution of 8 given 8 with the distribution of 8, the unconditional distribution of 8 will have ze, and Chapter 2. Literature Review 12 var(/3) Because Vt and £ = [X'fr'X]" 1 + E. are unknown, they must be replaced by estimates. In the first- stage model, Ferguson and Leech assumed homogeneous variance within plots, no serial correlation, and heterogeneous variance among plots. In the second-stage, they assumed a homogeneous covariance matrix for the coefficients and independence between sets of coefficients measured on different plots. W i t h these assumptions they developed a feasible generalized least squares ( F G L S ) algorithm. They also investigated more restrictive assumptions about the error structure. Commenting on this study, Davis and West (1981) claimed that no proper confidence intervals nor test of significance can be computed with this approach. Ferguson and Leech (1981) replied that asymptotically normal statistics have proved adequate for most practical purposes using large samples. Garcia (1983) fitted a height growth model with a stochastic differential equation. Height growth was assumed to be a function of height (if): dH/dt = f(H) T h e function of height was assumed to be (^if* 2 — d> H, where </>i, d>2, and <f>3 are 3 the function parameters. This function integrates to the well-known Chapman-Richards function (Chapman 1961, Richards 1959). T h e model included an error term to account for the random environmental variation in growth and the measurement error. The environmental error was assumed to be a Wiener process. This means that the variation in H accumulated over a short time interval is normally distributed with zero mean and a variance increasing with the interval length, and that errors for nonoverlapping time intervals are independent. of observed height. T h e measurement error was assumed to be a function T h e plots were assumed statistically independent. T h e parameters were simultaneously estimated with an iterative maximum likelihood procedure. Some parameters were global, that is identical across plots; some were local, that is site specific. 13 Chapter 2. Literature Review Gregoire (1987) applied error component modeling to permanent sample plot data. The basic model was Yij — ^^dpXpij -f- Ujj p = e; + ij + Vij with XUJ defined as 1. The error was divided into three components: one for plotdependent variation (e,), one for time-dependent variation (rj) and one, variation unexplained by plot-dependent or time-dependent effects; (V;J), for the was the dependent variable measured on plot i at time j, and X^j was the variable p measured on plot i at time j. The expected value of each component of the error term was assumed to be zero. The error term components were also assumed to be uncorrelated with the independent variables and uncorrelated with each other. Alternative error covariance structures were analyzed. When compared to the least squares technique, with the undivided error terms assumed independently and identically distributed, the results were not conclusive. In general, the error component models performed better than the least squares for the likelihood criterion but worse for the minimum prediction error criterion. Gregoire also pointed out the need to find good statistics to compare alternative models. Biging (1985) derived a site index model using random coefficient modeling theory in order to account for between-tree differences in individual tree height growth. The error structure was identical to Swamy's (1970) model, and the parameters were estimated using Maddala's (1977) estimators. Finally, Tait et al. (1988) developed a simple, two-equation growth model where the growth of the average tree was a function of the average tree volume, site quality and the number of trees per hectare, and the mortality was a function of an index of the rate of area expansion of a tree. They simultaneously estimated the best model parameters (i.e., the global parameters) with the best initial values for each plot (the local parameters). 14 Chapter 2. Literature Review This type of analysis is close to an analysis of covariance where each plot is considered a different treatment with two covariates, the initial stand density and volume. This brief historical report illustrates the change of attitude toward the information pool contained in permanent sample plot data bases. It is now recognized that an observation at a point in time on a given individual provides partial information, on one hand, on the same individual at other points in time and, on the other hand, on other individuals at the same point in time. Efficient parameter estimation techniques must exploit this information. 2.2 The Econometrics Literature In econometrics literature, a data base that provides repeated measurement for each of a number of entities is called a pooled cross-sectional and time-series data base (Dielman 1983). The term panel data is also often used. This type of data base is very common in econometrics. The abundance of information has led econometricians to develop techniques that take advantage of this information to explain and predict economic phenomena. Because of the similarities with forestry permanent sample plots, there has been a growing interest in forestry to adapt techniques developed in econometrics (e.g., Furnival and Wilson 1971, Ferguson and Leech 1978, Gregoire 1987, LeMay 1988). The literature about pooled cross-sectional and time-series data is extensive and only a brief overview of the major developments will be presented here. 2.2.1 Estimated Generalized Least Squares When a simple or a multiple linear model is fitted to predict or to explain the outcome of a dependent variable, the most common assumptions about the error term are zero expectation and spherical disturbance, (i.e., the covariance matrix of the error term is Chapter 2. 15 Literature Review <J I). These models are easily estimated with the ordinary least squares (OLS) technique. 2 However, the assumption of spherical disturbance is often not met. Another technique, called estimated generalized least squares, allows for estimating efficiently the parameters of such models. The general linear model can be expressed as Y = X/3 + u where Y is the column-vector of dependent variables, X is a matrix of independent variables, 8 is the column-vector of model parameters, and u is the column-vector of random error terms. Aitken (1935) showed that if the covariance matrix of u is fl, then the best linear unbiased estimate (BLUE) of the parameters is P = [X'fT'X^X'fT'Y. This estimate has been called the generalized least squares (GLS) estimate or Aitken estimate. If the error terms are assumed to be normally distributed, the GLS estimate is also the maximum likelihood estimate. Unfortunately, most of the time, fl is unknown. In this case, a two-step technique is required to obtain what has been called the estimated generalized least square (EGLS) estimate or the feasible generalized least square (FGLS) estimate. The first step of the E G L S technique is to estimate fl. In the most general case, fl will contain M ( M + l ) / 2 unknowns where M is the total number of observations. Because the number of unknowns is larger than the number of observations, assumptions about the structure of fl must be made to restrict the number of unknown elements to a manageable number. In the second step, fl, the estimate of fl, is simply substituted into the Aitken estimate. Consequently, the F G L S estimate of 8 is (3 = [ X ' f V ' x ^ X ' f V ' Y 16 Chapter 2. Literature Review which is the same as the GLS estimate except for 17 which was replaced by an estimate. The double hat above 0 emphasizes that it is estimated from a two-step technique. If Y is replaced by X/3 + u in the EGLS estimate, then it is easily shown that /3 =/3 + (x'rT'x^x'fVV Because ti and u are usually correlated, inferences about 0 need to be based on the asymptotic distribution of 0 (Judge et al. 1985; p. 175). Generally, 0 will be asymptotically normally distributed if u is normally distributed and ti is a consistent estimator for il. An estimator d is a consistent estimator for a if, for any £ > 0, Hm P[\a - a | > (] = 0. The EGLS technique is the general approach for obtaining the BLUE of the parameters of any linear model. The OLS technique is a special case, particularly restrictive, of the EGLS technique. With pooled cross-sectional and time-series data, the assumed structure of 0 will have a supreme importance on the structure of the covariance matrix of the error terms. This is why the EGLS technique for pooled cross-sectional and timeseries data has been organized in three different sections corresponding to the general assumptions about the structure of the model parameters: all parameters are constant, only the intercepts vary, and all parameters vary. 2.2.2 All Parameters Are Constant Each individual measured over time can be represented by its own model Yi = Xifr + Y 1 {1 • XKH U i 0iO u; = Xi = Y IT Uil 1 X L I T . • X K i T 0iK MiT 17 Chapter 2. Literature Review When it is assumed that all individuals follow a similar behaviour, the model parameters are assumed to be equal over individuals. The variation between individuals is then explained by the stochastic error term only. This is called classical pooling (CP). If the error terms on the observations from a single individual were uncorrelated, the observations from this individual would oscillate around their expected values. However, because of serial correlation, a given individual can perform constantly above or below the expected behaviour. Classical pooling is used when there is no reason to believe that the different behaviour between individuals can be explained other than by serial correlation and chance. 2.2.3 O n l y the Intercepts Vary Sometimes, it is possible to explain partially the difference between individuals by their initial states. In these cases, it is assumed that the partial derivatives, that is the slopes of each independent variable, are constant over individuals and time, and that the intercepts will vary over individual and time. In this section, the general model describing such a situation is presented. Alternative approaches to parameter estimation depend on the assumptions about the nature of the intercepts. The intercepts can be assumed to be fixed or random. Both alternatives lead to different estimation techniques that are discussed separately. Finally, the model and the estimation techniques are summarized in a short recapitulation. 2.2.3.1 T h e General Model In the most general case, the error term of the general linear model can be divided into three components: one to account for the difference between individuals, one to account for the difference between initial time periods, and a third for the variables unaccounted Chapter 2. Literature Review 18 for in the model that are not individual-dependent or time-dependent. This gives K Yij = B + ]T BpXpij + e,- + rj + Vjj 0 P =i where Yij is measurement j on plot i , /3 is the average intercept, Xpij is the variable p 0 measured on plot i at time j, ej is the individual error term, ij is the temporal error term, and Vij is the unexplained variation. T h e individual intercept is defined by combining ei and Tj with the average intercept. These different intercepts will likely arise from omission of important unobservable or unmeasurable variables in the model (Dielman 1989, p. 49). 2.2.3.2 The Intercepts Are Fixed In this case, the individual error component and the temporal error component are assumed to be fixed with mean zero (i.e., ^ i e = 0 a r m 2~Z j r 0) = component is random with mean zero and variance a . 2 a i m the third unexplained T h e estimation technique is called analysis of covariance ( A N C O V A ) or least squares with dummy variables. M u n d lak (1961) was among the first to use the A N C O V A for analyzing pooled cross-sectional and time-series data. There are a few disadvantages of A N C O V A . Since the different intercepts are considered fixed, they are using up as many degrees of freedom. Also, the different components of the error term represent some ignorance; the model does not explain the differences between initial states. Maddala (1971) suggested that this specific ignorance was not really different from the general ignorance un and should be considered in a similar way. This would also decrease the number of fixed parameters to be estimated, increasing therefore the number of degrees of freedom. This led to the assumption of random intercepts. Chapter 2. Literature Review 2.2.3.3 19 The Intercepts Are Random In this case, it is assumed that the various components of the error term are random effects with mean zero and specific variance: E[ei] = E[xA = E[v } = 0 {j m E[vl] = *j = 4. It is also usually assumed that e;, ij and V J J are mutually independent. T h e estimation technique based on these assumptions is called error component modeling ( E C M ) . Wallace and Hussain (1969) suggested estimating the variance components using the residuals obtained from the O L S estimate of the individual models. These estimators are consistent when the independent variables are assumed nonstochastic. Other estimators of the error components have been developed (e.g., Amemiya 1971, Swamy and Arora 1972, Rao 1972, Fuller and Battese 1974). A l l these estimators have similar asymptotic properties. Consequently, the choice of an estimator cannot be based on its asymptotic properties. One must consider the simplicity of the estimator, finite sample results and Monte Carlo studies of small sample performance (Baltagi 1981). T h e finite sample properties of some estimators have been investigated analytically by Swamy and Arora (1972) , Swamy and Mehta (1979) and Taylor (1980). Monte Carlo studies were published by Arora (1973), Maddala and Mount (1973) and Baltagi (1981). Swamy and Mehta (1973) analyzed the error component model in a Bayesian framework. 2.2.3.4 Recapitulation T h e difference between individuals can sometimes be explained by different initial states (or model intercepts). These different intercepts come from partitioning the error term Chapter 2. Literature Review 20 into generally three components: an individual error term, a temporal error term, and an unexplained error term, and from the combination of the common intercept with the individual and the temporal error terms. If the individual and the temporal error terms are assumed to be fixed, then the ANCOVA is the appropriate EGLS technique. If the individual and temporal error terms are assumed to be random, then ECM must be used. 2.2.4 All Parameters Vary- Some time-series observations of different individuals are highly correlated with the same set of independent variables but with slopes varying over individuals. Not only are their initial states different but also the increment in the dependent variable for one unit of change in one of the independent variables. This will occur if an important variable for all the individuals is omitted from the model or if a variable important for some individuals is omitted. This section follows the same format as the previous section. The general model describing the situation is presented first. The parameter estimation technique when the parameters are assumed to be fixed comes next, followed by the technique for random parameters. Last, the model and the techniques are briefly recapitulated. 2.2.4.1 The General Model The individual model is Yi = Xidi + Ui where Yn 1 Uil Ui = 0i = Y iT 1 X liT • • x KiT BiK 21 Chapter 2. Literature Review Then, the full model can be written: Y = XB + u where ' Pi Y = X = XJV 2.2.4.2 u = P = _0N UJV The Parameters Are Fixed With fixed parameters, efficient parameter estimates can be obtained using OLS if there is no contemporaneous correlation. When contemporaneous correlation is present, Zellner (1962) developed a technique using this supplementary information. He called the technique seemingly unrelated regressions (SUR). Estimators obtained with this technique are more efficient than simply applying least squares to each individual independently. The main assumptions about the error structure are: £[U;J] = 0, -Efujuf] = O^IT, and X,- is a fixed matrix in repeated samples. TV is the number of individuals, K is the number of independent variables in the model, and T is the number of observations on each individual. Based on the assumptions about the error terms, the covariance matrix of the error terms can be written il = S <g> I Oil where S = T &12 2N °~21 °~22 0~N1 0~N2 ••• °~NN CT When the contemporaneous correlation between individuals i and I is equal to zero, an = 0 for i / /, the estimates are easily shown to be identical to the OLS estimates Chapter 2. Literature Review 22 computed independently on each individual. The SUR estimator will also be the same as the OLS estimator if the X; are identical for all individuals. If u is normally distributed, and h m T _ [XTi X] _1 >00 _1 is a finite nonsingular matrix, the EGLS estimate is a minimum variance unbiased estimator, and is asymptotically efficient. To estimate ft, Zellner (1962) suggested using the residuals from the regression of Y{ on either the independent variables in the i th equation alone or all the independent variables in the system. Writing the residuals from either method as hij, the elements of S (or ft) are estimated by o-n = ijUij/{T u - K). Zellner (1962) also suggested iterating the estimator by recomputing the residuals using 8 to get a new estimate of ft, and so on until convergence occurred. Dhrymes (1971) showed that the iterative estimator is equivalent to the maximum likelihood estimator. Parks (1967) and Kmenta and Gilbert (1970) extended the theory to include the case where the disturbances are both serially and contemporaneously correlated. Schmidt (1977) examined the SUR model with missing observations. Swamy and Mehta (1975) also studied SUR models with missing observations but in a Bayesian framework. Phillips (1985) derived the exact finite sample properties of the SUR estimator. Zellner (1987, pp. 240-256), Srivastava (1973), and Zellner and Vandaele (1975) analyzed the SUR model in a Bayesian framework. 2.2.4.3 The Parameters Are Random The sets of different parameters can also be assumed to be random. The EGLS technique for such models has been called random coefficient modelling (RCM; Swamy 1970, 1971, 1974). Because the model parameters link the same independent variables to the dependent variable, they could very well be related. If it can be assumed that many Chapter 2. Literature Review 23 independent unobservable or unmeasurable factors can influence a model parameter, this model parameter can be seen as a random variable. These excluded variables can be assumed to counteract each other, so that their overall effect is negligible. Consequently, the expected value of the model parameter is the mean parameter, and its variance is the variance of the aggregate effect of these excluded variables. The parameter 8{ can be rewritten Pi = P + e; where Po e i0 and 0 = PK Replacing this new expression for Pi in the original individual model yields Y i = Xi{p + e;) + Ui = Xij3 + X ; e ; + U i = Xi/3 + W i The error term w; is then the sum of two components, X i e i and Ui. The full model can be written Y = X ^+ w where (Xjei + Y = w = X = XJV (Xjv-Civ Swamy (1970, 1971) made the following assumptions: the number of individuals and the number of observations on each individual are larger than the number of independent variables; the U, are independently and identically distributed with -E[uij] = 0 and .E[uiU^] = <THIT] the e, are independently and identically distributed with -E[ej ] = 0 and p 24 Chapter 2. Literature Review i£[ejej] = A, and the u; and 0\ are independent for every i and /. The covariance matrix of w, CI, is the block diagonal matrix with The GLS estimate for /3 is Ex^^r^x^Y; p = rx'n^x^x'n-'Y= i i =£ w ^ i where hi W, =[X& ]- X Y 1 i i i = {£[A + ^ ( X j X i ) - 1 ] " 1 } - 1 ^ + ^(Xpt;)- ]- . 1 1 i The GLS estimate can be seen as a weighted average of the OLS estimate applied to each individual model. Mundlak (1978) noted that 0 can be written as a weighted average of a "between estimator" and a "within estimator." Rao (1982) gave the conditions (unrestrictive) under which 0 is an unbiased estimator of 0. In general, 0{ is the parameter of interest. Lee and Griffiths (1979, in Judge et. al. 1985; p. 541) showed that the best linear unbiased predictor (BLUP) of 0{ is ft = [A + *rA(X' X )}-^7.i(X' X )b -1 i i i i l + [A-i + ^(XjXOr A-\§. Note that the term "predictor" was used instead of "estimator" because 0i is a random variable. The BLUP for 0{ is a weighted average of the OLS estimate of 0{ and the overall estimate of 0 with the weights inversely proportional to their respective variances. In a Bayesian analysis of a random coefficient model, Smith (1973) found a numerically identical estimator. As would be expected, the simple average of the individual BLUP's gives the overall BLUE, that is ( £ f 0i)/N = fi. 25 Chapter 2. Literature Review As usual, the unknown A and an must be replaced by consistent estimators to obtain the FGLS estimate. If u^ is the OLS residual for individual i, Swamy (1970) suggested using: U-U; T —K ^ = ^E^bj-^Eb.bji-iEMxjx,)- 1 With these estimates, Swamy showed that /3 is asymptotically efficient as T — > oo. Carter and Yang (1986) showed that this result was also valid when N —> oo or T —> oo if an = a (i.e., if all the variances of the disturbance terms are equal). 2 Amemiya (1978) investigated the case where the individual parameters are a linear function of some explanatory variables Pi = Z;0 + . e i Swamy (1974) extended the theory to include the case with contemporaneous correlation and serial correlation. Finally, there are also techniques to estimate models with different sets of parameters varying across individuals and time. These models were not considered relevant to this dissertation and are not discussed (for information concerning these models, see Judge et. al. 1985; p. 545-550). 2.2.4.4 Recapitulation The difference between individuals can sometimes be explained by different sets of model parameters. These parameters can be fixed in which case the SUR technique is the appropriate EGLS technique if contemporaneous correlation is present. If the parameters Chapter 2. Literature Review 26 are f i x e d , a n d there is n o c o n t e m p o r a n e o u s propriate. T h e parameters c o r r e l a t i o n , t h e n the O L S t e c h n i q u e is ap- c a n also be a s s u m e d to v a r y r a n d o m l y a r o u n d a c o m m o n average. I n t h i s case, R C M is the a p p r o p r i a t e E G L S t e c h n i q u e . 2.2.5 Summary of t h e E c o n o m e t r i c s Literature T h e best, l i n e a r , u n b i a s e d e s t i m a t e ( B L U E ) of the p a r a m e t e r s of a l i n e a r m o d e l is g i v e n i n its g e n e r a l f o r m b y the e s t i m a t e d generalized least squares ( E G L S ) e s t i m a t e . When t h e error t e r m s are a s s u m e d to be i n d e p e n d e n t l y a n d i d e n t i c a l l y d i s t r i b u t e d , the EGLS e s t i m a t e is e q u i v a l e n t to the o r d i n a r y least squares ( O L S ) e s t i m a t e . However, this rarely is the case w i t h m o d e l s f i t t e d o n p o o l e d cross-sectional a n d time-series d a t a . T h e struct u r e o f t h e c o v a r i a n c e m a t r i x of the error t e r m is h i g h l y i n f l u e n c e d b y t h e a s s u m p t i o n a b o u t the i n d i v i d u a l m o d e l p a r a m e t e r s . m o d e l s s h a r e t h e same p a r a m e t e r s , T h e r e are three g e n e r a l cases: the i n d i v i d u a l t h e i n d i v i d u a l m o d e l s h a v e different i n t e r c e p t s c o m m o n slopes, or the i n d i v i d u a l m o d e l s have different sets of p a r a m e t e r s T h e s e different p a r a m e t e r s (CP) c a n b e a s s u m e d to be fixed or r a n d o m . is the a p p r o p r i a t e t e c h n i q u e i f a l l sets of p a r a m e t e r s but altogether. Classical pooling are a s s u m e d to be i d e n t i c a l across i n d i v i d u a l s a n d t i m e . A n a l y s i s o f covariance ( A N C O V A ) is u s e d w h e n the intercepts are a s s u m e d t o b e different b u t f i x e d , w h i l e error c o m p o n e n t m o d e l l i n g ( E C M ) is a p p l i c a b l e for different r a n d o m i n t e r c e p t s . T h e s e e m i n g l y u n r e l a t e d regression ( S U R ) t e c h n i q u e is u s e d to e s t i m a t e different sets of fixed p a r a m e t e r s , w h i l e e s t i m a t i n g different sets o f r a n d o m p a r a m e t e r s 2.3 requires u s i n g r a n d o m coefficient m o d e l l i n g ( R C M ) . The Growth Curve Literature I n d i v i d u a l m o n i t o r i n g is a c o m m o n p r o b l e m i n b i o l o g i c a l research. Therefore, various m o d e l s a n d t e c h n i q u e s h a v e been i n v e s t i g a t e d to a n a l y z e g r o w t h curves. I n the b i o l o g i c a l Chapter 2. Literature Review 27 l i t e r a t u r e , time-series d a t a are u s u a l l y c a l l e d l o n g i t u d i n a l d a t a or repeated measurement data. P o t h o f f a n d R o y (1964) suggested t h e m u l t i v a r i a t e a n a l y s i s o f v a r i a n c e m o d e l to estimate g r o w t h curves. The M A N O V A (MANOVA) m o d e l requires t h a t a l l i n d i v i d u a l s be m e a s u r e d at t h e s a m e age. T h i s m o d e l a l l o w e d confidence i n t e r v a l e s t i m a t i o n a n d hypothesis testing. R a o (1965) e x t e n d e d P o t h o f f a n d R o y ' s m o d e l , t r a n s f o r m i n g i t to a r a n d o m coefficient m o d e l w h e r e t h e m a t r i x o f i n d e p e n d e n t across i n d i v i d u a l s (i.e., X; = X for a n y i). variables were i d e n t i c a l G r i z z l e a n d A l l e n (1969) u s e d a n a n a l y s i s of covariance t e c h n i q u e w h e r e t h e w e i g h t i n g w a s o b t a i n e d f r o m a subset o f covariates. Geisser (1970) p r o v i d e d a B a y e s i a n a n a l y s i s o f t h e M A N O V A m o d e l . F e a r n (1975) a p p l i e d t h e general B a y e s i a n l i n e a r m o d e l p r o p o s e d b y L i n d l e y a n d S m i t h (1972) to g r o w t h curves. H e also s t u d i e d p r e d i c t i o n s g i v e n a s a m p l e f r o m this m o d e l . R a o (1975) gave a n e m p i r i c a l B a y e s s o l u t i o n t o a r a n d o m coefficient m o d e l w h e r e the p a r a m e t e r s o f t h e i n d i v i d u a l m o d e l s were a s s u m e d r a n d o m l y d r a w n f r o m a c o m m o n d i s t r i b u t i o n w h o s e e x p e c t e d v a l u e was t h e average o f t h e i n d i v i d u a l p a r a m e t e r s . Laird a n d W a r e (1982), c o n s i d e r i n g t h e i n d i v i d u a l p a r a m e t e r s as m i s s i n g d a t a , u s e d t h e E M algorithm (Dempster et al. 1977) to e s t i m a t e them. W i t h t h e E M a l g o r i t h m , slow convergence or convergence to a l o c a l r a t h e r t h a n g l o b a l m a x i m u m are m a j o r concerns. B e r k e y (1982) e x t e n d e d R a o ' s (1975) t e c h n i q u e t o n o n l i n e a r m o d e l s . H u i a n d Berger (1983) also used a n e m p i r i c a l B a y e s i a n a p p r o a c h t o g r o w t h curves. T h e i r technique c o u l d b e c a l l e d a n e m p i r i c a l B a y e s e s t i m a t i o n o f a r a n d o m coefficient m o d e l . R a o (1987) d i s c u s s e d t h e p r o b l e m o f p r e d i c t i n g future o b s e r v a t i o n s w i t h g r o w t h m o d e l s . H e a n a l y z e d p a r a m e t r i c a n d n o n p a r a m e t r i c m o d e l s . F o r t h e p a r a m e t r i c m o d e l s , he c o n s i d e r e d b o t h B a y e s i a n a n d e m p i r i c a l B a y e s i a n techniques t o d e a l w i t h u n k n o w n p a r a m e t e r s . B a y e s i a n a n d e m p i r i c a l B a y e s i a n techniques curve literature. have predominated biological growth W i t h o u t s t a r t i n g a p h i l o s o p h i c a l d e b a t e o n h o w t h e y differ a n d w h i c h Chapter 2. Literature Review 28 one, if any, is more appropriate, one can notice that both are partially based on some auxiliary information. This auxiliary information, particular to an individual, individualizes a general model and its prediction of future observations. The prediction of future observations is dependent on the outcome of this individual-specific auxiliary information and on an estimate of the general model. 2.4 Conditional Densities and Expectations Many techniques for analyzing remeasured individuals and for predicting future observations described so far assume that future observations are conditional on past, random, observations. The modeling process, therefore, involves conditional densities and expectations. A conditional density is a probability density function (pdf) where the outcome of a random variable X depends on the outcome of another random variable Y. The variable of interest is thus X given Y, or X | Y, and its pdf is noted p(X \Y). Most of the theory concerning conditional pdf's was developed by Bayes (1763), and a brief summary is presented here. Some basic matrix inversion rules are useful for deriving conditional densities and expectations. These rules can be found in certain texts (see Smith 1973, for instance). Rule (1) (D + E F E ' ) - 1 = D 1 - D E(E'D~ E + F ^ ^ E ' D " 1 - D ^ D 1 Rule (2) (D+B)- 1 = D Rule (3) (D + B ) - ^ = I-(D 1 1 + B 1 ) - ^ 1 1 + B)- D 1 Conditional densities can be defined for univariate or multivariate variables. The multivariate case is more general. Let X and Y be two random vectors of n elements with pdf p(X) and p(Y), respectively. Their joint pdf is p(X,Y). Using Bayes' theorem on conditional probabilities, the joint pdf can be rewritten: p(X,Y) = p(X|Y)p(Y) Chapter 2. Literature Review 29 p(Y|X)p(X) From this theorem, the conditional pdf of X given Y can be defined as: P(X|Y) = p(Y|X)p(X) P(Y) When Y is given, the denominator p ( Y ) contains no unknown. It can be seen as a proportionality constant, so that the multiple integral of p ( X | Y ) over Xi, ... ,X n yields one. This constant will be equal to the inverse of the integral of the numerator over the domain of X : p(Y) = l / / / .../ P (Y\X)p(X)dX dX ...dX 1 2 n For instance, if Y given X is distributed as a i V ( X , f i ) , p(Y|X) = 1 (2TT)"/ \fl\^ 2 V - - ( Y - x y n - ^ Y - x ) and X is distributed as a N(p, A ) , P(X) = 1 (2TT)"/ 2 IA I/ -exp 1 2 -(X-p)'A-\X-p) the unconditional variable Y will be distributed as a N(p,A + ft) P(Y) = (2TT)"/ 2 IA + fl\V : ; exp hY-p)'(A + ftr^Y-p) To show this, rather than using the integral of the product of p ( Y | X ) and p ( X ) , an easy alternative can be used. The variable Y given X can be seen as the sum of the variable X , normally distributed, plus an error term, e, uncorrelated with X , normally distributed too, with expected value zero and variance ft. Obviously, Y will be normally distributed since the sum of normal variables yields a normal variable. The expected value and the variance of this normal variable can be found easily: Chapter 2. Literature Review 30 Y = X + e, E[Y] = E[X] + E[e] = fi, var(Y) = var(X) + var(e) = A + ft. Using the definition of a conditional density, it can be shown that the variable X given Y will also be normally distributed: -H(Y - xyn-^Y - x) + (x - ^A-^X -)> (27r)™|A| /2|f2|l/2 P 1 [X|Y] = eX M (i^/'lW'"* [-H( - ^)'( + ")"(Y - /*)}] A+ n Y P i 1 / A 1 2 X exp •-{(Y - xyn-^Y - x) + (x - U)'A-\X -(Y-py(A 1 + ny\Y-p)}} (A- + 0 - ) - ! / exp [-1 [(X - E[X\Y})'(A(2TT)"/ 2 I - p) 1 1 1 1 1 2 + fl^fX - £[X| Y]) where £[x|Y] = (A^ + n-^^n^Y + ^ ^ + n- )-^- /*1 1 The expected value of the conditional pdf X given Y is a weighted average between the expected values of the variables Y given X and X, where the weights are inversely proportional to their respective variance. The variance of X given Y is the inverse of the sum of the inverses of the respective variances. The expected value i£[X|Y] can be seen to be a random variable since it depends on the random outcome of Y. It is the sum of a constant, (A -1 + fi ) -1 -1 A /i, and a random vector normally distributed, Y, _1 multiplied by a vector of constants. Therefore, it will be normally distributed, with the Chapter 2. Literature Review 31 following expected value and variance: E[E[X\Y}} var(£[X|Y]) E[X]=p var(X)-£[var(X|Y)] = A - ( A - 1 + ft ) 1 -1 - If it is assumed that A and fi are two positive definite matrices, then it can be shown that var(X) is always larger than va,i(E[X | Y ] ) . A matrix A is larger than another matrix B if the matrix A — B is positive definite. Using matrix inversion rules (2) and (3) defined above, the expected value of the conditional density is equivalent to E[X\Y] = A(A + + ft(A + ft)~V- As demonstrated, statistical computations with conditional variables follow the same general rules as with unconditional variables, despite the more elaborate algebra. It is important to be famihar with the algebra of conditional variables to understand models making use of past observations to predict future observations. 2.5 Overview Permanent sample plot attributes can be assumed to be explained by some general biological phenomenon. Observations of these attributes on a given plot at a point in time will also be influenced by the individual character of the plot and time-related variables, such as climate. Repeated measurements on a group of plots provide information on both the influence of the individual character and the time period effect. This suggests that permanent sample plot observations can be explained by a general model plus random terms specific to the individual and the time period. Since future observations must be predicted on a given plot, a prediction of the individual character of the plot as well as an estimate of the general model are needed Chapter 2. Literature Review 32 to individualize the volume prediction. This approach recognizes the random nature of the relationship between observations coming from a sample plot. Future observations depend on the general model, and the outcome of the individual plot character. The individual plot character can be predicted from past observations. Therefore, predictions of future observations are dependent on past observations. Many approaches can be used to predict future observations based on past observations. The approach suggested in this dissertation was inspired by the empirical Bayes models described by Rao (1975, 1987) and Hui and Berger (1983), the random coefficient model (Swamy 1970, 1971, 1974; Amemyia 1978; Ferguson and Leech 1978; Biging 1985), as well as the theory related to error component modeling. Chapter 3 Statistical Methods Future volumes can be expressed as the current volume plus a volume increment. The analogy with the interest on the money in a bank account is often used to explain forestry ideas such as volume growth and allowable annual cut to non-foresters. Going from interest to interest rate to biological growth rate are steps easy to understand, and the mathematics relevant to interest rates can simply be adapted to a forestry context. Consequently, the parameter of interest in the proposed model is the growth rate. A biologically-reasonable model for predicting volume growth rate is defined and estimated. This predicted volume growth rate is used to compound the current volume of a plot, the same way money can be compounded with a certain interest rate. This chapter is divided into four parts. In the first section, the similarities between biological growth rate and interest rate are analyzed and explained. In the second section, a two-stage statistical model based on interest rate mathematics and fundamental principles of biological growth is suggested to predict volume growth rate. Parameter estimation is discussed and estimators are proposed in the third section. Finally, algorithms for compounding the current volume of a sample plot and for computing a confidence interval around the predicted volumes are presented in the last section. 3.1 The Timber Capital The trees on a permanent sample plot can be compared to a timber capital destined for yielding interest. The mathematics relevant to interest rates are simple and well-known; 33 34 Chapter 3. Statistical Methods converting them to growth rates is straightforward as it is shown in section 3.1.1 The main difference between money and trees is the uncertainty coming with observations on biological organisms such as trees. This uncertainty must be recognized and incorporated into the mathematical expression of the compounded capital. In section 3.1.2, the problem of uncertainty attached to volume observations is addressed. 3.1.1 The Interest Rate Analogy The growth in volume for a tree in a particular year is the amount of volume the tree gained between the beginning and the end of the growing season. When the growth is divided by the initial volume, it can be seen as a growth rate (i.e., the amount of volume per unit of initial volume gained in a year). The growth rate is the annual "interest rate" on the initial volume of a particular tree in a given year. This image can be extended to a plot or a stand without problem. In this section, the mathematics relevant to interest rates and their similarities with growth rates are explained. The interest rate at a particular bank will likely fluctuate over time. It will also likely differ between banks; two different banks can have different interest rates. In the same way, the growth rate of a tree (or a plot or stand) will be specific to a given individual and time. Mathematically, this gives V-y + 1 ) = ^(1+^.) (3.1) where Vij = volume measured on individual i at time j, Pij = annual growth rate on individual i at time j, (1 + pij) and = annual multiplicative rate. In forestry, permanent sample plots are commonly used to monitor growth. However, annual measurements on permanent sample plots are rarely available. The sample plots Chapter 3. Statistical Methods 35 are remeasured at a regular frequency, usually between three and ten years. When the time interval between measurements j and j + 1 is more than one growing season, periodic growth is then observed. In this case, the ratio between the volume at the end of the period and the initial volume is the periodic multiplicative rate. The periodic multiplicative rate is the product of the annual multiplicative rates. Since the annual multiplicative rates are not observed, the periodic multiplicative rate can be considered as the geometric mean of the annual multiplicative rates raised to a power, Lj, equal to the time lag between measurements j and j + 1: V where Pij = [H§ = 1 ) i{j+1) = V k ( l ( 3 . 2 ) ( 1 + )}^ - 1. Pij The periodic multiplicative rate is the percentage of initial volume obtained after a certain period of time. Therefore, the notation used for annual remeasurements is somewhat inadequate for periodic remeasurements since the subscript j in refers to the period between measurements j and j + 1 while in Vij it refers to the instantaneous moment at measurement j. Model (3.2) can be rewritten using the subscript p to refer to the time interval between measurements j and j + 1 with plot i and the initial volume on the volume at the end of the period: = V°(l + p ) ». L ip (3-3) With this notation, the growth rate is located in space and time. The subscript i indicates a geographic space, (delimited by some longitude and latitude) and the subscript p is temporal space (delimited by an initial year and a final year). This approach to growth modeling could be taken for other stand attributes, such as diameter, height, basal area or number of stems per hectare. For illustrative purposes, the approach is shown for plot volume. For the remainder of this thesis, the term volume will refer to plot volume. Chapter 3. Statistical Methods The 36 periodic multiplicative rate in model (3.3) can also be expressed using the Napierian base (e = 2.7182818 ...) K = V° x exp[8 L tp where 9 ip = ln(l + (3.4) f p ). ip The parameter &i is easily isolated p 1 0* = — I n V'P 1 v (3.5) ip Using the logarithm rules and the definition of a derivative, it can be shown that, as the time period goes to zero, 6{ becomes the instantaneous growth rate p lim 9j„ = lim *0 Lr In V v° v — 1 ip lim hm At—o Aln[V] At d\n[V] dt 1 dV (3.6) V~dt where t is time. Similarly, 0 L ip p can be shown to be the integral of the initial and the final measurements. with respect to time between T i m e can be expressed on a relative scale. T h e initial measurement is taken at time 0, and the final one is taken at time L . p di dt p This yields (3.7) which is the periodic multiplicative rate expressed in a logarithmic scale. The concept of growth as the derivative of volume with respect to time (or volume as the integral of growth) was recognized in the early sixties (Buckman 1962, Clutter 37 Chapter 3. Statistical Methods 1963). This is a simplification of reality; it assumes that growth is a constant process, which is not exactly true since growth stops in winter. A discrete analog would be to take the annual interest rate analogy. This approach has some shortcomings too; there is no growth assumed before the end of the growing season, similar to the annual interest on a bank account that is paid once a year. Another discrete approach would be to follow the modern bank system and use daily interest rate, but this would demand a tremendous data set. Assuming that growth is a continuous process leads to relatively easy mathematical computations while being close to reality. 3.1.2 Uncertainty and Growth Rate The deterministic model (3.4) could be used if the true initial and final volumes were observed. However, these quantities will likely be measured with some error. The observed volumes are random variables. The variability in observed volumes will introduce some variation into the observed growth rate. Because the growth rate is essentially the ratio of two variables, it becomes a random variable itself. A stochastic term must be incorporated in model (3.4) to account for this variability. A priori assumptions are then made about this error term in order to draw inferences using statistical theory after mathematically fitting the model. This stochastic term can be included in different ways. Three common ones are iVi/VP) = exp[0 L ] + u iVi/VP) = exp[(0 L ) + u ] (3.9) {V£/V?) = exp[(0 + u )L }. (3.10) ipJ p ip ip p (3.8) tp ip ip p The choice of a model depends on its mathematical tractability and its biological interpretation. Model (3.8) is a nonlinear model and is particularly difficult to fit to recent permanent sample plots. Model (3.9) is linear after a logarithmic transformation. 38 Chapter 3. Statistical Methods It adds an error term to the periodic growth rate. The error term will undoubtedly depend on the length of the time period even if not explicitly expressed in the model. Model (3.10) is also log-linear. The error term is included with the instantaneous growth rate. Its dependency on the time lag is therefore implicitly expressed in the model. This model has the advantages of being simple to fit and of corresponding to the interpretation of a growth rate whose variation is dependent on the period length. For these reasons, model (3.10) was considered the best approach. Bank account and trees have this point in common: they become bigger as time passes. However, while knowing the exact amount of money in a bank account is no problem, direct observations of plot volume is impossible. A n error term can be included in the mathematical expression of periodic growth rate to account for this uncertainty. Model (3.10) is the basic expression of growth rate with uncertainty. Within this basic expression, it is possible to assume that the growth rate, &{ , can be explained by some v biological phenomena. A relationship explaining #; from some variables can be substip tuted in the basic expression. This framework leads to the statistical model to predict future growth rates, which is discussed in the next section. 3.2 The Statistical Model The framework proposed for estimating the statistical model is a two-stage procedure. The first stage is the basic expression describing the mathematics of growth rate, and the second stage is a biological explanation of the true growth rate. The framework was designed specifically for the irregular remeasurements of permanent sample plots. These irregular remeasurements, along with overlapping time periods will create notation difficulty. Conventions must be defined to avoid confused notation. Once the notation is defined, each stage of the procedure can be discussed. The first Chapter 3. Statistical Methods 39 stage deals with the problem of a small number of observations on each plot and the error components. T h e second stage makes use of Von Bertalanffy's growth curve to estimate the growth rate from site index and volume. Finally, both stages are joined in the combined model. 3.2.1 Notational Conventions It will be assumed that there are N plots covering M intervals. Some time intervals may be overlapping. For instance, if some plots were measured in 1970, 1975 and 1980, and the other plots were measured in 1973, 1978, and 1983, there would be four time intervals: (1970,1975], (1973,1978], (1975,1980], and (1978,1983] . 1 observation in each time interval. Furthermore, Not every plot will have an the number of plots measured each year will vary with the money budgeted annually for the remeasurement of permanent sample plots. Consequently, the number of remeasurements will differ among plots and among time periods. It will be assumed that there are T; growth observations on plot i [i = 1, . . . , N), S p observations in time interval p ( p = l , . . . , M ) , and W observations in total where N W = Y,Ti i=i M = Y,S ; P P with W > N and S p > 1. = i Because time intervals may be overlapping, consecutive measurements on a plot do not lead to observations in consecutive time intervals. For the example mentioned above, the first set of plots would have observations in time intervals (1970,1975] and (1975,1980], while the second set of plots would have observations in time intervals (1973, 1978] and (1978, 1983]. T h e time interval numbering and the plot numbering are arbitrary. They are only coordinates to link an observation with geographic and time locations. Also, it is assumed that all plots used in the analysis come from the same geographic region, and 1 This notation supposes that the plots were measured at the end of the growing season. Chapter 3. Statistical Methods 40 are dominated by the same species. No notation is used to identify these two qualitative variables. 3.2.2 First-Stage M o d e l T h e first-stage model is the basic expression for periodic growth rate, with uncertainty included with the instantaneous growth rate. Recent permanent sample plots only have a few growth observations. Using an approximation, the proposed first-stage model can be estimated from plots that have had only two growth observations. This approximation is explained in the model definition section. Next, the error term is decomposed to take advantage of the measurements of many individuals in a given time interval. Assumptions about the error term and their justifications follow. Finally, the matrix notation of the model is given to simplify the notation and derivation of the parameter estimators. 3.2.2.1 M o d e l Definition T h e observed periodic multiplicative rate is the ratio V /V . ip ip T h e average annual mul- tiplicative rate observed during this time period is the observed periodic multiplicative rate raised to the power l/L . p Because the model (3.10) is in exponential form, it will be more convenient to work with the natural logarithm of this ratio. So let (3.11) It will be assumed that the variable Y{ growth rate, 8i , since L p p p represents an estimate of the instantaneous is usually small. It is the sum of the true instantaneous growth rate plus an error term: (3.12) Chapter 3. Statistical 41 Methods p l o t oge Figure 3.1: Growth Rate over Time for a Sigmoidal Yield Curve which corresponds to the logarithmic transformation of model (3.10). Recent permanent sample plots cover only a few time intervals, and if these time intervals are short, the whole period covered by the sample plot can still be considered close to zero, especially when, it is compared to the lifespan of the stand. Therefore^ the instantaneous growth rate could also be estimated from the ratio between the last and the first measurements on a sample plot. This corresponds to assuming a constant growth rate on a plot. Hence, the growth rates estimated in the intervals can be considered as repeated observations of the same parameter, the constant growth rate. If the yield is assumed to follow a sigmoidal shape over time, the growth rate in the early ages declines rapidly to level off as the stand gets older (Figure 3.1). Therefore, the assumption of constant growth rate seems reasonable for a large part of the plot's life. Obviously, this assumption is a compromise to approximate the trend of growth rate over time for a sigmoidal yield curve and to allow for parameter estimation. 42 Chapter 3. Statistical Methods This approximation will simplify the model, but also bring some limitations. It likely could cause some bias. Assuming a constant growth rate on a plot will tend to underestimate the growth rate at the first measurement and overestimate it at the last measurement. This bias should be negligible, or at least reasonably small, if the whole period covered by the plot is short enough. The period of time where the approximation is reasonable depends on the species and the age of the plot. The bias should be more pronounced on young plots. Also, the rate of change in the instantaneous growth rate is species-dependent; fast growing species will level off faster than slow growing species. Considering these limitations, the approximation can still prove useful in many cases such as older plots and fast growing species. Model (3.12) can be approximated by Yi p — Oi (3.13) + U;. This model can be estimated with as few as two growth observations on a permanent sample plot. That is, the model requires a minimum of three measurements on a permanent sample plot since the number of measurements is equal to the number of growth observations plus one. Also, measurements on different individuals in the same time period will allow decomposing u ; , the error term. p 3.2.2.2 Error Components Following the principles of error component modeling, when measured on a given individual (i.e., the individual is fixed), the error term u; can be divided into two components: p the error due to the time-dependent variables unaccounted for in the model and a term for the unaccounted variables that are not time-dependent. This gives u where ip = r + Vj. ip p (3.14) 43 Chapter 3. Statistical Methods r = error term explained by the time-dependent variables unaccounted for in the model. p It remains constant for all individuals in a time period but varies among time periods. V i = unexplained residual noise. It accounts for the observed difference between indip viduals with the same true growth rate in a given time period. Some unobservable or unmeasurable variables will have a similar effect on geographically close plots in a time period (Gregoire 1987). For example, climatic variables, and major insect and disease epidemics will uniformly affect a small region in a given time period. Total number of days in the growing season, rainfall, quantity and quality of sunlight, etc. should be roughly similar for all plots in a time period. However, it would be impossible to include all these variables in the model. Their cumulative effect will vary from time period to time period. With a two-way lay-out such as a panel data set, it becomes possible to estimate the variation of the random effect caused by these variables. It is therefore possible to distinguish the time-dependent ignorance from the unexplained ignorance. The error term in model (3.13) was decomposed to represent this knowledge about the stochastic variation in the model. 3.2.2.3 Assumptions Combining equations (3.13) and (3.14), the model for estimating the growth rate on plot i becomes: Y = 6i + T + v . ip p Assumption 1. E[T ] — E\v{ \ = 0 Assumption 2. E[T i ]' P p p q = 6 2 = Spq iip = q iipf^q ip (3.15) 44 Chapter 3. Statistical Methods Assumption 3. Assumption 4. E[vi Vj ] p E[T Vi ] p = cr if i = j and p = q = 0 otherwise 2 q 0 — q Assumption 5. r and v; are normally distributed. p p These assumptions imply that u; is normally distributed and p E[u ] ip E[u u ] ip =0 = o + 5 2 jq = S 2 = Spg 2 if i = j and p = q ii i ^ j and p = q Hpf^q Assumption 1 means that the expected value of the observed growth rate on a plot is assumed to be the true growth rate for this plot. Since Y± is the logarithm of a ratio of v random variables raised to a certain power, this assumption will make sense if and only if the expected value of the logarithm of the plot volume measured on plot i at time j is the logarithm of the true volume E[Y ] = 6i t=^> E[ln Vn] = l n ip where Vij is the true volume. The error of the logarithm of the observed volume has often been assumed to have zero mean in growth and yield literature (Schumacher 1939, Clutter 1963, Curtis 1967, Ferguson and Leech 1978, Gregoire 1987), and the same assumption is made here. The bias created by the approximation of a constant growth rate is assumed to be negligible. Assumption 2 concerns the time-dependent error terms affecting the growth rate. The effects of these error terms should be correlated with each other since the time Chapter 3. Statistical Methods 45 periods are overlapping. Following Garcia (1983), it can be assumed that the correlation is proportional to the overlap between two time periods. Nonoverlapping time periods are considered independent; the weather varies randomly from one period to another. Overlapping time periods share the same climatic conditions for a certain number of years which should indicate the presence of correlation. This assumption also implies that consecutive growth observations on a given individual are considered independent because they are always taken in nonoverlapping time periods. Let Cpq be the number of overlapping years between periods p and q, then the covariance between r and r , 8 , can be assumed to be p q m 2 *» ~ C p q ( L + p 2L L 5 p L) q ( q 3 - 1 6 ) where L and L are the respective lengths of periods p and q. p q To explain this estimator, let r be divided into two parts: r p temporal error term for the first L — C p remaining C m error term for the first Cpq years, and r p2 p2 where r pi is the p2 g q and r years, and r is the temporal error term for the m years. Similarly, let r be divided into r L — Cpq years. Both r pl gl and r g2 where r is the temporal gl is the temporal error term for the remaining g2 and r i are error terms for the same time period. Within a time g period, the variance is assumed constant. This means that the variance of r p2 is Cpq8 /L 2 p when estimated from time interval p. Similarly, the variance of the error explained by the same time interval but estimated from time interval q, i i, would be q correlation between r the Cpq p2 Cpq8 /L . 2 q Also, the and r is really the variance of the time-dependent error term for gl overlapping years. Therefore, of the correlation between r p2 C 8 /L 2 pq and C 8 /L 2 p pq q are two different estimates and r i . Averaging out both estimates will yield a single g estimate of the correlation between the two error terms, providing in the same time an estimate of the variance of the time-dependent error term for these C m overlapping years. Chapter 3. Statistical Methods 46 Mathematically, this yields + p 2 ) ( r i + r )] E[(i i r P E[i i ] pl g2 q + ql + E[i i } E[T i \ pl p2 q2 ql + £[r r ] p 2 9 2 £[r r i] p 2 g \{E[rl ] + E[? ]) 2 ql -(var(r ) + var(r )) p2 2 = 8 gl L L p q 2 (3.17) 2L L p q Two nonoverlapping time periods have Cpq — 0 satisfying the assumption of independence; two completely overlapping time periods have L = L — C^ which would yield p a covariance of 8 . Assuming a known form for 8 2 pq q is convenient since it decreases the number of unknown parameters to be estimated in the model. Assumption 3 implies that the unexplained error terms are independently distributed with common variance within a time period. This error term represents the stochastic variation between individuals observed in given time period after correction for the true growth rate. It accounts for the general ignorance in the model. For various reasons, some plots can be more sensitive than others in unusual but not extreme climatic conditions. For example, rainfalls slightly below the average can have serious effects on some plots and leave other plots almost unaffected. Other special conditions can prevail on a particular stand in a time period. A local insect infestation, for example, could affect only a small number of plots. For extreme conditions, it can effect all plots similarly. For instance, a major insect infestation would probably have the same kind of impact on all plots. The variance of this error term is assumed to differ Chapter 3. Statistical Methods 47 among time periods. The variation among plots within a time period will differ with the magnitude of the effect of climatic conditions on each plot. Some climatic conditions will cause a wide range of impact on individual plot growth rates while other conditions will affect all plots in a similar fashion. This error term can be seen as the interaction between time periods and plots, and it is assumed that the variance of this interaction varies with time periods. It is also assumed that the interaction between a given plot and a time period is uncorrelated with the interaction between the same plot and another time period, or with the interaction between another plot and the same time period. Assumption 4 implies that the time-dependent error term is uncorrelated with the unexplained error term. The individual plot behaviour in certain climatic conditions is unrelated with the average behaviour of all plots in the same conditions. Assumption 5 was justified by Aitchison and Brown (1957; p. 1-2) who mentioned that a variable which is the product of numerous independent random variables tends to be lognormally distributed around its true value. If the growth rate at any time is considered a random proportion of the initial volume, and if the growth rates are considered independent for a given individual, then the volume tends to be lognormally distributed. Therefore, the observed growth rate tends to be normally distributed. 3.2.2.4 Matrix Notation Model (3.15) and its assumptions can be expressed with matrices. To define the model using matrix notation, the observations are sorted initially by time periods and secondly by plots. The first observation is then the observation measured in period 1 on the plot with the smallest plot number from all plots measured in period 1, and is denoted Y„\. Chapter 3. Statistical Methods 48 The last observation is the observation measured in period M on the plot with the largest plot number from all plots measured in period M ( Y i ) . There are W observations in the M matrix containing the growth observations. The first-stage model is conditional on fl. The second-stage model will express fl; as a function of site index and initial plot volume. This approach will make the estimation of the covariance matrix of the random terms easier. The first-stage model is: Y = F9 + Gr + v (3.18) with V.l Y= r= 9 = 8 YiM N v= M T F, F = Fa ... F„ Fiw Gp i G = Gi G p = GpW where Fik — 1 if the k th element of Y was measured on plot i = 0 otherwise and Gk p = 1 if the k th element of Y was measured in time period p = 0 otherwise. Chapter 3. Statistical Methods 49 T h e m a t r i x of t h e s u m of b o t h error c o m p o n e n t s assumptions a b o u t the error t e r m s l e a d to t h e f o l l o w i n g E[w'] = E[rr'] where w o u l d be u = Gr + v . T h e various matrices S = diag (3.19) SA (3.20) 2 1 A12 A i 1 AMl A2 2 A = ... A i ... vq m 2 M M _ C (L and A M ~ v + L) q 2L L p q £ [ u u ' ] = t o G ' + S = n. (3.21) T h i s c o n c l u d e s the first-stage m o d e l . I n t h i s m o d e l , a c o n s t a n t g r o w t h r a t e is a s s u m e d for e a c h p l o t . S i n c e the i n d i v i d u a l is f i x e d , t h e first-stage m o d e l yields a n e s t i m a t e of the a p p r o x i m a t e d c o n s t a n t g r o w t h r a t e of the p l o t . T h i s e s t i m a t e comes f r o m the i n d i v i d u a l g r o w t h o b s e r v a t i o n s a n d c a n be c a l l e d the o b s e r v e d i n s t a n t a n e o u s g r o w t h rate. U s i n g a g r o u p of i n d i v i d u a l s , the g r o w t h r a t e c a n also be p r e d i c t e d f r o m i n d e p e n d e n t variables. T h i s is the second-stage m o d e l . 3.2.3 Second-Stage M o d e l T h e second-stage m o d e l p r o v i d e s a b i o l o g i c a l r e l a t i o n s h i p between the t r u e i n s t a n t a n e o u s g r o w t h r a t e a n d t h e site i n d e x a n d the v o l u m e . T h i s r e l a t i o n s h i p is d e r i v e d f r o m V o n Bertalanffy's growth curve. T h e b a c k g r o u n d e x p l a i n i n g the r e l a t i o n s h i p is p r e s e n t e d first. T h e s t a t i s t i c a l m o d e l a n d its a s s u m p t i o n s follow. F i n a l l y , the m a t r i x n o t a t i o n o f the m o d e l is g i v e n . 50 Chapter 3. Statistical Methods 3.2.3.1 Background It is reasonable to assume that the individual growth rate can be explained by some variables plus an individual-specific random term. For example, it can be assumed that the individual growth rate is a function of site index (SI) and initial volume (V). Any other factor that could have an influence on the individual growth rate can be included in a stochastic term. This function can be assumed to be identical for all individual growth rates, with fixed parameters and an additive error term 6 =f(SI ,V ) i i i (3.22) + e. i Fitting the relationship expressed in (3.22) would provide an estimate of the growth rate of a plot as explained by its site index and its initial volume. Also, the stochastic nature of the individual growth rate is recognized by the additive error term. The structure of the deterministic part of equation (3.22) can be derived from the biological interpretation leading to Von Bertalanffy's growth curve. The fundamental assumption underlying this curve is as follows. The potential growth is the difference between anabolic gains and catabolic losses. Anabolic gains are assumed to be proportional to the absorptive surface. The absorptive surface is assumed to be proportional to the volume raised to the power 2/3. Catabolic losses are assumed to be proportional to the volume. This gives Potential growth = K^V ^ 2 3 — K V. 2 (3.23) The potential growth is the production of surplus metabolic products (Pienaar and Turnbull 1973), which is the gain in biomass. The gain in biomass is distributed among various components of the trees: fine roots, large roots, stem, branches, leaves and bark. Therefore, the actual volume growth depends on the percentage of the total gain in biomass going to stem biomass. This percentage, « 3 , is an efficiency coefficient relating volume 51 Chapter 3. Statistical Methods production to production of surplus metabolic products: Actual growth — = K (KIV ^ = K 2 — K V) 3 3 3 K I V 2 - K K V. 2 / 3 3 2 (3.24) Keyes and Grier (1981) showed that the efficiency coefficient was correlated with site quality. Richer sites have a better availability of nutrients to put on stem growth. Poorer sites are poorer in nutrients, and more energy must be dedicated to fine roots production in order to get the nutrients. Kurz and Kimmins (1987) similarly found that the percentage of total biomass going into below-ground production is inversely proportional to site quality. As site quality decreases, a larger proportion of total biomass goes below ground, and consequently, a smaller proportion goes above ground. If the percentage of above- ground biomass is proportional to site quality, it can be expected that the percentage of biomass going into stem biomass will be correlated to site quality as well. Axelsson and Axelsson (1986) showed that the efficiency coefficient increases with fertilization. Richer sites do not need as developed of fine root system as the poorer sites, and are therefore more efficient at transforming their production of surplus metabolic products into growth. Site index can be assumed to be highly correlated with site quality; therefore, it is reasonable to assume that the efficiency coefficient is proportional to site index. Let K 3 be K4SI. Equation (3.24) can be rewritten ~ ot = K SI A K I V = fa SI V 2 / 3 2/Z The - K SIK V A - 0 SI V. 2 2 (3.25) growth rate is the growth divided by the volume. Dividing both sides of equation (3.25) by V yields 52 Chapter 3. Statistical Methods 3.2.3.2 Model Definition The left-hand side of model (3.26) is the instantaneous growth rate, which is approximated by fl;. Consequently, model (3.26) represents the deterministic part of model (3.22). The complete model (3.22) can be rewritten: SI9 i = h y ^ i - h M i + * - (3-27) The parameter fl; is assumed to be constant during the time periods covered by plot i. However, this does not correspond to the concept of instantaneous growth rate. This concept can be retrieved in the second-stage model if fl; is assumed to be measured at a single point, within the interval between the first measurement and the last measurement on plot i. The volume at this point can be used as the initial volume. This definition of initial volume will allow for simulating an instantaneous observation of the growth rate. Hui and Berger (1983) adopted a comparable strategy in a similar situation, mentioning that it should bring some robustness to the model. The constant growth rate is an average of the instantaneous (but unknown) growth rates. Therefore, its value is the value of an instantaneous growth rate located between the first and the last measurements. This suggests that an "average" initial volume would lead to a more appropriate definition of initial volume for an instantaneous growth rate. There are many possible definitions of this "average" initial volume, and since the real growth rate is unknown, there seems to be no definite rule to define this initial volume. One possible estimator is to take the mean volume, that is the arithmetic average of the observed volumes on a plot. V° = (3.28) where Vij is the volume observed on plot i at measurement j. When the growth rate is assumed to be estimated at this point, it better corresponds to the concept of instantaneous growth rate represented by fl;. 53 Chapter 3. Statistical Methods 3.2.3.3 Assumptions Once an appropriate definition of the initial volume is accepted, stating the assumptions about the error terms is the next step. These assumptions are: Assumption 6. Sli and (V^ ) / 0 Assumption 7. Efe] = 0 Assumption8. E[e ] iej 1 3 are fixed and measured without error = <t> /(V?) if i = j = 0 otherwise 2 1/3 Assumption 9. e; is normally distributed. Assumption 6 is a common assumption in regression. As a consequence, the error term e; is assumed to be uncorrelated with both independent variables, Sli and V®. Assumption 7 implies that the expected growth rate on a plot is assumed to be a function of site index and initial volume. Assumption 8 has two implications. First, it implies that the growth rates are inde- pendent between plots. Knowing the true growth rate on a plot gives no idea about the true growth rate on another plot. The eight assumption also states that the variance of the individual growth rates is inversely proportional to initial volume raised to a power of 1/3. The variance of the individual growth rate should decrease as the plot gets older. Variation in growth rate is more important when the growth rate is large (i.e., when the plots are young). If it is assumed that the variance is proportional to the expected individual growth rate var(0{) oc &i 54 Chapter 3. Statistical Methods (i.e., i f t h e v a r i a n c e expressed as a p r o p o r t i o n o f t h e e x p e c t e d g r o w t h r a t e is a s s u m e d to b e c o n s t a n t ) , i t w i l l also b e i n v e r s e l y p r o p o r t i o n a l t o t h e i n i t i a l v o l u m e r a i s e d t o a p o w e r o f 1/3 since t h e e x p e c t e d v a l u e o f t h e g r o w t h r a t e is itself i n v e r s e l y p r o p o r t i o n a l to i/(v°y/\ Assumption 9 can be justified b y i n v o k i n g the C e n t r a l L i m i t Theorem. T h e numerous variables i n f l u e n c i n g t h e g r o w t h r a t e u n a c c o u n t e d for i n t h e second-stage m o d e l are a s s u m e d t o b e a d d i t i v e w i t h t h e i r average effect b e l i e v e d t o b e n e g l i g i b l e . 3.2.3.4 Matrix Notation M o d e l (3.27) a n d i t s a s s u m p t i o n s c a n b e w r i t t e n u s i n g m a t r i x n o t a t i o n as: 6 = X0 + e w i t h the following (3.29) matrices: X " Ol" X21 11 X = 0 = 9 X\ N where Xn = Sli/'(V®) ? 1 3 e = N a n d X i = —Sli. F r o m t h e a s s u m p t i o n s a b o u t t h e error t e r m , 2 the d i s t r i b u t i o n o f 9 is easily f o u n d : 9 ~ N(X0,(f> V) (3.30) 2 where V is a d i a g o n a l m a t r i x w h o s e i t h d i a g o n a l element is l / ( V ° ) ' . 1 / 3 i T h i s concludes t h e second-stage m o d e l . T h i s stage p r o v i d e s a b i o l o g i c a l l y - r e a s o n a b l e m o d e l t o p r e d i c t t h e g r o w t h rate. T h e e s t i m a t e g i v e n b y t h e second-stage m o d e l c a n b e seen as t h e p r e d i c t e d g r o w t h r a t e o f a g i v e n p l o t . 55 Chapter 3. Statistical Methods 3.2.4 Combined Model The second-stage model (3.27) can be substituted into the first-stage model (3.15). Y 3.2.4.1 - 8 Sli + e; + r + v;. = Q ip 2 x p (3.31) Assumption A new assumption is needed. Assumption 10. 22[v{ |ej] = -E[r |e;] == 0 p Assumption 10 p implies that the error terms from the first-stage model are uncorre- lated with the error terms from the second-stage model. This means that the individualdependent variables unaccounted for in the model are assumed to be uncorrelated with the unaccounted time-dependent variables or the residual error term. It should be noted that the error terms from the first-stage model are also uncorrelated with the independent variables since the independent variables are assumed to be fixed. 3.2.4.2 Matrix Notation The combined model can be written with matrix notation as: Y = FX/3 + Fe + G r + v (3.32) Based on the assumptions about the error terms, the distribution of Y is: Y ~ where \E = $> FVF' + n. r 2 N(FX3,V). (3.33) 56 Chapter 3. Statistical Methods 3.2.5 Summary Growth rate and interest rate are very similar in nature, and the mathematics relevant to interest rates can be adapted to permanent sample plot growth rates. On recent permanent sample plots, the growth rate can be approximated by a constant. When the time period effect is removed, an estimate of the constant growth rate can be obtained. This can be called the observed instantaneous growth rate. The instantaneous growth rate can also be predicted from independent variables using a relationship derived from Von Bertalanffy's growth curve. This relationship presents the growth rate as a function of site index and volume at the beginning of the growing season. The second-stage estimate of growth rate can be called the predicted instantaneous growth rate. Both models can be joined to express the growth rate observations as a function of initial volume, site index, an individual effect, a time effect and an unexplained error term. For predictions of future volumes on a permanent sample plot, the parameter of interest is the random variable 3.3 and its prediction is discussed in the next section. Parameter Estimation Traditionally, the parameter of interest has been 0. However, this parameter does not recognize the individual character of each plot, represented by e;. The parameter 6i does recognize this individual character. The group of permanent sample plots provides the information to estimate the global parameter 0. Repeated observations on a given plot yield the information to predict the individual term e; of this plot. The estimate of 0 and the predictor e; can be joined to predict the actual parameter of interest. It is a common technique in parameter estimation to assume that the covariance matrix of the model random terms are known in order to derive the parameter estimators. Chapter 3. Statistical Methods 57 This simplification often allows a better understanding of the estimators. The estimators of the parameters composing 6 are first derived for the case where the covariance matrix of the model random terms are assumed to be known. Second, estimators for the unknown covariance matrices are defined. Finally, the estimated covariance matrices are substituted in the parameter estimators to yield estimates of the best linear unbiased predictor of 0 and of its variance. 3.3.1 Estimation When Covariance Matrices Are Known When the covariance matrix of the various random terms in the combined model (3.32) are assumed known, only the parameters 8 and e need to be estimated. These parameter estimates are random variables, and their variances can also be estimated. The parameter estimates are discussed first followed by a discussion of the estimate of the variance of the predictor of 0;. Deriving and simplifying the estimators involved a lot of matrix algebra. Most of the algebraic steps were carried out to provide a proof of the estimators. While complex at first glance, these steps are easy to follow using the matrix inversion rules given in section 2.4. Of course, the first and the final lines of any proof are sufficient to understand the text. The intermediate lines are complimentary, for the pleasure of the mathematically-inclined reader. 3.3.1.1 Parameter Estimators Estimators for 8 and e are first derived from a theorem given by Harville (1976), and combined to form a predictor of 6. The simplification and interpretation of this predictor using matrix algebra leads to an easily interpretable form. The suggested predictor is shown to be a weighted average between the observed instantaneous growth rate and Chapter 3. Statistical Methods 58 the predicted instantaneous growth rate. Interpretation of 9 is discussed following the mathematical derivation of 3 and e. Harville (1976) gave an extension of the Gauss-Markov theorem for prediction of random terms. Using theorem 1 from Harville (1976), the best Hnear unbiased predictor of e can be found. This leads to 9, the individual growth rate. The random parameter 9 is the sum of a fixed parameter, X8 and a random term, e (see (3.29)). The B L U P of 9 will be the sum of the B L U E of X3 and the B L U P of e. From the full model (3.32), the B L U E of 3 is the GLS estimate $ = [X'F'^FXJ^X'F'tf^Y (3.34) which leads to X/3, the B L U E of X8. FoUowing Harville (1976), the B L U P of e is e = </> VF'* (Y 2 _1 FX/3) (3.35) which can be developed further using the definition of ^ and matrix inversion rules e -0 = ^VF'^Y = ^ v F ' j n - - n-^KF'n^F) 2 VF'¥ FX/? - 1 1 -^ VF'{n 2 _ 1 + ^V-^^F'O-^Y - n^FKF'n^F) + ^ v ^ ^ F ' n - ^ F x / ? = 0 vF'n Y - ^ v F ' n ^ F K F ' n ^ F ) + ^-'v-^^F'n^Y 2 _1 -^VF'n^FX/S + ^VF'n-^KF'n^F) + = t-'v-^F'n^Fxp < 6 V{I-(F'n F)[(F'fl" F) + ^ - V - ] - } F n Y 2 _1 1 2 1 1 , - 1 ? -4> v{i 2 (F'n^FjKF'n^F) + ^v-^-^F'n^FX/? = ^vKF'n^Fj-^^vi-^F'n"^)- ^-^ 1 Chapter 3. Statistical Methods 59 - ( / ^ [ ( F ' f t - ' F ) - + 0 V]- X/3 1 2 (3.36) 1 This predictor is unbiased in the sense that E[e] — E[e] = 0 where 0 is a columnvector of zeros. It is best in the sense that E[{e - e)'(e - e)] < E[{e - e)'(e - e)] (3.37) where e is any other predictor of e. The BLUP of 6 can now be defined as 9 = X/3 + e = X/^ + ^ V ^ F ' O - ' F ) - (3.38) -^[(F'fi-'F)= ^VKF'O-'F)- 1 2 1 1 l </> \]- (F'n- F)- F'ii- Y 1 2 1 1 +(F'n F)- [(F'fi" F)- + ^ V ] X / 3 1 1 + ^vj-^x/S 1 ^VKF'JT'F)- + _1 1 + ^vj-^F'n^F^F'n^Y +{I - ^ [ ( F ' O ^ ' F ) = 2 </> V]- Xp + 1 +^ V]- (F'ft- F)- F'£r Y 1 1 1 2 _1 1 1 (3.39) This predictor of 9 can further be simplified; however, the terms it includes first must be recognized. The term ( F ' f i ^ F ^ F ' f i ^ Y is the GLS estimate of 9, 9, from the firststage model, model (3.18). Its distribution is conditional on 9 since 9, a random variable, was assumed fixed in the first stage. This distribution is 9\9 ~ N(9, [F'fi^F]- ). 1 (3.40) Therefore, the term [ F ' f i ^ F ] - is the variance of the variable 9 \ 9. From the distribution 1 of 8, shown in (3.30), the variance of 9 is </>V, and its expected value is X/3. Combining 2 Chapter 3. Statistical Methods 60 this information with the distribution shown in (3.40), the unconditional distribution of 9 is: 0~iV(X/3,$) (3.41) where # = [F'ft^F]- + cfV. 1 The GLS estimate of 3 from the full model, defined in (3.34), is equivalent to the GLS estimate of 3 from the unconditional distribution of 9, shown in (3.41) (Amemiya 1971). This can be shown using the matrix inversion rules. J3 = = [X'F'S^PX^X'P'*"^ (3.42) [X'F'JT FX - X ' F f i " F ( F n " F + ^- v- )- F'fi~ FX]X 1 / 1 / 1 2 ( 1 1 1 •[X'F'ft Y - x ' F ' n ~ F ( F ' n F + ^ - ' v - ^ ^ F ' n ^ Y _1 = 1 - 1 [X'{I- ( F ' n F ) ( F ' f i F + 0 - v - ) - } ( F ' n F ) X ] _ 1 _ 1 1 2 1 _ 1 1 •[X'{I - ( F ' f i - F ) ( F ' n F + ^ V - ^ - ^ F ' I T ' Y ] 1 = _ 1 [X'^F'fi-'F)- +</. V]- X]1 2 •X'^F'fl-'F)- + 1 <f> V]-\F'fl- 'E)- F'Q - Y 1 = 1 r 1 1 1 l [X'S^Xj^X*-^ (3.43) This demonstrates that X/3 in 9 (see model (3.39)) is equivalent to the GLS estimate of the expected value of the unconditional distribution of 9. The BLUP of 9, equation (3.39), can therefore be rewritten: 9 = var(0)[var(0 \6) + vai(0)] 0 + var(0 \ 9)[vav(9 \ 9) + var(0)] £[0] _1 _1 (3.44) 61 Chapter 3. Statistical Methods It is almost identical to the expected value of the conditional variable 9 given 9. Using Bayes' rule, the expected value of 9 \ 9 is shown to be: E[9\6) = var(0)[var(0 19) + v a r ^ ) ] " ^ + var(0 | 0)[vai(01 9) + v a r ^ ) ] " ^ ] 1 (3.45) The only difference is that the unknown expected value of the unconditional variable 9 is replaced by an estimate in the B L U P . The B L U P can be interpreted as a weighted average between the growth rate observed on a plot and the growth rate predicted from some plot attributes. This can easily be understood. The individual growth rates can be predicted from some independent variables with some error, and it can also be observed with some error. The best predictor will be an average between the prediction and the observation with the weights proportional to the confidence in each estimate. This confidence is inversely proportional to the respective variance of each estimate. The B L U P is this weighted average. As mentioned before, the B L U P is a random variable. The precision of the B L U P can be obtained by estimating the variance of the predictor. This variance estimator follows. 3.3.1.2 Estimation of the Variance of the Predictor The B L U P is a random variable, and estimation of its variance will be needed to build a confidence interval around this predictor. The individual B L U P of 9{ can be defined by: §i = Xi/3 + e; where X; = [Sli/(V®) / — Sli]. Because 9i is a predictor of 9 the variance of 6{ can be 1 3 it expressed as: EiiBi-Bif] (3.46) = E[(ynl3 + ei - Xi3 - erf] = EidxiP-Xi^ + ei-ei) ] 2 Chapter 3. Statistical Methods = 62 E[(xi$ - /3) ] + E[el\ + E[4\ + 2E[(x /3 2 Xi i -2E[(xiJ3 - Xif3)ei] - 2E[eiei] = var(x;/3) + var(ej) + var(ei) + 2cov(xi/3, hi) — 2cov(0;, e^) (3.47) The various components can be estimated separately, using the mathematics of linear combinations. First, it can be shown that the covariance between 3 and e is zero which means that cov(x;/3, §;) is zero. cov(/3,e) = c o v ( [ X ' $ - X ] - X ' $ - ^ , 0 V * [ 0 - X/3]) 1 1 1 2 _ 1 = cov([x $- x]- x'$- ^, = [X'#- X]- X #- var(fl){^ V*- [I - X p C ' ^ X l ^ X ' * - ] } ' = [X'S^XI^X'S-^IJI - ^-^[x'*- ^-^']*- ^} = [X's-'x]- ^*-V v - s^xpc's^xi^x's-V'v] = [x'*- ^- ^*- ^ - = 0 , 1 1 1 1 , 1 {^vs-^i-xpr^x^x's- ]}^ 1 2 1 1 1 1 1 1 1 2 1 [X'#- x]- x *-V v 1 1 1 2 , (3.48) The variance of the B L U P of e; can be derived from the definition of e in (3.36). Let {0 V*I> }; be the i 2 -1 var(8i) t h row of the matrix defined by ^> V$ . Then the variance of e; is: 2 -1 = var({^ V*- } [fl-X S]) 1 2 i = = = / {(^ V#- } var(^-X/3){^ V$- K 2 1 2 1 i .{0 V*- } var([I-X[X *--- X]- X *- ]fl){^ V*- K 2 , 1 1 1 , { ^ V * " } ^ - X[X'*- X]- X #- ]var(tf) 1 1 1 , 1 .[i - x t x ' * - ^ - ^ * - ] ^ ^ * - } ; 1 = 1 i 1 1 1 { ^ v * - } ^ - x[x #-' x]- x'$- ]$ 1 , 1 1 1 2 1 Chapter 3. Statistical Methods 63 •[i - $ - x [ x ' # - x ] - x ] { ^ v # - } } ; 1 = 1 1 , 2 1 i {^VS-^i^-XpC'^^Xj^X'l^V*- }; (3.49) 1 To define the covariance between 0; and e;, it will be useful to express in terms of Y . From equation (3.38), 9{ can be written as: = [^-{^V^-^X^ + ^V*- }^ = [xj - { ^ V ^ - ^ i X K X ' * - ^ - ^ * - ^ + {^V*- }^ = ([ = TiY 1 1 Xi 1 1 1 - {^ v#- } x][x'$- x]- x'*- + {^V*" },) 2 1 1 i 1 1 (F'n~ F)- F n Y 1 1 1 , _ 1 (3.50) where Ti = ([x; - { ^ V * - } ^ ] ^ ' * - ^ - ^ ' * " 1 1 Therefore, the covariance of 1 + {^V*" };) ( F ' n ^ F J ^ F ' n . 1 - 1 and e; is cov(0;,e;) = = cov(riY,ei) cov(r (FX/5 + Fe + G r + v),e ) i i = cov(riFe,ej) = cov(riFjej, e,-) = r F var(e ) i (3.51) i i The variance of e{ is given by var( ) = 0 / ( ^ ° ) 2 (3.52) 1 / 3 ei The term var(Xj/3) is easily obtained from the variance of 0: var(x;/3) = x;var(/3)x- = XifX'S^X]- ^ 1 (3.53) Chapter 3. Statistical Methods 64 Using these different estimators, the variance of 8{ can then be defined as - Qif] E[{§ { = xjx's^x]- ^ + o'v*- }^* 1 xpc's^xi^x'K^v*- };. 1 1 +(1 - 2T F )<b /{V Y 2 i i (3.54) 3 i This estimator yields an estimate of the variance of the predictor of 6{. Both, the predictor of 8{ and the estimator of the predictor's variance are based on the assumption that the covariance matrix of the error terms e, r, and v are known. These unknown matrices must now be replaced by their estimates. 3.3.2 Covariance Matrix Estimators Knowing the predictor when the covariance matrices are known "may provide some insights into what to do when [they are] not" (Harville 1976). A common approach is to replace the unknown covariance matrices by their estimates and to assume that they are the true values. T h e various assumptions about the error terms have drastically reduced the number of unknown parameters. Estimates of 6 , cr , . . . , o- , and <f) are sufficient to 2 2 2 2 M obtain estimates of the matrices fi and </>V. 2 T h e best strategy for estimating the unknown variances is to use the two-stage structure of the model; each stage can be dealt with independently. In the first stage, estimates of 8, 8 2 and cr ,... 2 based on the observed growth rates are obtained. In the second stage, the variable 8 is unobservable, but the unconditional distribution of 8, an estimate of 8 from the first stage, can be derived. From this distribution, estimates of 8 and <f> 2 are computed. 3.3.2.1 First-Stage Estimators Different estimators for error component models have been suggested in the literature. However, none of these estimators were found adequate for the assumptions made in the 65 Chapter 3. Statistical Methods first stage. Instead, theoretical estimators had to be derived based on moment estimators of the unobserved residuals. It is a common practice to replace the unobserved residuals by their OLS estimates in theoretical estimators (Wallace and Hussain 1969, Zellner 1962, Swamy 1971, among others). For small sample sizes, when replacing the unobserved residuals by their OLS estimates in an estimator, it is probably worthwhile to adjust the degrees of freedom. A) Estimator of o~ 2 Following Wallace and Hussain (1969), the estimator for a can be derived using the 2 squared residuals from period p with the following reasoning: E £ ip E S n i p r (5 r + E i v ) £ p p ip i E E ip S V i p r = E = E^^p-v f iip» ~ 'pJ ^ E V Vv 2 (3.55) Therefore, the theoretical estimator for a can be 2 1 E i »p u Hp Sr> p i J (3.56) which, after correcting for the degrees of freedom and replacing the unknown residuals by their OLS estimates ri; , becomes p 1 B) Estimator of 8 2 E u.tp E t ^tp 'V J (3.57) Chapter 3. Statistical Methods 66 U s i n g a s i m i l a r s t r a t e g y as for a , a n e s t i m a t o r for 8 c a n b e d e r i v e d u s i n g t h e squared 2 r e s i d u a l s , t h e estimates o f <T (p = 1 , . . . , M ) , t h e c o m m o n variance o f t i m e - d e p e n d e n t 2 error terms c o m i n g f r o m t h e s a m e t i m e p e r i o d a n d t h e covariance b e t w e e n error terms from overlapping time periods. Y(Y<- E I V sy ) YYYY i (3.58) 2 i Y E W8 = p p Y =* ip jp u u *p u = u 2 £ ^ Y (s - s ) 2 P P Y s (3.59) (3.60) PQ S X p p>q j <t>p F r o m t h e s u m o f these three e s t i m a t o r s , t h e e s t i m a t o r o f 8 is easily o b t a i n e d 2 1 v where u = W + E < p &=fi + E 1 P CY Y Y Y ir j* - Y u i E q>P P J u PD s a (3.61) 9 SSX . p q m A f t e r c o r r e c t i n g for t h e degrees o f freedom a n d r e p l a c i n g t h e u n k n o w n residuals b y t h e i r O L S residuals, t h e e s t i m a t o r becomes 1 (3.62) ( n )#o q P C) Estimator of 9 T h e estimates o f a 2 and 8 2 are r e p l a c e d i n il t o get t h e e s t i m a t e il. The FGLS e s t i m a t e o f 9 becomes: 9 = [F'n^FJ^F'n^Y (3.63) A g a i n , t h e d o u b l e h a t a b o v e 9 i m p l i e s t h a t i t was o b t a i n e d after first e s t i m a t i n g il. T h e F G L S e s t i m a t e is a s y m p t o t i c a l l y n o r m a l l y d i s t r i b u t e d . W h e n W is large, t h e n o r m a l 67 Chapter 3. Statistical Methods distribution should be a relatively good approximation. The approximate distribution of §\6,ia: l\8 ~ N(8, [F'fWF]- ) (3-64) 1 The inversion of matrix fi, a W X W matrix, can be simplified using matrix inversion rules. The inversion of fi is equivalent to: fT 1 = [^GAG' + S ] . -1 = I T - S ^ G 'G'S G + 1 l/S A~ 1 2 1 G'S (3.65) 1 which requires the inversion of S, a diagonal matrix, and A and G'S G + 1/S A' 2 two M x M matrices. 3.3.2.2 Second-Stage Estimators In the second stage, an estimate of the unconditional distribution of 6, the GLS estimate of 6 is needed. Unfortunately, the GLS estimate of 8 cannot be obtained because the covariance matrix of the error terms is unknown. This difficulty can be overcome using the unconditional distribution of 8, a FGLS estimate of 9. From the conditional distribution of 8 given 8, shown in (3.64), and the distribution of 8, shown in (3.30), the unconditional distribution of 8 is easily found: 8 ~ N(X0, [F'fV'F]- + <f> V) 1 (3.66) 2 In the second-stage, estimates of 8 and (f) are needed. GLS estimates are not possible 2 because they would depend on the unknown value of each other. An iterative FGLS estimate is then required. Theoretically, the GLS estimate of 0 would be J3 = [X'[</>V + ( F ' n ~ F ) - ] - X ] - X V V + (F'fV'F)- ]- ^ 2 1 1 1 1 2 1 1 (3.67) Chapter 3. Statistical Methods 68 which is also the maximum likelihood estimate since the error terms are assumed to be normally distributed. Similarly, a possible estimate of c6 is its maximum likelihood estimate. However, 2 equating the differential of the pdf of 6 with respect to <f> 2 with zero does not yield a simple solution. No satisfactory estimate was found in the literature. An alternative estimate had to be derived. Based on the following reasoning: E[(6 -X0)'(6 = tr[cA V -X0)] 2 = + (F'fi-'F)- ] 1 c6 tr[V] + t r K F ' n ^ F ) " ] 2 1 (9 - X0)'(e - X0) - trKF'SWF)" ]" 1 E (3.68) tr[V] the following estimate was found adequate N 12 ^ = [6 - X0)'{6 - X/3) - trKF'iWF)-- ] 1 iriVJ (iV^2) ( 3 ' 6 9 ) Because the unknown 0 is replaced by an estimate, 2 degrees of freedom are lost for the two coefficients 0\ and 0 2 Since both estimates, <^ and fi, depend on the unknown value of the other parameter, 2 FGLS estimates are needed. To find these FGLS estimates, an iterative solution must be used. The algorithm stops when both $ and <f) converge. The iterative estimate of 2 <f> 2 is used to get the FGLS estimate of 0: $ = [x'[0 v + ( F ' n F)- ]- x]- x [^ v 2 1 1 1 1 , 2 + [F'n F]~ ]- e 1 1 1 (3.70) 69 Chapter 3. Statistical Methods 3.3.3 The Best Linear Unbiased Predictor and Its Variance The estimates of o~ , S , (f> , and 0 can now be replaced in the BLUP, defined in equation 2 2 2 (3.39), to obtain what could be called an "estimated best linear unbiased predictor". Equation (3.39) becomes: § = ^ v K F ' f t ^ F ) - + 4> v\-H + 1 2 (F'n^F^KF'n^F) -1 + ^V^X/fl (3.71) Similarly, an estimate of the variance of the BLUP of di is obtained by replacing the unknown parameters in (3.54) by their estimates. This yields: E[(§i - 6i) ] 2 = Xipc'*- ^- ^ 1 1 + {^v*- }^* - x t x ' ^ x j ^ x ' K ^ v * - } ; 1 +(1 - 2 f F ) ^ / ( ^ ° ) 2 i 1 (3-72) 1 / 3 i where $ = (F'fi <j> V ti = ({^-{^V^-^iX^X'^X]-^'^- + 2 {4> V^' }i)(Fn 1 2 1 1 F)- F'n 1 1 The estimate of the BLUP of 9i can now be used to predict future growth rates and compound the current volume of a permanent sample plot. Confidence intervals around the prediction can be built with the variance estimate. 3.4 Model Predictions The annual growth rate on a given plot can be predicted from model (3.46) 6i=Xii3 + -ei (3.73) Once the unknown parameters 0 and e; have been estimated, the instantaneous growth rate can be predicted from two plot attributes: the site index and the initial Chapter 3. Statistical Methods 70 volume. The growth rate can be assumed to be constant in a given year and to be changing from year to year. The volume can thus be updated annually. Each updated volume becomes the new initial volume, [i.e. the volume at the beginning of the next growing season). The growth rate that will occur in this particular year is then predicted leading to a new updated volume. This process is repeated, say, K times to get the volume K years from now. This recursive system offers more flexibility to predict future volumes because the growth rate is updated annually. Assuming a constant instantaneous growth rate over a period of one year is close to reality. The annual volume update involves computing the expected value of the antilog of the predictor 9{. The technique for estimating this expected value is discussed in section 3.4.1. If a confidence interval is desired around the predicted volume, the confidence interval around each intermediate volume will be needed too since the volume at the end of the growing season is a function of the predicted volume at the beginning of the season. In section 3.4.2, the problem of variance estimation is analyzed, and a solution is suggested. 3.4.1 Volume Predictor Various techniques exist to estimate the expected value of the antilog of an estimate (Ung and Vegiard 1988). However, no technique was found to estimate the expected value of the antilog of a predictor. The subtle difference can be explained by recalling that 0i is a predictor of 9{, a random variable, while the expected value of an estimate is a constant. The technique used by Baskerville (1972) was simple enough to be adapted for estimating the expected value of the antilog of a predictor and was, consequently, the chosen technique. Future volume prediction is driven by the initial volume in the first iteration. The last volume measurement observed on the sample plot can serve as the initial volume since 71 Chapter 3. Statistical Methods this volume represents the best knowledge of initial volume to predict future growth rates. A possible estimator of the updated volume could be the volume at the beginning of the growing season multiplied by the antilog of the predicted growth rate: V/ (3.74) = V? x e x p $ ] +l where V? is the predicted volume on plot i at iteration j and Q\ is the predicted growth rate. However, this estimator is biased. Based on the moment generating function, it can be shown that E[X] = exp E[\nX] + var(ln X) (3.75) which can be approximated by E[X] = exp E[\nX] + vari (lnX) (3.76) Goldberger (1968) pointed out that this result was asymptotically unbiased. 2 The true model expressed in a logarithmic scale is lnV?' +1 =lnV? +61 + uf (3.77) where u\ is the error term on plot i at iteration j. The first term on the right-hand side of the equation can also be rewritten in a similar form. Hence, the true model really is l n y / = l n V f - f E ^ + Eu* +1 fc=0 (3.78) k=0 Given that the volume is estimated on plot i and that annual observations are used, if the first volume is assumed to be measured without error, then i E[\nV }=\xvV? 0+1 i + Y< i ek (3.79) k=0 Goldberger (1968) also mentioned that, instead of using the corrected expected value of InX, one can use the uncorrected median. However, the median might be more difficult to obtain than the corrected mean. 2 72 Chapter 3. Statistical Methods The variance of In V? +1 is the variance of the sum of the error terms var(lnV7 ) = +1 v a r ( £ u*) fc=o = E v a r K ) f c (3.80) k=0 The moment generating function can be used to rewrite the expected value of exp ElWV^} E[V/ ] +1 = + T a r ( ' n (3.81) exp fc=0 Baskerville (1972) replaced the theoretical moment generating function by an estimate. The same strategy can be followed. At this point, no estimate of var(u*) exists because the time period which u^ represents is located in the future. A possible estimate is to take the average variance estimate of the past time periods. This estimate can be found using , s (Y - F8)'(Y - F6) =- — — - ( - ) 3 82 where 2 degrees of freedom are lost for the estimates of 0x and B and one is lost for 2 E e; = 0. Any error terms located in the future will be assumed to have this estimated variance. The estimate of E[V? ] +l •j+ii will then be (3.83) exp fc=0 z This estimator is the closest form to the estimator used by Baskerville (1972). Its main advantages are that it is simple to compute and asymptotically unbiased. Estimation of a confidence interval of this predicted volume is discussed in the next section. 3.4.2 Confidence Intervals Estimator Confidence interval estimation is also a problem when a scale transformation is needed. Ung and Vegiard (1988) suggested the following confidence interval for ^[t^ ] when the J+1 Chapter 3. Statistical Methods 73 estimated moment generating function is used [exp a + vaf(ln X) var(ln A") , exp b + (3.84) where = £ [ l n X ] - * ( 1 - f ;i/)var(E[lnX]) b = E[\nX] + t(l - f ;i/)var(£[lnX]) a = type I error allowed v = number of degrees of freedom a A possible estimator for the variance of the estimated expected value of ln V / var(E[lnV / ]) r +1 = vSr(ln + 1 is +£ 6 ) k { k=0 = ££<£*(#,*!) (3-85) k=01=0 where the estimated covariance between 6\ and 6\ is approximated by the following {^v*- }^* - xrx's^xj^x'K^v*- }; 1 1 +i - f?p^ /(Vi ) 2 f c 1 / 3 r!F^ /(Vi') 2 (3-86) 1 / 3 where x f and r * are the respective estimates of x^ and Ti after the k iteration. til Following Ung and Vegiard (1988) and assuming a large number of plots, the confidence interval can be estimated approximately with the following bounds j lower bound = exp In V° + £ 61 - z , . c6v(0?, §[) + ^ ± - ^ f AJ=0 j upper bound (3.87) = exp k=Q Chapter 3. Statistical Methods 74 This estimator is an approximation of the true variance of the predicted volume due to the extra variation created by replacing the fixed x*, xj, rf, and T[ with estimates. Since the true variances of the various random terms in the model were also replaced by estimates, the true variance of the predicted volume will likely be underestimated. Chapter 4 Numerical Example 4.1 Description of the Data Base The data base was kindly provided by MacMillan Bloedel Ltd. It consisted of 68 permanent sample plots of coastal Douglas-fir. The sample plots were fixed plots located in central Vancouver Island. These were pure, even-aged, and undisturbed plots. A sample plot was considered pure Douglas-fir when Douglas-fir represented more than 80% of the basal area. Undisturbed meant that there had been no human intervention to modify the growth rate, nor catastrophic losses due to biotic or climatic agents. All sample plots were measured after the growing season. They contained between 3 and 8 measurements covering a wide range of site index (Figure 4.2). Site index was estimated with King's (1966) site index equation. To protect the confidentiality of the data, the actual volumes were multiplied by a constant factor. The modified volumes were assumed to be the measured volumes, and no further reference in the text will be made to the unmodified volumes. At first measurement, volume varied between 10.4 and 939.5 m / h a , and 3 breast height age varied between 8 and 90 years. At last measurement, volume varied between 95.8 and 1594.4 m / h a , and breast height age varied between 24 and 115 years 3 (Figure 4.3). The initial data base was split into two distinct groups: an experimental set and a prediction set. The model was calibrated with the experimental set in order to predict the volumes in the prediction set. A l l measurements prior to or taken in 1972 were 75 76 Chapter 4. Numerical Example 20-i 10-15 15-20 20-25 25-30 30-35 35-40 si-te cndex (m) Figure 4.2: Distribution of Site Index 1600 -) 1200- 400- 0-0 AGE ( y r s ) Figure 4.3: Individual Plot Volume versus Age Chapter 4. 77 Numerical Example included in the experimental set, and the measurements taken after 1972 were put into the prediction set. In the experimental data set, 42 plots were measured three times, and 26 plots had four measurements. In this data set, the maximum age was 100 years, and the maximum volume was 1224.6 m / h a . The mean volume was 393.7 m / h a and the standard deviation 3 244.3 m / h a . 3 3 The number of growth observations on a given plot was equal to the number of volume measurements on this particular plot minus one. For example, three measurements on a plot will yield two growth observations. Consequently, 42 plots had 2 growth observations (Ti = 2) and 26 plots had 3 (Ti = 3) for a total of 162 growth observations (W — 162). The growth observations in the experimental set covered 15 time intervals (M = 15) of 5 years on average, a minimum of 2 years and a maximum of 6 years. The number of observations in a time period varied between 2 and 26 (Table 4.1). Table 4.1: Time Intervals p Interval 1 2 3 4 5 6 7 8 1955-1961 1956-1962 1959-1962 1959-1964 1960-1965 1961-1966 1962-1965 1962-1966 Sp P Interval 17 7 2 14 14 17 3 9 9 10 11 12 13 14 15 1964-1969 1965-1970 1966-1969 1966-1971 1967-1970 1969-1972 1970-1972 14 17 9 26 2 9 2 In the prediction set, there were a total of 189 volumes to be predicted. The av- erage volume in the prediction set was 555.4 m /ha, and the standard deviation was 3 289.6 m /ha. The minimum age was 19 years and the minimum volume was 57.2 m / h a 3 3 78 Chapter 4. Numerical Example The time lag between a predicted volume and the last volume from the same plot used in the experimental set was called projection time. The model performances for volume projection were evaluated with respect to this projection time. The projection time varied between 4 and 16 years. It was divided into three classes: 5 years or less, between 6 and 10 years, and more than 10 years. The first class contained 68 volumes, the second one 62 and the last one 59 (Table 4.2). Another criterion of interest was the initial age. The initial ages were grouped by decades, except for the first category that included the first two decades. Predictive ability was estimated for each projection time-initial age class. Table 4.2: Frequency of Volumes by Initial Age and Projection Time Class Projection Class (yrs) Initial Age (yrs) Total 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 0-5 6-10 11+ 12 12 8 11 11 13 14 8 7 4 4 4 3 3 3 12 12 12 9 9 9 3 3 3 68 62 59 Total 32 35 29 12 9 36 27 9 189 It was also interesting to look at the predictive ability of the model as a function of the time lag between the predicted volume and the first observed volume on the same plot. This was called total lag. The total lag varied between 10 and 31 years. It was grouped in 4 categories: 10-15 years, 16-20 years, 21-25 years, and 26 years or more. The first class had 51 observations, the second one 49, the third one 66, and the last Chapter 4. 79 Numerical Example one 23 (Table 4.3). The interaction initial age-total lag was also considered important to analyze, hence, the total lag classes were subdivided into age classes. Table 4.3: Frequency of Volumes by Initial Age and Total Lag Class Total Lag (yrs) 4.2 Initial Age (yrs) Total 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 10-15 16-20 21-25 26+ 20 10 2 - 6 7 12 10 1 1 21 6 4 4 4 3 3 3 10 12 12 2 4 9 9 5 3 3 3 - 51 49 66 23 Total 32 35 29 12 36 27 9 189 - 9 Goodness of Fit Criteria A model can be assessed with many techniques and statistics. Generally these statistics are computed for the original model, but sometimes the actual variable of interest is a transformation of the original model. In these cases, the model can be assessed for either the original or the transformed model. Growth rate was the estimated variable in the original model. When variances are known, the suggested predictor is B L U P for the true growth rate. Best in this case meant that it minimizes the total sum of squared error of the growth rates. However, because the variable of interest was volume, it was considered more important to assess the capability of the model to predict volume rather than growth rate, even if the estimator was not minimum for the volume squared error. Therefore, it would theoretically be possible to Chapter 4. Numerical Example 80 find another predictor of the vector of growth rates that would reduce the total sum of squared error on volume. The model was tested for its ability to predict future volumes. It was fitted using the experimental data set. First, the observed growth rate was computed for each growth period by taking the difference in the natural logarithmic scale between the final volume and the initial volume and dividing this difference by the length of the period in years. T h e n , the first-stage model was fitted using P R O C M A T R I X in S A S (Sas Institute Inc., 1982) with the F U Z Z option on the mainframe at the University of British Columbia. The F U Z Z option reassigns any value less than 1 x 1 0 - 8 to zero. Before the second-stage, the average of the initial volumes of all growth periods on each plot were computed. This defined V^°. T h e second-stage model was fitted using the estimates of the growth rates from the first-stage, the plot site index and V®. T h e convergence criterion was ^fc+i _ and pk 0.0000001 < 0.0000001 -P\< where fi\ and fi\ are the estimates of /3 and /3 after the k X 2 iteration. Estimates of the unknown variances and of 3 and e were replaced in the theoretical predictor. T h e last volume on a plot in the experimental data set was used as the initial volume for all the projections on this plot. T h e predicted volumes were then compared with the corresponding observed volumes in the prediction set. The statistics used to assess the performance of the model were the bias, the mean absolute deviation and the root mean squared error, in both actual and relative scales. Also, these statistics were computed for both projection lag-inital age classes and total lag-initial age classes. Mathematically, these statistics were: bias (4.88) Chapter 4. 81 Numerical Example bias% 100 x 1 mad N mad% nv-v) (4.89) Y,\v-v\ IOO x — y (4.90) -V V n 0.5 Uv-vy raise (4.92) NE(V-V) ) 2 rmse% = 100 x (4.91) V 0.5 (4.93) EV where bias% = relative bias mad = mean absolute deviation relative mean absolute deviation mad% 4.3 rinse = root mean squared error rmse% = relative root mean squared error V = observed volume V = predicted volume N = number of predicted volumes in the given class Results The estimated B L U P of 0 is a weighted average between the first- and the second-stage model and is referred to as the weighted predictor. The overall performances of models from both stages individually and the weighted predictor as estimators of the observed growth rates and of the future growth rates are presented. Even if the predictions of future growth rates were the main interest, knowing the model behaviour for estimating the observed growth rates was also of interest. Chapter 4. 82 Numerical Example Estimation of the observed growth rates were compared for the three "estimators" (the first-stage model, the second-stage model, and the weighted predictor). Points of comparison included estimation of the mean value, prediction of the minimum and maximum growth rates and estimation of the standard deviation of the growth rates. Table 4.4: Estimation of Observed Growth Rates observed rates first-stage estimator second-stage estimator weighted predictor N Mean Minimum Maximum Std. Dev. 162 162 162 162 0.03732 0.03962 0.04370 0.04246 -0.08542 -0.04393 0.00084 -0.02738 0.27720 0.23615 0.21485 0.23047 0.04470 0.04441 0.03547 0.04347 As expected, the first-stage estimation of the mean growth rate was not equal to the mean observed growth rate since the unknown variance structure of the error term was estimated (Table 4.4). The observed growth rates were overestimated on average by about 6%. Since the predicted rates were an average of the observed rates on a plot, it is normal that the extreme predicted values be closer to the mean rate. However, the first-stage estimates were as variable as the observed rates. The second-stage estimator was even more biased. It is a FGLS estimator of the first-stage estimates. F G L S estimators are biased in finite samples which explains the difference between first-stage and second-stage estimates. The mean second-stage estimate is 10% larger than the first-stage estimate. When compared to the observed rates, the difference jumped to 17%. The second-stage estimator did not predict negative growth rates or extreme growth rates which is intuitively reasonable. Estimates were 20.6% less variable than the observed growth rates. 83 Chapter 4. Numerical Example When the weighted average of the first-stage and the second-stage estimator were used to predict the observed growth rates, the mean growth rate was overestimated by 13.8%. The weighted predictor tended to push the individual estimator toward the mean. Thus, the minimum and maximum growth rates predicted with this predictor were closer to the average prediction than the observed minimum and maximum. The mean, minimum and maximum estimates from the weighted predictor were between the respective estimates from the first-stage and second-stage models. The predicted estimates were slightly less variable than the observed growth rates. The three estimators were also compared on the basis of their respective efficiency to predict future volumes. The overall results showed that the weighted predictor performed better than the two other estimators (Table 4.5). Assuming a constant growth rate on a plot for a short period of time was a convenient approximation in the first-stage model. However, to assume that this estimate will stay constant in the future will tend to overestimate future growth rates. The overall results of the first-stage estimator showed this overestimation. The first-stage model also gave highly variable estimates. The relative root mean squared error was 30% (for an average volume of 555.4 m /ha). The 3 predicted volume were off by 20% on average. In the second-stage model, the choice of the initial volume was arbitrary and caused some uncertainty in the model. This was reflected in the results. The second-stage estimator underestimated the actual volumes by 3.2%. This suggests the choice of the initial volume to fit the second-stage model was probably underestimating the true initial volume. The root mean squared error was much smaller than with the first-stage model. The mean absolute deviation followed the same trend and was to 7.1%. With the weighted predictor, the bias was reduced to 1.1%. The variance was smaller with the estimate of the second-stage model than with the estimate of the first-stage model, which pulled the weighted predictions closer to the second-stage estimates than 84 Chapter 4. Numerical Example Table 4.5: Overall Results from the First- and the Second-Stage Estimators and from the Weighted Predictor Predictor N bias bias% (m /ha) 189 189 189 55.7 -17.6 -6.0 rmse% -1-1 166.6 71.9 51.4 mad% 3 3 10.0 -3.2 mad (m /ha) (m /ha) 3 First-stage Estimator Second-stage Estimator Weighted Predictor rmse 30.0 12.9 9.3 83.2 42.5 35.9 20.1 7.1 7.5 to the first-stage estimates. The weighted predictor was both more precise and more accurate than the second-stage model. The root mean squared error was reduced to about 9% and the mean absolute deviation was slightly more than 7%. Further details are only presented for the weighted predictor as this predictor performed better than the two other alternatives. Except for the youngest plots, the model consistently underestimated the actual volumes. The overestimation for the young plots is caused by the first-stage estimate, which assumed that the growth rate on a plot is constant. The growth rate is rapidly decreasing in a young plot and projecting the observed growth rate will cause a strong bias. The growth rate levels off as the plot gets older, which explains why the bias is much smaller on plots that had their first observations at 20 years of age or later. As a general trend, it can be noticed that the bias increased with projection time. (Table 4.6). This trend of increasing bias is not as clear when the total lag is taken into consideration (Table 4.7). Projecting a plot more than 25 years after the first measurement yielded seriously underestimated volumes for plots over 30 years of age. Chapter 4. Numerical Example 85 The relative variability of the predictions was more important for the young plots. For plots with an initial age less than 20 years, the root mean squared error reached about 20% for projections over 6 years. On the other hand, the variation of the predictions for plots with initial age over 40 years were always less than 10%, generally less than 5% (Table 4.8). The mean absolute deviations were fairly small for plots whose initial age were greater than 20 (Table 4.10). When the total lag was considered, the predictions were more variable as the total lag increased, usually greater than 10% when the total lag was longer than 26 years. The mean absolute deviations followed the same trends as the root mean square errors. 86 Chapter 4. Numerical Example Table 4.6: Bias and (Relative Bias) by Initial Age and Projection Time Class Projection Time (yrs) Initial Age (yrs) 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 0-5 21.8 (9.4) 4.3 (1.3) -27.2 (-5.6) 8.6 (1.3) -7.2 (-1.5) -11.4 (-2.1) -15.0 (-1.9) -5.1 (-0.6) 6-10 48.5 (15.6) -1.2 (-0.3) -45.8 (-9.9) 6.4 (0-9) -12.7 (-2.4) -24.4 (-4.2) -28.1 (-3.3) 0.5 (0.1) 11+ 92.7 (22.1) -4.3 (-0.9) -44.1 (-7.2) 5.4 (0-7) -30.5 (-5.2) -33.0 (-5.2) -38.3 (-4.1) 37.8 (4.1) Table 4.7: Bias and (Relative Bias) by Initial Age and Total Lag Class Total Lag (yrs) Initial Age (yrs) 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 10-15 29.7 (11.3) 6.8 (2.9) -1.4 (-0.6) 8.6 (1.3) -7.2 (-1.5) -9.7 (-1.9) -17.5 (-3.5) -5.1 (-0.6) 16-20 66.4 (18.9) 0.7 (0.2) -3.5 (-1.3) 6.4 (0.9) -12.7 (-2.4) -22.7 (-3.9) -21.7 (-2.6) 0.5 (o.i) 21-25 163.5 (30.7) -6.8 (-1.4) -36.1 (-7.3) 5.4 (0.7) -30.5 (-5.2) -27.8 (-4.4) -38.0 (-4.3) 37.8 (4.1) 26+ - 1.4 (0.3) -48.7 (-7.4) - - - -25.7 (-2.1) - - -61.4 (-8.1) - - Chapter 4. Numerical Example 87 Table 4.8: Root Mean Squared Error and (Relative Root Mean Squared Error) by Initial Age and Projection Time Class Projection Time (yrs) Initial Age (yrs) 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 0-5 25.3 (11.0) 23.0 (6.8) 36.5 (7.5) 14.6 (2.2) 8.7 (1.8) 22.8 (4.3) 20.9 (2.6) 12.9 (1.5) 6-10 61.3 (19.7) 41.6 (10.4) 60.3 (13.0) 38.2 (5.2) 16.6 (3.2) 38.5 (6.6) 37.1 (4.3) 9.9 (1.1) 11+ 123.1 68.7 (29.4). (13.9) 81.0 (13.2) 51.5 (6.4) 36.3 (6.2) 60.6 (9.5) 53.3 (5.8) 65.6 (7.1) Table 4.9: Root Mean Squared Error and (Relative Root Mean Squared Error) by Initial Age and Total Lag Class Total Lag (yrs) Initial Age (yrs) 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 10-15 40.3 (15.3) 8.7 (3.6) 1.4 (0.6) 14.6 (2.2) 8.7 (1.8) 20.8 (4.1) 24.2 (4.8) 12.9 (1.5) 16-20 93.7 (26.6) 21.5 (5.4) 3.5 (1.3) 38.2 (5.2) 16.6 (3.2) 36.5 (6.3) 26.0 (3.2) 9.9 (1.1) 21-25 164.2 (30.8) 56.2 (11.7) 47.8 (9.7) 51.5 (6.4) 36.3 (6.2) 45.7 (7.3) 50.2 (5.7) 65.6 (7.1) 26+ - 66.8 (14.7) 87.2 (13.3) - - 106.5 (14.1) 48.2 (3.9) - - - - 88 Chapter 4. Numerical Example Table 4.10: Mean Absolute Deviation and (Relative Mean Absolute Deviation) by Initial Age and Projection Time Class Projection Time (yrs) Initial Age (yrs) 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 0-5 21.8 (10.2) 15.3 (4.8) 28.7 (6.1) 9.7 (2.2) 7.2 (1.4) 16.7 (3.2) 17.3 (2.6) 11.9 (2.0) 6-10 48.8 (15.1) 28.8 (7.0) 45.8 (10.3) 28.5 (5.7) 12.7 (2.6) 34.3 (6.0) 33.8 (4.5) 9.9 (1.3) 11+ 97.9 (20.6) 51.1 (11.1) 67.1 (10.8) 40.6 (7.5) 30.5 (4.9) 47.5 (7.2) 44.8 (5.7) 53.6 (5.0) Table 4.11: Mean Absolute Deviation and (Relative Mean Absolute Deviation) by Initial Age and Total Lag Class Total Lag (yrs) Initial Age (yrs) 0-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 10-15 29.9 (10.8) 6.8 (3.9) 1.4 (0.6) 9.7 (2.2) 7.2 (1.4) 15.3 (3.0) 19.6 (3.9) 11.9 (2.0) 16-20 70.5 (19.2) 14.7 (3.4) 3.5 (1.3) 28.5 (5.7) 12.7 (2.6) 33.2 (5.9) 23.0 (3.6) 9.9 (1.3) 21-25 163.5 (30.7) 42.5 (8.6) 37.1 (8.1) 40.6 (7.5) 30.5 (4.9) 38.1 (6.0) 43.8 (5.9) 53.6 (5.0) 49.5 (12.3) 75.5 (11.7) 87.1 (12.3) 38.8 (3.0) 26+ Chapter 5 Discussion 5.1 T h e Basic Concept Obtaining an intuitive approximation of future volumes on permanent sample plots is not very difficult when the projection time is short. When a few volume observations from a plot are available, the most intuitive technique for a non-forester would be to figure out an average current annual increment from the past periodic increments, multiply this estimate by the number of years, and add this number up the current volume. Another technique is the "interest rate" technique; that is, instead of working with the absolute value of growth, it is also possible to work with the relative growth. The average interest rate is estimated from past measurements, and the current volume is compounded for the required number of years. These techniques have some drawbacks, though. The current annual increment and the growth rate are not really constant over time. For instance, projecting any estimated past growth rate will overestimate future volumes. To make up for the drop in future growth rate, some factor must be introduced to reduce the estimated growth rate before compounding the current volume. This compensatory factor should be correlated with time as the decrease in growth rate is more important in young plots than in old ones. An alternative estimate is to use the information given by the relationship between observed growth rates and some plot attributes. This relationship must be estimated from a collection of individuals. This technique, called regression, has become common with 89 Chapter 5. Discussion 90 the proliferation of statistical software. The growth rate of an individual is assumed to be the expected growth rate of all individuals with similar plot attributes. The difficulty with this technique is that the growth rate estimation is not plot specific but is based on some expected value from a collection of individuals. The regression technique is theoretically unbiased but rather imprecise when applied to individual plots as it does not take into account the specific performance of the individual plot compared to the average plot. This is of little importance if it is assumed that the individual relative performance in the past is unrelated to the relative performance in the near future. However, if it is assumed that the past individual relative performance is a good indicator of future individual performances, than the regression technique omits useful information: the individual plot performance. The weighted predictor combines the information that can be drawn from the regression technique with the average growth technique. In a sense, the second-stage model is the correction factor for the projection of the average growth rate, obtained from the first stage. It makes intuitive sense that future growth rates be highly correlated with both previous growth rates and future plot attributes. The weighted predictor recognizes this fact by using both types of information. The two-stage model and the resulting weighted predictor have some limitations, however. Knowing these limitations is essential to avoid misuse or misunderstanding. Theoretically, the weighted predictor will always be biased because of the approximation in the first-stage model. This absolute bias should decrease as the plot gets older since the growth rate levels off with time. When few measurements are available on each individual, it might be dangerous to try to estimate the individual component of the weighted predictor. Past individual performances based on 2 or 3 observations might not be a good indicator of future individual 91 Chapter 5. Discussion performances. Harville (1976) did not suggest a minimum number of observations to estimate the B L U P of a parameter. Therefore, there is no guarantee that short time-series are a valuable source of information for individual performances. Disturbed plots were not included in this study. This is a limitation of the second.stage model. At this stage, with our current knowledge, it is difficult to explain catastrophic disturbances and even more difficult to predict them; the decision was made not to include the disturbed sample plots in the study. Therefore, the capability of the model to predict future growth rates in a disturbed environment was not tested. 5.2 T h e Statistical M o d e l A statistical model is a simplification of reality. It represents the experimenter's imperfect understanding of the system. Furthermore, this understanding is limited in its expression by mathematical tractability. A model must be identifiable to be estimated. This becomes a serious constraint when only minimal information is available to fit the model. Assumptions have to be made; these assumptions can be seen as a tractable expression of the modeler's partial knowledge about the system. Thus, by definition, a statistical model is a compromise between reality and tractability. The first-stage model represented this kind of compromise. First, for each plot one observation was lost by using growth rate instead of volume, diminishing an already small number of observations. That brought the number of observations down to two for 42 plots out of 68. These two observations were covering ten years on average since the mean period interval was five years. The constant growth rate assumption was made because obtaining repeated measurements of the same parameter is essential for parameter estimation. It was considered reasonable since for most plots, ten years is still a short period of time. Of course, had all plots had at least 4 volume measurements (3 92 Chapter 5. Discussion growth observations), a first-stage model predicting a decreasing growth rate over time could have been proposed, and better results probably be achieved. The assumption of constant growth rate was not a constraint due to the technique used but due to the nature of the data set. Defining the structure of the random term in the first-stage model (and in the secondstage model, for that matter) was certainly a major problem. Very little attention has been paid in forestry to error structure in model-building. Due to the absence of previous work, models from other fields had to be investigated. Most structures suggested in the literature on time-series require an extensive data set to be estimated with a reasonable confidence. North American foresters usually do not have access to such data sets. Estimating short time-series is a difficult task. Lack-of-fit and serial correlation are not easily differentiated when individuals have been remeasured only a few times. Commonsense assumptions must be made to overcome this limitation. The assumptions about the error terms in the first-stage model reflect this absence of knowledge about the error structure. Two overlapping periods should intuitively be correlated. It is less certain, though, that two nonoverlapping periods can be automatically considered independent. For instance, two periods that had similar extreme climatic conditions could well be correlated. Nevertheless, the model neglects this information. Another consequence of the simplification of the covariance matrix of the time-dependent error terms is the independence between consecutive growth observations on an individual. One can probably argue that the temporal or the unexplained error terms might be correlated between periods. Considering this correlation as negligible is probably oversimplifying. However, for short-time series, it is reasonable to neglect some information rather than to try to extract it by making dubious assumptions. The second-stage model includes a biologically-reasonable estimate within the weighted predictor. Von Bertalanffy's curve, or its more general form, the Chapman-Richards 93 Chapter 5. Discussion curve, is probably the most used growth model in biological sciences to model growth. The yield model, given by the integration of the growth model can be written: V = bj(l - b 2 e ^- b 3 )) 3 where t is time. The parameter bi is the asymptotic volume of the plot. All curves with common asymptotic volumes will move to this volume at a pace depending on the parameter b . 2 The parameter b is a location parameter. Growth models based on Von Bertalanffy's 3 curve suggest some trends toward full-stocking since an asymptotic volume is assumed. This trend can be noticed in the second-stage model. Two plots with identical site indices but different volumes will have different growth rates; the plot with the smaller volume will increase its volume at a faster rate than the other plot, with both plots ending up with the same volume. The growth rate is assumed to be a function of two variables: site index and standing volume. Site index is a function of age and height, so these variables can be considered as being indirectly included in the model. Neither site index nor initial volume are ideal variables to be included in a model. While site index is generally accepted as a good index of site quality, its estimation on young plots might not be the best method for assessing the potential of a site. The definition of standing volume is also subject to some variation. The definition used offered satisfactory results. With a similar problem, Hui and Berger (1983) used for each individual the point where a constant growth rate and a linear growth rate fitted individually would have given an identical estimate. This technique was tried but it sometimes yielded volume estimates that were outside the range of observed volumes on a plot, even giving a negative estimate for one plot. These problems ruled out this definition. The chosen definition for initial volume was reasonable and sufficiently efficient. Chapter 5. Discussion 94 Age, a common variable in growth and yield modeling, was not directly included in the second-stage model. Growth was assumed to be a result of photosynthetic activity. This activity is executed by cells that are never more than a few year old at any point in time in the life of a plot. Growth is undoubtedly correlated with age, but it is not caused by age. Age is simply correlated with the real cause of growth, the standing volume. Density, another variable commonly used to predict growth, is not included in the model either. The most common forms of expressing density are probably the number of trees per hectare, basal area, and percentage of crown closure. It was believed that the volume per hectare could compensate for this variable since it can also be considered as an expression of density. When the second-stage model is combined with the first-stage model, individual growth rates can be projected into the future. While the error structure of the combined model was believed to be satisfactory for short projections, it might be inadequate for long projections. In the weighted predictor, the inverse of the variance of an estimate from one stage is a measure of the confidence in this stage's estimate. As the projection lag increases, the modeler's confidence in the first-stage model should diminish; the first-stage estimate should be closer to the actual growth rate in the first projection year than in the last one. Therefore, the variance of the estimate should reflect this change in confidence. In the weighted predictor, the relative confidence of the first-stage estimate (compared to the second-stage estimate) decreases as the confidence in the second-stage estimate improves with increasing volume, but it does not change in absolute value. However, this oddity was not important enough on short time-series to be corrected. If longer projection lags had been available, this problem would have been addressed. The purpose of the weighted predictor was to find a reasonable compromise between accuracy, precision, and mathematical tractability for projecting permanent sample plots 95 Chapter 5. Discussion when minimal information is available to the modeler. The model was not intended to predict long-term yield or predict yield from an extensive data set. The model uses as much information as can be used and gives limited but useful information. Fortunately, the concept of a two-stage framework can be extended easily to plots with more measurements. Another advantage of the two-stage framework is that the second stage constitutes the best option to project temporary sample plots, or permanent sample plots with less than three measurements. The combined model can be considered a random coefficients model. A random coefficients model recognizes that observations from an individual are random variations around the true individual relationship and that individual relationships are random variations around a general model. Permanent sample plots allow modeling these random relationships. The difference between random observations and random relationships has rarely been made in forestry. Growth is not an instantaneous variable such as volume or basal area, for instance. It is a linear variable, and it must be estimated as a hne. Geometrically speaking, instantaneous variables are dimensionless, while linear variables have one dimension. Regression through a series of dimensionless points does not yield a hne but an infinite series of points. The expected value of a dimensionless variable is a dimensionless value. This can easily be illustrated by an example of a simple linear regression where the only possible x-values are discrete. Then, if the scale of the x-axis is large enough, there will be gaps between predicted points since prediction for values in between discrete x-values are meaningless. The scale of the x-axis can also be reduced in such a way that the predicted points form a Hne if the range of x-values is large enough. However, regression through a series of Hnes does yield a Hne. The expected value of a one-dimensional variable is a one-dimensional value. This is a major difference when deahng with permanent sample plots. 96 Chapter 5. Discussion Recognizing the random nature of the relationship between observations on a plot, and that this relationship is the parameter of interest, are essential steps in modeling with permanent sample plots. This motivates the use of a random coefficient model. Unfortunately, the random coefficient model has not been studied as thoroughly as the components of variance model or the seemingly unrelated regressions model due to the extra complexity of the model (Dielman 1989, p. 95), but there has been a renewed interest (Gumpertz and Pantula 1989). Meanwhile, the lack of theoretical knowledge can be partially compensated by simulation or empirical evidence. Hopefully, more frequent use of random coefficient models will stimulate more research to improve our understanding of these models. 5.3 Results Fitting the first-stage model was relatively fast and required a little more than 1 Mb of memory. The two design matrices (F and G) with the covariance matrix of the estimated rates ([F'fi F] ) used most of the memory. Results from the first-stage estimators were _1 -1 highly predictable. It was the best estimator of the observed growth rates but the poorest for predicting future rates. This estimator could adequately report what happened in the past, but any extrapolation to the future would be dangerous. This is best illustrated with negative growth rates. The first-stage estimator will yield negative estimates for plots undergoing a reduction in volume between the first and the last measurement. However, these negative growth rates can be considered short-term adjustments, and are not an accurate indicator of future growth rate. The first-stage estimate is merely an observation rather than an explanation of the growth rate. Convergence of the second-stage model was also fast; only six iterations were needed before convergence. The initial value used for <f> 2 was 0. The second-stage was the poorest 97 Chapter 5. Discussion estimator of observed rates. To make comparison possible with the first-stage model, the second-stage model was used to predict not the individual observations. Better results could have probably been achieved by using the observed initial volumes and by trying to predict the observed rates. Another reason for the weak results in the second stage is that it could not predict extreme growth rates. This poor performance was also expected since the number of plots was not large and the estimators of 3 and, especially, (j) were based 2 on large sample theory. Surprisingly, the second-stage model could predict future growth rates with respectable precision but a rather large bias. Overall, the second-stage model underestimated the volume, suggesting that the initial volume used to fit the model was possibly underestimated. Writing the code for the prediction of future volumes was simple using a matrix algebra language. When no variance estimates are computed, the predictions are rapidly obtained. However, computing the confidence intervals of the predictions can be cumbersome. The overall results of the weighted predictor were always between the overall results from the first-stage and the second-stage models. The weighted predictor also did poorly in estimating the observed mean growth rates. This is not surprising since it is derived from the best estimate for observed rates, the first-stage estimate, to which a certain quantity was added. Combining the first- and the second-stage was of no use for fitting the observed growth rates. If fitting past observations had been the only purpose, the first-stage estimator would have been sufficient. When future volumes were considered, the weighted predictor was more accurate than either of the individual estimates. The variation of the prediction errors was considerably better than the first-stage estimate and slightly smaller than the second-stage model. The weighted predictor was by far the best option for predicting future growth rates. Analyzing the detailed results, it is seen that the relative bias was more important for predictions on plots less than 20 years of age initially. For projection time less than Chapter 5. Discussion 98 10 years, on plots with initial ages less than 20 years, the bias was close to the negative value of the mean absolute deviation, suggesting that the weighted predictor consistently overestimated the actual volumes. However, because these plots have less volume, they carry only a small weight when the overall bias is computed. For plots with initial age between 31 and 40 years, when only the projection time was analyzed, there was a strong bias. This probably occurred by chance, no rationale could be found to explain such a behaviour. Curiously, the length of the total lag did not seem to have an effect on the accuracy of the predictions. The projection time had a stronger influence on accuracy. The precision of the estimates was dependent on both the projection time and the total lag. There was a major difference in precision between plots with initial age less than 40 years and plots with initial age over 40 years. This probably indicated where the curve of growth rate over time for Douglas-fir in the data set began to level off. For plots with initial age over 40, the variability was less than 10% most of the time. For these plots, the first-stage estimate yielded precise estimates which explains the good precision of the weighted predictor. The mean absolute deviation is the arithmetic average of the error terms in absolute value. For plots over 40 years of initial age, it was usually less than 5%, even for projection times greater than 11 years. It was also relatively larger on young plots. This lack of precision was mainly caused by the first-stage estimates, as confirmed by looking at the results in detail. The precision of the second-stage estimator was relatively constant. Chapter 6 Conclusions Statistical models based on a collection of remeasured entities are common in a few scientific fields. However, permanent sample plots in forestry have particular complexities that models developed in other fields do not address. Adequate techniques must be developed to predict growth in forestry permanent sample plots. These techniques must be adaptable to plots with only a few measurements since this is the situation most often encountered in Canada. Above all, these techniques must rely on a clear understanding of the purposes of permanent sample plots, and their possibilities and limitations as sources of information. The framework presented in this dissertation reflects this understanding. Growth rates were explained by site index and initial volume in a model inspired from Von Bertalanffy's growth curve. The individual relationship characterizing a sample plot was expressed as a random variation around a general model. Observations on a plot were assumed to be random outcomes of the plot's individual model. The current volume of a plot was updated using the growth rate predicted from the individual model. This technique was shown to be more precise than using the general model common to all plots. The weighted predictor also emphasizes the importance of the error structure in forestry statistical models, as pointed out by Gregoire (1987). Forestry information is not equally valuable; this fact must be recognized in a model. Separating the error term into components facilitates assigning a specific weight to each piece of information. 99 Chapter 6. Conclusions 100 A greater weight must be given to reliable information when computing an estimator. Precision of an estimator is as important as its bias. However, too often foresters have not paid attention to precision. Accurate mathematical models are sought to the detriment of statistical models. The power of statistics lies in its recognition of the difference between a variable and a constant. Statistical models will always be more precise than mathematical models when the error structure is correctly specified. Because models are used to make decisions, precise models are needed to make well-informed decisions. The results of the numeric example illustrated the advantages of the weighted predictor. Despite the approximations made in both stages, the weighted predictor was reasonably accurate and precise. It is probable that using a larger number of plots would have resulted in more precision with the second-stage model and consequently to the weighted predictor as well. Permanent sample plots with a longer history would also have provided more precise results. Fortunately, the weighted predictor can be adapted to data sets where all sample plots have more than three measurements. The weighted predictor is certainly not perfect. A major problem with the weighted predictor is the absence of statistical theory dealing with random coefficient modeling in finite samples. The estimator is approximately precise, but how approximate is unknown. The first-stage approximation can be justified for a short period of time only. No stochastic term is included for predicting catastrophic losses. Despite these weaknesses, the weighted predictor can provide useful information for solving problems facing many forest agencies in Canada. The proposed framework, which led to the weighted predictor, opens new horizons for modeling using permanent sample plots. Developments in the theory related to random coefficient models in finite samples can be expected in the near future and should answer many important questions. New computer technology will facilitate large simulation problems. Research on techniques to include probabilities of catastrophic losses in Chapter 6. Conclusions 101 prediction could improve the decision-making process based on growth models. Foresters will benefit from these improvements if they adopt the right framework. Perfect models might still be far ahead, but we are definitely getting closer. References Cited Aitchison, J. and Brown, J. A . C. 1957. The lognormal distribution. Cambridge Univ. Press, Cambridge, England, 176 p. Aitken, A . C. 1935. On least squares and linear combinations of observations. Proceedings of the Royal Society of Edinburgh 55:42-48. Amemiya, T. 1978. A note on random coefficient model. Inter. Econom. Rev. 19:793— 796. 1971. The estimation of the variances in a variance-component model. Internat. Econom. Rev. 12:1-13. Arner, S. L. and Seegrist D. W . 1980. Missing: a computer program for the maximum likelihood estimates of the parameters of the multivariate hnear model with incomplete measurements. USDA For. Serv. Gen. Tech. Pap. NE-56, 19 p. 1979. A computer program for the maximum likelihood estimator of the general multivariate Hnear model with correlated errors. USDA For. Serv. Gen. Tech. Pap. NE-51, 10 p. Arora, S. S. 1973. Error components regression models and their apphcations. Econom. Social Measurement 2:451-461. 102 Ann. References Cited 103 Axelsson, E . and Axelsson, B. 1986. Changes in carbon allocation patterns in spruce and pine trees following irrigation and fertilization. Tree Physiology 2:189-204. Baltagi, B . H., 1981. Pooling: an experimental study of alternative testing and estimation procedures in a two-way error component model. J . Econometrics 17:21-49. Baskerville, G. L. 1972. Use of logarithmic regression in the estimation of plant biomass. Can. J . For. Res. 2:49-53. Bayes, T. R. 1763. A n essay towards solving a problem in the doctrine of chances. Philo. Trans. Roy. Soc. 53:370-418 (reprinted in Biometrika 45:296-315). Berkey, C. S. 1982. Bayesian approach for a nonlinear growth model. Biometrics 38:953961. Biging, G . S. 1985. Improved estimates of site index curves using a varying-parameter model. For. Sci. 31:248-259. Buckman, R. R. 1962. Growth and yield of red pine in Minnesota. USDA For. Serv. Tech. Bull. 1272, 50 p. Carter, R. L . and Yang, M . C. K . 1986. Large sample inference in random coefficient regression models. Comm. Statist. A:Theory-Methods 15:2507-2525. Chapman, D . G. 1961. Statistical problems in population dynamics, In Proc. Fourth Berkeley Symp. Math. Stat, and Prob., Univ. Calif. Press, Berkeley, p. 153168. References Cited 104 Clutter, J . L . 1963. Compatible growth and yield models for loblolly pine. For. Sci. 9:354-371. Curtis, R. 0. 1967. A method of estimation of gross yield of Douglas-fir For. Sci. Monogr. 13, 24 p. Davis, L. S. and Johnson, K . N . 1987. Forest management. Third edition. MacGraw-Hill Book Company, New York, 790 p. Davis, A . W . and West, P. W . 1981. Remarks on "Generalized least squares estimation of yield functions" by I. S. Ferguson and J . W . Leech. For. Sci. 27:233-239. Dempster, A . P., Laird, N . M . , and Rubin, D . B . 1977. Maximum likelihood from incomplete data via the E M algorithm.(with discussion) Jour. Roy. Stat. Soc. series B 39:1-38. Dhrymes, P. J . 1971. Equivalence of iterative Aitken and maximum likelihood estimators for a system of regression equations. Austral. Econom. Papers 10:20-24. Dielman, T. E . 1989. Pooled cross-sectional and time-series data. Marcel Dekker Inc., New York, 249 p. _ 1983. Pooled cross-sectional and time-series data: a survey of current statistical methodology. Am. Stat. 37:111-122. Fearn, T. 1975. A Bayesian approach to growth curves. Biometrika 62:89-100. Ferguson, I. S. and Leech, J . W . 1978. Generalized least squares estimation of yield functions. For. Sci. 24:27-42. References Cited 105 1981. Reply to remarks by A . W . Davis and P. W . West on "Generalized least squares of yield functions." For. Sci. 27:589-591. Fuller, W . A . and Battese, G . E . 1974. Estimation of linear models with crossed-error structure. J . Econometrics. 2:67-78. Furnival, G . M . and Wilson, R. W . Jr. 1971. Systems of equations for predicting forest growth and yield. In Statistical Ecol. 3:43-57. G . P. Patil, E . C. Pielou, W . E. Waters, eds. Penn State Univ. Press, University Park. Garcia, 0. 1983. A stochastic differential equation model for the height growth of forest stands. Biometrics 39:1059-1072. Geisser, S. 1970. Bayesian analysis of growth curves. Sankhya A 32:53-64. Gerhardt, E . 1930. Ertragstafeln fur reine und gleichartige hochwaldbestande von eiche, buche, tanne, fichte, kiefer, gruner Douglasie und larche. Second edition. Springer-Verlag, Berlin. Goldberger, A . S. 1968. The interpretation and estimation of Cobb-Douglas functions. Econometrica 36:464-472. Gregoire, T. G . 1987. Generalized error structure for forestry yield models. For. Sci. 33:423-444. Grizzle, J . E . and Allen, D . M . 1969. Analysis of growth and dose response curves. Biometrics 25:357-381. 106 References Cited Gumpertz M . and Pantula, S. G . 1989. A simple approach to inference in random coefficient models. A m . Stat. 43:211-215. Harville, D . 1976. Extension of the Gauss-Markov theorem to include the estimation of random effects. Ann. Stat. 4:384-395. Hui, S. L . and Berger, J . 0 . 1983. Empirical Bayes estimation of rates in longitudinal studies. J . A m . Stat. Ass. 78:753-760. Judge, G. G., Griffiths, W.E., Hill, R . C . , Liitkepohl, H . , and Lee, T. C. 1985. The theory and practice of econometrics. Second edition. John Wiley and Sons, New York, 1019 p. Keyes, M . R. and Grier, C. C. 1981. Above- and below-ground net production in 40-year old Douglas-fir stands on low and high productivity sites. Can. J . For. Res. 11:599-605. King, J . E. 1966. Site index curves for Douglas-fir in the Pacific Northwest. Weyerhaeuser Co. For. Res. Center, For. Paper 8. 50 p. Kmenta, J . and Gilbert R. F . 1970. Estimation of seemingly unrelated regressions with autoregressive disturbances. J . A m . Stat. Ass. 65:186-197. Kurz, W . A . and Kimmins, J . P. 1987. The influence of site quality on tree resource allocation to fine roots and its effect on harvestable productivity of coastal Douglas-fir stands. Can. For. Service, Victoria, B . C . F R D A report 034, 102 p. (+ appendices) 107 References Cited Laird, N . M . and Ware, J . H . 1982. Random-effects models for longitudinal data. Biometrics 38:963-974. Leak W . 1966. Analysis of multiple systematic remeasurements. For. Sci. 12:69-73. Lee, L. F. and Griffiths, W. E. 1979. The prior likelihood and best linear unbiased prediction in stochastic coefficient linear models. Univ. of New England Working Papers in Econometrics and Applied Statistics No. 1, Armidale, Australia. LeMay, V . 1988. Comparison of fitting techniques for systems of forestry equations. Unpublished Ph.D. dissertation, Univ. of B . C , 172 p. Lindley, D. V . and Smith, A. F. M . 1972. Bayes estimates for the linear model. J . Roy. Stat. Soc. B 34:1-41. Maddala, G. S. 1977. Econometrics. McGraw-Hill, New York, 516 p. 1971. The use of variance components models in pooling cross-section and time-series data. Econometrica 39:341-358. Maddala, G . S. and Mount, T. D. 1973. A comparative study of alternative estimators for variance component models used in econometric applications. J . A m . Stat. Ass. 68:324-328. Marshall, P. L . and Jahraus, K . V . 1987. Growth and yield availability and use in British Columbia. Unpublished report, 96 p. 108 References Cited Mundlak, Y . 1978. Models with variable coefficients: integration and extension. Ann. de 1'INSEE 30-31:483-509. _ 1961. Empirical production functions free of management bias. J . Farm Econom. 43:44-56. Parks, R. W . 1967. Efficient estimation of a system of regression equations when disturbances are both serially and contemporaneously correlated. J . A m . Stat. Ass. 62:500-509. Phillips, P. C. B . 1985. The exact distribution of the SUR estimator. Econometrica 53:745-756. Pienaar, L . V . and Turnbull, K . J . 1973. The Chapman-Richards generalization of Van Bertalanffy's growth model for basal area growth and yield in even-aged stands. For. Sci. 19:2-22. Pothoff, R. F. and Roy, S. N . 1964. A generalized multivariate analysis of variance model useful especially for growth curve problems. Biometrika 51:313-326. Rao, C. R. 1987. Prediction of future observations in growth curve models. Stat. Sci. 2:434-471. . 1975. Simultaneous estimation of parameters in different hnear models and applications to biometric problems. Biometrics 31:545-554. 1972. Estimation of variance and covariance components in hnear models. J. A m . Stat. Ass. 67:112-115. References Cited 109 _ 1965. The theory of least squares when the parameters are stochastic and its application to the analysis of growth curves. Biometrika 52:447-458. Rao, U . L . G . 1982. A note on the unbiasedness of Swamy's estimator for the random coefficient regression model. J . Econometrics 18:395-401. Richards, F . J . 1959. A flexible growth function for empirical uses. J . Exp. Bot. 10:290— 300. SAS Institute Inc. 1982. SAS user's guide: statistics, 1982 edition. SAS Institute Inc., Cary, N C , 584 p. Schmidt, P. 1977. Estimation of seemingly unrelated regressions with unequal number of observations. J . Econometrics 5:365-377. Schumacher, F . 1939. A new growth curve and its applications to timber-yield studies. J. For. 37:819-820. Seegrist, D . W . and Amer, S. L . 1978. Statistical analysis of linear growth and yield models with correlated observations from permanent plots remeasured at fixed intervals, p 209-233 In Growth models for long-term forecasting of timber yields, (eds. J . Fries , H . Burkhart, T. A . Max), Sch. For. Wild. Res. V P I S U , Publ. FWS 1-78. Smith, A . F . M . 1973. A general Bayesian linear model. J . Roy. Stat. Soc. B 35:67-75. Srivastava, V . K . 1973. The efficiency of an improved method of estimating seemingly unrelated regression equations. J . Econometrics 1:341-350. References Cited 110 Sullivan A . D. and Clutter, J . L. 1972. A simultaneous growth and yield model for loblolly pine. For. Sci. 18:76-86. Sullivan A . D. and Reynolds M . R. Jr. 1976. Regression problems from repeated measurements. For. Sci. 22:382-385. Swamy, P. A . V . B . 1974. Linear models with random coefficients, In Frontiers in econometrics (ed. P. Zarembka), New York, Academic Press, p. 143-168. . 1971. Statistical inference in random coefficient regression models. Springer- Verlag, New York, 209 p. . 1970. Efficient inference in a random coefficient regression model. Econometrica 38:311-323. Swamy, P. A . V . B. and Arora, S. S. 1972. The exact finite sample properties of the estimators of coefficients in the error components regression models. Econometrica 40:261-275. Swamy P. A . V . B. and Metha J . S. 1979. Estimation of common coefficients in two regression equations. J . Econometrics 10:1-14. _ 1975. On Bayesian estimation of seemingly unrelated regressions when some observations are missing. J . Econometrics 3:157-169. 1973. Bayesian analysis of error components regression. J . A m . Stat. Ass. 68:648-658. References Cited 111 Swindel, B. F. 1968. On the bias of some least squares estimators of variance in a general Hnear model. Biometrika 55:313-316. Tait, D. E . N . , Cieszewski, C. J . , and Bella, I. E . 1988. The stand dynamics of lodgepole pine. Can. J . For. Res. 18:1225-1260. Taylor, W . E . 1980. Small sample considerations in estimation from panel data. J . Econometrics 13:203-223. Ung, C. H . and Vegiard, S. 1988. Problemes d'inference statistique rehes a la transformation logarithmique en regression. Can. J . For. Res. 18:733-738. Wallace, T. D. and Hussain, A . 1969. The use of error component models in combining cross-section with time-series data. Econometrica 37:55-72. ZeUner, A . 1987. An introduction to Bayesian inference in econometrics. R. E . Krieger Publishing Co., Malabar, F L , 431 p. . 1962. A n efficient method for estimating seemingly unrelated regressions and tests for aggregation bias. J . A m . Stat. Ass. 57:38-368. Zellner, A . and Vandaele, W . 1975. Bayes-Stein estimators for k-means, regressions, and simultaneous equation models In Studies in Bayesian Econometrics and Statistics (ed. S. E . Fienberg and A . Zellner), Amsterdam, North HoUand, p. 62 7-653.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Growth prediction of recent permanent sample plots...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Growth prediction of recent permanent sample plots for forest inventory projection Thérien, Guillaume 1990
pdf
Page Metadata
Item Metadata
Title | Growth prediction of recent permanent sample plots for forest inventory projection |
Creator |
Thérien, Guillaume |
Publisher | University of British Columbia |
Date | 1990 |
Date Issued | 2011-02-10 |
Description | Permanent sample plots have become the main source of information for estimating models which quantify the dynamic processes of a forest. Fitted models allow for projecting inventories, used to determine timber production and many forest management decisions. The quality of these models is largely dependent on the quality of the information provided by the permanent sample plots. However, the pool of information contained in recent permanent sample plots is limited. Efficient estimation techniques must use all the information available from such plots. Current estimation techniques can be improved. Existing techniques employed in forestry have failed to recognize the random nature of the individual model characterizing each plot. On the other hand, techniques designed for remeasured entities in other scientific fields do not address particular forestry situations such as the small number of remeasurements or the irregularity of remeasurements. A framework for estimating forestry growth models which recognizes the individuality of each plot and special forestry situations is presented in this dissertation. The proposed framework is a two-stage estimation technique, in which the growth rate of a permanent sample plot is considered analogous to the interest rate on a bank account. The first stage estimates the growth rate after removing the time effect. The second stage, based on Von Bertalanffy's growth curve, relates growth rate to site index and volume at the beginning of the growing season. The proposed predictor of future growth rates, the "weighted predictor," is a weighted average between the growth rate observed on a plot and the growth rate predicted from the second-stage model. The weighted predictor is then used to compound the current volume of a plot. An estimate of the variance of the prediction can also be computed. |
Subject |
Forest Management -- Canada |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | Eng |
Collection |
Retrospective Theses and Dissertations, 1919-2007 |
Series | UBC Retrospective Theses Digitization Project |
Date Available | 2011-02-10 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0075366 |
Degree |
Doctor of Philosophy - PhD |
Program |
Forestry |
Affiliation |
Forestry, Faculty of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
URI | http://hdl.handle.net/2429/31159 |
Aggregated Source Repository | DSpace |
Download
- Media
- [if-you-see-this-DO-NOT-CLICK]
- UBC_1990_A1 T43.pdf [ 5.78MB ]
- Metadata
- JSON: 1.0075366.json
- JSON-LD: 1.0075366+ld.json
- RDF/XML (Pretty): 1.0075366.xml
- RDF/JSON: 1.0075366+rdf.json
- Turtle: 1.0075366+rdf-turtle.txt
- N-Triples: 1.0075366+rdf-ntriples.txt
- Original Record: 1.0075366 +original-record.json
- Full Text
- 1.0075366.txt
- Citation
- 1.0075366.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
Russia | 7 | 0 |
United States | 7 | 0 |
Mauritius | 4 | 0 |
France | 2 | 0 |
China | 2 | 32 |
Brazil | 2 | 0 |
Japan | 2 | 0 |
Canada | 1 | 1 |
City | Views | Downloads |
---|---|---|
Unknown | 11 | 2 |
Floreal | 4 | 0 |
Penza | 3 | 0 |
Ashburn | 2 | 0 |
Beijing | 2 | 0 |
Tokyo | 2 | 0 |
Edmundston | 1 | 0 |
Saint Petersburg | 1 | 0 |
Phoenix | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0075366/manifest