A STUDY ON THE EFFECTS OF MULTICOLLINEARITY, AUTOCORRELATION AND FOUR SAMPLING DESIGNS ON THE PREDICTIVE ABILITY OF THE 1994 AND 1995 VARIABLEEXPONENT TAPER FUNCTIONS. BY Joseph A. Barrel B.Sc. (Hons) University of Juba, 1991. A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENT FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES D E P A R T M E N T OF FOREST R E S O U R C E S M A N A G E M E N T . We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA February 1999 ©Joseph A. Bartel, 1999. In presenting degree this at the thesis in partial fulfilment University of British Columbia, freely available for copying of department publication this or of reference and study. thesis by this for his thesis scholarly or for her of that the agree that may representatives. financial requirements I agree I further purposes the It gain shall not be is permission granted allowed permission. Department of fffTlsA The University of British Vancouver, Canada Da,e DE-6 (2/88) N^rcU fe?CU<Cfco Columbia ^ fyMo^e,*"-*^ an advanced Library shall make by understood be for for the that without it extensive head of my copying or my written Abstract In British Columbia, government, industry and consulting firms have used taper functions since the late sixties. Most recently, Kozak's (1988) variable exponent model has been used since 1989. One practical problem with the model is that, it does not estimate total or merchantable volume without bias. These biases were found to be more pronounced for red cedar (Thuja plicata Donn ex D.Don) and western hemlock (Tsuga heterophylla (raf). Sarg.). Because of this problem, a second equation known as the 1994 equation was developed. However, reviewers identified some theoretical problems concerning multicollinearity and autocorrelation in the 1994 equation. These prompted the development of a third equation that possesses a lesser amount of multicollinearity referred to as the 1995 equation. The three principal objectives of this research were: (1) to study the effects of multicollinearity and autocorrelation on the predictive ability of the 1994 and 1995 variableexponent taper functions; (2) to study the effects of four sampling strategies on the predictive ability of the 1994 and 1995 taper equations; and (3) to examine the possibility of localizing the 1994 taper equations. The effects of multicollinearity and autocorrelation and the four sampling designs were studied using Monte Carlo simulations. The results of the study indicated that the presence of severe multicollinearity and autocorrelation in the data did not seriously affect the predictive ability of the equations. Stratified random sampling, with equal allocation of observations selected from each stratum, gave the smallest variability of the estimated coefficients compared to simple random sampling, and stratified random sampling, with the number of samples proportional to the size of the strata. However, the average estimated regression coefficients were somewhat different from ii the population parameters.Therefore, simple random sampling is recommended for selecting trees from the population if the main objective is the estimation of the population parameters. If the equations are to be used for prediction, then a wider range of the data (stratified sampling) should be used. The results indicated that no adjustment or scaling is required for the western hemlock equation for the two subzones studied. in Table of Contents Abstract ii List of Tables vii Acknowledgement ix 1 Introduction : 1 2 Literature Review 5 2.1 Tree Form and Taper Estimation 5 2.1.1 Single Equation Taper Models 5 2.1.2 The Segmented Models 5 2.1.3 Simultaneous Equations 6 2.1.4 Variable-exponent Taper Functions 6 2.2 Multicollinearity 7 2.2.1 Diagnostics of Multicollinearity 2.3 Autocorelation 7 8 3 Methods and Procedures 9 3.1 Data and Models 9 3.2 Procedures 10 3.2.1 Simulations 10 iv •3.2.2 Calibration (localization) of the 1994 equation 11 4 Results 1 4 4.1 Variance Inflation Factors and Correlation Matrix 14 4.2 Simulation 1 14 4.2.1 Western Hemlock, 14 4.2.2 Red Cedar 4.3 Simulation 2 21 : . 21 4.3.1 Western Hemlock 21 4.3.2 Red Cedar ; 4.3.3 Comparison of species .24 4.4 Simulation 3 24 4.4.1 Western Hemlock 4.4.2 Red Cedar 21 24 _ . . 4.5 Simulation 4 24 25 4.5.1 Western Hemlock 25 4.5.2 Red Cedar 25 4.6 Estimation of Total Tree Volume, merchantable height and inside bark diameter (at 0.3m, 10%, 30%, 50%, 70%, and 90%) 25 4.6.1 Volume Estimation 25 4.6.2 Merchantable Height Estimation 26 4.6.3 inside bark diameter estimation 26 4.7 Comparing the Equations and the Simulations 27 4.8 Goodness of fit 27 4.9 Localizing the 1994 equation 28 5 Discussion 33 5.1 Variance Inflation Factors and Correlation Matrix 33 5.2 Simulations and predictions 33 5.3 Localizing the 1994 Equation 35 6 Conclusion 36 7 Literature Cited 38 Appendices 40 vi List of Tables Table 1. Correlation Matrix and Variance Inflation Factor (VIF) for the1994 Equation for Western hemlock 15 Table 2. Correlation Matrix and Variance Inflation Factor (VIF) for the 1995 Equation for Western hemlock 15 Table 3. Correlation Matrix and Variance Inflation Factor (VIF) for the 1994 Equation for Red cedar 16 Table 4. Correlation Matrix and Variance Inflation Factor (VIF) for the 1995 Equation for ' Red cedar : 16 Table 5a. Average regression coefficients and their S and CV% obtained from the 4 simulations for Western hemlock (1994 Equation) 17 Table 5b. Average regression coefficients and their S and CV% obtained from the 4 simulations for Western hemlock (1995.Equation) 18 Table 6a. Average regression coefficients and their S and CV% obtained from the 4 simulations for Red cedar (1994 Equation) 19 Table 6b. Average regression coefficients and their S and CV% obtained from the 4 simulations for Red cedar (1995 Equation) 20 Table 7. Volume, merchantable height, and inside bark diameter (at 0.3m and 10% of the tree height) estimation for three (small, medium, and large) trees using the population equation and the four Monte Carlo simulations for Western hemlock 22 Table 8. Volume, merchantable height and inside bark diameter (at 0.3m and 10% of the tree height) estimation for three (small, medium and large) trees using the population equation and the four Monte Carlo simulations for Red Cedar 23 Table 9. Inside bark diameter estimation for three (small, medium and large) trees at 30%, 50%, 70%o and 90% of tree height for Western hemlock 29 Table 10. Inside bark diameter estimation for three (small, medium and large) trees at 30%, 50%, 70%, and 90% of tree height for Red cedar 30 Table 11. Errors in volume by DBH classes for the 1994 equation for V M (very moist) subzone 31 Table 12. Errors in volume by DBH classes, Exponential correction 31 vn Table 13. Errors in volume by DBH classes for the 1994 equation for X M (xeric) subzone 32 Table 14.Errors in volume by DBH classes, Exponential correction 32 Appendix 1. Biases and standard errors of estimate for inside bark diameter estimation at different heights from ground for Western hemlock 41 Appendix 2. Biases and standard errors of estimate for estimating total tree volume by DBH for Western hemlock 41 Appendix 3. Biases and standard errors of estimate for estimating total tree volume by height class for Western hemlock 42 Appendix 4. Biases and standard errors of estimate for estimating merchantable height by DBH classes for Western hemlock 42 Appendix 5. Biases and standard errors of estimate for inside bark diameter estimation at different heights from the ground for Red cedar 43 Appendix 6. Biases and standard errors of estimate for estimating total tree volume by DBH class for Red cedar 43 Appendix 7.Biases and standard errors of estimate for estimating total tree volume by height class for Red cedar 44 Appendix 8. Biases and standard errors of estimate for estimating merchantable height by DBH class for Red cedar 44 viii Acknowledgment I would like to thank all those who contributed to this study, especially my supervisor Dr. Tony Kozak for his expert advice and comments. I am also indebted to my thesis committee: Drs. Valerie LeMay, Peter Marshall and Sam Otukol, for their valuable contribution to the study. My Thanks also go to Macmillan Bloedel, for awarding me a fellowship. My final, and most deep felt gratitude is reserved for my family, without which I would have not succeeded, Maggie, Bartel, and Bryan I love you so much. For you all, this thesis is now done. ix Chapter 1 Introduction Total tree volumes are usually estimated using volume equations. These equations customarily predict tree volumes from diameter at breast height (D) and either total or merchantable height. However, foresters are often more interested in estimating merchantable volumes, that is, the content of tree boles from stump height to some fixed top diameter or height limit. When equations for different merchantable volumes are fitted independently, they often have the undesirable characteristics of producing volume surfaces that cross illogically within the range of the data. Consequently, inconsistent estimates are produced for different merchantable volumes of a single stem. Several forest mensurationists have attempted to introduce measures of form into volume tables ( e.g., Behre, 1927; Naslund, 1980; Avery and Burkhart, 1983). The actual measure of form in routine volume estimation is time consuming and prohibitively expensive (Loetsh, 1973). Furthermore, it has been shown by some researchers that no practical advantage in estimating volume can be gained from any measurement of form in addition to the diameter at breast height and the total height (Kozak et al, 1969; Smith et al, 1961). The limits to these approaches lie in the fact that form cannot be reduced to a single parameter. Because of this, several scientists conducted studies on stem profile and tree taper from which taper equations were developed. When these equations are integrated, they can be used to predict merchantable volume from any stump height to any point of the tree bole. Another advantage of taper equations compared to volume equations is that the volume of any log within a stem can be calculated. In British Columbia, taper functions have been used by government, industry and consulting firms since the late sixties with good success. The most recent model recommended by Kozak (1988) has been in use since 1989 for 16 species in three Forest Inventory Zones l (Kozak, 1997). One practical problem with Kozak's (1988) equation is that it does not estimate total or merchantable tree volume without bias. These biases were found to be more pronounced for red cedar (Thujaplicata Donn ex D. Don) and western hemlock (Tsuga heterophylla (Raf). Sarg.) in the wet belt region of British Columbia (Kozak, 1997). Because of this, in 1994, the Resources Inventory Branch of the Ministry of Forests requested Kozak to develop, fit and test new equations for 16 British Columbia species by biogeoclimatic zones, which came to be known as the 1994 equation (Kozak, 1997). However, the 1994 equation was not free of problems either. Reviewers of the equation identified some theoretical problems concerning multicollinearity and autocorrelation. Multicollinearity is a condition which occurs when there are near dependencies among the regressor variables. Multicollinearity can cause bias in the magnitude of the estimated regression coefficients, coefficient instability, and problems with prediction (Myers, 1986). Since the 1994 equation has several polynomial terms and other transformations of the same regressor variables, its predictive ability could be affected by multicollinearity. Because of the possible effects of multicollinearity, Kozak developed a third equation that possesses a lesser amount of multicollinearity, referred to as the 1995 equation (Kozak, 1997). Autocorrelation occurs when several measurements are taken from a single specimen or object. With taper equations, several measurements (inside bark diameter at several heights from the ground) are recorded from a single tree. These measurements are said to be serially correlated or autocorrelated, and the use of the ordinary least squares regression procedure under such conditions has a number of important consequences which are summarized in the next chapter. In developing taper equations, the method of data collection (i.e., the sampling design) involves other theoretical problems which require attention. How should trees be selected in order to develop equations for specified species in certain areas? To answer this question, several 2 sampling designs should be studied so that some conclusions can be reached regarding the effectiveness of the various methods. Finally, it is a well-known fact that equations developed for a given inventory zone or biogeoclimatic zone may not perform equally well in much smaller management units or subzones. Usually they will either overestimate or underestimate volume and other characteristics of interest, because tree growth and profile (form) are not uniform for all locations. Variations exist due to species, site characteristics, stand density, silvicultural operations, and other factors. The 1994 and 1995 equations were developed using data from large geographic regions. In practice, they are typically applied to much smaller, local sites (e.g., a timber sale or inventory compartment). The mean errors averaged across many sites are expected to be very close to zero. However, the mean error for any one site can be considerably larger, because of local differences in tree form or size distribution. This problem occurs with many regional equations, including volume equations. Variations in characteristics among sites can cause systematic differences between sites, even when sites are located near each other. Therefore, an unbiased regional equation that averages differences among many sites can be biased when applied to a given local site. To improve the accuracy of estimation, regional equations should be localized with some type of scaling or adjustment to reduce the amount of bias due to the factors mentioned above. The three principle objectives of this research were: (1) to study the effects of multicollinearity and autocorrelation on the predictive ability of the 1994 and 1995 taper equations; (2) to study the effects of four sampling strategies on the predictability of the 1994 and 1995 taper equations; and (3) to examine the possibility of localizing the 1994 taper equation. 3 This thesis is divided into four subsequent chapters. Chapter 2 presents a review of related literature, Chapter 3 reviews the data and methods used in the study, Chapter 4 presents the results, Chapter 5 contains a discussion and Chapter 6 contains the conclusions. 4 Chapter 2 Literature Review 2.1 Tree Form and Taper Estimation Many attempts have been made to explain the form of the taper curve using several biological and physical factors. In practice, it is not possible to include all of these factors in the taper curve model because they are either difficult or impossible to measure. For these reasons, relatively simple mathematical models have been presented for tree taper. Several taper functions have been proposed in the literature. For a detailed literature review of the models developed from 1903 to 1993, the reader is referred to Muhairwe (1993). Taper models may be classified as static or dynamic. A static taper model is a model which predicts the diameter along the tree stem at a particular time, whereas a dynamic model predicts the changing diameter along the tree stem over time. Static models are further subdivided into single, segmented, simultaneous, and variable-exponent models. 2.1.1 Single Equation Taper Models Single equation models describe diameter change from ground to top. These models are easy to fit and do not require extensive computing capabilities. The following equations fall under this category: Kozak etal. (1969); Ormerod (1973); and Hilt (1980). 2.1.2 Segmented Models It was discovered that the description of tree taper by means of a single equation does not give satisfactory results in all practical applications (Max and Burkhart, 1976; Demaerschalk and Kozak, 1977). Therefore, the fit of the taper model was improved by calculating individual functions for different segments along the stem. The segmented models require special approaches to fitting each segment to the data and to joining the segments together. Identification of the inflection points is pivotal for ensuring the continuity of the equation. The Max and Burkhart (1976) segmented, three-part quadratic-quadratic-quadratic model is a complex 5 approach to taper. In this model, three separate submodels are used to describe the neiloid frustum of the lower bole, the paraboloid frustum of the middle bole, and the conical shape of the upper portion. 2.1.3 Simultaneous Equations Kilkki et al. (1979) presented a method for determining tree taper using a number of diameters at different relative heights along the stem, predicted by means of a linear simultaneous equation model. From further studies carried out on this method, non-linear simultaneous equation models and linear models of the logarithms of diameters have also been tested (Kilkki and Varmola, 1979). 2.1.4 Variable-Exponent Taper Functions The variation in tree form makes it difficult to formulate general guidelines readily applicable to a single species, or even to all the stems in a single stand (Larson, 1963). Newnham (1988) and Kozak (1988) introduced a "variable-form" taper function that describes tree taper with a continuous function using a changing exponent to compensate for the form change of different tree sections. The first variable-exponent taper equation developed by Kozak for the Resources Inventory Branch, Ministry of Forests of British Columbia, was in 1988 (Kozak, 1988). To improve volume prediction, Kozak developed a new equation known as the 1994 equation (Kozak, 1997). Since the 1994 equation has very few independent variables, which have several transformations, the chances of multicollinearity are very high. To eliminate or reduce multicollinearity, a third equation known as the 1995 equation was developed. Appendices 1 - 8 in Kozak (1997) show the predictive ability of the 1994 and 1995 equations for both western hemlock and western red cedar. They include biases and standard errors of estimate for inside bark diameter estimation at several heights above ground, total tree volume estimation by D (diameter at breast height) class, total volume estimation by height class, and merchantable height estimation by D class. A review of appendices 1 - 8 shows that both equations predict well. However, the 1995 equation is the best in terms of inside bark estimation. Both the biases and standard errors of estimate are small and do not show much of a trend for either species. For total tree volume estimation by D and height class, the 1994 equation outperformed the 1995 for western hemlock. It gave the smallest biases and standard errors of estimate, while the 1995 equation outperformed the 1994 equation for red cedar. The trends and sizes of biases and standard errors of estimate were similar for both equations and species in estimating merchantable heights. 2.2 Multicollinearity Multicollinearity can be described as the presence of strong intercorrelation among the independent variables (Myers, 1986). The following problems may result due to the presence of multicollinearity among the independent variables: (1) instability of the regression coefficients (coefficients vary greatly over different sample sets); (2) the coefficients may be large and/or carry a wrong sign; and (3) the standard errors of the estimated regression coefficients may be too large relative to the population standard error. The use of over complicated models that include several polynomials and cross-products terms, as is the cases of the 1994 variableexponent taper equation, may be one source of multicollinearity (Kozak, 1997a). 2.2.1Diagnostics for Multicollinearity Diagnostic tools that can be used for identifying multicollinearity include: (1) the pairwise coefficients of simple correlation between pairs of the independent variables (correlation matrix); (2) variance inflation factors; and (3) the condition number (Myers, 1986). The correlation matrix is the easiest way to diagnose multicollinearity. It is available in most statistical programs and requires little effort on the part of the analyst. If the correlation matrix shows that any two variables are substantially related (Myers, 1986, suggests values over 0.7), further analysis should be done to determine the extent of multicollinearity. Variance inflation factors (VTJF) are the diagonals of the inverted correlation matrix. The variance inflation factors 7 represent the increase in the variance of the coefficients over the population variance. It is generally believed that there is reason for some concern with regards to multicollinearity for any A, VIF over 10 (Myers, 1986). The condition number is given by the formula § = m a x . It represents the largest eigenvalue divided by the smallest eigenvalue for the correlation matrix X X, which is a square matrix. When the condition number exceeds 1000, there should be some T concern with regards to multicollinearity (Myers, 1986). 2.3 Autocorrelation Stem analysis data are said to be serially correlated (autocorrelated) because several diameter measurements are taken from the same tree at different heights. Therefore, the error terms are not independent. The following are some of the problems that may result when measurements are autocorrelated (Kmenta, 1986): (1) the estimators of the regression coefficients are unbiased and consistent but no longer have the minimum variance property; (2) the calculated mean squared error (MSE) may underestimate the real variance of the error terms (residuals), while the standard errors of the regression coefficients may seriously underestimate the true standard deviation of the regression coefficients; and, (3) statistical tests using the t or F distributions and the confidence intervals are not reliable. Autocorrelation can be diagnosed by means of the Durbin-Watson statistic or the Q test. 8 Chapter 3 Methods and Procedures 3.1 Data and Models Due to the high cost of data collection, and the destructive nature of the process, the data for this study were provided by the Resources Inventory Branch of the Ministry of Forests of British Columbia. There were 5716 western hemlock trees collected from the CWH (Coastal Western Hemlock) Biogeoclimatic Zone and 3954 red cedar from the ICH (Interior Cedar Hemlock) Biogeoclimatic Zone. Several measurements were taken for every tree, but for this study, only the following were used: inside bark diameters at 0.30, 0.46, 0.61 and 1.3 m above ground and at subsequent tenths of the height above breast height, total tree height, outside bark diameter at breast height (1.3 m), and total tree volume, calculated using the Smalian's equation for different sections of the tree stem. The following two equations were studied: d = a D ' a x. a d = C D°' D [bo+blZilM+b2Zil JJ 2^ [ l i ' C m X 1 1 0 + m 2 i Z 4 + m ' ^ (1) 3+ 3^ (Qi) i n + m 4^ii- m5 + D J r ' 1 ^2) where: D, diameter outside bark at breast height (1.3 m) in cm; H, total height in m; hj, height above-ground in m; d , estimated stem inside bark diameter at h height above-ground in cm; ; ; asin, arc sine; e, natural logarithm; p, relative height constraint, p = 0.01 for equation (1) and p = 1.3/H for equation (2); a ,a ,a ,b ,b ,...b , 0 l 2 0 l 6 and c ,c ,c ,m ...m Q l 2 v s are the regression coefficients; z , relative height defined as (h IH) above-ground; ; i X =(l-VVH)/(1-Vp);and 1 Q =(\-4hJH) i : To estimate the regression coefficients, the equations were linearized by logarithmic transformations and their parameters were estimated using ordinary least squares regression. The 1995 equation (equation (2)) differs from the 1994 equation (equation (1)) in three major ways (Kozak, 1997a): (1) instead of a constant relative height constraint (e.g. p = 0.01), the relative height a constant height, p = 1.3/H, was used (breast height of 1.3 m versus total height); (2) in addition to D, total tree height was included as a multiplier of X; and (3) the variables in the exponent were created to reduce the correlation among the independent variables, thus reducing multicollinearity. 3.2 Procedures 3.2.1 Simulations Four sets of Monte Carlo simulations were carried out for both species to evaluate the effects of multicollinearity, autocorrelation, and four sampling designs. The 5716 western hemlock and 3954 red cedar trees constituted the populations for this study. To fulfill the first two objectives, sampling was carried out in the two populations and repeated 1000 times for each of the four sampling designs, similar to Kozak's (1997) methods. In simulation (1), 200 trees were randomly selected from each population to fit both equations. The main objective of this simulation was to study the effects of multicollinearity on the predictive ability of the 1994 and 1995 equations. For both equations and populations, regression coefficients, mean squared errors, and estimates of total tree volumes, merchantable heights to a 10 cm inside bark diameter, and inside bark diameters at 0.3 m, 10, 30, 50, 70, and 90% of the total tree height were estimated. From each of the 1000 independent samples,, the estimation of total volume, merchantable heights and inside bark diameters were simulated for small, medium and large trees for both species. Height and diameter breast height measurements of the three trees sizes were: small: 12.95 m, 15.5 cm; medium: 38.40 m, 55.5 cm; and large: 10 63.34 m, 156.2 cm. It should be noted that the predicted values (volume, merchantable height and inside bark diameter) of the 1000 equations were compared to the predicted values of the population equations and not to actual values in this thesis. In simulation (2), 2000 trees were randomly selected from each population, but only one randomly selected measurement per tree was used to constitute a sample size similar to the 200 trees as in simulation 1. This strategy eliminated repeated measurements within a tree (autocorrelation). Similar to simulation (1), sampling was repeated 1000 times and the same statistics as in simulation (1) were calculated. The four sampling designs used were: (1) simple random sampling of trees, (2) stratified random sampling of trees with an equal allocation of observations for each stratum, (3) stratified sampling of trees with the number of observations proportional to size of the stratum, and (4) simple random sampling with one observation per tree. To fulfill the second objective of this study, both populations were stratified into 5 m height and 10-20 cm diameter classes. Five height and diameter classes were identified for both species which resulted in 25 strata. In simulation (3), 14 trees were randomly selected without replacement from each stratum to constitute 200 trees as in simulation (1). Finally, in simulation (4), the number of trees selected from each stratum was proportional to the size of the stratum. Results from simulation (1) and (2) were compared to evaluate the effects of autocorrelation. Results from simulations (1), (3) and (4) were compared to evaluate the effects of the three sampling designs on the predictive abilities of the equations in the study. 3.2.2 Calibration (Localization) of the 1994 Equation To study the possibility of localizing the 1994 taper equation, two subzones, vm (very moist) and xm (xeric), were chosen from the western hemlock population to develop a zonal equation. A sample of 169 trees from subzone vm and 236 trees from subzone xm were randomly chosen to constitute the fit data. The 1994 variable exponent taper equation fitted for all trees, was used for each recorded segment to predict inside bark diameter at 0.3, 0.46, 0.61, 1.3 and at one tenths of the total height above breast height. For each segment, using these 11 predicted diameters, volume was calculated. The volume equation for a cylinder was used to calculate volume for the bottom section (from ground to 0.3 m). Smalian's equation was used for the mid-sections because these sections generally have a paraboloid frustum shape. Finally the equation for a cone was used for the top section because the tree top appears to have a cone shape. The sum of the predicted segment volumes gave the predicted total tree volume. The actual volume from the data and the predicted volume were then used to develop the calibration equation. A linear model in the form of Y = b + b X (where Y = actual volume, X= predicted 0 x volume, b and 6, are the intercept and slope respectively) and a curvilinear model in the form of 0 ^ b JC Y = aX C (where Y= actual volume, X= predicted volume, a, b and c are regression coefficients) were fit to the data. From the fit statistics, the curvilinear models were found to be the better equations with an R of 0.999 and 0.998 for the vm and xm subzones compared to 2 0.995 and 0.996 for the linear models. These models were then used to adjust the predicted volumes. The curvilinear equation was chosen because it is very flexible in shape, it passes through the origin, and it is fit using a logarithmic transformation that usually equalizes the variances of Y within the range of X. The subzone data were classified into of 15 cm. diameter classes for the examination of bias. For each D class, the bias, standard error of estimate (SEE), bias percent, and standard error of estimate percent were calculated for volume using the following equations: B = «=i (3) n (4) where: Y is the actual volume; ( 12 Y is the predicted volume from the calibrated equation; { n is the number of observations in a diameter class; m is the number of estimated parameters used in estimation. Finally, the biases and standard errors of estimate were calculated for each diameter class and for all the trees within the sample. 13 Chapter 4 Results 4.1 Variance Inflation Factors and Correlation Matrix The correlation matrices and VIFs obtained by Kozak (1997) for western hemlock for the 1994 and 1995 equations are shown in Tables 1 and 2. Tables 3 and 4 give the correlation matrices and the variance inflation factors fVTF) for each equation fitted with the red cedar data. For western hemlock, the correlation matrices and VTFs for both equations showed that most of the independent variables for the 1994 equation are strongly correlated with each other, unlike the 1995 equation whose independent variables have low correlation (Table 1 vs Table 2). Similarly, the correlation matrices and the VIFs for red cedar also indicate that the regressor variables for the 1994 equation have stronger linear dependencies than the regressor variables for the 1995 equation (Table 3 vs Table 4). 4.2 Simulation 1 4.2.1 Western hemlock In simulation (1), where 200 trees of each species were sampled (without replacement) to fit both equations, the estimated regression coefficients are similar to the population parameters for both equations (Tables 5a and 5b ). However, some of the averages of the estimated coefficients were highly variable, as indicated by the coefficients of variation for the 1994 equation (i.e. b (47.62%), 6,(44.06%), b (44.69%), b (50.07%), b (52.24%) and b (97.06%). 0 2 3 4 6 These variations may be attributed to the strong linear dependencies that exist among these variables, and the presence of severe multicollinearity. The averages of the estimated regression coefficient averages for the 1995 equation are much less variable, except for m , with a 4 coefficient of variation of 45.1%. This may be attributed to the moderate multicollinearity that the 1995 equation possesses. A slight underestimation of the MSE can be observed for both equations; this may be due to autocorrelation. 14 Table 1. Correlation matrix and variance inflation factors (VIF) for the 1994 Equation, western hemlock. (ExtractedfromKozak (1997) with permission of the author). 1/4 h\X l 1/2 1 / 3 l t Z Z , i Z a sin gj l/(D/H.+ z) : H InD D lndi 1.000 0.302 -0.309 0.388 8.72 1.000 0.941 0.703 9.73 1.000 0.646 8.98 1.000 1.000 lnX, 1/4 0.998 1.000 1/3 0.996 0.999 1.000 1/2 0.992 0.998 0.999 1.000 a sin ft 0.409 0.345 0.325 0.290 1,000 V(D/H+z,) H InD D lndi VIF 0.952 0.942 0.884 0.003 -0.003 0.683 * 0.938 0.929 0.883 0.000 -0.015 0.682 * 0.881 -0.004 -0.019 0.681 * 0.497 0.338 0.122 0.099 0.278 302,114. i Z z. i Z 0.885 0.012 -0.005 0.685 * = greater 5 x 10 1.000 0.786 0.192 0.169 0.758 20.17 7 Table 2. Correlation matrix and variance inflation factors (VIF) for the 1995 Equation, western hemlock. (ExtractedfromKozak (1997) with permission of the author). 1/10 i Z 4. . a sin Q \le D I » D"' n 0.816 1.000 a sin Q, 0.669 0.185 1.000 \le 0.893 0.760 0.551 1.000 0.588 0.244 0.881 0.465 1.000 -0.046 -0.082 -0.052 -0.124 0.609 , , .0.270 15.59 .20.53 0.172 0.034 0.709 9.62 -0.050 -0.091 0.295 6.92 4 Z, D> InD InH lndi VIF InH lndi 1.000 x> ° x InD D I H -0.054 -0.080 0.623 29.94 1.000 0.890 0.703 8.57 1.000 0.631 6.79 15 1.000 Table 3. Correlation matrix and variance inflation factors (VIF) for the 1994 Equation, red cedar. \nX i Z 1.000 lnX, z V' 1/4 t r 1/2 3 i a sing, l/(D/H + z,) H InD D lndi 1.000 -0.309 -0.314 0.459 10.55 1.000 . 0.936 0.658 9.24 1.000 0.597 8.33 1.000 Z 0.998 1.000 1/3 0.996 0.999 1.000 1/2 0.869 0.834 0.823 1.000 0.413 0.349 0.330 0.792 1.000- 0.949 0.887 0.010 -0.009 0.735 A 0.940 0.887 0.001 -0.017 0.725 B 0.936 0.886 -0.002 -0.019 0.722 C 0.879 0.748 0.065 0.042 0.677 D 0.488 0.336 0.126 0.099 0.408 E i Z i Z a sing l / ( D / / / + z,) H InD D lndi VIF 1.000 0.751 0.221 0.195 0.822 24.49 A = 6378289.3258 B = 761858.18257 C = 5976957.2279 D = 1094.8373975 E = 63541.561880 Table 4. Correlation matrix and variance inflation factors (VIF) for the 1995 Equation, red cedar. 4 1/10 l Z a sin Q \le DIH D> x InD InH 1.000 0.932 0.658 10.11 1.000 0.621 7.88 lndi 1.000 4 0.823 1.000 0.378 0.015 -0.130 0.005 1.000 -0.001 1.000 -0.136 -0.146 0.211 -0.042 1.000 -0.015 -0.021 0.713 16.65 -0.042 .-0.041 0.617 13.24 -0.027 0.378 0.232 5.89 0.115 0.005 0.093 1.03 -0.711 -0.642 -0.567 3.19 i Z a sin Q, \le D"' DIH InD InH lndi VIF 16 1.000 Table 5a. Average regression coefficients and associated S and CV% obtained from the four simulations for western hemlock. 1994 Equation aO Population Simulationl Simulation2 Simulation3 1.072655 1.074869 1.087627 1.02442 1.09379 0.093429 0.054516 0.075778 0.080598 8.71 5.01 7.39 7.37 1.016294 1.010726 1.028975 1.010549 0.032211 0.018076 0.026614 0.027954 S CV% a1 1.01551 S CV% Simulation4 3.17 1.79 2.59 2.77 0.998803 0.998863 0.998822 0.999152 0.001064 0.00135 0.000791 0.001139 83.4906 80.8532 96.3647 53.6021 S 38.8784 37.8059 34.8598 27.0128 CV% 47.62 46.75 36.17 50.39 -243.375 -238.259 -275.189 -156.592 a2 0.998825 S bO b1 81.6472 -238.305 S 105.001 109.9212 93.56383 73.88222 cv% 44.06 46.13 33.99 47.18 239.173 235.0459 269.9268 150.5629 104.613 116.1083 93.06731 73.93027 44.69 49.39 34.59 49.1 -78.3659 -76.7318 -90.1722 -46.6297 S 38.3191 36.90174 34.18039 26.93643 CV% 50.07 48.09 37.91 57.73 -42.2775 -40.78162 -49.59729 -25.98751 b2 234.102 S cv% b3 b4 -76.5239 -41.2654 S 21.5566 18.45587 19.37159 14.92038 CV% 52.24 46.05 39.06 57.41 -0.318428 -0.299803 -0.361717 -0.334733 S 0.08606 0.053096 0.065593 0.045066 CV% 27.15 18.0 27.03 13.46 -0.000936 0.000942 -0.000267 -0.000389 0.000892 0.000695 0.000738 0.000563 97.06 73.78 276.4 144.73 0.018472 0.018528 0.01799 0.017455 S 0.002165 0.002002 0.002186 0.001444 CV% 11.52 10.8 12.15 8.27 b5 b6 -0.316966 -0.000919 S CV% MSE 0.018791 17 Table 5b. Average regression coefficients and associated S and C V % obtained from the four simulations for western hemlock. 1995 Equation o s cv% c Population Simulationl Simulation2 Simulations Simulation4 0.779855 0.780658 0.765891 0.78877 0.781593 0.04273 0.037744 0.031313 0.027117 5.48 4.93 3.97 3.46 0.923061 0.941351 0.939772 0.927114 0.017093 0.013051 0.008219 0.011935 1.58 1.38 0.87 1.29 0.135999 0.121682 0.114953 0.131199 0.029391 0.016196 0.01574 0.017914 21.6 13.31 13.69 13.65 0.678888 0.629506 0.723646 0.681869 0.040765 0.032397 0.024082 0.042 6.01 5.15 3.33 6.16 0.321019 0.376176 0.291062 0.315592 0.048981 0.046496 0.032054 0.055175 15.19 12.36 11.01 17.5 -0.825401 -0.5896 -0.828604 -0.815086 0.110511 0.103533 0.072779 0.122704 13.48 17.55 8.78 15.05 -0.195518 -0.220805 -0.238356 -0.197289 0.089103 0.065627 0.044356 0.070142 45.1 29J2 18.61 35.5 0.0017665 0.013434 0.01751 0.017103 0.001684 0.001724 0.001177 0.001729 9.59 12.83 6.72 10.1 0.924613 c l s cv% 0.134179 s cv% m. 0.677818 1 s cv% m s cv% 0.322517 m s cv% -0.81981 m s cv% -0.19758 m s cv% 0.017564 MSE 0.019008 2 - 3 4 5 0.01888 0.016944 0.018653 0.018803 S 0.002326 0.001984 0.00146 0.002181 CV% 12.24 11.71 7.83 11.59 18 Table 6a. Average regression coefficients and associated S and C V % obtained from the four simulations for red cedar 1994 Equation Population Simulationl Simulation 1.735756 1.738227 1.839389 1.968011 S 0.159356 0.110104 0.035719 0.075491 CV% 9.17 5.98 1.81 4.14 aO a1 1.822585 0.941266 0.91958 0.891846 0.925163 S 0.030596 0.02041 0.006644 0.013342 CV% 3.25 2.22 0.74 1.44 0.999204 0.999603 1.000281 0.99939 a2 0.940212 Simulations Simulation4 0.999218 S 0.000193 0.000835 0.001811 0.000853 CV% 0.02 0.08 0.18 0.08 29.72229 25.52636 63.10847 34.00028 S 39.22315 27.78372 9.163154 36.08668 CV% 131.96 108.84 14.52 106.14 -93.96521 -76.58339 -195.5055 -110.5883 107.8425 75.79272 25.65334 99.30624 89.79 bO b1 28.6529 -91.1786 S CV% 114.76 98.97 13.12 85.78316 66.56506 190.6651 104.2264 S 108.1421 75.83416 25.85987 99.61195 CV% 126.06 113.92 13.56 95.57 b2 b3 83.057 -20.54797 -14.52839 -57.35989 -26.35989 S 39.31982 27.67569 9.282277 36.18278 CV% 191.36 190.49 16.18 135.7 -10.83282 -8.949319 -28.7718 -12.94788 21.64177 15.37463 5.021749 19.9025 b4 -19.5378 -10.236 S CV% 199.8 171.79 17.45 153.71 -0.549837 -0.555894 -0.391382 0.5187 S 0.13495 0.087799 0.027936 0.064552 CV% 24.54 15.79 7.14 12.44 b5 -0.552567 0.000468 0.000893 0.001296 0.000547 S 0.001135 0.000723 0.000302 0.000642 CV% 242.52 80.96 23.3 117.37 0.014233 0.015534 0.019661 0.014192 S 0.001877 0.001444 0.000597 0.001258 CV% 13.18 9.29 3.04 8.86 b6 MSE 0.0004391 0.014123 19 Table 6b. Average regression coefficients and associated S and C V % obtained from the four simulations for red cedar. i 995 Equation Population Simulationl Simulation2 Simulatio3 Simulation4 0.931686 0.93526 0.948009 0.964395 0.922077 0.055141 0.032237 0.024414 0.023319 5.89 3.4 2.53 2.53 0.897705 0.90571 0.921981 0.897107 0.027572 0.018231 0.011019 0.014716 3.07 2.01 1.19 1.64 0.124005 0.110554 0.084332 0.127759 0.044941 0.027965 0.016853 0.020973 36.24 25.29 19.98 16.42 0.741799 0.745185 0.707484 0.750796 0.04052 0.028558 0.023062 0.033322 5.46 3.83 3.26 4.44 0.358448 0.367978 0.361429 0.344151 0.044472 0.032298 0.029782 0.040765 12.41 8.78 8.24 11.84 -0.23485 -0.214727 -0.311074 -0.282165 0.111451 0.081968 0.074357 0.093591 47.45 38.17 23.9 33.17 -0.631711 -0.679784 -0.524093 -0.632655 0.115892 0.079528 0.053855 0.070033 18.35 11.69 10.27 11.07 0.019838 0.018876 0.021232 0.020533 0.001754 0.001233 0.001108 0.001223 8.84 6.53 5.22 5.95 s cv% 0.897235 c \ s cv% 0.125128 c ? z scv% 0.742048 m \ s cv% m s cv% 0.358176 m s cv% -0.23451 m. s cv% 0.63226 m s cv% 0.019811 MSE 0.014439 2 3 5 0.014449 0.015054 0.017383 0.014147 S 0.002034 0.001515 0.001147 0.001245 CV% 14.08 10.06 6.59 8.8 20 4.2.2 Red Cedar A similar result occurred for red cedar. Both equations predicted the population parameters well (Tables 6a and 6b). However, the coefficients of variation for the 1994 equation were quite high for the following coefficients: b (131.96%), 6, (114.76%) 0 b (126:06%); Z> (191.36%), b (199.8%); and b (242.52%). The highest CV% for the 1995 2 3 4 6 equation was for m (47.45%). 3 4.3 Simulation 2. 4.3.1Western Hemlock In simulation (2), where only one observation per tree was sampled to eliminate autocorrelation, the averages of the estimated regression coefficient averages for both equations were similar to the population parameters (Tables 5a and 5b). The associated standard deviations and the coefficients of variation were smaller than those obtained in simulation 1, except for b^ and b for the 1994 equation and for b for the 1995 equation with coefficients of variation of 2 3 46.13, 49 .39 and 17.55 % respectively. This may be due to the elimination of autocorrelation from the data set. The underestimation of the MSE by both equations is less pronounced in simulation (2) than in simulation (1), due to the removal of autocorrelation from the data. 4.3.2 Red Cedar The averages of the estimated regression coefficients for the 1994 equation were also slightly different from the population parameters (Tables 6a and 6b). The standard deviations and the coefficients of variation were also smaller than in simulation (1) due to the elimination of autocorrelation. The variation of the regression coefficient averages for the 1995 equation were smaller compared to the 1994 equation for both species. This may be attributed to the elimination of autocorrelation. 21 Table 7. Average Volume, Merchantable height, and inside bark diameter estimation for three (small, medium and large) trees using the population equation and the four Monte Carlo simulations for western hemlock. 1994 Equation Volume CV% small 0.1016 0.1023 0.0024 2.36 0.1012 0.0023 2.27 0.1005 0.0019 1.89 0.1029 0.0013 1.26 medium 3.6799 3.7213 0.0508 1.50 3.6991 0.0505 1.36 3.6296 0.0336 0.93 3.7141 0.0324 0.87 large 37.3437 38.1063 3.3140 8.87 37.7064 2.2298 5.91 38.5505 2.9691 7.70 38.1238 1.8648 4.89 small 0.1025 0.1029 0.0026 2.54 0.1010 0.0023 2.28 0.1015 0.0019 1.87 0.1028 0.0015 1.46 medium 3.6483 3.6861 0.0497 1.36 3.6582 0.0507 1.38 3.5896 0.0338 0.94 3.6848 0.0323 0.87 large 39.1256 40.4720 1.2015 3.07 41.5576 1.4777 3.56 39.5992 0.8867 2.24 40.6634 0.8044 1.98 Population Simulationl S CV% Simulation2 S CV% Simulation3 S CV% Simulation4 S CV% 6.41 6.41 0.16 2.50 6.40 0.15 2.34 6.31 0.12 1.90 6.41 0.09 1.40 33.92 33.91 0.14 0.41 33.81 0.11 0.33 33.75 0.09 0.27 33.94 0.13 0.38 60.57 60.55 0.35 0.58 60.21 0.28 0.46 60.39 0.29 0.48 60.64 0.26 0.43 6.48 6.47 0.14 2.16 6.35 0.14 2.20 6.30 0.11 1.75 6.46 0.08 1.23 33.80 33.80 0.14 0.41 33.84 0.13 0.38 33.72 0.10 0.29 33.83 0.12 0.35 60.68 60.69 0.29 0.48 60.61 0.28 0.46 60.69 0.21 0.35 60.74 0.28 0.46 Population Simulationl 15.50 15.49 0.19 1.23 15.31 0.17 1.11 15.60 0.14 0.89 15.53 0.12 0.77 62.18 62.20 0.72 1.16 62.55 0.68 1.09 62.95 0.55 0.87 61.86 0.62 1.00 171.92 172.02 8.02 4.66 176.16 5.88 3.34 177.87 7.14 4.01 170.73 5.80 3.40 15.03 15.03 0.18 1.20 15.11 0.16 1.06 15.15 0.13 0.86 15.03 0.13 0.86 60.55 60.57 0.58 0.96 59.58 0.57 0.95 60.77 0.11 0.72 60.38 0.45 0.74 203.59 203.81 4.80 2.36 193.29 4.86 2.51 205.43 3.71 1.81 202.28 4.57 2.26 13.90 13.90 0.15 1.08 13.77 0.12 0.87 13.74 0.11 0.80 13.97 0.11 0.79 48.22 48.23 0.37 0.77 48.31 0.38 0.78 47.78 0.27 0.56 48.29 0.31 0.64 120.56 120.52 5.21 4.32 121.53 3.53 2.90 122.30 4.63 3.80 120.79 3.27 2.71 14.00 14.00 0.16 1.14 13.91 0.14 1.01 14.04 0.12 0.85 14.01 0.10 0.71 48.12 48.11 0.39 0.81 48.18 0.40 0.83 47.94. 0.27 0.56 48.20 0.28 0.58 116.54 116.40 2.49 2.14 120.96 3.06 2.53 116.56 1.84 1.58 117.26 2.09 1.78 population Simulationl S cv% Simulation2 s cv% Simulation3 S CV% Simulation4 s Merchantable ht. 1995 Equation Inside bark diameter at 0.3m s CV% Simulation2 s CV% Simulation3 S cv% Simulation4 S Inside bark diameter at 10% cv% Population Simulationl S cv% Simulation2 s cv% Simulation3 s cv% Simulation4 S cv% 22 Table 8. Average Volume, Merchantable height and inside bark diameter estimation for three (small, medium and large) trees using the population equations and the four Monte Carlo simulations for red cedar. Volume Population Simulation 1 S cv% Simulation2 s cv% Simulation3 S cv% Simulation4 S Merchantable ht. cv% Population Simulation 1 S cv% Simulation2 s cv% Simulation3 S cv% Simulation4 s Inside bark diameter at0.3m cv% Population Simulation 1 S cv% Simulation2 S cv% Simulation3 S cv% Simulation4 S Inside bark diameter atlO% cv% Population Simulation 1 S cv% Simulation2 S cv% Simulation3 S cv% Simulation4 S cv% 1994 Equation small medium large 0.1124 3.3189 29.5237 0.1135 3.3489 29.6259 0.0031 0.0823 1.7887 2.73 2.46 6.04 0.1173 3.3281 30.0687 0.0020 0.0491 1.2380 1.71 1.47 4.12 0.1187 3.3355 33.4276 0.0006 , 0.0216 0.5442 0.50 0.65 1.63 0.1154 3.3222 29.6180 0.0013 0.0425 0.8320 1.13 1.28 2.80 small 0.1124 0.1133 0.0025 2.21 0.1144 0.0015 1.31 0.1141 0.0011 0.96 0.1125 0.0010 0.88 1995 Equation medium large 3.2543 30.5824 3.2745 30.7236 0.0780 0.9126 2.38 2.97 3.2817 30.7121 0.0491 0.5693 1.49 1.85 3.2636 31.3890 0.0320 0.5027 0.98 1.60 3.2821 30.8161 0.0352 . 0.4228 0.11 1.37 6.48 6.49 0.19 2.93 6.73 0.12 1.78 6.77 0.04 0.59 6.54 0.09 1.37 33.66 33.67 0.22 0.65 33.64 0.13 0.38 33.60 0.04 0.12 33.65 0.13 0.38 59.57 59.59 0.46 0.77 59.48 0.29 0.49 59.69 0.13 0.22 59.63 0.25 0.42 6.31 6.31 0.16 2.53 6.40 0.10 1.56 6.44 0.07 1.09 6.30 0.07 1.11 33.89 33.87 0.17 0.50 33.90 0.11 0.32 33.82 0.07 0.21 33.92 0.11 0.32 59.98 59.97 0.26 0.43 59.90 0.16 0.27 60.12 0.17 0.28 60.05 0.20 0.33 19.01 19.01 0.34 1.79 19.21 0.23 1.19 19.18 0.06 0.31 19.18 0.17 0.88 76.92 77.17 1.08 1.39 76.63 0.81 1.06 76.94 0.37 0.48 76.94 0.82 1.06 220.95 211.25 7.10 3.36 212.65 4.73 2.22 211.45 1.95 0.86 221.45 4.75 2.25 18.46 18.47 0.25 1.35 18.44 0.17 0.92 18.31 0.12 0.66 18.31 0.16 0.87 68.56 68.62 1.07 1.55 67.96 0.72 1.06 68.63 0.47 0.69 68.63 0.56 0.82 226.55 226.47 5.91 2.61 222.43 3.93 1.77 228.63 3.46 1.51 228.63 3.81 1.67 14.53 14.53 0.22 1.51 14.70 0.13 0.88 14.90 0.03 0.20 14.71 0.11 0.75 45.68 45.74 0.50 1.09 45.60 0.35 0.77 45.96 0.11 0.24 45.74 0.32 0.69 107.18 107.24 3.15 • 2.94 108.45 2.23 2.05 114.73 0.95 0.83 107.60 1.62 1.51 15.13 15.14 1.05 1.05 15.18 0.10 0.66 15.12 0.07 0.46 15.06 0.009 0.59 45.67 45.78 1.37 1.37 45.85 0.39 0.85 45.35 0.28 0.62 45.75 0.31 0.68 101.88 101.91 2.24 2.24 102.77 1.48 1.44 101.11 1.33 1.32 101.57 1.27 1.25 23 4.3.3 Comparison of Species The variation of the estimated regression coefficients for both equations were higher for red cedar than for western hemlock. The polymorphism of cedar may be the reason behind these differences. 4.4 Simulation 3. 4.4.1 Western Hemlock The averages of the estimated regression coefficients for the 1994 equation were different from the population parameters (Table 5a). The variations of the estimated regression coefficients were lower for simulation (3) compared to Simulations (1) and (2) except for b 6 with a coefficient of variation of 276.4%. In contrast to the 1994 equation, the averages of the estimated regression coefficients for the 1995 equation were similar to the population parameters (Table 5b). The variation of the estimated coefficients was smaller than in simulations (1) and (2). This may be attributed to the wider and uniform coverage of the range of tree sizes. Both equations underestimated the MSE, but the amount was negligible. 4.4.2 Red Cedar The averages of the estimated regression coefficients also differed from the population parameters for the 1994 equation, but were more or less similar for the 1995 equation (Tables 6a and 6b). However, the variations of the estimated coefficients were lower than for simulations (1) and (2) for both equations. Since cedar has several shape patterns (Silviculture in Canada, 1989), the wider and uniform coverage of the various ranges of tree sizes might be the reason behind the lower CV% compared to western hemlock. The MSE was underestimated by both equations. 24 4.5 Simulation 4 v 4.5.1 Western Hemlock In simulation (4), the observations used to fit the equations were selected with probability proportional to the size of the strata. The averages of the estimated regression coefficient for the 1994 equation differed from the population parameters and associated coefficients of variation were similar to simulation (1) (Table 5a). However, the estimated regression coefficients for the 1995 equation were in agreement with the population parameters (Table 5b). The variability of the estimated regression coefficient averages were slightly higher than simulation (3) but similar to simulation (1). The MSE was slightly underestimated by both equations. 4.5.2 Red Cedar The averages of the estimated regression coefficients for the 1994 equation were different from the population parameters (Table 6a). However, the variability of the regression coefficients were smaller than simulation (1) and larger than simulation (3). This may be attributed to the elimination of the variability due to stratification. The estimated regression coefficients for the 1995 equation were similar to the population parameters. The coefficients of variation were smaller than in simulation (1) but larger than in simulation (3). (Table 6b). This may be attributed to the moderate multicollinearity the 1995 equation possesses. 4.6 Estimation of Total Tree Volume, Merchantable Heights and Inside Bark Diameters 4.6.1 Volume Estimation A review of Tables 7 and 8 indicates that both equations predicted total tree volume with a fair amount of accuracy for both species. For western hemlock, the 1994 equation overestimated total tree volume for small trees in simulations (1) and (2). The above is also true for the 1995 equation, only that the total tree volume for the medium sized trees were overestimated in simulation (3) (Table 7). For red cedar, both equations overestimated total tree volume for all of the tree sizes in all the simulations (Table 8). The population values for the 25 1995 equation are slightly larger than the 1994 value for the small and large trees (Tables 7 and 4.6.2 Merchantable Height Estimation The predicted merchantable heights are very similar to the population values for both equations and species (Tables 7 and 8). There was a negligible difference between the equations and the simulations. 4.6.3 Inside Bark Diameter Estimation The equations gave different inside bark diameter estimates at 0.3 m. The estimates from the four simulations for both equations and species were very variable (Tables 7 and 8). This may be attributed to the different stump shapes. For hemlock, the estimates obtained by the 1994 equation in simulations (2) and (3) seemed to overestimate the population values for the medium and large trees, whereas the 1995 equation underestimated the population values. For red cedar, the predictions of the average inside bark diameters at 0.3 m for both equations were very variable. There was no definite pattern or consistency as the four simulations gave different estimates (Table 8). The predictions of the average inside bark diameter for both equations and species at 10% (Tables 7 and 8), and 30, 50, and 70% (Table 9 and 10), show some kind of consistency. It is evident that both equations provide nearly the same predictions along the stem. The predicted average inside bark diameters are less variable than at 0.3m and are similar to the population values. This may be attributed to the paraboloid shape between 10 to 70% of the tree height. However, the estimated average inside bark diameters were variable compared to the population values at 90%. This may be attributed to the smaller diameters (tapering of the tip of the tree) and the larger coefficients of variations. Both equations underestimated the average inside bark diameters at 0.3m and overestimated it at 90% of the total tree height of the tree. 26 Despite these short comings, the variable-form taper equations proved to be a very useful method for predicting diameter inside bark along the stem. 4.7 Comparing the Equations and the Simulations The coefficients of variation for the 1994 equation were larger than the 1995 equation, mostly for medium and large trees. This may be attributed to multicollinearity (Tables 9, 10, 11 and 12). The variations of the predicted values for Simulation (1) are higher compared to simulation (2). This may be attributed to the elimination of autocorrelation and the wider range of tree sizes sampled in Simulation (2). Similar to simulation (2), the variations of the predicted values in simulation (3) are smaller than in simulation (1). This may also be attributed to the wider range of tree sizes sampled. Hence, more stable equations and uniform predictions were obtained (Tables 9, 10, 11, and 12). The above is also true for simulation (4). Therefore, if the main objective is the prediction of total tree volume, merchantable height and inside bark diameter, then a wide range of tree sizes should be sampled to produce relatively stable equations and uniform predictions. 4.8 Goodness of Fit The chi-square test performed on the frequency distribution of the 1000 predicted values (total tree volume, merchantable height, inside bark diameter at 0.3m, 10%, 30%, 50%, 70% and 90%o of tree height) and regression coefficients for both equations and species, indicated that the most of them were not significantly different (at a = 6.05) from a normal distribution. As a result, it can be stated that the frequency distributions of the predicted characteristics and the estimated regression coefficients have a symmetrical distribution, approaching the normal distribution. 27 4.9 Possibility of Localizing the 1994 Equation From the analysis carried out on the possibility of localizing the 1994 variable exponent taper equation, the total average bias percent for the 1994 variable-exponent taper function was less than 5%, for both subzones (Tables 11 and 13). Even though the adjustment improved the overall bias % and reduced the variation around the regression surface (SEE%), the magnitude of reduction was negligible (Tables 12 and 14). According to Kozak and Smith (1993): " to evaluate equations, a decision can be made by visual comparison. If the biases show some kind of trend by tree sizes, or if some of the biases are extremely high for some size classes, the calibration is not acceptable. A good equation would show less than 5% biases for almost all DBH classes containing at least 5 trees". The bias for all but two DBH classes were below 5% in the xm subzone and for all but one in the vm subzone. They remained over 5% after the adjustment because the calibration equations have approximately a perfect fit with a zero intercept and a slope close to 1. Hence, not much can be expected from the adjustments. Therefore, it can be concluded that the 1994 variable-exponent taper equation predicted total volume for the subzones with a fair amount of accuracy and no adjustments are required for these particular subzones. 28 Table 9. Inside bark diameter estimation for three (small, medium and large) trees at 30%, 50%, 70% and 90% of the tree height for western hemlock. 1994 Equation 1995 Equation small medium large small medium large 12.27 12.24 0.15 1.23 12.28 0.10 0.81 12.14 0.14 1.15 12.10 0.12 0.99 41.83 42.46 0.38 0.89 42.45 0.29 0.68 42.21 0.36 0.85 41.83 0.26 0.62 104.61 104.64 4.65 4.44 104.85 2.59 2.72 103.66 3.16 3.05 104.85 4.13 3.94 12.36 12.28 0.18 1.46 12.28 0.12 0.98 12.06 0.16 1.32 12.48 0.13 1.06 42.11 42.26 0.33 0.78 42.25 0.23 0.54 41.84 0.31 0.74 41.60 0.21 0.50 107.25 107.16 1.65 1.54 105.49 1.08 1.00 108.47 1.97 1.82 105.67 1.23 1.16 9.95 9.93 0.16 1.61 9.93 1.10 1.01 9.92 0.13 1.31 9.82 0.13 1.32 34.58 34.59 0.36 1.04 34.49 0.28 0.81 33.87 0.34 1.00 33.87 0.25 0.74 84.65 84.67 4.15 4.90 84.71 2.31 . 2.73 83.03 2.90 3.49 83.76 3.54 4.23 10.00 10.00 0.15 1.5 9.99 0.08 0.80 9.88 0.13 1.31 9.86 0.11 1.11 34.46 34.38 0.31 0.90 34.32 0.24 0.69 34.25 0.30 0.87 33.49 0.21 0.63 88.95 88.74 1.71 1.93 88.78 1.24 1.36 89.61 1.74 1.94 86.35 1.12 1.29 6.64 6.64 0.18 2.71 6.64 0.12 1.80 6.70 0.12 1.79 6.55 0.12 1.83 23.20 23.30 0.36 1.54 23.24 0.33 1.42 23.14 0.33 0.01 22.56 0.23 1.02 56.76 56.77 3.31 5.83 56.88 2.08 3.65 54.98 2.38 4.32 55.04 2.62 4.76 6.81 6.80 0.12 1.76 6.76 0.07 1.03 6.84 0.11 1.61 6.64 0.08 1.20 23.20 23.07 0.29 1.26 23.05 0.23 0.99 23.31 0.27 1.16 22.37 0.19 0.84 58.92 58.49 0.29 3.41 59.27 1.11 1.87 60.34 1.43 2.37 57.17 0.96 1.67 2.46 2.44 0.10 4.09 2.46 0.13 5.28 2.47 0.14 5.66 2.45 0.10 4.08 8.58 8.68 0.28 3.22 8.74 0.27 3.09 8.47 0.24 2.83 8.41 0.18 2.14 20.55 20.98 1.75 8.34 21.26 1.11 5.22 19.47 1.23 6.32 20.00 1.30 6.50 2.56 2.57 0.10 3.89 2.59 0.09 3.47 2.57 0.09 3.50 2.57 0.07 0.83 8.60 8.49 0.29 3.41 8.55 0.27 3.15 8.52 0.28 3.28 8.41 0.19 2.26 20.99 21.16 1.09 5.15 21.31 0.87 4.08 21.24 1.04 4.89 20.71 0.68 3.28 Inside bark diameter at 30% population Simulationl s CV% Simulation2 S cv% Simulation3 S CV% Simulation4 S CV% Inside bark diameter at 50%. Population . Simulationl s cv% Simulation2 s CV% Simulation3 s CV% Simulation4 S CV% Inside bark diameter at 70% Population Simulationl s CV% Simulation2 s CV% Simulations s CV% Simulation4 s CV% Inside bark diameter at 90% Population Simulationl s CV% Simulation2 s CV% Simulations S CV% Simulation4 s CV% 29 Table 10. Inside bark diameter estimation for three (small, medium and large) trees at 3 0%, 5 0%, 70% and 90% of tree height for red cedar. 1994 Equation 1995 Equation small medium large small medium large cv% 12.01 12.18 0.18 1.48 12.28 0.09 0.73 12.28 0.12 0.98 12.46 0.03 0.24 37.49 37.50 0.56 1.49 37.34 0.30 0.80 37.46 0.33 0.88 37.46 0.1.4 0.37 85.92 83.96 2.76 3.29 84.04 1.31 1.56 89.60 1.90 2.24 89.60 0.83 0.92 12.10 12.13 0.16 3.32 12.12 0.08 0.66 12.25 0.10 0.82 12.25 0.08 0.65 37.56 37.59 0.48 1.27 37.69 0.24 0.63 37.73 0.31 0.82 37.73 0.21 0.55 86.99 86.79 1.43 1.64 86.97 0.66 0.75 88.21 0.91 1.04 88.21 0.80 0.91 Population Simulationl S CV% Simulation2 S CV% Simulation3 S CV% Simulation4 S CV% 10.02 10.06 0.18 1.79 10.11 0.08 0.79 10.29 0.12 1.17 10.32 0.04 0.38 30.42 30.50 0.63 2.06 30.25 0.34 1.24 30.45 0.36 1.18 30.36 0.16 0.53 66.52 66.00 2.80 4.24 65.84 1.42 2.15 66.20 1.88 2.84 70.13 0.85 1.21 9.87 9.84 0.15 1.52 9.83 0.06 0.61 9.93 0.09 0.90 9.97 0.06 0.60 30.72 30.74 0.41 1.33 30.82 0.21 0.68 30.87 0.28 0.91 31.02 0.18 0.58 70.99 70.90 1.39 1.96 71.09 0.73 1.02 70.85 0.91 1.28 73.09 0.76 1.04 Population Simulationl S CV% Simulation2 S CV% Simulation3 S CV% Simulation4 S CV% 7.03 7.03 0.18 2.56 7.07 0.09 1.27 7.23 0.13 1.79 7.30 0.05 0.68 20.95 20.94 0.61 2.91 20.77 0.35 1.68 20.89 0.36 1.72 20.89 0.16 0.76 43.75 43.62 2.58 5.91 43.53 1.42 3.26 43.42 1.70 3.91 46.31 0.80 1.73 7.02 7.01 0.16 2.28 7.00 0.08 1.14 7.11 0.11 1.55 7.06 0.07 0.99 21.60 21.59 0.38 1.76 21.63 0.22 1.02 21.72 0.26 1.19 21.73 0.18 0.83 47.96 47.95 1.95 4.07 48.07 0.66 1.37 47.78 0.81 1.69 49.80 0.68 1.36 Population Simulationl S CV% Simulation2 S CV% Simulations S CV% Simulation4 S CV% 2.82 2.92 0.17 5.84 2.95 0.12 4.05 3.03 0.12 3.96 3.03 0.05 1.65 8.25 8.38 0.37 4.41 8.36 0.21 2.51 8.33 0.22 2.64 8.23 0.05 0.61 16.11 16.22 1.47 9.06 16.26 0.79 4.86 15.85 0.92 5.80 16.81 0.45 2.67 2.95 2.97 0.15 5.05 2.99 0.10 3.34 3.02 0.10 3.31 2.92 0.06 2.05 8.70 8.71 0.32 3.67 8.80 0.23 2.61 8.76 0.21 2.39 8.61 0.15 1.74 17.42 17.46 0.71 4.07 17.65 0.47 2.66 17.22 0.49 2.84 18.14 0.46 2.53 Inside bark diameter at 30% Population Simulation 1 S CV% Simulation2 S CV% Simulation3 S CV% Simulation4 S Inside bark diameter at 50% Inside bark diameter at 70% Inside bark diameter at 90% 30 i Table 11. Errors in volume by D B H classes for the 1994 equation for the vm (very moist) subzone. DBH CLASS (cm) 0.0-15.0 15.1-30.0 ' 30.1-45.0 45.1-60.0 60.1-90.0 TOTAL FREQ SEE BIAS% SEE% 15 73 47 20 13 BIAS (m ) -0.01 -0.01 -0.07 0.06 0.02 (m ) 0.07 0.17 0.21 0.31 0.40 -2.06 -1.02 -7.91 3.53 1.55 11.81 15.99 21.47 16.69 25.48 169 -0.02 0.20 -2.28 17.86 3 3 Table 12. Errors in volume by D B H classes, after calibration using exponential correction of Y=0.934649206*X**0.9548228*1.027429447**Xfor the vm subzone. DBH CLASS (cm) 0.0-15.0 15.1-30.0 30.1-45.0 45.1-60.0 60.1-90.0 , FREQ TOTAL 15 73 47 20 13 BIAS (m ) 0.01 0.02 -0.05 0.06 0.03 SEE (m ) 0.04 0.17 0.18 0.24 0.20 169 0.01 0.17 3 BIAS% SEE% 1.45 0.23 -5.39 3.32 1.95 7.00 15.89 18.10 12.98 12.61 1.02 15.02 3 31 Table 13. Errors in volume by DBH classes for the 1994 equation for the xm (Xeric) subzone. DBH CLASS (cm) 0.0-15.0 15.1-30.0 30.1-45.0 45.1-60.0 60.1-105.0 FREQ SEE (m ) 0.19 0.20 0.23 0.16 0.47 BIAS% SEE% 23 96 66 37 14 BIAS (m ) 0.04 -0.00 -0.08 0.03 0.08 5.91 -0.46 -7.52 2.53 4.02 27.07 18.07 20.49 12.79 22.57 TOTAL 236 -0.01 0.22 -1.07 19.06 3 3 Table 14. Errors in volume by DBH classes, after calibration using exponential correction of Y=1.00278*X**0972589*1.0007393**X for the xm subzone. DBH CLASS (cm) 0.0-15.0 15.1-30.0 30.1-45.0 45.1-60.0 60.1-105.0 FREQ 23 96 66 37 14 BIAS (m ) 0.04 0.49 -0.06 0.02 0.07 SEE (m3 ) 0.18 0.21 0.19 0.13 0.35 BIAS % 5.71 0.43 -5.74 1.74 3.49 26.83 18.39 16.37 10.53 17.00 TOTAL 236 -0.004 0.20 -0.39 17.33 3 3 SEE% 32 Chapter 5 Discussion 5.1 Variance Inflation Factors and Correlation Matrix When significant correlations exist, the regression coefficients from ordinary least squares procedures may not be precisely estimated, thus producing areas in the regressor space where prediction could be poor (Myer, 1986). The Variance Inflation Factors for the 1994 equation were extremely large for both species compared to the 1995 equation. It is believed that when the Variance Inflation Factor exceeds 10 there is reason for concern about multicollinearity (Myers, 1986). Therefore, it can be concluded that the 1994 equation possesses severe multicollinearity, while the 1995 equation possesses moderate or less multicollinearity. 5.2 Simulation and Predictions The VIFs indicate the presence of severe multicollinearity for the 1994 equation and a moderate or less multicollinearity for the 1995 equation. Therefore, in simulation (1), the variability of the regression coefficients was more pronounced for the 1994 equation compared to the 1995 equation, as suggested by theory. However, despite the presence of multicollinearity, both equations predicted total tree volume, merchantable height, and inside bark diameter with a reasonable amount of accuracy for both species. This could be attributed to the fact that the fit of a model is unaffected by severe multicollinearity when the prediction of response is within the range of the data (Myers, 1986). There are three independent variables (D, H and /*,/H) for both the 1994 and 1995 taper equations. The VIFs values for D and H for both equations were less than 10. Hence, they were not affected by multicollinearity and their range is of no concern in this case. The possible range of the third independent variable, h /H, is between 0 and 1, and f since the range of the data sets used in this study is between 0.3/H (which is close to zero) and H/H (which is 1.0), it almost completely and uniformly covers the range of the data. Therefore, 33 multicollinearity should not adversely affect the predictive ability of the equations because prediction is within the data range. This was confirmed by simulation (1), as both equations gave very reasonable unbiased predictions within the range of the data set. However, as expected, the estimates from the 1994 equation were more variable than those of the 1995 equation because the 1995 equation possesses less multicollinearity. In simulation (2), in order to obtain the same sample size as in simulation (1), approximately half of the trees in the population were sampled for each equation. This led to a wider range of tree sizes being sampled compared to simulation (1). Because of this, it is not clear whether the smaller variations of the regression coefficients and the estimated values (volume, merchantable height and inside bark diameter) were due to the elimination of autocorrelation or samples covering wider tree sizes. The removal of autocorrelation usually results in the loss of information from the data set. Because sampling for taper equations is destructive in nature, and the practical improvement of using only one observation per tree is not very significant, it would be a waste of resources (time, money and trees) to use only one observation. However, if a technique or a procedure that permits the measurement of inside bark diameter from standing trees existed, the elimination of autocorrelation should be considered. In Simulation (3), both equations gave similar results for both species. Variations of the regression coefficients and the predicted values were considerably lower than in simulation (1). This is clearly due to a wider range of tree sizes sampled. However, the regression coefficients obtained in imulation (3) were considerably different than the actual population. Since the variable-exponent taper equations are empirical equations, their coefficients contain no interpretive value. The problem of unstable coefficients and coefficients that are too large in magnitude, while important, does not hinder the taper equation's ability to accurately predict the inside bark diameter of the tree. The volumes and inside bark diameter for medium and large trees for both species were generally well predicted, but overall the variability of prediction in 34 simulation (3), was the lowest among the simulations ran. It is clear that the only advantage from stratification was to reduce the variability of prediction. Since, the estimated regression coefficients in simulation (3) were biased, therefore, stratified random sampling with equal allocation of tree for each stratum should not be used if the objective of fitting the taper equations is to estimate population parameters. However, if the equations are for the estimation of volume, merchantable height, and inside bark diameter, this sampling procedure could be considered because of the low variations of these predicted values. In simulation (4), the variation of the regression coefficients and the predicted values were higher those in simulation (3) for both equations and species, but lower than those obtained in simulations (1) and (2). However, the coefficient of variation were larger for the 1994 equation. This disparity may be due to the distribution of trees in the two populations and the severe multicollinearity for the 1994 equation. Although, the red cedar population had roughly a uniform number of small, medium and large trees, the presence of several shape patterns and severe multicollinearity affects the performance of the 1994 equation in terms of estimating coefficients. However, both equations predicted volume, merchantable height and inside bark diameter well. 5.3 Localizing the 1994 Equation In the processes of localizing the 1994 equation, the total bias for volume estimation was less than 5% for both the subzones. These results indicate that the 1994 equation predicts total tree volume with a fair amount of accuracy. Therefore, no adjustments or scaling is required for these subzones in contrast with the study conducted by Kozak (1997b) for the Queen Charlotte Islands, where there was a significant difference between the actual and predicted volumes. However, additional study including several subzones and species is recommended before final conclusions are drawn. 35 Chapter 6 Conclusions It would be desirable to fit taper equations without multicollinearity and autocorrelation. However, this study indicated that the presence of multicollinearity in the third independent variable (hi/H) and autocorrelation in the data set did not affect the predictive ability of the 1994 and 1995 variable-exponent taper equations, because the prediction of the response was within the range of the data. The four simulations demonstrated the equations' ability to accurately predict inside bark diameter of trees regardless of the severe multicollinearity. Both equations predicted volume, merchantable height and inside bark diameter with a fair amount of accuracy. Considering the % bias and % SE, the 1995 equation out performed the 1994 equation in terms of inside bark diameter estimation for both species. Both the biases and standard errors of estimate were small, also, the biases did not indicate as much of a trend. For total tree volume estimation by DBH and height class, the 1994 equation had smaller biases than the 1995 equation for western hemlock, while the 1995 equation yielded smaller biases and standard errors of estimate for red cedar. In estimating merchantable heights, both equations had similar biases and standard errors. If the main objective is the prediction of total tree volume, the 1994 equation is recommended for western hemlock based on it superiority over the 1995 equation in terms of minimum bias and standard error of estimate. Since the 1995 equation possesses less multicollinearity and outperforms the 1994 equation in terms volume and inside bark diameter estimation for red cedar, it is recommended for red cedar, although there are no clear cut differences between the two in terms of prediction. Stratified random sampling gave the smallest variability of the estimated average regression coefficients compared to simple random sampling and stratified random 36 sampling, with the number of samples proportional to the size of the strata. However, the average estimated regression coefficients were different from the population parameters. Therefore, simple random sampling is recommended for selecting trees from the population if the main objective is the estimating the population parameters. If the equations are to be used for prediction, then a wider range of the data (stratified sampling) should be used, because it will account for the total variation within the population. If a technique or procedure existed that would permit the measurement of inside bark diameter from standing trees, then autocorrelation could be eliminated. Otherwise, it is a waste of resources (time, money and trees) to use only one observation per tree. Finally, based on the study conducted on the two subzones considered from the coastal western hemlock (CWH) zone, and on procedures recommended by Kozak and Smith (1993), the 1994 variable-exponent taper equation requires no adjustment or scaling. However, additional studies including several subzones and species is recommended before final conclusions can be drawn. 37 Chapter 7 Literature Cited Avery, T. E., and Ft. F. Burkhart. 1983. Forest measurement. McGraw-Hill, New York. 331 p. Behre, C. E. 1927. Form-class taper curves and volume tables and their application. J Agric Res 35(8): 673-744. Cochran, W. G. 1977. Sampling techniques. John Wiley and Sons, Inc. New York. 428p. Demaerschalk, J. P., and Kozak, A. 1977. The whole-bole system: a conditioned dualequation system for precise prediction of tree profile. Can. For. Res. 7:488-497. Hilt, D. E. 1980. Taper-based system for estimating stem volume of upland oaks. USDA For. Serv. Res. Pap. NE - 458. 11pp. Judge, G. G., W. E. Griffiths, R. C. Hill., H. Lutkepohl, and T. Lee. 1985. The theory and practice of econometrics. John Wiley and Sons, New York. 1019p. Kmenta, J. 1986. Elements of econometrics. Macmillan Publ. Co., New York. 655p Kilkki, P. and M . Varmola. 1979. A nonlinear simultaneous equation model to determine taper curves. Silva. Fenn. 13-10pp. Kilkki, P., M . Saramaki, and M . Varmola 1978. A simultaneous equation model to determine taper curve. Silva fenn. 12(2): 120- 125. Kozak, A., D.D. Munro, and J. H. G. Smith. 1969. Taper functions and their application in forest inventory. For. Chron. 45: 278-283. Kozak, A. 1988. A variable-exponent taper equation. Can. J. For. Res. 18: 1363 - 1368. Kozak, A., and Smith, J.H.G. 1993. Standards for evaluating taper estimating systems. For. Chron. 64: 438-444. 38 Kozak, A. 1997a. Effects of multicollinearity and autocorrelation on the variable-exponent taper functions. Can. J. For. Res. 27: 619-629. Kozak, A. 1997b. Calibration of volume and taper equations using regression techniques. (Internal report). For the Resources Inventory Branch, B.C, MOF. Larson, P. 1963. Stem form development of forest trees. For. Sci 5: 48pp. LeMay, V. M., Kozak, A., Muhairwe, C. K., and Kozak, R. A. 1993. Factors effecting the performance of Kozak's (1988) variable-exponent taper function. In Proceedings, of Modern methods of estimating tree and log volume, JJJFRO conference, June 14-16, 1993, Morgantown, West Virginia. G.B Wood and H.V. Wiant, Jr., editors. Pp. 34-53. Loetsch, Fritz. 1973. Forest inventory. B L V Verlagsgesellschaft, Munich. 469 p. Max, T. A., and H. E. Burkhart. 1976. Segmented polynomial regression applied to taper equations. For. Sci. 22: 283 - 289. Muhairwe, Charles K., 1993. Examination of tree form and taper over time for interior Lodgepole pine. Univ of British Columbia. Myers, R. H. 1986. Classical and Modern regression with applications. Duxbury press, Duxbury, Mass. 360 pp. Naslund, M . 1980. Stem form studies of pine in nothern Sweden. Swed Univ Agric Sci, Rep 8, 86 p. Neter, J., and Wasserman, W. 1990. Statistical models. Richard, D. Irwin, Inc., Homewood.lll pp. Newham, R. M . 1988. A variable-form taper function. For. Can. Petawawa. Natl. For. Inst. Information Rep. PI-X-83. 33 p. Ormerod, D.W. 1973. A simple bole model. For. Chron. 49: 136-138 39 Anonymous. 1989. Silviculuture in Canada. Canadian forestry Canada. Pamphlet. MCVF. 15p. Smith, J. H. G., Ker, J. W. and J. Csizmazia. 1961. Economics of reforestation of Douglas fir, western hemlock, and western red cedar in the Vancouver District. Univ B.C, Fac For,. Bui. No 3, 144 p. 40 Appendix 1. Biases and standard errors of estimate for inside bark diameter estimation at different heights from ground for western hemlock. 1994 Equation Ht from ground 1995 Equation 0.3m 0.3-1.3 1.3m 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% 100.0% No. 5716 9456 2851 7797 4232 4433 3088 4644 4511 4478 4720 3977 5758 Bias (cm) 0.32 0.62 0.61 -0.11 -0.20 -0.28 -0.15 -0.02 0.17 0.25 0.21 0.20 0.00 % bias 0.69 1.39 1.68 -0.29 -0.59 -0.87 -0.47 -0.07 0.76 1.39 1.67 2.50 0.00 SE (cm) 5.93 3.67 2.12 2.24 2.71 2.81 3.02 2.88 3.08 3.06 2.67 2.26 0.00 %SE 12.86 8.25 5.87 5.84 8.08 8.91 9.69 11.19 13.79 17.11 21.23 28.88 0.00 Bias (cm) 0.31 -0.18 0.11 -0.05 0.04 -0.13 -0.06 0.03 0.18 0.25 0.26 0.26 0.00 %bias 0.67 -0.41 0:30 -0.14 0.12 -0.40 -0.18 0.11 0.79 1.40 2.09 3.27 0.00 SE (cm) 5.74 3.01 1.46 2.23 2.70 2.84 3.09 2.92 3.11 3.08 2.67 2.25 0.00 %SE 12.44 6.76 4.02 5.82 8.04 9.00 9.93 11.36 13.95 17.22 21.28 28.77 0.00 Total 65661 0.15 0.52 3.15 11.01 0.06 0.20 3.01 10.72 Appendix 2. Biases and standard errors of estimate for estimating total tree volume by D B H class for western hemlock. 1994 Equation D B H class (cm) 1995 Equation 0.1-15.0 15.1-25.0 25.1-35.0 35.1-45.0 45.1-55.0 55.1-65.0 65.1-75.0 75.1-85.0 85.1-95.0 95.1-105.0 105.1-125.0 125.1-165.0 No. 393 997 1100 1065 773 532 356 188 127 94 62 29 Bias (m ) 0.001 0.008 0.000 -0.013 -0.022 -0.021 0.008 0.077 0.034 0.102 0.032 0.160 %bias 1.93 3.08 0.03 -0.88 -0.89 -0.53 0.15 0.99 0.33 0.77 0.18 0.55 SE (m ) 0.009 0.030 0.081 0.147 0.254 0.436 0.626 0.949 1.060 1.574 2.169 4.056 %SE 13.60 12.09 10.74 9.98 10.17 10.98 11.11 12.24 10.27 11.97 12.58 14.02 Bias (m ) 0.001 0.008 0.006 0.000 0.002 0.012 0.041 0.093 -0.001 -0.019 -0.291 -1.366 % bias 0.89 3.19 0.73 0.03 0.06 0.30 0.73 1.21 -0.01 -0.14 -1.69 -4.72 SE (m ) 0.009 0.030 0.081 0.146 0.254 0.438 0.630 0.953 1.068 1.573 2.209 4.258 %SE 13.13 12.10 10.73 9.96 10.15 11.03 11.18 12.30 10.36 11.96 12.81 14.72 Total 5716 0.001 0.03 0.535 11.29 -0.001 -0.03 0.545 11.51 3 3 41 3 3 Appendix 3. Biases and standard errors of estimate for estimating total tree volume by height class for western hemlock. 1994 Equation Height class (m) 0.01-16.00 16.01-20.00 20.01-24.00 24.01-28.00 28.01-32.00 32.01-36.00 36.01-40.00 40.01-44.00 44.01-48.00 48.01-52.00 52.01-56.00 56.01-64.00 Total 1995 Equation No. 847 516 651 814 803 728 533 385 212 135 56 37 Bias (m ) 0.007 0.009 0.019 0.000 -0.005 0.032 0.048 -0.037 -0.041 -0.277 -0.100 0.053 %bias 5.27 2.38 2.75 0.01 -0.26 1.14 1.19 -0.62 -0.52 -2.36 -0.60 0.22 SE (m ) 0.025 0.052 0.098 0.141 0.202 0.358 0.526 0.710 0.999 1.704 1.778 3.132 %SE 18.16 14.25 14.32 12.39 11.54 12.90 13.00 12.02 12.44 14.52 10.61 13.27 Bias (m ) 0.006 0.011 0.025 0.012 0.013 0.056 0.074 -0.020 -0.051 -0.408 -0.488 -0.859 % bias 4.38 2.94 3.68 1.07 0.75 2.02 1.83 -0.34 -0.64 -3.47 -2.92 -3.64 SE (m ) 0.024 0.052 0.101 0.141 0.204 0.361 0.524 0.707 0.984 1.735 1.963 3.210 %SE 17.41 14.38 14.68 12.42 11.67 12.99 12.95 11.97 12.27 14.78 11.72 13.60 5716 0.001 0.03 0.535 11.29 -0.001 -0.03 0.545 11.51 3 3 3 3 Appendix 4. Biases and standard errors of estimate for estimating merchantable height by DBH class for western hemlock. 1994 Equation D B H class (cm) 1995 Equation 0.1-15.0 15.1-25.0 25.1-35.0 35.1-45.0 45.1-55.0 55.1-65.0 65.1-75.0 75.1-85.0 85.1-95.0 95.1-105.0 105.1-125.0 125.1-165.0 No. 393 997 1100 1065 773 532 356 188 127 94 62 29 Bias (m ) 0.02 -0.19 -0.02 0.03 0.03 -0.17 -0.16 -0.30 -0.37 -0.33 -0.38 -0.36 % bias 0.20 -0.19 -0.12 0.12 0.12 -0.52 -0.45 -0.78 -0.90 -0.75 -0.80 -0.71 SE (m) 0.79 0.97 1.04 1.01 0.97 1.05 1.32 1.25 1.23 1.34 1.23 1.33 %SE 8.05 6.73 5.14 4.08 3.36 3.22 3.73 3.26 2.95 3.04 2.61 2.58 Bias (m) -0.08 -0.08 -0.02 0.07 0.10 -0.09 -0.09 -0.24 -0.32 -0.29 -0.37 -0.47 % bias -0.78 -0.58 -0.10 0.30 0.36 -0.28 -0.25 -0.62 -0.77 -0.66 -0.79 -0.92 SE (m) 0.77 0.97 1.05 1.03 0.98 1.04 1.32 1.24 1.22 1.33 1.22 1.32 %SE 7.93 6.79 5.21 4.14 3.40 3.19 3.71 3.23 2.94 3.01 2.60 2.56 Total 5716 -0.05 -0.22 1.04 4.73 -0.04 -0.15 1.04 4.75 3 42 Appendix 5. Biases and standard errors of estimate for inside bark diameter estimation at different heights from ground for red cedar. 1994 Equation Ht from ground 0.3m 0.3-1.3 1.3m 10.0% 20.0% 30.0% 40.0% 50.0% 60.0% 70.0% 80.0% 90.0% 100.0% No. 3593 '6200 1105 5298 2524 2480 1635 2683 2618 2603 2726 2370 3630 Total 39465 1 1995 Equation Bias (cm) 0.40 1.05 -0.03 -0.16 0.02 0.02 -0.07 0.02 0.10 0.16 0.12 0.13 0.00 % bias 0.61 1.76 -0.10 -0.35 0.06 0.08 -0.21 0.07 0.44 0.87 0.93 1.61 0.00 SE (cm) 7.12 4.91 1.54 3.43 3.48 3.31 3.17 2.92 2.95 2.82 2.36 1.87 0.00 %SE 10.80 8.36 5.04 7.32 10.06 10.17 10.09 11.36 13.09 15.62 18.75 23.01 0.00 Bias (cm) -0.32 0.66 -0.46 0.11 0.16 0.16 0.03 0.02 -0.07 -0.07 0.03 0.16 0.00 %bias -0.49 1.12 -1.50 0.24 0.45 0.50 0.08 0.06 -0.32 -0.41 0.20 1.96 0.00 SE (cm) 10.16 4.07 1.26 3.49 3.50 3.36 3.24 3.06 3.09 2.94 2.41 1.92 0.00 %SE 15.41 6.94 4.12 7.47 10.12 10.30 11.32 11.89 13!71 16.29 19.15 23.56 0.00 0.21 0.63 3.77 10.47 0.10 0.30 4.25 10.86 Appendix 6. Biases and standard errors of estimate for estimating total tree volume by D B H class for red cedar. 1994 Equation D B H class (cm) 1995 Equation 0.1-10.0 10.1-15.0 15.1-20.0 20.1-25.0 25.1-30.0 30.1-35.0 35.1-40.0 40.1-45.0 45.1-50.0 50.1-60.0 60.1-70.0 70.1-80.0 80.1-90.0 90.1-200.0 No. 65 149 231 253 258 331 321 287 224 396 290 214 176 399 Bias (m ) 0.001 -0.001 0.001 0.005 0.007 0.002 -0.008 -0.019 -0.005 -0.015 -0.019 0.055 0.050 0236 % bias 6.77 -1.03 .0.91 1.69 1.47 0.22 -0.82 -1.45 -0.29 -0.62 -0.55 1.14 0.79 2.03 SE (m ) 0.002 0.010 0.016 0.030 0.048 0.067 0.098 0.121 0.161 0.231 0.332 0.497 0.652 1.549 %SE 15.65 12.57 10.57 10.69 9.95 9.49 9.93 9.25 9.27 9.75 9.55 10.34 10.30 13.31 Bias (m ) -0.001 -0.002 0.000 0.004 0.009 0.008 0.004 -0.002 0.018 0.014 0.020 0.091 0.075 -0.022 %bias -8.49 -3.06 -0.03 1.56 1.81 1.09 0.36 -0.18 1.06 0.61 0.58 1.89 1.19 -0.19 SE (m ) 0.002 0.008 0.014 0.028 0.043 0.062 0.094 0.115 0.157 0.225 0.325 0.493 0.652 1.537 %SE 11.43 10.39 9.41 10.02 8.94 8.84 9.46 8.74 9.01 9.49 9.33 10.26 10.30 13.21 Total 3594 0.027 0.95 0.566 10.47 0.013 0.44 0.562 9.89 3 3 43 3 3 Appendix 7. Biases and standard errors of estimate for estimating total tree volume by height class for red cedar. 1994 Equation 1995 Equation Height class (m) 0.1-15.00 15.01-18.00 18.01-21.00 21.01-24.00 24.01-27.00 27.01-30.00 30.01-35.00 33.01-36.00 36.01-39.00 39.01-42.00 42.01-45.00 45.01-48.00 48.01-51.00 51.01-99.00 No. 488 266 332 371 403 412 330 293 249 191 149 60 34 16 Bias (m3) 0.010 0.018 0.026 0.004 0.005 0.006 0.015 -0.24 -0.045 0.107 0.274 -0.126 0.292 1.468 % bias 7.07 5.09 4.23 0.43 0.38 0.32 0.54 -0.59 -0.79 1.35 2.58 -0.96 1.81 6.76 SE (m3) 0.026 0.054 0.097 0.094 0.140 0.212 0.306 0.432 0.653 1.178 1.489 1.595 1.419 2.631 %SE 19.37 15.32 15.83 10.89 11.03 11.37 10.94 10.60 11.46 14.82 14.00 12.12 8.82 12.12 Bias (m3) 0.003 0.012 0.023 0.009 0.018 0.029 0.045 0.000 -0.049 0.034 0.123 -0.424 -0.083 0.733 % bias 1.97 3.50 3.71 1.00 1.40 1.56 1.59 -0.00 -0.86 0.43 1.16 -3.22 -0.52 3.38 SE (m3) 0.018 0.044 0.083 0.092 0.134 0.209 0.306 0.432 0.634 1.118 1.504 1.795 1.385 2.272 %SE 13.44 12.68 13.53 10.58 10.56 11.18 10.92 10.60 11.12 14.07 14.14 13.65 8.61 10.47 Total 3594 0.027 0.95 0.566 10.47 0.013 0.44 0.562 9.89 Appendix 8. Biases and standard errors of estimate for estimating merchantable height by D B H class for red cedar. 1994 Equation D B H class (cm) 0.1-10.0 10.1-15.0 15.1-20.0 20.1-25.0 25.1-30.0 30.1-35.0 35.1-40.0 40.1-45.0 45.1-50.0 50.1-60.0 60.1-70.0 70.1-80.0 80.1-90.0 90.1-200.0 No. 65 149 231 253 258 331 321 287 224 396 290 214 176 399 Total 3594 Bias (m) • 0.03 -0.07 -0.02 0.04 0.11 0.15 0.14 0.01 0.01 0.01 -0.10 -0.09 -0.13 -0.23 -0.01 1995 Equation %bias 0.53 -0.71 -0.14 0.29 0.73 0.87 0.70 0.07 0.04 0.04 -0.33 -0.28 -0.37 -0.60 SE (m) 0.27 0.51 0.69 0.76 0.86 0.88 0.93 0.93 0.89 0.91 0.88 0.86 0.94 0.86 %SE 5.48 5.32 6.05 5.72 5.53 4.93 4.66 4.23 3.69 3.50 2.99 2.70 2.73 2.21 Bias (m) -0.09 -0.11 -0.02 0.06 0.13 0.17 0.16 0.04 0.05 0.07 -0.05 -0.03 -0.09 -0.26 %bias -1.78 -1.12 -0.14 0.45 0.82 0.96 0.78 0.20 0.20 0.26 -0.16 -0.10 -0.25 -0.67 SE (m) 0.25 0.49 0.71 0.78 0.88 0.89 0.93 0.93 0.89 0.92 0.89 0.87 0.92 0.87 %SE 5.20 5.20 6.22 5.90 5.64 5.00 4.67 4.23 3.71 3.53 2.99 2.72 2.69 2.24 -0.04 0.85 4.11 0.01 0.05 0.86 4.15 44
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A study on the effects of multicollinearity, autocorrelation...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A study on the effects of multicollinearity, autocorrelation and four sampling designs on the predictive… Bartel, Joseph 1999
pdf
Page Metadata
Item Metadata
Title | A study on the effects of multicollinearity, autocorrelation and four sampling designs on the predictive ability of the 1994 and 1995 variable-exponent taper functions |
Creator |
Bartel, Joseph |
Date Issued | 1999 |
Description | In British Columbia, government, industry and consulting firms have used taper functions since the late sixties. Most recently, Kozak's (1988) variable exponent model has been used since 1989. One practical problem with the model is that, it does not estimate total or merchantable volume without bias. These biases were found to be more pronounced for red cedar (Thuja plicata Donn ex D.Don) and western hemlock (Tsuga heterophylla (raf). Sarg.). Because of this problem, a second equation known as the 1994 equation was developed. However, reviewers identified some theoretical problems concerning multicollinearity and autocorrelation in the 1994 equation. These prompted the development of a third equation that possesses a lesser amount of multicollinearity referred to as the 1995 equation. The three principal objectives of this research were: (1) to study the effects of multicollinearity and autocorrelation on the predictive ability of the 1994 and 1995 variableexponent taper functions; (2) to study the effects of four sampling strategies on the predictive ability of the 1994 and 1995 taper equations; and (3) to examine the possibility of localizing the 1994 taper equations. The effects of multicollinearity and autocorrelation and the four sampling designs were studied using Monte Carlo simulations. The results of the study indicated that the presence of severe multicollinearity and autocorrelation in the data did not seriously affect the predictive ability of the equations. Stratified random sampling, with equal allocation of observations selected from each stratum, gave the smallest variability of the estimated coefficients compared to simple random sampling, and stratified random sampling, with the number of samples proportional to the size of the strata. However, the average estimated regression coefficients were somewhat different from the population parameters.Therefore, simple random sampling is recommended for selecting trees from the population if the main objective is the estimation of the population parameters. If the equations are to be used for prediction, then a wider range of the data (stratified sampling) should be used. The results indicated that no adjustment or scaling is required for the western hemlock equation for the two subzones studied. |
Extent | 2340485 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-06-11 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0088932 |
URI | http://hdl.handle.net/2429/9003 |
Degree |
Master of Science - MSc |
Program |
Forestry |
Affiliation |
Forestry, Faculty of |
Degree Grantor | University of British Columbia |
GraduationDate | 1999-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1999-0112.pdf [ 2.23MB ]
- Metadata
- JSON: 831-1.0088932.json
- JSON-LD: 831-1.0088932-ld.json
- RDF/XML (Pretty): 831-1.0088932-rdf.xml
- RDF/JSON: 831-1.0088932-rdf.json
- Turtle: 831-1.0088932-turtle.txt
- N-Triples: 831-1.0088932-rdf-ntriples.txt
- Original Record: 831-1.0088932-source.json
- Full Text
- 831-1.0088932-fulltext.txt
- Citation
- 831-1.0088932.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0088932/manifest