M O D E L - D E P E N D E N T S A M P L I N G F O R T I M B E R V A L U E I N O L D - G R O W T H F O R E S T S O F C O A S T A L B R I T I S H C O L U M B I A By James S. Thrower M.Sc.F. Lakehead University, 1986 B.Sc.F. Lakehead University, 1984 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES FORESTRY We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA December 1989 Â© James S. Thrower, 1989 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada Date 14 / W J M . t9#9 DE-6 (2/88) Abstract The procedure used to sample crown timber before harvesting in B . C . is designed to estimate net volume per ha using systematically located angle-count plots where trees are selected with probability proportional to basal area. The primary purpose of the sample is to provide information for timber valuation and stumpage appraisal. Timber value is the most important population parameter for stumpage calculation, but it is not explicitly considered in the sampling design. The objective of this study was to modify the current sampling method to increase the efficiency for estimating value using model-dependent sampling theory. Eighteen model-dependent sampling strategies were developed from six subsampling methods using three estimators. The six subsampling methods were used to select trees from angle-count plots to estimate the relationship between cruiser-called and estimated tree value. Three subsampling methods used probability-based selection of trees and three methods used purposive-based selection of trees. Ratio, average ratio, and regression estimators were used with each method. The 18 strategies were tested using Monte Carlo simulation with 2000 samples at each of nine sample sizes in three test populations. The test populations were created by grouping angle-count plot data into mutually exclusive sets reflecting different stand characteristics. The sample sizes were n = 20,40, and 60 plots with m = n, 3 n , and 5 n subsampled trees. Individual tree value was estimated with regression equations that used variables closely related to the value of each species. The sampling strategies were evaluated for bias, sample variance, achieved subsample size, sampling cost, confidence interval coverage, and relative advantage against the current sampling method. i i The model-dependent subsampling methods using purposive selection of trees were more efficient than the current sampling method considering cost and variance. The purposive-based methods were biased up to about 5%; the probability-based methods were slightly less biased. The two most efficient methods were: i) purposive selection of trees with the highest estimated values in a plot; and ii) purposive selection of trees with estimated values within a given range to give a second-stage sample balanced on the auxiliary variable. The greatest efficiency was always achieved with one sample tree per plot. The current sampling method was unbiased for estimating value but required approximately twice as many plots to estimate value to the same level of precision as net volume. in Table of Contents Abstract ii List of Tables viii List of Figures xi Acknowledgements xiii 1 I N T R O D U C T I O N 1 1.1 The Problem 1 1.2 Objectives 2 1.3 Approach to the Problem 4 2 B A C K G R O U N D 6 2.1 Design-Dependent Sampling 6 2.2 Model-Dependent Sampling 8 2.2.1 Linear Least-Squares 11 2.2.2 Robustness 15 2.3 Model-Dependent Sampling in Forestry 17 2.4 Optimal Design and Cost Functions 20 2.5 Point-Sampling 23 2.5.1 Two-Stage Point-Sampling 24 2.5.2 Two-Phase Point-Sampling 24 2.5.3 Point-3P Sampling 27 iv 2.6 Forest Service Operational Cruise Procedure 29 2.7 Forest Service Log Grades 31 2.8 Vancouver Log Market 31 2.9 Estimating Tree Value 32 3 M E T H O D S 34 3.1 Overview 34 3.2 Test Populations 35 3.3 Tree Value 38 3.4 Sampling Strategies 43 3.4.1 Forest Service Sampling Strategy . . . 43 3.4.2 Model-Dependent Sampling Strategies 45 3.4.2.1 Probability-Based Subsampling Methods 48 3.4.2.2 Purposive-Based Subsampling Methods 51 3.5 Strategy Evaluation 53 3.5.1 Data Assumptions 53 3.5.2 Sample Size 54 3.5.3 Cost Functions 56 3.5.4 Simulation Program Design 59 3.5.5 Evaluation Criteria 60 3.5.5.1 Bias 60 3.5.5.2 Variance 61 3.5.5.3 Achieved Subsample Size 61 3.5.5.4 Cost 61 3.5.5.5 Relative Advantage 62 3.5.5.6 Confidence Interval Coverage 62 v 4 R E S U L T S 63 4.1 Tree Value 63 4.2 Forest Service Operational Cruise Procedure 67 4.3 Model-Dependent Sampling Strategies 73 4.3.1 Bias 73 4.3.2 Sample Variance 80 4.3.3 Confidence Interval Coverage 87 4.3.4 Achieved Subsample Size 98 4.3.5 Cost 101 4.3.6 Relative Advantage 103 5 D I S C U S S I O N 108 5.1 Tree Value . 108 5.1.1 Predictor Variables 108 5.1.2 Regression Approach to Estimating Value 109 5.1.3 Tree Value and Basal Area I l l 5.2 Forest Service Operational Cruise Procedure I l l 5.2.1 Precision of Sampling for Value 112 5.2.1.1 Single-Stage Point Sampling 113 5.2.1.2 Two-Stage Point Sampling 114 5.3 Model-Dependent Subsampling Methods 116 5.3.1 Bias 116 5.3.2 Sample Variance 121 5.3.3 Confidence Interval Coverage 122 5.3.4 Achieved Subsample Size 123 5.3.5 Relative Advantage 126 vi 5.4 The Simulation Process 127 6 C O N C L U S I O N S 129 6.1 Model-Dependent Subsampling Methods 129 6.2 Forest Service Operational Cruise Procedure 130 6.3 Tree Value 130 6.4 Contributions 131 6.5 Recommendations for Future Research 132 7 L I T E R A T U R E C I T E D 135 A G L O S S A R Y 144 vii List of Tables 3.1 Number of stands, plots, and trees in the test populations 37 3.2 Number of point-selected trees by species and test population 37 3.3 Statistics for the 1667 sample plots in TP1 38 3.4 Statistics for the 1236 sample plots in TP2 38 3.5 Statistics for the 1410 sample plots in TP3 39 3.6 Three month average log prices in the Vancouver log market for major coastal loggers (15 October 1987) . 42 3.7 Equations for estimating cruiser-called tree value ($) 44 3.8 Estimators for the 19 sampling strategies 49 3.9 Desired total number of subsampled trees 54 3.10 KZ values ($) for 3P subsampling 55 3.11 Target and deviation values ($) for target subsampling 55 . 3.12 Threshold values ($) for threshold subsampling 56 3.13 Time (minutes) required for a two-man crew to collect individual tree data by species for various activities of the sampling strategies 58 4.14 Regression statistics and variables for estimating cruiser-called tree value 64 4.15 Mean and coefficient of variation for the ratio of cruiser-called dollar value to basal area ($J3AR) and the ratio of net volume to basal area (VBAR) for individual trees by species and test population 66 4.16 Statistics for the 2000 estimates of value/ha from the Forest Service sam-pling strategy with TP1 68 viii 4.17 Statistics for the 2000 estimates of value/ha from the Forest Service sam-pling strategy with TP2 69 4.18 Statistics for the 2000 estimates of value/ha from the Forest Service sam-pling strategy with TP3 72 4.19 Percent bias by sample size and strategy for TP1 74 4.20 Percent bias by sample size and strategy for TP2 76 4.21 Percent bias by sample size and strategy for TP3 78 4.22 Variance ratios by sample size and strategy for TP1 81 4.23 Variance ratios by sample size and strategy for TP2 84 4.24 Variance ratios by sample size and strategy for TP3 86 4.25 Confidence interval coverage using 20 plots with TP1 89 4.26 Confidence interval coverage using 40 plots with TP1 90 4.27 Confidence interval coverage using 60 plots with TP1 91 4.28 Confidence interval coverage using 20 plots with TP2 92 4.29 Confidence interval coverage using 40 plots with TP2 93 4.30 Confidence interval coverage using 60 plots with TP2 ; 94 4.31 Confidence interval coverage using 20 plots with TP3 95 4.32 Confidence interval coverage using 40 plots with TP3 96 4.33 Confidence interval coverage using 60 plots with TP3 97 4.34 Mean and standard deviation of achieved subsample sizes by sample size and strategy for TP 1 98 4.35 Mean and standard deviation of achieved subsample sizes by sample size and strategy for TP2 99 4.36 Mean and standard deviation of achieved subsample sizes by sample size and strategy for TP3 100 4.37 Cost ratios by sample size and strategy for TP1 101 ix 4.38 Cost ratios by sample size and strategy for TP2 102 4.39 Cost ratios by sample size and strategy for TP3 . .â€¢ 102 4.40 Relative advantage by sample size and strategy for TP1 103 4.41 Relative advantage by sample size and strategy for TP2 105 4.42 Relative advantage by sample size and strategy for TP3 107 5.43 Approximate number of plots needed to estimate value/ha and net vol-ume/ha to within Â±5, 10, and 15% at the 95% confidence level with the Forest Service sampling strategy 114 5.44 Theoretical KZ values for 3P subsampling 124 5.45 Mean and standard deviation of achieved subsample size by sample size for the 3P method using theoretical KZ values 124 x List of Figures 3.1 Distribution of value/ha for the 1667 plots in TP1 39 3.2 Distribution of value/ha for the 1236 plots in TP2 40 3.3 Distribution of value/ha for the 1410 plots in TP3 40 4.4 Cruiser-called versus estimated value for the 11623 trees in TP1 65 4.5 Cruiser-called versus estimated value for the 12191 trees in TP2 65 4.6 Cruiser-called versus estimated value for the 9250 trees in TP3 67 4.7 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 20 plots for TP1 69 4.8 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 40 plots for TP1 70 4.9 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 60 plots for TP1 71 4.10 Cumulative distribution of sampling errors for the Forest Service sampling strategy for TP 1 72 4.11 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 20 plots for TP2 73 4.12 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 40 plots for TP2 75 4.13 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 60 plots for TP2 77 xi 4.14 Cumulative distribution of sampling errors for the Forest Service sampling strategy for TP2 79 4.15 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 20 plots for TP3 80 4.16 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 40 plots for TP3 82 4.17 Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 60 plots for TP3 83 4.18 Cumulative distribution of sampling errors for the Forest Service sampling strategy for TP3 85 5.19 Relationship between percent sampling error of estimating value/ha and net volume/ha with the Forest Service sampling strategy for TP1, TP2, and TP3 . 115 5.20 Number of BA plots and $BAR trees to estimate value/ha with 10, 15, and 20% sampling error at 95% confidence in TP1 117 5.21 Number of BA plots and $BAR trees to estimate value/ha with 10, 15, and 20% sampling error at 95% confidence in TP2 118 5.22 Number of BA plots and $BAR trees to estimate value/ha with 10, 15, and 20% sampling error at 95% confidence in TP3 119 xii Acknowledgements I am grateful to my major advisor, Dr. Peter Marshall, who accepted me as a PhD can-didate upon the early retirement of my former major advisor, Dr. Julien Demaerschalk. I thank Peter for his constant assistance, support, and encouragement. I also thank Dr. Kim lies of MacMillan Bloedel, who arranged for me to obtain data from MacMillan Bloedel, and who provided assistance and advice throughout all phases of my work and was always keen to discuss any ideas about sampling. I thank Dr. Antal (Tony) Kozak for his help and especially for securing financial support to run my simulations on the UBC mainframe computer. I thank the other members of my committee, Dr. Andrew Howard and Dr. Harry Joe for their advice, assistance, and willingness to help at all times. I thank Dr. Ken Mitchell of the B.C. Ministry of Forests, Research Branch, for always providing encouragement and support to complete this study. Many other people provided assistance in this study. The inventory staff of MacMillan Bloedel: Pat MacDonnel, Bert Vink, John Ahokas, Geoff Childs, Ken Epps, and Mark Weeks provided advice and assistance in developing cost functions for point-sampling. Barb Helmer of MacMillan Bloedel was very helpful in providing the data for the simu-lations. Mark Leja of Fletcher Challenge provided data from a study of visually estimat-ing tree diameters conducted by the Coastal Cruising Supervisors Task Force. Dermot McCarthy of UBC was very helpful in designing the data structures for the sampling simulation program. X U l Chapter 1 I N T R O D U C T I O N 1.1 The Problem Timber cut from crown land in British Columbia (B.C.) must be cruised before it is harvested. From the Government perspective, the primary purpose of the cruise is to provide information for timber valuation and stumpage appraisal. Stumpage is calculated as a function of the potential revenue from the harvested timber and the estimated cost of extraction. Both variables are estimated from the cruise, however, timber value is much more difficult to estimate than logging cost. Timber value is a function of species, volume, grade, and market prices. Timber attributes that affect logging costs are primarily related to volume and decay. Crown timber must be cruised according to methods accepted by the B . C . Forest Service (FS). The objective stated in the FS cruising manual (Ministry of Forests 1980) is to estimate the net volume of a cutting permit area to a specified degree of precision. Virtually all timber in B . C . is cruised using systematically located angle-count sample plots where trees are selected with probability proportional to basal area. A l l trees in the sample plots are measured to estimate volume and quality (grade on the coast and lumber recovery factor in the interior). The precision of the cruise is based on the variation in estimated net volume among sample plots; the precision of the cruise for estimated timber value is not calculated. For coastal timber, value is estimated from the estimated merchantable volume by log 1 Chapter 1. INTRODUCTION 2 grade from the cruise and current log market prices. The cruise is designed to estimate timber volume and is efficient for this purpose, however, the cruise may not be efficient for estimating value. Many other sampling methods could be used to estimate timber value. The most efficient methods would use knowledge of tree and stand structure as auxiliary information. Model-dependent sampling techniques where highly efficient strategies can be formulated under a superpopulation model have the most potential to use auxiliary information. However, the cost of the potential increase in efficiency is a loss of robustness. Biased and misleading estimates can result from model-dependent sampling methods if assumptions about the relationships are not true. Thus there were two problems with the FS cruising method: i) although the primary objective of the cruise is to estimate timber value, it is calculated only as a secondary statistic and the variation of the estimate is not controlled; and ii) the FS cruise procedure may not be the most efficient method to estimate timber value. The purpose of this study was to address the second problem. 1.2 Objectives The overall objective of this study was to determine if modifying the FS cruise procedure using model-dependent sampling theory would increase efficiency for estimating timber value. The specific objectives were: 1. To develop a method to obtain quick (inexpensive) and precise estimates of individ-ual tree value. This method was to be used to generate the covariate in a functional relationship with actual tree value in the sampling designs. The basic concept in all the model-dependent sampling designs was to spend time (money) on the trees that provided the most information for estimating value. However, preliminary estimates of individual tree value were needed to select the appropriate trees. The Chapter 1. INTRODUCTION 3 trade-off was between the precision of the initial estimate and the time needed to obtain the estimate. A regression approach was taken to estimate tree value. This provided a relatively precise, quantitative method that could be simulated in the computer. 2. To modify the FS cruise procedure using model-dependent sampling theory to in-crease the efficiency for estimating timber value. There are many methods to increase the efficiency of sampling in theory. However, the link between theoretical and practical sampling is often problematic. Operational cruising is restricted by many physical limitations and theoretical improvements may not be appropriate under field conditions. For example, preliminary estimates of the population are usually not available. Cruising is often done in remote locations where travel costs are very high and only one visit to the population is feasible. Hence efficiencies pro-vided by sampling methods that use prior estimates cannot be achieved. Defining a first stage sample of trees in a forest is also problematic. Ideally, the first stage units should be related to timber value, but estimating value is very expensive. Angle-count sampling selects trees with probability proportional to basal area, and basal area is roughly related to value for a given species. Hence angle-count sample plots were used as first-stage units. 3. To evaluate the relative advantage of the modified cruise methods against the FS method. Relative advantage was computed as a function of precision and cost. Two-stage sampling methods are usually less precise than single-stage methods using the same number of primary units. However, two-stage methods are usually less expensive because fewer trees are measured. Hence cost must be included to determine which methods are more efficient. Chapter 1. INTRODUCTION 4 The major contributions from meeting these objectives are in forest sampling, how-ever, the results of this study also contribute to the general practical application of model-dependent sampling. The major contributions were: 1. Evaluation of the FS cruise procedure for estimating timber value. 2. Demonstration that not all the pathological and quality related indicators collected by the FS are needed to estimate value. 3. Identification of tree characteristics that were related to value of different species. 4. Increased understanding of the distribution of tree values in old-growth forests. 5. Development and testing of new model-dependent methods for subsampling within angle-count sample plots. 6. Identification of sampling techniques that may increase the efficiency of the FS cruising method for estimating value. 1.3 Approach to the Problem The FS cruise procedure was modified using model-dependent sampling theory. Six subsampling methods were tested with three estimators at nine sample sizes using Monte-Carlo simulation with actual cruise data in three test populations. This was a cost effective method to evaluate the sampling strategies with real data. The study methodology was separated into four major steps. 1. Creation of test populations. Three mutually exclusive test populations were created by grouping many cruise (angle-count) plots. This provided a range of realistic forest conditions under which to simulate the sampling methods. The advantages Chapter 1. INTRODUCTION 5 of using real cruise data were: i) the data represented actual forest conditions; ii) the data were readily available; and iii) the data were collected with accuracy and high precision by professional cruisers. 2. Estimation of individual tree values. Multiple linear regression equations were de-veloped for each species to estimate the value of individual trees. Predictor variables were tree characteristics that are routinely collected in the FS cruise procedure. Estimated tree value was then used as the covariate in the sampling designs for selecting sample trees for detailed measurement of value. 3. Formulation of sampling strategies. The FS cruise procedure was modified using model-dependent sampling theory. Six subsampling methods were used to select sample trees from the point-selected trees. Three methods used some form of ran-domization to select the sample trees and three methods used purposive selection. Three estimators were used with each of the six subsampling methods. All sam-pling strategies used double sampling to define the relationship between actual and estimated tree value. 4. Evaluation of the sampling strategies. The modified sampling techniques were eval-uated on their relative advantage (considering cost and variance) over the FS sam-pling strategy. The sampling variance of the strategies was approximated using Monte Carlo simulation. The objective was to repeatedly sample each test popula-tion with new randomly selected plots to approximate the sampling distribution of the estimated value. Each strategy was evaluated on 2000 samples at each of nine sample sizes. Sampling cost was approximated with cost functions developed from consultation with field experts. Chapter 2 B A C K G R O U N D 2.1 Design-Dependent Sampling Finite population sampling theory addresses how to select and relate a sample to the population. The process requires principles and procedures for statistical inference. The stochastic structure introduced in the sampling design is the link between the sample and the population. Cassel et al. (1977) discussed two sources of randomization that contribute to the inference. First, stochastic structure is introduced by the methods used to select the units. This is the basis for conventional finite population sampling theory (probability or design-dependent sampling). Second, stochastic structure is introduced by knowledge of some process that generates the true measurement for a given unit. This is the model-dependent approach, superpopulation approach, or the prediction approach to finite population sampling. The two philosophies differ primarily in the element of randomness used to give stochastic structure to the inference (Royall and Herson 1973a, Sarndal 1978). The sampling design introduces randomization in design-dependent sampling. This provides for inference with no assumptions about the distribution of the population. The sampling design is central to inference in classical survey sampling theory. Probability sampling is robust in that there are no assumptions to violate. The Central Limit Theo-rem provides accurate confidence intervals when sample sizes are large, regardless of the distribution of the population. However, the robustness of probability sampling results 6 Chapter 2. BACKGROUND 7 in reduced efficiency. Probability sampling theory is considered to give mathematically rigorous, objective, probabilistic inferences without the need for unverified assumptions about probability distributions (Royall 1976a). Hansen et al. (1983) defined a probability-sampling design to consist of: i) a sampling plan where each unit has a known, non-zero probability of selection; and ii) procedures such that inference does not depend on an assumed model. They interpret ii) as requiring randomization-consistent estimators, where an estimator is consistent if it converges in probability to the parameter it is estimating as n â€”â€¢ oo. Survey sampling inference has been compared with traditional statistical inference where the problem is often to estimate an unknown population parameter. These are not opposing theories, however, basic concepts such as parameter, sample, data, and estimator have special meaning in survey sampling (Cassel et al. 1977 p.3). The ability to identify units is essential in survey sampling but is usually absent from traditional statistical inference. Identifiability means the unique label of each unit is known. The statistician creates the sampling distribution of an estimator, thus survey sampling is not restricted to independent and identically distributed observations. The basic model for conventional finite population sampling theory (Godambe 1955) is discussed by Cassel et al. (1977), Godambe (1969), Rao (1975), and Sarndal (1978). The properties of interest for an estimator under a given sampling design include un-biasedness, variance, and mean square error (MSE). Since the design affects the variance, both the design and the estimator must be considered together. A strategy is the pair of a design and an estimator, and is unbiased if the estimator is unbiased with the given design. The variance and MSE of a strategy are the variance and MSE of the estimator under the design. Survey sampling distinguishes two types of inference problems: i) the search for an optimal estimator under a given design; and ii) the search for an optimal strategy. For Chapter 2. BACKGROUND 8 i), the design is predetermined by practical or other considerations. For ii), there is no prior commitment to a given design. In general, an efficient strategy is closely related to the distribution of the population values. For example, a stratified design is efficient if the stratification is related to the population values. A very efficient strategy results if both the estimator and the design are related to the population values. Smith (1976a) noted that a good estimator can compensate for a poor design, but a good design cannot overcome a poor estimator. Inference criteria are widely researched in the foundations of survey sampling. The criteria are mostly from traditional statistical inference, but some are with specific ref-erence to survey sampling. These efforts give substantial insight into the structure of survey sampling inference. The results show that survey sampling inference is not a straightforward application of traditional statistical inference. Cassel et al. (1977) re-viewed several criteria of inference such as likelihood, sufficiency, admissibility, uniformly minimum variance, minimax, and invariance principles. 2.2 Model-Dependent Sampling Model-dependent sampling is based on assumptions about the population. Hansen et al. (1983) described a model-dependent design to consist of a sampling plan and estimators where either are chosen because they have desirable properties under an assumed model. They defined a model-dependent sampling design to consist of: i) a sampling plan that may or may not require randomization in sample selection; ii) estimators that are model-unbiased or model-consistent; and iii) procedures for inference such that the correctness of the inference depends on the assumed model. Highly efficient designs result if the model accurately represents the true relationship in nature. However, biased and misleading estimates may occur if the assumed model is not appropriate. In other words, there Chapter 2. BACKGROUND 9 is a trade-off between robustness and efficiency. Model-dependent sampling depends on accurate prior knowledge of relationships in the population. This does not imply a Bayesian approach, however, Bayes methods can be used in model-dependent samphng (Cassel et al. 1977, Ericson 1969, Scott and Smith 1969). There are two sources of randomness in model-dependent sampling: i) randomness associated with the random variables Y i , . . . , Yjv; and ii) randomness associated with a probability sampling design that generates the observations. Bias is considered important when a probability sampling design is used under a superpopulation model. However, bias is disregarded and is unnecessary when the superpopulation model is appropriate (Cassel et al. 1977). A stochastic structure is specified for each population unit in model-dependent sam-pling. The vector of population values y = (yi, ..., J/JV) is the realized outcome of a vector random variable Y = ( Y i , . . . , Y j v ) . In other words, each finite population is a random sample from an infinite superpopulation. The stochastic structure for inference is provided by the joint distribution of Y i , . . . , YJv, called the superpopulation distribution (Â£). If the superpopulation model is specified by a conventional parametric form, e.g., a normal or gamma distribution, Â£ is used to relate the values in the sample with those in the population. The superpopulation model provides the link between the observed and unobserved values. The values of y not in the sample are predicted from Â£, which is inferred from the values of y in the sample. Thus finite population sampling is in the area of predictive statistical inference. A superpopulation model can play a role in conventional finite population sampling theory, but is not used for inference. Rao (1975) stated that Cochran (1946) originally introduced superpopulation models for efficiency comparisons between conventional sampling methods. Cassel et al. (1977 p.81) gave four main interpretations of the superpopulation con-cept: Chapter 2. BACKGROUND 10 1. The finite population is drawn from a larger universe (the pure form of the super-population idea). 2. A random mechanism or process in the real world is modeled from the distribution Â£ (an analytical study). 3. The distribution Â£ is a prior distribution reflecting subjective belief (the Bayesian approach). 4. The distribution Â£ is used as a mathematical device to make explicit theoretical derivations (e.g., Cochran 1977 p.158, Sukhatme et al. 1984 p.194, 241). The terms model-based and model-dependent may not imply the same situation. Hansen et al. (1983) discussed the terms and noted that a probability sampling design may be model-based, e.g., when using a ratio estimator, but probability designs do not depend on a model for inference. Model-dependent sampling is when the inference de-pends on an assumed model, not where the design or estimator is based on a model. Some authors do not make the distinction between model-based and model-dependent sampling (Cassel et al. 1977, Sarndal 1978). Smith (1976a) suggested that conventional theories of inference would apply to finite population sampling if survey statisticians provided stochastic models like statisticians in experimental sciences. Kalton (1976) gave three reasons why surveys are treated differently from experiments: i) surveys use larger samples; ii) surveys are usually multi-purpose and gather data on a variety of variables; and iii) surveys are descriptive where experiments are usually concerned with contrasts among treatments. Randomization protects experimenters against subjective bias, however, the linear model is the basis for the analysis and the randomization is not specifically considered (Smith 1976a). Kish (1965 p.595) stated: "the separation of sample surveys from experimental designs is an Chapter 2. BACKGROUND 11 accident in the recent history of science, and it will diminish." Smith (1983) noted that stochastic models represent uncertainty in most applied sciences and that randomization does not play a role in the inference. The problems of statistical inference are testing alternate models and inferring about parameters. Cochran (1976) noted that model-dependent methods have made only a small impact on survey sampling. Sample survey theorists usually write for other theorists and are seldom engaged in survey practice. The vast experience in applied work is the reason for success of design-dependent sampling (Little 1983). Most accounts of model-dependent sampling are theoretical exercises that have little to do with complex real-world sampling problems. Little stated that modeling is the future of survey research, but cautioned that the accumulated wisdom of current practice must not be destroyed. 2.2.1 Linear Least-Squares Each school of statistical thought has a technique for analysis under a superpopulation model (Royall 1976a). However, the method of selecting sample units has little or no significance for any of the methods. Fiducial and classical prediction techniques are used for finite population problems (Kalbfleisch and Sprott 1969) as well as Bayesian techniques (Cassel et al. 1977, Ericson 1969, Scott and Smith 1969). However, linear least-squares methods are most common (e.g., Cassel et al. 1977, Hansen et al. 1983, Royall 1970, 1971, 1976a, 1976b, Royall and Cumberland 1981a, 1981b, Royall and Herson 1973a, Sarndal 1978, Smith 1981). Linear regression techniques are widely used in finite population sampling theory. The precision of the estimate is increased from using the relationship between the variable of interest and one or more auxiliary variables. Typically there is a known auxiliary variable x and an unknown variable of interest y for each of the N units. Cassel et al. (1977 p.126) discussed model-dependent sampling under a regression model for more Chapter 2. BACKGROUND 12 than one auxiliary variable. In model-dependent sampling, the sample values J / I , . . . , J / J V are realized values of the independent random variables Y i , . . . , Yjy. Royall and Herson (1973a) expressed the expected value [/i(Â£fc)] and variance [cr2u(a;jfc)] of Yk as n = h(xk) + e f cb(x f c)]1 / 2 fc = l , . . . , JV (2.1) where c i , . . . , are independent random variables with mean zero and variance a2. The function h is generalized as h(x) = 6o0o + Sxp\x + 8232x2 + ...+ 8J3JXJ (2.2) where the indicator variable 8j â€” 1 if 3JX* is in the regression function, and 8j = 0 if fax* is not in the function. The probability model for the regression function h is t[80,8iÂ±...,8j:v(x)] (2.3) where 8j are indicator variables associated with each regression term and v(x) is a variance function that depends on x. For example, Â£[0, l:x] refers to the model Yk = Bxxk + ekx]l2 (2.4) where both the expected value and variance of Yk are proportional to xk. Under the superpopulation model Â£[0, l:v(x)}, Royall (1971) gave the estimator T of the population total as T = + (2.5) where is the population units in the sample and J2s i s the population units not in the sample. The best linear unbiased (BLU) estimator of 3 is the weighted least squares estimator ^ L . ( f ^ ) / E . U b ) . (") Chapter 2. BACKGROUND 13 Hence a best linear model-unbiased estimator of T for any sampling plan is = 2/* + Specifically, the B L U estimators for T when v(x) = l , x , and x 2 , respectively, are r[o,i:i] = E.w + (E.WE.4)E.-*Â» = (E.^/E.^JOEL** f [0 , l :x] = E. W + (E. w/E.**) E,** = (E,^/E,^)EL X F C f[o,i:x2] = E.w + (Â» _ 1E.W**))E^* = ( n _ 1 E.fofc/**)) Ejli x*-(2.7) (2.8) (2.9) (2.10) These are the usual regression estimator (without intercept), ratio estimator, and average ratio estimator, and are best for any sampling plan given the stated variances. Further-more, the ratio estimator T[0, l:x] is ^-unbiased, i.e., Â£(T â€” T) = 0 for all s and any variance function v. The uncertainty of T is the uncertainty in the unknown quantity EiVk- Hence the error in T is T-f = YiJiVh - fak) which is the uncertainty in /3J2sxk as an estimate of EsVk-Royall (1971) gave the M S E with respect to Â£ for any sampling plan as 12 (2.11) M S E = Â£ T[0,l:v(x)] -T = o E , ^ ) + ( I > * ) ' ! f ^ ] (2.12) T h e ^-unbiased estimate of cr 2 from least squares theory is <72 = EsjVk - Pxk)2/v{xk) n-1 (2.13) Chapter 2. BACKGROUND 14 The MSE is minimized when Â£ a xk is maximized for any variance function v. Hence the optimal sample is the n largest values of xk (Bellhouse 1984, Cassel et al. 1977, Royall 1970, 1971, Royall and Cumberland 1981a, Royall and Herson 1973a, Sarndal 1978). Model-dependent sampling often leads to conventional estimators such as the ratio and average ratio estimators, but may lead to unconventional sampling plans. The Horvitz-Thompson estimator Tht (Horvitz and Thompson 1952) is design-unbiased under any PPS sampling design if v(x) = x2. Furthermore, N ( Â£ . y * / * * ) S > * (2.i4) 1=1 is optimal under the superpopulation model Â£[0,1: x2] (Godambe 1955, Godambe and Joshi 1965), i.e., MSE(PppS,fht) < MSE(p,f) (2.15) for any sampling design p with fixed sample size and any design-unbiased estimator. Bellhouse (1984) reviewed optimal designs for five areas of survey sampling including linear regression under a superpopulation model. He noted that the quantity to mini-mize depends on the approach to survey samphng inference. If design-unbiasedness is important, the quantity E ( r - r ) 2 (2.16) is minimized, where E is the sampling design expectation and the estimator T is design-unbiased. Where a model is assumed, the finite population MSE averaged over the superpopulation model Â£E(T-Y)2 (2.17) is minimized, where Â£ is the model expectation, or the squared error between the estimate and finite population parameter averaged over the superpopulation â‚¬(T - Y)2 (2.18) Chapter 2. BACKGROUND 15 is minimized. These quantities are minimized with constraints such as design-unbiasedness or Â£-unbiasedness. Rao (1979) stated that minimizing Â£E(T â€” Y)2 subject to design-unbiasedness leads to conventional sampling strategies. Minimizing Â£(T â€” Y)2 subject to Â£-unbiasedness leads to purposive selection. Royall (1970) stated that a sampling plan should minimize Â£(f - T)2 = ( Â£ , * f c ) 2 Â£ ( / ? - B)2 + a2 Â£ ,< ; (**) â€¢ (2-19) The sample should also provide a good estimate of the expected value of the non-sample units, i.e., so (52ixk)2Â£(/3 â€” B)2 is small. The sample should include the yk with the largest variance. Thus only the least variable yk values are predicted, i.e., so. v{xk) is small. He noted that sampling all units in the largest size stratum is a step toward the sampling plan where the n largest values of x are purposively selected. He also noted that PPS designs are a step toward the purposive selection of the n largest values of x. Rao (1975) noted that the purposive selection of the n largest values of x may be feasible in specialized surveys, however, this may lead to inefficient estimators in large surveys where many items are of interest. He noted that this criticism is also valid for conventional designs such as PPS and 100% sampling in the stratum of the largest x values. 2.2.2 Robustness Robustness is the primary concern in model-dependent sampling. The objective of robust estimation is to find an estimator or strategy that performs well allowing for uncertainty about the real world (Cassel et al. 1977 p.149). Hansen et al. (1983) described robust inference as insensitive to violations of the assumptions. Most authors acknowledge that model-dependent sampling is more efficient than probability sampling if the model is accurate. However, model-dependent sampling gives misleading results when the model Chapter 2. BACKGROUND 16 does not reflect the state of nature (Hansen et al. 1983). Smith (1984) asked: "How many statisticians manage to create a system which is apparently independent of any assump-tions?" He noted that a bias is introduced in each sample if the design is misspecified in the randomization approach. The bias is averaged over all samples and becomes a component of variance. He then asked: "Does this statistical dodge of converting biases to variances make scientific sense?" Hansen et al. (1983) stated that in principle, robustness in probability sampling results from known probabilities, consistent estimators, and large samples so the Central Limit Theorem applies. Smith (1976b) stated that robustness is an important problem facing statisticians; the question is how far to proceed along the spectrum from fully specified models to randomization tests. Royall and Herson (1973a) studied the ratio estimator T[0, l:x] when the model Â£[0, l:x] was not correct. Royall and Herson (1973b) used stratification as a safeguard against model failure. Scott et al. (1978) extended these approaches to a more general regression model. Brewer (1979) used the same concepts to present a class of robust de-signs for large scale surveys. Nathan and Holt (1980), Holt et al. (1980), and Pfeffermann and Holmes (1985) also studied model-dependent regression analysis under violations of model assumptions. Royall and Herson (1973a) stated that a fundamental role of randomization is to pro-vide samples that are balanced on the moments of x. The expansion and ratio estimator are equivalent for a balanced sample. However, random sampling does not always give a balanced sample; the ratio estimator is then biased. They commented that it is little consolation to know that random sampling produces balanced samples only on the av-erage. They suggested deliberate balancing to protect against bias. Royall and Herson (1973b) suggested stratified sampling with separate ratio estimates to protect against model failure. Chapter 2. BACKGROUND 17 2.3 Model-Dependent Sampling in Forestry Rennolls (1981) discussed superpopulation models for estimating forest area. He later (Rennolls 1982) recommended that inventory statisticians should abandon the idea that randomization provides the only method for inference. He suggested using model-dependent sampling when a reliable model is available or when sample size is small. However, he recommended that model-dependent inventory surveys should be designed to validate the model. Schreuder (1984) compared several model-dependent designs with 3P sampling. He used five sampling designs and five estimators to estimate total volume of two hypothet-ical populations consisting of 1084 white oak trees and 4438 loblolly pine trees. The simulations used 500 samples of 10 and 30 trees. The model was Y = ct + 8x + e (2.20) where Y is total tree volume, x = is squared diameter times height (d2h), a and 3 are regression coefficients, and the variance u(e,x,) is x\o2 where 1 < k < 2. The five sampling designs were: 1. Purposively select the n trees with the largest x = d2h, where all Xi were known. 2. Purposively select 20% of the n trees with the smallest xt- and 80% of the n trees with the largest x,-. 3. Purposively select n trees from strata created from cumulated values of rc, (pscX sampling). 4. Randomly select one tree from each of the n strata described above. 5. 3P sampling where the expected sample size was n e = XT/L. Chapter 2. BACKGROUND 18 The estimators were the average ratio (Grosenbaugh's adjusted estimator), simple regression, and weighted regression estimators with k =1, 1.5, and 2. Schreuder used preliminary estimates of N and XT to examine the practical situation where the param-eters are unknown. The most efficient estimators were the average ratio and weighted regression estimator with k = 2. The simple linear regression estimator was not appro-priate. The average ratio estimator is the BLU estimator if the variation in tree volume is proportional to the square of x = d2h. This confirms the result of k = 2 for the weighted regression estimator. The pscX, stratified random sampling, and 3P techniques gave equally good results for bias and MSE. However, preliminary estimates did not increase the efficiency of 3P sampling. The purposive sampling procedures using the n largest trees, and the 0.2n smallest and 0.8n largest trees gave erratic results and performed poorly. Schreuder explained the large MSE as resulting from the small sample size giving some estimates that were far from the true value. Schreuder et al. (1984b) compared pscX sampling with point-3P sampling (Grosen-baugh 1971,1979) in a 26 ha forest. They compared the sample estimates of total volume with (assumed) true volume from a larger sample. Both methods used jackknife variance estimators and point-samples for the first-stage selection of trees. The sampling designs differed only in the selection of individual trees for detailed estimation of volume. Both methods were accurate and equally efficient with the same MSE. They noted that the purposive pscX design may be easier and safer to carry out in similar situations for sampling dead and fallen trees. However, they noted that point-3P sampling is very efficient and does not rely on an underlying model. They noted that the pscX design is very efficient but may give erroneous results if the model is not correct. Wood et al. (1985) reported a similar study comparing model-based and point-3P sampling. The objective was to estimate total volume from the auxiliary variable d2h. Chapter 2. BACKGROUND 19 Both methods used a large sample estimate as the true population parameter and used jackknife variance estimators. The strategies used point-sampling as first-stage clusters and differed only in the method of selecting trees within point-samples. The model-based sampling plan was designed to distribute = 23 second-stage sample trees over the range of d 2 h (0-2.4 m 3 ) from n\ = 16 first-stage point-sample plots. T h e strategies performed equally well and gave similar point and standard error estimates. T h e model-based method provided a viable alternative to point-3P sampling and controlled the size of the second-stage sample. The designs were equally efficient because costs were the same. Schreuder and Thomas (1985) compared model-dependent with PPS sampling for updating timber sales and inventories, where the auxiliary variable was d 2 h and plot volume from a previous survey, respectively. They tested restricted simple random sam-pling (Rsrs), sampling from the range of x (psR), pscX, stratified pscX (spscX), and stratified PPS (sPPS) sampling. The simulations used 1000 samples from three popula-tions. The pscX and psR procedures were as given in Schreuder (1984) and Wood et al. (1985), respectively. The spscX procedure randomly selected one value from each of the n strata. The sPPS procedure selected one sample with PPS from each of the n strata cre-ated for the pscX procedure. The Rsrs procedure, suggested by Royall and Cumberland (1981a,b), selected samples that were approximately balanced on the first two moments of x. Samples were rejected when the mean differed by 10% or the variance differed by 50% from the population values. The psR method was biased, gave large estimates of the variance, and was the least reliable procedure. The spscX procedure had the smallest error variance followed by the pscX procedure. The performance of the other procedures varied with the population. The spscX and pscX sampling plans were consistently the most accurate. There was no superior procedure in confidence interval coverage. Chapter 2. BACKGROUND 20 Schreuder and Wood (1986) compared sp3cX with. sPPS sampling. The methods were tested using 500 samples of 10, 20, and 30 with seven populations. The populations were individual tree and inventory plot data showing linear relationships, curvilinear relationships, and no relationship between y and x. Both procedures were unbiased. The spscX procedure was more precise for the linear relationships and sPPS sampling was more precise for the curvilinear populations. They confirmed the generally accepted notion that design-dependent sampling is best if the model is in question and model-dependent sampling is more efficient when the model is valid. Van Deusen (1987) interpreted 3P sampling through model-dependent theory and suggested reasons why 3P sampling is near optimal. He also discussed robustness of some variance estimators for 3P sampling. Rennie (1989) presented detailed methodology for planning and conducting point-model dependent sampling for timber inventory. The methods included procedures for estimating the number of point-sample plots and second-phase trees, where the sample trees were spread across the range of d2h. 2.4 Optimal Design and Cost Functions An optimal design achieves the lowest cost for a given precision or the highest precision for a given cost (Cochran 1977 p.96, Hansen et al. 1953 p.34, Kish 1965 p.263). The most effective forest inventory achieves a desired level of precision at the lowest cost (Loetsch et al: 1973 p.345). Most sampling text books give formulae for estimating the optimal sample size for various designs with fixed cost or fixed precision. The procedure for determining an optimal sample usually involves the Cauchy-Schwartz inequality or LaGrangian multipliers. An expression of sample variance and cost are needed for both methods. Hansen et al. (1953 p.287) noted that it is not always possible Chapter 2. BACKGROUND 21 to obtain explicit mathematical solutions for complicated variance and cost functions. They gave an example of successive substitution in the eost-variance expression to ap-proximate an optimal solution. They noted that mathematical solutions aid in thinking about good.sample design and show the factors that are effective in determining the information per unit cost. They recommended that both successive substitution and an-alytical methods be used to estimate the optimal solution. This is similar to a sensitivity analysis and shows the region of optimal sample combinations. They emphasized that successive substitution seeks the region of the optimum, not the exact optimum given by the mathematical expression. Kish (1965 p.264) stated that a single mathematical expression would ideally lead to an optimum among all possible sampling designs. However, this is virtually impossible because the costs and variances are needed for every alternative. Practitioners often rejected sampling designs because of practical difficulties in implementation. Kish noted that obviously poor designs can be eliminated by using educated guesses for costs when precise data are not available. He noted that moderate errors in estimating costs often result in only small departures from the optimal design. He stated that the real value of a good cost model is that it helps to ask the right questions and to make good guesses. Sampling strategies are often compared by variance alone. Strategy A is said to be more efficient if it -has a smaller variance than strategy B. This is referred to as relative gain (Hansen et al. 1953 p.200) and relative precision1 (Cochran 1977 p.103). Kish (1965 p.266) gave a formula for comparing sampling designs that includes cost. He defined the relative advantage (RA) of sampling design A to design B as: RA = C o s t 5 x VarianceB ^ ^ Cost,i x Variance^ where a design is preferred if it has a smaller cost per unit variance or a smaller variance 1Hausen et al. (1953 p.124) used the term relative precision to denote relative variance. > Chapter 2. BACKGROUND 22 per unit cost. Costs are needed to determine the optimum among sampling designs. Fixed costs that are common among designs are not needed to determine optimal strategies. How-ever, Kish (1965 p.266) noted that variable costs become less important as fixed costs increase. Cost functions in forestry may be peculiar to the design, application, or pop-ulation (Murchison 1984, Ware 1964). Scott (1981) noted that specific cost functions are required to determine optimal designs in forest inventories. Hamilton (1978) stated that cost functions cannot determine optimal designs because they do not consider the value of information. He suggested using a function that describes the losses for incorrect inventory information. However, he noted the difficulty of specifying an adequate loss function. Murchison (1984) noted that the designation of fixed and variable costs differs among researchers. For example, most researchers consider travel to the field as a fixed cost (e.g., Arvanitis and O'Regan 1967, Bonner 1972, Gross et al. 1980). However, Scott (1981) noted that travel time to the field is a function of the total number of samples and the number that can be enumerated per day. Sampling cost can be measured in units of time or money. Hamilton (1978), Murchison (1984), and Scott (1979, 1981) used the dollar cost of sampling. Arvanitis and O'Regan (1967) and Gross et al. (1980) used time. Murchison (1984) noted that time is temporally and spatially consistent and eliminates the effect of currency exchange, however, vehicle and equipment costs are difficult to quantify in units of time. Bonner (1972) and Scott (1981) noted that personnel time is the major component of inventory costs. Travel cost among sample units is a function of the square root of the length times the width of the survey area. The average distance between units in an area of size A (assuming straight line travel) is approximately J A/n. Thus total travel distance is d â€” Chapter 2. BACKGROUND 23 nyA/n = V An. Jessen (1978) gave several approximations for more intricate movement between sample units. Nyyssonen et al. (1971) included travel distance between lines and return to the starting point where d = y/An + 2>/A. A general cost function for two-stage sampling given in most sampling text books (e.g., Cochran 1977 p.280, Hansen et al. 1953 p.272, Kish 1965 p.268, Sukhatme et al. 1984 p.309) is C = Co + c\n + c2nm (2.22) where C is the total survey cost, CQ is the fixed cost of sampling, c\ is the cost of sampling first-stage units, c2 is the cost of sampling second-stage units, and n and m are the number of first- and second-stage sampling units, respectively. Many variations of this cost function have been used in forest inventories. Arvanitis and O'Regan (1967), O'Regan and Arvanitis (1966), Hamilton (1978), and Gross et al. (1980) estimated variable costs as C = cwy/nA + cen + cmnM0 (2.23) where c,,, is the cost of walking a unit distance, ce is the cost of establishing a plot, cm is the cost of measuring a single tree, Mo is the total number of trees in a plot, and the other terms are as previously defined. The cost of point-sampling is a linear function of the number of sampled trees (Sayn-Wittgenstein 1963, Shirley 1960). Murchison (1984) and Scott (1981) used more detailed cost functions including both fixed and variable costs. 2.5 Point-Sampling Point-sampling is also known as Bitterlich sampling, angle-count sampling, variable plot sampling, polyareal sampling, plotless sampling, and other names. Bitterlich (1947,1948) pioneered the concept and prefers the term angle-count sampling because it emphasizes Chapter 2. BACKGROUND 24 the essential feature of the method (Bitterlich 1984). T h e term point-sampling is due to Grosenbaugh (1952) and is commonly used in North America. Point-sampling is used throughout the world and there is a voluminous literature on theory and practice. Bibliographies of point-sampling literature were given for 1947 to 1959 by Thomson and Dietschman (1959) and for 1959 to 1965 by Labau (1967). Bitterlich (1984) described the history, theory, and various modifications of point-sampling. Most forest mensuration texts discuss the fundamentals of point-sampling (e.g., Avery 1975, Husch et al. 1982, Loetsch et al. 1973). Each point-sample plot contains zero or more trees selected with probability pro-portional to B A . Point-sampling can be interpreted as cluster sampling and has been modified to subsample the point-selected trees. Subsampling from point-samples can be interpreted as either two-phase (double) sampling or as two-stage sampling (Schreuder 1970). 2.5.1 Two-Stage Point-Sampling Conventional two-stage sampling is not commonly used with point-sampling. Schreuder (1970) gave unbiased estimators for two-stage point-sampling. Yandle and White (1977) gave examples and derived estimators for two-stage point-sampling. 2 T h e probability of inclusion was proportional to the ratio of tree height to the total height of all point-sampled trees. Their procedures were for estimation when two visits to an area were necessary. 2.5.2 Two-Phase Point-Sampling Bell and Alexander (1957) first proposed two-phase point-sampling. First-phase point-samples are called count plots where only the number of point-selected trees are tallied 2The procedure was actually double sampling. Chapter 2. BACKGROUND 25 to estimate stand basal area. Second-phase samples estimate the ratio of y (usually net volume) to BA (yBAR) that is used to adjust the estimate of stand BA from the larger sample. The rationale of two-phase point-sampling is illustrated by multiplying the top and bottom of the standard estimator for point-sampling n Mi E E ($) y = BAF t-=^1 (2.24) n by the total number of the point-selected trees (M = Â£ " = i Mi) so y = BAF n Mi n i=ii=i Â£ Mi n (2.25) Â«=i where y is the per ha quantity of the attribute y, BAF is the basal area factor (m2/ha), n~is the number of point-sample plots, M,- is the number of point-selected trees in the ith plot, y^ is the quantity of the jth tree in the ith plot, and bij is the basal area of the jth. tree in the ith. plot. Now, the per ha quantity of y is given by the average ratio of y/b (yBAR; VBAR in the case of volume) for the M point-selected trees, multiplied by the average per ha estimate of BA for the n point-sample plots. This shows that the two parameters can be estimated separately. BA/ha is easy and inexpensive to estimate from the average number of point-selected trees from count plots and sample plots. Average yBAR is more expensive to estimate. For volume, VBAR requires the measurement or estimation of height and dbh of each tree. The less expensive BA/ha estimates are considerably more variable than VBARs. Hence the rationale for taking a larger sample for BA/ha than for VBARs. Chapter 2. BACKGROUND 26 The usual estimator for two-phase point-sampling is y = BAF E Mi Â»=i n n> Mi E E Â© â€¢=13=1 'J n' EM, 1=1 (2.26) where n is the total number of point-sample plots and n' is the number of plots where volume (or some other attribute) is measured on all point-selected trees. Schreuder (1970) noted that Don Bruce first proposed this estimator. Bell and Alexander (1957) and Johnson (1961) also discussed this estimator. Palley and Horwitz (1961) showed that this estimator is biased because it is a ratio estimator and gave an approximate expression for the MSE. Promnitz (n.d.) stated that the bias of two-phase point-sampling is small and depends on population characteristics. Schreuder (1970) gave four biased double sampling estimators including the ratio and average ratio estimators. The standard error (SE) of a two-phase point-sample is a function of the SE of BA/ha and the SE of average yBAR. The SE for two-phase point-sampling can be approximated by Bruce's (1961) method as 1/2 %SE = % S E T C + %SEyBAR (2.27) %SE rpQ is the SE of the average tree count among plots expressed as a percentage of the average tree count. S E j ^ is equivalent to the SE of estimated BA/ha among plots. For a subsample of plots, %S&yBAR * s * n e %SE of the average yBARs among plots. For a subsample of trees, % S E ^ ^ ^ is the %SE of the average yBARs among individual trees. Bell et al. (1983) noted that the combined SE should include a term for the covariance, however, the covariance between VBARs and tree counts is usually very small. Johnson (1961) gave an approximation for the combined SE that included a covariance term. Bell et al. (1983) gave a method for balancing the ratio of count plots to yBAR using the cost and variance of each component. They showed how the number of count plots Chapter 2. BACKGROUND 27 a n d yBARs v a r y to g ive the same p r e c i s i o n . T h e y also gave a n e x p r e s s i o n for the lowest cost s o l u t i o n a n d d i scussed m e t h o d s for se l ec t ing yBAR trees s u c h as u s i n g a l a rge r d i o p t e r p r i s m a n d s y s t e m a t i c a l l y se lec t ing eve ry kth. po in t - s e l ec t ed t ree . Lies a n d B e l l (1983) d i scussed two-phase p o i n t - s a m p l i n g for e s t i m a t i n g g rade a n d d o l l a r va lues . T h e p r o c e d u r e was the same as for VBAR a n d t ree coun t s , b u t the char -a c t e r i s t i c o f in te res t was the pe rcen t g rade pe r t ree a n d the d o l l a r v a l u e pe r t ree. T h e y n o t e d t h a t d o l l a r v a l u e c o u l d be t o t a l l o g va lue , p ro f i t , s t u m p a g e , a n d so o n . T h e y s h o w e d h o w s ta t i s t i c s c a n be c o m p u t e d w i t h c o m m o n cru ise c o m p i l a t i o n p r o g r a m s a n d h o w to e x a m i n e the effect o f different c o m b i n a t i o n s o f yBAR to t ree coun t s . Lies a n d B e l l n o t e d t h a t e x a m i n i n g the r a t i o s o f d o l l a r va lues to B A quant i f ies the affect o f g rade o n the n u m b e r o f s a m p l e trees measu red . T h i s avo ids the d i f f icu l t p r o b l e m of conf idence i n t e r v a l s for c o r r e l a t e d a n d c o n s t r a i n e d va r i ab le s as for g rade pe rcen t . 2.5.3 Point-3P Sampling G r o s e n b a u g h r e c o g n i z e d t ha t some forest areas were too l a rge to v i s i t each tree a n d use h i s v e r y prec ise 3 P s a m p l i n g m e t h o d ( G r o s e n b a u g h 1964, 1979) . T h i s l e d to the d e v e l o p m e n t o f h i s v e r y efficient p o i n t - 3 P s a m p l i n g m e t h o d ( G r o s e n b a u g h 1971 , 1979, R e n n i e 1976, W i a n t 1974, 1976, 1977) . T h e r e are t w o c o m m o n l y used m e t h o d s o f p o i n t -3 P s a m p l i n g : i ) to e s t ima te the q u a n t i t y o f y pe r t ree for a l l po in t - s e l ec t ed trees a n d s u b s a m p l e i n d i v i d u a l trees; a n d i i ) to e s t i m a t e the t o t a l o f y for a l l the trees i n each p lo t a n d s u b s a m p l e en t i r e p lo t s . I n p o i n t - 3 P s a m p h n g , p o i n t - s a m p l e s are l o c a t e d as u s u a l where trees are se lec ted w i t h p r o b a b i l i t y p r o p o r t i o n a l to B A . I n d i v i d u a l trees are t h e n s u b s a m p l e d f r o m the p o i n t - s e l e c t e d trees u s i n g the 3 P m e t h o d . W h e r e v o l u m e is o f in te res t , he igh t ( t o t a l o r m e r c h a n t a b l e ) is e s t i m a t e d for each tree a n d c o m p a r e d to a p r egene ra t ed r a n d o m n u m b e r . V o l u m e is m e a s u r e d for trees w i t h e s t i m a t e d he ights grea ter t h a n o r e q u a l to Chapter 2. BACKGROUND 28 the random number. Hence trees are selected with probability proportional to diameter squared times height (d 2 h) which is strongly related to volume. Grosenbaugh (1971) suggested that d 2 h could be multiplied by value, thus trees are selected with the combined probability proportional to d 2 h times value. Kasile (1983) used conventional 3P sampling to estimate the dollar value of stands. The point-3P estimator of y / h a is (Grosenbaugh 1971, 1979) *=^;Â£(ar) ;Â§g l u <2-28) where y is average y / h a , y is the attribute of interest (usually volume), x is the covariate (usually height), and s is the total number of 3P-selected trees. When y is volume and x is height, the estimator is the average cylindrical form factor of the 3P-selected trees, mul-tiplied by the product of estimated total height and B A per unit area. Yandle and White (1977) used the same estimator for subsampling point-selected trees with probability pro-portional to tree height. Their application was subsampling from a list of point-selected trees from a completed survey. This was a priori list sampling where the sum of the tree heights for all point-selected trees was known before sampling. Grosenbaugh (1976,1979) noted that list sampling with PPS without replacement and fixed sample size is preferable to 3P or point-3P sampling when a complete a priori list is available and selected trees can be relocated. However, 3P or point-3P sampling is preferable to a posteriori list sampling when the cost of obtaining a complete a priori list is prohibitive, or where trees cannot be relocated. Grosenbaugh (1971, 1979) gave a good approximation of the relative variance of estimated y / h a ( C V | ) for point-3P'sampling as C V 2 = C V 2 , + C V 2 * + 2 C V 2 â„¢ (2.29) where C V 2 ^ is the relative variance of the sum of the x's per plot, CV2R is the relative variance of the average ratios, and CV2XR is the relative covariance. He noted that the Chapter 2. BACKGROUND 29 major component of variance is the variation among the sum of the x's per point. This is intuitively reasonable because estimates of the plot average ratios should be similar whereas the number and height of trees in a plot can vary greatly. Furthermore, the average ratios should not be related to the sum of x's, thus giving a small covariance term. Bell and Dilworth (1988 p.217) suggested the same estimator but did not include the covariance because it is usually very small. Schreuder et al. (1984a) compared several point-3P variance estimators. The simu-lation used estimates from 2000 samples from a hypothetical forest. The variance es-timators included one suggested by Grosenbaugh (a more sophisticated approximation of eq. 2.29), Yandle and White's (1977) estimator, a jackknife estimator, and others. They concluded that Grosenbaugh's variance estimator and the jackknife variance esti-mator were reliable and did not give negative values as did some others. The variance of Grosenbaugh's and the jackknife estimators were the smallest. Yandle and White's estimator did not give a useful estimate of variance. 2.6 Forest Service Operational Cruise Procedure The objective of an operational cruise in B .C . is to estimate the average net volume/ha to a specified degree of precision (Ministry of Forests 1980). The precision requirements are Â± 15% of the estimated net volume/ha for scale-based sales and Â± 10% or Â± 5% for cruise-based sales, at 95% confidence. Stumpage is computed using the estimated quantity and quality of timber from the cruise for both cruise-based and scale-based sales. Most timber cut from Crown land in B .C . is from scale-based sales where billing is based on the net volume and grade of logs scaled at the log dump or mill yard. For cruise-based sales, billing is based on the cruise estimates of volume by grade. Virtually all timber in B .C . is cruised using systematically located point-samples. Chapter 2. BACKGROUND 30 Distinct forest types are delineated from the cruise area on aerial photographs. Evenly spaced transects are located so at least four plots are in each type. The total number of plots required to meet the given precision is estimated as n = t 2 C V 2 /E2, where t is Student's t, CV is the coefficient of variation of point-sample estimates of net volume/ha, and E is the desired precision. The CV is usually estimated from previous cruises or experience in similar forest types. There are two types of point-sample plots used in operational cruising: sample plots that include detailed measurements for each tree; and count plots that are only a numer-ical count of the point-selected trees. The Ministry of Forests (1980) suggests locating three count plots for each sample plot to achieve the increased precision required for cruise-based sales. Information recorded for each point-selected tree includes: species, dbh, tree class, crown class, presence and position of eight pathological indicators, and presence and degree of seven quality indicators. Height is measured for two or three trees in each plot, so at least 30 trees are measured for each major species. The data are used to construct height-diameter curves. On the coast, the heights of two or three trees are measured on each plot and the remainder are visually estimated. Age is estimated from increment cores from two or three codominant sample trees per plot. This is often not possible in old-growth forests because of the large diameter of trees and frequent occurrence of decay in some species. Pathological indicators are used to determine risk groups and assign decay, waste, and breakage factors (British Columbia Forest Service 1976). The indicators are also used with the quality remarks to assign log grades to 10 m tree sections. In the interior, the pathological indicators are used to determine risk groups and to compute the lumber recovery factor. Chapter 2. BACKGROUND 31 2.7 Forest Service Log Grades Crown timber is appraised on the coast using FS letter log grades. Grade is determined by factors that affect the end use of the timber such as species, size, and various quality related indicators (Ministry of Forests 1986). The Task Force on Crown Timber Disposal (Pearse et al. 1974) recommended that the statutory log grades be replaced by new log grades. The new grades were to incorporate industrial grades that had evolved from market transactions. The new letter log grades became official for use in coastal appraisals in January, 1982. Middleton and Munro (1985) studied product outturn values by log grade for Douglas-fir, hemlock, cedar, and amabilis fir. They found that the new letter log grades were closely related to log value, however, they did not give a good indication of lumber recovery. They noted that the old statutory log grades were not strongly related to log value. - ._ 2.8 Vancouver Log Market The Vancouver log market is officially an administrative region set up to monitor log prices for timber appraisal. The Vancouver log market is often referred to in an unofficial sense of the communication networks that have evolved between log brokers, company log buyers, and other timber brokers. The Task Force on Crown Timber Disposal (Pearse et al. 1974) gave a history of the Vancouver log market. The Forest Act requires sellers and buyers of timber to keep accurate records of log sale transactions. The FS Valuation Branch collects, audits, and reports the data. Prices do not include logs sold in inter-divisional transfer, remanufacture, export, poles and pilings, or beachcombed logs. Average log market values are reported each month for major and non-major coastal loggers. Six month average prices are used for timber Chapter 2. BACKGROUND 32 appraisal. The log market values are supposed to reflect free-on-board Vancouver prices for free market transactions. However, Pearse (1989) suggested that current log market prices reflect relative value but probably do not reflect actual value. 2.9 Estimating Tree Value There are two basic methods of estimating individual tree value. Whole-tree methods value standing trees by grade or size class. Log-based methods value standing trees by the length, diameter, and grade of the individual logs in the tree. Davis (1966 p.420) and Davis and Johnson (1987 p.384) discussed and gave examples of the two methods. Loetsch et al. (1973 p.190) discussed the assessment of timber quality and value in forest inventories. Objectives must be clearly stated before tree quality or value can be quantified (Ware 1964). Ware identified three main sources of variation in predicting tree value: i) variation in end products; ii) variation in volume by end product quality class; and iii) variation in price by end product quality class. End products must be clearly defined in order to examine the various related characteristics. Ware stated that the main source of error is variation in volume by end product quality class. Selling prices and costs for whole tree methods of valuation are given for the entire tree. However, individual logs are identified at some point before the tree enters the mill. It is easier to recognize classes of standing trees than to estimate log grades, diameters, and lengths, however, estimating tree value by log value, size, and grade is more comprehensive, flexible, and widely applicable (Davis and Johnson 1987 p.386). Individual tree valuation has traditionally used the log-based approach where value is computed from recoverable products. This is probably the result of the close relation-ship to estimating lumber recovery of individual logs with log rules (Bruce 1970). This Chapter 2. BACKGROUND 33 method uses tables showing the board foot lumber recovery by species, diameter class, and log grade. Bruce (1970) discussed the development of log rules and their relationship to product recovery using tree volume, surface area, and length. More recently, product recovery has been calculated as the yield of lumber, veneer, and chippable waste pre-dicted by equations developed from empirical observation. Current methods of log-based product recovery predict lumber yield by grade from standard log measurements (e.g., Howard and Yaussy 1986, Yaussy 1986). Whole-tree valuation was successfully used to estimate the value of trees in the Pacific Northwest. The methods predicted tree value as a continuous function, not by discrete tree classes as discussed by Davis (1966). Lane et al. (1970) directly estimated the dollar value of Douglas-fir sawtimber. They used multiple regression to predict dollar value and volume of Standard and Better grade lumber for each tree. The dependent variables were obtained from mill studies of 1099 trees. Predictor variables for dollar value included dbh, height, basal scar length, diameter of the largest limb (or stub) in the 16 foot butt-log, and estimated percent defect. A test showed that the system accurately estimated the value of lumber produced from milling harvested trees from three timber sales. Snellgrove et al. (1973) estimated the selling price and volume of lumber for white pine sawtimber. The predictor variables were dbh, height, height to the first live limb, diameter of the largest limb in the 16 foot butt-log, the number of limb-free and defect-free faces on the 16 foot butt-log, and the estimated percent defect. The same procedures were used for western larch (Plank and Snellgrove 1978) and ponderosa pine sawtimber (Plank 1981). The predictor variables for larch value were dbh, height, number of limb- and defect-free faces on the 16 foot butt-log, and estimated percent defect. The value of ponderosa pine trees was predicted using dbh, height, height to the first live limb, the number of limb- and defect-free faces on the 32 foot butt-log, and estimated total defect. Chapter 3 METHODS 3.1 Overview The project methodology is presented in major four sections: 1. Creation of test populations. 2. Estimation of individual tree values. 3. Formulation of the model-dependent sampling strategies. 4. Evaluation of the model-dependent sampling strategies. 1. Creation of test populations. Three populations were created to provide different stand conditions to test the sampling strategies. The test populations were created by separating an aggregate data base of point-sample plots into mutually exclusive data sets. The criterion for separating the aggregate data were chosen to provide forest conditions of differing density, volume, and value characteristics. 2. Estimation of individual tree values. Individual tree values were computed and each tree was assigned a cruiser-called value and an estimated value. The cruiser-called value was assumed to approximate the value of the felled and bucked tree in the Vancouver log market. The cruiser-called value was computed as the sum of the value of the individual logs in the tree as given by prices for the Vancouver log market. This was done by conceptually bucking each tree into 5 m logs and assigning the cruiser-called grade and net factor for decay to each log. Estimated tree values were computed from regression 34 Chapter 3. METHODS 35 equations developed from the aggregate data using predictor variables that were related to cruiser-called value. S. Formulation of the model-dependent sampling strategies. The FS sampling method was modified to subsample trees from point-sample plots. Three probability-based meth-ods and three purposive-based methods were used to select subsampled trees. The as-sumed superpopulation model provided the inference for all the subsampling methods. The subsampling methods were constrained to consider the practical difficulties of forest sampling. These practical considerations had three major affects on the designs: i) point-sampling was considered to be the most efficient method to establish first-stage clusters for value in the field and was used for all strategies; ii) only one visit to the sample area was assumed; and iii) methods for estimating individual tree values were restricted to using only variables that could be easily measured or observed in the field. 4- Evaluation of the model-dependent sampling strategies. The model-dependent sam-pling strategies were evaluated on their relative advantage over the FS sampling strategy using the same number of first-stage sample plots. The variance of the strategies was approximated from the sampling distribution from Monte Carlo simulation. Analytical expressions of the variance for the strategies would be very difficult to derive and may not provide a realistic representation of the achieved sample variance. Sampling cost was approximated with cost functions developed from consultation with field experts. The Monte Carlo simulation offered a cost effective and efficient method to test the designs using actual forest data. 3.2 Test Populations The point-sample plots used to test the sampling strategies were obtained from MacMillan Bloedel's computer data base of operational cruise plots. The plots were from cruises Chapter 3. METHODS 36 of mature and old-growth timber (greater than 120 years of age) in District 1 of Tree Farm 19 on southern Vancouver Island. Only living and dead potential trees (Ministry of Forests 1980) were included in the data. Tree species were Douglas-fir (Pseudotsuga menziesii (Mirb.) Franco var. menziesii), western red cedar (Thuja plicata Donn), yellow cedar (cypress) (Chameacyparis nootkatensis (D. Don) Spach), western white pine (Pinus monticola Dougl.), lodgepole pine (P. contortavai. contorta), Sitka spruce (Picea sitchensis (Bong.) Carr.), and two minor hardwood species, maple (i4cer spp.) and alder (Alnus spp.). Western hemlock (Tsuga heterophylla (Raf.) Sarg.) and mountain hemlock (T. mertensiana (Bong.) Carr.) were considered a single species, and grand fir (Abies grandis (Dougl.) Lindl.) and amabilis fir (A. amabilis (Dougl.) Forbes) were grouped together and called balsam. The tree data included measurements for dbh, measured or estimated height, the eight FS pathological indicators, and the seven FS quality indicators (Ministry of Forests 1980). The tree data also included cruiser-called MacMillan Bloedel inventory grade,1 grade length, and percentage net factor defect reduction. The six inventory grades were designed to reflect company requirements for various end products. Cruiser called log lengths varied to reflect grade breaks. Net factors were called for defect only and did not include deductions for waste and breakage. The data were arbitrarily separated into three mutually exclusive test populations. The objective was to provide test populations with different stand conditions to test the samphng strategies. Test population 1 (TP1) and test population 2 (TP2) contained the plots from stands in the aggregate data that were sampled with a 6.06 diopter prism (BAF 9.2 m2/ha). TP1 contained the plots from stands with average plot BA<80m2/ha, and TP2 contained the plots from stands with average plot BA>80m2/ha. Test population ^MacMillan Bloedel now uses the FS letter grades for internal inventory; personal communication with Mr. Bert Vink, MacMillan Bloedel Ltd., Nanaimo B.C., March 1988. Chapter 3. METHODS 37 3 (TP3) contained the plots from stands in the aggregate data that were sampled with a 7.43 diopter prism (BAF 13.8 m2/ha). Table 3.1 gives the number of stands, plots, trees, diopter, and BA limit of the plots for the three test populations. Table 3.2 gives the number of point-selected trees by species for the three test populations and the aggregate data. Table 3.1: Number of stands, plots, and trees in the test populations Test Number , â€” Number "i Number Prism Average Plot Population of Stands of Plots of Trees Diopter BA/ha T P l 327 1667 11623 6.06 < 80m2/ha TP2 234 1236 12191 6.06 > 80m2/ha TP3 286 1410 9250 7.43 Total 847 4313 33064 - -Table 3.2: Number of point-selected trees by species and test population T P l TP2 TP3 Aggregate Species Number % Number % Number % Number % Hemlock 4222 36.3 4102 33.6 2 945 31.8 11269 34.1 Douglas-fir 3641 31.3 4160 34.2 3129 33.9 10930 33.1 Balsam 1555 13.4 1741 14.3 1543 16.7 4839 14.6 Cedar 1382 11.9 1347 11.0 1122 12.1 3851 11.7 Cypress 652 5.6 751 6.2 462 5.0 1865 5.6 White Pine -91 0.8 85 0.7 46 0.5 222 0.7 Alder 30 0.3 2 0 0 0 32 0.1 Sitka Spruce 27 0.2 1 0 0 0 28 0 Maple 12 0.1 2 0 2 0 16 0 Lodgepole Pine 11 0.1 0 0 1 0 12 0 Total 11623 100 12191 100 9250 100 33064 100 The arbitrary separation of the aggregate data into three mutually exclusive data sets resulted in different test population characteristics. The stands sampled with the Chapter 3. METHODS 38 6.06 diopter prism (TP1 and TP2) had lower density than the stands sampled with the 7.43 diopter prism (TP3). The split into TP1 and TP2 by average plot BA/ha further separated the stands into high and low density. This also resulted in different value and volume characteristics (Tables 3.3, 3.4, and 3.5). Figures 3.1, 3.2, and 3.3 show the frequency distribution of value/ha for plots in the three test populations. Table 3.3: Statistics for the 1667 sample plots in TP1 average std. dev. skewness kurtosis min max value/ha ($) 30717.04 23641.75 2.06 7.83 0 225973.96 est. value/ha ($) volume/ha (m3) 31851.18 23862.39 1.73 5.86 0 222283.81 613.36 321.64 0.67 0.49 0 2 068.14 net volume/ha (m3) 578.02 303.41 0.71 0.61 0 1943.64 basal area/ha (m2) 64.01 25.37 0.25 -0.07 9.18 165.26 trees/ha 422.60 282.78 1.12 1.42 12.82 1728.32 trees/plot 6.97 2.76 0.25 -0.07 1 18 Table 3.4: Statistics for the 1236 sample plots in TP2 average std. dev. skewness kurtosis min max value/ha ($) 48483.83 30615.33 2.02 7.73 2 280.22 301422.07 est. value/ha ($) 50895.58 30414.01 1.57 4.68 0 247764.07 volume/ha (m3) 912.62 363.43 0.62 0.61 102.90 2459.18 net volume/ha (m3) 859.52 349.33 0.69 0.96 93.40 2418.52 basal area/ha (m2) 90.55 27.25 0.39 0.55 9.18 192.80 trees/ha 510.01 306.23 1.26 2.19 13.96 2 061.09 trees/plot 9.86 2.97 0.39 0.55 1 21 3.3 Tree Value Each tree was conceptually divided into 5 m logs starting at a stump height of 45 cm. When the first grade section was cruiser-called as cull, the first 5 m log began at the Chapter 3. METHODS 39 Table 3.5: Statistics for the 1410 sample plots in TP3 average std. dev. skewness kurtosis min max value/ha ($) 56678.63 40030.87 1.80 4.71 2531.74 280992.94 est. value/ha ($) volume/ha (m3) 56327.60 38034.12 1.71 4.63 0 319645.62 985.23 474.65 0.95 1.49 88.63 3228.70 net volume/ha (m3) 923.24 456.76 1.02 1.67 88.63 3 219.73 basal area/ha (m2) 90.54 36.42 0.63 0.46 13.80 234.62 trees/ha 421.18 302.43 1.60 3.94 10.36 2467.12 trees/plot 6.56 2.64 0.63 0.46 1 17 Figure 3.1: Distribution of value/ha for the 1667 plots in TPl Percent 25 20 15 1 10 5 1 1 i 1 i | 1 1 1 |1 | // // // // 0 1 2 3 4 5 6 0 0 0 0 0 0 7 8 9 1 1 1 1 1 1 1 0 0 0 0 0 1 1 7 8 0 0 Value/ha ($ 000) Chapter 3. METHODS 40 Figure 3.2: Distribution of value/ha for the 1236 plots in TP2 1 3 4 6 7 9 1 1 1 1 1 1 1 2. 2 2 2 2 2 3 5 0 5 0 5 0 0 2 3 5 6 8 9 1 2 4 5 7 8 0 5 0 5 0 5 0 5 0 5 0 5 0 5 0 Value/ha ($ 000) Figure 3.3: Distribution of value/ha for the 1410 plots in TP3 Percent 25 I 0 1 3 4 6 7 9 1 1 1 1 1 1 1 2 2 2 2 2 5 0 5 0 5 0 0 2 3 5 6 8 9 1 2 4 5 7 5 0 5 0 5 0 5 0 5 0 5 0 Value/ha ($ 000) Chapter 3. METHODS 41 top of the cull grade section. The top log was of variable length to a 17.5 cm top diameter (inside bark). Each log was assigned the cruiser-called MacMillan Bloedel inventory grades and net factors if at least 4 m of the conceptual 5 m log contained the call grade. The inventory grades were converted to FS letter grades using MacMillan Bloedel specifications.2 Quality characteristics of the MacMillan Bloedel grades roughly corresponded to the FS grades and conversion was based primarily on log top diameter. Log volume and top diameter (inside bark) was computed with the subroutine L O G 3 from the FS cruise compilation program that is based on the whole-bole taper equations of Demaerschalk and Kozak (1977). The whole-tree approach to valuation was used to estimate the value of individual trees. The methods were similar to those used by Lane et al. (1970), Plank (1981), Plank and Snellgrove (1978), and Snellgrove et al. (1973). The cruiser-called value of each tree was computed as the sum of the values of the conceptual logs in each tree. The value of each log was computed with prices from the Vancouver log market for the three months ending 15 October 1987 (Table 3.6) and the net volume and FS grades converted from the MacMillan Bloedel cruiser-called grades and net factors. Multiple linear regression equations were developed to predict the value of individual trees by species. The equations were developed from trees systematically selected from the aggregate data. When possible, 500 trees were selected for each species. All white pine, Sitka spruce, and lodgepole pine in the aggregate data were used because less than 500 trees of these species were available (Table 3.2). Equations were not developed for alder and maple because they were considered non-commercial. The variables considered as predictors of cruiser-called value were: dbh, dbh2, the 2Personal communication with Mr. Bert Vink, MacMillan Bloedel Ltd., Nanaimo B.C., September 1987. 3Written by Dr. A. Kozak, Faculty of Forestry, University of British Columbia. Chapter 3. METHODS 42 Table 3.6: Three month average log prices in the Vancouver log market for major coastal loggers (15 October 1987) Value/m3 ($) Grade Douglas-fir Cedar Balsam" Spruce Cypress Pineb Peelers A 257.90 - -B 134.04 - -C 76.59 - 68.39 116.63 Lumber D 189.54 146.15 89.35 352.19 658.81 74.83 E - - - 382.77 682.73 F - 136.07 - 242.24 486.80 G - 153.21 396.20 Sawlbgs H 65.14 77.29 61.51 104.94 362.44 44.32 I 52.32 59.29 52.74 69.57 257.04 29.12 J 34.12 45.53 34.41 32.48 97.03 14.98 Shingles K - 69.17 - - - -L - 57.71 - - -M - 40.96 - - -Pulp X 23.22 16.71 24.22 25.69 81.34 18.63 Y 13.42 0.12 13.54 1.20 31.73 6.33 Z 0.00 0.00 0.00 0.00 0.00 0.00 "includes Hemlock ^includes White and Lodgepole pine Chapter 3. METHODS 43 eight FS pathological indicators, the seven FS quality indicators, and various transfor-mations and interactions of the variables. Height was not considered as a predictor variable because it is expensive to measure and is often difficult to estimate visually in old-growth forests. Likewise, pathological risk group was not considered because it is a function of several of the pathological indicators. Predictor variables were selected using ordinary least squares, stepwise multiple linear regression with inclusion probability of partial F = 0.15. All variables included in the final equations were significant with probability of a larger t of less than 1%. Table 3.7 gives the equations used to predict the cruiser-called value of individual trees. The variables were coded as specified by the Ministry of Forests (1980) and are defined as: D - diameter at breast height (m); PI - occurrence of conks; P3 - occurrence of scars; P4 - occurrence of fork or crook; P8 - occurrence of dead top; Ql - degree of spiral grain; Q2 - degree of sweep; QZ - degree of lean; Q4 - number of first 5 m log with live limbs; Q5 - number of first 5 m log with stubs; Ql - number of quarters free of knots in the second 5 m log. 3.4 Sampling Strategies 3.4.1 Forest Service Sampling Strategy The FS sampling strategy was the basis for comparing the model-dependent sampling strategies. For the FS strategy, the simulations assumed that the eight pathological and seven quality indicators were recorded for every point-selected tree and height was measured for three trees in each plot and visually estimated for the remaining trees. Value/ha (y) was estimated as BAF n Mi (3.30) Chapter 3. METHODS 44 Table 3.7: Equations for estimating cruiser-called tree value ($) Hemlock JV = 500 R2 = 0.85 Value = Douglas-fir JV = 500 R2 = 0.85 Balsam JV = 500 R2 = 0.92 Cedar N = 500 i?2 = 0.73 Cypress JV = 500 R2 = 0.84 White Pine N = 91 R2 = 0.75 Spruce JV = 27 R2 = 0.89 Lodgepole Pine JV = 11 R2 = 0.94 Value = Value = Value = Value = Value = Value = -55.42 -(- 0.07639844J}2 - 0.006948383D2P8 -0.005149949I>2Q7 - 0.00634132LD2P1 -47.74 + 0.01296585i;2Q4 + 0.001599439D2Q52 -0.02769457D2Q1 - 0.01035615Â£ 2 P3 0.03534327J52 + 0.009230732.D2g4 - 0.02693984.D2Q1 -18.2349479Q4 -191.48 + 5.20317012D - 0.003132438D2P3 +0.00494992D2Q5 + 0.001897799D2Q4 0.02396627Z)2g5 - 0.02974097/J2P3 + 0.15342899D2 -0.03404306Z?2P4 - 0.03690323Z>2Q12 + 0.006073705Â£>2Q4 -27.28539515Q42 -97.33 + 2.81186196L> + 0.005177247D2Q4 -0.004884537Z?2Q3 1149.39 - 451.41278Q7 - 0.005896187D2Q42 -0.02561579I>2P4 + 0.0413201i;2<24 + 38.28417072Q72 Value = 0.004822402D2Q4 - 3.36711508Q22 Chapter 3. METHODS 45 where j/fJ- is the cruiser-called value of tree j in plot i, is the basal area of tree j in plot i, n is the number of point-sample plots, and M,- is the number of point-selected trees in plot i. The sample variance of value/ha ( 3 ^ ) was estimated as 4 ^ [ E ( | ( - , ) 3 - ( | : | f ) ) 7 4 This is the usual procedure and formulae for estimating volume, except value is the characteristic of interest. The FS cruise procedure was reviewed in Chapter 2 (section 2.6) and was described in detail by the Ministry of Forests (1980). 3.4.2 Model-Dependent Sampling Strategies Eighteen model-dependent sampling strategies were developed using six subsampling methods and three estimators. The six subsampling methods included three probability-based and three purposive-based methods. Each method used the ratio, average ratio, and regression estimators. The subsampling methods differed only in how individual trees were selected from within the point-sample plots. The first-stage sample was randomly selected point-sample plots for all strategies and estimated the sum of the covariate (average estimated value/ha). The second-stage sample defined the relationship between cruiser-called value and estimated value from individual trees. The three probability-based subsampling methods included selecting trees from within point-sample plots: i) at random; ii) with probability proportional to estimated value; and iii) using 3P sampling (point-3P sampling). Trees had a non-zero chance of inclusion with the probability-based subsampling methods and a zero or one probability with the purposive-based methods. The three purposive-based subsampling methods included selecting trees from within point-sample plots: i) with the highest estimated value; ii) with estimated values within a given range; and iii) with estimated values above a given threshold value. Chapter 3. METHODS 46 The eighteen sampling strategies were considered to be model-dependent because they are model-unbiased under their respective superpopulation models and they do not consider the probability of selection in the estimators. The three probability-based subsampling methods would be design-dependent (conventional probability designs) if the selection probability was considered in the estimators. Prior estimates of population characteristics were not used in the model-dependent sampling strategies because of the restriction of only one visit to the forest area. A preliminary sample of the forest can be used to compute sample size which usually results in a more efficient survey. However, this is usually not economically feasible in coastal B.C. where forest sampling is often in remote areas. Furthermore, preliminary sample plots must be located throughout the entire forest area to provide meaningful estimates. Thus the tract must be fully traversed on two occasions which results in greatly increased travel costs. The first-stage selection of trees was point-sample plots for all sampling strategies. Ideally, first-stage selection should explicitly consider tree value. However, it is time consuming and expensive to estimate tree value. Tree value is roughly related to BA for a given species, hence point-samples should be roughly related to value. Scatterplots of all trees in each test population showed that cruiser-called tree value (y) was roughly related to estimated tree value (x) by a straight line passing through or near the origin. Hence the finite test populations created for this study could have been generated by a hypothetical superpopulation described as Vk = Pixk + ek[v(xk)]^2 (3.32) where ek are independent and identically distributed random variables with mean zero and variance cr2. However, the scatterplots showed that the variance was not a simple function of xk and appeared to differ among test populations. The variance of y appeared Chapter 3. METHODS 47 to be an increasing function of x with a possible constant component. Consequently, the model-dependent sampling strategies were designed under three superpopulation models: Â£[0,1: x], Â£[0,1: x2], and Â£[0,1:1], where Vk = PiXk + efcxi / 2 Vk = p\xk + efcXfc (3.33) (3.34) (3.35) These models describe the variance as proportional to x, x2, and a constant, respectively. The optimal estimators under these models are T[0, l:x], T[0, l:x 2], and T[0,1:1]; the ratio, average ratio, and regression estimator (without intercept), respectively. Accord-ingly, the estimators used in this study were the ratio estimator BAF r n M i 2/2 = n L t = l j = l U*3 J L t = l j = l ' i=l j=l J (3.36) the average ratio estimator 2/3 = BAF n Â±hXf)][Â£Z(yf)/rn ;=i j=i u*J J L i=i j=i X Â« J ' (3.37) and the regression estimator (without intercept) BAF r n M i 2/4 = n r n Mi _ -i r " ">,- , rÂ» m,-L t = l j = l W U J L | = l j = l ' ,=1 j = l (3.38) where x fJ is the estimated value of tree j in plot i, m,- is the number of subsampled trees in plot i, and the other terms are as previously defined. Each of the three estimators has two terms. The first term is common to all three and gives the averaged estimated value/ha for the n point-sample plots. The second term is different for each and estimates the relationship between cruiser-called and estimated tree value. The ratio estimator (eq. 3.36) is the average plot estimate of estimated value/ha, multiplied by the ratio of the total of y to the total of x for all the subsampled trees. The Chapter 3. METHODS 48 average ratio estimator (eq. 3.37) is analogous to the ratio estimator except the ratio is computed as the average of the ratios of y/x for all the subsampled trees. The ratio estimator is also known as the ratio of means estimator and the average ratio estimator is also known as the means of ratios estimator. The variance of the sample estimates from the randomly selected point-sample plots was computed using either the jackknife or combined SE methods. The jackknife was used to estimate the sample variance of the random, PPS, and purposive subsampling methods (Table 3.8) where all plots contained at least one sample tree. The jackknife was not used for bias reduction. The combined SE method was used to estimate the sampling variance for the 3P, target, and threshold strategies where some plots might not contain sample trees. 3.4.2.1 Probability-Based Subsampling Methods The three probability-based methods for subsampling trees within point-sample plots included: i) random selection of trees (with replacement); ii) selecting trees with prob-ability proportional to estimated value (with replacement); and iii) 3P subsampling of trees from all point-selected trees across the n plots (point-3P sampling). The ratio, av-erage ratio, and regression estimator were used with all three subsampling methods. The superpopulation models were used to select the estimators, but were not considered in the subsampling method. The probability of selection was not considered in the estimators. Random Subsampling The random subsampling method selected trees with equal probability with replacement from within point-sample plots. All trees in the plot were subsampled when the plot contained less than the desired total number of subsample trees. Value/ha (y) was Chapter 3. METHODS 49 Table 3.8: Estimators for the 19 sampling strategies Subsampling Point Strategy Method Estimator Variance Estimator Forest Service 1 none y~i expansion Probability-Based Subsampling Methods sample variance 2 random fa ratio jackknife 3 random fa average ratio jackknife 4 random fa regression jackknife 5 PPS fa ratio jackknife 6 PPS fa average ratio jackknife 7 PPS fa regression jackknife 8 3P fa ratio combined SE 9 3P fa average ratio combined SE 10 3P fa regression combined SE Purposive-Based Subsampling Methods 11 purposive fa ratio jackknife 12 purposive fa average ratio jackknife 13 purposive fa regression jackknife 14 target fa ratio combined SE 15 target fa average ratio combined SE â€¢ 16 target fa regression combined SE 17 threshold fa ratio combined SE 18 threshold fa average ratio combined SE 19 threshold VA regression combined SE Chapter 3. METHODS 50 estimated with the ratio estimator y2 (eq. 3.36), the average ratio estimator y3 (eq. 3.37), and the regression estimator (eq. 3.38). Sample variance was estimated with the jackknife v\ = BAF2 (^ ) - If (3.39) where is the average estimated value/ha for the n â€” 1 plots where the ith plot is omitted, and 9 = Â£ " = 1 is the average estimated value/ha of the n pseudoestimates where the ith plot is omitted (Miller 1974). This technique assumes that the plots were randomly selected and were independent and identically distributed (Bissell 1975). PPS Subsampling The PPS subsampling method selected trees with probability proportional to estimated tree value (PPS) with replacement within plots. This was PPS list sampling within plots. Value/ha (y) was estimated with the ratio estimator y2 (eq. 3.36), the average ratio estimator y3 (eq. 3.37), and the regression estimator y4 (eq. 3.38). The variance of the PPS estimates was computed with the jackknife (eq. 3.39). As for the random subsampling method, all trees in the plot were subsampled when the plot contained less than the desired total number of subsample trees. 3P Subsampling The 3P subsampling method selected individual trees with probability proportional to estimated value. The probability of selection was considered over the predicted total estimated value of all point-selected trees in the n plots (point-3P sampling). Value/ha was estimated with the ratio estimator (eq. 3.36), the average ratio estimator (eq. 3.37), and the regression estimator (eq. 3.38). This was not conventional point-3P samphng because the estimator differed from that given by Grosenbaugh (1971, 1979) (eq. 2.28). Chapter 3. METHODS 51 The relative variance of the estimators was approximated by combining relative com-ponents of variance CV? = C V 2 . + CVR (3.40) and CV 2 , = n - 1 n E ( E xalbii)7 i=i j=i CVR = m â€” 1 ( E E * , ; A ; ) 2 Â«=i j=i n m,- ^ E ( E yij/xij) i=i j=i ( E E i=i i=i - 1 (3.41) (3.42) where CV\ is the relative variance of estimated value/ha over the n sample plots and CVR is the relative variance of the ratio of true to estimated value (y/x) over the m subsampled trees. This is another expression for Bruce's (1961) method of combining SEs. 3.4.2.2 Purposive-Based Subsampling Methods The three purposive-based subsampling methods explicitly considered the superpopula-tion models in the subsampling methods and in the estimators. As with the probability-based subsampling methods, the probability of selection was not considered in the esti-mators. The optimal sample (lowest sample variance) under the hypothetical superpopu-lation model (eq. 3.32) that could have generated the three test populations was to select the m trees with the largest values of x. Hence, xk must be known for all k = 1,..., AT trees in the population. The optimal sample cannot be observed without visiting every tree prior to selecting the sample. This is usually not possible in operational cruising except for very small areas and under special circumstances. However, three subsampling Chapter 3. METHODS 52 methods were formulated to take advantage of the hypothetical superpopulation models while considering the practical nature of operational cruising. Purposive Subsampling The purposive subsampling method selected trees with the highest estimated value in each plot. All the point-selected trees were sampled when less than the total desired number of trees were in the plot. When the total estimated plot value was zero, the subsample trees were randomly selected from trees in the plot. The objective of this subsampling method was to approximate the optimal sample under the superpopulation model by selecting trees with the highest estimated value, but also to provide a simple method of selecting trees and to ensure the subsample was distributed throughout the population by taking trees from all plots. Value/ha was estimated with the ratio estimator y2 (eq. 3.36), the average ratio estimator y$ (eq. 3.37), and the regression estimator y~4 (eq. 3.38). The sample variance of the estimators was approximated with the jackknife (eq. 3.39). Target Subsampling The target subsampling method purposively selected all point-selected trees within a given range of estimated value. Trees were selected when the estimated value was within Â± 6 dollars of an arbitrary target value. Hence the number of trees selected in a plot and the total number of subsampled trees over the n plots was a random variable. The target value was the known average estimated tree value for the test population (px)-The objective of this subsampling method was to provide a sample balanced on the first moment of x (estimated tree value). This method was to define the ratio between cruiser-called value and estimated value from trees restricted to the narrowest possible range of estimated tree value centered at fix. Chapter 3. METHODS 53 Value/ha was estimated with the ratio estimator y~i (eq. 3.36), the average ratio estimator yz (eq. 3.37), and the regression estimator y4 (eq. 3.38). The sample variance was approximated by combining relative variances (eq. 3.40). Threshold Subsampling The threshold subsampling method purposively selected all point-selected trees greater than an arbitrary threshold value. The objective was to approximate the optimal sample (selecting the m trees with the highest estimated value) without prior knowledge of the population. This method was similar to the purposive subsampling method, but tree selection was not restricted within plots. This method can be thought of as the a priori equivalent to sampling all trees in the stratum containing the m trees with highest estimated value. Value/ha was estimated with the ratio estimator y~2 (eq. 3.36), the average ratio estimator 3/3 (eq. 3.37), and the regression estimator yÂ± (eq. 3.38). The sample variance was approximated by combining relative variances (eq. 3.40). 3.5 Strategy Evaluation 3.5.1 Data Assumptions The test populations were comprised of data from point-sample plots taken in opera-tional cruises. However, some assumptions must be accepted so the results of testing the sampling designs in these artificial populations can be extended to the real world. The assumptions are: 1. The test populations provided a distribution of point-sample plots that was repre-sentative of randomly located plots in a forest containing the same trees. Chapter 3. METHODS 54 2. The individual tree data were measured without error. 3. The value of a felled and bucked tree (cruiser-called value) was the sum of the log values from the MacMillan Bloedel cruiser-called grades (converted to FS grades) given in the Vancouver log market. 4. The cruiser-called value of each tree was observed in the field when height was measured and the necessary time was taken to call grade the tree, or when height was measured and all the FS pathological and quality indicators were taken. 3.5.2 Sample Size The model-dependent subsampling methods were two-stage designs. First-stage sample sizes were arbitrarily chosen as n = 20, 40, and 60 plots. Twenty plots is a small sample and 60 plots is a more realistic sample size. The mid-point of 40 plots was chosen to reveal nonlinear response to sample size. The desired total number of second-stage sample trees (rh) for each sample of n plots was arbitrarily chosen as rh = n, 3n, and 5n. These sample sizes provided reasonable numbers of trees at the upper and lower range of the first-stage sample sizes. Table 3.9 gives the desired total number of sample trees for the nine combinations of the three first-stage and three second-stage sample sizes. Table 3.9: Desired total number of subsampled trees First-stage Sample Size Second-stage Sample Size (Number of plots) (Number of subsampled trees) 20 20 60 100 40 40 120 200 60 60 180 300 Chapter 3. METHODS 55 The desired number of subsampled trees for the random, PPS and purposive subsam-pling methods (strategies 2-7 and 11-13, Table 3.8) was m,- = 1, 3, and 5 trees/plot, i.e, subsample size (ro;) was constant for each plot. When plots contained less than m,- trees, all trees in the plot were subsampled and m,- = Mi < m,-. Subsample size was a random variable for the 3P method (strategies 8-10), the target method (strategies 14-16), and the threshold method (strategies 17-19). The desired total number of subsampled trees for 3P subsampling was achieved by determining the appropriate KZ values by trial and error (Table 3.10). KZ is usually Table 3.10: KZ values ($) for 3P subsampling Desired Total Number of Test Population Subsampled Trees (rh) TP1 TP2 TP3 n 1800 3100 2800 3n 390 875 650 5n 75 375 155 computed as m = xi/ KZ so that one sample unit is taken on the average for every unit of KZ. Deviation (8) from the target value in the target subsampling method was determined by trial and error to give the desired total number of second-stage samples (rh) on the average (Table 3.11). Threshold values for the threshold subsampling method Table 3.11: Target and deviation values ($) for target subsampling Desired Total Number of Test Population Subsampled Trees (rh) TP1 TP2 TP3 n 280 Â± 75 325 Â± 55 430 Â± 105 3n 280 Â± 200 325 Â± 155 430 Â± 280 5n 280 Â± 280 325 Â± 240 430 Â± 420 also were determined by trial and error to give the desired total number of second-stage Chapter 3. METHODS 56 samples on the average (Table 3.12). Table 3.12: Threshold values ($) for threshold subsampling Desired Total Number of Test Population Subsampled Trees (rh) T P l TP2 TP3 n 525 750 800 3n 165 325 270 bn 35 160 70 3.5.3 Cost Functions Cost functions were developed to estimate the total variable cost of sampling for each strategy. The average time to complete the various point-sampling activities was esti-mated in consultation with MacMillan Bloedel field inventory staff.4 Estimated times were for a two-man crew and average coastal forest conditions, i.e., average slope, weather, brush, and old-growth stand conditions. Sampling cost was computed using a crew cost of $500.00/day multiplied by the estimated sampling time for a given sampling strategy. The estimated daily crew cost included wages and payroll loading for a two-man crew for an eight-hour day. Sampling cost (C) was estimated as C = l.Q4[Ttr + Tst + Tret] (3.43) where the multiplier 1.04 was crew cost (dollars/minute); Ttr was the total time (minutes) to travel between the n plots; Tat was the total sampling time in the n sample plots; and Tret was the additional travel time to return from the last plot of the day to the point of commencement, and then return to that plot at the beginning of the next day (when more than one day was required to complete the survey). 4Mssrs. Bert Vink, John Ahokas, Geoff Childs, Ken Epps, and Mark Weeks. Chapter 3. METHODS 57 Total travel time between the n plots within the sample area (Ttr) was estimated by Ttr = 13.3"1 [VAri + 2\fA}. (3.44) The multiplier 13.3 -1 converted distance (m) to time (minutes) based on a crew speed of 200 m in 15 minutes for chaining and compassing. The bracketed term estimated distance travelled between plots within the sample area, where A is sample area (m2). The term \f~An approximates the travel distance between the n plots. The term 2\f~A approximated the distance between lines in the sample transect grid, and return from the point of termination to the point of commencement. Total sampling time within plots was estimated with two functions. For the FS sampling strategy, total sampling time (Tat(\)) was estimated as T8t(x) = + E Â«fc E *(!),â€¢Â« +3 tÂ«] (3.45) i=i fc=i i=i = 20n + E E a* E i=ifc=i j=i The coefficient teat was a fixed time of 5 minutes to establish the plot, including unpacking equipment, establishing plot centre, marking bearing trees, and so on. The indicator variable a* = 1 if species k was present in the plot and ak = 0 if the species was not present. The coefficient was the average time to measure characteristics for the jth. tree of the kth species in the ith plot. This included measured dbh, visually estimated height, the eight FS pathological indicators, and the seven FS quality indicators as specified by the Ministry of Forests (1980). The time required to sample each tree was estimated as 5 minutes for cedar and spruce, and 4 minutes for Douglas-fir, hemlock, balsam, cypress, white pine, and lodgepole pine. The average time to measure the height of one tree (tht) was estimated as 5 minutes. The multiplier 3 was for an average of three tree heights per plot. Chapter 3. METHODS 58 Total sampling time for the model-dependent methods (T4t(2)) where trees were sub-sampled was estimated as n a rk Tat(2) = Â£ [ * e . t + Â£ Â£ t(2)ikj + m time]- (3.46) i=l Jb=l j=l The coefficient teat and the indicator variable ak are as previously defined. The coefficient t(2),fcj was the time to estimate the value of the jth. tree of the fcth species in the ith plot. This was estimated as 4 minutes for cedar, spruce, and cypress, 3 minutes for Douglas-fir and hemlock, and 2 minutes for balsam, white pine, and lodgepole pine. These times included measuring dbh and recording the variables required to estimate the value of the given species (Table 3.7). The time required to select the appropriate subsample trees was assumed to be included. The coefficient r t r u c was the time to supplement previously gathered information to call grade a subsampled tree. This was estimated as 5 minutes/tree for all species, and included time to measure tree height, m, was the number of subsampled trees in plot i. Table 3.13 gives the times required to collect individual tree data for the various strategies and activities by species. Table 3.13: Time (minutes) required for a two-man crew to collect individual tree data by species for various activities of the sampling strategies FS Estimating Supplement Species Strategy Value Height for value" Hemlock 4 3 5 5 Douglas-fir 4 3 5 5 Balsam 4 2 5 5 Cedar 5 4 5 5 Cypress 4 4 5 5 White Pine 4 2 5 5 Sitka Spruce 5 4 5 5 Lodgepole Pine 4 2 5 5 "Crew time required to supplement data collected to estimate value to determine true value (includes measuring tree height). Additional time was included to return from the last sample plot of the day to the Chapter 3. METHODS 59 point of commencement (Tret), and then return to that plot the next day. This was estimated as Tret = 45 (int) [Ttr + Tst]/480. (3.47) The multiplier 45 was the estimated time (minutes) to relocate the last plot of the previous day, and to return to the point of commencement at the end of the day. The expression (int) indicates conversion to integer format by rounding the coefficient down to the nearest integer. Thus 45 minutes were added for each additional day required to complete the survey. The final return to the point of commencement was the term 2A/A in Ttr- The denominator 480 was the number of minutes in an eight-hour work day. 3.5.4 Simulation Program Design The Monte Carlo simulation program was written in Turbo C, 5 version 1.5. The program was developed and tested on an IBM PS/2 Model 50 under IBM DOS 3.0. The code was recompiled with the ANSI compatible C876 compiler on the Amdahl 5850 mainframe computer at the University of B.C. The program was tested on the mainframe to ensure that results were the same as in the micro environment. The simulation program randomly accessed data for individual plots in binary files for each test population. The random numbers were produced with a multiplicative congruential generator. A unique random number seed was used for each simulation. The seed was the elapsed time in seconds since 00:00:00 Greenwich Mean Time, 1 January 1970. The simulation program used two binary data files for each test population. One file contained tree data grouped by plot. This included height, dbh, volume, net volume, 5Borland International Inc., 4585 Scotts Valley Dr., Scotts Valley, California, 95066. 6 M T S C Staff, University of Michigan Computing Center, Ann Arbor Michigan. Computing Center Memo 481, 48 pp. Chapter 3. METHODS 60 cruiser-called value, and estimated value for each tree. The other binary file contained the number of trees and the byte-offset for the first tree in each plot in the plot data file. Plots were located by loading the binary index data into a vector of structures containing the number of trees and byte-offset of each plot. The vector index number corresponded to the plot number. The simulations used 2000 samples of n randomly selected plots for each of the 19 strategies. Sample sizes of n = 20, 40, and 60 plots with desired total subsample sizes of rh = n, 3n, and bn trees were used with the 18 model-dependent strategies in each of the three test populations. Thus 39.6 million plots were analyzed from 495 sets of 2000 samples with an average of 40 plots per sample. Each of the 2000 samples used a new random set of n sample plots. The data were loaded using dynamic allocation of memory which was freed at the end of each simulation. Statistics and a summary for each simulation were printed to the screen and to disk files. 3.5.5 Evaluation Criteria The 2000 estimates of value/ha for each simulation were evaluated at each sample size for-bias, sample variance, cost, and confidence interval coverage. Achieved subsample size was evaluated for the 18 model-dependent sampling strategies that used subsampling. The efficiency of each model-dependent strategy was evaluated on the relative advantage of the strategy in relation to the FS strategy using the same number of plots. 3.5.5.1 Bias The bias of the 2000 samples of each simulation was computed as 2000 g (yj - Y) bias = 3 ~ l n n n n (3.48) 2000 v ' Chapter 3. METHODS 61 where y~j was the estimated average cruiser-called dollar value/ha for the jth sample, and Y was the known test population average value/ha (Tables 3.3, 3.4, and 3.5). Bias was expressed as a percentage of Y. 3.5.5.2 Variance The variance of the estimates for the 2000 samples of each simulation was computed as 2000 E (Si - yf where y~j was the estimated average cruiser-called dollar value/ha for the jth sample, and y â€” Â£ 2 Â£ i 0 j/,/2000 is the average of the 2000 estimates of value/ha. The variance was expressed as the ratio of the sample variance for the given strategy to the sample variance of the FS strategy using the same number of plots. Thus, variance ratios greater than unity indicated that the model-dependent strategy was more variable than the FS strategy using the same number of plots. 3.5.5.3 Achieved Subsample Size The six subsampling methods were evaluated on the average and variability of the achieved subsample size. Subsample size was independent of the estimator, hence results for the six methods were compared using only the ratio estimator. This was expressed as the average and standard deviation of the achieved subsample size for the 2000 samples of each subsampling method and sample size. 3.5.5.4 Cost Sampling cost was computed with the equations given in section 3.5.3. Sampling cost was also independent of the estimator, hence costs were compared using only the ratio Chapter 3. METHODS 62 estimator for each subsampling method and sample size. Cost was expressed as the ratio of the average total cost of sampling for a given subsampling method to the average total cost of sampling for the FS strategy using the same number of plots. 3.5.5.5 Relative Advantage The relative advantage (RA) of model-dependent sampling strategy X in relation to the FS sampling strategy using the same number of plots was computed as R A X - F S _ Cost ps x Variance F s , g 5 f Jx Cost x x Variance x Model-dependent strategies with RAs greater than unity were considered to be more efficient than the FS strategy, i.e., the given strategy provided more information for a given cost than the FS strategy. 3.5.5.6 Confidence Interval Coverage A 95% nominal confidence interval (CI) was computed for each of the 2000 samples assuming independent and normally distributed estimates. The CI was computed as Vj Â± tsvj (3-51) where y~j is the jth estimate of value/ha for j = 1,... ,2000, t is Student's t with n â€” 1 degrees of freedom at probability 0.025, and Sy. is the SE of the jth estimate computed from either the jackknife of combined SE methods. This was to examine the effect of the variance estimators and sample size. Each CI was evaluated as containing Y (the true value/ha), below Y, or above Y. The results were given as the percentage of the 2000 Cis that were below Y, contained Y, or were above Y. Chapter 4 R E S U L T S 4.1 Tree Value The variables selected to estimate the cruiser-called value of individual trees were: D (dbh), D 2 , six quality indicators, four pathological indicators, and various interactions of these variables (Table 4.14). The quality indicators were: Ql - degree of spiral grain; Q2 - degree of sweep; Q3 - degree of lean; Q4 - number of the first 5 m log with live limbs; Q5 - number of the first 5 m log with stubs; Q7 - quarters of the second 5 m log free of knots. The pathological indicators were: PI - occurrence of conks; P3 - occurrence of scars; P4 - occurrence of fork or crooks; and P8 - occurrence of dead tops. The equations to predict cruiser-called tree value were given in section 3.7. The variables were coded as specified by the Ministry of Forests (1980). The relationship of cruiser-called value to estimated tree value was roughly hnear, passing through the origin and was highly variable at low estimated values (Figures 4.4, 4.5, and 4.6). Cruiser-called tree value also was linearly related to BA but was highly variable. Table 4.15 gives the average and CV for $BARs and VBARs (from net volume) by species and test population. The average $BAR and VBAR increased from TP1 to TP2 to TP3 for all species except balsam. This reflects the larger tree size and higher values of TP2 and TP3 (Tables 3.3, 3.4, and 3.5). The $BARs varied greatly among species which reflected relative value. In contrast, VBARs were relatively constant among species. The CVs of the $BARs and VBARs 63 Chapter 4. RESULTS 64 Table 4.14: Regression statistics and variables for estimating cruiser-called tree value. Horizontal lines separate groups of common predictor variables Douglas- Cedar Hem- Bai- Spruce Cyp- White Lodge-fir lock sam ress Pine pole ~N 500 501) 500 500 27 500 91 11 R 2 0.85 0.73 0.85 0.92 0.89 0.84 0.75 0.94 Intercept â€¢ â€¢ â€¢ â€¢ â€¢ D â€¢ â€¢ D 2 â€¢ â€¢ â€¢ D 2 Ql D 2 Q l 2 â€¢ â€¢ â€¢ Q22 â€¢ D 2 Q3 â€¢ Q4 Q42 D 2 Q4 D 2 Q42 â€¢ â€¢ â€¢ â€¢ â€¢ â€¢ â€¢ â€¢ â€¢ â€¢ D 2 Q5 D 2 Q52 â€¢ â€¢ â€¢ Q7 Q72 â€¢ â€¢ â€¢ D 2 Q7 â€¢ D 2 PI â€¢ D 2 P3 â€¢ â€¢ â€¢ D 2 P4 â€¢ â€¢ D 2 P8 â€¢ Chapter 4. RESULTS 65 Figure 4.4: Cruiser-called versus estimated value for the 11623 trees in TPl value ($ '000) 15r u 13 12 11 10 9 8 7 6 5 4 3f 4 5 6 7 8 9 10 11 12 13 14 Estimated Value ($ '000) Figure 4.5: Cruiser-called versus estimated value for the 12 191 trees in TP2 Value ($ '000) 12 r 11 1 0 9 8 7^ 6 S 4 3, 2 1 mm* . r â€¢ â€¢ . '7^ . ' , : 2 3 4 5 6 7 6 9 10 11 12 Estimated Value ($ '000) Chapters RESULTS 66 Table 4.15: Mean and coefficient of variation for the ratio of cruiser-called dollar value to basal area ($BAR) and the ratio of net volume to basal area (VBAR) for individual $BAR VBAR Species TP1 TP2 TP3 TP1 TP2 TP3 Douglas-fir X 523.5 593.4 774.9 8.9 9.4 10.3 CV 69.7 63.0 61.6 28.1 25.2 24.9 N 3641 4160 3129 3641 4160 3129 Cedar X 336.3 381.7 419.8 6.9 7.0 7.1 CV 71.6 67.4 66.6 33.8 33.4 33.0 N 1382 1347 1122 1382 1347 1122 Hemlock X 392.4 423.8 491.7 9.6 10.1 11.0 CV 69.8 60.0 61.5 42.2 36.2 36.0 N 4222 4102 2945 4222 4102 2945 Balsam X 468.7 499.9 300.6 10.7 11.1 4.0 CV 64.8 57.5 58.2 40.5 35.2 35.5 N 1555 ~1741 1543 1555 1741 1543 Spruce X 551.2 - - 10.1 - -CV 65.6 - - 25.9 - -N 27 - - 27 -Cypress X 1211.1 1215.9 1378.9 6.9 7.3 7.8 CV 99.1 98.7 87.8 36.7 37.6 31.9 N 652 751 462 652 751 462 White Pine X 171.4 241.8 265.0 7.6 9.0 9.9 - CV 83.8 67.8 83.3 47.9 40.1 40.6 N 91 85 46 91 85 46 Lodgepole Pine X 103.8 - - 6.1 - -CV 70.9 - - 41.4 -N 11 - 11 - -All Species X 481.6 535.6 626.1 9.0 9.5 10.2 CV 94.4 86.5 80.3 39.7 35.4 35.2 N 11581 12186 9247 11581 12186 9247 Chapter 4. RESULTS 67 Figure 4.6: Cruiser-called versus estimated value for the 9250 trees in TP3 Value ($ '000) 20 r 19 IB 17 16 15 14 13 12 11 10 â€” i i i i i â€¢ â€¢ ' ' ' ' 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Estimated Value ($ '000) varied from about 60 to 100%, and 25 to 45% among species, respectively. The CV for very high valued cypress $J3AR was approximately three times larger than for VBAR. CVs for $BARs were approximately double the VBAR CV for high valued Douglas-fir, cedar, and spruce, and approximately one and one-half times the VBAR CVs for low valued hemlock, balsam, and the pines. 4.2 Forest Service Operational Cruise Procedure Test Population 1 Estimates of value/ha from the FS operational cruise procedure were unbiased (Table 4.16). The standard deviation, multiplied by y/n, was close to the population standard deviation of 23 642 for n = 20, 40, and 60 plots (Table 3.3). Less than 95% of the nominal confidence intervals (Cis) contained the known population value/ha (Y). More Cis were Chapter 4. RESULTS 68 Table 4.16: Statistics for the 2000 estimates of value/ha from the Forest Service sampling Avg. Std. Avg. 95% CI Coverage Estimate Bias Dev. Cost (%) Skew- Kurt-Plots ($) (%) ($) Low In High ness osis 20 $30589 -0.4 5398 1423 8.0 91.0 1.0 0.48 0.66 40 $30757 0.1 3743 2633 5.5 93.6 0.9 0.24 0.23 60 $30780 0.2 3084 3828 5.7 93.3 1.0 0.31 0.28 below Y than above Y for all sample sizes. Sample distributions were approximately normal, but were skewed to the right (Fig-ures 4.7, 4.8, and 4.9). The skewness coefficients were significant at a = 0.05/2 for all sample sizes. The kurtosis coefficients were significant at a = 0.05/2 which indicates more clustering around the mean than a normal distribution. The Kolmogorov D and Shapiro-Wilk's statistics rejected the null hypothesis of a normal distribution for n = 20 and 60 plots; the null hypothesis was not rejected for n = 40 plots. The cumulative percentage of the 2000 sampling errors ( i ( n - i ) 5 y ) a r e shown in Figure 4.10. The proportion of errors below a given absolute value increased with sample size. Approximately 10.5%, 48.5%, and 81.7% of the errors were less than $7000/ha for n = 20, 40, and 60 plots, respectively. Approximately 50% of the errors were below $10000, $7000, and $6000 for n = 20, 40 and 60 plots, respectively. Test Population 2 Estimates of value/ha were unbiased and trends in the results for TP2 (Table 4.17) were similar to TP1. The standard deviation multiplied by y/n for n â€” 20,40, and 60 plots was close to the population standard deviation of 30615 (Table 3.4). Similarly to TP1, less than 95% of the 2000 nominal CIs contained Y", and more CIs were below Y than Chapter 4. RESULTS 69 Figure 4.7: Distribution of the 2000 estimates of value/ha using the Forest Service sam-pling strategy with 20 plots for TPl Percent 30 " 25 " 20 " 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 Value/ha ($ 000) Table 4.17: Statistics for the 2000 estimates of value/ha from the Forest Service sampling Avg. Std. Avg. 95% CI Coverage Estimate Bias Dev. Cost (%) Skew- Kurt-Plots ($) (%) ( Â« ) (S) Low In High ness osis 20 $48321 -0.3 6728 1715 6.6 92.7 0.6 0.35 0.15 40 $48513 0.1 4915 3177 6.1 93.1 0.7 0.30 0.28 60 $48598 0.2 3902 4635 5.0 93.4 1.5 0.28 -0.05 Chapter 4. RESULTS 70 Figure 4.8: Distribution of the 2000 estimates of value/ha using the Forest Service sam-pling strategy with 40 plots for T P l Percent 30 " 25 " 1 1 2 2 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 Value/ha ($ 000) , were above Y. The sample estimates were more normally distributed than T P l and did not contain as many extreme values (Figure 4.11, 4.12, and 4.13). The skewness coefficients were significant at a = 0.05/2 for all sample sizes (Table 4.17) but were smaller than for T P l . The kurtosis coefficient for n = 40 plots was significant at a = 0.05/2. The null hypothesis of a normal distribution was rejected for n = 20 plots with the Kolmogorov D and the Shapiro-Wilk's test statistics at the 95% level of confidence. The absolute errors for TP2 were larger than T P l . Approximately 50% of the errors were below $13000, $9000, and $8000 for n = 20, 40 and 60 plots, respectively (Figure 4.14). Chapter 4. RESULTS 71 Figure 4.9: Distribution of the 2000 estimates of value/ha using the Forest Service sam-pling strategy with 60 plots for TPl Percent 30 1 $ 1 $ I 1 1 VA i i 1 2 2 2 3 3 3 3 3 4 4 4 4 4 5 5 5 5 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 Value/ha ($ 000) Test Population 3 Estimates of value/ha were unbiased (Table 4.18) as was shown in TPl and TP2. CI cov-erage was skewed as in TPl and TP2, but appeared to approach the expected frequency in a more consistent trend. Skewness and kurtosis coefficients were similar to TPl and TP2, showing positive skewness and leptokurtic distributions (Figures 4.15, 4.16, and 4.17). The skewness coefficients were significant for all sample sizes at a = 0.05/2 as in TPl and TP2. The kurtosis coefficients for n = 20 and 40 plots were significant at a = 0.05/2. The null hypothesis of a normal distribution was rejected with the Kol-mogorov D statistic and the Shapiro-Wilk's test statistics for n = 20 and 40 plots at the 95% level of confidence, but was not rejected for n = 60 plots. The absolute sampling errors were larger than with TPl or TP2 reflecting the higher tree values. Approximately 50% of the errors were below $17000, $12000, and $10000 Chapter 4. RESULTS 72 Figure 4.10: Cumulative distribution of sampling errors for the Forest Service sampling strategy for TP1 Cumulative X 100 90 80 -70 GO 50 40 30 20 -10 -0 5000 10000 15000 20000 Absolute Error ($) 25000 30000 Table 4.18: Statistics for the 2000 estimates of value/ha from the Forest Service sampling Avg. Std. Avg. 95% CI Coverage Estimate Bias Dev. Cost (%) Skew- Kurt-Plots (*) (%) ($) ($) Low In High ness osis 20 $56835 0.3 8978 1393 6.4 92.9 0.6 0.46 0.29 40 $56773 0.2 6202 2558 5.2 93.7 1.0 0.32 0.37 60 $56881 0.4 5246 3711 4.9 93.8 1.3 0.20 0.16 Chapter 4. RESULTS 73 Figure 4.11: Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 20 plots for TP2 Percent 25 -20 -15 -3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 Value/ha ($ 000) for n = 20, 40, and 60 plots, respectively (Figure 4.18). 4.3 Model-Dependent Sampling Strategies 4.3.1 Bias Test Population 1 Estimates of value/ha from the ratio and regression estimators showed a negative bias of about 2-6% for all sample sizes (Table 4.19). The random subsampling method with the ratio estimator was essentially unbiased. The average ratio estimator gave the smallest bias for most sample sizes of the 3P method and for rh = n and 3n trees with the PPS method. However, the average ratio estimator gave large positive biases for all sample sizes with the random method, for rh = bn trees with the PPS and target methods, and Chapters RESULTS Table 4.19: Percent bias by sample size and strategy for TP1 Subsample 20 Plots 40 Plots 60 Plots Method Estimator 20 60 100 40 120 200 60 180 300 Random Ratio 1 0 0 0 0 0 -1 0 0 Average Ratio 57 62 116 115 108 108 96 99 116 Regression -1 -2 -3 -2 -3 -2 -3 -2 -3 PPS Ratio -4 -4 -3 -4 -4 -4 -4 -4 -3 Average Ratio 0 -1 20 -1 0 14 0 0 1.9 Regression -3 -3 -4 -3 -3 -5 -5 -4 -4 3P Ratio -2 -1 -3 -1 -2 -2 -1 -2 -2 Average Ratio -1 -2 0 -1 -2 0 -1 -2 0 Regression -2 -2 -3 -3 -4 -4 -4 -3 -4 Purposive Ratio -6 -4 -3 -5 -4 -2 -5 -4 -2 Average Ratio -5 31 21 -5 25 29 -6 31 26 Regression -4 -2 -2. -4 -4 -4 -6 -5 -4 Target Ratio -5 -5 -3 -6 -5 -2 -6 -5 -3 Average Ratio -6 -5 160 -6 -5 160 -5 -5 153 Regression -7 -4 -5 -6 -5 -3 -6 -4 -3 Threshold Ratio -1 -2 -3 0 -2 -2 -2 -2 -2 Average Ratio -1 -3 -2 0 -4 -2 0 -3 -2 Regression -2 -2 -2 -3 -3 -3 -3 -3 -4 Chapter 4. RESULTS 75 Figure 4.12: Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 40 plots for TP2 Percent 25 "I 20 " 15 " 10 I L i n 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 Value/ha ($ 000) rh = 3n and 5n trees with the purposive method. The target method gave similar biases with all estimators and subsample sizes, except with the average ratio estimator with rh = bn trees. The regression estimator gave biases similar to the ratio estimator for the PPS, pur-posive, target, and threshold methods. However, bias for the regression estimator with the random and 3P methods was slightly larger then with the ratio estimator. Test Population 2 The trends for bias in TP2 were similar to TPl, but were smaller and mostly positive in sign and usually in the order of 2-3% (Table 4.20). The average ratio estimator with random subsampling was highly biased as in TPl, but the bias was smaller by an order of magnitude. The ratio estimator with the random subsampling was unbiased in TPl Chapter 4. RESULTS Table 4.20: Percent bias by sample size and strategy for TP2 Subsample 20 Plots 40 Plots 60 Plots Method Estimator 20 60 100 40 120 200 60 180 300 Random Ratio 3 2 2 3 2 2 2 2 2 Average Ratio 41 29 34 32 28 27 25 32 33 Regression. 1 1 1 1 1 3 2 2 3 PPS Ratio -2 -2 -1 -1 -2 -2 -2 -1 -2 Average Ratio -2 -3 -1 -3 -3 0 -3 -3 0 Regression 0 1 -1 1 -1 1 0 0 1 3P Ratio 2 1 2 2 1 2 1 1 Average Ratio 2 -1 -1 1 0 0 0 0 0 Regression -2 -2 1 3 2 3 3 3 3 Purposive Ratio -3 -1 -1 -2 -1 -1 -3 -1 0 Average Ratio -8 -2 3 -7 -1 4 -7 -1 3 Regression 1 1 2 0 1 2 0 1 3 Target Ratio -6 -4 -2 -6 -4 -3 -6 -4 -3 Average Ratio -5 -4 -3 -6 -4 -3 -6 -3 -3 Regression -5 -3 -2 -5 -3 -2 -5 -3 -2 Threshold Ratio 3 1 0 3 1 0 3 1 0 Average Ratio 2 -1 -2 2 0 -2 3 -1 -2 Regression 4 1 2 3 3 3 4 3 3 Chapter 4. RESULTS 77 Figure 4.13: Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 60 plots for TP2 Percent 25 1 3 3 3 3 3 4 4 4 4 4 5 5 5 5 5 6 6 6 6 6 7 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 2 4 6 8 0 Value/ha ($ 000) but showed a small bias in TP2. The average ratio estimator with the 3P method was virtually unbiased at larger sample sizes. The large bias shown in TP1 for the average ratio estimator with the purposive and target methods was not shown in TP2. The regression estimator gave the smallest bias for most sample sizes of the random and PPS methods. Test Population 3 TP3 showed more positive and larger biases (Table 4.21) than TP1 and TP2. The target strategy gave the lowest bias of about 2-3%. Bias was similar among estimators for the target method as in TP1 and TP2. The increase in bias for rh = 5n was not as large as in TP1. The random method gave biases of 5-6% for the ratio estimator and 4-7% for the regression estimator. The average ratio estimator with the random method gave Chapter 4. RESULTS Table 4.21: Percent bias by sample size and strategy for TP3 Subsample 20 Plots 40 Plots 60 Plots Method Estimator 20 60 100 40 120 200 60 180 300 Random Ratio 5 6 5 5 5 5 5 5 6 Average Ratio 286 154 173 96 134 154 135 177 164 Regression 7 4 7 7 7 7 6 6 6 PPS Ratio 5 5 4 4 5 4 4 5 4 Average Ratio 2 2 5 1 1 6 1 1 6 Regression 5 5 7 4 6 6 5 7 6 3P Ratio 7 6 5 7 6 5 7 6 5 Average Ratio 5 4 1 4 -3 1 5 4 1 Regression 5 7 7 7 8 7 6 8 7 Purposive Ratio 1 4 6 3 5 6 2 5 6 Average Ratio -1 1 13 -3 1 12 -2 1 11 Regression 4 6 7 4 6 7 4 6 7 Target Ratio 2 1 1 2 0 2 2 0 1 Average Ratio 1 -1 4 2 -1 3 1 -2 3 Regression 3 2 3 2 2 3 2 2 3 Threshold Ratio 10 6 4 9 6 5 9 7 5 Average Ratio 10 3 0 9 4 0 10 4 0 Regression 8 7 7- 8 8 7 8 8 7 Chapter 4. RESULTS 79 Figure 4.14: Cumulative distribution of sampling errors for the Forest Service sampling strategy for TP2 Cumulative X Absolute Error ($) larger biases as was shown in T P l and TP2. The average ratio estimator with the PPS method gave the lowest biases for m = n and 3n trees as in T P l , but bias for rh = bn trees was similar to the ratio and regression estimators. Bias for the 3P method decreased with subsample size for the ratio and average ra-tio estimators, but was constant with the regression estimator. Bias for the purposive method also increased with subsample size for the ratio and regression estimators. How-ever, the average ratio estimator showed a small bias for rh = n and 3n trees. The threshold method showed decreasing bias with subsample size for the ratio and average ratio estimators, and no bias for rh = bn trees with the average ratio estimator. Bias for the regression estimator was consistent at 7-8% for the threshold method. Chapter 4. RESULTS 80 Figure 4.15: Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 20 plots for TP3 Percent 25 -20 " 15 " 10 " 5 " 3 3 3 4 4 4 5 5 5 6 6 6 6 7 7 7 8 8 8 9 9 3 6 9 2 5 8 T 4 7 0 3 6 9 2 5 8 1 4 7 0. 3 Value/ha ($ 000) 4.3.2 Sample Variance Test Population 1 Variance ratios (VRs) for the (approximately) unbiased strategies in TP1 ranged from 1 to 4 (Table 4.22), i.e., the variance of the sample estimates were up to about four times larger than the FS strategy using the same number of plots. Most VRs ranged from about one to two. The VRs decreased when more trees were subsampled for a given number of plots. The largest decrease was observed between rh = n and 3n trees. The target method gave the lowest VRs except for m = 5n trees with 60 plots where the threshold method was the lowest. The 3P and threshold subsampling methods gave the second lowest VRs at all sample sizes. The target method gave VRs slightly less than unity for m = 3n and 5n trees with n = 20 plots. Chapter 4. RESULTS 81 Table 4.22: Variance ratios by sample size and strategy for T P l . Underline indicates smallest ratio for a given sample size Subsample Method Estimator 20 Plots 40 Plots 60 Plots 20 60 100 40 120 200 60 180 300 Random Ratio 2.18 1.61 1.38 2.63 1.60 1.64 2.50 1.81 1.44 Average Ratio 2485 979 2427 7284 3062 2084 5730 2410 2007 Regression 3.56 3.15 2.49 6.00 4.26 3.12 6.87 4.96 3.87 PPS Ratio 1.88 1.56 1.47 2.04 1.64 1.59 1.94 1.65 1.66 Average Ratio 2.50 1.72 95.2 2.29 1.77 73.1 2.93 1.75 94.7 Regression 3.57 2.86 3.02 4.93 3.67 3.90 5.44 3.99 4.23 3P Ratio 1.67 1.30 1.19 1.90 1.37 1.20 1.92 1.34 1.25 Average Ratio 1.48 1.17 1.22 1.42 1.15 1.19 1.46 1.21 1.20 Regression 2.82 2.04 2.08 3.34 2.41 2.62 3.76 2.99 2.75 Purposive Ratio 1.62 1.28 1.26 1.77 1.41 1.29 1.83 1.36 1.27 Average Ratio 2.21 276 73.3 2.36 222 106 2.15 274 93.2 Regression 3.02 2.29 2.19 4.16 2.84 2.57 3.71 3.18 2.90 Target Ratio 1.34 1.02 0.99 1.36 1.12 1.06 1.21 1.17 1.09 Average Ratio 1.26 0.98 1810 1.41 1.05 1915 1.36 1.01 1620 Regression 1.21 1.12 1.20 1.23 1.18 1.28 1.27 1.17 1.22 Threshold Ratio 1.75 1.30 1.13 1.81 1.33 1.18 1.72 1.23 1.21 Average Ratio 1.48 1.06 1.08 1.57 1.14 1.07 1.58 1.13 1.04 Regression 2.43 2.06 2.28 2.97 2.65 2.52 3.25 2.84 2.83 Chapter 4. RESULTS 82 Figure 4.16: Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 40 plots for TP3 Percent 25 1 20 " 15 " 10 " ml 3 3 3 4 4 4 5 5 5 6 6 6 6 7 7 7 8 8 8 9 9 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 Value/ha ($ 000) VRs for the regression estimator were usually larger than the ratio and average ratio estimators. However, VRs were similar among estimators with the target method. The regression estimator gave the lowest VR for rh = n trees with n = 20 and 40 plots for the target method. VRs for the regression estimator increased marginally for m = 5n trees over m = 3n trees for the PPS and target methods. The average ratio estimator gave the lowest VRs for most sample sizes for the thresh- < old and 3P methods. However, the average ratio estimator gave highly variable estimates and large VRs for all sample sizes of the random subsampling method. Large VRs were observed with the average ratio estimator and target method for m = 5n trees and for most sample sizes of the purposive method. The ratio estimator gave the lowest VRs for the random, PPS, and purposive subsampling methods. Chapter 4. RESULTS 83 Figure 4.17: Distribution of the 2000 estimates of value/ha using the Forest Service sampling strategy with 60 plots for TP3 Percent 25 " 20 " 15 10. 0 0 3 3 3 4 4 4 5 5 5 6 6 6 6 7 7 7 8 8 8 9 9 3 6 9 2 5 8 1 4 7 0 3 6 9 2 5 8 1 4 7 0 3 Value/ha ($ 000) Test Population 2 The VRs for the (approximately) unbiased strategies in TP2 (Table 4.23) were slightly larger than for T P l . The lowest VRs were achieved with the purposive and target methods except with m = 5n trees for 40 plots with the threshold method. The lowest VRs for a given number of plots was always achieved with m = n trees. The variance decreased when more trees were subsampled for a given number of plots as was shown in T P l . The ratio estimator gave the lowest VRs for the random method and the average ratio estimator gave the lowest VRs for the threshold method. The ratio or average ratio estimators gave the lowest VRs for the PPS, purposive, and target methods. The regression estimator usually gave higher VRs than the ratio and average ratio estimators. The average ratio estimator with the random method gave the largest VRs as with T P l . Chapter 4: RESULTS 84 Table 4.23: Variance ratios by sample size and strategy for TP2. Underline indicates smallest ratio for a given sample size Subsample Method 20 Plots 40 Plots 60 Plots Estimator 20 60 100 40 120 200 60 180 300 Ratio 3.06 1.81 1.76 3.10 1.83 1.73 2.82 1.89 1.72 Average Ratio 567 203 129 359 129 67 226 141 116 Regression 4.95 3.59 3.43 6.30 4.85 4.25 7.77 5.21 4.57 Ratio 2.38 1.97 1.86 2.29 1.97 1.87 2.40 1.82 1.92 Average Ratio 1.70 1.47 2.74 2.06 1.40 2.22 1.63 2.42 2.52 Regression 4.54 4.00 3.77 5.62 4.73 4.93 6.75 5.42 5.55 Ratio _2-56 1.73 1.55 2.46 1.55 1.42 2.71 1.61 1.41 Average Ratio 25.6 1.28 1.19 24.6 1.22 1.52 1.73 2.02 1.13 Regression 4.37 2.91 2.71 4.72 3.15 2.87 4.99 3.14 2.96 Ratio 2.14 1.68 1.46 2.06 1.51 1.39 1.96 1.46 1.39 Average Ratio 1.28 1.93 2.46 1.20 1.99 2.54 1.28 2.14 2.39 Regression 3.73 2.97 2.82 4.14 3.16 2.94 4.59 3.43 3.09 Ratio 1.37 1.14 1.18 1.24 1.06 1.03 1.26 1.05 1.05 Average Ratio 1.38 1.18 1.14 1.25 1.07 1.03 1.33 1.04 1.13 Regression 1.38 1.18 1.32 1.24 1.09 1.12 1.27 1.11 1.19 Ratio 2.41 1.70 1.42 2.33 1.48 1.45 2.20 1.60 1.39 Average Ratio 2.01 1.36 1.15 1.86 1.28 1.01 1.90 1.31 1.09 Regression 3.38 2.58 2.43 3.69 2.89 2.88 3.54 3.03 3.05 Random PPS 3P Purposive Target Threshold Chapter 4. RESULTS 85 Figure 4.18: Cumulative distribution of sampling errors for the Forest Service sampling strategy for TP3 Cumulative X Absolute Error ($) However, the large VRs in TP1 for rh = 5n trees with the average ratio estimator and target method were not shown in TP2. Test Population 3 The VRs for TP3 (Table 4.24) were similar to TP1 but smaller than TP2. The lowest VRs were achieved with the purposive and target methods as in TP1 and TP2. The pattern in TP3 was consistent with lowest VRs achieved for: rh = n trees with the purposive method and average ratio estimator; rh = 3n trees with the target method and average ratio estimator; and m = 5n trees with the target method and the ratio estimator. The second lowest VRs for a given sample size were achieved with the 3P, purposive, and threshold methods. The VRs decreased as subsample size increased for a given number of plots as was shown in TP1 and TP2. The average ratio estimator with Chapter 4. RESULTS 86 Table 4.24: Variance ratios by sample size and strategy for TP3. Underline indicates smallest ratio for a given sample size Subsample Method Estimator 20 Plots 40 Plots 60 Plots 20 60 100 40 120 200 60 180 300 Random Ratio 2.16 1.67 1.46 2.48 1.65 1.54 2.17 1.56 1.46 Average Ratio 10e4 31e3 14e3 23e3 15e3 14e3 34e3 19e3 13e3 Regression 3.67 2.62 2.37 4.81 3.92 3.40 5.52 4.25 3.46 PPS Ratio 2.09 1.68 1.63 2.31 1.83 1.77 2.22 1.85 1.82 Average Ratio 2.10 1.32 2.89 1.72 1.46 4.31 1.57 1.28 3.35 Regression 3.55 3.22 3.17 5.03 4.71 4.63 6.20 5.48 5.91 3P Ratio 2.11 1.41 1.36 2.10 1.53 1.41 1.99 1.45 1.26 Average Ratio 1.84 1.22 1.15 1.72 1.37 1.29 1.62 1.28 1.19 Regression 3.10 2.38 2.13 4.07 3.05 2.81 4.09 3.18 2.84 Purposive Ratio 1.76 1.33 1.39 1.96 1.51 1.45 1.72 1.46 1.36 Average Ratio 1.41 1.25 5.65 1.40 1.26 5.23 1.38 1.16 4.97 Regression 3.20 2.40 2.21 3.96 3.03 2.97 5.14 3.22 3.04 Target Ratio 1.63 1.19 1.12 1.65 1.25 1.26 1.58 1.17 1.18 Average Ratio 1.62 1.13 1.33 1.74 1.22 1.30 1.51 1.12 1.26 Regression 1.70 1.34 1.32 1.64 1.47 1.36 1.48 1.27 1.29 Threshold Ratio 1.84 1.37 1.39 2.05 1.46 1.46 1.81 1.43 1.35 Average Ratio 1.78 1.20 1.11 1.99 1.38 1.25 1.67 1.17 1.22 Regression 2.67 2.33 2.21 3.37 2.90 2.80 3.60 2.79 2.85 Chapter 4. RESULTS 87 the random method gave larger VRs than in TP1 and TP2. 4.3.3 Confidence Interval Coverage Test Population 1 Less than 95% of the nominal confidence intervals (CIs) for TP1 contained Y and more than 2.5% of the CIs were below Y for most cases (Tables 4.25, 4.26, and 4.27). CI coverage did not show obvious trends and was not strongly affected by sample size, subsampling method, or estimator. Approximately 5-10% of the CIs were below Y, 90-93% of the CIs contained Y, and 0.1-1.0% were above Y. Approximately 95% of the CIs contained Y for the average ratio estimates with n = 20 and 40 plots for the random, PPS, and purposive methods. The sample variance for these methods was estimated with the jackknife. CI coverage was furthest from the nominal frequency with the regression estimator for the 3P and threshold methods that used the combined SE estimator. Test Population 2 The CI coverage trend for TP2 was similar to TP1. Less than 95% of the nominal CIs contained Y, more than 2.5% were lower than Y, and there was no apparent affect of subsample method or estimator (Tables 4.28, 4.29, and 4.30). However, more CIs contained Y and fewer CIs were less than Y than for TP1. The average ratio with the jackknife variance estimator gave CIs that deviated further from the expected nominal frequency than for TP1. As with TP1, the CIs computed with the jackknife estimator did not differ substantially from those computed with combined SEs. Chapter 4. RESULTS 88 Test Population 3 The frequency of CI coverage for TP3 (Tables 4.31, 4.32, and 4.33) was similar to TPl and TP2. However, the proportion containing Y was closer to the nominal 95% for most strategies and sample sizes than TPl or TP2. The proportion of CIs lower than Y was higher than the proportion greater than Y for n = 20 and 40 plots. However, more CIs were higher than Y with n = 60 plots. CI coverage for the 3P and threshold methods with the regression and the combined SE estimators was less than 90%, as in TPl and TP2. However, CI coverage was usually closer to nominal coverage than for TPl and TP2. Most CIs contained Y for about 93-94% of the samples. Chapter 4. RESULTS 89 Table 4.25: Confidence interval coverage using 20 plots with TP1. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Method Desired Total Number of Subsampled Trees Estimator 20 60 100 Random PPS 3P Purposive Target Threshold Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression 6.6/92.8/0.6 11.6/88.3/0.0 9.9/89.1/1.0 9.7/89.8/0.5 6.6/93.0/0.3 8.9/90.2/0.9 10.5/88.7/0.9 8.1/91.1/0.8 16.3/81.7/2.0 11.6/88.1/0.2 12.8/87.1/0.1 8.5/91.1/0.3 12.2/87.4/0.4 11.7/88.1/0.2 12.8/86.8/0.3 9.6/89.5/0.8 7.9/91.6/0.4 15.0/83.3/1.7 6.7/93.1/0.2 6.8/92.9/0.2 8.1/91.7/0.1 7.0/92.9/0.1 7.8/91.1/1.1 8.2/91.3/0.4 10.3/89.2/0.4 9.1/90.6/0.3 7.4/92.4/0.1 6.9/92.5/0.5 8.0/91.5/0.4 8.7/90.9/0.3 9.3/89.6/1.2 9.7/89.6/0.7 8.4/91.0/0.6 7.1/92.0/0.9 14.6/83.2/2.2 17.1/80.1/27 10.6/88.9/0.5 9.5/90.0/0.5 7.5/92.3/0.1 5.6/93.9/0.4 7.9/91.5/0.5 7.3/92.1/0.5 11.5/88.3/0.2 3.9/95.7/0.3 10.5/89.1/0.3 0.4/97.5/0.2 11.2/88.2/0.6 6.3/93.0/0.6 10.4/88.8/0.8 9.0/90.3/0.6 9.0/90.4/0.5 8.1/91.3/0.5 15.0/82.5/2.5 18.2/79.0/28 Chapter 4. RESULTS 90 Table 4.26: Confidence interval coverage using 40 plots with TPl . Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Method Desired Total Number of Subsampled Trees Estimator 40 120 200 Random PPS 3P Purposive Target Threshold Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression 7.9/91.1/1.0 10.4/89.6/0.0 7.7/91.3/1.0 9.0/90.3/0.7 8.3/91.3/0.3 7.4/91.4/1.1 10.8/87.6/1.7 6.8/92.4/0.9 18.7/77.5/3.8 11.3/88.5/0.1 13.4/86.5/0.0 7.8/91.6/0.5 1-370/86.6/0.3 14.0/85.7/0.2 12.8/86.8/0.3 13.2/86.7/0.1 8.0/89.9/0.2 6.0/92.3/1.6 5.5/93.6/0.9 6.2/92.1/1.7 6.8/92.7/0.4 5.6/94.1/0.3 7.0/92.0/0.9 6.1/92.7/1.1 9.0/90.8/0.1 8.7/90.9/0.3 8.5/91.1/0.3 7.3/92.2/0.4 7.3/91.9/0.8 8.2/90.9/0.9 9.5/89.6/0.9 7.7/91.5/0.7 18.3/78.5/3.2 9.8/89.9/0.2 7.4/92.1/0.4 7.9/91.4/0.6 10.9/88.8/0.2 11.9/87.6/0.4 11.2/88.2/0.6 12.5/86.7/0.7 8.6/90.2/1.1 10.1/89.4/0.4 8.4/90.7/0.9 6.0/93.5/0.5 18.6/78.2/a2 8.0/91.3/0.6 5.0/94.1/0.8 6.9/92.2/0.8 2.4/97.6/0.0 0.1/88.7/1L1 6.3/93.0/0.6 4.5/95.4/0.0 8.6/90.4/0.9 8.5/90.9/0.5 Chapter 4. RESULTS 91 Table 4.27: Confidence interval coverage using 60 plots with TP1. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Desired Total Number of Subsampled Trees Method Estimator 60 180 300 Random Ratio 6.0/93.4/0.5 6.3/92.6/0.1 4.6/94.2/1.1 Average Ratio 9.3/90.6/0.1 5.6/94.0/0.4 4.4/94.6/0.9 Regression 6.6/91.6/1.7 4.8/93.2/1.9 6.1/93.2/0.6 PPS Ratio 8.4/91.1/0.5 8.8/90.6/0.5 8.6/91.0/0.3 Average Ratio 7.4/91.9/0.6 5.6/93.7/0.6 6.9/92.5/0.5 Regression 6.6/92.2/1.2 7.1/92.0/0.9 7.3/91.9/0.8 3P Ratio 9.7/88.1/2.3 9.3/89.5/1.3 7.9/90.4/1.7 Average Ratio 6.9/92.2/0.9 7.7/91.1/1.2 5.7/93.6/0.7 Regression 20.6/74.6/4.9 18.8/75.4/5.8 21.3/74.6/41 Purposive Ratio 10.4/89.3/0.2 9.0/90.7/0.2 8.3/91.3/0.3 Average Ratio 13.9/86.0/0.0 6.5/93.2/0.3 3.2/95.8/1.0 Regression 9.6/89.9/0.6 8.1/91.1/0.7 6.5/92.1/0.7 Target Ratio 12.8/87.0/0.2 13.9/85.7/0.4 2.4/97.6/0.0 Average Ratio 13.2/86.4/0.3 11.9/87.8/0.2 0.0/77.3/226 Regression 13.6/86.1/0.2 11.6/87.8/0.6 2.8/97.1/0.0 Threshold Ratio 8.3/90.0/1.6 8.4/90.3/1.3 9.1/89.7/1.1 Average Ratio 5.7/92.6/1.6 10.8/88.4/0.8 8.0/91.2/0.7 Regression 17.6/77.6/4.7 21.0/73.8/5.1 20.7/74.6/47 Chapter 4. RESULTS 92 Table 4.28: Confidence interval coverage using 20 plots with TP2. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Desired Total Number of Subsampled Trees Method Estimator 20 60 lOO Random Ratio 5.6/93.7/0.6 5.5/93.7/0.7 6.2/92.8/0.9 Average Ratio 8.8/91.1/0.1 5.6/93.9/0.4 4.3/95.2/0.5 Regression 7.4/91.4/1.1 6.9/91.9/1.1 5.7/93.0/1.2 PPS Ratio 8.8/90.8/0.4 7.4/91.7/0.8 8.2/91.3/0.4 Average Ratio 8.6/91.0/0.3 9.2/90.4/0.3 8.0/91.5/0.5 Regression 7.6/90.6/1.7 7.3/91.1/1.5 8.2/90.8/1.0 3P Ratio 11.5/85.5/3.0 9.0/89.0/2.0 9.9/87.8/2.3 Average Ratio 6.8/92.4/0.8 7.1/91.8/1.2 8.0/91.4/0.6 Regression 18.0/77.1/4.9 14.6/79.8/5.6 15.2/79.3/5.5 Purposive Ratio 8.9/90.2/0.8 8.9/90.5/0.6 7.1/92.4/0.4 Average Ratio 13.0/86.9/0.0 9.1/90.7/0.2 6.5/93.1/0.3 Regression 7.6/91.3/1.0 6.8/92.1/1.1 7.3/91.0/2.8 Target Ratio 10.9/88.9/0.1 11.2/88.4/0.3 8.9/90.6/0.4 Average Ratio 11.8/88.0/0.2 11.6/87.8/0.6 9.9/89.5/0.6 Regression 11.8/88.0/0.2 9.6/89.7/0.6 9.8/89.5/0.6 Threshold Ratio 7.5/90.1/2.3 10.1/87.7/2.1 8.8/89.7/1.4 Average Ratio 6.0/91.9/2.0 7.9/90.8/1.2 8.6/90.7/0.6 Regression 5.6/91.4/2.9 7.3/91.2/1.4 7.2/91.8/1.0 Chapter 4. RESULTS 93 Table 4.29: Confidence interval coverage using 40 plots with TP2. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Desired Total Number of Subsampled Trees Method Estimator 40 120 200 Random Ratio 4.6/93.8/1.5 4.6/94.3/1.0 4.4/93.8/L8 Average Ratio 7.6/92.0/03 4.4/95.1/0.5 2.8/96.3/0.9 Regression 6.6/91.3/20 6.3/91.6/2.1 4.6/92.9/24 PPS Ratio 6.7/92.7/06 8.0/91.0/0.9 7.7/91.4/08 Average Ratio 8.9/90.6/04 9.4/90.2/0.3 7.6/92.1/03 Regression 6.3/92.1/L5 5.7/92.0/2.3 6.1/91.2/26 3P Ratio 8.6/86.7/48 6.6/90.4/3.0 6.7/90.7/26 Average Ratio 6.4/92.2/L4 7.2/91.5/1.3 7.6/91.2/L2 Regression 16.4/73.1/10.5 14.0/76.3/9.7 12.8/77.0/10.2 Purposive Ratio 8.3/91.1/05 6.7/92.4/0.9 7.2/92.1/0.6 Average Ratio 14.3/85.4/02 10.1/89.6/0.2 5.6/93.8/0.6 Regression 6.3/92.4/L2 5.1/93.1/1.7 5.4/92.4/22 Target Ratio 14.2/85.5/02 10.0/89.5/0.4 9.5/89.7/0.7 Average Ratio 14.5/85.3/02 11.6/87.9/0.4 9.1/90.5/0.3 Regression 12.3/87.4/02 10.0/89.3/0.6 8.1/90.7/L1 Threshold Ratio 6.9/88.0/5.0 6.5/90.7/2.7 8.8/88.4/27 Average Ratio 5.6/91.4/29 7.3/91.2/1.4 7.2/91.8/LO Regression 13.2/78.7/ai 13.0/77.3/9.6 13.2/77.1/9.6 Chapter 4. RESULTS 94 Table 4.30: Confidence interval coverage using 60 plots with TP2. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Desired Total Number of Subsampled Trees Method Estimator 60 180 300 Random Ratio 1.9/93.6/1.7 3.5/94.5/1.9 3.5/94.5/1.9 Average Ratio 7.4/92.3/0.2 2.7/96.5/0.7 1.9/96.3/1.7 Regression 5.0/92.6/2.3 5.6/91.7/2.6 3.9/93.2/2.8 PPS Ratio 6.1/93.1/0.8 5.0/94.0/0.9 7.5/91.8/0.6 Average Ratio 8.9/91.0/0.0 8.7/90.8/0.4 7.1/92.6/0.2 Regression 5.8/91.8/2.4 5.9/91.6/2.4 5.5/92.5/1.9 3P Ratio 8.6/84.5/6.8 7.1/89.5/3.4 6.4/90.9/2.7 Average Ratio 5.8/91.9/2.3 6.5/91.9/1.5 5.4/93.3/1.3 Regression 15.1/70.3/146 12.4/75.9/117 13.0/76.3/10.8 Purposive Ratio 8.2/91.5/0.2 5.9/93.3/0.8 5.6/93.6/0.7 Average Ratio 17.7/82.1/(12 10.1/89.4/0.4 4.9/94.4/0.6 Regression 5.7/92.4/1.9 4.6/93.2/2.1 4.3/93.0/2.6 Target Ratio 13.3/86.2/0.4 11.3/88.3/0.3 9.3/90.0/0.7 Average Ratio 15.4/84.1/(14 11.0/88.4/0.5 10.5/88.8/0.6 Regression 13.9/85.8/0.3 10.0/89.3/0.6 9.0/90.0/1.0 Threshold Ratio 5.6/88.6/5.8 6.9/89.3/3.7 7.2/89.8/3.0 Average Ratio 4.4/91.4/4.2 6.9/91.1/2.0 9.0/89.9/1.1 Regression 9.0/80.1/1(18 11.8/76.4/118 13.3/73.5/13.1 Chapters RESULTS 95 Table 4.31: Confidence interval coverage using 20 plots with TP3. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Method Desired Total Number of Subsampled Trees Estimator 20 60 100 Random PPS 3P Purposive Target Threshold Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression 4.4/94.6/0.9 6.4/93.3/0.3 4.4/93.3/2.3 4.6/93.9/1.4 6.1/93.2/0.6 4.1/93.8/2.0 6.2/90.5/3.2 4.9/93.7/1.4 11.1/84.5/4.3 6.2/93.1/0.6 6.4/92.8/0.7 4.3/93.8/1.8 6.0/93.3/0.7 5.9/93.5/0.5 4.4/94.4/1.2 3.4/93.2/3.3 2.9/93.6/3.4 6.7/88.5/4.7 4.2/94.6/1.1 5.6/93.9/0.5 4.7/93.0/2.2 4.4/94.1/1.4 4.7/94.6/0.7 3.8/94.1/2.1 3.8/93.8/2.3 4.5/94.2/1.3 7.9/86.8/5.2 4.4/94.6/1.0 6.2/93.1/0.6 3.4/94.4/2.1 7.1/92.1/0.7 7.9/91.6/0.4 6.0/93.0/0.9 3.7/93.9/2.4 4.5/94.1/1.4 8.0/86.2/5.7 4.1/94.3/1.5 5.5/94.0/0.4 3.8/93.9/2.2 4.7/94.2/1.1 4.7/94.3/0.9 3.5/94.3/2.1 5.2/92.8/2.1 6.1/92.8/1.1 9.0/86.1/4.9 3.9/94.7/1.3 3.9/95.3/0.8 3.0/94.9/2.0 -5.3/93.8/0.8 4.4/94.9/0.6 4.9/93.8/1.2 5.5/91.9/2.5 7.2/92.2/0.5 8.7/86.0/5.2 Chapter 4. RESULTS 96 Table 4.32: Confidence interval coverage using 40 plots with TP3. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Desired Total Number of Subsampled Trees Method Estimator 40 120 200 Random Ratio 3.1/94.0/2.8 2.9/94.8/2.2 2.8/94.6/2.5 Average Ratio 5.6/94.0/0.3 4.3/95.3/0.3 2.9/96.3/0.7 Regression 3.4/93.0/3.5 2.2/93.3/4.4 1.9/94.5/3.5 PPS Ratio 3.3/94.5/2.1 2.6/95.3/2.0 2.4/95.8/1.8 Average Ratio 5.1/93.9/0.9 4.4/94.1/1.3 3.2/95.4/1.3 Regression 3.5/93.8/2.7 2.5/94.1/3.3 2.4/93.9/3.6 3P Ratio 3.7/91.3/5.1 3.0/92.5/4.5 3.2/92.8/4.1 Average Ratio 3.0/94.5/2.5 4.0/93.7/2.3 4.8/93.4/1.8 Regression 9.0/80.8/10.3 6.1/83.3/10.6 7.7/81.6/10.7 Purposive Ratio 3.9/94.6/1.4 2.4/95.2/2.3 2.7/94.6/2.6 Average Ratio 8.1/91.3/0.6 4.2/94.8/0.9 2.5/96.2/1.2 Regression 2.6/94.2/3.1 2.3/94.4/3.2 1.8/95.4/2.8 Target Ratio 4.2/94.5/1.2 6.1/92.7/1.1 3.8/94.8/1.3 Average Ratio 4.0/94.9/1.0 8.1/91.1/0.7 3.8/95.3/0.8 Regression 4.3/94.5/1.1 6.8/91.0/2.2 3.2/94.3/2.4 Threshold Ratio 2.4/90.1/7.5 2.4/92.5/5.0 3.4/91.9/4.7 Average Ratio 2.6/89.7/7.6 2.8/93.8/3.3 6.8/91.7/1.5 Regression 6.4/83.4/10.1 6.2/81.9/11.9 7.3/81.7/1L0 Chapter 4. RESULTS 97 Table 4.33: Confidence interval coverage using 60 plots with TP3. Numbers are the percentage of 95% nominal confidence intervals that were below, contained, or were above the population value/ha Subsample Desired Total Number of Subsampled Trees Method Estimator 60 180 300 Random Ratio 1.9/95.2/2.9 2.0/94.1/3.9 1.7/93.8/4.4 Average Ratio 5.0/94.2/0.7 3.0/96.2/0.7 2.1/96.8/1.0 Regression 2.5/92.3/5.1 2.2/92.6/5.1 2.0/93.1/4.8 PPS Ratio 3.1/94.3/2.5 2.5/93.9/3.5 2.1/94.8/3.0 Average Ratio 4.8/94.4/0.7 3.5/95.2/1.3 2.5/96.0/1.4 Regression 2.3/93.1/4.5 1.9/93.4/4.6 2.0/94.1/3.8 3P Ratio 3.2/88.8/8.0 1.6/91.2/7.2 1.8/92.4/5.9 Average Ratio 1.7/93.4/4.9 2.5/93.3/4.2 3.8/94.1/2.1 Regression 9.1/78.7/12.2 5.5/78.2/16.3 6.0/79.5/145 Purposive Ratio 3.5/95.1/1.3 2.1/94.3/3.6 1.8/94.1/4.0 Average Ratio 7.6/91.9/0.4 3.6/94.8/1.5 1.2/96.6/2.1 Regression 3.1/93.6/3.2 1.5/94.2/4.2 1.7/93.7/4.5 Target Ratio 3.4/95.0/1.5 5.4/93.1/1.4 3.8/94.4/1.7 Average Ratio 4.9/93.9/1.1 7.5/91.8/0.6 2.7/94.9/2.4 Regression 3.6/94.9/1.4 4.6/92.7/2.6 3.3/93.8/2.9 Threshold Ratio 0.9/87.8/11.2 1.8/90.5/7.6 2.3/91.1/6.5 Average Ratio 0.7/88.5/10.7 2.4/94.2/3.4 6.5/91.9/1.5 Regression 6.5/78.2/15.2 5.3/79.3/15.3 6.0/79.6/143 Chapters RESULTS 98 4.3.4 Achieved Subsample Size Test Population 1 The mean and standard deviation of the achieved sample sizes for T P l were similar among the random, PPS, and purposive subsampling methods (Table 4.34). The achieved Table 4.34: Mean and standard deviation of achieved subsample sizes by sample size and strategy for T P l Subsample Method 20 Plots 40 Plots 60 Plots 20 60 100 40 120 200 60 180 300 Random X 20.00 58.74 92.82 40.00 117.56 185.48 60.00 176.34 278.55 sx 0 1.35 3.86 0 1.93 5.37 0 2.28 6.39 PPS x 20.00 58.78 92.87 40.00 117.56 185.38 60.00 176.21 278.66 0 1.32 3.80 0 1.84 5.22 0 2.31 6.45 3P X 20.33 59.74 99.06 41.04 118.55 198.31 61.11 177.87 298.43 Sx 6.00 10.99 12.56 8.36 15.52 17.24 10.32 19.32 20.94 Purposive X 20.00 58.80 92.84 40.00 117.54 185.73 60.00 176.28 278.57 Sx 0 1.31 3.72 0 1.86 5.23 0 2.28 6.45 Target X 19.71 59.30 97.82 39.16 118.95 195.27 59.02 178.28 293.07 sx 5.13 9.66 11.24 7.47 13.47 15.41 8.93 16.40 20.01 Threshold X 20.37 59.80 100.21 40.96 119.21 200.41 60.59 178.40 300.48 Sx 7.37 12.01 12.38 10.20 17.47 16.82 12.70 20.38 21.48 subsample size was constant for rh = n with the random, PPS, and purposive methods because each plot contained at least one tree. The variability and deviation from the desired subsample size increased for all methods as more trees were subsampled. The 3P, target, and threshold subsampling methods achieved the desired sample size on the average. The random, PPS, and purposive methods were less variable because of the fixed number of sample trees/plot. The achieved subsample size for the target method was less variable than for the 3P and threshold methods. The 3P method was Chapters RESULTS 99 slightly less variable for rh = n and 3n trees than the threshold method. Test Population 2 The average achieved subsample size for TP2 (Table 4.35) was more consistent among Table 4.35: Mean and standard deviation of achieved subsample sizes by sample size and strategy for TP2 Subsample Method 20 Plots 40 Plots 60 Plots 20 60 100 40 120 200 60 180 300 Random X 20.00 59.92 99.10 40.00 119.85 198.24 60.00 179.80 297.26 Sx 0 0.39 1.26 0 0.53 1.79 0 0.62 2.17 PPS X 20.00 59.93 99.12 40.00 119.86 198.21 60.00 179.81 297.31 Sx 0 0.37 1.24 0 0.52 1.77 0 0.60 2.20 3P X 20.62 59.61 98.31 41.23 119.21 198.84 61.69 177.44 297.03 Sx 5.83 11.35 14.56 8.24 15.81 20.00 10.11 19.31 24.02 Purposive X 20.00 59.94 99.11 40.00 119.84 198.21 60.00 179.79 297.33 Sx 0 0.36 1.28 0 0.57 1.72 0 0.65 2.15 Target X 19.57 59.49 98.85 39.68 118.10 197.48 59.34 176.94 298.00 Sx 5.12 10.15 12.49 7.25 13.91 17.20 8.93 16.46 21.84 Threshold X 20.06 59.27 100.40 40.03 118.75 201.75 60.30 178.83 301.45 Sx 7.65 13.10 14.75 10.69 18.63 21.78 13.24 22.61 26.27 methods, but the pattern of variability was similar to TPl. The random, P P S , and purposive methods showed similar variability as in TPl. The 3P, target, and threshold methods were more variable. Subsample size for the target method was less variable than the 3P and threshold method as was shown in TPl. Test Population 3 Trends in achieved subsample size and variability for TPl and TP2 were also shown in TP3 (Table 4.36). The random, P P S , and purposive methods gave similar results. The Chapter 4. RESULTS 100 Table 4.36: Mean and standard deviation of achieved subsample sizes by sample size and strategy for TP3 Subsample Method 20. Plots 40 Plots 60 Plots 20 60 100 40 120 200 60 â€¢ 180 300 Random X 20.00 59.20 92.65 40.00 118.26 185.20 60.00 177.38 277.88 sx 0 0.96 3.57 0 1.36 4.94 0 1.64 6.35 PPS X 20.00 59.13 92.67 40.00 118.33 185.29 60.00 177.48 277.99 Sx 0 0.97 3.54 0 1.40 4.90 0 1.65 6.22 3P X 19.75 59.69 99.31 39.38 118.49 197.90 59.89 177.93 296.92 Sx 5.73 10.51 11.77 7.87 14.94 16.91 9.64 18.45 20.87 Purposive X 20.00 59.14 92.70 40.00 118.31 185.38 60.00 177.40 277.85 sx 0 0.96 3.51 0 1.37 5.04 0 1.66 6.03 Target X 19.78 59.30 98.60 39.62 118.20 198.07 59.18 177.27 297.04 Sx 5.18 9.48 10.37 7.49 13.37 14.66 9.34 16.20 17.92 Threshold X 19.36 60.28 100.66 38.41 120.11 200.96 58.13 180.93 302.33 Sx 7.19 11.63 12.33 9.86 16.60 17.24 12.68 20.17 21.20 / Chapter 4. RESULTS 101 3P, target, and threshold methods achieved the desired subsample size on the average, but were more variable than the other methods. The variability of the methods were similar to T P l and were higher than TP2. The achieved subsample size for the target method was less variable than the 3P and threshold methods as was shown in T P l and TP2. 4.3.5 Cost Test Population 1 Cost ratios (CRs) for T P l were similar among methods for a given sample size (Table 4.37). The random and PPS methods gave approximately the same number of subsampled Table 4.37: Cost ratios by sample size and strategy for T P l Subsampling 20 Plots 40 Plots 60 Plots Method 20 60 100 40 120 200 60 180 300 Random 0.72 0.86 0.96 0.70 0.84 0.95 0.68 0.82 0.94 PPS 0.72 0.84 0.92 0.70 0.80 0.90 0.68 0.80 0.90 3P 0.72 0.89 1.04 0.70 0.87 1.04 0.68 0.86 1.04 Purposive 0.72 0.89 1.01 0.70 0.87 1.01 0.68 0.86 1.01 Target 0.72 0.89 1.03 0.70 0.87 1.04 0.68 0.86 1.03 Threshold 0.72 0.89 1.04 0.70 0.87 1.05 0.68 0.86 1.04 trees as other methods (Table 4.34), but the trees were subsampled with replacement, hence costs were reduced when the same tree was selected more than once. This resulted in slightly lower CRs for the random and PPS methods for rh = 3n and bn trees, and the same CRs as other methods when rh = n trees. Test Population 2 The general trend for CRs in TP2 (Table 4.38) was the same as in T P l . CRs were similar Chapters RESULTS 102 Table 4.38: Cost ratios by sample size and strategy for TP2 Subsample 20 Plots 40 Plots 60 Plots Method 20 60 100 40 120 200 60 180 300 Random 0.72 0.83 0.92 0.71 0.82 0.92 0.69 0.82 0.92 PPS 0.72 0.81 0.88 0.71 0.80 0.89 0.69 0.80 0.88 3P 0.72 0.84 0.98 0.71 0.84 0.99 0.69 0.84 0.99 Purposive 0.72 0.85 0.99 0.71 0.85 0.98 0.69 0.84 0.99 Target 0.72 0.84 0.98 0.71 0.84 0.98 0.69 0.84 0.99 Threshold 0.72 0.84 0.99 0.71 0.85 0.99 0.69 0.84 0.99 for a given sample size among subsampling methods. The random and PPS methods gave lower CRs with rh = 3n and 5n trees. Test Population 3 CRs for TP3 (Table 4.39) were similar to TP1 and TP2. CRs were similar among Table 4.39: Cost ratios by sample size and strategy for TP3 Subsample 20 Plots 40 Plots 60 Plots Method 20 60 100 40 120 200 60 180 300 Random O.fl 0.86 0.96 0.70 , 0.83 0.95 0.68 0.82 0.95 PPS 0.71 0.84 0.93 0.70 0.81 0.92 0.68 0.80 0.92 3P 0.71 0.89 1.04 0.69 0.87 1.04 0.68 0.86 1.05 Purposive 0.71 0.89 1.01 0.70 0.87 1.01 0.68 0.86 1.01 Target 0.71 0.89 1.03 0.70 0.87 1.05 0.68 0.86 1.05 Threshold 0.71 0.89 1.04 0.69 0.87 1.05 0.68 0.87 1.05 subsampling methods, except the random and PPS methods where again the CRs were lower for rh = 3n and 5n trees. Chapters RESULTS 103 4.3.6 Relative Advantage Test Population 1 The relative advantages (RAs) of the sampling strategies in T P l (Table 4.40) were pro-portional to the VRs (Table 4.22) for a given sample size. This was because costs were Table 4.40: Relative advantage by sample size and strategy for T P l . Underline indicates largest ratio for a given sample size Subsample Method 20 Plots 40 Plots 60 Plots Estimator Random PPS 3P Purposive Target Threshold Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression similar for a given sample size (Table 4.37). The target method was the most efficient showing the highest RAs for all sample sizes except the threshold method with rh = bn trees for 60 plots. The highest RAs for a given number of subsampled trees ranged from 0.91 to 1.21. The target method was more efficient than the FS strategy for rh = n and Chapter 4. RESULTS 104 3n trees where RAs exceeded unity, and was slightly less efficient for rh = 5n trees. The RAs for the target method with the ratio and regression estimators generally decreased as the number of subsampled trees increased for a given number of plots. The exception was the ratio estimator using 20 plots. The highest RAs were achieved with rh = n trees for all n and decreased as subsample size increased. RAs for the average ratio estimator increased with the number of trees from rh = n to 3n trees, but decreased to zero when m = 5n trees. This is the result of the highly variable estimates using the average ratio estimator (Tables 4.19 and 4.22). The threshold and 3P methods consistently achieved the second largest RAs for a given sample size. RAs for three of the nine 3P sample sizes achieved unity and three of the nine threshold sample sizes exceeded unity. The largest RAs for the 3P and PPS methods were achieved with m = 3n trees. The exception was the 3P method with the average ratio estimator where the largest RAs were with rh = n trees with 60 plots. The largest RAs for the purposive method were also achieved with rh = 3n trees using the ratio and regression estimators. The exception was the regression estimator with rh = n trees and n = 60 plots. The purposive method was biased and highly variable with the average ratio estimator. The random, PPS, and purposive methods were less efficient than the FS procedure with RAs less than unity. The RAs for the purposive method were slightly higher than the PPS method which were slightly higher than the random method. Test Population 2 The target method gave the highest RAs for most of the sample sizes in TP2 (Table 4.41) as was shown in TP1. However, the purposive method gave the highest RAs with rh = n trees for 20 and 40 plots. The highest RAs for a given number of plo.ts was always achieved with m = n trees. The threshold method gave the highest RA for rh = 5n trees Chapter 4. RESULTS 105 Table 4.41: Relative advantage by sample size and strategy for TP2. Underline indicates largest ratio for a given sample size Subsample Method 20 Plots 40 Plots 60 Plots Estimator Random PPS 3P Purposive Target Threshold Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression Ratio Average Ratio Regression 0.57 0.69 0.41 0.70 0.87 0.46 0.66 0.76 0.41 0.75 0.91 0.39 0.97 0.90 0.86 0.73 0.93 0.33 Chapters RESULTS 106 with 40 plots. The highest RAs with the target method were for rh = n trees as was shown with TP1, except for 20 plots with the ratio estimator. The increase in RA from rh = n to m = 3n trees with the target method and average ratio estimator in TP1 was not shown in TP2. The highest RAs decreased as subsample size increased, as in TP1. RAs for the random and PPS methods were less than unity as in TP1. The 3P method gave RAs much lower than unity which was not shown in TP1. The threshold method also showed RAs lower than for TP1. Test Population 3 The RAs were generally lower in TP3 (Table 4.42) than in TP1 and TP2. The largest RAs were approximately unity for rh = n and 3n trees and approximately 0.8-0.9 for rh = 5n trees. The largest RAs decreased as subsample size increased as was shown in TP1 and TP2. The highest RAs were achieved with the purposive and target method as in TP1 and TP2. The pattern in TP3 was consistent with the highest RAs achieved for: rh = n trees with the purposive method and the average ratio estimator; m = 3n trees with the target method and the average ratio estimator; and m = 5n trees with the target method and the ratio estimator. The highest RAs for a given number of plots was always achieved with the purposive method using m = n trees (i.e., one tree/plot for this method). The target, 3P, and threshold methods gave the second highest RAs. The random method gave RAs less than unity as in TP1 and TP2. The PPS method with the average ratio estimator gave RAs exceeding 0.90 for three of the nine sample sizes. Chapter 4. RESULTS 107 Table 4.42: Relative advantage by sample size and strategy for TP3. Underline indicates largest ratio for a given sample size Subsample Method 20 Plots 40 Plots 60 Plots Estimator 20 60 100 40 120 200 60 180 300 Random Ratio 0.65 0.70 0.71 Average Ratio 0.00 0.00 0.00 Regression 0.38 0.45 0.44 PPS Ratio 0.68 0.71 0.66 Average Ratio 0.67 0.91 0.37 Regression 0.40 0.37 0.34 3P Ratio 0.67 0.80 0.71 Average Ratio 0.77 0.92 0.83 Regression 0.46 0.47 0.45 Purposive Ratio 0.80 0.85 0.71 Average Ratio L00 0.90 0.17 Regression 0.44 0.47 0.45 Target Ratio 0.86 0.94 0.86 Average Ratio 0.87 L00 0.73 Regression 0.82 0.84 0.73 Threshold Ratio 0.76 0.82 0.69 Average Ratio 0.79 0.94 0.85 Regression 0.53 0.48 0.43 0.58 0.73 0.68 0.00 0.00 0.00 0.30 0.31 0.31 0.62 0.68 0.61 0.83 0.85 0.25 0.28 0.26 0.23 0.68 0.75 0.68 0.84 0.84 0.74 0.35 0.38 0.34 0.73 0.76 0.68 1.02 0.91 0.19 0.36 0.38 0.33 0.87 0.92 0J6 0.82 0,94 0.74 0.87 0.78 0.71 0.70 0.78 0.65 0.73 0.83 0.75 0.43 0.39 0.34 0.68 0.78 0.72 0.00 0.00 0.00 0.27 0.29 0.30 0.66 0.94 0.24 0.74 0.91 0.36 0.86 1.07 0.29 0.93 0.97 1.00 0.81 0.88 0.41 0.68 0.98 0.23 0.80 0.91 0.36 0.80 1.00 0.36 0.99 1.04 0.92 0.81 0.99 0.41 0.60 0.33 0.18 0.76 0.80 0.34 0.73 0.20 0.32 0.81 0.76 0.74 0.70 0.78 0.33 Chapter 5 DISCUSSION 5.1 Tree Value 5.1.1 Predictor Variables The variables selected to predict cruiser-called tree value reflected three major factors that determine tree value: i) the exponential increase in value with increased dbh (D); ii) the frequency of degrade quality and pathological indicators; and iii) the affect of the indicators in the grading rules. The interaction of D 2 with the number of the first 5 m log with the first live limb (Q4) was generally the most important predictor variable for the high valued species (Douglas-fir, cedar, spruce, cypress, and hemlock). This reflected the degrade of peelers from live limbs in the butt-log. Spiral grain (Ql) was also important for Douglas-fir, balsam, and cypress, which again reflected degrade of valuable peeler logs. The four interaction terms of D 2 in the Douglas-fir equation reflected the increasing importance of degrade indicators for large trees. The premium for peelers in larger trees is reflected in the interactions of D 2 with the number of the first 5 m log with live limbs (Q4.) and the number of the first 5 m log with stubs (Q5). Lane et al. (1970) also found butt-log condition was important for predicting value of Douglas-fir. They used the diameter of the largest limb in the 16 foot butt-log and other predictor variables. The linear relationship of value to dbh for cedar reflects the proportional increase in decay with increased dbh. Exponential increases in value are offset by exponential 108 Chapter 5. DISCUSSION 109 increases in decay. The interaction of D 2 with Q4 and Q5 (the number of the first 5 m log with live limbs and the number of the first 5 m log with stubs, respectively) for cedar reflected the condition of the butt-log, as was also shown for Douglas-fir. The variables selected for predicting cruiser-called tree value showed the importance of the relationship to tree value and the frequency of occurrence in the aggregate data. The equations to predict cruiser-called value included relatively few of the FS pathological and quality related indicators. This suggests that all the FS indicators are not needed to estimate cruiser-called value. More indicators should be collected for high valued species and less indicators for low valued species. For example, the condition of butt-logs should receive more attention for Douglas-fir and spruce than for balsam. Also some indicators such as conks were not needed to estimate the cruiser-called value of cedar. The result of collecting all the FS indicators for all species is that it takes time (money) to obtain data that gives very little information for estimating value. 5.1.2 R e g r e s s i o n A p p r o a c h to E s t i m a t i n g V a l u e The regression approach for estimating cruiser-called tree value had a significant affect on the comparison of efficiency among the model-dependent samphng methods. The regression method was easy to simulate and was relatively precise, however, value is expensive to estimate with this method. The concept for all samphng methods was to use a quick (thus relatively inexpensive) estimate of individual tree value for double sampling. The trade-off was between the precision and cost of the estimate. The RAs would differ with methods of estimating tree value, and may result in different ranking of the sampling methods. Diameter and its transformations and interactions with other indicators was one of the most important predictor variables of cruiser-called value. Diameter is also relatively expensive to measure. However, dbh can be precisely estimated in a fraction of the Chapters. DISCUSSION 110 measurement time. It may take 2-3 minutes to measure dbh, especially for trees that are difficult to reach and large trees, however, dbh can be visually estimated in seconds. Visually estimated diameter could be used with other predictor variables to give precise, inexpensive estimates of tree value. For example, species, visually estimated dbh, and three tree quality classes should account for most of variation in tree value. Estimates of value would be less precise than with the regression approach, however, this method would be much less expensive than using measured dbh and other more time consuming predictor variables. The Coastal Cruising Supervisors Task Force conducted a study1 to examine the precision of visually estimated dbh. Four crews from four companies estimated the dbh of point-selected trees in 70 plots. The C V of the error for 440 trees was about 5-10% and was relatively constant across dbh classes. The errors were normally distributed and about 92% were within Â± 5 % of the measured dbh. The regression approach used to provide a rough estimate of cruiser-called tree value in this study would not be useful in practice. The dependent variable in the regressions was cruiser-called value which was a function of physical tree characteristics and log market prices. This approach would require that new equations be fitted each time the log market prices changed. A more practical approach would be to use multivariate regression equations to predict log grade percent. Seegrist (1975) and Howard and Yaussy (1986) used multivariate regression equations to estimate percent lumber recovery by grade. Tree value could be computed with the estimated percentage net volume by grade from the multivariate equations with current log market prices. This approach would also allow direct recalculation of tree values for re-appraisal of completed cruises. 1 Personal communication with Mr. Mark Leja, Assistant Inventory Forester, Fletcher Challenge Canada Ltd. P.O. Box 2000, New Westminister, B.C. V3L-5A4. Chapter 5. DISCUSSION 111 5.1.3 Tree Value and Basal Area The highly variable $BARs indicate that two to three times more sample trees are needed to estimate cruiser-called value to the same precision as volume. Very high valued cypress had $BAR CVs approximately three times the CV for VBAR. CVs for $BARs were ap-proximately double the VBAR CV for Douglas-fir, cedar, and spruce, and approximately one and one-half times the VBAR CVs for low valued hemlock, balsam, and the pines. The magnitude of the CV for value is proportional to the grade differential in value. lies and Bell (1983) reported CVs for $BARs of individual trees of approximately 1.0 to 1.4 times the CVs for VBAR. 5.2 Forest Service Operational Cruise Procedure The unbiasedness of the FS strategy is consistent with point-sampling theory. The dis-proportionate coverage of 95% nominal CIs where the frequency of Y > y + tsg is greater than 2.5% and Y < y â€” ts$ is less than 2.5% is typical of positively skewed populations (Cochran 1977 p.41, Kish 1965 p.411). The Central Limit Theorem states that means are normally distributed, regardless of the distribution of the population. However, the rate at which the sample distribution approaches normality depends on the distribution of the population and the sample size. Cochran (1977 p.42) gave a rule of thumb for estimating the sample size required to give a normal distribution of estimates as n > 25<J?I , where G\ is the third central moment (the skewness coefficient - Tables 3.3, 3.4, and 3.5 for TP1, TP2, and TP3, respectively). This rule estimated sample sizes of n > 106, 102, and 81 plots for TP1, TP2, and TP3, respectively. The null hypothesis of a normal distribution was not rejected with the Shapiro-Wilk's statistic for 2000 samples with n = 110 for TP1 and n = 85 for TP3. The null hypothesis was rejected for n = 105 plots with TP2 but could not be rejected with Chapter 5. DISCUSSION 112 n = 110. Kurtosis coefficients were not significant, but the skewness coefficients were significant at a = 0.05/2 for n = 110 plots with T P l , and for n = 105 and 110 plots with T P 2 . Sampling in skewed populations is discussed in the literature (e.g., Hansen et al. 1953 p.102, Ki sh 1965 p.404). Stratification of the large elements is a practical solution for some situations. For operational cruising, the forest area is usually stratified by timber type from aerial photographs prior to sampling and after sampling from ground information. This usually results in strata of similar density, species, and height. This should partially alleviate the problem of the highly skewed distribution of tree values, however, very high valued trees can occur in many timber types. Log market prices fluctuate and species such as yellow cedar can be very valuable in relation to other species. Douglas-fir, cedar, and spruce that yield peelers and high quality sawlogs are usually much more valuable than balsam and hemlock that often yield only lowerjmality sawlogs and pulpwood (Table 3.6). A possible solution is to treat yellow cedar and other high valued species as separate strata. Sampling efficiency would also be increased by using enhanced count plots (point-sample plots where only the presence of high valued trees is of interest). 5.2.1 Precision of Sampling for Value The precision of estimating value/ha with the F S cruise procedure is much lower than the precision of net volume/ha estimates. The results for $ B A R S and V B A R s indicate that many more trees are needed to estimate cruiser-called value than are needed to estimate volume. In other words, a F S cruise that achieves a 10% error for volume does not achieve a 10% error for value. The precision of estimating value/ha can be expressed in absolute or relative terms. The main justification for absolute precision is that it may be desirable to estimate the Chapters. DISCUSSION 113 value of any stand to within a given amount of money. For example, it might be desirable to estimate the value of any stand to within Â±$5000/ha at 95% confidence, regardless of the total value/ha. However, a fixed dollar amount may be Â±50% of the stand value for low valued stands, and the same amount may be only Â±10% of the stand value for high valued stands. Also, an absolute precision requirement may require 100% sampling for very high valued stands and require only a few plots for low valued stands. The main justification for measuring sampling precision for value/ha in relative terms is that stand estimates should be of the same relative precision, regardless of their absolute value. Relative precision is currently used for estimating net volume/ha. However, it does not seem reasonable to spend a lot of time and money to obtain a precise estimate of low valued stands when the error is very small in absolute terms. With relative precision, estimates of value might be to within Â±$10000/ha for a high valued stand and to within Â±$2000/ha for a low valued stand. 5.2.1.1 Single-Stage Point Sampling The precision of the conventional FS sampling strategy (single-stage) is simply a function of the variability of the estimated value/ha among sample plots. The CVs for estimated value/ha among sample plots were approximately 77%, 63%, and 71% for TP1, TP2, and TP3, respectively. Assuming simple random sampling, sample size can be estimated with the usual formula n = t2CV2/E2. If we assume that the t for the number of plots is the same for volume and value, the ratio of the number of plots needed to achieve the same precision for value estimation as for net volume estimation is n$ CV (5.52) nv C V 2 where n$ is the number of plots for estimating value/ha to a given level of precision, nv is the number of plots for estimating net volume/ha to the same level of precision, CV$ Chapterd. DISCUSSION 114 is the coefficient of variation of value/ha among sample plots, and CVâ€ž is the coefficient of variation of net volume/ha among sample plots. Thus to estimate value to the same level of precision as net volume, approximately 2.2, 2.4, and 2.0 times as many plots are needed for TPl, TP2, and TP3, respectively. Table 5.43 shows that the sample size needed to estimate value/ha to Â±5% at 95% confidence is prohibitively large in these test populations. Sample sizes needed to estimate value/ha to Â±15% could be used in Table 5.43: Approximate number of plots needed to estimate value/ha and net volume/ha to within Â±5, 10, and 15% at the 95% confidence level with the Forest Service sampling Test Value ($/ha) Net Volume (m3/ha) Population 5% 10% 15% 5% 10% 15% TPl 913 230 104 418 106 49 TP2 612 155 70 261 67 31 TP3 776 196 88 371 95 43 practice. The CVs for value could be reduced in practice by further stratification of timber types. The appropriate level of precision for estimating value/ha should consider the overall monetary policy of the FS and the practical implications for sampling. The level of precision should also be based on an appropriate loss function that considers the value of information. Figure 5.19 shows that for a given number of plots, the percent sampling error of the FS sampling strategy is much higher for estimating cruiser-called value than for estimating net volume in all three test populations. 5.2.1.2 Two-Stage Point Sampling The variation in estimated value/ha can be separated into two components in terms of BA; the variation in BA/ha among sample plots and the variation in $BAR among Chapter 5. DISCUSSION 115 Figure 5.19: Relationship between percent sampling error of estimating value/ha and net volume/ha with the Forest Service sampling strategy for TP1, TP2, and TP3. Lines are in the order of TP1, TP3, and TP2 from top to bottom for value (dashed lines) and for net volume (solid lines) Error % 0 2 5 5 0 7 5 10 0 12 5 .15 0 17 5 2 0 0 Number of Plots Chapter 5. DISCUSSION 116 subsampled trees. This is analogous to separating the variation in estimated volume/ha into the variation in BA/ha among sample plots and the variation in VBARs (volume to BA ratio) among sample plots or sample trees (Bruce 1961). If we assume simple random sampling and that BA and $BAR are independent, the combined percent SE of estimated value/ha is SE%C = [SE%2BA + S E % 2 $ B A R ] 1 / 2 (5.53) where S E % B , 4 is the percent standard error of BA/ha among sample plots and SE%$BAR is the percent standard error of $BAR among sample trees. Thus there are many com-binations of the number of BA plots and $BAR trees that will achieve a given level of precision. Figures 5.20, 5.21, and 5.22 show the combinations of the number of sample plots needed to estimate BA/ha and the number of subsampled trees needed to estimate $BAR to give 10, 15, and 20% sampling error at 95% confidence for T P l , TP2, and TP3, respectively. 5.3 M o d e l - D e p e n d e n t Subsampling M e t h o d s 5.3.1 Bias The bias of estimated value/ha varied with estimator, subsampling method, and test population. Some estimators were more prone to bias under certain population condi-tions. For example, the average ratio estimator introduced serious bias in populations with trees of low estimated value. However, this was not a problem for subsampling methods that did not select these low valued trees. The bias of the model-dependent sampling strategies is less important when non-sampling errors are considered. For example, a negative bias of $1464/ha is introduced if a 100 cm dbh tree valued at $5000 is missed in a 60 plot cruise using a 13.8 BAF Chapter 5. DISCUSSION 117 Figure 5.20: Number of BA plots and $BAR trees to estimate value/ha with 10% (solid), 15% (long dashed), and 20% (short dashed) sampling error at 95% confidence in TP1 Number of Trees 1000 800 600 400 200 0 \ iâ€”|â€”iâ€”iâ€”iâ€”iâ€”|â€”iâ€”iâ€”iâ€”iâ€”|â€”iâ€”iâ€”iâ€”iâ€”|â€”iâ€”iâ€”iâ€”iâ€”râ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”iâ€”r 5 0 10 0 1 5 0 2 0 0 2 5 0 3 0 0 3 5 0 4 0 0 Number of Plots prism. This is a bias of -4.8% for TP1, -3.0% for TP2, and -2.6% for TP3. lies and Bell (1983) reported that an average tree can represent $160000 for extensive inventory cruises and large trees can represent $300000. The biased estimates of value/ha resulted from using ratio estimators of the form Y = XR. The random selection of plots ensured unbiased estimation of X, however, biased estimates of R resulted from not weighting individual trees by their selection probabilities. This supports the findings of Hansen et dl. (1983) where model-dependent ratio estimators were biased despite samples that were balanced on X. The ratio estimator is theoretically unbiased under the model Â£[0,1: v(x)], for any variance function. However, the finite test populations in this study had a small intercept term in the relationship of true value (y) to estimated value (x). Simple Hnear regression of y on x showed that for TP1, TP2, and TP3, respectively, y = 13.05 + 0.92x, y = Chapter 5. DISCUSSION. 118 Figure 5.21: Number of BA plots and $BAR trees to estimate value/ha with 10% (solid), 15% (long dashed), and 20% (short dashed) sampling error at 95% confidence in TP2 Number of Trees Number of Plots -6.08 +0.99a:, and y = -11.80+ 1.098x. Under the model Â£(0, l:z], the bias of the ratio estimator is N0(fix â€” i)/x if the real model is Â£[1,1: x] (Royall and Herson 1973a, Royall and Cumberland 1981a). The bias increases as x departs from u x, and is zero when the sample is balanced on the first moment of x (i.e., x = fix). The random selection of plots in this study gave samples that were balanced on X on the average. However, few individual samples would be balanced on X, especially at n = 20 plots. Hansen et al. (1983) showed that despite balanced samples, model-dependent estimators gave biased estimates. They noted that bias was the result of the estimators not considering the sampling design. The bias of the ratio estimators for these model-dependent sampling designs might be reduced by using a jackknife estimator (Cochran 1977 p.175, Gregoire 1983). Under model-dependent sampling theory, the estimator does not need to consider the Chapters. DISCUSSION 119 Figure 5.22: Number of BA plots and $BAR trees to estimate value/ha with 10% (solid), 15% (long dashed), and 20% (short dashed) sampling error at 95% confidence in TP3 Number of Trees Number of Plots method used to select the sample if the model is correct. If the three test populations in this study are realizations of an unknown superpopulation, the true model is unknown. However, if the test populations are the superpopulations from which individual samples were generated, then the assumed models Â£[0,1: x], Â£[0,1: x2], and Â£[0,1:1] are incorrect for two reasons. First, the relationship of true value to estimated value includes a small, and probably unimportant intercept. The intercept is 13.06, â€”6.08, and â€”11.80 for TP1, TP2, and TP3, respectively. Second, the assumed variance functions v(x) were ejtx^2, fix*, and ek, where ek are independent and identically distributed random variables with mean zero and variance o*2. However, the variance of yk for the three test populations was not a continuous function of xk (Figures 4.4, 4.5, and 4.6). The large bias with the average ratio estimator was the result of including sample trees with low estimated that had much higher cruiser-called values. These trees gave Chapter 5. DISCUSSION 120 very high ratios, resulting in a positive bias with the average ratio estimator. Significant bias was introduced into a sample from including only one tree with a high ratio. For example, a yellow cedar tree in T P l had an estimated value of $0.28 and a cruiser-called value of $1510.84, giving a ratio of 5396. A Douglas-fir tree in TP3 had an estimated value of $0.01,.and a cruiser-called value of $169.35, giving a ratio of 16935. Each ratio is weighted equally with the average ratio estimator, hence including the large ratios significantly biased the estimate (Sarndal 1978). The random subsampling method was prone to this bias because each point-selected tree in a plot had the same probability of selection. Consequently, there was no mechanism to favor trees with high estimated values where the large ratios did not occur. A large bias also occurred with other subsampling methods when rh,- > 1 tree/plot. Trees with large ratios occasionally occurred in plots with less than the desired number of subsampled trees. All trees in the plot were sampled in this event. Hence trees with large ratios were subsampled, despite a method that favored trees with high estimated values. This occurred for m,- = 5 trees/plot with the PPS and purposive methods with T P l (Table 4.19) and TP3 (Table 4.21). A similar bias did not occur in TP2 because of the high average number of trees/plot. The average number of trees/plot was 9.86 for TP2 (Table 3.5), 6.97 for T P l (Table 3.3), and 6.56 for TP3 (Table 3.4). Hence plots with less than 5 trees occurred less frequently in TP2. The large bias for m,- = n trees with the target method in T P l was the result of the deviation value being equal to the target value (Table 3.11). Hence all trees with estimated values between 0 and 2ux were subsampled, including trees with low estimated values and high ratios. This wide range was required to achieve the desired subsample size. The low number of trees/plot caused the bias for m,- = 3 trees/plot with the purposive method in T P l . This resulted in sampling all trees when plots contained three or fewer Chapter 5. DISCUSSION 121 trees. This did not occur in TP2 and TP3. TP2 had a higher average number of trees/plot than TP1, and the occurrence of low value trees was less frequent in TP2 and TP3. 5.3.2 Sample Variance The ranking of subsampling methods by VR (without regard for estimator),was consistent among test populations: target < purposive < threshold < 3P < PPS < random. The rank of the 3P and threshold methods was reversed for TP1. Thus the purposive-based subsampling methods were more precise than the probability-based methods. This sup-ports model-dependent sampling theory, however, the theory suggests that the threshold method should have been the most precise. Threshold sampling can be considered the model-dependent equivalent of 3P sam-pling. 3P sampling is simply a priori list sampling where selection probability is xj Â£ KPI. Â£ KPI is the a priori estimate of the population total for the covariate. Hence selection probability is considered over all trees expected to be sampled. Selection of individual trees in threshold sampling depends on an a priori estimate of the frequency distribution of estimated tree values. The 3P and threshold methods gave similar VRs, but neither were the lowest. The threshold method did not give the lowest VRs because of random sample size and population characteristics. The desired subsample size rh was achieved on the average, thus approximately one-half of the samples contained fewer than the desired number of sample trees. These samples were less precise than samples with rh or greater numbers of trees. However, population characteristics had a greater affect on the variability of the threshold method. Only trees with the highest estimated value were sampled with the threshold method. The distribution of y at large values of x was sporadic and non-symmetric. High value trees occurred less frequently than lower valued trees. Consequently, variability was Chapter 5. DISCUSSION 122 increased by including the high valued trees. This also introduced a bias that further increased the VR. The sporadic, non-symmetric distribution of y at large x also explains why the target method showed the lowest VRs. The distribution of y in the interval \ix Â± 8 was more consistent and symmetric than at high values of x. Hence, estimates from the target method were less variable than from the threshold method. The similarity in VRs of estimates among estimators with the target subsampling method was the result of re-stricting x to the range fix Â±8. As 8 decreased and the range of x narrowed to fix, the variability of the three estimators became equivalent. 5.3.3 Confidence Interval Coverage Population characteristics had more affect on CI coverage than sample size, estimator, or subsampling method. Sample distributions were not normally distributed, hence CI coverage was not as expected under the assumption of normality. CIs for TP3 were closer to the expected frequency of 2.5%, 95.0%, and 2.5% than T P l or TP2. Skewness and kurtosis coefficients for TP3 were also smaller than for T P l and TP2 (Tables 3.3, 3.4, and 3.5). It is not possible to separate the effect of the variance estimator from the subsampling method in this study. However, both the jackknife and the combined SE estimators underestimated the true variance and gave shorter CIs than expected. Thus confidence was overstated. The combined SE estimator is simple to use. The jackknife estimator is more difficult to implement, but is useful in complex designs where analytical estimators are not available. The jackknife variance estimator was found to be equivalent or better than complex analytical functions in many studies from a wide range of populations and sampling methods (Royall and Eberhardt 1975, Royall and Cumberland 1978, 1981a, 1981b, Schreuder and Anderson 1984, Schreuder et al. 1984a). Chapter 5. DISCUSSION 123 CI coverage is also affected by bias. Cochran (1977 p.15) noted that the affect of bias on CIs is modest even when the ratio of bias over the standard deviation is less than 20%. CIs can be more accurately estimated with other methods. Stratifying the large value trees should reduce the skewness of the population so that estimates are normally distributed. Larger samples and better variance estimators also will improve CI coverage. 5.3.4 Achieved Subsample Size Subsample size for the random, PPS, and purposive methods was a function of the number of trees/plot. There was no variation in subsample size if all plots contained m,- or more trees. This was observed when m,- = 1 tree/plot. The effect of the number of trees/plot was shown by the lower variability for m,- = 3 and 5 trees/plot in TP2 (Table 4.35) compared to TP1 (Table 4.34) and TP3 (Table 4.36) that had fewer trees/plot. The actual subsample size (m) was always less than the desired subsample size (rh) with the random, PPS, and purposive methods. The desired precision may not be achieved if subsample size is less than anticipated. This can be alleviated by using a larger number of subsample trees than needed to meet the precision requirements. Also, the distribution (known or estimated) of trees/plot can be used to estimate the number of plots that contain less than m,- trees. The desired subsample size can be more closely approximated by distributing the estimated shortfall over the sample plots. For example, an extra tree can be sampled at every third plot to overcome an estimated shortfall of 20 trees in 60 plots. The random, PPS, and purposive methods were biased because each tree received the same weight in the estimator but had different selection probabilities. An unbiased method to select sample trees is to generate rh random numbers between 1 and M, where (M) is the expected number of trees in the n sample plots. Trees are sampled if their tally Chapter 5. DISCUSSION 124 number corresponds to one of the random numbers (e.g., Yandle 1982). Trees can also be systematically sampled, e.g., sample every fcth point-selected tree (Bell et al. 1983). Both methods are theoretically unbiased but also give random subsample sizes. Subsample size of 3P samphng is widely discussed in the literature (Grosenbaugh 1976, 1979, Schreuder et al. 1968,1971). On the average, one sample tree is selected for every KZ units of Â£ KPI. In this study, KZ was computed from the known Â£ KPI for the average plot in the test populations (Table 5.44). However, the theoretical KZ values Table 5.44: Theoretical KZ values for 3P subsampling TO TPl TP2 TP3 n 1968 3188 2813 3n 656 1063 938 5n 394 638 562 gave much lower sample sizes than expected (Table 5.45). This is explained by examining Table 5.45: Mean and standard deviation of achieved subsample size by sample size for the 3P method using theoretical KZ values 20 Plots 40 Plots 60 Plots 20 60 100 40 120 200 60 180 300 TPl X 18.36 44.27 58.75 37.10 87.97 118.38 55.25 131.39 177.51 Sx 5.62 9.70 11.10 8.07 13.25 16.17 9.38 16.12 18.61 TP2 X 19.47 50.57 72.89 38.49 101.31 145.76 58.38 151.65 220.04 Sx 5.70 10.21 12.44 8.07 14.93 17.80 9.99 18.09 21.89 TP3 X .19.31 47.16 64.21 38.58 95.01 128.56 57.84 142.10 191.96 Sx 5.58 9.25 10.96 8.23 13.75 15.86 9.92 16.30 19.68 the tree value distributions. The 5% of trees with the highest value accounted for 39%, 36%, and 33% of the total value of all trees in TPl, TP2, and TP3, respectively. In 3P samphng, trees are sampled without replacement with probability of x/KZ. However, x Chapter 5. DISCUSSION 125 may be 40 times larger than KZ for high valued trees. Hence high valued trees should be sampled x/KZ times, but can only be sampled once. The resulting sample size is less than expected. Estimation of KZ for value is more problematic than for volume because value is much more variable than volume. The appropriate KZ might be more closely approximated using the distribution of values for point-selected trees. The KZ giving the desired total number of subsample trees can be estimated by substituting different KZ values into the following equation. When the distribution of tree values is expressed as a continuous function, the expected sample number (ESN) is MAX . MAX _ , KZ ESN KZ KZ 0 A X A X = M J f(v) + M-M J f(v) â€” J vf(v) (5.54) where M is the estimated total number of point-selected trees in n sample plots, MAX is the estimated maximum tree value, and f(v) is a pdf representing the probability of observing a tree with estimated value v. The first term on the right hand side is the expected number of point-selected trees where estimated value exceeds KZ (sure to be measured trees). The second term is the expected number of point-selected trees where the estimated value is less than KZ (in square brackets), multiplied by the sum of the probabilities of observing a tree with estimated value v, times the probability of that tree being 3P sampled. This method was used with moderate success using a Weibull function to describe the distribution of estimated tree values in T P l . The estimated KZ values were closer to the values estimated by trial and error (Table 3.10) than the theoretical values estimated with the usual formula (Table 5.44). However, this method is probably very sensitive to the estimated pdf for tree value. The deviation values (6) in the target subsampling method are also difficult to esti-mate without a pdf of tree value. Small S values give small samples and may not achieve the desired precision. Large S values give large samples that may exceed the desired Chapters. DISCUSSION 126 precision and are too costly. Subsample size for the threshold method is also controlled by an estimate based on prior knowledge of the distribution of tree values. A modified form of sequential sampling might give better control of subsample size for the target and threshold methods. The number of subsampled trees and precision could be monitored during sampling. Subsampling intensity could be increased or decreased based on the number of trees selected. True sequential sampling is not appropriate for systematic or purposive methods. This application of sequential sampling could be used to control subsampling intensity, but should not be used to stop sampling when a predetermined precision is reached as is the usual case. 5.3.5 Relative Advantage The highest RAs (greatest efficiency) were achieved with the target and purposive sub-sampling (purposive-based) methods in all three test populations. These methods were up to 20% more efficient than the FS method with m = n and 3rc trees. The highest RAs for a given number of plots was always achieved with m = n trees. For TP2 and TP3 where the purposive method had the highest RAs, m = n trees was achieved by purpo-sively selecting one tree/plot. The random and PPS methods were clearly less efficient than the FS strategy and the other subsampling methods. Strategies showing the lowest VRs also gave the highest RAs because the cost for a given sample size was fairly constant among strategies. However, cost varied with subsample size and method. Absolute RAs were not proportional to VRs, consequently, the target method gave the highest RAs for eight of the nine sample sizes for T P l and for six of the nine sample sizes for TP2 and TP3. There was a consistent trend among test populations for RAs to decrease as the subsample size increased for a given estimator and number of plots. This suggests that only one tree/plot is needed for estimating value. The increased precision (lower VR) Chapter 5. DISCUSSION 127 from measuring more than one tree/plot was not justified by increased efficiency (higher RA). 5.4 The Simulation Process Monte Carlo simulation is a cost effective approach to evaluate sampling methods. Point-sampling can be simulated with a stem map where trees are selected with gauge angle and horizontal distance (e.g., Liu 1978, Murchison 1984, Schreuder et al. 1987). Schreuder et al. (1984a) selected trees from a list with probability proportional to basal area to give a desired average number of trees/plot. Yandle et al. (1983) simulated point-sampling by selecting individual plots from a list of many point-sample plots as was done in this study. There were several major advantages to this approach. 1. Existing operational cruise data were available at no cost. 2. The data were collected with high precision by professional cruisers. 3. The data represented actual old-growth forest conditions. 4. The abundance of data made it possible to create several test populations, allowing the sampling methods to be evaluated over a range of conditions. Monte Carlo simulation of point-sampling using a stem map is desirable because all possible plot clusters can be selected. However, detailed stem maps of sufficient size to give a comprehensive evaluation of sampling methods are rare. Compiling a stem map is expensive. This often results in using existing maps that may not represent the conditions of interest. The cost of mapping a large area is prohibitive and results in simulating methods on a relatively small area. Selecting point-sample plots from a list is theoretically incorrect. Each tree can only be contained in a single plot. If points are randomly located in the forest, a given tree Chapter 5. DISCUSSION 128 can be contained in an infinite number of plots. However, there are a finite (although very large) number of point-sample clusters that are defined for a given area. Selecting whole plots may be more realistic because point-sample plots are systematically located in operational inventories, and individual trees are usually contained in only one plot. Monte Carlo simulation gives a general indication of the relative efficiency of alter-nate sampling methods. However, simulation of physical processes is only approximate and field trials are needed to fully evaluate the effect a design. This simulation did not incorporate non-sampling errors. This is reasonable for missed trees, but error in mea-surement of tree height may be a substantial component of variance. Cost functions were averages and may not provide realistic estimates for all conditions. Point-sampling is a complex procedure when specific activity times are estimated for more than one person. For example, professional cruisers observe stand conditions as they walk along a cruise line. They mentally note the occurrence of pathological indicators. They observe trees in the area of a plot as they approach plot center and look for trees to measure height, positions to take angle shots for tree heights, broken tops, and other information to supplement plot data. This information has a large affect on what happens in the plot and was not simulated in this study. Also both persons in the cruising party collect visual information. One person stays at plot center and sights trees with the prism and records data. The other person moves among the trees and measures diameters and calls pathological and quality indicators. The person taking notes may have a view of the upper stem where the other person does not. These complex interactions are difficult to quantify and are usually not considered in sampling simulations. The complex human interactions of point-sampling have a great affect on efficiency. Field evaluation of sampling methods must consider this supplementary information. Simulations evaluate scenarios where this information is not used. However, the use of supplementary information may show that some designs are clearly superior to others. Chapter 6 C O N C L U S I O N S 6.1 Model-Dependent Subsampling Methods The purposive-based subsampling methods (purposive, target, and threshold) were more efficient for estimating value than the probability-based methods (random, PPS, and 3P) and the FS strategy. However, the purposive-based subsampling methods were biased up to about 5%. The jackknife estimator should be tested for bias reduction with the purposive-based subsampling methods. The random and PPS subsampling methods were much less efficient than the FS strategy. The target and purposive subsampling methods were the most efficient. The greatest efficiency was always achieved with one sample tree per plot. The purposive method was more efficient than the FS strategy in TP2 and TP3 and had the advantage of controlling sample size. The bias of the target method varied with test population, but was lowest with the ratio estimator. This might be reduced by weighting the ratio by the number of point-selected trees in the plot (this is not weighting the ratio by the selection probability of the tree). The purposive and target methods have potential for operational use and should be tested more intensively using computer simulation and field trials. The target method has two possible problems: i) subsample size is a random variable and is not precisely controlled; and ii) the method may be sensitive to target values. Both potential problems could reduce efficiency. Computer simulations involving smaller test populations could 129 Chapter 6. CONCLUSIONS 130 be used to more fully examine these problems. 6.2 Forest Service Operational Cruise Procedure The FS operational cruise procedure gave unbiased estimates of value/ha that were much more variable than estimates of volume/ha. Approximately twice as many plots were needed to estimate value/ha to the same level of precision as net volume/ha in the three test populations. The highly skewed distribution of tree values resulted in skewed sample mean distributions and confidence interval coverage that was less than the nominal 95%. Usually more than 100 plots were needed to give a normal distribution of mean value/ha using the FS strategy with the three test populations. Very high valued trees should be treated as a separate stratum using many enhanced count plots (where only the high valued trees are measured in detail). This would concentrate effort on the high valued trees to give a more precise estimate of value and give improved confidence interval coverage. 6.3 Tree Value The distribution of tree values varied slightly with test population but was always very skewed to the right with many low valued trees and few very high valued trees. This resulted in skewed distributions of mean value/ha estimates with the sample sizes used in this study (up to 60 plots). Tree value was linearly related to BA but was much more variable than the relationship of volume to BA. The average $BAR varied greatly among species but was similar among test populations. The eight pathological and seven quality related indicators collected by the FS are not all needed to estimate tree value. More indicators that are related to value should be collected for high valued trees and fewer indicators should be collected for low valued Chapter 6. CONCLUSIONS 131 trees. The indicators should reflect the relationship to value (i.e., grade determination and potential products) and the frequency of occurrence in the species. The regression approach used to predict cruiser-called tree value in this study is not appropriate for practical use. Multivariate regression should be used to estimate log volume percent or grade percent by species. The volume by grade could be multiplied by log market prices to obtain current value. This approach would allow recalculation of values for completed cruises as log prices change. Visually estimated diameters should be tested for predicting tree value. This is a quick method to precisely estimate dbh and may provide a very inexpensive method to obtain auxiliary information on tree value. 6.4 Contributions The major contributions of this study are in forest sampling, however, the results also contribute to the general practical application of model-dependent sampling. The major contributions are: 1. A comprehensive evaluation of the FS cruise procedure for estimating timber value. This was the most detailed study of estimating the value of standing timber using the FS cruise procedure. This study will contribute to the current interest in estimating timber value by the FS Valuation Branch. 2. The demonstration that all the pathological and quality related indicators collected by the FS are not needed to estimate cruiser-called tree value. This should provide additional insight into the usefulness of the indicators and whether or not they need to be routinely collect for all species in operational cruising. Chapter 6. CONCLUSIONS 132 3. The identification of tree characteristics and interactions that are related to cruiser-called value of different species. This information will be useful in identifying characteristics that should be stressed in field estimation of timber value. 4. The demonstration that $BARs are about one and one-half to three times more variable than VBARs, depending on the species and log value. This information can be used in estimating preliminary sample sizes for further simulations or field testing. 5. The increased understanding of the distribution of cruiser-called tree values in old-growth forests. The distribution of tree values in old-growth forests is highly skewed. This study showed that these high valued trees need to be treated as a separate stratum both for absolute value and for sampling efficiency. 6. The development and testing of new model-dependent methods for subsampling trees from point-sample plots. The purposive-based subsampling methods were new and will contribute to the forest sampling literature. 7. The identification of subsampling methods that may increase the efficiency of the FS cruising method for estimating value. The purposive and target subsampling methods have potential for field use and should be further tested. 8. The demonstration that not all trees need to measured in detail to precisely estimate value. The greatest efficiency was always achieved with subsampling only one tree/plot. 6.5 Recommendations for Future Research This research has provided the basis for further study of samphng for timber value in old-growth forests of coastal B.C. However, sampling for timber value is complex and Chapter 6. CONCLUSIONS 133 other questions remain to be addressed. The primary weakness of computer simulation is that it cannot accurately reflect the highly complex real world conditions. The next logical step in this research is to narrow the scope of the methods, use more intensive computer simulation, and use field testing. Also there are other theoretical questions that should be addressed. I have several major recommendations for future research of sampling methods for estimating value. 1. The purposive, target, and 3P methods should be more intensively compared with the FS strategy. The FS strategy should include the use of regular and enhanced count plots. $BAR is considerably more variable than VBARs, however, count plots are very inexpensive and should increase the efficiency of sampling for value as it does for volume. The bias of the purposive method might be reduced by using a jackknife estimator or weighting the individual tree ratios by the number of point-selected trees in the plot. The target method should be examined for sensitivity to the target value and for the variability in achieved sample size. The point-3P method should be tested using the conventional design-consistent estimator. 2. Visually estimated dbh should be tested in a system to estimate tree value. This is an inexpensive method to obtain precise information. The use of visually estimated dbh could be simulated on the computer using random errors. 3. Multivariate regression should be tested for estimating net volume, percent log grade, or percent lumber grade recovery for estimating value of standing trees in timber cruising. 4. Detailed time studies should be conducted to provide more accurate cost functions for point-sampling simulations. This would help quantify the many processes in point-sampling and would also give more precise estimates of activity times. Chapter 6. CONCLUSIONS 134 5. An attempt should be made to incorporate human interactions and supplementary information into the simulation process. This would be a complex task but might increase the understanding of the process and provide more realistic simulations. Chapter 7 L I T E R A T U R E C I T E D Arvanitis, L . G . and W . G . O'Regan. 1967. Computer simulation and economic effi-ciency in forest samphng. Hilgardia 38:133-164. Avery, T . E . 1975. Natural Resource Measurements. Second edition. McGraw-Hill Book Co., New York. 339 pp. Bell, J . F . and L . B . Alexander. 1957. Application of the variable plot method of sam-phng forest stands. Oregon State Bd. For., Res. Note No. 30, 22 pp. Bell, J . F . and J .R. Dilworth. 1988. Log Scaling and Timber Cruising. Oregon State Univ. Book Stores, Inc., CorvaJlis, Oregon. 396 pp. Bell, J . F . , K . lies and D . D . Marshall. 1983. Balancing the ratio of tree count-only sample points and VBAR measurements in variable plot sampling. Pp. 699-702 In Bell, J.F. and T. Atterbury (eds.) Renewable Resources Inventories for Monitoring Changes and Trends. Proc. of the Intern. Conf., Aug. 15-19, Corvallis, Oregon. Bellhouse, D .R. 1984. A review of optimal designs in survey sampling. Can. J. Statist. 12:53-65. Bissell, A . F . 1975. The jackknife-Toy, tool, or two-edged weapon? The Statist. 24:79-100. Bitterlich, W . 1947. Die Winkelzahlmessung. Allgemeine Forst- und Holzwirtschaft-liche Zeitung. 58(11/12)94-96 (cited in Bitterlich 1984). Bitterlich, W . 1948. Die Winkelzahlprobe. Allgemeine Forst- und Holzwirtschaftliche Zeitung. 5(1/2)4-5 (cited in Bitterlich 1984). Bitterlich, W . 1984. The Relascope Idea. Commonw. Agric. Bur., Farnham Royal Slough. 242 pp. Bonner, G . M . 1972. Cost of a small forest inventory. Can. J. For. Res. 2:45-48. Brewer, K . R . W . 1979. A class of robust sampling designs for large-scale surveys. J. Amer. Statist. Assoc. 74:911-915 135 Chapter 7. LITERATURE CITED 136 British Columbia Forest Service. 1976. Metric diameter class decay, waste and breakage factors. B.C. For. Serv., For. Inv. Div. Bruce, Don 1961. Prism cruising in the western United States and volume tables for use therewith. Mason, Bruce, and Girard. Portland, Oregon. 61 pp. Bruce, David 1970. Predicting product recovery from logs and trees. USDA For. Serv. Res. Pap. PNW-107, 15 pp. Cassel, C M . , C . E . Sarndal and J . H . Wretman. 1977. Foundations of Inference in Survey Sampling. John Wiley and Sons, New York. 192 pp. Cochran, W . G . 1946. Relative accuracy of systematic and stratified random samples for a certain class of populations. Ann. Math. Statist. 17:164-177. Cochran, W . G . 1976. Discussion of paper by T.M.F. Smith. J. R. Statist. Soc. A. 139:201. Cochran, W . G . 1977. Sampling Techniques. Third edition. John Wiley and Sons, New York. 428 pp. Davis, K . P . 1966. Forest Management. Second edition. McGraw-Hill Book Co., New York. 519 pp. Davis, L.S. and K . N . Johnson. 1987. Forest Management. Third edition. McGraw-Hill Book Co., New York. 790 pp. Demaerschalk, J.P. and A . Kozak. 1977. The whole-bole system: a conditioned dual-equation system for precise prediction of tree profiles. Can. J. For. Res. 7:488-497. Ericson, W . A . 1969. Subjective Bayesian models in sampling finite populations. J. R. Statist. Soc. B, 31:195-233. Godambe, V . P . 1955. A unified theory of sampling from finite populations. J. R. Statist. Soc. B, 17:269-278. Godambe, V . P . 1969. Some aspects of the theoretical developments in survey-sampling. Pp. 27-58 In Johnson, N.L. and H. Smith Jr. (eds). New Developments in Survey Sampling. A Symp. on the Foundations of Survey Sampling. Univ. of N . C , Chapel HiU, N.C. April 22-26, 1968. John Wiley and Sons, New York. 732 pp. Godambe, V . P . and V . M . Joshi. 1965. Admissibility and Bayes estimation in sam-pling finite populations. I. Ann. Math. Statist. 36:1707-1722. Chapter 7. LITERATURE CITED 137 Gregoire, T . G . 1983 . The jackknife: an introduction with applications in forestry data analysis. Can. J. For. Res. 14:493-497. Grosenbaugh, L . R . 1952. Plotless timber estimates-new, fast, easy. J. For. 50:32-37. Grosenbaugh, L . R . 1958. Point-sampling and line sampling: probability theory, geo-metric implications, synthesis. USDA For. Serv., Occas. Pap. SO-160. 33 pp. Grosenbaugh, L . R . 1964. Some suggestions for better sample-tree measurement. Soc. Amer. For. Proc. 1963, pp. 36-42. Grosenbaugh, L . R . 1971. STX 1-11-71 for dendrometry of multistage 3P samples. USDA For. Serv. Publ. FS-277, 63 pp. Grosenbaugh, L . R . 1976. Approximate sampling variance of adjusted 3P estimates. For. Sci. 22:173-176 Grosenbaugh, L . R . 1979. 3P sampling theory, examples, and rationale. USDI Bur. Land Manage. Tech. Note 331, 18 pp. Gross, H . L . , A . R . ~Ek and R . F . Patton. 1980. Comparison of sample rules for esti-mating jack pine density, basal area, and percentage with sweetfern rust cankers. Can. J. For. Res. 10:190-198. Hamilton, D . A . Jr. 1978. Specifying precision in natural resource inventories. Pp. 276-281 In Lund, H.G., V.J. Labau, P.F. Ffolliott and D.W. Robinson (eds.) Integrated Inventories of Renewable Resources: Proceedings of the Workshop. USDA For. Serv. Gen. Tech. Rep. RM-55, 482 pp. Hansen, M . H . , W . N . Hurwitz and W . G . Madow. 1953. Sample Survey Methods and Theory. Vol. I. Methods and Applications. John Wiley and Sons, New York. 638 pp. Hansen, M . H . , W . G . Madow and B . J . Tepping. 1983. An evaluation of model-dependent and probability-sampling inferences in sample surveys. J. Amer. Statist. Assoc. 78:776-807. Holt, D . , T . M . F . Smith and P.D. Winter. 1980. Regression analysis of data from complex surveys. J. R. Statist. Soc. A, 143:474-487. Horvitz, D . G . and D . J . Thompson. 1952. A generalization of sampling without re-placement from a finite universe. J. Amer. Statist. Assoc. 47:663-685. Howard, A . F . and D . A . Yaussy. 1986. Multivariate regression model for predicting yields of grade lumber from yellow birch sawlogs. For. Prod. J. 36(ll/12):56-60. Chapter 7. LITERATURE CITED 138 Husch, B., C . I . Miller and T.W. Beers. 1982. Forest Mensuration. Third edition. John Wiley and Sons, New York. 402 pp. lies, K. and J.F. Bell. 1983. Grade assessment using variable plot samphng. Pp. 696-698 In Bell, J.F. and T. Atterbury (eds.) Renewable Resources Inventories for Monitoring Changes and Trends. Proc. of the Intern. Conf., Aug. 15-19, Corvallis, Oregon. Jessen, R.J. 1978. Statistical Survey Techniques. John Wiley and Sons, New York. 520 pp. Johnson, F.A. 1961. Standard error of estimating average timber volume per acre un-der point samphng when trees are measured for volume on a subsample of all points. USDA For. Serv., PNW For. Range Exp. Sta., Res. Note No. 201, 6 pp. Kalbfleisch, J.D. and D.A. Sprott. 1969. Applications of likelihood and fiducial probability to samphng finite populations. Pp. 358-389 In Johnson, N.L. and H. Smith Jr. (eds). New Developments in Survey Samphng. A Symp. on the Foun-dations of Survey Samphng. Univ. of N.C, Chapel Hill, N.C. April 22-26, 1968. John Wiley and Sons, New York. 732 pp. Kalton, G. 1976. Discussion of paper by T.M.F. Smith. J. R. Statist. Soc. A. 139:196-197. Kasile, J.D. 1983. 3-P samphng for dollars. Pp. 703-706 /nBell, J.F. and T. Atterbury (eds.) Renewable Resources Inventories for Monitoring Changes and Trends. Proc. of the Intern. Conf., Aug. 15-19, Corvallis, Oregon. Kish, L. 1965. Survey Samphng. John Wiley and Sons, New York. 643 pp. Labau, V . J . 1967. Literature on the Bitterlich method of forest cruising. USDA For. Serv., Res. Pap. PNW-47, 20 pp. Lane, P.H., M . E . Plank and J.W. Henley. 1970. A new and easier way to estimate the quality of inland Douglas-fir sawtimber. USDA For. Serv. Res. Pap. PNW-101. 9 pp. Little, R.J.A. 1983. Discussion of paper by Hansen et al. (1983). J. Amer. Statist. As-soc. 78:797-799. Liu, C.J . 1978. A comparison of forest inventory systems through computer simulations. Pp. 224-251 In Corcoran, T.J. and W. Heij (eds.) Proc. of Simulation Techniques in Forest Operational Planning Control. IUFRO Working Party S3.04.01. Loetsch, F., F. Zohrer and K . E . Haller. 1973. Forest Inventory. Vol. II. BLV Verl-agsgesellschaft mbH, Munchen. 469 pp. Chapter 7. LITERATURE CITED 139 Middleton, G.R. and B.D. Munro. 1985. Product outturn values for British Columbia coastal log grades: An information report. FORINTEK Can. Corp., Vancouver, B.C. 22 pp. (+ app.). Miller, R .G. 1974. The jackknife - a review. Biometrika 61:1-15. Ministry of Forests. 1980. Forest Service cruising procedures and cruise compilation. B.C. Min. For., Victoria, B.C. Ministry of Forests. 1986. Forest Service scaling manual. B.C. Min. For., Victoria, B.C. Murchison, H .G. 1984. Efficiency of multi-phase and multi-stage sampling for tree heights in forest inventory. Univ. Minn. Ph.D. Thesis, 158 pp. Nathan, G. and D. Holt. 1980. The effect of survey design on regression analysis. J. R. Statist. Soc. B, 42:377-386. Nyyssonen, A., P. Roiko-Jokela and P. Kilkki. 1971. Studies on improvement of the efficiency of systematic sampling in forest inventory. Acta. For. Fenn. 116, 26 pp. O'Regan, W-G. and L . G . Arvanitis. 1966. Cost-effectiveness in forest sampling. For. Sci. 12:406-414. Palley, M.N. and L . G . Horwitz. 1961. Properties of some random and systematic point sampling estimators. For. Sci. 7:52-65. Pearse, P.H. 1989. The Vancouver log market, no longer serving its function. For. Plann. Can. 5(5):14-17. Pearse, P.H., A.V. Backman and E.L. Young. 1974. Timber Appraisal. Policies and Procedures for Evaluating Crown Timber in British Columbia. Second re-port of the Task Force on Crown Timber Disposal. B.C. For. Serv. Inv. Div., Victoria, B.C. 185 pp. Pfeffermann, D. and D.J. Holmes. 1985. Robustness considerations in the choice of a method of inference for regression analysis of survey data. J. R. Statist. Soc. A, 148:268-278. Plank, M . E . 1981. Estimating value and volume of ponderosa pine trees by equations. USDA For. Serv. Res. Pap. PNW-283. 13 pp. Plank, M . E . and T.A. Snellgrove. 1978. An equation for estimating the value and volume of western larch trees. USDA For. Serv. Res. Pap. PNW-231, 29 pp. Chapter 7. LITERATURE CITED 140 Promnitz, L . C . n.d. Two phase samphng in operational cruising. Unpubl. Man., 18 pp. Fletcher Challenge Canada Ltd., P.O. Box 2000 New Westminister, B.C. V3L-5A4 Rao, J.N.K. 1975. On the foundations of survey samphng. Pp. 489-505 In Srivastava, J.N. (ed.) A survey of statistical design and hnear models. North-Holland Publ. Co., Amsterdam. 699 pp. Rao, J.N.K. 1979. Optimization in the design of sample surveys. Pp. 419-434 In Rustagi, J.S. (ed.) Optimizing Methods in Statistics. Acad. Press, New York. Rennie, J .C. 1976. Point-3P samphng: A useful timber inventory design. For. Chron. 52:145-146. Rennie, J .C. 1989. Point-model dependent samphng for timber inventory. Dept. For., Wildlife and Fish., Univ. Tenn., Knoxville. 39 pp. + app. Rennolls, K. 1981. The total area of woodland in Berkshire is . . . . The Statistician 30:275-287. Rennolls, K. 1982. The use of superpopulation-prediction methods in survey analysis, with apphcations to the British National Census of Woodlands and Trees. Pp. 395-401 In Brann, T.B., L.O. House and H.G. Lund. (eds). In-Place Resource Inventories: Principles and Practices. Proc. of a National Workshop, Aug. 9-14, 1981, Orono, Maine. Soc. Amer. For., 1101 pp. Royall, R . M . 1970. On finite population samphng theory under certain hnear regression models. Biometrika 57:377-387. Royall, R . M . 1971. Linear regression models in finite population sampling theory. Pp. 259-279 In Godambe, V.P. and D.A. Sprott (eds.). Foundations of Statistical Inference. Proc. of the Symp., Univ. of Waterloo, March 31 to April 9, 1970. Holt, Reinhart and Winston, Toronto. 519 pp. Royall, R . M . 1976a. Current advances in samphng theory: Implications for human ob-servational studies. Amer. J. Epidemiology 104:463-474. Royall, R . M . 1976b. The hnear least-squares prediction approach to two-stage sam-phng. J. Amer. Statist. Assoc. 71:657-664. Royall, R . M . and W.G. Cumberland. 1978. Variance estimation in finite popula-tion samphng. J. Amer. Statist. Assoc. 73:351-358. Royall, R . M . and W.G. Cumberland. 1981a. An empirical study of the ratio esti-mator and estimators of its variance. J. Amer. Statist. Assoc. 76:66-88. Chapter 7. LITERATURE CITED 141 Royall, R . M . and W.G. Cumberland. 1981b. The finite-population linear regres-sion estimator and estimators of its variance-An empirical study. J. Amer. Statist. Assoc. 76:924-930. Royall, R . M . and K.R. Eberhardt. 1975. Variance estimates for the ratio estimator. Sankhya, Ser. C. 37:43-52. Royall, R . M . and J . Herson. 1973a. Robust estimation in finite populations I. J. Amer. Statist. Assoc. 68:880-88a Royall, R . M . and J . Herson. 1973b. Robust estimation in finite populations II. Stratification on a size variable. J. Amer. Statist. Assoc. 68:890-893. Sarndal, C .E . 1978. Design-dependent and model-dependent inference in survey sam-pling. Scand. J. Statist. 5:27-52. Sayn-Wittgenstein, L. 1963. An attempt to find the best basal area factor for point sampling. Can. Dept. For., For. Res. Br., Ottawa. Inservice Rep., 17 pp. Schreuder, H.T. 1970. Point sampling theory in the framework of equal-probability cluster sampling. For. Sci. 16:240-246. Schreuder, H.T. 1984. Poisson sampling and some alternatives in timber sales sam-pling. For. Ecol. and Manage. 8:149-160. Schreuder, H.T. and J . Anderson. 1984. Variance estimation for volume where D2H is the covariate in regression. Can. J. For. Res. 14:818-821 Schreuder, H.T. and C.E. Thomas. 1985. Efficient sampling techniques for timber sale surveys and inventory updates. For. Sci. 31:857-866. Schreuder, H.T. and G.B. Wood. 1986. The choice between design-dependent and model-dependent sampling. Can. J. For. Res. 16:260-265. Schreuder, H.T. , J . Sedransk and K.D. Ware. 1968. 3-P sampling and some alter-natives, I. For. Sci. 14:429-454. Schreuder, H.T. , J . Sedransk, K.D. Ware and D.A. Hamilton. 1971. 3-P sam-pling and some alternatives, II. For. Sci. 17:103-118. Schreuder, H.T. , G.E. Brink and R.L. Wilson. 1984a. Alternative estimators for point-Poisson sampling. For. Sci. 30:803-812 Schreuder, H.T. , G.E. Brink, D.L Schroeder and R. Dieckman. 1984b. Model-based sampling versus point-Poisson sampling on a timber sale in the Roo-sevelt National Forest in Colorado. For. Sci. 30:652-656. Chapter 7. LITERATURE CITED 142 Schreuder, H.T. , S.G. Banyard and G.E. Brink. 1987. Comparison of three sam-phng methods in estimating stand parameters for a tropical forest. For. Ecol. Manage. 21:119-127. Scott, A. and T .M.F . Smith. 1969. Estimation in multi-stage surveys. J. Amer. Statist. Assoc. 64:830-840. Scott, C.T. 1979. Midcycle updating: Some practical suggestions. Pp. 362-370 In Frayer, W.E. (ed.) Forest Resource Inventories. Workshop Proc, Colorado State Univ., Ft. Collins, CO, July 23-26,1979. Vol. 1, 513 pp. Scott, C.T. 1981. Design of optimal two-stage multiresource surveys. Univ. Minn. Ph.D. Thesis, 138 pp. Scott, A .J . , K.R.W. Brewer and E.W.H. Ho. 1978. Finite population samphng and robust estimation. J. Amer. Statist. Assoc. 73:359-361 Seegrist, D.W. 1975. A multivariate model and statistical method for validating tree grade lumber yield equations. USDA For. Serv. Res. Pap. NE-320, 19 pp. Shirley, F .C. 1960. A comparison of the amount of time needed to measure fixed and variable radius plots. State Univ. New York M.F. thesis, 68 pp. (cited in Scott 1981). Snellgrove, T.A. , M . E . Plank and P.H. Lane. 1973. An improved system for esti-mating the value of western white pine. USDA For. Serv. Res. Pap. PNW-166. 19 pp. Smith, T . M . F . 1976a. The foundations of survey samphng: A review. J. R. Statist. Soc. A. 139:183-204. Smith, T . M . F . 1976b. Reply to comments. J. R. Statist. Soc. A. 139:201-204. Smith, T . M . F . 1981. Regression analysis for complex surveys. Pp. 267-292 In Krewski, D., R. Platek and J.N.K. Rao (eds). Current Topics in Survey Samphng. Proc. of the Intern. Symp. on Survey Samphng. Carleton Univ., Ottawa. May 7-9, 1980. Academic Press, New York. 509 pp. Smith, T . M . F . 1983. On the validity of inferences from non-random samples. J. R. Statist. Soc. A, 146:394-403. Smith, T . M . F . 1984. Present position and potential developments: Some personal views. Sample surveys. J. R. Statist. Soc. A, 147:208-221. Chapter 7. LITERATURE CITED 143 Sukhatme, P.V., B.V. Sukhatme, S. Sukhatme and C . Asok. 1984. Sampling Theory of Surveys with Applications. Iowa State Univ. Press, Ames Iowa. 526 pp. Thomson, G .W. and G . H . Dietschman. 1959. Bibliography of world literature on the Bitterlich method of plotless cruising. Iowa State Univ. Agric. and Home Econ. Exp. Sta., 10 pp. Van Deusen, P.C. 1987. 3-P sampling and design versus model-based estimates. Can. J. For. Res. 17:115-117. Ware, K .D . 1964. Some problems in the quantification of tree quality. Proc. Soc. Amer. For. Pp. 211-217. Wiant, H.V. Jr. 1974. Combine 3P and point sampling for efficient cruising. West Vir-ginia For. Notes 2:12-15. Wiant, H.V. Jr. 1976. Elementary 3P sampling. West Virginia Univ. Agr. For. Exp. Sta., Bull. 650T, 31 pp. Wiant, H.V. Jr. 1977. Comparison of point-3P sampling designs. USDI Bur. Ld. Man-age., Res. Inv. Notes BLM 8, 5 pp. Wood, G .B . , H.T. Schreuder and G . E . Brink. 1985. Comparison of a model-based sampling strategy with point-Poisson sampling on a timber management area in the Arapahoe-Roosevelt National Forest in Colorado. Can. J. For. Res. 15:83-86. Yandle, D.O. 1982. Binomial sampling of forest populations. Pp. 362-370 In Brann, T.B., L.O. House and H.G. Lund. (eds). In-Place Resource Inventories: Principles and Practices. Proc. of a National Workshop, Aug. 9-14,1981, Orono, Maine. Soc. Amer. For., 1101 pp. Yandle, D.O. and F . M . White. 1977. An application of two-stage forest sampling. South. J. Appl. For. 1:27-32. Yandle, D.O., J.R. Myers and H.V. Wiant Jr. 1983. A comparison of some two-stage sampling designs. Pp. 645-647 In Bell, J.F. and T. Atterbury (eds.) Re-newable Resources Inventories for Monitoring Changes and Trends. Proc. of the Intern. Conf., Aug. 15-19, Corvallis, Oregon. Yaussy, D.A. 1986. Green lumber grade yields from factory grade logs of three oak species. For. Prod. J. 36(5):53-56. Appendix A GLOSSARY This list describes some of the commonly used notation. Less common and case specific terms are described in the text. %SE standard error expressed as a percentage of the estimate $ B A R dollar to basal area ratio b tree basal area BA basal area BAF basal area factor (m 2/ha) C R cost ratio C V coefficient of variation C V 2 relative variance D diameter at breast height (1.3 m) dbh diameter at breast height (1.3 m) E desired error for sample size calculations FS Forest Service i subscript denoting plot i j subscript denoting tree j k subscript denoting the kth. subsampled tree considered over the total of M trees m meters m actual total number of subsampled trees mt- actual number of subsampled trees in plot i 144 Appendix A. GLOSSARY m desired total number of subsampled trees rhi desired total number of subsampled trees in plot i M total number of point-selected trees in all plots Mi total number of point-selected trees in plot i n total number of point-sample plots PPS sampling with probabihty proportional to size RA relative advantage s total number of 3P-selected trees in point-3P samphng SE standard error t Student's t TC tree count, the number of point-selected trees VBAR volume to basal area ratio VR variance ratio x covariate of y, usually estimated tree dollar value y attribute of interest, usually cruiser-called tree dollar value Y total of y for the population Y estimated total of y for the population Y average of y for the population Y estimated average of y for the population y average y per ha yBAR ratio of the attribute y to basal area
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Model-dependent sampling for timber value in old-growth...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Model-dependent sampling for timber value in old-growth forests of coastal British Columbia Thrower, James S. 1989
pdf
Page Metadata
Item Metadata
Title | Model-dependent sampling for timber value in old-growth forests of coastal British Columbia |
Creator |
Thrower, James S. |
Publisher | University of British Columbia |
Date Issued | 1989 |
Description | The procedure used to sample crown timber before harvesting in B.C. is designed to estimate net volume per ha using systematically located angle-count plots where trees are selected with probability proportional to basal area. The primary purpose of the sample is to provide information for timber valuation and stumpage appraisal. Timber value is the most important population parameter for stumpage calculation, but it is not explicitly considered in the sampling design. The objective of this study was to modify the current sampling method to increase the efficiency for estimating value using model-dependent sampling theory. Eighteen model-dependent sampling strategies were developed from six subsampling methods using three estimators. The six subsampling methods were used to select trees from angle-count plots to estimate the relationship between cruiser-called and estimated tree value. Three subsampling methods used probability-based selection of trees and three methods used purposive-based selection of trees. Ratio, average ratio, and regression estimators were used with each method. The 18 strategies were tested using Monte Carlo simulation with 2000 samples at each of nine sample sizes in three test populations. The test populations were created by grouping angle-count plot data into mutually exclusive sets reflecting different stand characteristics. The sample sizes were n = 20,40, and 60 plots with m = n, 3n, and 5n subsampled trees. Individual tree value was estimated with regression equations that used variables closely related to the value of each species. The sampling strategies were evaluated for bias, sample variance, achieved subsample size, sampling cost, confidence interval coverage, and relative advantage against the current sampling method. The model-dependent subsampling methods using purposive selection of trees were more efficient than the current sampling method considering cost and variance. The purposive-based methods were biased up to about 5%; the probability-based methods were slightly less biased. The two most efficient methods were: i) purposive selection of trees with the highest estimated values in a plot; and ii) purposive selection of trees with estimated values within a given range to give a second-stage sample balanced on the auxiliary variable. The greatest efficiency was always achieved with one sample tree per plot. The current sampling method was unbiased for estimating value but required approximately twice as many plots to estimate value to the same level of precision as net volume. |
Subject |
Forests and forestry--Measurement Forests and forestry--Valuation |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-02-15 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0100626 |
URI | http://hdl.handle.net/2429/31308 |
Degree |
Doctor of Philosophy - PhD |
Program |
Forestry |
Affiliation |
Forestry, Faculty of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1990_A1 T47.pdf [ 8.7MB ]
- Metadata
- JSON: 831-1.0100626.json
- JSON-LD: 831-1.0100626-ld.json
- RDF/XML (Pretty): 831-1.0100626-rdf.xml
- RDF/JSON: 831-1.0100626-rdf.json
- Turtle: 831-1.0100626-turtle.txt
- N-Triples: 831-1.0100626-rdf-ntriples.txt
- Original Record: 831-1.0100626-source.json
- Full Text
- 831-1.0100626-fulltext.txt
- Citation
- 831-1.0100626.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0100626/manifest