Q U A L I T Y C O N T R O L M E T H O D S F O R M O N I T O R I N G T H E V A R I A B I L I T Y O F M O I S T U R E C O N T E N T TN K T L N - D R I E D L U M B E R by C A T A L I N R I S T E A B . S c , Transylvania University, Romania, 1994 A THESIS S U B M I T T E D IN P A R T I A L F U L F I L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F S C I E N C E in T H E F A C U L T Y O F G R A D U A T E STUDIES T H E F A C U L T Y O F F O R E S T R Y Department of Wood Science We accept th$thesis as conforming to the required standard T H E U N I V E R S I T Y O F BRITISH C O L U M B I A October 2001 © Catalin Ristea, 2001 In presenting this thesis in partial fulfillment of the requirements for the degree of Master of Science at The University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Wood Science Faculty of Forestry The University of British Columbia Vancouver, Canada Date ' . I A B S T R A C T Monitoring the variability of moisture content in lumber is a problem of utmost importance in kiln-drying. This thesis focused on determining the distribution of moisture content in lumber, and on the application of statistical process control principles to monitor the drying process through the use of quality control charts. Specific parameters of the charts for monitoring the process average and process dispersion were determined. The charts were developed on the assumption that the data under analysis has a Lognormal distribution. To test whether this assumption is valid or not for moisture content, graphical and numerical tests were employed as goodness of fit methods. Graphical methods utilized in this research were symmetry and probability plots, and empirical cumulative distribution function graphs. Numerical methods employed in this study were goodness of fit tests based on empirical distribution function statistics. Procedures were developed for both graphical and numerical goodness-of-fit methods, and also for construction of control charts for Lognormal variables. The methods presented were tested on a data set of actual Douglas-Fir (Pseudotsuga menziesii) lumber collected from a production facility in British Columbia, Canada. It was determined that the Lognormal distribution provided a good fit for the experimental data. Therefore, the proposed control charts for Lognormal data are to be used instead of the customary charts based on the Normality assumption. i i TABLE OF CONTENTS ABSTRACT ii TABLE OF CONTENTS iii LIST OF TABLES v LIST OF FIGURES vi LIST OF PROCEDURES viii ACKNOWLEDGEMENTS ix 1 INTRODUCTION 1 1.1 O V E R V I E W 1 1.2 P R O B L E M DEFINITION 3 1.3 R E S E A R C H O B J E C T I V E S 4 2 LITERATURE REVIEW AND BACKGROUND MATERIAL 6 2.1 O V E R V I E W 6 2.2 M O I S T U R E C O N T E N T 6 VARIATION IN MOISTURE CONTENT 7 DISTRIBUTION OF MOISTURE CONTENT. 9 2.3 G O O D N E S S O F FIT M E T H O D S 10 GRAPHICAL ANALYSIS 11 NUMERICAL TESTS 11 2.4 PRINCIPLES O F S T A T I S T I C A L P R O C E S S C O N T R O L 13 CONTROL CHARTS 13 CONTROL CHARTS FOR LOGNORMAL VARIABLES 14 SAMPLING AND SAMPLE SIZE 16 STATISTICAL PROCESS CONTROL IN LUMBER DRYING 17 3 METHODS AND MATERIALS 19 3.1 O V E R V f f i W 19 3.2 D A T A C O L L E C T I O N 20 DATA FOR DISTRIBUTION ANALYSIS 20 DATA FOR CONTROL CHARTS 21 DETERMINATION OF MOISTURE CONTENT 22 3.3 G R A P H I C A L A N A L Y S I S O F D A T A 23 INVESTIGATION OF SYMMETRY 25 PROBABILITY PLOTTING 28 Normal Probability Plotting 29 Lognormal Probability Plotting 31 Weibull Probability Plotting 33 iii 3 .4 G O O D N E S S O F F I T T E S T S 3 6 INTRODUCTION 36 ECDF STATISTICS 37 GENERAL PROCEDURE FOR TESTS OF HYPOTHESIS 38 TESTS FOR NORMAL DISTRIBUTION. 39 TESTS FOR LOGNORMAL DISTRIBUTION. 40 TESTS FOR WEIBULL DISTRIBUTION 42 3.5 C O N T R O L C H A R T S F O R L O G N O R M A L D A T A 4 3 THREE-PARAMETER LOGNORMAL DISTRIBUTION 43 WHAT TO MONITOR ON CONTROL CHARTS 45 PARAMETER ESTIMATION 48 ARITHMETIC MEAN VS. GEOMETRIC MEAN 51 CONTROL CHART FOR SCALE PARAMETER ( y CHART) 52 CONTROL CHART FOR SHAPE PARAMETER (S CHART) 58 4 RESULTS AND DISCUSSION 61 4.1 O V E R V I E W 61 4 . 2 G R A P H I C A L A N A L Y S I S 61 4 . 3 G O O D N E S S O F F I T T E S T S 6 6 4 . 4 C O N T R O L C H A R T S F O R L O G N O R M A L V A R I A B L E S 6 9 4 . 5 L I M I T A T I O N S O F T H E S T U D Y 7 8 5 CONCLUSIONS 79 5.1 O V E R V I E W 7 9 5 .2 C O N C L U S I O N S 7 9 5 .3 R E C O M M E N D A T I O N S F O R F U R T H E R R E S E A R C H 81 6 LITERATURE CITED 82 7 APPENDIX A - KILN CHARACTERISTICS 85 8 APPENDIX B - DRYING SCHEDULE 86 9 APPENDIX C - DATA PLOTTED ON CONTROL CHARTS 87 10 APPENDIX D - TWO REPLICATIONS FOR GOODNESS OF FIT TESTS OF HYPOTHESIS 88 i v LIST OF TABLES Table Page Table 1. Absolute values of distances from sample median to percentiles 63 Table 2. Observed statistics and critical values for E C D F statistics for hypothesis testing 67 Table 3. Parameters of preliminary data for construction of control charts based on the Normal distribution 69 Table 4. Control limits and center lines for control charts based on Normal assumption 70 Table 5. Parameters of preliminary data for construction of control charts based on the Lognormal distribution 71 Table 6. Control limits and center lines for control charts based on Lognormal assumption 72 v LIST OF FIGURES Figure Page Figure 1. Symmetric and skewed distributions 25 Figure 2. Empirical Cumulative Distribution of M C for Douglas-Fir, and the relation of 10 t h and 90 t h percentiles about the median (50 t h percentile) 62 Figure 3. Plot of upper vs. lower observations for investigation of symmetry 63 Figure 4. Normal probability plot for the moisture content data 65 Figure 5. Lognormal probability plot for moisture content data 65 Figure 6. Weibull probability plot for moisture content data 66 Figure 7. Control chart for sample averages based on the Normality assumption 70 Figure 8. Control chart for sample standard deviations based on the Normality assumption 70 Figure 9. Control chart for sample averages (y chart) based on Lognormal assumption 73 Figure 10. Antilog y chart (in original scale) based on Lognormal assumption 73 Figure 11. Control chart for sample standard deviations (5 chart) based on Lognormal assumption 74 Figure 12. Antilog S chart (in original scale) based on Lognormal assumption 74 vi Figure 13. Control chart for sample averages based on the Normality assumption 76 Figure 14. Antilog y chart (in original scale) based on Lognormal assumption 76 Figure 15. Control chart for sample standard deviations based on the Normality assumption 77 Figure 16. Antilog S chart (in original scale) based on Lognormal assumption 77 vii LIST OF PROCEDURES Procedure Page Procedure 1: Constructing an E C D F plot 26 Procedure 2: Assessing the symmetry by the distance of percentiles to the sample median 26 Procedure 3: Assessing symmetry by plotting the upper half of the observations against the lower 27 Procedure 4: Normal probability plotting 31 Procedure 5: Lognormal probability plotting 32 Procedure 6: Weibull probability plotting 34 Procedure 7: Goodness-of-fit test of hypothesis for the Normal distribution 40 Procedure 8: Goodness-of-fit test of hypothesis for the Lognormal distribution 41 Procedure 9: Goodness-of-fit test of hypothesis for the Weibull distribution 42 Procedure 10: Construction of charts for sample means - or scale parameters - (y chart) and the corresponding Lognormal control charts in original scale of measurements (antilog y chart) 57 Procedure 11: Construction of charts for shape parameter - sample standard deviations - (S chart) and the corresponding Lognormal control charts in original scale of measurements (antilog S chart) 60 vni ACKNOWLEDGEMENTS I would like to offer my sincere thanks to my advisor, Dr. Thomas C . Maness, for his constant encouragement and guidance throughout this research. Special thanks to my supervisory committee, Dr. Stavros Avramidis, Dr. Robert Kozak, and Dr. Ian Hartley, for many valuable suggestions and comments. I wish to express my gratitude to Dr. Michael Stephens who has made a substantial contribution to my understanding of goodness-of-fit testing. Last but not least, I would like to thank my wife Anca and my son Alex for their unconditional love and constant support, without which this work would have not been possible. ix QUALITY CONTROL METHODS FOR MONITORING THE VARIABILITY OF MOISTURE CONTENT IN KILN-DRIED LUMBER 1 INTRODUCTION 1.1 OVERVIEW The variability of moisture content (MC) in solid wood is a problem of utmost importance in kiln-drying processes. Variation in final moisture content of lumber can cause serious problems in its subsequent processing and use. It is known that the moisture content of lumber varies considerably among boards in a kiln charge. This can happen because of natural variability in drying rate or initial moisture content, sapwood and heartwood, wet pockets in the lumber, and/or variability in drying conditions in various parts of the kiln. It is known that substantial M C variation can exist between package locations in the kiln, between row locations in the package, between kiln charges, and also between different kilns at the same mill. The moisture content variation needs to be estimated at all drying stages: before the actual kiln drying begins to determine the actual setting of the kiln controls; during the kiln drying period to monitor M C distribution and/or change drying schedules; and finally, at the end of a kiln charge to determine whether the lumber has acceptable moisture variation or not. To 1 minimize lumber degrade during drying requires that the variation of moisture content be accurately monitored and controlled. This can be done using the quality tools of statistical process control (SPC). The essential idea of S P C is to improve the quality of a process continuously through the constant application of statistical methods to process control. A major goal of S P C is to detect the occurrence of any assignable causes of disturbances in the process as soon as possible so that investigation of the process and corrective action can be taken before many nonconforming products reach the final stage of the process. This is usually done through the use of parametric control charts for variables. The parametric control charts used in S P C are always based on the knowledge of the underlying distribution of the data under analysis, and the conventional assumption is that the data is approximately Normally distributed. If the process shows evidence of a significant departure from normality, then the control limits calculated may be entirely inappropriate. The moisture content of kiln-dried lumber does not appear to be well modeled by a normal distribution. It is, therefore, very important to determine the distribution of the moisture content of kiln-dried lumber. There are a number of reasons why a process such as kiln drying, which is operating in a state of statistical control, yields non-normal distributions. To name a few: a) Restraining variables at a fixed limit, e.g. preventing the kiln temperature from exceeding a certain set point, preventing the air relative humidity to exceed set parameters, etc.; b) Measurement of a characteristic that has zero as a natural limit, e.g. moisture content. 2 9 1.2 PROBLEM DEFINITION Although variability is an important parameter to be measured and controlled, many quality control methods suggested in literature are concerned with the average moisture content alone, such as sample estimation of average moisture content for a charge. Other methods are based on "go/no-go" decision criteria, such as acceptance sampling, and conformance tests. These quality control methods help lumber producers to comply with lumber standards, but they do not help to improve the consistency of drying processes. Other methods for consistency, although concerned with the variation of moisture content, did not find practical applicability because they are based on the incorrect assumption that moisture content has a Normal probability distribution. The variation of moisture content during and after drying could be monitored with the tools of statistical process control (SPC), such as control charts for process averages and process dispersion. One obstacle in the development and use of these tools is the insufficient knowledge of the distribution of moisture content in kiln-dried lumber, which is not well modeled by the customary Normal distribution. Many authors suggest that the Lognormal distribution may offer a good model for the moisture content (McMahon 1961, Zwick and Cook 1985, and Maki 1991). However, suitable procedures for testing this assumption have not been documented yet and therefore are not used in industry. This thesis presents a methodology for determining the distribution of moisture content by assessing how well a sample of data agrees with a given distribution as its population. A methodology is also developed for constructing control charts to monitor the moisture content of kiln-dried lumber, when the Lognormal is the appropriate underlying distribution. The methods 3 presented are tested on a data set of actual lumber collected from a production facility in British Columbia, Canada. 1.3 RESEARCH OBJECTIVES Understanding how well a drying process is operating is very important to mill management. A n ideal condition would be to know at all times what is happening in the kiln, with respect to the correct moisture content, its distribution and its variability. If the process is stable or in-control, it should behave in a predictable fashion, and the distribution of moisture content should not change. If the process becomes unstable or out-of-control, the parameters of the distribution of moisture content will change, and this situation should be detected immediately because it will lead to unacceptable quality levels of process performance. Two specific objectives were undertaken in this study, which will lead to methods that correctly determine the distribution of moisture content, and properly interpret this information using statistical process control principles. These objectives are as follows: 1) Develop a methodology for determining the underlying distribution of moisture content in kiln-dried lumber, which is essential for making valid assumptions used in statistical process control. Conventional control charts for variables are based on the assumption that the quality characteristic of interest is approximately Normally distributed. It has been shown that the moisture content in wood has a non-normal probability distribution. Therefore, the underlying distribution of moisture content in wood must be understood and properly modeled. Moisture content typically has a skewed distribution that could be modeled 4 with the Lognormal or Weibull distributions, but this assumption was not well justified yet statistically. The approach will be to fit the skewed distribution of M C to other known distributions, such as Lognormal or Weibull; 2) Develop quality control charts that accurately and realistically monitor the variability of moisture content in kiln-dried lumber. Using the moisture content distribution previously determined, appropriate control charts will be developed that monitor process average and dispersion. 5 2 LITERATURE REVIEW AND BACKGROUND MATERIAL 2.1 OVERVIEW This literature review will examine the concept of moisture content in solid wood, and current methods for testing the goodness of fit of experimental distributions of M C against given probabilistic distributions. General issues concerning statistical process control principles and their application to lumber drying are also discussed. 2.2 MOISTURE CONTENT The wood of freshly cut trees, also called 'green' wood, contains a considerable amount of water. Most of this water has to be removed before the wood is used for various purposes. Also, wood loses or gains moisture in an attempt to reach a state of balance with the surrounding air. Therefore, the knowledge of the relations between wood and its moisture are very important for storage, fabrication and use (Simpson 1991). The weight of moisture contained in a piece of wood is called the moisture content, and it is usually expressed as a percentage of its oven dry weight. By oven dry weight is meant the constant weight attained by wood samples dried at 103 ± 2 °C. It is advantageous to define moisture content in terms of the oven dry weight of wood since the oven dry weight is a constant 6 value that may be determined at any time. Moisture in wood above the fiber saturation point is described by two types: free water, contained in voids and lumen of the structure, and bound water, contained in the cell-wall material. In most applications, free water has to be removed completely along with some of the bound water, before it can be used for different products. The water is usually removed by drying. Wood is typically dried in large kilns, which are large chambers designed to provide and control the heat, humidity and air circulation necessary for proper drying. A regular kiln will dry most species of wood to any specified moisture content between 3 and 19 percent in a reasonable short time. One of the main objectives of kiln drying is to bring all the wood to about the same moisture content. This is a difficult task because wood dries at different rates: thicker lumber dries harder, sapwood dries faster than heartwood, quartersawn boards dry slower than flatsawn, etc. VARIATION IN MOISTURE CONTENT The variation of moisture content in kiln-dried lumber is due to three major factors: the inherent properties of wood as a non-homogenous material, the different rates at which wood dries, and the inconsistencies of drying conditions between different regions of the kiln. Wood is a non-homogenous material, and the initial moisture content of green wood varies greatly and can range from over 200 percent to 30 percent. These variations occur mostly between different species, but also between the same specie, and even within the same tree (Simpson 1997). In softwood species, sapwood usually contains more water than heartwood. In many species, the butt logs of trees may contain more water than the top logs. Therefore, the 7 wood enters the kiln with a certain amount of variability of M C . However, Culpepper and Wengert (1980) showed that there is a poor correlation between incoming (green) moisture content and final moisture content - the wettest boards when green were not the wettest boards at the end of drying. The rate at which wood dries depends on the relative humidity of the surrounding air, the steepness of the moisture gradient, and the temperature of the wood. Drying rate is also influenced by lumber thickness and grain direction. The lower the relative humidity, the faster water moves through capillary cavities. The steeper the moisture gradient (the difference in M C between the inner and outer portions of a board), the faster the drying rate. The higher the temperature of the wood, the faster water will be removed from wood. Drying time increases with thickness, at a rate that is more than proportional to thickness. Finally, flatsawn (also called flat grain) lumber dries faster than quartersawn (or vertical grain) lumber (from Simpson 1997). The variation of M C in kiln-dried lumber is also caused by inconsistency of drying conditions within the kiln. Culpepper and Wengert (1980) showed that tremendous variation in final moisture content could result from the kiln itself. The identified causes for the variation were variation in air velocity from top to bottom of kiln, and heat supply partially blocked on one side of the kiln, causing over-drying on the other side. The moisture content variation needs to be estimated at all drying stages (McMahon, 1961): before the actual kiln drying begins to determine the actual setting of the kiln controls; during the kiln drying period to monitor M C distribution and/or change drying schedules; and finally, at the end of a kiln charge to determine whether the lumber has acceptable moisture variation or not. 8 / Variation in final moisture content of lumber can cause serious problems in its subsequent processing and use (Simpson, 1991). Rice and Shepard (1993), and Culpepper and Wengert (1980) showed that substantial M C variation exists between package locations in the kiln, between row locations in the package, between kiln charges, and also between different kilns at the same mill. To correctly monitor the variation of moisture content it is necessary to know its underlying distribution or at least the best approximation of it. DISTRIBUTION OF MOISTURE CONTENT Knowledge of the underlying distribution of moisture content would be an advantage for producers of kiln-dried lumber. Parameters such as average M C or standard deviation could be properly estimated from samples. Also, meaningful and reliable parametric control charts and other statistical process control techniques could be employed. Many of the existing methods for parameter estimation, sampling, and control charting, assume that the moisture content obeys the normal law of probability. Many authors agree that the distribution of moisture content of kiln-dried lumber is not well modeled by the Normal distribution (McMahon 1961, Zwick and Cook 1985, Maki 1991, and Noghondarian 1997). The distribution of M C has a finite left tail, because the moisture content cannot have values less than zero percent. In addition, the distribution is positively skewed - the right-hand side tail is longer (McMahon 1961, Maki 1991). As the lumber becomes drier in a kiln, the skewness of the distribution generally increases (Zwick and Cook 1985). A l l these considerations suggest that the Lognormal or the Weibull distributions may be capable to model 9 appropriately the moisture content in kiln-dried lumber. These assumptions should be rigorously checked with statistical methods, such as goodness of fit tests. 2.3 GOODNESS OF FIT METHODS By goodness of fit techniques is meant the methods of examining how well a sample of data agrees with a given distribution as its population. A comprehensive review of these methods is given in D'Agostino and Stephens (1986). In the formal framework of hypothesis testing, the null hypothesis is that a given random variable follows a stated probability law (for example, the Normal distribution or the Lognormal distribution). The goodness of fit techniques applied to test the null hypothesis are based on measuring in some way the conformity of the experimental data to the hypothesized distribution. The techniques usually give formal statistical tests and the measures of consistency are test statistics. There are many goodness of fit methods, such as graphical analysis, chi-square-type tests, tests based on the empirical distribution function, tests based on regression and correlation, tests based on the third and fourth sample moments, etc. (from D'Agostino and Stephens 1986). However, not all these techniques are well developed for testing all three distributions: Normal, Lognormal, and Weibull, especially when the distribution parameters are not known a-priori and they have to be estimated from samples. 10 GRAPHICAL ANALYSIS Graphical methods are as exploratory techniques only and should always be used in conjunction with formal numerical techniques. Two graphical methods, which are based on the empirical cumulative distribution function (ECDF) , are investigation of symmetry and probability plotting. D'Agostino and Stephens (1986) showed the procedure for investigating the symmetry by means of an E C D F plot and by plotting a scatter diagram of upper half of the ordered observations against the lower. Procedures for Normal and Lognormal probability plotting are given D'Agostino and Stephens (1986). Weibull probability plotting is showed in Dodson (1994). NUMERICAL TESTS The goodness of fit tests that are well developed for all three distributions under analysis are the tests based on E C D F statistics. They are more powerful than chi-square type tests (D'Agostino 1986), and their procedures and test statistics are better developed for the analysis of the three distributions under study, Normal, Lognormal, and Weibull distribution, especially for the case when the parameters of the hypothesized distribution are not known and have to be estimated from the sample. The goodness of fit tests based on E C D F are founded on various measures of the discrepancy between the empirical cumulative distribution function Fn(x) and the proposed theoretical cumulative distribution function F(x). The statistics measuring this discrepancy are called E C D F statistics. To decide which statistics should be used, the following considerations were taken into account: 11 a) The parameters of the underlying distribution are not known. The distribution of moisture content in kiln-dried lumber is not well known, and it will have different parameters for a specific set of conditions (kiln, species, type of lumber, drying schedule, etc.). These parameters must be estimated from the sample, and the goodness of fit theory is not well developed for this situation, except for a few theoretical distributions; b) Most of the procedures and test statistics for goodness of fit tests were developed for the following situations: F(x) is completely specified; F(x) is the normal distribution with one ore more parameters unknown; F(x) is the exponential distribution, with the scale parameter unknown; F(x) is the extreme value distribution, with one or both parameters unknown; F(x) is the Weibull distribution with all parameters unknown ( Stephens 1974 and 1976, Chandra 1981, andLockhart 1994); c) The probability distributions that are similar with the distribution of moisture content data are the Lognormal and the Weibull. The Normal distribution needs to be considered also, because of its wide acceptance in statistical process control. Given the above considerations, the following E C D F statistics are used for goodness of fit hypothesis testing: the Kolmogorov-Smirnov D statistic, the Cramer-von Mises W 2 statistic, and the Anderson-Darling A statistic. Their definitions and formulas are given later in the 'Methods and Materials' section. Once the underlying distribution is determined, control charts can be constructed according to statistical process control principles. 12 2.4 PRINCIPLES OF STATISTICAL PROCESS CONTROL If a lumber product is to meet customer requirements from the moisture content point of view, generally it should be produced by a drying process that is stable or repeatable. More precisely, the drying process must be capable of operating with little variability around the target or nominal value of the final moisture content. Statistical Process Control (SPC) is a powerful collection of problem-solving tools useful in achieving process stability and improving capability through the reduction of variability (Montgomery 1997). One of the most technically sophisticated tools of SPC is the control chart. C O N T R O L C H A R T S The theory of variability is the basis for the construction and use of control charts. In kiln-drying processes, regardless of how well designed or carefully maintained they are, a certain amount of inherent or natural variability exists. This natural variability, also called background noise, is a stable system of "common causes", which are an inherent part of the process. A process that is operating with only common causes of variation is said to be "in statistical control" (Montgomery 1997). Other kinds of variability may occasionally be present in the output of a kiln-drying process, namely in the moisture content of lumber. This variability in moisture content usually arises from three sources: improperly adjusted kiln equipment, operator errors, or defective raw material. Such variability is generally large when compared to the background noise, and it usually represents an unacceptable level of process performance. These sources of variability are called "assignable causes", and a process that is operating in the presence of such causes is said 13 to be "out of control". A kiln can operate in the in-control state for a long time, producing acceptable moisture content levels in lumber. Eventually, however, assignable causes will occur, resulting in a "shift" to an out-of-control state, where a large proportion of the lumber does not conform to requirements. A major objective of the control charts is to quickly detect the occurrence of assignable causes of process shifts, so that investigation of the process and corrective action may be undertaken before many nonconforming units are produced. Control charts may also be used to estimate the parameters of the kiln-drying process, and through this information to determine process capability. The control chart also provides useful information to improve the process. In general, the control chart is an effective tool in reducing variability as much as possible. CONTROL CHARTS FOR LOGNORMAL VARIABLES Conventional control charts for variables are based on the assumption that the underlying distribution of the data is Normal. As it was discussed earlier in the Distribution of Moisture Content section, the Lognormal may provide a good fit for the underlying distribution of moisture content. Lognormal control charts were discussed by Ferrell (1958), Morrison (1958) and Joffe and Sichel (1968). Ferrell (1958) proposes a control chart for monitoring the "geometric midrange", which he defines as the Lognormal equivalent of the arithmetic midrange from a Normal distribution (half the sum of the largest and smallest data values). However, he uses an equivalent to the '3-sigma' limits for the geometric midrange without showing what the distribution of this parameter 14 actually is. This control chart is developed only for the two-parameter Lognormal. He also does not present any chart for monitoring process dispersion. Morrison (1958) suggests two control charts for monitoring the sample mean and the sample ratio, respectively. He states that the control limits for a Normal distribution are calculated by processes of addition, subtraction, multiplication, and division. Hence, the corresponding processes for a Lognormal distribution will be multiplication, division, evolution, and involution. The ratio of sample values in a Lognormal distribution corresponds to the range within samples in a Normal distribution. Therefore, the factors used as multipliers of the average range in calculating Normal distribution control limits should be used as indices of the average ratio in the Lognormal distribution. The formulas for calculating the Lognormal limits are given for charts that monitor the sample mean and the sample ratio. However, these limits apply only to the two-parameter Lognormal. Also, the measuring units of the control charts are not in the original scale of measurement of data, so the interpretation of these charts by mill personnel would be rather difficult. Joffe and Sichel (1968) used the arithmetic mean to sequentially testing hypotheses about the mean of a Lognormal distribution. The procedure is essentially an acceptance sampling plan for attributes, and it can be used as an inspection of the final product rather than a real-time monitoring. Also, the procedure applies only to the two-parameter Lognormal. To use the control charts in practical applications, sample sizes must be established along with a method for taking the samples. 15 SAMPLING AND SAMPLE SIZE In general, the parameters of a process are unknown; furthermore, they can change over time. Therefore, estimates for these parameters are obtained by statistical inferences based on samples taken from that population. A sample is a portion or a subset of the entire population of measurements under consideration (Milton 1986). The method of sampling used in goodness-of-fit tests is random sampling (D'Agostino and Stephens 1986). Also, Montgomery (1997) clearly advises that samples be randomly selected for SPC procedures, otherwise false conclusions will be derived. Random samples can be taken either with replacement or without replacement. In practical applications, the sampling is done typically without replacement, because the ratio of sample size versus population size is negligible. From a sample or a collection of samples, inferences are made about the entire population. Therefore, large sample sizes would be ideal; however, the financial cost usually associated with sampling will favour small samples. Consequently, the practitioner will be concerned with the minimum sample size that will provide accurate information. The minimum sample size for goodness of fit tests varies from a situation to another. For empirical distribution function plots, the typical sample size is at least 100 measurements (D'Agostino and Stephens 1986). For probability plots, the same authors recommend a sample size greater than 50. For a discussion about sample size in probability plotting, see also Daniel and Wood (1971). The minimum sample size for numerical goodness-of-fit tests is not well defined in literature. However, Stephens (1974), Lockhart (1994), and Cohen et al. (1985) illustrated applications of goodness of fit tests with sample sizes ranging from 15 to 100. 16 Only small sample sizes are required for SPC charts (relative to the size of the population), so sampling without replacement is acceptable for the case of moisture content. For example, a typical kiln (as the one used to dry the lumber for this study) holds over 35,000 lumber pieces (with a nominal size of 50.8mm x 101.6mm x 1.83m, or 2"x4"x6'). For control charts, sample sizes as small as 5 are often used, and therefore the ratio of the sample size vs. population size is negligible (Montgomery 1997). STATISTICAL PROCESS CONTROL IN LUMBER DRYING Quality control methods have been successfully used and are now ordinary in most high-tech industries. However, the application of statistical process control in the forest products industry, specifically in lumber drying, is delayed well behind other industries. There are three primary reasons for this: a. Technology development in the past has been based on achieving high production rates of low valued products. Product quality was not a market driver; b. Wood is a non-homogenous material, and even when it is kiln-dried under controlled conditions, the moisture content does not stabilize or become constant throughout the wood pieces. Due to this large variability, Statistical Process Control (SPC) methods based on inspecting product do not work because of the large volume of product that must be inspected; c. The mathematical distribution of the moisture content in dried wood is not well understood. Therefore, SPC methods borrowed from other industries do not work well. 17 Although variability is an important parameter to be measured and controlled, many quality control methods suggested in literature are concerned with the average moisture content alone, such as sample estimation of average moisture content for a charge (Fell and Hi l l 1980, Rassmussen 1988, and Simpson 1991). Other methods are based on "go/no-go" decision criteria, such as acceptance sampling (Bramhall and Wellwood 1976, and Bramhall and Warren 1977), and conformance tests ( Cheung 1994). These quality control methods help lumber producers to comply with lumber standards such as NIST (1999), W W P A (1998), and N L G A (1998), but they do not help to improve the consistency of drying processes. Other methods for consistency, although concerned with the variation of moisture content, did not find practical applicability because they are based on the incorrect assumption that moisture content has a Normal probability distribution (Pratt 1953 and 1956, Bramhall 1975, and Maki and Milota 1993). This assumption considers the asymmetry and the positive skewness of the M C distribution as "defects" or out-of-control situations, when in fact these are typical outcomes of the drying process (Zwick and Cook 1985, and Maki 1991). McMahon (1961) proposed Lognormal probability plotting in conjunction with frequency distribution analysis as quality control methods. These techniques were used to estimate the average moisture content and the percentage of M C above or below given limits. The methods are only approximate however, and the possible procedural errors are quite large. Also, the author fails to consider the third parameter of the Lognormal distribution - the threshold. 18 3 METHODS AND MATERIALS 3.1 OVERVIEW The study begins with the collection of M C measurements from kiln-dried lumber, from a process that is thought to be in-control. Graphical and numerical methods are used to determine the underlying distribution of M C using goodness-of-fit analysis for three known distributions: Normal, Lognormal, and Weibull. The parameters of the distribution are estimated and appropriate control charts are constructed to monitor process average and dispersion. The graphical methods employed in this paper are investigation of symmetry and probability plotting. The first method provides a general estimation with respect to the symmetry and skewness of the moisture content distribution. The second method offers a visual determination of how well the data follows each of the three mathematical distributions. To complement the goodness-of-fit analysis, formal mathematical tests are carried out for each distribution using test statistics based on the empirical distribution function (ECDF) . Generic quality control charts are developed for Lognormal distributed data, which are designed to monitor the process average and dispersion. Specific control limits and charts are then constructed using the collected data. Subsequent samples, collected from later kiln charges, are plotted on the proposed charts, and a comparison is made with conventional charts based on the normality assumption. 19 3.2 DATA COLLECTION The lumber used in this study was kiln-dried Douglas-Fir with a nominal section size of 50.8mm x 101.6mm (2"x4"), and a nominal length of 1.83m (6 ft). The lumber came from two different kiln charges that had the same species, dimensions, and drying conditions. The lumber from the first charge was used for distribution analysis and determination of parameters for control charts, and the lumber from the second charge was used for plotting data on the control charts. It is important to mention that the two charges were obtained from a kiln that was operating under normal conditions, using a typical drying schedule. By normal conditions it is meant the typical operating situations of the kiln, without any known errors in the drying schedule and level of drying parameters. Technical details about the kiln are given in Appendix A , and the drying schedule is given in Appendix B. DATA FOR DISTRIBUTION ANALYSIS The first kiln charge contained approximately 36,480 boards. Two hundred and twenty-five of these boards were randomly selected without replacement from the first kiln charge and each board was cut up in sections of 1-foot length each, resulting in a total number of 1350 specimens. The '1-foot' length was chosen because of practical applications of thesis' results. Specifically, '1-foot' is the longitudinal dimension of the measuring field of an in-line moisture meter, and it is intended to apply the results of this research to this type of instruments. The 20 moisture content of these specimens was determined by oven-drying as described below. Six hundred and seventy-five measurements were selected from the total of 1350 by random sampling with replacement, and they constituted the experimental data for the analysis of the underlying distribution. The procedure of sampling the 675 M C measurements was replicated two more times, to verify the validity of the conclusions drawn from the first sample. DATA FOR CONTROL CHARTS To determine the control limits and centerlines of the control charts, 20 samples were successively selected from the 1350 measurements of the first charge. Each sample contained 5 randomly selected M C measurements, resulting in a total of 100 measurements. The minimum sample size and the minimum number of samples are recommended by Montgomery (1997). The sampling procedure was 'simple random sampling with replacement'. To plot moisture content data on the control charts, 225 boards were randomly chosen from the second kiln charge. The charge contained approximately 36,480 boards, and the kiln was operating under typical conditions. The boards were cut in 1-ft long specimens, and their M C was determined by oven-drying. Twenty samples were successively selected from the total of 1350, each sample consisting of 5 M C measurements. The sampling procedure was again 'simple random sampling with replacement'. 21 DETERMINATION OF MOISTURE CONTENT The moisture content was determined using the oven-dry method as described in ASTM Standard: D 4442 - 92 (ASTM 1997). To prevent wood moisture loss or gain, the specimens were weighed within three days after the lumber exited the kiln. The initial mass (A) of the specimens was determined by weighing them with a scale, which had a minimum readability of 1 mg. The specimens were oven-dried in a forced-convection oven, which maintained a temperature of 103 ± 2 °C. The specimens were considered completely dry when the mass loss in a 3-hr interval was equal to zero, as read by the scale. After oven-drying, the specimens were immediately weighed again, thus obtaining the oven-dry mass (B). The moisture content of each sample was calculated using the formula: %MC = ^ -^-x 100 ( 1 ) 22 3.3 GRAPHICAL ANALYSIS OF DATA Two graphical analysis methods were used in this research: investigation of symmetry, and probability plotting. These methods can reveal relationships present in the data, and departures from the assumed models and statistical distributions. D'Agostino and Stephens (1986) recommend that graphical methods precede and complement the numerical methods that are presented in the following section. As an exploratory technique, the objective of the graphical analysis is to uncover characteristics of the data such as symmetry, skewness, and outliers. This is accomplished by constructing different graphs based on the empirical cumulative distribution function (ECDF) . The E C D F is defined for a set of n independent observations X1,...,Xn with a common distribution function F(x). The observations are ordered from smallest to largest as Xm ,...,X(n). The empirical cumulative distribution function, Fn (x), is defined as C Fn(x) = 0, x<X (i) Fn(x) = - , X(i)<x<X(M) i = l n - 1 (2) n [ Fn(x) = \, X(n)<x Fn (x) is a step function that takes a step of height 1/n at each observation. This function estimates the distribution function F(x). At any value of x, Fn(x) is the proportion of 23 observations that is less than or equal to x, while F(x) is the theoretical probability of an observation that is less than or equal to x. The use of the E C D F plot does not depend upon any assumptions concerning the underlying parametric distribution and it has some definite advantages over other statistical devices (D'Agostino and Stephens 1986): a) It is invariant under monotone transformations with regard to quantiles; b) Its complexity is independent of the number of observations; c) It supplies immediate and direct information regarding the skewness and bimodality of the underlying distribution; d) It is an effective indicator of outliers; e) It supplies robust information on location and dispersion; f) It does not involve grouping difficulties that arise in using for example, a histogram; g) It can be used effectively in censored samples. There is, however, one serious potential drawback with the use of E C D F plots. They can be sensitive to random occurrences in the data and exclusive reliance on them can lead to false conclusions. This is especially true if the sample size is small (D'Agostino and Stephens 1986). Therefore, they should be used in combination with other methods, such as formal numerical tests. 2 4 INVESTIGATION OF SYMMETRY Figure 1 contains plots of three distributions to illustrate different situations one can encounter in attempting to determine if a distribution is symmetric or skewed. The three distributions are the Normal (which is symmetric), the negative exponential (which is "positively skewed": its upper tail is longer than its lower tail, its upper percentage points are farther from the median than are the lower), and the Johnson unbounded curve (which is "negatively skewed": its lower tail is longer than its upper tail, its lower percentage points are farther from the median than are the upper). density cumulative Normal Johnson Unbounded Lognormal (symmetric) (negative skew) (positive skew) Figure 1. Symmetric and skewed distributions The E C D F plot is obtained by plotting i/n as ordinate against the i t h ordered value of the sample as abscissa. The procedure for constructing the E C D F plot is as follows: 2 5 Procedure 1: Constructing an E C D F plot 1. Order the observations X^ (i=l, ...,n) from smallest to largest; 2. Calculate the E C D F : Fn (x) = - for each X(l); n 3. Plot Fn(x) versus X^. The E C D F plot can be used to assess the skewness of the distribution. If the distribution has positive skewness, the portion of the E C D F for i/n values close to 1.0 (i.e. greater than 0.9) will usually be longer and flatter (almost parallel to the horizontal axis) than the rest of the E C D F . Similarly, if the distribution has negative skewness the long flat portion will lie in the lower end of its E C D F (i/n values lower than 0.1). The symmetry of a distribution can be assessed by the distance of percentiles to the sample median, and by plotting the upper half of the ordered observations against the lower. If a distribution is symmetric, then the distance between the median (50 percentile) and any percentile P below the median (0<P<50) is equal to the distance from the median to«the (100-P) t h percentile. The procedure of assessing the symmetry by the distance of percentiles to the sample median is as follows: Procedure 2: Assessing the symmetry by the distance of percentiles to the sample median 1. Order the observations X^ (/ =1, .. .,n) from smallest to largest; 26 2. Determine the values of corresponding to the following percentiles: 10 , 20 , 30 , 40 t h, 50 t h, 60 t h, 70 t h, 80 t h, and 90 t h. The 50 t h percentile is the median, and the corresponding value is X^; 3. Distances in absolute value from each percentile to the median are calculated by subtracting X^ from all X^'s obtained in step 2; 4. These distances are compared for the following pairs of percentiles: 10 t h and 90 t h, 20 t h and 80 t h, 30 t h and 70 t h, 40 t h and 60 t h. For the distribution to be symmetrical, these distances have to be equal for each of the pairs considered; Symmetry can also be assessed by a scatter diagram of the upper half of the ordered observations against the lower. This graph is obtained by plotting X{n+]_0 on the vertical axis n versus on the horizontal axis, for / < —. A "-1" slope indicates symmetry; a negative slope exceeding 1 in absolute value indicates positive skewness; and a negative slope less than unity in absolute value indicates negative skewness. The procedure for assessing symmetry by plotting the upper half of the observations against the lower, is as follows: Procedure 3: Assessing symmetry by plotting the upper half of the observations against the lower 1. If the X(,),X( 2),. . . ,X( n) represent the ordered observations, plot X^versus X^, X^n_^ n versus X^, and in general X^n+l_^ versus X^ for i; - ~; 27 2. Calculate the slope of the plotted data. A slope equal to -1 indicates symmetry. A negative slope exceeding unity in absolute value indicates positive skewness, and a negative slope less than 1 in absolute value indicates negative skewness; PROBABILITY PLOTTING A major problem with the use of the E C D F plot in attempting to judge visually the correctness of a specific hypothesized distribution is due to the curvature of the E C D F and C D F plots. The C D F (Cumulative Distribution Function) plot is a cumulative plot of the hypothesized distribution function. It is usually very hard to judge visually the closeness of the curved E C D F plot to the curved C D F plot. It is much easier to determine if a set of points deviates from a straight line. A probability plot offers such an opportunity to reach a decision based on visual inspection, because if the hypothesized distribution is the true underlying distribution, then it will be a straight-line plot. To construct a probability plot, the ordered observations are plotted against functions of the ranks. In general, a probability plot is a plot of z, versus x(j). Z,=G-*(F(X ( i ))) for i=l,...,n (3) where: G _ 1 ( . ) is the inverse transformation which here transforms F(x) (the C D F of the hypothesized distribution) into the corresponding standardized value z; x represents the observed values of the random variable X ; n is the sample size. 28 In equation (3) the standardized F(x) can be calculated using automated functions in spreadsheets, or it can be substituted with pt: F (*(/))=/>, 1 - 0 .5 (4) n Once the points are plotted the major task is to judge if the plotted data form a straight line. If they don't, the task is then to decide what are the properties of the underlying distribution or data that cause this non-linearity. Methods for probability plotting are constructed for three hypothesized distributions: Normal, Lognormal, and Weibull. The last two distributions were chosen because of their characteristics - finite left tail and a positive skew - which could mimic the moisture content data. Normal Probability Plotting Normal probability plotting is the plotting of data in order to investigate the goodness of fit of the data to the Normal distribution. The Normal distribution has the probability density function /(*) = 1 I f x-fl for - o o < x < °° exp 2 a (5) V J where p is the mean and o~ is the scale parameter. 2 9 The cumulative distribution function is F(x) = O V ° J (6) where the function <1> is the cumulative distribution function of the standard normal variable: *(z) = -4=] O42K J exp f 1 2 \ V 2 [ A J ) Hx The z of equation (4) is approximated by: z = sign{Fn (x) - 0.5)(l .238f (l + 0.262?)) where: t = {-\n[4Fn{x){l-Fn{xW and sign{Fn{x)-0.5): f+1 i / F „ ( x ) - 0 . 5 > 0 [-1 i / F „ ( x ) - 0 . 5 < 0 where Fn (x) is the empirical cumulative distribution function. (7) (8) (9) (10) This approximation is given in Hamaker (1978) and appears to be of sufficient accuracy for plotting. 30 Caution should be exercised when using normal probability plots for samples smaller that 25, for they can show substantial variation and nonlinearity even if the underlying distribution is normal (Daniel and Wood, 1971). The procedure for Normal probability plotting is as follows: Procedure 4: Normal probability plotting 1. Compute pt with equation (4), then obtain z,-, the inverse of the standard normal cumulative distribution, for each i=\, ...,n. This can be done by using an automated function in spreadsheet software, or by using the approximation formulas (8) through (10); 2. Plot the values of z,- versus x.; 3. If the plotted points form a straight line, then the Normal distribution is the true underlying distribution. Lognormal Probability Plotting The lognormal distribution has usually two parameters: scale and shape, and a lower bound of zero. However, a third parameter - threshold (or location) - has to be taken into consideration, because the moisture content in wood has a positive lower bound. For kiln-dried lumber, this lower bound is related to the lowest equilibrium moisture content (EMC), which depends on the relative humidity and temperature of the air in the kiln. The three parameter lognormal distribution has the probability density function 31 / ( * ) = x-6 ^iTtcr exp \fln(x-6)-ju^ a for x > 6 (11) where: 9 is the threshold parameter, fi is the scale parameter, and a is the shape parameter. The cumulative distribution function is F(x) = $ \n(x-6)-ju a for x> 6 (12) An important property of the lognormal distribution is that the random variable Y = ln(X — 6) has a normal distribution with mean ju and standard deviation a. Probability plots for this distribution can be constructed by plotting z from equation (8) on the vertical axis, against ln(x-d). It is important to note that in order to obtain a straight line, the threshold parameter (or an estimate of it, see Cohen 1951, Cohen et al. 1985, or Crow and Shimizu 1988 for estimation procedures) has to be subtracted from all the data. The subtraction of the threshold parameter is often ignored (Maki 1991, and Noghondarian 1997), thereby leading to incorrect conclusions about the underlying moisture content distribution. To draw correct conclusions from probability plots, they should always be complemented by numerical tests for goodness of fit. The procedure for Lognormal probability plotting is as follows: Procedure 5: Lognormal probability plotting 32 1. Estimate the value of the threshold parameter from the data set (estimation procedures are given in Cohen 1951, Cohen et al. 1985, and Crow and Shimizu 1988). Subtract this value from all the data, and then take the natural logarithm; the results will be plotted on the horizontal axis; 2. Calculate z, using equations (8) through (10); 3. Plot zt against ln(x, -d); 4. If the plotted points form a straight line, then the Lognormal distribution is the true underlying distribution. Weibull Probability Plotting The Weibull distribution has the probability density function c (x-0) c-1 (X~0] — exp — o { A J I a ) for x> 0, c> 0 (13) where: 0 is the threshold parameter, cr is the scale parameter, and c is the shape parameter. The cumulative distribution function is F(x) = 1-exp for x > 6 (14) 33 Various methods are available for Weibull analysis (Nelson and Thompson, 1971). D'Agostino (1986) showed that Weibull probability plotting could be achieved simply by plotting z on \n(x-0), where where: Fn (x) is the Weibull cumulative distribution function. Before a probability plot can be constructed, an estimate of Fn (x) is needed. The most common estimate is the median rank, a nonparametric estimate of the Weibull cumulative distribution function. Dodson (1994) shows that the median rank is well approximated by the expression: where: i= 1,... ,n is the sample number; n is the sample size. Similar to the three-parameter lognormal distribution, the threshold value must be subtracted from all the data before a Weibull plot will produce a straight line. The procedure for Weibull probability plotting is as follows: Procedure 6: Weibull probability plotting z = l n ( - l n ( l - F „ W ) ) (15) 1 - 0 . 3 (16) rc + 0.4 34 Estimate the value of the threshold parameter from the data set (for estimation procedures see Cohen 1965 or Dodson 1994). Subtract this value from all the data, and then take the natural logarithm; the results will be plotted on the horizontal axis; Calculate z, using equations (15) and (16); Plot Zj against ln(x(. - d); If the plotted points form a straight line, then the Weibull distribution is indicated as the true underlying distribution. 35 3.4 GOODNESS OF FIT TESTS INTRODUCTION The general test of fit is a test of Ho : a random sample of n X-values comes from F(x; co) (17) where F(x; co) is a continuous distribution and co is a vector of parameters. In our case, co is not specified and as a result the null is a composite hypothesis. The alternative hypothesis is also composite - it gives no information on the distribution of the data, and simply states that Ho is false. The major focus is on the measure of agreement of the data with the null hypothesis; in fact, it is usually hoped to accept that H 0 is true ( D'Agostino, 1986). The fact that it is usually hoped to accept the null hypothesis and proceed with other analyses as if it were true, sets goodness of fit testing apart from most statistical testing procedures. In usual tests of hypotheses, the null must be rejected in order to prove a point. Here, because the alternative is very vague, the appropriate statistical test will often be by no means clear and no general theory of Neyman-Pearson type (in which the two competing hypotheses have to be precisely defined, along with the exact power and significance levels) appears applicable. Therefore, many different and elaborate procedures have been developed to test the same null hypothesis, without any one emerging as superior. To name a few: tests of chi-squared type, tests based on E C D F (empirical distribution function) statistics, tests based on regression and correlation, and moment techniques. 36 Given the above considerations, the following E C D F statistics will be used for goodness of fit hypothesis testing: the Kolmogorov-Smirnov D statistic, the Cramer-von Mises W2 statistic, and the Anderson-Darling A2 statistic. E C D F STATISTICS The Kolmogorov-Smirnov statistic assesses the discrepancy between the empirical distribution Fn (x) and the estimated hypothesized distribution F(x). Specifically, it is based on the largest vertical difference between Fn (x) and F ( x ) , and it is computed as the maximum of D+ and D~. D+ is the largest vertical difference between the E C D F and the distribution function when the E C D F is greater than the distribution function. D is the largest vertical distance when the E C D F is less than the distribution function. D+ = max. U,* n (18) D = max. ( i - n U{i) (19) D = max(D + , D") (20) where U(i> = F(x(j)) is the cumulative distribution function value at x^, the i t h ordered value. The Cramer-von Mises statistic (W 2 ) is defined as 37 W2 =n][Fn(x)-F(x)]2dF(x) (21) and it is computed as w2=Y »-f 2 / - l Y 1 1=1 u0) — V 2 n J + U~n ( 2 2 ) The Anderson-Darling statistic (A ) is defined as A 2 =n][Fn(x)-F(x)]2{F(x)[l-F(x))ydF(x) (23) and it is computed as A2=-n--Y{(2i-l)[\n(UU) + ln( l -L/ ( „ + 1 _, , ) ]} (24) GENERAL PROCEDURE FOR TESTS OF HYPOTHESIS The general null hypothesis is that the input data values are a random sample from a specified distribution. To test of whether the null hypothesis is true or false, first determine from appropriate tables the statistic ("critical" value) that will cause us to reject the null, then calculate the test statistic ("computed" value) using relations (18) - (20), (22) and (24). Note that modified forms of the statistics might be necessary to be calculated, see the following section for details. If the 38 "computed" is greater than the "critical", then the null is rejected and the conclusion is that the M C data is not a random sample from the specified distribution. TESTS FOR NORMAL DISTRIBUTION For a test of Normality, the hypothesized distribution is a Normal distribution function with parameters ju and cr estimated by the sample mean and standard deviation (see equations (4), (5), and (6)). The critical values for D, W2, and A2 are given by Stephens (1974). The author has shown how these goodness of fit statistics can be modified so that the critical values for n — oo can be used for all sample sizes. The following modified, more powerful statistics for D, W ^ n d A 2 are recommended: f D = D V n - 0 . 0 1 + 0.85 (25) W2 =W (26) A2 , 2 4 25 A + r y n n j (27) For different levels of significance, the "critical" values of the E C D F statistics are found in Stephens (1974), "table 1A.3: Modifications for a test of normality ju and a unknown". The procedure for tests of hypothesis for the Normal distribution is as follows: 39 Procedure 7: Goodness-of-fit test of hypothesis for the Normal distribution 1. Estimate ju and <7 from the sample of n observations using equations (45) and (46); 2. Order the observations, then calculate Fn (x) and F(x) for each observation; 3. Calculate D, W2, and A2 using equations (20), (22) and (24). Then compute the modified statistics using (25) through (27); 4. From Stephens (1974), find the critical values for each statistic, for the significance level desired; 5. For each E C D F statistic, reject the null hypothesis if the "computed" statistic exceeds the "critical" value. TESTS FOR LOGNORMAL DISTRIBUTION For a test of whether the data are from a Lognormal distribution (given by (11) and (12)), the hypothesized distribution is a Lognormal distribution function with parameters 6 , ji and <j. First, the threshold parameter 0 is estimated from the sample (see Cohen 1951, Cohen et al. 1985, or Crow and Shimizu 1988 for estimation procedures). Then, the scale fi parameter is estimated from the sample after the logarithmic transformation of the data, ln ( ;c -0) . The sample mean of the transformed sample is used as the estimate for the scale parameter. The shape parameter o can be calculated with the formula given by Cohen, 1951. Other sources (SAS 1999) suggest that the shape parameter cr can be estimated by the sample standard deviation of the logarithmic-transformed data. The test is therefore equivalent to the test of 40 normality on the transformed sample, and the same modified statistics from formulas (24) and (25) can be used. The critical values for D, W2, A2 are given by Stephens (1974) for different significance levels. The same formulas (24) through (26) for the modified statistics are to be used. The procedure for tests of hypothesis for the Lognormal distribution is as follows: Procedure 8: Goodness-of-fit test of hypothesis for the Lognormal distribution 1. Estimate the location parameter 6 from the sample (for estimation procedures see Cohen 1951, Cohen et al. 1985, or Crow and Shimizu 1988). Transform the data by l n ( j t - 0 ) . Compute the sample mean and standard deviation of the transformed sample, which will give ji andcr respectively; 2. Calculate Fn (x) and F(x) for each observation from the transformed data; 3. Calculate D, W2, and A2 using equations (20), (22) and (24). Then compute the modified statistics using (25) through (27); 4. From Stephens (1974) find the critical values for each statistic, for the significance level desired; 5. For each E C D F statistic, reject the null hypothesis if the "computed" statistic exceeds the "critical" value. 4 1 TESTS FOR WEIBULL DISTRIBUTION For a test of whether the data are from a Weibull distribution, the hypothesized distribution is a Weibull distribution function (see formulas (13) and (14)) with parameters 0 , c and <x estimated by the maximum-likelihood method (see Cohen 1965 or Dodson 1994 for estimation procedures). In this case, the The Kolmogorov-Smirnov D statistic is not as powerful as the other two statistics. Therefore, only the Cramer-von Mises W2 statistic, and the Anderson-Darling A2 statistic will be used for the tests of hypothesis. The critical values for W2 and A2 are obtained by interpolation of simulated values given by Lockhart and Stephens (1994). The procedure for tests of hypothesis for the Weibull distribution is as follows: Procedure 9: Goodness-of-fit test of hypothesis for the Weibull distribution 1. Estimate the location, shape and scale parameters from the sample, using the methods presented in Cohen (1965) or Dodson (1994); 2. Calculate Fn (x) and F(x) for each observation for the transformed data; 3. Calculate W2 and A2 using equations (22) and (24); 4. From Lockhart and Stephens (1994) find the critical values for each statistic by interpolation, for the significance level desired; 5. For each E C D F statistic, reject the null hypothesis if the "computed" statistic exceeds the "critical" value. 42 3.5 CONTROL CHARTS FOR LOGNORMAL DATA The methodology for estimating distribution parameters is presented for the three-parameter Lognormal distribution, and the control charts for Lognormal data are constructed. The proper variables to be monitored on control charts are also discussed. THREE-PARAMETER LOGNORMAL DISTRIBUTION The lognormal distribution function has usually two parameters, scale and shape, and a lower bound of zero. The probability density function for the two-parameter lognormal distribution is / (*) = 1 1 exp ln(x)-// V for x > 0 (28) where ju is the scale parameter, and a is the shape parameter. However, a third parameter - threshold or location - has to be taken into consideration, because the moisture content in wood has a positive lower bound, always greater than zero. For kiln-dried lumber, this lower bound is related to the lowest equilibrium moisture content (EMC) , which depends on the relative humidity and temperature of the surrounding air in the kiln. The three parameter lognormal distribution has the probability density function /(*) = 1 1 x-6 42KG exp 1 (In(x-d)-ju^ 2 V for x > 6 > 0 (29) 43 where 9 is the threshold parameter, fi is the scale parameter, and a is the shape parameter. The cumulative distribution function for the three-parameter case is: \r\(x-9)-ju~ F(x) = ® v a J for x> 9 (30) The three parameter Lognormal distribution has the mean (from Aitchinson and Brown 1957) a = 9 + exp f C72^ (31) and variance p2 = exp(2 * fi + cr2 )* (exp(cr 2)-1) (32) It is important to note that a change in the value of the parameter 9 affects only the location of the distribution, and it does not affect the variance or the shape (Johnson et al. 1994). There is a very important connection between Lognormal and Normal distributions. The Lognormal distribution in its simplest form may be defined as the distribution of a variable whose natural logarithm obeys the normal law of probability. In other words, if a variable X is distributed Lognormal with a threshold parameter 9, a scale parameter fi, and a shape parameter a, then the variable Y = \n(X-9) has a normal distribution with mean /I and standard deviation cr. This property of the Lognormal is very useful in quality control work, 44 because the methods of statistical process control are well known and widely applied for the Normal case. Although many of the properties of the Lognormal may immediately be derived from those of the Normal distribution, there are certain features of the former that differ from anything arising in Normal theory. One example is that the mean and the variance of the Lognormal distribution are not parameters of the distribution, contrasting with the Normal case, where the parameters of the distribution are the mean and the variance. Another example is that, unlike the Normal distribution, the Lognormal is not uniquely determined by its moments (Heyde 1963). Also, when the threshold value is unknown and it has to be estimated from the sample, this complicates the estimation procedures developed for the two-parameter case (Aitchison and Brown 1957). WHAT TO MONITOR ON CONTROL CHARTS Control charts usually monitor process average and process dispersion. The process average is typically checked with a control chart for means, and the dispersion is usually monitored with a control chart for the standard deviation, or for the range. For Normally distributed quality characteristics, the mean and the standard deviation are also the parameters of the distribution, and are measures of central tendency and dispersion, respectively. For moisture content in kiln-dried lumber, whose distribution seems to be better modeled by the Lognormal, the mean, standard deviation, or variance are not parameters of the distribution. The question that arise for Lognormal variables is, should the mean and standard deviation be monitored, or the threshold, scale and shape parameters? 45 One approach would be to monitor the average and the standard deviation of the moisture content, which is a standard practice in quality control applications. For large samples conventional control charts for sample averages could be constructed regardless of the Lognormal assumption, because of the central limit theorem, which makes sample averages to be approximately Normally distributed. However, the distributions of the standard deviation and the variance of a Lognormal variable are not well known, so conventional methods such as "3-sigma" or probability limits cannot be employed here. Another reason for not monitoring the "average" is that, for Lognormal distributions, a good measure of central tendency is not the arithmetic mean, but rather the geometric mean. This subject will be detailed a latter section. This paper proposes a second approach, which is to monitor directly the parameters of the distribution. This consideration is based on the close relationship between the Lognormal and the Normal distributions. As it was discussed earlier, if a variable X is distributed Lognormal with a threshold parameter 6 , a scale parameter ju , and a shape parameter a, then the variable Y = \n(X - d) has a normal distribution with mean ju and standard deviation a. Monitoring the mean and standard deviation of the Normal variable Y would actually control the scale and shape parameters of the Lognormal variable X. The threshold parameter would not be monitored directly with a specific chart. However, inferences could be made about a change in the threshold value from the other two charts. For example, if a shift in the scale parameter occurs, it could be an indication that the threshold value also changed, especially if the shape remains the same. Similarly, if a shift occurs in the shape parameter and the scale stays the same, it could imply that the threshold value changes also. 46 Suppose that the moisture content X is distributed Lognormal with known threshold, scale, and shape parameters. If xj, x2,...,xn is a sample of size n, we transform it to normality, by subtracting the value of 6 from all and then taking the natural logarithm: l n ( x , . - 0 ) = y , (33) According to the definition of the Lognormal distribution, the new variable Y is distributed Normal with mean ju and standard deviation <J. The average of the transformed sample is: 3 , = y , + y 2 + . . . + y, ( 3 4 ) n It is known that y is normally distributed with mean ju and standard deviation -^L , for moderately large sample sizes (Montgomery 1997). The conventional "3-sigma" control limits, and the center line for a control chart for sample means y are: U C L N = ju + 3* \ Center L i n e N = fi (35) L C L N = jU-3-•Jli The subscript "N" in equations (35) refers to the fact that the control limits are developed for Normal distributed data. 47 In practice, we usually will not know a priori 6, / / and o~, the parameters of the Lognormal distribution. Therefore, they must be estimated from preliminary samples taken when the process is thought to be in-control. In the following section we show how to estimate the parameters of the underlying Lognormal distribution of moisture content. PARAMETER ESTIMATION The knowledge of the threshold parameter 6 is critical for the choice of methods employed to estimate the other two lognormal parameters. If the threshold is known, the estimation methods are well developed and straightforward. If the threshold is not known, and it has to be estimated from previous data, the methods involved in parameter estimation are much more complex. The threshold is said to be 'known' when it can be determined a priori by reference to the generating system. The minimum moisture content that the lumber can possibly have during drying is given by the equilibrium moisture content ( E M C ) set up in the kiln. This assumes, of course, that all the lumber entering the kiln has moisture content greater than the initial E M C . The temperature and relative humidity of air in the kiln determine the E M C . However, in order to force water out of the wood, the drying schedules are maintained in a way that does not allow the wood to attain the E M C set by the schedule, so the threshold parameter is related to E M C , but this relationship is not known. Therefore, the threshold parameter cannot be determined solely by reference to the generating system, and will have to be estimated from samples. The fact that 6 is unknown creates complications in the estimation of Lognormal parameters. A great deal of research concerning the three-parameter Lognormal was published 48 (see Crow and Shimizu 1988, p. l 13, for an exhaustive discussion). In general, global maximums of the likelihood function are used to estimate distribution parameters. However, Heyde (1963) demonstrated that the three-parameter Lognormal is not uniquely determined by its moments and this raised questions about moment estimators. Also, Hi l l (1963) has shown that global maximum likelihood estimators lead to inadmissible estimates regardless of the sample. Therefore, other estimators have to be used that at least produce reasonable estimates. As an alternative to global maximum likelihood estimators (MLE) , local maximum likelihood estimators ( L M L E ) are generally accepted for the estimation of the threshold, scale, and shape parameters, especially for moderately large sample sizes (Cohen 1951, Cohen et al. 1985, and Crow and Shimizu 1988). Given a sample of moisture content data, the first parameter that can be estimated is the threshold. For an ascending ordered sample xit x2,...,xn , Cohen (1951) proved the following local maximum likelihood estimator for 9 : F(0)= < 1 ^ n*£ln(jc ( . -0)-n*^ln2{Xi -9)+ ^Tlnfe-fl) -n2 * f\n{Xi-9)_Q V x,-8 (36) It is convenient to use the "trial and error" technique with linear interpolation to solve this equation. A first approximation 9X < xj is chosen and F( 9X) is evaluated. If F( 0,) is zero, then no further calculations are required. Otherwise it is continued until a pair of values t9, and 9j is found in a sufficiently narrow interval such that F(#,) > 0 > F(# y ) or F((9.) < 0 < F(9j), 49 and the final estimate 6 is found by interpolation. When solving (36) for 6 , only values for which 0 < 6 < xj are accepted. After obtaining the local maximum likelihood estimate 0 for the threshold, the estimates for the other two parameters, scale and shape, can be determined with the following relations (from Cohen 1951): //= - * ] • > ( * , - # ) (37) d> =±*±lnixl-e)-\±*±ln(xl-§) ft i Yt i (38) It can be seen from equation (37) that the scale parameter ji is estimated by the sample mean of the logarithmic-transformed data, ln(x- 6). Sometimes this estimation technique does not yield a valid result for the threshold, because local maximums of the likelihood function do not always exist, especially in small samples (Cohen et al. 1985). If this is the case, the parameters of the distribution can be estimated from the sample by an alternative technique, based on modified moment estimators ( M M E ) . This methodology is given in Cohen et al. 1985. In practical applications, once the threshold 6 is calculated for a kiln and a set of conditions, it will be assumed as constant until further proof that it has changed. The other two parameters are estimated from each sample using equations (37) and (38). 50 ARITHMETIC MEAN VS. GEOMETRIC MEAN It is known that the arithmetic mean quantifies the central tendency of Normal variables. When the observations are not distributed Normal, but their natural logarithms are, then the geometric mean should be used. This is true for the two-parameter Lognormal distribution. For the Lognormal variable x,- ( i = 1,2,...,n), y,-/n(x,) is a Normal variable. With a few algebraic operations, the average of y,- (called y) becomes: _ i=l n ; - i r V n n i=l f n \ \ ' = 1 J V '"=» J (39) It is known that ( " ^ (40) is the geometric mean of x,- (further called "geomean(x;)"). Equation (39) becomes: y - ln(geomean(x ()) (41) If we calculate the average of y, and we want to express it in the original units of x,-, then from X,- = exp(y,) we have: anti\og(y) = exp(y) = exp[ln(geomean(x()] = geomean(xf) (42) In other words, the geometric mean of a Lognormal variable corresponds to the arithmetic mean of the log of that variable (which is a Normal variable). Hence, for the two-51 parameter Lognormal the geometric mean should be used as a measure of central tendency, rather than the average (arithmetic mean). For the three-parameter case, y, = ln(jc, - 6) is a Normal variable, and the average y is given by: y— m[geomean{x t —6)) (43) Which is equivalent to: exp(y) = geomeaniXj - 6) (44) This means that the geometric mean of the moisture content less the threshold should be actually monitored on a chart in the original scale of measurements. Once the parameters of the underlying distribution are estimated for the in-control process, the control limits and center lines of the control charts can be determined. CONTROL CHART FOR SCALE PARAMETER ( y CHART) In quality control literature the customary chart for monitoring sample averages is usually called "3c chart". To adopt this nomenclature for our case, we will call "y chart" the chart that monitors sample averages of the transformed data - the Y variable - which is distributed Normal. It was shown earlier that these sample averages are equivalent to sample scale parameter of the Lognormal variable. 5 2 If m samples each of size n are used as preliminary data, and y, , y 2 , . . . , y m are the averages of each sample - identical with the scale parameter of each sample, calculated with equation (37) - then the scale parameter of the Lognormal distribution is given by y = (y, + y 2 + . . . + y m ) / m (45) Similarly, if S(. is the standard deviation of the i * sample, then S=(Sl+S2+... + Sm)/ m (46) The relationship between the standard deviation (of the Normal variable Y) and the shape parameter (of the Lognormal variable X) follows from the (biased) maximum likelihood estimator of the variance a2 for the Normal variable Y (from Johnson et al. 1994, and Montgomery 1997): r n V n If we substitute ln(x. -§)- yj in (47) the equation becomes: (47) S2=±*±ln2(Xi-§)-\±*±ln(Xi-e) (48) which is identical in expression with equation (38). However, the standard deviation of the sample, S, is not an unbiased estimator of a. To obtain an unbiased estimator of o, S must 53 be divided by C4, a constant that depends on the sample size1. Therefore, 5 / c 4 is to be used for estimating the standard deviation a of the Normal population, which is equivalent to estimating the shape parameter 0 of the Lognormal population. Using the usual notations for control charts, the parameters of the control chart for sample means - or scale parameters - y (called " y chart") become: f - S U C L N = 5> + 3*—^= c 4 V « "N Center L ine N = y (49) L C L N = y-3*-^-= v c 4 V n where: n = the size of the samples that will be used for subsequent plotting; y = estimate of the Normal population mean, obtained from preliminary data, given by equation (45); it is also an estimate of the Lognormal population scale parameter, S = an estimate of the standard deviation of the Normal population, given by equation (46); it is also an estimate of the Lognormal population shape parameter o~; 1 For a detailed discussion about c4 and its values see Johnson et al. 1994, or Montgomery 1997. 54 C4 = a constant that depends on the sample size; this constant is introduced because S / c 4 is an unbiased estimator of the population standard deviation; for n = 5, c 4= 0.940 ( Johnson et al. 1994, or Montgomery 1997). This chart could be used as it is. However, it will not provide meaningful information, for the values plotted are dimensionless, because of the logarithmic transformation. To solve this problem, a chart can be constructed which plots the data in the original scale. The control limits and center line for the original scale are obtained by adding the threshold and then taking the antilog (by exponentiation). From Y = \n(X - 0) we have X = 6 + exp(y), and we will use this relationship to transform all the results back to the original scale of measurements. The subscript " L N " in equations (50) refers to the fact that the control limits are developed for the Lognormal distributed data, which is the moisture content. By substituting (49) in (50) the limits of the original-scale control chart (further called "antilog y chart") become: U C L L N = 9 + exp{UCLN) (50) L C L L N = 9 + exp{LCLN) U C L L N = 0 + exp y + 3* V Center L i n e ^ = 9 + exp(y) (51) f L C L L N =9 + exp y — 3 * S V J 55 Once these control limits are established, they can be used to monitor subsequent kiln charges that are obtained under the same drying conditions. To plot a point on the y chart, a sample x, (i = l,2,...,n) of M C measurements is taken, then x, is transformed to a normal variable y- =ln(x 1-d). The sample average of y„ y, is then plotted on the y chart. To plot the corresponding point on the antilog y chart, y needs to be transformed to the original scale of measurements, with the equation: anti log(y) = 6 + exp(y) ( 5 2 ) At the first sight, plotting anti\og(y) on the antilog y chart seems to be just an algebraic manipulation. However, it was demonstrated earlier that exp(y) is in fact the geometric mean of (Xi-0). From ( 4 4 ) it follows that the point plotted on the antilog y chart is: anti log(y) = 9 + geomeanix, —6) (53) This means that the geometric mean of the M C data less the threshold is actually monitored on the antilog y chart. The threshold is added back in just to give the right scale to the chart. The following is a procedure for constructing the charts for sample means (scale parameters) and the corresponding original-scale control charts: 56 Procedure 10: Construction of charts for sample means - or scale parameters - (y chart) and the corresponding Lognormal control charts in original scale of measurements (antilog y chart) 1. For each kiln and set of conditions (species, type of lumber, stacking method, drying schedule), collect m random samples of n moisture content measurements, from when the process is thought to be in-control (it is recommended that ra > 20; can be as low as 5 -Montgomery 1997); 2. Determine if the moisture content is distributed Lognormal, using the methods described in 'Graphical Analysis' and 'Goodness O f Fit Tests' sections. If the Lognormal is not a good model for the data, then other appropriate methods should be used. This procedure assumes hereafter that the M C is distributed Lognormal; 3. From the preliminary data estimate the threshold parameter using equation (36). The threshold should be estimated from the entirety of preliminary data, instead as average of sample estimates, because of estimation problems explained above. With 0 known, the other two parameters are estimated from the transformed samples, using (45) and (46); 4. Establish a sample size n to be used in subsequent plotting, then calculate the center lines and the control limits for both charts using relations (49) and (51). Equations from (49) define the chart for the sample averages (or scale parameter), the " y chart", which is used to monitor the lognormal-transformed data. Equations from (51) describe the corresponding chart that monitors the moisture content in the original scale of measurements, the "antilog y chart"; 57 5. To plot the y data for a subsequent sample of size n, first normalize x, by subtracting § from each value, and then taking the natural logarithm, ln(x, - § ) . Calculate the mean of the transformed sample, y , which is plotted on the " y chart"; 6. To plot the corresponding value on the "antilog y chart", the result needs to be brought back to the original scale of measurements, with the expression: anti log(y) = 6 + exp(y). CONTROL CHART FOR SHAPE PARAMETER (S CHART) To monitor the Lognormal shape parameter (or the sample standard deviation of the transformed data), the "3-sigma" limits cannot be used because the sample standard deviation is not Normally distributed, not is the sample variance. Therefore, probability limits are proposed to construct control charts for sample standard deviation. The construction of these charts is based on the fact that the variance of the Normal variable Y is 'chi-square' distributed. If Y ~ N(ji, a 2 ) , then _ fa ( 5 4 ) cr where Y is the normal variable with mean ju and variance a2, 5 2 is the sample variance, and xl-i is t n e chi-square distribution with n-\ degrees of freedom. A 100(l-ec)% two sided confidence interval on the variance is: Aall,n-\ /L\-al2,n-\ 58 where xln,n-i denotes the percentage point of the chi-square distribution such that -a/2 . The control limits for this chart, called the "S chart", are calculated as follows: f U C L N = Center Linei S (56) : N -L C L N — — where S is an estimate of the standard deviation of the Normal population, given by equation (46), and C4 is the constant that depends on the sample size, described earlier. To construct this chart in the original scale of measurements, the above control limits just need to be exponentiated, without adding the threshold parameter. As it was explained earlier, the Lognormal shape parameter (the standard deviation of the transformed data) does not change when the threshold value changes. The parameters of the original-scale control chart (called the "antilog S chart") become: r (57) 59 Once these control limits are established, they can be used to monitor subsequent kiln charges that are obtained under the same drying conditions. The sample collected earlier for the y chart is used, and the standard deviation 5 of the transformed sample is plotted on the S chart. To plot the corresponding point on the antilog S chart, S needs to be transformed to the original scale of measurements by exponentiation. The following is a procedure for constructing the charts for Lognormal shape parameter (or Normal sample standard deviation) and the corresponding original-scale control charts: Procedure 11: Construction of charts for shape parameter - sample standard deviations -(5 chart) and the corresponding Lognormal control charts in original scale of measurements (antilog S chart) 1. Using the results from steps 1 to 4 of Procedure 10, calculate the center lines and the control limits using relations (56) and (57). Equations from (56) define the chart for the shape parameter (or sample standard deviation), called the "5 chart", which is used to monitor the variation of lognormal-transformed data. Equations from (57) describe the corresponding chart in the original scale of measurements, called the "antilog S chart"; 2. To plot a subsequent sample of size n, calculate the standard deviation S of the transformed sample, ln(x, - § ) , and then plot this value on the S chart; 3. To plot the corresponding value on the antilog S chart, the result needs to be brought back to the original scale of measurements, just by exponentiating: anti\og(s) = exp(s); 60 4 RESULTS AND DISCUSSION 4.1 OVERVIEW The probabilistic distribution of the experimental moisture content data was analyzed both graphically and numerically. Results of two graphical methods - analysis of symmetry and probability plotting - are presented in the form of E C D F plots, symmetry tables and charts, and probability plots for the three distributions investigated: Normal, Lognormal, and Weibull. Furthermore, results of the hypothesis tests are showed for the numerical goodness-of-fit methods, by comparing test statistics with critical values. Finally, the true underlying distribution of moisture content is determined by consolidating the results of these methods. 4.2 GRAPHICAL ANALYSIS The symmetry of the moisture content distribution was first assessed by a plot of the empirical cumulative distribution function (ECDF) . The moisture content measurements were ordered from smallest to largest, and then the empirical cumulative distribution function was calculated for each moisture content value. The E C D F plot, presented in Figure 2, shows that the moisture content distribution has a positive skewness, because the portion of the E C D F for i/n values greater than 0.9 is longer and flatter (almost parallel to the horizontal axis) than the portion of the E C D F for i/n values smaller than 0.1. 61 9.00 11.00 13.00 15.00 17.00 19.00 21.00 Moisture Content [%] Figure 2. Empirical Cumulative Distribution of M C for Douglas-Fir, and the relation of 10 t h and 90 t h percentiles about the median (50 t h percentile). The symmetry of the empirical distribution of M C was also analyzed by comparing the absolute distances of percentiles from the median. The following pairs of percentiles were calculated: 10 t h and 90 t h, 20 t h and 80 t h, 25 t h and 75 t h, 40 t h and 60 t h. The distances of the percentiles from the median (the 50 t h percentile) are presented in Table 1, for all the pairs of percentiles. These distances are not equal for each pair, proving that the distribution of M C is not symmetric. The table also shows that all the upper percentiles are "further away" from the median than their corresponding lower percentile. This supports the observation that the distribution is positively skewed. 62 Sample percentiles P 100-P Absolute values from median 10 1.73 90 2.53 20 1.22 80 1.52 25 1.01 75 1.23 40 0.40 60 0.46 Table 1. Absolute values of distances from sample median to percentiles The symmetry was finally assessed by a plot of upper vs. lower observations (Figure 20.00 i 19.00 A 14.00 4- , , , , , , , , 10.50 11.00 11.50 12.00 12.50 13.00 13.50 14.00 14.50 Lower observations MC [%] Figure 3. Plot of upper vs. lower observations for investigation of symmetry. The slope of the plotted observations had a value of -1.67. Following the considerations presented in the 'Methods' section, a slope greater than 1 in absolute value also indicates positive skewness of the moisture content distribution. The conclusion of the symmetry investigation of our experimental data is that the moisture content distribution is not symmetrical, and it is positively skewed. This supports the idea that the Normal distribution is not a good model for the moisture content, which might be better modeled by the Lognormal and/or Weibull. The next step of the graphical analysis was to construct probability plots to visually assess the correctness of these hypothesized distributions. The Normal probability plot in Figure 4 was constructed using equations (4) and (8) -(10), by plotting the inverse standard normal cumulative against each value of M C . Visual examination of this plot shows that the moisture content data does not follow a straight line, indicating that the Normal distribution does not provide a good fit. A Lognormal probability plot is shown in Figure 5. It is important to note that, in order to obtain a straight line, the threshold (location) value of 7.661 was subtracted from the data. The plot suggests that the Lognormal provides a good model for the distribution of the moisture content. The Weibull probability plot shown in Figure 6 indicates that the hypothesized Weibull distribution does not appear to be the true underlying distribution for the moisture content data. 64 Figure 5. Lognormal probability plot for moisture content data 4.000 -i -10.000 --12.000 -I ln[x(i) - location] Figure 6. Weibull probability plot for moisture content data The symmetry analysis and the probability plots indicated that the distribution of the moisture content appears to be better modeled by the Lognormal distribution. To substantiate these findings, the graphical analysis was followed by more formal numerical tests for goodness of fit. 4.3 GOODNESS OF FIT TESTS Three hypothesis tests were conducted for each of the distributions under investigation. The null hypothesis for the goodness of fit tests was that the sample of 675 moisture content values comes respectively from Normal, Lognormal, or Weibull. Test statistics based on the 66 empirical distribution function were calculated for each hypothesized distribution, and then compared with the critical values. To test whether our sample came from a Normal distribution or not, the values for D, W2, and A2 were calculated using equations (20), (22) and (24). Because the parameters of the Normal distribution had to be estimated from the sample, the modified statistics D, W2,A2 were computed using equations (25) through (27). The critical values of these statistics were obtained from Stephens (1974), for significance levels (a) of 0.01 and 0.05. The parameters for the Normal distribution, estimated from the sample, were: mean = 14.58 and standard deviation = 1.645. The computed and critical values for each statistic are given in Table 2. For all three goodness of fit statistics, the observed statistic was greater than the critical value, therefore the null hypothesis was rejected, and it was concluded that M C data does not come from a normal distribution. This result is significant at both 1% and 5% levels. This conclusion is consistent with the findings from the graphical analysis. Test Statistic Normal Test Lognormal Test Weibull Test Obs. Crit. Obs. Crit. Obs. Crit. a .01 .05 .01 .05 .01 .05 Kolmogorov-Smirnov b L566 1.035 0.895 0.447 1.035 0.895 — — — Cramer-von Mises w2 0.629 0.178 0.126 0.034 0.178 0.126 0.161 0.1603 0.1137 Anderson-Darling A2 3.996 1.092 0.787 0.248 1.092 0.787 1.035 0.9428 0.6892 Table 2. Observed statistics and critical values for E C D F statistics for hypothesis testing. 67 To conduct the lognormal test, the parameters of the lognormal distribution were estimated from the sample: threshold = 7.661, scale = 1.907, and shape = 0.236. The modified tests statistics given by equations (25) through (27) were used because of the parameter estimation. Table 2 shows the computed values for D, W2, A2 and the critical values - from Stephens (1974) - for significance levels ( a ) of 0.01 and 0.05. For all three goodness of fit statistics, the observed statistic was smaller than the critical value, and therefore the null hypothesis is not rejected. Taking into account the considerations detailed in the Methods section, the null hypothesis was accepted. At a - 0.01 and 0.05 significance levels, all three tests support the conclusion that the three-parameter Lognormal distribution with threshold parameter 6 = 7.661, scale parameter ju - 1.907, and shape parameter o~ = 0.236 provides a good model for the distribution of moisture content. This is consistent with the conclusion from the graphical analysis. For the Weibull goodness of fit test, the Cramer-von Mises W2, and the Anderson-Darling A test statistics were calculated using equations (22) and (24). The critical values were found by interpolation from Lockhart and Stephens (1994), for significance levels (a) of 0.01 and 0.05. These values are given in Table 2. The parameters used for Weibull distribution, estimated from the sample, were: threshold = 10.867, scale = 4.197, and shape = 2.401. For both W and A goodness of fit statistics, the observed statistic was greater than the critical value. Therefore, the null hypothesis was rejected, and it was concluded that M C data does not come from a Weibull distribution. This result is significant at both 1% and 5% levels. The conclusion is consistent with the findings from the graphical analysis. 68 A l l the above hypothesis tests were replicated two more times, for two other samples of 675 measurements, to confirm the validity of the results. The results of both tests, presented in Appendix D , confirmed the conclusions of the tests performed on the first sample of 675 measurements. The numerical goodness of fit tests confirmed that the Lognormal provides a good model for the distribution of moisture content data, whereas Normal and Weibull do not. 4.4 CONTROL CHARTS FOR LOGNORMAL VARIABLES The charts proposed in this paper are based on the assumption that the moisture content has a Lognormal distribution. On the other hand, in many practical applications the moisture content is assumed to have a Normal distribution. To demonstrate the difference between these two approaches, the control charts are constructed first by assuming a Normal distribution, and then by assuming the Lognormal. The preliminary data, consisting of 20 samples of 5 measurements each, was used to estimate the parameters of the distribution. These parameters are given in Table 3. Parameter Symbol Value Average X 14.264 Standard Deviation S 1.626 Table 3. Parameters of preliminary data for construction of control charts based on the Normal distribution If the underlying distribution were assumed to be Normal, then the customary control charts based on this assumption would have the control limits given in Table 4. 69 x chart NORMAL [%J S chart NORMAL [%] U C L 16.585 3.717 Center Line 14.264 1.7298 L C L 11.943 0.261 Table 4. Control limits and center lines for control charts based on Normal assumption In the construction of these charts, 3-sigma limits were employed for the x chart N O R M A L , and probability limits were used for the S chart N O R M A L . The charts based on the Normality assumption are shown in Figure 7 and Figure 8. The points plotted on these charts came from 20 samples randomly selected from a subsequent kiln charge with the same species and drying conditions. x bar chart NORMAL 17.0 -i co x 1 2 0 " : 11.0 -I 1 1 1 1 1 1 1 1 1 1 1 1 , , 1 , , , , 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number Figure 7. Control chart for sample averages based on the Normality assumption. S chart NORMAL 4.0 -| 3.5 H Sample Number Figure 8. Control chart for sample standard deviations based on Normality assumption. 70 The x chart in Figure 7 shows no out-of-control situations. However, at least two points are observed around or above the "2-sigma" region. The "S chart N O R M A L " in Figure 8 exhibits no out-of-control situation. The analysis of the above charts suggests that the drying process performs acceptable. This conclusion is based on the assumption that the moisture content is Normally distributed. As it was found earlier, the distribution of experimental data is better modeled by a Lognormal distribution with threshold 6 = 7.661, scale ft = 1.907, and shape 6 = 0.236. These results are summarized in Table 5. It is interesting to note that the estimated threshold value (=7.661) is very close to the lowest equilibrium moisture content controlled by the drying schedule (=7.7). Parameter Symbol Value variable X Threshold 0 7.661 variable X variable X Scale fi 1.907 rmal Shape a 0.236 ognoi Number of Samples m 20 Sample size n 5 variable Y Mean of the Normal Population, defined by equation (45) y 1.907 Normal i Standard deviation of the Normal population, defined by equation (46) S 0.236 Table 5. Parameters of preliminary data for construction of control charts based on the Lognormal distribution 71 Using the procedures described in the 'Control Charts for Lognormal Data' section, the control limits were calculated with equations (49), (51), (56) and (57), and the results are presented in Table 6. y chart [unitless ] antilog y chart [%J S chart [unitless] antilog S chart [%] U C L 2.233 16.813 0.562 1.754 Center Line 1.882 14.052 0.262 1.299 L C L 1.532 12.108 0.039 1.040 Table 6. Control limits and center lines for control charts based on Lognormal assumption. Using these parameters, the Lognormal charts were constructed. Figure 9 shows the y chart, and Figure 10 shows the antilog y chart. The S chart is shown in Figure 11, and the antilog S chart is shown in Figure 12. As expected, the control limits for the y chart are symmetric, while for the other three charts they are asymmetric. The limits of the y chart as given by (49), are symmetric about y with ± "3-sigma". For the S chart the limits are not symmetric because they include percentage points of the chi-square distribution, which is not a symmetrical distribution. For the other two charts, "antilog y " and "antilog 5", the limits are not symmetric because they also involve exponentiation. The points plotted on the Lognormal charts came from the same samples chosen for the Normality assumption. From the second kiln charge, a first sample of 5 random moisture content measurements was selected. After the log transformation to normality, the average of the 72 transformed sample was plotted on the y chart in Figure 9. This point was also plotted on the antilog y chart (Figure 10), after transforming it to the original scale of measurements. The standard deviation of the transformed sample was plotted on the s chart (Figure 11), and its corresponding value in the original scale was also plotted on the antilog s chart in Figure 12. A total of 20 successive samples were selected and plotted in this manner in charts from Figure 9 through Figure 12. "y b a r " c h a r t 2 3 1 _ 2 . 2 A [ : 1.5 I i i i i i i i = F = -1 2 3 4 5 6 7 8 9 1 0 1 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 2 0 S a m p l e N u m b e r Figure 9. Control chart for sample averages (y chart) based on Lognormal assumption. antilog y bar chart § 11.0-I 1 1 1 1 1 1 1 1 1 , 1 1 , , , , , , 1 — 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number Figure 10. Antilog y chart (in original scale) based on Lognormal assumption. 73 •S" chart 0.60 «r 0.50 oj 0.40 "E 0.30 ^ 0.20 OT 0.10 0.00 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 S a m p l e N u m b e r Figure 11. Control chart for sample standard deviations (S chart) based on Lognormal assumption. "antilog S" chart 1.9 -i 1.7 -CO 1.5 -o> o 1.3 -c CO 1.1 -0.9 -1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number Figure 12. Antilog S chart (in original scale) based on Lognormal assumption. It can be seen that the pair of charts " y " and "antilog y " have practically identical appearances and the plots look the same, although the scale of measurements is different. The same is true for the other pair of charts, " 5 " and "antilog S". This means that the analysis of patterns and the sensitizing rules, which are valid for the y chart, can also be applied to the corresponding "antilog y chart". Therefore, only one chart from each pair is sufficient for practical applications. It is recommended that the pair of charts "antilog y" and "antilog s" be used because they show the control limits and plotted data in the original scale of measurements - percentages of moisture content. 74 The Lognormal charts in Figure 9 and Figure 10 show no out-of-control conditions. This is in accordance with the conclusions drawn previously from the "3c chart N O R M A L " . It can be seen that the "3c chart N O R M A L " is somewhat similar in appearance with the y and antilog y charts, but not identical. Also, the Lognormal charts in Figure 11 and Figure 12 show no points out-of-control. This in accordance with results from the chart in Figure 8, which was based on the Normality assumption. It can be seen that the charts based on the Lognormal assumption are somewhat similar to the charts based on the Normality. However, when these charts are compared more thoroughly it is obvious that the Normal charts tend to overestimate the plotted parameters, and in the long run are more prone to falsely signal out-of-control conditions. The two types of charts are shown again in pairs to better visualize and compare them, in Figure 13, Figure 14, Figure 15, and Figure 16. From the comparative analysis of charts in Figure 13 and Figure 14 it can be easily seen that the Normality assumption tends to overestimate the points plotted on the "3c chart N O R M A L " . This is more evident for the points corresponding to samples 3 and 6, which are closer to the upper limit of the "3c chart N O R M A L " . It is expected that in the long run, this overestimation will create many false alarms (points show out-of-control conditions when in fact the process is in-control). Comparing the control limits of the two charts, it can be seen that the centerline of the "antilog y" chart is smaller in value (14.052 vs. 14.264). As it was demonstrated earlier, this is because the parameter monitored on the "antilog y " is the geometric mean of M C less the threshold, and the "3c chart N O R M A L " monitors the arithmetic mean of M C . 75 x bar chart NORMAL 17.0 -. 16.0 -< 15.0 -s cr 14.0 -O z 13.0 -cs 12.0 -X 11.0 -1 2 3 4 5 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number Figure 13. Control chart for sample averages based on the Normality assumption. antilog y bar chart ^ 17-0 £ 16.0 £ 15.0 14.0 m 13.0 = 12.0 « 11.0 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number Figure 14. Antilog y chart (in original scale) based on Lognormal assumption. It is known that the geometric mean is always less than or equal to the arithmetic mean for a set of data. The asymmetrical control limits of the "antilog y " chart will allow for larger variations of M C , which are due to the positive skewness, and not to outlier data. As it was discussed earlier, the fact that there may be M C observations in the upper tail of the distribution is a typical occurrence, and a properly chosen control chart should consider it this way. On the other hand, the "x chart N O R M A L " will consider these points as "outliers" and will falsely signal them as out-of-control conditions. 7 6 S chart NORMAL 4.0 -i 3.5 ] Samp le Number Figure 15. Control chart for sample standard deviations based on the Normality assumption. "antilog S" chart 1.9 - i 0.9 -I 1 1 1 1 1 1 1 1 , 1 , 1 1 1 1 1 , , , 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Sample Number Figure 16. Antilog S chart (in original scale) based on Lognormal assumption. The analysis of charts in Figure 15 and Figure 16 also reveals that the Normality assumption in the "S chart N O R M A L " tends to overestimate the dispersion of M C . This is more evident in samples 3 and 6, but it can also be observed in sample 9. The use of the "S chart N O R M A L " to monitor the experimental data would lead to incorrect conclusions about the performance of the process, especially about its dispersion. 77 4.5 LIMITATIONS OF THE STUDY The specific results of the study can be applied only to Douglas-Fir lumber from the interior of British Columbia, Canada, for comparable kiln types, drying conditions, and lumber sizes. However, the methodology of determining the moisture content distribution and the methods for constructing control charts can be applied to many practical situations. For each kiln and set of conditions (species, lumber type and size, drying conditions), the distribution of moisture content could be assessed with the methods presented here. If the distribution of moisture content is proved to be well modeled by the Lognormal, control limits could then be established using the calculated distribution parameters, and the proposed charts should be used to monitor the process average and dispersion. 78 5 CONCLUSIONS 5.1 OVERVIEW In this section the main findings of the research are summarized by drawing conclusions and recommending areas of interest for further investigations. 5.2 CONCLUSIONS This study presented the methodology of using graphical and numerical methods for assessing the goodness of fit of an experimental distribution against the Normal, Lognormal, and Weibull distributions. Graphical methods rely on various approximations and therefore they were used for informal and preliminary judgments only. It was found that the threshold (location) parameter must be estimated and subtracted from the data for correct Lognormal and Weibull probability plotting. The threshold parameter of the experimental data (0 = 7.661%) was very close to the value of E M C (equilibrium moisture content) given by the drying schedule ( E M C = 7.700%). When applied to Douglas-Fir kiln-dried lumber data, the symmetry plots suggested that the empirical distribution is not symmetric and is positively skewed, and the probability plots indicated that Lognormal might be the true distribution. These findings were confirmed by formal numerical goodness-of-fit tests, and the overall conclusion of the goodness of fit tests was that the moisture content of the experimental data is well modeled by the three-parameter Lognormal distribution. 79 Control charts which monitor the parameters of a Lognormally distributed variable were proposed. When these Lognormal charts were compared to the customary charts based on the Normality assumption, it was found that the latter may wrongfully imply that the process is out of control, even when the skewness of the distribution is a natural result of the drying process. Control charts that are based on the Lognormal assumption are more appropriate to use in cases when the moisture content has such a skewed distribution. A practical implementation of the proposed charts is recommended to begin with the determination of the underlying distribution of moisture content. Graphical methods as well as formal goodness-of-fit tests will determine whether the moisture content is better modeled by a Normal, Lognormal, or Weibull distribution. The analysis should start with the investigation of symmetry. If the empirical distribution is asymmetric then the Lognormal or Weibull may provide a better fit, and this should be first checked with probability plots. Carefulness is required in constructing these plots because they can be easily misused by failing to recognize that the moisture content distribution is left-bounded not by zero, but by another positive value -threshold or location. To obtain a correct probability plot, the threshold must be subtracted beforehand from the data. To generate rigorous conclusions about the underlying moisture content distribution, it is recommended that graphical methods be always accompanied by formal numerical tests, which should corroborate the findings. If it is concluded that a three-parameter Lognormal is a proper fit, then the central tendency and the variability of moisture content can be monitored with the charts proposed in this paper. The central tendency would be better modeled by the geometric mean, instead of the usual arithmetic mean, and this should be monitored with the anti log y chart proposed in this paper. The variability between different samples of moisture content can be monitored with the proposed anti log S chart. 80 5.3 RECOMMENDATIONS FOR FURTHER RESEARCH Considering the topics discussed and the findings of this research, several areas are recommended for further investigation. One possible area of research is to determine how the distribution of moisture content changes during different stages of drying. Using the goodness of fit methods presented here, the initial distribution of M C (of the "green" wood) could be determined. Furthermore, the parameters of the distribution could be assessed during successive stages of the drying process, including the final M C . This research could lead to development of proper parameter-testing methods for each stage. Another area of investigation is to ascertain precisely the relationship between the equilibrium moisture content controlled by the drying schedule, and the threshold (location) parameter of M C distribution, for each drying stage. A n additional research topic could be to further develop the Lognormal control charts proposed here, by determining their associated operating-characteristic and average-length curves, and process capability analysis. Further research could determine the precise effect of non-normality on the design of control charts, specifically for a variable that has a three-parameter Lognormal distribution. 81 6 LITERATURE CITED Aitchinson, J. and Brown, J . A . C . (1957). The Lognormal Distribution. Cambridge University Press, London. A S T M - American Society for Testing and Materials (1997). D 4442 - 92 (Reapproved 1997): Standard Test Methods for Direct Moisture Content Measurement of Wood and Wood-Base Materials. Bramhall, G . (1975). Meeting New Kiln Drying Standards. Canadian Forest Industries, 95(9), pp.33-35 Bramhall, G . and Warren, W . G . (1977). Moisture Content Control in Drying Dimension Lumber. Forest Products Journal, 27(7), pp.26-28 Bramhall, G . and Wellwood, R . W . (1976). Kiln Drying of Western Canadian Lumber. Canadian Forestry Service, Western Forest Products Laboratory, Vancouver, Canada Chandra, M . , Singpurwalla, N .D. , and Stephens, M . A . (1981). Kolmogorov Statistics for Tests of Fit for the Extreme-value and Weibull Distributions. Journal of the American Statistical Association, 76(375), pp.729-731. Cheung, K . C . (1994). Hand-held Moisture Meters in Lumber Production. In " A S T M Hand-held Moisture Meter Workshop", Forest Products Society, U S A Cohen, A . C . (1951). Estimating Parameters of Logarithmic-Normal Distributions by Maximum Likelihood. Journal of the American Statistical Association, 46(254), pp.206-212. Cohen, A . C . (1965). Maximum Likelihood Estimation in the Weibull Distribution Based on Complete and Censored Samples. Technometrics, 5, pp.579-588. Cohen, A . C , Whitten, B.J . , and Ding, Yihua (1985). Modified Moment Estimation for the Three-Parameter Lognormal Distribution. Journal of Quality Technology, 17(2), pp.92-99. Crow, E . L . , and Shimizu, K . (1988). Lognormal Distributions, Theory and Applications. Marcel Dekker, New York. Culpepper, L . , and Wengert, E . M . (1980). Looking for Causes of Moisture Content Variation in Kiln Drying Southern Pine 2 by 4's. Part 1 and 2. Lumber Drying Sourcebook - 40 Years of Practical Experience. Forest Products Society, Madison, W l , pp.87-93. D'Agostino, R.B. , Stephens, M . A . (1986). Goodness-Of-Fit Techniques. Marcel Dekker, New York. 82 Daniel, D. and Wood, F.S. (1971). Fitting equations to data. Wiley, New York. Dodson, B . (1994). Weibull Analysis. A S Q C Quality Press, Milwaukee, Wisconsin. Fell, J.D., and Hil l , J .L. (1980). Sampling Levels for Hardwood Kiln-Drying Control. Forest Products Journal, 30(3), pp.32-36. Ferrell, E . B . (1958). Control Charts for Log-Normal Universes. Industrial Quality Control, 15(2), pp.4-6. Hamaker, H . C . (1978). Approximating the cumulative normal distribution and its inverse. Applied Statistics, 27, pp.76-79. Heyde, C . C . (1963). On a Property of the Lognormal Distribution. Journal of the Royal Statistical Society, Series B (Methodological), 25(2), pp.392-393 Hil l , B . M . (1963). The Three-Parameter Lognormal Distribution and Bayesian Analysis of a Point-Source Epidemic. Journal of the American Statistical Association, 58(301), pp.72-84. Joffe, A . D . , and Sichel, H.S. (1968). A Chart for Sequentially Testing Observed Arithmetic Means from Lognormal Populations Against a Given Standard. Technometrics, 10(3), pp.605-612. Johnson, N . L . and Kotz, S. (1970). Continuous Univariate Distributions. Vo l . 1. Wiley, New York. Johnson, N . L . , Kotz, S., and Balakrishnan, N . (1994). Continuous Univariate Distributions. Vo l . 1, 2 n d edition, John Wiley and Sons, New York Lockhart, R . A . , Stephens, M . A . (1994). Estimation and Tests of Fit for the Three-Parameter Weibull Distribution, Journal of the Royal Statistical Society, 56(3), pp.491-500. Maki , R . G . (1991). A n Application of Statistical Process Control Measures for Maintaining Optimal Quality from Dry Kiln Operations. M.Sc. Thesis, Oregon State University, Corvallis, OR. Maki , R . G . , and Milota, M . R . (1993). Statistical Quality Control Applied to Lumber Drying. Quality Progress, 26(12), pp.75-80 McMahon, E.P. (1961). Moisture Control During Kiln Drying. Forest Products Journal, 11(3), pp.133-138. Milton, J.S., Corbet, J.J., and McTeer, P . M . (1986). Introduction to Statistics. D . C . Heath and Company, Lexington, Massachusetts. Montgomery, D . C . (1997). Introduction to Statistical Quality Control. Third edition, John Wiley and Sons 83 Morrison, J. (1958). The Lognormal Distribution in Quality Control. Applied Statistics, 7(3), pp. 160-172. Nelson, W . and Thompson, V . C . (1971). Weibull probability papers. Journal of Quality Technology, 3, pp.45-50. NIST (National Institute of Standards and Technology) (1999). American Softwood Lumber Standard D O C PS 20-99, Washington N L G A (National Lumber Grades Authority) (1998). Standard Grading Rules for Canadian Lumber. New Westminster, Canada Noghondarian, K . (1997). Quality Control with Non-Normal, Censured and Truncated Data. Ph.D. Thesis, University of British Columbia, Dept. of Mechanical Engineering. Pratt, W . E . (1953). Some Applications of Statistical Quality Control to the Drying of Lumber. Journal of FPRS, 3(5), pp.28-31 Pratt, W . E . (1956). Estimating the Moisture Content of Lumber During the Drying Process. Forest Products Journal, 6(9), pp.333:337 Rasmussen, E . F . (1988). Dry Kiln Operator's Manual. Forest Products Laboratory, Hardwood Research Council, Memphis, U S A Rice, R .W. , and Shepard, R . K . (1993). Moisture Content Variation in White Pine Lumber Dried at Seven Northeastern Mills. Forest Products Journal, 43(11/12), pp.77-81. SAS Institute Inc. (1999). S A S O n l i n e D o c ® , Version 8, Cary, N C : SAS Institute Inc., 1999 Simpson, W . T . (1991). Dry Ki ln Operator's Manual. U .S .D .A . , Forest Products Laboratory. Printed by Forest Products Society. Stephens, M . A . (1974). E D F Statistics for Goodness of Fit and Some Comparisons. Journal of the American Statistical Association, Theory and Methods Section, 69(347), pp.730-737. Stephens, M . A . (1976). Asymptotic Results for Goodness of Fit Statistics with Unknown Parameters. The Annals of Statistics, 4(2), pp.357-369. Zwick, R . L . , and Cook, J .A. (1985). The Modeling of Moisture Content Distributions Based on Censored Readings from a Resistance Meter. Technical paper presented at Western Dry Ki ln Association Meeting. W W P A (Western Wood Products Association) (1998). Standard Grading Rules for Western Lumber. Portland, U S A 84 7 APPENDIX A-KILN CHARACTERISTICS K I L N Double track conventional Steam heated / steam injected 6 cross shaft fans 145,000 M B F capacity K I L N C A P A C I T Y 160 units (loads) of 6' Total of 36,480 pieces of 6' in a single charge UNITS 2"x4" - 12 pieces wide x 19 rows high (4' high x 4' wide) 228 pieces / unit %" stickers every 2' K I L N L O A D I N G 60' long tracks, 4 units high, 2 units wide 8 APPENDIX B - DRYING SCHEDULE Step No. Time[h] Temperature [°C] E M C [%] RH [%] Dry-bulb Wet-bulb 1 0 t o 8 26.7 26.7 — — 2 8 to 16 65.6 62.8 15.4 87 3 16 to 24 71.1 67.2 13.4 83 4 24 to 32 76.7 71.7 11.8 80 5 32 to 40 82.2 75.6 10.1 75 6 40 to 48 87.8 79.4 8.9 71 7 48 to 56 93.3 83.3 7.7 67 8 56 to 80 93.3 90.6 14 90 86 9 APPENDIX C - DATA PLOTTED ON CONTROL CHARTS Sample x chart 5 chart y chart antilog y chart S chart antilog S chart # Normal Normal [%] [%] [unitless] [%] [unitless] [%] 1 13.767 0.934 1.829 13.711 0.151 1.1625 2 14.570 0.754 1.953 14.536 0.110 1.1161 3 15.866 2.565 2.089 15.563 0.301 1.3517 4 15.153 1.253 2.027 15.075 0.158 1.1713 5 14.218 1.806 1.879 14.033 0.261 1.2986 6 16.064 2.734 2.101 15.658 0.363 1.4382 7 13.778 0.900 1.831 13.725 0.147 1.1586 8 14.045 1.526 1.859 13.901 0.238 1.2684 9 14.957 1.865 1.990 14.796 0.225 1.2527 10 12.825 1.161 1.655 12.715 0.235 1.2643 11 14.577 2.313 1.916 14.280 0.328 1.3875 12 14.327 1.557 1.903 14.191 0.222 1.2491 13 13.968 0.810 1.863 13.926 0.129 1.1381 14 13.626 0.988 1.805 13.564 0.159 1.1727 15 14.228 2.068 1.869 13.965 0.322 1.3795 16 15.461 0.743 2.073 15.433 0.093 1.0977 17 14.016 0.767 1.871 13.981 0.115 1.1214 18 13.609 0.905 1.802 13.548 0.164 1.1782 19 13.989 , 1.389 1.857 13.884 0.196 1.2166 20 13.412 1.435 1.752 13.252 0.273 1.3133 87 10 APPENDIX D - TWO REPLICATIONS FOR GOODNESS OF FIT TESTS OF HYPOTHESIS Replicate #1 Test Statistic Normal Test Lognormal Test Weibull Test Obs. Crit. Obs. Crit. Obs. Crit. a .01 .05 .01 .05 .01 .05 Kolmogorov-Smirnov b 1.5855 1.035 0.895 0.5066 1.035 0.895 — — — Cramer-von Mises w2 0.5717 0.178 0.126 0.0317 0.178 0.126 0.3026 0.1603 0.1137 Anderson-Darling A2 3.5904 1.092 0.787 0.2290 1.092 0.787 2.0760 0.9428 0.6892 Replicate #2 Test Statistic Normal Test Lognormal Test Weibull Test Obs. Crit. Obs. Crit. Obs. Crit. a .01 .05 .01 .05 .01 .05 Kolmogorov-Smirnov b 1.6127 1.035 0.895 0.5625 1.035 0.895 — — — Cramer-von Mises w2 0.5988 0.178 0.126 0.0316 0.178 0.126 0.2483 0.1603 0.1137 Anderson-Darling 3.6778 1.092 0.787 0.2702 1.092 0.787 1.6655 0.9428 0.6892 88
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Quality control methods for monitoring the variability...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Quality control methods for monitoring the variability of moisture content in kiln-dried lumber Ristea, Catalin 2001
pdf
Page Metadata
Item Metadata
Title | Quality control methods for monitoring the variability of moisture content in kiln-dried lumber |
Creator |
Ristea, Catalin |
Date Issued | 2001 |
Description | Monitoring the variability of moisture content in lumber is a problem of utmost importance in kiln-drying. This thesis focused on determining the distribution of moisture content in lumber, and on the application of statistical process control principles to monitor the drying process through the use of quality control charts. Specific parameters of the charts for monitoring the process average and process dispersion were determined. The charts were developed on the assumption that the data under analysis has a Lognormal distribution. To test whether this assumption is valid or not for moisture content, graphical and numerical tests were employed as goodness of fit methods. Graphical methods utilized in this research were symmetry and probability plots, and empirical cumulative distribution function graphs. Numerical methods employed in this study were goodness of fit tests based on empirical distribution function statistics. Procedures were developed for both graphical and numerical goodness-of-fit methods, and also for construction of control charts for Lognormal variables. The methods presented were tested on a data set of actual Douglas-Fir (Pseudotsuga menziesii) lumber collected from a production facility in British Columbia, Canada. It was determined that the Lognormal distribution provided a good fit for the experimental data. Therefore, the proposed control charts for Lognormal data are to be used instead of the customary charts based on the Normality assumption. |
Extent | 3555300 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-08-06 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0090020 |
URI | http://hdl.handle.net/2429/11801 |
Degree |
Master of Science - MSc |
Program |
Forestry |
Affiliation |
Forestry, Faculty of |
Degree Grantor | University of British Columbia |
GraduationDate | 2001-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_2001-0504.pdf [ 3.39MB ]
- Metadata
- JSON: 831-1.0090020.json
- JSON-LD: 831-1.0090020-ld.json
- RDF/XML (Pretty): 831-1.0090020-rdf.xml
- RDF/JSON: 831-1.0090020-rdf.json
- Turtle: 831-1.0090020-turtle.txt
- N-Triples: 831-1.0090020-rdf-ntriples.txt
- Original Record: 831-1.0090020-source.json
- Full Text
- 831-1.0090020-fulltext.txt
- Citation
- 831-1.0090020.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0090020/manifest