QUALITY CONTROL WITH NON-NORMAL, CENSORED AND TRUNCATED DATA by Kazem Noghondarian B.Sc, University of Nevada, USA., M.Sc, Arizona State University, USA. A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in THE FACULTY OF GRADUATE STUDIES MECHANICAL ENGINEERING We accept this thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA October, 1997 © Kazem Noghondarian, 1997 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of tA£cl\t\ s>A f E~rS&n The University of British Columbia Vancouver, Canada Date 0C/T- (o, DE-6 (2/88) ABSTRACT This research presents a new approach to the computations of control charts for non-Normal data and for those quality characteristics where the exact sampling distributions of statistics for the process mean and standard deviation are not known. We use a class of power transformations due to Box and Cox (1964), to produce data that conform best to the Normal distribution. A statistical test of significance to determine the presence of an additional between-sample variation is introduced and an appropriate control chart to control this extra variation is developed. The Likelihood Ratio (LR), statistic which has been found useful in areas such as testing of hypothesis and estimation of confidence intervals, is used to design the control charts in the original scale of measurements that are natural for the product. The major advantage of LR method is its relatively rapid convergence to its chi-square asymptote. We present a specific application in the wood industry, by constructing appropriate control charts for the final Moisture Content (MC) of kiln-dried lumber. Comparison with a previous study which used the original non-Normal MC data showed the importance of an appropriate transformation and the inclusion of the additional between-sample variation in the calculations of the control chart limits. Without these necessary steps the control chart may lose its validity and falsely signal an out of control situation. Confidence intervals and control charts for the process mean and standard deviation are developed based on the LR statistic for the Weibull and Gumbel distributions. A control chart for the percentile of strength data to maintain a rninimum strength at a desired level, is also presented. Probability plots to check the Normality assumption of the censored and truncated data are presented. Appropriate control charts for the sample estimates of mean and standard deviation for the non-Normal censored and truncated data are developed. A procedure is given to re-express the control charts for the censored and truncated data in the original scale of measurements. Complex calculations were performed without the need to program using the Mathcad™ computer analysis package. This is a highly desirable property for the non-statistically oriented user. iii TABLE OF CONTENTS Abstract Table of Contents List of Tables List of Figures Acknowledgment Chapter 1 Introduction Overview Problem Definition Research Objective Chapter 2 Literature Review and Background Material Overview Non-Normal Distributions Censored and Truncated Samples Likelihood Ratio Statistic Chapter 3 Control Charts for Non-Normal Distributions Overview Final Moisture Content of Kiln-dried Lumber The Normality Assumption Data Transformation Control Chart for Sample Means Control Chart for Moving Standard Deviations Control Chart for Sample Standard Deviations Control Chart in the Original Scale of Measurements Summary Chapter 4 Control Charts with Censored Data Overview The Normality Assumption Estimation of the Process Parameters Confidence Intervals for the Process Parameters Control Chart for the Sample M L Estimates of Mean Control Chart for the Sample M L Estimates of Standard . Deviations Control Chart in the Original Scale of Measurements Summary iv Chapter 5 Control Charts with Truncated Data Overview 66 The Normality Assumption 67 Confidence Intervals for the Process Parameters 72 Control Chart for the Sample ML Estimates of Mean 74 Control Chart for the Sample ML Estimates of Standard 76 Deviations Control Chart in the Original Scale of Measurements 77 Summary 82 Chapter 6 Control Charts with Material Strength Data Overview 84 Estimation of the Process Parameters 85 Weibull Probability Plot 88 Data Transformation 89 Confidence Intervals for the Process Parameters 91 Control Chart for Sample ML Estimates of Scale Parameter 94 Control Chart for the Sample ML Estimates of Location 96 Parameter Control Chart in the Original Scale of Measurements 97 Control Charts for the Percentiles of Strength Distribution 100 Summary 106 Chapter 7 Conclusions and Recommendations Overview 109 Conclusions 109 Recommendations for Further Research 112 Bibliography 114 LIST OF TABLES Table 3-1 Percentage moisture content of kiln dried 4/4 redwood uppers 16 Table 3-2 Sample means and moving standard deviations 27 Table 3-3 Sample means in the original scale of measurements. 34 Table 4-1 Measures of Pollutants in Air Quality Samples (in PPM) 44 Table 4-2 Control limits for the sample ML estimates of means 55 Table 4-3 Control limits for the sample ML estimates of standard deviations 57 Table 4-4 Control limits for the sample MLE of mean in the original scale 62 Table 5-1 Data from a truncated non-Normal distribution 67 Table 6-1 Rupture strength of a new engineering material 86 LIST OF FIGURES Figure 3-1 Probability plot for the data in Table 3-1 19 Figure 3-2 Probability plot for the log transformed data 20 Figure 3-3 Probability plot for the transformed data 23 Figure 3-4 The sample means chart for data in Table 3-1 27 Figure 3-5 Moving additional between-sample standard deviation chart 30 Figure 3-6 Standard deviation chart 31 Figure 3-7 LR curve and the line of chi-square 36 Figure 3-8 Control chart for sample means in the original scale of measurements 37 Figure 4-1 Probability plot for the data in Table 4-1 46 Figure 4-2 Normal probability plot for the transformed data 49 Figure 4-3 Likelihood ratio curve as a function of the process mean 52 Figure 4-4 Likelihood ratio curve as a function of the process standard deviation 53 Figure 4-5 Control chart for the sample ML estimates of mean 56 Figure 4-6 Control chart for the sample ML estimates of standard deviation 58 Figure 4-7 Likelihood ratio curve as a function of mean in the original scale 61 Figure 4-8 Control chart for the ML estimates of mean in the original scale 63 Figure 5-1 The probability plot of the original truncated data 69 Figure 5-2 Probability plot for the transformed data 71 Figure 5-3 Likelihood ratio curve as a function of the process mean 73 Figure 5-4 Likelihood ratio curve as a function of the process standard deviation 74 Figure 5-5 Control chart for the sample ML estimates of mean 76 Figure 5-6 Control chart for the sample ML estimates of standard deviation 78 Figure 5-7 Likelihood ratio curve as a function of the mean in the original scale 80 Figure 5-8 Control chart for the sample estimates of mean in the original scale 81 Figure 6-1 Plot of Weibull pdf 87 Figure 6-2 Probability for data in Table 6-1 89 Figure 6-3 Plot of Gumbel probability distribution function 90 Figure 6-4 LR curve as a function of the location parameter ji 92 Figure 6-5 Sample ML estimates of standard deviations chart 96 Figure 6-6 LR curve for the process mean in the original scale 99 Figure 6-7 Control chart for the sample means in the original scale 100 Figure 6-8 Control chart for the sample tenth percentiles 106 vi Acknowledgment I would like to offer my sincere thanks to my advisor, Dr. F. Sassani for his constant supervision and guidance throughout this research. Special thanks to my supervisory committee, Dr. B. Dunwoody and Dr. T. Maness for many valuable suggestions and comments. I wish to express my gratitude to Dr. Bury who has made a substantial contribution to both my understanding and appreciation of statistics. I would also like to tank the Ministry of Higher Education, Islamic Republic of Iran for their invaluable scholarship during my Ph.D. program. Last but not least, I would like to thank my wife Fatemeh Ashraf Julaei and my children Iman, Ehsun and Maryam for their constant support and encouragement. vii D E D I C A T I O N Dedicated to the memory of the late Ayatollah Imam Khomaini, the great leader of the Islamic revolution and the founder of the Islamic Republic of Iran. A great man who devouted his entire blissful life to chastity, to self-purification, nearness to God, acquisition of knowledge and struggle for liberation of the oppressed masses from the shackles of the arrogant. A great leader, who led the Iranian nation to end the tyranical, corrupt, and deviated monarchial regime and replaced it with an Islamic system. A great scholar, who presented the doctrine of wilayat-e fagih (The Guardianship of Jurisconsult) as the best form of government and set up a political system based on Divine foundation wiht Divine injuction. May Almighty God please his exalted soul and give us strength to follow his path. K. Noghondarian Oct. 10, 1997 viii CHAPTER 1 INTRODUCTION Overview In today's competitive global market place, the achievement of high-quality products has become the only solution for manufacturers to survive and prosper. As a result, manufacturers have to have an operational control system to assure the quality of their products. A manufacturer who only performs inspection on the final products without being concerned about the intermediate stages of the manufacturing process, by no means is enhancing the quality of his products. The final inspection may give an estimation of the performance of the process, but does not prevent nonconforming products from reaching the customer. Studies have shown that only about 80% of nonconforming products are detected during any 100% inspection, see Ryan (1989, p. 9.) Therefore, to ensure quality, emphasis should be placed on the intermediate stages of the manufacturing process and the use of a powerful tool known as Statistical Quality Control (SPC.) SPC, which uses the concepts of statistics to control the manufacturing process, was introduced decades ago. Unfortunately it did not receive wide attention in North America and Europe until recent years. The essential idea of SPC is to improve the quality of products continuously through constant applications of statistical methods to process control. A major goal of SPC is to detect the occurrence of any assignable causes of disturbances in the process as soon as possible so that investigation of the process and l corrective actions can be taken before many nonconforming products reach the final stage of the process. In an ideal manufacturing world, any production would turn out perfect products. No quality control would then be necessary because every item that comes off the production line would conform 100% with specification. Unfortunately, in the real world of manufacturing, many factors combine and interact to make each unit unique. Temperature, humidity, materials used, machine settings all vary and affect the product. The parts of a machine are not fixed entities; they wear out, change dimensions, and lose their adjustment. Also the people who run the machines differ in their behavior. A single operator forgets things over time and may fail to communicate with others, and when many operators are involved, opportunities to vary from the standard procedure are multiplied. Therefore, uniformity and stability over time is not something that may exist as a natural characteristic of a manufacturing process, rather something that we should work hard to achieve. The Shewhart control chart (Montgomery, 1991) is a valuable tool in this regard. Its importance is the simple fact that successive observations are plotted in time order, so that time patterns of normal and abnormal behavior of the process can be clearly seen and analyzed by a viewer who is familiar with the process. To make the assessment easier of what might be and what might not be a normal process behavior, control limits are often placed at ± 3rj about the process mean ja . Since many processes tend to remain stable over short periods of time, a measure of this short term standard deviation, a, is usually used to judge normal behavior. This standard deviation is determined from short lengths, 2 called rational subgroups, of normal operation of the process. Control limits are determined by allowing a process to run untouched and then analyzing the results using a set of mathematical formulas. There are two kinds of variation. The first is that which results from many small causes: minor variations in the worker's ability, the clarity of procedures, the capability of the machinery and equipment, and so forth. These are "common causes" and can often only be changed by management. The other form of variation is usually easier to eliminate. When a machine malfunctions, or an untrained worker is put on the job, or a defective material arrives from a vendor, corrective actions can be taken rather easily. Deming (1986), calls these "special causes." They show up on control charts as points outside the limits. The formula for the control limits is designed to provide an economic balance between searching too often for special causes where there are none (i.e. a false-alarm or Type I error), and not searching when a special cause may be present (i.e. not to give alarm or type II error.) A system can best be improved when special causes have been eliminated and it has been brought into statistical control. At that point, management can work effectively on the system, looking for ways to reduce variation. Once a system is in control, control charts can be used for monitoring so as to immediately detect when something goes wrong. Line operators can record the data and take action, shutting down the line, if necessary. A point need not be outside the limits to indicate action. Abrupt shifts or distinct trends within limits are also signals for investigation. 3 Generally speaking, there are two kinds of control charts: charts for variable data and charts for attribute data. Variable data control charts are useful when the parameter of interest can be conveniently measured numerically, for example, the measurement of the diameter of a cylindrical part. Whereas attribute data control charts are useful when the parameter of interest can not be conveniently measured numerically, for example, the inspection of the finished surface of a cylindrical part. X and R charts belong to the category of variable data charts, and p and c charts belong to the category of attribute data charts. Problem definition Conventional control charts for variables are based on the assumption that the quality characteristic of interest is approximately Normally distributed. If the process shows evidence of a significant departure from Normality, however, then the control limits calculated may be entirely inappropriate. Control chart techniques intended for Normally distributed processes can lead to serious errors when applied to data obtained from other processes. In such cases, it will usually be best to determine the control limits for the individual control chart based on the probabilities of the correct underlying distribution. These probabilities could be obtained from a probability distribution fit to the data. Another approach would be to transform the original variable to a new variable that is approximately normally distributed, and then apply control charts to the new variable. However, working with the transformed data presents a significant problem in quality control work: the units of the transformed data are not comprehensible to the production 4 workers who should use control charts to monitor their work. Therefore, a procedure is needed to re-express the control chart in the original scale of measurements and specify its control limits in the units that are natural for the product. A control chart is a graphical display of a quality characteristic that has been obtained from a sample versus the sample number or time. Usually quantities computed from the observations in the sample such as sample mean and sample standard deviation are used to monitor the process mean and process standard deviation respectively. Sample means obtained from Normal and Exponential distributions, have known Normal and Gamma probability distributions respectively, and as a result, control charts for sample means can easily be developed to monitor the process parameters. Sample means obtained from other distributions, do not have known exact probability distributions. Therefore, a method is needed to develope control charts to monitor the process parameters. Research objective The Likelihood Ratio (LR), a statistic which has been found useful in areas such as testing of hypothesis and estimation of confidence intervals, can also be used to establish rather accurate control chart limits for quality characteristics where the exact sampling distribution of statistics for the process parameters are not known. The major advantage of LR method is its relatively rapid convergence to its chi-square asymptote. However, in one way the LR method is disadvantageous: it requires more complex computations. 5 The objective of this research is to present a new approach for the computations of control charts for non-Normal data and for those quality characteristics where their exact sampling distributions of statistics for the process mean and standard deviation are not known. This research uses a class of power transformations to produce data that conform best to the Normal distribution, and presents a method that enables the quality control engineer to design the control charts in the original units of measurements that are more natural for the product. Also a specific application in the wood industry, by constructing appropriate control charts for the moisture content data from kiln-dried wood will be presented. LR-based control charts for censored and truncated samples obtained from non-Normal distributions, will also be dealt with. In both cases, the probability distributions of sample parameters are not known and the conventional methods for developing control charts can not be used. The computations in this research are performed using the Mathcad (1996), computer analysis package, which is one of the most popular computational tools available today. This is a highly desirable property for non-statistically oriented users. The chapters are organized as follows. Chapter 2 reviews some of the early works on the analysis of non-normal distributions, censored and truncated samples. Chapter 3 begins with a brief review of some of the basic concepts of testing the Normality assumption and transformation, then presents control charts for the transformed data and a control chart for sample means based on the original units of measurements. Chapter 4 discusses control charts for sample maximum likelihood estimates of means and standard 6 deviations for censored non-Normal data. Chapter 5, presents control charts based on the LR statistics for quality characteristics with truncated distributions. Chapter 6, discusses control charts for sample means and percentiles of material strength test data with a Weibull distribution. Chapter 7, summarizes conclusions as well as research contributions and gives some further research directions. 7 CHAPTER 2 LITERATURE REVIEW AND BACKGROUND MATERIAL Overview This chapter presents some of the early works on the analysis of non-Normal distributions, censored and truncated samples and the Likelihood Ratio method. Non-Normal distributions The quality characteristics of products sometimes do not follow the normal distribution, and assuming Normality will result in inappropriate control limits. Several authors have studied the effects of non-Normality on the conventional X and R charts for different distributions like the Burr family of distributions discussed by Burr (1967), Gamma, uniform and Normal mixture distributions by Schilling and Nelson (1976), Normal mixture and double exponential distributions by Balakrishnan and Kocherlakota (1986), and Tukey's X-family of symmetric distributions by Chan, Hapuarachchi and Macpherson (1988). Robust versions of X and R charts have been discussed by Langenberg and Iglewicz (1986) and Rocke (1989). An alternative approach to those surveyed above may be the use of appropriate transformation so that the transformed data are approximately normally distributed, and then use 3-sigma limits for the transformed variable. Various normalizing transformations have been suggested in the literature. Some of the better known are general transformations like the power transformations studied by Box and Cox (1964), or some 8 transformations that are particularly structured for given distributions, like the arc-sin transformation for binomially distributed data or certain root transformations for Poisson variates1. Schneider, Hui, and Pruett (1992), used Box and Cox transformations in their study of the control charts for environmental data. Very few papers dealing with non-Normal distributions in quality control have appeared in the literature. Hosono, Ohta, and Kase (1981) developed single sampling plans for Weibull distributions. Hosono (1984) also introduced cumulative sum charts for the mean of extreme-value distributions with the assumption that the shape parameter of the corresponding Weibull distribution was known with the scale parameter as the specified target value of the process. Variable sampling plans based on Weibull distribution with complete or Type I and II censoring were developed by Fertig and Mann (1980); and Nelson (1982). Schneider (1989) also developed Failure-Censored Variables Sampling Plans for Lognormal and Weibull distributions. Shewhart-type "percentile charts" for Weibull and Lognormal distributions based on Monte Carlo simulation were proposed by Padgett and Spurrier (1990). Censored and truncated samples Censoring is a well known term in statistical literature and refers to the situation in which the values of some of the observations of the sample are unknown. Consider for instance, a fatigue test, where the experimenter decides to withdraw some of the test specimens prior to their failure. In this case some of the lifetimes are not known, i.e. they are censored. On the other hand, truncated samples are those from which certain population 1 See Ryan (1989) for a discussion of some of these transformations. 9 values are entirely excluded. The estimation of the parameters of a censored sample taken from a normal population and related statistical intervals have previously been considered by many researchers who have used different methods. In addition to the method of least squares and the method of maximum likelihood many other methods have been used to obtain simple but highly efficient estimators (refer for a survey of some of these methods, for example, to Schneider, 1986, p. 3 and 57). Herd (1960), used the term "multicensored" samples and suggested a probability plotting of multiply censored data much the same as the Kaplan - Meier (1958) method. Mann (1971), introduced best linear invariant estimators (BLIE's), for Weibull parameters under progressive censoring. Cohen (1991), and Nelson (1982), in their respective books presented the asymptotic (i.e. large-sample) theory for maximum likelihood estimators and confidence limits. Hahn and Meeker (1991), demonstrated the practical relevance and construction of confidence intervals and discussed several practical issues in the analysis of progressively censored data . Viveros and Balakrishnan (1994), used a conditional method of inference to derive exact confidence intervals for several life characteristics such as location, scale, quantiles, and reliability when the data are Type II progressively censored. Likelihood ratio statistic The likelihood ratio is familiar in other branches of statistical analysis, but its use in quality control does not appear to have been exploited. Likelihood ratio statistic has been used by many researches to test hypotheses (Brownless 1960; Kendall 1961; and Keeping 10 1962), and construct confidence intervals (Mohan 1979, Owen 1988, Chang 1989, Baxter 1993, Qui 1993 & 1994 and Murphy 1995). Chang (1989), presented a methodology based on the likelihood-ratio test for construction of confidence intervals for a normal mean, following a group sequential test. Owen (1988), introduced the empirical likelihood ratio method in nonparametric models. Qin (1994), used the likelihood ratio to construct confidence intervals in a semiparametric problem, in which one model is parametric, and the other is nonparametric. Murphy (1995), considered binomial and Poisson extensions of the likelihood, in an attempt to find meaningful likelihood ratio hypothesis tests and subsequent confidence intervals in a semiparametric setting. He defined confidence intervals for the survival function and the cumulative hazard function for failure time data. Basic concepts The likelihood ratio test is based on the ratio of the likelihood function for a sample of observations computed using the null hypothesis over the same likelihood function computed using the alternative hypothesis. This ratio, can not be greater than 1 and is positive since it is the ratio of products of probability fuctions which must always be positive (Brownless, 1960 P. 88.) A small value of likelihood ratio indicates that the likelihood computed using the null hypothesis is relatively unlikely, and so we should reject the null hypothesis. Conversely, a value of the ratio close to 1, indicates that the null hypothesis is very plausible and should be accepted. Considering the following ratio of likelihoods for a sample of observations: LR^=Wy (2-1} i i the method of Likelihood Ratio (LR), is based on the fact that under very general conditions as described by Kokoska and Nevisbn (1994), for large n, -2\nLR(d) has approximately a chi-square distribution with degrees of freedom equal to the number of unknown parameters in the likelihood function L(6). According to Keeping (1962), when the parent population is Normal, the chi-square distribution of -2 in LR(e) holds exactly, even for sample size n = 2. L(d) in (2-1) is the likelihood function with unknown parameter replaced by its maximum likelihood estimator. See Kendall (1961) for the basic properties of the Likelihood Ratio statistic. Lawless (1982) has shown that the distribution of the likelihood ratio statistic approaches its limiting chi-square distribution considerably more rapidly than the distribution of the maximum likelihood estimator approaches its limiting Normal distribution. Therefore, using the LR statistic we can find more accurate confidence limits for small to moderately-sized samples. Considering the likelihood function of an obtained sample and its maximum likelihood estimators of the parameters u. and a, a ( l - a )-level confidence interval endpoints on the process parameter u are obtained from the condition L(/i,tr(/i)) -2 • In * XL- (2-2) L(ju,a) where x\(\-a) is the (1 - a) percentile of the chi-squared distribution with v = 1 degree-of-freedom.tr(^ ) is the process standard deviation as a function of the process mean and is defined by solving the maximum likelihood equation for a in terms of fi. The upper and lower (l-a) percentile confidence limits for parameter fi are then obtained from (2-2). A 12 similar procedure can be used to find a ( i-a )-level confidence interval on the scale parameter a. Confidence interval endpoints on the parameter a are obtained from the condition - 2 • In l{u(a),a) * Xl-a (2-3) where /i(o") is the process mean as a function of the process standard deviation and is defined by solving the maximum likelihood equation for u in terms of a. The end points of the (l - a) -percentile confidence limits for parameter a are then obtained from (2-3). 13 CHAPTER 3 CONTROL CHARTS FOR NON-NORMAL DISTRIBUTIONS Overview Conventional Shewhart control charts for variables are based on the assumption that the underlying distribution of the obtained data is Normal. However, there are numerous industrial situations where such an assumption is invalid and could lead to inaccurate quality control judgments. In this chapter, a specific quality characteristic with a non-Normal distribution will be investigated. Moisture content (MC) of kiln-dried lumber is a good example as addressed in the following section. Final moisture content of kiln-dried lumber Variable control charts may be used as effective tools in the control of the final moisture content of kiln dried lumber. The removal of moisture from lumber usually is accomplished by exposing the lumber to outdoor atmospheric conditions or to the higher temperatures of a dry kiln. There are two main reasons for kiln drying lumber: 1. To reduce its moisture content more rapidly than can be accomplished in air drying. While air drying usually requires several months or a season, kiln drying can be done in a few days. Rapid drying results in a more flexible operation and reduces capital tied up in yards. It also reduces insurance costs and taxes. 14 2. To reduce the moisture content of lumber below that attainable in air drying. For cases where wood is to be used under the driest conditions, lumber must be dried to a low moisture content, so that little or no shrinkage will take place. Improperly dried lumber would cause problems for both the manufacturer and the user. If lumber is under-dried, the end result can be warppage, shrinkage, cracks, splits, and related problems when the wood dries while in service. On the other hand, over-dried wood, which eventually equalizes to a higher moisture content, wastes energy (Rice and Shepard, 1993). Lumber must be dried to a prespecified, uniform moisture content to maintain dimensional stability in service and improve rmchining operations. Drying of lumber to a uniform moisture content will significantly increase its durability, retain its usefulness and value. However, lumber drying defects can often occur whenever lumber is over or under-dried. Application of control charts to the final moisture content of kiln-dried lumber can help prevent both over dried and wet boards and maintain uniform moisture content. In the following, we demonstrate the construction of the appropriate control charts using an illustrative example. Illustrative example The data set in Table 3-1 consists of twenty samples, each containing five measurements of the final MC of kiln dried redwood lumbers expressed as a percentage of the oven-dry weight. For the purpose of drying lumber, MC is the water contained in a sample of wood expressed as a percentage of the mass of dry wood of the sample, assuming that all water 15 has been removed (small wood samples must be dried in a small sample drying oven until they show no further loss of weight. They are then considered dry.) For example, if the wet weight of a piece of sample is 25 g, and its dry weight is 25-20 20 g, the M C would be: — — — x 100 = 25 %. Sample boards were obtained randomly from each kiln charge. Samples were cut about 1 FT from the end of the board as the very end may be drier than the remaining sample board. The location in the charge, from which each sample was obtained and the identity of the kiln charges were recorded. Oven sections were cut from the sample boards and weighed as soon as possible after the samples have been chosen so as to minimize errors. The M C of each sample expressed as Table 3-1. Percentage moisture content of kiln dried 4/4 redwood uppers. Sample Kiln and charge number number Sample values 1 1 -1 8.0 7.5 9.2 8.2 7.0 2 1--1 8.5 8.4 9.5 8.9 7.8 3 2--1 7.6 7.6 8.7 8.3 8.3 4 2--1 7.3 8.2 8.8 8.9 8.1 5 1 -2 8.1 7.7 8.1 8.4 8.1 6 1-•2 7.8 8.6 9.4 9.1 8.9 7 2-•2 8.4 10.0 9.5 12.1 12.0 8 2--2 9.3 9.7 10.2 10.8 10.0 9 1 -3 89 8.8 10.4 8.9 8.0 10 1--3 8.5 10.6 9.0 8.8 9.9 11 2--3 7.7 9.2 10.3 9.0 10.8 12 2-•3 8.5 9.7 7.2 8.6 8.3 13 1--4 8.9 9.0 8.0 8.0 9.4 14 1--4 8.2 8.3 7.7 8.7 8.1 15 2--4 8.4 10.8 9.9 13.1 8.5 16 2--4 11.1 12.2 8.8 10.1 9.8 17 1--5 9.0 10.1 8.5 9.1 8.6 18 1--5 7.6 8.8 8.5 8.3 8.4 19 2-•5 7.0 9.8 8.1 9.4 8.6 20 2-•5 7.8 7.7 9.2 8.7 8.2 16 a percentage of its oven-dry weight, is calculated to the nearest tenth of a percent. Enough samples to make two subgroups per kiln charge were obtained. The sample number, kiln and charge number and the MC of each sample were recorded. These data were first reported by Pratt (1953), who used the conventional Shewhart control charts for variables without applying any transformation or checking the significance of the additional between-sample variation. These data are used here for the purpose of comparison and to show the significance of a more thorough analysis of the final MC data. The Normality assumption The quality characteristic of interest, sometimes does not follow the Normal distribution, and assuming Normality will result in inappropriate control limits. Therefore, it is imperative to test this assumption before constructing the control charts. We can check the Normality assumption graphically with a probability plot. The probability plot indicates how the observations may deviate from what would be expected if they were Normally distributed. The Normal probability plot, is the cumulative frequency distribution (cfd), constructed by ranking the observed sample values from small to large; assigning each value a rank i, and calculating the plotting position of the probability scale. There are many choices for plotting position when plotting the sample cfd for the data plot. Blom (1958), Mandel (1964), and Press et al. (1986) showed that the plotting position: i- 0-375 >*>'77025 ( M ) 17 closely approximates the order statistics of the Normal distribution. pw is the proportion of samples that are less than or equal to x(i) with a slight "continuity" correction. Normal cfd as a function of the standard Normal deviate, can be obtained by linear interpolation between any two sets of extreme values of the standard Normal deviate and its corresponding cfd (z, cfd). For example, interpolation between two points (-2.5, 0.006) and (2.5,0.994), results in the following equation: y, =05 + 0.1975-2, (3-2) Using y, for the ordinate of the model plot would cause it to appear as a straight-line. The probabilities, p(i), are converted to standard Normal order scores, z,, and the ordinate of data plot would also produce a straight-line if the data came from a Normal distribution. Using Mathcad notation and functions, the ordinates of data and model plots are: Afi-100 i.-i.JV J C , : = READidata) (3-3) x: =sort(x) 1 N T * — X N -1 , i - 0.375 P i ' N+0.25 <P(t): = cnorm(t) Guess t:= 0 z,:= root(<P(t)~ p,,t) data^ 0.5 + 0.1975Z, mode/,:= 05 + 0.1975 - 1 18 The initial guess value assigned to t in (3-3), is a starting value that Mathcad uses in its search for a solution. The root function returns the value of the variable that solves the equation.The probability plot is shown in Figure 3-1. The data plot looks non-linear and the Normality assumption therefore is rejected. 1.5 1 1 1 1 1 1 1 model, l X -data, l X 0.5 :* 0 1 1 1 1 1 i 7 8 9 10 11 12 13 14 X l Figure 3-1. Probability plot for the data in Table 3-1. Data transformation When the Normality assumption is rejected, the usual procedure for computing the sample mean will not provide a good estimate of the process mean. Fortunately, we may transform the data so that they appear to be Normally distributed. Traditionally, a logarithmic transformation has been applied to MC data, leading to an estimation procedure based on the lognormal distribution, see Maki (1991). However, logarithmic transformation of MC data of Table 3-1, shown in Figure 3-2 indicates that this transformation isn't the right fit either. 19 Here, we use a general class of power transformations due to Box and Cox (1964), to produce data that conform best to the Normal distribution. Then we will 1.5 1 1 1 i 1 1 1 x-model, l data, l X 0.5 0 x 1 1 1 i 1 1 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 1 Figure 3-2. Probability plot for the log transformed data. present a procedure to re-express the control charts in the original scale of measurements. Box and Cox transformation An important contribution in the field of transformation was made by Box and Cox (1964.) They considered the following parametric family of transformations: ' Xk - 1 y = ——> A * °> for X > 0, (3-4) [ log X, A = 0, Here, the data values X are raised to some power X, then 1 is subtracted from the modified data and finally divided by X. These steps are fully explained in Box and Cox (1964), but two reasons are: (a) for negative values of X the transformed data (y), do not 20 reverse their order; and (b) the value X = 0 then corresponds to a logarithmic transformation. The logarithm is, therefore, a power transformation which fits naturally into the transformation sequence. We assume that for some unknown X, the transformed observations [(x, )A - l]y/A satisfy the full Normal theory assumptions, i.e. are independently Normally distributed with constant variance. The probability density for the original observations, is obtained by multiplying the Normal density of the transformed variable by the Jacobian of the transformation. The Jacobian of the transformation is obtained by differentiating the transformed variable with respect to the original variable, d ( xx - 1 dx x A -(3-5) = X The likelihood function of the obtained sample is then (2-*)' / 2 . f r N •II" I - l 2a1 •ii(*.r i - l (3-6) It is usually more convenient to work with log-likelihood, LL(n,o), function: LL(u,a) = - - ^ - l n ( 2 n)- « - l n ( < r ) - £ l-a' • + ( A - l ) - I l n U ) 0-7) 21 We will use the maximum likelihood estimator (MLE) approach to find our estimator for the particular transformation parameter X, then we will find the mean u, and standard deviation a of the transformed data. For normally distributed data, the ML estimators for u and a (biased) are: 1 N (x,)'-l 1 , £ ( * , ) ' -1 3 \1 J-* N ft A (3-8) Substituting (3-8) in the log-likelihood function (3-7) will result in an equation in terms of X only. Differentiating this equation with respect to X and equating to zero, gives a maximum likelihood estimating equation for X. The solution to this equation is straightforward with Mathcad, although at first sight the numerical calculations appear to be rather involved. Guess: A: =0.1 (3-9) A:= root dX Y ' ln(2 -ir) - N In o o A - i _ i ^ ( X i y - \ 1 *> — "I N ft ( x , ) A - l _ 1 f ( x , ) A - l A N fri A N ~ i A N ft A = -2.168 ( x , ) A - i _ i f (xty-i A N ft A - r + ( A - l ) £ l n ( J c j ) 0,A Using Box and Cox transformation (3-3) with A = -2.168, resulted in a Normal distribution for the MC data in Table 3-1. The probability plot of the transformed data is shown in Figure 3-3. The ML estimators for u and a (unbiased) are: 22 " N k A u = 0.457 1 N a = 0.001 -1 1 ^(s/) A -l 1 1 1 1 1 1 * 0.8 A mcxlel. 0.6 l data. X 1 0.4 -0.2 0 X */ l 1 1 1 i 0.454 0.455 0.456 0.457 0.458 0.459 0.46 (3-10) Figure 3-3. Probability plot for the transformed data. The maximum likelihood estimator a reflects both fcenveen-sample and wif/iin-sample variability. If the sample means differ significantly, then this will cause a to be too large and the process standard deviation overestimated. See Montgomery (1991, p. 215) for a discussion of these two variations. In processes where only simple random variation exists, between-sample variation is not significant. The estimate of the wif/im-sample variation only, should then be taken as an estimate of the process variation and be used in calculating the control chart limits. 23 For some processes, the variation of sample means will be more than we expect from a random variation. The process mean, may vary slightly causing additional variation. In this case the between-sample variation becomes significant and its extra value must be added to the w'tfim-sample variation before calculating the control chart limits. Estimate of the within-sample standard deviation sw, is the average standard deviation within samples2: 1 m m (3-11) where I •j'i (xx - 1 - x. n - 1 1 " JC-• 1 and *;. = -• Y — n % A (subscript dot denotes summation over suffix j.) Between-sample standard deviation sb, is simply the standard deviation of the m sample means: 1 m — 2 , — 7 ' £ ( * « . - J " ) \m-l i = 1 v ' (3-12) when a process is in control, sw estimates the process standard deviation a, and sb estimates o/4n. Therefore a formal F-test of whether or not there is a significant between-sample variation can be set up using the following ratio F = (3-13) ' See Wether i l l and B r o w n , 1991 P. 57. 24 For data in Table 3-1, and using formulas (3-11) and (3-12), the values of sb and sw are -4 (3-14) sb = 6.966 10" sw = 8.665 10"4 Using (3-13), the value of the F statistic is F= 3.232 (3-15) This value is greater than 1, indicating that there is some additional systematic variation between sample means. We can formally test the significance of this F value by comparing it with the critical F value at a = 0.05 with numerator degrees-of-freedom u = m - 1 = 19, and denominator degrees-of-freedom v = m{n -1) = 80. The 5% critical F value with 19 and 80 degrees-of-freedom is 1.718, computed as follows: «:= 19 v:=80 F(c): = 2 V ' 1 1 J ° V 7 fl - - I T — • v dt (3-16) c:=1.7 xc:= root(F(c)- 0.95,c) xc= 1.718 Since the sample F value is larger than the critical F value of xc, we conclude that the additional between-sample variation is significant. The standard deviation due to this variation is: S. = \l Sh (3-17) 25 The additional Beftveen-sample variation can be controlled by means of a moving standard deviation chart based on sample means. This chart together with the control chart for sample means aids in detecting changes in the process mean. Control chart for sample means Sample mean X, is normally distributed with expectation equal to the process mean u. and standard deviation equal to the ratio of the process standard deviation a divided by the square root of the sample size. When between-sample variation is proved to be significant, as in the case of MC data given in Table 1, the extra between-sample variation should be added to the expected within-sample variation. Therefore the 3-sigma control limits for the sample means control chart become: The sample mean chart is shown in Figure 3-4. In this case, the process mean appears to be in control (as opposed to control chart developed by Pratt, which showed points out of control), indicating the need for transformation and inclusion of the additional between-sample variation. (3-18) 26 Sample Mean 0.457 0.46 0.459 h 0.458 r-0.456 h 0.455 r -0.454 1 1 1 1 1 r UCL=0.459. CL=0.4571 LCL-0.455Q 10 12 14 16 18 20 Sample Number Figure 3-4. Sample means chart for data in Table 3-1. Table 3-2. Sample means and moving additional between-sample standard deviations. Sample mean, Moving additional between-sample X standard deviation, se 0.45606 -0.45693 -0.45632 4.383 IO"4 0.45640 3.059 IO"4 0.45633 6.922 10 s 0.45780 3.893 IO"4 0.45824 9594 IO"4 0.45816 6.434-10"4 0.45729 5.21 IO"4 0.45762 4.345-IO"4 0.45756 1518 IO"4 0.45667 5.227-IO"4 0.45697 4.418-IO"4 0.45647 2.387 IO"4 0.45799 7.71-10"4 0.45832 9.828 IO"4 0.45739 4.629 IO"4 0.45661 8536 IO"4 0.45673 4.097 • 10"4 0.45658 5.603-10"4 27 Control chart for moving standard deviations We can control the additional between-sample variability by constructing a moving between-sample standard deviation chart. We chose subgroups of size k = 3 to calculate the moving additional between-sample standard deviations. Table 3-2 shows the results of these calculations. The first entry in column two is the additional between-sample standard deviation (3-17) based on samples 1 to 3 and the next entry is based on samples 2 to 4 and so on. Since the sample means are Normally distributed, we can set up an se chart with probability limits. Probability limits can be obtained using the chi-square distribution in conjunction with the following relation where, (k-1) is the degrees of freedom. For a Type I error of a =.0027, and from (3-19) it follows that: (*-iK (3-19) < Xi-a/2 = 1-a (3-20) and so P •Xa,2 <*/ < J = 1-a (3-21) By taking the square roots we obtain: (3-22) 28 Therefore, if the between-sample variability is in control at o~e , 1 - a % of the time the moving standard deviations, se, will fall between the endpoints of the interval. Using estimate of the additional between-sample standard deviation (3-17), the control limits would then become: LCLS_ = 0.00002 UCL^ = 0.0015 x x and the centerline would be se, the average of se values. For a =.0027 (corresponding to the 3-sigma limits), the a/2 percentage point of the chi-square distribution with k-l degrees-of-freedom is obtained using Mathcad as follows a:= 0.0027 k:= 3 v: = k - 1 i r f i \r~l ( i > F(c): = • v V2 t \dt (3-24) 2 • r c:= 0.002 chi - square:= root(F (c) - a/2,c) chi - square = 0.003 Also Xk-i,i-a/2 = 13.215. The movitig additional between-samples standard deviations chart is given in Figure 3-5. In this case, this component of variability appears to be in control. 29 0.002 UCL = 0.0015 Moving Between-Sample standard deviation 0.001 0.00051 o LCL " 0.00002A o 2 4 6 10 12 14 16 18 Group number Figure 3-5. Moving additional between-sample standard deviations chart. Control chart for sample standard deviations (s chart) An s chart can be used for controlling the process variability. Using similar steps as given for the moving additional between-samples standard deviations chart, it can be shown that the control limits for the 5 chart are as follows: UCLS_ = 0.0018 x where, a = 0.0027, n = 5, xl-i,a/2 = 0.106, and xl-i,i-a/2 = 1 7 - 8 - T n e s chart is given in Figure 3-6. (3-25) LCL^ = 0.00014 30 0.002 1 1 1 1 1 1 1 1 1 UCL =0.0018 0.002 -sample standard 0.001 0.001 deviation CL «* 0.0008 LCL = 0.00014 0 0 1 2 1 4 I 1 1 1 6 8 10 12 Sample Number i 14 1 16 1 18 2( )Figure 3-6. Standard deviations control chart. The s chart should generally indicate control before the control chart for sample means is constructed. The reason for this is that unless the variability of the process is in a state of statistical control we do not have a stable process with a single fixed mean. Control charts shown in Figures 3-4, 3-5 and 3-6 clearly indicate an in control state for the kiln-drying operation. Both the process mean and process variation have remained constant during the period of study. These results are in sharp contrast with that previously obtained by Pratt (1953.) In that study, as mentioned earlier, conventional Shewhart X and R control charts were applied to original data and on both charts some points were found beyond the control limits. Although Pratt collected his data from an ongoing operation and any present assignable cause could have been found and reported by the author, no such finding is given in the study. This leads us to believe that the out of control points were in fact false alarms and no assignable cause had been present present in the process. Had Pratt recognized the underlying distribution and made proper transformation, he would have 31 found the process to be under control. The conventional control charts as developed by Pratt were based on the assumption that the distribution of sample means was Normal. Probability plot and statistical test of fit indicated that this assumption was not realized for the samples of MC data used in the study and as a result, the 3-sigma control limits couldn't have been the proper limits to use for the distribution of sample means. The lack of Normality in the original data couldn't have also been the result of an assignable cause since an assignable cause is either related to the process mean or the process standard deviation, not the shape of the distribution. Before any data are collected for the purpose of control charts, the quality control engineer should take proper action to fix any recognizable problem. Once the process is judged to be operating under its best possible conditions, the task of collecting data begins. For the kiln-drying operation of this chapter, it is natural to assume that these basic steps have been taken. Our control charts clearly confirm this assumption and their control limits could be used to keep the process under ontrol in the future. Control chart in the original scale of measurements So far we have constructed control charts for transformed data. However, working with transformed data presents a significant problem in process quality control: the units of the transformed data are not comprehensible to the production workers who should use control charts to monitor their work. We can construct a control chart in the original scale of measurements and specify its control limits in the units that are natural for the product. 32 Considering the definition of the expected value of a variable, the mean in the original scale of measurements can be written as a function of the mean and variance in the transformed scale. Defining the transformed data as variable y: x x - \ y = (3-26) and solving for x we find x = ^jA-y + l The mean of measurements in the original scale can then be defined as: 0.461 0.38 a -+J2-7T exp y ~ u V a dy (3-27) u03(u,a) = 8.908 The values 0.38 and 0.461 in the integral are the possible values for the transformed variable y. The expected value of MC in the original scale, will then be used as the centerline of the control chart for sample means in the original scale. For each sample, we use (3-27) to find the mean in the original scale (xos). For first sample number this calculation is as follows: j:=\.5 0.461 ;:= J VA • y + 1 0.38 •exp 2 >'5 t i -dy (3-28) Xos = 7.991 For sample number 2 we only change the subscript j to 6.. 10, and so on. The sample means in the original scale of measurements for all 20 samples are listed in Table 3-3. Local standard error of mean in the original scale ( LSE/J0S), can be found from the error propagation formula evaluated at the ML estimates of mean and standard deviation in the 33 transformed scale and the local standard errors of these parameters. These calculations are shown in 3-29. Since the exact sampling distribution of the estimate of mean in the original scale is not known (as opposed to the exact sampling distribution of the estimate of mean in the Table 3-3. Sample means in the original scale of measurements. Sample Number Mean (original scale) 1 7.991 2 8.750 3 8.195 4 8.338 5 8.202 6 8.909 7 10.510 8 10.372 9 9.138 10 9.560 11 9.476 12 8.499 13 8.794 14 8.323 15 10.104 16 10.641 17 9.261 18 8.443 19 8.553 20 8.414 LSE/u: = -=• •Jn LSEo: = LVuoa: = da' LL(u,a) LSE/u2 + -\2 ^ T ^ ( ^ a ) n-1 LSEo' (3-29) LSEu„.= JIW, LSEuos =0.115 34 transformed scale which is Normal), we use the Likelihood Ratio (LR) statistic to find an approximate confidence interval on the process mean in the original scale and subsequently we obtain the related control limits for the sample means chart. Considering the log-likelihood function (3-7) and the ML estimates (3-10) for parameters u and a, a (l-oc)-level confidence interval endpoints on the process mean in the original scale (yos) are obtained from the following condition - 2 In L(U,6) (3-30) Both parameters in the numerator (3-30) are expressed in terms of mean in the original scale. This can be done by first defining the location parameter in terms of mean in the original scale and scale parameter: (3-31) a:= a b:= a I(u0,,b): = root f°- 4 6V, ,\1/A 1 , (A y + 1) , exp 1 fy-a' . b j dy~ <uM,a Substituting (3-31) in the log-likelihood function (3-7) and differentiating with respect to the scale parameter, gives the ML equation for scale parameter. Solving this equation, defines the scale parameter as a function of mean in the original scale: ML(iJos,b):=j^LL[l(Mos,b),b] s(ti„):= root[ML(n0S,b),b] LR statistic (3-30) can then be expressed as: LR(Mosy-= 2-LL(u,a) - LL[/(/iM,s(/i0J)),s(/i0J)] (3-32) (3-33) 35 The upper and lower confidence limits of mean in the original scale are those values of p0 for which (3-33) equals - 9.009 ^=10 [7 :^= root(LR(/Jos)-9.009,/Jj £/„„,:= 9.31 0^,:=85 (3-34) Lim:-root{LR{n„)-9Sm,»„) L -=8.596 The above confidence limits fall at the following local standard errors of mean in the original scale: k u ' = LSEu "°s' LSEMos (3-35) ku^ = 3.494 kl^ = 2.719 LR curve and the line of chi-square at 9.009 are plotted in Figure 3-7. To construct the control chart we first estimate the expected standard error of mean in the original scale (ESE/u0S) for samples of size 5 and then we set the upper and lower control limits at 3.494 and 2.719 expected standard errors respectively. Using the "error propagation" formula, the expected variance of mean in the original scale and the control chart limits are: 15 i 1 1 1 1 r i r T 9.1 9.2 9.3 9.4 Figure 3-7. LR curve and the line of chi-square. 36 n:=5 ESEu: = ESEo: = EVu0S: = (3-36) a yf2~n~ — Mos(M,o~) du ESEu2 + n-1 ESEuos:=jEVuos ESEn0S =0519 UCL^ /Jos + kuuos-ESE/J0 UCL^ =10.721 LCL^5:= tios-kl^-ESEv, LCL^ = 7.498 The individual sample means in the original scale (Table 3-3) will then be plotted on the control chart. Control chart is shown in Figure 3-8. Sample mean original scale 10 9 h _L _L _L UCL= 10.721 CL" 8.908 LCL = 7.498 10 12 14 Sample number 16 18 20 Figure 3-8. Control chart for sample means in the original scale of measurements. This control chart indicates that, under current operating conditions, the output of these dry kilns is such that the average MC of samples of size five may be expected to vary between 7.5 percent and 10.72 percent. Variations in excess of these limits are, in all probability, due to assignable causes which can be found and corrected. Problems due to 37 the modification in the kiln schedules, erratically functioning of one or more of the heating coils in the kiln, excessive air leakage and other possibilities will cause points to fall beyond the control limits. Finding the source of the trouble and taking corrective action, can help prevent both over dried and wet boards and maintain uniform moisture content. Summary In this chapter, the application of control charts for variables to the MC of kiln-dried lumber has been discussed. The analysis results showed that the MC data are not Normally distributed and the traditionally applied logarithmic transformation is not the right fit. A general class of power transformations due to Box and Cox produced data that conformed best to the Normal distribution. A statistical test of significance, indicated the presence of an additional systematic fcenveen-sample variation and as a result, the conventional sample means chart limits had to be modified to include this additional variation. Comparison with a previous study which did not consider the Normality assumption and the presence of the additional between-sample variation, showed the importance of more thorough analysis of data before the development of control charts. The likelihood ratio statistic with its approximate chi-square distribution appeared to be a good choice for developing control chart for sample means in the original scale of measurements. Modern computational tools such as Mathcad™ now make this method potentially feasible. The control chart development procedure may be summarized in the following steps: 38 1. Obtain k samples of size n each. 2. Determine whether or not the underlying distribution is Normally distributed and, assuming they are not, transform data using the Box and Cox procedure. 3. Find MLE of A, the transformation parameter, using the procedure described previously and then calculate the estimates of u, and CT. 4. Check whether or not the between-sample variation is significant and, assuming it is, calculate its extra value and add that to the within-sample variation before calculating the sample means control chart. 5. Construct a moving standard deviations control chart based on sample means to control the additional between-sample variation. 6. Construct control chart for the sample standard deviations (s chart.) 7. Calculate the process mean in the original scale of measurements (pos) and define it as a function of mean and standard deviation in the transformed scale. 8. Find the 99.73% confidence interval on the process mean in the original scale using the likelihood ratio method. 9. Determine the local standard error of mean in the original scale and find the distances of the upper and lower confidence limits found in step 8 in terms of the local standard error of mean in the original scale ( ku^os and kl^s respectively.) 10. Calculate the expected standard error of mean in the original scale for samples of size 5(ESE/JJ. 11. Finally, using the /Jos, ESE/Jos,kupos and kl^ found in prior steps, compute the upper and lower control limits UCL^: = /Jos + kum • ESE/J0S and 39 LCLm\ = nos - kl^os • ESEn0S and plot the sample estimates of means in the original scale on the control chart. 40 CHAPTER 4 CONTROL CHARTS WITH CENSORED DATA Overview Censoring refers to the situation in which the value of some of the observations of the sample is unknown. Censored samples are those in which sample specimens with measurements that lie in the restricted areas of the sample space may be identified and thus counted, but are not otherwise measured (Cohen, 1991). It is assumed that the size of the sample and the number of censored observations are known. Censored data, in essence, are missing values. Missing values are not always a serious problem in statistical process control. If for example we had collected twenty-five random samples of size five to construct a control chart, and five samples contained damaged or lost data, we could discard these samples and construct the control chart with the remaining twenty samples. Alternatively, we could construct a control chart based on all twenty-five samples, using variable control limits for samples of different sizes. The difficulty with censored data is that they are not missing in a random pattern, but they are often all missing at one end of the distribution. We cannot go ahead as if they never existed, we cannot pretend they are zeros (if they are missing in the low end), and replacing the censored observations with some arbitrary values are not entirely satisfactory. 41 Censored samples often result from life testing and reaction time experiments where it is common practice to terminate observation prior to failure or reaction of all sample specimens. In addition to life-testing of equipment or related aging problems, censored data are encountered in other situations such as the determination of the moisture content of lumber which is best met by the use of electronic moisture meters. Meter readings are rapid and easy to take, and for these reasons make possible the taking of large size samples for moisture content determination. However, one disadvantage of moisture meters, particularly the resistance type, is that the moisture range in which the meter may be used with any degree of reliability extends only from about 5 percent to the fiber saturation point, which occurs at about 25 or 30 percent moisture content. In kiln dried lumber, the moisture-meter accuracy limitation occurs in resistance-type meters in the low moisture content range, since at around 5 to 6 percent moisture content, the electrical resistance of wood become so high that it is difficult to measure (McMahon, 1961). Although the actual meter reading can not be given when the meter reads over 25 percent or below 5 percent, but the fact can be recorded that a reading above fiber saturation point or below 5 detection limit was obtained. In effect, the sample of moisture content data is often censored. The consideration of these censored data is essential for the correct solution to different aspects of the total moisture variation problem. Censored data also arise in the monitoring the level of toxic pollutants present in the environment. Instruments measuring toxic contaminants can only detect the pollutants present above some detection limit, it is not possible to know the values of the 42 concentrations below the detection limit. For example, the instrument may show a zero parts per million (PPM) pollution reading when pollutants are actually present in the sample but below the instrument's ability to detect the pollutants. This censoring makes it difficult for realistic estimates to be made of population parameters. Another complicating factor in the analysis of censored data (at least for the cases discussed above), is that the assumption that the distribution is Normal is not true. If the distribution of data is not Normal, the usual procedures for estimating process parameters and the control chart limits are not appropriate. Fortunately, the situation can be improved by transforming the data so that they appear to be Normally distributed. In this research, using some sample data of pollutants in air quality, we present an approach for developing control charts for censored and non-Normally distributed data. We use Box and Cox power transformation to produce data that conform best to Normal distribution and the likelihood ratio method to re-express the derived control chart in the original scale of measurement. Illustrative Example Table 4-1 shows sixteen samples, each containing five measurements of the pollutants present in air quality samples. The measurements are recorded in parts per million (PPM). The instrument used to make the measurements had a detection limit of 8 PPM. Therefore, measurements below 8 are censored. They are shown in Table 4-1 as BDL (Below Detection Limit). We use these data to illustrate the necessary steps in the construction of control charts for censored data. 43 The Normality Assumption The first step in the analysis is to check whether or not the data are Normally distributed. We can check the Normality assumption graphically with a probability plot. The Normal probability plot as previously described in chapter 3, is constructed by ranking the observed sample values from small to large; assigning each value a rank and calculating the plotting position of the probability scale. The problem with censored data is that it is Table 4-1. Measures of Pollutants in Air Quality Samples (in PPM)" Sample Number Sample Values 1 29 30 12 15 28 2 20 24 8 BDL 16 3 22 20 21 BDL 11 4 11 25 19 22 10 5 10 13 BDL 19 12 6 14 9 18 32 14 7 BDL BDL 13 12 25 8 14 26 9 BDL 8 9 22 10 17 14 13 10 8 23 11 35 39 11 13 14 14 33 BDL 12 9 14 13 11 13 13 17 13 BDL 8 10 14 18 35 14 27 13 15 20 10 BDL 14 BDL 16 16 12 16 12 21 Source: Schneider et al. (1992), p.221. not easy to determine the quantiles of the data and the estimation procedure, previously described for complete data, must be adjusted to take the censored data into account. To 44 do this we use the Kaplan-Meier (1958) product limit estimator to estimate the cumulative frequency function F(xt) = 1 - R(xt), where For censored data, the notation Rix^Rix,^) can be defined as n - i (4-2) tf(*,)|fl(*M) = — — n is the total sample size ( observations below the detection limit and above the detection limit), i is the order of the observed data in the total sample. The R(xt) estimates can be computed for the observed sample values above the detection limit only. No estimate can be computed for the censored observations. Using the plotting position (3-1), the notation (4-2) can be adjusted to closely approximate the order statistics of the Normal distribution. Therefore, the product limit estimation of cfd which gives the plotting position for censored sample data becomes: m: = 70 j:= l..m xt:= READ {censored) x:= sort{x) r^.READ^cenrank) n: = 80 1' n-r +0.625^ R0 =1 n-r,+ 1.625 P, = 1-fn + 0.625 n + 0.25 R, Converting the probabilities p, to standard Normal order scores z,, gives the ordinates of data plot: 45 t:=0 Zi'.= root(cnortn(t) - pt,t) (4-4) mod e/<:= 0.5+ 0.1975 z, The probability plot for data in Table 4-1 is given in figure 4-1. The probability plot of the original data looks non-linear. The Normality assumption is therefore questionable. ,.0.96295, 1 1 I I 1 1 1 X 0.8 0.6 data. I X X * * **** *** ***** g X * 1 f X X -0.4 0.2 $ X -,.0.0415356, „ X 1 1 1 1 1 1 0 5 10 15 20 25 30 35 40 A X. .39, Figure 4-1. Probability plot for the data in Table 4-1. Estimation of the process parameters Once the Normality assumption for the censored data is rejected, as was shown above, the usual procedure for estimating the sample mean will not provide a good estimate of the process mean. For the case of censored data we also use a general class of power transformation due to Box and Cox (1964), as we did for the complete sample data in the previous chapter to produce data that conform best to the Normal distribution. We will use the maximum likelihood estimator (MLE) approach to find the values of X (the parameter defining a particular transformation), u (the process mean), and CT (the process standard deviation). Then we will determine the control limits for the transformed data and 46 finally we will present a procedure to re-express the control chart in the original scale of measurement. The likelihood function of an observed censored sample from a Normal process can be written as 1 2a2 D-n 1 (2-7T) V2 e 2 dt (4-5) where D is the detection limit, nu is the number of observations above the detection limit, and n, is the number of observations below the detection limit. Using Box and Cox transformation (3-7), the log-likelihood function of the observed censored sample can be written as: LL(A,/i,a) = -^•ln(2-7r)-n,-lnfa) 2 i=i M A - i -|2 7* 2a2 A 1 T , (2 -» f (4-6) Using Mathcad, function cnorm to calculate the cumulative distribution function of the Normal distribution, we would have: LL(A>^ff)--^-ln(2-ff)-nk-la(ff)-i £ i . i 1 2 2-G2 • + n, -In] cnomi ' D A - I ^ —;—M • ( A - l ) - ^ ^ ) (4-7) Differentiating (4-7) with respect to X, u and a and equating to zero, gives three maximum likelihood estimating equations. The simultaneous solution to these equations is 47 straightforward with Mathcad, although good starting values should be used for the solution to converge. For a given value of X, the mean and standard deviation of the actual observed data often provide adequate starting values. (4-8) Define where K (A, p, CT y.' DA + 1 + /J • A A CT e(A,//,a) = erf ±=e-^dt fljt Guess X:=0.1 u:=3 a:=0.6 Given InU) (x,)A-l A A M^CT))2 V2 • (- P* • ln(P) + / / - K( A, /I, CT) - CT) A-<r-(l-«(A^„(T)) + IlnU) = 0 U)A-i V2 y-«(A,iu,<r) + -W A - i CT3 DA+1 + ^ A = 0 = Find(x,/j,a) 0.106 3.055 0.657 The initial guess values assigned to X, u. and a in (4-8), are the starting values that Mathcad uses in its search for a solution. The word Given, tells Mathcad that what follows is a system of equations. The Find function returns the values of the variables that solves the system of equations. 48 Using Box and Cox transformation defined in (3-7) with A = 0.106, resulted in a Normal distribution for the censored data in Table 4-1. The probability plot for transformed data is shown in Figure 4-2. Figure 4-2. Normal probability plot for the transformed data. The local standard errors of estimates u and a can be found from the Local Information Matrix (LIM): LIM = LL{U,G) d d i \ LL(u,a) do du d d ( \ d2 ( \ ~d~orLL^,a' (4-9) Define: d: = D -1 yr-= x, -1 I-erf \j2-{-d + n) 2a 42-n{ s2(jJ-d) 49 s2:= exp 2a Elements of the LIM can be written as: LIM: = a2 u-d u-d a t3 + tl- rl - + t2 t2: = n,(s2)2 jd- u) 2-7r(slf a3 f3: = V « r3 + f i - - a -u-d 3 l U - / i ) 2 - n a c 7 2 ^_dy •+t2 ix-d -t\ + 2 r l - r 2 u-d The matrix of minimum variance bound is then the inverse of LIM, evaluated at the maximum likelihood estimates obtained in (4-8.) The square roots of variances give the standard errors of the sampling distributions of the parameter estimates u and a: LVM = LIM'1 LSEii:=jLVM0J0 LSEM:= 0.075 LSEa:= ^LVMU LSEo = 0.057 (4-10) LSE/JO:= jLVM0l Confidence intervals for the process parameters u and a When an exact sampling distribution for an estimator is not known, an approximate confidence interval can be established using the Likelihood Ratio statistic. Considering the likelihood function (4-5) and its maximum likelihood estimators of the parameters ji and CT, a (1 - a)-level confidence interval endpoints on the process parameter n are obtained from the condition: - 2 In L(/j,o) $ X 2 1,1-a (4-11) 50 where x\<\-a) is the (1 - a) percentile of the chi-squared distribution with v = 1 degree-of-freedom, s(a) is the process standard deviation as a function of process mean and is defined by solving the maximum likelihood equation for standard deviation in terms of mean: (4-12) s(a): = root] -n„ -a f-l 2 •e(A,Ju,a) + ^ The upper and lower (1 - a) percentile confidence limits for parameter u. are then obtained from 4-11: LR{a)'=2LL(v,<J) - LL(a,s(a)) (4-13) a:= 3 £/„:= root(LR(a)- 9.009, a) U„:= 3.282 a:=2 L„= root(LR(a)-9.Q09,a) !„:= 2.819 The starting values assigned to variable "a" in (4-13) are the initial guess values. The value 9.009 in the root function is the 99.73 percentile of the chi-square distribution with v = l degree-of-freedom. Here, as is the case with the conventional 3a control charts, the probability of type I error is held at a = 0.27 percent. The likelihood ratio curve as a function of the process mean and the line of chi-square at 9.009 are shown in Figure 4-3. A similar procedure can be used to find a confidence limit interval on the process parameter a. In this case, the confidence interval endpoints are obtained from the following condition: 51 7igure 4-3. Likelihood ratio curve as a function of the process mean and the line of chi-square. - 2 In L(g(b),b) L(u,o) (4-14) where g(b) is the process mean as a function of the process standard deviation and is defined by solving the maximum likelihood equation for u in terms of CT: (4-15) a:= n g(b): = root 2-4TZ •exp y - K r ( A , / i , a ) 2 V 2 o -1 / \ 1 — •e(X,u,a) + -The confidence Umit endpoints for parameter CT are then obtained from (4-14): LR(b):= 2 • LL{n,a) - LL(g(b),b) (4-16) b:=0.9 Ua:=root(LR(b) - 9.009,b) a: =05 La:= root{LR{b)~ 9.009,b) Ua:= 0.875 La:= 2.819 52 The likelihood ratio curve as a function of the process standard deviation and the line of chi-square at 9.009 are shown in Figure 4-4. 15 i r 0.8 0.85 0.9 0.95 Figure 4-4. Likelihood ratio curve as a function of the process standard deviation, and the line of chi-square. Control chart for sample M L estimates of means To construct control chart for the sample ML estimates of means, we use the ML estimate n = 3.055 of the transformed data as the center line on the control chart. Then we compute the individual sample ML estimates of mean from all samples of size 5 each. For example, the calculations for estimating the first sample MLE of mean are: (4-17) n,:= 0 nu: = 5 i: = L.n. u: = root u = 3.625 2-yfn • exp -Y-K(X,U,O)2 V2 "I / x 1 — e{\,n,o)+-53 ML estimates of mean for all 16 samples are shown in table 4-2. We set the control limits at the same distances in terms of the local standard errors of mean from the center line as the 99.73% confidence limits for u are from the process mean. Upper and lower confidence limits (4-13) have the following distances in terms of LSE/u from the process mean: KU"' LSEM K, = 3.049 LSEu klft = 3.169 (4-18) Control limits should be independent from the sample values and be defined as a function of the process parameters and the number of observed and censored samples. This can be accomplished by computing the Expected Information Matrix (EIM) which is the expected value of the local information matrix (4-9) and then calculating the expected standard error of mean ( ESE/J ). The expected values depend only on the number of observations above and below the detection limits (nu and n, respectively) and should be calculated for each sample size. Therefore, for every sample, different upper and lower control limits need to be calculated. These calculations for samples with 4 observed (nu = 4) and 1 censored unit (n, = 1) are as follows: (4-19) d: = D* - 1 l- erf d + u) 2a s2: = exp n„:= 4 #1,1=1 V2 n, s2(u-d) nl(s2f(d-M) t2: = 2a 2n(s\)2 a2 54 E/M: = tl - \ - t i - a -a u-d -tl-U-d a u-d + t2 a n~ d + 2tl-t2 2-nu (u-dY u-d a' EVM: = EIM'1 ESEu:= JEVM00 ESEu = 0.316 ESEa:= y]EVMu ESEa = 0.196 ESE/ja: = yJEVM^ The upper and lower control limits for the sample ML estimates of means chart are: UCL^ :=/J + kutl- ESEM LCL, :=H-kltl- ESE/J UCLU = 4.02 LCLTL = 4.052 (4-20) Upper and lower control limits for all 16 samples are shown in Table 4-2 and the control chart is plotted in Figure 4-5. In this case, it appears that the process mean is in control. Table 4-2. Control limits for the sample ML estimates of mean. Sample Number Sample ML estimate of mean UCL LCL 1 3.625 0 5 3.951 2.124 2 2.967 1 4 4.02 2.052 3 3.103 1 4 4.02 2.052 4 3.258 0 5 3.951 2.124 5 2.758 1 4 4.02 2.052 6 3.228 0 5 3.951 2.124 7 2.692 2 3 4.22 1.844 8 2.733 1 4 4.02 2.052 9 3.113 0 5 3.951 2.124 10 3.515 0 5 3.951 2.124 11 3.054 1 4 4.02 2.052 12 2.83 0 5 3.951 2.124 13 2.617 1 4 4.02 2.052 14 3.529 0 5 3.951 2.124 15 2.591 2 3 4.22 1.844 16 3.146 0 5 3.951 2.124 55 Control chart for sample ML estimates of standard deviations For this chart, we also set the control limits at the same distances from the center line as the 99.73% confidence limits for a are from the process standard deviation. The upper and lower confidence limits (4-16) have the following distances in terms of LSEa from the process standard deviation: k - v - ^ -"*• LSEa k.. - 3.796 Kla- LSEa (4-21) klo = 2.458 Control chart limits for sample ML estimates of standard deviations for samples with 4 observed (nu =4) and 1 censored sample (n, = 1) are: UCLa:=a + kua ESEa UCL„ = 1.4 LCLa = a~klaESEa LCL„ = 0.176 (4-22) We use the ML estimate a = 0.657 of the transformed data as the center line on the 4.5 3.5 Sample ML estimate of mean 2.5 2h 1.5 6 8 Sample Number 10 12 14 16 U C L ( n u - 3 ) UCL(n u =4) U C L ( n u - 5 ) LCL(ni= 5) L C L ( n i - 4) LCL(n]= 3) Figure 4-5. Control chart for the sample ML estimates of mean. 56 control chart. Then we calculate the individual sample ML estimate of standard deviation for all samples. For example, the calculations for estimating the first sample ML estimate of standard deviation are: n,: = 0 «„:=5 i:= l..n„ s:=root\ :A+y. - 1 ni a3 - + 2^'e*\ (4-23) — -K[Ku,a) J-V2—= —,<r a2-1 J, \ 1 -e(A,v,o) + -5 = 0.776 Sample ML estimates of standard deviation for all 16 samples along with their control limits are shown in Table 4-3. Control chart for sample ML estimates of standard deviations is shown in Figure 4-6. In this case, it appears that the process standard Table 4-3. Control limits for sample ML estimates of standard deviations. Sample Sample ML estimate of Number standard deviation n, UCL LCL 1 0.776 0 5 1.446 0.146 2 0.727 1 4 1.400 0.176 3 0.662 1 4 1.400 0.176 4 0.537 0 5 1.446 0.146 5 0.53 1 4 1.400 0.176 6 0.588 0 5 1.446 0.146 7 0.909 2 3 1.459 0.138 8 0.775 1 4 1.400 0.176 9 0.357 0 5 1.446 0.146 10 0.969 0 5 1.446 0.146 11 0.744 1 4 1.400 0.176 12 0.305 0 5 1.446 0.146 13 0.624 1 4 1.400 0.176 14 0.708 0 5 1.446 0.146 15 0.851 2 3 1.459 0.138 16 0.295 0 5 1.446 0.146 57 1.6 1.4 1.2 h l h 0.8 h Sample ML estimate af standard 0.6 deviation 0.4 0.2 10 Sample Number 12 14 16 UCL(nu = 3) UCL(nu = 5) UCL(nu = 4) CL = 0.657 LCL(n1=4) LCL(ni = 5) LCL(n] = 3) Figure 4-6. Control chart for the sample ML estimates of standard deviation, deviation is in control. Control chart in the original scale of measurements Working with transformed data is difficult, since the units are not comprehensible to the production worker who should use the control chart to monitor his/her work. We should construct a control chart in the original scale of measurements and specify its control limits in the units that are natural for the product. Following the analysis given in the previous chapter on page 30, the process mean in the original scale can be calculated from the ML estimates of mean and variance in the transformed scale: 58 * / -1 *,:=(),,-A+ 1)^ -1(y-n y 2 I a , (4-24) dy LXOS = 15.701 The process mean in the original scale of measurement, namely n0! = 15.701 would then become the center line of the control chart in the original scale. For each sample, we use the ML estimate of mean from that sample to find the expected value of that sample mean in the original scale. For example the first sample estimate of mean in the original scale is: u:= 3.625 Mos = 23.698 1 J2~7T •exp zSJy-v 2 dy (4-25) The sample means in the original scale for all 16 samples are listed in Table 4-4. The Local Variance of the estimate of mean in the original scale (LV/J0S) and its Local Standard Error (LSEjJ0S) can be found from the "error propagation formula" evaluated at the ML estimates of the process mean and standard deviation: (4-26) fio y 1 ju„ C". o): = L(A y + 1)/A j== exp y-y v. a J dy LVu„: = -12 — Mosi&o-) LSEu2 + -12 da •LSEo2+2 -T-»os(u,o-) An LSEucr2 1^=0.794 LSEum: = jLVu~a LSE/J^ =0.891 59 Using the likelihood ratio method, the 99.73% confidence interval on the estimate of mean in the original scale can be found from the following condition: - 2 - In (4-27) For numerator of (4-27) both estimates of the process mean and standard deviation need to be expressed in terms of mean in the original scale. This can be done by first defining the process mean in terms of the mean in the original unit and the scale parameter: (4-28) b:= a I(u01,b): = root rio y 1 (A + 7==e*P 2 I a J dy - u0,>a Substituting (I(M0S,b)) in the log-likelihood function (4-7) for u. and differentiating with respect to scale parameter, gives the ML equation for scale parameter. Solving this equation, defines scale parameter as a function of mean is the original scale: MLa(M0S,by=—LL[l(v0S,b),b] J(//M):= root[MLa(/Jos,b),b] (4-29) LR statistic as a function of mean in the original scale can then be expressed as: LR(V„):-2LL(/J,a) - 2-LL[I(UOS,S(MOS))XM0S)] ( 4 " 3 0 ) The upper and lower confidence limits of the mean in the original scale are those values of H05 for which (4-30) equals %0S9I3 = 9.009: A*«:-17 /V=12 (4-31) UMM:= root[LR(Mos)-9.009,MOS] root[LR(MOS) - 9.009,^] U^os:= 18.942 13.295 LR curve and the line of chi-square at 9.009 are plotted in Figure 4-7. 60 Figure 4-7. LR curve as a function of mean in the original scale and the line of chi-square. The distances of the upper and lower confidence limits from the mean in the original scale (pos) in terms of local standard error ( LSE/J0S ) are: U. /Jos LSEV0 ku^ = 3.636 "°S' LSE»0S *U = 2.699 (4-32) These distances can be used in calculating the control limits on the chart for the sample ML estimate of mean in the original scale. The expected standard error of estimate of mean in the original scale of can be found from the "error propagation formula" using the expected errors of estimates of mean and standard deviation in the transformed scale. Since this expected standard error is a function of the number of observed and censored data, control limits for each sample should be calculated separately. These calculations for the case of 4 observed (nu: = 4), and 1 censored (n,: = 1) data in a sample of size five are as follows: 61 (4-33) Hm (/i, a): = f(°A y + l / A =^=exp EVum: = dM •ESEM2 V a ) -|2 do EVpm = 16.766 ESEu„: = JEVu~a ESEjJn =4.095 \dy ESE(72+2' d d u —Master) dM ESEuo UCL„M:= vos + ku^-ESE/J, UCL^ = 29.219 LCLU0S:= nos-kl^-ESEp, L C L _ = 5.669 The upper and lower control limits in the original scale for all 16 samples are shown in Table 4-4. Control chart for the sample estimates of mean in the original scale is shown in Figure 4-8. Table 4-4. Control limits for the sample estimates of mean in the original scale. Sample Sample ML estimate of Number mean (original scale) ni UCL LCL 1 23.693 0 5 28.381 6.290 2 14.711 1 4 29.219 5.669 3 16.264 1 4 29.219 5.669 4 18.221 0 5 28.381 6.290 5 12.574 1 4 29.219 5.669 6 17.820 0 5 28.381 6.290 7 11.963 2 3 31.909 3.672 8 12.343 1 4 29.219 5.669 9 16.389 0 5 28.381 6.290 10 21.921 0 5 28.381 6.290 11 15.686 1 4 29.219 5.669 12 13.273 0 5 26.368 5.034 .13 11.298 1 4 29.219 5.669 14 22.131 0 5 28.381 6.290 15 11.081 2 3 31.909 3.672 16 16.784 0 5 28.381 6.290 62 35 30 h 25 20 Sample ML estimate of mean 15 (original scale) 10 10 12 14 Sample number UCL (nu - 3) q UCL(nu-4) UCL(nu=5) CL = 15.701 LCL(n] = 5) LCL(n]-4) LCL(nj=3) 16 Figure 4-8. Control chart for the ML estimates of mean in the original scale. Summary In this chapter, an approach for developing control charts for censored Non-Normal data was presented. The following steps summarize the control chart development procedure: 1. Obtain k samples of size n each. 2. Determine whether or not the underlying distribution is Normally distributed using the probability plot and, assuming they are not, transform data using the Box and Cox procedure. 3. Find MLE of A, u., and a by simultaneously solving their three ML equations. 4. Calculate the local standard errors of the parameter estimates u and a. 63 5. Find the 99.73% confidence interval for the parameter u using likelihood ratio method, and determine the distances of the upper and lower confidence limits in terms of the local standard error of u. (km and klfl respectively.) 6. Calculate the 99.73% confidence interval for the parameter a using likelihood ratio method, and determine the distances of the upper and lower confidence limits in terms of the local standard error of CT (kua and kla respectively.) 7. Calculate the expected standard errors for the parameters u. and CT (ESEp. and ESEa respectively) for different numbers of the observed and censored data. 8. Find the ML estimates of the sample means and standard deviations for k samples. 9. Using the ML estimate of u, ESEy, kU)1 and klfl found in prior steps, compute control limits UCL = p. + kullESE/j and LCL = n - kltlESE/J and plot the sample ML estimates of means. 10. Develop control chart for sample ML estimates of standard deviations by using the ML estimate of CT, E S E C T , kua and kla found in prior steps, compute control limits UCL = a + kuaESEa and LCL = a - klaESEo and plot the sample ML estimates of standard deviations. 11. Calculate the process mean in the original scale of measurements (uos) and define it as a function of mean and standard deviation in the transformed scale. 12. Find 99.73% confidence interval on the process mean in the original scale using the likelihood ratio method. 13. Determine the local standard error of mean in the original scale and find the distances of the upper and lower confidence limits in terms of the local standard error (&uand k, respectively.) 14. Calculate the expected standard error of mean in the original scale (ESE/J0S) for different number of observed and censored data. 15. Finally, using the /Jos, ESEu0S, ku and kt found in prior steps, compute the upper control limit UCL = JJOS + ku • ESEjJ0S and LCL = pos - k, • ESEpos and plot the sample ML estimates of means in the original scale. 65 CHAPTER 5 CONTROL CHARTS WITH TRUNCATED DATA Overview When sample selection is possible over only a partial range of a variable, the obtained sample is said to be truncated. Although all sample specimens are available in this case, certain population values are entirely excluded from investigation. This situation often occurs in manufacturing when samples are selected from production processes that have previously been screened to remove items that are above and/or below acceptable specification limits. Samples taken from a screened process are still needed to control the process mean and standard deviation. Inspecting all items and removing all nonconforming units may prevent the defective products from reaching the customers, but in no way prevents them from being manufactured again. The use of control charts will enable the quality control engineer to detect undesirable changes in the process mean and variation well before these changes could result in the production of nonconforming products. Therefore, even if a process is screened and no defective unit is allowed beyond the inspection point, appropriate control charts are still needed to maintain control on the process parameters and to prevent costly production of the defective products. Samples from a truncated non-Normal distribution are shown in Table 5-1. We use these data to develop a procedure for establishing control limits for the sample means and standard deviations charts. 66 Table 5-1. Data from a Truncated Non-Normal Distribution. Sample Number Sample Values (mm) 1 28 15 12 30 29 2 16 13 8 24 20 3 11 29 21 20 22 4 10 22 19 25 11 5 12 19 14 13 10 6 14 32 18 9 14 7 25 12 13 13 36 8 8 22 9 26 14 9 13 14 17 10 22 10 39 35 11 23 8 11 35 33 14 13 14 12 13 11 13 14 9 13 10 8 11 13 17 14 13 27 14 35 18 15 33 14 14 20 10 16 21 12 16 12 16 Source: Frontiers in statistical quality control, 1992; P. 221, (with some modifications.) The Normality assumption Checking the Normality assumption is an important first step in the analysis of any data for the purpose of quality control. If data are not Normally distributed and a known distribution has not been identified for the underlying distribution, then estimating the population parameters is not possible. We can check the Normality assumption of a truncated distribution with a Normal probability plot. Suppose 16 samples of size 5 have been taken from a production process which has already been screened and all items below the lower specification limit of 8 have been 67 removed. The likelihood function of an observed sample from a truncated Normal distribution can be written as: (5-1) L(u,o): = 1 (2-ny2 a" TI e xp 2 a 2 1 T-fL 1 -exp (-t2^ It is usually more convenient to work with the log-likelihood function: V 2 j dt (5-2) LL(u,a):= — -ln(2-g) - nln(a) - £ ' ^ - In 1 (2~^r72~'eXP r - r 2 ] dt I 2 J Differentiating (5-2) with respect to u and CT and equating to zero, gives two maximum likelihood estimating equations. The simultaneous solution to these equations gives the ML estimates of p and CT : n:=80 i:= l..n 7/:= 8 J C ( : = READ(data) (5-3) Guess u:=17 CT:=7 Given v x, ~ u - 1 r o1 2-4H - n . xr ( x i ~ vY •exp 2 ' CT2 42 1 1 ~ + -erf 2 2 J 2 CT = 0 2-VJr •exp 2 " CT 2 V 2 -CT2-1 I — + — • erf 2 2 J 2 CT := Find {p,a) (A fl7.640 ,7.912, Using the plotting position (3-1), the ordinates of the data and model plots are: 68 x: = sort(x) i - 0.375 P , : n + 0.25 f:=0 Zj: = root(cnorm(t) -pt,t) data, : = 05 + 0.1975 Z f mod e/(: = 05 + 0.1975 • — -(5-4) Probability plot is shown in Figure 5-1. The probability plot of the original truncated data looks non-linear. The Normality assumption is therefore rejected. We use the Box and Cox transformation (3-7) to produce data that conform best to the Normal distribution. The log-likelihood function of (5-2) for the transformed data becomes: (5-5) IM^u,a): = ^ H2-7r)-n-Ha)-2Z „ , -In 1 ~ J " —-w-exp 2 , 2<r J-~ (2-jry r»-i 12 1.2 data. 0.6 model, l 10 15 20 25 30 35 x. l Figure 5-1. Probability plot for the original data. -(A-l)-J>0O 1 1 1 1 I I -- f « 1 i i 1 1 1 40 69 Differentiating (5-3) with respect to X, u, and CT and equating to zero, gives three maximum likelihood estimating equations. The simultaneous solution to these equations gives the ML estimates for the transformation parameter X, the process mean u, and the process standard deviation CT. (5-6) -7/A + Define: K(X,/j,a): = s(\,u,o-):= erf Xo d(X, n,a):= exp - (-T* + 1 + u • xy 2-X2-a2 V2-(-7 A +1+ u-X 2-A a Given ^ -1 [ l ^ , h W ~ W > 1 V2^,a) . [ -^- ln ( 7) + , - . . , ( A,, ,CT)] xt-l a2 2-4TT A 2 V 4n-Xo~\\+e(X,iJi,cf^ Jjn(xt)=0 2J2 a\l+e{X,n,cf) =0 JC, A -1 0 ^ o- 2--s/7r = Find(X,u,o) 0.445" = 1584 ,0.119, 1 N 242-K(X,u,o) + —j=-&(X,U,(j) F T = 0 o-[\ + E(X,u,o)] The Probability plot for the transformed data is shown in Figure 5-3. The transformed data are now Normally distributed. 70 0.8 data. 0.61-x model. _ '0.4 h 0.2 h 1.3 1.4 1.5 1.6 1.7 1.8 1.9 W x - ' Figure 5-2. Probability plot for the transformed data. The local standard errors of the estimates |i and CT can be found from the Local Information Matrix (LIM) previously defined in (4-9): Define: (5-7) sl: = 1 + jU-A -T* x,x - 1 52: = erf 53: = exp A C T "V2-(l + y u - A - T A ) 2-A-CT (i + / j - A - r A ) 2 2 - A 2 - C T 2 * 4 : = £ * 5 : = £ 1 2 M Elements of the LIM can be written as: LIM: n V2-51-53 2-(53)2 CT2 V ^ C T 2 ( l + 52) 7T-CT2-(l + 52)2 „ , V2-53(l-5l2) 2-51-532 2-54 V2-53-(l-5l2) 2-51-532 • v ' +— a5- — -V^T-CT 2 (1 + 52) (1 + 52) 2-CT 2-TT VJT-CT2-(1 + 52) 2-54-V ^ - C T 2 (1 + 52) (1 + 52)2-CT2-7T V2-5l-53(5l2-2) 2-5l2-532 n (1 + 52)2CT2-^ CT2 71 The matrix of minimum variance bound is then the inverse of LIM, evaluated at the maximum likelihood estimates obtained in (5-5.) The square roots of variances give the standard errors of the sampling distributions of the parameter estimates \x and CT: LVM: = LIM'1 LSEv:= JLVM00 LSEcr = 0.009 (5-8) LSE/J.= 0.013 LSEo := ^LVMU cr = . LSEy.o-.= JLVM01 Confidence intervals for the process parameters Having found the ML estimates of the process mean and standard deviation, we can compute an approximate confidence interval on these parameters using the likelihood ratio statistics. Substituting "a" for p., the likelihood ratio statistic as a function of the process mean can be written as: LR(a): = 2 • LL(/J,cr) - LL{a,s(a)) (5-9) where s (a) is the process standard deviation as a function of \i and is obtained by solving the maximum likelihood equation for process standard deviation (5-6) for CT: b:= o s(a): = root ~ n + 1 Q , . , 2V2-*(A,j/,g) - j=- d{X,u,o) • r -i,b 2-VTT a [1 + e(X,u,o)\ (5-10) Upper and lower 99.73 percentile confidence limits for the process mean are those values of parameter n that make the likelihood ratio (5-7) equal to chi-square statistic Xi20.9973 = 9.009: 72 a: =1.7 a: = 15 U„:= root(LR(a) - 9.009,a) L„: = root{LR(a) - 9.009,a) (5-11) U/.= 1.626 I / l:-L543 The likelihood ratio curve as a function of the process mean and the line of chi-square at 9.009 are shown in Figure 5-4. A similar procedure can be used to find a confidence limit interval on the process parameter c. In this case, the confidence interval endpoints are obtained from the following likelihood ratio: LR(b): = 2 • LL(/u,a) - LL(s(b),b) (5-12) where s(b) is the process mean as a function of the process standard deviation and is defined by solving the maximum likelihood equation for process mean (5-4) for p.: 15 1 1 1 1 10 LR(a) chi(a) 5 0 1 1 1 1.54 1.56 1.58 a 1.6 1.62 1.64 Figure 5-3. Likelihood ratio curve as a function of the process mean. 73 a:= u s(b):** root xS -1 2-V5r d(*,u,o)-2V2 a[\ + e{X,u,a)] (5-13) The confidence limit endpoints for parameter CT are those values of parameter CT that make the likelihood ratio (5-10) equal to chi-square statistic xlosm = 9.009: b.= 0.2 Ua:= root{LR{b)~ 9.009,b) Ua:= 0.155 a: =0.08 La:= root(LR(b)-9.009,b) L-= 0.096 (5-14) The likelihood ratio curve as a function of the process standard deviation and the line of chi-square at 9.009 are shown in Figure 5-5. Figure 5-4. Likelihood ratio curve as a function of the process standard deviation. Control chart for the sample ML estimates of means From (5-9), it can be seen that the distances of the upper and lower confidence limits are respectively at 3.086 and 3.091 local standard error ( LSE/J ) from the process mean: 74 k - u " ~ 1 1 LSE/J km = 3.086 V ~ LSEp klu = 3.091 (5-15) Control limits for the sample means chart should also fall at the same distances from the process mean using the expected standard error of mean for small samples that would be plotted on the control chart. The expected standard error of mean are obtained from the expected information matrix: Define: sl: = 1 + u • A - TA s2: = erf s3: = exp A C T \l2-(\ + u-X - 7/A) 2 - A C T ' (i+ux-T*y 2-X2-o2 (5-16) EIM: = 42s\s3 2 -(s3)2 2n j2-s3-(\-sl2) 2sls32 a2 4TT• a2 • (1 + s2) rr-o2-(l + s2)2 o 4n• a2-{\ +s2) ' {\ +s2)2 • o2-rr 2n ^-s3(\-sl2) + 2sl-s32 2 n j2~-sl-s3-(s\2-2) 2-sl2-s32 o~ VJra 2-(1 + 52) (l + s2)2-a2-7T CT VTT • a2 • (1 + s2) (\ +s2)2 • a2-n EVM: = EIM'1 ESEJJ:= yJEVM0A ESEu = 0.055 ESEo:= jEVMhl ESEo = 0.039 ESEpa:= *jEVM0l The upper and lower control limits for the sample estimates of means chart are: L/CI,:= v + km ESEv LCL^ v-k^ESE/J UCL^ = 1.76 LCL„ = 1.41 (5-17) 75 The individual sample ML estimate of mean from all samples of size 5 each are then calculated and plotted on the control chart. For example, these calculations for first sample are as follows: n :=5 i:= l..n u:= root] u = 1.66 x* - 1 1 „ 2V2 &{X,U,CT) F — — -^,u 2-4TT <7-[l + e(A,/i,<7)]' (5-18) Control chart for the sample ML estimates of mean is plotted in Figure 5-6. u 1.8 1.7 J Sample M L estimate of j g mean 1.5 1.4 I X X UCL = CL =1.589 LCL = 1.4 6 8 10 Sample number 12 14 16 Figure 5-5. Control chart for the sample estimates of mean. Control chart for the sample ML estimates of standard deviations The upper and lower confidence limits for the ML estimate of a (5-12), stand respectively at 3.72 and 2.496 local standard error (LSEcr •): 76 ua' LSEo k,.„ = 3.72 k • = LSEcr (5-19) kla = 2.496 We set our control limits for the sample estimates of standard deviations chart at the same distances from the process standard deviation using the expected standard error of estimate of standard deviation for samples of size five obtained in (5-14): UCLa:= a + kua • ESEa UCL„ = 0265 LCLa: = a - k l a ESEa (5-20) LCLa = 0.022 The individual sample ML estimates of standard deviations from all samples of size 5 would then be calculated and plotted on the chart. These calculations for sample number 1 are as follows: n :=5 i: = 1.5 (5-21) s: = root * / - l "|2 2-VTT \ 2V2-y(A>/i>g) •0(A,ji,ff) r -1,0-c r [ l + £(A,/i,a)J 5 = 0.132 The control chart for sample estimates of standard deviations is shown in Figure 5-7. Control chart in the original scale of measurements It is always a good idea to design the control chart in the original scale of measurement that is more natural for the product. Following the procedure given for the case of censored data, we can re-express the control chart for sample estimates of mean in the original scale. First we express the process mean in the original scale as a function of mean and variance in the transformed scale: 77 J 0.3 0.25 h 0.2 h Sample estimate of 0.15 standard deviation 0.1 0.05 _L UCL = 0.265\ CL = 0.119 LCL = 0.022 6 8 10 12 14 16 Sample number Figure 5-6. Control chart for the sample estimates of standard deviations. -1 (y-»)2 2 { a , dy = 17598 /uos: = iios(n,a) (5-22) The local standard error of mean in the original scale can then be obtained from the "error propagation formula" evaluated at the maximum likelihood estimates of the process mean and standard deviation: /:= 0..1 ;:=0..1 dm •uos(u,<j) (5-23) do I J LSE„0S:=jLVi LSE„„ = 1.049 flOS 78 Using the likelihood ratio method, we can find the 99.73% confidence interval on the process mean in the original scale. The likelihood ratio statistic as a function of process mean in the original scale can be written as: LR(ti„):= 2LL(/J,a) - LL[l(iJ0S,s(iJ0S)),S(^s)] (5-24) where s(/J0S) is the process standard deviation as a function of mean in the original scale and I(/Jos,s(/Jos))is the process mean in the transformed scale as a function of the process mean in the original scale and s(nos). The required calculations to obtain these functions are as follows: a:= u b:= a I (Mo,'*>)'• = r o o t (5-25) Jl.359V * > b-yfl7^ 2 { b J ML(/J0S,b):=-^LL[l(MOs,b),b] s(/J0S):= root[ML(uos,b),b] 99.73 percentile confidence limits for the process mean in the original scale are those values of Mos that make the likelihood ratio (5-22) equal to the chi-square statistic xlo .9973 = 9.009: U-.-rootiLRinJ- 9.009,/U £/„<„:= 9.31 /^:=8.5 L^rootiLRiMos)- 9.009,^) (5-26) 8.596 Likelihood ratio curve as a function of the process mean in the original scale and the line of chi-square at 9.009 are shown in Figure 5-8. Confidence limits (5-24) fall at the following local standard errors of nos from the process mean in the original scale: 79 Figure 5-7. Likelihood ratio curve as a function of the process mean in the original scale. ku V., Hos ftos' LSE/J0 kilos'-LSEU0S (5-27) K» = 4.149 kl,os = 2568 We use these distances in calculating the control limits on the chart for the estimates of mean in the original scale. What is needed is the expected standard error of the estimates of mean in the original scale when samples are of size five. This estimate can be found from the "error propagation formula": i:=0..1 ;:=0..1 d dQ:=—Uos(M,^) d dl'- =do :^ os^ ,(T) EV^J^dtd^EVM^ i j ESE^ = 4.032 (5-28) 80 Control chart limits for the sample estimates of means in the original scale are: UCL^:' /Jos + ku^ • ESEn0S LCL^ /Jos - kl^ • ESEfd0S UCL^ = 30.59 LCL„0S = 4.651 (5-29) Estimate of mean in the original scale as defined in (5-20) can be used to calculate the sample estimate of mean for each individual sample: 7:= 1.16 (5-30) uos (A-,•!)* — i . 1 JlJ59v J ' n-J"> •exp -_1 2 \ ° J dy Sample estimates of mean (w;) are given in (5-16.) Values of uosj are then plotted on the control chart. This chart is shown in Figure 5-9. 40 1 1 I 1 1 1 1 UCL = 34.32 30 U O S j Sample estimates of mean 20 CL •= 17.598 (original scale) 10 LCL = 7.243 I I I I I l 1 0 0 2 4 6 8 10 12 14 16 Sample number Figure 5-8. Control chart for the sample estimates of mean in the original scale. 81 Summary In this chapter, An approach for developing control charts for Non-Normal Truncated data was presented. The following steps summarizes the control chart development procedure: 1. Obtain k samples of size n each. 2. Test the Normality assumption of the underlying distribution using probability plot and, assuming data are not Normally distributed, transform them using the Box and Cox procedure. 3. Find ML estimates of X, u, and a by simultaneously solving their three ML equations. 4. Calculate the local standard errors for the parameter estimates |i and CT. 5. Considering a = 0.0027, calculate 99.73% confidence interval for the parameter u using the likelihood ratio method, then determine the distances of the upper and lower confidence limits in terms of the local standard error of |i (kUfl and klfl respectively.) 6. Determine the 99.73% confidence interval for the parameter CT using likelihood ratio method, and determine the distances of the upper and lower confidence limits in terms of the local standard error of CT (ku a and kla respectively.) 7. Calculate the expected standard errors for the parameters \x and CT ( E S E U and ESEa respectively) for sample size n = 5. 8. Compute ML estimate of mean for each sample, using the standard error estimated in step 3. Also, using the process mean estimated in step 3, calculate ML estimates of standard deviation for all samples. 82 9 . Using ML estimate of u, ESEy, kUfl and klfl found in prior steps, compute control limits UCL = p. + k^ESEp and LCL = u - k^ESEp and plot the sample ML estimates of means obtained in step 8. 1 0 . Develop control chart for sample ML estimates of standard deviations by using ML estimate of CT, E S E C T , kua and kla found in prior steps, compute control limits UCL = a + kuo.ESEo~ and LCL = a - klaESEo and plot the sample ML estimates of standard deviations calculated in step 8. 1 1 . Calculate process mean in the original scale of measurements (pos) and its local standard error using error propagation formula. 1 2 . Compute 9 9 . 7 3 % confidence interval on the process mean in the original scale of measurements using the likelihood ratio method. 1 3 . Find the distances of the upper and lower confidence limits in terms of the local standard error calculated in 1 1 ( ku and k, respectively.) 1 4 . Find expected standard error of mean in the original scale of measurements for sample sizen = 5 , (ESEpos.) 15. Finally, using /Jos, ESEp0S, ku and k, found in prior steps, calculate control limits for the sample means chart in the original scale (UCL = /Jos + ku • ESELIOS and LCL = LIOS - k, • ESELI0S respectively.) Plot the sample ML estimates of means in the original scale. 83 CHAPTER 6 CONTROL CHARTS WITH MATERIAL STRENGTH DATA Overview In this chapter, control charts applicable to the material strength test data are presented. Material strength is an important characteristic of product quality, and should be maintained at a desired level with appropriate control charts on the manufacturing Process. We can think of the flaws in materials as made of many small volumes of varying strength, the weakest of which determines the strength of the material. For example, the tensile strength of a piece of lumber is determined by the minimum strength of knots and other surface defects in the wood grain. As another example, suppose we have a chain of n links. Clearly the strength of the chain is equal to the strength of its weakest link. In these examples, the distribution of strength is the distribution of a minimum which approaches a Weibull distribution as n —¥ °°. The Weibull distribution is quite popular as a material strength testing distribution and for many other applications where a skewed distribution is required. It was originally derived by Fisher and Tippett (1928) as an asymptotic distribution of extreme values and later on by Weibull (1939) in an analysis of breaking strengths. The probability distribution function (pdf) of a Weibull distribution has the form: (6-1) 84 Here, 5 is a scale parameter and X is a shape parameter. It is well known that a negative natural logarithmic transformation converts the pdf (6-1) to a location-scale-parameter family. Variates with this distribution, are called the Gumbel distribution and have pdf of the form: The Gumbel location and scale parameters u and CT are related to the Weibull parameters as follows: In the following, we use a simulated sample of size 50 from a Weibull distribution with 5 = 55 and X = 8, to develop appropriate control charts for material strength test data. Illustrative Example Table 6-1 gives ten samples, each containing five measurements of the rupture strength of a new engineering material. These data will be used to illustrate the necessary steps in the construction of control charts for the material strength data. Estimation of the process parameters We estimate the Weibull parameters by maximum-likelihood procedure. The likelihood function of a sample of size n from a Weibull distribution is: (6-2) Ai = -ln(5) 1 (6-3) Z,(5 ,A)= A (6-4) 85 Table 6-1. Rupture strength of a new engineering material. Sample Number Sample values (iV/cm2 x io4) 1 69.4 44.6 52 49.8 55.6 2 59.3 47.4 55.6 51.6 56 3 64.1 52.7 55.4 52.9 52.2 4 53.6 47.7 48.8 35.3 38.9 5 45.6 57.5 57 47.6 46.9 6 54 43.3 58.1 39.2 59.7 7 52.9 57.4 62.9 42.7 44.2 8 62.6 52.3 51 51 55.6 9 43 43.5 56.9 55.3 54 10 58.6 51.8 56.3 40.9 56.8 It is more convenient to work with the log-likelihood function: LL(d,A) = n-ln(A)- n-A -ln(d) + (A - l ) - £ l n ( j c , ) - 5"A ( 6 " 5 ) i ( Differentiating (6-5) with respect to 5 and X and equating to zero, gives two Maximum Likelihood (ML) estimating equations. The simultaneous solution to these equations gives the ML estimates of 5 and X: Guess d: =50 X:=8 (6-6) Given »(*')A "n"7 = 0 X a n n • A • \ '55.05' 8.28 , The initial guess values assigned to 5 and X in (6-6), are starting values that Mathcad uses in its search for a solution. The Find function returns the value of the variables that solves 86 the equation. The estimate X above is biased downward. Using the bias correction factor b(ri) of Thoman (1969), the unbiased estimator An is: \ = b(n) • A where for n = 50, b(n) = 0.973 (6-7) The plot of Weibull pdf is shown in Figure 6-1. f(x) = x exp 6X « " k 0.06 1 i 1 1 0.04 0.02 1 " i i V 1 0 0 20 40 60 80 100 w Figure 6-1. Plot of the Weibull probability distribution function. For any model that is considered for process control, it is necessary to confirm that there is a single process yielding a single distribution. Items turned out by different operators may be different to the extent that in fact there be two or more Weibull processes imbedded in the overall process. Therefore, it is always a good idea to investigate whether the assumption of a single Weibull distribution is satisfied. Using data from mixtures of two or more different Weibull distributions to calculate control limits under the assumption that the process is generating a single distribution, would yield incorrect results. 87 Probability plotting of both the model and the data, is an effective tool in detecting the presence of mixtures and checking the model fit. Although for this example we do not need to plot our data, but to show how this important step could be accomplished, we explain a simple probability plotting method for Weibull distribution. Weibull probabiUty plot Taking natural logarithm twice of the Weibull cdf F(xm) = 1-exp fx ^A V " J results in a linear relation in terms of ln(jc(0) (6-8) In 1 - e x p if)' l n ^ j = -Aln (5)+ A-Info) Using Press et al. (1986) plotting position, Pi = i - 0.375 n + 0.25 (6-9) (6-10) in place of F[X(I) ) gives the following ordinates for the data and model respectively: data, = ln[- ln(l - />,)] mod el, - ln (6-11) The probability plot is shown in Figure 6-2. 88 Data transformation It is well known that a natural logarithmic transformation converts the pdf (6-1) into a location- scale-parameter family. Specifically, transforming Weibull variate x to -ln(x) results in a Gumbel variate with probability distribution function: Figure 6-2. Probability plot for data in Table 6-1. fix) = — exp a x - u - exp f x-j£ o- j (6-12) Here, p: and CT are location and scale parameters respectively. In determining control limits for detecting shifts of location and dispersion in Weibull data, we will define the problem in terms of detecting shifts in the Gumbel distribution location and scale parameters u and CT. When Weibull parameters are not available, Gumbel parameters can be estimated by simultaneously solving their maximum likelihood equations: x(:=-lnU) (6-13) 89 Guess n:=-4 a:=0.1 G iven n 1 2, e x P a a i J -•=-•1 ~ x, u - x, -—I — ( a \ a j = 0 V " J F' ind (/j , cr ) ^ ^ ( - 4.008^1 \ ° J 0.1 21 The scale parameter a is biased downward. Its unbiased value can be obtained by multiplying it by Thoman's bias correction factor given in (6-7). The Gumbel pdf is plotted in Figure 6-3. Figure 6-3. Plot of the Gumbel probability distribution function. 90 Confidence intervals for the process parameters Once the estimates of the process parameters u and a are determined, then it is possible to calculate ranges within which samples of fixed size from this process should lie with some specified probability. For quality control purposes, these ranges (confidence intervals) are usually calculated for 99.73% probability level (corresponding to the probability of false alarm of a = 0.0027.) Since the sampling distributions of the ML estimates of u and a are not known, we use the Likelihood Ratio (LR) statistic to obtain the confidence limits on these parameters. Considering the log-likelihood function of an obtained sample from the Gumbel distribution: ILGi.a)- -« -ln(a)- £ ( ±ZJL) - £ e x p f < 6 " 1 4 > The 99.73% confidence interval calculations for the location parameter u are: b: = a (6-15) s(a): = root{^g LL(a,b),b^ LR(a): = 2 • LL(u,o)- 2 • LL(a,s(a)) a:= -4.4 a:= -3.6 Utl: = root(LR{a) - 9.009,a) Lfl: = root(LR(a) - 9.009,a) UM = -3.95 I , = -4.063 The value 9.009 in (6-15) is the chi-square 99.73 percentile at 1 degree of freedom. The first root function solves the likelihood equation of the scale parameter and returns the value of o(|i). The second and third root functions return the values of |i for which the LR equations become equal to chi-square of 99.73 with 1 degree-of-freedom. LR curve and 91 the line of chi-square are shown in Figure 6-4. Similar confidence interval calculations for the scale parameter are: A : M V (6-16) s(b): = rootl — LL(a,b),a \da J LR(b): = 2 • LL(u,o) - 2 • LL{s(b),b) b: = .2 b:=.l Ua: = root{LR(b) - 9.009,fo) La: = root{LR{b) - 9.0Q9,b) U„ = 0.171 L„ = 0.09 15 i i i i 1 1 l 10 - \ / " LR(a) chi(a) 5 1 1 1 1 1 0 -4.08 -4.06 -4.04 ~4.02 - 4 ~3.98 ~3.96 ~3.94 ~3.92 a Figure 6-4. LR curve as a function of the location parameter and the line of chi-square at 9.009. To construct a control chart for the ML estimates of a parameter, we need to know at what level of standard error of the estimate of that parameter its confidence limits would fall. Using the obtained sample data, the Local Standard Error (LSE) of the estimates of the parameters JLL and a are obtained from their Local Information Matrix (LIM): 92 LIM = LL{U,O) d d , x ——LL(U,O) do du d d i \ -— — LL(u,o) du do --^TLL(u,o) Using the log-likelihood function (6-14), the elements of LIM are: 111 £ e x p — o T \ o ) LI 2 r + £ o~ i LI3= LI2 L / 4 = :r-pr-X(*«-^)+X o o , , ' 1 — exp o exp (6-17) (6-18) - 2 - xt+/u • • exp -xt + u) (~ *, + MT (-X, + U) ~ 4 e X P LIM = - LIl - LI2 - LI3 - LIA The Local Variance Matrix (LVM), and the Local Standard Errors of the estimates of u and a are then: LVM = LIM'1 (6-19) LSE/J = yJLVM00 LSE/J = 0.018 LSEo = 0.973 -JLVMU LSEa = 0.012 LSE/JO = V0.973 • ylLVMoi LSE/JO = 0.008 The value 0.973 in (6-19) is the bias correction factor previously defined in (6-7). We apply this factor to the estimate of CT which is biased downward. The upper and lower confidence limits of the parameter u fall respectively at 3.217 and 3.057 local standard errors from the process mean: 93 U„-V _fi-Lll (6-20) m LSE/J LSEfJ kUM = 3.217 = 3.057 For the scale parameter CT, we have: Va-cJ _c-La (6-21) ua LSEo la LSEo Ka = 4.071 kla = 2.45 These values will be used in the construction of the following control charts. Control chart for the sample ML estimates of the scale parameter Control chart on the sample ML estimates of CT detects the shift in the variability of the process and its limits represent a range within which the sample ML estimates of CT should lie with probability 0.9973. According to (6-21), the upper and lower confidence limits of the ML estimate of CT fall respectively at 4.071 and 2.45 local standard error of this parameter. Therefore, for samples of size 5 that are used with the control chart, we first compute the expected standard error of CT for samples of size 5, and then we set the upper and lower control limits at 4.071 and 2.45 expected standard errors respectively. Elements of the Expected Information Matrix (EIM) for estimates u and CT are the expected values of the elements of the local information matrix (6-17). Given sample size n = 5, these expected values are: 94 Ell 1 f-3.1 = - ^ - j _ 4 3 e X I n f-3. = T + / J -a J"4-a Ell EI3 = EI2 o ) f(x)dx • • exp \-x + li\ exp f x-jT f{x)dx (6-22) n 2 2 3 a a m A = ^-ZJnV^x~^'f^dx+n'l x-u •exp f x-ii\ {-x + u) •f{x)dx HM = -Ell -E12 -EI3 - 0 4 The Expected Variance Matrix (EVM) and the Expected Standard Errors of the parameters |u and CT are then obtained as follows: (6-23) ESEa = 0.973-J EVM Ul EVM = EIM 1 ESE/J = JEVMM ESE/J = 0.057 ESEa = 0.041 ESE/Ja = V0.973 • ^EVMol The upper and lower control limits for the sample ML estimates of u are: UCL„= /i + kull- ESE/J UCL„ = -3.825 LCLfl = fJ~ kIfl- ESE/J LCL„ = -4.182 (6-24) Control limits for the sample ML estimates of CT are: UCLa =a + kua ESEa UCL„ = 0.288 LCLa = y-kla- ESEa LCL„ = 0.02 (6-25) The individual sample ML estimates of scale parameter from all samples of size 5 would then be plotted on the control chart. The necessary calculations For the first sample, are as follows: 95 i:=1..5 n:=5 (6-26) Given n x, - u \^ u - xt (u- xA n s:= Find(o) s = 0.173 Control chart is shown in Figure 6-5. 0.3 1 1 1 1 UCL = 0.288 Sample ML 0.2 estimate of the scale CL = 0.121 parameter 0.1 1 1 1 1 LCL - 0.02 0 0 2 4 6 8 10 Sample number Figure 6-5. Sample ML estimates of standard deviation control chart. Control chart for the sample ML estimates of the location parameter Following the procedure described above, control chart for sample ML estimates of location parameter could also be developed. However, working with Gumbel data (transformed Weibull) presents a problem in that the units of the transformed data are not comprehensible to the production worker who should use control chart to monitor his work. This problem can be solved by constructing the control chart in the original Weibull scale. 96 Control chart in the original scale of measurements The Weibull variate y is related to Gumbel variate JC by: y = exp(- x). Considering the definition of the expected value of a variable, the expected value of the observed data in the original Weibull scale can be expressed as a function of the location and scale parameters in the transformed Gumbel scale: ^(^^)=l4351exp(-^)' • exp x-u - exp f x - u ^ dx (6-27) The Local Standard Error of the expected value of measurements in the original Weibull scale (LSE^) can then be estimated from the error propagation formula: LVMos: = — ua(u,o) LSEu2 — u^o) do LSEo2+2 — um(u,o) d — ua{u,o) (6-28) •LSEfuo LVua =0.794 LSEum = 1.054 Next, we calculate the confidence interval for the expected value of measurements in the original Weibull scale. Since the exact sampling distribution of this statistic is not known, we use the likelihood ratio statistic to derive the appropriate confidence limits. For this calculations, both terms in the numerator of the LR statistic need to be expressed in terms of the mean of the measurements in the original scale. Confidence interval endpoints are obtained from the condition: - 2 - In L{n,o) (6-29) < Y2 - A lJ -o 97 Considering the process mean in the original scale as given in (6-27), we first define the location parameter in terms of the mean in the original scale and the scale parameter: (6-30) 1 a:= u b:= a 7 C " < - , . * > } = r o o t •exp x - u I x - u exp a v \ a ) dx - uBt,a Substituting (6-30) in the log-likelihood function (6-14) and differentiating with respect to scale parameter, gives the ML equation for scale parameter. Solving this equation, defines the scale parameter as a function of the mean is the original scale: ML(iJ0S,by.= — LL[l(M0S,b),b] (6-31) s(/J0S):= root[ML(/J0S,b),b] LR statistic (6-29) can be expressed as: LR(u0S):= 2LL(M,a) - LL[I(IIOS,S(M0S)),S(V0S)] ( 6 " 3 2 ) The upper and lower confidence limits of the mean in the original scale are those values of Hos that make (6-32) equals >^^xx = 9.009: i " M : = 56 /io s:= 45 (6-33) [7^:= root(LR(iJos) - 9.009,/Jw) 1^:= root(LR(pos) - 9.009 . /O [7^:= 54.988 1^:= 48.432 The likelihood ratio curve and the line of chi-square at 9.009 are shown in Figure 6-6. Confidence limits (6-33) fall at the following local standard errors of mean in the original scale: U. nos LSE/J0 kilos'- LSE/J01 (6-34) = 2 5 1 5 *U = 3.305 98 For samples of size 5 that are plotted on the control chart, we first estimate the expected standard error of the mean in the original scale and then we set the upper and lower control limits at 2.915 and 3.305 expected standard error respectively. Using the error propagation formula, the Expected Variance of mean in the original scale and the control chart limits are: 151 1 1 1 1 1 1 1 1 Figure 6-6. LR curve as a function of the process mean in the original scale. (6-35) M. o*» °y- - O A y+—w= exq ~* 5 <7-V2-7T 1 -1 to EVua.= -fl du EVua =11.157 ^•toff) •ESEu2 + •ESEa2+2 ^W - tocr ) ^ t o * ) ESEua2 ESEU„: = JEVLT ESEu„ = 334 UCLM0S:= nos + ku^-ESEfi. UCL^ - 61.651 LCL-=/Jos-kluos-ESELi0 IXOS LCLV0S= 40.877 99 The individual sample ML estimates of mean in the original Weibull scale would then be plotted on the control chart. The required computations for first sample are as follows: n 1 ( x,-fi^ i:=1..5 n:=5 Given _L.£exp| c a i 1 ux: = Find(ju) ux = -4.081 uoSl:= J_4jexp(-jc)' uoSl =55.917 (6-36) = 0 •exp x — U, exp x - ux V, ° u J J tix Control chart for the ML estimates of sample means in the original scale is shown in Figure 6-7. Sample means in the original Weibull scale 65 60r-55r-50r-45 h 40 I X UCL = 61.65 CL = 51.915 LCL = 40.87 4 6 Sample number 10 Figure 6-7. Control chart for the sample means in the original scale. Control charts for the percentiles of strength distribution In monitoring the material strength of products, often a minimum breaking strength is desired and is considered an important quality characteristic that needs to be controlled. 100 Detecting a downward shift in the lower percentiles of the breaking strength distribution becomes as important as detecting a shift of the mean. Control charts for percentiles of strength data can be developed using the likelihood ratio method. We use the strength data given in Table 1 to construct the appropriate control charts for percentiles. Suppose the tenth percentile of breaking strength distribution is of interest and a change in the tenth percentile is to be detected for the process. Assuming Weibull distribution for strength data, the tenth percentile (X10) is: We first calculate the maximum likelihood estimates of parameters 5 and X using log-likelihood function (6-4). The local standard errors of these parameters can then be found from their local information matrix. Using the LIM of (6-17) for Weibull parameters 5 and X the necessary calculations are: (6-37) (6-38) 1/1 = n - ^ - - 5 " A • ^ • £ ( * l ) A - 5 - A - ^ - - £ ( * J A L / 2 = ^ _ 5 - M n ( 5 ) | . £ W A + ^ LI 3 = 1/2 LI4 = ^ - 5 A • ln(5)2 • £ (x, )*' + 2 • 5 "A • ln(5) • £ (x, )* • Info ) - 3 "A • £ (x, )* • ln(xf) - LIl - LI2 LIM = - LI3 - LI4 LVM = LIM - l LSE5 = 0.993 LSEX = 0.849 LSE8X = 0.522 101 Standard error of X is biased downward. The value 0.973 above, is the bias correction factor previously defined in (6-7). The local standard error of the tenth percentile can then be obtained from the error propagation formula: (6-39) LVX\0i = — X10TAA.) LSE52 + -12 LSEa2+2- ^ x i O ( M ) XlOfAAJ LSE5K LSEXm=jLVX\0 LSEX10 = 1.6303 Since the exact sampling distribution of the tenth percentile is not known, We use the likelihood ratio method to compute the confidence interval on the tenth percentile. Confidence interval endpoints are obtained from the condition: - 2 In L ( / ( X 10, s(X 10)), s(X 10)) " L(5,X) (6-40) * xl Weibull parameters 5 and X in the numerator of (6-40) are expressed as a function of tenth percentile. To define these functions, we first solve the tenth percentile (6-37) for 5 and then express the result as a function of the tenth percentile and X: a:= A d: = 5 (6-41) / (X10 ,a )= root din 1 1-0.10 " - X\0,d Substituting in the log-likelihood function (6-5), and differentiating with respect to X, gives a maximum likelihood function in terms of tenth percentile and X: ML(X10,a):= — LL(I(X10,a),a) da (6-42) 102 Solving (6-42) for X defines this parameter as a function of the tenth percentile. The likelihood ratio statistic (6-40) can then be written as: LR(X10): = 2- LL(5,A) - 2- LL[l(X10,s(X10)),s(X10)] (6-43) To find the upper and lower 99.73% confidence limits, we find those values of tenth percentiles that makes (6-43) equal to A 2 , 9 9 7 3 1 = 9.009: XIO: =45 (6-44) UX10:= root[LR(X10) - 9.009, X10] Uxl0 = 46.219 x ia= 35 Lxl0: = root[LR(Xl0) - 9.009,XIO] Uxl0 = 36.313 Above confidence limits fall at the following local standard error of the tenth percentile: Uxw - XIO X10-UX10 (6-45) X1°-" LSEX10 xl°- LSEXIO kuxl0 = 2.617 klxl0 - 3.459 Therefore, for percentile control chart, one would expect the upper and lower control limits to fall at 2.617 and 3.459 expected standard error of the tenth percentile. For samples of size 5, the expected standard errors for estimates of parameters 5 and X can be found from the inverse expected information matrix with elements that are the expected values oftheL/M (6-38): 103 (6-46) rt = 5 n-X n-X2 rtsa nX r450 , m = 'dr~"~oT' 'Jo x •Ax)dx-5 - t f - l x f { x ) d x £73: =£72 EIA: = ~-8-x -ln(5)2 -n-J^ V •/(x)^ + 2-5'A • ln(5)n{ 5 °^ A -lrK»-/(x)^-n-5"A-J^°jeA -ln(;c)2 • / ( » & £ y M : = £7Af _ 1 £S£5: = JEVM00 ESEX: = JEVMU ESE5 = 3.139 ESEA = 2.792 ESE5X:= JEVM0X ESE5X = 1.727 Using the error propagation formula, the expected standard error of the tenth percentile and the upper and lower control limits are: EM: = '-£71 -£72 A Evxm= •ESE52 + d \ X m A ) •ESEu2+2 d 5 X m A ) (6-47) — X10TM) ESEbX ESEXIO. = JEVXIO £S£X10 = 5325 UCLxl0: = X10 + kuxl0 • ESEX10 UCLxl0 = 55575 LCLX10:= X10- klxl0 • ESEX10 LCLX10 = 23.217 104 For each sample of size 5 given in Table 6-1, we find the sample maximum likelihood estimates of the parameters 5 and X and then the sample tenth percentile. These calculation for first sample are: n:=5 i.-1..5 (6-48) Given »-4-zw--£-o ±-»MB)* » ( * , ) + s-> • m(a)- ) ' - a - " • £ U ) " • m U ) - o i5}:-Find(5,A) (5) '58.04' , 6.48 , = 0.973 • A X10: = 5 In 1 > 1-0.10 XIO = 40.621 Values of the sample tenth percentiles are then plotted on the control chart. Figure 6-8 shows the control chart for sample tenth percentiles. 105 Summary In this chapter, we have presented control charts applicable to the material strength data with Weibull distribution. Since the exact sampling distribution of statistics for the process parameters of Weibull distribution are not known, sufficiently accurate control limits cannot be established by conventional methods. Likelihood ratio statistic with its approximate chi-square distribution appears to be a good choice for developing appropriate control charts for the process parameters and the percentiles of the Weibull distribution.The following steps summarize the control chart development procedure: 1. Obtain k samples of size n each. 2. Transform Weibull variate x to -\n(x) to obtain a Gumbel distribution with location and scale parameters p:, and CT. 106 3. Find ML estimates of it, and CT by simultaneously solving their ML equations. 4. Calculate the local standard errors for the parameter estimates u and CT. 5. Calculate 99.73% confidence interval for the parameter CT using the likelihood ratio method ( i.e. a = 0.0027), then determine the distances of the upper and lower confidence limits in terms of the local standard error of CT (k u a and kIa respectively.) 6. Calculate the expected standard errors for the parameters u and CT (ESEy and ESEo respectively) for sample size n = 5. 7. Compute ML estimate of CT for each sample, using the process mean estimated in step 3. 8. Using ML estimate of CT, E S E C T , kua and kla found in prior steps, compute control limits UCLa = CT + kuaESEc and LCLa = a - klaESEc and plot the sample ML estimates of CT calculated in step 8. 9. Calculate process mean in the original Weibull scale of measurements (nos) and its local standard error using error propagation formula. 10. Compute 99.73% confidence interval on the process mean in the original Weibull scale of measurements using the likelihood ratio method. 11. Calculate local standard error of mean in the original Weibull scale and find the distances of the upper and lower confidence limits in terms of local standard error ( ku^ and kl^ respectively.) 12. Compute expected standard error of mean in the original Weibull scale (ESE^) for sample size n = 5. 107 13. Using nos, ESE^s, kuuos and kl^, found in prior steps, calculate control limits for the sample means chart in the original scale of measurements: UCL^0S = /Jos + ku^os • ESE^, and LCL^0S = /Jos - kl^ • ESE^0S. Plot the sample ML estimates of means in the original Weibull scale. 14. Define the tenth percentile of the Weibull breaking strength distribution and compute its local standard error. 15. Using the likelihood ratio method, calculate the 99.73% confidence interval for the tenth percentile and find the distances of the upper and lower confidence limits in terms of the local standard error of the tenth percentile^ kuxl0 and klxl0 respectively.) 16. Compute the expected standard error of tenth percentile (ESEXIO) for sample size n = 5. 17. Using the information obtained in steps 14 to 16, calculate control limits for the sample tenth percentile chart: UCLX10:= XIO+ kuxlQ-ESEXIO and LCLxl0: = XIO- W y i n - ESEXIO. 108 CHAPTER 7 CONCLUSIONS AND RECOMMENDATIONS Overview In this chapter we will summarize our main findings in the course of this research by drawing conclusions and recommending areas of interest for further investigation. Conclusions The problem of quality characteristics with Non-Normal distributions was discussed in chapter 3 with a specific example of the final moisture content of kiln dried lumber. It was found that when the assumption of Normality of the underlying distribution is violated, it would always be a good idea to transform data to Normal distribution before calculating the estimates of the process parameters and proceeding with the design of the control charts for sample means and standard deviations. Since the ML estimate of the process standard deviation contains both within and between-sample variations, a statistical test was made to determine whether the between-sample variation was significant. The test indicated that the extra between-sample variation was significant and its value was subsequently calculated and added to the estimate of the within-sample variation before constructing the sample means control chart. We developed a moving standard deviation control chart based on the sample mean values to control the additional between-sample variation. 109 Control chart for the sample means, was then re-expressed in the original scale of measurements. Since the distribution of the sample means in the original scale of measurements was not known, we used the likelihood ratio statistic to compute the 99.73% confidence limits ( corresponding to type I error of a = 0.0027) on the estimate of the process mean in the original scale of measurements. We computed the local standard error of estimate of mean in the original scale of measurements using the error propagation formula and calculated the distances of the upper and lower confidence limits from the process mean in terms of the local standard error to determine how far from the process mean the control chart limits should fall. We calculated the expected standard error of the process mean in the original scale for samples of size n, and then we set the control chart limits in the original scale of measurements at the computed standard errors that was found for the confidence limits. In chapter 4, we developed control charts for censored samples from a non-Normal distribution. We first computed the ML estimates of the required power transformation X and the process mean and standard deviation. Since in this case, the exact sampling distribution of statistics for the process mean and standard deviation were not known, we used the likelihood ratio statistic to compute the 99.73% confidence intervals for the estimates of the process mean and standard deviation. We then calculated the local standard errors for the estimates of the process parameters and determined the distances in terms of the local standard errors that the upper and lower confidence limits would fall from the corresponding process parameter. For the sample ML estimates of mean control chart, we first computed the expected standard error of mean when samples 110 are of size n, then we set the upper and lower control limits at distances in terms of expected standard errors which were earlier found for the confidence limits. The same procedure was used for the development of the control chart for the sample ML estimates of standard deviation. To present control chart for the sample ML estimates of mean in the original scale of measurements, we first obtained the 99.73% confidence interval for the estimate of the process mean in the original scale of measurements using the likelihood ratio method. Then we computed the local standard error of mean in the original scale of measurements and determined the distances of the confidence limits from the process mean in the original scale in terms of the local standard error. We then used these distances in conjunction with the expected standard error of mean in the original scale of measurements computed for samples of size n to set the control chart limits for the sample ML estimates of mean in the original scale of measurements. In chapter 5, control charts for truncated samples from a Non-Normal distribution was discussed. Following the same basic approach as was used for the case of censored samples, we first computed the ML estimates of the power transformation and process parameters. Using the likelihood ratio method and the expected standard errors, we developed control charts for the sample ML estimates of means and standard deviations. Finally we re-expressed the control chart for sample ML estimates of means in the original scale of measurements. In chapter 6, control charts for material strength test data with Weibull distribution was introduced. For Weibull distribution where the exact sampling distributions of 111 statistics for the process mean and standard deviation are not known, the likelihood ratio statistic can be used to compute the appropriate control chart limits. We first transformed the Weibull variate with shape and scale parameters to a location-scale-parameter family known as the Gumbel distribution. We then used the likelihood ratio method similar to the steps described in the previous chapters and developed control charts for the sample ML estimates of scale parameter and for the sample expected values in the original Weibull scale. Finally we developed control chart for the sample tenth percentiles of strength distribution. The major conclusion drawn from this work was that with the development of new procedures and proper manipulation of the existing data, significantly better process control schemes can be put in place in a variety of industrial settings. Recommendations for further research Considering the topics covered during the course of this research, the following areas are recommended for further investigation: Development of control charts for progressively censored samples. If the underlying distribution is not Normal and the use of power transformation is considered, attempts should be made to re-express the control chart in the original scale of measurements. Development of control charts for centrally truncated distributions and if necessary transforming the data and presenting the final control charts in the original scale of measurements. 112 Although the need for control charts for other distributions such as Lognormal, Gamma and Beta may seldomly arise, never the less, attempts should be made to develop appropriate control charts for these distributions. When prespecified standards for the process mean and standard deviation are available and the process is Normally distributed, the exact control limits can easily be established for a given sample size. However, in many situations no standards are given and the Maximum Likelihood (ML) estimates of the process mean and standard deviation should be used instead. In this case, sufficiently accurate control limits cannot be established due to the relatively slow convergence of the ML estimators, to their normal asymptote which makes the ML estimates of the process mean and standard deviation inaccurate even for fairly large sample sizes. This has prevented the valid use of control charts for a process whose total output is not sufficiently large or during the crucial stage of initiating a new process. Fortunately, Likelihood Ratio statistic often appears to approach its limiting chi-square distribution considerably more rapidly than the distribution of the ML estimator approaches its limiting normal distribution. Therefore, LR-based control charts could be considered as an alternative and further research is needed in the possible use of LR statistic in the development of control charts for low volume production runs. 113 Bibliography BLOM, G. (1958). " Statistical Estimates and Transformed Beta Variables'", New York, John Wiley & Sons. BALAKRISHNAN, N., and KOCHERLAKOTA, S., (1986). "Effects of Non-normality on X charts: Single Assignable Cause Model", The Indian Journal of Statistics, 48, pp. 439-444. BAXTER, LAURENCE A.(1993), "Towards a theory of confidence intervals for system reliability," Statistics & Probability Letters, V. 16, No. 1, pp. 29-38. BOX, G.E.P. and D.R. COX (1964), "An analysis of transformations," J. Roy. Stat. Soc, Series B, 26. pp. 211-252. BOX, G. E. P., and MULLER, M.E.(1958), "A Note on the Generation of Normal Deviates". Annals of Mathematical Statistics, Vol. 28,1958. BOX, G. E. P., and C. A. FUNG.(1994), "The Importance of Data Transformations in Designed Experiments for Life Testing". Report No. 121, Center for Quality and Productivity Improvement, University of Wisconsin, Madison, Wisconsin. 114 BURR, I. W., (1967), "The Effect of Non-Normality on Constants for X and R charts", Industrial Quality Control, 23, 563-569. BURY, K. (1986), "Statistical Models in Applied Science", Robert E. Krieger Publishing Company, Malabar Florida. CHAN, L. K., HAPUARACHCHI, K. P., and MACPHERSON, B. D., (1988), "Robustness of X and R charts", IEEE Transactions on Reliability, 37, 117-123. CHANG, MYRON, N. (1989), "Confidence intervals for a Normal mean following a group sequential test", Biometrics 45, pp. 247-254. COHEN, A. CLIFFORD (1991), "Truncated and Censored Samples", Marcel Dekker, Inc. New York. P. 1. DEMING, W. E. (1986). "Out of the Crisis", Center for Advanced Engineering Study, M.I.T., Cambridge, MA. EVERITT, B. S. (1995). "The Cambridge Dictionary of Statistics in The Medical Sciences", Cambridge University Press, Cambridge CB2IRP. 115 FARNUM, N. R. (1991), Modern Statistical Quality Control and Improvement, p. 363, Duxbury Press, Belmont, California. FERTIG, K. W. and MANN, N. R. (1980). "Life-Test Sampling Plans for Two-Parameter Weibull Populations." Technometrics 22, pp. 165-177. FISHER, R. A. and TIPPETT, L. M. C. (1928). "Limiting forms of the frequency distribution of the largest or smallest member of a sample." Proceedings of the Cambridge Philosophical Society 24: 180-190. HAHN, G. J., and MEEKER, W. Q., (1991), Statistical Intervals: A Guide for Practitioners, John Wiley, New York. HERD, G. R. (1960), "Estimation of Reliability from Incomplete Data," Proceedings of the 6th National Symposium on Reliability and Quality Control, IEEE, 345 East 47th St., New York, NY 10017, pp. 202-217. HOSONO, Y. (1984). " Cumulative Sum Chart for Double Exponential Life Data." Frontiers in Statistical Quality Control 2, (H. J. Lenz, G. B. Wetherill and P. Th. Wilrich, Eds.). Physica-Verlag, Wiirzburg, federal Republic of Germany, pp. 227-237. 116 HOSONO, Y.; OHTA, H.; and KASE, A. S. (1981). "design of Single Sampling Plans for Double Exponential Characteristics." Frontiers in Statistical Quality Control, (H. J. Lenz, G. B. WetherilL and P. Th. Wilrich, Eds.). Physica-Verlag, Wurzburg, federal Republic of Germany, pp. 94-112. JOHNSON, RICHARD A. (1994), "Miller & Fround's Probability & Statistics For Engineers, Fifthe Edition", Prentice Hall, Englewood Cliffs, New Jersey. JOHNSON, L. G. (1964), "The Statistical Treatment of Fatigue Experiments", Elsevier, New York. KAPLAN, E. L., and MEIER, P. (1958), "Nonparametric Estimation From Incomplete Observations", Journal of the American Statistical Association, No. 53, pp. 457-481. KEEPING, E. S. (1962), "Introduction to Statistical Inference", D. Van Nostrand Company, Inc. New York, pp.136-138 KENDALL, MAURICE G. (1961), " The Advanced Theory of Statistics, Volume 2", Hafner Publishing Company, New York. 117 KOKOSKA, STEPHEN and NEVISON, CHRISTOPHER, (1994), "Statistical Tables and Formulae", Springer-Verlag, New York, NY., p. 31. LANGENBERG, P., and IGLEWICZ, B., (1986), "Trimmed Mean X and R charts", Journal of Quality Technology, 18, pp. 152-161. LAWLESS, J. F. (1982), "Statistical Models and Methods for Life Time Data" John Wiley, New York, NY, p.525. McMAHON, E. P. (1961), " Applying Cumulative Frequency Distribution in Moisture Control During Kiln Drying", Forest Products Journal, March, pp. 133-138. MAKI, ROBERT G. (1991), "An Application of Statistical Process Control Measures for Maintaining Optimal Quality From Dry Kiln Operations," master's thesis, Department of Forest Products, Oregon State University, Corvallis, OR. P. 67. MANDEL, A. (1964). "The Statistical Analysis of Experimental Data", New York, Interscience Publishers. MANN, N. R. (1971), "Best Linear Invariant Estimation for Weibull Parameters under Progressive Censoring," Technometrics 13, pp. 521-533. 118 Mathcad User's Guide (1996), MathSoft Inc., 201 Broadway Cambridge Massachusetts, 02139 USA. Mathcad is a registered trademark of MathSoft, Inc. MOHN, ERICK (1979), "Confidence estimation of measures of location in the log-normal distribution," Biometrika, 66, no. 3, pp. 567-575. MONTGOMERY, D. C. (1991), Introduction to Statistical Quality Control, Second Edition, John Wiley, New York. MURPHY, S. A. (1995), "Likelihood Ratio-Based Confidence Intervals in Survival Analysis," Journal of the American Statistical Association, 90, pp. 1399-1405. NELSON, WAYNE., (1982), "Applied Life Data Analysis", John Wiley & Sons, New York, NY. Chapters 4 and 8. OWEN, A. B. (1988), "Empirical Likelihood Ratio Confidence Intervals for a Single Functional," Biometrika, 75, pp. 237-249. PADGETT, W. J., and Spurrier, J. D. (1990), "Shewhart-Type Charts for Perentiles of Strength Distributions" Journal of Quality Technology, Vol. 22, No. 4, pp. 283-288. 119 PRATT, E. (1953), "Some applications of statistical quality control to the drying of lumber", Forest Products Research Society Journal. PRESS, W. H., FLANNERY, B. P., TEUKOLSKY, S.A., and VETTERLING, W. T. (1986). "Numerical Recipes: The Art of Scientific Computing", Cambridge, Cambridge University Press. QIN, JING (1993), "Empirical likelihood in biased sample problems," The Annals of Statistics, 21, no. 3, pp. 1182-1196. QIN, JING (1994), "Semi-empirical Likelihood Ratio Confidence Intervals for the Difference of Two Sample Means," Annals of the Institute of the Statistical Mathematics, 46, pp. 117-126. QUESENBERY, C. P., (1991). "SPC Q Charts for Start-Up Processes and Short or Long Runs." Journal of Quality Technology 23, pp. 213-224. RICE, ROBERT W., AND SHEPARD, ROBERT K. (1993), "Moisture content variation in white pine lumber dried at seven northeastern mills", Forest Products Journal. 43(11): 77-81. ROCKE, D. M., (1989), "Robust Control Charts", Technometrics, 31, 173-184. 120 RYAN, T.P. (1989), "Statistical methods for quality improvement," Wiley Series in Probability and Mathematical Statistics, John Wiley and Sons; New York, NY. SCHILLING, E. G., and Nelson, P. R., (1976), "The Effect of Non-Normality on the Control Limits of X Charts", Journal of Quality Technology, 8, 183-188. SCHNEIDER, H., (1986). "Truncated and Censored Samples from Normal Populations'", Marcel Dekker, Inc., New York, NY. SCHNEIDER, H. (1989). "Failure-Censored Variables Sampling Plans for Lognormal and Weibull Distributions." Technometrics 31, pp. 199-206. SCHNEIDER, H., HUI, Y. AND PRUETT, B., (1992). "control charts for environmental data", Frontiers in Statistical Quality Control, (H. J. Lenz, G. B. WetherilL and P. Th. Wilrich, Eds.). Physica-Verlag, Wurzburg, federal Republic of Germany, pp. 216-226. THOMAN, D. R., BAIN, L. J., AND ANTLE, C. E. (1969), "Inferences on the Parameters of the Weibull Distribution," Technometrics 11, pp. 445-460. VIVEROS, ROMAN and BALAKRISHANAN, N. (1994), "Interval Estimation of Parameters of Life From Progressively Censored Data", Technometrics 36, 84-91. WARREN, G. (1984), Statistical Modeling, John Wiley & Sons, New York. 121 WEIBULL, W. (1939). " A statistical theory of the strength of materials." Ing. Velenskaps Akad. Handl. 151: 1-45. WETHERILL, G. B., and BROWN, D. W. (1991). Statistical Process Control, Chapman and Hall., London, England. 122
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Quality control with non-normal, censured and truncated...
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Quality control with non-normal, censured and truncated data Noghhondarian, Kazem 1997
pdf
Page Metadata
Item Metadata
Title | Quality control with non-normal, censured and truncated data |
Creator |
Noghhondarian, Kazem |
Date Issued | 1997 |
Description | This research presents a new approach to the computations of control charts for non- Normal data and for those quality characteristics where the exact sampling distributions of statistics for the process mean and standard deviation are not known. We use a class of power transformations due to Box and Cox (1964), to produce data that conform best to the Normal distribution. A statistical test of significance to determine the presence of an additional between-sample variation is introduced and an appropriate control chart to control this extra variation is developed. The Likelihood Ratio (LR), statistic which has been found useful in areas such as testing of hypothesis and estimation of confidence intervals, is used to design the control charts in the original scale of measurements that are natural for the product. The major advantage of LR method is its relatively rapid convergence to its chi-square asymptote. We present a specific application in the wood industry, by constructing appropriate control charts for the final Moisture Content (MC) of kiln-dried lumber. Comparison with a previous study which used the original non-Normal MC data showed the importance of an appropriate transformation and the inclusion of the additional between-sample variation in the calculations of the control chart limits. Without these necessary steps the control chart may lose its validity and falsely signal an out of control situation. Confidence intervals and control charts for the process mean and standard deviation are developed based on the LR statistic for the Weibull and Gumbel distributions. A control chart for the percentile of strength data to maintain a rninimum strength at a desired level, is also presented. Probability plots to check the Normality assumption of the censored and truncated data are presented. Appropriate control charts for the sample estimates of mean and standard deviation for the non-Normal censored and truncated data are developed. A procedure is given to re-express the control charts for the censored and truncated data in the original scale of measurements. Complex calculations were performed without the need to program using the Mathcad™ computer analysis package. This is a highly desirable property for the non-statistically oriented user. |
Extent | 4301594 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-04-20 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
IsShownAt | 10.14288/1.0088340 |
URI | http://hdl.handle.net/2429/7438 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mechanical Engineering |
Affiliation |
Applied Science, Faculty of Mechanical Engineering, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1997-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1997-251292.pdf [ 4.1MB ]
- Metadata
- JSON: 831-1.0088340.json
- JSON-LD: 831-1.0088340-ld.json
- RDF/XML (Pretty): 831-1.0088340-rdf.xml
- RDF/JSON: 831-1.0088340-rdf.json
- Turtle: 831-1.0088340-turtle.txt
- N-Triples: 831-1.0088340-rdf-ntriples.txt
- Original Record: 831-1.0088340-source.json
- Full Text
- 831-1.0088340-fulltext.txt
- Citation
- 831-1.0088340.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0088340/manifest