- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Robust tests on the equality of variances
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Robust tests on the equality of variances 1997
pdf
Page Metadata
Item Metadata
Title | Robust tests on the equality of variances |
Creator |
Lai, Man-Po |
Date Created | 2009-03-25 |
Date Issued | 2009-03-25 |
Date | 1997 |
Description | The classic F test for the hypothesis concerning the equality of two population variances is known to be non-robust. When we apply the classical F test to the non-normal samples, the actual size of the test can be different from its nominal level. Therefore, several robust alternatives have been introduced in the literature. In this thesis, I will present some of these alternatives, and illustrate their application with some examples. A new approach will also be introduced. The best feature of this method is that it seems to be able to overcome the adverse effect of outliers. A Monte Carlo study is used to compare the new test with the F test and the other methods. The results of this study are encouraging for the new test. |
Extent | 1953671 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | Eng |
Collection |
Retrospective Theses and Dissertations, 1919-2007 |
Series | UBC Retrospective Theses Digitization Project |
Date Available | 2009-03-25 |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0088196 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of |
Degree Grantor | University of British Columbia |
Graduation Date | 1997-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
URI | http://hdl.handle.net/2429/6535 |
Aggregated Source Repository | DSpace |
Digital Resource Original Record | https://open.library.ubc.ca/collections/831/items/1.0088196/source |
Download
- Media
- ubc_1997-0561.pdf [ 1.86MB ]
- Metadata
- JSON: 1.0088196.json
- JSON-LD: 1.0088196+ld.json
- RDF/XML (Pretty): 1.0088196.xml
- RDF/JSON: 1.0088196+rdf.json
- Turtle: 1.0088196+rdf-turtle.txt
- N-Triples: 1.0088196+rdf-ntriples.txt
- Citation
- 1.0088196.ris
Full Text
Robust tests on the equality of variances by Man-Po Lai B.Sc. University of British Columbia, 1994 A T H E S I S S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E R E Q U I R E M E N T S F O R T H E D E G R E E O F M A S T E R O F S C I E N C E in T H E F A C U L T Y O F G R A D U A T E S T U D I E S D E P A R T M E N T O F S T A T I S T I C S We accept this thesis as conforming to th_e^r.e;quired standard T H E U N I V E R S I T Y O F B R I T I S H C O L U M B I A 1997 © M a n - P o Lai , 1997 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the Universityn of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the, head of my department or by his' or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Staff's f^lCS, The University of British Columbia Vancouver, Canada DE-6 (2/88) Abstract The classic F test for the hypothesis concerning the equality of two popula- tion variances is known to be non-robust. When we apply the classical F test to the non-normal samples, the actual size of the test can be different from its nominal level. Therefore, several robust alternatives have been introduced in the literature. In this thesis, I will present some of these alternatives, and illustrate their application with some examples. A new approach will also be introduced. The best feature of this method is that it seems to be able to overcome the adverse effect of outliers. A Monte Carlo study is used to compare the new test with the F test and the other methods. The results of this study are encouraging for the new test. 11 C o n t e n t s Abstract ii Table of Contents iii List of Tables v List of Figures vii Acknowledgements viii 1 Introduction 1 1.1 Non-normality 1 1.1.1 Normal Case 2 1.1.2 Non-normal Case 3 1.2 Some Robust Methods 5 1.2.1 The Levene test 6 1.2.2 Modifications of Levene test 7 1.2.3 The Jacknife test 7 1.2.4 The Box test 9 1.2.5 The Moses test 10 1.2.6 The Layard X 2 test 10 1.2.7 T h e Box-Andersen Test 11 1.3 Example 13 1.3.1 T h e first example: Cloud 13 1.3.2 The second example: Michelson 14 i n 2 A New Robust test 18 2.1 Robust Dispersion Estimates 18 2.2 Asymptotic Distribution of R 19 2.2.1 Normal case 19 2.2.2 Non-normal case 24 2.3 Examples 24 3 Monte Carlo study 28 4 Conclusion 40 Bibliography 43 i v L i s t of T a b l e s 1 Actual asymptotic significance level of F test (a = 0.05), with non- normal samples 5 2 Results of tests on variances for the Cloud data 15 3 Results of tests on variance for the Michelson data 17 4 relation between c,a,b and E F F 21 5 Acceptance regions of R = Sr-i/Sr? with a = 0.05 obtained from asymptotic distribution and simulation with 10,000 repetitions 23 6 Simulated actual significant level of the new test (ct = 0.05) from 10,000 generated data, with normal assumption for several non-normal distributions 25 7 Results of the new tests (c = 1.7,2.07,2.3765) on the Cloud example. If reject = 1, the test rejects Ff 0 26 8 Results of the new tests (c = 1.7,2.07,2.3765) on the Michelson ex- ample. If reject = 1, the test rejects Ho 27 9 Acceptance regions of R = Srx/'Sr2 with a = 0.05 with sample sizes ni,ri2 obtained from simulation with 1,000 repetitions 27 10 Monte Carlo Power Function for Tests on Variances for Normal dis- tribution 32 11 Monte Carlo Power Functions for Tests on Variances for xs 2 distribution 33 12 Monte Carlo Power Functions for Tests on Variances for t5 distribution 34 13 Monte Carlo Power Functions for Tests on Variances for tw distribution 35 14 Monte Carlo Power Functions for Tests on Variances for £20 distribution 36 v 15 Monte Carlo Power Functions for Tests on Variances for based on two samples from the 7V(0,1) population wi th different number of outliers from the 7V(5,0.1) in the first sample 37 16 Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0,1) and iV(0, 3) populations wi th different number of outliers from the iV(5.5,0.1) in the first sample 38 17 Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0,1) and /V(0,3) populations wi th different number of outliers from the 7V(10,0.1) in the first sample 39 v i L i s t o f F i g u r e s 1 Side by Side Boxplots of the two logged variables in Cloud example . 16 2 Side by Side Boxplots of the two logged variables wi th outliers i n the seeded sample 44 3 Side by Side Boxplots of the measurements in the first and fifth trials i n Michelson's example 45 4 Side by Side Boxplots of the two variables wi th outliers i n the fifth sample 46 5 Histograms of R for different combinations of sample size n, and c . . 47 v n A c k n o w l e d g e m e n t s I would like to thank my supervisor, Dr . Ruben Zamer for introducing me such a interesting topic. Ruben was always willing to give his advice, ideas, and encour- agement, which were much needed. Also, I would like to thank Dr. Paul Gustafson for his very useful comments and careful reading. Thanks also go to Daniel Ng for his invaluable help with latex, and various other problems along the way. v m 1 I n t r o d u c t i o n It is a well-known fact that the two sample t-test is a reliable method to test the differences between population means because it is insensitive to the departures from normality in the populations. O n the other hand, when testing the differences between population variances, the F test is known to be rather sensitive to the assumption of normality. As a result, it might be possible that the null hypothesis is rejected because of the fact that the random variables are not normally distributed rather than the fact that the variances are not equal. This chapter focuses on inferences about variances of two populations. Section 1.1 investigates the influence of non-normality on comparing the variation in two samples. Section 1.2 describes alternative robust methods which have been proposed to deal with the non-normality problem. 1.1 Non-normality The classic F test was first proposed by Bartlett [1]. Unfortunately, the F test is very sensitive to the assumption that the underlying populations have normal distri- butions. Box [2] showed that when the underlying distributions are non-normal, this test can have an actual size several times larger than its nominal level of significance. To see the influence of non-normality on comparing the variation in two samples by a classical F test, we will look at the normally and non-normally distributed cases. Firstly, we will derive the asymptotic distribution of the classic F test statistic under the assumption of an underlying normal distribution. Secondly, we will investigate how this asymptotic distribution changes under departures from normality. 1 1.1.1 Normal Case Let us consider a two sample problem. Let ?/n, . . . , ? / i n i and y2i, ...,y2n2 D e t w o inde- pendently distributed samples from the distributions 7V"(/f,cT 2) and N(fj,,a2) respec- tively. The asymptotic distribution of the test statistic i n the classical F test w i l l be derived, although the statistic has exact F distribution under the nul l hypothesis and normal assumption. We use the asymptotic distribution, because the distribution of the test statistic is hard to obtain when samples are non-normal. For simplicity, we assume first that n\ = n2 = n. The sample variances Sf = -—ir z3j=i (yij — yl)2-, f ° r i = 1,2, are unbiased estimators of the corresponding population variances af for i = 1,2 respectively, where y~i are the corresponding sample means. B y the Central L i m i t Theorem, (s2A I (0\ — -> N ,£ \ U, / where B y the Delta Method, E = /2af 0 \ \ 0 2u\, \s2 a2J where g(x,y) = \jxjy and V<? is the gradient of g(x,y) : ( i ) (2) (3) \-<7l^2 3 / (4) (5) (6) Therefore, a2 (7) If the null hypothesis, H 0 : o\ = o~2, is true, and according to the equation (3) So, we can use T to test the two-sided Ho, and would reject Ho when T exceeds the upper 100(a!/2) percentile or falls below the lower 100(a/2) percentile of the N(0,1) distribution. Thus, H 0 is rejected when \T\ > z(l — a/2). For instance, if a = 0.05, then Ho is rejected, when \T\ is greater than 1.96. For the unequal sample size case, if ^ • d, then 1.1.2 N o n - n o r m a l C a s e The method described in the last section is based on the assumption of normality. To see how this method is sensitive to departures from normality, we will look at the cases that the population of the variables follow other distributions: double exponential, t5 ,tw, X51X101 a n d uniform. In addition, we will calculate their actual asymptotic significance levels. Let us first look at the general case. If the observations yu,y\ni and 2/21 > J/2n2 are independently distributed according to a general distribution F(y), then V</£V<7 = 1 and (8) E(Sf) 1 •> (10) 3 Var(S,2) n — 1 n (11) where (12) ( E ( y - » y y 7 is called the coefficient of kurtosis and measures the peakedness or flatness of the probability distribution function (pdf). For the normal case, 7 = 0 and Var(S' 2) = 2 ( T 4 / ( n - 1). By the C L T , where (si U E = N ( (2 + l)a\ / 0 \ V 0 (2 + 7KV According to(4) and (14) B y the Delta Method, if Ho is true, we obtain " ( TT - 1 02 and for the unequal sample size case, - N(O, 2 + 7 •n-2 / (S^ ^ *r(n (2 + 7 ) ( l + ^ ) 2 (13) (14) (15) (16) (17) If the normality assumption is met, 7 = 0, so that equation 16 is equivalent to equation 8. However, for the non-normal cases, like £ 5 , 7 won't be zero. So when we apply the classical F test to the non-normal samples, the actual size of the test would be different from its nominal level of significance, a. Table 1 displays the value of distribution 7 actual significance level double exponential 3 0.215 h 6 0.327 1 0.110 xl 2.4 0.186 Xw 1.2 0.121 uniform (a, b) -1.2 0.002 Table 1: Actual asymptotic significance level of F test (a = 0.05), with non-normal samples 7 and the actual significance level of the F test (a = 0.05), for several non-normal distributions: double exponential,t 5, tw, xh Xio> a n d uniform (a,b) . Note that the arguments in the uniform distribution do not affect the result since in this case.7 is always equal to —1.2. Also, for a heavy-tailed distribution (7 > 0), the probability of rejecting Ho exceeds 0.05; whereas, for a short-tailed distribution, the probability is less than 0.05. 1.2 Some Robust Methods This section contains some discussion of other alternatives to the test based on T defined in (8). The six robust methods considered here are the Levene test [6], the Jacknife test [7], the Box test [2], the Box-Andersen test [3], the Moses test [9], and 5 the Layard x 2 test [5]. 1.2.1 The Levene test The idea of the Levene test [6] is to transform the original data y^ into Zij — — yi independently, identically, normal distributed under Ho, and use the usual t test on the two transformed samples: zu,...,zitTll and z 2 i , z 2 t J l 2 . Obviously, the 2,-j's do not satisfy the above assumptions. Normality is not met because the Zifs are absolute values. Independence is violated because of the common term yi i n each z^; also, they are not identically distributed unless ni — n2. However, as mentioned at the beginning of this chapter, the t test is a reliable method to check the differences between means due to the fact that it is insensitive to non-normality. To apply the two sample t test we have a new statistic where zi,z2, var(zi), and var(z2) are the means and variances of the samples z\ and z2. Levene [6] showed that under the null hypothesis, the distribution of 7} can be approximated by a t distribution with degree of freedom j = l , . . . , n t - for the two samples, i = 1,2. Then, we just pretend that they are Ti = (z! - z2)/s with 1 n\— 1 n?— 1 where var(zi) ns2 6 For two side test, if |T/| is greater then tv(l — cn/2), where v is degree of freedom, Ho would be rejected. 1.2.2 Modifications of Levene test For skewed distributions, such as the x 2 with 4 degrees of freedom (df), and heavy- tailed distributions, such as the Cauchy, the Levene test usually has too many rejections. That is, the actual rejection rate exceeds the nominal significance level. For these settings, improved Levene-type procedures have been proposed by Brown and Forsythe [4] which modify the test statistic by replacing the central location yl with more robust versions, such as the medians and the 10% trimmed means of the the two samples. Monte Carlo studies [4] show that all of these test statistics are robust for the very heavy tailed Cauchy distribution. For the x 2 (4) distribution, the statistics based on the median is robust but the 10% trimmed mean rejects too often. Usually the version based on the sample mean has the greatest power in situations when the three statistics are robust. 1.2.3 The Jacknife test In [7], Miller proposed a procedure based on the Jacknife technique to test Ho in the two-sample case. Let us first review the idea of the jacknife technique. Let 9 be an unknown parameter, and let (y-i,...,?/JV) be a sample of N independent observations with cumulative distribution function (cdf) G$- Suppose that we use 9 to estimate 9, and that the data is divided into n groups of size k. Let 0_8-, i = 1, ...,n, denote the estimation of 9 obtained by deleting the i-th group and estimating 9 from the (n — 1)A; observations. Define 9{ = n9 — (n — l )0_i , and 9 = ^Z)"=i ^t? z — 1? ••-,n, 7 then the statistics 1 (18) should be approximately distributed as t with (n-1) df. The statistics 18 can be used to perform an approximate significance test on 9. To apply the jacknife technique to test Ho : lncr 2 = lncr 2 in the two-sample case, we first define 9X = Inal , 6y = Inal , 9X = \nS2x, 9y = \nS2y, x6~i = ni In S2X - (n - 1) In xSli , y0i - n2 In Sy — (n- 1) In yS2^ , where n; is the number of subsamples in the ith sample. Since x9 and y9 are approximately independently distributed, Miller proposed to test Ho by using a two sample t-test on the two samples: x 6 i , x 9 n j , and y9\, ....y9n2. To apply the two sample t test we have a new statistic rp xO - y9 J-e = s with ~ 1/2 ^ _ /var( x9) var( y9)\ \ ni n2 J and x9, y9, var( x9), and var( y9) are the sample means and variances of the samples x9 and y9. He showed that under the null hypothesis, the distribution of T$ can be approximated by a t distribution with degree of freedom 1 ni — 1 n 2 — 1 8 where var( x6) c ns2 For the two side test, we first compute \Te\. If \T$\ is greater then tv(l — a/2), we could reject H 0 , and conclude that the two variances are different. 1.2.4 The Box test The Box test [2] is the earliest robust test for equality of variances. For the two sample case, similar to Jacknife test, each sample is divided into subsamples of size k(k > 1). So there are n-i subsamples for the fist sample X\, ...,xni ,and n2 subsam- ples for the second sample y i , ...,yn2. Then l n 5 2 is obtained from each subsample. Let's define Gij = \nS?j, i = 1,2, and j = l , . . . , n t - . The Gij are approximately distributed as N [ lna 2 , ^-j- + ^ , and the Box procedure performs two sample t test on Gij and to test H 0 : l n a 2 = l n a | . First, let's define G\, G2, var{G\), and var(G\) as the sample means and variances of the two samples G\ and G2, and s with /var(Gi) ^ var(G2) 1/2 \ ni n2 J The null hypothesis can be approximated by a t distribution with degree of freedom 1 n\— 1 ri2 — 1 where var(Gi) c ns2 9 For two sided test, if \TQ\ is greater then tv{\ — OJ/2) , where v is degrees of freedom, Ho would be rejected. Also, Box suggested that the test statistics TQ will not have exactly a t distri- bution since In S2 is not exactly normally distributed, but the level of significance should be closely approximate because of the robustness of the t statistics. The main disadvantage of the Box test is the loss of information in subdiving the sam- ples, and different groups of the data within each sample have the potential to produce substantially different results. 1.2.5 The Moses test The main idea of Moses test [9] is to apply the Wilcoxon two sample rank test to the value S2 obtained from the subsamples as in the Box test. This method was studied in detail by Shorack [10]. Besides S2, other measures of dispersion (e.g., the range, or the mean deviation about the sample mean) were also considered to be used in the subsamples. Moses pointed out that the following properties:(a) this test yields an exact significance level, and (b) the two population means can be left completely unspecified. However, like the Box test, this test still suffers from the loss of information due to the sample subdivision. 1.2.6 The Layard x 2 test Layard [5] suggested a x 2 test statistic which is a function of the kurtosis 7. For large sample size n, the statistic approximately follows a N[\na2,T2] distribution, where r 2 = 2 + [1 — (l/n)]7, and 7 is the coefficient of kurtosis. Under H 0 the 10 statistic s = j > < - 1 ) InSf E ( n , - - l ) l n S ? 1 2 / r 2 E f a - 1 ) is asymptotically distributed like x\i a n d Sf is the sample variance of the ith sample. However 7 is unknown, so Layard suggested the use of 7 [ E E ( * ; - * . - ) 2 ] 2 l i 9 j to estimate the kurtosis. Hence, we can use the estimate 7 and base a test on S = T 2 S / T 2 , where f 2 = 2 + [1 — ^7. If S exceeds the upper 100(o:/2) percentile or falls below the lower 100(ai/2) percentile of the x 2 distribution, the null hypothesis would be rejected. Note that Layard [5] and Brown [4] have simulated sampling experiments which suggest that the x 2 test compares favourably with Box test. A difficulty with this procedure is that quite large samples are needed to get a reasonable estimate of 7. 1.2.7 The Box-Andersen Test Box and Andersen [3] applied permutation theory to construct an approximate ro- bust test. The idea of this test is to adjust the degree of freedom for the statistic Sx/Sy , so that the mean and the variance of this distribution are equal to that under the permutation distribution. Permutation theory assumes that the two samples have been randomly selected without replacement from m = y n , u n i = y l n i , u n i + 1 = y 2 i , u n i + T l 2 = y2n2, where y,j = Xij — Hi, and is the population mean of the ith sample. For simplicity, / «i + n2 \ //,-'s are assumed to be known. Each of the possible combinations is 11 equally likely. Let y ^ i 7 /2. 2^?=i tfij 73 V 2 V n i „?. ' The mean of 5 is the same under the normal and permutation distributions, VN(B) = MB) = ^, where JV = ni + re2- However, the variances differ. Under the normal distribution, Under the permutation distribution, 1 / N N2(N + 2) where ^ _ ( T V + 2 ) Y,T=1 Vij4 ( 5 Z » = i S j = i y « j 2 ) 2 B y using new sample sizes, ni, and n~2, we can make the two variances equal, where rii — drii, n~2 = c?n2, and d 1 / N + 2 1 + 2 l i v T ^ ) ( 6 2 - 3 ) T2 The mean of B is unchanged under this substitution. So, by redefinding the sam- ple sizes, the normal theory distribution for B can be made to approximate the permutation distribution for B. According to the discussion above, Shorack [10] suggested the following approx- imate Box-Andersen test. The test approximates the distribution of the usual F by an F distribution on degrees of freedom di,d2, where d\ — d{n\ — 1 ) and d2 = d(ri2 — 1 ) 1 2 with and d = 1 + ^ - 3) - l Z)i=l n » 2~!w=l YljLl{xij xi)4 So, if the classic F statistic exceeds the upper 100(a/2) percentile or falls below the lower 100(a!/2) percentile of the Fdltd2 distribution, the null hypothesis would be rejected. 1.3 Example This section contains two examples, which are available on the internet at the address http : I/lib.stat.cmu.edu/DASLjallmethods.html. The data file names are Clouds and Michelson. 1.3.1 The first example: Cloud In the first example, clouds were randomly seeded or not with silver nitrate. Rainfall amounts were recorded from the clouds. The purpose of the experiment was to determine if cloud seeding increases rainfall. The side by side boxplots of the two logged variables Fig 1 indicate that the variances of the two groups are very similar after a log transformation. To compare the significance levels of these six tests, two outliers, with the same value are added to the seeded sample, and the value of the outliers is increased until the results of these tests become steady. The side by side boxplots for each pair of samples are shown in Fig 2. 13 The results of these tests and the classic F test are displayed in Table 2. For the F , Levene, Layard, Jacknife, Box, Moses, and Box-Andersen tests, if the test result is 1 in the table, the test rejects Ho. For the Moses and Box tests, the test results may change due to different subsamples of the data within each sample. To see if these two tests are likely to reject the null hypothesis, for each pair of samples, each of these two tests is executed 100 times. The entries are the proportion of rejections. As expected, the F test is very non-robust. It rejects Ho as the two outliers 12 are added. In this example, of all the tests, the Moses and Box tests are less affected by the outliers. They do not reject the null hypothesis, even when the largest outliers 100 are added. In addition, the performance of the Box-Andersen test is quite good. The Levene test is not as good as the Box-Andersen test, but is better than the Layard test, and the Jacknife test is the worst one. 1.3.2 T h e second e x a m p l e : M i c h e l s o n In the Michelson's example, 100 determinations of the velocity of light in air using a modification of a method proposed by the French physicist Foucault. These mea- surements were grouped into five trials of 20 measurements each. The numbers are in km/sec, and have had 299,000 subtracted from them. The currently accepted 'true' velocity of light in vacuum is 299,792.5 km/sec. The side by side boxplots of the measurements in the first and fifth trials, F ig 3, reveal that their variances are very different. To compare the power of the seven tests, one outlier is added to the sample with smaller sample variance, and the value of the outlier is increased until neither of these tests rejects HQ . The results of these tests and the side by side boxplots of each 14 value of Box- two Levene Layard Jacknife Box Moses Andersen outliers F test test test test test test test no outlier 0 0 0 0 0.03 0.04 0 12 1 0 0 0 0 0.01 0 14 1 0 0 1 0.01 0.04 0 25 1 0 1 1 0 0.01 0 28 1 1 1 1 0 0.01 0 30 1 1 1 1 0 0.04 1 100 1 1 1 1 0 0 1 Table 2: Results of tests on variances for the Cloud data. 15 Figure 1: Side by Side Boxplots of the two logged variables in Cloud example pair of samples are shown in Table 3 and Fig 4. Without the outlier, all tests except Box and Moses reject Ff 0, and these two tests have about 25% of results rejecting Ho- Hence, these two tests do not perform powerfully in this example. Surprisingly, the F test is not fooled by large outlier in this example. The Levene test is also very powerful. The Layard test is the worst. The Jacknife and Box-Andersen tests are about equally powerful. According to the two examples, the power of the F test and the Jacknife test are not so affected by the outliers, but their significance levels are very sensitive to the outliers. The Layard test is not so powerful, but, in term of the significance 16 Box- value of Levene Layard Jacknife Box Moses Andersen outlier F test test test test test test test no outlier 1 1 1 1 0.28 0.25 1 950 1 1 0 1 0.21 0.25 1 980 1 1 0 0 0.13 0.12 0 1000 1 1 0 0 0.13 0.03 0 1100 0 0 0 0 0 0 0 Table 3: Results of tests on variance for the Michelson data. level, it is better than the Jacknife test. The Levene test is the most powerful test in the Michelson's example, and its performance is better than the Layard test in the Cloud example. In addition, although, the Moses and Box tests are not affected by the largest outlier in the first example, they are not robust. They seem to be superior in the first example just because they are so conservative. Of all the tests, the Box-Andersen test is the best in these two examples. 17 2 A N e w R o b u s t t e s t This chapter contains three sections. In the first section, a new robust method testing the equality of variances between two populations is presented. In the second section, the asymptotic distribution of the new test statistic described in the first section is derived. In the last section, the new method is applied to the two examples mentioned in the first chapter. 2.1 Robust Dispersion Estimates First , an alternative measure of dispersion that is more resistant to outliers is in - troduced. The best feature of this new method is that it has superior abil ity to overcome the effect of outliers. This measure is insensitive to changes in the most extreme observations and therefore is resistant to outliers. To start wi th , we just consider one sample, x \ , x n , wi th X{ ~ N(/i, cr 2), and X{ are independent. The alternative measure of dispersion, based on a sample x \ , x n , is called Sr. Notice that Sr satisfies the following equation [ 1 , otherwise, where c is arbitrary. The value of b depends on the choice of c. To ensure consistency of Sr, we choose (20) where Tn is the median of the sample, x is defined as a function: f , if k l < c; (21) 18 with z ~ iV(0, l)(i.e. Sr —> a as n —> oo.) Observe that for —c < z < c, x(z) equals the sample standard deviation score function. For the two sample case, Sri is referred to as the new measure of dispersion in the ith sample, i = 1,2. The new test statistic for the Ho will be based on the ratio R = §1. ,23) The asymptotic distribution of R is derived in the next section. In addition, Miller [8] also gave some references and mentioned the possibility of doing a test based on the ratio of M A D ' s , which is a particular case of robust scale estimate. 2.2 Asymptotic Distribution of R In this section, the asymptotic distributions of the test statistic R for the normal and non-normal case are derived. To see the influence of non-normality when com- paring the variation in two samples, we will look at the normally and non-normally distributed cases. Firstly, we will describe the statistical method based on the as- sumption of an underlying normal distribution. Secondly, we will investigate how this method is sensitive to the departure from normality. 2.2.1 Normal case First, we need to compute the asymptotic distribution of n(Sr — a). Because R is location invariant, we can assume, without loss of generality, that u = 0. By the Taylor series expansion, 19 n E X { - S r - \ - b n 1 x(f) n E x(f) n 1_ - & - r E ( x ' ( ? A ) ( s E U F R ( 5 r - a ) . (24) So, y/n(Sr — a) % E (x(*)) - ^ 6 ^(x'(^)ft) B y the Law of Large Numbers, 1Efx'(^)(^) with i = £ [x'(f)(f)] = -S [x'MW]. Also, n where and 2/i = X(—) - b, (7 E(y) = 0, Var(y) = E { [ X ( - ) - 6 ] 2 } = r^ (7 B y the C L T , ^ £ ( X ( * ) ) - V ^ " < O , T ' ) . Therefore, by Slutsky's Theorem, q i V ( 0 ,T 2 ) V n ( 5 r - a ) = 7V(0,a<72), (25) (26) (27) (28) (29) 20 c a b E F F 1.041 0.989 0.500 0.51 1.7 0.625 1.294 0.80 2.07 0.555 0.218 0.90 2.3765 0.526 0.172 0.95 Table 4: relation between c, a, b and E F F with a = The value of a depends on the choice of c. Table 4 shows how a, b and E F F , the relative efficiency of Sr to the classic sample standard deviation SD, varies with the value of c. The table shows that the efficiency of the dispersion estimate increases with c. We do not use larger c to obtain greater efficiency because as c increases, b will decrease, and the less the value of b is, the less robust the test is. In the next chapter, we will find a value of c, such that the test will be robust and efficient. In the two sample case, suppose we have two independent samples, Xn, ...,x\ni and X21,x2n2 from the populations, JV(//i,<7i) and N(fi2,cr2). Suppose the j — 1, . . . , n ; are independent. For simplicity, we assume ni = n2 = n. By the Central Limit Theorem, (<ri\ \o-2) N / 0 \ \ (30) 21 where E = (31) V 0 aal) Let us define a function g(x,y) = x/y. Thus, we have by the Delta Method, V ^ ( Sri _ <Ji Sr2 <J2 N(0,Vg'ZVg) (32) where Vg = ( £<7(ci,<72)' ( a , " 1 and Vg'ZVg = 2 o U (33) (34) (35) If the null hypothesis, 1 1 0 : 0 " ! = <T2, is true, and S = y/n ( R - 1) -> JV(0,1). (36) So, we can use 5 to test the two side Ho, and would reject Ho when S exceeds the upper 100(a/2) percentile or falls below the lower 100(a/2) percentile of the iV(0,1) distribution. Thus, H 0 is rejected when \S\ > z(l — a/2). For instance, if a = 0.05, then Ho is rejected, when |5| is greater than 1.96. Table 5 displays the upper and lower critical values (i.e. the acceptance regions) for the test statistic R = Sri/Sr2, with a = 0.05 based on both the asymptotic 22 n = 25 n = 50 c = 1.7 Asymptotic distribution (0.562, 1.438) (0.690, 1.310) simulation (0.631, 1.575) (0.727, 1.383) c = 2.07 Asymptotic distribution (0.587, 1.413) (0.707, 1.292) simulation (0.648, 1.538) (0.736, 1.356) c = 2.3765 Asymptotic distribution (0.598,1.402) (0.708,1.292) simulation (0.654, 1.535) (0.745, 1.341) Table 5: Acceptance regions of R = Sr-i/Sr? with a — 0.05 obtained from asymp- totic distribution and simulation with 10,000 repetitions. distribution and generation of R from 10,000 random numbers in Splus. The larger the sample size is, the less difference between the acceptance regions obtained from the two methods. F i g 5 shows the simulated distribution of R, with sample sizes 25 and 50, c — 1.7,2.07,2.3765. The histogram for the smaller sample size is more skewed to right, but as the sample size increases it becomes more symmetric. For unequal sample size case, if ^ —> d, then we obtain V ^ 7 + l ^ ( i 2 - 1) - N(0, ^ p ^ a ) . (37) 23 2.2.2 N o n - n o r m a l case To see how this new test is sensitive to departures from normality, we will look at the cases that the population of the variables follow other distributions: £5 ,tio, X51X101 uniform(0,l), and uniform(0,10). In addition, we will estimate their actual significance levels by generating 10,000 numbers. Since we want to known if the two arguments in the uniform distribution affect the results, the uniform distributions with arguments (0,1), and (0,10) are investigated. The simulated significance levels (a = 0.05) for the non-normal distributions are displayed in Table 6. The normal case is included in the table because we want to see how large the error is due to the generation of data. Note that the arguments in the uniform distribution do not affect the result. Also, for a heavy-tailed distribution , the probability of rejecting H 0 exceeds 0.05; whereas for a short-tailed distribution, the probability is less than 0.05. But, in general, the results are closer to 0.05 than the ones from classic F test. Also, the significance levels yielded by smaller c are closer to 0.05. 2.3 Examples In this section, the new tests with c = 1.7,2.07,2.3765 are applied to the examples described in the first chapter. The test results and the the test statistics i?'s for each pair of samples are shown in Table 7 and 8. Table 9 displays the acceptance regions of R with a = 0.05 for sample sizes ni,ri2. The acceptance regions shown in the table are obtained from simulation with 1,000 repetitions. In the Cloud example, when no outlier is added, ni = n 2 = 26, and with c = 1.7, R = 0.958. Since R is within the acceptance region, [0.689, 1.457], shown in Table 9, the new test with 24 Distribution c = 1.7 c = 2.07 c = 2.3765 iV(0 , l ) 0.053 0.052 0.051 u 0.087 0.105 0.123 0.073 0.079 0.080 x\ 0.074 0.127 0.150 X2io 0.052 0.080 0.098 Uniform(0,1) 0.0005 0.001 0.002 Uniform(0,10) 0.0005 0.001 0.002 Table 6: Simulated actual significant level of the new test (a = 0.05) from 10,000 generated data, with normal assumption for several non-normal distributions 25 c= 1.7 c = 2.07 c = 2.3765 value of two outlier R reject R reject R reject no outlier 0.958 0 0.953 0 0.969 0 12 0.958 0 0.953 0 0.969 0 14 0.958 0 0.953 0 0.969 0 25 0.958 0 0.953 0 0.969 0 28 0.958 0 0.953 0 0.969 0 30 0.958 0 0.953 0 0.969 0 100 0.958 0 0.953 0 0.969 0 Table 7: Results of the new tests (c = 1.7,2.07,2.3765) on the Cloud example. If reject = 1, the test rejects Ho c = 1.7 does not reject the null hypothesis. For all of the three tests, no matter how large the two outliers are, they still do not reject the null hypothesis. It means that the tests are not affected by the extremely large observations. Also, the value of R does not vary with the value of outliers for each test. Similarly, for the Michelson's example, the size of outlier does not make any influence on the results of the tests, and the value of R keeps constant with different values of outliers. Based on these two examples, we can conclude that the new tests have superior ability to overcome the effect of outliers. 26 c = 1.7 c = 2.07 c = 2.3765 value of two outlier R reject R reject R reject no outlier 1.841 1 1.882 1 1.781 1 950 1.841 1 1.882 1 1.589 1 980 1.841 1 1.882 1 1.589 1 1000 1.841 1 1.882 1 1.589 1 1100 1.841 1 1.882 1 1.589 1 Table 8: Results of the new tests (c = 1.7,2.07,2.3765) on the Michelson example. If reject = 1, the test rejects H 0 ni n2 c= 1.7 c = 2.07 c = 2.3765 26 26 [0.689, 1.457] [0.716, 1.464] [0.698, 1.413] 26 28 [0.688,1.431] [0.721, 1.407] [0.710, 1.415] 20 20 [0.564, 1.774] [0.599, 1.723] [0.606, 1.679] 20 21 [0.654,1.592] [0.661, 1.490] [0.672, 1.470] Table 9: Acceptance regions of R = Sr1/Sr2 with a = 0.05 with sample sizes n i , n 2 obtained from simulation with 1,000 repetitions. 27 3 M o n t e C a r l o s t u d y In this Chapter, we compare the new tests with the F test and the six robust tests described in the first chapter. Two types of Monte Carlo studies are presented. First we investigate the sensitivity of the tests to non-normality. Second we investigate the influence of outliers on the power and the significance level of the tests. The procedures for our first Monte Carlo study are the following: (i) Generate one hundred and fifty pairs of samples; the sample size is 25, and the pseudo-random numbers represent samples from a uniform distribution. (ii) Transform the pseudo-random numbers to obtain samples from a iV(0,1), X2IOJ £5,^0, and ̂ 20 distributions. (iii) After the transformation, the second sample was scaled by the factor A so that the ratio of the two variances is A 2 for each distribution. Different values of A are selected and applied to the samples. (iv) Ten tests were applied to each of the 150 pairs of samples. The ten tests are the F test, the Box-Andersen test, the Levene test, the Jacknife test with subsample size k = 1, the Box and Moses tests both with subsample size k = 5, and the three new tests with c = 1.7,2.07, and 2.3765. (v) Repeat steps (i) to (iv) with sample size 50. The entries in Tables 10 to 14 are the proportions of samples in 150 trials that the tests reject the null hypothesis cr2 = <r2 for the various distributions and A . For A = 1 the proportions should be close to a = 0.05. For A > 1 the proportions are Monte Carlo estimates of the power of the tests at the particular selections of A for various distributions. The results of these tables reveal the following conclusion: 28 (i) The F test is extremely non-robust. It gives too many significant results for long tailed distributions. (ii) The three new tests have about the same power, and in general they are the most powerful tests in the group. The three tests, when A = 1, give more significant results than the other tests. (iii) The new test with c — 1.7 is not as powerful as the new tests with c = 2.07,2.3765, but its actual significance level is closer to 0.05. (iv) The other tests are robust, but they are not as powerful as the new tests. In general, the Jacknife and Box-Andersen tests have about the same power. The Levene test is more powerful than these two tests. (v) The Moses test is sightly less powerful than the Box test, and seems to be the least powerful of all the tests. The second type of Monte Carlo studies includes two parts. The first part esti- mates the influence of outliers on the significance of the tests, and the second part estimates the influence of outliers on the power of the tests. The procedures for the first part are the following: (i) Transform the first sample of each of the one hundred and fifty pairs of pseudo- random samples to obtain samples from JV(0,1) with different number of outliers from N (5,0.1). (ii) Transform the second sample of each of the one hundred and fifty pairs of pseudo-random samples to obtain samples from iV(0,1) without outlier. (iii) Repeat steps (i) and (ii) with sample size 50. The entries in Table 15 are the proportions of samples in 150 trials that the tests reject the null hypothesis c r 2 = a2, for the various numbers of outliers. Test 29 with smaller values is less affected by the outliers, and seldom falsely rejects the null hypothesis. According to the results in the table, we have the following conclusions: (i) The Moses test is less affected than the Box test. Both of these tests seem to be least affected by the outliers. However, it is probably due to the fact they are very conservative, and the result is consistent with the one obtained by Miller [7]. (ii) The new test with c = 1.7 is the second least affected one. When the sample size is 25 and less than 16% of observations in the first sample are outliers, the Levene test is slightly better than the new test with c = 2.07; the new test with c = 2.3765 is almost the worst one. Also, as the number of outliers increases, the new test with c = 2.07 becomes more affected by the outliers. (iii) When the sample size is 50, the new test with c = 2.07 is better than the Levene test. In addition, the performance of the new test with c = 1.7 is almost the best in the group. The second part is to test the effect of outliers on the power of the tests. The procedures are the following: (i) Transform the first sample of each of the one hundred and fifty pairs of pseudo- random samples to obtain samples from JV(0,1) with different number of outliers from iV(5.5,0.1). (ii) Transform the second sample of each of the one hundred and fifty pairs of pseudo-random samples to obtain samples from N(0,3) without outlier. (iii) Repeat steps (i) and (ii) with sample size 50. The entries in Table 16 are the proportions of samples in 150 trials that the tests reject the null hypothesis a l = a2y for the various numbers of outliers. Tests with larger values are less affected by the outliers, and seldom falsely accepts the null 30 hypothesis. To estimate the influence of larger outlier, we repeat the procedures with larger outliers from iV(10,0.1) distribution. The results are exhibited in Table 17. Based on these two tables, we have the following conclusions: (i) The new test with c = 1.7 has the best performance. (ii) When the sample contains less than 16% outliers, the new tests with c = 2.07,2.3765 are the second best tests. Whereas, as the number of outliers increases, the new tests with higher values of c become the worst of the all. 31 = n2 = 25 ni = n2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.053 0.513 0.927 0.987 1.000 0.053 0.767 1.000 1.000 1.000 Levene 0.047 0.460 0.893 0.973 1.000 0.080 0.733 0.987 1.000 1.000 Layard 0.040 0.407 0.827 0.953 1.000 0.067 0.707 0.980 1.000 1.000 Jacknife k = 1 0.027 0.493 0.900 0.973 1.000 0.040 0.740 1.000 1.000 1.000 Box k = 5 0.047 0.293 0.707 0.820 1.000 0.060 0.553 0.927 0.993 1.000 Moses k = 5 0.027 0.287 0.600 0.800 0.987 0.053 0.560 0.920 0.980 1.000 Box Andersen 0.033 0.487 0.913 0.973 1.000 0.053 0.747 0.993 1.000 1.000 New test c = 1.7 0.060 0.420 0.860 0.967 1.000 0.067 0.687 0.987 1.000 1.000 New test c = 2.07 0.047 0.453 0.893 0.980 1.000 0.080 0.740 0.993 1.000 1.000 New test c = 2.3765 0.060 0.453 0.900 0.973 1.000 0.067 0.760 1.000 1.000 1.000 Table 10: Monte Carlo Power Function for Tests on Variances for Normal distribu- tion 32 ni = n2 = 25 ni = n 2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.153 0.540 0.860 0.960 1.000 0.093 0.707 0.987 1.000 1.000 Levene 0.060 0.487 0.833 0.953 1.000 0.087 0.653 0.973 1.000 1.000 Layard 0.040 0.353 0.740 0.900 1.000 0.027 0.507 0.913 1.000 1.000 Jacknife k = 1 0.080 0.440 0.760 0.900 0.993 0.053 0.600 0.940 1.000 1.000 Box k = 5 0.033 0.293 0.600 0.800 0.987 0.067 0.460 0.880 0.987 1.000 Moses k = 5 0.033 0.227 0.493 0.760 0.973 0.060 0.447 0.860 0.987 1.000 Box Andersen 0.053 0.407 0.780 0.913 1.000 0.047 0.600 0.933 1.000 1.000 New test c = 1.7 0.0737 0.427 0.847 0.967 1.000 0.073 0.653 0.980 1.000 1.000 New test c = 2.07 0.093 0.500 0.880 0.967 1.000 0.100 0.693 0.987 1.000 1.000 New test c = 2.3765 0.100 0.487 0.873 0.967 1.000 0.113 0.727 0.993 1.000 1.000 Table 11: Monte Carlo Power Functions for Tests on Variances for Xs 2 distribution 33 = n2 = 25 ni = n2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.193 0.500 0.847 0.960 1.000 0.227 0.667 0.973 1.000 1.000 Levene 0.040 0.353 0.727 0.940 0.993 0.073 0.593 0.940 1.000 1.000 Layard 0.040 0.227 0.607 0.840 0.993 0.027 0.407 0.860 0.947 1.000 J acknife k = 1 0.047 0.353 0.673 0.847 0.980 0.073 0.473 0.860 0.940 1.000 Box k = 5 0.040 0.240 0.547 0.780 0.980 0.053 0.433 0.860 0.967 1.000 Moses k = 5 0.020 0.200 0.467 0.660 0.980 0.060 0.427 0.847 0.960 1.000 Box Andersen 0.027 0.293 0.660 0.833 0.993 0.053 0.473 0.880 0.960 1.000 New test c= 1.7 0.087 0.427 0.840 0.953 1.000 0.120 0.667 0.973 1.000 1.000 New test c = 2.07 0.093 0.493 0.860 0.967 1.000 0.120 0.700 0.960 1.000 1.000 New test c = 2.3765 0.113 0.480 0.847 0.973 1.000 0.140 0.720 0.973 1.000 1.000 Table 12: Monte Carlo Power Functions for Tests on Variances for £5 distribution 34 = n 2 = 25 = n 2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.107 0.507 0.900 0.980 1.000 0.087 0.747 0.987 1.000 1.000 Levene 0.047 0.427 0.840 0.967 1.000 0.080 0.667 0.967 1.000 1.000 Layard 0.040 0.333 0.720 0.913 1.000 0.053 0.520 0.940 0.993 1.000 J acknife k = 1 0.033 0.420 0.793 0.913 1.000 0.040 0.600 0.960 1.000 1.000 Box k = 5 0.033 0.260 0.613 0.807 1.000 0.073 0.507 0.920 0.993 1.000 Moses k = 5 0.020 0.200 0.560 0.700 0.973 0.053 0.480 0.860 0.980 1.000 Box Andersen 0.027 0.413 0.787 0.920 1.000 0.053 0.647 0.967 0.993 1.000 New test c = 1.7 0.067 0.427 0.847 0.967 1.000 0.093 0.673 0.987 1.000 1.000 New test c = 2.07 0.073 0.480 0.873 0.980 1.000 0.093 0.720 0.987 1.000 1.000 New test c = 2.3765 0.067 0.447 0.873 0.973 1.000 0.093 0.733 0.980 1.000 1.000 Table 13: Monte Carlo Power Functions for Tests on Variances for t 1 0 distribution 35 = n2 = 25 = n 2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.073 0.493 0.900 0.980 1.000 0.060 0.767 0.993 1.000 1.000 Levene 0.047 0.433 0.873 0.967 1.000 0.080 0.720 0.973 1.000 1.000 Layard 0.053 0.367 0.773 0.927 1.000 0.060 0.647 0.967 1.000 1.000 Jacknife k = 1 0.033 0.453 0.833 0.953 1.000 0.040 0.707 0.987 1.000 1.000 Box k = 5 0.033 0.287 0.653 0.853 1.000 0.047 0.607 0.880 0.993 1.000 Moses k = 5 0.020 0.207 0.553 0.760 0.993 0.067 0.567 0.900 0.980 1.000 Box Andersen 0.027 0.440 0.860 0.953 1.000 0.053 0.713 0.980 1.000 1.000 New test c= 1.7 0.067 0.420 0.860 0.967 1.000 0.087 0.673 0.987 1.000 1.000 New test c = 2.07 0.053 0.480 0.880 0.980 1.000 0.087 0.733 0.993 1.000 1.000 New test c = 2.3765 0.060 0.447 0.887 0.973 1.000 0.067 0.747 0.993 1.000 1.000 Table 14: Monte Carlo Power Functions for Tests on Variances for t20 distribution 36 n\ = n2 = 25 = n 2 = 50 number of outliers 1 2 3 4 5 2 4 6 8 10 F-test 0.380 0.807 0.960 0.987 0.993 0.653 0.993 1.000 1.000 1.000 Levene 0.040 0.153 0.460 0.820 0.987 0.140 0.473 0.873 1.000 1.000 Layard 0.000 0.093 0.467 0.913 0.987 0.027 0.513 0.987 1.000 1.000 Jacknife k = 1 0.047 0.413 0.840 0.967 0.987 0.260 0.893 1.000 1.000 1.000 Box k = 5 0.013 0.073 0.233 0.400 0.547 0.100 0.207 0.560 0.740 0.907 Moses k = 5 0.047 0.033 0.173 0.267 0.400 0.067 0.227 0.420 0.653 0.840 Box Andersen 0.013 0.140 0.600 0.940 0.987 0.100 0.680 1.000 1.000 1.000 New test c = 1.7 0.073 0.100 0.240 0.400 0.700 0.080 0.187 0.407 0.727 0.933 New test c = 2.07 0.073 0.193 0.373 0.833 0.993 0.080 0.280 0.680 0.987 1.000 New test c = 2.3765 0.107 0.293 0.773 0.993 0.993 0.120 0.487 0.980 1.000 1.000 Table 15: Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0, 1) population with different number of outliers from the iV(5,0.1) in the first sample 37 = n2 = 25 ni = n2 = 50 number of outliers 1 2 3 4 5 2 4 6 8 10 F-test 0.953 0.713 0.407 0.153 0.067 1.000 0.967 0.820 0.540 0.273 Levene 0.947 0.760 0.473 0.240 0.087 1.000 0.987 0.867 0.613 0.273 Layard 0.747 0.407 0.180 0.087 0.040 1.000 0.833 0.613 0.333 0.187 J acknife k = 1 0.073 0.093 0.093 0.087 0.067 0.947 0.807 0.613 0.440 0.307 Box k = 5 0.767 0.240 0.060 0.013 0.013 0.980 0.867 0.573 0.313 0.120 Moses k = 5 0.527 0.233 0.073 0.053 0.033 0.987 0.813 0.467 0.207 0.133 Box Andersen 0.820 0.507 0.280 0.140 0.080 1.000 0.907 0.740 0.480 0.293 New test c = 1.7 0.993 0.980 0.940 0.853 0.573 1.000 1.000 0.993 0.987 0.840 New test c = 2.07 0.993 0.980 0.880 0.567 0.013 1.000 1.000 0.993 0.820 0.067 New test c = 2.3765 1.000 0.960 0.700 0.080 0.027 1.000 1.000 0.940 0.333 0.127 Table 16: Monte Carlo Power Functions for Tests on Variances for based on two samples from the 7V(0,1) and 7V(0, 3) populations with different number of outliers from the JV(5.5,0.1) in the first sample 38 ni = n 2 = 25 = .n2 = 50 number of outliers 1 2 3 4 5 2 4 6 8 10 F-test 0.173 0.007 0.073 0.207 0.320 0.580 0.000 0.107 0.380 0.673 Levene 0.607 0.047 0.000 0.013 0.173 0.953 0.313 0.000 0.040 0.413 Layard 0.027 0.073 0.027 0.020 0.127 0.047 0.033 0.027 0.073 0.420 J acknife k = 1 0.000 0.000 0.000 0.100 0.307 0.000 0.000 0.013 0.220 0.600 Box k = 5 0.127 0.000 0.000 0.000 0.020 0.847 0.167 0.013 0.000 0.020 Moses Jfe = 5 0.007 0.000 0.000 0.000 0.080 0.787 0.047 0.000 0.027 0.153 Box Andersen 0.013 0.000 0.000 0.020 0.227 0.140 0.000 0.000 0.140 0.567 New test c= 1.7 0.993 0.980 0.940 0.853 0.573 1.000 1.000 0.993 0.987 0.840 New test c = 2.07 0.993 0.980 0.880 0.567 0.100 1.000 1.000 0.993 0.820 0.133 New test c = 2.3765 1.000 0.960 0.700 0.120 0.500 1.000 1.000 0.933 0.227 0.853 Table 17: Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0,1) and 7V(0,3) populations with different number of outliers from the iV(10,0.1) in the first sample 39 4 C o n c l u s i o n The classic F test for the hypothesis concerning the equality of two population variances is known to be non-robust. Let us consider a two sample problem. Suppose we have two samples, y n , y X n i and 7 / 2 1 , 2/2n 2 - Suppose the ?/,j's are independent where 7 is the coefficient of kurtosis. If normal assumption is met, 7 = 0. However, for non-normal cases, like t5, 7 won't be zero. So, when we apply the classical F test to the non-normal samples, the actual size of the test would be different from its nominal level of significance, a. Therefore, several robust alternative procedures have been introduced in this century. This paper presents a new robust method. The best feature of this new method is that it has superior ability to overcome the effect of outliers. First, an alternative measure of dispersion, Sr, that is more resistant to outliers was introduced. The new test statistic was then defined using these robust dispersion estimates. In Section 2.2.2, we estimated the actual significance levels of the new tests (a = 0.05) for the non-normal case. We've found that for a heavy-tailed distribution the probability of rejecting H 0 exceeds 0.05; whereas for a short-tailed distribution, the probability is less than 0.05. But, in general, the results are closer to 0.05 than the ones from classic F test. Also, the significance levels yielded by smaller c are closer to 0.05. According to the two examples described in the first two chapters, the perfor- mance of the new tests is obviously better than the other tests discussed in the first and identically distributed with cdf G((yi — //;)/cr;). As n 40 chapter. In these two examples, we can see that no matter how large the outliers are, the new tests are not affected by them. It can be explained by the fact that the test statistic R is not affected by the size of outliers but the number of outliers. In addition, according to the first type of Monte Carlo study, the three tests have about the same power. In general, the new tests are most powerful i n the group, although the true significance levels of the three tests are sightly more sensitive to the other tests. Also , the new test with c = 1.7 is just not as powerful as the new tests wi th c = 2.07,2.3765, but its actual significance level is closer to the proposed significance level 0.05. Based on the second type of Monte Carlo study, the new test wi th c = 1.7 seems to have the superior power to overcome the effect of outliers. O n the whole, this paper has demonstrated that although the new test wi th c = 1.7 is just a l i t t le bit less powerful than those wi th c — 2.07, 2.3765, of al l the tests, the new test with c = 1.7 has superior ability to overcome the effect of outliers. 41 R e f e r e n c e s [1] M . S . Bartlett. Properties of sufficiency and statistical tests. Proceedings of the royal society A, 160:262-282, 1937. [2] G . E . P . Box. Non-normality and tests on variances. Boimetrika, 40:318-335, 1953. [3] G . E . P Box and S.L. Andersen. Permuation theory in the derivation of robust criteria and the study of departures from assumption. Journal of the Royal Statistical Society, B17:l-26, 1955. [4] M . B . Brown and A . B . Forsythe. Robust test for the equality of variances. Journal of American Statistical Association, 69:364-367, 1974. [5] M . W . J . Layard. Robust large-sample tests for homogeneity of variances. Jour- nal of American Statistical Association, 68:105-198, 1974. [6] H . Levene. Robust tests for equality of variance contributions to probability and statistics, pages 278-292. Stanford University Press, 1960. [7] R . G . Jr Miller. Jacknifing variances. Annals of Mathematical Statistics, 39:567- 582, 1968. [8] Rupert G . Miller. Beyond anova, basis of applied statistics. [9] L . E . Moses. Rank tests of dispersion. Annals of Mathematical Statistics, 34:973- 983, 1963. 42 [10] G . R . Shorack. Nonparametric tests and estimation of scale in two sample prob- lem. Technical Report, 10. 43 two outl iers 12 a d d e d in the s e e d e d s a m p l e two outl iers 14 a d d e d in the s e e d e d s a m p l e two outl iers 2 5 a d d e d in the s e e d e d s a m p l e two outl iers 2 8 a d d e d in the s e e d e d s a m p l e two outl iers 30 a d d e d in the s e e d e d s a m p l e two out l iers 1 0 O a d d e d in the s e e d e d s a m p l e Figure 2: Side by Side Boxplots of the two logged variables wi th outliers i n the seeded sample 44 no outlier added Figure 3: Side by Side Boxplots of the measurements in the first and fifth trials Michelson's example 45 outlier 950 added outlier 980 added o o 0) o o O O 0) O o s outlier 1000 added o o 0) o o outlier 1100 added o o o o 0) o o s Figure 4: Side by Side Boxplots of the two variables with outliers in the fifth sample 46 n = 25,c=1.7 n = 25, c =2.07 n = 25, c =2.3765 i 1 1 1 i 1 1 1 i 1 1 1 1 1 1 0.5 1.0 1.5 2.0 0.5 1.0 1.5 2.0 0.6 0.8 1.0 \1 1.4 1.6 1.8 M trS 66 Figure 5: Histograms of R for different combinations of sample size n, and c 47
Cite
Citation Scheme:
Usage Statistics
Country | Views | Downloads |
---|---|---|
China | 30 | 0 |
France | 3 | 0 |
United States | 2 | 0 |
Japan | 1 | 0 |
City | Views | Downloads |
---|---|---|
Beijing | 30 | 0 |
Unknown | 3 | 21 |
Tokyo | 1 | 0 |
Sunnyvale | 1 | 0 |
Ashburn | 1 | 0 |
{[{ mDataHeader[type] }]} | {[{ month[type] }]} | {[{ tData[type] }]} |
Share
Share to: