Robust tests on the equality of variances by M a n - P o Lai B.Sc. University of British Columbia, 1994 A T H E S I S S U B M I T T E D IN P A R T I A L F U L F I L L M E N T O F T H E REQUIREMENTS FOR T H E DEGREE OF MASTER OF SCIENCE in T H E F A C U L T Y OF G R A D U A T E STUDIES D E P A R T M E N T O F STATISTICS We accept this thesis as conforming to th_e^r.e;quired standard T H E UNIVERSITY O F BRITISH C O L U M B I A 1997 © M a n - P o Lai, 1997 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it n freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the, head of my department or by his' or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of Staff's f^lCS, The University of British Columbia Vancouver, Canada DE-6 (2/88) Abstract The classic F test for the hypothesis concerning the equality of two population variances is known to be non-robust. When we apply the classical F test to the non-normal samples, the actual size of the test can be different from its nominal level. Therefore, several robust alternatives have been introduced in the literature. In this thesis, I will present some of these alternatives, and illustrate their application with some examples. A new approach will also be introduced. The best feature of this method is that it seems to be able to overcome the adverse effect of outliers. A Monte Carlo study is used to compare the new test with the F test and the other methods. The results of this study are encouraging for the new test. 11 Contents Abstract ii Table of Contents iii List of Tables v List of Figures vii Acknowledgements viii 1 Introduction 1 1.1 Non-normality 1 1.1.1 Normal Case 2 1.1.2 Non-normal Case 3 1.2 1.3 Some Robust Methods 5 1.2.1 T h e Levene test 6 1.2.2 Modifications of Levene test 7 1.2.3 T h e Jacknife test 7 1.2.4 T h e Box test 9 1.2.5 T h e Moses test 1.2.6 T h e Layard 1.2.7 T h e Box-Andersen Test X 2 10 test 10 11 Example 13 1.3.1 T h e first example: C l o u d 13 1.3.2 T h e second example: Michelson 14 in 2 A New Robust test 18 2.1 Robust Dispersion Estimates 18 2.2 Asymptotic Distribution of R 19 2.2.1 Normal case 19 2.2.2 Non-normal case 24 2.3 Examples 24 3 Monte Carlo study 28 4 Conclusion 40 Bibliography 43 iv L i s t of T a b l e s 1 A c t u a l asymptotic significance level of F test (a = 0.05), with nonnormal samples 5 2 Results of tests on variances for the Cloud data 15 3 Results of tests on variance for the Michelson data 17 4 relation between c,a,b and E F F 21 5 Acceptance regions of R = Sr-i/Sr? with a = 0.05 obtained from asymptotic distribution and simulation with 10,000 repetitions 6 Simulated actual significant level of the new test (ct = 23 0.05) from 10,000 generated data, with normal assumption for several non-normal distributions 7 25 Results of the new tests (c = 1.7,2.07,2.3765) on the Cloud example. If reject = 1, the test rejects Ff 8 26 0 Results of the new tests (c = 1.7,2.07,2.3765) on the Michelson example. If reject = 1, the test rejects Ho 9 Acceptance regions of R = Sr /'Sr x 2 27 with a = 0.05 with sample sizes ni,ri2 obtained from simulation with 1,000 repetitions 10 27 Monte Carlo Power Function for Tests on Variances for Normal distribution 32 11 Monte Carlo Power Functions for Tests on Variances for x s distribution 33 12 Monte Carlo Power Functions for Tests on Variances for t 13 Monte Carlo Power Functions for Tests on Variances for tw distribution 35 14 Monte Carlo Power Functions for Tests on Variances for £20 distribution 36 2 5 v distribution 34 15 Monte Carlo Power Functions for Tests on Variances for based on two samples from the 7V(0,1) population w i t h different number of outliers from the 7V(5,0.1) i n the first sample 16 Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0,1) and iV(0, 3) populations w i t h different number of outliers from the iV(5.5,0.1) i n the first sample 17 37 38 Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0,1) and /V(0,3) populations w i t h different number of outliers from the 7V(10,0.1) i n the first sample vi 39 L i s t o f F i g u r e s 1 Side by Side Boxplots of the two logged variables i n C l o u d example 2 Side by Side Boxplots of the two logged variables w i t h outliers i n the . seeded sample 3 44 Side by Side Boxplots of the measurements i n the first and fifth trials i n Michelson's example 4 5 16 45 Side by Side Boxplots of the two variables w i t h outliers i n the fifth sample 46 Histograms of R for different combinations of sample size n, and c . . 47 vn Acknowledgements I would like to thank my supervisor, D r . Ruben Zamer for introducing me such a interesting topic. Ruben was always willing to give his advice, ideas, and encouragement, which were much needed. Also, I would like to thank D r . Paul Gustafson for his very useful comments and careful reading. Thanks also go to Daniel N g for his invaluable help with latex, and various other problems along the way. vm 1 Introduction It is a well-known fact that the two sample t-test is a reliable method to test the differences between population means because it is insensitive to the departures from normality in the populations. O n the other hand, when testing the differences between population variances, the F test is known to be rather sensitive to the assumption of normality. As a result, it might be possible that the null hypothesis is rejected because of the fact that the random variables are not normally distributed rather than the fact that the variances are not equal. T h i s chapter focuses on inferences about variances of two populations. Section 1.1 investigates the influence of non-normality on comparing the variation in two samples. Section 1.2 describes alternative robust methods which have been proposed to deal with the non-normality problem. 1.1 Non-normality T h e classic F test was first proposed by Bartlett [1]. Unfortunately, the F test is very sensitive to the assumption that the underlying populations have normal distributions. Box [2] showed that when the underlying distributions are non-normal, this test can have an actual size several times larger than its nominal level of significance. To see the influence of non-normality on comparing the variation in two samples by a classical F test, we will look at the normally and non-normally distributed cases. Firstly, we will derive the asymptotic distribution of the classic F test statistic under the assumption of an underlying normal distribution. Secondly, we will investigate how this asymptotic distribution changes under departures from normality. 1 1.1.1 Normal Case Let us consider a two sample problem. Let ?/n, . . . , ? / i pendently distributed samples from the distributions ni and y i, ...,y 2 7V"(/f,cT ) 2 2n2 D e t w o inde- and N(fj,,a2) respec- tively. The asymptotic distribution of the test statistic i n the classical F test w i l l be derived, although the statistic has exact F distribution under the null hypothesis and normal assumption. We use the asymptotic distribution, because the distribution of the test statistic is hard to obtain when samples are non-normal. For simplicity, we assume first that n\ = n 2 = n. The sample variances Sf = -—ir z3j=i (yij — yl) -, f ° 2 i = 1,2, are unbiased estimators of the corresponding population variances af for i = 1,2 respectively, where y~i are the corresponding sample means. B y the C e n t r a l L i m i t Theorem, (s A 2 -> N — I (0\ \ where /2af E U, ,£ (i) / 0 \ = \ 0 (2) 2u\, B y the D e l t a M e t h o d , \s 2 (3) aJ 2 where g(x,y) = \jxjy and V<? is the gradient of g(x,y) : (4) (5) \-<7l^2 / 3 (6) r Therefore, (7) 2 a If the null hypothesis, H 0 : o\ = o~2, is true, and according to the equation (3) V</£V<7 = 1 and (8) So, we can use T to test the two-sided Ho, and would reject Ho when T exceeds the upper 100(a!/2) percentile or falls below the lower 100(a/2) percentile of the distribution. Thus, H is rejected when \T\ > z(l — a/2). 0 N(0,1) For instance, if a = 0.05, then Ho is rejected, when \T\ is greater than 1.96. For the unequal sample size case, if ^ 1.1.2 • d, then N o n - n o r m a l Case T h e method described in the last section is based on the assumption of normality. T o see how this method is sensitive to departures from normality, we will look at the cases that the population of the variables follow other distributions: exponential, t5 ,tw, X51X101 a n double d uniform. In addition, we will calculate their actual asymptotic significance levels. Let us first look at the general case. If the observations yu,y\ni and are independently distributed according to a general distribution F(y), E(Sf) 1 3 •> 2/21 > J/2n 2 then (10) Var(S, ) 2 (11) n n —1 where (E( - (12) » y y y 7 is called the coefficient of kurtosis and measures the peakedness or flatness of the probability distribution function (pdf). For the normal case, 7 = 0 and Var(S' ) 2 2(T4/(n - 1). By the = CLT, (si U where E = N (13) l)a\ ((2 + V /0\ 0 (2 + (14) 7KV According to(4) and (14) (15) B y the Delta Method, if Ho is true, we obtain " ( TT - 1 02 N(O, 2 + 7 (16) and for the unequal sample size case, - •n-2 / (^ S ^ *r(n (2 + 7 ) ( l + ^ ) 2 (17) If the normality assumption is met, 7 = 0, so that equation 16 is equivalent to equation 8. However, for the non-normal cases, like £ , 7 won't be zero. So when we 5 apply the classical F test to the non-normal samples, the actual size of the test would be different from its nominal level of significance, a. Table 1 displays the value of distribution 7 actual significance level double exponential 3 0.215 h 6 0.327 1 0.110 xl 2.4 0.186 Xw 1.2 0.121 uniform (a, b) -1.2 0.002 Table 1: A c t u a l asymptotic significance level of F test (a = 0.05), with non-normal samples 7 and the actual significance level of the F test ( a = 0.05), for several non-normal distributions: double exponential,t , t , 5 w xh Xio> a n d uniform (a,b) . Note that the arguments in the uniform distribution do not affect the result since in this case.7 is always equal to —1.2. Also, for a heavy-tailed distribution (7 > 0), the probability of rejecting Ho exceeds 0.05; whereas, for a short-tailed distribution, the probability is less than 0.05. 1.2 Some Robust Methods This section contains some discussion of other alternatives to the test based on T defined in (8). T h e six robust methods considered here are the Levene test [6], the Jacknife test [7], the Box test [2], the Box-Andersen test [3], the Moses test [9], and 5 the Layard x 1.2.1 2 test [5]. The Levene test The idea of the Levene test [6] is to transform the original data y^ into Zij — j = l , . . . , n - for the two samples, i = t 1,2. — yi T h e n , we just pretend that they are independently, identically, normal distributed under Ho, and use the usual t test on the two transformed samples: zu,...,zi and z i , z tTll do not satisfy the above assumptions. 2 2 t J l 2 . Obviously, the 2,-j's Normality is not met because the Zifs absolute values. Independence is violated because of the common term yi i also, they are not identically distributed unless ni — n . 2 n are each z^; However, as mentioned at the beginning of this chapter, the t test is a reliable method to check the differences between means due to the fact that it is insensitive to non-normality. T o apply the two sample t test we have a new statistic Ti = (z! - z )/s 2 with where zi,z , 2 z. 2 var(zi), and var(z ) 2 are the means and variances of the samples z\ and Levene [6] showed that under the null hypothesis, the distribution of 7} can be approximated by a t distribution with degree of freedom 1 n\— 1 n?— 1 where var(zi) ns 2 6 For two side test, if |T/| is greater then t (l v — cn/2), where v is degree of freedom, Ho would be rejected. 1.2.2 Modifications of Levene test For skewed distributions, such as the x 2 with 4 degrees of freedom (df), and heavy- tailed distributions, such as the Cauchy, the Levene test usually has too many rejections. T h a t is, the actual rejection rate exceeds the nominal significance level. For these settings, improved Levene-type procedures have been proposed by Brown and Forsythe [4] which modify the test statistic by replacing the central location yl with more robust versions, such as the medians and the 10% trimmed means of the the two samples. Monte Carlo studies [4] show that all of these test statistics are robust for the very heavy tailed Cauchy distribution. For the x (4) 2 distribution, the statistics based on the median is robust but the 10% trimmed mean rejects too often. Usually the version based on the sample mean has the greatest power in situations when the three statistics are robust. 1.2.3 The Jacknife test In [7], Miller proposed a procedure based on the Jacknife technique to test Ho in the two-sample case. Let us first review the idea of the jacknife technique. Let 9 be an unknown parameter, and let (y-i,...,?/JV) be a sample of N independent observations with cumulative distribution function (cdf) G$- Suppose that we use 9 to estimate 9, and that the data is divided into n groups of size k. Let 0_ -, i = 1, ...,n, 8 denote the estimation of 9 obtained by deleting the i-th group and estimating 9 from the (n — 1)A; observations. Define 9{ = n9 — (n — l ) 0 _ i , and 9 = ^Z)"=i ^t? 7 z — 1? ••-,n, then the statistics 1 (18) should be approximately distributed as t with (n-1) df. T h e statistics 18 can be used to perform an approximate significance test on 9. To apply the jacknife technique to test Ho : lncr = lncr in the two-sample case, we first define 2 2 9 X 9 X , = Inal = \nS , 6 = Inal y 9 = 2 x , \nS , 2 y y 6~i = ni In S - (n - 1) In Sli 0i n In Sy — (n- 2 x X - y x , 1) In yS2^ , 2 where n ; is the number of subsamples in the i sample. th Since 9 x and 9 y are approximately independently distributed, Miller proposed to test Ho by using a two sample t-test on the two samples: x 6i,x9 n j , and 9\, .... 9 . To apply the two y y n2 sample t test we have a new statistic rp J-e = xO - 9 y s with ~ ^ _ /var( \ and 9, x 9, var( y 9 and 9. x y 9), x and var( 9) var( x ni 1/2 9)\ n y 2 J 9) are the sample means and variances of the samples y He showed that under the null hypothesis, the distribution of T$ can be approximated by a t distribution with degree of freedom 1 ni — 1 n2—1 8 where var( 6) x c ns 2 For the two side test, we first compute \Te\. If \T$\ is greater then t (l — a/2), v could reject H , and conclude that the two variances are different. 0 1.2.4 The Box test T h e Box test [2] is the earliest robust test for equality of variances. For the two sample case, similar to Jacknife test, each sample is divided into subsamples of size k(k > 1). So there are n-i subsamples for the fist sample X\, ...,x ni ples for the second sample y i , ...,y . n2 Then l n 5 2 ,and n subsam2 is obtained from each subsample. Let's define Gij = \nS?j, i = 1,2, and j = l , . . . , n - . t T h e Gij are approximately distributed as N [ l n a , ^-j- + ^ , and the Box procedure performs two sample t 2 test on Gij and to test H : l n a 0 2 = l n a | . First, let's define G\, G2, var{G\), and var(G\) as the sample means and variances of the two samples G\ and G2, and s with /var(Gi) \ ^ 1/2 var(G2) ni n 2 J T h e null hypothesis can be approximated by a t distribution with degree of freedom 1 n\— 1 ri2 — 1 where c var(Gi) ns 2 9 we For two sided test, if \TQ\ is greater then t {\ — OJ/2), where v is degrees of freedom, v Ho would be rejected. Also, Box suggested that the test statistics TQ will not have exactly a t distribution since In S 2 is not exactly normally distributed, but the level of significance should be closely approximate because of the robustness of the t statistics. The main disadvantage of the Box test is the loss of information i n subdiving the samples, and different groups of the data within each sample have the potential to produce substantially different results. 1.2.5 The Moses test T h e main idea of Moses test [9] is to apply the Wilcoxon two sample rank test to the value S 2 obtained from the subsamples as in the Box test. T h i s method was studied i n detail by Shorack [10]. Besides S , other measures of dispersion (e.g., 2 the range, or the mean deviation about the sample mean) were also considered to be used i n the subsamples. Moses pointed out that the following properties:(a) this test yields an exact significance level, and (b) the two population means can be left completely unspecified. However, like the Box test, this test still suffers from the loss of information due to the sample subdivision. 1.2.6 The Layard x test 2 Layard [5] suggested a x 2 test statistic which is a function of the kurtosis 7. For large sample size n , the statistic approximately follows a where r 2 N[\na ,T ] 2 2 = 2 + [1 — (l/n)]7, and 7 is the coefficient of kurtosis. 10 distribution, Under H 0 the statistic s = j > < -1) E(n,--l)lnS? InSf is asymptotically distributed like x\i a n 1 /r -1) E f a 2 2 d Sf is the sample variance of the i h sample. t However 7 is unknown, so Layard suggested the use of [ E E ( * ; - * . - ) 7 to estimate the kurtosis. S = T S / T , where f 2 2 2 2 ] i 9 j = 2 + [1 — ^7. If S exceeds the upper 100(o:/2) percentile or 2 distribution, the null hypothesis Note that Layard [5] and Brown [4] have simulated sampling experiments which suggest that the x A l Hence, we can use the estimate 7 and base a test on falls below the lower 100(ai/2) percentile of the x would be rejected. 2 test compares favourably with Box test. 2 difficulty with this procedure is that quite large samples are needed to get a reasonable estimate of 7. 1.2.7 The Box-Andersen Test Box and Andersen [3] applied permutation theory to construct an approximate robust test. T h e idea of this test is to adjust the degree of freedom for the statistic S /Sy x , so that the mean and the variance of this distribution are equal to that under the permutation distribution. Permutation theory assumes that the two samples have been randomly selected without replacement from m = where y,j = Xij — Hi, and y n , u n i = y l n i ,u = n i + 1 y i,u 2 n i + T l 2 = y , 2n2 is the population mean of the i h sample. For simplicity, t / «i + n \ 2 //,-'s are assumed to be known. E a c h of the possible 11 combinations is equally likely. Let y^i 2. 2^?=i tfij 7/ 73 V V 2 n i „?. ' The mean of 5 is the same under the normal and permutation distributions, V (B) = MB) N = ^, where JV = ni + re - However, the variances differ. Under the normal distribution, 2 Under the permutation distribution, 1 / N (N 2 N + 2) where ^ _ (TV + 2 ) Vij Y,T 4 =1 (5Z»=i S j = i y«j ) 2 2 B y using new sample sizes, ni, and n~2, we can make the two variances equal, where rii — drii, n~2 = c?n , and 2 d 1 / 1 + N + 2 T2 2livT^) ( 6 2 - 3 ) T h e mean of B is unchanged under this substitution. So, by redefinding the sample sizes, the normal theory distribution for B can be made to approximate the permutation distribution for B. According to the discussion above, Shorack [ 1 0 ] suggested the following approximate Box-Andersen test. T h e test approximates the distribution of the usual F by an F distribution on degrees of freedom di,d , where 2 d\ — d{n\ — 1 ) and d = d(ri2 — 1 ) 2 12 with d = 1+ ^ - 3) -l and Z)i=l » 2~!w=l YljLl{ ij n x i) x 4 So, if the classic F statistic exceeds the upper 100(a/2) percentile or falls below the lower 100(a!/2) percentile of the Fd d lt 2 distribution, the null hypothesis would be rejected. 1.3 Example This section contains two examples, which are available on the internet at the address http : I/lib.stat.cmu.edu/DASLjallmethods.html. T h e data file names are Clouds and Michelson. 1.3.1 The first example: Cloud In the first example, clouds were randomly seeded or not with silver nitrate. Rainfall amounts were recorded from the clouds. T h e purpose of the experiment was to determine if cloud seeding increases rainfall. T h e side by side boxplots of the two logged variables F i g 1 indicate that the variances of the two groups are very similar after a log transformation. To compare the significance levels of these six tests, two outliers, with the same value are added to the seeded sample, and the value of the outliers is increased until the results of these tests become steady. T h e side by side boxplots for each pair of samples are shown in F i g 2. 13 T h e results of these tests and the classic F test are displayed in Table 2. For the F , Levene, Layard, Jacknife, Box, Moses, and Box-Andersen tests, if the test result is 1 in the table, the test rejects Ho. For the Moses and Box tests, the test results may change due to different subsamples of the data within each sample. T o see if these two tests are likely to reject the null hypothesis, for each pair of samples, each of these two tests is executed 100 times. T h e entries are the proportion of rejections. A s expected, the F test is very non-robust. It rejects Ho as the two outliers 12 are added. In this example, of all the tests, the Moses and Box tests are less affected by the outliers. T h e y do not reject the null hypothesis, even when the largest outliers 100 are added. In addition, the performance of the Box-Andersen test is quite good. T h e Levene test is not as good as the Box-Andersen test, but is better than the Layard test, and the Jacknife test is the worst one. 1.3.2 T h e second example: Michelson In the Michelson's example, 100 determinations of the velocity of light i n air using a modification of a method proposed by the French physicist Foucault. These measurements were grouped into five trials of 20 measurements each. T h e numbers are in km/sec, and have had 299,000 subtracted from them. T h e currently accepted 'true' velocity of light in vacuum is 299,792.5 km/sec. T h e side by side boxplots of the measurements i n the first and fifth trials, F i g 3, reveal that their variances are very different. T o compare the power of the seven tests, one outlier is added to the sample with smaller sample variance, and the value of the outlier is increased until neither of these tests rejects HQ . T h e results of these tests and the side by side boxplots of each 14 value of Box- two outliers F test Levene Layard Jacknife Box Moses Andersen test test test test test test no outlier 0 0 0 0 0.03 0.04 0 12 1 0 0 0 0 0.01 0 14 1 0 0 1 0.01 0.04 0 25 1 0 1 1 0 0.01 0 28 1 1 1 1 0 0.01 0 30 1 1 1 1 0 0.04 1 100 1 1 1 1 0 0 1 Table 2: Results of tests on variances for the C l o u d data. 15 Figure 1: Side by Side Boxplots of the two logged variables in Cloud example pair of samples are shown i n Table 3 and F i g 4. Without the outlier, all tests except Box and Moses reject Ff , and these two tests have about 25% of results rejecting 0 Ho- Hence, these two tests do not perform powerfully in this example. Surprisingly, the F test is not fooled by large outlier in this example. T h e Levene test is also very powerful. T h e Layard test is the worst. T h e Jacknife and Box-Andersen tests are about equally powerful. According to the two examples, the power of the F test and the Jacknife test are not so affected by the outliers, but their significance levels are very sensitive to the outliers. T h e Layard test is not so powerful, but, in term of the significance 16 Boxvalue of outlier F test Levene Layard Jacknife Box Moses Andersen test test test test test test no outlier 1 1 1 1 0.28 0.25 1 950 1 1 0 1 0.21 0.25 1 980 1 1 0 0 0.13 0.12 0 1000 1 1 0 0 0.13 0.03 0 1100 0 0 0 0 0 0 0 Table 3: Results of tests on variance for the Michelson data. level, it is better than the Jacknife test. T h e Levene test is the most powerful test in the Michelson's example, and its performance is better than the Layard test in the Cloud example. In addition, although, the Moses and Box tests are not affected by the largest outlier i n the first example, they are not robust. T h e y seem to be superior in the first example just because they are so conservative. O f all the tests, the Box-Andersen test is the best in these two examples. 17 2 A N e w R o b u s t t e s t This chapter contains three sections. In the first section, a new robust method testing the equality of variances between two populations is presented. In the second section, the asymptotic distribution of the new test statistic described i n the first section is derived. In the last section, the new method is applied to the two examples mentioned i n the first chapter. 2.1 Robust Dispersion Estimates F i r s t , an alternative measure of dispersion that is more resistant to outliers is i n troduced. The best feature of this new method is that it has superior ability to overcome the effect of outliers. This measure is insensitive to changes i n the most extreme observations and therefore is resistant to outliers. To start w i t h , we just consider one sample, x \ , x , w i t h X{ ~ N(/i, cr ), and X{ n 2 are independent. The alternative measure of dispersion, based on a sample x \ , x , n is called Sr. Notice that Sr satisfies the following equation (20) where T n is the median of the sample, x is defined as a function: f , if k l < c; [ 1 , otherwise, (21) where c is arbitrary. The value of b depends on the choice of c. To ensure consistency of Sr, we choose 18 with z ~ iV(0, l)(i.e. Sr —> a as n —> oo.) Observe that for —c < z < c, x( ) z equals the sample standard deviation score function. For the two sample case, Sri is referred to as the new measure of dispersion in the i th sample, i = 1,2. T h e new test statistic for the Ho will be based on the ratio R = §1. ,23) T h e asymptotic distribution of R is derived in the next section. In addition, Miller [8] also gave some references and mentioned the possibility of doing a test based on the ratio of M A D ' s , which is a particular case of robust scale estimate. 2.2 Asymptotic Distribution of R In this section, the asymptotic distributions of the test statistic R for the normal and non-normal case are derived. To see the influence of non-normality when comparing the variation in two samples, we will look at the normally and non-normally distributed cases. Firstly, we will describe the statistical method based on the assumption of an underlying normal distribution. Secondly, we will investigate how this method is sensitive to the departure from normality. 2.2.1 Normal case First, we need to compute the asymptotic distribution of n(Sr — a). Because R is location invariant, we can assume, without loss of generality, that u = 0. Taylor series expansion, 19 B y the n E X { -Sr-\ - b x(f) n 1 - & - rnE ( x ' ( ? A ) ( s E x(f) n So, y/n(Sr — a) By with i = £ [x'(f)(f)] = -S E U F R ( 5 r - a ) . % E (x(*)) - ^ 6 ^(x'(^)ft) the L a w of Large Numbers, 1 1_ Efx'(^)(^) [x'MW]. (24) (25) (26) Also, (27) n where 2/i = X ( — ) - b, (7 and E(y) = 0 , Var(y) = E { [ ( - ) - 6 ] } = r^ 2 X (7 By the C L T , ^ £ ( X ( * ) ) - V ^ " < O , T ' ) . (28) Therefore, by Slutsky's Theorem, qiV(0,T ) 2 Vn(5r - a) = 20 7V(0,a<7 ), 2 (29) c a b EFF 1.041 0.989 0.500 0.51 1.7 0.625 1.294 0.80 2.07 0.555 0.218 0.90 2.3765 0.526 0.172 0.95 Table 4: relation between c, a, b and E F F with a= T h e value of a depends on the choice of c. Table 4 shows how a, b and E F F , the relative efficiency of Sr to the classic sample standard deviation SD, varies with the value of c. T h e table shows that the efficiency of the dispersion estimate increases with c. We do not use larger c to obtain greater efficiency because as c increases, b will decrease, and the less the value of b is, the less robust the test is. In the next chapter, we will find a value of c, such that the test will be robust and efficient. In the two sample case, suppose we have two independent samples, Xn, ...,x\ ni and X21,x 2n2 from the populations, JV(//i,<7i) and N(fi ,cr ). 2 Suppose the 2 j — 1, . . . , n ; are independent. For simplicity, we assume ni = n = n. 2 B y the Central Limit Theorem, (<ri\ \o- ) 2 21 /0\ N \ (30) where E = V Let us define a function (31) 0 aal) g(x,y) = x/y. Thus, we have by the D e l t a M e t h o d , V ^ ( Sri _ Sr 2 <Ji N(0,Vg'ZVg) <J (32) 2 where ( £<7(ci,<7 )' 2 Vg = (33) ( a," 1 (34) and Vg'ZVg = 2 o U (35) If the null hypothesis, 1 1 0 : 0 " ! = <T , is true, 2 and y/n S = (R- 1) -> JV(0,1). (36) So, we can use 5 to test the two side Ho, and would reject Ho when S exceeds the upper 100(a/2) percentile or falls below the lower 100(a/2) percentile of the iV(0,1) distribution. Thus, H is rejected when \S\ > z(l — a/2). For instance, if a = 0.05, 0 then Ho is rejected, when |5| is greater than 1.96. Table 5 displays the upper and lower critical values (i.e. the acceptance regions) for the test statistic R = Sri/Sr , 2 with a = 0.05 based on both the asymptotic 22 n = 25 n = 50 distribution (0.562, 1.438) (0.690, 1.310) simulation (0.631, 1.575) (0.727, 1.383) distribution (0.587, 1.413) (0.707, 1.292) simulation (0.648, 1.538) (0.736, 1.356) distribution (0.598,1.402) (0.708,1.292) simulation (0.654, 1.535) (0.745, 1.341) Asymptotic c = 1.7 Asymptotic c = 2.07 Asymptotic c = 2.3765 Table 5: Acceptance regions of R = Sr-i/Sr? with a — 0.05 obtained from asymp- totic distribution and simulation with 10,000 repetitions. distribution and generation of R from 10,000 random numbers in Splus. T h e larger the sample size is, the less difference between the acceptance regions obtained from the two methods. F i g 5 shows the simulated distribution of R, with sample sizes 25 and 50, c — 1.7,2.07,2.3765. T h e histogram for the smaller sample size is more skewed to right, but as the sample size increases it becomes more symmetric. For unequal sample size case, if ^ —> d, then we obtain V ^ 7 + l ^ ( i 2 - 1) - 23 N(0, ^ p ^ a ) . (37) 2.2.2 N o n - n o r m a l case T o see how this new test is sensitive to departures from normality, we will look at the cases that the population of the variables follow other distributions: £5 ,tio, X51X101 uniform(0,l), and uniform(0,10). In addition, we will estimate their actual significance levels by generating 10,000 numbers. Since we want to known if the two arguments in the uniform distribution affect the results, the uniform distributions with arguments (0,1), and (0,10) are investigated. T h e simulated significance levels (a = 0.05) for the non-normal distributions are displayed in Table 6. T h e normal case is included in the table because we want to see how large the error is due to the generation of data. Note that the arguments in the uniform distribution do not affect the result. Also, for a heavy-tailed distribution , the probability of rejecting H 0 exceeds 0.05; whereas for a short-tailed distribution, the probability is less than 0.05. B u t , in general, the results are closer to 0.05 than the ones from classic F test. Also, the significance levels yielded by smaller c are closer to 0.05. 2.3 Examples In this section, the new tests with c = 1.7,2.07,2.3765 are applied to the examples described in the first chapter. T h e test results and the the test statistics i?'s for each pair of samples are shown in Table 7 and 8. Table 9 displays the acceptance regions of R with a = 0.05 for sample sizes ni,ri2. T h e acceptance regions shown in the table are obtained from simulation with 1,000 when no outlier is added, ni = n 2 repetitions. In the C l o u d example, = 26, and with c = 1.7, R = 0.958. Since R is within the acceptance region, [0.689, 1.457], shown in Table 9, the new test with 24 Distribution c = 1.7 c = 2.07 c = 2.3765 iV(0,l) 0.053 0.052 0.051 u 0.087 0.105 0.123 0.073 0.079 0.080 x\ 0.074 0.127 0.150 X io 0.052 0.080 0.098 Uniform(0,1) 0.0005 0.001 0.002 Uniform(0,10) 0.0005 0.001 0.002 2 Table 6: Simulated actual significant level of the new test ( a = 0.05) from 10,000 generated data, with normal assumption for several non-normal distributions 25 c = 2.07 c= 1.7 c = 2.3765 value of R reject R reject R reject no outlier 0.958 0 0.953 0 0.969 0 12 0.958 0 0.953 0 0.969 0 14 0.958 0 0.953 0 0.969 0 25 0.958 0 0.953 0 0.969 0 28 0.958 0 0.953 0 0.969 0 30 0.958 0 0.953 0 0.969 0 100 0.958 0 0.953 0 0.969 0 two outlier Table 7: Results of the new tests (c = 1.7,2.07,2.3765) on the C l o u d example. If reject = 1, the test rejects Ho c = 1.7 does not reject the null hypothesis. For all of the three tests, no matter how large the two outliers are, they still do not reject the null hypothesis. It means that the tests are not affected by the extremely large observations. Also, the value of R does not vary with the value of outliers for each test. Similarly, for the Michelson's example, the size of outlier does not make any influence on the results of the tests, and the value of R keeps constant with different values of outliers. Based on these two examples, we can conclude that the new tests have superior ability to overcome the effect of outliers. 26 c = 2.07 c = 1.7 c = 2.3765 value of R reject R reject R reject no outlier 1.841 1 1.882 1 1.781 1 950 1.841 1 1.882 1 1.589 1 980 1.841 1 1.882 1 1.589 1 1000 1.841 1 1.882 1 1.589 1 1100 1.841 1 1.882 1 1.589 1 two outlier Table 8: Results of the new tests (c = 1.7,2.07,2.3765) on the Michelson example. If reject = 1, the test rejects H 0 ni n c= 1.7 c = 2.07 c = 2.3765 26 26 [0.689, 1.457] [0.716, 1.464] [0.698, 1.413] 26 28 [0.688,1.431] [0.721, 1.407] [0.710, 1.415] 20 20 [0.564, 1.774] [0.599, 1.723] [0.606, 1.679] 20 21 [0.654,1.592] [0.661, 1.490] [0.672, 1.470] 2 Table 9: Acceptance regions of R = Sr /Sr 1 2 with a = 0.05 with sample sizes n i , n obtained from simulation with 1,000 repetitions. 27 2 3 M o n t eC a r l o s t u d y In this Chapter, we compare the new tests with the F test and the six robust tests described in the first chapter. T w o types of Monte Carlo studies are presented. First we investigate the sensitivity of the tests to non-normality. Second we investigate the influence of outliers on the power and the significance level of the tests. T h e procedures for our first Monte Carlo study are the following: (i) Generate one hundred and fifty pairs of samples; the sample size is 25, and the pseudo-random numbers represent samples from a uniform distribution. (ii) Transform the pseudo-random numbers to obtain samples from a iV(0,1), X IOJ 2 £ 5 , ^ 0 , and ^20 distributions. (iii) After the transformation, the second sample was scaled by the factor A so that the ratio of the two variances is A for each distribution. Different values of A 2 are selected and applied to the samples. (iv) Ten tests were applied to each of the 150 pairs of samples. T h e ten tests are the F test, the Box-Andersen test, the Levene test, the Jacknife test with subsample size k = 1, the Box and Moses tests both with subsample size k = 5, and the three new tests with c = 1.7,2.07, and 2.3765. (v) Repeat steps (i) to (iv) with sample size 50. The entries i n Tables 10 to 14 are the proportions of samples i n 150 trials that the tests reject the null hypothesis cr = <r for the various distributions and A . For 2 2 A = 1 the proportions should be close to a = 0.05. For A > 1 the proportions are Monte Carlo estimates of the power of the tests at the particular selections of A for various distributions. T h e results of these tables reveal the following conclusion: 28 (i) T h e F test is extremely non-robust. It gives too many significant results for long tailed distributions. (ii) T h e three new tests have about the same power, and in general they are the most powerful tests in the group. T h e three tests, when A = 1, give more significant results than the other tests. (iii) T h e new test with c — 1.7 is not as powerful as the new tests with c = 2.07,2.3765, but its actual significance level is closer to 0.05. (iv) T h e other tests are robust, but they are not as powerful as the new tests. In general, the Jacknife and Box-Andersen tests have about the same power. T h e Levene test is more powerful than these two tests. (v) T h e Moses test is sightly less powerful than the Box test, and seems to be the least powerful of all the tests. T h e second type of Monte Carlo studies includes two parts. T h e first part estimates the influence of outliers on the significance of the tests, and the second part estimates the influence of outliers on the power of the tests. T h e procedures for the first part are the following: (i) Transform the first sample of each of the one hundred and fifty pairs of pseudorandom samples to obtain samples from JV(0,1) with different number of outliers from N (5,0.1). (ii) Transform the second sample of each of the one hundred and fifty pairs of pseudo-random samples to obtain samples from iV(0,1) without outlier. (iii) Repeat steps (i) and (ii) with sample size 50. T h e entries i n Table 15 are the proportions of samples in 150 trials that the tests reject the null hypothesis c r 2 = a , for the various numbers of outliers. 2 29 Test with smaller values is less affected by the outliers, and seldom falsely rejects the null hypothesis. According to the results in the table, we have the following conclusions: (i) T h e Moses test is less affected than the Box test. B o t h of these tests seem to be least affected by the outliers. However, it is probably due to the fact they are very conservative, and the result is consistent with the one obtained by Miller [7]. (ii) T h e new test with c = 1.7 is the second least affected one. W h e n the sample size is 25 and less than 16% of observations in the first sample are outliers, the Levene test is slightly better than the new test with c = 2.07; the new test with c = 2.3765 is almost the worst one. Also, as the number of outliers increases, the new test with c = 2.07 becomes more affected by the outliers. (iii) W h e n the sample size is 50, the new test with c = 2.07 is better than the Levene test. In addition, the performance of the new test with c = 1.7 is almost the best in the group. T h e second part is to test the effect of outliers on the power of the tests. T h e procedures are the following: (i) Transform the first sample of each of the one hundred and fifty pairs of pseudorandom samples to obtain samples from JV(0,1) with different number of outliers from iV(5.5,0.1). (ii) Transform the second sample of each of the one hundred and fifty pairs of pseudo-random samples to obtain samples from N(0,3) without outlier. (iii) Repeat steps (i) and (ii) with sample size 50. T h e entries in Table 16 are the proportions of samples in 150 trials that the tests reject the null hypothesis a l = a 2 y for the various numbers of outliers. Tests with larger values are less affected by the outliers, and seldom falsely accepts the null 30 hypothesis. To estimate the influence of larger outlier, we repeat the procedures with larger outliers from iV(10,0.1) distribution. T h e results are exhibited in Table 17. Based on these two tables, we have the following conclusions: (i) T h e new test with c = 1.7 has the best performance. (ii) W h e n the sample contains less than 16% outliers, the new tests with c = 2.07,2.3765 are the second best tests. Whereas, as the number of outliers increases, the new tests with higher values of c become the worst of the all. 31 ni = n = 50 = n = 25 2 2 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.053 0.513 0.927 0.987 1.000 0.053 0.767 1.000 1.000 1.000 Levene 0.047 0.460 0.893 0.973 1.000 0.080 0.733 0.987 1.000 1.000 Layard 0.040 0.407 0.827 0.953 1.000 0.067 0.707 0.980 1.000 1.000 0.027 0.493 0.900 0.973 1.000 0.040 0.740 1.000 1.000 1.000 0.047 0.293 0.707 0.820 1.000 0.060 0.553 0.927 0.993 1.000 0.027 0.287 0.600 0.800 0.987 0.053 0.560 0.920 0.980 1.000 0.033 0.487 0.913 0.973 1.000 0.053 0.747 0.993 1.000 1.000 0.060 0.420 0.860 0.967 1.000 0.067 0.687 0.987 1.000 1.000 0.047 0.453 0.893 0.980 1.000 0.080 0.740 0.993 1.000 1.000 0.060 0.453 0.900 0.973 1.000 0.067 0.760 1.000 1.000 1.000 Jacknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c = 1.7 New test c = 2.07 New test c = 2.3765 Table 10: Monte Carlo Power Function for Tests on Variances for Normal distribution 32 n i = n2 = 25 ni = n 2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.153 0.540 0.860 0.960 1.000 0.093 0.707 0.987 1.000 1.000 Levene 0.060 0.487 0.833 0.953 1.000 0.087 0.653 0.973 1.000 1.000 Layard 0.040 0.353 0.740 0.900 1.000 0.027 0.507 0.913 1.000 1.000 0.080 0.440 0.760 0.900 0.993 0.053 0.600 0.940 1.000 1.000 0.033 0.293 0.600 0.800 0.987 0.067 0.460 0.880 0.987 1.000 0.033 0.227 0.493 0.760 0.973 0.060 0.447 0.860 0.987 1.000 0.053 0.407 0.780 0.913 1.000 0.047 0.600 0.933 1.000 1.000 0.0737 0.427 0.847 0.967 1.000 0.073 0.653 0.980 1.000 1.000 0.093 0.500 0.880 0.967 1.000 0.100 0.693 0.987 1.000 1.000 0.100 0.487 0.873 0.967 1.000 0.113 0.727 0.993 1.000 1.000 Jacknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c = 1.7 New test c = 2.07 New test c = 2.3765 Table 11: Monte Carlo Power Functions for Tests on Variances for X s distribution 2 33 = n = 25 n i = n2 = 50 2 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.193 0.500 0.847 0.960 1.000 0.227 0.667 0.973 1.000 1.000 Levene 0.040 0.353 0.727 0.940 0.993 0.073 0.593 0.940 1.000 1.000 Layard 0.040 0.227 0.607 0.840 0.993 0.027 0.407 0.860 0.947 1.000 0.047 0.353 0.673 0.847 0.980 0.073 0.473 0.860 0.940 1.000 0.040 0.240 0.547 0.780 0.980 0.053 0.433 0.860 0.967 1.000 0.020 0.200 0.467 0.660 0.980 0.060 0.427 0.847 0.960 1.000 0.027 0.293 0.660 0.833 0.993 0.053 0.473 0.880 0.960 1.000 0.087 0.427 0.840 0.953 1.000 0.120 0.667 0.973 1.000 1.000 0.093 0.493 0.860 0.967 1.000 0.120 0.700 0.960 1.000 1.000 0.113 0.480 0.847 0.973 1.000 0.140 0.720 0.973 1.000 1.000 J acknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c= 1.7 New test c = 2.07 New test c = 2.3765 Table 12: Monte Carlo Power Functions for Tests on Variances for £5 distribution 34 = n = 25 = n 2 2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.107 0.507 0.900 0.980 1.000 0.087 0.747 0.987 1.000 1.000 Levene 0.047 0.427 0.840 0.967 1.000 0.080 0.667 0.967 1.000 1.000 Layard 0.040 0.333 0.720 0.913 1.000 0.053 0.520 0.940 0.993 1.000 0.033 0.420 0.793 0.913 1.000 0.040 0.600 0.960 1.000 1.000 0.033 0.260 0.613 0.807 1.000 0.073 0.507 0.920 0.993 1.000 0.020 0.200 0.560 0.700 0.973 0.053 0.480 0.860 0.980 1.000 0.027 0.413 0.787 0.920 1.000 0.053 0.647 0.967 0.993 1.000 0.067 0.427 0.847 0.967 1.000 0.093 0.673 0.987 1.000 1.000 0.073 0.480 0.873 0.980 1.000 0.093 0.720 0.987 1.000 1.000 0.067 0.447 0.873 0.973 1.000 0.093 0.733 0.980 1.000 1.000 J acknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c = 1.7 New test c = 2.07 New test c = 2.3765 Table 13: Monte Carlo Power Functions for Tests on Variances for t 35 10 distribution = n = 25 = n 2 2 = 50 ratio of standard deviation 1:1 1:1.5 1:2 1:2.5 1:5 1:1 1:1.5 1:2 1:2.5 1:5 F-test 0.073 0.493 0.900 0.980 1.000 0.060 0.767 0.993 1.000 1.000 Levene 0.047 0.433 0.873 0.967 1.000 0.080 0.720 0.973 1.000 1.000 Layard 0.053 0.367 0.773 0.927 1.000 0.060 0.647 0.967 1.000 1.000 0.033 0.453 0.833 0.953 1.000 0.040 0.707 0.987 1.000 1.000 0.033 0.287 0.653 0.853 1.000 0.047 0.607 0.880 0.993 1.000 0.020 0.207 0.553 0.760 0.993 0.067 0.567 0.900 0.980 1.000 0.027 0.440 0.860 0.953 1.000 0.053 0.713 0.980 1.000 1.000 0.067 0.420 0.860 0.967 1.000 0.087 0.673 0.987 1.000 1.000 0.053 0.480 0.880 0.980 1.000 0.087 0.733 0.993 1.000 1.000 0.060 0.447 0.887 0.973 1.000 0.067 0.747 0.993 1.000 1.000 Jacknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c= 1.7 New test c = 2.07 New test c = 2.3765 Table 14: Monte Carlo Power Functions for Tests on Variances for t 20 36 distribution n\ = n = 25 = n 2 2 = 50 number of outliers 1 2 3 4 5 2 4 6 8 10 F-test 0.380 0.807 0.960 0.987 0.993 0.653 0.993 1.000 1.000 1.000 Levene 0.040 0.153 0.460 0.820 0.987 0.140 0.473 0.873 1.000 1.000 Layard 0.000 0.093 0.467 0.913 0.987 0.027 0.513 0.987 1.000 1.000 0.047 0.413 0.840 0.967 0.987 0.260 0.893 1.000 1.000 1.000 0.013 0.073 0.233 0.400 0.547 0.100 0.207 0.560 0.740 0.907 0.047 0.033 0.173 0.267 0.400 0.067 0.227 0.420 0.653 0.840 0.013 0.140 0.600 0.940 0.987 0.100 0.680 1.000 1.000 1.000 0.073 0.100 0.240 0.400 0.700 0.080 0.187 0.407 0.727 0.933 0.073 0.193 0.373 0.833 0.993 0.080 0.280 0.680 0.987 1.000 0.107 0.293 0.773 0.993 0.993 0.120 0.487 0.980 1.000 1.000 Jacknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c = 1.7 New test c = 2.07 New test c = 2.3765 Table 15: Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0, 1) population with different number of outliers from the iV(5,0.1) in the first sample 37 = n = 25 n i = n2 = 50 2 number of outliers 1 2 3 4 5 2 4 6 8 10 F-test 0.953 0.713 0.407 0.153 0.067 1.000 0.967 0.820 0.540 0.273 Levene 0.947 0.760 0.473 0.240 0.087 1.000 0.987 0.867 0.613 0.273 Layard 0.747 0.407 0.180 0.087 0.040 1.000 0.833 0.613 0.333 0.187 0.073 0.093 0.093 0.087 0.067 0.947 0.807 0.613 0.440 0.307 0.767 0.240 0.060 0.013 0.013 0.980 0.867 0.573 0.313 0.120 0.527 0.233 0.073 0.053 0.033 0.987 0.813 0.467 0.207 0.133 0.820 0.507 0.280 0.140 0.080 1.000 0.907 0.740 0.480 0.293 0.993 0.980 0.940 0.853 0.573 1.000 1.000 0.993 0.987 0.840 0.993 0.980 0.880 0.567 0.013 1.000 1.000 0.993 0.820 0.067 1.000 0.960 0.700 0.080 0.027 1.000 1.000 0.940 0.333 0.127 J acknife k = 1 Box k = 5 Moses k = 5 Box Andersen New test c = 1.7 New test c = 2.07 New test c = 2.3765 Table 16: Monte Carlo Power Functions for Tests on Variances for based on two samples from the 7V(0,1) and 7V(0, 3) populations with different number of outliers from the JV(5.5,0.1) i n the first sample 38 ni = n = 25 = .n = 50 2 2 number of 1 2 3 4 5 2 4 6 8 10 F-test 0.173 0.007 0.073 0.207 0.320 0.580 0.000 0.107 0.380 0.673 Levene 0.607 0.047 0.000 0.013 0.173 0.953 0.313 0.000 0.040 0.413 Layard 0.027 0.073 0.027 0.020 0.127 0.047 0.033 0.027 0.073 0.420 0.000 0.000 0.000 0.100 0.307 0.000 0.000 0.013 0.220 0.600 0.127 0.000 0.000 0.000 0.020 0.847 0.167 0.013 0.000 0.020 0.007 0.000 0.000 0.000 0.080 0.787 0.047 0.000 0.027 0.153 0.013 0.000 0.000 0.020 0.227 0.140 0.000 0.000 0.140 0.567 0.993 0.980 0.940 0.853 0.573 1.000 1.000 0.993 0.987 0.840 0.993 0.980 0.880 0.567 0.100 1.000 1.000 0.993 0.820 0.133 1.000 0.960 0.700 0.120 0.500 1.000 1.000 0.933 0.227 0.853 outliers J acknife k = 1 Box k = 5 Moses Jfe = 5 Box Andersen New test c= 1.7 New test c = 2.07 New test c = 2.3765 Table 17: Monte Carlo Power Functions for Tests on Variances for based on two samples from the N(0,1) and 7V(0,3) populations with different number of outliers from the iV(10,0.1) in the first sample 39 4 Conclusion T h e classic F test for the hypothesis concerning the equality of two population variances is known to be non-robust. Let us consider a two sample problem. Suppose we have two samples, y n , y X n i and 7/21, and identically distributed with cdf G((yi 2/2n 2 Suppose the ?/,j's are independent — //;)/cr;). A s n where 7 is the coefficient of kurtosis. If normal assumption is met, 7 = 0. However, for non-normal cases, like t , 7 won't be zero. 5 So, when we apply the classical F test to the non-normal samples, the actual size of the test would be different from its nominal level of significance, a. Therefore, several robust alternative procedures have been introduced in this century. T h i s paper presents a new robust method. T h e best feature of this new method is that it has superior ability to overcome the effect of outliers. First, an alternative measure of dispersion, Sr, that is more resistant to outliers was introduced. T h e new test statistic was then defined using these robust dispersion estimates. In Section 2.2.2, we estimated the actual significance levels of the new tests (a = 0.05) for the non-normal case. We've found that for a heavy-tailed distribution the probability of rejecting H exceeds 0.05; whereas for a short-tailed distribution, 0 the probability is less than 0.05. But, in general, the results are closer to 0.05 than the ones from classic F test. Also, the significance levels yielded by smaller c are closer to 0.05. According to the two examples described in the first two chapters, the performance of the new tests is obviously better than the other tests discussed in the first 40 chapter. In these two examples, we can see that no matter how large the outliers are, the new tests are not affected by them. It can be explained by the fact that the test statistic R is not affected by the size of outliers but the number of outliers. In addition, according to the first type of Monte C a r l o study, the three tests have about the same power. In general, the new tests are most powerful i n the group, although the true significance levels of the three tests are sightly more sensitive to the other tests. A l s o , the new test w i t h c = 1.7 is just not as powerful as the new tests w i t h c = 2.07,2.3765, but its actual significance level is closer to the proposed significance level 0.05. Based on the second type of Monte C a r l o study, the new test w i t h c = 1.7 seems to have the superior power to overcome the effect of outliers. O n the whole, this paper has demonstrated that although the new test w i t h c = 1.7 is just a little bit less powerful than those w i t h c — 2.07, 2.3765, of all the tests, the new test w i t h c = 1.7 has superior ability to overcome the effect of outliers. 41 References [1] M . S . Bartlett. Properties of sufficiency and statistical tests. Proceedings of the royal society A, 160:262-282, 1937. [2] G . E . P . Box. Non-normality and tests on variances. Boimetrika, 40:318-335, 1953. [3] G . E . P Box and S . L . Andersen. Permuation theory in the derivation of robust criteria and the study of departures from assumption. Statistical of the of American Statistical Robust test for the equality of variances. Association, 69:364-367, 1974. [5] M . W . J . Layard. Robust large-sample tests for homogeneity of variances. nal of American [6] H . Levene. Royal Society, B 1 7 : l - 2 6 , 1955. [4] M . B . Brown and A . B . Forsythe. Journal Journal Statistical Association, Jour- 68:105-198, 1974. Robust tests for equality of variance contributions to probability and statistics, pages 278-292. Stanford University Press, 1960. [7] R . G . Jr Miller. Jacknifing variances. Annals of Mathematical Statistics, 39:567- Statistics, 34:973- 582, 1968. [8] Rupert G . Miller. Beyond anova, basis of applied statistics. [9] L . E . Moses. Rank tests of dispersion. Annals of Mathematical 983, 1963. 42 [10] G . R . Shorack. Nonparametric tests and estimation of scale in two sample problem. Technical Report, 10. 43 two outliers 12 added in the seeded sample two outliers 14 added in the s e e d e d sample two outliers 25 added in the s e e d e d sample two outliers 28 added in the seeded sample two outliers 30 added in the seeded sample two outliers 10Oadded in the s e e d e d sample Figure 2: Side by Side Boxplots of the two logged variables w i t h outliers i n the seeded sample 44 no outlier added Figure 3: Side by Side Boxplots of the measurements in the first and fifth trials Michelson's example 45 outlier 950 added outlier 980 added o o O O o o O 0) 0) o s outlier 1000 added outlier 1100 added o o o o o o 0) 0) o o o o s Figure 4: Side by Side Boxplots of the two variables with outliers in the fifth sample 46 n = 25,c=1.7 i 1 0.5 1.0 n = 25, c =2.07 1 1 i 1 1.5 2.0 0.5 1.0 M trS n = 25, c =2.3765 1 1 i 1 1 1 1 1 1 1.5 2.0 0.6 0.8 1.0 \1 1.4 1.6 1.8 66 Figure 5: Histograms of R for different combinations of sample size n , and c 47
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- Robust tests on the equality of variances
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
Robust tests on the equality of variances Lai, Man-Po 1997
pdf
Page Metadata
Item Metadata
Title | Robust tests on the equality of variances |
Creator |
Lai, Man-Po |
Date Issued | 1997 |
Description | The classic F test for the hypothesis concerning the equality of two population variances is known to be non-robust. When we apply the classical F test to the non-normal samples, the actual size of the test can be different from its nominal level. Therefore, several robust alternatives have been introduced in the literature. In this thesis, I will present some of these alternatives, and illustrate their application with some examples. A new approach will also be introduced. The best feature of this method is that it seems to be able to overcome the adverse effect of outliers. A Monte Carlo study is used to compare the new test with the F test and the other methods. The results of this study are encouraging for the new test. |
Extent | 1953671 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
File Format | application/pdf |
Language | eng |
Date Available | 2009-03-25 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0088196 |
URI | http://hdl.handle.net/2429/6535 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
Graduation Date | 1997-11 |
Campus |
UBCV |
Scholarly Level | Graduate |
Aggregated Source Repository | DSpace |
Download
- Media
- 831-ubc_1997-0561.pdf [ 1.86MB ]
- Metadata
- JSON: 831-1.0088196.json
- JSON-LD: 831-1.0088196-ld.json
- RDF/XML (Pretty): 831-1.0088196-rdf.xml
- RDF/JSON: 831-1.0088196-rdf.json
- Turtle: 831-1.0088196-turtle.txt
- N-Triples: 831-1.0088196-rdf-ntriples.txt
- Original Record: 831-1.0088196-source.json
- Full Text
- 831-1.0088196-fulltext.txt
- Citation
- 831-1.0088196.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0088196/manifest