UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A comparison of nonparametric tests of independence Nemec, Amanda Frances 1978

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
831-UBC_1978_A6_7 N44.pdf [ 2.74MB ]
Metadata
JSON: 831-1.0080119.json
JSON-LD: 831-1.0080119-ld.json
RDF/XML (Pretty): 831-1.0080119-rdf.xml
RDF/JSON: 831-1.0080119-rdf.json
Turtle: 831-1.0080119-turtle.txt
N-Triples: 831-1.0080119-rdf-ntriples.txt
Original Record: 831-1.0080119-source.json
Full Text
831-1.0080119-fulltext.txt
Citation
831-1.0080119.ris

Full Text

A COMPARISON OF NONPARAMETRIC TESTS OF INDEPENDENCE by AMANDA FRANCES NEMEC B.Sc., University of V i c t o r i a , 1975 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE i n THE FACULTY OF GRADUATE STUDIES Department of Mathematics We accept t h i s thesis as conforming to the required standard THE UNIVERSITY OF BRITISH COLUMBIA August, 1978 © Amanda Frances Nemec, 1978 In presenting th i s thes is in pa r t i a l fu l f i lment of the reguirements f o r an advanced degree at the Univers i ty of B r i t i s h Columbia, I agree that the L ibrary sha l l make it f ree ly ava i l ab le for reference and study. I further agree that permission for extensive copying of t h i s t h e s i s for scholar ly purposes may be granted by the Head of my Department or by his representat ives. It is understood that copying or p u b l i c a t i o n of th i s thes is fo r f inanc ia l gain sha l l not be allowed without my written permission. Department of Mathematics The Univers i ty of B r i t i s h Columbia 2075 Wesbrook Place Vancouver, Canada V6T 1W5 Date n M A t i r t t . n, \<T1*. Abstract The nonparametrie tests of b i v a r i a t e independence, based on Spearman's rho, Kendall's tau, the Blum-Kiefer-Rosenblatt s t a t i s t i c , the Fisher-Yates normal scores c o e f f i c i e n t and the quadrant sum are compared with the parametric t e s t based on the ordinary c o r r e l a t i o n c o e f f i c i e n t . The tests are compared i n the b i v a r i a t e normal case by recording the Pitman and Bahadur e f f i c i e n c i e s of each test. The empirical powers r e s u l t i n g from a Monte Carlo study are also given for the tests. The components of the Blum-Kiefer-Rosenblatt s t a t i s t i c are derived and are related to l i n e a r rank s t a t i s t i c s . The rank s t a t i s t i c associated with the f i r s t component i s suggested as a new nonparametric test of independence. This t e s t i s included in the comparison and i s shown to perform reasonably well. As one sided tests of independence i n b i v a r i a t e populations the Fisher-Yates c o e f f i c i e n t and the sample co r r e l a t i o n c o e f f i c i e n t are to be preferred over the other tests. As two sided tests the Blum-Kiefer-Rosenblatt s t a t i s t i c i s the best test for alternatives near the n u l l hypothesis. When the alternatives are distant from the n u l l hypothesis the Fisher-Yates c o e f f i c i e n t or the sample cor r e l a t i o n c o e f f i c i e n t should be used. The quadrant sum always performs poorly while the other nonparametric t e s t s , including that based on the f i r s t component of the Blum-Kiefer-Rosenblatt s t a t i s t i c are acceptable t e s t s . TABLE OF CONTENTS i i i A b s t r a c t i i L i s t of Tables v Acknowledgement v i CHAPTER 1 INTRODUCTION 1 CHAPTER 2 CRITERIA FOR THE COMPARISON OF TESTS 3 2.0 I n t r o d u c t i o n 3 2.1 Pitman Asymptotic R e l a t i v e E f f i c i e n c y 3 2.2 Bahadur R e l a t i v e E f f i c i e n c y 7 2.3 A Monte C a r l o Study 10 CHAPTER 3 TESTS TO BE COMPARED 11 3.0 I n t r o d u c t i o n 11 3.1 Spearman's Rho 12 3.2 The F i s h e r - Y a t e s Normal Scores i : S t a t i s t i c 12 3.3 The Quadrant Sum 13 3.4 K e n d a l l ' s Tau 14 3.5 The Blu m - K i e f e r - R o s e n b l a t t S t a t i s t i c And I t s Components 15 1. The components 16 2. The r e l a t e d l i n e a r rank s t a t i s t i c s 18 CHAPTER 4 ALTERNATIVES TO INDEPENDENCE 2 0 CHAPTER 5 THE COMPARISON OF NONPARAMETRIC TESTS OF INDEPENDENCE 22 i v 5.0 I n t r o d u c t i o n 22 5.1 Pitman E f f i c i e n c i e s 23 1. Spearman's rho 2 6 2. F i s h e r - Y a t e s normal scores s t a t i s t i c 26 3. Quadrant sum 26 4. The components and r e l a t e d l i n e a r rank s t a t i s t i c s of the Blum-Kiefer-Rosenblatt s t a t i s t i c . 27 5. K e n d a l l ' s tau 29 5.2 Bahadur E f f i c i e n c i e s 29 1. F i s h e r - Y a t e s normal scores s t a t i s t i c 31 2. Quadrant sum 31 3. Spearman's rho 31 4. K e n d a l l ' s t a u 32 5. The components and r e l a t e d l i n e a r rank s t a t i s t i c s of the Blum-Kiefer-Rosenblatt s t a t i s t i c . 32 5.3 A Monte C a r l o Comparison 34 CHAPTER 6 SUMMARY AND CONCLUSIONS 40 BIBLIOGRAPHY 42 LIST OF TABLES V Table I C r i t i c a l v a l u e s d e r i v e d from a Monte C a r l o experiment 36 Table II Power d e r i v e d from a Monte C a r l o experiment 38 ACKNOWLEDGEMENT v i I wish t o thank Dr. J . K o z i o l and Dr. J . Petkau f o r t h e i r h e lp and advice w h i l e p r e p a r i n g t h i s t h e s i s . I would a l s o l i k e t o thank my husband, Jim, and my f a m i l y , f o r t h e i r g r e a t encouragement and support, Mr. M.E. L i n n e l l f o r p r o o f - r e a d i n g the t e x t , and Mrs. F . I . L i n n e l l f o r t y p i n g the t h e s i s . The r e s e a r c h and work f o r t h i s t h e s i s was supported by a N a t i o n a l Research C o u n c i l of Canada Postgraduate S c h o l a r s h i p . 1 CHAPTER 1  INTRODUCTION Given a b i v a r i a t e p o p u l a t i o n (X,Y) i t i s o f t e n necessary to t e s t f o r the independence of X and Y u s i n g a sample of n observa-t i o n s , (X^,Y^), i = l , 2 , . . . , n . The u s u a l t e s t of independence i s to compute the sample c o r r e l a t i o n c o e f f i c i e n t , r ^ d i r e c t l y from the o b s e r v a t i o n s and perform a t - t e s t . The nonparametric t e s t s based on Spearman's rho, K e n d a l l ' s t a u , the F i s h e r - Y a t e s normal scores c o e f f i c i e n t , the quadrant sum or the Blum-Kiefer-Rosenblatt s t a t i s t i c depend on l y on the ranks (R^,S^), i = l , 2 , . . . , n , corresponding t o the o b s e r v a t i o n s . T h i s t h e s i s compares the performance of these nonparametric t e s t s with the param e t r i c t e s t based on r . n The components of the Blum-Ki e f e r - R o s e n b l a t t s t a t i s t i c are d e r i v e d . These components may be r e l a t e d t o l i n e a r rank s t a t i s t i c s . The l i n e a r rank s t a t i s t i c a s s o c i a t e d w i t h the f i r s t component i s suggested as a t e s t of independence and i s compared w i t h the oth e r t e s t s . The t e s t s are compared assuming (X,Y) has a b i v a r i a t e normal d i s t r i b u t i o n . The Pitman and Bahadur e f f i c i e n c i e s of each non-pa r a m e t r i c t e s t r e l a t i v e t o the t e s t based on r n are g i v e n . The e m p i r i c a l powers of the t e s t s , d e r i v e d from a Monte C a r l o experiment,are a l s o g i v e n . These three p i e c e s of i n f o r m a t i o n p r o v i d e a u s e f u l comparison of the nonparametric t e s t s of independ-ence. In the one s i d e d t e s t i n g s i t u a t i o n the t e s t s based on the F i s h e r - Y a t e s c o e f f i c i e n t and on r emerge as the bes t t e s t s of n 3 independence. Both t e s t s perform e q u a l l y w e l l and are to be 2 preferred over a l l the other tests. The quadrant t e s t , although simple to execute, l i e s at the other extreme, being by far the worst t e s t . S l i g h t l y i n f e r i o r to the two best tests are Spearman's and Kendall's t e s t s , which exhibit a sim i l a r behavior to each other, and the test based on the Blum-Kiefer-Rosenblatt f i r s t component. Kendall's and Spearman's tests have the advan-tage over the other nonparametric tests, that being well studied, tables of c r i t i c a l values are readi l y available. In testing against a two sided alternative, the o v e r a l l Blum-Kiefer-Rosenblatt test i s the best test i n regions near the n u l l . Moving away from the n u l l a cross over takes place and an ordering of the tests s i m i l a r to the one sided s i t u a t i o n i s assumed, with the Blum-Kiefer-Rosenblatt test s l i g h t l y poorer than the r and Fisher-Yates t e s t s . The Blum-Kiefer-Rosenblatt test n should be used when i t i s known that the alternative i s close to the n u l l hypothesis, otherwise the Fisher-Yates test or the corre l a t i o n c o e f f i c i e n t test should be selected. 3 CHAPTER 2 CRITERIA FOR THE COMPARISON OF TESTS 2.0 I n t r o d u c t i o n In comparing two t e s t s of the n u l l h y p o t h e s i s a g a i n s t the a l t e r n a t i v e , the obvious approach i s an examination of the a p p r o p r i a t e power f u n c t i o n s . The power f u n c t i o n i s , however, a f u n c t i o n of the sample s i z e , n, the s i z e of the t e s t , a, and the parameter, 0, which " d e f i n e s " the two hypotheses. Any statement of one t e s t being " b e t t e r " than the other must be accompanied by a statement of the c o n d i t i o n s f o r which i t holds t r u e . For example, f o r f i x e d sample s i z e and f i x e d a,v i t may be s a i d t h a t one t e s t i s b e t t e r than the o t h e r i f i t has g r e a t e r power f o r a l l 9 i n the a l t e r n a t i v e . Or, i f the sample s i z e can vary, i t may be s a i d t h a t one s i z e a t e s t i s b e t t e r than the o t h e r i f i t r e q u i r e s a s m a l l e r sample s i z e to achieve the same power a g a i n s t the same a l t e r n a t i v e . The r a t i o of sample s i z e s i s a measure of the r e -l a t i v e e f f i c i e n c y of the two t e s t s a g a i n s t t h a t a l t e r n a t i v e . A complete comparison of two t e s t s would be complicated; i t i s d e s i r a b l e to e l i m i n a t e some of the v a r i a b l e s (sample s i z e , s i z e , etc.) i n order to make a more d i r e c t comparison p o s s i b l e . To do t h i s the " r e l a t i v e e f f i c i e n c y " of one t e s t w i t h r e s p e c t to the o t her, i n the " l i m i t i n g " case o n l y , i s o f t e n used. Pitman and Bahadur e f f i c i e n c i e s are two such "asymptotic r e l a t i v e e f f i c i e n c i e s " . 2.1 Pitman Asymptotic R e l a t i v e E f f i c i e n c y (or asymptotic r e l a t i v e  e f f i c i e n c y or Pitman e f f i c i e n c y , f o r short) Given two t e s t s , both of the same s i z e , a, the Pitman e f f i c i e n c y i s the l i m i t i n g r e l a t i v e e f f i c i e n c y of the second t e s t 4 with r e s p e c t t o the f i r s t , as the sample s i z e i n c r e a s e s without bound and a t the same time the a l t e r n a t i v e approaches the n u l l h y p o thesis. The r e l a t i v e e f f i c i e n c y of the second t e s t w i t h r e s p e c t t o the f i r s t i s the r a t i o of the sample s i z e s , n^/n 2, r e q u i r e d by the r e s p e c t i v e t e s t s , t o achieve the same power a g a i n s t the same a l t e r n a t i v e . Although v a r i o u s authors have g e n e r a l i z e d the n o t i o n , Pitman e f f i c i e n c y w i l l be d e f i n e d here i n the u s u a l way (Noether 1955, K e n d a l l and S t u a r t 1973). Consider a t e s t based on the s t a t i s t i c T which i s c o n s i s t e n t f o r t e s t i n g n HQ:0=0Q a g a i n s t H^:6>0g, where T^ i s computed from n o b s e r v a t i o n s . Suppose there e x i s t f u n c t i o n s of 0, \b and a , ( u s u a l l y ET (0) ' y n n 1 n r n 2 and VarT =o (0)) such t h a t : n n A ( i ) ip' (0j=iK' (e n ) = . . .=i>(m_1) (0-rt)=O 4> ( m ) (0 n)>O r n 0 r n 0 r n 0 r n 0 . A ( i i ) l i m n~ m < 5ip ( m ) (9.) /a (6_)=v>0 f o r some 6>0 r n 4 o ' n 0 n.*°° (assuming the e x i s t e n c e of these d e r i v a t i v e s ) I t i s now p o s s i b l e t o d e f i n e the simple a l t e r n a t i v e H :0 = 0 = e _ + - ^ j -n n 0 o n where k i s some p o s i t i v e c onstant. The f o l l o w i n g f u r t h e r assump-t i o n s are a l s o made: A ( i i i ) l i m ( 6 n ) / ^ m ) (6 0)=1, l i m on {Q_) /a_ (0 Q) =1 A( i v ) T n ~ ^ n ^ 0 n ^ i s a s Y m P t o t i c a l l y normal wi t h mean 0 and — — T Q — \ v a r i a n c e 1, under the n u l l hypothesis -n n HQ:0^=0Q and under the a l t e r n a t i v e hypo-t h e s i s ,(although the asymptotic normal-i t y requirement i s not necessary f o r the gene r a l d e f i n i t i o n of Pitman e f f i c i e n c y t h i s i s the only case c o n s i d e r e d as a l l the s t a t i s t i c s to be compared w i l l s a t i s f y the n o r m a l i t y assumption). For large samples, under the above assumptions, the size a t e s t of HQ versus H^ may be written i n a form: Reject H. i f T > ty ( 9 A ) + X a (9.) J 0 n r n 0 a n 0 where $ (A )=a,$ i s the standard normal cumulative d i s t r i b u t i o n a function. The approximate power of th i s test against the alternative H i s n P J 9 J = $ n n -ty (6.) - X a (6.) + ty ( 9 ) 1 r n 0' a n 0 . n n I n n Expanding ty(9 )-ty (9 Q) i n a Taylor series about 0 n and under the r e g u l a r i t y conditions of assumption A ( i i i ) the argument of $ above, may be approximated by k mv , -i — — A . nr. a The approximate powertof the tes t i s , therefore, n n . m k v ml - X Assume the tests based on T, and T„ are two consistent l n 2n tests of H_ versus H1, and that T. s a t i s f i e s A(i)-A(iv) with 0 1 in ^ n = ^ i n ' a n = a i n ' m h = m ± ' S = 6\ , and v=v^ for i=l,2; The tests w i l l then have asymptotic powers r m. i . i v. — 7 - - X 1 m. I a i = l , 2 against the respective alternatives H. :9=9. =9„+ 1 ; i=l,2, i n i n 0 n Si As stated before, Pitman e f f i c i e n c y requires that the tests have the same power against the same alt e r n a t i v e . This i s achieved i f E q u i v a l e n t l y , the c o n d i t i o n n-, n. v2 k £V m l ) \ 1 / m l V l V guarantees the same power a g a i n s t the same a l t e r n a t i v e . The Pitman e f f i c i e n c y of T„ r e l a t i v e to T.. , A „ n , i s d e f i n e d t o be 3 2n In' 21 A 2 1 i 0 i f <S1>62 i f S±<62 n. lim. .a _1 i f ^ i = < ^ 2 n I n 2 " ° ° n 2 In the case 6,=6_ the e x p r e s s i o n f o r the Pitman e f f i c i e n c y i s lim n l _ / v2 <V k 2 ( m 2 " m i ) \ ^ l 6 M.2 n l rn 2 - c o n 2 V l m 2 Fre q u e n t l y , m1=m2=m as w e l l as <51=S2 and A 2 1 i s s i m p l i f i e d to be 1 ' v A 2 m<5 21 v. l i m n+oo * l n > ( V / oln ( V 2.1.1 The l a s t e x p r e s s i o n i s o f t e n given as the d e f i n i t i o n of asymptotic r e l a t i v e e f f i c i e n c y . I t should be noted t h a t the Pitman e f f i c i e n c y may a l s o be i n t e r p r e t e d as a l i m i t i n g r a t i o , o f the d e r i v a t i v e s of the power f u n c t i o n s of the two t e s t s (Blomqvist 1950, Noether 1955; K e n d a l l and S t u a r t 1973). I f the Pitman e f f i c i e n c y t urns out to be independent o f a, as i t does f o r a s y m p t o t i c a l l y normal s t a t i s t i c s , i t i s a s i n g l e summary measure of how w e l l one t e s t does compared w i t h the other, i n the r e g i o n of the n u l l hypothesis. 7 2.2 Bahadur R e l a t i v e E f f i c i e n c y (or Bahadur e f f i c i e n c y , f o r short) Given two t e s t s of the n u l l versus the a l t e r n a t i v e h y p o t h e s i s , the Bahadur e f f i c i e n c y i s the l i m i t i n g r e l a t i v e e f f i c i e n c y of the two t e s t s as the sample s i z e i n c r e a s e s without bound and a t the same time the s i z e of each t e s t approaches 0. In order to compute the Bahadur e f f i c i e n c y i t i s necessary to d e f i n e the "exact s l o p e " a s s o c i a t e d w i t h a sequence of t e s t s t a t i s t i c s , L T N J n = ; L , as i n Bahadur (1967). The n o t a t i o n used here i s Bahadur's. Consider a t e s t s t a t i s t i c , T , based on n o b s e r v a t i o n s , f o r n t e s t i n g the n u l l h y p o t h e s i s , H n:0=0 n a g a i n s t the one s i d e d a l t e r -n a t i v e , H.:9>8- The t e s t r e j e c t s H„ when T i s g r e a t e r than some 1 0 * J 0 n constant. L e t L (T)=l-F (T ), where F (t)=P Q (T <t), be the n « n n n 0 n n p r o b a b i l i t y under the n u l l h y p o thesis of o b t a i n i n g T n g r e a t e r than I or equal to the observed v a l u e . Now, Ln~*0 wit h p r o b a b i l i t y (w.p.) 1, under the a l t e r n a t i v e , e x p o n e n t i a l l y f a s t , t y p i c a l l y . I f then, there e x i s t s a f u n c t i o n of 9, c ( 9 ) , d e f i n e d over the a l t e r n a t i v e , H^, such t h a t c i s p o s i t i v e and f i n i t e and n ''"log L n^-- 3sc(9)as n - M » , w.p. 1, when 9>9 n i s the t r u e parameter v a l u e , then c(0) i s c a l l e d the exact slope of the sequence {T }. n Bahadur d e f i n e s f o r two t e s t sequences {T^ n}^_^ , i = l , 2 w i t h corresponding exact slopes c ^ ( 9 ) , i = l , 2 , B 2 1 = c 2 ^ ^ ^ C l ^ ^ ' ca-'--'-ed the Bahadur e f f i c i e n c y , as a measure of the asymptotic r e l a t i v e e f f i c i e n c y of T £ n w i t h r e s p e c t t o T^ n when 8 i s the parameter. (Here the asymptotic e f f i c i e n c y i s i n the sense d e s c r i b e d at the beginning of t h i s s e c t i o n ) . Bahadur e f f i c i e n c y assesses the asymp-t o t i c e f f i c i e n c y f o r a l l 0 w h i l e Pitman e f f i c i e n c y has meaning I In the above d i s c u s s i o n the convergence i s w.p. 1. T h i s i s r e l a x e d t o convergence i n p r o b a b i l i t y under the a l t e r n a t i v e h y p o t h e s i s , by Woodworth (1970). only i n the neighbourhood of the n u l l h y p o t h e s i s . Bahadur (1967) o u t l i n e s the f o l l o w i n g procedure f o r determining the e x i s t e n c e and e v a l u a t i o n of the exact s l o p e . Suppose T n/n 2-> b ( 8 ) w.p. 1 where 9 > 8 Q and b i s a p o s i t i v e and f i n i t e f u n c t i o n d e f i n e d on H^. F u r t h e r , suppose -1 i-n l o g ( 1 - F n ( n 2 t ) ) f (t) as n+°° f o r every t i n an open i n t e r v a l i n c l u d i n g each value of b, where f i s continuous on the i n t e r v a l , f i n i t e and p o s i t i v e . Then c ( 8 ) e x i s t s f o r each 8e and equals 2f (b(6) ) . Very o f t e n the exact d i s t r i b u t i o n , F n ( t ) , i s not known. In such a case i t may s t i l l be p o s s i b l e - to get an approximation to the exact s l o p e , i f the asymptotic d i s t r i b u t i o n i s known. Bahadur (1960) d e f i n e s a "standard sequence",{T n>, f o r which i t i s p o s s i b l e t o compute the "approximate s l o p e " . A sequence of t e s t s t a t i s t i c s {T }°° , i s c a l l e d a standard ^ n n=l sequence i f : B ( i ) There e x i s t s a continuous p r o b a b i l i t y d i s t r i b u t i o n , F, such t h a t l i m Pft (T <t)=F(t) v t n-*» 9 0 n B ( i i ) There e x i s t s a c o n s t a n t , a, 0<a<°°, such t h a t l o g ( l - F (t) )=-at 2/2 (1+6(1) ) ast+«> B ( i i i ) There e x i s t s a f u n c t i o n , b, on 8 > 6 f ) , w i t h 0<b<°°, such t h a t V 8 > 8 0 l i m P Q ( | /T n/n J s-b (8) | >t)=0 Vt>0 n^°° 1 1 (a) For a standard sequence, Bahadur (1960) shows t h a t l o g L n /n-* (a) -he (8) i n p r o b a b i l i t y f o r 8 > 6 n . The s u p e r s c r i p t " ( a ) " r e f e r s (a) to approximate v a l u e s . L n i s approximated by L n = l - F ( T n ) and c ^ ( 8 ) = a ( b ( 8 ) ) 2 i s the approximate slope (or i n e x a c t s l o p e ) . Sometimes c o n d i t i o n B ( i i i ) above, i s r e p l a c e d by the s t r o n g e r 9 c o n d i t i o n B ( i i i ) * P 0 ( l i m T /n 3 s=b(0))=l f o r 9>9_ i n which case, l o g l / a ) / n + - J 3 C ( a ) ( 9 ) w.p. 1 * 8>e o The d e f i n i t i o n of the approximate slope p a r a l l e l s t h a t of the exact slope w i t h the exact d i s t r i b u t i o n r e p l a c e d by the asymptotic d i s t r i b u t i o n as an approximation. S i m i l a r l y given two standard sequences, {T^ ^ n - ± ' 1 = l / 2 w i t h approximate s l o p e s , c | a ) ( 9 ) , i = l , 2 , B J a ) = c ( a ) ( B ) / c ( a ) ( 9 ) , the approximate Bahadur e f f i c i e n c y , can be i n t e r p r e t e d as an asymptotic measure of the r e l a t i v e e f f i c i e n c y o f the two sequences. Bahadur (1960 and 1967) s t a t e s t h a t the use of the approximate slope i s o f t e n i n f o r m a t i v e but g i v e s examples where i t i s a very bad approximation to the exact s l o p e . In g e n e r a l , the i n e x a c t slope i s reasonable i n the neighbourhood of the n u l l h y p o t h e s i s . The approximate Bahadur e f f i c i e n c y g e n e r a l l y depends on the parameter, 9 . The l i m i t , as 9 approaches the n u l l v a l u e , 9^, i s c a l l e d the l i m i t i n g Bahadur e f f i c i e n c y . In the case of one-s i d e d t e s t s , the l i m i t i n g Bahadur e f f i c i e n c y i s equal to the Pitman e f f i c i e n c y . (Bahadur 1960, Wieand 1976) Pitman and Bahadur e f f i c i e n c i e s serve as two means of comparing the s e l e c t e d nonparametric t e s t s of independence w i t h the p a r a m e t r i c t e s t based on the c o r r e l a t i o n c o e f f i c i e n t . The main reason f o r comparing asymptotic behaviour i s t h a t the d i s t r i b u t i o n f u n c t i o n s , and t h e r e f o r e power f u n c t i o n s , o f some of the nonparametric s t a t i s t i c s are unknown or are d i f f i c u l t t o work w i t h i n the f i n i t e sample case. Where p o s s i b l e the exact slo p e s w i l l be given t o a l l o w f o r comparison i n r e g i o n s d i s t a n t , from the n u l l . Otherwise approximate sl o p e s w i l l be g i v e n . 10 Pitman e f f i c i e n c i e s and approximate Bahadur e f f i c i e n c i e s p r o v i d e comparison c r i t e r i a i n re g i o n s near the n u l l . 2.3 A Monte C a r l o Study Pitman and Bahadur e f f i c i e n c i e s p r o v i d e comparison c r i t e r i a of t e s t s when the sample s i z e i s l a r g e . The comparison of the t e s t s would not be complete without c o n s i d e r a t i o n of the performance of the t e s t s when the sample i s s m a l l . The e a s i e s t way t o examine the s m a l l sample behaviour i s through a Monte C a r l o study. A l a r g e number of samples of given s i z e are generated under a f i x e d a l t e r n a t i v e . The t e s t s are c a r r i e d out f o r each sample and the r e l a t i v e f r e q u e n c i e s of making the c o r r e c t decision:- ( r e j e c t the n u l l hypothesis) are recorded. These e m p i r i c a l powers p r o v i d e a way to assess the performance of the tests as a f u n c t i o n o f sample s i z e and the a l t e r n a t i v e , when the experiment i s repeated f o r chosen v a l u e s of n and f o r d i f f e r e n t a l t e r n a t i v e s . The s i z e o f the t e s t remains f i x e d a t the same constant value f o r a l l t e s t s . 11 CHAPTER 3 TESTS TO BE COMPARED 3.0 I n t r o d u c t i o n Nonparametric t e s t s f o r the independence of two random v a r i a b l e s , l i k e other nonparametric t e s t s , r e q u i r e t h a t o n l y very weak assumptions be made. In a l l t h a t f o l l o w s i t i s assumed t h a t a random sample of n independent b i v a r i a t e o b s e r v a t i o n s i s a v a i l a b l e , denoted by (X^,Y^), (X^rY2^*'• ^ X n , Y n ^ " F u r t h e r , i t i s assumed t h a t the sample i s drawn from a p o p u l a t i o n with d i s t r i b u t i o n f u n c t i o n , F (x,y) =Prob (X=s:x,Y<y) . I t i s d e s i r e d t o t e s t the n u l l h y p o t h e s i s , H n, t h a t X and Y are independent, or H n :F (x,y) =F (x ,oo) F (oo ,y) . The t e s t s to be d i s c u s s e d are a l l rank t e s t s , t h a t i s , the t e s t s t a t i s t i c , T , depends on the ob s e r v a t i o n s only through t h e i r ranks, (R^,S^), i = l f 2 . . . n ; , where R^(S^) i s the rank of ^ ( Y ^ ) i n the j o i n t r a n k i n g of the n X's (Y's). F i n a l l y , i t i s assumed t h a t F i s continuous, thereby r u l i n g out the p o s s i b i l i t y of t i e d ranks. The tests t o be compared are those based on K e n d a l l ' s t a u , Spearman's rho, the F i s h e r - Y a t e s normal scores s t a t i s t i c , the quadrant sum, and the Blum-Kiefer-Rosenblatt s t a t i s t i c and i t s components. A l l of these t e s t s s t a t i s t i c s , with the exce p t i o n of K e n d a l l ' s t au and the Blum-Ki e f e r - R o s e n b l a t t s t a t i s t i c , are l i n e a r rank s t a t i s t i c s . A l i n e a r rank s t a t i s t i c i n the b i v a r i a t e case, i s a s t a t i s t i c of the form, a 3.0.1 where a„ and b are the scores i n the terminology of Hajek and n n Sidak (1967). Each s t a t i s t i c w i l l now be d e f i n e d i n a way 12 f a c i l i t a t i n g e f f i c i e n c y calculations. 3.1 Spearman's Rho Spearman (1904) introduced a d i s t r i b u t i o n - f r e e s t a t i s t i c which may be used to tes t HQ. The s t a t i s t i c r_, i s the sample correlation c o e f f i c i e n t applied to the ranks, (R^,S^), i=l,2...n, and i s given by r s=12n _ 1 ( n 2 - l ) _ 1 J_^±-h (n+1) ) (S^-h (n+1) ) . I t i s clear that r i s a multiple of s ^ T s n=n - 3 s f 12 (R i/ (n+1)-h) (S ±/ (n+1) ~h) and that T i s a lin e a r rank s t a t i s t i c . The scores are sn generated by the function £ [VL) =/\2 (u-h) , 0£u<:l, i n the following way, a n(i)=b n(i ) = c(i/(n+l) ) i=l,2. . .n. It may be argued (Kruskal 195 8) that r g i s a reasonable estimate of pg=6Prob{(X-X') (Y-Y' ')>0}-3 , where (X,Y), (X',Y'), (X'',Y'') are ar b i t r a r y , independent observations from the d i s t r i b u t i o n function F. If X and Y are independent p g i s 0, although the converse i s not true i n general. r g may also be written as r s = l - 6 n " 1 ( n 2 - l ) ~ 1 £(R.-S..) 2. 3.2 The Fisher-Yates Normal Scores S t a t i s t i c The Fisher-Yates normal scores s t a t i s t i c i s given by, f= f a (R. )b (S. ) L n i ' n i 7 where a (i)=b (i)=EV and V^ ' i s the i largest observation n n n n ^ of a sample of size n, drawn from a standard normal population. Apart from some constants, f i s the corr e l a t i o n c o e f f i c i e n t of the expected normal order s t a t i s t i c s corresponding to the 13 o b s e r v a t i o n s . In c e r t a i n circumstances the s t a t i s t i c a r i s e s when a l o c a l l y most powerful rank t e s t i s d e s i r e d . (Hajek and Sidak 1967) For the purposes of t h i s t h e s i s the s t a t i s t i c , rt T. =n~ hf=n~ h a (R. )b (S.) f n -Ti n 1 n l w i l l be used t o correspond e x a c t l y to the l i n e a r rank s t a t i s t i c of 3.0.1. The scores may a l s o be d e f i n e d i n terms of the g e n e r a t i n g f u n c t i o n ?(u)=#> 1 (u) , O^u^l, ($ i s the cumulative d i s t r i b u t i o n f u n c t i o n of a standard normal random v a r i a b l e ) by n o t i n g , a (i)=b ( i ) = E ? ( U ( l ) ) where i s the i ^ * 1 l a r g e s t o b s e r v a t i o n i n a sample of s i z e n from the Uniform (0,1) d i s t r i b u t i o n . 3.3 The Quadrant Sum Blomqvist (1950) s t u d i e s the s t a t i s t i c , q, which he a t t r i b u t e s t o M o s t e l l e r (1946). q, as d e f i n e d by Blomqvist i s = V n 2 = n l ~ n 2  q n l + n 2 whe re n^=the number of p o i n t s l y i n g i n the f i r s t or t h i r d quadrants, and n 2=the number of p o i n t s l y i n g i n the second or f o u r t h quadrants. The l i n e s x=m and y=m (m and m are the sample medians of the x J y x y c n X v a l u e s and the n Y v a l u e s , r e s p e c t i v e l y ) d i v i d e the (x,y) plane i n t o these f o u r quadrants. For s i m p l i c i t y the sample s i z e , n, i s assumed to be even t o a v o i d the s i t u a t i o n where a p o i n t f a l l s on one of the l i n e s x=m and y=m . x y Blomqvist (1950) demonstrates t h a t q i s a c o n s i s t e n t estimate of n=2Prob{(X-y )(Y-y )>0}-l, where y and y are the p o p u l a t i o n x y x y medians. n may be thought of as a measure of the c o r r e l a t i o n between X and Y which i s 0 when X and Y are independent (the converse i s not n e c e s s a r i l y t r u e ) . In order t o r e l a t e q t o 3.0.1, n,-n 9 may be w r i t t e n as n n-n„= £ s i g n (R.-Jg (n+1) ) s i g n (S.-h (n+1) ) X Z t»i X X = ^ s i g n ( (n+1) ~ 1R i-J 5) s i g n ( (n+1) ~1Si~h) 1=1 sign(v) i s d e f i n e d i n the u s u a l way, s i g n (vW 1 i f v>0 0 i f v=0 [ - l i i f v<0 -h Now, q=n T where T q n = n _ 3 s ^ s i g n ( (n+1) " 1R i-%) s i g n ( (n+1) _ 1 S i - J 2 ) . The s t a t i s t i c T i s i n the d e s i r e d form and the score f u n c t i o n s , qn a n and b n , of 3.0.1 are given by a (i)=b (i)=c (i/n+1) . n n ^ The score g e n e r a t i n g f u n c t i o n , 5 , i s d e f i n e d t o be £ (u)=sign(u-%) O ^ U s j l . 3.4 K e n d a l l ' s Tau Although the s t a t i s t i c appeared i n the l i t e r a t u r e b e f o r e K e n d a l l ' s e x t e n s i v e work on i t s p r o p e r t i e s (Kruskal 1958), i t i s commonly known as K e n d a l l ' s tau. K e n d a l l ' s tau, t , i s d e f i n e d as t = n _ 1 (n-1)" 1 7 )"sign(R.-R.) s i g n (S.-S .) . n • ,. 1 J -1- J i*i , - , t i s not a l i n e a r rank s t a t i s t i c but Hajek and Sidak (1967) n /\ compute the p r o j e c t i o n , t , of t n i n t o the f a m i l y o f l i n e a r rank s t a t i s t i c s , under the n u l l h y p o t h e s i s . Namely, n t n = 8 n " 2 ( n - l ) ~ 1 £ ( R . - 3 5 (n+1) ) ( S ± - % (n+1) ) . i=i Comparing t h i s w i t h Spearman's rho of 3.1, i t i s apparent t h a t t n = 2 / 3 ( n + l ) n _ 1 r s . t i s an estimate o f T=2Prob{ (X-X') (Y-Y')>0}-1, where (X,Y) and (X',Y') are independent o b s e r v a t i o n s from F (Kruskal 1958). 15 Under H n, x=0, although T=0 does not necessarily imply that H n holds. T can be interpreted as a measure of "agreement" or correlation between X and Y and therefore a test based on t i s a reasonable test of independence. 3.5 The Blum-Kiefer-Rosenblatt S t a t i s t i c And Its Components Let F n (x,oo) ,F (°°,y) be the marginal d i s t r i b u t i o n functions of the sample j o i n t d i s t r i b u t i o n function F n(x,y) where, F (x,y)=n (the number of pairs ' (X . y.Y. ) with n X.^x and Y . ^ y X . 1 1 Define the random process Q (x,y) such that, Q n(x,y) = v^T (F n(x,y)-F n(x,~)F n ( o o,y) ) . BlumyKief er-,- and Rosenblatt (1961) propose the s t a t i s t i c B =n - 1/ Q 2(x,y) dF (x,y) n n v ' 2 n ' 2 as the basis of a test for the independence of X and Y. B may also be written as the sum, n B =n _ 1 Y (#(j*X.<X.,Y.<Y.)n _ 1-n" 2#(j >X.<X.)#(j *Y.<Y.)) 2. It may be seen that any test based on B n i s nonparametric by writing B i n terms of the ranks. Bn=n 1 J (n 1 J s i g n + ( R i - R j ) s i g n + ( S i - S j ) - n " 2 ( r s i g n + ( R i - R j ) ;i ^sign^(S i-S.) ) + where sign (v)= 1 i f v>0 0 i f v<0 Hoeffding (1948) introduced a s t a t i s t i c which i s asymptotically equivalent to B n. Hoeffding's s t a t i s t i c estimates A(F)=/ D 2(x,y) dF(x,y) where D (x,y)=F (x,y) -F (X,°P)F (oo,y) . This parameter has the desirable property that D(x,y)=0 for a l l (x,y) i f and only i f HQ i s true. 16 Assuming the n u l l hypothesis holds, i t i s of in t e r e s t to fi n d the orthogonal representation of nB n i n the asymptotic co eo O case, namely B=YY^"kz"k ' w n e r e t n e asymptotic d i s t r i b u t i o n of nB i s the same as that of B. The components z., , j,k=l,2...°° n j JS. are independent, i d e n t i c a l l y d i s t r i b u t e d normal random variables with mean 0 and variance 1. The procedure used to fi n d the representation i s completely analogous to that of Durbin and Knott (1972). For f i n i t e n, the corresponding "components" z n j k ' J' k = 1' 2'-'°° MAY b e computed which are asymptotically equivalent to the components , z ^ k , j ,k=l, 2 ...«>.. . The z n j k ' s m a Y be related to li n e a r rank s t a t i s t i c s which may themselves be considered for tes t i n g for independence. The components are computed i n section 3.5(1) and are then related to l i n e a r rank s t a t i s t i c s i n section 3.5(2). 1. The components. Assume that the n u l l hypothesis holds, i . e . F(x,y)=F(x,°°)F(°°,y) . Let Q n (x,y) =/n (F n (x,y) - F N (x,°°) F n (°° ,y) ) and consider the transformation (x,y)=(F" 1(u),F~ 1(v)) where F x(x)=F(x , oo ) and F y (y)=F(oo,y) . Now Q n (x,y) =Qn ( F " 1 (U) . F ' 1 ( V ) ) i s asymptotically a Gaussian process, @ (u,v), with mean and covarianvce given by EQ(u,v)=0=EQn(F^"1(u) , F - Y 1 ( V ) ) Cov(Q(u,v) ,Q(r,s) ) = {min (u, r-ur } {min (v, s)-vs } =lim Cov (Q ( F " 1 (u) /F"1-.(v) ) , n~"°° -1 -1 Q n ( F x X ( r ) , F Y x ( s ) ) ) where Cov ( Q N ( F " 1 (u) ^ J 1 (v) ) ,Qn ( F " 1 (r) ^  (s) ) ) -1 -1 -2 = {n (n-l)mm (u,r)-ur } {n (n-1)min ( v,s)-vs }-n uvrs . Define z n j k " ^ j k / / i j k ( u , v ) Q n ( F x 1 ( u ) ^ "-"-(v) ) dudv j,k=l,2. . . » 3.5.1.1 17 where (u,v) i s the e i g e n f u n c t i o n corresponding to the e i g e n v a l u e X > k of the equation 1 1 / I {min(u,r)-ur}{min(v,s)-vs}I(r,s) drds=X£(u,v) 3.5.1.2 I t has been found (Blum, K i e f e r , and R o s e n b l a t t 1961) t h a t the e i g e n f unctions and eigenvalues of 3.5.1.2 are 2 s i n (irju) s i n (ukv) 4 2 2 and 1/TT j k , j ,k=l, 2 . . . °°. A s y m p t o t i c a l l y , z has the same n _j d i s t r i b u t i o n as -V 1 1 •• z j k = X j k I £ ^ j k ( u , v ) Q ( u , v ) dudv 3.5.1.3 Since Q(u,v) i s Gaussian, z., has a normal d i s t r i b u t i o n (Ash and Gardner 1975). I t i s easy to see, as X., and (u,v) s a t i s f y 3.5.1.1, t h a t the z ' s are u n c o r r e l a t e d and have 0 means and v a r i a n c e s e q u a l to 1. The z ^ ^ ' s are, t h e r e f o r e , a s y m p t o t i c a l l y independent, i d e n t i c a l l y d i s t r i b u t e d Normal (0,1) random v a r i a b l e s . The i n v e r s e of the system 3.5.1.3 i s •o «D Q(u,v) = £ 2 > ? k A . k ( u f v ) z . k Now, nB n=/Q 2(x,y) d F n ( x , y ) = / Q 2 ( F " 1 ( u ) . F ' 1 ( V ) ) d F n ( F " 1 ( u ) ^ ( v ) ) and a s y m p t o t i c a l l y , under H Q , nB n i s d i s t r i b u t e d as B=f / o/(u,v) dudv=H\ zZ J k z - k ' R e f e r r i n g to the d e f i n i t i o n 3.5.1.1, z n j k may be w r i t t e n i n terms of the o r i g i n a l data as f o l l o w s , 2 1 1 - 1 - 1 z n j k = 2 7 T j k £ £ s i n (^Ju) s i n ( ^ k v ) Q n ( F x (u) , F y (v) ) dudv 2 1 1 - 1 - 1 = 2TT jkf s i n ( i T k v ) / s i n (TT ju) Q (F (u) ,F (v))du dv ° n A X I n t e g r a t i n g the i n n e r i n t e g r a l by p a r t s , w i t h x = Q n ( F x 1 ( u ) ^ ^ ( v ) ) dy=sin (TT ju) du g i v e s 18 1 1 - ] -1 z n . k = 2 T T k I s i n ( T T k v ) ( c o s ( ^ j u ) 6 _ { Q n ( F x J - ( u ) , F Y x ( v ) ) } d u dv i _ i r <Su _ 1 _ =2irkvfi7 s i n ( T T k v ) { n j_ cos (ITjF x ( X ± ) )-n F n (»,F (v) ) l > rt ' ' ) cos (TT j F (X. ) ) }dv A 1 Integrating a second time by parts, with x= {n I cos (TTJFX ( X ± ) ) -n" F n (°°,FY (v) )[cos (TTJFX (XJ_) ) } dy=sin (irkv) dv gives, z . ^ / n / c o s ( i r k v ) d{n - 1 [ cos (TTjF x ( X ± ) ) ~ n _ 1 F n (^F" 1 (v) ) [cos (TTJF X(X 1) ) } = 2/n{n /cos (TTJF V(X. ) )cos ( 7TkF v(Y. );)--n"z7cos ( T T J F V ( X . ) ) i - A l x l i , ; /— A l .--I N £cos ( T r k F y ( Y I ) ) } z ., may now be related to a l i n e a r rank s t a t i s t i c , n j K 2. The related l i n e a r rank s t a t i s t i c s . I t may be necessary to estimate F x and F Y by the sample d i s t r i b u t i o n functions F (•,<») and F (°°,0. Replacing F and F by t h e i r estimates n n A Y i n the expression for z n j k gives z n . k =2/H{n _ 1icos ( T T J F N ( X i , o o ) )cos ( u k F n (oo, Y ± ) ) -n" 2[cos (irjF n ( X ± F ~) ) [cos ( T T k F n ( o 0 , Y i ) ) } Ul Or i n terms of the rank s t a t i s t i c s , (R^,S^) A ""1° —1 —1 — 2 v -1 z • 1 = 2 v / n { n 7 c o s J N R-)cos (Trkn S . ) -n Ycos (Tr jn R. ) ^cos ("fkn S^) } The l a s t term of thi s expression i s constant, ( i . e . i t i s equal — 2 n - l n —1 A to n Jcos (irjn i)£cos(Trkn i ) ) . The rank s t a t i s t i c z suggests the use of the li n e a r rank s t a t i s t i c , T n > k=n 2 ^ 2 c o s (TT j (n+1) cos ( i r k (n+1) S . ) which i s of the form 3 . 0 . 1 . The scores are generated by the functions ^^(u)= 2cos (TT£U) , 0«u-£l, £ = 1 , 2 . . . ° ° where 19 a„ (i)=C- ( ( n + l ) _ 1 i ) , b (!)=?. ( ( n + l ) - 1 i ) . I t w i l l be shown t h a t i t i s on l y necessary t o c o n s i d e r the l i n e a r rank s t a t i s t i c s T ., , njk and not the z ., 's themselves, s i n c e T ., i s a s y m p t o t i c a l l y n j k njk c J e q u i v a l e n t t o z . n j k CHAPTER 4 20 ALTERNATIVES TO INDEPENDENCE S p e c i f i c a t i o n of the a l t e r n a t i v e t o independence i s important i n making a power comparison of t e s t s of independence. Any c o n c l u s i o n s r e g a r d i n g the behaviour of the t e s t s are v a l i d only f o r the p a r t i c u l a r a l t e r n a t i v e under c o n s i d e r a t i o n . U s u a l l y , o n l y the b i v a r i a t e normal case i s c o n s i d e r e d . The same w i l l be done i n t h i s t h e s i s , but not without mention of some other a l t e r n a t i v e s which have been i n v e s t i g a t e d . K o n i j n (1956) i s , perhaps, the f i r s t to p o i n t out the s i g n i f i c a n c e and d i f f i c u l t i e s a r i s i n g i n d e f i n i n g a c l a s s of a l t e r n a t i v e s to independence. He d e r i v e s the r e l a t i v e Pitman e f f i c i e n c i e s of s e v e r a l nonparametric t e s t s of independence (with r e s p e c t t o the t e s t based on the sample c o r r e l a t i o n c o e f f i c i e n t ) , when X and Y are g i v e n by X=A. 1Zi+X 2Z 2 , Y=X 3 Z i + X 4 Z 2 , where Zi and Z 2 are independent. The hypothesis of independence i s Xi = A i t = l , A 2 = A 3 = 0. T h i s model i n c o r p o r a t e s the b i v a r i a t e normal case and i s t h e r e f o r e more g e n e r a l . Bhuchongkul :(1964) c o n s i d e r s a s i m i l a r model and a g e n e r a l c l a s s of nonparametric t e s t s , computing the Pitman e f f i c i e n c i e s of some s p e c i f i c t e s t s . Bhuchongkul}s model i s X=(1-6)Zi + 0Z 2, Y= (1-9)Z 3*8Z 2, where 0^9<1; Z i , Z 2 and Z 3 are independent. Dependence may be s p e c i f i e d i n terms of the d i s t r i b u t i o n f u n c t i o n s . T h i s approach i s taken by F a r l i e (1961) who computes Pitman e f f i c i e n c i e s of g e n e r a l i z e d D a n i e l ' s c o r r e l a t i o n c o e f f i c i e n t s (which i n c l u d e the o r d i n a r y sample c o r r e l a t i o n c o e f f i c i e n t , Spearman's rho and K e n d a l l ' s t a u ) , when the j o i n t d i s t r i b u t i o n f u n c t i o n of X and Y has the form F (x,y)=F (x,°°)F (°°,y) (1+AA (x) B (y) ) Ghokale (1968) a l s o s p e c i f i e s a g e n e r a l c l a s s of b i v a r i a t e d i s t r i b u t i o n s , namely, F (x,y) = (1-6)F (x,°°)F (°° ,y) + 9K (x,y) , 0< 9<1 and c o n s i d e r s a s u b c l a s s of Bhuchongkul's c l a s s of rank s t a t i s t i c s . Ghokale o b t a i n s the Pitman e f f i c i e n c i e s of. one rank t e s t w i t h r e s p e c t t o another as w e l l as with r e s p e c t to the t e s t based on the sample c o r r e l a t i o n c o e f f i c i e n t . He shows t h a t t h e r e e x i s t a l t e r n a t i v e s f o r which these e f f i c i e n c i e s are 0 and °°. T h i s i l l u s t r a t e s how c r u c i a l knowledge of the a l t e r n a t i v e i s i n choosing the be s t t e s t . A g e n e r a l n o t i o n of dependence, d e f i n e d by Lehmann (1966), i s p o s i t i v e quadrant dependence. The d i s t r i b u t i o n f u n c t i o n F i s p o s i t i v e l y quadrant dependent i f F (x,y) ^ F (x,°°)F (°°,y) f o r every x and y. Behnen (1971) i n v e s t i g a t e s the behaviour of l i n e a r rank s t a t i s t i c s when t e s t i n g independence a g a i n s t g e n e r a l contiguous p o s i t i v e quadrant dependence. ( C o n t i g u i t y i s as d e f i n e d by Hcijek and Sidak 1967) . CHAPTER 5 22 THE COMPARISON OF NONPARAMETRIC TESTS OF INDEPENDENCE 5.0 I n t r o d u c t i o n The comparison of the t e s t s o f independence i s r e s t r i c t e d to the case of b i v a r i a t e n o r m a l i t y . To be p r e c i s e , i t i s assumed t h a t (X,Y) has a b i v a r i a t e normal d i s t r i b u t i o n w i t h c o r r e l a t i o n c o e f f i c i e n t p and t h a t X and Y have means y '-V„ and x y 2 2 v a r i a n c e s a r,a ^ r e s p e c t i v e l y . I t i s d e s i r e d to t e s t x y ^ 2 HQ:p=0(independence) versus the one s i d e d a l t e r n a t i v e H1:p> 0. To c a l c u l a t e the Pitman and Bahadur e f f i c i e n c i e s of the t e s t s based on the s t a t i s t i c s of Chapter 3, r e l a t i v e to the t e s t based on the sample c o r r e l a t i o n c o e f f i c i e n t , two main t o o l s w i l l be used. F i r s t , the Pitman e f f i c i e n c i e s of t e s t s based on l i n e a r rank s t a t i s t i c s can be e a s i l y d e r i v e d by a p p l y i n g Behnen ( 1 9 7 1 ) , Theorem 1. T h i s r e s u l t i m p l i e s t h a t the s t a t i s t i c s 3.0.1 are a s y m p t o t i c a l l y Normal(0,1) under the n u l l h y p o t h e s i s , HQ and under contiguous a l t e r n a t i v e s , H n are a s y m p t o t i c a l l y Normal (y , 1 ) , where an e x p r e s s i o n f o r yn i s g i v e n . The Pitman e f f i c i e n c y i s then a s t r a i g h t f o r w a r d e v a l u a t i o n o f the r i g h t hand s i d e of 2 . 1 . 1 . Second, the Bahadur e f f i c i e n c y of T n (given by 3 . 0 . 1 ) may be computed by combining Theorem 3 w i t h 4.6, both of Woodworth (197.0) , t o y i e l d the exact slope o f the sequence {T n>. I f the exact slope i s d i f f i c u l t t o compute, the approximate slope w i l l be g i v e n . The approximate slope r e q u i r e s knowledge of the d i s t r i b u t i o n of T n under HQ, giv e n by Behnen ( 1 9 7 1 ) , Theorem 1 and of l^m : T / n under the a l t e r n a t i v e , g i v e n by 4.6 of Woodworth (1970). For the sma l l sample comparison, a Monte C a r l o experiment was c a r r i e d out f o r samples of s i z e s , n=10, 24 and 50. One thousand samples were generated from a b i v a r i a t e normal d i s t r i b u t i o n f o r each value of n and f o r s e v e r a l p o s i t i v e p v a l u e s . An estimate of the power of each t e s t i s p r o v i d e d by the r e l a t i v e frequency of r e j e c t i n g the n u l l H n:p=0 i n favour of H^:p>0. 5.1 Pitman E f f i c i e n c i e s The nonparametric t e s t s are to be compared w i t h the parametric t e s t based on the sample c o r r e l a t i o n c o e f f i c i e n t , n Y (X.-X)(Y.-Y) r n = — / £ ( X i - X ) 2 £ ( Y ± - Y ) 2 where X and Y are the sample means of the X's and Y's r e s p e c t i v e l y . The t e s t rejects^H„ i n favour of H, when r 0 1 n i s g r e a t e r than some p o s i t i v e constant. I t i s known t h a t when (X,Y) has a b i v a r i a t e normal d i s t r i b u t i o n , w i t h c o r r e l a t i o n c o e f f i c i e n t p , the t e s t i s c o n s i s t e n t a g a i n s t H^:p|0 and r n i s 2 2 a s y m p t o t i c a l l y normal w i t h mean p and v a r i a n c e (1-p ) /n. (See K e n d a l l and S t u a r t 1973, f o r example). In the n o t a t i o n of Chapter 2, where 9=p, 6Q=0 ¥ (P)=P n ar (p)=(l-p 2)//n" n ty'r (0)=1, m =1 r n r l i m n_JV (0) = l i m n"*5 1 =1, 6 =h, v =1. n+°° n n->-°° -% r n D e f i n i n g the simple a l t e r n a t i v e t o independence, H : p = p = b n K M n r f o r b^, some p o s i t i v e constant, assumptions A ( i ) - A ( i v ) of 2.1 are s a t i s f i e d . Now c o n s i d e r a c o n s i s t e n t t e s t based on the s t a t i s t i c — 3*- n T = n 2 Ya (R. )b (S. ) n L n I n I which r e j e c t s HQ, i n favour of H^, f o r l a r g e v a l u e s of T n . Suppose the score f u n c t i o n s are d e f i n e d by, a n ( i ) = EaCU^ 1*) b n ( i ) = E b ( U n ( l ) ) where U^"^ i s the i.^1 order s t a t i s t i c o f a sample of n independent Uniform ( 0 , 1 ) random v a r i a b l e s , or by, a n ( i ) = ;a(i/;(n+l)) b n ( i ) = b ( i / ( n + l ) ) . F u r t h e r , suppose t h a t the score g e n e r a t i n g f u n c t i o n s , a and b, are r e a l v alued measurable; f u n c t i o n s d e f i n e d on [ o , l ] which s a t i s f y , I I / b(z)dz= /"a(z)dz= 0 i d o and 0</ a 2 ( z ) d z = a2<°°, 0</ b 2 ( z ) d z = a2<°°. o a o o I f these c o n d i t i o n s hold,by Behnen ( 1 9 7 1 ) , Theorem ( l - a , c ) , T n i s a s y m p t o t i c a l l y normal w i t h mean 0 and v a r i a n c e 1, under HQ : p = 0 and i s a s y m p t o t i c a l l y normal wi t h mean and v a r i a n c e 1, T — 2-under the simple a l t e r n a t i v e , H n : p = p n = b T / n 2 , b T i i s any p o s i t i v e constant. The mean y n i s giv e n by, U n= f a(F(x f«))b(F(« fy)) dF (x,y) a a, n a b where F (x,°°) =$ (x) , F (°°,y)=<l'(y) are the standard normal marginals of X and Y and F (x,y) i s the b i v a r i a t e normal d i s t r i b u t i o n p n f u n c t i o n w i t h p = p . As f o r r , these r e s u l t s can be w r i t t e n M n n' 25 i n the n o t a t i o n of Chapter 2, t h a t i s ^ T (P)= nj| / a(F(x,°°) )b(F(»,y) ) dF (x,y). n a a, p a b Without l o s s of g e n e r a l i t y i t may be assumed t h a t yx=yy=0 and a =a =1. S u b s t i t u t i n g f o r F gi v e s x y 3 p ^ h * T n ( P > aloT 0 - L a ( $ ( x ) ) b ( $ ( y ) ) ( 2 7 r / i ^ T ) - 1 e x p { - % ( l - p 2 ) - 1 r x 2 + y 2 -2pxy] } dxdy a b a T (p)=l n n ill™ (0)=n 2 ,oo — i -> o vr„l -—tj^ a ( $ ( x ) ) b ( $ ( y ) ) (2TT) exp{-3s (x 2+y 2) }xy dxdy, rn^l 1.. . . -1 ,oo 00 l i m n ^ (0) n->oo n = ( a a V " S-ooQ a ( $ ( x ) ) b ( $ ( y ) ) (2TT) 1exp{-J 5 (x^+y*)'} xy dxdy=v m 2 . 2! tfiji (0) n Pro v i d e d v T i s p o s i t i v e , A A ( i ) - A ( i v ) of 2.1 are s a t i s f i e d . From 2.1.1 the Pitman e f f i c i e n c y o f the t e s t based on T n , r e l a t i v e t o t h a t based on r , i s given by, A T r = l i m n->oo ^ ( 0 ) / a T (01 n n Vr (0)/ a (0) n n 5.1.1 H f - { r j V " 0 - « a ( $ ( x ) ) b ( $ ( y ) ) (2TT (1-p 2) % ) - 1 -p a b exp{-%(x +y -2pxy)}dxdy> 2 Jp = 0 Or, ^ r ' f * aaab)~^>C a ( $ ( x ) ) b ( $ ( y ) ) (2TT) ^xpf-Js (x 2+y 2) }xy dxdyj 2 1 00 i •/„ a($(x))<|>(x)x dx L a i /™ b($(y))<(.(y)y dy °b 5.1.2 -3" 2 where <J> (x) =$ ' (x) = (2TT ) 2exp(-%x ) (standard normal d e n s i t y ) . In the case where a(')=b(«), 5.1.2 reduces t o A T r ~ - fm a($(x) )(J)(x)x dx u a " 26 5.1. 3 I t i s now a simple matter t o c a l c u l a t e the Pitman e f f i c i e n c i e s o f the t e s t s based on the l i n e a r rank s t a t i s t i c s o f Chapter 3, r e l a t i v e t o the t e s t based on r . Although these e f f i c i e n c i e s n ^ have a l r e a d y been d e r i v e d i n v a r i o u s ways by s e v e r a l authors (with the e x c e p t i o n of those r e l a t e d t o the components of the Blum-Kiefer-Rosenblatt s t a t i s t i c ) , they are once more computed here f o r completeness and convenience. The t e s t s r e j e c t H Q:p=0 when the s t a t i s t i c s are too l a r g e and are c o n s i s t e n t f o r t e s t i n g a g a i n s t H^:p>0. From now on Pitman e f f i c i e n c y w i l l be assumed to be taken r e l a t i v e to the t e s t based on r n Spearman's rho (T ) n and a =o=l a b k -[. s r L In t h i s case a(u)=b(u)=/12(u~h) From 5.1.3, oo 12 / x($ (x)-Js)c)) (x)dx E v a l u a t i n g the i n t e g r a l by p a r t s w i t h v=$(x)-k and du=x<|) (x) g i v e s the w e l l known r e s u l t , A = s r / . oo - 1 2 IT / ( 2 T T ) exp(-x ) dx (12)3 5(2 T T ) h(2) h 2. F i s h e r - Y a t e s normal scores s t a t i s t i c (T^ ) 4 ? = 9/TT =. 91. Here a(u)=b(u) -1 n =$ (u) and a =a,=l. A c c o r d i n g to 5.1.3 the Pitman e f f i c i e n c y i s A f r = _ - d l x 2 * < x ) dx *=1. Quadrant sum (T ) qn The s t a t i s t i c T of 3.3 corresponds qn c to l e t t i n g a(u)=b(u)=sign(u-^). Obviously, a "-arid are both 1. From 5.1.3 the Pitman e f f i c i e n c y i s , 14 A = [ / qr L -x s i g n ($(x)-3s)((>(x) d x l 4 r o i = |/ 0 0 -xo) (x) dx +,' f0 xcj) (x) dx = [ ( 2 T T ) 2 fo xexp(-3sx ) dx + ( 2 T T ) 2 -xexp (-Jjx ) dx J = [ 2 ( 2 T T ) " J s ] 4 = 4 ( T r ) : : . 2 - . 4 1 . 4. The components and r e l a t e d l i n e a r rank s t a t i s t i c s of the Blum-Kief e r - R o s e n b l a t t s t a t i s t i c (z ., and T ). njk njk I t has a l r e a d y been shown t h a t znj-^ 1 S a s y m p t o t i c a l l y d i s t r i b u t e d as z ., when X and Y are independent. Since T ., i s of the c o r r e c t jk ^ njk form,Behnen 1s theorem a p p l i e s and l ^ j ^ i s a l s o a s y m p t o t i c a l l y Normal (0,1) under independence. T .. and z are, t h e r e f o r e , ' ^ njk njk ' a s y m p t o t i c a l l y e q u i v a l e n t when Hg:p=0 i s t r u e . Consider the a l t e r n a t i v e H :p=p =n 2 b m . Under H , l e t t i n g n n T n ( x , y ) = ( F " 1 ( U ) f F ' 1 ( v ) ) = ( $ - 1 ( u ) ( v ) ) the expected value of Q i s A x n E ( Q n ( $ _ 1 ( u ) , 3» - 1(v) ) )=/n [F ( $ _ 1 ( U ) ,<D_1(v) ) - n _ 1 F ( $ _ 1 ( u ) , $ _ 1 ( v ) ) h n - ( l - n - 1 ) u v ] , where F i s the b i v a r i a t e normal d i s t r i b u t i o n f u n c t i o n with P - P N -n I f F (<£>"' (u) , ( v ) ) i s expanded about p=0, f o r l a r g e n p n - l - l F ( $ _ 1 ( u ) , $ _ 1 ( v ) ) - /* ( v ) /* ( u ) (27r)" 1exp ( - J 5(x 2+y 2):) (l+ p nxy) dxdy. p n T A s y m p t o t i c a l l y , under the sequence of a l t e r n a t i v e s , {H } - i - i n E ( Q n ( $ ~ 1 ( u ) , $ _ 1 ( v ) ) ) - > ( v )/_! ( U ) ( 2 T r ) - 1 e x p ( - J S ( x 2 + y 2 ) ) (1+P nxy) dxdy-uv - i - i = ^ P n fl ( v ) / i ( U ) ( 2 T r ) - 1 e x p ( - % ( x 2 + ^ 2 ) ) x y dxdy. n -°° A l s o , C o v ( Q n ( $ _ 1 ( u ) , $ _ 1 ( v ) ) , Q n ( $ _ 1 ( r ) , $ _ 1 ( s ) ) )->{min (u,r)-ur} {min(v,s)-vs} as n-+°°, as w e l l as i n the case o f independence. Under the T -1 -1 sequence {HV} , Q n($ (u),<2> (v) ) i s a s y m p t o t i c a l l y a Gaussian pr o c e s s , Q(u,v), w i t h - l - l E(Q(u,v ) ) =/Hp n / * ( v ) / * ( u ) ( 2 T r ) ~ 1 e x p ( - ^ 2 + y 2 ) ' ) x y dxdy and Cov(Q(u,v),Q(r,s))={min(u,r)-ur}{min(v,s)-vs}. From t h i s i t may be concluded t h a t under the sequence of T a l t e r n a t i v e s , { H n}/ the component z n - k i s a s y m p t o t i c a l l y normal 28 wi t h v a r i a n c e 1 and mean given by-, I 1 1 / n p n T r 2 j k / .• / 2 s i n ( T T J u ) s i n ( 7 r k v ) (V)-Cf ( u ) ( 2TT ) _ 1 e x p (-J5 (x 2+y 2) ) xy dxdy dudv. I n t e g r a t i n g by p a r t s , t w i c e , t h i s i s equal t o , 1 1 - 1 - 1 /no f f 2cos (TT ju) cos (Trkv) $ (u)$ (v) dudv. n o 0 T From Behnen 1s theorem, under the sequence {H }, T ., ' i s ' ^ n n]k a s y m p t o t i c a l l y normal w i t h v a r i a n c e 1 and mean giv e n by, us = /n f 2cos (IT j $ (x) ) cos (Trk$ (y) ) dF (x,y) . :.- N P N Again, approximating F (x,y) i n the r e g i o n of p=0 u =/n 2cos (TTJ$ (x) ) cos (TTk$ (y) ) (2TT) 1exp (-% ( x 2 + y 2 ) ) ( l + p n x y ) dxdy. Making the s u b s t i t u t i o n , u=0(x) and v=$(y) 1 1 _ i u =/n / / 2cos (TTJU) cos ( f rkv) (1+p $ (u)$ (v) ) dudv n o o n 1 1 - 1 - 1 = / n p n / / 2cos (TTJU) cos (Trkv) $ (u) $ (v) dudv. T n j k a n < ^ z n j k a r e ' t h e r e f o r e , a s y m p t o t i c a l l y e q u i v a l e n t under T both the n u l l , H^, and the sequence of a l t e r n a t i v e s 'f H n^« Because of t h i s e q uivalence the Pitman and approximate Bahadur e f f i c i e n c i e s computed f o r T ., w i l l be the same as those f o r z . c njk n j k For the s t a t i s t i c T a (u) = /2cos (TTju) and b (u) =/2cos (Trku) , n j K and a =ov = l . The Pitman e f f i c i e n c y of the T t e s t i s by 5.1.2, a b 2 n j k A ( j k ) r = [ / ~ /2cos (Trj$(x))<j)(x)x dxf [/_" /2 ,cos(TTk$(y))cJ)(y)y d y ] 2 . S u b s t i t u t i n g u=$(x), v=$(y) A ( ' k) r-=4!T-o c o s ^ J 1 1 ) * "*"(«) d u J 2 [ { cos ( T r k v ) $ 1 (v) d v j 2 . 1 -1 Consider the i n t e g r a l / cos (TT&U) $ (u) du. 1^ i s equal t o 0 when l i s an even i n t e g e r . The e v a l u a t i o n of 1^, when H i s an odd i n t e g e r , i s not s t r a i g h t f o r w a r d . The work of Beran (1975 a,b), suggests t h a t the behaviour o f the f i r s t component z ^ determines the Pitman e f f i c i e n c y of the o v e r a l l B l u m - K i e f e r - R o s e n b l a t t s t a t i s t i c . For t h i s reason the Pitman e f f i c i e n c y of the f i r s t component (or equivalently T n 1_j_) i s of in t e r e s t . In order to compare t h i s e f f i c i e n c y with the other e f f i c i e n c i e s , 1^ v. (as well as I^) has been numerically evaluated, using 8 point Gaussian quadrature. The results are reported here, I1=.67 I3=.17 The corresponding e f f i c i e n c i e s are, A ( 1 1 ) r=.80 (13)r A,~0, =.0033 (33) r These are seen to decrease rapid l y with increasing j or k. 5. Kendall'.s tau. Kendall's tau i s asymptotically equivalent to Spearman's rho under both the n u l l hypothesis (Hajek and Sidak 1967) and under the alternative where p>0 i s close to 0 (Farlie 1961). This implies that the Pitman e f f i c i e n c y of Kendall's t e s t i s the same as that of Spearman's test. The Pitman e f f i c i e n c y i s , 2 therefore, 9/ir -.91. 5.2 Bahadur E f f i c i e n c i e s As another means of comparing te s t s , the Bahadur e f f i c i e n c i e s (exact or approximate) of the nonparametric tests r e l a t i v e to the test based on sample cor r e l a t i o n c o e f f i c i e n t , w i l l be given. To begin with, i t i s necessary to know the exact or approximate slope associated with r n when testing HQ:p=0 against H^:p>0, i n the bivar i a t e normal case. The exact slope, of [r /v4i-2/l-r2 j, 2 n i s c r ( p ) = - l o g ( 1 - p ), (Woodworth 1970). The approximate slope i s c ^ a ) ( p ) = p 2 / l - p 2 , (Abrahamson 1965). Now consider the l i n e a r rank s t a t i s t i c s of 3.0.1, 30 T =n 2>a (R.)b (S.). The procedure f o r c a l c u l a t i n g the exact or approximate slope of the sequence ' t T n J n _ ^ r e q u i r e s e v a l u a t i o n i, of l i m T / n 2 when p>0. From 4.6 of Woodworth (1970) T /n2+ff a(F)b(G) dF = b ( p ) 5.2.1 n p i n p r o b a b i l i t y as n-*-°°, where F and G are the Normal (0,1) marginals, <3? , of the b i v a r i a t e normal d i s t r i b u t i o n F and a p n and b n converge i n q u a d r a t i c mean t o the square i n t e g r a b l e f u n c t i o n s a and b, r e s p e c t i v e l y . In a d d i t i o n t o t h i s l i m i t , i t i s necessary t o f i n d the f u n c t i o n f (t) which s a t i s f i e s , -1 h n l o g (1-F n (n 2t))->--f (t) . Employing theorem 3, Woodworth (1970), f (t) may be found as f o l l o w s . I f there e x i s t s a constant A>0 and a f u n c t i o n s (u),isuch t h a t * e x P C A ( a ( u ) b ( v ) - s ( v ) ) ] _ d u = 1 Q < v < 1 / e x p [ A ( a ( u ) b ( v ' ) - s ( v ' ) ) ] dv' ' fa (u)b (v) exp[ A (a (u)b (v) -s (v) )] dv d t 5.2.3 / e x p L A ( a ( u ) b ( v ' ) - s ( v ' ) ) ] d v ' then f ( t ) = A ( t - / s ( v ) d v ) - / l o g { / e x p [ A ( a ( u ) b ( v ) - s ( v ) ) ] dv}du. 5.2.4 I t i s not always simple t o f i n d the s o l u t i o n (A,s) of 5.2.2 and 5.2.3. The approximate slope w i l l have to s u f f i c e when i t i s d i f f i c u l t to s o l v e f o r f ( t ) by t h i s method. I f f ( t ) can be found, the exact slope i s 2 f ( b ( p ) ) . Under the n u l l hypothesis H Q:p=0, T n i s a s y m p t o t i c a l l y Normal (0,1), as s t a t e d i n s e c t i o n 5.2 and a=l i n B ( i i ) of 2.2. The approximate s l o p e of ^ T n^n-1 ^"S' t n e r e f o r e ' k 2 ( p ) where b ( p ) i s given by 5.2.1. The above r e s u l t s w i l l now be used to compute the exact (where p o s s i b l e ) or approximate s l o p e s of the s t a t i s t i c s g i v e n i n Chapter 3. and 31 1. F i s h e r - Y a t e s normal scores s t a t i s t i c ( T ^ n ) . Woodworth (1970) so l v e s 5.2.2 and 5.2.3 f o r the F i s h e r - Y a t e s s t a t i s t i c and 2 CO consequently f i n d s f ( t ) = - % l o g ( 1 - t ). The exact slope of { T f n } n = i 2 5" i s , t h e r e f o r e , c ^ ( p ) = - l o g ( 1 - p ) s i n c e l i m T f n / n 2 i s n->-°o //xy (2TT)" 1 ( l - p 2 ) _ J 5 e x p [ - 3 5 ( l - p 2 ) - 1 (x 2+y 2-2pxy)] dxdy=p. 2. Quadrant sum ( T g n ) • A p p l y i n g 5.2.1 to the quadrant sum, l i m T /nh=ff s i g n ($ (x)-%) s i g n ($ (y)-%) (2TT) _ 1 ( 1 - p 2 ) n ^ 0 0 q n , L , i 2. - J 5 , 2^ 2 0 . . , , e x p ( - % ( l - p ) (x +y -2pxy) )dxdy. The r i g h t hand s i d e of t h i s equation has a l r e a d y been e v a l u a t e d as (2TT " S a r c s i n p . (Woodworth 1970) The s o l u t i o n t o 5.2.2 and 5.2.3 may be found by l e t t i n g s(u)=0. 5.2.2 i s then i d e n t i c a l l y equal t o 1 f o r a l l A>0 and 0<v<l. I t remains t o f i n d A by s o l v i n g 5.2.3. When s(u)=0, 5.2.3 becomes ./ s i g n (u-h) s i g n (v-h) exp[ A (sign (u-h) s i g n (v-h) )1 dv du l / exp[A (sign (u-%) s i g n (v'-h) )~] dv' =(e A-e~ X)/(e A+e~ A)=tanhA T h i s g i v e s A = t a n h _ 1 ( t ) = % l o g ( 1 + t / l - t ) . F i n a l l y , f ( t ) = J j t l o g d + t / l - t J - l o g f ^ d + t / l - t j ' ^ + J s d + t / l - t ) 3 5 ] from 5.2.4. Combining r e s u l t s , the exact slope f o r the quadrant sum i s , — 1 r - J- J- - i c q ( p ) = 2 i T a r c s i n p l o g (£ ( p ) )-21og[jJj5 2 (p) +hK 2 ( p ) j , where E, ( p ) =1+2TT ^ a r c s i n p . 1-2TT ^"arcsinp 3. Spearman's rho (T ). Consider Spearman's t e s t . l i m T- /n3s=// 12(u-3s) (v-h) (2TT) _ 1 ( 1 - p 2 ) "^exp (~h ( 1 - p 2 ) " % (x 2+y 2-2pxy) ) dxdy = 6 T T - 1 a r c t a n ( p ( 4 - p 2 ) - 3 s ) . (Woodworth 1970) Although, 5.2.2 and 5.2.3 are i n t r a c t a b l e f o r Spearman's case, 12(v-h) [v-h) may be approximated by a f u n c t i o n f o r which the 32 equations can be s o l v e d , p r o v i d i n g an approximation to f ( t ) (Woodworth 1970). For t c l o s e t o 1, f ( t ) has a s e r i e s expansion (Woodworth 1970) and may be approximated by the n o n n e g l i g i b l e terms i n t h i s expansion. Woodworth (1970) presents a t a b l e of f ( t ) f o r v a r i o u s v a l u e s of t , making use of these approximations. Using t h i s t a b l e the exact slope, -1 2 -J--c s(p)=2f(6Tr a r c t a n ( p ( 4 - p ) 2) ), may be estimated f o r given v a l u e s of p . 4. K e n d a l l ' s tau (t ). Since K e n d a l l ' s tau i s not l i n e a r , n the methods of t h i s s e c t i o n are not a p p l i c a b l e . The exact s l o p e , c t ( p ) , can be found, however, as c f c ( p ) = 2 f ( b f c ( p ) ) (Woodworth 1970) Where f ( t ) i s the s o l u t i o n of f (t)=%At+%A+logU/fe A-l)) r rX_., _x -, x -1 t = l + 4 [ ^ A x ( e - l ) " - L dx -A]/; and b t ( p ) = 4//F d E p -1=2TT Cretan (p ( 1 - p 2 ) . Woodworth (1970) has t a b l e d f ( t ) f o r .020<t<.996 and c f c ( p ) may be computed f o r given v a l u e s of p. 5. The components and r e l a t e d l i n e a r rank s t a t i s t i c s of the Blum-Kiefer-Rosenblatt s t a t i s t i c (z ., and T ., ) . I t has not njk njk been p o s s i b l e t o f i n d the s o l u t i o n t o 5.2.2 and 5.2.3 i n t h i s case and, t h e r e f o r e , o n l y the approximate slope w i l l be g i v e n . From 5.2.1 l i m T n- k/n 2=// 2cos ( T Tj$(x) ) cos (Trk<2> (y) ) dE n-y Expanding E p about p=0, to get an approximation to F p f o r s m a l l P i l i m T n i k/n^=// 2cos ( T T J $ (x) )cos (7rk$ (y) ) ( 2 7 T ) - 1 e x p (-J5 (x 2+y 2) ) n-»-«> 3 (1+pxy) dxdy. I f the s u b s t i t u t i o n u=$(x), v=$(y) i s made i n the e x p r e s s i o n above, l i m T .,/n 2 reduces to nik 7  n-)-oo J 33 1 -1 1 -1 2pf cos ( T T J U) $ (u) du / cos (irkv) $ (v) dv. (a) The approximate s l o p e , c , ( p ) , i s c f j ) : ( p)=4p 2|7 cos ( T T J U ) $ _ 1 (u) d u l 2 f / cos (irkv) $ _ 1 (v) d v l 2 3 2 2 2 = 4 p ^ I Z I, j k where 1^ was d e f i n e d i n s e c t i o n 5.1. Abrahamson (1965) g i v e s the approximate slope of the o v e r a l l B l um-Kiefer-Rosenblatt s t a t i s t i c which i s noted here. c ^ (p)=12 1 T r 2 p 2 f o r p near 0. I t should be p o i n t e d out t h a t the t e s t based on B i s two s i d e d c n i n c o n t r a s t to the other t e s t s p r e v i o u s l y d i s c u s s e d . A d i r e c t comparison of the approximate Bahadur e f f i c i e n c y of B n cannot be made wit h the other Bahadur e f f i c i e n c i e s . The Bahadur e f f i c i e n c i e s of the nonparametric t e s t s r e l a t i v e t o the t e s t based on r , may now be computed by d i v i d i n g the slopes by c^ip) . These are summarized below, F i s h e r - Y a t e s normal scores B^ ( p ) = l s t a t i s t i c Quadrant sum • B a r ( p ) = - l o g ( l - p 2 ) i [ 2 i r a r c s i n p -h . . . . J « Spearman's rho B ( p ) = - l o g - 1 ( 1 - p 2 ) c ( p ) " 1 ( l - p 2 ) i [ T " 1 < logXS ( p ) ) - 2 1 o g [ j 5 ? " % ( p ) + % c ; J s ( p ) ] ] s r _ 1 - 2 ) s  3 -1,, 2, K e n d a l l ' s tau B t r ( p ) = - l o g x ( l - p ^ ) c t ( p ) T n j k B ^ ( p ) = 4 ( l - P 2 ) I 2 I 2 Blum-Kiefer-Rosenblatt . . 1 9 9 s t a t i s t i c ( o v e r a l l ) B ^ a ) ( p) =12" TT (1 - p ) = . 82 (1 - p ) 2 " -1 £ J p J =1+2TT a r c s i n p . 3 1-2TT a r c s i n p C s ' c t a r e n o t 9 i v e n e x p l i c i t l y but are t a b l e d by Woodworth (1970) The Pitman and approximate Bahadur e f f i c i e n c i e s of T n ^ k r a p i d l y decrease w i t h j and k. For t h i s reason i t i s suggested t h a t T n be used i n t e s t i n g f o r independence. 34 5.3 A Monte C a r l o Comparison The s i z e of each t e s t , was f i x e d throughout the study, at a=.05. The c r i t i c a l v a l u e s were f i r s t e s t i mated f o r each s t a t i s t i c by g e n e r a t i n g 1000 independent samples of s i z e n. In each sample, the n o b s e r v a t i o n s (X^,Y^), i = l , 2 . . . n , were drawn from the n u l l d i s t r i b u t i o n , t h a t i s the X's and Y's were independent Normal (0,1). The s t a t i s t i c s of Chapter 3: Spearman's rho, K e n d a l l ' s t a u , the F i s h e r - Y a t e s Normal scores c o e f f i c i e n t , the quadrant sum, the Blum-Kiefer-Rosenblatt s t a t i s t i c and T n j _ j _ ( r e l a t e d t o i t s f i r s t component) , as w e l l as the o r d i n a r y sample c o r r e l a t i o n c o e f f i c i e n t were a l l computed from the same n o b s e r v a t i o n s . The r e s u l t i n g 1000 v a l u e s of each s t a t i s t i c were ordered. The c r i t i c a l v a l u e was then taken to be the s m a l l e s t observed number such t h a t the r e l a t i v e frequency of values g r e a t e r than or equal to i t , was l e s s than or equal to a. Where t a b l e s are a v a i l a b l e the c r i t i c a l v a l u e s o b t a i n e d n u m e r i c a l l y can be compared w i t h exact v a l u e s . For the F i s h e r - Y a t e s normal scores s t a t i s t i c of s e c t i o n 3.2 the approximate s c o r e s , $ 1 ( i / ( n + 1 ) ) , were used i n s t e a d of the exact s c o r e s , E ($ 1 (U ^ ) ) . T h i s e l i m i n a t e d the need n f o r a t a b l e of normal s c o r e s . The approximation improves as n+°°. (Hajek and Sidak 1967) The Blum-Kiefer-Rosenblatt t e s t i s i n h e r e n t l y two s i d e d . To make a f a i r comparison w i t h the o t h e r t e s t s i t i s necessary t h a t they a l s o be two s i d e d . Since the a l t e r n a t i v e of i n t e r e s t i s the one s i d e d H-^:p>0, and, o n l y p o s i t i v e p ' s were used i n the power c a l c u l a t i o n s , i t w i l l be assumed t h a t the a p p r o p r i a t e one s i d e d t e s t s with a=.025 are good approximations to the corresponding two s i d e d t e s t s w i t h a=.05. 35 The Monte C a r l o c r i t i c a l v a l u e s as w e l l as some exact values are given i n Table I. A f t e r the c r i t i c a l v a l u e s were computed, a power study was c a r r i e d out. Again, 1000 samples of each s i z e , n=10, 2 4 and 50, were generated. T h i s time the samples were generated from the b i v a r i a t e normal d i s t r i b u t i o n (with 0 means and u n i t v a r i a n c e s ) w i t h p=.l,.25 and .5. The a d d i t i o n a l values p=.75 and .9 f o r n=10, p = . 4 and .6 f o r n=24 and p=.15, .3 and .4 f o r n=50, were i n c l u d e d t o give a b e t t e r p i c t u r e o f the power curve. A l l t e s t s were performed f o r each sample and a running count of the number of times HQ was r e j e c t e d was kept f o r each t e s t . The f i n a l count d i v i d e d by 1000 estimates the power. The r e s u l t s o f these computations are g i v e n i n Table I I . The powers f o r both one s i d e d and two s i d e d t e s t s , w i t h a=.05 are recorded f o r a l l s t a t i s t i c s except the Blum-Kiefer-Rosenblatt s t a t i s t i c . Only the .05 l e v e l two s i d e d t e s t was performed f o r the l a t t e r . Computations were done at the U n i v e r s i t y of B r i t i s h Columbia Computing Centre on an IBM 36 0 computer. The normal o b s e r v a t i o n s were generated by a b u i l t - i n r o u t i n e which transforms independent normal o b s e r v a t i o n s , produced by M a r s a g l i a ' s rectangle-wedge-tail-method, i n t o o b s e r v a t i o n s from the d e s i r e d b i v a r i a t e normal d i s t r i b u t i o n . A d i f f e r e n t random s t a r t i n g v alue was used t o generate the 1000 samples f o r each n and p combination. Table I C r i t i c a l v a l u e s d e r i v e d from a Monte C a r l o experiment n=10 Test C r i t i c a l v a l u e Exact c r i t i c a l a value C o r r e l a t i o n 2. 2660 2. 3060 .025 c o e f f i c i e n t 1. 8939 1. 8595 .05 Ke n d a l l ' s tau 0. 5111 0. 5111 .025 0. 4222 0. 4667 .05 Spearman's rho 0. 6242 0 . 6485 .025 5 0. 5273 0 . 5636 .05 Quadrant sum 1. 0000 1. 0000 .025 1. 0000 1. 0000 .05 F i s h e r - Y a t e s 1. 2069 - .025 c o e f f i c i e n t 1. 0661 - .05 T n i l 1. 7544 - .025 1. 50 3-9 - .05 Blum-Kiefer-Rosenblatt 0 . 00803 - .05 s t a t i s t i c n=24 Test C r i t i c a l v a lue Exact c r i t i c a l a value C o r r e l a t i o n 2. 0350 2. 0739 .025 c o e f f i c i e n t 1. 6942 1. 7171 .05 Ke n d a l l ' s tau 0. 2826 0. 2899 .025 0. 2319 0. 2464 .05 Spearman's rho 0. 3948 - .025 5 0. 3313 - .05 Quadrant sum 0. 5000 0. 5000 .025 0. 5000 0. 5000 .05 F i s h e r - Y a t e s 1. 4778 - .025 c o e f f i c i e n t 1. 2831 - .05 T n l l 1. 9306 - .025 1. 5563 .05 Blum-Kiefer-Rosenblatt 0. 00278 - .05 s t a t i s t i c References f o r exact v a l u e s : Blomquist (1950), 0deh e t a l (1977). Because of the d i s c r e t e nature of the s t a t i s t i c , the .05 and .025 c r i t i c a l v a l u e s of the quadrant sum are the same. T h i s i s a l s o r e f l e c t e d i n equal powers f o r the corresponding e n t r i e s i n Table I I . 37 Table I (continued) n=50 Test C r i t i c a l v a lue Exact c r i t i c a l a v a l u e C o r r e l a t i o n 2.3116 2.0106 .025 c o e f f i c i e n t 1.7523 1.6772 .05 K e n d a l l ' s t au 0.2033 - .025 0.1657 - .05 Spearman's rho 0.3007 - .025 0.2400 - .05 Quadrant sum 0.3600 0.3600 .025 0.2800 0.2800 .05 F i s h e r - Y a t e s 1.8486 - .025 c o e f f i c i e n t 1.4817 - .05 T n i l 1.9738 - .025 1.7134 - .05 Blum-Kiefer-R o s e n b l a t t 0.00133 - .05 s t a t i s t i c O u m CD > •H S-l CD U Q) 0 LD O O O O O O L D r H O O O ^ r O i - I O CM O M f i k O l f l O I ^ O J C M C T i V D O V D O i H O i - l r H i H O O O i - l i - l i - ) II C L o o o o o o o o o o o o CN in CN II C L o o ^ o o i o H M i n i n n o o ^ t N ^ c o ^ o o m o o o m o i i n o i O O O O O H O O O O O O O O O O O O O O O O O O o o II CL v o r g n c n c ^ ' a " ^ ^ 1 C N O O i H C N i H r H O O C N f O t N C O C S r O H r l o o o o o o o o o o o c N n m o c N c N O i H O i — I O I - I O O O O O O O O O O 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0080119/manifest

Comment

Related Items