UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

A comparison of nonparametric tests of independence Nemec, Amanda Frances 1978

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1978_A6_7 N44.pdf [ 2.74MB ]
Metadata
JSON: 831-1.0080119.json
JSON-LD: 831-1.0080119-ld.json
RDF/XML (Pretty): 831-1.0080119-rdf.xml
RDF/JSON: 831-1.0080119-rdf.json
Turtle: 831-1.0080119-turtle.txt
N-Triples: 831-1.0080119-rdf-ntriples.txt
Original Record: 831-1.0080119-source.json
Full Text
831-1.0080119-fulltext.txt
Citation
831-1.0080119.ris

Full Text

A COMPARISON OF NONPARAMETRIC  TESTS OF INDEPENDENCE  by  AMANDA FRANCES NEMEC B.Sc., U n i v e r s i t y of V i c t o r i a , 1975  A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE  REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in  THE  FACULTY OF GRADUATE STUDIES Department o f Mathematics  We accept t h i s t h e s i s as conforming to the r e q u i r e d standard  THE  UNIVERSITY OF BRITISH COLUMBIA August, 1978  ©  Amanda Frances Nemec, 1978  In p r e s e n t i n g t h i s  thesis  in p a r t i a l  f u l f i l m e n t o f the reguirements f o r  an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia, the I  Library shall  freely available  f u r t h e r agree t h a t p e r m i s s i o n  for  for  that  reference and study.  f o r e x t e n s i v e copying o f t h i s  thesis  s c h o l a r l y purposes may be granted by the Head o f my Department o r  by h i s of  make i t  I agree  this  representatives. thesis  It  is understood that c o p y i n g o r p u b l i c a t i o n  f o r f i n a n c i a l gain s h a l l  written permission.  Department of  Mathematics  The U n i v e r s i t y o f B r i t i s h 2075 Wesbrook Place Vancouver, Canada V6T 1W5  Date  n M A t i r t t .  n,  \<T1*.  Columbia  not be allowed without my  Abstract The nonparametrie  t e s t s of b i v a r i a t e independence, based  Spearman's rho, K e n d a l l ' s t a u , the  Blum-Kiefer-Rosenblatt  s t a t i s t i c , the F i s h e r - Y a t e s normal scores c o e f f i c i e n t and quadrant  sum  on  the  are compared w i t h the p a r a m e t r i c t e s t based on the  ordinary c o r r e l a t i o n c o e f f i c i e n t .  The  t e s t s are compared i n  the b i v a r i a t e normal case by r e c o r d i n g the Pitman and Bahadur e f f i c i e n c i e s of each t e s t .  The e m p i r i c a l powers r e s u l t i n g  a Monte C a r l o study are a l s o given f o r the The  tests.  components of the Blum-Kiefer-Rosenblatt  statistic  d e r i v e d and are r e l a t e d t o l i n e a r rank s t a t i s t i c s . s t a t i s t i c a s s o c i a t e d w i t h the f i r s t new in  nonparametric  The  are  rank  component i s suggested  t e s t of independence.  from  as a  This t e s t i s included  the comparison and i s shown t o perform reasonably w e l l . As one  s i d e d t e s t s of independence i n b i v a r i a t e p o p u l a t i o n s  the F i s h e r - Y a t e s c o e f f i c i e n t and the sample c o r r e l a t i o n are t o be p r e f e r r e d over the other t e s t s . the Blum-Kiefer-Rosenblatt  As two  coefficient  sided tests  s t a t i s t i c i s the b e s t t e s t f o r  a l t e r n a t i v e s near the n u l l h y p o t h e s i s .  When the  alternatives  are d i s t a n t from the n u l l h y p o t h e s i s the F i s h e r - Y a t e s c o e f f i c i e n t or  the sample c o r r e l a t i o n c o e f f i c i e n t should be used.  quadrant  sum  always performs  p o o r l y w h i l e the other  t e s t s , i n c l u d i n g t h a t based on the f i r s t Blum-Kiefer-Rosenblatt  The nonparametric  component of the  s t a t i s t i c are a c c e p t a b l e t e s t s .  iii TABLE OF  CONTENTS  Abstract List  i i  of Tables  v  Acknowledgement  v i  CHAPTER 1  INTRODUCTION  1  CHAPTER 2  CRITERIA FOR THE COMPARISON  OF  TESTS  3  2.0  Introduction  3  2.1  Pitman Asymptotic  Relative  Efficiency  3  2.2  Bahadur R e l a t i v e E f f i c i e n c y  7  2.3  A Monte C a r l o  10  CHAPTER 3  Study  TESTS TO BE COMPARED  11  3.0  Introduction  11  3.1  Spearman's Rho  12  3.2  The F i s h e r - Y a t e s  Normal S c o r e s i :  Statistic  12  3.3  The Q u a d r a n t Sum  13  3.4  Kendall's  14  3.5  The B l u m - K i e f e r - R o s e n b l a t t  Tau Statistic  And I t s Components  15  1.  The components  16  2.  The r e l a t e d  linear  rank  statistics CHAPTER 4  ALTERNATIVES  CHAPTER 5  THE COMPARISON  18  TO INDEPENDENCE OF  20  NONPARAMETRIC  TESTS OF INDEPENDENCE  22  iv 5.0  Introduction  22  5.1  Pitman E f f i c i e n c i e s  23  1.  Spearman's r h o  26  2.  Fisher-Yates  normal  scores  statistic  26  3.  Q u a d r a n t sum  26  4.  The components linear  and  related  rank s t a t i s t i c s o f the  Blum-Kiefer-Rosenblatt 5. 5.2  Kendall's  Bahadur 1.  s t a t i s t i c . 27  tau  29  Efficiencies  Fisher-Yates  normal  29 scores  statistic  31  2.  Quadrant  3.  Spearman's r h o  31  4.  Kendall's  32  5.  The components linear  sum  31  tau and  related  rank s t a t i s t i c s o f the  Blum-Kiefer-Rosenblatt 5.3 CHAPTER 6 BIBLIOGRAPHY  s t a t i s t i c . 32  A Monte C a r l o C o m p a r i s o n  34  SUMMARY AND CONCLUSIONS  40 42  V  Table  I  Critical  L I S T OF  TABLES  values  derived  f r o m a Monte  C a r l o experiment Table  II  Power d e r i v e d experiment  f r o m a Monte  36 Carlo 38  vi ACKNOWLEDGEMENT  I w i s h t o thank Dr. J . K o z i o l and Dr. J . Petkau f o r help  and a d v i c e w h i l e  preparing  this  thesis.  I would a l s o  t o t h a n k my h u s b a n d , J i m , and my f a m i l y , f o r t h e i r encouragement and s u p p o r t , the  Mr. M.E. L i n n e l l  t e x t , and M r s . F . I . L i n n e l l The  National  research  their  for  like  great  proof-reading  f o r typing the t h e s i s .  a n d work f o r t h i s  t h e s i s was s u p p o r t e d  R e s e a r c h C o u n c i l o f Canada P o s t g r a d u a t e  by a  Scholarship.  1 CHAPTER 1  INTRODUCTION Given test  the  (X,Y) i t i s o f t e n n e c e s s a r y  f o r t h e i n d e p e n d e n c e o f X a n d Y u s i n g a sample o f n  tions, to  a bivariate population  (X^,Y^),  i=l,2,...,n.  The u s u a l  compute t h e sample c o r r e l a t i o n observations  and p e r f o r m  a t-test.  coefficient,  statistic  the quadrant  depend o n l y  corresponding  t a u , the F i s h e r - Y a t e s  (R^,S^),  t o the observations.  performance o f these  r^directly  The n o n p a r a m e t r i c  sum o r t h e  on t h e r a n k s  observa-  o f independence i s  coefficient,  b a s e d on Spearman's r h o , K e n d a l l ' s scores  test  This  to  from tests  normal  Blum-Kiefer-Rosenblatt i=l,2,...,n,  t h e s i s compares t h e  nonparametric t e s t s with  the parametric  test  b a s e d on r . n The  components o f t h e B l u m - K i e f e r - R o s e n b l a t t  are d e r i v e d .  T h e s e components may be r e l a t e d  rank s t a t i s t i c s . first  component  compared w i t h The  The l i n e a r i s suggested  the other  tests  rank s t a t i s t i c as a t e s t  statistic  to linear  a s s o c i a t e d with the  o f i n d e p e n d e n c e and i s  tests.  a r e compared a s s u m i n g  (X,Y) h a s a b i v a r i a t e  distribution.  The P i t m a n and B a h a d u r e f f i c i e n c i e s  parametric  r e l a t i v e t o the test  test  e m p i r i c a l powers o f t h e t e s t s , experiment,are a l s o given. provide  b a s e d on r  derived  n  o f each  normal non-  are given.  f r o m a Monte C a r l o  These t h r e e p i e c e s o f i n f o r m a t i o n  a u s e f u l comparison o f the nonparametric  tests  of independ-  ence. I n t h e one s i d e d t e s t i n g Fisher-Yates  situation  t h e t e s t s b a s e d on t h e  coefficient  and on r emerge as t h e b e s t t e s t s o f n B o t h t e s t s p e r f o r m e q u a l l y w e l l and a r e t o be 3  independence.  The  2  p r e f e r r e d over a l l the o t h e r t e s t s . simple t o execute, worst  test.  The quadrant  test,  although  l i e s a t the other extreme, being by f a r the  S l i g h t l y i n f e r i o r t o the two b e s t t e s t s are  Spearman's and K e n d a l l ' s t e s t s , which e x h i b i t  a similar  behavior  to each o t h e r , and the t e s t based on the Blum-Kiefer-Rosenblatt first  component.  K e n d a l l ' s and Spearman's t e s t s have the advan-  tage over the o t h e r nonparametric  t e s t s , t h a t being w e l l s t u d i e d ,  t a b l e s o f c r i t i c a l v a l u e s are r e a d i l y  available.  In t e s t i n g a g a i n s t a two s i d e d a l t e r n a t i v e , Blum-Kiefer-Rosenblatt null.  the o v e r a l l  t e s t i s the best t e s t i n r e g i o n s near the  Moving away from the n u l l a cross over takes p l a c e and an  o r d e r i n g o f the t e s t s s i m i l a r t o the one s i d e d s i t u a t i o n i s assumed, w i t h the Blum-Kiefer-Rosenblatt the r and F i s h e r - Y a t e s t e s t s . n  t e s t s l i g h t l y poorer  The Blum-Kiefer-Rosenblatt  should be used when i t i s known t h a t the a l t e r n a t i v e  c o e f f i c i e n t t e s t should be s e l e c t e d .  test  i s c l o s e to  the n u l l h y p o t h e s i s , otherwise the F i s h e r - Y a t e s t e s t or the correlation  than  3 CHAPTER 2  CRITERIA FOR 2.0  THE  c o m p a r i n g two  alternative,  the  tests  of  function  of  the  parameter, one  0,  test  being  example, f o r one  test i n the  said  fixed  one  than the  sample s i z e The  efficiency  ratio of  complete comparison desirable etc.) this  to  to  the of  eliminate  the  "relative  i n the  efficiency"  Bahadur e f f i c i e n c i e s a r e  the  of  such  or  G i v e n two efficiency  Pitman e f f i c i e n c y , tests,  i s the  both of  the  can  that  said  vary,  that  i t may  the  same  the  re-  alternative.  (sample s i z e ,  used.  size, To  do  to  the  Pitman  and  relative  (or  A  i t is  test with respect  i s often  be  i f i t requires  complicated;  "asymptotic  efficiencies".  asymptotic  relative  short)  same s i z e ,  limiting relative  by  For  comparison p o s s i b l e .  one  for  be  other  variables  Pitman Asymptotic R e l a t i v e E f f i c i e n c y  efficiency  accompanied  i s a measure o f  against  the  the  statement  same power a g a i n s t  w o u l d be  and  a  g r e a t e r power f o r a l l  than the  case o n l y , two  a,  Any  i t may  v  sample s i z e  tests  tests  a,  i f i t has  sample s i z e s  some o f  "limiting"  test,  hypotheses.  fixed  i n o r d e r t o make a more d i r e c t  other,  2.1  and  achieve  two  the  i s , however,  o t h e r must be  i s better  two  of  the  the  f o r which i t holds t r u e .  i f the  of  examination of  size  two  other  Or,  a test  the  than the  sample s i z e  size  alternative.  n u l l hypothesis against  power f u n c t i o n  the  conditions  alternative.  a smaller  lative  the  n,  "defines"  "better"  i s better  that  The  sample s i z e ,  which  a statement of  the  o b v i o u s a p p r o a c h i s an  a p p r o p r i a t e power f u n c t i o n s .  9  TESTS  Introduction In  of  COMPARISON OF  a,  efficiency  the of  Pitman  the  second  test  4 with  respect to the f i r s t ,  a s t h e sample  bound and a t t h e same t i m e hypothesis. respect  t o the f i r s t  required against  efficiency  i s the r a t i o  t h e same a l t e r n a t i v e .  (Noether  HQ:0=0Q  t o achieve  2 V a r T =o (0)) n n  such  A(ii)  r  l i m n~  m<5  authors  have  be d e f i n e d h e r e i n  and S t u a r t  \b a n d a y  n  n  (0- )=O 0  (m_1)  n  r  various  1973).  Consider  , ( u s u a l l y ET n 1  (0) r  n  that:  ip' ( 0 j = i K ' ( e ) = . . .=i> n 0 n 0 n  A(i)  2  T  '  and  n^/n ,  t h e same power  efficiency will  f u n c t i o n s o f 0,  exist  sizes,  with  which i s c o n s i s t e n t f o r t e s t i n g n where T^ i s computed f r o m n o b s e r v a t i o n s .  a g a i n s t H^:6>0g,  Suppose t h e r e  without  approaches t h e n u l l  o f t h e sample  1955, K e n d a l l  b a s e d on t h e s t a t i s t i c  increases  o f the second t e s t  Although  the n o t i o n , Pitman  u s u a l way  a test  the a l t e r n a t i v e  by t h e r e s p e c t i v e t e s t s ,  generalized the  The r e l a t i v e  size  ip  (  m  )  r  4  n  4>  (  rt  r  r  m  (0 )>O 0 .  )  n  n  (9.) /a (6_)=v>0 f o r some o ' n 0  6>0  n.*°° (assuming t h e e x i s t e n c e It  of these d e r i v a t i v e s )  i s now p o s s i b l e t o d e f i n e t h e s i m p l e H  n  :0 = 0  n  = e_+-^j0 o  n  where k i s some p o s i t i v e tions  alternative  constant.  The f o l l o w i n g f u r t h e r assump-  a r e a l s o made: A(iii)  A(iv)  (6 )/^  lim  T  ~^n^ n^ 0  n  — — T Q — \  -n  m )  n  n  i  (6 )=1,  Y P variance HQ:0^=0Q  s  a  s  limo  0  m  t  o  t  n  {Q_) /a_ ( 0 ) =1 Q  y n o r m a l w i t h mean 0 a n d 1, u n d e r t h e n u l l h y p o t h e s i s a n d u n d e r t h e a l t e r n a t i v e hypoi  c  a  l  l  thesis ,(although the asymptotic normali t y requirement i s not necessary f o r the g e n e r a l d e f i n i t i o n o f Pitman e f f i c i e n c y t h i s i s t h e o n l y case c o n s i d e r e d as a l l t h e s t a t i s t i c s t o be compared w i l l s a t i s f y the n o r m a l i t y assumption).  For  large  samples, under the above assumptions, the s i z e a t e s t  of H Q versus H ^ may be w r i t t e n R e j e c t H. i f T 0  J  n  i n a form:  > ty ( 9 ) + X a ( 9 . ) n 0 a n 0 r  A  where $ (A )=a,$ i s the standard normal cumulative d i s t r i b u t i o n a function. The approximate power o f t h i s t e s t a g a i n s t the alternative H i s n PJ9 J n n  -ty ( 6 . ) n 0'  = $  r  - X a ( 6 . ) +ty( 9 ) 1 a n 0 . n n I n  n  Expanding ty(9 )-ty (9 ) i n a T a y l o r s e r i e s about 0 Q  the  r e g u l a r i t y c o n d i t i o n s o f assumption A ( i i i )  $ above, may be approximated by k v  and under  the argument o f  , -  m  i—  n  —  A  .  a The approximate powertof the t e s t i s , t h e r e f o r e , nr.  n  .m k v - X ml  n  Assume the t e s t s based on T, and T„ are two c o n s i s t e n t ln 2n t e s t s o f H_ versus H , and that T. s a t i s f i e s A ( i ) - A ( i v ) w i t h 0 1 in 1  ^n ^in' =  a  n  = a  in'  m h = m  ± ' S = 6\ , and v=v^ f o r i = l , 2 ;  The t e s t s  will  then have asymptotic powers r m. i.  i  v.1 m.— 7 -I - X a a g a i n s t the r e s p e c t i v e As s t a t e d the  i=l, 2  alternatives  H. :9=9. =9„+ 1 ; i=l,2, in in 0 Si n b e f o r e , Pitman e f f i c i e n c y r e q u i r e s that the t e s t s have  same power a g a i n s t the same a l t e r n a t i v e .  T h i s i s achieved i f  Equivalently,  the condition  v  n-,  k£ V l \ m  2  V  n.  )  1 / m  l  l V  g u a r a n t e e s t h e same power a g a i n s t t h e same a l t e r n a t i v e . Pitman e f f i c i e n c y 3  o f T„ r e l a t i v e 2n 0  A  21  n  I "°° n  In t h e c a s e  n  M.2  n  Frequently,  l r  n -co 2  1  relative It  n  S <6 ±  2  ^i ^2 = <  2  l  _ / 2 <V v  n  V  2  m =m =m ' v2  21  n  t o T.. , A „ , i s d e f i n e d t o be I n ' 21  2  6,=6_ t h e e x p r e s s i o n lim  last  1  n. .a _1 i f  2  The  <S >6  if  i lim.  A  i f  l  m  k  2  f o r t h e Pitman e f f i c i e n c y i s ( m 2  "  ^ l  6  2 1  1 m<5  \  m i )  a s w e l l a s <5 =S  2  and A  2  2 1  i s simplified  lim n+oo  v.  expression  2.1.1  *ln V ln V > (  (  / o  i s often given  t o be  as t h e d e f i n i t i o n  of  asymptotic  efficiency. should  interpreted  be n o t e d  t h a t t h e P i t m a n e f f i c i e n c y may a l s o be  as a l i m i t i n g  r a t i o , o f t h e d e r i v a t i v e s o f t h e power  f u n c t i o n s o f t h e two t e s t s  (Blomqvist  K e n d a l l and S t u a r t 1973).  I f t h e Pitman e f f i c i e n c y  be  The  1950,  N o e t h e r 1955; turns out to  i n d e p e n d e n t o f a, a s i t does f o r a s y m p t o t i c a l l y n o r m a l  statistics,  i t i s a s i n g l e summary measure o f how w e l l one t e s t  does compared w i t h  the other,  i n the region of the n u l l  hypothesis.  7 2.2  Bahadur R e l a t i v e E f f i c i e n c y Given  two  tests  of the n u l l versus  the Bahadur e f f i c i e n c y two  tests  as  same t i m e  (or B a h a d u r e f f i c i e n c y ,  i s the  t h e sample s i z e  increases without approaches  i t i s necessary  a s s o c i a t e d w i t h a sequence o f t e s t Bahadur  (1967).  Consider  The  hypothesis,  statistic,  T  bound and  0.  t o d e f i n e the  here  at  L  T N  J  n  =  the  the  In order to  statistics,  n o t a t i o n used  a test  the a l t e r n a t i v e  l i m i t i n g r e l a t i v e e f f i c i e n c y of  the s i z e of each t e s t  the Bahadur e f f i c i e n c y  f o r short)  compute  "exact  slope"  , as i n  ; L  i s Bahadur's.  , based  on n o b s e r v a t i o n s , f o r  n t e s t i n g t h e n u l l h y p o t h e s i s , H : 0 = 0 a g a i n s t t h e one s i d e d a l t e r n a t i v e , H.:9>8The t e s t r e j e c t s H„ when T i s g r e a t e r t h a n some 1 0 * 0 n constant. L e t L ( T ) = l - F (T ) , where F ( t ) = P (T < t ) , be t h e n « n n n 0 n p r o b a b i l i t y under the n u l l h y p o t h e s i s of o b t a i n i n g T g r e a t e r than n  n  J  Q  n  n  I  or  equal  to the observed  value.  1, u n d e r t h e a l t e r n a t i v e , there exists H^,  such  n-M»,  called  L ~*0 w i t h p r o b a b i l i t y n  exponentially fast, typically.  a f u n c t i o n o f 9, c ( 9 ) , d e f i n e d o v e r t h e  that c i s positive  w.p.  Now,  1, when 9 > 9  the exact  n  and  finite  and  sequence  {T  If  then,  alternative,  n ''"log L ^ - - s c ( 9 ) a s 3  n  i s the t r u e parameter v a l u e , then  s l o p e of the  (w.p.)  c(0) i s  }. n  B a h a d u r d e f i n e s f o r two corresponding  exact  slopes c ^ ( 9 ) ,  the Bahadur e f f i c i e n c y , efficiency (Here  n  with  the asymptotic  beginning totic  of T £  of t h i s  efficiency  test  sequences i=l,2,  B  21  {T^ }^_^ n  = c  2 ^ ^^  C  , i=l,2 with l ^ ^ ' -'--'ca  as a measure o f t h e a s y m p t o t i c respect to T^  efficiency  section).  n  when 8 i s t h e  i s i n the  sense  relative  parameter.  described at  Bahadur e f f i c i e n c y  assesses  f o r a l l 0 w h i l e Pitman e f f i c i e n c y  ed  has  the  the  meaning  I  I n t h e above d i s c u s s i o n t h e c o n v e r g e n c e i s w.p. r e l a x e d t o convergence i n p r o b a b i l i t y under the h y p o t h e s i s , by Woodworth (1970).  1. This i s alternative  asymp-  only  i n the neighbourhood Bahadur  determining  of the n u l l  (1967) o u t l i n e s the e x i s t e n c e  the f o l l o w i n g procedure f o r  and e v a l u a t i o n  Suppose T /n -> b ( 8 ) w.p. finite  function  -1 n  defined  on H^.  of the exact  9 8Q  1 where  2  n  and  hypothesis.  >  and b i s a  Further,  slope. positive  suppose  ilog ( 1  -  ( t))  f (t)  n2  n  including finite  F  as n+°° f o r e v e r y t i n an open  interval  e a c h v a l u e o f b , where f i s c o n t i n u o u s on t h e  and p o s i t i v e .  Then c ( 8 ) e x i s t s  f o r each  8e  interval, and e q u a l s  2f (b(6) ) . Very often  the exact d i s t r i b u t i o n ,  s u c h a c a s e i t may the  exact slope,  Bahadur is  still  be p o s s i b l e -  F  n  (t),  i s n o t known.  t o g e t an a p p r o x i m a t i o n t o  i f the asymptotic d i s t r i b u t i o n  (1960) d e f i n e s  In  i s known.  a "standard sequence",{T >, n  f o r which i t  p o s s i b l e t o compute t h e " a p p r o x i m a t e s l o p e " . A s e q u e n c e o f t e s t s t a t i s t i c s {T }°° , i s c a l l e d ^ n n=l  a standard  sequence i f : B(i)  There e x i s t s a continuous p r o b a b i l i t y F, s u c h t h a t  distribution,  lim P (T < t ) = F ( t ) vt n-*» 0 T h e r e e x i s t s a c o n s t a n t , a, 0<a<°°, s u c h ft  9  B(ii)  n  l o g ( l - F (t) ) = - a t / 2 (1+6(1) ) 2  B(iii)  There e x i s t s a f u n c t i o n , such t h a t V 8 > 8  that  ast+«>  b , on  8 6 >  f )  , w i t h 0<b<°°,  0  lim  ( | T / n - b ( 8 ) | >t)=0 /  Q  Js  n  ^°° a s t a n d a r d sequence, n  For  P  Vt>0 (a)  11  Bahadur  (1960) shows t h a t  log L  n  /n-*  (a)  -he to c ^  (8) i n p r o b a b i l i t y approximate  values.  for 8>6 . n  L  n  The  superscript (a)  i s a p p r o x i m a t e d by L  n  "(a)" refers =l-F(T )  (8)=a(b(8)) i s t h e a p p r o x i m a t e s l o p e (or i n e x a c t Sometimes c o n d i t i o n B ( i i i ) a b o v e , i s r e p l a c e d by 2  n  and  slope). the stronger  9 condition B(iii)*  P ( l i m T /n =b(0))=l i n which  The the  definition  c|  a )  case, l o g l /  a )  9>9_  /n+  o f the approximate  -J3C  distribution  i=l,2, B J  efficiency, relative  c a n be  =c  (1960  i s often  ( a )  ^ -±'  (B)/c  ( a )  1967)  =  parameter, is  called  9 .  states  that  Pitman  the  (Bahadur  and B a h a d u r  generally  1960,  distribution  slopes w i l l  be  from the n u l l .  given  hypothesis.  depends on  the 9^,  i s e q u a l t o the  s e r v e as two  means o f  of independence  with  coefficient.  asymptotic behaviour i s that  The  the  and t h e r e f o r e power f u n c t i o n s , o f some  the nonparametric s t a t i s t i c s  work w i t h i n t h e f i n i t e  inexact  1976)  b a s e d on t h e c o r r e l a t i o n  functions,  the  I n t h e c a s e o f one-  efficiency Wieand  approximate  the n u l l v a l u e ,  efficiency.  the s e l e c t e d nonparametric t e s t s  parametric test  slopes,  where i t i s a v e r y  of the n u l l  9 approaches  efficiencies  main r e a s o n f o r c o m p a r i n g  of  as  t h e l i m i t i n g Bahadur  efficiency.  comparing  limit,  given  Bahadur  t h e use o f t h e  In g e n e r a l ,  efficiency  t h e l i m i t i n g Bahadur  sided tests, Pitman  The  of  the  Similarly  i n f o r m a t i v e b u t g i v e s examples  approximate Bahadur  sample  to allow  o  sequences.  i s reasonable i n the neighbourhood The  r e p l a c e d by  the approximate  bad a p p r o x i m a t i o n t o t h e e x a c t s l o p e . slope  that  l / 2 w i t h approximate  (9),  *8>e  as an a s y m p t o t i c measure o f t h e  o f t h e two and  1  n  interpreted  efficiency  Bahadur slope  a )  {T^  1  w.p.  (9)  as an a p p r o x i m a t i o n .  s t a n d a r d sequences, (9),  ( a )  slope p a r a l l e l s  exact slope w i t h the exact d i s t r i b u t i o n  asymptotic two  for  3 s  0  a r e unknown o r a r e d i f f i c u l t case.  Where p o s s i b l e  f o r comparison  Otherwise approximate  the exact  i n regions  slopes w i l l  be  to  distant,  given.  10 Pitman  e f f i c i e n c i e s and a p p r o x i m a t e  comparison  criteria  2.3 A Monte C a r l o Pitman of  Bahadur  e f f i c i e n c i e s provide  i n regions near the n u l l .  Study  and B a h a d u r  e f f i c i e n c i e s p r o v i d e comparison  t e s t s when t h e sample  size  i s large.  The c o m p a r i s o n o f t h e  t e s t s would  n o t be c o m p l e t e w i t h o u t c o n s i d e r a t i o n  performance  of the t e s t s  way t o examine t h e s m a l l Carlo  study.  A large  when t h e sample sample  a fixed  for  and t h e r e l a t i v e  e a c h sample  decision:- ( r e j e c t empirical  of the  i s small.  The  alternative.  of given  The t e s t s  size are are c a r r i e d  These  powers p r o v i d e a way t o a s s e s s t h e p e r f o r m a n c e o f sample  size  out  f r e q u e n c i e s o f making t h e c o r r e c t  the n u l l hypothesis) are recorded.  t e s t s as a f u n c t i o n  easiest  b e h a v i o u r i s t h r o u g h a Monte  number o f samples  g e n e r a t e d under  criteria  and t h e a l t e r n a t i v e ,  o f the  when t h e  e x p e r i m e n t i s r e p e a t e d f o r c h o s e n v a l u e s o f n and f o r d i f f e r e n t alternatives.  The s i z e  of the t e s t  constant value f o r a l l t e s t s .  remains  fixed  a t t h e same  11 CHAPTER 3  TESTS TO BE COMPARED 3.0 I n t r o d u c t i o n Nonparametric t e s t s variables,  like  f o r t h e i n d e p e n d e n c e o f two random  other nonparametric t e s t s ,  v e r y weak a s s u m p t i o n s be made.  require that  In a l l that follows i t i s  assumed t h a t a random sample o f n i n d e p e n d e n t observations Further, with  i s available,  to test  d e n o t e d by  i t i s assumed t h a t t h e sample  distribution the n u l l  o r H :F (x,y) =F n  rank t e s t s ,  (x,oo)  F  (oo  ,y) .  H ,  The t e s t s  ruling  Finally,  ranks, n  of t i e d  of Kendall's linear case,  (R^,S^),  ;  f  o f t h e n X's  i s a statistic  thereby  b a s e d on K e n d a l l ' s t a u ,  normal scores  testsstatistics,  A linear  are a l l  i = l 2 . . . n , where  ranking  s t a t i s t i c , the  statistic with  t a u and t h e B l u m - K i e f e r - R o s e n b l a t t  rank s t a t i s t i c s .  n^"  T , depends on t h e  sum, and t h e B l u m - K i e f e r - R o s e n b l a t t A l l of these  , Y  ranks.  t e s t s t o be compared a r e t h o s e  components.  X  I t i s desired  t o be d i s c u s s e d  the j o i n t  Spearman's r h o , t h e F i s h e r - Y a t e s quadrant  ^ n  i s drawn f r o m a p o p u l a t i o n  i t i s assumed t h a t F i s c o n t i n u o u s ,  out the p o s s i b i l i t y  The  Y  statistic,  only through t h e i r  R^(S^) i s t h e r a n k o f ^ ( Y ^ ) i (Y's).  ( X ^ , Y ^ ) , (X^r 2^*'•  t h a t X and Y a r e i n d e p e n d e n t ,  n  that i s , the t e s t  observations  bivariate  f u n c t i o n , F (x,y) =Prob (X=s:x,Y<y) . hypothesis,  only  and i t s  the exception s t a t i s t i c , are  rank s t a t i s t i c  i n the b i v a r i a t e  o f t h e form,  a  3.0.1 where a„ and b are the scores n n Sidak  (1967).  i n the terminology  Each s t a t i s t i c w i l l  o f H a j e k and  now be d e f i n e d i n a way  12 facilitating efficiency  calculations.  3.1 Spearman's Rho Spearman  (1904) i n t r o d u c e d a d i s t r i b u t i o n - f r e e  which may be used t o t e s t H Q .  statistic  The s t a t i s t i c r_, i s the sample  c o r r e l a t i o n c o e f f i c i e n t a p p l i e d t o the ranks,  (R^,S^),  i=l,2...n,  and i s given by r =12n  _ 1  s  (n -l) 2  J_^ -h (n+1) ) (S^-h (n+1) ) .  _ 1  ±  I t i s clear that r i s a multiple of s ^ T =n f 12 (R / (n+1)-h) ( S / (n+1) ~h) i s a l i n e a r rank s t a t i s t i c . The scores a r e - 3 s  and t h a t T generated  s n  sn  i  ±  by the f u n c t i o n £ [VL) =/\2 (u-h) , 0£u<:l, i n the f o l l o w i n g  way, a (i)=b (i)=c(i/(n+l) ) n  I t may be argued estimate  i = l , 2 . . .n.  n  (Kruskal 195 8)  that r  i s a reasonable  g  of p =6Prob{(X-X') (Y-Y' ')>0}-3 , where g  (X,Y), (X',Y'),  (X'',Y'') are a r b i t r a r y , independent o b s e r v a t i o n s from the I f X and Y are independent p  d i s t r i b u t i o n f u n c t i o n F. although r  g  the converse  g  i s 0,  i s not t r u e i n g e n e r a l .  may a l s o be w r i t t e n as r =l-6n" (n -l)~ 1  2  1  s  £(R.-S..) .  3.2 The F i s h e r - Y a t e s Normal Scores  2  Statistic  The F i s h e r - Y a t e s normal scores s t a t i s t i c i s given by, f= f a (R. )b (S. ) L n i 'n i 7  where a (i)=b (i)=EV n n n of  and V^ ' i s the i n  largest ^  observation  a sample o f s i z e n, drawn from a standard normal p o p u l a t i o n .  Apart  from some c o n s t a n t s , f i s the c o r r e l a t i o n c o e f f i c i e n t o f  the expected  normal order s t a t i s t i c s  corresponding  t o the  13 observations.  In c e r t a i n  when a l o c a l l y Sidak  1967)  circumstances  most p o w e r f u l  rank  For the purposes  test  the  statistic  i s desired.  of t h i s  thesis  the  arises  (Hajek  and  statistic,  rt  T.  will of  be  used  3.0.1.  a  h  -Ti n  to correspond The  generating  =n~ f=n~ h  fn  s c o r e s may  function  distribution  ?(u)=#>  function  1  (R. )b  1  (S.)  n  l  exactly  to the  a l s o be  d e f i n e d i n terms of  (u) , O ^ u ^ l ,  linear  rank  ($ i s t h e  statistic the  cumulative  o f a s t a n d a r d n o r m a l random  variable)  by n o t i n g , a where from 3.3  (i)=b  i s the i ^ * the Uniform  The  largest  1  (0,1)  Quadrant  Blomqvist  (i)=E?(U  ( l )  )  o b s e r v a t i o n i n a sample o f s i z e  distribution.  Sum  (1950) s t u d i e s t h e  t o M o s t e l l e r (1946).  q,  V 2 n  =  q  n  l  +  n =  n  statistic,  as d e f i n e d by  The  lines  2  n X v a l u e s and plane  into  size,  n,  point  falls  J  Blomqvist i s  2  n = t h e number o f p o i n t s l y i n g and  y=m  y  (m x  and  i n the  i n the  m y  first  second  are the  four quadrants.  i s assumed t o be on  one  of the  For  x=m  (1950) d e m o n s t r a t e s  o f n=2Prob{(X-y medians.  x  n may  between X and  )(Y-y be  y  and  thought  quadrants.  divide  the  (x,y)  sample  s i t u a t i o n where a  y=m  .  x  q i s a consistent and  y  y  Y are  estimate  are the p o p u l a t i o n  o f as a measure o f t h e  Y w h i c h i s 0 when X and  the  the  y  that  ) > 0 } - l , where y  or fourth  simplicity  x  Blomqvist  quadrants,  c  even t o a v o i d the lines  or t h i r d  sample m e d i a n s o f  the n Y v a l u e s , r e s p e c t i v e l y )  these  attributes  n  and  x  q, w h i c h he  l~ 2  whe r e n^=the number o f p o i n t s l y i n g  x=m  n  correlation  independent  (the  converse  i s not necessarily  In o r d e r  to relate  true).  q t o 3.0.1, n , - n  may be w r i t t e n as  9  n - n „ = £ s i g n (R.-Jg (n+1) ) s i g n (S.-h (n+1) ) n  X  Z  X  t»i  X  ~ S ~h)  = ^ s i g n ( (n+1) ~ R - J ) s i g n ( (n+1)  1  1  i  i  5  1=1  sign(v)  i s d e f i n e d i n t h e u s u a l way, s i g n ( v W 1 i f v>0 0 i f v=0 [ - l i i f  v<0  -h Now, q=n  T  where  T The a  n  q n  =n  statistic  T  _ 3 s  ^ s i g n ( (n+1) " R - % ) s i g n ( (n+1) 1  _ 1  i  S -J ). i  2  i s i n t h e d e s i r e d f o r m and t h e s c o r e f u n c t i o n s ,  qn  and b , o f 3.0.1 a r e g i v e n by n  a The  score  n  ( i ) = b ( i ) = c (i/n+1) . n ^  generating  f u n c t i o n , 5 , i s d e f i n e d t o be  £ (u)=sign(u-%)  O^Usjl.  3.4 K e n d a l l ' s T a u Although  the s t a t i s t i c  appeared  i n the l i t e r a t u r e  K e n d a l l ' s e x t e n s i v e work on i t s p r o p e r t i e s is  commonly known a s K e n d a l l ' s  tau.  before  ( K r u s k a l 1958), i t  Kendall's  tau, t ,i s  d e f i n e d as t t  n  =n  ( n - 1 ) " 7 )"sign(R.-R.) s i g n (S.-S .) . • ,. J -- J  _ 1  1  1  i s not a linear  n  1  ,  *  i  i  rank s t a t i s t i c  - ,  b u t Hajek and S i d a k  (1967)  /\  compute t h e p r o j e c t i o n , statistics,  t , of t  under t h e n u l l  n  into  the family of l i n e a r  hypothesis.  rank  Namely,  n t =8n" (n-l)~ 2  n  1  £(R.-35  (n+1) ) ( S - % (n+1) ) . ±  i=i  Comparing  this with  t = / (n+l)n 2  n  3  t and  _ 1  Spearman's r h o o f 3.1, i t i s a p p a r e n t  that  r . s  i s an e s t i m a t e  o f = 2 P r o b { (X-X') (Y-Y')>0}-1, where (X,Y) T  (X',Y') a r e i n d e p e n d e n t o b s e r v a t i o n s  from F (Kruskal  1958).  15 Under H , x=0, although  T=0 does not n e c e s s a r i l y imply t h a t  n  H  n  holds.  T can be i n t e r p r e t e d as a measure o f "agreement"  or  c o r r e l a t i o n between X and Y and t h e r e f o r e a t e s t based on  t  i s a reasonable t e s t o f independence.  3.5 The Blum-Kiefer-Rosenblatt Let of  Statistic  And I t s Components  F (x,oo) ,F (°°,y) be the marginal d i s t r i b u t i o n f u n c t i o n s n  the sample j o i n t F  (x,y)=n  d i s t r i b u t i o n f u n c t i o n F ( x , y ) where, n  (the number o f p a i r s ' (X . y.Y. ) w i t h X.^x  n  Y.^yX.  and  1  1  Define the random process Q (x,y) such t h a t , Q (x,y) = v^T ( F ( x , y ) - F ( x , ~ ) F ( o o , y ) ) . n  n  n  n  BlumyKief er-,- and Rosenblatt  (1961) propose  the s t a t i s t i c  B = n / Q (x,y) dF (x,y) n n ' n ' -1  2  v  2  2  as the b a s i s o f a t e s t f o r the independence o f X and Y. B  n  may a l s o be w r i t t e n as the sum,  B =n  _ 1  Y (#(j*X.<X.,Y.<Y.)n -n" #(j >X.<X.)#(j *Y.<Y.)) . _1  2  2  I t may be seen t h a t any t e s t based writing B B =n n  1  on B  i s nonparametric by  n  i n terms of the ranks.  J (n J s i g n ( R - R ) s i g n ( S - S ) - n " r s i g n ( R - R ) ;i 1  +  +  i  j  2  i  j  +  (  i  j  ^sign^(S -S.) ) i  where s i g n +(v)= 1 i f v>0 0 i f v<0 Hoeffding  (1948) i n t r o d u c e d a s t a t i s t i c which i s a s y m p t o t i c a l l y  equivalent to B . n  Hoeffding's s t a t i s t i c  estimates  A(F)=/ D (x,y) dF(x,y) 2  where D (x,y)=F (x,y) -F (X,°P)F (oo,y) .  T h i s parameter has the  d e s i r a b l e p r o p e r t y t h a t D(x,y)=0 f o r a l l (x,y) i f and o n l y i f HQ  i s true.  16 Assuming the n u l l hypothesis h o l d s , i t i s of i n t e r e s t to f i n d the orthogonal r e p r e s e n t a t i o n o f n B co  eo  case, namely YY^"k "k ' of nB i s the same as t h a t of B. n z  B=  are independent,  w  n  e  r  e  t  n  e  asymptotic d i s t r i b u t i o n The components z., , j,k=l,2...°° j JS.  i d e n t i c a l l y d i s t r i b u t e d normal random v a r i a b l e s  w i t h mean 0 and v a r i a n c e 1.  The procedure  r e p r e s e n t a t i o n i s completely analogous Knott z  (1972).  n j k ' J'  k=1  i n the asymptotic  n  O  used to f i n d the  t o t h a t of Durbin and  For f i n i t e n, the corresponding  ' '-'°°  MA  2  Y  b  "components"  computed which are a s y m p t o t i c a l l y  e  e q u i v a l e n t t o the components , z ^ , j ,k=l, 2 ...«>.. . The z j '  s  be r e l a t e d to l i n e a r rank s t a t i s t i c s which may themselves  be  k  n  c o n s i d e r e d f o r t e s t i n g f o r independence.  1.  Y  The components are  computed i n s e c t i o n 3.5(1) and are then r e l a t e d t o l i n e a r s t a t i s t i c s i n section  m a  k  rank  3.5(2).  The components.  Assume t h a t the n u l l hypothesis h o l d s , L e t Q (x,y) =/n ( F (x,y) - F  i . e . F(x,y)=F(x,°°)F(°°,y) .  n  n  N  (x,°°) F  n  (°° ,y) )  and c o n s i d e r the t r a n s f o r m a t i o n ( x , y ) = ( F " ( u ) , F ~ ( v ) ) where 1  F (x)=F(x,oo)  and  x  F  y  (y)=F(oo,y) .  Now  Q  n  1  (x,y) =Q  ( F " (U) . F ' ( V ) ) 1  n  1  i s a s y m p t o t i c a l l y a Gaussian process, @ (u,v), w i t h mean and covarianvce given by EQ(u,v)=0=EQ (F^" (u) 1  n  ,F-Y (V) ) 1  Cov(Q(u,v) ,Q(r,s) ) = {min (u, r - u r } {min (v, s ) - v s } =lim Cov ( Q ( F " ~"°°  n  Q  1  (u) F" -.(v) ) , 1  /  -1 -1 (F (r) ,F (s))) X  n  x  x  Y  where Cov ( Q ( F " (u) ^ J (v) ) ,Q ( F " (r) ^ (s) ) ) -1 -1 -2 = {n (n-l)mm ( u , r ) - u r } {n (n-1)min ( , s ) - v s }-n uvrs . 1  1  1  N  n  v  Define z  njk"^jk  / / i ( u , v ) Q ( F ( u ) ^"-"-(v) ) dudv 1  j k  n  x  j,k=l,2. . . »  3.5.1.1  17 where  (u,v)  eigenvalue 1  X  > k  i s the  eigenfunction  of  equation  the  corresponding  to  the  1  / I  {min(u,r)-ur}{min(v,s)-vs}I(r,s)  I t has  been found  eigenf unctions  (Blum, K i e f e r , and  and  eigenvalues  of  4 2 2 1/TT j k , j , k = l , 2 . . . °°.  and  distribution -V jk  z  Since  = X  3.5.1.2 a r e  3.5.1.2  1961)  that  the  2 s i n (irju) s i n (ukv) z n _j  has  the  same  as 1  jk  1  ••  I £ ^j (u,v)Q(u,v)  dudv  k  Gardner 1975).  a n o r m a l d i s t r i b u t i o n (Ash  see,  as  X.,  are  uncorrelated  1.  The  z^^'s  independent, i d e n t i c a l l y inverse  3.5.1.3  z ' s  equal to  •o  has  I t i s easy t o  3.5.1.1, t h a t t h e  The  Rosenblatt  Asymptotically,  Q(u,v) i s G a u s s i a n , z.,  variances  drds=X£(u,v)  of  and  and  are,  system  h a v e 0 means  therefore,  d i s t r i b u t e d Normal  the  (u,v)  (0,1)  and  satisfy and  asymptotically random v a r i a b l e s .  3.5.1.3 i s  «D  Q(u,v) = £ 2 > ? A . ( u v ) z . k  k  f  k  Now, nB =/Q (x,y) d F ( x , y ) = / Q ( F " ( u ) . F ' ( V ) ) 2  2  n  under H Q , n B  and  asymptotically,  B=f  / o / ( u , v ) dudv=H\ Referring  in z  dF (F" (u)^(v))  1  1  1  n  terms o f the  njk  = 2 7 T  2  j k  1  £  2  = 2TT jkf Integrating  J  Z  definition  original  £  s  i  (^Ju)  n  1  s  i  n  k v  )Q  n  (  °  the  inner n  1 x  s i n (TT j u ) Q  F x  n  i n t e g r a l by  (u) ^ ^ ( v ) )  d y = s i n (TT j u ) du  -  '  k  k  may  be  follows, - 1 - 1 (u) , F (v) ) dudv y  1  s i n ( i T k v ) /  z  n  d a t a as (^  k  3.5.1.1, z j  1  x=Q (F  gives  i s d i s t r i b u t e d as  n  z  to the  n  (F  A  - 1 - 1 (u) ,F  parts,  (v))du  X  with  dv  written  z . =2TTk n  k  I  1  1  sin(TTkv)(  i  =2irkvfi7 s i n ( T T k v ) { n  J  n  _ i  18  -] -1 c o s ( ^ j u ) 6 _ { Q ( F - ( u ) , F ( v ) ) } d u dv <Su _ _ j_ cos (ITjF ( X ± ) )-n F (»,F (v) ) x  x  Y  1  r  x  n  rt  l >  ''  ) cos  (TT j F A  ) ) }dv  (X. 1  I n t e g r a t i n g a second time by p a r t s , w i t h I cos (TTJF ( X ) ) -n" F (°°,F  x= {n  ±  X  n  (v) ) [ c o s (TTJF (X _) ) }  Y  J  X  dy=sin (irkv) dv gives, z  .^/n/cos  (irkv)  d{n  [ cos (TTjF  - 1  ) ~n  (X )  x  ±  _ 1  F  ( ^ F " (v) ) 1  n  [cos (TTJF (X )) } X  = 2/n{n  /cos (TTJF (X. ) )cos ( 7 T k F ( Y . );)--n" 7cos z  .--I i  A  V  l  x  v  l  i  ,  ; /—  zn j .,K may now be r e l a t e d t o a l i n e a r rank  F  n  (•,<») and  x  and F  Y  by the sample  F (°°,0. n n  z . =2/H{n icos _1  k  k  (TTJF (X ,oo) N  i  (TrkF  y  (Y ) ) } I  I t may be necessary  distribution  Replacing F  i n the e x p r e s s i o n f o r z j n  V  statistic,  The r e l a t e d l i n e a r rank s t a t i s t i c s .  to e s t i m a t e F  (TTJF (X. ) ) A l  N  £cos  2.  1  and F A  functions  by t h e i r e s t i m a t e s  Y  gives  )cos ( u k F  n  (oo, Y ) ) ±  -n" [cos (irjF 2  n  (X  ± F  ~) )  [cos ( T T k F ( o , Y ) ) } n  0  i  Ul  Or i n terms o f the rank s t a t i s t i c s , A  z  • =2v n{n /  1  ""1°c o s  7  J  —1  N  R-)cos(Trkn  —1  (R^,S^)  —2v S . ) -n  Ycos ( T r-1 jn  ^ c o s ("fkn The l a s t —2 to n the T  n > k  l —1 i)£cos(Trkn i)). n  The rank s t a t i s t i c  use o f the l i n e a r rank  statistic,  ^ 2 c o s (TT j (n+1)  cos ( i r k (n+1)  =n  2  S^) }  term of t h i s e x p r e s s i o n i s c o n s t a n t , ( i . e . i t i s equal  Jcos (irjn n  R. )  which i s o f the form 3 . 0 . 1 . f u n c t i o n s ^^(u)= 2cos (TT£U)  A  z  suggests  S. )  The s c o r e s are generated by the , 0«u-£l, £ = 1 , 2 . . . ° ° where  19 a„ (i)=C- ( ( n + l ) it and  i s only  _ 1  i) , b  (!)=?. ( ( n + l )  necessary to consider  - 1  i) .  It will  the l i n e a r  be shown  rank s t a t i s t i c s  n o t t h e z ., 's t h e m s e l v e s , s i n c e T ., i s a s y m p t o t i c a l l y njk njk  equivalent  c  to z . njk  J  that T  ., , njk  20 CHAPTER 4  ALTERNATIVES TO  Specification important Any only  of the a l t e r n a t i v e  i n making  conclusions  same w i l l  regarding  alternative  the b i v a r i a t e  some o t h e r a l t e r n a t i v e s (1956)  normal  under  w h i c h have  been  i s , perhaps, the f i r s t  t o p o i n t out the  arising i n defining  efficiencies  of several nonparametric t e s t s  He  derives  is  Xi = A i = l ,  of  2  3  = 0.  T h i s model i n c o r p o r a t e s  c o n s i d e r s a s i m i l a r model and a g e n e r a l tests,  computing  model i s X = ( 1 - 6 ) Z i + 0 Z ,  Zi,  are independent.  2  and  Z  2  3  Dependence may functions.  This  be s p e c i f i e d  2  the  independence  bivariate  Bhuchongkul  class  :(1964)  of nonparametric  Y= ( 1 - 9 ) Z * 8 Z , where 0^9<1; 3  a p p r o a c h i s t a k e n by F a r l i e  the o r d i n a r y  tests.  2  i n terms o f the d i s t r i b u t i o n  Pitman e f f i c i e n c i e s of g e n e r a l i z e d (which i n c l u d e  , Y=X 3 Z i + X 4 Z ,  2  t h e P i t m a n e f f i c i e n c i e s o f some s p e c i f i c  Bhuchongkul}s Z  2  hypothesis of  n o r m a l c a s e and i s t h e r e f o r e more g e n e r a l .  Pitman  correlation  1  The  of  independence  when X and Y a r e g i v e n by X=A. Zi+X Z  A =A  t  a class  the r e l a t i v e  ( w i t h r e s p e c t t o t h e t e s t b a s e d on t h e sample  are independent.  of  investigated.  to independence.  2  The  but not without mention  alternatives  Z  are v a l i d  case i s c o n s i d e r e d .  and d i f f i c u l t i e s  where Z i and  independence.  consideration.  significance  coefficient),  of  the behaviour o f the t e s t s  be done i n t h i s t h e s i s ,  Konijn  t o independence i s  a power c o m p a r i s o n o f t e s t s  f o r the p a r t i c u l a r  Usually, only  INDEPENDENCE  sample  Daniel's  (1961) who  correlation  correlation  computes coefficients  coefficient,  Spearman's r h o and K e n d a l l ' s t a u ) , when t h e j o i n t d i s t r i b u t i o n  function Ghokale  o f X and Y has t h e f o r m F (1968) a l s o s p e c i f i e s  distributions,  (x,y)=F (x,°°)F (°°,y) (1+AA (x) B (y) )  a general  class  of b i v a r i a t e  namely, F (x,y) = ( 1 - 6 ) F (x,°°)F (°° ,y) + 9K (x,y) , 0< 9<1  and c o n s i d e r s a s u b c l a s s o f B h u c h o n g k u l ' s c l a s s statistics.  Ghokale o b t a i n s  rank t e s t w i t h r e s p e c t  the Pitman e f f i c i e n c i e s  t o a n o t h e r as w e l l  t h e t e s t b a s e d on t h e sample c o r r e l a t i o n shows t h a t t h e r e e x i s t are  0 and °°.  alternative  This  alternatives  illustrates  i s i n choosing  how  the best  coefficient.  f o r which these crucial  for  e v e r y x and y .  linear general is  to  He  efficiencies  knowledge  o f the  test.  quadrant dependence.  function F i s positively  of. one  as w i t h r e s p e c t  A g e n e r a l n o t i o n o f dependence, d e f i n e d (1966), i s p o s i t i v e  o f rank  by Lehmann  The  distribution  q u a d r a n t d e p e n d e n t i f F (x,y) ^F (x,°°)F (°°,y)  Behnen  (1971) i n v e s t i g a t e s  r a n k s t a t i s t i c s when t e s t i n g contiguous p o s i t i v e  independence a g a i n s t  q u a d r a n t dependence.  as d e f i n e d by Hcijek and S i d a k  the behaviour of  1967) .  (Contiguity  22  CHAPTER 5  THE  COMPARISON  5.0  Introduction The  to  OF NONPARAMETRIC  comparison  of the t e s t s  the case o f b i v a r i a t e  assumed t h a t correlation  (X,Y) h a s a b i v a r i a t e  2  1  calculate  the  tests  the  test  will  be u s e d .  b a s e d on l i n e a r Behnen  First,  (1971),  the s t a t i s t i c s  yn  i s given.  evaluation  3, r e l a t i v e t o  coefficient,  c a n be e a s i l y This  result  3 . 0 . 1 are asymptotically  derived implies  Normal(0,1)  HQ and u n d e r c o n t i g u o u s  alternatives,  (y , 1 ) , where an e x p r e s s i o n f o r  The P i t m a n e f f i c i e n c y  of the r i g h t  two  the Pitman e f f i c i e n c i e s o f  Theorem 1 .  a r e a s y m p t o t i c a l l y Normal  n  to test  alternative  of Chapter  rank s t a t i s t i c s  under t h e n u l l h y p o t h e s i s , H  with  t h e P i t m a n and B a h a d u r e f f i c i e n c i e s o f  b a s e d on t h e sample c o r r e l a t i o n  applying  that  normal d i s t r i b u t i o n  t h e one s i d e d  b a s e d on t h e s t a t i s t i c s  main t o o l s  by  i t is  0.  To  tests  To be p r e c i s e ,  I t i s desired  2  HQ:p=0(independence) versus H :p>  i s restricted  p and t h a t X and Y have means y '-V„ and x y  coefficient 2  INDEPENDENCE  o f independence  normality.  a r,a ^ r e s p e c t i v e l y . x y ^  variances  TESTS OF  hand s i d e  i s then a  straightforward  of 2.1.1.  Second, t h e Bahadur e f f i c i e n c y  of T  n  ( g i v e n by  3.0.1)  may be computed by c o m b i n i n g Theorem  3 with  Woodworth  s l o p e o f t h e sequence  {T >. n  (197.0)  I f the exact  approximate requires by  , t o y i e l d the exact  Behnen  slope  slope w i l l  is difficult  be g i v e n .  t o compute, t h e  The a p p r o x i m a t e  knowledge o f t h e d i s t r i b u t i o n (1971),  4 . 6 , both of  Theorem 1 and o f l ^ m  of T :  n  slope  u n d e r HQ, g i v e n  T / n under the  alternative, For was  the small  carried  thousand  g i v e n by  4 . 6 o f Woodworth  sample  (1970).  c o m p a r i s o n , a Monte C a r l o  o u t f o r samples  of s i z e s ,  n=10,  2 4 and  samples were g e n e r a t e d f r o m a b i v a r i a t e  distribution  f o r each value  values.  e s t i m a t e o f t h e power o f e a c h t e s t  by  An  the r e l a t i v e  o f n and  experiment 50.  One  normal  for several positive  frequency of r e j e c t i n g  the n u l l  p  i s provided H :p=0 i n n  f a v o u r o f H^:p>0.  5.1  Pitman  Efficiencies  The n o n p a r a m e t r i c t e s t s parametric  test  b a s e d on  a r e t o be  t h e sample  compared w i t h t h e  correlation  coefficient,  n  Y  (X.-X)(Y.-Y)  r =  —  n  /£(X -X)  £(Y -Y)  2  i  where X and Y a r e t h e sample respectively. is  greater  when  The  test  means o f t h e X's  has  coefficient  a bivariate  p , the t e s t  and  Y's  r e j e c t s ^ H „ i n f a v o u r o f H, when r 0 1 n  t h a n some p o s i t i v e  (X,Y)  2  ±  constant.  normal  I t i s known  d i s t r i b u t i o n , with  i s consistent  against  (See of  Kendall  n o r m a l w i t h mean p and v a r i a n c e  and S t u a r t  1973,  f o r example).  n  a  In the  (P)=P (p)=(l-p )//n" 2  r  n ty'r (0)=1, m =1 n r r  lim n+°°  n V _J  (0) = n n  l i m n"* n->-°°  5  1 =1, -%  6 =h, r  v  n  is  2  (1-p ) / n .  C h a p t e r 2 , where 9=p, 6Q=0 ¥  correlation  H^:p|0 and r 2  asymptotically  that  =1.  notation  Defining  the simple  alternative  t o independence, H : p = p = b n n r K  f o r b ^ , some p o s i t i v e are  constant,  M  assumptions A ( i ) - A ( i v )  of 2.1  satisfied. Now  c o n s i d e r a c o n s i s t e n t t e s t b a s e d on t h e s t a t i s t i c T = n n  which r e j e c t s  — 3*2  HQ, i n f a v o u r  Suppose t h e s c o r e  o f H^, f o r l a r g e v a l u e s  of T . n  f u n c t i o n s a r e d e f i n e d by, a (i)=  EaCU^ *)  b (i)=  Eb(U  n  n  where U^"^ i s t h e i.^  1  order  1  Uniform  n  Ya (R. ) b (S. ) L n I n I  ( l ) n  )  statistic  ( 0 , 1 ) random v a r i a b l e s ,  o f a sample o f n i n d e p e n d e n t  o r by,  a ( i ) = ;a(i/;(n+l)) n  b (i)= n  b(i/(n+l) ) .  Further,  suppose t h a t t h e s c o r e  generating  f u n c t i o n s , a and b ,  are r e a l  v a l u e d m e a s u r a b l e ; f u n c t i o n s d e f i n e d on [ o , l ] w h i c h  satisfy, I i  and  I  / b(z)dz= /"a(z)dz= d o  0  0</ a ( z ) d z = a <°°, 0</ b ( z ) d z = a <°°. 2  2  o  2  a  2  o  If  these  T  i s a s y m p t o t i c a l l y n o r m a l w i t h mean 0 and v a r i a n c e  n  c o n d i t i o n s hold,by  o  Behnen  (1971),  Theorem  (l-a,c),  HQ : p = 0 and i s a s y m p t o t i c a l l y n o r m a l w i t h mean T — 2under t h e simple constant.  alternative,  The mean y U = n  f  a a a, b  n  H :p=p =b /n n  n  T  2  a ( F ( x « ) ) b ( F ( « y ) ) dF f  f  (x,y) n normal  marginals  (x,y) i s t h e b i v a r i a t e n o r m a l d i s t r i b u t i o n n p=p . As f o r r , t h e s e r e s u l t s c a n be w r i t t e n n n' p  function with  any p o s i t i v e  i s g i v e n by,  where F (x,°°) =$ (x) , F (°°,y)=<l'(y) a r e t h e s t a n d a r d o f X a n d Y and F  and v a r i a n c e 1 ,  ,b iis T  1, under  M  25 in  the n o t a t i o n o f Chapter ^  Without and  T  n  (P)= nj|  a a,  2, t h a t i s  / a(F(x,°°) )b(F(»,y) ) dF p  (x,y).  a b  y =y =0  l o s s o f g e n e r a l i t y i t may be assumed t h a t  a =a =1. x y  x  y  Substituting f o rF gives p ^ 3  h  * n T  aloT 0 - L a ( $ ( x ) ) b ( $ ( y ) ) ( 2 7 r / i ^ ) - e x p { - % ( l - p ) - r x + y  ( P >  T  1  2  1  2  a b  -2pxy] } dxdy  a (p)=l T  n ill™  vr„ n  (0)=n  2  — i -> o a ( $ ( x ) ) b ( $ ( y ) ) (2TT) exp{-3s ( x + y ) }xy dxdy, rn^l  ,oo  -—tj^  l  lim n ^ n->oo  (0) 1.. = n tfiji (0) n  Provided  v  T  2  (  . a  aV  . -1  "  S  ,oo  -ooQ  n  relative  1  5  dxdy=v  AA(i)-A(iv)  From 2.1.1 t h e P i t m a n T ,  00 2 . 2! a ( $ ( x ) ) b ( $ ( y ) ) (2TT) e x p { - J (x^+y*)'}  xy  i s positive,  2  m  o f 2.1 a r e s a t i s f i e d .  efficiency  of the test  b a s e d on  t o t h a t b a s e d on r , i s g i v e n b y , A  T r  =lim  ^  n->oo  (0)/a n  V  r  n H  -p  5.1.1  n (0)/ a  f- rjV"0-« {  (01  T  n  (0)  a ( $ ( x ) ) b ( $ ( y ) ) (2TT ( 1 - p ) 2  %  )  a b  exp{-%(x  +y - 2 p x y ) } d x d y >  J  -  1  2 p =0  Or, ^ r ' f *  a  1  a b ~^>C a  )  a ( $ ( x ) ) b ( $ ( y ) ) (2TT)  ^ x p f - J s ( x + y ) }xy d x d y j  00  2  2  2  5.1.2 i /™ b($(y))<(.(y)y dy i •/„ a($(x))<|>(x)x dx °b L a -3" 2 where <J> (x) =$ ' (x) = (2TT ) e x p ( - % x ) ( s t a n d a r d n o r m a l d e n s i t y ) . 2  In  t h e c a s e where a ( ' ) = b ( « ) , 5.1.2 r e d u c e s t o  2  26 A  -  Tr~  f  a i s now  It of  the t e s t s  relative  based  on  t o the t e s t  the  linear  based  on  r  rank .  have a l r e a d y b e e n d e r i v e d  statistic),  f o r completeness  when t h e  and  these  they  several  and  tests  and  a  sr Evaluating  A  sr  =  2. =$ is  the  integral  known  /  . oo  w i l l be  Q  testing assumed  a(u)=b(u)=/12(u~h)  IT  /  by p a r t s w i t h v=$(x)-k  and  du=x<|) (x) g i v e s  result, -  1  2  ( 2 T T ) exp(-x  (12) (2TT) 3 5  )  dx  h  (2)  h  4  = 9/TT  ?  =. 91.  Here a(u)=b(u) s c o r e s s t a t i s t i c (T^ ) n A c c o r d i n g t o 5.1.3 t h e P i t m a n efficiency  F i s h e r - Y a t e s normal -1  a =a,=l.  (u) and A  f  = r  _ - d l  x  2  *  <  dx  x )  *=1.  The s t a t i s t i c T o f 3.3 ) qn qn l e t t i n g a(u)=b(u)=sign(u-^). O b v i o u s l y , a "-arid Quadrant  to  case  for  H :p=0  x ( $ (x)-Js)c)) (x)dx  L  the w e l l  this  computed  oo  -[. 12 /  k  ) n 5.1.3,  From  =o=l b  a  In  (T  the  reject  are c o n s i s t e n t  a g a i n s t H^:p>0. From now on P i t m a n e f f i c i e n c y t o be t a k e n r e l a t i v e t o t h e t e s t b a s e d on r n Spearman's r h o  authors  a r e o n c e more The  3,  efficiencies  t o t h e components o f  convenience.  s t a t i s t i c s are too l a r g e  efficiencies  s t a t i s t i c s of Chapter  i n v a r i o u s ways by  (with the e x c e p t i o n o f those r e l a t e d Blum-Kiefer-Rosenblatt  the Pitman  Although ^  n  here  5.1. 3  a ( $ ( x ) )(J)(x)x dx " a simple matter to c a l c u l a t e  m  u  From 5.1.3 A  sum  (T  the Pitman  corresponds c  are both  efficiency i s , 14  =[/ x s i g n ($(x)-3s)((>(x) d x l q r rL o i = |/ -xo) (x) dx +,' f xcj) (x) dx 4  00  =  0  [ ( 2 T T )  2  fo  xexp(-3sx ) dx  = [2(2 )" ] =4(Tr) . -.41. J s  T T  4  : :  2  +  (2TT)  2  -xexp (-Jjx  )  dx  J  1.  4.  The components and r e l a t e d l i n e a r Blum-Kief er-Rosenblatt  statistic  been shown t h a t  j-^  It  has a l r e a d y  as  z ., when X and Y a r e i n d e p e n d e n t . jk ^  z n  form,Behnen s theorem a p p l i e s  (0,1) u n d e r i n d e p e n d e n c e . '  asymptotically  equivalent  S  Since  T  ^  o f the  (z ., and T ). njk njk asymptotically distributed  and l ^ j ^  1  Normal  1  rank s t a t i s t i c s  T  i s also .. and z  njk  ., i s o f t h e c o r r e c t  njk  asymptotically are, therefore, '  njk  when Hg:p=0 i s t r u e .  C o n s i d e r t h e a l t e r n a t i v e H :p=p =n b . Under H , l e t t i n g n n T n (x,y)=(F" (U)fF' (v))=($ (u)(v)) the expected value of Q i s A x n 2  m  1  E(Q ($  _ 1  n  1  - 1  ( u ) , 3 » ( v ) ) )=/n [F  ($  - 1  _ 1  ( U ) ,<D (v) ) - n _1  _ 1  F  ($  h -(l-n  - 1  _ 1  (u) ,$  _ 1  (v))  n )uv],  where F  i s the b i v a r i a t e normal d i s t r i b u t i o n f u n c t i o n w i t h n (<£>"' (u) , ( v ) ) i s e x p a n d e d a b o u t p=0, f o r l a r g e n  If F p  F  n  ($ p  -l _ 1  (u) ,$  _ 1  -  ( v ) ) - /*  (  v  /*  )  P - P  N  -  l (  u  (27r)" exp(-J (x +y ):)  )  1  2  2  5  (l+p xy) n  dxdy.  n T  Asymptotically,  under t h e sequence o f a l t e r n a t i v e s , - i  E(Q ($~ (u),$ (v)))-> 1  _ 1  ( v )  n  /_!  (  i U  (2Tr)- exp(-J (x +y ))  )  1  2  =^P Also,  Cov(Q ($  _ 1  n  (u) ,$  _ 1  i  fl n -°°  v  n  )  2  S  n  (  } n  (1+P xy) -  {H  dxdy-uv  i  / i  (v) ),Q ($  _ 1  n  (2Tr)- exp(-%(x +^ ))xy 1  ( U )  (r) ,$  _ 1  2  2  dxdy.  ( s ) ) )->{min ( u , r ) - u r } {min(v,s)-vs}  as n-+°°, as w e l l  as i n t h e case o f independence.  sequence  T -1 -1 {HV} , Q ( $ (u),<2> (v) ) i s a s y m p t o t i c a l l y  process,  Q(u,v),  a Gaussian  n  with -l  E(Q(u,v))=/Hp and  Under t h e  n  /*  (  v  )  / *  l (  u  )  (2Tr)~ exp(-^ +y )')xy 1  2  2  dxdy  Cov(Q(u,v),Q(r,s))={min(u,r)-ur}{min(v,s)-vs}.  From t h i s  i t may be c o n c l u d e d  alternatives,  T { }/ H  n  that  t h e component  under t h e sequence o f z n  k  i s asymptotically  normal  28 w i t h v a r i a n c e 1 and mean g i v e n by-, I  / .• /  /np Tr jk 2  n  (V)  2sin(TTJu)sin(7rkv)  1  1  -Cf  (  u  )  (  2TT ) e x p (-J5 ( x + y ) ) _ 1  2  xy dxdy Integrating /no  1  f  n  1  f  From B e h n e n s t h e o r e m , u n d e r ' 1  asymptotically  i s equal t o ,  n o r m a l w i t h v a r i a n c e 1 and mean g i v e n b y ,  approximating  u =/n  the s u b s t i t u t i o n ,  = /np a  k  both of  1  / o  n  <  /  ^  z  this  njk  a  r  e  H^, a n d t h e s e q u e n c e  forT  t h e Pitman  ., w i l l njk  =[/~  when odd  n  njk  efficiencies  .  a (u) = / 2 c o s (TTju) a n d b (u) =/2cos ( T r k u ) , njK The P i t m a n e f f i c i e n c y o f t h e T t e s t i s by 5.1.2,  T  njk  2  u=$(x), cos^J  1 1  2  v=$(y) )*  the i n t e g r a l  l i s an e v e n  "*"(«)  d  u  J [{ 2  cos (  /  integer.  that the behaviour  the Pitman  efficiency  statistic.  For this  c o s (TT&U) $  $  1  (v)  dvj . 2  (u) d u .  1^ i s e q u a l  The e v a l u a t i o n o f 1^, when The work o f B e r a n  of the f i r s t  of the o v e r a l l reason  T r k v )  -1  i n t e g e r , i s not straightforward.  suggests  forz  Because  H  and a p p r o x i m a t e Bahadur  1  Consider  T 'f ^«  ,  ( ' k) r- !T-o =4  dudv  / 2 c o s ( T r j $ ( x ) ) < j ) ( x ) x d x f [/_" /2 cos(TTk$(y))cJ)(y)y d y ] .  Substituting A  (v) )  of alternatives  be t h e same a s t h o s e  For the s t a t i s t i c and a =ov = l . a b  (u)$  ' t h e r e f o r e , a s y m p t o t i c a l l y e q u i v a l e n t under  equivalence  c  dxdy.  - 1 - 1 2 c o s (TTJU) c o s ( T r k v ) $ (u) $ (v) dudv.  1  /  n  _ i  2cos (TTJU) c o s ( f r k v ) (1+p $ n  the n u l l ,  ( j k ) r  2  u=0(x) and v=$(y)  1  computed  A  2  n  1  j  (x,y) i n t h e r e g i o n o f p=0  F  1  u =/n / n o  n  N  2 c o s (TTJ$ (x) ) c o s (TTk$ (y) ) (2TT) e x p (-% ( x + y ) ) ( l + p x y )  Making  T  (x,y) .  P  N  Again,  T {H }, T ., ' i s n n]k  t h e sequence ^  us = /n f 2cos (IT j $ (x) ) c o s (Trk$ (y) ) dF :.-  dudv.  - 1 - 1 (u)$ (v) dudv.  2cos (TT ju) c o s (Trkv) $  0  o  by p a r t s , t w i c e , t h i s  2  component  z ^  H  i s an (1975 a , b ) ,  determines  Blum-Kiefer-Rosenblatt  the Pitman e f f i c i e n c y  to 0  of the f i r s t  component  (or e q u i v a l e n t l y _ j _ ) i s o f i n t e r e s t . T  n1  In order  to compare t h i s e f f i c i e n c y w i t h the other e f f i c i e n c i e s , 1^ v. (as w e l l as I^) has been n u m e r i c a l l y e v a l u a t e d , u s i n g 8 p o i n t Gaussian quadrature.  The r e s u l t s  are r e p o r t e d here, I =.67 1  I =.17 3  The corresponding e f f i c i e n c i e s a r e , A  ( 1 1 ) r  =.80  (13)r A,~ , =.0033 (33) r 0  These are seen t o decrease r a p i d l y w i t h i n c r e a s i n g j o r k. 5.  Kendall'.s t a u .  Kendall's tau i s asymptotically equivalent  to Spearman's rho under both the n u l l hypothesis 1967)  and under the a l t e r n a t i v e  1961).  (Hajek and Sidak  where p>0 i s c l o s e t o 0 ( F a r l i e  T h i s i m p l i e s t h a t the Pitman e f f i c i e n c y o f K e n d a l l ' s t e s t  i s the same as t h a t o f Spearman's t e s t . 2 t h e r e f o r e , 9/ir -.91.  The Pitman e f f i c i e n c y i s ,  5.2 Bahadur E f f i c i e n c i e s As another means o f comparing t e s t s , the Bahadur e f f i c i e n c i e s (exact o r approximate) o f the nonparametric t e s t based on sample c o r r e l a t i o n  tests  r e l a t i v e t o the  c o e f f i c i e n t , w i l l be g i v e n .  To begin w i t h , i t i s necessary t o know the exact o r approximate slope a s s o c i a t e d w i t h r  n  when t e s t i n g  H :p=0 Q  against  H^:p>0,  normal case. The exact s l o p e , o f [ r /v4i-2/l-r j, 2 i s c ( p ) = - l o g ( 1 - p ), (Woodworth 1970). The approximate s l o p e i s 2  i n the b i v a r i a t e  n  r  c ^  a  )  ( p ) = p  2  / l - p  2  ,  (Abrahamson 1965).  Now c o n s i d e r the l i n e a r rank s t a t i s t i c s o f 3.0.1,  30 T =n  2  >a  (R.)b  (S.).  The  procedure f o r c a l c u l a t i n g  or approximate slope of the  the  exact  ' t J _ ^ requires evaluation  sequence  T  n  n  i,  of l i m T / n  2  when p>0.  T /n +ff n  a ( F ) b ( G ) dF  2  in probability marginals,  From 4.6  as  p  o f Woodworth  (1970)  =b(p)  n-*-°°, where F and  5.2.1  G are the Normal  (0,1)  <3? , o f t h e b i v a r i a t e n o r m a l d i s t r i b u t i o n F  and  a n  p  and  b  n  c o n v e r g e i n q u a d r a t i c mean t o t h e  f u n c t i o n s a and  b,  it  to f i n d the  i s necessary  -1  respectively.  square i n t e g r a b l e  In a d d i t i o n t o t h i s  f u n c t i o n f (t) w h i c h  limit,  satisfies,  h  n  l o g ( 1 - F (n t))->--f ( t ) .  E m p l o y i n g t h e o r e m 3, Woodworth  2  n  f ( t ) may and  be  found  as  follows.  I f there e x i s t s  a f u n c t i o n s (u),isuch t h a t *ex CA(a(u)b(v)-s(v))]  _  P  /exp[A(a(u)b(v')-s(v'))]  d  u  =  1  Q  <  v  <  (1970), A>0  a constant  1  dv'  and ' fa (u)b (v) exp[ A (a (u)b (v) -s (v) )] dv  d  5.2.3  t  /expLA(a(u)b(v')-s(v'))]dv' then It  f ( t ) = A ( t - / s ( v ) d v ) - / l o g { / e x p [ A ( a ( u ) b ( v ) - s ( v ) ) ] dv}du.  i s not  5.2.3.  always simple The  difficult  exact  slope  Under t h e n u l l  The b(p)  (0,1), as  is  i s g i v e n by  statistics  exact given  5.2.2  and  have t o s u f f i c e when i t i s  t h i s method.  H :p=0, T Q  s t a t e d i n s e c t i o n 5.2  5.2.1.  of  I f f ( t ) can  be  2f(b(p)).  hypothesis  approximate slope of  compute t h e  s o l u t i o n (A,s)  the  approximate slope w i l l  t o s o l v e f o r f ( t ) by  found, the  Normal  to find  5.2.4  ^ ^ n - 1 ^" ' T  S  n  The  i s asymptotically  n  and t  n  e  r  e  a=l f  o  r  e  i n B ( i i ) of '  above r e s u l t s w i l l  k  2  (p)  now  be  where used  (where p o s s i b l e ) o r a p p r o x i m a t e s l o p e s o f i n Chapter  3.  2.2.  to the  31 1.  Fisher-Yates  solves  normal s c o r e s  statistic  (T^ ).  Woodworth  n  5.2.2 a n d 5.2.3 f o r t h e F i s h e r - Y a t e s  s t a t i s t i c and  2 f ( t ) = - % l o g ( 1 - t ). The e x a c t s l o p e 2 5" therefore,c^(p)=-log(1-p ) since l i m f / is  consequently is,  finds  T  (1970)  n  CO  of { f } T  n  n  =  i  2  n  n->-°o //xy  (2TT)" ( l - p ) 1  2.  Quadrant  lim n^  2  T  0 0  exp[-35(l-p ) 2  sum  ( g )•  - 1  (x +y -2pxy)] dxdy=p. 2  2  Applying  T  n  5.2.1 t o t h e q u a d r a n t sum,  /n =ff  s i g n ($ ( x ) - % ) s i g n ($ ( y ) - % ) (2TT) (1-p ) , L ,i 2. - J 5 , 2^ 2 .., , exp(-%(l-p ) (x +y -2pxy) )dxdy. hand s i d e o f t h i s e q u a t i o n h a s a l r e a d y b e e n  h  q  n  _  1  2  0  The as  _ J 5  right  (2TT " S a r c s i n p . The  t o 5.2.2 and 5.2.3 may be f o u n d  5.2.2 i s t h e n  0<v<l. 5.2.3  (Woodworth 1970)  solution  s(u)=0.  evaluated  identically  I t remains t o f i n d  equal  A by s o l v i n g  by  letting  t o 1 f o r a l l A>0 a n d 5.2.3.  When  s(u)=0,  becomes ./ s i g n (u-h) s i g n (v-h) exp[ A ( s i g n (u-h) s i g n (v-h) )1 dv l  du  / exp[A ( s i g n (u-%) s i g n (v'-h) )~] dv' =(e -e~ )/(e +e~ )=tanhA A  This  gives  X  A  A  A=tanh (t)=%log(1+t/l-t). _ 1  Finally,  Jjtlogd+t/l-tJ-logf^d+t/l-tj'^+Jsd+t/l-t) Combining  results,  the exact  slope r  —1  a r c s i n p l o g (£ ( p ) )-21og[jJj5  E, ( p ) =1+2TT  ^arcsinp  1-2TT  ^"arcsinp  q  3. lim  T-  /n =//  5.2.4.  2  sum i s ,  J- - i  (p) +hK ( p ) j , where 2  .  Spearman's r h o (T 3s  ] from  f o r the quadrant - J-  c (p)=2iT  3 5  f(t)=  ).  Consider  12(u-3s) (v-h) (2TT)  _  1  Spearman's  test.  ( 1 - p ) " ^ e x p (~h ( 1 - p ) " 2  2  %  (x +y -2pxy) ) 2  2  dxdy =6TT  Although,  - 1  arctan  (p  (4-p ) 2  - 3 s  ) .  (Woodworth 1970)  5.2.2 a n d 5.2.3 a r e i n t r a c t a b l e  f o r Spearman's  case,  1 2 ( v - h ) [v-h) may be a p p r o x i m a t e d by a f u n c t i o n f o r w h i c h t h e  32 equations  can  be  s o l v e d , p r o v i d i n g an  (Woodworth 1 9 7 0 ) . expansion  For  t close to  (Woodworth 1970)  nonnegligible  terms i n t h i s  presents  a t a b l e of  of  approximations.  these  c (p)=2f(6Tr  -1  s  of  p  and  may  approximation  1,  f ( t ) has  be  a p p r o x i m a t e d by  expansion.  this  2 -J-a r c t a n ( p ( 4 - p ) ) ), may  the  (1970)  o f t , making  t a b l e the  be  2  series  Woodworth  f ( t ) f o r various values Using  a  to f ( t )  exact  estimated  use  slope,  f o r given  values  .  4.  Kendall's  tau  (t ). n  t h e methods o f t h i s c (p),  can  t  be  s e c t i o n are  found,  Where f ( t ) i s t h e  Since Kendall's not  however, as  solution  tau i s not  applicable.  The  c (p)=2f(b (p)) f c  linear, exact  slope,  (Woodworth  f c  1970)  of  f (t)=%At+%A+logU/fe -l)) A  t and  = l + 4 r[ ^rX_., x ( e_x- l ),"x --1 dx A  b ( p ) = 4//F  dE  t  Woodworth  -1=2TT  p  (1970) has  for given values  5.  components and  Blum-Kiefer-Rosenblatt been p o s s i b l e t o f i n d case  and,  From  C r e t a n (p ( 1 - p )  of  linear  statistic  therefore, only  .020<t<.996 and  c ( p ) may f c  be  p.  related  the  .  2  tabled f(t) for  computed The  -A]/;  L  (z  solution the  ., njk to  rank s t a t i s t i c s and  T  5.2.2  approximate  ., ) . njk and  of  the  I t has  5.2.3  in  slope w i l l  be  not  this given.  5.2.1 lim  T - / n = / / 2cos ( T T j $ ( x ) ) c o s (Trk<2> (y) )  dE  2  n  k  n-y  Expanding E  p  about  p=0,  to get  an  approximation  to F  for small  p  Pi  lim T n-»-«> If lim n  -)-oo  n i k  the T  /n^=//  2cos  (TTJ$  3  substitution . ,7/ n  nik J  2  reduces  (x) ) c o s (7rk$ (y) ) ( 2 7 T ) e x p (-J5 ( x + y ) ) (1+pxy) dxdy. - 1  u = $ ( x ) , v=$(y) i s made i n t h e to  2  2  expression  above,  33 1  2pf  The cfj  -1 -1 cos (TTJU) $ (u) du / c o s (irkv) $ 1  approximate ) :  3  slope,  (v) dv.  (a) c , (p), is  (p)=4p |7 cos (TTJU) $ 2 2 2 = 4 p ^ I I, j k 2  _ 1  (u) d u l f /  c o s (irkv) $  2  (v)  _ 1  dvl  2  Z  where 1^ was d e f i n e d Abrahamson  i n section  (1965) g i v e s  Blum-Kiefer-Rosenblatt (p)=12  c ^ It  s h o u l d be p o i n t e d  in  contrast  c  comparison  1  the approximate  statistic  Tr p 2  out that  o f the approximate  efficiencies  of the o v e r a l l  0.  b a s e d on B  previously Bahadur  slope  i s noted here.  the t e s t  t o the other t e s t s  Bahadur  which  f o r p near  2  be made w i t h t h e o t h e r Bahadur The  5.1.  i s two s i d e d  n  discussed.  efficiency  A  of B  direct  cannot  n  efficiencies. of the nonparametric t e s t s  relative  t o t h e t e s t b a s e d on r , may now be computed b y d i v i d i n g t h e s l o p e s by c^ip) Fisher-Yates statistic Quadrant  .  These  normal  a r e summarized  B^ ( p ) = l  scores  sum  below,  •  B  ( p ) = - l o g " ( (l l- -p p )) [[ 22 iT rr " < a r c s i n p  logXS Spearman's r h o  B  s  tau  B  t r  (p)=-log  -_ 11  Blum-Kiefer-Rosenblatt s t a t i s t i c (overall) " -1 £ J p J =1+2TT arcsinp  . . B^  a )  . . .  %  (( 11 -- pp ))  J « J s  (p)]  ]  c c(p)  22  s  - 1 , , 2, ( l - p ^ ) c (p) x  B^(p)=4(l-  njk  T  1  -h .  5  (p)=-log  r  ii  (p))-21og[j ?" (p)+%c;  3  Kendall's  2 2  1  a r  1  t  2 P  )I  2  9  I  2  9  ( p ) =12" TT ( 1 - p ) = . 82 ( 1 - p )  2  3 C  s'  c  1-2TT t a  The rapidly that  T n  r  e  n  o  arcsinp 9 i t  Pitman  v  e  n  . explicitly  and a p p r o x i m a t e  d e c r e a s e w i t h j a n d k. be u s e d i n t e s t i n g  but are tabled Bahadur  by Woodworth  efficiencies  For this  of T ^ n  (1970) k  reason i t i s suggested  f o r independence.  34 5.3  A Monte C a r l o C o m p a r i s o n The  at  size  a=.05.  o f e a c h t e s t , was  The  each s t a t i s t i c size  n.  fixed  throughout  c r i t i c a l v a l u e s were f i r s t by  g e n e r a t i n g 1000  the n u l l  independent  were i n d e p e n d e n t  Normal  distribution, (0,1).  The  study,  estimated f o r samples  In e a c h s a m p l e , t h e n o b s e r v a t i o n s  were drawn f r o m  the  of  (X^,Y^), i = l , 2 . . . n ,  t h a t i s t h e X's  statistics  and  of Chapter  Spearman's r h o , K e n d a l l ' s t a u , t h e F i s h e r - Y a t e s N o r m a l coefficient, and  T n  j_j_  the quadrant  sum,  n observations. were o r d e r e d . smallest  The  observed  resulting  number s u c h or equal  Where t a b l e s a r e a v a i l a b l e numerically For 3.2  the  c a n be  were a l l computed 1000  t h a t the  then  taken  relative  t o i t , was  the  from  v a l u e s o f each  c r i t i c a l v a l u e was  values g r e a t e r than  less  compared w i t h e x a c t  (U  ^ ) ) .  the of a.  obtained  values. of  section  instead of  1  1  same  or equal to  a p p r o x i m a t e s c o r e s , $ ( i / ( n + 1 ) ) , were u s e d s c o r e s , E ($  statistic  statistic  t o be  than  c r i t i c a l values  the  frequency  the F i s h e r - Y a t e s normal s c o r e s s t a t i s t i c  the exact  scores  component) , as w e l l as t h e o r d i n a r y  coefficient The  3:  the Blum-Kiefer-Rosenblatt  (related to i t s f i r s t  sample c o r r e l a t i o n  Y's  T h i s e l i m i n a t e d the  need  n for  a t a b l e of normal s c o r e s .  as n+°°. The  (Hajek  is  Sidak  they  t h e one  comparison w i t h  a l s o be sided  two  t h e power c a l c u l a t i o n s , one  sided tests with  corresponding  two  and,  test  improves  i s i n h e r e n t l y two  the other t e s t s  sided.  H-^:p>0,  approximation  1967)  Blum-Kiefer-Rosenblatt  To make a f a i r that  and  The  Since the  i t will  be  i t i s necessary  alternative  only positive  assumed t h a t t h e  a=.05.  of  interest  p ' s were u s e d  a=.025 a r e good a p p r o x i m a t i o n s  sided tests with  sided.  in  appropriate to  the  35 The values  Monte C a r l o c r i t i c a l  the c r i t i c a l  carried  out.  v a l u e s were computed, a power  Again,  1000 s a m p l e s o f e a c h  2 4 a n d 50, w e r e g e n e r a t e d .  T h i s time  generated  normal  and  unit  from  the b i v a r i a t e  variances) with  p=.l,.25  distribution  and  .5.  p=.75 and .9 f o r n=10,  p=.15,  .3 and .4 f o r n=50, were i n c l u d e d  picture  was  sided  all  statistics  Only  and two s i d e d except  t h e .05 l e v e l Computations  Columbia  f o r each  test.  are given i n Table I I .  one  The f i n a l  count  The r e s u l t s  o f these  The powers f o r b o t h  t e s t s , w i t h a=.05 a r e r e c o r d e d f o r  the Blum-Kiefer-Rosenblatt  two s i d e d  t e s t was  performed  were done a t t h e U n i v e r s i t y  statistic. f o r the  Marsaglia's  independent  normal  from t h e d e s i r e d b i v a r i a t e random s t a r t i n g n and p  v a l u e was  normal used  combination.  routine  The  normal  which  o b s e r v a t i o n s , produced  rectangle-wedge-tail-method,  latter.  of B r i t i s h  C o m p u t i n g C e n t r e on an IBM 36 0 c o m p u t e r .  transforms  each  additional  to give a better  o b s e r v a t i o n s were g e n e r a t e d by a b u i l t - i n  for  The  o f t h e number o f t i m e s H Q  by 1000 e s t i m a t e s t h e power.  computations  ( w i t h 0 means  A l l t e s t s were p e r f o r m e d f o r  sample and a r u n n i n g c o u n t  divided  n=10,  p = . 4 and .6 f o r n=24 and  o f t h e power c u r v e .  r e j e c t e d was k e p t  size,  study  t h e s a m p l e s were  values  each  a s some e x a c t  are given i n Table I.  After was  v a l u e s as w e l l  by  into observations  distribution.  t o generate  A  t h e 1000  different samples  Table Critical  I  v a l u e s d e r i v e d from  a Monte C a r l o  experiment  n=10 Test  Critical  Correlation coefficient Kendall's tau Spearman's r h o 5  Quadrant  sum  Fisher-Yates coefficient  T  nil Blum-KieferRosenblatt statistic  value  Exact c r i t i c a l value  a  2. 3060 1. 8595 0. 5111 0. 4667 0 .6485 0 .5636 1. 0000 1. 0000  -  .025 .05 .025 .05 .025 .05 .025 .05 .025 .05 .025 .05  -  .05  2. 2660 1. 8939 0. 5111 0. 4222 0. 6242 0. 5273 1. 0000 1. 0000 1. 2069 1. 0661 1. 7544 1. 50 3-9 0 .00803 n=24  Test  Critical  Correlation coefficient Kendall's tau Spearman's r h o 5  Quadrant  sum  Fisher-Yates coefficient T  nll  Blum-KieferRosenblatt statistic  value  Exact c r i t i c a l value  a  2. 0739 1. 7171 0. 2899 0. 2464  .025 .05 .025 .05 .025 .05 .025 .05 .025 .05 .025 .05  2. 0350 1. 6942 0. 2826 0. 2319 0. 3948 0. 3313 0. 5000 0. 5000 1. 4778 1. 2831 1. 9306 1. 5563 0. 00278  References f o r exact values: 0deh e t a l (1977).  -  0. 5000 0. 5000  -  Blomquist  .05  (1950),  Because o f the d i s c r e t e n a t u r e o f the s t a t i s t i c , the .05 and .025 c r i t i c a l v a l u e s o f t h e q u a d r a n t sum a r e t h e same. T h i s i s a l s o r e f l e c t e d i n e q u a l powers for the corresponding e n t r i e s i n Table I I .  37 Table  I  (continued)  n=50 Test  Critical  Correlation coefficient Kendall's tau  2.3116 1.7523 0.2033 0.1657 0.3007 0.2400 0.3600 0.2800 1.8486 1.4817 1.9738 1.7134  Spearman's r h o Q u a d r a n t sum Fisher-Yates coefficient  T  nil Blum-KieferRosenblatt statistic  O  u  m  CD > •H S-l  LD CM  II  value  -  .025 .05 .025 .05 .025 .05 .025 .05 .025 .05 .025 .05  -  .05  -  -  0.3600 0.2800  CN  in CN  II  o o o o o o o o o o o o  CL  a  2.0106 1.6772  0.00133  OOOOOOLDrHOOO^rOi-IO OMfikOlflOI^OJCMCTiVDOVD OiHOi-lrHiHOOOi-li-li-)  Exact c r i t i c a l value  CL  vorgncnc^'a"^^ CNOOiHCNiHrHOO CNfOtNCOCSrOHrl 1  o o o o o o o o  CD  U Q)  o o ^ o o i o H M i n i n n o o ^ t N ^ c o ^ o o m o o o m o i i n o i  0  O O  O O  O O  O O  O O  H O  O O  O O  O O  O O  O O  O O  o o  o o o c N n m o c N c N  II CL  O i H O i — I O I - I O O O  O  O  O  O  O  O  O  

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080119/manifest

Comment

Related Items