UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Some optimum properties and applications of Stein's test Knight, William Rixford 1957

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Item Metadata

Download

Media
831-UBC_1957_A8 K6 S6.pdf [ 2.12MB ]
Metadata
JSON: 831-1.0080643.json
JSON-LD: 831-1.0080643-ld.json
RDF/XML (Pretty): 831-1.0080643-rdf.xml
RDF/JSON: 831-1.0080643-rdf.json
Turtle: 831-1.0080643-turtle.txt
N-Triples: 831-1.0080643-rdf-ntriples.txt
Original Record: 831-1.0080643-source.json
Full Text
831-1.0080643-fulltext.txt
Citation
831-1.0080643.ris

Full Text

SOME OPTIMUM PROPERTIES AND APPLICATIONS OF STEIN'S TEST by W i l l i a m R i x f o r d Knight A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF ARTS i n the Department of MATHEMATICS We accept t h i s t h e s i s as conforming to the standard r e q u i r e d from candidates f o r the degree of MASTER OF ARTS. Members of the Department of MATHEMATICS THE UNIVERSITY OF BRITISH COLUMBIA A p r i l , 1957 ABSTRACT St e i n ' s t e s t t e s t s the homogeneity of means of a set of normal d i s t r i b u t i o n s w i t h the homoscedas-t i c i t y property by means of a s e q u e n t i a l sampling prodecure i n two stages, the r e s u l t s of the f i r s t sample determining the s i z e of the second. By means of t h i s procedure a t e s t the power of which i s independent of the variance i s p o s s i b l e . C e r t a i n extensions of St e i n ' s t e s t are obtained. Some optimum p r o p e r t i e s f o r such two stage se-q u e n t i a l procedures are proposed, and i t i s shown that the t e s t s s a t i s f y i n g these optimum p r o p e r t i e s are e s s e n t i a l l y the same as S t e i n ' s t e s t . In presenting t h i s t h e s i s i n p a r t i a l f u l f i l m e n t of the requirements f o r an advanced degree at the U n i v e r s i t y of B r i t i s h Columbia, I agree that the L i b r a r y s h a l l make i t f r e e l y a v a i l a b l e f o r reference and study. I f u r t h e r agree th a t permission f o r extensive copying of t h i s t h e s i s f o r s c h o l a r l y purposes may be granted by the Head of my Department or by h i s r e p r e s e n t a t i v e . I t i s understood tha t copying or p u b l i c a t i o n of t h i s t h e s i s f o r f i n a n c i a l gain s h a l l not be allowed without my w r i t t e n permission. Department of 777 a - t k e w i t t ' c s The U n i v e r s i t y of B r i t i s h Columbia, Vancouver 8\ Canada. Date Q. n^jjj ' 7 , M r ?  ACKNOWLEDGEMENT I wish to express my thanks to Dr. Stanley W. Nash of the Department of Mathematics f o r suggesting t h i s t o p i c and f o r h i s advice during the preparation of t h i s t h e s i s . I a l s o extend my thanks to Dr. B. Moyls f o r h i s a s s i s t a n c e i n prepara t i o n of the f i n a l manuscript. TABLE OF CONTENTS INTRODUCTION 1 SUMMARY 4 CHAPTER ONE: V a r i a t i o n s on a Theme by S t e i n 6 CHAPTER TWO: Some Optimum P r o p e r t i e s 26 CHAPTER THREE: A M u l t i v a r i a t e Extension 51 BIBLIOGRAPHY 57 -1-INTRODUCTION HISTORY One of the c l a s s i c problems i n s t a t i s t i c s i s the making of s t a t i s t i c a l statements about the mean of a normal d i s t r i b u t i o n , the mean and variance of which are both unknown. In t h i s connection there are two questions which are commonly asked: Given some set of observations from t h i s p o p u l a t i o n , what i s a good confidence i n t e r v a l f o r the mean? Given a set of observations how i s some hypothesis, f o r example that the mean i s zero, to be tested? These questions were o r i g i n a l l y answered by Gosset, and h i s s o l u t i o n i s now w e l l known to s t a t i s t i c i a n s as the f a m i l i a r "Student's t - t e s t " . The " t - t e s t " y i e l d s a confidence i n t e r v a l of the form x - s t ot 4 m 4 x f st ^ , where m i s the true mean, x the sample mean, s the sample standard e r r o r of x, and t ^ a percentage p o i n t i n the t d i s t r i b u t i o n . The l e n g t h of t h i s i n t e r v a l i s 2 s t o r . This i s a random v a r i a b l e which may be sma l l , t h i s being u s u a l l y regarded as d e s i r -a b l e , or which may be l a r g e , t h i s being u s u a l l y r e -garded as un d e s i r a b l e . -2-This means that whether a given sample w i l l s p e c i f y the mean w i t h the needed accuracy i s a random happen-in g over which the experimenter has no c o n t r o l . An analogous s i t u a t i o n a r i s e s when an hypothesis i s t e s t e d by means of the " t - t e s t " . I f the n u l l hypothesis i s not t r u e , the p r o b a b i l i t y that the 2 experimenter w i l l d i s cover t h i s i s dependent on <r-which i s not known. For these reasons, the need has been f e l t f o r a scheme which would y i e l d confidence regions the s i z e of which could be determined i n advance, and t e s t s of which the power ag a i n s t a given a l t e r n a t i v e could be determined i n advance. In 1940, Dantzig showed tha t no non t r i v i a l t e s t , t e s t , the power of which was independent of <s~ f e x i s t s . (An example of a t r i v i a l t e s t would be to decide by t o s s i n g a coin.) From t h i s i t f o l l o w s that no scheme y i e l d i n g a con-fi d e n c e i n t e r v a l - w i t h s i z e independent of &~ e x i s t s . (An exception must again be made i n the case of the t r i v i a l zero and i n f i n i t e i n t e r v a l s . ) With schemes of f i x e d sample s i z e e l i m i n a t e d , there remains the p o s s i b i l i t y of using a s e q u e n t i a l scheme i n which the sample s i z e i s random. Such a scheme was devised by S t e i n i n 1945. -3-Stein's method works roughly as follows: Take an i n i t i a l sample. -Use t h i s sample to estimate the variance of the population. Next, take a second sample, the size of which i s determined by the es-timated population variance, a large sample i f the variance i s estimated to be large, a small sample i f the estimated variance i s small. The confidence i n t e r v a l i s centered at a weighted average of the means of the i n i t i a l and second samples. Stein also generalized t h i s method to include a large class of tests of l i n e a r hypotheses, among them a l l block designs with an equal number of r e p l i c a -tions i n each block. The discussion herein covers certain extensions of Stein's t e s t and certain optimum properties of t e s t s s i m i l a r to Stein's. A more complete summary follows. SUMMARY CHAPTER ONE Chapter one mainly r e i t e r a t e s Stein's r e s u l t s . In addition, certain extensions are developed, among them Stein type tests i n which d i f f e r e n t sized samples are taken from each group, Stein type tests i n which the range rather than the standard deviation of the i n i t i a l sample i s used to estimate the population variance, and Stein type tests of multiple hypotheses comparable to, f o r example, Scheffe's t e s t . A gen-e r a l method of converting "Studentized" tests into Stein type tests i s described. CHAPTER TWO Chapter two i s demoted to the following ques-t i o n : Are there other" schemes f o r obtaining a con-fidence region of fixed size which do the job better . than Stein's method? This i s approached as a problem i n estimation theory; how i s the mean to be estimated with a minimum loss or error? A minimax c r i t e r i o n i s devised; the set of two stage sampling estimation methods s a t i s f y i n g t h i s c r i t e r i o n i s defined, and Stein's method found to be among them. -5-CHAPTER THREE Chapter three extends some of S t e i n ' s r e s u l t s to the m u l t i v a r i a t e case. The method y i e l d s an e l l i p s o i d a l confidence region f o r the means the maximum a x i s of which can be determined i n advance, or an e l l i p s o i d a l confidence region the volume of which can be determined i n advance. I t i s shown that the d i s t r i b u t i o n s i n v o l v e d are the same as those which a r i s e i n connection w i t h c e r t a i n s i n g l e sample t e s t s i n the m u l t i v a r i a t e case. Unfortunately, the d i s t r i b u t i o n s of the s t a t i s -t i c s used i n these s i n g l e sample t e s t s are s t i l l unknown i n many cases. -6-CHAPTER ONE: VARIATIONS ON A THEME BY STEIN INTRODUCTION The following model i s used. A set of Y independent normal populations with the same var-iance, cr 2, but d i f f e r e n t means, IIK , i s given. From these populations observations, x ^^, are taken; the i subscripts run from 1 to % and denote the population from which the observation was drawn; the k subscripts denote the r e p l i c a -t i o n and run from 1 to the t o t a l number of r e p l i -cations. These populations might be, f o r example, the groups i n a block design. 2 Suppose that both the variance, a— , and the means, ITK $ are unknown, and i t i s wished to obtain a confidence region f o r some of the m^ . When there i s only one group (^-1) t h i s can be done by means of the well known " t - t e s t " . When there i s more than one group ( Y> 1) a spherical con-fidence region' for some or a l l of the m^  may be obtained by use of the analysis of variance. -7-A r e l a t e d problem i s to o b t a i n a s t a t i s t i c a l t e s t f o r some hypothesis about the m-j_, f o r example, tha t a given subset of the means are equal, when both the population variance and the population means are unknown. The standard method of doing t h i s i s to use " 'Students's t - t e s t " when Y- 2 and the a n a l y s i s of variance when more groups are i n v o l v e d . STEIN'S TEST A t e s t , s i m i l a r to the a n a l y s i s of varia n c e , but i n v o l v i n g two stages of sampling r a t h e r than one, has been devised by S t e i n . The s i z e of the confidence region obtained by use of the a n a l y s i s of variance i s a random v a r i a b l e over which the experimenter has l i t t l e c o n t r o l . By u s i n g a two stage sampling scheme S t e i n was able to obta i n a confidence region f o r the m± the s i z e of which could be determined i n advance of the experiment o even though <r- was unknown, or, to look at the other side of the c o i n , t e s t s of hypotheses con-cerning the m^  the power of which was independent of a - 2 . S t e i n ' s method w i l l be b r i e f l y d e s c r i b e d . S e l e c t i n advance of the experiment a r e a l pos-i t i v e number, z; t h i s number and the nature of -8-the experimental design determine the size of the confidence region (or the power of the test) to be to be obtained. An i n i t i a l sample of size n^ 0^ i s taken from each of the X groups, and i t s r e s i d u a l 2 variance, s , i s computed. A second sample of size n^D i s then taken from each of the Y groups, where i s determined as follows: Consider the equation i s a, a 2 / n ( 0 ) - (1 - a ) 2 / n ( 1 ) = z/s 2. The second sample siz e , n ^ , i s taken to be the lea s t natural number such that t h i s equation per-mits a r e a l root; one of these roots (which does not matter) w i l l be calle d "a"-. A l i t t l e calcu-l a t i o n w i l l show that n ^ i s always defined provided n^ 0^ i s not zero. In f a c t , n ^ i s the 2 ( ) least integer not less than Max(l, s /z - n ). Next, compute the group means, x ^ 0 ^ , f o r the i n i t i a l sample, and the group means, x ^ ^ , f o r (a) the second sample. Define x ^ by x ( a ) . . a x ( o ) i _ ( l - a ) x { 1 ) ± . Stein showed that f o r any i ( x ( a ) i - m^/a* i s d i s t r i b u t e d according to "Student's" t d i s t r i -p bution with as many degrees of freedom as s , -9-and that 1 <x<a' - x ( a ) . ) 2 / 2 i s F dis t r i b u t e d with 7 -1, (n -1) degrees of (a) (a) freedom, x . i s the average of the x v ' over i . l "STEINIZATION" In some sense, Stein's test i s an extension of the t and F t e s t s . An extension of t h i s sort w i l l hereafter be spoken of as the "Steinized" version of the corresponding one sample test, and the process of obtaining such a test from a one sample test w i l l be spoken of as "S t e i n i z a t i o n " . This chapter i s devoted to the derivation of a general method of "St e i n i z a t i o n " of a certain class of tests f o r the means of a set of homo-scedastic normal d i s t r i b u t i o n s . With t h i s method most "Studentized" tests may be further "Steinized". The method also covers tests i n which the range, or any other l i n e a r unbiased order s t a t i s t i c e s t i -mator of the population standard deviation, i s used in the denominator as the sample standard deviation i s used i n the " t - t e s t " . Tests of multiple hypotheses such as Scheffe's test can also be "Steinized" by t h i s method. -10-N0TATI0N The following supplements to the notation already developed w i l l be used. Since i t i s possible that the experimenter may wish f o r greater accuracy i n the measurement of some groups than others, a separate z i s chosen f o r each group. The experimenter may, i f so i n -clin e d , select a l l the z's to be the same, of course. Each z must now have an i subscript, and therefore so must the n Ts and the a's. Thus we get, z±, n ^ 0 ^ , n ^ 1 ^ , and a±. In t h i s chapter the symbol, s, i s not neces-s a r i l y used to represent the standard deviation, or the square root of the residual variance of the i n i t i a l sample, but represents any random variable which i s s t a t i s t i c a l l y independent of the x ^ ° ^ and the x.^  k f o r k ^ n ^ 0 ^ , and the d i s t r i b u t i o n of which, w(s , <r ), does not depend on the m^  but may, and usually w i l l ^ depend upon cr . In any p r a c t i c a l application, s w i l l be a function of the observations i n the i n i t i a l sample. Note that the requirements 2 for - s are s a t i s f i e d by the residual variance of the i n i t i a l sample, the square of the range of the i n i t i a l sample, etc., not to mention the toss of a coin. The notation to be used i s summarized on the following page. - l i -the number of groups. (i= 1--^) predetermined constants random variables d i s t r i b u t e d indepen-dently N(mi, <r 2) size of the i n i t i a l sample from the i - t h group size of the second sample from the i - t h group being the least positive integer f o r which the equation i n a j _ : z./s 2 = a . 2 / ^ 0 * . + ( l - a . ) 2 / ^ 1 ^ permits a r e a l root which i s calle d a^ as defined above r>). -f n(D. \ k - n(o> i + 1 = H * { 0 ) i + d - a i ) x ( 1 ) i = a random variable distributed independently of x(7,{ and m-j_ with pr o b a b i l i t y measure w( s , a- ) as defined above k i , k / n (1) -12-LEMMA 1.1 The conditional d i s t r i b u t i o n of x ^ a ^ given s^ i s N(rrij_, o - ^ z ^ / s 2 ) , moreover the x ^ a ^ i are dist r i b u t e d independently of each other. THEOREM 1.2 The j o i n t d i s t r i b u t i o n of the (x^ a'j_ - m^  the same as the jo i n t d i s t r i b u t i o n of the U ( o ) i - m±)M{o)i)./s-Proof: There i s no loss i n generality i n setting a l l the m-L equal to zero. It w i l l be shown that the cha r a c t e r i s t i c functions of the two d i s t r i b u t i o n s are the same. The c h a r a c t e r i s t i c function of the two sample d i s t r i b u t i o n i s t) (The variances of x are obtained from lemma 1.1) Make the change of variable d t i = ( s 2 / n ( o ) i z i)2 d x ^ ^ and (i) becomes X n ±l~ 7 J t . J cUcS, -13-The c h a r a c t e r i s t i c function of the single sample d i s t r i b -ution i s (oj (oj Equations ( i i ) and ( i i i ) define the same function of the T J i * COROLLARY The j o i n t d i s t r i b u t i o n of any set of integrable functions of the ( x ^ a ^ - m^)/z^ s, say j^g((x^ a^ - m^)/z^2)]is the same as that of fc((x(°>i - m.)(n(o).)Vs)}. THEOREM 1.3 I f s/c— i s distributed independently of a— , then the j o i n t d i s t r i b u t i o n of the x ^ a ^ / z ^ ^ i s independent of a- . Proof: I t w i l l be shown that the c h a r a c t e r i s t i c function i s independent of o . The c h a r a c t e r i s t i c function i s / h. , c 1 , (3) a 1--, (iv) l / 7 7 7 W * — J^/)<l^(a-^J { •J(I.(J /T9F^v7777 '/ Make the transformation = s/ , and (iv) becomes ITT y The above expression does not contain <5~~. -14-"STEINIZATION" METHODS General " S t e i n i z a t i o n " methods, based on theorems 1.2 and 1.3 w i l l now be presented. CONFIDENCE REGIONS Any single stage sampling scheme f o r obtaining a confidence region f o r the means of a set of homo-following properties can be "Steinized" to obtain a confidence region the size of which can be determined i n advance. 1) An-observation can be drawn at w i l l from any p a r t i c u l a r d i s t r i b u t i o n i n the above set that the experimenter wishes. 2) The s t a t i s t i c s used are functions of scedastic normal U ( o ) i - m.) ( n(°K ) i / s where s i s s t a t i s t i c a l l y independent of a l l x ^ i and a l l X i , k i f k ) n ( o ) i . 3) The confidence region obtained i s invariant under t r a n s l a t i o n . This means that f o r any vector, {vj_} , the observations f x ^ ^ j a n d ^ xi»k / vi} y i e l d the same confidence region displaced by (v-j_). -15-Such a confidence region i s "Steinized" as follows. In the single sample case, the vector, fm^}is i n the confidence region i and only i f the vector { ( x ( ° ) i - m.) ( n ^ ± ) i / s } * G f o r some . Q» i s determined by the nature of the problem. In the "Steinized" case, the vector, /m^lis i n the confidence region i f and only i f the vector f(xU>. . m . ; ) / z . i j 6 G f o r the same as above. Due to the tr a n s l a t i o n invariance of the single stage sampling confidence region, the "Steinized" confidence region i s of fix e d s i z e . This size may be controlled by selection of the z's. It follows immediately from theorem 1.2 that the confidence c o e f f i c i e n t of both the single sample test and the "Steinized" confidence region are the same. EXAMPLE THE "T-TEST" The " t - t e s t " i s used when y = 1 so the i subscripts can be dropped. The confidence region obtained i n the case of the two t a i l e d test i s the set of a l l m s a t i s f y -ing (x(°)- m ) ( n ( ° > ) V s « f - t ^ , t j , where s i s the sample standard deviation and t ^  i s a percentage point i n the t d i s t r i b u t i o n with n^°^ - 1 -16-degrees of freedom. The corresponding "Steinized" confidence region i s the set of a l l m such that ( x { a ) - m)/z* e [-t^ , t , ] That i s to say, the confidence region i s the i n t e r v a l I The length of the i n t e r v a l , 2 z 2 t ,can e a s i l y be selected by a proper choice of z. EXAMPLE: THE "U-TEST" The "u-test" i s a test s i m i l a r to the " t - t e s t " save that the range rather than the standard deviation i s used. Since there i s but one group, the i subscripts may be dropped. The confidence region obtained from the "u-test" i s the set of a l l m s a t i s f y i n g u = (Xl°> - m)(n ( o ))^d/s 6 f-u , u "/ where s i s the sample range, d i s the expected sample range i f s~ - 1, and i s a percentage point i n the d i s t r i b u t i o n of u; percentage points i n t h i s d i s t r i b u -t i o n have been tabulated by Lord. The corresponding "Steinized" confidence region i s the set of a l l m such that u' = ( x ( a ) - m) d / z 5 6 /T-u^ , u ^ 7 .1 . The lengthe of the confidence i n t e r v a l , 2z 2 u^ /d, can e a s i l y be selected by proper choice of z. -17-TESTS OF HYPOTHESES Any single stage sampling scheme for t e s t i n g the hypothesis, fm^l , with the following properties can be "Steinized" to y i e l d a test the power of which i s independent of g~~ . 1) An observation can be drawn from the d i s t r i b u t i o n corresponding to any i subscript that the experi-"... menter wishes. 2) The s t a t i s t i c used i s a function of [x(°\(n(o)±)k/s}, g({x(°) i(n(o) i)i/s}) , where s i s s t a t i s t i c a l l y independent of a l l x ^ ° ^ and a l l x, i f k n^°)., and s/o~ i s l i'k i ' ' distr i b u t e d without reference to either the m^  or to <r~ . In many p r a c t i c a l cases the function ••; g w i l l be an F r a t i o . 3) For any s, f o r any vector {v^\ , and f o r any fm^j such that the hypothesis i s true, that i s , {mj}e^l g ( f ( v . - m.)(n(o).)s/s}) = 1 g ({v±{nlo\)k/s}) . The reader can. see that t h i s conditon i s met i n many p r a c t i c a l tests, for example the analysis of variance. The condition i s an invariance condition under certain translations and corresponds to the invariance condition IM the section on confidence regions. -18-and corresponds to the invariance condition • under the section of confidence regions. Such a test can be "Steinized" as follows: Take any set of z^'s such that f o r any vector, g(f(v± - m^/zj*}) - g({v±/Z±i}) provided that the hypothesisis true, that i s , fm^lefy. To show that such z's exist, note that an example i s obtained by taking z^ proportional to l/n^°^ . Construct the x' a''s i n the usual way. In the case of the one sample t e s t the hypothesis i s accepted i f g{(x^± ( n ^ ^ ) V s}) s a t i s f i e s certain conditions. In the case of the "Steinized" test the hypothesis i s accepted i f g ( f x t a ) i / HkD s a t i s f i e s the same conditions. The "Steinized" test w i l l have the same type one error as the one sample t e s t . The type two error of the "Steinized" test w i l l not depend on Proof: For the type one error, note that the following four s t a t i s t i c s have the same d i s t r i b u t i o n i f the hypothesis i s true. i ) g(hia\ / zj}) x i i ) g ( f ( x ( a ) . - m.)/ z*}) i i i ) g({(x(o). - m )(n(o) )*/ s}) iv) g((x(°\ (n(°) i)*/ s}) For the type two error, use theorem 1.3• -19-EXAMPLE: THE ANALYSIS OF VARIANCE,FACTORIAL DESIGN The 2x2 f a c t o r i a l which i s presented here i s chosen merely because i t i s simpler to present a small design; the general procedure i s the same i n any f a c t o r i a l design. In the f a c t o r i a l design the same number of observations must be made i n each group; hence, i n the single sample test, a l l the n^°^'s are the same. The corresponding condition i n the "Steinized" te s t i s that a l l the z f s must be the same. (To be f u l l y general, i t i s only necessary that the n(°)'s s a t i s f y certain proportionality r e s t r i c t i o n s . " S t e i n i z a t i o n " i n t h i s case i s pe r f e c t l y straightforward, although an algebraic mess.) This r e s t r i c t i o n on the z's i s necess-ary i n order that the tr a n s l a t i o n invariance be preserved. I t i s true, although not immediately obvious, that although there are r e s t r i c t i o n s on the n(°)'s i n the single sample tes t , there are no such r e s t r i c t i o n s on the n^°^'s i n the "Steinized" test, the r e s t r i c t i o n s being only on the z*s. The interpretation of t h i s i s that data missing from the f i r s t sample can be made up i n the second sample, a c i r c -umstance which might be useful i n practi c e . -20-The groups of the 2x2 f a c t o r i a l design w i l l be numbered as follows: 1 2 3 4 The usual single sample tests are: 1) The row e f f e c t s are taken to be zero i f -i(n(°) ) ( x * 0 ^ / x ( o ) 2 -2x<°> J 2 / s 2 / 4 ( n ( ° ) ) ( x ( ° > 3 / x ( ° ) 4 -2x(°L)2/s 2 r F 4' F r row r ^ 2) The column effects are taken to be zero i f .KnCo'hlxlo^ / x < ° > 3 - 2x(°).)2/s 2 / b ( o l ( x ( ° ) 2 / x ( o ) r 2 x ( o ) , ) 2 / s 2  = F c o l ^ 3) The in t e r a c t i o n effections are taken to be zero i f (21 n(o)(x(o). - x ( o ) . ) 2 / 8 2 J . _ ^ = F i n t ^ F K In the above x^ 0^. i s the grand mean, s 2 , i s the residual variance, and F ^ a percentage point i n the F d i s t r i b u t i o n , The corresponding "Steinized" tests are 1) The row effects are taken to be zero i f K x ^ - L / x < a ) 2 - 2x( a) . ) 2 / z / 5,(x(a) / x ( a ) _ 2xU).)2/z - F' ( F row \>  x o< -21-2) The column ef f e c t s are taken to be zero i f (Ja\ / x(a) - 2 X< a).) 2/z / = F , c o l « F -3) The interaction e f f e c t s are taken to be zero i f ()L-(xM± - x ^ . ) 2 / z ) - F' - F' i row col = F' . < F e i n t v^ * In the above x ^ a ) . i s the grand mean and F - i s the same as i n the single sample case, namely the -th percentage point i n the F d i s t r i b u t i o n with 1, -x> degrees of freedom, where ^ i s the number of degrees of freedom of the residual variance of the i n i t i a l sample. -22-TESTS OF MULTIPLE HYPOTHESES Tests of multiple hypotheses such as Scheffe's test, the multiple range te s t s , and the multiple F te s t s are treated i n the same manner as tests of single hypotheses, A l l type one errors of the "Steinized" test w i l l be the same as those of the single sample tes t . . A l l type two errors of the "Steinized" test w i l l be independent of o . EXAMPLE: SCHEFFE'S TEST The analysis of variance enables one to look at a l l the data and say, for example, that not a l l the means are the same. In actual practice i t i s often wished to look at the groups i n d i v i d u a l l y and decide which p a r t i c u l a r means are d i f f e r e n t from certain others. One might wish to know i f m^  were d i f f e r e n t from mg, that i s , i s ""-:m2 a 0 ? I f m-^  and are found to be the same, i t might be asked i f they are the same as some t h i r d mean, m^;;in other words, i s m / m -2m = 0 ? A contrast i s defined as any Y 1 2 3 dimensional vector orthogonal to (1, 1, 1, 1, 1 ) . The two questions just asked concern the contrasts, ( 1 , -1, 0, 0, 0) and (1, 1, -2, 0, 0, 0, ..,0). -23-By means of Scheffe's test i t i s possible to make statements concerning any number of contrasts with p r o b a b i l i t y 1 - oi. that a l l of them are. correct. Scheffe's test i s as follows. Let / c-J be any contrast and s c the standard error of 2 c i X(°) . as estimated from the residual variance of the analysis of variance. The pr o b a b i l i t y that there i s any con-t r a s t , {cjJ$ , such that Z l m i c i = 0 and i s not greater than . In the above F i s the -th percentage point i n the F d i s t r i b u t i o n with ) - 1,3? degrees of freedom where s c has 3? degrees of freedom. Thus a contrast i s considered s i g n i f i c a n t i f the second of the above i n e q u a l i t i e s holds. To "Steinize" Scheffe's t e s t , merely replace the single sample inequality by y ^ ^ . / ( n ^ K ? ! ) * - > 1) F -24-IN BRIEF The reader may f i n d the f o l l o w i n g r u l e of thumb u s e f u l . Although not complete or a l t o g e t h e r r i g o r o u s , i t u s u a l l y works; moreover, i t i s a good summary of " S t e i n i z a t i o n " i n general. To " S t e i n i z e " a t e s t , merely replace ( x ( 0 ) . - m.)(n(. 0).)Vs wherever.-it occurs by ( x ( a ) . - m.} / z ^ i i ' i and continue as u s u a l . Everything e l s e save the type two e r r o r i s obtained i n the same manner as i n the s i n g l e stage sampling t e s t . -25-* i ^ IMPROVEMENT A Stein type test can be improved i f the t o t a l sample mean n ( o ) _ / n ( o ) I - ' x i ) k j / ^ / n ^ J k = 1 . / i s used i n place of x ^ a ^ . The d i s t r i b u t i o n s involved are unknown; however, since Xj_ i s a better estimator of mj_ than x ' a ^ the d i s t r i b u t i o n for the x ^ a ^ can be used and the re a l error w i l l be smaller than that computed. Al t e r n a t i v e l y the Zj_ can always be selected so that s 2 / z ^ i s an integer. -26-CHAPTER TWO: SOME OPTIMUM PROPERTIES INTRODUCTION Chapter one t e l l s how numerous two sample con-fidence regions f o r the means of a set of homo-sc e d a s t i c normal d i s t r i b u t i o n s of unknown variance may be b u i l t ; many others, no doubt, e x i s t . This chapter w i l l be devoted to an attempt to f i n d the "best" p o s s i b l e such confidence r e g i o n . As S t e i n himself has pointed out, one of the major parts of t h i s problem i s to decide what s o r t of p r o p e r t i e s a "best" scheme f o r f i n d i n g confidence regions should have. The only work done on t h i s problem known to the w r i t e r i s contained i n a paper by Weiss i n which c e r t a i n r e s u l t s are obtained f o r the case where there i s one group (Jf= 1 ) . Weiss confines himself to i n v a r i a n t estimates, as does t h i s chapter a l s o . An i n v a r i a n t (under t r a n s l a t i o n ) estimate has the property that adding a constant to a l l observations has the e f f e c t of adding that constant to the estimate of the mean. Weiss shows tha t w i t h i n t h i s c l a s s the "best" p o s s i b l e estimator of the population means i s the mean of the t o t a l sample; i n t h i s case "best" means that c e n t e r i n g the confidence i n t e r v a l at the t o t a l sample mean max-imizes the confidence c o e f f i c i e n t . -27-Weiss also attempts to provide j u s t i f i c a t i o n of Stein's choice i n making the size of the second sample depend-ent on the variance of the i n i t i a l sample alone by means of the following r e s u l t . Given any invariant (under translation) scheme (1) f o r selecting the second sample si z e , there exists another scheme (2) y i e l d i n g as large a confidence c o e f f i c i e n t with sample size dependent upon the i n i t i a l sample variance alone such that f o r every there i s an n ^ such that the pr o b a b i l i t y that the sec-ond sample size exceeds n ^  given scheme (2') i s less than or equal to the pr o b a b i l i t y of the same event given scheme (1). Nothing seems to be known about the actual size of n^- . This chapter uses a di f f e r e n t approach. The prob-lem i s regarded as one i n estimation theory. Viewed i n those terms, Stein's scheme yi e l d s an expected loss which i s constant no matter what <s~ may be. The "best" scheme i s one which y i e l d s i n some sense a minimal expected loss f o r a given expected sample s i z e . Of course, a scheme which i s good f o r some a— may be poor f o r some other a— . To circumvent t h i s a minimax c r i t e r -ion i s used, i e . we t ry to minimize the worst that can happen. -28-Only invariant estimators are considered. An extension to the two sample case of the concept of an admissible minimax test i s proposed, and necessary and s u f f i c i e n t conditions f o r the class of a l l such e s t i -mators are found. EVALUATION The work herein i s broader than that of Weiss i n that i t considers the average sample size over a l l s i t -uations and not just those a r b i t r a r i l y close to i n f i n i t y . I t also deals with more general loss functions. Both t h i s discussion and that of Weiss are r e s t r i c t e d i n that they deal only with invariant estimators.. MODEL The same model i s used as i n chapter one, thus, an optimum estimate of the means of a homoscedastic set of normal d i s t r i b u t i o n s i s sought. I t i s assumed that no l i n e a r r e s t r i c t i o n s among the means exist; however, t h i s r e s t r i c t i o n i s not overly r e s t r i c t i v e since i f X means with l i n e a r r e s t r i c t i o n s are to be estimated, the prob-lem can always be reduced to the estimation of X < If means with no l i n e a r r e s t r i c t i o n s . -29-NOTATION The notation of chapter one i s carried over unless otherwise stated. The r e l a t i o n = i s taken to be functional i d e n t i t y with p r o b a b i l i t y one. The convention^is adopted that, unless otherwise stated, a l l lower case l e t t e r s represent r e a l scalars and a l l upper case l e t t e r s represent vectors or matrices, the components of which are the corresponding lower case l e t t e r s . Either the matrix (vector) or the scalars, but usually not both, w i l l be defined, and i t w i l l be assumed that the reader can i n f e r the meaning of one given the other. Exceptions to th i s rule are the upper, case p i and sigma which are used for products and sums respec-t i v e l y , the upper case E which i s used as the expecta-t i o n operator, the upper case gamma which i s used f o r the gamma function, and a l l German l e t t e r s . A two sample estimation scheme i s completely determ-ined by the functions M and n-j_ where M i s the estimator of M and nj_ = n(°)i / n ^ i . Hereafter a two sample estation scheme w i l l be denoted by [M, or /m^ , njj . -30-LOSS FUNCTION The discussion w i l l be r e s t r i c t e d to loss functions with the following properties. Let be a measurable function of the Y -dimensional vector, {$^\ , such that 1) [ i s symmetric i n f • f o r a l l i , which i s to say that [ i s invariant under r e f l e c t i o n . 2) £ i s non-decreasing i n a l l f- over the range £.£[0, 0 0 ) , that i s , d i s non-decreasing along any l i n e passing through the o r i g i n as one goes away from the o r i g i n . 3) ii i s continous at zero and f (0) - 0 4) E ( t ) exists 5) Either a) dt exists and i s continous f o r a l l i , o r b) there exists a sequence of functions s a t i s f y i n g 1), 2 ) , 3), 4 ) , and 5a) converging i n pro-b a b i l i t y to USE OF LOSS FUNCTION I f the loss due to an incorrect estimate of M be taken as £(M - M) ( i e . t((m^_ - m^) ) a good estimator should y i e l d a r e l a t i v e l y small E( f (M - M) ) and E( n^ ) -31-I t i s to be expected that E(rij[ \ a- ) w i l l be large i f 2 o- i s large. Comparisions between dif f e r e n t o~ w i l l be made on the basis of E( ni/o- 2/! a~2) instead of simply E( iHJ'cr-2) Two special cases should be noted. The f i r s t i s when t%S<\) = An estimate which minimizes t h i s loss y i e l d s a minimal mean squared error, i e . , i s the most e f f i c i e n t estimator. The second special case i s when d i s the c h a r a c t e r i s t i c function (in the sense of set functions) of the compli-ment of some convex and symmetric region containing the o r i g i n . Minimizing t h i s loss yields an estimator which y i e l d s a confidence region of the same size and shape as the given set and having a maximal confidence c o e f f i -cient . ORDERING I f one estimator i s going to be preferred to an-other, some sort of ordering must be established. The following two orderings w i l l be used herein. DEFINITION Weakly better than or as good as, {mi, n.j3 i f i ) S v P E ( e \o~) n , < S»P ECtl^njz';,^})--32-The s t r i c t ordering, weakly s t r i c t l y better than, <^  , i s obtained i f , i n addition to the above, one of the i n e q u a l i t i e s i s s t r i c t f o r some i . DEFINITION Strongly better than or as good as, {m±, n±] « {m<±, n' ± r i f 1) a l l - - 2 , M, E( C 1 *~ 2 f M, (m±, ) E( e l - 2 , M, fa'±, n' ± 2) a l l : . i , c ^ 2 , M, E ( n i / c r - 2 ) cr- 2 , M) E(n'._/V 2 ( a~2, M) The s t r i c t ordering, strongly s t r i c t l y better than, « i s obtained i f , i n addition to the above, one of the in e q u a l i t i e s i s s t r i c t f o r some i , a- , and M. -33-REMARKS ON ORDERING • I f these are r e a l l y orderings they should have certain properties; for example, they should be re-fl e x i v e and t r a n s i t i v e , and the strong ordering should demand more than the weak. Some of the properties of these orderings are a) Both; 4: a n ( 1 1= a r e r e f l e x i v e . b) A l l four of the rel a t i o n s are t r a n s i t i v e . c) In addition to b) d) A s t r i c t l y better ordering implies the correspond-ing better or as good ordering. e) The strongly better or as good ordering implies the weakly better or as good ordering. This i s not necess-a r i l y true of the s t r i c t orderings, however, imply -34-DEFINITION OF DESIRABLE PROPERTIES We are now i n a position to define the properties which w i l l be expected of a good estimator. These are: 1) Invariance (Under translation) (mi, n i l i s said to be invariant i f f o r any vector, V, M(X k / V) = M(X k) / V. 2) Minimax An estimator i s said to be minimax with respect to a set of estimators i f no weakly s t r i c t l y better estimator exists within the set. 3) Admissible An estimator i s said to be admissible with respect to a set of estimators i f no strongly s t r i c t l y better estimator exists within the set. REMARKS In the theory of single stage sampling estimation, properties s i m i l a r to the three just defined are among those properties which i t i s generally accepted as desirable for an estimator to possess. One of the standard approaches i s to consider the admissible min-imax estimators as the "best" estimators; a simpler a l t e r n a t i v e approach i s to r e s t r i c t consideration to some class of "nice" estimators and f i n d admissible minimax estimators with respect to i t ; the class of i n -variant estimators i s sometimes selected i n t h i s con-nection. Also i t seems i n t u i t i v e l y l i k e l y that invar-iant estimators are about as good as any estimators. -35-When dealing with two sample estimates the usual concepts must be extended i n some way to take the var-iable sample size into account. The question of what constitutes a "desirable property" ultimately i s answered as much by the i n t u i t i o n of the s t a t i s t i c i a n as anything else. It i s hoped that"the reader w i l l f i n d the proposed extensions of single sample properties to be reasonable requirements f o r a "good" estimator. PROPERTIES OF THE DESIRABLE PROPERTIES An estimator which i s (minimax, admissible) with respect to a set of estimators, Q£ , i s also (minimax, admissible) with respect to any subset of i n which i t i s contained. If (XI i s the set of estimators (minimax, admissible) with respect to some set of estimators, G?, then any estimator (admissible, minimax) with respect to O'X i s also (admissible, minimax) with respect to -36-PROSPECTUS The remainder of thi s chapter w i l l be devoted to proof of the following assertions. 1) Let Cr be the set of invariant two sample estima-tors of the means of a set of normal d i s t r i b u t i o n s with equal variance. Let % be the set of a l l such estimators f o r which m^  = x^ where XJ_ i s the t o t a l sample mean for the i - t h group. For every estimator i n G - ^ there exists a strongly s t r i c t l y better estimator i n 36 . 2) Let It be the set of estimators i n for which n-i_ i s a function of the residual variance of the i n i t i a l sample, s , and no other random variable. For every estimator i n ~%r - ti there exists a strongly s t r i c t l y better estimator i n ft . 3) Let (fl be the set of estimators i n H f o r which 2 n i = b i s f o r some constants b^. For every estimator i n TL - OX there exists a weakly s t r i c t l y better estimator i n cn . 4) By 1), 2), and 3) an estimator i s both minimax and admissible with respect to £ i f and only i f i t i s i n -37-INVARIANT ESTIMATORS Since the assertions to be proven concern only i n -variant estimators, i t w i l l hereafter be assumed that any estimator under consideration i s invariant. Such estimators have the following properties: 1) E( £(M-M) ) does not depend on M. 2) The taking of sups over M i n defining the ordering propertir&e, weakly better than, i s unnecessary and w i l l be hereafter discontinued. 3) A l l n-j_ are independent of x ' ° ' . To show t h i s , translate a l l observations by V to get X k f = X k / V. If a l l n ^ 0 ) are not independent of X^0^ there i s , f o r some V, a f i n i t e p r o b a b i l i t y that a d i f f e r e n t sized sample w i l l be taken i n the primed and unprimed cases. Thus there i s a f i n i t e p r o b a b i l i t y that M' w i l l not equal M / V which means that the estimator i s not i n -variant. (Remark: "Independent" i n t h i s case r e a l l y means, independent with p r o b a b i l i t y one.) 4) There exists an orthogonal l i n e a r trans-formation mapping the xj., k (k= 1, 2, .., n-j_).to (n-j_)2x-j_, y i ) k (k = 2, 3, n i ) . The assertion i s M = X / F ( f y i > k ) ) where F i s some function of {y±,^\' -38-Proof: T r a n s l a t i o n of each ^ by some v^ changes X j _ to x^ / v^ and leaves a l l the y^ ^ unchanged. No other dependence of the m's on the observations has t h i s property. The next problem i s to f i n d the best such E; the f o l l o w i n g theorem shows t h i s to be zero. THEOREM 2.1 ?,C Z 0 fx/F, n i }»fx, n j Proof: 1) S e t t i n g M = 0 r e s u l t s i n no l o s s of g e n e r a l i t y . 2) I (X / F) - f(X) i s skew symmetric about -IF and i s non-positive i f the inner product of X and F i s l e s s than - | / F l . 3) The d i s t r i b u t i o n of X i s symmetric about zero, and i t s d e n s i t y f u n c t i o n decreases as one goes away from zero. 4) By 2) and 3) E(£(X/F) - tU)\h±)k\) 0 since every X such that (?(X/F) - C(X) < 0 can be paired w i t h p o i n t , X', such that dU/t) - C(x) = -U(X'/F) - e(xM) and the p r o b a b i l i t y d e n s i t y at X' i s greater than at X. •REMARKS Useful i n e q u a l i t i e s can be obtained when the l o s s f u n c t i o n i s convex by s h i f t i n g expectation operators across f u n c t i o n brackets. The f o l l o w i n g m a t e r i a l w i l l enable us to f o r c e convexity on the s i t u a t i o n and so -39-The rather obvious fa c t that a larger sample size y i e l d s a better estimate w i l l also be proven , DEFINITION f ( l ) , J " K X ) 7 t (This i s defined only f o r £ > 0 . ) LEMMA 2.2 i ( n i /cr- 2) = E ( ^ I M , ^2Jn±j ) To prove, simply substitute /o-2 into the d e f i n i -t i o n . This lemma, of course, i s the reason f o r defining i n the f i r s t place. THEOREM 2. 3 For a l l , jz> i s 1) s t r i c t l y decreasing i n a l l f j_, and 2) convex i n a l l j ^  and s t r i c t l y convex somewhere i n a l l / ± . Proof: It w i l l be f i r s t shown that $ i s non increas-ing and convex i f t possesses f i r s t p a r t i a l s . Set -.40-. 5/-This i s negative since u- and £ ,• have the same sign. %J J This shows that $ i s decreasing. To show convexity-take the second derivative which i s . ?- ' - f x * 3 J,- 3 f / / Since both 3 v J f a n d a r e negative, the reader can e a s i l y see that the above expression i s positive upon noting that x- and [ . have the same sign. I f [ does not have p a r t i a l derivatives, take a sequence, { f ^ l , having p a r t i a l derivatives and con-verging i n p r o b a b i l i t y to I . Let p1^ be the 'V" defined by I ^ . The l i m i t of the p1^ i s p. The l i m i t of a sequence of non-increasing convex functions i s non-increasing and convex. To show that $ i s s t r i c t l y decreasing, make use of the following: 1) Lim ^ ( f ) 2 0 2) ^ i s convex 3) p* ^  0 somewhere . To show that p i s s t r i c t l y convex somewhere, note that in-addition to these three conditions we have 4) Lim tfif) > O- . This would be impossible i f p were a straight l i n e . RESTRICTION 2 The use of the mark, s , i s hereafter r e s t r i c t e d to the residual variance of the i n i t i a l sample. y <n io> -s 2 S £ 7 l l U i , k - x ( o ) )2 2 where y i s the number of degrees of freedom of s . -42-THEOREM 2.4 t t 0 =} fx, n^> » fr, E ( n i | S 2 ) } (Note that E(nj_|s2) does not depend on M or o~2.) Proof: 1) E ( n i ) = E ( E ( n i i s 2)) 2) E(E(^(n i/o- 2 J ) s 2 ) ) > E ^ E d ^ / a - 2 | s 2) ) ) by convexity of Moreover the inequality i s sharp because <j> i s s t r i c t l y convex somewhere. COROLLARY Don't randomize the nj_. REMARK It has been shown so far that the optimum estimator of M i s X and that the n's should be a function of s 2 alone. I t remains to f i n d what function of s2 to use. This w i l l be established i n theorem 2.8\ Several pre-liminary results w i l l be needed. -43-O B F I M I T I 0 N G i v e n n: ( S 1) , n • ( S i ' J cf g J= , „ c 4 h V ft i: n R R K w h c * 5 1 r> 3 p J v- J CM c t e y a •* J V y i > J t't b r L b v t e *t n/? 2.5"-.. w K s r e " the b t a r c j n ^ p o s i i t . v « c 0 m t j h t J o o --0 -= b s' J [ \j/ y/ 7 1 3 - / r(^vti) 2 7 S e t y , ' = ^ 4 j u s v r ( i v ) ' * ^ -44-This i s the expectation of n ^ l f " ) - b± y • where jsf' i s distributed chi square with >- degrees of freedom, n ^ t ^ ' ) - b i i ^ t = 0 since the chi square d i s t r i b u t i o n has no non t r i v i a l estimators of zero. P £ F \ M I TI <? (v/ E( t • ^ s< ^ ( V ) N< t . s ? w r f l i p c o f c a b . ' / i f y one. P f o o f ; /a o J[_ 3 ^ y U s 1 / n(*-*>; 2 ^ -45-<T~ _ V s S e t s' 2 =• y 5 * 2. /O / v Reverse the order oS- i*te<]r?bt'on tr yet Co We at the 'e«J i o f + ^> > c h ? »' H o .f i'>if(j(^l.'/ro t<r yet -46-\ 5 a / a n d parameter T H 15 0 E N 2 . 5 » f ( « 4 * H ' 1 ) s fc>L| c o n v e x i t y o f - ^ -47-The foregoing r e s u l t s w i l l now be used to determine the optimum scheme f o r selecting the sample s i z e . THEOREM 2.8 n ^ s 2 ) % b^s 2 and £ ^ 0 imply / X , n i ( s 2 ) } > [x, b i S 2 } . Proof: 1) Sup E ( n i ( s 2 ) / V 2 \ a- 2} = b i = E ( b i S 2 / cr2 ( cr 2) 2) a) n ^ s 2 ) ^ b is2 by lemma 2.4-b) nj_*(s2) ^ b ^ s 2 with p r o b a b i l i t y one by lemma 2.5. c) Prob { n i " (s 2)< b i s 2 } > 0 by a) an d b ) . d) i s s t r i c t l y decreasing by theorem 2.3. e) jz( (n^ (s2)/cr 2) y/ ^ ( b ^ s 2 / cr 2) with p r o b a b i l i t y one by b) and d). f) By c) and d) P r o b ^ d i i ( s 2 ) / ^ 2) y ^ ± 3 Z / ^ 2)} > 0 g) By e ) and f ) , f o r a l l a- 2 E{j {n± (s2)/o- 2).|^-2) > E ( ^ ( b i S 2 / <r- 2) (^2) h) ECj^lbiS 2/ c r - 2 ) j c r 2 ) i s constant f o r a l l t r 2 -48-i ) By g) and h) Sup E{fi(nT{a2)/^2)lo-2) > Sup E(^(b.s 2/<r- 2)|^-2) o— j) By theorem 2.7 Sup E { ^ ( n i ( s 2 ) / V 2 ) \ * - 2 ) o— >s Sup E ( ^ ( n i ( s 2 ) / ^ 2)/cr-2) > Sup EtpCRrS 2/^ 2) / c r o— The desired re s u l t follows from 1) and 2j) -49-CONCLUSION An invariant two sample estimator of the means of a set of normal d i s t r i b u t i o n s with hornoscedasticity i s minimax and admissible for any loss function i n the closure of the space of d i f f e r e n t i a b l e functions, sym-metric about zero and increasing as one goes away from zero i f and only i f i t i s i n 07 where Otis the set of a l l two sample estimates such that m^  = x^ and n^ a b ^ s 2 f o r some positive constants, bj_, where s 2 i s the residual variance of the i n i t i a l sample. The proof of t h i s follows. F i r s t , any admissible estimator of M must be X by theorem 2.1. Second, the sample size of any admissible two 2 sample estimation scheme must depend on s alone by theorem 2.4« Third, by theorem 2.8, i f the sample size depends on s 2 alone then the sample size must be of the form b-sS2 i f the scheme i s to be minimax. -50-PRACTICAL CONSIDERATIONS The optimum estimator requires a sample size which i s l i t e r a l l y almost never an integer. In practice such samples cannot be taken. Some compromise must be made, and the obvious one to make i s to always take a sample the size of which i s the least integer which i s greater than or equal to the sample size 'which i s " r e a l l y " wanted. Moreover, i f the i n i t i a l sample i s already greater than the sample which the t h e o r e t i c a l l y best scheme demands, there i s no sense i n throwing away part of the sample. Making these two compromises leads to the scheme: Take n. as the least integer not less than Max( n(°) , b.;s 2). In the case when i a l l the b's are equal, t h i s i s simply the improved version of Stein's t e s t . -51-CHAPTER THREE: A MULTIVARIATE EXTENSION MODEL Let . v be the observation of the p-th char-p, 1 ,K acter of the k-th i n d i v i d u a l of the i - t h group. The p subscripts run from 1 to r, the i subscripts from 1 to ^ , and the k subscripts from 1 to the number-of r e p l i c a t i o n s . The x's are taken to be d i s t r i b u t e d according to a multivariate normal d i s t r i b u t i o n , independently over the i and k subscripts, and with parameters M i » 6* p,i) = ( E ( x p j i > k ) ) , a row vector of r dimensions, and ^ = .^P.qJ = / E ( ( x P , i , k " ^ i X X q ^ k " m q > i n j , an r by r matrix. Such models occur i n b i o l o g i c a l sampling. An example would be when several kinds of measurements (the p subscripts), f o r example head length, head width, etc., are made on numerous ind i v i d u a l s (the k subscripts) of several d i f f e r e n t races or subspecies (the i subscripts). Models of t h i s sort also appear i n the f i e l d of psychological measurement when, f o r example, several d i f f e r e n t kinds of tests are given to numerous individuals i n d i f f e r e n t occupation groups. The problems which may arise are much the same as i n the univariate case: Where do the mn ,• l i e ? -52-A SINGLE SAMPLE MULTIVARIATE STATISTIC A s t a t i s t i c used i n connection with such prob-lems i n the single sample case i s the Mahalanobis' distance, a multivariate extension of the F r a t i o . In the univariate case, a scalar observation, x^,^ i s made on each i n d i v i d u a l , and the measure of dispersion, s 2 = Sum (x^,k - x-j_,.) 2/ number of observations i s also a scalar. In the multivariate case, a vector observation, X ^ ^ = ( x ^ i ^ k * x 2 j, ^, ) i s made on each i n d i v i d u a l , and the measure of dispersion, Sum ( x D i k - x A .)(x * k - x • ) ' "NUMBER OF OBSERVATIONS - If i s a square matrix. In the single variate case the homogeneity of scalar means i s tested by the F r a t i o which i s proportional to Sum (x.: . - x. . ) 2 / s 2 . In the multivariate case the homogeneity of vector means i s tested by the Mahalanobis' distance which i s proportional to Sum (Xi -X ) S _ 1 ( X i . - X )' . -53-"STEINIZATION OF THE MAHALANOBIS " DISTANCE Select a set of posi t i v e numbers, and some function of the sample covariance matrix, ^ ( S ) , i n practice i t seems l i k e l y that w i l l be either the determinant or the largest eigenvalue. Take an i n i t i a l sample of size n ^ 0 ^ from the i - t h group. S w i l l be Wishart distributed, and the density function denoted by w(S,Z, )• A second sample of size n ^ ^ ^ i s then taken. This sample size i s determined as follows. Consider the equation i n a^, a i 2 / n ( o ) i / ( l - a i ) 2 / n ( 1 ) i = *i/ 4(S) . Take n ^ ^ to be the least natural number such that t h i s equation permits a r e a l root. One of the roots thus obtained (which does not matter) w i l l be c a l l e d a^. Define X ^ a ) i by X ( a ) . = a.X^). / ( l - a . J X ^ ) . where X^°^j_ and X ^ i are the i - t h group means fo r the f i r s t and second samples respectively. The d i s t r i b u -t i o n of Sum ( X ( a ) ± - M i ) S ~ 1 ( X ( a ) i - Mi)' U {S)/z±) w i l l be related to the d i s t r i b u t i o n of the Mahalanobis' distance. -54-LEMMA 3.1 (a) The conditional j o i n t d i s t r i b u t i o n of the X ^ given S i s multivariate normal, independent over the i subscripts, with means and covariances matrices I X(S)/z±. THEOREM 3.2 The d i s t r i b u t i o n of Sum ( X ( a ) ± - M i ) S _ 1 ( X ( a ) i - Mi)' f * ) i s the same as that of Sum ( X ( o ) . - M . ) S - 1 ( X ( o ) i - M,)' n ( o ) . Proof: The c h a r a c t e r i s t i c function of the f i r s t expre-ssion i s f *ypprr 5u^/(V3;-n,}5"'(x?-a0' (AO)/^ -) Make the change of variable, (nio)±)(T:± - M.) - ( X ( o ) . -M±)(MS)/z±)Z and i ) becomes j~ ^fI/TryS^[CTc-M:)S''crrnl-)\(t! ]] * Upon replacing T by X^°^ i t can be seen that i i ) i s also the c h a r a c t e r i s t i c function of the d i s t r i b u t i o n of the single sample s t a t i s t i c . -55-EXAMPLE: HOTELLING'S TEST It has been shown by Hotelling that i n the case of one group (or more generally, one contrast) the s t a t i s t i c (n ( 0 , )-r) nt°> ( x ( o ) . M ) g - l ( I ( o ) . M ) , (n(o)-i) r i s d i s t r i b u t e d according to the F d i s t r i b u t i o n with r, n ^ - r degrees of freedom. (The i subscripts are dropped since Y- 1.) The confidence region for the means w i l l thus be the set of a l l M such that n ( o ) ( x ( o ) _ M ) S _ 1 ( X ( o ) - M)« ><; ( ( n ( o ) - l ) r / n ( o ) - r ) F where F^ i s a percentage i n the F d i s t r i b u t i o n with degrees of freedom as given above. By theorem 3.2. the "Steinized" confidence region i s the set of a l l M such that ( ^ ( S ) / z ) ( X ( a ) - M ) S _ 1 ( X ( a ) - M)' N< ((n(°)-l)r/n(°}-r) F^ _ I f X be taken as the largest eigenvalue of S, the max-imum radius of t h i s confidence region, J z((n(°)-l)r/n{°)-r) F^ can e a s i l y be controlled by proper choice of z. I f X be taken as the determinant of S, a confidence region of fi x e d volume, / h " J _ , \ ^ . i s obtained. -56-IMPROVEMENT It was pointed out in the univariate case that an improved confidence i n t e r v a l could be obtained by-centering at the average of the entire sample rather than at the weighted average, x^ a^. The same pr i n c i p l e , of course, holds i n the multivariate case. -57-BIBLIOGRAPHY Dantzig, George, On the Nonexistence of Tests  of "Student' s" Hypothesis Having Powe"r~Functions  Independent of . Annals Matnematical"~Statistics v o l . XI- (19407- pp. 1 8 6 - 1 9 2 . Hotelling, Harold, The Generalization of Student's Ratio. Annals of Mathematical S t a t i s t i c s , vol.11 TOOT P P . 3 6 0 - 3 7 8 . Kendall, Maurice G., The Advanced Theory of  S t a t i s t i c s , Volume I I . London, Charles G r i f f i n and Co., 19k%~. Lehmann, E. L., Notes on the Theory of Estimation. Mimeographed notes on lectures given at the Univ-e r s i t y of C a l i f o r n i a at Berkeley, 1949-50, recorded by Colin Blyth. Lord, E., The Use of the Range i n Place of the  Standard Deviation i n the T Test. Biometrika v o l . XXXIV ( 1 9 4 7 ) pp. 4 1 ^ 6 7 . Neyman, Jerzy, Lectures and Conferences on  Mathematical S t a t i s t i c s and P r o b a b i l i t y . Washington, D. C., Graduate School of the United States Department of Agriculture, 1952. Scheffe, Henry, A Method f o r Judging A l l Con-tr a s t s i n the Analysis of Variance. Biometrika v o l . XL"TO5T) pp. 87-104. Seelbinder, B. M., Oh ..Stein T s'! Two-Stage Sampling  Scheme. Annals of Mathematical S t a t i s t i c s , v o l . XXIV ( 1 9 5 3 ) pp. 640-649. Stein, C , A Two-Sample Test f o r a Linear  Hypothesis Whose Power i s Independent of~the  Variance. Annals of Mathematical S t a t i s t i c s v o l . XVI (1945) pp. 243-258. Weiss, L., On Confidence Intervals of Given Length  for the Mean of a Normal D i s t r i b u t i o n with Unknown  Variances Annals of Mathematical . S t a t i s t i c s , v o l . XXVI (1955) pp. 348-352. 

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
http://iiif.library.ubc.ca/presentation/dsp.831.1-0080643/manifest

Comment

Related Items