ON THE ADMISSIBILITY OF SCALE AND QUAJSITILE ESTIMATORS BY c JOHN FREDERICK BREWSTER B.Sc, University of British Columbia, 1966 M.Sc., University of Toronto, 1967 A THESIS SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in the Department of MATHEMATICS We accept this thesis as conforming to the required standard The University of British Columbia March 1972 I n p r e s e n t i n g t h i s t h e s i s i n p a r t i a l f u l f i l m e n t o f t h e r e q u i r e m e n t s f o r a n a d v a n c e d d e g r e e a t t h e U n i v e r s i t y o f B r i t i s h C o l u m b i a , I a g r e e t h a t t h e L i b r a r y s h a l l m a k e i t f r e e l y a v a i l a b l e f o r r e f e r e n c e a n d s t u d y . I f u r t h e r a g r e e t h a t p e r m i s s i o n f o r e x t e n s i v e c o p y i n g o f t h i s t h e s i s f o r s c h o l a r l y p u r p o s e s may be g r a n t e d b y t h e H e a d o f my D e p a r t m e n t o r b y h i s r e p r e s e n t a t i v e s . I t i s u n d e r s t o o d t h a t c o p y i n g o r p u b l i c a t i o n o f t h i s t h e s i s f o r f i n a n c i a l g a i n s h a l l n o t be a l l o w e d w i t h o u t my w r i t t e n p e r m i s s i o n . D e p a r t m e n t o f Mathematics T h e U n i v e r s i t y o f B r i t i s h C o l u m b i a V a n c o u v e r 8, C a n a d a D a t e April 28. 1972 i i Supervisor: James V. Zidek ABSTRACT The inadmissibility of the best affine-invariant estimators for the variance and noncentral quantiles of the normal law, when loss is squared error, has already been established. However, the proposed (minimax) alternatives to the usual (minimax, but inadmissible) estimators are themselves inadmissible. In our search for admissible alternatives in these problems, we f i r s t consider estimators which are formal Bayes within the class of scale-invariant procedures. For such estimators, we present explicit conditions for admissibility within the class cf scale-invariant pro-cedures . In the second chapter of the thesis, we consider the estimation of an arbitrary power of the scale parameter of a normal population. Under the assumption that the loss function satisfies certain reasonable con-ditions, an estimator is constructed which is (i) minimax, and ( i i ) for-mal Bayes within the class of scale-invariant procedures. The estimator obtained is a limit of a sequence of minimax, preliminary test estimators. Moreover, under squared error loss, and using the results of Chapter One, this estimator is shown to be scale-admissible. More generally, results are obtained for the estimation of powers of the scale parameter in the canonical form of the general linear model, and for the estimation of powers of the scale parameter of an exponential distribution with unknown i i i location. In Chapter Three, conditions are given for the minimaxity of best i n -variant procedures in general location-scale problems. Finally, by com-bining these results with those of the preceding chapter, the usual interval estimators for the variance of a normal population are shown to be minimax, but inadmissible. Superior, minimax procedures are suggested. i v TABLE OF CONTENTS page INTRODUCTION 1 CHAPTER ONE.: Scale-admissible, Invariant Estimators For 4 Quantiles And Variance Of The Normal Law Under Quadratic Loss 1.1 Introduction And Summary 4 1.2 Definitions And Preliminary Results 4 1.3 The Reduced Problem 9 1.4 Quantile Estimation 14 1.5 Scale Estimation 39 CHAPTER TWO: Minimax, Scale-admissible Estimators Of Scale 42 Parameters 2.1 Introduction 42 2.2 Inadmissibility Of The Best G-invariant Estimator 43 2.3 The Construction Of A Minimax, Formal Bayes - 48' Scale-Invariant Estimator 2.4 Extension Of The Previous Results To G-invariant 53 Loss Functions 2.5 The Exponential Distribution With Unknown Location 56 And Scale CHAPTER THREE: Minimax, Inadmissible Interval Estimators Of 60 Scale Parameters 3.1 Minimax Estimators In Location-Scale Problems 60 3.2 Interval Estimation Of Scale Parameters " 70 BIBLIOGRAPHY 78 V ACKNOWLEDGEMENT I am deeply indebted to Professor James Zidek for his suggestion of the problems treated i n this thesis and for his guidance during i t s pre-paration. I am particularly thankful to him for the knowledge and insight he has imparted to me during the course of my graduate program. I would also like to extend my appreciation to Professors Lawrence Clevenson, Ned Glick, Stanley Nash, and Carl Sarndal for their careful reading of this dissertation and for their helpful comments concerning i t . In particular, I would like to thank Professor Richard Shorrock for suggestions given during the preparation of the second chapter. Finally, I would l i k e to thank Eleanor Lannon for her care and patience i n typing the manuscript. The financial support of the University of British Columbia and the National Research Council of Canada is gratefully acknowledged. Chapter 0: Introduction From the point of view of decision theory, the statistician's job consists of the selection of a decision rule from the large class of rules available. To make his task easier, the s t a t i s t i c i a n often imposes restrictions which reduce, i n size, the class of available rules. As many s t a t i s t i c a l problems possess certain natural symmetries, the "prin-ciple of invariance" is often employed in this manner. Although there is rarely a "best" decision rule i n a particular problem, there may be a best rule among the invariant ones. In such cases, the effect of im-posing the principle of invariance is to reduce the class of a l l possible rules to one. It may have seemed, at one time, that the best invariant rule would, except in unusual circumstances, have most of the properties considered desirable i n decision theory. Such a belief would have been supported by the work of Kiefer [22] and Kudo [23], who showed that the best invariant rule i s minimax in a wide class of problems. Also, i t i s well known that the best invariant rule is admissible i f the group acting on the problem is compact. In 1956, however, Stein [36] showed that the usual (best affine - invariant) estimator for the multivariate normal mean is inad-missible under squared error loss i f the dimension i s greater than or equal to 3. Brown [4] extended Stein's result to a very general location parameter problem in which he found that the dichotomy between 3 or more dimensions and less than 3 dimensions persists. Stein [37] also showed that the usual estimator (again, best affine-invariant) for the variance of a normal population i s inadmissible under 2 squared error loss i f the mean i s unknown. Brown [5] also extended this result to a wider class of distributions and loss functions. In addition to the above results, Zidek [46] has shown that the usual estimator for any non-central quantile of a normal distribution i s inadmissible. A common feature of the aforementioned papers is that the proposed (minimax) alternatives to the usual (minimax, but inadmissible) estimators are themselves inadmissible. Strawderman [38] and Brown [6] have recently presented admissible, minimax estimators for the multivariate normal mean. As the usual estimators for the variance and non-central quantiles of the normal law are inadmissible, i t is natural to search for admissible alternatives i n these problems, too. From many points of view i t is reasonable to restr i c t this search to the class of formal Bayes estima-tors. For, like proper Bayes estimators, formal Bayes estimators are comparatively easy to obtain in an explicit form for many commonly used loss functions. Furthermore, i t is well known that Bayes estimators and their limits, in an appropriate topology, constitute a complete class. And in many interesting s t a t i s t i c a l problems (see Sacks [32], F a r r e l l [11], and Brown [6]) these limits are formal Bayes estimators. In the f i r s t chapter of the thesis, explic i t conditions are given for the scale-admissibility of formal Bayes, scale-invariant estimators for the variance and quantiles of the normal law, under squared error loss. These conditions are obtained by application of an extension of the results of Zidek [45]. The second chapter i s concerned primarily with the estimation of an arbitrary power of the scale parameter of a normal population. Under the assumption that the loss function satisfies certain reasonable conditions, an estimator is constructed which is (i) minimax, and ( i i ) formal Bayes 3 within the class of scale-invariant procedures. The estimator obtained i a a limit of a sequence of minimax, preliminary test estimators, each con-structed i n the manner of Brown. Moreover, under squared error loss, and using the results of Chapter One, this estimator is shown to be scale-admissible. More generally, results are obtained for the estimation of powers of the scale parameter in the canonical form of-the general linear model, and for the estimation of powers of the scale parameter of an ex-ponential distribution with unknown location. The remainder of the thesis i s devoted primarily to interval es-timation problems. F i r s t , conditions are given for the minimaxity of best invariant procedures in general location-scale problems. As applied to confidence intervals, the main theorem may be viewed as an extension of Valand [40]. Finally, the usual interval estimators for the variance of a normal population are shown to be minimax, but inadmissible. Superior, minimax procedures are suggested. 4 Chapter 1: Scale-admissible, Invariant Estimators For Quantiles And Variance Of The Normal Law Under Quadratic Loss 1.1 Introduction And Summary The usual estimators for the variance and noncentral quantiles of the normal distribution are known to be inadmissible, under squared error loss ([37], [46]). However, no admissible alternatives have been suggested. This chapter gives explic i t conditions for the scale-admis-s i b i l i t y , under squared error loss, of formal Bayes, scale-invariant estimators <$>^ and ty^ f° r W + no and 0 M , respectively. If 0* rep-resents the orbit space created by the action of the scale group on the par-ameter space, and i f ty is formal Bayes with respect-to a prior measure on 9* with density IT (i=l, 2) , then, under suitable regularity con-" 2 -1 _ 1 2 -1 ditions, ty. i s scale-admissible i f f(t Tr.(t)) dt = f(jt Tr..(t)) dt 1 1 1 _oo = 0 0 . These conditions are obtained by application of an extension of the results of Zidek [45]. 1.2 Definitions And Preliminary Results Let X be a random variable taking i t s values i n a measurable space (X, 8) . Assume that the distribution of X is a unique but unknown mem-ber of a family of probability distributions indexed by a set 0 , a (possibly unbounded) subinterval of the real line, with upper and lower endpoints and 8^ , respectively. After observing X , a real-valued (measurable) function w: 0 x X -*• (R is to be estimated, with a loss function, L: A x 0 x X -> [0, °°) , of the. form 5 L(a, 6, x) = c(9, x) || a - w(6, x)|| 2 , (1.2.1) where the action space, A , is a subset of the real line, c: 0 x X •> (0, 00) is a measurable function, and denotes the Euclidean norm. Suppose p is a a-finite measure on 8 which dominates the family of underlying probability distributions. Let p(" |'6) , 6 e 0 , denote the density of the probability distribution corresponding to 0 . We assume p('|") is joi n t l y measurable in i t s arguments. We r e s t r i c t ourselves to nonrandomized decision rules, which are measurable functions <f>: X -* A , and define the risk of <j> to be r ( * , e) = / LOKx) , e, x) P ( x|e) d P (x) . (1.2.2) X Suppose II is a probability measure on the Borel subsets of 0 . A procedure $ is said to be Bayes with respect to II i f R ( * N ) = * J F R(<j») < » , where R(<J>) = / r(<fr, 6) dn(8) . 0 (1.2.3) 4>depends on II only through the posterior probability distribution, PJJ , defined by p (B|X = x) = / p(x|6) d n ( 9 ) a.e. [p] (1.2.4) B for a l l Borel subsets, B , of 0 , In fact, / w(e, x) cCe, x) P n Cde|x = x) <f> (x) = , (1.2.5) / cCe, x) P n ( d e | x = x) e . providing i t s Bayes risk, R((t'jj) » i s f i n i t e . From some points of view i t i s reasonable to allow II to be a o-finite measure. Providing Pjj(o[x = x) < 0 0 , a.e. [p] , II is called a prior measure (improper i f H(Q) = °°) . We can define the formal pos-terior distribution on 0 using (1.2.4) . A formal Bayes estimator i s defined as any measurable function on X , which, evaluated at x , minimizes and makes f i n i t e / L(t, 6, x) P n(d6|x = x) (1.2.6) 0 a.e. [p] as a function of t . If such a procedure exists, i t i s uni-que, and given by (1.2.5) , except, possibly, on a set B for which / dp(x) / c(0, x) P n(d0|x = x) = 0 . (1.2.7) B 0 The condition SO. + w2(0, x)) c(6, x) P^Cdelx = x) < » a.e. [p] (1.2.8) 0 is sufficient to insure that <j>n i s a formal Bayes estimator with re-spect to II . A rule <j) i s said to be admissible i f there i s no rule <|>* such that rC<j>*, 6) <_ r(<f>, 0) for a l l 0 e 8 , (1.2.9) with, s t r i c t inequality for some 0 q E 0 . If II i s a measure on the 7 Borel subsets of 0 , then a rule <j> is said to be almost admissible with respect to IT i f there is no rule <j>* for which (1.2.9) holds, with s t r i c t inequality on a set of positive II measure. Clearly, any Bayes (but not formal Bayes) rule i s almost admissible with respect to the prior from which i t i s constructed. In the following sections, we w i l l be able to apply the following theorem, which gives conditions under which admissibility follows from almost admissibility. Although this theorem is well-known, we w i l l present the short proof for complete-ness. It i s important to observe that only the s t r i c t convexity of L('» 6, x) , for a l l 8 and x , is used in the proof. Theorem 1.2.1; Suppose for every element 8 q e © and every set B e B for which P Q (B) > 0 , II({6: P Q(B)}) > 0 .. Then, i f ty is almost o admissible with respect to II , i t is admissible. In particular, ad-m i s s i b i l i t y follows from almost admissibility i f ( P Q J 8 e 0} i s a 8 family of mutually absolutely continuous probability measures. Proof: Suppose that <ji is not admissible, and that v(ty*, 6) £ r(ty, 8) , with s t r i c t inequality at QQ . If B = {x e X: ty*(x) 4 ty(x)} , i t follows that P n (B) > 0 . Let ^ = ty + ^ ty* . Then, since 8 i. 2. 0 {< |-L(<(>(x) , 8, x) + -| L(cf)*(x), 8, x) x e B = L(<J)(x), 8, x) = L(<f>*(x), 8, x) x e B C (1.2.10) We have r ( y , ei < . | r G h e) +1 r(<j>*; e) , ( 1 . 2 . 1 1 ) 8 with s t r i c t inequality i f and only i f P Q C B ) > 0 . Therefore, r(<j>, 6) - r(V, 0) .> 0 , with s t r i c t inequality i f and only i f P Q ( B ) > 0 , and the conclusion of the theorem follows. We are now i n a position to state the main result of [45], which i s central to this chapter. Assume that II i s absolutely continuous with respect to Lebesgue measure on 0 and denote i t s density by TT . The sufficient conditions for almost admissibility involve a function M: X x 0 -> <») defined by 0 -1 u M(x, 0) = [c(0, x)p(x|0)Tf(0)] / [w(t, x)-<)>n(x)] c(t, x)p(x|t)Tr(t)dt . 0 (1.2.12) Also let h(t) = / M2(x, t) c(t, x) p(x|t) dp(x) , (1.2.13) X and assume Cl) Tr(t) h(t) is bounded away from 0 on compact sub intervals of 0 , (II) {0; p(x|e) > 0} is an interval a.e. [p] . Theorem 1.2.2; Under assumptions Cl) and CH) , <f>n is almost admissible with respect to II i f , for a l l c e (0„, 0 ) , when 0 0 U U -1 CD / TtCt) rOfijT, t)dt - ~ , Cii) / CirCt) hCt)) dt = 0 0 and when c c C i i ) ' / CTTCC) h(t)) X dt = « . (i) c / 0, TlCt) r C * u , t) dt = i 9 In [45], c and w did not depend on x , but as pointed out in [47], no d i f f i c u l t y i s encountered in this case. 1.3 The Reduced Problem Theorem 1.2.2 can not be applied directly i n our problem because the mean and variance both are assumed unknown, and therefore the parameter space is not a subinterval of the real line. However, i f we r e s t r i c t our-selves to scale-invariant decision rules, then i t is possible to obtain a reduced problem which can be formulated as in Section 2. In this section we shall consider a more general problem in which such a reduction is possible. As i n [47], we therefore consider a problem which remains invariant under a group G . (In our applications, G w i l l represent the scale group.) Suppose G is a transformation group acting on the l e f t of a given sample space, (X', 5') , and that X' is a random variable taking values i n X' . The distribution of X' is unknown, but i t is assumed to be a member of {PQ.*. 6'e 0'} , a family of distributions indexed by o a parameter set 0' . 0' is also equipped with a o-algebra C . Assume that X' = G/H x X'/G , where H i s a subgroup of G , G/H is the space of l e f t cosets of H , and X ' /G is the quotient (orbit) space under the equivalence, x^ ~ x^ i f and only i f x^ = gx^ for some g e G . (Some situations i n which such a decomposition exists may be found in Berk 12].) For simplicity, let G* = G/H and X* = X ' /G . Denote the canonical mapping of G onto G* by T * , and assume that G acts on the f i r s t co-ordinate of X' . That i s , i f x' = (g*, x*VeX' , and g e G , then gx' = (gg*» x*) , where i f g* = x*(g Q) , then gg* i s well-defined by- gg* = x*(gg0} . The random variable, X' , i s then of the form X' = (G*, X*) . Let p* be a a-finite measure on a a-algebra, 10 8* , of subsets of X* , and assume that, for each 6'e 9*, the marginal distribution of X* given 0' i s absolutely continuous with respect to p* . Denote the corresponding density by p(x*|0') , and assume i t is jointly measurable in both arguments. Assume that the action space, A' , is a subspace of the real line, and that the loss function, L': A' x 0' [0, 00) , is of the form L'Ca«, 0') = aC9')||a' - gC9')||2 , (1.3.1) where a: 0' -*• (0, °°) and g: 0' -»• A' are measurable functions. The assumption that the problem remains invariant under G entails the ex-istence of transformation groups G and G acting on the l e f t of 0' and A' , respectively. These groups are required to act in such a way that i f g and g denote the homomorphic images of g e G i n G and A G , respectively, then P_ (gB1) = Pfl,(B') and L'(ga\ g0') = L ' ( a \ 0') for a l l g e G , a'e A' ,0'eO' , and B'e B' . Assume H acts t r i v i a l l y on A' . A nonrandomized estimator £: X' -* A' i s called invariant i f g S(g*, x*) = S(gg*,x*)for a l l g e G and (g*, x*)e X' . If we let g=gg i t follows that any invariant estimator £ i s equivalent to an estimator of the form GT(X*) , where T: X* •> A' is a measurable function. If 6CG*, X*) = GT(X*) i s an arbitrary, invariant, nonrandomized es-timator, then the risk of d, r'(6, 0') , is Efl, aCe')||GT(X*) - B(9')|| 2 e (1.3.2) _ - l _"1 , = E , a(G e')||T(X*) - g(G 0 * > 11 , 11 because of the assumed invariance of the loss. Here, and elsewhere, Eg, w i l l mean expectation when the true underlying distribution has parameter 0 ' . It follows that _ - l _ - l o r ( f i . e ' ) = E e , ( E Q f Ca(G 0 ' ) | | T ( X * ) - 3(G 8')|r|X*)) = E Q,(c ( e ' , x*)|| T ( X*) - w(e',x*)|| 2) + b(e ' ) , (1.3.3) _ - l where c(e', X * ) = E , (aCG 0 ' ) | X * ) , (1.3.4) D _~1 _ - l w(0', X * ) = E e,(a(G 0') B(G 0') |X*)/c(9 ', X * ) , (1.3.5) - 1 2 - 1 and b(8') =E .(a(G 0') g Z(G 0')) fl - E e,(c(0', X * ) w2(0', X * ) ) . (1.3.6) Let T denote the canonical mapping of 0' onto 0* = 0'/G , and assume that there exists a (1:1) measurable function ty: Q* -> 0 1 such that T0<j>: 0* 0* is the identity mapping. In other words, ty selects one element out of each equivalence class to represent that class. Wijsman ([41], [42]) has given conditions for the existence of such a measurable cross-section. From the invariance of the problem, i t follows that r(6, 0') = r(6, <j>oxC0')) , (1.3.7) for any invariant rule 6 , and for a l l 0'e 0 (see, for example, [14], 12 p. 149) . If the choice of decision rules i s restricted to the class of non-randomized, invariant procedures, D^ . , a reduced problem is thus obtained which can be described as follows. A random variable X* with range X* i s observed. The density (with respect to p*) of the distribution of X* is a unique, but unknown, member of {p(• | <f> (6*)) : 0*e 0*} . The action space for the problem is A ' , and loss i s measured by L*(a, e*. X*) = c(<j,(e*), X*) ||a-w(<j>(e*) , X*)|| 2 + b(<f>(0*)) ( 1 . 3 . 8 ) The class of decision rules available, D* , consists of a l l measurable functions, T: X* ->- A ' . The following obvious results are stated for future reference. A Lemma 1 . 3 . 1 : The invariant procedure GT(X*) is admissible i n when loss i s measured by L'(., . ) , i f and only i f , for the reduced problem, T(X*) is admissible i n D* , when loss i s measured by c(», • ) ||« - w(»,» )|| A Lemma 1 . 3 . 2 : The procedure GT(X*) is (formal) Bayes in with respect to a prior measure, II' , when loss i s measured by L ' ( • , • ) , i f and only i f , for the reduced problem, T(X*) i s (formal) Bayes in D* , with respect - 1 , , . , , , , 11 , . 1 1 2 2 to n ' o T , when loss i s measured by c(«, • ) 11• - w(«, « ) | In order to apply the results of Section 1 .2 and, in particular Theorem 1 . 2 . 2 , we identify X* with X and 0* with 0 . It follows that i t i s necessary to assume that 0* = (0*, 0*) , - «> < 9* < 0* < 0 0 . If J V \x — I — u — II' i s a prior distribution on 0' , and II = II'o T is the induced dis-tribution on 0* , then we w i l l assume that II has a density IT with 13 respect to Lebesgue measure. Then, according to Lemma 1.3.2 and (1.2.5) , G T J J(X*) is Bayes in D with respect to I I ' i f / w(8*, x*) c(6*, x*) p( x*|e*) TT(9*) d6* T (x*) = G * : , (1.3.9) / c(6*, x*) p(x*|e*) TT(6*) d0* 0* except for values of x* in a measurable subset A C X* > for which, / dp*(x*)7 c(9*, x*) p(x*|e*) ir(6*) d9* = 0 , (1.3.10) A 0* and provided that / L*CTnCx*) , 9*, x*) p(x*|e*) Tr(e*) d0* < » a.e. [p*] . (1.3.11) Here, 6* represents <J>(6*) > and we shall continue to follow this practice when no confusion arises. Let M and h be defined as in Theorem 1.2.2 (with x* , 6* , 9* , 9* , p* replacing x, 9, 0 , 6 p, respectively), and X, u ~ u for any T e D* , let r*(T, 6*) =/ L*(T(x*), 0*, x*) p(x*|9*) dp*(x*) . (1.3.12) X* Theorem 1.3.1: Under assumptions (I) and (II) , GT^(X*) is almost ad-missible with respect to n' in Dj. i f , for a l l c e (6*, 6*) , when 14 * 6 u (i) / T r ( t ) r*(T , t ) d t = » , t h e n c n 0 ( i i ) / UGr(t) h ( t ) ) " 1 dt = oo , c and when (i) ' / ir(t) r*(T , t) dt = °° , then C -1 ( i i ) ' / (ir(t) h(t)) 1 dt = co . In the following sections this theorem w i l l be applied in two special cases involving the normal law. 1.4 Quantile Estimation In this section.we consider the problem in which we observe indepen-2S dent random variables X and Z= (Z,, Z 0 Z )' , where X: N( u, a ) ± z n r 2 and Z: ^ n ^ » °" I) • l n the usual situation, in which we have inde-pendent observations from a normal population, we obtain this model 2 after applying a suitable orthogonal transformation. Here, a and p are assumed to be unknown and we wish to estimate u + ncr > where n is a known constant, and our loss function is of the form ' lem n 2 i s (X, S) , where S = z Z. » and we need consider only estimators i=l 1 that are a function of this s t a t i s t i c . The problem remains invariant under the transformation group G^ such that (X, S) (cX + d, c 2S) (li, cr) (cu + d, ccr) (1.4.1) L(a; u, a) = a 2 ||a-u-noj| 2 • A u fi ient s t a t i s t i c in this proble 15 where c > 0 and *-<» < d < «>. It follows that any nonrandomized Gj-1/2 invariant estimator of y + na is of the form X + cS ' ,-«> < c < » . Since G-^ acts transitively on the parameter space, there exists a best choice of c . However, Zidek [46] has shown that the resulting estimator is inadmissible i f n. ^ 0 . In fact, i f we l e t G^ denote the subgroup of G^ obtained by putting d = 0 in (1.4.1), then there exists a G2 -invariant (i.e. scale-invariant) procedure having uniformly smaller risk. In this example, i t is easy to see that the results of the previous section apply with G= G = G = G2 > ti ={e} = the identity element of G X' = ( G * , X*) = ( S 1 / 2 , XS~ 1 / 2) , X* = 0* = ( - c o , 00) , G* = (0, c o ) , 6* = (y, a) , x(e') = ya" 1 , and <j>(e*) = (6*, 1) . For simplicity of notation we w i l l denote X* by Y and 0* by X • Let I (a, b) = 7 t V t 2 / 2 ~ C a t" b> 2/ 2 dt . (1.4.2) n 0 Then, i f p* i s Lebesgue measure, i t is easily shown that p(y|x) = (constant) I n(y, X) (1.4.3) cCX, y) = E x Cs|Y=y)=I n + 2(y,X)/I n(y, X) (1.4.4) and w(X, y) = (X + n) E,CS1/2|Y=y)/c(X, y) = (X + n) i n + 1 ( y , . x > / I n + 2 ( y » A> • (1.4.5) Finally, the formal Bayes estimator with respect to n (assuming one 1/2 exists) i s S ' T n(Y) , where 16 OO / (X + n) I + 1 Cy,. x) TT(A) dx —CO (1.4.6) 00 /I . 9(y, x) TT(X) dx —CO In order to obtain an applicable consequence of Theorem 1.3.1, assume a l . T T ( X ) > 0 , - ° ° < X < co t a2. ir is non-increasing on (0, °°) , and TT i s non-decreasing on (-«>, 0) , a3. TT is bounded, —1 —8 a4. ir(sX)n (X) _< c 1 s 9 , for a l l X and 0 < s < 1 , Theorem 1.4.1: Under assumptions a l , a2, a3, and a4, S '*"s admissible within the class of scale-invariant procedures i f (i) T n ( y ) -y is bounded, w i l l be presented in Theorem 1.4.2 . Proof: The result w i l l follow from Theorem 1.3.1 i f we can show that a l , a2, a3, a4, and (i) together imply where c. > 0 and g < n + 2 . CO Note: Explicit conditions on TT for the boundedness of T (y) -y h(X) < KX2 |X| > 1 . (1.4.7) Note that here, and throughout the thesis, K w i l l be used to denote a 17 generic constant whose precise value is of no relevance to the argument. Recalling equation (1.2.12) we see that M(y,x) i n + 2 (y, X ) TT(X) = / t i n + 1 ( y , t) ,r(t)dt X + n / I n + 1 ( y . fc> ir(t) dt - T n(y) / I n + 2 ( y , t)ir(t)dt X X But, after interchanging integrals, the f i r s t term is equal to °° n+1 -x2/2 " -(xy - t) 2/2 / x e ft e w Tr(t)dtdx o X °° n+1 -x2/2 °°. s -{xy - t) 2/2 , . , , / x e /(t - xy)e ^ a ir(t)dtdx o X °! n+2 -x2/2 °! -(xy - t) 2/2 , . , . + y / x e / e w ir(t)dtdx It follows that M(y, A ) i n + 2 ( y , x)7r(x) = (y - ^ ( y ) ) / i n + 2 ( y , OuCOdt X + n / i n + 1(y» t)u(t)dt (1.4.8) X oo 2 0 0 2 , °! n+1 -x /2 °! . s -(xy - t) /2 ^ v , , , + / x e ' / (t- xy)e w ir(t)dtdx o X 18 Therefore, using a2 and ( i ) , for X > 0 M 2(y» X ) I n + 2 ( y ' X ) i K l H n f 2 ( y ' X ) + H n + l ( y ' X ) + ^ l ^ ' X ) + L n + l C A y " 1 ) J + ( s 8 n y ) ( 1 . 4 . 9 ) where, i f we let 2 f(x) = ( 2 n ) " 1 / 2 e" X 1 1 , (1.4.10) then H n(a, b) - / I n(a, t) dt b and L (a) = / x nf(x)dx , (1.4.12) a Here, J +(sgn y) i s equal to 1 i f y i s positive, and is equal to 0 otherwise. The last term in (1.4,9) arises, since for y > 0 and X > 0 , v-\x)ff x n + 1 e ~ x 2 / 2 J c t-xy) e ~ C x y - t ) 2 / 2 * ( t)dtdx| o X oo 2 . oo , 9 , n+1 ~x 2 . | , -(xy-t) /2 , . <. / x e / I t-xy| e w ' ' dtdx o X 19 oo 2 0 0 2 . n+1 -x /2 . I , -u 12 . . / x e / I u I e dudx o X-xy \ X n+1 -x2/2 ~ -u2/2 A A f x e / ue dudx X-xy , 7 n + l ~ x 2/2 7 i i - u 2 / 2 « , ,, + / ^ x e / l u | e dudx Xy X-xy - In+l ( y» X ) + ^ r r i - l ^ 7 " ^ * (1.4.13) If y < 0 , then t - xy > 0 for a l l t > X > 0 , and therefore we can remove r, directly, obtaining ^n+±(y' ^ a s t n e squired upper bound. Before continuing with the proof of the theorem, i t w i l l be useful to have available the following lemmas. Lemma 1.4.1: (-» < n < °°) (a) If K. < l T 1 / ' 2 2 n / / 2 _ 1 < K„ , then there exists a such that 1 Z o a > a implies — o L n(a) K <_ < K 2 e" a / 2 ( l + a 2 / 2 ) ( n - 1 ) / 2 (b) There exist K„, K. > 0 such that a >_ 0 implies L n(a) K3 - K4 e - a 2 / 2 C l + a 2 / 2 ) ( n - 1 ) / 2 Proof: By differentiating both of i t s sides, the following identity proved: e - a 2 / 2 ( l + a2/2) C n " 1 ) / 2 = " e-y[-2-\n-l) ( l + y ) ( n ~ 3 ) / 2 a2/2 + d + y ) ( n - 1 ) / 2 ] d y . 2 Thus, e~ a / 2 ( 1 + a 2 / 2 ) ( n ~ 1 ) / 2 may be written as / y C N - 1 ) / 2 I - 2 - 1 ( n - 3 ) y - C n - 1 ) / 2 ( l + y ) ( n - 3 ) / 2 + y - ( n - 3 ) / 2 ( l + y ) ( n - 3 ) / 2 ] d y a2/2 To prove (a), note that ? -y (n-l)/2 . 0l-n/2 _l/2 _ . . / e J y dy = 2 II L (a) , a >^ 0 a2/2 Thus, since J £ [ _ 2 - l ( n . 3 ) y - ( n - l ) / 2 a 4 7 ) ( n - 3 ) / 2 + y - ( n - 3 ) / 2 ( 1 + y ) ( n - 3 y 2 ] = ± 21 the limit of the ratio in Ca) is 2 ° ^ 2 II 2 . 2 (b) follows from Ca) and the continuity of L^ a ) e a / 2C1 + a 2 / 2 ) ( 1 _ n ) / 2 For convenience in the sequel, we set t t 2 FCt) = / f ( x ) d x - C 2 l i r 1 / 2 / e~ X 1 1 dx , (1.4.14) J (a) = / xn. f (x-a)dx , (1.4.15) n o and R (a) = J (a) J - * (a) . i (1.4.16) n n n+1 Lemma 1.4.2: (n = 0 , 1, 2, ... ) (a) R n(a) £ a" 1 a > 0 (b) There exists K,. such that R (a) < K_ a > 0 . n — 5 — (c) If a < 0 , there exists such that o 6 R (a) < K. lal a < a . n — 6 1 ' — o (d) There exist and Kg such that R (a) < K., lal + K Q a < 0 . n — 7 1 ' 8 22 Proof: Since x is convex for x > 0 , i t follows from Jensen's i n -equality that R (a) > R (a) , Therefore, R (a) < R (a) . But n n+1 n o R (a) = F(a)IfCa) + aFCa)]" 1 < a" 1 a > 0 . o — The remaining parts of the lemma follow from the continuity of R^ and/or from the observation that, by l'Hopital's rule, lim F(a)[f(a) + a F ( a ) J _ 1 |a | 1 a-*—°° = lim [2|a| F(a)f~ 1(a)-1]~ 1 = 1 a-*—00 (For a proof of the latter equality refer to [12], p.166.) Lemma 1.4.3: (n = 0 , 1, 2, ...) (a) There exist K , K > 0 such that y 1 0 K Q a n < J (a) < BT (a + l ) n a > 0 . y - n - 1 0 (b) There exists > 0 such that J (a) > K,, a > 0 n — 1 1 — (c) If ' a * 0 there exist K 1 2 > K 1 3 > 0 such that 23 2 2 T , I I -n-1 -a /2 T / \ T, i l n-1 -a /2 K 1 0 a e < J (a) < K » a e a < a . 12 ' 1 — n — 13 — o Proof: (a) From Lemma 1.4.2 (a) J (a) > aJ .(a) > ... > a°J (a) = a nF(a) > 2 1 aU a > 0 . n — n-1 — — o — — — w Also, J (a) = / (x+a)n f(x)dx < 2n~l(an F(a) + /|x|n f(x)dx) n 1 1 -a < 2 n 1 ( a 1 1 + / |x|n f(x)dx) The result follows from continuity and the fact that (a+l) n > 0 for a > 0 . (b) This result i s t r i v i a l . n H i (c) Since J (a) = Z l i j a L .(-a) , n . n-i 1=0 i t follows that, for a < a , o i=o 1=0 24 where = max K^(i) . Here, for -co < n < » , K^(n) i s used to o<l<n accent the dependence on n in Lemma 1.4.1 . Therefore, J (a) < K Z ( i ) | a | 1 | a | n - i - 1 &~a'?/2 i=o 2 v o n I |n-1 -a /2 = K2 a e a < a . 1 1 — o Also, by Lemma 1.4.2(c) , J n(a) ^ K| a | - 1 J x(a) >. . .. >. K| a | _ n J(a, 0) = .K,|.aj n F(a) a < a ~ o Therefore, since lim |a| F(a)f 1(a) = 1 , a->~ oo J (a) > Rial n 1 f(a) a < a n — 1 1 — o Lemma 1.4.4: If X > 1 and y > 0 , then H n(y, A) < K e - x 2 / 2 ( 1 + y 2 ) ( l + X 2 / 2 ( l + y 2 ) ) ( n - 1 ) / 2 , where H i s as defined in (1.4.11) n 25 Proof: Assume throughout the proof that X > 1 and y > 0 . I n(y, X) - ( 2 n ) 1 / 2 ( l + y V ( n + 1 > / 2 e - * 2 / 2 ( 1 + y 2 > J n ( y X ( l + y V 1 / 2 ) , (1.4.17) and therefore, by Lemma 1.4.3, H n(y, X) < ( 2 n ) 1 / 2 ( l + y 2 ) - ( " + 1 ) / 2 K l p 7 . - ' ' " ( ^ ' ( y t d V ) " 1 ' 2 + 1>" X < ( 2 n ) 1 / 2 ( l + y 2 ) - ( n + 1 ) / 2 K 1 Q 1 e ~ t 2 ' 2 ^ ( t + l ) n dt X X = K L n ( X ( l + y 2 ) " 1 / 2 ) < K K 4 e - A 2 / 2 ( 1 + y 2 ) ( l + X 2 / 2 ( l + y 2 ) ) ( n - 1 ) / 2 , by Lemma 1.4.1. We now return to the proof of Theorem 1.4.2. Assume, for the present, °° 2 that X > 1 , and r e c a l l that h(X) = / M (y, ^ ) I n + 2(y» *)dy • T h u s —OO the proof i n this case w i l l be complete i f we can show that - 1 2 (I) / M (y, X ) I n + 2 ( y , X)dy < K , — 0 0 0 2 (II) / lC(y, X ) I n + 2 ( y , A)dy < K , 26 1 2 (III) / lT(y, A)I n + 2(y,. A)dy < K , (IV) / M2(y, X ) I n + 2 ( y , A)dy < KX2 , and (V) / M2(y, A ) I n + 2 ( y , A)dy < KX2 A (I) (-°° < y < -1); As a consequence of (1.4.17) |H n + 2(y, * ) / I n + 2 ( y . » l = e x 2 / 2 ( l ^ 2 > J ^ 2 ( y A ( 1 4 y 2 ) - 1 / 2 ) ~S e - t 2 / 2 ( 1 ^ 2 > J n + 2 ( y t U + y 2 ) - 1 / 2 ) d t A < KK 1 3 K"1 X n + 3 e X 2 / 2 L ^ A ) < KK^ X n + 3 ( l + X 2 / 2 ) n / 2 < KA 2 n + 3 Similarly, | H n + 1(y, A ) / I n + 1 ( y , A ) | ± K X 2 n + 1 Also, | i n + 1 ( y , ^ ) / i t t f 2 ( y ' - x ) = ( l + y 2 ) 1 / 2 R n + 1(yA(l+y 2) 1 / 2 ) < K 6 ( l + y 2 ) 1 / 2 | y | ( l + y 2 ) - 1 / 2 A < K X ( l + y 2 ) 1 / 2 . Moreover, |H n + 1(y,. A ) / I n + 2 ( y , X)| = |H n + 1(y, X ) / I n + 1 ( y , X)|.| I n + 1 ( y , X)/I j r i. 2(y, X) | < K X 2 n + 2 ( l + y 2 ) 1 / 2 • Therefore, for a l l X > 1 , - 1 2 / M (y, X ) I n + 2 ( y , X)dy —CO < K "} X M a + y 2 ) - ^ + 1 ) / 2 e - ^ ^ ^ ^ J ^ C y X d + y 2 ) - 1 7 2 —CO < KK 1 3 X 5 n + 5 e " x 2 / 2 ') ( l + y 2 ) - ( n + 1 ) / 2 dy < K . —CO (II) (-1 < y < 0): Since n^+2^"^ ^s i- n c r e a s :*- nS> | U / ,wT r , M X 2/2(l+y 2) " - t 2 / 2 ( l + y 2 ) J t A - ( l + y 2 ) 1 / 2 f" 1(X(l+y 2)(l-F(X(l+y 2)~ 1 / 2) < X _ 1(l+y 2) < 2X _ 1 . 28 (The second last inequality i s given in [12], p. 166) Also, | i n + 1 ( y , *>/in+2<y.. A ) | = ( l + y 2 ) 1 / 2 R n + 1 ( y X ( l + y 2 ) " 1 / 2 ) <_ K(K?A + Kg) . Morover, proceeding as in (I), |H n + 1(y, A ) / I n + 2 ( y , X)| < K^+KgX" 1) . Therefore, since J .,.(•) i s increasing, °f M 2(y, X ) I n + 2 ( y , X) < K(K 7+K 8X _ 1)e" A 7 4 < K . (I l l ) (0 < y < 1) : |H n + 2(y, X ) / I n + 2 ( y , X)| < KK 1 4 K ] L~ 1(l+X 2/2(1+y 2)) ( n + 1 ) / 2 < KX n + 1 . Similarly, |H n + 1(y, X ) / I n + 2 ( y , X)| < KX n . 2 1/2 Also, | l n + 1 ( y , X ) / I n + 2 ( y , A)| < K 5(l+y ) < K , ( x y - 1 ) / l + 2 ( y , A) | < | L n + 1 ( x ( l + y 2 ) - 1 / 2 ) / i n + 2 ( y , X) < ( 2 n ) - 1 / 2 K , ( i + y 2 ) ( n + 3 > / 2 ( i + x 2 / 2 . ( i + y 2 ) ) n / 2 ^ 2 ( y A a + y V 1 / 2 ) < KXn Therefore, for a l l X > 1 , 1 2 / M (y, X ) I n + 2 ( y , X)dy o < x 2 n + 2 ) ( i + y 2 ) - < " + 3 > ' 2 e - i 2 / 2 ( 1 + " 2 ) V 2 ( y x ( i + y 2 ) - 1 / 2 ) a y 2n+2/1^1.n+2 -X2/4 < K . — 10 (X+l) e — (IV) (1 < y < X) Here, l e t f(y, X) be any function for which |f(y, X)| <. KX m(l+y 2) k^ 2 Then, from (1.4.17), i t follows that X 2 / f Z ( y , X ) I n + 2 ( y , X)dy < K X n + 2 + 2 m / ( l + y 2 ) ^ 2 k - n - 3 ) / 2 e - x 2 / 2 ( 1 + y 2 ) d y / • T J . T . 2^1/2 2 Tr,n+2+2m K f ' 2k-n-3 {, 2 l N - l / 2 . -X = KX J u [(u -1) u] e 2 l / 2 < K X n + 2 + 2 m J u 2 k — 3 e"A / 2 U du 2 l / 2 X 2 _ 1 / 2 2 - K X 2 ( m + k ) J v n + 1 " 2 k e" V 1 1 dv x ( i + x 2 ) - 1 / 2 < KX2 providing that m + k •<_ 1 For y > 1 , Therefore, |H^.2(y,. x ) / I n + 2 ( y ' X^ < K K U X - n - 2 ( l + X 2 / 2 ( l + y 2 ) ) ( n + 1 ) / 2 ( l + y 2 ) ( n + 3 ) / 2 < KX 1(l+y 2) , 1 < y < X Similarly, JH^ C y , A)/I n + 2<y» A ) < K X - n - 2 ( l + X 2 / 2 ( l + y 2 ) ) n / 2 ( l + y 2 ) ( n + 3 ) / 2 -2 2 3/2 < KX ( 1 + y T Also, |L n + 1(Xy"Vi n + 2(y, x)| < l \ + 1 ( x ( i + y 2 ) ~ 1 / 2 ) / i n + 2 ( y , x) < K K 4 X _ n " 2 ( l + X 2 / 2 ( l + y 2 ) ) n / 2 ( l + y 2 ) ( n + 3 ) / 2 < K X - 2 ( l + y 2 ) 3 / 2 Moreover, J'I n + 1(y, x ) / I n + 2 ' ( y ' x> I - d + y 2 ) 1 / 2 |R n + 1(yx(i+y 2)" 1 / 2) < X" 1(l+y 2)y" 1 < K X ~ 1 ( l + y 2 ) 1 / 2 In each case the bound is a function of the form K A ^ l + y 2 ) ^ 2 , where m + k < 1 . (V) (X < y < ») Using (1.4.6)i we see that 32 / C T n ( y ) I n + 2 ( y ' O - C t + ^ I ^ C y , t ) ] i r ( t ) d t = 0 , (1.4.18) and i t follows from (1.4.8) that M also satisfies M(y> X)I . 9 ( y , A)TT(A) = (T n(y)-y) / I ..(y, t ) i r ( t ) d t -n / i n + 1 C y , t)7r(t)dt (1.4.19) 7 n+1 -x2/2 \. .. ~(xy-t) 2/2 , N J . -fx e /(t-xy)e w ' Tr(t)dt dx . Therefore, M 2(y>. A> In+2 ( y' < KtC / I n + 2<y» t ) i r(t)dt) 2, , w 2 , , % 2,,v . r , \ , % , v . v2 X 2 + .( / ^ ( y . t ) T r ( t ) d t ) ' (1.4.20) . T. n+1 -x2/2 \ . t . -(xy-t) 2/2 , X J J ,2, + (/ x e / (t-xy)e w Tr(t)dt dx) ] As in part IV, i f f(y, A) is any function for which |f(y, A) | <. KA m(l+y 2) k /' 2 , then / f 2 ( y , X ) I n + 2 ( y , A) X < „.2(m+k) A ^ 1 + a 2 ) n+l-2k -v2/2 , < KA f v e dv < KA2 , providing that m + k <_ 1 , and k < (n+2)/2 . Now, using (1.4.17), we see that for y > X , / ^ ( y . t ) T r ( t ) d t - K ( l + y 2 ) - < n + 3 ) / 2 } e - t 2 / 2 ( 1 + y 2 ) j n + 2 ( y ( l + y 2 ) - 1 / 2 t ) r r ( t ) d t < K ( l + y 2 ) - < n + 3 ) / 2 [K + / e - t 2 / 2 ( 1 + y 2 ) t n + 2 1 T ( t ) d t ] 1 < K ( l + y 2 ) - ( n + 3 ) / 2 [ , ( X ) X e + ,(X)X g / e - t 2 / 2 ( l + y 2 ) t n + 2 - g d t ] - K ( l + y 2 ) - < n + 3 > / 2 Tr(A)[X e +X n + 3 ) e - X 2 u 2 / 2 ( l + y 2 ) u n + 2 - B , u ] X"1 < K ( l + y 2 ) - ( n + 3 ) / 2 X N + 3 TT(X) In the second inequality, assumption a4 was applied twice. In the f i r s t case, s = X 1 , and i n the second case, s = tX 1 . Similarly, / i n + 1 ( y , t M O d t < K ( i + y 2 ) - ( n + 2 ) / 2 x n + 2 TT(X) F i r s t observe that ,\ " n+1 -x 2/2, . -(xy-t) 2/2 . , . . • \S I x e (xy-t)e J dx ir(t)dt| 1 o ) 7 x n + 1 e " x 2 / 2 | x y - t | e - ( x y " t ) 2 / 2 dx rr( t )dt 1 0 K ( i + y V M , 2 J " « ' , t l |wyd+y 2)- 1 / 2 - t | 1 0 - v 2/2 ( l + y 2 > - ( w y U + y 2 ) 1 / 2 - t) 2/2 , e dw i r ( t ) d t W r ^ 2 " 2 ) " " ^ 1 l»y(i+y 2)- 1 / 2-t| 1 0 - ( w y ( l + y 2 ) " 1 / 2 - t) 2/2 , e dw 7r(t)dt K ( l + y 2 ) - ( n + 2 ) / 2 / / ( v + t ) n + 1 |v| e " V 1 2 dv T r ( t ) d t 1 - t K ( l + y 2 ) - ( n + 2 ) / 2 / 7 (|v| n + 1 + t n + 1 ) | v | e " V 1 2 dv ,(t)dt 1 -co 2 K ( l + y 2 ) - ( n + 2 ) / 2 ir<A)AB / 7 ( | v | n + 1 + t n + 1 ) | v | e " V / 2 dv t " B 1 -00 K a + y 2 ) - ^ 7 2 - ^ ^ 3 ) t n + 1 ' P dt 1 K ( l + y 2 ) - < n + 2 ) / 2 X n + 2 „(X> 35 Finally, by treating the cases -» < t < 0 , and 0 < t < 1 separately, i t i s not hard to see that 1 oo 2 2 , n+1 -x 12. . -(xy-t) /2 , . N j , [ / / x e (xy-t)e w dx Tr(t)dt| - c o o <K(l+y 2)-< n + 2> / 2 < K ( l + y 2 ) - ( n + 2 ) / 2 X * v ( X ) But, for y > X , I n + 2 ( Y , X) > K(l+y 2)-< n + 3>/ 2 x n + 2 e - ^ / 2 ^ > K ( l + y 2 ) - ( n + 3 ) / 2 X N + 2 E - A 2 / 2 ( 1 + A 2 > > K ( l + y 2 ) - ( n + 3 ) / 2 X N + 2 Therefore / I t t f 2 ( y ' t ) i r ( t ) d t -CO i n + 2 ( y .. X ) T T ( X ) < K X , / i n + 1 ( y , t M t ) d t . - n + 1 _ < K ( l + y 2 ) 1 / 2 , i n + 2 ( y , X ) T T ( X ) 36 X °° 2 2 , r , , n+1 - x 12. . - ( x y - t ) /2 , S j , . a n d j / / x e ( x y - t ) e w i T ( t ) d x d t | -°° o __ I n + 2 ( y , A)T T ( X ) < K ( l + y 2 ) 1 / 2 In each case the bound is a function of the form KX m(l+y 2) k^ 2 , with k + m <^ 1 , and k < (n+2)/2 . The proof is complete for X > 1 . Now, using (1.4.19), we see that M also satisfies M(y, x ) I n + 2 ( y ' A ) 7 T ( A ) = ( T n ( y ) _ y ) / ^+2^' - t ^ ^ d t —A -n / x n + 1 ( y » -t ) i r(-t)dt (1.4.21) , " n+1 - x 2 / 2 " / - j . \ - ( x y + t ) 2 / 2 , . A t_ , + / x e / ( t + x y ) e J i r ( - t ) d t d x o -X Also I n(y, t) = I n(-y, -t) . But i f X < -1 , Tr(-t) <. TT(X) for t > - X . So we readily obtain the basic inequality corresponding to (1.4.9), and i t differs only i n as much as -y and |x| replace y and X , respectively. Therefore, the previous analysis allows us to conclude that / M2(y, A ) I n + 2 ( y , X)dy < KX2 , X < -1 . A 37 Similarly, using (1.4.8) M ( y » x ) I n + 2 ( y » X ) 7 T ( X ) = ( y - T n ( y ) > / 1T&2(-y> - O u C - O d t +n I x n + 1 ( y . - t ) i r ( - t ) d t (1.4.22) — CO 7 n+1 -x2/2 ~ X , , . -(xy+t) 2/2 , . , -fx e f (t+xy)e 3 Tr(-t)dt dx The analysis of part V now allows us to conclude that / M 2(y, A ) I n + 2 ( y , A)dA < KA2 , A < -1 . In this case assumption a4 is applied with s = -A ^ and s = -tA The proof of Theorem 1.4.1 is complete. Assume: -1 a5. ir(t)TT~ (A) i . c 2 + °3 lt_x|a f o r a 1 1 x a n d t » where c^, c^, a > 0 . Theorem 1.4.2: Under assumptions a2, a4, and a5, there exists K such that |T n(y)-y| < K , -co < y < 38 Proof: It can be shown that T n(y) = y + A + nB , (1.4.23) " n+1 -x2/2 " \ -(xy-X) 2/2 . , fx e f (X-xy)e J Tr(X)dX dx where A = ^ 2 ~ 2 (1.4.24) . n+2 -x 12 . -(xy-X) /2 /-.x,, , fx e f e J Tr(X)dX dx oo 2 0 0 \ 2 . , n+1 -x /2 , -(xy-X) 2 , / x e f e • 7r(X)dX dx and B = ^ — (1.4.25) CO / O J , n+2 -x 12 . -(xy-X) /2 ,. S A. , fx e / e ' Tr(X)dX dx Now, 00 2 , 0 0 2 , 1 t- n + l -x 2 , -u/2 / L , j I |/ x e / ue iT(u+xy)du dx| Al , n+2 -x2/2 . -u2/2 , . , j / x e f e TT (u+xy) du dx oo 2 co 2 / x 1 1 ^ e X ^ 2 fr(xy) / |u| e U ^ 2 Tr(u+xy)Tr "*"(xy)du dx 0 < n+2 -x2/2 , s • -u2/2 , , \ -1/ •> , 1 / x e it(xy) / e Tr(u+xy)Tr (xy)du dx <K*f(y) , . 2 / |u| e~ U / 2(c 2+c 3|u| a)du —CO where ¥> = ! /e (c 2+c 3|u| ) du 39 and f(y) -oo 2 . n+1 -x 12 , . -1. / x e ir(xy)Tr (y)dx o °° 2 . n+2 -x 2 , > - L >, / x e IT (xy) IT (y)dx Similarly, |B | <_ K**f (y) , J e- u 2 / 2(c 2+c 3|u| a)du where K** J e ^ ^ + C g l u l 0 1 ) - 1 du r v n + 1 Q ~ x 2 / 2 3 j . 7 n+1 -x 2 . c / x e x d x + / x e dx But f(y) < 5 3 » 7 n+2 -x 12 f x e. and the result follows. 1.5 Scale Estimation The structure i n this section is identical to that i n Section 1.4, ex-cept that the problem of interest i s the estimation of a 2 p , p > 0 . The / 2 2 loss function i s given by L(a; u, a) = a |a-a > and the action of on A=(0, oo) is such that a -> c 2 p a , c > 0 . Any 6^-invariant estimator i s of the form cS P , and, again, the best G^-invariant estimator i s inadmissible ([37], [5]). In this example, c(X, y) = E x(S 2 p|Y=y) = I n + 4 p ( y , A)/I n(y, X) (1.5.1) 40 and w(X, y) = E X(S P|y=y)/c(X, y) = W 7 ' X ) / I n + 4 P ( y ' X ) ' ( 1- 5' 2> The formal Bayes estimator with respect to n (assuming one exists) i s S PT n(Y) , where, CO / • I n + 2 p-(y» x)Tr(x)dx T n ( y ) = Z E • <1-5-3> Theorem 1.4.1 and Theorem 1.4.2 have analogues i n this example, and the methods of proof are similar to, and slightly less complicated than, those of Section 1.4. We therefore present only the results, noting that M(y,. x ) i n + 4 p ( y , A)TT.(A) = / i n + 2 p ( y , t )T r(t)dt X - T n(y) / I n + 4 p ( y , t ) T i ( t ) d t . (1.5.4) X Replacing 3 < n+2 by g < n+2p+l i n a4 , we have the following theorems: Theorem 1.5.1: Under assumptions a l , a2, a3, and a4, SPT^(Y) is admissible within the class of scale-invariant procedures i f (i) ^rj(y) i s bounded, CO -1 and ( i i ) / (X T TU))" 1 dX = / (X 2 T r(X)) _ 1 dX = Theorem 1.5.2: Under assumptions a2, a4, and a5, there exists K such that |T n(y) | <. K , -» < y < 0 0 • 42 Chapter 2: Minimax, Scale-admissible Estimators Of Scale Parameters 2.1 Introduction Consider the canonical form of the general linear model in which we observe independent random variables X and Z = (Z^, Z n ) ' , where 2 2 2 X: N (u, a I) and Z: N (0, o I) . Here, a and u are assumed to be k ^ unknown and we wish to estimate a 2 p(p>0) , where our loss function (for the present) is of the form L(a; u» a) = a (a-a ) . A sufficient n 2 s t a t i s t i c in this problem is (X, S) , where S = Z 2. , and we need i=l consider only estimators that are a function of this s t a t i s t i c . The problem remains invariant under the transformation group G , such that (X, S) _ (caX+d, c 2S) (u, a) *- (cau+d, ca) (2.1.1) 2p a *- c v a 1^ where c > 0 , d E R , and a is a k x k orthogonal matrix. It follows that any nonrandomized G-invariant estimator of a^ P is of the form cS P Since G acts transitively on the parameter space, there exists a best choice of c , given.by E (sp) r(P+f) c°= J ^ 1 9 = — , (2.1.2) E o , i ( s 2 p ) 2 P r(2p+|) w 1 f \ , x-1 -u , where T(x) = / u e du . 43 Although c°S P is minimax ([22], [23]), i t is well-known that this estimate i s inadmissible ([37], [5]). In fact, i f we l e t ff denote the subgroup of Q obtained by putting d = 0 in (2.1.1), then there exists an H - i n v a r i a n t procedure having uniformly smaller risk. However, the estimators which have been chosen to dominate the usual procedure are themselves H -inadmissible. In this chapter, we construct an alternative, minimax estimator, which i s formal Bayes within the class of scale-invariant procedures. If k=l., then using the results of Chapter 1, we are able to show that this estimator is also scale-admissible. An extension of these results to the problem obtained by introducing a more general loss function i s carried out in Section 4. As a f i r s t step in the construction, we w i l l rederive the estimators of Stein and Brown in a manner which w i l l suggest the direction in which to proceed. 2.2 Inadmissibility of the best G-invariant estimator Let Y = S~1/2 X , W = | | Y | | 2 = z Y 2 , A= a _ 1 y , and 6 = | | A | | 2 . i=l Observe that for any nonrandomized, /-/-invariant procedure, >r'(W)SP , we have E [o 4 p ( y ( W ) S P - 0 2 p ) 2 ] = E . [(Y(W)S P-1) 2] . (2.2.1) u> 0 fAI^ (6 ,0,...,0),1 Our goal i n this section is that of finding a B -measurable function, Y , w such that sup E rGKW)S p-l) 2] < E n[(c°S P-l) 2] , (2.2.2) 6>0. 6 ° 44 with s t r i c t inequality for at least one 6 q Example 2.2.1 (Stein) Since E f i[(c°S P-l) 2 - 0<(w)sp-i)2] (2.2.3) = E 6{E 6[(c°S P-l) 2 - 0KW)SP-1)2|W]} , i t i s sufficient (but not necessary) to find a En-measurable function, W ¥ , such that, for a l l 6 > 0 , Ej(c°S P-l) 2 - (^(W)SP-l)2|w=w] > 0 a.a.w [P.] , (2.2.4) 6 1 o with s t r i c t inequality on a set of positive probability for at least one For notational simplicity we introduce a random variable T , whose joint density, with W , i s f T f W ( t , w|6) = K t 2 p f s > w ( t , w|6) v (n+k+4Pr2)/2- (k-2)/2 -t(1+w)/2-6/2 = Kt w e D) (6tw) 1 - t, w > 0 *i= 0 i!'t(2i+k-2)/2]I 4 1 2 (with respect to Lebesgue measure on IR ) . Also, a function f:(0, °°) ->- IR w i l l be called (strictly) bowl-shaped i f there exists x^ > 0 such that 45 f is (strictly) decreasing on (0 , x^] and (strictly) increasing on [x Q, OT) • Then, for fixed 8 and w , E f(cS P-l) |w=w] is a s t r i c t l y bowl-o shaped function of c , which takes i t s minimum at EjS P |w=w] - J L - ^ =EJT P |w=w] . (2.2.6) E6[S2p|W=w] 6 It i s easy to show that, for each w and 6 , f^, ^ ( t j w | <$) * ^x^W^' is a non-decreasing function of t , so that sup E [T~P|W=w] = E [T p|W=w] 6>0 6 ° (l+w)P r[(n+k+2p)/2] § (2.2.7) 2 P T([n+k+4p)/2] But E Q (T P |W=W] < c° in a neighbourhood of the origin, and therefore, letting Ys(w) = min {E Q [ T P|W=W] , c°)", (2.2.8) we obtain a procedure which dominates the usual one. Moreover, i t i s not d i f f i c u l t to show that, for each w , inf E [T~P|W=w] <_ c° . Therefore, 6>0 6 improvement cannot be achieved for any w by replacing c° by this infimum. Note that the method of proof used in this example may be viewed as a modification of that used in [46] and [47] . 46 More generally, i f 8 is any sub-a-algebra of B , then i t i s w sufficient to find a B-measurable function ¥ , such that, for a l l 6 >. 0 , E f(c°S P-l) 2 - 0KW)SP-1)2|8] > 0 a.e. [P ] , (2.2.9) o o with s t r i c t inequality on a set of positive probability for at least one 5 q . As i n Example 2.2.1, for each atom B of 8 , and 6 >. 0 , E [(cS P-l) IB] is a s t r i c t l y bowl-shaped function of c which takes i t s o minimum at E^[T P|B] . The construction of y now requires that we determine, for each atom B , whether E^(T PJB] i s bounded away (in 6) from c° . For those B for which the answer is affirmative, f i s de-fined i n the obvious way. Otherwise ¥ i s set equal to c° . Example 2.2.2. (Brown) If we f i x r > 0 and let B ^ = 8 ( 1 ^ rj(W)) , we see that 6>0 E 6[ T~ P| W e t°' r ] ] = V T~ P| we[°» r ^ < c ° ' (2.2.10) We obtain the result by letting Y ( r )(w) = min {c°, E orT" P|B ( r )](w)> r t(n +k +2p)/2] / u(k-2)/2 ( 1 + u )-(n +k +2p)/2 d u w , r 2 p r[(n + k +4 P)/2] 7 u ( k - 2 ) / 2 ( l + u ) - ( n + k + 4 p ) / 2 du . (2-2.11) V c ° w > r . 47 (2.2.10) can be obtained most easily by noting that f,j, (t | S) has non-decreasing likelihood ratio with respect to (t|0) , which in turn has increasing likelihood ratio with respect to f^,(t) . Since ^ and ^ fr^ a r e n o t analytic, we know that the corresponding estimators are inadmissible ([32], [6]). In fact, i f we are searching for admissible procedures, i t seems reasonable to ask 'a p r i o r i ' why we should r e s t r i c t our search to procedures which are measurable with respect to small a-algebras (such as 8 ^ ) . The answer appears to l i e in the method of proof implied by (2.2.9) . Looking at each atom separately means that we have lost any desirable effects which might accrue from averaging over atoms (see "not necessary" in Example 2.2.1). This loss w i l l probably be most significant i f the number of atoms is large. It is interesting to observe that the method may f a i l completely i f the a-algebra chosen is too large. For example, i f we do not require the orthogonal invariance, then we might look for a scale-invariant procedure, y(Y)S^ ,where ¥ i s 8^-measurable. If we attempt to improve on separate atoms of 8^ , then the proof f a i l s to produce an improved estimate. At the same time, we can improve on certain atoms of 8,. • • . M * i l ' * "'» I \ l ' In the next section we w i l l construct a minimax, formal Bayes scale-invariant estimator, y ^ ' M S ^ . The construction w i l l be carried out roughly as follows: a) select a sequence of a-algebras B t B , 48 b) select a "good" B -measurable function, Y , ° m m for each m , and c) l e t H<*(W) = lim ty (W) . m m-xio An obvious choice for ty is given by m I* (W) = min (c°, E Q [ T ^ |Sm] (W)) - but this approach f a i l s because we see that lim ty (W) = (W) . In the next section we shall see that m S m-x» that our choice of a "good" B m -measurable function is motivated by a monotonicity property of {^(r) -^ r>0 a n c* ^ v t* i e desire to avoid truncation by c° (and hence, possibly, to obtain analyLicity). 2.3 The construction of a minimax, formal Bayes scale-invariant estimator Although the inadmissibility of ,i'^rj(W)SP is clear from analyticity considerations, a more direct proof can be given. In fact, i f we select 0 < r' < r , then f T | W < r , ( t | 0 ) has increasing likelihood ratio with re-spect to f T | w < r ( t | 0 ) , so. that E0tT"P|We[0, r']] < E 0[T~ P|w e[0, r]] , and we can therefore repeat the argument in Example 2.2.2 to find a function ty' which is better than ty, . , where (r) £ 0[T P|we[0, r']] 0 < w < r' Y' (w) =<jE0[T"P|we[0, r]] r'<w<r (2.3.1) V o c w > r 49 = min{¥ ( r,(w) , y ( r )(w)} n oo ct More generally, let {r } _ , be a double sequence of constants such og C X — 1 p — 1 a) 0 = r . < r 1 < . . . < r au a l an ' a b) {ra3| B-1 , . . . , V C { r ( a + 1 ) 3 13=1, ... , n ^ } for a l l a c) r + o o , and an a d) lim max r -r , J = 0 . ^~ -\SQ^ a3 a(3-1) 1 a-xo Kg<n a Now, for each a , i f we let ¥ (w) = min v(w) , then <P KB<n ( raB } a i s better that c° . But i t i s easy to see, with the help of (2.2.11), that ¥ -»•¥*, where a Y*(w) = V. v(w) (2.3.2) r[(n +k +2p)/2] 7 u 0 - 2 ) / 2 ( 1 + u ) - ( n + l c + 2 p ) / 2 d u 2Pr[(n-fk +4 P)/2] / u ( k - 2 ) / 2 ( 1 + u ) - ( n + k + 4 P ) / 2 d u o o •' 2 » r [ « * M p ) / 2 J W <T r l u < k " 2 ) / 2 ( l - u ) < " + 4 P - 2 » 2 du 50 Fatou's Lemma enables us to conclude that Y*(W)SP is minimax, and we w i l l now show that i t i s also formal Bayes scale-invariant. Consider an arbitrary nonrandomized scale-invariant procedure, >KY)SP , with risk function equal to f f (^(y)s , P-l) 2 f c v ( s , yl\)dy ds . IR+ (Rk Suppose that we are given a (possibly improper) prior measure, Ji , on IR p Among rules of the form y(Y)S , the (formal) Bayes rule with respect to p II i s Y^WS , where, for each y , ^ ( y ) i s chosen to minimize, and make f i n i t e , / / ( y ( y ) s P - l ) 2 f v ( s , y|A)dnU)ds . (2.3.3) + k ' I R + R k Therefore, i f g (s, y) = / f_ v ( s , y|x)dn(A) , (2.3.4) n k 9 1 H-n(y) = [/ s P g n ( s , y)ds] [/ s^ Pg n(s, y)ds] _ J- . (2.3.5) 0 o p That Y*(W)S is formal Bayes w i l l be established i f we produce a prior, n , such that g T T(s, y)K(y) = f .(s|0) (2.3.6) 11 S|W<||yl|2 2 v (n+k-2)/2 -s/2 ,,y!l k/2-1 -sw/2 = Ks e J w e dw , 51 k where K: (R (0, »j i s a measurable function. But, . v v (n+k-2)/2 -s/2 . -||s1/2y-X|l2/2 . (2.3.7) g (s, y) = Ks e / e 11 " dll(A) . 2 Therefore, letting d £ ( x ) = e HAH ^ 2 dn(X) , and x = s~ y, we are searching for £ such that 1 — -1 2 / e A ' x dc(X) = K / ( l - v ) 2 e X v / 2 dv , (2.3.8) k o IR and i t follows that 1 ~ -1 — 2 d£(X) = / ( l - v ) 2 v 2 e""All / 2 V dv dX . (2.3.9) o In other words, the density of II , with respect to k-dimensional Lebesgue measure, is given by CO -— —1 2 T I ( X ) = / u 2 (u+1)"1 e"ilAH u / 2 du . (2.3.10) If k = 1 , then ir(A) = IXI 1 / v 2(1+X" 2v) - 1 e ~ v / 2 dv , (2.3.11) so that assumption a4 of Section 1.5 is satisfied with 3 = 1 . Moreover, a l , a2, and a3 are satisfied, and u(X) < K | x | _ 1 . (2.3.12) 52 Therefore, as a consequence of Theorem 1.5.1, we have the following theorem: P 2 Theorem 2.3.1: If k = 1 , then ,1'*(W)S is a minimax estimator of. a , which i s admissible within the class of scale-invariant procedures. It i s interesting to observe that the method of proof used to construct ty* enables us to demonstrate the minimaxity of a number of other estimators. That i s , i f i s a non-decreasing function of w , and i f o P ¥*(w) _< ^(w) < c , for a l l w , then V M S is a minimax estimator of of . Moreover, i f we wish to obtain a minimax, admissible estimator using this method, then we should also have ^(w) <_ ^ (w) , for a l l w . P Otherwise, we could demonstrate the inadmissibility of as in Example 2.2.1. Also notice that ^ ( O ) = V<j(0) , and for a l l w f o Finally, i t is clear that the origin i s not important i n the pro-k blem. Therefore, i f E, is an arbitrary vector in IR , then an es-P timator possessing properties similar to 1J/A(W)S is given by f*(W)S P , where Y*(W) = YMW^) (2.3.13) and = S_1||X-5||2 . p Moreover, "i'*(W)S has an obvious interpretation. F i r s t notice that o the natural (minimax and admissible) estimator for a in the analogous 53 problem with known mean E, is given by V(S, X; O = r[(n+k+2p)/2] (S+||X-f ) P . (2.3.14) 2Pr[(n+k+4p)/2] In the problem with unknown mean y , we now let K represent a preliminary P 2p estimate of y , and use estimator ¥*(W)S to estimate cr . If we find, in fact, that X = £ (supporting our prior suspicion), then our estimate w i l l agree with that given by Y(S, X; ?) . Otherwise, the estimate i s modified, depending on the (normalized) distance, S"^ !^x--cj| , between X and K . As this distance becomes i n f i n i t e the estimate approaches c S This interpretation is similar to that of Stein's estimator. In this case we note that u = £ at a particular significance level - using ^(S, x; 0 i f the hypothesis i s accepted, and c°S^ i f the hypothesis i s rejected. 2.4 Extension of the previous results to G-invariant loss functions Although we assumed squared error loss i n the preceding sections, the specific form of the loss function played only a minor role in the proofs. The form was important only i s so far as i t produced the bowl-shaped nature of certain risk functions, and also because i t enabled us (2.3.15) T^CWf)S may thus be regarded as the result of f i r s t testing the hypothesis 54 to use the observation that a stochastically larger S produces a smaller estimate for . Thus, i f we assume L(a; u , a) = L(aa 2 p ) is an arbitrary, nonegative, 6-invariant loss function, and i f we l e t F = {fg | w < r ( s I <$) : 6 >. 0 and 0 < r <^ <»} , then we have actually proved the following more general result: Theorem 2.4.1 fnr o i l f „ P If (i) for a l l f e F , h (c) = / L(cs P)f(s)ds is a s t r i c t l y o bowl-shaped function of c which assumes i t s (finite) minimum at c(f) , ( i i ) f^, f 2 e F and f - L f 2 1 increasing, implies c ( f 1 ) < c ( f 2 ) , and ( i i i ) Y°(w) = c ( f s | w < w ( . | 0 ) ) , then <i'0(W)S i s an estimator of a which is formal Bayes scale-in-variant with respect to the prior given i n (2.3.10), and which has a risk function which i s uniformly no larger than that of the best G-invariant estimator. The minimaxity of the usual estimator, and hence of ¥°(W)S P , w i l l often follow from results of Kiefer and Kudo. Note that the application of Fatou's Lemma prevents the strenthening of the "no larger than" conclu-sion i n the theorem, without a more careful analysis of the risk functions, or alternatively the demonstration of scale-admissibility. We would hope, of course, that the new estimator w i l l often be scale-admissible, particu-l a r l y i f the t a i l s of the loss function are steeper than those for squared error. 55 It i s perhaps useful to observe that the assumptions of Theorem 2 . 4 . 1 are satisfied i n a few important cases: Example 2 . 4 . 1 If I^Cx) = |x- l | , then ¥°(w) = medianQ(U~P|W<w) ( 2 . 4 . 1 ) = median~P(u|w<w) where f y w ( u , w|6) = Ku P f g w ( u , w|6) Example 2 . 4 . 2 Brown has shown that the use of an unbiased estimator i s essentially equivalent to the use of L (x) = x - 1 - ln(x) . In this case Y°(w) = E ^(SP|wiw) . ( 2 . 4 . 2 ) Example 2 . 4 . 3 Another common loss is given by 2 L 3(x) = In (x) , for which y°(w) = exp{-E Q(^ S*|w<w) • ( 2 . 4 . 3 ) 56 In Chapter'3 we shall also apply Theorem 2.4.1 in an interval estimation problem. 2.5 The exponential distribution with unknown location and scale It i s evident that the techniques of the previous sections may be use-fu l i n many location-scale problems. In this section we shall examine the applicability of the method in a somewhat antithetical situation. Namely, we shall assume that X^, X^, X n are independent observations from an exponential distribution with density (x) = / ( -1 -(x-u)/a a e x >_ u N (2.5.1) V 0 x < u . The problem at hand is the estimation of aP(p>0) , where the loss function . -2p, p.2 is given by L(a; u, a) = a (a-a ) In this case, (M, X) is a sufficient s t a t i s t i c , where M = min X_^ , i=l,...,n and nX = X, + ... + X . For convenience, let S = X - M . Again the 1 n problem remains invariant under the location-scale group, . Here, the best G^-invariant estimator coincides with the maximum likelihood estimator, o P and i s given by c S , where p 0 _ E 0 . 1 ( S ' . n Pr(n+p-l) ( 2 5 2 ) 57 If we l e t Y = MS , then any scale-invariant estimator is of the form ¥(Y)S P . In analogy with Stein's estimator for the variance of a normal population, Arnold [1] and Zidek [47] have produced a scale-invariant estimator, y^OOS^ , which dominates the usual one. It i s given by - r o (l+y) P nPr(n+p) _ m n { C » r(n+2p) (2.5.3) y < 0 . To obtain this estimator using the method of Section 2.2, f i r s t 1 2 observe that, for each X =ya and y , E [(cS P-l) |Y=y] is a s t r i c t l y A bowl-shaped function of c , which takes i t s minimum at E X [ S P ' Y = y ] - Ex[T-p|Y=y] . (2.5.4) E.[S2p|Y=y] Here, T i s a random variable, whose joint density, with Y , is f T . Y ( t , y|x) = Kt P f g y ( t , y|x) 2 P XS YVI"' •y|A'' (2.5.5) v2p+n-l -nt-n(ty-X) T _ ft_ >. = K t 6 . J(0,co)( t> J ( X , c o ) ( t y ) • Now, for y >. 0 , f T (t,y|x) ,f T"'" v.(t,y|0) i s a non-decreasing function of t , so that sup E [T_P|Y=y] = E [T_P|Y=y] -co<x<» A u (2.5.6) 58 r(n+2p) But there exists K such that E0[T~P|Y=y] < c° , i f 0 <_ y <_ K , and the desired estimator is obtained. It i s natural to attempt to duplicate the procedure in Example 2.2.2. Unfortunately, i f we l e t B ^ = B(J|-_ r r j ( Y ) ) > then the desired monoton-i c i t y properties do not hold. However, i f we l e t B' ^ = r] ' then sup E x[T" F|Ye[0,r]] = E Q[T F|Y e[0,r]] < c . (2.5.7) - o o < X < c o (2.5.7) can be seen most easily by noting that ^x|0<Y<r^ t^ n a s n o n ~ decreasing likelihood ratio with respect to f T|Q < Y < r( f c|0) , which in turn has increasing likelihood ratio with respect to f (t) . We obtain a dominating procedure by letting A n i n { c ° , E Q[T P | 8 j r ) ] ( Y ) } Y > 0 Y < 0 (2.5.8) c°n-(l+r ) - n - p + 1 1 n y — 1 L 0 0 < Y < r [ l - ( l + r ) - n - 2 p + 1 ] ~ " L c ° Y < 0 o r Y > r 59 -n-p+1 Y > 0 (2.5.9) U-U+Y) -n-2p+l ] o c Y < 0 Unfortunately, this procedure i s not formal Bayes scale-invariant. In fact, we are no longer certain that i t dominates c°S P - although i t can be no worse. However, in view of an example of Sacks [32] regarding the exponential distribution, perhaps we are asking too much of a candidate for admissibility when we ask that i t be formal Bayes. Finally, with the aid of a computer, the risk function of ¥*(Y)SP has been plotted. From the results, i t i s apparent that the procedure dominates the usual one, and, in fact does significantly better over a wide range of values for X . 60 Chapter 3: Minimax, Inadmissible Interval Estimators Of Scale Parameters 3.1 Minimax Estimators In Location-Scale Problems Valand [40] has given conditions under which a best location-invariant interval estimator of a location parameter w i l l be minimax, when this is the only unknown parameter in the problem. By applying a suitable transformation, Valand easily extends these results to include the interval estimation of scale parameters. In this section we obtain similar results for problems involving both unknown location and scale parameters. In our formulation of the problem, the action space is unspecified (aside from invariant structure), and the results are therefore applicable in both point and interval estimation problems. In particular, in the next section Lemma 3.1.1 i s used to prove the minimaxity of certain interval estimators of scale parameters, when the location i s unknown. As we have mentioned in previous chapters Kiefer [22] and Kudo [23] have given sufficient conditions for the minimaxity of best invariant estimators in a wide range of problems. In particular, their results are applicable in many problems involving unknown location and scale parameters. In fact, applying Kiefer's Theorem in our problem, we obtain a result which i s similar in form to Theorem 3.1.1, but in which the loss function - not the region of integration - has been truncated. Kudo's Theorem i s widely applicable, but to apply i t in particular problems is d i f f i c u l t . We, l i k e Valand [40], obtain explicit conditions by a simple direct argument. More recently, Chen [8] has studied location-scale problems in which the action space and parameter space coincide. In our formulation of the problem, we observe a random variable X = (U, V, W) , taking values in X = U x \J x W , 61 where If = (-<*>, ») , 1/ = (0, 00) , and (W, C) is an arbitrary measurable space. Assume that the conditional distribution of (U, V) , given W = w , i s of the form T?(r--^- , ~|w) , where (y, a) is an element of 0 = {(u, a): -oo<u<co5 a>o} . F i s assumed to be known, but y and a are unknown. Also, assume that W i s distributed according to a known measure G on W . If (Y n, Y„, Y ) is an observation from 1 2 n yj-y y 2-y y n-y H( , , ' *" , ) , then a suitable transformation w i l l yield the a a a above model. Moreover, W may not appear i f the problem has been reduced by sufficiency. Note that the results which follow can easily be extended to the higher dimensional location-scale problem in which li = IR and p e I . However, for simplicity of notation, we have treated the case k = 1 . The action space, A , and loss function, L: A x 0 -> [0, 0 0 ) , are unspecified, but we assume that there exists a homomorphic image, G , of the location-scale group, G , which acts on A in such a manner that the problem remains invariant under G . Here, A is equipped with a a-algebra, A , which contains a l l singletons. The class of rules available, V , consists of a l l possible functions 6: X x A [0, 1] , such that (i) 6 0 , A) is measurable, for each A e A , and ( i i ) <5(x, •) is a probability distribution on A , for each x e X . Denote a typical element of G by [c, d] , c > 0 and - °° < d < 0 0 , and l e t [c, d] and [c, d] represent i t s homomorphic images in G and G , respectively. Then [c, d] (u, v, w) = (cu+d, cv, w) 62 [c, d] (u, a) = (cu+d, ca) ( 3 . 1 . 1 ) [c, d] ' [c', d'] = [ c c % cd'+d] and [c, d] ^ = [c \ -de . If t: X -> A i s any nonrandomized invariant rule, then i t i s easy to see that t(u, v, w) = [vT*u]<j>(w) , ( 3 . 1 . 2 ) for some measurable function <(>: W A . The risk function of any such procedure i s constant, and i s equal to r(<j>; p, a) = r(<j>; 0 , 1 ) = / // L([Ou] <Kw); 0 , l)dF(u, v|w)dG(w). ( 3 . 1 . 3 ) W tixl/ Let D represent the set of a l l measurable functions <f>: W -»• A , and assume that there exists <t> e D , such that o ' r(<j> ; 0 , 1 ) = inf r(4>; 0 , 1 ) < » . ( 3 . 1 . 4 ) 0 <|>eD • Denote the quantity in equation ( 3 . 1 . 4 ) by R Q . OO Finally, l e t ^ ^ ^ = 1 ^ e a n i n c r e a s i n 8 sequence of compact subsets )f U x V , such that U x V = Q , and define g : D -* [ 0 , °°) by N=l gwM>) = / / / L([v, u]Kw); 0 , l)dF(u, v|w)dG(w) . ( 3 . 1 . 5 ) W K N We are now in a position to state the main result of this section. 63 Theorem 3.1.1: The best invariant rule, {V, U] <J>o(W) , i s minimax, providing that lim inf g (cf>) = inf lim g (<f>) . (3.1.6) N-x» <j>eD <j>eD N-*» Note: The left-hand side i s no greater than the right-hand side, and the right-hand side i s equal to R q by the monotone convergence theorem. Therefore, (3.1.6) i s equivalent to lim inf g (<j>)> R . N+~ <j>eD Proof: It i s well-known that a rule with constant risk i s minimax, providing that i t i s e-Bayes, for a l l e > 0 (see, for example, [14], p.91). As the best invariant rule i s formal Bayes with respect to the measure induced on 0 by right Haar measure on G , (see, for example, [43]), i t is natural to look at priors which approximate this measure. To this end, let x (y, a) = C o" 1 j r (y, a) , C3.1.7) where '•^J- 'MFI 1 s a s e c l u e n c e °f compacts subsets of 0 , and C M 1 = ' / ; ^ T - C 3 - 1 - 8 ) If t: X -> A i s an arbitrary nonrandomized procedure, then i t s Bayes risk with respect to i s given by RCt, T m) = C M / / / / /L[tCu, v, w); y , c ] d F ( ^ , J | w ) d G ( w ) ^ (3.1.9) M A W Uxl/ o o = C M / / / / / L[t(y+au, av, w); y , a]dF( u , v | w ) d G ( w ) d M £ (3.1.10) A M W Uxl/ 64 = C M / / f/f I L[t(y+au, cv, w); y, a ] ^ 4 d F ( u , v|w)dG(w) . (3.1.11) Now, for each (u, v , w) , le t (x, y) = [a, y](u, v) in the integral in brackets, so that R(t, T ) is equal to (3.1.12) f f f/f f L[t(x, y, w); x-v 1 y u , v - 1 y ] ^ 4 d F ( u , v|w)dG(w) , w uxv^z cu, v) y J where BM(u, v) = {[c, d] Cu, v) , Cd, O e A ^ -1 But [v, u] [y, x] (x-v yu, v y) = (0, 1) , and therefore, because of the assumed invariance of the loss , R(t, T ) = C f f f/f f L([vT^] [y'Tx]" :Lt(x, y, w); 0, 1)^-} . n n W Uxl/^BM (u, v) 7 J • dF(u, v|w)dGCw) C3.1.13) (3.1.14) - C M f f/f f f . LCIvT'u] [ y T ^ J - 1 t C x , y, w); 0, DdFCu, v | w ) d G C w ) ) ^ , M 0 lW E m Cx, y) J y -1 where E M(x, y) = {[c, d] (x, y) , (d, c)e^} . In order to prove the theorem, i t i s sufficient to show that, for a l l e > 0 , there exists M such that R(t, x w) > R - e , for a l l nonrandomized t M o 65 . CO Assume, for the present, that there exist subsets {R^ } ' n N, M=l and {H^ } of (-00, "0 * CO, 00) such that N=l (i) H^ j i s open and + (-», <*>)' x (0, «>) ( i i ) E M ( x , y) D-HJJ , for a l l N , M > N and ( i i i ) lim C / / = 1 , for a l l N . * ~ V>< 7 Now, given e > 0 , choose N. , such that g^ (<j)) > R Q - /2 , for a l l <j> e D , and let N be such that 3 . This latter step is j u s t i f i e d by the compactness of Then, for a l l M > N , C3.1.15) RCt, T ) > C / S If J S L([Cu] t y r x] _ 1t(x, y, w; 0, l)dF(u, v|w)dG(wV V M w «N „ „ 5 dxdy y > (R - £/2) C / / . (3.1.16) o M y and the result follows. It remains to show that » {H\^ } , and {R^ ^ } exist, satisfying conditions ( i ) , ( i i ) , and ( i i i ) . Let 66 2 2 r M M , r -M M, \ = I ~ e > e 3 * U , e J , -1 M2 so that C,, = 4 e . M, and M 2 v / N r / \ -M M I -1 | M. , t^Cx, y) = {(u, v): ye <v<ye , |uv y-x|<e } . Also, let 2 2 / M M M M \ v / M _ M X T " 1 M \ = ( e ~ 6 ' 6 ~ e ) x< N e . N e )» so that ; ; dxdx = 4 [ e M 2 - e M] [M - log(N)] . For N > 2 , the conditions then hold with = {(u, v): N J"<v<N, -Nv<u<Nv} -1 Usually, <J> is determined conditionally on W . That i s , we may take, for each w , <|>o(w) to be that value of a which minimizes, and makes f i n i t e , / / L Q v T u J a ; 0, l)dF(u, v|w) , C3.1.17) UxV providing that the <j> , so constructed, i s measurable. Let R Q( W) represent the minimum value in (3.1.17), and h : A x W ->- [0, ») be defined by 67 h N C a , w) = / / LQv, u]a; 0, l)dF(u, v|w) . (3.1.18) Theorem 3.1.2: If ( i ) l i m i n f h^Ca, w) = i n f l i m h^Ca, w) , f o r a l l w , (3.1.19) N-*» aeA aeA N-*°-and ( i i ) inf h^Ca, w) is G-measurable, for a l l N , then (3.1.6) holds. aeA Proof: As in Theorem 3.1.1, we can prove using the hypotheses that inf h^Ca, w) converges monotonically to R (w) , for a l l w . Therefore, aeA ° by the monotone convergence theorem, / inf h fa, w)dG(w) -*• / R (w)dG(w) =R . C3.1.20) W aeA- N W ° ° Now, given e > 0 , i f we choose N such that / inf h (a, w)dG(w) > R - e , (3.1.21) W aeA N ° then, for a l l ty e D , / h.TC<Kw), w)dG(w) > R - e . C3.1.22) W N ° But g 0f>) = / \ (ty(w), w)dGCw) , and W W N the proof is complete. Now, assume that V i s an arbitrary topological space, and that CO {g } is a non-decreasing sequence of non-negative functions on V N=l 6 8 Also assume that inf lim g (y) = S < »» . ( 3 . 1 . 2 3 ) Whether we are looking at the unconditional problem, as in Theorem 3 . 1 . 1 , or at a conditional problem, as in Theorem 3 . 1 . 2 , we require conditions under which lim inf gNCy) = S . ( 3 . 1 . 2 4 ) N-**> yeV The following lemma is particularly useful to that end in interval estimation problems. Lemma 3 . 1 . 1 : If g^ is lower semicontinuous for a l l N , then ( 3 . 1 . 2 4 ) holds, i f and only i f , for any e > 0 , there exists a compact set K , and K , such that g„ (y) > S - e , for a l l y £ K. . o - N o ' J o Proof: The proof in one direction is t r i v i a l , and we shall use contradiction to prove the other direction. Therefore, assume that ( 3 . 1 . 2 4 ) does not hold, and let S - lim inf g,,(y) = e > 0 . Using this e , choose K ° N yeV N and N as above. Then, for a l l N > N , there exists y„ e K such that o - o N g N(y N) = inf g N(y) C3.1 .26) y^K < S q - £ . C3.1 .27) Since K i s compact, the sequence {y }' has an accumulation point y N • o If N i s fixed, i t follows from the lower semicontinuity of g N that there exists a subsequence {y^ } such that i 69 (3.1.28) But from the monotoriicity of {gM} > i t follows that N. l x (3.1.29) i We.arrive at a contradiction by taking the limit. For completeness, the following easy lemma i s stated without proof. Lemma 3.1.2; If (i) L(»;0, 1) i s bounded, or i f ( i i ) F(-,'|w) has compact support, for a l l w , then (3.1.6) holds. 70-3.2 Interval Estimation Of Scale Parameters consider the problem of determining a confidence interval for a (p>0) . Usually, a confidence coefficient 1-a is selected, 0 < a < 1 , and the confidence interval i s assumed to be of the form (c^ S P, S P) . Then c., i = 1, 2 , are chosen so that distribution. Equation (3.2.1) does not uniquely determine the c_^ . So some additional criterion is introduced which, with (3.2.1), determines an interval. Common c r i t e r i a , so introduced are: (i) shortest interval of coefficient 1-a: minimize °2 ~ c]_ ' subject to (3.2.1). ( i i ) logarithmically shortest interval of coefficient 1 - a : minimize tnic^) - In(c^) , subject to (3.2.1). ( i i i ) equal-tailed interval of coefficient 1-d: select d^ and d^ d « such that / f„(s)ds = / f ,(s)ds = a/2 . In order to obtain the form of the shortest or logarithmically shortest confidence intervals, i t w i l l be useful to present a simple measure-theoretic lemma. Assume (a) (X, M) is a measurable space, where d (3.2.1) lve (b) y and V are cr-finite measures on (X, M) such that y i s absolutely continuous with respect to v , (c) f i s a non-negative measurable function on (X, M ) , (d) g = f * 5 where denotes the Radon-Nikodym derivati of y with respect to v , (e) A £ e M , and g "*"[(c, " O J ^ A ^ C g "*"[[c, °°) ] , for some c > 0 , and (f) h ( A ) = k v ( A ) - /fdy , A e M , where k i s any fixed scalar. A Lemma 3.2.1: (a) If v ( A ) < v ( A ) , A E H , then /fdy < / fdy . C A A c (b) _ i f /fdy > / fdy , A E M , then v ( A ) > v ( A ) . A A c (c) h ( A * ) = inf h ( A ) . K A E M Proof: (a) Since v ( A ) < v ( A ) , v ( A ~ A ) < v ( A ~ A ) . c c c Therefore, /fdy = /gdv < c v ( A ~ A ) < c v ( A ~ A ) < / gdv = /fdy , A - A A ~ A ° ° A ~ A A ~ A c c C C and the result follows. The proof of (b) is similar, and we obtain (c) by noting that h ( A ) = /[k -g]dv . A It w i l l be convenient to introduce some notation. Definition: (a) If f: iR-> IR , then x^ = ff] i f a n d only i f f(x x ) = f ( x 2 ) . (b) For x > 0 and - 0 0 < a < 0 0 , Q (x) = x ct 72 Now, c_ - c. = p / x dx . Therefore, i f we l e t X = (0, 00) , d l V1 (A) = /x dx , and y = Lebesgue measure, we see that y (A) = / x ^ * ^ A A dv 1(x) , so that dy dv. = Q, , . Since Q_ , • f is unimodal, i t follows from 1+p 1+P S Lemma 3.2.1 (a) (b), that the shortest confidence interval of size 1 - a is given uniquely by d2 / f (s)ds = 1 - a (3.2.2) and d± = d 2 [ Q 1 + p . f g ] Solutions to (3.2.2) may be obtained from [24] and [39]. Similarly, letting v.. (A) = / x "'"dx s we see that the logarithmically 1 A shortest interval of size 1 - a is given uniquely by d2 / f_(s)ds = 1 - a " i (3.2.3) and d± = d 2 [Q 1 -f] This latter interval i s also "shortest unbiased" (see, for example, [33]), and solutions, to (3.2.3) may be obtained from [24], [28], and [39]. From a decision-theoretic point of view, what properties do the above procedures possess? In interval estimation problems, two quite different types of losses occur. One varies with the "size" of the interval. The other i s a consequence of not covering the true parameter. Thus, a vector-valued loss function seems appropriate. Extended definitions of minimaxity and admissibility can then be developed. Alternatively, by taking a linear 73 combination of the individual losses, a real-valued loss function may be constructed. For a discussion of the resulting implications, see, for example, Blythe [3]. Let A = {(a^, a^)0 < a^<a2<°°} denote the action space, and assume that the loss function i s of the form L(a^, a^\ V, cr) = L(a^a 2 p , a^o 2p) . As in Section 2.1, the problem i s then invariant under G , and any G-invariant procedure i s of the form (c^S P, c^ S P) . In particular, i f then, from Lemma 3.2.1, the shortest interval of coefficient 1 - a i s the n * best G-invariant procedure, for some k > 0 . Similarly, i f the logarithmically shortest interval i s best G-invariant. It does not appear to have been generally recognized that the equal-t a i l s interval i s also best G-invariant, when L C«, 3 ) = k C3-ct) + 1 - 1 (1), (a,S) (3.2.4) L Ca, 8) = k*£n(Ba 1) + 1 - I (1), 1 (a,3) (3.2.5) C O + / £n( C ; Ls P)f s(s)ds . Now, keeping c_c -1 fixed, the derivative of r with respect to c is given by 74 C-1/P 2 OO c^ 1! - / f (s)ds + / f s(s)ds] , (3.2.7) ' C - I / P and the result is evident. With this smoother loss function, the s t a t i s t i c i a n is not merely assessed a unit loss whenever a 2 p is not contained in the interval. Instead, he is penalized according to the v 2-distance between a 2 p and the interval. We w i l l now use the results of Section 3.1. to show that these best G-invariant procedures are minimax. In the notation of the previous section, 1/2 replace U and V by X and S , respectively, and note that D may be identified with A . In order to apply Theorem 3.1.1, let = [ - N , N ] K x [ N - 1 , N J , so that g N: A -> [0, ~) ±s defined by N g N ( a l S a 2) = [ * ( N ) - * ( - N ) ] K f^Ua^2*, a 2 v 2 p ) f v ( v ) d v . N -k Using loss function L^, we see that g N(a^, a 2) • [ $ (N)-<j> ( - N ) ] = k (a 2~a ) •/'_ 1v 2 pf v(v)dv + P r ( N - 1 < V < N ) - / f (v)dv . [N~1VN]n[a2~2p,a1'"2p] Moreover, i t i s evident that R- < 1 . Therefore, using Lemma 3.1.1 and the continuity of g^ , i t is sufficient to show that there exists a compact subset K of A , and N q , such that g^. (a^, a^) > 1 - e ,' for a l l (a±, a 2) t K . Let N q be such that [ $ ( N Q ) - $ ( - N ) ] k > (1-e) 1^ 2 . In order to find the form of the compact set, let M.. > 0 be such that N Pr(V > M~2p) > ( l - e ) 1 / 2 . Moreover, let M2 satisfy k* M2 / ° v 2 p f v ( v ) d v > 1. N o 75 Finally, l e t K = { ( 3 , a „ ) : 0 < a < a 2 ~ ai 1 M2} •• Then, for Ca^, a^) ft K, either a^ > , or a 2 - a^ > M2 . In the f i r s t case, i t follows that a " 2 p k 1 g N C a 1 , a 2 ) > [ $ ( N o ) - $ C - N o ) J K [1- / f v(v)dv] > 1 - e . o o In the second case, ^ N % C a l ' a2) > I H V " $ C - N o ) J k k * ( a 2 - a 1 ) / 0 v 2 p f v ( v ) d v > ( l - e ) 1 / 2 > 1-e Using loss function L^, - k * - 1 N - 1 g . N C a 1 , a„) £ * ( N ) - * ( - N ) ] = k - c n C a ^ 1 ) f f v(v)dv + P r ( N O X <V<N Q ) N - 1 -/f v C v)dv [No1,No]n[a22p,a12p] Again, R < 1 . A similar argument w i l l yield the result, where M2 satisfies * 0 M2 k 1 M2 / f v(y)dv •> 1 , and K = {(a^ a 2 ) : 0 < a^ < , a 2 > e . N _ 1 0 The proof of the minimaxity of the equal-tails interval is easier, since L„(a , a n) -> 0 0 , whenever (a , , a „) approaches the one-point compactif i c a t i 3 n i n2 n l ' n2 r r r r of A . Using the results of Chapter 2, i t is now possible to demonstrate the inadmissibility of the logarithmically shortest and equal-tailed intervals. One would suspect that the shortest interval would also be inadmissible, but the method seems hard to apply in this case. ion 76 F i r s t consider a logarithmically shortest interval (c°SP, c°S P) , o - l / p o - l / p o q _ 1 where c^ = IQ-^*fg] • Let y = c^ • c^ . In this case, we shall show that there exists an H-invariant procedure (c° YCWjS^, c°'l'CW)SP) having uniformly higher probability of coverage (but the same v2-volume). Note that an arbitrary H-invariant procedure is of the form (H^OOS 1*, V2(W)SP) As we shall be restricting ourselves to intervals having v^-volume ln(y) , l e t A * = {(ac°, ac°): 0 <a< »} . Identifying A* with (0, 0 0 ) , we are now in a position to apply Theorem 2.4.1. But, in order to apply Theorem 2.4.1, a stronger version of Lemma 3.2.1 is needed. It i s necessary to establish that the minimization in this lemma is "monotone", when the relevant integrand is unimodal. In particular, i f g ^ ) < g(a 2) , gO^) < g(b 2) , a 1 < b , and a2 b2 v[(a^, a 2)] = v[(b^, b 2)] , then / fdy < I fdu . A simple geometric a l b l (or calculus) argument w i l l show that this i s the case, providing that there exists X q such that g is s t r i c t l y increasing on (0, X Q J , and s t r i c t l y decreasing on [ X Q , «*>) . Therefore, using the notation of Theorem 2.4.1, for a l l f e F , CO h^(c) = / L 2(cc^ s P, c c 2 s P ) f ( s ) d s is a s t r i c t l y bowl-shaped function o on A . which takes i t s minimum when (cc°) "^ P = (cc°) [Q^* f] . Also, i f f^ f 2 1 is increasing, and = d 2 ^2^ ' t h e n 77 d 2 h Cd 2 ) f L C d 2 ) / f 2 C d 2 ) ^ X • d l f l ( d l ) f l C d 1 ) / f 2 C d 2 ) Therefore, using the stronger version of Lemma 3.2.1 again, condition ( i i ) of Theorem 2.4.1 holds. Finally, i f Cc° Y % ) ) ~ 1 / P = (c 2 ¥°(w))" 1 / p [Q 2 • f g | w < w (:|0)] , (3.2.8) then (c° 1,0(W)SP, c° Y°(W)S P) i s a minimax interval estimator of o"2p with respect to L 2 . Its risk function is uniformly ho larger than that of (c°SP, c°S P) , and i t i s formal Bayes within the class of scale-invariant procedures having v^volume £n (y) . Naturally, the actual inadmissibility of (c°SP, c°S P) would be obtained prior to taking the limit . Incidentally, since ¥ (W) < 1 , the new procedure produces shorter intervals. Now, consider an equal-tailed interval (c° S P , c 2S P) . Again, we consider only procedures of the form (c° 4;(W)SP, c 2 (W)SP) . Here, the s t r i c t l y bowl-shaped nature of h^(c) is a consequence of (3.2.7), and the "dominating" procedure (c°¥°(W)SP, c 2V°(W )S P ) satisfies P r 6 = Q t S < (c°T°(w))"1 / p|w<w] =Pr 6 = ( ) [ S > (c^0(w))"1/p|w<w] . 78 Bibliography Arnold, B. C. "Inadmissibility of the usual scale estimate for a shifted exponential distribution." J. Amer. Statist. Assoc., Vol. 65 (1970), pp. 1260-1264. Berk, R. H. "A special group structure and equivariant estimation." Ann. Math. Statist., Vol. 38 (1967), pp. 1436-1445. Blyth, C. "Admissibility of minimax procedures." Ann. Math. Statist., Vol. 22 (1951), pp. 22-42. Brown, L. D. "On the admissibility of invariant estimators of one or more location parameters." Ann. Math. Statist., Vol. 37 (1966), pp. 1087-1135. Brown, L. D. "Inadmissibility of the usual estimators of scale parameters in problems with unknown location and scale parameters." Ann. Math. Statist., Vol. 39 (1968), pp. 29-48. Brown, L. D. "Admissible estimators, recurrent diffusions, and insoluble boundary value problems." Ann. Math. Statist. Vol. 42 (1971), pp. 855-903. Buehler, R. J. "Some valid i t y c r i t e r i a for s t a t i s t i c a l inferences." Ann. Math. Statist., Vol. 30 (1959), pp. 845-863. Chen, H. J. "On minimax invariant estimates of scale and location parameters." Sci. Sinica, Vol. 13 (1964), pp. 1569-1586. Cheng, P. "Minimax estimates of parameters of distributions belonging to the exponential family." Chinese Mathematics, Vol. 5 (1964), pp. 277-299. Dugundji, J. Topology. Boston: Allyn and Bacon, Inc., 1966. Fa r r e l l , R. H. "Weak limits of sequences of Bayes procedures in estimation theory." Proc. Fifth Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 83-111. Berkeley: University of California Press, 1966. Feller, W. An Introduction to Probability Theory and i t s Applications, Vol. 1, 2nd ed. New York: John Wiley and Sons, Inc., 1957. Feller, W. An Introduction to Probability Theory and i t s Applications, Vol. 2 New York: John Wiley and Sons, Inc., 1966. Ferguson, T. S. Mathematical Statistics: A Decision Theoretic Approach. New York: Academic Press. Inc., 1967. Guenther, W. C. "Shortest confidence intervals." The American Statistician, Vol. 23, No. 1 (1969), pp. 22-25. 79 ] Guenther, W. C. "Unbiased confidence intervals." The American Statistician, Vol. 25, No. 1 (1971), pp. 51-53. ] James, W. and Stein, C. "Estimation with quadratic loss." Proc. Fourth Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 361-379. Berkeley: University of California Press, 1961. ] Joshi, V. M. "Admissibility of confidence intervals." Ann. Math. Statist., Vol 37 (1966), pp. 629-638. ) Joshi, V. M. " Inadmissibility of the usual confidence sets for the mean of a multivariate normal population. " Ann. Math. Statist., Vol 38 (1967), pp. 1868-1875. ] Joshi, V. M. "Admissibility of the usual confidence sets for the mean of a univariate or bivariate population." Ann. Math. Statist., Vol. 40 (1969), pp. 1042-1067. ] Joshi, V. M. "Admissibility of invariant confidence procedures for estimating a location parameter." Ann. Math. Statist., Vol. 41 (1970), pp. 1568-1581. | Kiefer, J. "Invariance, minimax sequential estimation and continuous time processes." Ann. Math. Statist., Vol. 28 (1957), pp. 573-601. | Kudo, H. "On minimax invariant estimators of the transformation parameter." Natur. Sci. . Rep.Ochanomizu Univ., Vol. 6 (1955), pp. 31-73. | Lindley, D. V., East, D. A. and Hamilton, P. A. "Tables for making inferences about the variance of a normal distribution." Biometrika, Vol. 47 (1960), pp. 433-437. | Loeve, M. Probability Theory, 3rd ed. Princeton, N. J.: D. Van Nostrand Co., Inc., 1963. | Lukacs, E. Characteristic Functions, 2nd ed. London: Charles G r i f f i n and , Co. Ltd., 1970. | Nachbin, L. The Haar Integral. Princeton, N. J.: D. Van Nostrand Co., Inc., 1965. | Pachares, J. "Tables for unbiased tests on the variance of a normal population Ann. Math. Statist., Vol. 32 (1961), pp. 84-87. | Pearson, K. Tables of the Incomplete Beta Function. London: Cambridge Univ. Press. 1922. I Portnoy, S. "Formal Bayes estimation with application to a random effects model." Ann. Math. Statist., Vol. 42 (1971), pp. 1379-1402. Rao, C. R. Linear S t a t i s t i c a l Inference and Its Applications. New York: John Wiley and Sons, Inc., 1965. 80 [32] Sacks, J. "Generalized Bayes solutions in estimation problems." "Ann. Math. Statist., Vol. 34 (1963), pp. 751-768. [33] Scheffe, H. "On the ratio of the variances of two normal populations." Ann. Math. Statist., Vol. 13 (1942), pp. 371-388. [34] Sion, M. Introduction to the Methods of Real Analysis. New York: Holt, Rinehart and Winston, Inc., 1968. [35] Stein, C. "A necessary and sufficient condition for admissibility." Ann. Math. Statist., Vol. 26 (1955), pp. 518-522. [36] Stein, C. "Inadmissibility of the usual estimator for the mean of a multivariate normal distribution." Proc. Third Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 197-206. Berkeley: University of California Press, 1956. [37] Stein, C. "Inadmissibility of the usual estimator for the variance of a normal distribution with unknown mean." Ann. Inst. Statist. Math., Vol. 16 (1964), pp. 155-160. 1 [38] Strawderman, W. E. "Proper Bayes minimax estimators of the multivariate normal mean." Ann. Math. Statist., Vol. 42 (1971), pp. 385-388. [39] Tate, R. F. and Klett, G. W. "Optimum confidence intervals for the variance of a normal distribution." J. Amer. Statist. Assoc., Vol. 54 (1959), pp. 674-682. [40] Valand, R. S. "Invariant interval estimation of a location parameter." Ann. Math. Statist., Vol. 39 (1968), pp. 193-199. [41] Wijsman, R. A. "Existence of local cross-sections in linear Cartan G-spaces under the action of non compact groups." Proc. Amer. Math. Soc, Vol. 17 (1966), pp. 295-301. [42] Wijsman, R. A. "Cross-sections of orbits and their applications to densities of maximal invariants." Proc. Fi f t h Berkeley Symp. Math. Statist. Prob., Vol. 1, pp. 389-400. Berkeley: University of California Press, 1967. [43] Zidek, J. V."A representation of Bayes invariant procedures in terms of Haar measure." Ann. Inst, of Statist. Math., Vol. 21 (1969), pp. 291-308. [44] Zidek, J. V. "Inadmissibility of the best invariant estimator of extreme quantiles of the normal law under squared error loss." Ann. Math. Statist., Vol. 40. (1969), pp. 1801-1808. [45] Zidek, J. V. "Sufficient conditions for the admissibility under squared error loss of formal Bayes estimators." Ann. Math. Statist., Vol. 41 (1970), pp. 446-456. [46] Zidek, J. V. "Inadmissibility of a class of estimators of a normal quantile." Ann. Math. Statist., Vol. 42 (1971), pp. 1444 — 1447. [47] Zidek, J. V. "Estimating the scale parameter of the exponential distribution with unknown location." To appear Ann. Math. Statist.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- On the admissibility of scale and quantile estimators
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
On the admissibility of scale and quantile estimators Brewster, John Frederick 1972
pdf
Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.
Page Metadata
Item Metadata
Title | On the admissibility of scale and quantile estimators |
Creator |
Brewster, John Frederick |
Publisher | University of British Columbia |
Date Issued | 1972 |
Description | The inadmissibility of the best affine-invariant estimators for the variance and noncentral quantiles of the normal law, when loss is squared error, has already been established. However, the proposed (minimax) alternatives to the usual (minimax, but inadmissible) estimators are themselves inadmissible. In our search for admissible alternatives in these problems, we first consider estimators which are formal Bayes within the class of scale-invariant procedures. For such estimators, we present explicit conditions for admissibility within the class of scale-invariant procedures. In the second chapter of the thesis, we consider the estimation of an arbitrary power of the scale parameter of a normal population. Under the assumption that the loss function satisfies certain reasonable conditions, an estimator is constructed which is (i) minimax, and (ii) formal Bayes within the class of scale-invariant procedures. The estimator obtained is a limit of a sequence of minimax, preliminary test estimators. Moreover, under squared error loss, and using the results of Chapter One, this estimator is shown to be scale-admissible. More generally, results are obtained for the estimation of powers of the scale parameter in the canonical form of the general linear model, and for the estimation of powers of the scale parameter of an exponential distribution with unknown location. In Chapter Three, conditions are given for the minimaxity of best invariant procedures in general location-scale problems. Finally, by combining these results with those of the preceding chapter, the usual interval estimators for the variance of a normal population are shown to be minimax, but inadmissible. Superior, minimax procedures are suggested. |
Genre |
Thesis/Dissertation |
Type |
Text |
Language | eng |
Date Available | 2011-03-14 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0080459 |
URI | http://hdl.handle.net/2429/32428 |
Degree |
Doctor of Philosophy - PhD |
Program |
Mathematics |
Affiliation |
Science, Faculty of Mathematics, Department of |
Degree Grantor | University of British Columbia |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-UBC_1972_A1 B74.pdf [ 3.57MB ]
- Metadata
- JSON: 831-1.0080459.json
- JSON-LD: 831-1.0080459-ld.json
- RDF/XML (Pretty): 831-1.0080459-rdf.xml
- RDF/JSON: 831-1.0080459-rdf.json
- Turtle: 831-1.0080459-turtle.txt
- N-Triples: 831-1.0080459-rdf-ntriples.txt
- Original Record: 831-1.0080459-source.json
- Full Text
- 831-1.0080459-fulltext.txt
- Citation
- 831-1.0080459.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
data-media="{[{embed.selectedMedia}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0080459/manifest