A NEW MEASURE OF QUANTITATIVE ROBUSTNESSbySONIA V.T. MAllILic., Universidad Nacional de COrdoba, Argentina, 1989A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinTHE FACULTY OF GRADUATE STUDIESDEPARTMENT OF STATISTICSWe accept this thesis as conformingto the required standardTHE UNIVERSITY OF BRITISH COLUMBIADecember 1991©Sonia V.T. Mazzi, 1991In presenting this thesis in partial fulfilment of the requirements for an advanceddegree at the University of British Columbia, I agree that the Library shall make itfreely available for reference and study. I further agree that permission for extensivecopying of this thesis for scholarly purposes may be granted by the head of mydepartment or by his or her representatives. It is understood that copying orpublication of this thesis for financial gain shall not be allowed without my writtenpermission.(Signature)Department of 6-TA T I STICSThe University of British ColumbiaVancouver, CanadaDate oEcemso 43-fh 1 j9 9 1 DE-6 (2/88)Abstract The Gross-Error Sensitivity (GES) and the Breakdown Point (BP) are two measures ofquantitative robustness which have played a key role in the development of the theory of robust-ness. Both can be derived from the maximum bias function B(€) and constitute a two-numbersummary of this function. The GES is the derivative of B(c) at the origin whereas the BPdetermines the asymptote of the curve (c, B(c)).Since GESc :::-_, B(€) for c near zero, the GES summarizes the behavior of B(c) near theorigin. On the other hand, the BP does not provide an approximation for B(€) for c large and,consequently, estimates with strikingly different bias performance when c is large may have thesame BP.A new robustness quantifier, the breakdown rate (BR), that summarizes the behavior of B(c)for c near BP will be introduced. The BR for several families of robust estimates of regressionwill be presented and the increased usefulness of the three-number summary (GES,BP,BR) forcomparing robust estimates will be illustrated by several examples.i iContentsAbstract iiTable of Contents iiiList of Tables vList of Figures vi1 Introduction 12 Quantitative Robustness 72.1 Estimates Defined by Functionals 72.2 c-Neighborhoods 82.3 Quantitative Robustness 92.3.1 Asymptotic Bias and Asymptotic Variance 92.3.2 The Influence Function and the Gross-Error-Sensitivity 103 Some Robust Estimates of Regression Coefficients 123.1 The Regression Model 123.2 S-Estimates 153.3 7-Estimates 183.4 MM-Estimates 19iii4 The Relative Breakdown Rate 214.1 The Relative Breakdown Rate of S-Estimates Based on x Functions Strictly Con-vex on a Neighborhood of Zero 214.2 The Relative Breakdown Rate of MM- and S-Estimates Based on x FunctionsStrictly Convex on a Neighborhood of Zero 254.3 The Relative Breakdown Rate of 7- and S-Estimates Based on x Functions StrictlyConvex on a Neighborhood of Zero 265 The Breakdown Rate 295.1 The Baseline Estimate 295.2 The Definition of the Breakdown Rate 345.3 Breakdown Rate of S-Estimates of Regression 345.4 Breakdown Rate of 7-Estimates of Regression 415.5 Breakdown Rate of MM-Estimates of Regression 435.6 Conclusions 47Bibliography 47ivList of Tables4.1 Comparison of two S-estimates with the same BP 254.2 Comparison of an MM- and a 7-estimate with the same BP and efficiency . 274.3 Comparison of an MM- and a 7-estimate with the same BP and SENS 275.1 Comparison of two S-estimates with the same BP but markedly different biasperformance 41vList of Figures1.1 Maximum bias curve, BP and GES of the sample median 31.2 Maximum bias curves of Sb for b = 0.85 and b = 0.15 4viChapter 1IntroductionTo quantify the large sample properties of an estimate representable as a functional T, the studyof its asymptotic behavior is usually performed on some neighborhood of the model.We will concentrate on the study of the asymptotic bias of T and consider c-contaminationneighborhoods of a central or ideal model Fo . Following this criterion, robust estimates (in theirasymptotic version) should change as little as possible, uniformly over some neighborhood ofthe model. An c-neighborhood of F0 is a set of distribution functionsVe(Fo) = IF : F = (1— OF° + cH; H is a cdfl.If F E 1),(F0) then F = (1 — c)Fo + EH for some cdf H which can be interpreted as someunspecified distribution function generating outliers and c can be viewed as the fraction ofoutliers.The maximum asymptotic bias of an estimate T over an c-neighborhood, BT(c), is an es-tablished concept and an important measure of the quantitative and global robustness of T (seeSection 2.3.1). BT(c) measures the maximum possible perturbation of the value of T(F) whenF ranges over V,(F0 ).Naturally, when the amount c of contamination increases so does BT(c) and it eventuallybecomes infinity. The smallest value of c such that the maximum asymptotic bias is infinite iscalled the breakdown point of the estimate and indicates the amount of distortion in the modelneeded to make the estimate take on arbitrarily large aberrant values. The concept of breakdown1point was first introduced by Hodges (1967) for one-dimensional estimates of location. Hampel(1971) gave a much more general definition of an asymptotic nature and Donoho and Huber(1983) introduced a finite sample version of the breakdown point.Hampel (1968, 1974a) introduced a robustness quantifier called the influence curve whichmeasures the speed of change of the value of an estimate when the central model is contaminatedwith a single observation (see Section 2.3.2). The maximum absolute value of the influencecurve is called the gross-error sensitivity and this single number summarizes the behavior of themaximum bias curve in a neighborhood of c = 0. In many cases, like in the following example,the gross-error sensitivity is the derivative of the maximum bias curve at the origin.The concepts of maximum bias curve, gross-error sensitivity and breakdown point are illus-trated in Figure 1.1. In this case we consider the one dimensional Gaussian location model andthe sample median. It can be shown that the maximum bias of the median is.13,,(c) = (1) -1 (1/(2(1 — c)))and that its influence curve is/Cm (x) = sgn(x)/[2(p(0)].It easily follows then, that the breakdown point of the sample median is c* = 0.5 and that thegross error sensitivity is 7* = 1/[2(p(0)]P..- 1.253 (see for instance Huber, 1981).The breakdown point and the gross-error-sensitivity are two "one-number-summaries" ofthe maximum bias curve and they carry important information about this function. These twoquantities are now routinely computed and characterize the performance of an estimate.The breakdown point has proved to be very helpful for understanding the robustness prop-erties of estimates. For example Hampel (1974b,1976) analyzed data from a Monte Carlo studyof rejection rules followed by the sample mean, concluding that the performance of the differentstatistics considered could be ranked in terms of their breakdown points.As another example, in the Princeton robustness study (Andrews et al, 1972, p.253) twoestimates of location with similar asymptotic properties for all symmetric distributions were2O0.0 0.1 0.2 0.3 0.4 E4 =0.5epsFigure 1.1: Maximum bias curve, BP and GES of the sample medianstudied, among others. These location estimates used auxiliary estimates of scale and the differ-ence in their performance was explained in terms of the breakdown points of their correspondingscale estimates.In the regression setup, the problem of constructing an estimate with non-null breakdownpoint, i.e. an estimate that can deal with a certain percentage of outliers and that is efficientfor a model with Gaussian errors, was a serious concern for many statisticians.Until 1984, several efforts were made towards obtaining an affine equivariant estimate withmaximal breakdown point of 50%.In 1984, Rousseeuw and Yohai introduced the S-estimates , which are defined implicitly byminimizing a robust M-estimate of the scale of the residuals (see Section 3.2). S-estimates canattain a 50% breakdown point, they are affine equivariant and asymptotically normal at theusual rate of VT. But these estimates cannot combine the property of high breakdown pointwith high efficiency at the model with Gaussian errors.Finally, the MM-estimates proposed by Yohai (1987) and the 7-estimates proposed by Yohaiand Zamar (1988) have the three desired properties: high breakdown point, affine equivariance3OO00.0 0.02 0.04 0.08 0.08 0.10 0.12 0.14 0.18EPSILONFigure 1.2: Maximum bias curves of Sb for b = 0.85 and b = 0.15and high efficiency at the Gaussian model (see Section 3.3 and 3.4).We see how the concept of breakdown-point (combined with other classical asymptotic con-cepts) inspired a fruitful search for estimates which are robust in a very precise way and alsopossess other desirable properties. However, the following example illustrates the fact that ro-bust estimates with the same breakdown point can have strikingly different bias performancesfor large E.EXAMPLE: Let bi ,b2 be such that b 1 < 0.5, b i = 1 — b2 . Consider the S-estimates ofregression Sb, and Sb, based on jump functions (see Section 5.1). Since b 1 = 1 — b2 , these twoestimates have the same breakdown point (see section 3.2). By graphing their maximum biasfunctions (see Figure 1.2) we notice that /35,2 diverges much more rapidly than Bsbi , whereBs denotes the maximum bias curve of Sb,. This indicates that Sb..2 is prone to take on largeaberrant values much more rapidly than Sb, and this fact can be formalized by computing the4following limit:RBR(Sb1, S62)An easy calculation shows that RBR(Sb i , S62)inferred from Figure 1.2.The reason why the breakdown point classification fails to distinguish between rather differ-ent estimates, is that the breakdown point indicates only the location of the asymptote of themaximum bias curve but not how the curve actually behaves near this point. That is, the BPdoes not distinguish among estimates with maximum bias curves tending to infinity at differentrates.Therefore the gross-error sensitivity should be considered a more complete single-numberdescription since it tells us about the behavior of the maximum bias curve in a neighborhood ofthe origin.In this thesis we introduce a new measure to quantify robustness in terms of the asymptoticbias, which is fairly easy to compute and to interpret and which, in conjunction with the GESand BP criteria, helps in classifying robust estimates. This quantity is called the BreakdownRate (BR).The breakdown rate is based on another newly introduced concept called the Relative Break-down Rate (RBR). Given two estimates, say Ti. and T2 with the same breakdown point, c*, wecompute their relative breakdown rate as the limit of the ratio of the square of their maximumbias curves, Bi (c) and B2 (c), as c —› c*. If 0 < RBR(T1,T2) < oo then for c near c*,BRE) r-'..-: RBR(Ti,T2) B2 (E).If RBR(TI, T2) = 0 then there is no doubt we would prefer T1 to T2 and if RBR(Ti, T2) = 00then Ti. would be inadmissible from a robust point of view with respect to T2.We work with the specific model of linear regression. The estimates considered are Rousseeuw-Yohai's S-estimates, Yohai's MM-estimates and Yohai-Zamar's r-estimates.Bsbi (E)= iiill , ,e--*B P .13;§ (E)62= 0, providing a formal justification of what we5The breakdown rate of an estimate in the just mentioned families is defined as the relativebreakdown rate with respect to a baseline estimate, namely the min-max bias S-estimate amongall S-estimates with the same breakdown point.The breakdown rate together with the breakdown point concept gives a more complete de-scription of the robustness properties of an estimate, because it not only points to the asymptoteof the bias curve but also characterizes the way in which the curve goes to infinity. Observethat the gross-error sensitivity and the breakdown rate describe the maximum bias curve nearthe boundary of its domain, (0, BP).We will show how the triplet (GES, BP, BR) allows a finer classification of robust estimates.6Chapter 2Quantitative Robustness2.1 Estimates Defined by FunctionalsHampel (1968) introduced a way to define an estimate which proved to be quite fruitful since itenabled formalization of a very important aspect of robustness (qualitative robustness). It alsomade easier the study of the asymptotic properties of estimates, linking theoretical results offunctional analysis with those of statistics.To present Hampel's idea we need the concept of empirical distribution, which gives a wayfor linking a set of observations yn to a probability distribution on Rk , k > 1.Definition: Given a set yn} yi E Rk , the empirical distribution of yn is theprobability measure on Rk, yn] defined byn11 [Y1, • • • , Yn](13) =n IB (yi) ,VB E Bkwhere IB is the indicator function of the set B and 13k is the family of Borelian sets in Rk.Let Z(Rk) denote the set of all probability measures on Rk . For each n let= {161, • • • , Yrd , • • • , Yn E Rk }be the set of all empirical distributions associated with samples of size n.Definition: an estimate Tn, is given by a functional, T, defined on Z(Rk) if there exists afunction T defined on a subset D(T) C Z(Rk) such that:Tn , • • , Yn ) T(P[Y1, • • • ,7where yn) is in the domain set of Tn and p[yi , yn] E D(T).We consider estimates which can be defined by functionals or that can be replaced by func-tionals. This means we assume that there exists a function T : D(T) Rk such thatTn (Yi , T(F), as n oowhen the observations are i.i.d according to the true distribution F. We say that T(F) is theasymptotic value of Tn at F.To illustrate the definitions, an example of how an estimate can be defined by a functionalfollows.EXAMPLE: Sample mean defined by a functional.LetD(T) {F : F is a cdf on R and f Ix' dF(x) < oo}andT(F) = f x dF(x) = EF(X).ThennTn(Y1 • • • I Yn) =Yi T(tt[Y1, • • Yni).n i=i2.2 €-NeighborhoodsGiven a functional T, we are interested in quantifying its robustness with respect to smallchanges in F. We want to measure the changes in T(F) caused by "small" changes in F in asense that we will define.We need the concept of an "ideal" distribution F0 which obtains because of physical or otherreasons and which is completely known. The real data we are able to obtain have a distributionF distorted through gross errors, rounding errors or other factors beyond our control. Tomake a quantitative assessment of the effects of such distortions we employ a measure of suchdistortions in the ideal distribution, which can be a measure defined in the space of probability8distributions or more generally just a discrepancy in the same space. We will work with theHuber contamination discrepancy defined as:6Huber(F; Fo) = inf{C : F(x) (1 — ()Fo (x), V xlNote that this is not a distance.LetV,(Fo) = {(1 — c)F0 cH : His a cdf};thenV, (Fo ) = {F : OHnber (F, Fo) E}is called the c-contamination neighborhood of Fo•c-contamination neighborhoods were first introduced by Huber (1964) for the location modeland they provide a simple way for modeling data contaminated by outliers.If F E V„ then F = (1 — OF° -I- cH where H can be interpreted as some unspecified distribu-tion function which generates the outliers and c can be viewed as the fraction of contamination.2.3 Quantitative RobustnessFor various reasons it may be useful to describe quantitatively how greatly a small change in theunderlying distribution, F, changes the distribution, dF(Tn ), of an estimate Tn = Tn(xi, • • • , xn),xi E Rk . A description by means of a few numerical quantifiers might be more effective than adetailed characterization.For the sake of simplicity, we will assume that k = 1.2.3.1 Asymptotic Bias and Asymptotic VarianceAssume that Tr, is defined through a functional T, so that Tr, = T(Fn ). In most cases of interest,Tn, is strongly consistent i.e,a.s.[F1Tn T(F)9and asymptotically normal,EFINTri[Tn, — T(F)]} 7-2-4. Ar(0, A(F,T))as the sample size, n, tends to infinity.Quantitative large sample robustness is usually discussed in terms of the behavior of theasymptotic variance A(F,T) and of the asymptotic bias, T(F) — T(F0), over some neighborhoodVf (F0 ) of the model distribution (e.g. V,(F0 ) can be an c-contamination neighborhood). In thissense, two important quantifiers are the maximum asymptotic biasBT(E) = sup IT(F) — T(Fo)iFEMF0)and the maximum asymptotic varianceVT(E) -= sup A(F,T).FEK(Fo)If we consider c -contamination neighborhoods of F0, thenVe (F0) = {F : F = (1 — OF° + EH , where H is a cdf}.Therefore, V1(F0) = {H : H is a cdf} is the set of all probability measures on the sample spaceso that VE (F0) C V1(F0), d0 < c < 1 and so BT(c) < BT(1). Usually BT(1) = oo.The asymptotic breakdown point of T at F0 isBP(T) = sup{c : BT(E) < BT(1)}.2.3.2 The Influence Function and the Gross-Error-SensitivityHampel (1968,1974a) introduced a robustness quantifier called the influence curve (IC) or in-fluence function, defined asIC(x; F, T) = li m T((1 — s)F + so) — T(F) ,3.0 swhere bx denotes the point mass 1 at x, x E R, when the limit exists.10This quantity can be viewed as the limiting influence on the value of T(Fri) of a singleobservation x added to the sample of size il.The maximum absolute value of the influence curve,7* = sup IIC(x; F,T)1sis called the gross-error-sensitivity.In most of the cases, when -y* and ./3,(0) are finite, it can be seen that 7* = ./A,(0), and sothe gross-error sensitivity gives us a linear approximation of the bias curve near 0. Indeed it canbe shown that under mild regularity conditions, then equality holds.11Chapter 3Some Robust Estimates ofRegression Coefficients3.1 The Regression ModelAssume the target model is given by( 3 . 1 ) y = x'00 + u,where x =-- (x1,...,xp)' is a random vector in RP, Bo = (Oh , ...,Opo )' is the vector of trueregression coefficients and the error, u, is a random variable independent of x. Let Fo be thenominal distribution function of u and G o , the nominal distribution function of x. Then thenominal distribution function, 1/0 , of (y, x) is(3.2) Ho(y, x) = 11 • • •/X : Fo(y — eos) dGo(s).Assume Go is elliptical about the origin with scatter matrix A. Correspondingly, we workwith a zero intercept, although it can be shown that there is no loss of generality in this as-sumption.Let T be an RP valued functional defined on a ("large") subset of the space of distributionfunctions, H, on RP+ 1 . This subset is assumed to include all empirical distribution functions,Hri , corresponding to a sample, (y 1 ,x 1 ),...,(y„,x„), of size n from H. Then, Tn = T(1-/n ) isan estimate of 00 .12It is further assumed that T is regression invariant, i.e., if 'y = y + x'b and "X. = CTx forsome full rank p x p matrix, C, then T(R) = C -1 [T(H) + b], where ii- is the distribution of(9, '-)• Correspondingly, the transformed model parameter is Bo = C-1 [90 + b].The asymptotic bias bA = bii(H) of T at H is defined as(3.3) bl(H) = (T(H) — 9o )'A(T(H) — Go ).Therefore, we can assume without loss of generality, that G o is spherical, i.e., A is theidentity matrix, and that 00 = 0. Accordingly, the nominal model (3.2) becomes(3.4) Ho(y,x)= flio- • •,/ x:Fo (y)dG0 (11.911)and, correspondingly, the asymptotic bias of T at H is given by the euclidean norm squared ofT,(3.5) bT(H) = IIT(H)11 2 .From now on we will write bT(H) = bi,(H).If the functional T is continuous at H, then T(H) is the asymptotic value of the estimatewhen the underlying distribution of the sample is H. It is assumed that T is asymptoticallyunbiased at the nominal model, Ho , that isT(Ho) = 0.In this paper, we will assume that (y, x) ,-•-• .V(0, /p+1), that is Ho is the p + 1-dimensionalmultivariate standard normal distribution.We will work with the &contamination neighborhood of the fixed nominal distribution Ho,VE (Ho ) = {(1—€)H0+ cH* : H* is any arbitrary distribution on RP+ 1 }. The maximum asymp-totic bias of T over VE (Ho ) is defined as(3.6) BT(c) = sup {I IT(H)11 : H E 1),(H0 )}.Finally the asymptotic breakdown point of T is defined as(3.7) BP(T) = inf le : BT(E) = oo}.13The estimates of regression coefficients considered in this paper have the characteristic thattheir influence curves are unbounded and so their gross error sensitivity is infinite. And thederivative of their maximum asymptotic bias function at 0 is infinite but the derivative of thesquare of their maximum asymptotic bias function at 0 is finite. This fact and the need of alinear approximation of the maximum bias function near the origin leads us to use B1 , insteadof BT as a measure of maximum possible departure from the central model. Note that thebreakdown point remains unaffected. We define the sensitivity of T asd(3.8) SEN S(T) = Te .131(€)1,=0 .In this way we can approximate /31,(c) .;.--, ESENS(T) for c ..:-... 0.Remark. Connected with the computation of the maximum asymptotic bias of the estimatesconsidered in the next section, the following is a key result (Martin, Yohai and Zamar, 1989).Let x be a real-valued function on RI- satisfying the following assumptions:• symmetric and non-decreasing on [0, oo), with x(0) = 0;• bounded, with lims_, x(x) = 1;• x has only a finite number of discontinuities.Assume now that the target model is H0 is given by (3.4) and that• F0 is absolutely continuous with density fo which is symmetric, continuous and strictlydecreasing for u > 0 and• Go is spherical and PG0 (X 1 0 = 0) = 0 , V 0 E RP with 0 0.Under the last assumption, it is easy to see that the distribution of x'0 depends only on 11011.Thus we set1(s,11911)= EHO X ( Y —sx'9 ) -14Martin, Yohai and Zamar (1989) show that under the assumptions stated above on x, Foand Go , h is continuous, strictly increasing with respect to and strictly decreasing in s fors > O.If z = (y, x) N Ar(0, Ip+1 ), then((1 7 2 ) 112 )h(s, 7) = gxwheregx(t) = E{X(tZ)} with Z ti H(0, 1).3.2 S-EstimatesS-estimates of regression coefficients were introduced by Rousseeuw and Yohai (1984).Given the M-estimate of scale of these numbers, s„, is defined as the solution ofnEx ( ?Lsi = bn i=1Where x is bounded, even and non-decreasing on [0, oo) and b is usually taken equal to E{x(Z)}with Z N .Ar(0, 1) (see Huber, 1964). We can assume with no loss of generality that x(oo) = 1and x(0) = 0.Let (yi, xi) be as in (3.1) and let ui(0) = yi — O'xi , 0 ER'. The S-estimate of regression Bsis defined by the property of minimizing the M-estimate of scale of {ui(0)}1 1_ 1 , that isds = arg min Sn (0).The corresponding asymptotic version is(3.9) Os(H) = arg min SH(0)where SH(0) satisfies the equation( y e'x ) bS H(0)EHX15As proved in Martin, Yohai and Zamar (1989), the maximum bias of S-estimates of regressionwhen 110 is Gaussian is given by[g (1%12 ,(3.10) B2s(E) = 1 with g(t) = Ec{x(tZ)}g-1 (h)where (I. denotes the standard normal distribution function.This formula can be derived in the following way. Let us consider two situations.1.Residual M-Scale When the True Model B = 0 is Fitted. LetH(s,y) = (1 — OH° 05(x ,y) E VE (Ho).Suppose that y is such that x (y I s(c)) = 1 where D o is the residual scale M-estimate whenwe fit the true model (i.e. 9 = 0) so that(1 — c)EHo x (--) c = b,Aoor equivalently (1 — c)g (--) c = b Po =Ao g-1 1efE02.Residual M-Scale When the Outlier (y, x) is Fitted.Let A(11011) be defined by the equationthat is(1 — €)Elio x (Y — 9'x6, (11 0 11))(1 — c)g (V1+11011 2 ) + HOP A(11011) =b A(I1011) — g-1 (H •The maximum bias Bs(c) is determined by the condition(3.11) A(Bs(E)) = Do .Observe that SH(11°11) > 0 (11 9 11) and SH(0) < Ao for all 9 ERP and all H E VE(Ho)•Therefore, if 11611 > Bs(c) then SH(9) > SH(0) and -0 argminSH(9). Clearly then Bs(c) <11°11.16On the other hand, following along the lines of Martin, Yohai and Zamar (1989) one canprove that given 9* with 11011 < BS(c), there exists H E VE (H0) such that 9* = arg min SF(0).Hence, BS(f) I WI I -Therefore,BS(c)= sup{liell : A(11 011) < Do}and so by continuity of AO, Bs(c) must satisfy the equation A(Bs(c)) = D o , from which (3.10)directly follows.BREAKDOWN POINT OF S-ESTIMATESFrom (3.7) and (3.10) we see that the breakdown point of an S-estimate S is(3.12) BP(S) = minfb, 1 — b}.So, two distinct values of b give rise to any specified breakdown point c* E (0,0.5), namely,b = c* and b = 1 — c*. It will be shown in chapter 4 that the S-estimates S b for two such valuesof b have a strikingly different bias performance.SENSITIVITY OF S-ESTIMATESFrom (3.8) and (3.10), and if g(t) is continuously differentiable in some neighborhood oft = 1, the sensitivity of an S-estimate, SENS(S), is given by(3.13) SENS(S) =g'2(1) •More generally, suppose now that the estimate of regression coefficients, Oj(H), is given by(3.14) dj(H) = arg mjn J(FH,e)where J is a functional defined on a subset of Z(R) and FH,e is the distribution function under Hof the residual r(9) = y— x'0. Notice that in the case of S-estimates we take J(FH,e) = S(FH,e),with S(FH,e) defined by the equation(3.15) S(FH,e)r LEFH , e x (—) = u.s17Under certain regularity conditions to be determined in future work, we conjecture thatfollowing the lines of the argument given above it can be shown that the maximum bias functionfor Uj(H) satisfies the equation (3.11) with,AO = AFFIX)) ; FH2O(X) = ( 1 — 04(x) + 054. 00 (X)andA(IIeii) = AFRe) ; Ffi ,e(x) = (1— c)(1)(x\11 +1101 2 ) + c80 (x)where Sy • ) is a point mass distribution at y.3.3 T-EstimatesA r-estimate is given by (3.14) with J(FH,e) = r (F11,9), whereT(FH,9 ) = S2 (FH,e)EFH ,e X2 (s(FH,9))and S(FH,o) is based on a function xi (see Yohai and Zamar, 1988).Let g2(t) = gx(t), i = 1,2 and b = E:Dxi (Z). Since in this case,1 b —Ao = T(FH,o) ET ,[gi-1 2 {( 1 092 (gi(andA(11 0 1 1 ) = ( 1 + 1 10112)(1 E)g2 (gi-1 we have that from (3.11)b e ) I 2 g2 (gr1 (4 )(3.16) B?(c)= (=e,) g2 (g171 ( tie1_ eg2 (gr i b , )) 1.BREAKDOWN POINT OF T -ESTIMATESAccording to (3.16) we see that the breakdown point of a r-estimate of regression, T, is(3.17) BP(r) = minfb, 1 — b}.18[gr1 (47)} 2Again, as in the case of S-estimates, two distinct values of b give rise to an specified breakdownpoint €* E (0, 0.5). In chapter 5 the pronounced difference between these estimates with thesame breakdown point will be shown .SENSITIVITY OF 7-ESTIMATESIf gi(t) is continuously differentiable in a neighborhood of t = 1, i = 1, 2, the sensitivity of aT-estimate, SENS(7),is1 g(3.18) SENS(T) = g“21 ) g; (1 gl(11))where b2 = E4>X2(Z).3.4 MM-EstimatesLetsl = si(H) = min Si(FH,o)where S1(FH,o) is as on (3.15) and is based on a function xi.(3.14) where the J-functional is in this case M(Fx,e, si), withrM(FH,e, si) = EFH,e X2 (-81An MM-estimate is defined bywhile xi and X2 satisfy the conditions given in Yohai (1987) including the requirement thatXi(x) X2(x) V x ER.In this case6.0 = M( FH,o,Si(FH,0 ) )andA( 11 0 11 ) = m(Fil , e ,si (FH,0 ) ).Notice thatsup = Si (FH,o).HEvf(H0)19Let gi(t) = gx(t) i = 1, 2 and b = Ecpxi(Z). It can be easily derived thatb —Op = (1 — c)g2 (grl ) \ cA(11 0iI) =— E )(1 — c)g2 (V1 + 11011 2gT 1 (b— f )1and therefore (3.19) BL(c) = [9 1 (92 (91 1-z)] 1.g 1 0.:Ce)BREAKDOWN POINT OF MM-ESTIMATESAccording to (3.7) and (3.19) the breakdown point of an MM-estimate is (3.20) BP(M) = b.SENSITIVITY OF MM-ESTIMATESIf gi (t) is continuously differentiable in a neighborhood of t = 1, i = 1, 2, the sensitivity ofan MM-estimate is (3.21) SENS(M) — , 2gi) .220Chapter 4The Relative Breakdown RateGiven two estimates T and T' with the same breakdown point b, we define the relative breakdownrate of T with respect to T' as:Y(4.1) RBR(T,T') lim B„,,E),loq i kE) •The concept of relative breakdown rate gives a more complete description of two estimates,because it not only points to the asymptote of the bias curves but also characterizes the relativespeed of divergence to infinity.In chapter 5 we will define the Breakdown Rate of certain types of S-, T- and MM-estimates.It will be the relative breakdown rate with respect to a baseline estimate, namely the min-maxbias S-estimate of regression among all S-estimates with the same breakdown point.We illustrate now how to compute and use the concept of the relative breakdown rate.4.1 The Relative Breakdown Rate of S-Estimates Based on xFunctions Strictly Convex on a Neighborhood of ZeroLet Si be an S-estimate of regression based on Xi such that xi is continuous, differentiable in allbut a finite number of points with 0 < j Vi (y)y dy < oo and three times differentiable in someneighborhood of zero with x'1(0) 0, i = 1, 2. Also suppose thatE(Dx i (Z) = Etx2(Z) = b , 0 < b < 0.5.21L? = limt-43According to (3.10) and (4.1), the relative breakdown rate of Sl with respect to S 2 is(1_b€ ) (tfe) 2(4.2) RBR(S1, 52) = urn ).t) g ( i_b e ) g1-1 ( itef )where gi = g, , i = 1,2.Note that as c --> b, 1—E 1-6—* 0, i _b , if b < 0.5 and T§7 -> 1 if b = 0.5.Therefore, as E -> b, g[ l U-2,) 0 gz- 1 (1%) )b if b < 0.5 and gt 1 (ib E )if b = 0.5, i = 1,2.We will compute L1 = 2-(t)27---1 and L2 = 11111t_,.1 g-1—r—(t) using L'HOspital's rule.(t) (t)Computation of L 1 . It is easier to compute00Let= ii111 dt [g2 1(t)] 2 —llm g(gri(t)) g2 1 ( t) t—*C1 {9, 1-1( t )] 2 t—).0 gl 1 (t) 92(92 1 (t))Then, if LI exists (contemplating also the possibility of LI being infinity), L? =Now, for i = 1,2cogi(t)= 2 xi(ty)co(y) dyJoandcogat) = 2 Jowhere co =Let y > 0; then a Taylor's series expansion of order 1 around 0 of Vi gives, XatY) =X: 1(0)ty o(ty) as t -> 0, so that1 °° o(-tgat) = x1(0) + 2 f tty) yco(y) dy , t > 0.Hence,(4.3) . g“gr l(t)) g l (t)t,c1 (t) gygnt))22(4.4) VAC)) + 2 0 y(P(y)0 g371 (t) y(p s dy4(0) + 2 fo" (t) yco(y) dy(4.5)Computation of L2. We have that1• gl 1(t) goL2 = 11M = liM 2 i t1"t-n 1 g 1 (t) t--.4 —_-r—gi (t)Letd 1L* = lim dt g2=r6j- = lim [gi 1 (t)] 29,1 (gr 1 (t)) 2 t_.•1dt 9T ) t-•1 {gi 1 (t)PCgnt)) •. (tThen, if L; exists, L2 = L2 .Now, for i = 1, 2 and t > 0t2g(t) = 2 Joy X:(Y)Poo (—Y ) dy.For each y > 0, a Taylor's series expansion of order 1 around 0 of co gives around 0 we can write(t) = (p(0) + o(—) as t--; ooand00 00t2 gat) = 2(140) Xi(Y)Y dy + 2 f Vi(Y)Y0 (t) dyJoso thatL2 (Y)Y dY(4.6)-2 /0" X 12 (Y)Y dY .Then, by (4.2),(4.5) and (4.6), if 0 < b < 0.5and if b= 0.5RBR(S 1 , 5 2 ) =92 1( 14:41 2 X1( 3 ).(AO T051(4.7) RBR(S1 , s 2 ) _ f0Xi(Y)Y dy 2 V1'(0)f0 v2(y)y dy v2(0) -23EXAMPLE 1: We will compute the RBR of two commonly used "smooth" initial S-estimatesof regression with breakdown point equal to 0.5. The first one, SA , is based on the function(4.8) XA(x) = if lx1 > Ax 2 , if < Ix' < A , A > 0which is a simple truncation of the classical square loss function. The choice A = 1.041 givesBP(SA) = xA(x) d(1)(x) = 0.5.The second S-estimate, SB is based on the integrated Tukey's bisquare score function73Tx4 irx6, if 0 < lx1 < Bwhich is three times continuously differentiable. The choice B = 1.547 gives BP(SB).111 xB(x) ci(x) = 0.5.We have that, 2 A(4.10) X'A(Y)YaY = –3 ; fco XIBMY dY = —16BJoy 35and(4.11)so that by (4.7)4(0) == 2 • e (0) = 6--A2 B B2(4.9) XB(x) = , B > 01, if Ix' > BRBR(SA , SB) = 0.709.Also,4 (4.12) g'A(1) = --A-2 [4(A) – iLly,(A) – 0.5]and (4.13) g'B (1) = 12co(B) ( T32 – + 12[4)(B) – Bco(B) – 0.5] 11;For A = 1.041 and B = 1.547 we getg iA (1) = 0.404 ; g'B (1) = 0.389and so from (3.13)SENS(SA) = 4.950 ; SENS(SB) = 5.141.242 (E) RBR(M, S) = lim B me--.b B3(c)9v. (92 (g1 1[= lim I(H) + &i) 2W1—e)e—+b —1estimate A, B SENS BP efficiency RBRSA 1.041 4.950 0.5 0.2190.709SB 1.547 5.141 0.5 0.287Table 4.1: Comparison of two S-estimates with the same BPThe asymptotic efficiency at the model with Gaussian errors of SA is given byeA 2[4)(A) — Aco(A) — 0.5]and for A = 1.041 we have that eA = 0.219. The efficiency of SB for B = 1.547 is eB = 0.287(see Rousseeuw and Yohai, 1984).All these computations are summarized in Table 4.1.Based on these figures, the S-estimate based on XA can be expected to perform approximatelythe same as that based on XB for Gaussian or approximately Gaussian data and can be expectedto perform better in the presence of a large fraction of outliers. However, this should be confirmedby extensive Monte Carlo simulation.4.2 The Relative Breakdown Rate of MM- and S-EstimatesBased on x Functions Strictly Convex on a Neighborhoodof ZeroLet M be an MM-estimate of regression based on x i and X2 with x i and X2 three times contin-uously differentiable, x'/(0) 0 for i = 1, 2 and E4,xi (Z) = b, 0 < b < 0.5.Further, let S be an S-estimate based Xi. Then BP(S) = BP(M) and we want to computethe relative breakdown rate of M with respect to S. By (4.1) and (3.19) we have that25To compute the limit we can use L'IlOspital's rule. Following a similar reasoning as in Section4.1 we see thatRBR(M, S) =if 0 < b < 0.5I1.9T I (4_1).1 'rf,:x(Y)YdY 2 ))) \ —1 1 2 if b = 0.5L Jo Xi (dY )?11(04.3 The Relative Breakdown Rate of T— and S-Estimates Basedon x Functions Strictly Convex on a Neighborhood of ZeroLet 7 be a 7-estimate of regression based on x i and X2 such that Eox i (Z) = b, 0 < b < 0.5.Further, let S be the S-estimate of regression based on xi. Then,92 (9r1 1;:fe)RBR(r, 5) = lim +g2 (gr l ( 1 1...0 1 gr1 (1 i b e )if 0 < b < 0.51—b 92 (9 17 1 ( 1, if b = 0.5.Note that given x i , if r and S are based on x i and b = 0.5, then RB R(r, S) = 1, no matterhow we choose X2.EXAMPLE 2: Let M be an MM-estimate based on x i and X2 such that xi and X 2 satisfythe assumptions made in example 3. Now let T be based on x i and some other function X3.Then, noting that RBR(M, T) = RBR(T,S)RBR(M,S) where S is the S-estimate based on x i we have that(el 2 1—b (gr1 if 0 < b < 0.5[gV(4.14) RBR(M, T) =X(Y)YdY 2 x"(0)4s) 1 2 if b = 0.5[ 0 Xi (y)y dy XI 0)Let .7" be the family of functions XB, B > 0 where xB(x) is given by (4.9). .T is known asTukey's family of x functions.If we take xi = XBi i = 1, 2,3 such that Eoxi(Z) = 0.5, by (4.14), (4.10) and (4.11) we havethat(4.15)2) B2 B?)-112RBR(M, T ) [71T26estimate B1 B2 BP efficiency SENS RBRM 1.56 4.68 0.5 0.95 9.6392.58T 1.56 6.08 0.5 0.95 13.480Table 4.2: Comparison of an MM- and a 7-estimate with the same BP and efficiencyestimate B1 B2 BP SENS RBRM 1.56 4.680 0.5 9.6392.58T 1.56 5.025 0.5 9.639Table 4.3: Comparison of an MM- and a 7-estimate with the same BP and SENSThe choice of B 1 = 1.56 gives us two estimates with BP = 0.5 and if we choose B2 = 4.68and B3 = 6.08, both estimates have 95% asymptotic efficiency at the model with Gaussian errors(see Yohai, 1985 and Yohai and Zamar, 1988).With these values of B 1 and B2 (note that BRB(M, r) doesn't depend on B3 )RBR(M, r) = 2.58.By (4.13) gl . (1) = 1.300, gY1) = 0.207 and g(1) = 0.138 and the value of b 3 such thatb3 = E:Dx3(Z) is b3 = 0.075.Therefore,SENS(r) = 13.480 ; SENS(M) = 9.639.We summarize the calculated quantifiers in Table 4.2.Since (4.15) does not depend on the choice of B3, if we take B1 = 1.56, B2 = 4.68 andB3 = 5.025, we have that SENS(M) = SENS(r) = 9.639 and RBR remains the same as theone calculated before, i.e. RBR(M, r) = 2.58.We can conclude from Table 4.3 that 7-estimates can be expected to outperform comparableMM-estimates for a wide range of fractions of contamination. This should also be confirmed by27extensive Monte Carlo studies.28Chapter 5The Breakdown RateIn this chapter we will define the breakdown rate for S-, r- and MM-estimates of regression. Themin-max asymptotic bias (among all S-estimates with the same breakdown point) S-estimatewill be used as a baseline estimate. In Section 5.1 we justify the choice of this baseline estimate,in Section 5.2 we give the definition of the breakdown rate and in the subsequent sections wecompute the breakdown rate for certain types of 5-, r- and MM-estimates.5.1 The Baseline EstimateWe will denote by X a the functionXa(x) = { 1 if ixi < a0 otherwise.We call Xa a "jump function" with jump constant a.Let C be the family of functions x :R--> R such that:• X is even and nondecreasing in [0, oo);• x is either continuous or a jump function;• x is continuously differentiable in all but a finite number of points;• X(0) = 0 and x(x) --> 1 as x —+ oo;• 0 <E4,{x(X)} < 1.29For x E C, letgx ( t) = EcD {x(tx)}.The following lemma was stated and proved by Martin and Zamar (1989). 1Lemma 1 : Given 0 < b < 1, let(5.1) Cb = fx : x E C and E4{x(X)} = b}and a satisfying 2[1 — (D(a)] = b.Then, for all x E Cbgx0 (t) _> gx ( t) , V t > 1;gx.(t) < gx (t) , Vt < 1.Proof: Since x, Xa E Cb we have thatfa coojo X(Y) (P(Y) dy = J [1 — x(Y)1 (P(Y) dY •aNow, note that cp(y1t)hp(y) is an increasing function of y if t > 1 and it is decreasing in yif 0 < t < 1.Then, V t > 11 fa ( Y = 1 r-7 0 xkg)co (7) dy T Jo a x(g)co(g) (70(y) dy< t1 5, La X(Y)C9(Y) 4= ,11,fa—[1—x(y)],(y)dy< 1-./. [1_x(y)]4omdy'The result proved in the reference paper is more general than the one presented in Lemma 1. It is valid forany distribution function F0 with a density fo symmetric about 0 and such that f(tx)/f(x) is decreasing in x fort > 1.30Therefore,gx(t) —t1 Jo (.1) dy= VOW (Yt) dy r X(Y)Co (1-) dyt Job71100 (y) dy= gxa(t)For t <1 the inequalities above are reversed and the result follows. qThe following theorem, which follows directly from Lemma 1, shows that the S-estimate ofregression Sb based on Xa with 2[1 — (NO] = b is min-max bias over Cb, where Cb is as in (5.1).Theorem 1 : For all 0 < b < 1,Bs(c) Bsb(E) 0 < < bfor all S based on x E Cb.Proof: From (3.10), the maximum bias function of an S-estimate based on a function x E Cb isgiven by2\ I (b/(1 — c)) 1-13*(€) 1.9V ((b EV( 1— f)).1 - 1 ,0<c<b.Since V c > 0, b/(1 — c) > b and (b — c)/(1 — c) < b, it follows thatgX l (b/(1—E)) > 1 and G 1 ((b — c)1(1— c)) <1.By the preceding lemma, we have thatgxa (b/(1 — c)) (b/(1 — c)) and g;c-,1, ((b — e)/(1 — c)) ((b — e)/(1 — c)) . qProposition 1 : If S is an S-estimate of regression based on x E Cb, 0 < b < 1, thenlim -13(f) > 1 .*311 , if b = 0.5b1—b 92(91--1( i b , if 0 < min{ b,1 — b} < 0.51Proof: The result follows from last theorem, since BS(E) > Bk(E), 0 < c < b. qProposition 2 : Let 0 < b < 1 and r be a r-estimate of regression based on x i E Cb andX2 E C. Then, if either• b = 0.5,or• 0 < min{ b, 1 — b} < 0.5 and g2 (gi-1 ( 1 60) b 21—blim B (c) > 1e--4) Bs2 b(E) 1 .Proof: Denote by S 1 the S-estimate of regression based on the function x l and let gi = gx ,,i = 1,2. Then,ELMBS1 (€) g2 =fe )) 1 B2(C) B2 (e) f--*b B s2 b (E) g2 (grl( i b E ))Sb 1 E 92 (g1 1( i b E )) } •Now,g2 (gT1( iblEe ))liM g2 f))11 C g2 (gri(1te))The hypothesis g2 (grl (Th)) implies that the last limit above is greater than orequal to one V 0 < b < 1, and so by proposition 1 the result follows. qProposition 3 : Let M be an MM-estimate of regression based on X 1 E Cb and X2 E C ,0 < b < 0.5. Then, if either• b = 0.5Or2 If xi(x) > x2(x)Vx, then gi (t) > g2(t)V t, and so g2(g1 1 (t)) < t, V t. Usually, xi and X 2 are taken in thesame family of functions (e.g. Tukey's family, see section 4.1) . In this case, since x i is chosen to attain a highbreakdown point and X 2 to attain high efficiency, the choice x i > X2 is the natural one to do.32• 0 < b < 0.5, g 2 r 1 (lb b )) < ib b 3 , C > 0 and d > c such that xi(x) = 0, V Ix' < c andxi is strictly increasing and two times continuously differentiable on (c, d)Btr(C) llin > 1.B b E)Proof: Suppose first that b = 0.5.For each 0 < c < 0.5, let fe (b) = Bk(c) and b(c) = arginine<b<r — EL(b).It was proved by Yohai and Zamar (1991) that if T is an estimate of regression dependingonly on the residuals, then for each 0 < c < 0.5, Bk o (c) < BRE).This fact implies thate_.1kno .5 :To 1.BST (E)) Now, note that since b(E) 0.5 as c T 0.5,B b( c) (€)ER1113.5 B s2 i (c = 1.and since lim B2 (E BM (E) .14 (c) ) m b(0 m ' = firn Bs2 4 (€) BL) (E) BS (e)the assertion follows.Now suppose that 0 < b < 0.5, then4 :B2 (C)ihn M (E 00a2 (9;1 01(1-0)) 2c2 h -1 (b/(1-0)if c = 0if c > 0Suppose that c > 0 and denote by S 1 the S-estimate based on xi. Since g 171 ( i b b ) <g2-1 (16=6)B 2 (E) B21 (E)li111 M > 11111 S > 1.06.-+b B s2 b (c) B s2 b (E)3 this condition will be automatically satisfied for the MM-estimates such as we have defined them in Section3.4 since it is required that X2 < xi.4 We delay the proof of this fact until Section 5.5.33B2 (f)urn BS (E) Bk(e)9-1 ( ib e )= lim E—rb [h—i ( i b e )BR2(S) =]h-1 0=0 29 -1 ( it€, )5.2 The Definition of the Breakdown RateDefine eb as the family of S-,T- and MM-estimates with breakdown point b.The results of the previous section motivate us to define the Breakdown Rate (BR) of anestimate Tb E gb as:B R2 (Tb) = lin 6(e)i., B2 EUnder the assumptions stated in propositions 1, 2 and 3, the BR indicates the speed of divergenceto infinity of the square of the maximum asymptotic bias function of an estimate Tb with respectto that of a baseline function, namely the maximum asymptotic bias of Sb, Bk(c).The BR is a measure of global robustness which summarizes information contained in the lastportion of the maximum asymptotic bias function of Tb. It provides a simple way of comparingrobust estimates with the same breakdown point.Note that we are comparing all estimates of Cb with the same estimate, namely Sb E Cb, themin-max asymptotic bias S-estimate of regression.5.3 Breakdown Rate of S-Estimates of RegressionLet 0 < b < 1, x E Cb and S be an S-estimate of regression based on x.In this section we calculate the breakdown rate of S, that iswhere,h(t) = 2 (1 — (I) (0 ) , with a such that b = 2 (1 — cD(a))g(t) = I : x(tx)co(x) dx .and34The following two results show that we can restrict our attention to the case 0 < b < 0.5.Lemma 2 : Let 0.5 < b < 1 and xi E Cb, X2 E Cl—b. Further assume that either:• 0 < fo" Vi. (y)y dy < oo,or• Xi = Xa, i = 1, 2 where 2[1 — fb(ai)] b and 2[1 — (1.(a2)] = 1 — b.Denote by Sb the S-estimate of regression based on xi and Sl — b the S-estimate of regressionbased on X2. Then,RBR(S6 ,5 1—b ) = oo.Proof: Assume that 0 < fo Xi(Y)Ydy < co. Let g(t) = g xi (t) and f (t) = gx2 (t).(1 112 —RBR(S b , S i—b ` = I ET b 9-1 (tf_: ..ce ) f-1 04))We can apply L'IlOspital's rule to compute the limit:f_1(1:17 t) g- 1 (t)f- 1 (1 — t) = lim lig (t)gl( — 1 (g— 1 t )2 f fg (1t ) ) tWe can write,t2g'(t) = 2 [C0(0) foc° (x)x dx I: o(x It)Vi (x)x dx]f (t) = 21 '3° A(tx)xco(x) dx.0Thus,lim( n _1( 9'(9 -1(t)) _ Ern Co(o) r Xi(x)x dx 0(x1(9 1 (t)AC(x)x dxt.1" fi(f-1 (t) t.1 Lc° XVI-1 ( 1 — t)x)xco(x)dx= 00.Now suppose that xi is of the jump type with jump constant ai, i = 1, 2 where 2[1-4)(a1)] = band 2[1 — (I)(a2)] = 1 — b.35[4, —1 (1 rtf..7 )- 1 (1 1—b73: 70 1 2RBR(Sb , S i—b )c-+1-b 4)-1 (1 1-r-14.) 4, —1 (1 2(1b—c)) •Now, by applying L'HOspital's rule we get21lira 4._i 1 - 1 - b - f) _ i ( 1 b Ern. , - 4,-1 ( 1 ( 6)6-4-19 2(1 - C) ) 2(1- c)) ,---,1-b ipp-i (1 12 -t,04)= lirn 1-4,_i (i 1 _ b - c)1 2 (1.1 (41 1 ( 1 2 ( 1 6 ))) c—a—b i 2(1 - e) LI (70 (4,-1 (1 ..1L.y))1=limo (P(0)o° t2W(t) = 0And so, RBR(Sb , S 1 - b ) = 00.0Let Sb be the family of S-estimates based on x functions such that x E Cb.Proposition 4 : If 0.5 < b <1 and S E Sb then BR(S) oo.Proof: Let S1-b be any S-estimate in S i_b such that it is based on a function of the sametype as the function on which S is based (i.e. if S is based on a jump type function then S 1- bshould be based on a jump type function as well, and similarly if S is based on a continuouslydifferentiable in all but a finite number of points function). Then, sinceB R2 (S) RBR(S, S 1 - b )B R2 (S") > RBR(S, S1 -b)the result follows as a consequence of the previous lemma. qWe will concentrate now in the case when 0 < b < 0.5.g-i_b_Note that if L 1 = lim,_,b h 1-c/)and L2 = 1—c both exist, then BR2 (S) =-1i(1-)(L1L 2 ) 2 •Lemma 3 : Suppose that x E Cb and 0 < fo x'(y)ydy < oo. Then1 n1 g-1(t) 1 j.c° Li X'(Y)Yt i h-1 (t) a36Proof: Since h-1 (t) and g -1 (t) tend to infinity as t tends to one, we can apply L'HOspital's ruleto compute L 1 .Letd 1 L t. m d h -1 (t) I = ht-o. 11_ 1dt F=ITtT[g -1 (o]2 9,/ (g -1 (0)[h -1 MP il l ( h-1 (0) .Then, if LI exists, L 1 = LI.Note thath'(t) =g'(t) =2a(a—co —t2 t ) 'T L x (ow (Y dy;and that a Taylor's series expansion of order 1 around 0 givesco ( z z7) = cp(0) + o (7) , as 1 --- 0so that we can write,ah'(t) = 2T k(0) + o (7)] , as t —> oo;1 CO 00g'(t) = 2 —t2 {cc( 0)1 Xi (Y)Y dY + I X' (Y)Y 0 () dy] , as t -- oo.o oHence,I = lim (g-1(t))2 g'(g-1(t)) 1 rx)L1 (h-1 (t)) 2 hqh-1 (t)) a JO ayand the result follows. qLemma 4 : Let x E Cb and L2 = lint-40 9: 14 • Suppose that 3 c > 0 and d > c such thatX(y) = 0 dy E [-c, c] and x is strictly increasing and two times continuously differentiable in(c, d).Then if c = 0, L2 = 00 and if c > 0,aL2 = —.C37Proof: Since h-1 (t), g -1 (t), h'(t), g'(t) —.). 0 as t —> 0, we can apply L'IlOspital's rule (twotimes) to compute L. Letd 1 L 2 = um dt N -1 (t)p .... [h-1 (t)]3 11 1 0-1 (t))t--*C1 cclitlh-11(0p — [g -1 (t)] 3g'(g -1 (t))andL2 = lim 11[11— 1 (OP h/ ( h-1 (0)} t-4) 1{[g-1(013g 1(9-1(0)}3[h-1 (01 2 + [11 -1 (0]3 hi: : hhi: tt?? If L2 exists, then L2 = L2 and so L3 = L2.Now,a) ah'(t) = 2c7o (--t- T11 211"M = h'(t) t- 72- — 21 c°g'(t) 2— Xi(Y)FP (-) dyt2I y g"(t) = 2— XilY/Y3C0 (—) dy — 2 Tg'(t)t5 I so thatg"(t)g'(t)h"(t)h'(t)1 fr xi(y)y3ca (f) dy 2 _ 1 1 fr V(Y)Y3co (f) dy 2t21=t3 Jr v(y)yv (f) dy t t3 t fr Xi (Y)A0 () dyT1 (a2 — 2t2 )and soa2 + h -1 (t)I2 = 1114 f" x i (y)y3 (P(n dY n-1(t)locx) XIMY(P(ndY + .7Under the stated hypothesis on X,fr x'(Y)Y3co (f) dy = t2 fc7t X/(tY)Y3(10(Y) dyfog Xi(Y)YV () dy rit xqtY)Ycio(y)dy=3[g -1 (t)] 2 + [g-1 ( t )1 3 g i : .gi- 11 till 38Let2 f(e+h)it XVY)y3y,(y) dy f(t,h) = t co • t > 0 , h > 0.Ac+h)/t V(tY)Y50(y) dyLet t, h > 0 and y > (c + h) I t. Then, by performing a Taylor's series expansion of order 0around c + h we can writeX'(ty) = X i(c + h) + Ro(ty, c + h)where Ro(ty, c + h) = x"(e)[ty — (c + h)] for some E (c + h, ty).Then,_f (t, h) t2 X1(,C, a 1+ h L) , rfr+oeh)/t Y3CO(Y)clY + f(7:Fol t R(ty, c + h)y3co(y) dyX" lc + i(+h)lt MY) dy + fic:)+01t R(ty, c + h)p,o(y) dy •But,t2 f( :+h)/t y3co(y) dy — (LA) [(c + h) 2 + 2t2 ]=Acc°+hv t Y Co(Y) dy (to ( gin.)= (c + h) 2 + 2t2 ;t2 fr+hvt Ro (ty, c + h)y3 co(y) dy =f(+h)/t ycp(y) dyandfoco C0 (x + LttLI) dxRo(tx + c + h, c + h)(tx + c + h)3 t (L+A)`P tAc ° Ro(tY , c + h*P(Y) clY-E0 it I °° +F= Ro(tx c + h, c + h)(tx + c + h) i co (x + -Li) dxfrc+hvt Y40(Y) 4 o t co (L+A)where Ro(tx + c + h, c + h) = x" (e)tx , for some e E (c + h,tx -I- c + h) for each x > O.Therefore, f (t, h) —+ c2 for (t, h) --- (0, 0) -1- anda2L2 -=- - - 111cTheorem 2 : Let 0 < b < 0.5 and x E Cb. Suppose that 3 c > 0 and d > c such thatX(Y) = 0 Vy E [—c, c] and x is strictly increasing and two times continuously differentiable in(c,d). Let S be the S-estimate based on X.• If c , 0 and 0 < fr V(Y)Y dy < oo, then BR2 (S) = oo V b.39• If c > 0 and 0 < b < 0.5, then1BR2 (S) = [y l (1 2(1 b b) ) g-1 (1 b b )12 .— • If c > 0, 0 < fr X'(Y)Y dy < 00 and b = 0.5, then(5.2) BR(S)2 = [-1 f c° X'(0Y dd2 •cProof: It is a direct consequence of lemmas 3 and 4. qEXAMPLE 3: Let{XC,A =and let Sc,A be the S-estimate based on XC,A•that isif 0 < Ix' < Cif C < Ix' < Aif ixl > ALet 0 < b < 0.5 be such that E4)xc,A(Z) = b0x2 — C2A2 —C21A2 —2 C2 [40(A)(1 — A2 ) — Aco(A) — 4, (C)(1 — C2 ) + Cco(C)J + 2 = b.The choice C = 0.202 and A = 1 gives BP(S") = 0.5.Sincegc,A(1) = A2 _4 c,2 [4)(A) — Aco(A) — 4, (C) + Cco(C)]we have thatA2 — C2SENS(Sc 'A ) = 2[4,0)_ Aco(A) — 4.(c) + cw(c) ,and the efficiency of S c,A at the Gaussian model ise(S") = 2WA) — Aco(A) — 4)(C) + Cco(C)]•To compute the breakdown rate of Sc ,A note first that[co 2 A3 — C3A Vc,A(x)x dx — 3 A2 _ 6,2and by(5.2)A, 2 g — C3 11 2BR2 (SC" - ) = [3 A2 — C2 C40estimate C A BP SENS efficiency BR5c'' ,A 0.202 1 0.5 4.879 0.196 3.412SA 0 1.041 0.5 4.950 0.219 00Table 5.1: Comparison of two S-estimates with the same BP but markedly different bias per-formanceNote that if C = 0, the estimate reduces to the one introduced in Section 4.1, Example1, based on XA given by (4.8). We summarize the quantifiers calculated above for the specificvalues of C and A in Table 5.1 including SA as well.Notice that these two estimates have very similar asymptotic properties such as the BP,SENS and efficiency. They can only be distinguished in terms of their BR.5.4 Breakdown Rate of T-Estimates of RegressionLet 0 < b < 1, Xi E Cb, X2 E C and r be a 7-estimate of regression based on x i and x2 .If b 0.5, the breakdown rate of r isBR2 (r) = lim B2(E)T e—.b (E) b gT1 (?) h -11— bk2 (g171 (T-bb ))1 lim [€—nb h-1 617) giT-10:1 2(b—e)1—fand if b = 0.5limB R2 (7) =bg 1` 1 6E)0 :fe )h T 1 2h-1 0-2c )g1 1—Ewhere gi(t)= gx.(t), i = 1,2 and h is the same function defined in Section 5.3.Lemma 5 : Let 0.5 < b < 1, E C, i = 1,2, Xi E Cb and Xi E C1 —b. Let yi be they-estimate based on Xi and Xa. ThenRBR(r 1 ,7-2 ) = 00.41Proof: Let g i(t) = g x; (t) and fi(t) = gx./ (t) , j = 1,2.b ))] -1RBR(r 1 ,r2 ) 1= 3111b [12 (K 1 (b 2 [g2 (gr 1 (lbY))]-1and by lemma 1, this limit is equal to infinity. qLet Tb be the family of 7-estimates based on functions2[917 1 ( i t%) fi-1 (1 _ 1b e ) g r 1 ( lb 1 fe ) K 1 ( 1 : be ) 1E Cb and X2 E C.The following result shows that we can restrict our attention to the case 0 < b < 0.5.Proposition 5 : If 0.5 < b < 1, X1 E Cb, X2 E C and r is the 7-estimate of regression basedon xi and x2, thenBR(r) =Proof: Let 7 1— b be any 7-estimate in Ti_b. Then since BR2 (r) > RBR(r,r 1— b), the resultfollows as a consequence of Lemma 5. qTheorem 3 : Let 0 < b < 0.5, xi E Cb, X2 E C and r be the 7-estimate of regression based onXi and X2. Suppose that ]c > 0 and d > c such that xi(y) = 0 Vy E [—c, c] and x i is strictlyincreasing and two times continuously differentiable in (c, d). Then,• if c 0 and 0 < fo vicoydY < oo, then BR(r) = oo;• if c > 0 and b < 0.5,BR2 (7) = b —1 (ib b)j 21 — bk2 (9'1 ))1 {Y 1 ( 1 2(1 b— b))• if c > 0, b = 0.5 and 0 < Xi(Y)Y dY < 00,2(5.3) BR2(7) = Xi(Y)y dy]Proof: It is also a consequence of lemmas 3 and 4. qCorollary 1 : If b = 0.5, S E Sb is based on some function x i and r E 7 is based on x i andX2, then BR(S) = BR(r). If b < 0.5 then BR(S) < BR(r).Remark:In the case c > 0, b = 0.5, the BR of the 7-estimate does not depend on the choiceof x 2 .425.5 Breakdown Rate of MM-Estimates of RegressionLet 0 < b < 0.5, Xi E Cb, X2 E C and M be an MM-estimate of regression based on x i and x2 .The breakdown rate of M is2E)( BR2(M) = Ern B m.,E)92 1[g 2 (g r1 0:0) J h-1 (( 1 —E/c) 21—€ =c-'b h-1 ( 14-4)Lemma 6 : Let 0 < b < 0.5 and xi E C, i = 1,2. Then,• if b < 0.5,11111 =c—.b h-1 (1-9g2 [9,2 (9,17 1 (11:cf )) 4_ I. e l 91 ( 1—b b)J h-1 ( 1b1);• if b = 0.5 and 0 < foc° V2(Y)Y dy < 00,gv [g2 (gr i 0:0) + isL71 1 r -1c—).1) j = j: V2 (y)y dy (2 - trim° g20)) •lim h-1 ( i+,) -0Proof: Assume that b = 0.5. We can apply L'Illispitars rule to compute the limit of interest:ifL = limE-40.5 h-1 —c= lim -1c—n0.5 [g (g2 (g r 1 (0 itcf )) 1Sz7)][11-1 (M)} 1L' = lim€—).bd"cllfh-i ( 0.5 )1 -1i-EJ7:14i {g2-1 (g2 (91-1 (0.5—e€) —1c1c)192 (g2 (oitc,)) TEE )) [g2 12 (g2 (W)) +E 6)] 2hil (11 1 (M)) [11 192 i- 1 (05—,e)) —1 .A (gi.An (0i%€))g2 1 (g2 (g171 (01.57)) + l i e )andOr= X{243Then, if L' exists, L = L'.Now, sinceh' (t) = 2t2 [co(0) o ( c.±)] as t - 0092(t)t) = 2t P 00O) X'2(Y)Y dy + Joy a () d yi as t oo ;then,(9,2 (91-1 (015t—cc)) 1l c )] 2 A (gV (g2 r 1 ( Oitf erim 1 =[h -1 (II)] 2 h' (M))--€1 fcc)j X2(Y)Y dy. qag(g- (t0))) iRemark: Unfortunately, we were not able to compute lim t, g0 , g1.( n general. However( we will compute it for one important special case.Let xi , X2 E C be such that x2(x) < (x), V x. Suppose that there exists c 1 > 0 such thatX1(x) = 0 if x E [0, Then there exists c 2 > c i such that X 2 (x) = 0 if x E [0, ed.If we want to define an MM-estimate based on xi and X 2 , the choice of X 2 should be done toobtain efficiency at the Gaussian model and so X2 should be as closely as possible to x(x) = x 2 .For this reason we only consider the case c i = c2 = c.Now,g2(t) .17 x2(y)y4' dy 9i(t) Vi(y)yco (f) dy.17it V2(ty)w(y) dy Ljt Xi(tY)Y (P(Y) dyIf we letf (t, h) — roofrc-f-hvt X2(tY)P,o(y) dyfor tAci-hvt X1(tY)A0(Y) dy> Oiand assume that there exists d > c such that Xi is strictly increasing and two times continuouslydifferentiable in (c, d). By continuity of on (c, d), for each y > (c h) I t we havex2(ty) = x2(c h) + Po(ty, c + h),Vi (ty) xii(c h) Ro (ty, , c h)•44where Ro(ty, c h) and Ro(ty, c + h) converge to zero as ty tends to c + h.Then,tf(,h) =xi (c + h) fr+0,ty,(04 + f c' R, (1x'2(c + h)co ( 2-11-i ) + rV2 (c + h) f- , I/ ( ) + rh) ( dy , +h)/t Ro(ty, c + h)Ao(y) dyj jj: : ++ hoviit.fio(ty , c + h)Ao(y) dyJ(c-1-1)1t —0 ‘ --Y 1 C + h)yco(y) dyXi (c + h)co (A) + r Ro(ty, c + h)yco(y) dyx(cc-i-+hh) +( ) 1- 3-(71:PL) 474-h)it Ro(ty, c + It)y,p((cti))/t) dy1 + xl(c1+h) fr-I-Olt Ro(ty, c + h)Y w((-i-Yh)it) dyThen, it's easy to see that if xi is not differentiable at c, thenf (t, h) x (c+))2, as (t, h) (0,0);Xi( where xac+) denotes the right lateral derivative of Xi at c, i = 1,2.Theorem 4 : Let° < b < 0.5, xi E CI)) X2 E C, X2(x) < xi(x), V x and M be the MM-estimateof regression based on xi and X2. Suppose that 3 c > 0 and d > c such that xi(x) = OV lx1 < cand x i is strictly increasing and two times continuously differentiable on (c, d). Then,• if 0 < b < 0.5 and c > 0,1 .1 b )1 2BR2 (M)2(1— b)) — bilFurther assume that 0 < fo x'2 (y)y dy < Do. Then,• if c = 0, BR(M) = oo;• if b = 0.5, c > 0 and lim t_,o g'212.0) exists, —1 2(5.4) BR2(M) / 1C foc° A(y)y dy [2 — A(gri(t))] .gl(gT 1 (t))Proof: Letg2 1 [g2 (91-1 (cli.5_7 )) + 1 1 , 1= •h -1 `)1—f45o.s—Then, since g2 (gr i ( EE)) > 0, g2 (t) < g 1 (t) and g2 is increasing,1gV ( 1e ) g21( 0.5e)( ( < f( C) <h-1 (;—c) h-1 (D—E ) .By Lemma 3 the right hand side of the above inequality converges to 1/a fr x 12 (y)y dy asE 0.5 and by a similar reasoning the left hand side tends to -1-,-, fir V2 (y)y dy as E —4 0.5. Sinceby hypothesis 0 < fr V2(Y)Y dy < oo, there exist A1, A2 such that 0 < A l < A2 < oo andA l < f (c) < A2.Hence, by Lemma 4, if c = 0, BR(M) = oo. If c > 0 and b = 0.5, the result is a consequenceof Lemma 6.465.6 ConclusionsFollowing are some conclusions obtained from the results proved in this chapter.• The results of Chapter 5 can be used to choose the loss function, Xi, which determinesthe breakdown point of S-, 7- and MM-estimates so that they have good bias-robustnessproperties. In particular, the fact that Xi should be constant and equal to zero on aneighborhood of zero (among other regularity conditions) was first discovered here.• MM- and 7-estimates were developed for the purpose of achieving a high breakdown pointand a high efficiency at the Gaussian model. The results in this chapter show that thebreakdown rate of 7-estimates with breakdown point equal to 0.5 does not depend on thechoice of the "efficiency determining" loss function X 2 . On the other hand, the conditionthat X2 (x) < X1 (x), Vx for MM-estimates, forces X 2 to be constant near the origin withensuing loss of efficiency (see the remark to Lemma 6, Section 5.5).• We think that the breakdown rate is a good criterion for defining optimality as in thefollowing problem: "maximize the efficiency of an estimate subject to a constraint onits breakdown rate". If we can find an estimate that solves such a problem, it will be anadaptive estimate in the sense that if the model is Gaussian or nearly Gaussian the estimatewill perform well (because of its high efficiency) and if the fraction of contamination ishigh, the estimate will perform well compared to other estimates with the same breakdownpoint.• The breakdown rate of an estimate is a robustness quantifier of an asymptotic nature.It remains to be determined whether the good breakdown rate properties of an estimatecarries over to finite sample situations. A next step in this work will be to perform anextensive Monte Carlo study.47BibliographyAndrews, D.F.,Bickel, P.J.,Hampel, F.R.,Huber, P.J.,Rogers, W.H. and Tukey, J.W. (1972),Ro-bust Estimates of Location: Survey and Advances, Princeton University Press, Princeton, N.J.Donoho, D.L., and Huber, P.J. (1983), The notion of breakdown point, in A Festschrift for ErichLehmann, edited by P. Bickel, K. Doksum, and J.L. Hodges, Jr., Wadswoth, Belmont, CA.Hampel, F.R. (1968), Contributions to the theory of robust estimation, Ph.D. Thesis, Universityof California, Berkeley.Hampel, F.R. (1971), A general qualitative definition of robustness, Ann. Math. Stat., 42,1887-1896.Hampel, P.R. (1974a), The influence curve and its role in robust estimation, J. Am. Stat.Assoc., 69, 389-393.Hampel, F.R. (1974b), Rejection rules and robust estimates of location: an analysis of someMonte Carlo results, Proc. European Meeting of Statisticians and 7th Prague Conference onInformation Theory, Statistical Decision Functions and Random Processes, Prague, 1974.Hampel, F.R. (1976), On the breakdown point of some rejection rules with mean, Res. Rep.No. 11, Fachgruppe fiir Statistik, Eidgen. Tech. Hochschule, Zurich.Hampel, F.R. (1978), Optimally bounding the gross-error sensitivity and the influence positionin factor space, in Proceedings of the Statistical Computing Section of the American StatisticalAssociation, ASA, Washington, D.C., 59-64.Hodges, J.L. (1967), Efficiency in normal samples and tolerance of extreme values for someestimates of location, Proc. Fifth Berkeley Syrnp. Math. Stat. Probab., 1, 163-168.48Huber, P.J. (1964), Robust estimation of a location parameter, Ann. Math. Stat.,35, 73-101.Huber, P.J. (1973), Robust regression: Asymptotics, conjectures and Monte Carlo, Ann. Stat.,1, 799-821.Huber, P.J. (1981), Robust Statistics, John Wiley & Sons, New York.Martin, R.D., Yohai, V.J. and Zamar, R.H. (1989), Min-max bias robust regression, Ann. Math.Stat., 4, 1608-1630.Martin, R.D.,and Zamar, R.H. (1989), Asymptotically min-max bias robust M-estimates of scalefor positive random variables, J. Am. Stat. Assoc., 406, 494-501.Rousseeuw, P.J. and Yohai, V.J. (1984), Robust regression by means of S-estimators, in Robustand Nonlinear Time Series Analysis, edited by J. Franke, W. Hardle, and R.D. Martin, LectureNotes in Statistics No. 26, Springer Verlag, New York, pp 256-272.Yohai, V.J. (1987), High breakdown point and high efficiency robust estimates for regression,Ann. Math. Stat., 15, 642-656.Yohai, V.J., and Zamar, R.H. (1988), High breakdown point and high efficiency robust estimatesfor regression, Ann. Math. Stat., 83, 406-413.Yohai, V.J., and Zamar, R.H. (1991), unpublished manuscript.49BIOGRAPHICAL INFORMATION NAME: 50N/4 V. T. MAll-1 .MAILING ADDRESS: SAMUEL BRETON 480LoiviAs DE SAN MA-RT645000- cdR.008AARGENTINAPLACE AND DATE OF BIRTH: CORD() M / AR6ENriNASEPTEMBER VIA, 1 9 63EDUCATION (Colleges and Universities attended, dates, and degrees):uNiveRSIDAD NACIONAL DE COIRDO8A - 4RGENTIN A3 / 1 "3 - 3 / 4C1 gq1-10ENCIADA EN mArEmdricAPOSITIONS HELD:PUBLICATIONS (if necessary, use a second sheet):AWARDS:Complete one biographical form for each copy of a thesis presentedto the Special Collections Division, University Library.
- Library Home /
- Search Collections /
- Open Collections /
- Browse Collections /
- UBC Theses and Dissertations /
- A new measure of quantitative robustness
Open Collections
UBC Theses and Dissertations
Featured Collection
UBC Theses and Dissertations
A new measure of quantitative robustness Mazzi, Sonia V. T 1992
pdf
Page Metadata
Item Metadata
Title | A new measure of quantitative robustness |
Creator |
Mazzi, Sonia V. T |
Date Issued | 1992 |
Description | The Gross-Error Sensitivity (GES) and the Breakdown Point (BP) are two measures of quantitative robustness which have played a key role in the development of the theory of robust-ness. Both can be derived from the maximum bias function B(€) and constitute a two-number summary of this function. The GES is the derivative of B(€) at the origin whereas the BP determines the asymptote of the curve (c, B(€)). Since GES€ ≈ B(€) for € near zero, the GES summarizes the behavior of B(€) near the origin. On the other hand, the BP does not provide an approximation for B(€) for c large and, consequently, estimates with strikingly different bias performance when c is large may have the same BP. A new robustness quantifier, the breakdown rate (BR), that summarizes the behavior of B(€)for € near BP will be introduced. The BR for several families of robust estimates of regression will be presented and the increased usefulness of the three-number summary (GES,BP,BR) for comparing robust estimates will be illustrated by several examples. |
Extent | 1482597 bytes |
Genre |
Thesis/Dissertation |
Type |
Text |
FileFormat | application/pdf |
Language | eng |
Date Available | 2009-01-06 |
Provider | Vancouver : University of British Columbia Library |
Rights | For non-commercial purposes only, such as research, private study and education. Additional conditions apply, see Terms of Use https://open.library.ubc.ca/terms_of_use. |
DOI | 10.14288/1.0086725 |
URI | http://hdl.handle.net/2429/3379 |
Degree |
Master of Science - MSc |
Program |
Statistics |
Affiliation |
Science, Faculty of Statistics, Department of |
Degree Grantor | University of British Columbia |
GraduationDate | 1992-05 |
Campus |
UBCV |
Scholarly Level | Graduate |
AggregatedSourceRepository | DSpace |
Download
- Media
- 831-ubc_1992_spring_mazzi_sonia.pdf [ 1.41MB ]
- Metadata
- JSON: 831-1.0086725.json
- JSON-LD: 831-1.0086725-ld.json
- RDF/XML (Pretty): 831-1.0086725-rdf.xml
- RDF/JSON: 831-1.0086725-rdf.json
- Turtle: 831-1.0086725-turtle.txt
- N-Triples: 831-1.0086725-rdf-ntriples.txt
- Original Record: 831-1.0086725-source.json
- Full Text
- 831-1.0086725-fulltext.txt
- Citation
- 831-1.0086725.ris
Full Text
Cite
Citation Scheme:
Usage Statistics
Share
Embed
Customize your widget with the following options, then copy and paste the code below into the HTML
of your page to embed this item in your website.
<div id="ubcOpenCollectionsWidgetDisplay">
<script id="ubcOpenCollectionsWidget"
src="{[{embed.src}]}"
data-item="{[{embed.item}]}"
data-collection="{[{embed.collection}]}"
data-metadata="{[{embed.showMetadata}]}"
data-width="{[{embed.width}]}"
async >
</script>
</div>
Our image viewer uses the IIIF 2.0 standard.
To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.831.1-0086725/manifest