E S T I M A T I O N W I T H M U L T I V A R I A T E E X T R E M E V A L U E DISTRIBUTIONS, W I T H A P P L I C A T I O N S TO E N V I R O N M E N T A L D A T A By YOUSHI C H E N B.S c , Beijing Institute of Technology, 1983 M.S c , Beijing Institute of Technology, 1986 A THESIS S U B M I T E D IN P A R T I A L F U L F I L L M E N T OF T H E R E Q U I R E M E N T S F O R T H E D E G R E E OF M A S T E R OF S C I E N C E in T H E F A C U L T Y OF G R A D U A T E STUDIES D E P A R T M E N T OF STATISTICS We accept this thesis as conforming to the required standard T H E U N I V E R S I T Y OF BRITISH C O L U M B I A September 1991 ©Youshi Chen, 1991 In presenting this thesis in partial fulfilment of the requirements for an advanced degree at the University of British Columbia, I agree that the Library shall make it freely available for reference and study. I further agree that permission for extensive copying of this thesis for scholarly purposes may be granted by the head of my department or by his or her representatives. It is understood that copying or publication of this thesis for financial gain shall not be allowed without my written permission. Department of The University of British Columbia Vancouver, Canada Date DE-6 (2/88) Abstract Several parametric families of multivariate extreme value distributions (Hiisler and Reiss 1989, Tawn 1990, Joe 1990a, 1990b) have been proposed recently. Applications to multivariate extreme value data sets are needed to assess the adequacy of the known families in their fit to data. Different families are compared in their range of multivariate dependence and their ease of use for maximum likelihood estimation. Some useful conclusions have been made from experience with several environmental data sets. n T A B L E S O F C O N T E N T S Abstract i i Table of Contents i i i List of Tables iv Acknowledgement v Chapter 1. Introduction 1 Chapter 2. Multivariate Extreme Value Distributions 3 2.1 Introduction 3 2.2 Multidimensional Distributions 3 2.3 Copulas 6 2.4 Univariate Extremes and the Generalized Extreme Value Distribution 7 2.5 Multvariate Extreme Value Distributions 11 2.6 Some Relation Between Extreme Value Theory and Central Limit Theory 18 2.7 Parametric Families of Multivariate Extreme Value Distributions 20 2.8 Estimation for Multivariate Extreme Value Distribution 23 Chapter 3. Maximum likelihood Estimation With Husler-Reiss Model 25 3.1 Bivariate Case 26 3.2 Trivariate Case 28 3.3 Four Dimensional Case 30 3.4 Computer Implementation of the Husler-Reiss model 32 3.5 Data Analysis 32 Chapter 4. Analysis with Missing Data 44 Chapter 5. Conclusion and Further Research 51 Bibliography 52 in List of tables Table 1. Bay Area-Ozone. (n=120) 38 Table 2. GVRD-Sulphur Dioxide. (n=70) 39 Table 3. GVRD-Nitrogen Dioxide. (n=84) 40 Table 4. GVRD-Ozone. (n=84) 41 Table 5. GVRD-Station 5 (Confederation Park). (n=64) 42 Table 6. GVRD-Station 7 (Anmore). (n=76) 42 Table 7. GVRD-Station 9 (Rocky Point Park). (n=63) 43 Table 8. GVRD-Sulphur Dioxide. (n=104) 48 Table 9. GVRD-Nitrogen Dioxide. (n=84) 49 Table 10. GVRD-Ozone. (n=84) . . .50 iv A C K N O W L E D G M E N T I would like to express my deep gratitude to my supervisor, Professor Harry Joe, for his constant encouragement, guidance and assistance in my study and in the producing of this thesis. Very special thanks are due Professor Mohan Delampady for reading the manuscripts and making constructive comments. Finally, I would like to thank the Department of Statistics, The University of British Columbia, for supporting me with financial aid and encouragement. v Chapter 1 Introduction Extreme value theory has a great number of applications. We list two cases where the largest or the smallest "measurement" are of interest. 1. Air Pollution. Air pollution concentration is expressed in terms of proportion of a specific pollutant in the air. Concentrations are recorded at equal time intervals, and it is required by law to keep the largest annual concentrations below given limits. 2. Natural Disasters. Floods, heavy rains, extreme temperatures, extreme atmo-spheric pressures, winds and other phenomena can cause extensive human and material loss. Communities can take preventive action to minimize their effects even if such dis-asters cannot be completely avoided. In dams, dikes, canals, and other structures the choice of building materials and methods of architecture can take some of these poten-tial disasters into account. Engineering decisions that confront such problems should be based on a very accurate theory, because inaccuracies can be very expensive. For example, a dam built at a huge expense may not last long before collapsing. In example 1, at each time point, there may be a vector X of measurements, con-sisting of concentrations of a pollutant at several air quality monitoring stations in a network or consisting of concentrations of several pollutants at a given monitoring sta-tions. In example 2, there may be a vector X of measurements at several positions in a spatial grid. In the examples, we have over time a number n of random measurements X i , ' . . . , X„ , and the behaviour of either Z n = m a x ( X i , . . . , X n ) or W n = m i n ( X i , . . . , X n ) can be of interest. Here max and min refer to componentwise maxima and minima; see Chapter 2 for details. Statistical inferences of the upper or lower tail of the multivariate distribution of X can be made from using extreme value theory to model the multivariate distribution of 1 Z n or Wn. There has been much recent research in multivariate extreme value theory; see Galambos (1987, Chapter 5) and Resnick (1987, Chapter 5) for some general theory. The set of possible of multivariate extreme value distributions has been shown to be of infinite dimension. Smith, Tawn and Yuen (1990) study a density estimation based method for nonparametrically fitting a multivariate extreme value distribution. Except in the bivariate case, the method was not too promising. A n alternative is to find a finite-dimensional parametric subfamily of the set of multivariate extreme value distri-butions. A good subfamily would be "dense" in the complete family. Parametric families have recently been proposed by Hiisler and Reiss (1989), Tawn (1990), Joe (1990a,b), Coles and Tawn (1991). Experience is now needed to assess the adequacy of fit of these families to multivariate extreme value data sets. Joe (1990b) did some work on this but did not include the Hiisler and Reiss family in his comparisons. One goal of this thesis is to do maximum likelihood estimation with the Hiisler and Reiss family. This turns out be very difficult and details are given in Chapter 3, where comparisons are also made with results in Joe (1990b). Another goal of the thesis is to deal with missing values in computing maxima. Joe (1990b) deleted cases where there were many missing values so that maxima could not be properly computed. These maxima were assumed to be missing at random. Here, in Chapter 4, we treat these cases as right-censored maxima, that is the actual maxima is known to be above the maxima of all non-missing observations. The thesis proceeds next (in Chapter 2) with definitions and properties of general multivariate distributions and of multivariate extreme value distribution. The last chap-ter (Chapter 5) of this thesis makes some final conclusions and points out a direction for further research. 2 Chapter 2. Multivariate Extreme Value Distribution 2.1 Introduction In this chapter we describe some general properties of multidimensional distributions and copulas. Then we introduce some univariate extreme value theory and the general-ized extreme value distribution. Following this, we concentrate on multivariate extreme value distributions, the main topic of this chapter. 2.2 Multidimensional Distributions Let us define the p-dimensional random variable X as the vector X = (Xi,... ,XP) The distribution function -F(x) = F(xi,..., xp) is defined as F(x) = P ( X < x) = P(XX Zp). f = >0, dx\ • • • dx. v and / is the probability density function. P6 (Rectangles): p P ( a 1 < X l < b 1 , . . . , a p < X p < b p ) = F(b1,b2,...,bp)-J2Fi + i=l ^ F i j + ... + (-lYF(a1,...,ap)>0, where for a subset I of { 1 , . . . F j is the value of F ( s i , . . . , Sk) with s; = a; if i 6 / , and Si = b{ if i £ I. P7 (the Frechet Bounds): p max(0, J2 Fi(xi) ~ P + !) < F(x^ • • • xp) < min(Fi(a; i ) , . . . , Fp(xp)). If Fi,...,Fp are all continuous, then the upper bound corresponds to distribution of (Xi, F 2 ~ 1 F i ( X i ) , . . . , F ^ F ^ X i ) ) . In general the lower bound is not a multivariate dis-tribution function for p > 3. For example, suppose F is a three-dimensional distri-bution function with marginal functions satisfying Fi(xi) = 1/2, ^ 2 ( ^ 2 ) = 1/2 and F3(x3) = 1/2. Then F(+oo, + 0 0 , + 0 0 ) - Fi(ari) - F2(x2) - F2(x3) + max(0, F ^ ) + F2{x2) - 1) + max(0, F i (x i ) + F3{x3) - 1) + max(0, F2{x2) + F3(x3) - 1) - max(0, F f a ) + F2(x2) + F3(x3) - 2) = 1 - 3 / 2 = - 1 / 2 <0, which contradicts Property 6. P8 (Survival function): Let Bj be the event (Xj > Xj) and j(k) = (ji,... ,jk) be a subset of { 1 , . . . ,p}, then let G j ( k ) ( x h , . . . , X j k ) = P ( B h n • • • n Bjk), 5 0 (x) = 1, and Sk(x) = G h ( x h > • • • > x3k)> 1 < k < P -i 2, then p F(x1,...,xp) = J2(-l)ksk(xi,...,xp). k=0 In addition, for any integer 0 < q < (p — l ) /2 , E ( - i ) f c ^ W < f(x) < E ( - i ) f c ^ ) . fc=0 k-0 The proof of most of the above results is straightforward; for the proof of properties P7 and P8, readers are referred to Galambos (1987). 5 2.3 Copulas Assume that F(x) is a p-dimensional distribution function with univariate marginals Fi(x{), for 1 < i< p. Let C(y) be a p-dimensional function over the unit cube 0 < yi < 1, 1 < ^ < P, and such that it increases in each of its variables and F(x1,...,xp) = C[F1(x1),...,Fp(xp)]. Then the function C(y) is called a copula of JP(X). If C is a copula, then G(Vu•••,yP) = C[G1(y1),...,Gp(yp)] is a multivariate distribution with univariate margins G,-, for i — 1, ...,p, where Gi is an arbitrary univariate distribution function. When needed, we write CF = CF^Y) = C(y) for ^(x). Note that some people call C(y) a dependence function. If F(x\,..., xp) is a continuous p-variate cumulative distribution function with uni-variate margins Fj(xj),j = 1,... ,p, then the associated copula is C(UL,..., tip) = F(Fr1(u1),..., Fp-\up)), 0 < m < 1, i = 1,... ,p. This is a multivariate distribution with uniform (0,1) margins. Returning to extreme values, the ith. marginal of Fn(x) is F^Xi), and thus Fn(Xl,..., xp) = CFn[F?(Xl,..., F?(xp)]. On the other hand, we have Fn(x1,...,Xp) = CF[F(x1),...,F(xp)}. A comparison of these last two equations leads to CF4y) = c F [ y l / n , . . . , y l / n } . 6 2.4 Univariate Extremes and the Generalized Extreme Value Distribution Let Xi,...,Xn denote independent and identically distributed (iid) random vari-ables, with cumulative distribution function F. Let Zn = m a x ( X i , . . . , X n ) , Wn = m i n ( X a , . . . , X n ) . By the iid assumption, we have Hn(x) = P(Zn < x ) = Fn{x) and Ln(x) = P ( W n < x) = 1 - (1 - F(x))n. We seek conditions on F(x) to guarantee the existence of sequences {a n }, {&„}, {c n }, { d n } of constants such that, as n —> oo l im H J a n + bnx) = H(x) n—*oo and l im Ln(cn + dnx) = L(x) exist for all continuity points of H ( x ) and L(x) respectively, where H ( x ) and L(x) are nondegenerate distribution functions. Equivalently, {an}, {&„}, {cn} and { d n } are such that (Zn — an)/bn and (Wn — cn)/dn converge in distribution as n —> oo. A sufficient condition (see Galambos, 1987) is given next. Assume that there are sequences an, cn, b n > 0, and dn > 0, of real numbers such that, for all x and y, lim n[l — F(an + bnx)] = u(x), n—>-+oo l im nF(cn + dny) = w(y) n—*-+oo exist. Then lim P ( Z n < a n + bnx) = exp[—u(x)] and J i m P ( W n < cn + dny) - 1 - exp[-w(y)]. Example 1. ( The Exponential Distribution). Let Xi,... ,Xn be iid random variables with common distribution function F(x) = 1 - e~x, x > 0. Then 1 - F{an + bnx) = e - a n e - h n X . In order to satisfy lim n[l — F(an + bnx)\ — u(x), n—•+oo we can choose an = logn. and bn = 1. Hence, u(z) = e~z, and thus, l im P ( Z n < log re + z) = exp(—e~z). On the other hand, we can choose cn = 0 and dn = 1/re, such that lim nF(cn -f- c?ny) = lim nFiyln) = l im n ( l — e~y^n) = y. n—>+oo n—t-+oo n—t-+oo Hence, lim P ( W n < c n + d n y) = lim P { W n < y/n) = 1 - e~y. n-+oo . n—*+oo -Example 2. Let X i , . . . , X n be iid random variables with common distribution function F(x) = \ - x ~ 1 , x > l . To determine the limiting distribution of Zn, we note that with an = 0 and bn = re, we obtain lim nil - F(an + bny)\ = l im re — = -, y > 0. and thus lim P ( Z n < ny) = e x p ( - - ) . n-++oo y For maxima, it is well-known that there are only three types of nondegenerate distri-butions H(x) that can be univariate limiting extreme value distributions. These results are contained in books devoted to extreme value theory, such as Galambos (1987). The three types are: exp(—x~) x > 0 , 7 > 0 , 0 otherwise, Hlf1(x) = < H2n(x) = < exp(—(—x) i ) x < 0 , 7 < 0, 1 x > 0 , H3,o(x) - exp(—e x) — oo < x < + 0 0 , 7 = 0. After adding location and scale parameters the above three types of distributions can be summarized together as the generalized extreme value distribution (GEV): , H(x) = exp{-[max(l + 7 ^ ~ ^ , 0 ) ] ^ } - 00 < 7 < + 0 0 . ( 2 - 4 - 1 ) The parameter 7 can be interpreted as a tail index parameter of F. It measures the thickness of the tail of F; 7 is larger if F has a thicker tail, and vice versa. The tail thickness is related to the rate of convergence of 1 — F to 0 as x —> 00. For example, let 1 - F(x) =. [max(l + 7 s , 0 ) ] _ 1 / 7 . For 72 < 71 < 0 then — 1/71 > — 1 / 7 2 > 0 and [ m a x ( l + 7 1 x , 0 ) ] - 1 ^ lim ———;— = OO. i/T4 [max(l + 72X, O)]- 1 ^ 2 For 72 < 0 , 71 = 0 , then l i m 7 T, „x, ,,— = 00. ^ - 1 / 7 2 [max(l + 7 2 x , 0 ) ] - 1 / i ' 2 9 For 72 = 0, 71 > 0, then r [max( l - f 7 l : r ,0 ) ]~ 1 / 7 1 r ex ex l im - hm . > l im r—,— = oo. x^oo e-x x^oo [max(l + 7ix ,0)] 1 / ' 7 1 (x7i) 1 /' y i For 71 > 72 > 0, then 0 > -I/71 > - I / 7 2 , I/72 — I/71 > 0 and ,. [max(l + 7 i ^ , 0 ) ] " 1 / 7 1 r (1 + 7 i * ) ~ 1 / 7 1 h m 7 T\ ,,— = lim — —-;— [max(l + 72Z, O)]- 1 /^ (1 + 72z)-1/'y2 > l im i1 + l x X ) X ' T = l im (1 + 7 l ; r ) 1 / 7 2 ~ 1 / 7 1 = 00. a;-+oo M +7lX) l^1 X—KX>K ' So, the tail is heavier as 7 increases. For 7 > 0, from Theorem 2.1.1 of Galambos (1987), we find an = 0 and bn = 7 - 1 [ ( l / r a ) " 7 - 1] so that F(bnx) = 1 - F(bnx) = [1 + (n 7 - l)x}-ih ~ n - x x - l h . and F n { b n x ) = (1 - F(bnx))n = (1 - n " ^ - 1 ^ ) " ™ exp(-a ; - 1 / 7 ) . For 7 < 0, from Theorem 2.1.2 of Galambos (1987), we find an = — 1 / 7 , bn = — 7 _ 1 n 7 so that F(x) = (1 + 7 * ) " 1 / 7 , + M = ( - n ^ x ) - 1 ^ = n^i-x)-1^ and F n ( a n + 6nx) = [1 - n - ^ - x ) - 1 ^ - expi-i-x)-1^), x < 0. For 7 = 0, from Theorem 2.1.3 of Galambos (1987), we find an — logn, bn = 1 so that F(x) = e~x F(an + bnx) = n ^ e - * and F n ( a n + bnx) = [1 - n - V * ) " = exp(-e- x ) . 10 The G E V distribution is useful for statistical inference when the distribution is un-known, and the tail index must be estimated. Since m i n ( X i , . . . , X n ) - - m a x ( - A V , . . . , - X n ) , then, similarly for minima there are also three types of nondegenerate distributions as the univariate limiting extreme value distribution which can be summarized together and can be expressed in terms of (2-4-1) L(x) = l - H ( - x ) = 1 - exp{-[max(l + ~ ^ , 0 ) ] - 1 / 7 }. Wl,...,Wnp>wp) = Gn(w), where G(w) = P(Xi > wu ..., Xp > wp). 11 Any problem on W is equivalent to one on Z by changing the basic vector X to (—X), therefore we concentrate on the vector Z of maxima. Next we list and discuss various properties which F(x.) or H(x.) may possess. P r o p e r t y 1. If F n ( x ) is a sequence of p-dimensional distributions, let the «th uni-variate marginal of F„(x) be Fjf>(xi). If Fn(x) converges to a nondegenerate continuous distribution function -F(x), then for each i with 1 < i < p, F ^ ( X J ) converges to the ith marginal F^(xi) of F(x). Later we shall see that all limiting distribution functions of multivariate extremes are continuous. Hence, Property 1 tells us that we can appeal to univariate case for determining the components of a n and b n whenever (2-5-1) holds. The following several properties are important in multivariate extreme value theory. P r o p e r t y 2. Let X i , . . . , X n be iid p-dimensional vectors with common distribution function F(x.). Then there are vectors a n and b n > 0 such that ( Z n — a n ) / b n converges to nondegenerate distribution function H(x.), if and only if, each marginal belongs to the G E V family, and if CnF[y\'\...,ylJn] -> CW(vi,• • • \yP), n^oo. Property 2 tells us that given a p-variate distribution function .F n (x) , we can check if its marginals belong to the G E V family, if so, then we use the methods of the uni-variate case to determine the components of the vectors a n and b n ; furthermore we can determine CV(y) by its definition and check if C ^ y 1 / " ) converges. Example 3. Let (X, Y ) have a bivariate exponential distribution F ( x , y ) = l-e-x-e-y + G(x,y), ( 2 - 5 - 2 ) where G(x,y) = P ( X > x , Y > y ) . 12 If ( Z n - a n ) / b n converges to a distribution H(zi,z2), then we can choose a n (logn,logn), and b n = (1,1), and obtain e~Zl + e~Z2 F( logn + 2i, logn + z 2 ) = 1 h G(logn + zi,lo'gn + z2). n For the Marshall-Olkin distribution G(x,y) = exp[-rc — y — Xmax(x,y)], A > 0. The last term of (2-5-2) is exp{ — (2 + A)logn — zx — z2 — A max(zi, 22)} = 2 + A ex'p{—z\ — z2 — \ ma,x(z1,z2 1 ri1 = 0(n-"-A), and / e -2 i _(. e-^2 \ 71 l im F n ( l o g n + zi,logn + 22) = l im 1 n—>oo n—>oo V TI I = exp(—e-*1 — e - 2 2 ) . \ For another example, let G{x,y) = (e6x + e6y - l ) _ 1 / e , 6> > 0. Then G(log n + log n + z2) = ( n V 2 1 + n V 2 2 - l)~1/e „ I( e^i )"!/«, n and lim F n ( l o g n + 2 l , l o g n + * 2) = l i m ( l - ^ — ^ K- ^—L )» = {exp[-(e 2 1 + e 2 2)]}{exp[( e^ + e*22)"1/*]}. 13 In the second example, there is dependence in the limiting extremes. Property 3. A p-variate continuous distribution function H ( x ) is a limit distribu-tion in (2-5-1) if, and only if, its univariate marginals belong to the G E V family and if its copula CH satisfies CkH{y\'\ ..., yllk) = CH(yi, • • •, yP), k = 1,2, • • •. ( 2 - 5 - 3 ) Property 3 tells us that if the limit H ( x ) exists, then we check condition (2-5-3). If it holds and if C'H is a copula, then we have got the actual limit distribution. Example 4. Let H\(x),..., H p ( x ) belong to the G E V family, then H(x) = Hx{x)..-Hp{x) is a possible limit of (2-5-1). This limit consists of independent univariate marginals. By assumption, the condition of Property 2 on the marginals is satisfied. Furthermore, by definition, CW(y) = y\ • • • yP for which the condition of (2-5-3) is evident. A n appeal to Property 3 yields the claim. Example 5. The distribution function H(x!, ...,xp) = exp{exp[- m i n ( x i , . . . , xp)}} is a limit in (2-5-1). The marginal distributions are Hi(x{) = exp(—e~Xi) = ^ ^ ( x , ) . Therefore it remains to check the validity of the condition (2-5-3): exp{— exp[— m i n ( x i , . . . , xp)]} = min{exp[— exp(—x,-)] 1 < i < p}, and CH(yu • • •, yP) = min( j / i , . . . , yp). 14 Hence for k > 1, CkH(yl,k, yllk) = [mm(yl/k, vl/k)]k = m i n ( y i , . . . , yp). Example 6. The distribution H ( x u x 2 ) = #3,o(si)#3,o(>2)[l + ^(1 - #3,of>i))(l - #3,o(z2))] does not occur as a limit in (2-5-1). (See Galambos(1987).) Even though the marginals H\(xi) = H 3 f i ( x 1 ) and i/2^2) = -#3,0(22) the copula CH{yi, 2/2) = ym[l + ^ ( l - y i ) ( l - 2/2)] fails to satisfy (2-5-3). Let X = (Xi,... i X p ) be a vector with distribution function F (x ) . Let j(k) = (ji, • • • ,jk), 1 < k < p be a vector with components 1 < j i < J2 • • • < i t < P- The distribution function Fj(k)(xh, • • • ^ j*) is a A;-dimensional marginal distribution, which is obtained from F(x) by letting X{ —> +oo for all % £ {ji, • • • ,jk}-Let Gj{k){XJIT • • i x j k ) = F ( - ^ j i > x j n • • • •>Xjk > Xjk). Assume that F ( x ) is such that each of its univariate marginals belongs to the G E V family. Then, there are a, n and 6,n > 0 such that lim F t n (a ,„ + binx) = Hi(x), 1 < i < p, ( 2 - 5 - 4 ) where Hi(x) belongs to the G E V family. We assume that a, n and 6 t n > 0 have been determined and put a n = ( a i „ , . . ., apn) and b n = (bln,.. ., bpn). P r o p e r t y 4. ( Z n — a n ) / b n converges to a nondegenerate distribution i / (x) if and only if for each fixed vector j(k) and x for which Hi(xi), 1 < i < p, of (2-5-4) are positive, the limit nHS> nGj(k){ajin ~i~ bjinXjin, . . . , Ojfcn "H bjkn.Xjkn) — hjk(Xj1 , . . . , £j t) (2 5 5) 15 are finite, and the function H(x) = e x p { £ ( - l ) f c £ hih(xh,...lXjk)} ( 2 - 5 - 6 ) fc=l 1 0 be an integer, then #(x;2s + l) < H ( x ) < H(x;2s), ( 2 - 5 - 7 ) where i7(x, 0) = 1 and #(x,r) = e x p { £ ( - l ) f c £ hjk(xh,... ,xJk)}. ( 2 - 5-8) When s = 0 in (2-5-7), the inequality H ( x , 1) < #(x) < 1 obtains. This means that iJ(x) is never exceeded by the product of its univariate marginals, and H ( x ) never exceeds 1. Example 7. . Let F(x, y) = 1 — e~x — e~y + G(x, y) where G{x,y) = (e6x + e e y -I)-1'0. From the univariate case we know that lim i*7(logra + x) = exv(-e~x), x > 0, lim F2"(log n + t/) = exp(—e_2/), y > 0. Since G\(x) = e - x and G 2 ( y ) = e _ y War) = lim nG(logra + x) = lim n e ~ ^ n + x ) = e~x. n—»oo n—*oo Similarly, h 2 ( y ) = e _ J /, and h12(x,y) = Jim nG(logn + x,logn + y) 16 = lim n(nee6x + n e e 9 y — l ) ~ 1 / , f l n—*oo n—+00 JI<> = (ee* + e0y)-1'e. Therefore, the limiting extreme value distribution is H(x,y) = exp{-h1(x)-h2(y) + h12(x,y)} = e x p { - e - x - e~y + (e8x + eBy)~1/e}. The following definition is needed for the next property. Definition: The p-dimensional unit simplex S is the set of vectors q with nonnegative components qi such that YA=I Qi = 1-Property 5 (The Pickands Representation for a min-stable exponential distribu-tion). G(xi,..., xp), with univariate exponential margins, is a survival function satisfy-ing — \ogG(txi,... ,txp) = —t\ogG(x1,... ,xp) for all t > 0 if and only if G has the representation -logG?(a;i,...,a;p) = / [max. (qixi)]dU(q1,... ,qp), x, > 0,i = 1,... ,p, ( 2 - 5 - 9 ) Jsp 1<»

0, i = 1,... ,p, Yli qi — 1} is the p-dimensional unit simplex and U is a finite measure on Sp. Property 5 is a result of Pickands and an alternative statement of Theorem 5.4.5 of Galambos (1987). It is not an easy task to give the representation (2-5-9) for a given H(x), but one usually does not aim at giving another form of H(x) when it is already known. The value of (2-5-9) lies in its possibility of generating functions which are limits in (2-5-1). For applications of Property 5, see Joe (1990a). A n example from there is: exp{-A(Zl, . . . , z p ; \ B , B c ( l , . . . , p ) , 6 ) } = exp{ - £ A s ( £ x6t)l's},6 > 1. ( 2 - 5 - 1 0 ) 17 For each integer m > 2, it can be directly shown that x m )=exp{-(^ + --- + ^)1/5} ( 2 - 5 - 1 1 ) is a survival function over Xj > 0,j = 1,... ,m,for 1 < 8 < co. (2-5-11) satisfies the homogeneity of degree one condition of Property 5. Hence by Property 5 , if m < p, there is a measure Vm on Sm and a measure Um on Sp such that where Um is a representation of Vm in p dimensions which puts all mass on qi,... ,q, Therefore, where dU has the form J2B \BdUs- By Property 5, (2-5-10) is a distribution. 2.6 Some relation between extreme value theory and Central Limit theory -Since Central Limit theory is more familiar to statisticians, in this section we show some similarities and differences between Central Limit theory and Extreme Value the-ory. Consider the univariate case first. Suppose that X\,... ,Xn are iid with distribution £ M £ ^ ) 1 / f i B ^ 0 ieB has the representation F. Let Zn = max{Xi,. . . ,X n}, Wn = mm{Xi,.. .,Xn}. 18 We would like to know whether there are any sequences { a n } and {bn} such that Sn — a n Zn- a n Wn — a n — A ' A ' . ° r 1 converge in distribution. From Central Limit theory, if F has a finite second moment, then there exist { a n } and {bn} such that the limiting distribution (Sn — an)/bn has a normal distribution. Definition. Let H ( x ) be a nondegenerate distribution function, then F(x) is in the domain of attraction of H ( x ) if there are sequences { a n } and {bn} > 0 such that l im F n ( a n + bnx) = H(x). From extreme value theory, we know that the limiting distribution H ( x ) of ( Z n — o-n)/bn belongs to the G E V family if the distribution function F(x) is in the domain of attraction of H(x). For the multivariate case, assume that X i , . . . , X n are iid with distribution F. Let X; = (Xn,..., Xip), S n = X i + • • • + X n , Z n = (max(Xn , . . .,Xnl),... ,max(A"i p , . . .,Xnp)). Are there any {an} and {bn} > 0 such that converge in distribution? If the limit of (S n — a n ) /b n and the second order moments of F exist, then from Central Limit theory the limit of (S n — a n ) /b n has a multinormal distribution. 19 If the limit of (Z n — a n ) /b n exists, from extreme value theory, each univariate margin must be in the G E V family. An analogy between multinormal and multivariate extreme value distribution is: (1) Suppose (Z i , . . . , Zp) has a multinormal distribution, then linear combinations of univariate normal, i.e. a\Z\ + • • • + apZp has a univariate normal distribution. (2) Suppose Xj ~ H(-,~fj, fj,j,o-j). After transformation to an exponential survival function Zj = - ^og{H(Xf^hnhaj)) have an exponential distribution. If (Z\,..., Zp) has a min-stable multivariate exponen-tial distribution, then from Property 5, G(zi,...,zp) = exp{-A(z1,...,zp)} ( 2 - 6 - 1 ) and V =. m i n ( ^ - , . . . , ^ f-) has an exponential distribution e-vA(wi>->wp) ^ where Wj > 0. That is, weighted minima are univariate exponential. Note also the main difference: The multinormal distributions form a finite-dimensional parametric family, the multivariate extreme value distributions form an infinite-dimensional family (as given by Pickand's representation). Since parametric inference is easier than nonparametric inference, a "goal" is to find good parametric (finite dimensional) subfamilies of the infinite-dimensional family (see the next section). 2.7 Parametric families of multivariate extreme value distributions. A multivariate extreme value distribution with margins transformed to a survival function with exponential survival functions as univariate margin is a min-stable expo-nential distribution. Let G be a p-dimensional min-stable exponential distribution, from extreme value theory A = — log(G) satisfies A(tzi,...,tzp) = tA(zi,...,zp), Vt > 0. 20 Let GS be the marginal survival function of G, then G s ( z a ) = e x p { — A s(z s)} where A S is obtained from A by setting Zj — 0 for j $ s and s is a subset of {1,... ,p}. Here we will list some parametric families of multivariate min-stable exponential distributions in terms of A . These families can be found in Joe (1990a,b), Tawn (1989), Coles and Tawn (1991), Husler and Reiss (1989). The interpretations of the parameters are given after all the families are listed. The families are given here for p = 2,3,4, from which the general multidimensional form can be seen. A(zuz2;S) = (zs1 + z62)1's,- 8>1, ( 2 - 7 - 1 ) A(ZL, z2; r) = zx + z2 - (z7r + z 2 r ) ~ x l \ r > 0, ( 2 - 7 - 2 ) A ( z u z2, z3; 8, 83) = ((z* + zs2)s*'s + zi*)lls\ 6 > 8 3 > 1 , ( 2 - 7 - 3 ) A(ZI,Z2,Z3;T,T3) = z1 + z2 + z 3 - (z~T + z 2 T ) - 1 / T - (z7T3 + z ^ ) - 1 ^ - (z2^ + z ^ ) ' 1 ^ + ( ( ^ T + ^ T ) T 3 / T + ^ 3 T 3 ) " 1 / T 3 , r > r 3 > 0 , ( 2 - 7 - 4 ) A ( z u z2, z3, z4; 8,83,84) = ([(z5 + z62)s*l6 + zi*}s*'s> + zi*)1'8*, l < 8 4 < 8 3 < 8 , ( 2 - 7 - 5 ) A ( z x , z 2 , z 3 , z 4 ; 8 , 8 2 , 8 4 ) = {{z[ + zs2)Si'& + (zs3> + z ^ 6 * ) 1 ' 6 * , 1 < 62,84 < 8, ( 2 - 7 - 6 ) A(zi,z2,z3, z4; T , r 3 , T 4 ) 21 = z1 + z2 + z3 + z4- (z7r + z 2 T ) ~ 1 / T - (z~T3 + z ^ T 3 ) - 1 / T 3 - ( z 2 T 3 + z 7 T 3 ) ~ 1 / T 3 -(z7Ti + z 7 T i ) - 1 / T i - {z2T* + 2 4 " T 4 ) " 1 / T 4 ~ + ^ " T 4 ) _ 1 / T 4 + (c*rT + ^ T 3 / T+^- T 3)" 1 / T 3 + (c*r+*-2Tr' r+*r 4)~ 1 / T 4 + {{Zr3+*3 T 3r / T 3+*r 4)~ 1 / T 4 + ((*r 3+zrr i / T 3+*r 4)~ 1 / T 4 -( ( ( * r + ^ T ) T 3 / T + z3"T 3)T 4 / T 3 + ^ " T 4 )" 1 / T 4 , r > r 2 > r 4 > 0. ( 2 - 7 - 7 ) A(ZI,Z2,Z3,Z4;T,T2,T4) = z1 + z2 + z3 + z4- (z7T + 2 2 -T 1 / T - (sp + z 3 - T T 1 / T 4 - ( ^ + ^ T 4 ) - 1 / T 4 (zp + z r 4 ) _ 1 / T 4 - (^2"T4+*rT1 /T4 - + * r r 1 / r i + ( c * r + * n T 4 / T + * 3 - T 4 ) " 1 / T 4 + ( ( * r + * 2 - T ) T 4 / T + * r 4 ) ~ 1 / T 4 + (*r4 + (*3-T2 + ^ r2)T4/T2)"1/T4 + (*r4 + (*3_T2 + *r2)T4/T2)~1/T4 -O r + * 2 " T 4 / T + (*3"T2 + 2 4 - T 2 ) T 4 / T 2 ) _ 1 / T 4 , 1 < r 2 , r 4 < r. ( 2 - 7 - 8 ) Finally, the Hiisler and Reiss(1989) model with p = 2 is A(zx, A) = *(A + l 0 g Z l - l 0 g Z 2 ) , 2 + *(A + 1 ° g Z 2 - l 0 g ^ ) , 1 , A > 0. (2 - 7 -2A 2A For p > 2, let Xij € (0,oo) VI < i , j < p with i ^ Put Moreover , for 2 < k < p and m = ( m 1 ? . . . , m^) with 1 < mi < m 2 < . . . < < define rfc,m — ^ ( A 2 ^ . ^ + x2m.mk — \2m.^m.) i,j ) m j f - 1 | r f c , m 6~^> JVk for 2 < k < p, where J2k nieans summation over all p-vectors m = ( m i , . . . , m^) with 1 < mi < m 2 < . . . < mk < p, and /i i , m (y) = e~y for m = 1,... ,p. The cases p = 3,4 are given explicitly in Chapter 3. For comparison, note that the bivariate models (2-7-1), (2-7-2) and (2-7-9) each have a single dependence parameter. For (2-7-1) and (2-7-2) dependence increases as the parameter increases; for (2-7-9) dependence decreases as the parameter increases. The trivariate models (2-7-3) and (2-7-4) have respectively models (2-7-1) and (2-7-2) for each of the three bivariate margins. For the trivariate model (2-7-3) (respectively (2-7-4)), 8 (respectively T ) is the dependence parameter for the (1,2) bivariate margin; £3 (respectively, T3) is the dependence parameter for both the (1,3) and (2,3) bivariate margins. We have something similar for the 4-variate models (2-7-5) to (2-7-8). The 4-variate models (2-7-5) to (2-7-8) also have respectively models (2-7-1) and (2-7-2) for each of the six bivariate margins. For (2-7-5), 8 is the dependence parameters for the (1,2) bivariate margin; <$3 is the dependence parameter for the (1,3) and (2,3) bivariate margins; 84 is the dependence parameter for the (1,4), (2,4) and (3,4) margins. For (2-7-6), 8 is the dependence parameter for the (1,2) bivariate margin; 82 is the dependence parameter for the (3,4) bivariate margin; 84 is the dependence parameter for the (1,3), (1,4), (2,3) and (2,4) margins. The model (2-7-11) has a dependence parameter for each of the p(p— l ) /2 bivariate margins. For (2-7-11), the dependence parameter of the 23 bivariate margin, with i < j, is A,j, and the margin has form (2-7-9). Models (2-7-3) to (2-7-8) and their extensions have p — 1 parameters in total; as above, some of the bivariate margins have the same dependence parameter. Hence these models do not have as much flexibilty in the dependence pattern but they have a simpler form than (2-7-11). 2.8 Estimation for a parametric family of multivariate extreme value distri-bution. Joe (1990b) suggests the following procedures for fitting of a parametric multivariate distribution to iid X i , . . . , X n . (1) Fit the p univariate margins separately by maximum likelihood using the G E V family. (2) Transform each margin so that each transformed variable has exponential survival function Gi(zi) = e~Zi. (3) Suppose that the transformed data are iid p-vectors, then compare fit of differ-ent families of min-stable multivariate exponential distributions. Check for parameter estimation consistency with bivariate and higher order margins. (4) Go back to the original data and estimate both multivariate parameters and parameters of the univariate margins simultaneously, using maximum likelihood if a copula has been decided on as being acceptable. In (3), to check the fit of a model, we compare parametric estimates of A with a nonparametric estimate of A. Besides the consistency checking from we know that if (Zi,..., Z p ) is a random vector with the min-stable exponential survival function G — e~A, then E(wm[^,...,^]) = [A{w1,;..,wp)]-1 Wi wp for u>j > 0, j = 1,... , p. So if Y i , . . . , Y p are a transformed random sample, expo-24 nential probability plots of Vi = minjYij/wj can be used to check the assumption of min-stability for multivariate exponential distribution. If the plots are adequate, a non-parametric estimate of A(wi,..., wp) is given by n/ J27=i ^ which can be compared with parametric estimates of A from different models. 25 Chapter 3 Maximum Likelihood Estimation With The Husler-Reiss Model Hiisler and Reiss (1989) obtained a new class of multivariate extreme value distri-butions by taking a non-standard extreme-value limit. Let X and M n be as defined in Chapter 2. For a mutivariate normal random vec-tor, having all correlation coefficients smaller than one, Sibuya (1960) proved that the marginal maxima M n > i , . . . , M„ i P are asymptotically independent. The rate of con-vergence to asymptotic independence is slower as the correlation coefficients increase. Since in practice the sample size n never goes to infinity but is a fixed positve integer, is there another asymptotic formulation which may provide a better approximation? This motivated the work by Hiisler and Reiss. For the bivariate case, Hiisler and Reiss let the correlation coefficient p — p n increase towards 1 as the sample size increases. It is shown that the marginal maxima are neither asymptotically independent nor completely dependent if (1 — pn)\ogn converges to a positive constant as n —> oo. This was extended to dimensions p > 2. Hiisler and Reiss define their limiting distribution by the c.d.f. or survival function. To get a maximum likelihood estimation, the density is needed. To obtain the den-sity, the following result for the conditional distribution of a multidimensional normal distribution is needed. Result 1: Let U = (Ui,..., Ua) and V = (Vi, . . . , H ) be random vectors such that MVN 0, V 12 ^ £21 ^22 J \ I where E n , £12, £21, £22 are a x a, a x b,b x a,b x b matrices respectively, and suppose £11 and £22 are nonsingular. Then U | V = v ~ MVN ( EjaE-iy; E n - E i a E ^ i ) • 26 Let F x and F x denote the distribution function and survival function for a random vector X , and let fx and /x|Y denote the density function of X and the conditional density function of X when Y is fixed. Then the joint distribution of X and Y is /x,y(x»y) = /x|Y(x|y)/v(y)-Let <^(x;E), $(x; E ) denote the multivariate normal density and survival function with zero mean vector and covariance matrix E . Then /u,v(u,v) = ^(u - S H S ^ V J S H - S 1 2 E - 1 E 2 1 ) ^ ( v ; S 2 2 ) and F u , v ( u , v ) = $( (u ,v) ;E) Jroo /-co roo too f • • • / / • • • / 0 ( s - E i 2 S 2 - 2 1 t ; E i i - S 1 2 S j 2 1 E 2 1 ) ^ ( t ; E 2 2 ) d s d t "1 Jvi, Jui Jua ' • • • / ^ ( u - E i a E ^ t j E u - E w E ^ E a O ^ E M j d t . Therefore, 3 b <&((u,v);£) _ ( _ 1 ) 6 $ ( u _ S i 2 S - i v ; E i i _ E i 2 E - i S 2 i ) ^ ( v ; ^ dv\ • • • dvb Now we are ready to derive the density for the Hiisler and Reiss survival function when the univariate margins exponential have been transformed to distributions with mean 1. We provide details for dimensions p = 2,3,4 from which the general case will be apparent. The notation gets increasingly difficult as p increases. 3.1 Bivar ia te Case Frome Section 2.7, G(wi,w2,X) = exp{-A 2 ( i0 i ,u ; 2 , A)}, 27 where A2(w1,w2, A) = $(A + ^ \og(wi/w2))w2 + $(A + ^ log(w 2/u;i))u)i, A > 0. Since J^- = — f^-G and J^- = — we obtained the density function of t7(u>i, w2, d2G = _ d _ = a d A 2 = &42 g A 2 c _ d 2 A 2 dw\dw2 dwi dw2 dw1 dw2 dw\ dw2 dwxdw2 From 4>(x) = (2Tr)~1^2e~x2^2 we have ^'(x) = a;^(x). Also -\og[cf)(\+ ^\og(w2/w1))/w1] = A 2 / 2 + (logto2 - logwi)/2 + ^ - ( ( l o g w 2 - l o g ^ i ) 2 -t-logiu! + log V^TT = A 2 / 2 + (log wx - log w2)/2 + g^-((log tux - log w2)2 + log w2 + log V2TT = - logty(A + ^-log(tOi/u; 2))/u; 2]. Hence 2 M ) + ^ ^ ( A + ^ - l o g ( ^ i M ) ) ( ^ i / ^ 2 ) +$(A + ^\og(Wl/w2)) = ^(X + ^-\og(Wl/w2)). 2A Similarly, 3\A. 1 Finally, 28 d 1 o ^ 2 m X + 2 X l o g { w i / w 2 ) ) ^ ( A + ^ l o g K M ) ) . From Ait2, A2,i and A\2 we can get the density function g(w1,w2, A). 3.2 Tr ivar ia te Case Let f 9\2 \2 _i_ \2 13 1^3 ' ^23 "12 E = 2 2 \ \2 I \2 \2 y A13 "t" ^23 — A12 2A^3 / be the matrix in (2-7-10) and let P12 = P21 -P13 = psi = P23 = P32 = i.e., Pij = _ 1^3 + 2^3 ~~ \2 A\2 2A13A23 1 \2 , \2 A12 + A23 A23 2Aj2A23 1 \2 . \2 A12 + A13 — A23 2A12A13 2A,/;Ajfc for i,j, k distinct. Then h 3 < 1 2 3 in (2-7-11) can be written as C(wi,w2,w3, A12, A 1 3 , A 2 3 ) = / $(A 1 3 +—— ( logg- logtui ) , A JO ^A13 where with some abuse of notation $ ( • ; / > ) = ¥ ( • ; From Section 2.7, we have 23+777—(log g-log w2); p12)dq LA23 1 1 \ 1 P P 1 G(w1,w2,w3) - exp{-A3(w1,w2,w3, X12, X13, A 2 3 )} 29 where with simplification of (2-7-11), A3(w1,w2,w3,Xi2,Xi3,\23) ~w\ - w2 - w3 + A2(wx,w2, Ai 2 ) + A2(w1,w3, A13) + A2(w2,w3, X23) + C(wi,w2,w3). Constraints are 0 < A t J < 00 and 0 < pij < 1. By symmetry, equivalent forms for C can be obtained by permutations of subscript indices: C(iui , w2, w3, A 1 2 , A i 3 , A 2 3 ) rw\ 1 1 = / ^(Ais + 77T-(log q - log w3), A 1 2 + -r— (log q - log w2); p23)dq Jru>2 1 1 ' $(^12 + 7TT - ( l o S9 - logioi), A 2 3 + — — (log 9 - log w3);p13)dq. 0 6Ai2 4A23 The density function for G(wi,w2,w3) is: d3 g(w1,w2,w3) = ( - l ) 3 ^—7;—«—exp{-A 3 (w 1 ,w 2 ,w 3 , A i 2 , A i 3 , A 2 3 )} OWiOW2OW3 _ SMdMdM _ dA3 d2A3 _ 8A3 d2A3 _ dA3 d2A3 + d3A3 dwidw2dw2 dw\dw2dw3 dw2dw\dw3 dw3dw\dw2 dw1dw2dw3 From (xi,x2; p) = (f>(Xl)(f)(^=^) 00 f°° ±tzi ~ PZ~L 1 and Let ^ 1 , x 2 ; p ) = $ ( 7 f = ) ^ i ) . Cl,23 := = $(Ai3 + \og(Wi/w3), A12 + J-J—log(lWi/u72);P23)-atvi zAi3 z A 1 2 30 dC By symmetry, we have C2,i3 •-d2C and C3,i2 := Similarly, 12,3 dw\dw and QJ^Q^£QW w e have an expression for the density function <7(it>i, iw2> u>3). 3.3 Four-Dimensional Case Let E = 2 2A?4 \2 1 \2 _ \2 \2 1 \2 _ \ A14 + A24 A12 A14 + A34 A 2 13 A 2 4 +A 2 _ \2 24 A12 2A 2 4 \2 1 \2 _ \2 A24 + A34 A23 \2 1 \2 \2 \2 1 \2 \2 y A14 -r A34 — A13 A24 A 3 4 — A 2 3 2A| 4 be the matrix in (2-7-10) and let Pijk \2 1 \2 _ \2 A t j + A i f c Ajk ZAijAik where A,j = AJt- and, «, j , A; are distinct and between 1 and 4. Then / i 4 ) i 2 3 4 in (2-7-11) can be written as: 31 D(w1,W2,W3,W4) = rm l l / $(A14 + 7T\ ^g(q/w4), A 2 4 + p-r— log(q/w4), JO &A\4 ^^24 A 3 4 + 7 7 7 — \og(qjw4), R(p4i2, 0413, P423))dq, ZA34 where R(a, b, c) — By symmetry, equivalent forms for D are f 1 A X a 0 a l e 6 c 1 D(lWl,l02,f>3,tU4) y ^ i 1 1 = / $(Ai2 + 7 7 7 — log(?/w 2), A a 3 + 7 7 7 — log(o/w 3), JO A^Vl ZA13 A 1 4 + 2A 14 log(a/to4), R(p123, 0124, P\3A))dq rw2 1 1 / *(Ai2 + 7 7 7 — log(o/wi) ,A 2 3 + 777— l o g ( ? / w 3 ) , Jo £>A\2 ^A23 A24 + •^—\og(q/w4),R(p2i3,p2i4,p234))dq ZA24 ru>3 \ \ • / $(Ai3 + 7 7 7 — log(g/wi), A 2 3 + ^ 7 — log(?/u;2), Jo AA\3 ZA23 A 3 4 + 77T— I°g(?/W4), #(0312, 0314, P324))dq. ZA34 Constraints are: 0 < Xij < 00, i ^ j, 0 < /o.-jjk < 1, i , j , are distinct and R(P412, 0413, 0423), R(P\23i 0124, 0134), #(0213, 0214, 0234), #(0312, 0314, 0324) are positive definite. 32 Taking derivatives, we have -Dl,234 dD —.. 1 , , . . o ^ r * { X i 2 + 2 ^ l o g ( W i / W 2 ) ' i i A13 + TT—log(«;i/u;3),Ai4 + 7 7 7 — ^g(w1/w4), R(p123, 0124,0134)) ZA13 By interchange of 1 and 2, 1 and 3, 1 and 4 in 1)1,234 we can obtain 1)2,134 := g ~ 5 As,i24 : = | £ , -04,123 := f j - By making use of Result 1 with E n 3u>3 : / \ 0123 1 0134 ^ 0134 1 j J12 , E22 = 1 we have D 12,34 • — \ 0124 j d 2 D = dDlt234 dwidw2 dw2 — 1 1 1 $(A13 + 7 7 T — log(wi/w3) ~ 0123(Al2 + 7 - — \ 0 g ( w 1 / w 2 ) ) , \ l 4 + 777—log(tt>l/w4) -ZA13 ^ A i 2 ^ A i 4 1 1 1 012 4 (Al2 + 7 7 7 — log(wi/w2)),#34,12)134 ^ —Pl34 1 j -#123,4 = 1 — E i 2 S 2 2 E 2 i = 1 — (p 1 2 3 + p 1 2 4 — 2/)i23/)i24/)134)/(l — p 1 3 4 ) and D d 3 D 123,4 dw\dw2dw3 ${(Al4 + -^—logiwxIWi) - [(/9123 - />124Pl34)(Al2 + T"J— l o g ^ i / ^ ) ) + Z/\i4 ^^ 12 f>124 - /9l23/»134)(Al3 + ^ 7 log(li>i/u>3))]/(l ~ p\34))I\JR™A) 1 (Ai2 + -J—log(t«i/ti;2),Ai3 + r-J— l o g ^ / ^ a ) ; ^123)7. . ZA12 ZA13 4Ai2Ai3U;2^3 By symmetry and permution of indices we can easily get £*i24,3 D dwidufySwi ' 134,2 • dw^dw^dun 1 -^ 234,1 Finally, D dA D 1234 dwidw2dwzdw4 <^ (Ai4 + •^—log(w1/w4),X24 + -^—\og(w2/w4),X34 + -^—\og(w3/w4),) ZA\4 ZA24 ^A34 R{,P412, / > 4 1 3 , / > 4 2 3 ) ) g ^ ^ ^ ^ ^ ^ From Sections 2.7 we have G(w!,w2,w3, w4) = exp[—A4(iwi, w2, w3, w4, X12, X13, A i 4 , A 2 3 , A 2 3 , A 2 4 , A 3 4)], where A4(w1,w2,w3, w4, A 1 2 , A 1 3 , A 1 4 , A23, A23, A 2 4 , A 3 4 ) = i^(iwi) + h1(w2) + hi(w3) + fei(iw4) - h2{w1,w2) - h^W!, w3) - h2(wx,w4) -h2(w2,w3) - h2(w2,w4) - h2(w3,w4) + h3(u)!,w2,w3) + h3(w!,w2,w4) + h3(u)!,w3,w4) + h3(w2,w3,w4) + h4(w!,w2,w3,w4) 34 = IWi + w2 + w3 + w4 - [wi +w2- A2(w1,w2, A12)] -[W!,W3 - A 2(lUi, to3, Ai3)] - [wi + w4 - A2(wi, W4, A14)] -[w2 + w3- A2(w2, w3, A23)] - [w2 + w4 - A2(w2, w4, A24)] -[w3 + w4 - A2(w3,w4, A34)] + C(ioi,tw2,iW3, Ai 2 , A 1 3 ,A 2 3 ) + C(wi,w2,w4, X12, A 1 4 , A 2 4 C(wi,ti> 3,t«4,A i 3,A14, A 3 4) + C(w2,w3,w4, A 2 3 , A 2 4 ,A 3 4 ) - D(u>i, 1 ^ 2 , ^ 3 , ^ 4 ) = -2(wi + u>2 + w3 + w4) + A2(w1,w2,\12)] + A2(wi,w3,\13)] + A2(w!,w4, Ai4)] + A2(w2, iw3, A23)] + A2(u>2, u>4, A24)] + A2(u;3,u;4, A34)] + C(w1, w2, w3, A 1 2 , A13, A 2 3) + C(u>i, w2, w4, Ai 2 , A i 4 , A2 4) + C(iwi,io3,u;4, A13, Ai4 , A 3 4 ) + C(w2,w3,w4, A 2 3 , A 2 4, A 3 4) - D(u>i, w2, w3, w4) and g(wuW2,W3,w4) = ( - l ) 4 F A i 8 ^ > 4 ^ ^ ^ 3 , ^ ) aA4 aA4 <9A4 d A 4 a 2 A 4 a A 4 a ^ 4 dw2 dw3 dw4 dwidw2 dw3 dw4 d2A4 dA4dA4 d2A4 8A4dA4 d2A4 8A4dA4 dw\dw3 dw2 dw4 dw\dw4 dw2 dw3 dw2dw3 dw-^ dw4 d2A4 8A4dA4 d2A4 8A4dA4 | d2A4 d2A4 | dw2dw4 dw\ dw3 dw3dw4 dw\ dw2 dw\dw2 dw3dw4 d2A4 d2A4 d2A4 d2A4 d3A4 dA4 dw\dw3 dw2dw4 dw\dw4 dw2dw3 dw\dw2dw3 dw4 d3A4 dA4 d3A4 dA4 d3A4 dA4 dwidw2dw4 dw3 dw1dw3dw4 dw2 dw2dw3dw4 dwx dAA4 dw\dw2dw3dw4 Upon differentiation, dA4 -2 + A1>2 + A1<3 + A1A + Ci,23 + Ci,24 + Ci,34 - D i , 234-By symmetry,!^1,1^-, f^ 1 are like f^- with indices 1 and 2, 1 and 3, 1 and 4 inter-changed. Also, Pi1 A. A\2 + Cl2,3 + Cl2,4 — ^12,34 dw1dw2 35 and o Z 5 — W23 ~ -L/123,4 OW1OW2OW3 By the same reason as the above we can easily get d2A4 d2A4 d2A4 d2A4 d2A4 .Finally, dw\dw3 dwxdw4 dw2dw3 dw2dw4 dw3dw4 d3A4 d3A4 dzA4 dwidw2dw4' dwidw3dw4' dw2dw3dw4 d4A4 , Q Q Q Q — ~~ ^ 1234-C'Zi;iC'l«2C'W;3C'^ '4 3.4 Computer implementation of the Hiisler-Reiss model The parameter A,j's are estimated with a quasi-Newton routine. The multivari-ate normal survival function is computed with the routine of Schervish (1984). Other routines that are needed are a numerical integration routines and routines for the func-tions: C i i 23 , C i 2 , 3 , Ci23,-Di,234, ^ 12,34,-Di23,4,-Di234- Then C 1 < 2 4 , C 1 3 i 4 etc., can be com-puted from these routines by changing the parameters of the routines. 3.5 Data Analysis Joe (1990b) constructed some multivariate extreme value environmental data sets from Bay Area and from the Great Vancouver Regional District. The first data set consists of weekly maxima of ozone concentrations ( in parts per hundred million) for several monitoring stations in the Bay Area; the weeks were for the months of April to October in 1983-1987. For comparison with Joe (1990b), the same subset of 5 stations are used. The second data set consists of weekly maxima of SO2, NO2 and ozone concen-trations (in parts per hundred milion ) for several monitoring stations in the Greater Vancouver Regional District: the weeks were for the months of April to September to October for July 1984 to October 1987. Joe (1990b) eliminated all weeks with too many 36 missing values, and for comparison the same reduced data are used. In Chapter 4, weeks with missing values are used as censored data. Details of exploratory data analysis of both data sets are given in Joe (1990b). Joe (1990b) compares the fit of the models (2-7-1) to (2-7-8) and their multivariate extensions, as well as other models in Joe (1990a). The models in Joe (1990a) which have some appealing theoretical properties were found to fit worse even though the models had more parameters. A partial explanation is that the parameters in these models are not easily interpretable as in (2-7-1) to (2-7-11). For comparison with a baseline, the likelihood of the multivariate normal copula with exponential margins was computed. The comparison of the Husler-Reiss model with the best fitting of (2-7-1) to (2-7-8) and with the multivariate normal copula is given in Tables 1 to 7. Only the parameter estimates for the Hiisler-Reiss model are given. Note that the likelihoods given are for the dependence parameters, after the univariate margins have been transformed to exponential survival functions (see Section 2.8). Also all likelihoods based on margins ( or subsets of data from subsets of stations) are given. This is to check on consistency of the parameter estimates for multivariate margins of different orders. This check provides an indication of the goodness of fit of the multivariate models. A final note for the tables is that for 3 or more stations, labelled as l , . . . , p , the parameter estimates are given in lexicographical order, that is, A12, A13, . . . , A i p , A23, . . . , A 2 p , . . . , A p _ i i P . For example, consider the pairs of stations P T , SJ in Table 1, where the other stations are abbreviated C C , V A , S T . In the trivariate subset, (CC, PT , SJ), the A parameter for (PT,SJ) is the third value (0.588); in the trivariate subset, (PT, SJ, V A ) , it is the first value (0.595). in the 4-variate subset, (CC,PT, SJ, V A ) , it is the fourth value (0.567), etc. Recall from Chapter 2, that a smaller value of A means more dependence. Some overall conclusions from the tables are: 37 (1) The parameter estimates for a particular A are similar over different likelihoods. (2) Of the 37 bivariate likelihoods, the likelihood of the Hiisler-Reiss model is higher than those of both models (2-7-1) and (2-7-2) in 13 cases, and it is higher than that of the multivariate copula in 18 cases. (3) Of the 25 trivariate likelihoods, the likelihood of the Hiisler-Reiss model is higher than those of both models (2-7-3) and (2-7-4) in 11 cases. With the adjustment for the number of parameters ((2-7-3) and (2-7-4) have 2 parameters, the Hiisler-Reiss trivariate model has 3 parameters), through a penalty of half the number of parameters for the log likelihood( the Akaike information criterion), the Hiisler-Reiss model is only better for the first data set in Table 1. (4) Of the 9 four-variable likelihoods, the likelihood of the Hiisler-Reiss model is higher than those of the models (2-7-5) to (2-7-8) in 6 cases. This is true even with adjustment for the number of parameters ( 6 for the Hiisler-Reiss model and 3 for models (2-7-5) to (2-7-8) ). Overall, it appears that the Hiisler-Reiss model provides a better fit for Tables 1 and 3, and maybe Table 5. Therefore it is a useful model in addition to those in Joe (1990b). A n indication of whether the Hiisler-Reiss model is better can be seen from the estimates for the bivariate likelihoods. If the pattern of the A's does not fit the bivariate patterns that are possible with models (2-7-3) to (2-7-8), then the Hiisler-Reiss model should be better. ( The restrictions of the bivariate patterns for models (2-7-3) to (2-7-8) can be seen from taking the bivariate margins in these models.) 38 Table 1. Bay Area - Ozone(n=120) pay Notation: Concord(CC),Pittsburg(PT),Vallejo(VA), San Jose(SJ) Santa RosT). Tikelihood for dependence parameters subset CC, PT cd, SJ CC, VA CC, ST PT, SJ PT, VA PT, ST SJ, VA SJ, ST VA. ST C C J ^ S J CC,PT,VA CC,PT,ST CC,SJ,VA CC,SJ,ST CC,VA,ST PT,SJ,VA PT,SJ,ST PT,VA,ST SJ,VA,ST CC,PT,SJ VA CC,PT,SJ ST CC,PT,VA ST CC,SJ,VA ST PT,SJ,VA ST estimate T F T T negative log likeli. normal r^ s hood H-R model — O S 3 — 0.451 0.532 0.708 0.595 0.581 0.749 0.579 0.740 0.599 0.354,0.453 0.5^ 88 0.353,0.532 0.&0 0.352,0.704 0.750 0.451,0.536 0.576 0.450,0.716 0.741 0.533.0.708 0.606 0.595,0.583 0.579 0.594,0.755 0.?38 0.582.0.752 0.d06 0.580.0.747 0.607 0.357,0.441 0.539,0.567 0.581,0.574 0.355,0.441 0.70. 0.567 0.738,0.741 0.353,0.534 0.706,0.581 0.754,0.607 0.450,0.538 0.714,0.577 0.750,0.609 0.594,0.585 0.755,0.581 0.746,0.610 154.28 176.17 190.70 211.60 199.85 196.07 214.80 197.68 214.56 199.86 221.32 154.08 175.96 190.38 211.03 199.34 196.05 214.37 197.40 213.96 199.62 216.41 153.50 179.40 192.41 214.24 200.48 198.52 214.54 202.49 218.74 203.12 212.94 225.26 221.36 223.87 244.94 245.20 245.70 245.38 268.11 271.62 267.09 290.80 279.04 282.20 288.42 243.37 265.32 271.40 265.27 288.08 277.97 278.01 280.47 248.67 272.20 273.54 269.90 291.56 278.94 284.13 280.01 311.54 305.04 303.61 154.75 176.91 190.30 210.47 199.06 197.15 213.92 197.15 213.52 199.54 212.23 222.92 243.86 242.80 265.69 268.28 264.27 287.04 275.45 275.85 275.72 299.69 307.17 303.67 304.20 300.66 329.94 324.70 329.57 321.01 352.88 347.34 349.94 342.36 39 Table 2. GVRD-Sulphur Dioxide(n=70) Notation: Confederation Park(CP), Second Narrows(SN), Anmore(AN),Rocky Point Park(RP). Likelihood for dependence parameters. estimate negative log likelihood subset H-R 6's T ' S normal H-R model CP, SN 0.845 129.55 129.54 129.37 129.58 CP, A N 1.086 136.11 135.96 134.82 135.94 CP, R P 0.905 131.45 131.67 129.61 132.00 SN, A N 1.068 135.36 135.54 133.97 135.62 SN, R P 1.229 137.04 137.40 136.48 137.70 A N , R P 0.756 123.39 124.54 123.63 126.08 C P , S N , A N 0.840,1.068 193.83 193.82 192.16 193.44 1.062 CP,SN,RP 0.846,0.891 190.28 192.97 189.26 190.89 1.157 C P , A N , R P 1.045,0.909 183.89 188.01 182.94 187.05 0.750 S N , A N , R P 1.064,1.172 187.61 190.59 187.30 190.93 0.750 C P , S N , A N 0.840,1.060 241.86 247.15 240.07 244.05 R P 0.900,1.032 1.155,0.748 40 Table 3. GVRD-Nitrogen Dioxide(n=84) Notation: Kensington Park(KP), Confederation Park(CP), Anmore(AN), Rocky Point Park(RP). Likelihood for dependence parameters. estimate negative log likelihood subset H-R 6's r's normal H-R model K P , C P 0.378 112.97 113.10 120.15 114.06 K P , A N 0.415 119.26 119.31 124.01 119.59 K P , R P 0.362 110.64 .110.64 119.83 111.53 CP, A N 0.466 127.84 127.30 128.81 126.55 CP, R P 0.346 108.53 108.27 107.87 107.71 A N , R P 0.356 110.48 110.22 112.64 109.91 K P , C P , A N 0.379,0.417 147.96 146.20 154.95 146.27 0.463 K P , C P , R P 0.382,0.363 133.11 126.51 137.25 128.55 0.343 K P , A N , R P 0.416,0.363 136.99 .131.88 143.35 132.88 0.353 C P , A N , R P 0.463,0.345 139.29 137.49 135.81 132.40 0.356 K P , C P , A N 0.382,0.418 161.42 151.95 160.77 149.77 R P 0.363,0.457 0.342,0.352 41 Table 4. GVRD-Ozone(n=84) Notation: Marpole(MA), Confederation Park(CP), Anmore(AN), Rocky Point Park(RP). Likelihood for dependence parameters. estimate negative log likelihood subset H-R 8's r's normal H-R model M A , C P 0.521 89.17 89.23 88.00 89.69 M A , A N 0.829 101.63 102.47 101.80 104.54 M A , R P 0.763 99.94 100.50 100.88 102.21 CP, A N . 0.747 98.78 99.60 95.96 101.93 CP, R P 0.680 97.23 97.72 93.19 98.95 A N , R P 0.284 61.55 61.73 72.02 64.66 M A , C P , A N 0.526,0.803 129.02 131.08 126.47 133.31 0.741 M A , C P , R P 0.526,0.738 125.73 129.42 123.81 130.07 0.676 M A , A N , R P 0.816,0.747 103.97 105.08 115.26 108.96 0.284 C P , A N , R P 0.734,0.670 101.13 101.70 107.48 105.76 0.285 M A , C P , A N 0.505,0.763 130.01 133.28 137.97 136.71 R P 0.705,0.700 0.642,0.291 42 Table 5. GVRD-Station 5 (Confederation Park). (n=64) Likelihood for dependence parameters. estimate negative log likelihood subset H-R <5's r's normal H-R model S02,N02 1.359 127.33 126.92 125.49 126.77 so2,o3 1.033 122.83 122.88 122.10 123.01 N02,03 0.932 120.31 120.56 118.39 120.76 S02,N02,03 1.312,1.023 181.73 180.37 176.24 179.48 0.930 Table 6. GVRD-Station 7 (Anmore). (n=76) Likelihood for dependence parameters. estimate negative log likelihood subset H-R <5's T 'S normal H-R model S02,N02 1.167 148.68 148.81 146.10 148.84 so2,o3 1.134 148.21 148.22 145.17 148.25 N02,03 0.526 118.65 119.11 124.25 120.64 S02,N02,03 1.132,1.112 189.37 190.11 192.69 191.96 0.526 43 Table 7. GVRD-Station 9 (Rocky Point Park). (n=63) Likelihood for dependence parameters. estimate negative log likelihood subset H-R tf's r's normal H-R model S02,N02 1.065 120.87 121.48 118.51 121.99 so2,o3 1.063 120.38 121.10 118.57 122.28 N02,03 0.622 107.55 106.99 106.00 106.30 S02,N02,03 1.019,1.050 162.60 163.61 159.92 164.04 0.625 44 Chapter 4. Analysis With Missing Data In statistical analysis, it may happen by accident or some other reason that some of the observations are missing, cannot be collected, or in some other way are not obtainable. In such a situation it is obvious that the routine method is not appropriate. Now we would like to reanalyze the data from Greater Vancouver Regional Dis-trict with missing values among the hourly measurements. If there are missing hourly measurements, we treat the daily or weekly maxima as right-censored. For right censored data, let xt- be either the observed value or the right censored value, and let Si 1 if Xi is right-censored 0 if Xi is not right-censored If f(t; 0) is a parametric family of models for the iid data and if S(t; 9) is the survival function, then the likelihood function with right-censored data is L = n ^ J ( x i , 0 ) 1 - S i S ( t i , 9 ) 6 i , see, for example, L'awless(1980) for details. We will use the procedure in Section 2 proposed by Joe (1990b) for fitting of a para-metric multivariate distribution to independent and identically distributed p-vectors X i , . . . , X n . The first step is to fit generalized extreme value distribution to the uni-variate margin separatly by maximum likelihood. That is, the j t h univariate margin is: Fj{x^hHj,(Tj) = e x p { - ( l + 7 i[(x - fij)/ so that original random variables become exponential random variables. The y ^ s are not treated as left censored data. Now for multivariate parameters, we take the univariate margins to be exponential. For bivariate data, suppose we have iid random pairs {Y\\, Yi2),. • • i(Yni, ^12),and observe (Yu, Y12, Sn, 612),... ,{YnX, Yn2, Snl, Sn2),where 1 if Yij is left censored, Sij = { 0 if Y{j is not censored, i = 1 , . . . ,n and j = 1,2. The bivariate cdf is F ( y x , y2,6) = F = 1 - Fx - F2 + F, where F is the survival function and F\ and F2 are univariate margins of F. Therefore we can get the following contribution to the likelihood of the bivariate parameter 6: P ( Y 1 = y 1 , Y 2 = y2) P ( Y 1 < y 1 , Y 2 = y2) P ( Y 1 = y 1 , Y 2 < y 2 ) _ P ( Y 1 < y 1 . Y 2 < y 2 ) In the trivariate case, the generalization is straightforward and some details are given below. Suppose we have iid random variables (Y\\,Yi2,Y\3), ... ,(Yni, Yn2, Yn3), and observe ( Y n , Y 1 2 , Y 1 3 , 8 n , S 1 2 , S 1 3 ) , ( Y n l , Y n 2 , Y n 3 , 8 n l , 8 n 2 , 8 n 3 ) , where 1 if Yij is left censored, 8^ = < 0 if Yij is uncensored, 46 _ d2F dy1dy2 _ dF_ _ df_ _ d~F7 dy2 dy2 dy2 _ 8F_ _ 9 F _ dF, dyi dyi dyi = F ( y i , y 2 , 0 ) if ( £ i , M = (0,0) if (Siu6i2) = (1,0) if (^1,^2) = (0,1) if (6iuSi2) = (l,l) i — 1,. . . , n and j = 1,2,3. The trivariate cdf is F(yi, y 2 ,y 2 ,5 ) = 1 - F i (y i ) - F 2 ( y i ) - F 3 ( y i ) + F12(yi, y 2 ,5) + ^ i 3 ( y i , y 3 , 0) + ^ 23(2/2,2/3,0) - F(yi, y 2 , y2,9) where JF1 is survival function and F\, F2, F3, Fi2, F13, and F 2 3 are the univariate and bivariate margins of F. Let C i 2 3 = (Sn,Si2,Si3), then we have P(Y1=y1,Y2 = y2,Y3 = y3)=: d3F if C i 2 3 = (0,0,0) dyidy2dy2 P(Y1 3, time-consuming integrations and computa-tions of the multivariate normal cdf are needed. Also, unlike the other families, symbolic manipulation software cannot be used to obtain the density function from the survival function. The maximum likelihood estimation takes 10-20 hours on a Sun SPARCsta-tion 330 for p =4, and is expected to take weeks for p =5 ( the programming for this case has not been completed). If both the multivariate dependence parameters and the univariate marginal parameters are simultaneously estimated, the C P U time is much more than the preceding (this also has not been done). Fortunately, it seems possible from the maximum likelihood estimates of the bivariate likelihoods to check whether the simpler models in Joe (1990b) fit the data adequately, if so, the bivariate parameter estimates must follow certain patterns. If the simpler models appear not to be adequate, then the best model, among known ones, would likely be the Hiisler-Reiss model. Further research is needed to derive families of multivariate extreme value distri-butions that have a dependence parameter for each bivariate margin and that have a simpler form than the Husler-Reiss model (simpler in the sense that maximum likeli-hood estimation becomes possible for p > 5). Some positive work in this direction has been initiated. 52 B I B L I O G R A P H Y Char, Bruce W. , Geddes, K . O . , Gonnet, G .H. , Monagan, M . B . , and Watt, S.M. (1989). Maple. 5th ed., W A T C O M Publications Limited. Coles, S. G. and Tawn, J . A . (1991). Modelling extreme multivariate events. J . R. Statist. Soc. 52, 377-392. Galambos, J . (1987). The Asymptitic Theory of Extreme Order Statistics, second edition. Kreiger Publishing Co., Malabar, F L Hiisler, J . and Reiss, R.-D. (1989). Maxima of normal random vectors between independence and complete dependence. Statistics &; Probability Letters 7 283-286. Joe H . (1990a). Families of min-stable multivariate exponential and multivariate extrem value distributions. Statistics & Probability Letters 9 (1990) 75-81. Joe H . (1990b). On fitting multivariate extreme value distributions, with applications to environmental data. Technical report. Department of Statistics, University of British Columbia. Lawless J . F. (1982). Statistical Models and Methods for Lifetime data. J O H N W I L E Y & SONS, Inc., New York. Resnick, S.I. (1987). Extreme Values, Regular Variation and Point Processes. New York: Springer-verlag. 53 Schervish, J . (1984). Multivariate normal probabilities with error bound. Appl. Stat.33 81- 94. Tawn J . A . (1988). Bivariate extreme value theory, models and estimation, Biometrika 75, 397-415. Tawn J . A . (1990). Modelling multivariate extreme value distributions. Biometrika 77, 245-253. 54