UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Conditional extremes in asymmetric financial markets Zhang, Jinyuan 2015

Your browser doesn't seem to have a PDF viewer, please download the PDF to view this item.

Notice for Google Chrome users:
If you are having trouble viewing or searching the PDF with Google Chrome, please download it here instead.

Item Metadata

Download

Media
24-ubc_2015_september_zhang_jinyuan.pdf [ 716.61kB ]
Metadata
JSON: 24-1.0166440.json
JSON-LD: 24-1.0166440-ld.json
RDF/XML (Pretty): 24-1.0166440-rdf.xml
RDF/JSON: 24-1.0166440-rdf.json
Turtle: 24-1.0166440-turtle.txt
N-Triples: 24-1.0166440-rdf-ntriples.txt
Original Record: 24-1.0166440-source.json
Full Text
24-1.0166440-fulltext.txt
Citation
24-1.0166440.ris

Full Text

Conditional Extremes in AsymmetricFinancial MarketsbyJinyuan ZhangB.Sc., The Chinese University of Hong Kong, 2009A THESIS SUBMITTED IN PARTIAL FULFILLMENT OFTHE REQUIREMENTS FOR THE DEGREE OFMASTER OF SCIENCEinThe Faculty of Graduate and Postdoctoral Studies(Statistics)THE UNIVERSITY OF BRITISH COLUMBIA(Vancouver)July 2015c© Jinyuan Zhang 2015AbstractThe project focuses on the estimation of the probability distribution of abivariate random vector given that one of the components takes on a largevalue. These conditional probabilities can be used to quantify the effect offinancial contagion when the random vector represents losses on financialassets and as a stress-testing tool in financial risk management. However, itis tricky to quantify these conditional probabilities when the main interestlies in the tails of the underlying distribution. Specifically, empirical prob-abilities fail to provide adequate estimates while fully parametric methodsare subject to large model uncertainty as there is too little data to assessthe model fit in the tails.We propose a semi-parametric framework using asymptotic results in thespirit of extreme values theory. The main contributions include an exten-sion of the limit theorem in Abdous et al. [Canad. J. Statist. 33 (2005)]to allow for asymmetry, frequently encountered in financial and insuranceapplications, and a new approach for inference. The results are illustratedusing simulations and two applications in finance.iiPrefaceThis work was prepared under the supervision of Professor Natalia Nolde.Chapter 3 and 4.1 were developed based on draft of Professor Natalia Nolde.The remaining chapters were independently written by the author. ProfessorNatalia Nolde provided many valuable suggestions on improving the materialin these chapters.iiiTable of ContentsAbstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiPreface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiiTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viiList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . xiiiDedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Preliminary Theory . . . . . . . . . . . . . . . . . . . . . . . . 52.1 Extreme Value Theory . . . . . . . . . . . . . . . . . . . . . 52.1.1 Univariate Case . . . . . . . . . . . . . . . . . . . . . 52.1.2 Multivariate Case . . . . . . . . . . . . . . . . . . . . 72.2 Regular Variation . . . . . . . . . . . . . . . . . . . . . . . . 82.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 9ivTable of Contents2.2.2 Link to MVEVD . . . . . . . . . . . . . . . . . . . . . 112.3 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.3.1 Tail Index Estimators . . . . . . . . . . . . . . . . . . 112.3.2 Spectral Measure . . . . . . . . . . . . . . . . . . . . 143 Review of Skewed Distributions . . . . . . . . . . . . . . . . 173.1 Elliptical Distributions . . . . . . . . . . . . . . . . . . . . . 173.2 Skew-symmetric Distributions . . . . . . . . . . . . . . . . . 183.3 Skew-elliptical Distributions . . . . . . . . . . . . . . . . . . 203.3.1 Skew-normal Distribution . . . . . . . . . . . . . . . . 213.3.2 Skew-t Distribution . . . . . . . . . . . . . . . . . . . 224 Theoretical Results . . . . . . . . . . . . . . . . . . . . . . . . . 254.1 Limit Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.1 Tail Index Estimation . . . . . . . . . . . . . . . . . . . . . . 295.2 Parametric EVT Estimation . . . . . . . . . . . . . . . . . . 315.3 Iterative Parametric EVT Estimation . . . . . . . . . . . . . 345.4 Estimation of the Scale Matrix . . . . . . . . . . . . . . . . . 365.5 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . 376 Financial Applications . . . . . . . . . . . . . . . . . . . . . . . 426.1 Financial Contagion . . . . . . . . . . . . . . . . . . . . . . . 426.2 CoVaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54vTable of ContentsAppendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .A Proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61A.1 Proof of Lemma 3.3.2 . . . . . . . . . . . . . . . . . . . . . . 61A.2 Proof of Proposition 4.1.1 . . . . . . . . . . . . . . . . . . . . 61A.3 Proof of Lemma 4.1.2 . . . . . . . . . . . . . . . . . . . . . . 63A.4 Proof of Theorem 4.1.3 . . . . . . . . . . . . . . . . . . . . . 64B Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67B.1 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . 67B.2 CoVaR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70viList of Tables5.1 Performance of estimator νˆ computed using Program 1. The bi-variate skew-t distribution with tail index ν = 2 is used in sim-ulations to generate samples of four different sample sizes. Eachcell presents the average value of estimates of ν based on 1000simulated samples of a given size, and the corresponding standarderror (in brackets). Two different parameter settings are consid-ered. Case 1: α = (1,−3), ρ = 0.5, ξ = (0, 0); Case 2: α = (1,−3),ρ = 0.5, ξ = (3, 1). . . . . . . . . . . . . . . . . . . . . . . . . . 305.2 Simulation results based on 1000 samples of size 1000 from a bi-variate skew-t distribution with parameters ξ = (3,1), α = (1,-3), ν=2, ρ = 0.5 and a standardized scale matrix. Each cell provides theaverage (standard deviation) of the estimates of η(x, y) under var-ious methods; see Section 5.5 for details. For ηˆAFG(x, y), ηˆ1(x, y),ηˆ2(x, y) and ηlim(x, y), we used z = y/x − ρ in the limit results.Values of x and y are chosen as the theoretical marginal quantileswith probability p, where p labels columns and rows. . . . . . . . 405.3 Simulation results based on 1000 samples of size 1000 from a bi-variate skew-t distribution with parameters ξ = (3,1), α = (1,-3),ν =2, ρ = 0.5 and ω = diag(2, 3) for the scale matrix. Each cellprovides the average (standard deviation) of the estimates of η(x, y)under various methods; see Section 5.5 for details. For ηˆAFG(x, y),ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y) we used z = ω1y/ω2x−ρ in the limitresults. Values of x and y are chosen as the theoretical marginalquantiles with probability p, where p labels columns and rows. . . 41viiList of Tables6.1 Point estimation of extreme conditional excess probability for 28financial institutions. Daily losses are computed using log returnsand are filtered by the AR(1)-GARCH(1,1) process. The thresholdvalues of x and y are, respectively, 99.9% quantile of losses onDJUSFN and average of 99% quantile of losses on cross-sectionalstocks. The sample period is from June 1, 2006 to May 31, 2008.The pre-crisis period is from June 1, 2006 to May 31, 2007, and thecrisis period is from June 1, 2007 to May 31, 2008. . . . . . . . . . 466.2 Summary statistics for cross-sectional 4CoVaRs|jq1,t for all institu-tions during sample period from June 1, 2006 to May 31, 2007.Level q1 is set to be 5%, and q2 is 5%, 1% or 0.01%. “EVT” and“skew-t”, respectively, refer to the use of the iterative EVT methodand the bivariate skew-t distribution to model the sequence of stan-dardized residuals {zˆt}. Estimation based on the empirical distri-bution is reported under “Empirical”. Column “Std.TS” reportsthe average of the standard deviations of individual 4CoVaRs|jq1,tand Column “Std.CS” reports the standard deviations of the meanof each individual 4CoVaRs|jq1,t measure. . . . . . . . . . . . . . 51B.1 Simulation results based on 1000 samples of size 1000 from a bi-variate skew-t distribution with parameters ξ = (0,0), α = (0,0), ν=2, ρ = 0.5 and ω = diag(1, 1) for the scale matrix. Each cell pro-vides the average (standard deviation) of the estimates of η(x, y)under various methods; see Section 5.5 for details. For ηˆAFG(x, y),ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y) we used z = ω1y/ω2x−ρ in the limitresults. Values of x and y are chosen as the theoretical marginalquantiles with probability p, where p labels columns and rows. . . 68viiiList of TablesB.2 Simulation results based on 1000 samples of size 1000 from a bi-variate skew-t distribution with parameters ξ = (3,1), α = (1,-3),ν =20, ρ = 0.5 and ω = diag(2, 3) for the scale matrix. Each cellprovides the average (standard deviation) of the estimates of η(x, y)under various methods; see Section 5.5 for details. For ηˆAFG(x, y),ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y) we used z = ω1y/ω2x−ρ in the limitresults. Values of x and y are chosen as the theoretical marginalquantiles with probability p, where p labels columns and rows. . . 69B.3 Summary statistics for cross-sectional 4CoVaRs|jq1,t for all institu-tions during sample period from June 1, 2006 to May 31, 2007.VaRjq2,t is estimated by assuming that Zˆjt follows the skew-t distri-bution, and CoVaRs|bjq1,t is estimated by assuming that {Zˆt} followsa bivariate skew-t distribution. . . . . . . . . . . . . . . . . . . 70B.4 Summary statistics for cross-sectiona 4CoVaRs|jq1,t for all institu-tions during sample period from June 1, 2006 to May 31, 2007.VaRjq2,t is estimated using EVT method decribed in McNeil andFrey [2000], and CoVaRs|bjq1,t is estimated by assuming that {Zˆt}follows a bivariate skew-t distribution. . . . . . . . . . . . . . . 71B.5 Summary statistics for cross-sectional 4CoVaRs|jq1,t for all institu-tions during sample period from June 1, 2006 to May 31, 2007.VaRjq2,t is estimated by assuming that Zˆjt follows the skew-t distri-bution, and CoVaRs|bjq1,t is estimated empirically. . . . . . . . . . . 71ixList of Figures1.1 Scatter plot of the daily log losses (negative returns) on the stock ofBank of America Corp (BAC) and JP Morgan Chase & Co (JPM)over a period of 500 days from June 1, 2005 to May 31, 2007 (leftpanel) and a period of 501 days from June 1, 2007 to May 31, 2009(right panel). . . . . . . . . . . . . . . . . . . . . . . . . . . . 24.1 Plot of the values of limx→∞P(Y ≤ (z + ρ)x | X > x) in terms of zfor a bivariate skew-t distribution with ξ = (5,−5), α = (1,−3),ρ = 0.5 and ν = 2: a) limiting value from Thereon 4.1.3, and truevalue with x being (b) 99% marginal quantile, (c) 99.99% marginalquantile, and (d) 200. . . . . . . . . . . . . . . . . . . . . . . . 285.1 Mean (left panel) and standard derivation (right panel) of absolutedifferences between exact value of η(x, y) and estimated probabilityηˆ(x, y) using Theorem 4.1.3 for different thresholds. Samples aredrawn from the skew-t distribution with ξ = (0, 0), α = (1,−3),ν = 2, ρ = 0.5. Values of x and y are chosen as the theoreticalmarginal quantiles with probability 99.99%. . . . . . . . . . . . . 335.2 The contour plots {z ∈ R2 | ψst(z) < 1} for spectral density ψstin (5.1). The parameters are set to ν = 2, ρ = 0.5 and severalvalues of α (left panel); α = (3,−1) and several values of ρ (rightpanel). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33xList of Figures5.3 Comparison of the true level curve {ψst(z) < 1} with the estimatedcontours {ψˆst(z) < 1} using the AFG method and the proposediterative parametric EVT method. The data are generated froma bivariate skew-t distribution with parameters ξ = (3, 1), α =(1,−3), ρ = 0.5, ν = 2, and ω = (1, 1) (left panel) or ω = (2, 3)(right panel). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396.1 Estimated contours {ψˆst(z) < 1} for daily losses on DJUSFN ver-sus Hudson City Bancorp (HCBK, left panel) and Peoples BankBridgeport (PBCT, right panel) between June 1, 2006 and May 31,2007 using the AFG method and the proposed iterative parametricEVT method. . . . . . . . . . . . . . . . . . . . . . . . . . . . 45xiList of Algorithms1 Estimation of tail index ν . . . . . . . . . . . . . . . . . . . . . 302 Estimation using parametric EVT method . . . . . . . . . . . . . 323 Estimation using iterative parametric EVT method . . . . . . . . 36xiiAcknowledgementsFirst and foremost I would like to thank my supervisor Natalia Nolde. Ithas been an honour to be her first master student. I appreciate all hercontributions of time, ideas, and funding to make my experience productiveand stimulating. The joy and enthusiasm she has for her research inspire me.Natalia, many thanks for your kind support and guidance while encouragingme to move beyond my intellectual comfort zones.My time at UBC was enjoyable thank to my dear friends. I am gratefulfor time spent with my friends and flatmates. Big thanks to Celina Liuand Kun Wang for being true inspirations and such an integral part of mygraduate school experience. Doing projects with Celina have kept my mindsharp. I will always remember the passionate discussions with Kun Wang -work related or not.Their enthusiasm, curiosity and encouragement broughtmuch fun to the study and research process. I have been very fortunate tobe friend with so many incredible people in my department. Thanks forall of the wonderful conversations. They have enriched my life in graduateschool.Most of all, I would love to thank my family for all their love and encour-agement. I am grateful for my mom. Her unfailing love and unconditionalsupport in all of my pursuits are invaluable. I would like to thank Xi Kangfor taking care of me when I suffer and for her understanding, faithful sup-port, encouragement and patience.xiiiDedicationTO MY PARENTS AND GRANDPARENTSxivChapter 1IntroductionIn a wide range of applications, it is of interest to quantify how large move-ments in one variable affect another variable. Particularly, in finance andinsurance, one is often concerned with the impact of a stock market crashon individual stocks, or with the impact of a large loss on one financialasset on other assets. One common approach to assess such influences isby considering conditional probabilities in which the conditioning event isextreme. Let (X,Y ) be a bivariate random vector, which can be interpretedas representing losses on two financial assets. The conditional probabilityη(x, y) := P(Y ≤ y|X > x), (1.1)when x is large, can be seen as a measure of the effect the occurrence ofa large loss on X exerts on the distribution of losses Y . When x and y in(1.1) are chosen as p-marginal quantiles, the limit of 1− η(x, y), if it exists,as the quantile level p goes to one, is known as the (upper) tail dependencecoefficient (Joe [1997]). It is a widely used way to quantify extremal depen-dence in the presence of tail dependence, i.e., when the coefficient is strictlypositive (McNeil et al. [2005]). In finance, the tail dependence coefficientis often interpreted as a measure of financial contagion and plays a rolein a variety of applications including risk management, derivative pricingand portfolio selection; see, e.g., Poon et al. [2004], Burtschell et al. [2009],Chan-Lau et al. [2004], Aloui et al. [2011] and DiTraglia and Gerlach [2013].Figure 1.1 shows the scatter plot of the daily log losses (negative returns)on the stock of Bank of America Corp (BAC) and JP Morgan Chase & Co(JPM) over a period of 500 days from June 1, 2005 to May 31, 2007 and aperiod of 501 days from June 1, 2007 to May 31, 2009. It is obvious thatthese two stocks exhibit weaker dependence in the extreme movements over1Chapter 1. IntroductionFigure 1.1: Scatter plot of the daily log losses (negative returns) on the stock ofBank of America Corp (BAC) and JP Morgan Chase & Co (JPM) over a period of500 days from June 1, 2005 to May 31, 2007 (left panel) and a period of 501 daysfrom June 1, 2007 to May 31, 2009 (right panel).-the second period, which is wildly recognized as financial crisis period. Thedifficulty to quantify this extreme dependence precisely via statistical esti-mation is various. Firstly, it is noted that for x large it becomes exceedinglydifficult to estimate η(x, y) empirically as there are too few, if any, datapoints falling in the associated extremal region. On the other hand, para-metric models fitted using the entire dataset are subject to undue influenceof the central observations possibly compromising the fit in the tail regions.Such a fully parametric approach naturally gives rise to a significant modelrisk. An effective approach in situations dealing with extreme events is torely on asymptotic models in the spirit of extreme value theory (EVT). Thestudy of conditional extreme value models was initiated by Heffernan andTawn [2004], followed by Heffernan and Resnick [2007] and Das and Resnick[2011], among others.Similar to Abdous et al. [2005a], we focus on the estimation of η(x, y) forx large but extend their methodology from the class of elliptical distributionsto a more general framework of asymmetric models describing the stochastic2Chapter 1. Introductionbehaviour of underlying tail risks.Elliptical distributions are frequently used to model the stationary dis-tribution of financial losses and returns; see, e.g., McNeil et al. [2005] foran overview. They are a natural generalization of the multivariate normaldistribution by giving flexibility in modelling the tail behaviour. Typically,financial data exhibit slower decay in the tails that can be captured by thestudent-t distribution. However, elliptical distributions fail to address thelikely tail asymmetry in the distribution of financial returns, both for themarginals and the (tail) dependence. Empirical evidence of skewness of fi-nancial returns has been documented by Longin and Solnik [2001], Rockingerand Jondeau [2002] and Chang et al. [2013], among others. Applications ofskewed distributions in finance and insurance along with implications ofasymmetry for asset pricing and risk management are reviewed in Adcocket al. [2012]. One example is that a stock’s co-skewness with market couldshed light on investor’s behaviour, especially in a downside market (see e.g.,Harvey and Siddique [2000], Alles and Murray [2013], Conrad et al. [2013]).Our aim here is to relax the assumption of elliptical symmetry whilestill retaining a certain parametric structure which can be used to developasymptotic approximations as well as subsequent inference procedures usingEVT. First, assuming that (X,Y ) comes from a sub-family of skew-ellipticaldistributions of Azzalini and Capitanio [2003], we derive a limit expressionfor η(x, y) as x → ∞ and y is chosen to grow at a suitable rate to ensureexistence of a non-degenerate limit. The limit expression is then used as anapproximation for large but finite values of x. Estimation of unknown pa-rameters is carried out using a semi-parametric procedure, consisting of twomain components. The tail behaviour is modelled under the assumptionof regular variation and standard EVT techniques are employed to esti-mate the tail index parameter. Dependence in the joint tail region is rep-resented by the spectral density for which we assume a specific parametricform based on the multivariate skew-t distribution. There exist other moregeneral (semi-)parametric modelling approaches for spectral densities as, forinstance, proposed in Peiro [1999], Boldi and Davison [2007] and Beran andMainik [2014]. However, they tend to come at a significant computationalor model-formulation costs, and will not be explored here.3Chapter 1. IntroductionThe report is organized as follows. Chapter 2 gives basic overview ofextreme value theory and regular variation to be used in the sequel. It alsoincludes some inference methodologies. In Chapter 3 we review a generalclass of skew-symmetric distributions with a special attention to its skew-elliptical sub-family. These skewed distributions will form the basis for mod-elling asymmetry in the extremal behaviour of multivariate random vectors.Chapter 4.1 summarizes a number of limit results, which will be used toapproximate the condition probability η(x, y) in (1.1) as well as to justifysubsequent inference procedures. In Chapter 5, we give details on modelfitting and estimation of η(x, y), and illustrate performance of the proposedmethods in several simulation studies. Chapter 6 gives an application of thedeveloped methodology to financial data. Finally, Chapter 7 presents someconcluding remarks and outlines directions for future research. Proofs andadditional tables are delegated to the Appendix.4Chapter 2Preliminary Theory2.1 Extreme Value TheoryThis section briefly reviews classical EVT, which mainly deals with theasymptotic distribution of sample maxima. The presentation is given forupside gains but the same results holds for downside losses.2.1.1 Univariate CaseThe classical univariate EVT dates back to the work of Fisher and Tippett[1928] and Gnedenko [1943]. Let Z1, ..., Zn be a sequence of independent andidentically distributed (i.i.d.) random variables with population cumulativedistribution function F , and Mn be the maximum of the sequence. Thenthe distribution for Mn isP(Mn < x) = P(Z1 ≤ x, ..., Zn ≤ x) = (F (x))n.If the distribution F is known, then the distribution of Mn can be deter-mined; otherwise, it can be approximated by modelling though asymptotictheory of Mn. When n → ∞, the distribution of Mn will degenerate to apoint mass, but this problem, in most cases, can be eliminated by allowingsome linear normalization of Mn.Theorem 2.1.1. (Fisher-Tippett Theorem) If there exist normalization con-stants an > 0 and bn ∈ R, such that as n→∞P(Mn − bnan≤ x)→ H(x),where H(x) is some non-degenerate distribution function, then H(x) belongsto one of the following three families of distributions:52.1. Extreme Value Theory• Gumbel: H(x) = exp{− exp(−x)}, x ∈ R;• Fre´chet: H(x) =0, x ≤ 0exp{−x−ν}, x > 0, ν > 0,• Weibull: H(x) =exp{−(−x)−ν}, x ≤ 0, ν < 01, x > 0,where ν is the shape parameter which describes the fatness of the tail.A rigorous proof of the theorem can be found in Gnedenko [1943]. Von Mises[1936] and Jenkinson [1955] derive a one-parameter representation of thethree limit distributions above, which is known as the standard generalizedextreme value (GEV) distribution.Definition 2.1.1. The distribution function of the standard GEV distribu-tion is given byH(x) =exp{−(−(1 + ξx))−1ξ }, ξ 6= 0exp{− exp(−x)}, ξ = 0,where 1 + ξx > 0.The Gumbel (ξ = 0) distribution is related to light-tailed distributionssuch as normal, log-normal or exponential; the Fre´chet distribution (ξ > 0)is related to heavy-tailed distributions such as Pareto, Cauchy, or Studentdistribution; the Weibull (ξ < 0) is associated to finite support distributionssuch as the uniform distribution. In practice, there are various ways toadopt the EVT to perform inference for the tail index ν = 1ξ . For details,please refer to Section 2.3.1.In practice, above results can be applied to perform inference for block(e.g., annual) maxima data using the maximum-likelihood method. How-ever, this method can be wasteful of data due to the trade-off between thesize of the blocks and the number of blocks to be constructed from a givendataset. In contrast, the peaks over threshold (POT) method, which em-ploys the data of exceedances over a high threshold to estimate the tail ofdistribution, is a more efficient approach.62.1. Extreme Value TheoryThe generalized Pareto distribution (GPD) is the pivotal distribution formodelling the data of exceedances over a high threshold, see Embrechts et al.[1997] and McNeil et al. [2005], among others.Definition 2.1.2. (Generalized Pareto Distribution). The distribution func-tion of the GPD is given byHξ,β(x) =1− (1 + ξ xβ )− 1ξ , ξ 6= 01− exp{−xβ}, ξ = 0,where ξ is the shape parameter and β is an additional scale parameter.When ξ > 0, Hξ,β(x) is a re-parameterized version of a heavy-tailed, ordi-nary Pareto distribution; when ξ = 0, we have a light-tailed, exponentialdistribution; when ξ < 0, Hξ,β(x) corresponds to a bounded (i.e. short-tailed), Pareto type II distribution.For a random variables Z with the cumulative probability function F ,the conditional excess distribution of Z over a certain threshold u is givenbyFu(x) = P(Z − u ≤ x|Z > u) =F (u+ x)− F (u)1− F (u).A famous limit result by Balkema and De Haan [1974] and Pickands [1975]shows that the GPD is the limiting distribution for the data of exceedancesover a high threshold. This results allows us to perform inference for thedata of exceedances over a high threshold (the tail observations).2.1.2 Multivariate CaseMultivariate EVT studies the limiting distribution of a vector of appropri-ately normalized coordinate-wise maxima. Assuming Z1, ...,Zn are a se-quence of i.i.d. d-dimensional random vectors, andMn = (nmaxi=1(Zi,1), ...nmaxi=1(Zi,d))T ,then Mn can be modelled by the multivariate extreme value distribution(MVEVD).72.2. Regular VariationTheorem 2.1.2. If there exist vectors of normalizing constants an > 0 andbn ∈ Rd, such that as n→∞P(Mn − bnan≤ x)→ H(x), (2.1)where H(x) is some non-degenerate distribution function, then H(x) iscalled the multivariate extreme value distribution with univariate GEV marginals.The details of above proposition can be found in de Haan and Resnick[1977]) and Resnick [1987]. A typical procedure to model multivariate ex-tremes typically involves two steps: marginal estimation and dependenceestimation. Marginal block maximum are first modelled using the GEV dis-tribution and then transformed based on the fitted distribution to have aunit Fre´chet distribution. The dependence structure can then be modelledvia an existing parametric distribution or non-parametric distribution, see,e.g., Coles and Tawn [1994].The POT method can also be applied to do inference for the multivariatedata; however, the definition of a threshold exceedance is not as obvious as inthe univariate case. One approach is to employ a multivariate GPD (Rootze´nand Tajvidi [2006]), which leads to the study of data exceeding marginalthresholds. Another one is to define a multivariate threshold exceedance interms of the norm of a random vector.2.2 Regular VariationIt is widely accepted that high frequency financial return data comes froma distribution with heavy tails. We make this assumption throughout theproject. Mathematically, heavy tails are often formalized by imposing thecondition of regular variation on the tail of the underlying distribution. Mul-tivariate regular variation provides a probability framework for modellingthe joint tail of a random vector with threshold-exceedance data. Generallyspeaking, it decomposes data into polar coordinates, and then characterizestail dependence by a limiting angular measure on the unit sphere under afixed norm.82.2. Regular VariationIn this section we review some results on regular variation, which will beused in the later parts of the project. For more details, see, e.g., Resnick[2006].2.2.1 DefinitionDefinition 2.2.1. A positive measurable function h on (0,∞) is said tobe regularly varying at infinity with index ν > 0 (written h ∈ RV−ν) iflimt→∞ h(tx)/h(t) = x−ν for all x > 0.Definition 2.2.2. A non-negative random variable Z is said to be regularlyvarying with index ν > 0 if for every x > 0,limt→∞P(Z > tx)P(Z > t)= x−ν .Definition 2.2.3. A random vector Z on Rd and its distribution are saidto be multivariate regularly varying with index ν > 0 iflimt→∞P(||Z|| ≥ tx,Z/||Z|| ∈ D)P(||Z|| ≥ t)= x−νΨ(D), (2.2)for every x > 0 and Borel set D in Sd−1 = {z ∈ Rd : ||z|| = 1} withΨ(∂D) = 0, where Ψ is a spectral probability measure on Sd−1, and || · ||denotes the L2-norm.Condition (2.2) is equivalent to havinglimt→∞P(‖Z‖ ≥ tx)P(‖Z‖ ≥ t)= x−ν , x > 0, (2.3)and there must exist a measure µ such thatlimt→∞P(Z ∈ tE)P(‖Z‖ ≥ t)= µ(E) <∞ (2.4)for every Borel set E on Rd, bounded away from the origin and satisfyingµ(∂E) = 0. Measures Ψ in (2.2) and µ in (2.4) are related viaΨ(D) = µ(E1,D) for D ⊂ Sd−1 and E1,D = {z ∈ Rd : ‖z‖ ≥ 1, z/‖z‖ ∈ D}.92.2. Regular VariationDefinition 2.2.3 suggests a possibility to model multivariate tails by sep-arating the (radial) tail behaviour and the extremal dependence structureexpressed in the form of measure Ψ in (2.2). However, condition (2.2) maybe cumbersome to work with as often multivariate distributions are specifiedin terms of their densities. The following result, due to de Haan and Resnick[1987], gives sufficient conditions on the density for (2.2) to hold; see alsoCai et al. [2011].Theorem 2.2.1. Let f denote the density of random vector Z on Rd. Sup-pose f is positive and continuous, and for some regularly varying functionV with index ν > 0 the following two limit conditions hold:limt→∞f(tz)t−dV (t)= q(z) > 0 for every z 6= 0, (2.5)andlimt→∞supz∈Sd−1∣∣∣f(tz)t−dV (t)− q(z)∣∣∣ = 0. (2.6)Then Z is multivariate regularly varying with index ν. Necessarily, q ishomogeneous: q(tz) = t−ν−d q(z) for z 6= 0. Moreover, if V (t) = P(‖Z‖ > t)then µ(E) =∫E q(z)dz, where µ(·) is the measure in (2.4).Remark 2.2.1. Condition (2.5) says that the density f is a multivariate reg-ularly varying function with index (ν+d) and limit function q (Stam [1977]).However, this condition alone is not sufficient to ensure multivariate regu-lar variation of the tail of the distribution as it only controls the behaviouralong rays. The uniformity condition in (2.6) then guarantees (2.2).Existence of density q of µ implies the existence of density ψ of Ψ. Inparticular, one can show thatq(z) = q(rw) = ν r−ν−d ψ(w), w = z/‖z‖ ∈ Sd−1, r = ‖z‖. (2.7)Proposition 2.2.2. (Karamata’s Theorem)If ν > 1 (or if ν = 1 and∫∞t h(u)du < ∞), then h ∈ RV−ν implies that∫∞t h(u)du is finite,∫∞t h(u)du ∈ RV−ν+1, andlimt→∞th(t)∫∞t h(u)du= ν − 1.102.3. Inference2.2.2 Link to MVEVDThe multivariate regular variation can be linked to MVEVD discussed inSection 2.1.2 in the following way.Proposition 2.2.3. (Resnick [1987], Corollary 5.18) Suppose that Z, a se-quence of i.i.d. d-dimensional random vector, is regularly varying with indexν > 0, and that Mn = (maxni=1(Zi,1), ...maxni=1(Zi,d))T is the coordinate-wise maxima of Z, then there exist normalizing sequences of vectors an > 0and bn ∈ Rd such that Formula 2.1 holds withH(x) = exp{−µ([0,x]c)},where the marginal distributions of H(x) are GEV with tail index ν.The above Corollary basically claims that if Z is multivariate regularlyvarying, then its coordinate-wise maximum has a MVEVD with Fre´chetmarginal distributions and dependence structure described by the measureµ.2.3 InferenceTwo important components in modelling multivariate tail behaviour of reg-ularly varying random vectors are tail index ν of the radial component andtail dependence structure, which can be modelled by the spectral measureΨ. In this section, we review several existing methods to perform inference.2.3.1 Tail Index EstimatorsAs we assume a heavy-tailed behaviour for the financial return data, weneed to estimate the tail index ν. Under the assumption that each variableof a random vector Z has the same tail index ν, its L2-norm ‖Z‖ has aregularly varying tail with index ν. This reduces the multivariate problemto a univariate problem. In this section, we briefly review a few methods thatcan be applied to estimate the tail index ν. This review is strongly influencedby the master’s thesis of Trudel [2008], and I rephrased and reordered severalparts to fit in this project.112.3. InferenceHill Estimator For a random variable Z, Hill [1975] derived the maximum-likelihood estimator for the tail index ν given byνˆ =[1kk∑i=1logZ(i)Z(k)]−1, (2.8)where k is the threshold that defines the tail of the distribution, and Z(1) ≥Z(2) ≥ ... ≥ Z(n) are the order statistics for the sample of i.i.d. copies ofrandom variable Z, and n is the number of observations. This estimatoris asymptotically unbiased and easy to implement, but is biased for smallsamples.CSN Estimator Clauset et al. [2009] proposed another way to estimatethe tail index, which relaxes the assumption that observations should be i.i.d.in the maximum-likelihood method. This method relies on the minimizationof the Kolmogorov-Smirnov statistic between the empirical distribution ofthe data and the assumed heavy-tailed distribution. Intuitively, the CSNestimator intends to find the point where the change between the empiricaldistribution and the assumed distribution will most likely happen. Theestimator is given byνˆ = arg minνDν ,where Dν = maxx|Pemp(x)− Pν(x)|;Pemp(x) and Pν(x) denote the empirical (cumulative) distribution functionand the assumed heavy-tailed distribution function with tail index ν, re-spectively.HKKP Estimator The drawback of Hill estimator is the difficulty tochoose a suitable tail threshold number k such that the MSE is minimized.Practically, the selection of k normally depends on experience. Huismanet al. [2001] developed an ordinary linear regression (OLS) tail estimator tocorrect the bias generated by Hill estimator for small samples. This methoddoes not simply rely on one single threshold k to estimate the tail index,but exploits the information contained in a series of Hill estimators, eachestimated using different tail thresholds.122.3. InferenceThe HKKP Estimator utilizes an important characteristic of the biasterm; that is, when the threshold k is small enough, the bias term can beapproximated by a linear function. This suggests that a linear regression orOLS can help estimate the tail index. The linear model adopted by Huismanet al. [2001] is stated as follows:1ν(k)= β0 + β1k + (k), k = 1, 2, ..., κ, (2.9)where κ is chosen such that the inverse of function ν(k) for k = 1, 2, ..., κ isapproximately linear, and Huisman et al. [2001] showed that the estimationfor tail index is robust to the choice of κ.Furthermore, to correct for the heteroscedasticity of the error term (k),a weighted least square (WLS) regression can be adopted here. Since thevariance of Hill estimator is inversely related to k, a (κ×κ) weighting matrixthat has {√1,√2, ...,√κ} as diagonal elements and zeros elsewhere can bechosen. Since an unbiased estimator of ν can be attained for k approaching0, Equation 2.9 yields the estimation of ν equal to the inverse of estimatedintercept β0.BGST Estimator Another popular method to estimate tail index is torun the following OLS log-log rank regression with γ = 0 (see Beirlant et al.[2006])log(k − γ) = β0 − β1 logZ(i), k = 1, 2, ..., n. (2.10)The estimate of β1 is the approximation for tail index ν. This method ismotivated by the approximate linear relationship of the distribution with aheavy taillog(kn) ≈ log(C)− ν logZk, k = 1, 2, ..., n. (2.11)Gabaix and Ibragimov [2011] provide an improved tail index estimate forthis method, which simply sets γ to be 12 instead of 0. In this way, the biasdue to small samples can be reduced substantially.132.3. Inference2.3.2 Spectral MeasureAs discussed above, the spectral measure Ψ in (2.2) can be used to describethe extreme dependence structure of a regularly varying random vector. Amajor issue here is the estimation of the spectral measure Ψ. Firstly, it ishard to tell where the tail begins in a given sample so that the asymptoticapproximation is sufficiently accurate. Secondly, there are simply very fewextreme observations. In this section, we summarize two non-parametricmethods to estimate the spectral measure. To simplify notation, we onlyconsider a random sample from a bivariate distribution. Note that if the dis-tribution is known, the spectral measure can be obtained via Definition 2.2.3or Theorem 2.2.1, at least numerically.ES Estimator Einmahl and Segers [2009] proposed a non-parametricmaximum empirical likelihood estimator for the spectral measure Ψ. Theestimator is based on moment condition as below:∫[0,pi/2]f(θ)Ψ(dθ) = 0, (2.12)where f(θ) =sin θ − cos θ|| sin θ − cos θ||, θ ∈ [0, pi/2]. (2.13)Let {(Zi,1, Zi,2); i = 1, . . . , n} be a bivariate random sample with n ob-servations. The empirical marginal distribution function can be written asFˆj(x) =1nn∑i=11(Zi,j ≤ x), x ∈ R, j = 1, 2and ˆ¯Fi,j = 1 − Fˆj(xi,j) is defined as the upper tail empirical probability.Then the consistent empirical spectral probability measure Ψˆ isΨˆ(·) =∑i∈Inp˜i,n1(θi,n ∈ ·),where θin = arctan( ˆ¯Fi,2ˆ¯Fi,1), i = 1, 2, ..., n,In = {i = 1, 2, ..., n : ||( ˆ¯F−1i,1 ,ˆ¯F−1i,2 )|| ≥ n/k},142.3. Inferenceand the weight vector (p˜i,n : i ∈ In) solves the following optimization prob-lem:maximize∏ipi,n,constraints pi,n ≥ 0 for all i ∈ In,∑ipi,n = 1,∑ipi,nf(θi,n) = 0.To prove asymptotic results of the estimator, the condition that k = k(n) isan intermediate sequence satisfying the condition that k →∞ and k/n→ 0as n→∞ is required.This method makes no assumptions on the marginal distribution func-tions, and allows for arbitrary norms when defining the actual representationof the spectral measure. As then moment condition is set to be a constraint,it succeeds to overcome the shortcomings of the empirical spectral measureproposed in Einmahl et al. [2001] that is itself not a proper spectral measuredue to violation of moment constraints.NS Estimator One drawback of the ES estimator is the difficulty tochoose the tail threshold k. Nguyen and Samorodnitsky [2013] proposed amethod that allows for systematic decision on what part of the sample cor-responds to “tail observations”. This method is based on the rank method,where the rank statistics are defined asr(j)i =n∑m=11(Zm,j ≥ Zi,j), 1 ≤ i ≤ n, j = 1, 2.To estimate the spectral measure, the data are firstly transformed by polartransformation as(Ri,k, θi,k) =(∣∣∣∣∣∣(kr(1)i,kr(2)i)∣∣∣∣∣∣,( kr(1)i, kr(2)i)∣∣∣∣∣∣( kr(1)i, kr(2)i)∣∣∣∣∣∣), i = 1, 2, ..., n.152.3. InferenceThen, the consistent estimator for Ψ based on Equation 2.2 can be writtenasΨˆ(·) =∑ni=1 1(Ri,k > 1, θi,k ∈ ·)∑ni=1 1(Ri,k > 1, θi,k ∈ [0, pi/2]).Their choice of k used in tail estimation depends on the test for exponen-tiality of “tail part” of the data. Specifically, k = min(N (1)n , N(2)n ), whereN (j)n is the smallest k such that the null hypothesis of exponentiality isrejected for each marginal variable. The statistic used to test for exponen-tiality is:Qk,j =√k2( 1k∑k−1i=0(log Zn−i,jZn−k,j)2(1k∑k−1i=0 logZn−i,jZn−k,j)2 − 2), j = 1, 2,which converges to the standard normal distribution under the null hypoth-esis that the variable is exponentially distributed (see Dahiya and Gurland[1972]). Their choice of the significance level of the test givesN (j)n = inf{k : 1 ≤ k ≤ n, |Qk,n| ≥ ωj√θ(j)nk}, j = 1, 2,where ωj > 0 and θ(j)n is an increasing sequence such that θ(j)n = o(n2ν1+2ν).Their estimate is conservative about deciding on where the “tail” starts.Other choices are possible catering to different needs, for example, k =max(N (1)n , N(2)n ). This method is easily extended to higher dimension ofrandom vector Z, and is fast and simple to automate.Other methods to estimate spectral measure include Guillotte et al.[2011], Eastoe et al. [2014], among others. All these non-parametric meth-ods are flexible to capture the behaviour of tail observations. However,these methods suffer from a common disadvantage of non-parametric meth-ods, that is, there is no parameter to make quantitative statements aboutpopulations. In this project, the objective is to derive a semi-parametricestimator for spectral measure, which is flexible enough to describe differenttail behaviour.16Chapter 3Review of SkewedDistributions3.1 Elliptical DistributionsElliptical distributions are often used as a starting point for further gener-alizations beyond elliptical symmetry. Here we give a definition, establishnotation and some useful facts for future reference. A complete treatmentof the topic may be found in Fang et al. [1990].Definition 3.1.1. A continuous random vector Z on Rd is said to have anelliptical distribution if its density is of the formf(z; ξ,Ω) =cd|Ω|1/2f˜{(z− ξ)TΩ−1(z− ξ)}, z ∈ Rd, (3.1)where ξ ∈ Rd is a location parameter, Ω ∈ Rd×d is a positive-definite scalematrix, f˜ : [0,∞) → [0,∞) is a continuous and integrable decreasing func-tion, known as the density generator, and cd is a normalizing constant. Wewrite Z ∼ Elld(ξ,Ω, f˜).The following stochastic representation for elliptically distributed randomvectors is often useful. If Z ∼ Elld(ξ,Ω, f˜), then Z can be written asZ = ξ +RLTS, (3.2)where LTL = Ω, S is a uniformly distributed random vector on Sd−1 = {z ∈Rd : ‖z‖ = 1}, the unit sphere in Rd, and R is a positive scalar randomvariable independent of S.173.2. Skew-symmetric DistributionsIt will sometimes be convenient to work with a standardized form of thescale matrix. Given a scale matrix Ω = (ωij) ∈ Rd×d, define the associatedstandardized scale matrix Ω¯ as Ω¯ = ω−1 Ω ω−1, whereω = diag(ω1, . . . , ωd) = diag(√ω11, . . . ,√ωdd). (3.3)3.2 Skew-symmetric DistributionsIn this section we describe a very general class of skewed distributions. Theconstruction is simple and is summarized in Proposition 3.2.1 below. LetZ be a d-dimensional random vector which is centrally symmetric around apoint ξ ∈ Rd; i.e. Z−ξ d= ξ−Z. Assuming that Z is continuous with densityf , the condition of central symmetry implies that f(z − ξ) = f(ξ − z) forall z ∈ Rd, up to a set of measure zero. Note that the property of centralsymmetry is satisfied by a wide variety of densities including for example theclass of elliptical densities. Skew-symmetric distributions as introduced inWang et al. [2004] are then constructed by perturbing the symmetry of thedensity f of Z by a so-called skewing function. A function pi : Rd → [0, 1] isa skewing function if pi(−z) = 1− pi(z) for z ∈ Rd.Proposition 3.2.1. Suppose f is the density of a continuous random vectoron Rd which is centrally symmetric around 0, and pi : Rd → [0, 1] is askewing function. Let G denote a scalar distribution function of a randomvariable that is symmetric about 0 such that G(−s) = 1−G(s) for all s ∈ Rand let w : Rd → R be an odd function with w(−x) = −w(x) for all x ∈ Rd.Then the function from Rd → R+2f(z− ξ)pi(z− ξ), z, ξ ∈ Rd (3.4)or equivalently,2f(z− ξ)G(w(z− ξ)), z, ξ ∈ Rd (3.5)is a density.Proof A proof is given in Proposition 1 of Wang et al. [2004] and Azzaliniand Capitanio [2003]. Equivalence of the two formulations (3.4) and (3.5)is shown in Proposition 2, Wang et al. [2004]. ¶183.2. Skew-symmetric DistributionsAs remarked in Wang et al. [2004], representation (3.5) of a skewing func-tion is not unique since for any strictly increasing distribution function G,one can find a suitable odd function w to obtain a given skewing function pi.Definition 3.2.1. A random vector Z on Rd has a skew-symmetric distri-bution with location parameter ξ ∈ Rd if its density is of the form (3.4) or(3.5).For a skew-symmetric random vector Z with density (3.5) , there existsa convenient stochastic representation:Z =Y + ξ if U < pi(Y) or X < w(Y + ξ)−Y + ξ if U > pi(Y) or X > w(Y + ξ), (3.6)where Y has density f , U is uniformly distributed on (0, 1), and X is arandom variable with distribution function G; Y, U and X are mutuallyindependent. This representation gives a straightforward way to simulatefrom skew-symmetric distributions.An interesting and in fact useful property of skew-symmetric distribu-tions is their invariance under even functions, which is immediate from thestochastic representation in (3.6).Proposition 3.2.2. Consider random vectors Y and Z on Rd with densitiesf and of the form (3.4) with ξ = 0, respectively, satisfying the conditions ofProposition 3.2.1. If τ : Rd → Rp for some p > 0 is an even function, i.e.τ(−x) = τ(x) for all x ∈ Rd, thenτ(Y)d= τ(Z).The complete generality of the class of skew-symmetric distributions isshown in Wang et al. [2004] by proving that in fact any density has a skew-symmetric representation and this representation is unique:Proposition 3.2.3. For any density g on Rd and any point ξ ∈ Rd, g canbe represented asg(z) = 2f(z− ξ)pi(z− ξ), (3.7)193.3. Skew-elliptical Distributionswhere f is a density, centrally symmetric around 0 and pi is a skewing func-tion. This representation is unique for any ξ, andf(z) =g(ξ + z) + g(ξ − z)2and pi(z) =g(ξ + z)g(ξ + z) + g(ξ − z).Next we would like to take a closer look at several important examplesof this very general family of skewed distributions.3.3 Skew-elliptical DistributionsSkew-elliptical distributions constitute a fairly large subclass within the fam-ily of skew-symmetric distributions, and are obtained by replacing the cen-trally symmetric density f in (3.5) with a density of an elliptical distribution(see Definition 3.1.1).Definition 3.3.1. A random vector Z on Rd has a skew-elliptical distribu-tion if its density is given byh(z) = 2 f(z; ξ,Ω) G(w(z− ξ)), z ∈ Rd, (3.8)where:• ξ ∈ Rd is a location parameter and Ω ∈ Rd×d is a positive-definitescale matrix;• f is the elliptical density in (3.1) with density generator f˜ ;• G is a scalar distribution function such that G(−x) = 1−G(x) for allx ∈ R;• w : Rd → R is an odd function; i.e., w(−x) = −w(x) for all x ∈ Rd.We write Z ∼ SEd(ξ,Ω, f˜ , G ◦ w).Azzalini and Capitanio [2003] have derived a stochastic representationfor skew-elliptical random vectors similar to the one for elliptical vectorspresented in (3.2).203.3. Skew-elliptical DistributionsProposition 3.3.1. If Z ∼ SEd(ξ, Ω¯, f˜ , G ◦w) as in Definition 3.3.1, thenit admits the following stochastic representationZ = ξ +RLTS′, (3.9)where Ω = LTL; R > 0 has densityfR(r) =2pid/2Γ(d/2)cdf˜(r2)rd−1, r > 0, (3.10)and S′ has a non-uniform distribution on the unit sphere in Rd. The densityof S′ in spherical coordinates is given byΓ(d/2)pid/2d−2∏k=1(sin θk)d−k−1P{X ≤ w∗L(θ1, . . . , θd−1, R)},θ = (θ1, . . . , θd−1) ∈ [0, pi)d−2 × [0, 2pi),where w∗L(θ, r) = wL(r cos θ1, r sin θ1 cos θ2, . . . , r sin θ1 · · · sin θd−1), wL(x) =w(LTx), and X is an independent random variable with distribution func-tion G.Lemma 3.3.2 (Linear transformation of SE). Let Z ∼ SEd(ξ,Ω, f˜ , G◦w)),then CZ ∼ SEd(Cξ, CTΩC, f˜ ,G ◦ω0) for any k× k non-singular matrix C,where ω0(z− Cξ) = ω(C−1(z− Cξ)).We next look at the form of this representation for two specific examplesof skew-elliptical distributions.3.3.1 Skew-normal DistributionThe density of random vector Z on Rd with a skew-normal distribution(denoted Z ∼ SNd(ξ,Ω,α)) is given byfSN (z) = 2φd(z− ξ; Ω)Φ(αT (z− ξ)), z ∈ Rd, (3.11)where α ∈ Rd is the shape parameter controlling the skewness of the distri-bution; φd(·; Ω) is the density of the centred d-variate normal distribution213.3. Skew-elliptical Distributionswith scale matrix Ω and Φ(·) is the distribution function of the standard nor-mal random variable. Comparing fSN in (3.11) with representation (3.5), wesee that G(·) = Φ(·) and w(y) = αTy, y ∈ Rd. Hence, for Z ∼ SNd(ξ,Ω,α),Proposition 3.3.1 says that Z = ξ +RLTS′ with R2 ∼ χ2d and the density ofS′ in spherical coordinates equal tofΘ(θ) =Γ(d/2)pid/2d−2∏k=1(sin θk)d−k−1P{X ≤ R(α∗)Tβθ}, (3.12)θ = (θ1, . . . , θd−1) ∈ [0, pi)d−2 × [0, 2pi),where α∗ = Lα, βθ = (cos θ1, sin θ1 cos θ2, . . . , sin θ1 · · · sin θd−1)T and X ∼N(0, 1), independent of R. Using the fact that X/√R2/d has a Student tdistribution with ν degrees of freedom and letting T1(·; ν) denote its distri-bution function, we obtainfΘ(θ) =Γ(d/2)pid/2d−2∏k=1(sin θk)d−k−1T1(√d(α∗)Tβθ; ν). (3.13)3.3.2 Skew-t DistributionFollowing Azzalini and Capitanio [2003], we say that a d-dimensional vectorZ has a skew-t distribution with location parameter ξ, scale matrix Ω, shapeparameter α and ν degrees of freedom (written as Z ∼ Std(ξ,Ω,α, ν)) if itsdensity is given byfSt(z) = 2td(z; Ω, ν)T1(αTω−1(z− ξ)( ν + dQ(z) + ν)1/2; ν + d), z ∈ Rd,(3.14)where Q(z) = (z − ξ)TΩ−1(z − ξ), and td(z; Ω, ν) is the density of a stan-dardized d-dimensional t distributed random vector with scale matrix Ω andν degrees of freedom.Application of Proposition 3.3.1 to Z ∼ Std(ξ,Ω,α, ν) gives that in thestochastic representation (3.9), the density of S′ in spherical coordinates isequal tofΘ(θ) =Γ(d/2)pid/2d−2∏k=1(sin θk)d−k−1P{X ≤ R√ν + dR2 + ν(α∗)Tβθ}, (3.15)223.3. Skew-elliptical Distributionswhere α∗ and βθ are as in (3.12), the density of R is given byfR(r) ∝ (1 + r2/ν)−(ν+d)/2rd−1, r > 0,and X has a Student t distribution with ν + d degrees of freedom and isindependent of R. Let Y = (X/R)√(R2 + ν)/(ν + d). The density of Ycan be computed as followsfY (y) =∫ ∞0fY |R(y | r)fR(r)dr ∝∫ ∞0rd(ν + r2(1 + y2))−(ν+d+1)/2dr;making change of variable t = r√1 + y2 givesfY (y) ∝ (1 + y2)−(d+1)/2∫ ∞0td(ν + t2)−(ν+d+1)/2dt ∝ (1 + y2)−(d+1)/2, y ∈ R,from which one recognizes fY as the density of a t distribution with d degreesof freedom evaluated at the point√dy. So, (X/R)√d(R2 + ν)/(ν + d) hasa t distribution with d degrees of freedom, which implies that the density ofS′ in (3.15) for the skew-t distribution is exactly the same as in (3.13) for theskew-normal distribution. This result reveals similarity between the skew-normal and skew-t distributions via the stochastic behaviour of the angularcomponent S′; the differences arise due to the tail behaviour as governed bythe radial component R, and the dependence between S′ and R.In a similar way, it can be shown that the same result holds for a moregeneral class of skew-elliptical densities (of which skew-t is a special case)based on the multivariate Pearson type VII distributions, whose generatorand normalizing constant aref˜(x) = (1 + x/ν)−M , cd =Γ(M)(piν)d/2Γ(M − d/2), ν > 0, M > d/2.Proposition 4 in Azzalini and Capitanio [2003] says that the density of theabove skew-elliptical distributions is of the type (3.5) and is given by2f(z; Ω)F1(αT z(ν +Q(z))−1/2;M, 1), z ∈ Rd,where f is the density of PV IId(0,Ω,M−1/2, ν) distribution and F1(·;M, 1)is the distribution function of a PV II1(0, 1,M, 1) distributed random vari-able. One recovers the skew-t case by setting M = (v + d+ 1)/2.233.3. Skew-elliptical DistributionsAn interesting question is whether there are other examples of skew-elliptical distributions for which the density of S′ in (3.9) is equal to (3.13),and if so what conditions on the family of skew-elliptical distributions willensure this property.24Chapter 4Theoretical ResultsIn this chapter we explore the asymptotic behaviour of skew-elliptical dis-tributions under certain restrictions on the skewing function. Specifically,we derived a limit expression for the conditional probability given that oneof the components of a bivariate random vector is extreme, assuming a sub-family of skew-elliptical distributions with regularly varying tails as the un-derlying model. This extends results in Abdous et al. [2005a] for ellipticallysymmetric vectors. All the proofs can be found in Appendix A.4.1 Limit ResultsThe following results on the asymptotic behaviour of random vector Z will beused to motivate and justify the proposed inference procedure for estimationof the conditional probability η(x, y) in (1.1).Proposition 4.1.1. Let Z = (X,Y ) ∼ SE2(ξ, Ω¯, f˜ , G ◦ w), where Ω¯ii = 1and Ω¯ij = ρ ∈ (−1, 1) for i 6= j, i, j ∈ {1, 2}. Assume the following:(i) The density generator f˜ of the underlying elliptical distribution variesregularly: f˜ ∈ RV− ν+22for some ν > 0;(ii) limt→∞w(tz) =: w∞(z) ∈ R for all z ∈ R2, and limt→∞ supz∈S1 |G(w(tz))−G(w∞(z))| = 0.Then Z is a bivariate regularly varying random vector with index ν. Thedensity of the associated spectral measure Ψ in (2.2) is given byψ(w) = 2 |Ω¯|−(1/2)G(w∞(w)) Q∗(w)−(ν+2)/2[ ∫ 2pi0A(θ)ν/2dθ]−1, w ∈ S1,(4.1)254.1. Limit Resultswhere A(θ) = 1 + ρ√1− ρ2 sin(2θ) + ρ2 cos(2θ), and Q∗(w) = wT Ω¯−1w.Remark 4.1.1. Assumption (i) above is standard, and assumption (ii) holdsfor popular members of the skew-elliptical family of distributions, includingmultivariate skew-t distribution and more generally the skew-Pearson typeVII distribution, both mentioned in Chapter 3.3. The first part of assump-tion (ii) indicates that the function w is bounded in all directions, and thesecond part guarantees the uniformity condition (2.6) for density functionof Z.Lemma 4.1.2. Let Z = ξ + RLTS′∼ SEd(ξ,Ω, f˜ , G). Then the followingstatements are equivalent:(i) f˜ ∈ RV−(ν+d)/2;(ii) R is regularly varying with index ν;(iii) ||Z|| is regularly varying with index ν, where || · || is the L2 norm.The next result gives the limit expression for the conditional probabilityin (1.1).Theorem 4.1.3. Let Z have a bivariate skew-elliptical distribution satisfy-ing assumptions of Proposition 4.1.1. Suppose the marginal densities of Zhave the form:hi(z) = 2 fi(z − ξi) G0(wi(z − ξi)), i = 1, 2, (4.2)where fi(·) is the symmetric density of the ith component of elliptical vectorZ˜ and G0(wi(·)) is a skewing function with existing limit limt→∞wi(tx) =wi,∞(x) ∈ R for all x ∈ R, i = 1, 2. Then, for y ∼ (z + ρ)x as x → ∞ andz ∈ R,limx,y→∞P(Y ≤ y | X > x) = 1−K(ρ+ z) J(z), (4.3)where:• K(z) =∫∫∞(1,z)Q∗(u)−(ν+2)/2 G(w∞(u)) duG0(w1,∞(1))∫∫∞(1,z)Q∗(u)−(ν+2)/2 duwith Q∗(x) = xT Ω¯−1x;264.1. Limit Results•J(z) = T 1(z√ν + 1√1− ρ2; ν + 1)+sign(ρ+ z)|ρ+ z|νT 1(sign(ρ+ z)√ν + 1√1− ρ2( 1ρ+ z− ρ); ν + 1);• T1(x; ν) is the cdf of a univariate Student t random variable with νdegrees of freedom;• T 1(x; ν) = 1− T1(x; ν).Remark 4.1.2. When the skewing function is identity, the factor K(z) in(A.9) is equal to one, and one recovers Theorem 1(i) in Abdous et al. [2005a]for elliptical random vectors.Remark 4.1.3. The assumption of closure under marginalization is valid forsome restricted generating functions and skewing functions; for example, theskew-elliptical distributions proposed by Branco and Dey [2001] and by Fang[2003]1. Skew-normal , skew-t, and skewed Pearson Type II distributions areall included in this distribution family.Example 1. Suppose Z has a bivariate skew-t distribution, Z ∼ St2(0,Ω,α, ν).The marginals satisfyP(Zi > z) ∼ 2 T 1(z; ν) T1(αi√ν + 1, ν + 1), z →∞, i = 1, 2,withα1 =α1 + ρα2√1 + α22(1− ρ2)and α2 =α2 + ρα1√1 + α21(1− ρ2). (4.4)For the skewing function we haveG(·) = T1(·; ν+2), w(z) = αT z( ν + 2Q(z) + ν)1/2and w∞(z) = αT z( ν + 2Q∗(z))1/2, and so the factor K(z) in (A.9) is given byK(z;α, ρ, ν) =∫∫∞(1,z)Q∗(u)−(ν+2)/2 T1(αTu√ν+2Q∗(u) ; ν + 2)duT1(α1√ν + 1; ν + 1)∫∫∞(1,z)Q∗(u)−(ν+2)/2du. (4.5)1When λ = 0, skew-elliptical distribution in Fang [2003] is reduced to the form inBranco and Dey [2001].274.1. Limit ResultsIn this case, the limit expression in (4.3) below can be evaluated numerically.Figure 4.1 plots the limiting behaviours described in Theorem 4.1.3 fora bivariate skew-t distribution with shifted location. To show correctnessof the derived formula, we firstly calculate the limiting value obtained fromTheorem 4.1.3 (see (a)), and then calculate true values by numerical inte-gration with different thresholds for x (see (b), (c), and (d)). As x increases,numerical values converge to the limiting value. Therefore, Theorem 4.1.3works well even if the location parameter is not around 0.Figure 4.1: Plot of the values of limx→∞P(Y ≤ (z + ρ)x | X > x) in terms of z for abivariate skew-t distribution with ξ = (5,−5), α = (1,−3), ρ = 0.5 and ν = 2: a)limiting value from Thereon 4.1.3, and true value with x being (b) 99% marginalquantile, (c) 99.99% marginal quantile, and (d) 200.−3 −2 −1 0 1 2 30.20.40.60.81.0zη(x, y)(a)(b)(c)(d)28Chapter 5InferenceIn practice, one often has to evaluate the probability η(x, y) in (1.1) when xand y are large but finite. For example, η(x, y) can represent the probabilitythat a stock or a portfolio experiences a large loss when the market crashes.However, the empirical estimate of this probability may take a degeneratevalue of 0 or 1 whenever the sample contains no observations in the region ofinterest. The inference procedure we propose makes use of the asymptoticresult in Theorem 4.1.3 as a finite sample approximation of the conditionalprobability η(x, y) for large values of x and y. Evaluation of the limit in (4.3)requires estimation of the tail index ν, parameter ρ of the standardized scalematrix, and (parameters of) the asymptotic skewing function G(w∞(·)).It is henceforth assumed that the bivariate sample {Z1, . . . ,Zn} on whichinference is based comprises independent and identically distributed copiesof random vector Z satisfying assumptions of Theorem 4.1.3.5.1 Tail Index EstimationUnder the assumptions of Theorem 4.1.3 on random vector Z, its L2-norm‖Z‖ has a regularly varying tail with index ν; see Lemma 4.1.2. This reducesthe problem of estimation of ν to a univariate setting, which has been wildlystudied in the literature. Similar to Abdous et al. [2005a], we adopt themethod of Huisman et al. [2001], shown to have good performance in smallsamples. The estimation procedure is summarized in Program 1.Table 5.1 re-examines performance of the above tail index estimator fortwo skewed distributions, in particular with the purpose to supplement thesymmetric cases considered in the original paper of Huisman et al. [2001].295.1. Tail Index EstimationAlgorithm 1 Estimation of tail index ν(1) Let {z1, z2, ..., zn} represent a bivariate sample with location ξ;(2) Re-center data z˜j = zj − ξˆ, j = 1, . . . , n, where ξˆ is a robust estimateof ξ;(3) Calculate L2-norm distances ||z˜j ||, j = 1, ..., n;(4) Compute Hill estimates νˆH(k) of the tail index for k = 1,. . . ,n2 , wherek is the number of upper order statistics used in estimation (Hill [1975]);(5) Fit a weighted linear regression 1/νˆH(k) = β0+β1k+(k), k = 1, . . . , n2with weights {√1,√2, ...√n2 };(6) Set νˆ = 1/βˆ0, where βˆ0 is the estimate of β0 obtained in Step (5).The analysis is based on 1000 simulated samples of sizes 5000, 1000, 500 and200 from a bivariate skew-t distribution with different settings for the loca-tion parameter. As specified in Program 1, the generated samples are firstre-centered by a robust estimate of location ξ (see Rousseeuw and Driessen[1999] and Maronna and Zamar [2002]). Re-centering helps reduce the biasin the tail index estimate caused by a non-zero location. Note that in thesymmetric case, the sample mean is an unbiased estimator of location pa-rameter ξ; however, when the underlying distribution is skewed, estimationof ξ using the sample mean is subject to a bias. Based on Table 5.1, theperformance of estimator νˆ seems to be satisfactory even when the samplesize is relatively small.Table 5.1: Performance of estimator νˆ computed using Program 1. The bivariateskew-t distribution with tail index ν = 2 is used in simulations to generate samplesof four different sample sizes. Each cell presents the average value of estimates ofν based on 1000 simulated samples of a given size, and the corresponding standarderror (in brackets). Two different parameter settings are considered. Case 1: α =(1,−3), ρ = 0.5, ξ = (0, 0); Case 2: α = (1,−3), ρ = 0.5, ξ = (3, 1).Sample Size Case 1 Case 25000 2.00 (0.01) 2.00 (0.09)1000 2.04 (0.22) 2.05 (0.22)500 2.08 (0.33) 2.08 (0.32)200 2.25 (0.61) 2.23 (0.59)305.2. Parametric EVT Estimation5.2 Parametric EVT EstimationEstimation of the remaining terms in the limit expression (4.3) for η(x, y)constitutes a more complex problem than estimation of the tail index ν, asthe former requires inference about the multivariate tail behaviour. In thesetting of multivariate regular variation, a standard procedure is to makeuse of the spectral measure Ψ in (2.2).Suppose Z1, . . . ,Zn are independent and identically distributed randomvectors from a bivariate regularly varying distribution; see Definition 2.2.3.Let Ri = ‖Zi‖ and Wi = Zi/‖Zi‖ denote the radius and direction of Zi,i = 1, . . . , n, respectively. Let Rn−k:n denote the kth largest observationin the sample of Ri’s; it is used to identify extreme or tail observations inthe original bivariate sample of Zi’s. Then the limit in (2.2) suggests that,given Zi with Ri > Rn−k:n, the sample of Wi’s can be modelled by measureΨ. While various parametric and non-parametric procedures are possibleto make inference about Ψ, here we build upon the class of skew-ellipticaldistributions.Given a particular skewing function pi(·) = G(w(·)), which satisfies as-sumptions of Theorem 4.1.3, spectral density ψ(w) can be written downexplicitly using (4.1). One can then estimate ρ as well as other parametersin G(w∞(·)) by maximizing the log-likelihood based on ψ(w) for the taildata projected on the unit sphere S1, namely for Wi’s with Ri > Rn−k:n.In our implementation, we assume that the limiting skewing functionG(w∞(·)) has the same form as that for the skew-t distribution. The spectraldensity in this case is given byψst(w) = 2 |Ω¯|−12 T1(αTw√ν + 2Q∗(w); ν + 2)Q∗(w)−ν+22[ ∫ 2pi0A(θ)ν2 dθ]−1,w ∈ S1; (5.1)cf. Example 1. The parameters to be estimated include shape parameterα = (α1, α2) and parameter ρ of the standardized scale matrix Ω¯. Thedetails of the resulting inference procedure are summarized in Program 2.315.2. Parametric EVT EstimationAlgorithm 2 Estimation using parametric EVT method(1) Let {z1, z2, ..., zn} represent a bivariate sample with location ξ;(2) Re-center data z˜j = zj − ξˆ, j = 1, . . . , n, where ξˆ is a robust estimateof ξ;(3) Estimate tail index ν using Program 1;(4) Calculate L2-norm distances ||z˜j || and set wj = z˜j/||z˜j ||, j = 1, 2, ..., n;Keep wj if corresponding ||z˜j || exceeds a specified high quantilethreshold;Estimate parameters ρ and α by fitting ψst in (5.1) to the retainedsample of wj ’s using the maximum-likelihood.In Step (4) of Program 2, it is necessary to select a threshold to identifyextreme observations based on their L2-norm distances. Such a choice al-ways entails a bias-variance trade-off due to balancing between the modelvalidity and estimation efficiency. To assess stability of our parametric EVTmethod, we simulated 1000 skew-t samples of size 1000. Absolute differ-ences between the true value of η(x, y) and estimated probability ηˆ(x, y)using (4.3) and Program 2 for parameter estimation with different thresh-olds were calculated, where x and y were taken as the theoretical 99.99%marginal quantiles. Figure 5.1 shows the mean and standard deviation ofabsolute differences for different thresholds. Based on these plots, in the sub-sequent analyses, we set the threshold at the 85% quantile of the distancevariates ‖z˜j‖ (j = 1, . . . , n).Assuming a particular parametric form for the spectral density improvesefficiency of the estimation procedure in comparison to non-parametric ap-proaches, especially when the sample size is small. While this entails acertain degree of model risk, the proposed model can well capture asymmet-ric contour shapes and thus gives a more flexible alternative to ellipticallysymmetric models. Figure 5.2 shows several contour plots that can be mod-elled by ψst in (5.1). As only tail observations are used to do the estimation,our approach is fundamentally different from modelling data directly usingthe skew-t distribution; that is, skew-t distribution fitted using the entiredata are subject to undue influence of the central observations, possiblycompromising the fit in the tail regions.325.2. Parametric EVT EstimationFigure 5.1: Mean (left panel) and standard derivation (right panel) of absolutedifferences between exact value of η(x, y) and estimated probability ηˆ(x, y) usingTheorem 4.1.3 for different thresholds. Samples are drawn from the skew-t distri-bution with ξ = (0, 0), α = (1,−3), ν = 2, ρ = 0.5. Values of x and y are chosenas the theoretical marginal quantiles with probability 99.99%.Figure 5.2: The contour plots {z ∈ R2 | ψst(z) < 1} for spectral density ψstin (5.1). The parameters are set to ν = 2, ρ = 0.5 and several values of α (leftpanel); α = (3,−1) and several values of ρ (right panel).335.3. Iterative Parametric EVT EstimationOur method shares a similar idea as in Cai et al. [2011], where the authorspropose an asymptotically motivated estimator to approximate an extremerisk region of the form {z ∈ Rd : h(z) ≤ β}, where h is the underlying densityfunction and β > 0 is a small number. They argue that a (semi)-parametricmethod with a pre-specified form of the spectral density does not perform aswell as when the spectral density is estimated non-parametrically. However,in their comparative analysis an elliptically contoured model is used whichindeed may not be flexible enough for an application at hand. Different non-parametric consistent estimators for spectral density have been proposedrecently (see Nguyen and Samorodnitsky [2013], Eastoe et al. [2014], amongothers); however, these methods do not fit in our framework.While asymptotically a non-zero location ξ does not affect the results,in practice when working with finite values the location does have influenceon the performance of the estimation procedure. So far we have been usinga robust estimate of location ξ to re-centre the data. In the next section,we propose a refinement of Program 2. Based on the subsequent simulationstudies with the location parameter substantially different from zero, theproposed iterative procedure leads to more accurate estimation results.5.3 Iterative Parametric EVT EstimationTo estimate conditional probability η(x, y) when the location of the sample isfar from the origin, one needs a good estimator for location parameter ξ. Maet al. [2005] propose a locally efficient semi-parametric method specificallyfor univariate skew-elliptical data. However, their method becomes tediousin the multivariate setting. Other methods including robust estimation usedin the previous section cannot handle skewed data very well.Note that our estimation method relies on the tail data only given aninitial estimate of the location. If the location of the shifted sample is closeto zero, the accuracy of the estimation using the parametric EVT methodof the previous section should not be significantly influenced by the locationparameter. Tail data alone are unlikely to provide an accurate estimate ofthe location parameter which determines the center of the entire data cloud.345.3. Iterative Parametric EVT EstimationBased on these considerations, we introduce an iterative procedure to updatethe location parameter using the entire sample, followed by re-estimation ofthe other model parameters with the parametric EVT method.In particular, we assume that the angular component S′ in the stochasticrepresentation (3.9) of skew-elliptical distributions for the bivariate case hasthe following density in polar coordinates:fΘ(θ) =1piT1(√2(Lα)T(cos θsin θ); 2), θ ∈ [0, 2pi). (5.2)This is precisely the form that was obtained in examples considered in Sec-tion 3.3. Now, based on (3.9), writeZ˜ = (X˜, Y˜ ) = (L−1)T(Z− ξ) = R(cos Θ, sin Θ),which implies that Θ = tan−1(Y˜ /X˜). Hence, given an initial estimate ξˆ of ξsuch as the robust estimate used previously and the estimate of ρ, the param-eter of standardized scale matrix Ω¯ = LTL, computed using the parametricEVT method, we can obtain the initial sample of angles {Θ(0)i ; i = 1, . . . , n}from the original data Z1, . . . ,Zn. Then we can update ξˆ by minimizing adistance between the theoretical density fΘ(θ; ξ|αˆ, ρˆ) in (5.2) and empiri-cal density of Θi’s, denoted fˆΘ,n(θ; ξ|αˆ, ρˆ) over values of ξ with the otherparameters fixed at their estimates, i.e.,ξˆ = arg minξ||fΘ(θ; ξ|αˆ, ρˆ)− fˆΘ,n(θ; ξ|αˆ, ρˆ)||2. (5.3)To improve stability of estimation of ξˆ, we also control the convergence ofαˆ and ρˆ by updating their values for the fixed value of ξˆ from the previousiteration:(αˆ, ρˆ) = arg minα,ρ||fΘ(θ;α, ρ|ξˆ)− fˆΘ,n(θ;α, ρ|ξˆ)||2. (5.4)Let ||(ξˆ, αˆ, ρˆ)j − (ξˆ, αˆ, ρˆ)j−1||∞ denote the largest absolute value of differ-ences between two consecutive estimates. Once ||(ξˆ, αˆ, ρˆ)j−(ξˆ, αˆ, ρˆ)j−1||∞ < for a chosen tolerance level  > 0, we stop the above iterative procedure ofupdating parameter estimates. As a final step, given the value of ξˆ from the355.4. Estimation of the Scale Matrixlast iteration, the data are re-centred and α and ρ are re-estimated usingthe parametric EVT method. The details of this procedure are summarizedin Program 3.Algorithm 3 Estimation using iterative parametric EVT method(1) Let {z1, z2, ..., zn} represent a bivariate sample with location ξ;(2) Re-center data z˜j = zj − ξˆ, j = 1, . . . , n, where ξˆ is a robust estimateof ξ;(3) Estimate parameters ν, α and ρ using Program 1 and 2;(4) Set Diff = 1while Diff >  doCompute ξˆ according to (5.3);Given ξˆ, update αˆ and ρˆ according to (5.4);Set Diff = ||(ξˆ, αˆ, ρˆ)i − (ξˆ, αˆ, ρˆ)i−1||∞end while(5) With updated ξˆ from the last iteration in Step (4), apply Program 2to re-compute αˆ and ρˆ.As a final remark, we note that by assuming a particular form of the den-sity of the angular component in (5.2), we implicitly restrict the behaviourof the underlying density generator f˜ due to dependence between the radialand angular components in representation (3.9). However, this restrictionis used only to improve the estimation of the location parameter, while theother parameters are estimated as before using the parametric EVT methodbased on (re-centred) tail data.5.4 Estimation of the Scale MatrixSo far, in Theorem 4.1.3 and inference methods discussed above, it was as-sumed that the scale matrix is given in the standardized form as Ω¯. However,estimation of scale matrix Ω assuming that the underlying distribution isskewed is not as straightforward as in the case of elliptical symmetry.To address this problem, first recall Ω = ω Ω¯ ω, where ω is the diagonalmatrix defined in (3.3). Note that if Z ∼ SEd(ξ,Ω, f˜ , G◦w), then from (3.9)365.5. Simulation Studiesit follows that ω−1Z ∼ SEd(ω−1ξ, Ω¯, f˜ , G◦w). A simple approach is to firstestimate ω by say a robust method mentioned before (see Rousseeuw andDriessen [1999] and Maronna and Zamar [2002]), and then directly applyprevious inference results to the transformed vector ω−1Z. However, theperformance of this approach is not satisfactory especially when data isskewed and ω is far from the identity matrix. Other approaches include:M1 Estimate ω via MLE with density ψ(w) in (4.1);M2 Estimate ω together with ξ using the iterative parametric EVT method,i.e.,(ξˆ, ωˆ) = arg minξ,ω||fΘ(θ; ξ, ω|αˆ, ρˆ)− fˆΘ,n(θ; ξ, ω|αˆ, ρˆ)||2.The above two approaches only require slight modifications of Program 2and 3. It is worth to mentioning that method M1 requires simultaneousestimation of too many parameters making the optimization process hardto control. In Section 5.5, we demonstrate that the second approach (M2)works better than the first one.5.5 Simulation StudiesIn this section we report results of a simulation study to show performance ofthe proposed methods. The assessment is based on 1000 samples of size 1000simulated from two bivariate skew-t distributions, one with a standardizedscale matrix Ω¯ and the other with scale matrix Ω = ω Ω¯ ω, where ω =diag(2, 3). The other distribution parameters are set as ξ = (3, 1), α =(1,−3), ρ = 0.5, and ν = 2. In the first case, we do not estimate ω, inother words, we take Ω¯ as given; in the second case, ω is estimated usingthe two approaches discussed in the previous section. For the estimationof conditional probability η(x, y), the values of x and y are taken as thetheoretical 97.5%, 99.0%, 99.9% and 99.99% marginal quantiles. For eachsample and various values of x and y, we evaluated ηˆ(x, y) using differentmethods:375.5. Simulation Studies• AFG (ηˆAFG(x, y)): based on the limit result for elliptical distributions(see Abdous et al. [2005b]), and ω is estimated using robust methodwhen necessary;• Parametric EVT (ηˆ1(x, y)): based on the limit result in Theorem 4.1.3and the parametric EVT method (Program 2), and ω is estimatedusing M1 when necessary;• Iterative parametric EVT (ηˆ2(x, y)): based on the limit result in The-orem 4.1.3 and the iterative parametric EVT method (Program 3),and ω is estimated using M2 when necessary;• Empirical (ηˆemp(x, y)): based on the empirical distribution;• True (η(x, y)): the true value computed by numerical integration usingpreset parameters;• Limiting (ηlim(x, y)): the true value of the limit stated in Theorem4.1.3 using preset parameters.The simulation results are presented in Tables 5.2 and 5.3. They showthat, under the considered simulation settings, the AFG method greatly un-derestimates condition probability η(x, y). The empirical method providesgood estimates of η(x, y) for moderate quantile levels relative to the sam-ple size, albeit a much larger standard deviation in comparison to the othermethods. For the very extreme quantile levels, no proper empirical estimatescould be obtained. The two methods proposed in this project produce accu-rate estimates of η(x, y) with reasonable standard errors, and clearly providean improvement over the AFG method in terms of accuracy. The iterativeparametric EVT estimator ηˆ2(x, y) does achieve gains in accuracy over theparametric EVT estimator ηˆ1(x, y) due to better estimation of the locationparameter. A similar simulation for elliptically symmetric distributions wasalso conducted (see Table B.1), in which our estimator ηˆ2(x, y) showed asimilar performance to ηˆAFG(x, y)2.2For comparison, we also report simulation results for a bivariate skew-t distributionwith a lighter tail (ν = 20) in Table B.2. We can see that the limiting value ηlim(x, y)is still close to the true value η(x, y), but estimate ηˆ2(x, y) is not. This is due to poor385.5. Simulation StudiesFigure 5.3 compares the true contour {z ∈ R2 | ψst(z) < 1} with es-timated contours {z ∈ R2 | ψˆst(z) < 1} for one of the simulated scenariosusing the iterative parametric EVT method and the method discussed in Ab-dous et al. [2005b]. It can be easily observed that the proposed method ismore flexible in capturing asymmetric shapes of the data clouds.Figure 5.3: Comparison of the true level curve {ψst(z) < 1} with the estimatedcontours {ψˆst(z) < 1} using the AFG method and the proposed iterative parametricEVT method. The data are generated from a bivariate skew-t distribution withparameters ξ = (3, 1), α = (1,−3), ρ = 0.5, ν = 2, and ω = (1, 1) (left panel) orω = (2, 3) (right panel).-estimation of tail index when the tail decays fast. Abdous et al. [2005a] also suffers thesame problem.39Table 5.2: Simulation results based on 1000 samples of size 1000 from a bivariate skew-t distribution with parameters ξ =(3,1), α = (1,-3), ν =2, ρ = 0.5 and a standardized scale matrix. Each cell provides the average (standard deviation) of theestimates of η(x, y) under various methods; see Section 5.5 for details. For ηˆAFG(x, y), ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y), weused z = y/x− ρ in the limit results. Values of x and y are chosen as the theoretical marginal quantiles with probability p,where p labels columns and rows.Quantile yQuantile x97.5% 99.0% 99.9% 99.99%97.5%ηˆAFG(x, y) 0.239 (0.019) 0.215 (0.019) 0.179 (0.018) 0.165 (0.017)ηˆ1(x, y) 0.605 (0.046) 0.562 (0.046) 0.490 (0.044) 0.460 (0.043)ηˆ2(x, y) 0.693 (0.048) 0.641 (0.048) 0.553 (0.046) 0.517 (0.045)ηˆemp(x, y) 0.653 (0.098) 0.588 (0.168) * *η(x, y)/ηlim(x, y) 0.652/0.685 0.587/0.633 0.518/0.544 0.498/0.50899.0%ηˆAFG(x, y) 0.273 (0.020) 0.238 (0.019) 0.186 (0.018) 0.167 (0.017)ηˆ1(x, y) 0.659 (0.045) 0.603 (0.046) 0.506 (0.044) 0.465 (0.043)ηˆ2(x, y) 0.755 (0.046) 0.691 (0.048) 0.573 (0.047) 0.523 (0.045)ηˆemp(x, y) 0.761 (0.089) 0.658 (0.165) * *η(x, y)/ηlim(x, y) 0.761/0.749 0.661/0.683 0.540/0.564 0.505/0.51499.9%ηˆAFG(x, y) 0.507 (0.024) 0.399 (0.022) 0.234 (0.019) 0.181 (0.018)ηˆ1(x, y) 0.863 (0.030) 0.794 (0.038) 0.596 (0.046) 0.495 (0.044)ηˆ2(x, y) 0.932 (0.022) 0.884 (0.032) 0.683 (0.048) 0.559 (0.046)ηˆemp(x, y) 0.963 (0.039) 0.911 (0.095) * *η(x, y)/ηlim(x, y) 0.964/0.930 0.915/0.882 0.666/0.675 0.542/0.55099.99%ηˆAFG(x, y) 0.880 (0.022) 0.802 (0.026) 0.439 (0.022) 0.231 (0.019)ηˆ1(x, y) 0.977 (0.008) 0.961 (0.012) 0.823 (0.035) 0.592 (0.046)ηˆ2(x, y) 0.991 (0.004) 0.984 (0.007) 0.905 (0.028) 0.677 (0.048)ηˆemp(x, y) 0.996 (0.012) 0.991 (0.030) * *η(x, y)/ηlim(x, y) 0.996/0.990 0.991/0.983 0.916/0.903 0.667/0.67040Table 5.3: Simulation results based on 1000 samples of size 1000 from a bivariate skew-t distribution with parameters ξ =(3,1), α = (1,-3), ν =2, ρ = 0.5 and ω = diag(2, 3) for the scale matrix. Each cell provides the average (standard deviation) ofthe estimates of η(x, y) under various methods; see Section 5.5 for details. For ηˆAFG(x, y), ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y) weused z = ω1y/ω2x−ρ in the limit results. Values of x and y are chosen as the theoretical marginal quantiles with probabilityp, where p labels columns and rows.Quantile yQuantile x97.5% 99.0% 99.9% 99.99%97.5%ηˆAFG(x, y) 0.236 (0.035) 0.210 (0.027) 0.178 (0.018) 0.168 (0.017)ηˆ1(x, y) 0.540 (0.049) 0.491 (0.047) 0.425 (0.046) 0.402 (0.045)ηˆ2(x, y) 0.627 (0.050) 0.577 (0.050) 0.510 (0.049) 0.485 (0.048)ηˆemp(x, y) 0.653 (0.098) 0.588 (0.168) * *η(x, y)/ηlim(x, y) 0.652/0.648 0.587/0.597 0.518/0.526 0.498/0.50199.0%ηˆAFG(x, y) 0.286 (0.0491) 0.241 (0.037) 0.187 (0.020) 0.170 (0.017)ηˆ1(x, y) 0.623 (0.048) 0.549 (0.049) 0.444 (0.046) 0.408 (0.045)ηˆ2(x, y) 0.708 (0.048) 0.635 (0.049) 0.529 (0.049) 0.492 (0.048)ηˆemp(x, y) 0.761 (0.089) 0.658 (0.165) * *η(x, y)/ηlim(x, y) 0.761/0.733 0.661/0.657 0.540/0.547 0.505/0.50899.90%ηˆAFG(x, y) 0.598 (0.100) 0.457 (0.086) 0.245 (0.038) 0.186 (0.020)ηˆ1(x, y) 0.891 (0.026) 0.810 (0.037) 0.556 (0.049) 0.442 (0.046)ηˆ2(x, y) 0.929 (0.021) 0.870 (0.033) 0.642 (0.050) 0.527 (0.049)ηˆemp(x, y) 0.963 (0.039) 0.911 (0.095) * *η(x, y)/ηlim(x, y) 0.964/0.945 0.915/0.893 0.666/0.665 0.542/0.54499.99%ηˆAFG(x, y) 0.915 (0.055) 0.846 (0.078) 0.489 (0.090) 0.246 (0.038)ηˆ1(x, y) 0.985 (0.006) 0.970 (0.010) 0.832 (0.034) 0.557 (0.048)ηˆ2(x, y) 0.991 (0.004) 0.981 (0.007) 0.887 (0.030) 0.644 (0.049)ηˆemp(x, y) 0.996 (0.012) 0.991 (0.030) * *η(x, y)/ηlim(x, y) 0.996/0.994 0.991/0.987 0.916/0.908 0.667/0.66641Chapter 6Financial Applications6.1 Financial ContagionIn this section, a practical application is illustrated in the context of finan-cial contagion at the domestic level, for which estimation of the extremeconditional excess probability is needed. We investigate the extreme condi-tional excess probability 1 − η(x, y) = P(Lj > y|Ls > x), where Ls is thedaily loss on a portfolio which is a proxy for the aggregate financial system,and Lj is the daily loss on a stock or a portfolio. This probability quantifiesthe tail risk of a stock when the overall financial market encounters a severenegative shock. It could also be interpreted as the extreme dependence ofthe stock and the aggregate financial system.We consider a partial panel of financial institutions studied in Acharyaet al. [2010], with complete data between June 1, 2006 and May 31, 2008.The time period from June 1, 2006 to May 31, 2007 is defined as the pre-crisis window (same as in Girardi and Tolga [2013]), and the remainingone-year is the crisis window. The panel contains U.S. financial firms in thedepositories industry with a market capitalization greater than 5 billion USDas of the end of June 2007. The Dow Jones US Financials Index (DJUSFN)is used as a proxy for the aggregate financial system. The daily prices andcapitalization information are extracted from Yahoo Finance.Daily losses are calculated as negative log returns. Due to the presenceof serial dependence and volatility clustering in daily financial returns, it iscommon to first filter the data using the AR(1)-GARCH(1,1) model. Thenresiduals can be treated as a sequence of independent and identically dis-tributed random variables.426.1. Financial ContagionTable 6.1 reports estimation results for the filtered data. For each insti-tution in pre-crisis period, we estimate 1 − η(x, y) using the AFG methodand the iterative parametric EVT method (denoted as before 1− ηˆAFG(x, y)and 1 − ηˆ2(x, y), respectively), together with the estimated marginal sur-vival probability Pˆ(Y > y). The latter is estimated by fitting the General-ized Pareto distribution to threshold excesses using the maximum-likelihoodmethod (see, e.g., Coles [2001] for details). The values of x and y are,respectively, the 99.9% quantile of the losses on DJUSFN and the aver-age of the 99% quantile of the losses on cross-sectional stocks3. The rightcolumns in the panel “pre-crisis” present parameter estimates used to eval-uate 1− ηˆ2(x, y). The “crisis” column reports the estimated extreme condi-tional excess probability during the crisis period.As indicated by the estimates of the tail index ν, the bivariate jointdistributions of losses on stocks and the DJUSFN index exhibit a fairlyslow radial decay, with most of νˆ in the range from 2.5 to 4.54. Based onthe estimates of the shape parameter α = (α1, α2), there is evidence ofskewness in the joint distribution of the index and stock returns over theconsidered one-year estimation window. It can be noted that the estimatesof the skewness parameter α1, corresponding to the DJUSFN index, tend tofluctuate considerably, contrary to the intuition that these estimates shouldbe fairly stable and reflect the skewness in the index data. Partly, this canbe explained by the fact that α1 is not the marginal parameter. If the data ofstock and the index losses were to come from a bivariate skew-t distribution,one could convert the bivariate skewness parameter α = (α1, α2) to themarginal parameters α¯1 and α¯2 using Equation 4.4. Indeed, ˆ¯α1 should befairly constant as the marginal corresponding to the index is the same forall pairs of data; αˆ1 is determined by αˆ2 and the parameters of the scalematrix. Another reason for a large variation in the values of αˆ1 is thelevel of sampling variability when estimating the tail of the multivariatedistribution. However, majority of the estimates of α1 do tend to be positive,3The values of x and y are 3.487% and 2.722% for pre-crisis period, and 4.587% and5.761% for crisis period.4The Hill estimators (Hill [1975]) for the linear combinations of losses on stocks andthe DJUSFN index have a similar tail index range; hence, the assumption of multivariateregular variation is reasonable in this example.436.1. Financial Contagionthus indicating that the losses on the index are likely to be right-skewed.This is in line with the skewness estimate calculated using the index dataalone. Finally, it is worth pointing out that a similar issue of large variabilityin skewness parameter estimates was present in the simulation studies. Atthe same time, the estimates of the conditional probability, the ultimatequantity of interest here, were fairly stable and close to the true values.Comparing 1 − ηˆAFG and 1 − ηˆ2, we see that under the assumption ofelliptical symmetry, the estimates of the conditional exceedance probabilityare, in most cases, bigger than those obtained when the skewness in thejoint distribution of losses is accounted for. We select Hudson City Ban-corp which has a small difference between 1− ηˆAFG and 1− ηˆ2 and PeoplesBank Bridgeport which has a large difference, and plot their fitted contours{z ∈ R2 | ψˆst(z) < 1} in Figure 6.1. It can be observed that the ellipti-cal contour under the AFG method provides a poor fit to the data cloud.Hence, in this particular example, ignoring the asymmetry leads to inaccu-rate estimation of η(x, y). We also note that, since conditional exceedanceprobabilities are greater than the corresponding marginal exceedance prob-abilities for Y , the data clearly exhibit contagion from the DJUSFN indexto each institution. Lastly, the results also show that extremal dependencecannot be modelled by the AR(1)-GARCH(1,1) process. This is in line withthe findings in Bae et al. [2003], where the counts of coincident extremedaily returns across international equity markets could not be explained bythe AR(1)-GARCH(1,1) filter.To compare financial contagion between the pre-crisis and crisis periods,we report the estimated extreme conditional excess probability in the cri-sis period using the iterative parametric EVT method (the last column ofTable 6.1). It is interesting to observe that the extreme conditional excessprobability drops substantially during the financial crisis, indicating thatextreme dependence between the market and individual stocks decreases ina downside market. One possible explanation is that firms start to watchtheir risk exposures when they realize they are in a dangerous position. Thisresult is in accordance with the findings in Adrian and Brunnermeier [2011]that contemporaneous risk measures5 are procyclical.5They measure systemic risk using CoVaR.446.1. Financial ContagionFigure 6.1: Estimated contours {ψˆst(z) < 1} for daily losses on DJUSFN versusHudson City Bancorp (HCBK, left panel) and Peoples Bank Bridgeport (PBCT,right panel) between June 1, 2006 and May 31, 2007 using the AFG method andthe proposed iterative parametric EVT method.-45Table 6.1: Point estimation of extreme conditional excess probability for 28 financial institutions. Daily losses are computedusing log returns and are filtered by the AR(1)-GARCH(1,1) process. The threshold values of x and y are, respectively, 99.9%quantile of losses on DJUSFN and average of 99% quantile of losses on cross-sectional stocks. The sample period is fromJune 1, 2006 to May 31, 2008. The pre-crisis period is from June 1, 2006 to May 31, 2007, and the crisis period is from June1, 2007 to May 31, 2008.-CompanyPre-crisis Crisis1-ηˆAFG 1-ηˆ2 Pˆ(Lj > y) νˆ αˆ1 αˆ2 ρˆ 1-ηˆ2 (crisis)1 BANK NEW YORK 0.856 0.722 0.008 2.815 0.177 -0.385 0.790 0.2242 BANK OF AMERICA CORP 0.675 0.778 0.006 3.444 -0.448 0.346 0.785 0.1983 B B & T CORP 0.703 0.859 0.003 3.113 -0.524 0.568 0.811 0.2914 CITIGROUP 0.790 0.515 0.008 4.248 0.648 -0.770 0.604 0.4375 COMERICA 0.781 0.508 0.012 4.042 0.052 0.056 0.509 0.3306 COMMERCE BANCORP 0.814 0.583 0.039 2.751 0.623 -0.500 0.564 0.0427 HUDSON CITY BANCORP 0.522 0.527 0.003 3.319 0.028 0.082 0.555 0.1548 HUNTINGTON BANCSHARES 0.718 0.678 0.008 2.850 0.136 -0.413 0.829 0.3259 J P MORGAN CHASE & CO 0.871 0.838 0.011 3.853 0.262 -0.280 0.809 0.09610 KEYCORP 0.777 0.649 0.010 3.513 0.413 -0.250 0.700 0.51211 MARSHALL & ILSLEY CORP 0.749 0.737 0.007 2.809 -0.608 0.708 0.676 0.46912 M & T BANK CORP 0.818 0.630 0.014 3.448 0.125 0.023 0.697 0.51813 NATIONAL CITY CORP 0.728 0.649 0.018 3.161 0.030 0.049 0.596 0.02014 NEW YORK COMMUNITY BANCORP 0.521 0.364 0.000 6.051 -0.681 0.344 0.357 0.09415 NORTHERN TRUST CORP 0.827 0.733 0.014 3.360 0.179 -0.052 0.725 0.30016 PEOPLES BANK BRIDGEPORT 0.815 0.530 0.028 2.469 0.548 -0.145 0.593 0.04317 PNC FINANCIAL SERVICES GRP 0.643 0.428 0.010 4.459 0.332 -0.189 0.493 0.122Continued on next page46CompanyPre-crisis Crisis1-ηˆAFG 1-ηˆ2 Pˆ(Lj > y) νˆ αˆ1 αˆ2 ρˆ 1-ηˆ2 (crisis)18 REGIONS FINANCIAL CORP 0.700 0.528 0.006 3.767 -0.294 -0.297 0.693 0.33219 SOVEREIGN BANCORP 0.650 0.590 0.025 3.596 -0.044 0.300 0.461 0.36520 STATE STREET CORP 0.846 0.718 0.023 2.827 0.735 -0.737 0.718 0.26521 SUNTRUST BANKS 0.629 0.448 0.008 3.943 -0.014 -0.000 0.507 0.37222 SYNOVUS FINANCIAL CORP 0.693 0.583 0.001 3.522 0.009 0.008 0.683 0.32423 UNIONBANCAL CORP 0.692 0.668 0.021 2.795 0.001 0.003 0.684 0.29624 U S BANCORP DEL 0.435 0.468 0.000 3.182 -0.145 -0.157 0.735 0.19025 WACHOVIA CORP 0.720 0.813 0.006 4.373 0.005 0.007 0.773 0.45726 WASHINGTON MUTUAL 0.830 0.730 0.022 3.380 -0.019 0.264 0.564 0.54027 WELLS FARGO & CO 0.873 0.593 0.024 2.035 0.806 -1.059 0.734 0.27028 ZIONS BANCORP 0.652 0.410 0.010 3.859 0.206 -0.086 0.532 0.387476.2. CoVaR6.2 CoVaRConditional Value-at-Risk (CoVaR), introduced by Adrian and Brunner-meier [2011], is defined as Value-at-Risk (VaR) of one portfolio conditionalon one institution being in financial distress. Given losses (negative returns)Lj (or Ls) of an institution (or a portfolio) and the confidence level q1, VaRjq1is defined as q1-quantile of loss distribution:P(Lj ≥ VaRjq1) = q1,and CoVaRs|jq1 is defined by the q1-quantile of the conditional loss distributionP(Ls ≥CoVaRs|jq1 |Lj ≥ VaRjq2) = q1. (6.1)Different from VaR, which only considers individual risk faced by an insti-tution, CoVaR accounts for the possible contribution of each institution tothe overall system risk. Additionally, it provides a way to capture the riskspillovers among institutions.Girardi and Tolga [2013] define the systemic risk of an institution using4CoVaRs|jq1 , the change between its CoVaR in benchmark state (defined asa one-standard deviation event) and its CoVaR under financial distress:4CoVaRs|jq1 = 100 ∗CoVaRs|jq1 − CoVaRs|bjq1CoVaRs|bjq1, (6.2)where bj is the benchmark state, which is one standard deviation aboutthe mean: Lj ∈ (µj − σj , µj + σj), where µj and σj are, respectively, theconditional mean and the standard deviation of institution j’s losses6.In their examples, q1 and q2 are set to be 5%, and the joint dynamicof losses is estimated using a bivariate GARCH model with Engle [2002]DCC specification. To take skewness and kurtosis into consideration, theyreport results for both Gaussian and skew-t innovations. Despite that, theirapproach still suffers from common shortcomings of estimating tail probabil-ity using parametric models. Specifically, when one is interested in CoVaRwith a very small q2, that is, the impact of one institution’s bankruptcy on6Specifically, P(Ls ≥ CoVaRs|bjq1 |µj − σj ≤ Lj ≤ µj + σj) = q1.486.2. CoVaRoverall financial networks, their specification could not handle tail estima-tion well. However, the fact that 2008 financial crisis was partially triggeredby the bankruptcy of Lehman Brothers makes this question essential formacroprudential supervision and regulation.Our method could help resolve this problem. We continue consideringfinancial institutions listed in Table 6.17. The objective is to calculate the4CoVaR of the DJUSFN index (denoted as Ls) when one of the institutions(denoted as Lj) is being in financial distress. The sample period is takenfrom June 1, 2006 to May 31, 2007.To fit in with our methodology, we adopt estimation procedure describedbelow. First, losses Ljt are filtered by the AR(1)-GARCH(1,1) model:Ljt = µjt + σj,tZjt ,where Zjt are i.i.d. random variables with a univariate standard Gaussiandistribution, and µt and σt are measurable with respect to sigma algebraFt−1, representing the information about the process {Ljt} available up totime t−1. In order to capture typical time dynamics of financial time series,one possibility is to assume that the conditional mean µt follows an AR(1)processµjt = α0 + α1Ljt−1,while the condition variance σ2t evolves according to a GARCH(1,1) modelspecificationσ2j,t = βj0 + βj1(σj,t−1Zjt−1)2 + βj2σ2j,t−1.Since it is generally agreed that loss series exhibit both skewness and(excess) kurtosis, the (normalized) residuals Zˆjt =Ljt−µˆjtσˆjt, where µˆjt and σˆjtare estimates of µjt and σjt , do not follow the Gaussian distribution perfectly.Hence, we compute VaRjq2,t of each institution using EVT approach intro-duced by McNeil and Frey [2000]. This approach is slightly different fromthat in Girardi and Tolga [2013], where they directly adopted the skew-tdistribution when filtering the parameters of the AR(1)-GARCH((1,1) fil-ter and then took the quantile of skew-t as the estimate of VaRjq2,t. The7This group of institutions is the same as “Depositories” group in Girardi and Tolga[2013].496.2. CoVaRrationale to separate the estimation process is that the conditional varianceis the feature of the whole distribution, while VaRjq2,t is related to tail ob-servations. A similar two-stage estimation is adopted by McNeil and Frey[2000] and Diebold et al. [2000].Next, instead of specifying a bivariate parametric GARCH model, we sim-ply assume that the joint dynamic of standardized residuals Zˆt = (Zˆst , Zˆjt )follows a certain skew-elliptical distribution, and then model it using the it-erative parametric EVT method. Specifically, CoVaRZs|Zjq1,t can be estimatedby solving equation below, where Theorem 4.1.3 can be applied:P(Zˆst ≥ CoVaRZs|Zjq1,t |Zˆjt ≥ VaRZjq2,t) = q1. (6.3)Then the forecast of CoVaRs|jq1,t is calculated with predictions of µˆst+1 andσˆst+1 from AR(1)-GARCH(1) model:CoVaRs|jq1,t = µˆst+1 + σˆst+1ĈoV aRZs|Zjq1,t .For the benchmark case, we adopt empirical approach to compute ĈoV aRZs|Zbjq1,tand forecast CoVaRs|bjq1,t using above equation.Table 6.2 reports the summary statistics8 for cross-sectional daily con-ditional 4CoVaRs|jq1,t over the sample period for q1 = 5% and q2 = 5%, 1%and 0.01%. “EVT” 9 indicates that bivariate Zˆt is modelled using the iter-ative parametric EVT method; for comparison, we also report results whenCoVaRs|jq1,t is estimated assuming that Zˆt follows a bivariate skew-t distribu-tion, as well as the empirical distribution. The numbers in column “Mean”give the increase, on average across all the institutions, in the CoVaR at levelq1 of the aggregate financial system given that an institution experiences aloss in excess of its VaR at level q2 in comparison to when the institution isin its benchmark state. For example, when institutions are subject to large8The summary statistics are robust to different methods adopted to model VaRjq2,t andthe benchmark case. Please find other results in the Appendix B.2.9 When the tail index ν estimated by the EVT method is greater than 10, we drop theestimate results of “EVT” method. Totally, there are 40 out of 7000 values dropped. Thisis because the performance of the iterative parametric EVT method is not satisfactorywhen the tail decays fast.506.2. CoVaRlosses exceeding their 5% VaR, the 5% CoVaR of the aggregate financial sys-tem increases on average by 139% over its 5% CoVaR when the institutionsare in their benchmark state, based on the ”EVT” method. “Std.TS”, whichgives the average of the standard deviations of individual 4CoVaRs|jq1,t, is aproxy for volatility of systemic risk contribution over time. “Std.CS”, whichreports the standard deviation of the mean of each individual 4CoVaRs|jq1,t,is a proxy for the dispersion of the average systemic risk contribution.Summary statistics for q2 = 5% in row “skew-t” are quite similar tothose reported in Table 6 of Girardi and Tolga [2013]. Comparing thesewith results in row “EVT”, it is clear that 4CoVaRs|jq1,t estimated usingthe EVT method exhibits much higher standard derivation and much widervalue range; for instance, Std.TS goes from 31.1 to 41.9. This suggests thatthe impact of one financial institution’s distress on the aggregate financialsystem is estimated more conservatively by the parametric (skew-t) modelin comparison to the proposed EVT method. When q2 decreases from 0.05to 0.01, both Std.TS and Std.CS increase. This reveals that the uncertaintyof the risk spillovers among institutions rises when one institution faces asevere financial distress.Table 6.2: Summary statistics for cross-sectional 4CoVaRs|jq1,t for all institutionsduring sample period from June 1, 2006 to May 31, 2007. Level q1 is set to be5%, and q2 is 5%, 1% or 0.01%. “EVT” and “skew-t”, respectively, refer to theuse of the iterative EVT method and the bivariate skew-t distribution to modelthe sequence of standardized residuals {zˆt}. Estimation based on the empiricaldistribution is reported under “Empirical”. Column “Std.TS” reports the averageof the standard deviations of individual 4CoVaRs|jq1,t and Column “Std.CS” reportsthe standard deviations of the mean of each individual 4CoVaRs|jq1,t measure.q2 Mean(%) Std.TS Std.CS Max(%) Min(%)5%Skew-t 178.3 31.1 36.5 416.7 70.0EVT 139.0 41.9 41.8 534.1 -21.4Empirical 161.4 79.4 40.4 491.5 -17.81%Skew-t 303.3 59.7 70.2 748.5 109.0EVT 306.9 80.4 96.9 1172.5 48.30.01% EVT 1409.1 897.6 1332.6 14333.6 89.451Chapter 7ConclusionOn the theoretical side, we derived a limit expression for the conditionalprobability given that one of the components of a bivariate random vec-tor is extreme, assuming a sub-family of skew-elliptical distributions withregularly varying tails as the underlying model. This extends results in Ab-dous et al. [2005a] for elliptically symmetric vectors. We also developed asemi-parametric EVT method to estimate this conditional probability usingthe above-mentioned asymptotic result. The main advantage of EVT-basedestimators is that they preserve useful information in the tail data withoutrestricting the behaviour of the central part. This methodology allows toassess extreme risk in asymmetric financial markets. Through two financialapplications, we demonstrated that how our method can be applied flexiblyin different contents.In this project, we assume that the underlying distribution has a multi-variate regularly varying tail. It would be of interest to develop a method-ology to handle light-tailed data with rapidly-varying tails. For financialtime series, this would apply to less frequently sampled data such as weeklyor monthly returns. Additionally, one assumption for Theorem 4.1.3 is clo-sure under marginalization. However, it is still not clear what conditions onskew-elliptical distributions could guarantee this assumption.Regarding financial applications, several more topics can be explored.Firstly, DiTraglia and Gerlach [2013] claim that the extreme conditional ex-ceedance probability contains important information for risk-averse investorswhich can be a valuable tool to select portfolios. Our method provides amore accurate estimation of the extreme conditional exceedance probabil-ity, and hence can be used to further verify their conclusion. Secondly, itis interesting to investigate the financial contagion problem in international52Chapter 7. Conclusionequity markets, similar to Kenourgios et al. [2011]. Instead of adopting allobservations, our method focuses more on the tail data and is more suitablefor analysing extreme events such as financial crises.53BibliographyB. Abdous, A.L. Fouge`res, and K. Ghoudi. Extreme behaviour for bivariateelliptical distributions. Canadian Journal of Statistics, 33:317–334, 2005a.B. Abdous, C. Genest, and B. Re´millard. Dependence properties of meta-elliptical distributions. In P. Duchesne and B. RAMillard, editors, InStatistical Modeling and Analysis for Complex Data Problems, pages 1–15. Springer, 2005b.V.V. Acharya, L.H. Pedersen, T. Philippon, and M. Richardson. Measuringsystemic risk. NYU Working Paper, 2010.C. Adcock, M. Eling, and N. Loperfido. Skewed distributions in financeand actuarial science: a review. The European Journal of Finance, iFirst:1–29, 2012.T. Adrian and M.K. Brunnermeier. CoVaR. Technical report, NationalBureau of Economic Research, 2011.L. Alles and L. Murray. Rewards for downside risk in Asian markets. Journalof Banking & Finance, 37:2501–2509, 2013.R. Aloui, M.S.B.N. Aissa, and D.K. Nguyen. Global financial crisis, extremeinterdependences, and contagion effects: The role of economic structure?Journal of Banking & Finance, 35:130–141, 2011.A. Azzalini and A. Capitanio. Distributions generated by perturbation ofsymmetry with emphasis on a multivariate skew t distribution. Journalof the Royal Statistical Society: Series B, 65:367–389, 2003.K.H. Bae, G.A. Karolyi, and R.M. Stulz. A new approach to measuringfinancial contagion. Review of Financial Studies, 16:717–763, 2003.54BibliographyA.A. Balkema and L. De Haan. Residual life time at great age. Annals ofProbability, 2:792–804, 1974.B. Basrak, R.A. Davis, and T. Mikosch. A characterization of multivariateregular variation. Annals of Applied Probability, 12:908–920, 2002.J. Beirlant, Y. Goegebeur, J. Segers, and J. Teugels. Statistics of Extremes:Theory and Applications. John Wiley & Sons, 2006.J. Beran and G. Mainik. On estimating extremal dependence structures byparametric spectral measures. Statistical Methodology, 21:1–22, 2014.N.H. Bingham, C.M. Goldie, and J. L. Teugels. Regular Variation (Encyclo-pedia of Mathematics and its Applications). Cambridge University Press,1987.M.O. Boldi and A.C. Davison. A mixture model for multivariate extremes.Journal of the Royal Statistical Society: Series B, 69:217–229, 2007.M.D. Branco and D.K. Dey. A general class of multivariate skew-ellipticaldistributions. Journal of Multivariate Analysis, 79:99–113, 2001.X. Burtschell, J. Gregory, and J.P. Laurent. A comparative analysis of CDOpricing models under the factor copula framework. Journal of Derivatives,16:9–37, 2009.J.J. Cai, J.HJ. Einmahl, and L. De Haan. Estimation of extreme risk regionsunder multivariate regular variation. The Annals of Statistics, 39:1803–1826, 2011.J.A. Chan-Lau, D.J. Mathieson, and J.Y. Yao. Extreme contagion in equitymarkets. IMF Staff Papers, 51:386–408, 2004.B.Y. Chang, P. Christoffersen, and K. Jacobs. Market skewness risk andthe cross section of stock returns. Journal of Financial Economics, 107:46–68, 2013.A. Clauset, C.R. Shalizi, and M. EJ. Newman. Power-law distributions inempirical data. SIAM review, 51:661–703, 2009.55BibliographyS.G. Coles. An Introduction to Statistical Modelling of Extreme Values.Springer Series in Statistics, 2001.S.G. Coles and J. Tawn. Statistical methods for multivariate extremes:an application to structural design (with discussion). Journal of AppliedStatistics, 43:1–48, 1994.J. Conrad, R.F. Dittmar, and E. Ghysels. Ex ante skewness and expectedstock returns. The Journal of Finance, 68:85–124, 2013.R.C Dahiya and J. Gurland. Goodness of fit tests for the gamma andexponential distributions. Technometrics, 14:791–801, 1972.B. Das and S.I. Resnick. Conditioning on an extreme component: Modelconsistency and regular variation on cones. Bernoulli, 17:226–252, 2011.L. de Haan and A. Ferreira. Extreme Value Theory. An Introduction.Springer-Verlag, 2006.L. de Haan and E. Omey. Integrals and derivatives of regularly varying func-tions in Rd and domains of attraction of stable distributions II. StochasticProcesses and their Applications, 16:157–170, 1983.L. de Haan and S.I. Resnick. Limit theory for multivariate sample extremes.Z. Wahrscheinlichkeitstheorie verw. Gebiete, 40:317–337, 1977.L. de Haan and S.I. Resnick. On regular variation of probability densities.Stochastic Processes and their Applications Applications, 25:83–93, 1987.F.X. Diebold, T. Schuermann, and J.D. Stroughair. Pitfalls and opportu-nities in the use of extreme value theory in risk management. Journal ofRisk Finance, 1:30–35, 2000.F.J. DiTraglia and J.R. Gerlach. Portfolio selection: An extreme valueapproach. Journal of Banking & Finance, 37:305–323, 2013.E.F. Eastoe, J.E. Heffernan, and J.A. Tawn. Nonparametric estimation ofthe spectral measure, and associated dependence measures, for multivari-ate extreme values using a limiting conditional representation. Extremes,17:25–43, 2014.56BibliographyJ.HJ. Einmahl and J. Segers. Maximum empirical likelihood estimation ofthe spectral measure of an extreme-value distribution. The Annals ofStatistics, 37:2953–2989, 2009.J.HJ. Einmahl, L. de Haan, and V.I. Piterbarg. Nonparametric estimationof the spectral measure of an extreme value distribution. The Annals ofStatistics, 29:1401–1423, 2001.P. Embrechts, C. Klu¨ppelberg, and T. Mikosch. Modelling Extremal Eventsfor Insurance and Finance. Springer, 1997.R. Engle. Dynamic conditional correlation: A simple class of multivariategeneralized autoregressive conditional heteroskedasticity models. Journalof Business & Economic Statistics, 20:339–350, 2002.B.Q. Fang. The skew elliptical distributions and their quadratic forms.Journal of Multivariate Analysis, 87:298–314, 2003.K.T. Fang, S. Kotz, and K.W. Ng. Symmetric Multivariate and RelatedDistributions. Chapman and Hall, 1990.R.A. Fisher and L.H.C Tippett. Limiting forms of the frequency distributionof the largest or smallest member of a sample. In Mathematical Proceed-ings of the Cambridge Philosophical Society, volume 24, pages 180–190.Cambridge Univ Press, 1928.X. Gabaix and R. Ibragimov. Rank- 1/2: a simple way to improve the OLSestimation of tail exponents. Journal of Business & Economic Statistics,29:24–39, 2011.G. Girardi and E.A. Tolga. Systemic risk measurement: MultivariateGARCH estimation of CoVaR. Journal of Banking & Finance, 37:3169–3180, 2013.B. Gnedenko. Sur la distribution limite du terme maximum d’une seriealeatoire. Annals of Mathematics, 44:423–453, 1943.S. Guillotte, F. Perron, and J. Segers. Non-parametric Bayesian inferenceon bivariate extremes. Journal of the Royal Statistical Society: Series B(Statistical Methodology), 73:377–406, 2011.57BibliographyC.R. Harvey and A. Siddique. Conditional skewness in asset pricing tests.The Journal of Finance, 55:1263–1295, 2000.J. Heffernan and S.I. Resnick. Limit laws for random vectors with an extremecomponent. Annals of Applied Probability, 17:537–571, 2007.J. Heffernan and J. Tawn. A conditional approach for multivariate extremevalues. Journal of the Royal Statistical Society, Series B, 66:497–546,2004.B. Hill. A simple general approach to inference about the tail of a distribu-tion. The Annals of Statistics, 3:1163–1174, 1975.R. Huisman, K.G. Koedijk, C.L.M. Kool, and F. Palm. Tail-index estimatesin small samples. Journal of Business and Economic Statistics, 19:208–216, 2001.H. Hult and F. Lindskog. Multivariate extremes, aggregation and depen-dence in elliptical distributions. Advances in Applied Probability, 34:587–608, 2002.A.F. Jenkinson. The frequency distribution of the annual maximum (orminimum) values of meteorological elements. Quarterly Journal of theRoyal Meteorological Society, 81:158–171, 1955.H. Joe. Multivariate Models and Dependence Concepts. Chapman & Hall,1997.D. Kenourgios, A. Samitas, and N. Paltalidis. Financial crises and stockmarket contagion in a multivariate time-varying asymmetric framework.Journal of International Financial Markets, Institutions and Money, 21:92–106, 2011.F. Longin and B. Solnik. Extreme correlation of international equaity mar-kets. The Journal of Finance, 56:649–676, 2001.Y. Ma, .M.G. Genton, and A.A. Tsiatis. Locally efficient semiparamet-ric estimators for generalized skew-elliptical distributions. Journal of theAmerican Statistical Association, 100:980–989, 2005.58BibliographyR.A. Maronna and R.H. Zamar. Robust estimates of location and dispersionfor high-dimensional datasets. Technometrics, 44:307–317, 2002.A. J. McNeil and R. Frey. Estimation of tail-related risk measures for het-eroscedastic financial time series: an extreme value approach. Journal ofEmpirical Finance, 7:271–300, 2000.A. J. McNeil, R. Frey, and P. Embrechts. Quantitative Risk Management:Concepts, Techniques, Tools. Princeton University Press, Princeton, 2005.T. Nguyen and G. Samorodnitsky. Multivariate tail estimation with appli-cation to analysis of CoVaR. Astin Bulletin, 43:245–270, 2013.A. Peiro. Skewness in financial returns. Journal of Banking & Finance, 23:847–862, 1999.J. Pickands. Statistical inference using extreme order statistics. The Annalsof Statistics, 3:119–131, 1975.S.H. Poon, M. Rockinger, and J. Tawn. Extreme value dependence in finan-cial markets: diagnostics, models, and financial implications. Review ofFinancial Studies, 17:581–610, 2004.S.I. Resnick. Extreme Values, Regular Variation, and Point Processes.Springer, 1987.S.I. Resnick. Heavy-tail Phenomena. Probabilistic and Statistical Modeling.Springer, 2006.M. Rockinger and E. Jondeau. Entropy densities with an application to au-toregressive conditional skewness and kurtosis. Journal of Econometrics,106:119–142, 2002.H. Rootze´n and N. Tajvidi. Multivariate generalized pareto distributions.Bernoulli, 12:917–930, 2006.P.J. Rousseeuw and K.V. Driessen. A fast algorithm for the minimum co-variance determinant estimator. Technometrics, 41:212–223, 1999.59A. J. Stam. Regular variation in Rd and the Abel-Tauber theorem. Mathe-matisch Instituut, 1977.D. Trudel. Tail dependence of hedge funds. Master’s thesis, Swiss FederalInstitute of Technology Zurich, 2008.R. Von Mises. La distribution de la plus grande de n valeurs. Reviews inMathematical Union Interbalcanique, 1, 1936.J. Wang, J. Boyer, and M.G. Genton. A skew-symmetric representation ofmultivariate distributions. Statistica Sinica, 14:1259–1270, 2004.60Appendix AProofsA.1 Proof of Lemma 3.3.2Proof Denoting Z¯ = C−1Z, the density function h¯ of Z¯ can be derived fromtransformation of the density function h of Z:h¯(z) = 2f(C−1z; ξ,Ω)G(ω(C−1z− ξ))|C|−1= 2cd|Ω|1/2|C|f˜((C−1z− ξ)TΩ−1(C−1z− ξ))G(ω(C−1z− ξ))= 2cd|CTΩC|1/2f˜((z− Cξ)T (CTΩC)−1(z− Cξ))G(ω0(z− Cξ)),where ω0(z − Cξ) = ω(C−1(z − Cξ)), and it is obvious that ω0 is also anodd function. ¶A.2 Proof of Proposition 4.1.1Proof We begin by showing that condition (2.5) holds, and then derive theform of limit function q(z). The validity of uniformity condition (2.6) isdiscussed in the end.It suffices to consider the case ξ = 0 as a shift in location does notaffect the index and spectral measure of regularly varying random vectors;see Lemma 2.2 in Hult and Lindskog [2002]. Letting Z˜ ∼ Ell2(0, Ω¯, f˜)with density f , Proposition 3.2.2 says that ||Z˜||d= ||Z||. From (3.2), Z˜ =RLTS, where R has density (3.10), L =(1 ρ0√1− ρ2)using Cholesky61A.2. Proof of Proposition 4.1.1decomposition, and S has uniform distribution on S1. We then haveV (t) := P(||Z|| > t) = P(||Z˜|| > t) = P(R(STLLTS)1/2 > t)= P(R(1 + ρ√1− ρ2 sin(2Θ) + ρ2 cos(2Θ))1/2 > t), Θ ∼ Unif(0, 2pi)=∫ 2pi0∫ ∞t√A(θ)12pifR(r)drdθ, where A(θ) := 1 + ρ√1− ρ2 sin(2θ) + ρ2 cos(2θ)=∫ 2pi0∫ ∞t√A(θ)cdf˜(r2)rdrdθ=∫ 2pi0∫ ∞( t√A(θ))2cd2f˜(s)dsdθ=∫ 2pi0cd2F˜(t2/A(θ))dθ, where F˜ (x) :=∫ ∞xf˜(u)du.With density h of Z in (3.8), the limit in (2.5) can be computed asq(z) = limt→∞h(tz)t−2V (t)= limt→∞2 cd |Ω|−1/2f˜(Q(tz))G(w(tz))t−2∫ 2pi0 (cd/2) F˜(t2/A(θ))dθ= 4 |Ω|−1/2 G(w∞(z)) limt→∞t2 f˜(t2Q(z))∫ 2pi0 F˜(t2/A(θ))dθ= 4 |Ω|−1/2 G(w∞(z)) Q∗(z)−(ν+2)/2 limt→∞t2 f˜(t2)/F˜ (t2)∫ 2pi0 F˜(t2/A(θ))/F˜ (t2)dθ= 4 |Ω|−1/2 G(w∞(z)) Q∗(z)−(ν+2)/2 limt→∞t2 f˜(t2)/F˜ (t2)∫ 2pi0 F˜(t2/A(θ))/F˜ (t2)dθ= 2ν |Ω|−1/2 G(w∞(z)) Q∗(z)−(ν+2)/2[ ∫ 2pi0A(θ)ν/2dθ]−1.The final line above follows from the assumption f˜ ∈ RV−(ν+2)/2, which im-plies F˜ ∈ RV−ν/2 and tf˜(t)/F˜ (t)→ ν/2 as t→∞ by Karamata’s Theorem(see de Haan and Ferreira [2006]).Based on Potter’s Theorem (see Theorem 1.5.6 in Bingham et al. [1987]),for any chosen c > 1 and  > 0, there exist C = C(c, ) such that F˜(t2/A(θ))/F˜ (t2) ≤c ·max{( 1A(θ))−ν/2+, ( 1A(θ))−ν/2−} when t2/A(θ) > C and t2 > C. The limitin the denominator holds by the Dominated Convergence Theorem.62A.3. Proof of Lemma 4.1.2Finally, from (2.7) and using homogeneity of function q, we obtainψ(w) = ν−1 q(||z||w) ||z||ν+2 = ν−1 q(w), w = z/‖z‖ ∈ S1= 2 |Ω|−(1/2)G(w∞(z)) Q∗(z)−(ν+2)/2[ ∫ 2pi0A(θ)ν/2dθ]−1.It remains to show validity of uniformity condition (2.6) for density h.Following the arguments in de Haan and Resnick [1987] (Section 3), theuniformity condition is fulfilled by the elliptical density f of Z˜. In particular,we have for z 6= 0limt→∞f(tz)t−2V (t)= q0(z) > 0 and limt→∞supz∈S1∣∣∣f(tz)t−2V (t)− q0(z)∣∣∣ = 0.(A.1)It is straightforward to show that q(z) = 2G(ω∞(z)) q0(z). Now writesupz∈S1∣∣∣h(tz)t−2V (t)− q(z)∣∣∣ = supz∈S1∣∣∣2f(tz) G(ω(tz))t−2V (t)− q(z)∣∣∣≤ supz∈S1∣∣∣2G(ω(tz))∣∣∣∣∣∣f(tz)t−2V (t)− q0(z)∣∣∣+ supz∈S12q0(z)∣∣∣G(ω(tz))−G(ω∞(z))∣∣∣.Letting t → ∞, the first term vanishes by (A.1) and since G(·) is boundedby one; Assumption (ii) gives convergence of the second term. ¶A.3 Proof of Lemma 4.1.2Proof (i) =⇒ (ii). For any z > 0, if limt→∞f˜(tz)f˜(t)= z−(ν+d)/2, thenlimt→∞P(R > tz)P(R > t)= limt→∞∫∞tz f˜(r2)rd−1dr∫∞t f˜(r2)rd−1drL′Hospital′s rule= limt→∞zf˜(t2z2)f˜(t2)(tz)d−1td−1= zdz−(ν+d) = z−ν .63A.4. Proof of Theorem 4.1.3(ii) =⇒ (i). Regular variation ofR implies its density function fR ∈ RV−(ν+1)based on Karamata’s Theorem. For z > 0,limt→∞f˜(t2z2)f˜(t2)= limt→∞fR(tz)(tz)1−dfR(t)t1−d= z−ν−1z1−d = z−(ν+d)=⇒ limt→∞f˜(tz)f˜(t)= z−(ν+d)2 .(ii) ⇐⇒ (iii). Let Z˜ = RLTS ∼ Ell2(ξ, Ω¯, f˜), then ||Z||d= ||Z˜||. Based onTheorem 4.3 in Hult and Lindskog [2002], R is regularly varying with indexν is equivalent to Z˜ being regularly varying with index ν. This directlyimplies thatlimt→∞P(||Z|| > tz)P(||Z|| > t)= limt→∞P(||Z˜|| > tz)P(||Z˜|| > t)= z−ν , z > 0.¶A.4 Proof of Theorem 4.1.3Proof1. (Multivariate regular variation of h) Let h denote the density ofZ; see (3.8). Let 1 = (1, 1) denote a bivariate vector of ones. We havelimt→∞h(tz)h(t1)= limt→∞f˜(Q(tz)) G(w(tz− ξ))f˜(Q(t1)) G(w(t1− ξ))(A.2)=(Q∗(z)Q∗(1))−(ν+2)/2G(w∞(z))G(w∞(1))=: λ(z), z 6= 0. (A.3)Note λ(z) > 0 for z 6= 0 and λ(az) = a−(ν+2)λ(z) for a > 0. Hencedensity h is bivariate regularly varying with index (ν + 2) > 2 andlimit function λ. The first factor in the expression for λ comes fromthe underlying elliptical density, whereas the second factor is due tothe skewing function.2. (Joint tail behaviour) We next relate the tail behaviour of the skew-elliptical random vector Z to that of the associated elliptical random64A.4. Proof of Theorem 4.1.3vector Z˜ = (X˜, Y˜ ) ∼ Ell2(ξ, Ω¯, f˜) with density f . Let V (t) = t2h(t1).Analogous to the arguments in Proposition 4.1.1, one can show thath satisfies conditions of Theorem 2.2.1 with limit functions λ. Hence,Theorem 2.2.1 and (2.4) with B = [z,∞) (cf. de Haan and Omey[1983], Theorem 1) implyP(Z > tz) ∼ t2 h(t1)∫∫ ∞zλ(u)du, t→∞, z ≥ 0, z 6= 0= 2 f(t1) G(w(t1− ξ)) t2∫∫ ∞zλ(u)du. (A.4)Similarly, for the elliptical random vector, the limit function is givenby λ0(z) =(Q∗(z)/Q∗(1))−(ν+2)/2andP(Z˜ > tz) ∼ t2 f(t1)∫∫ ∞zλ0(u)du, t→∞, z ≥ 0, z 6= 0.(A.5)Combining (A.4) and (A.5), and plugging in expressions for the limitfunctions λ and λ0 givesP(Z > tz)P(Z˜ > tz)∼2 G(w(t1− ξ))∫∫∞z λ(u)du∫∫∞z λ0(u)du, t→∞, z ≥ 0, z 6= 0→2∫∫∞z Q∗(u)−(ν+2)/2 G(w∞(u)) du∫∫∞z Q∗(u)−(ν+2)/2 du. (A.6)3. (Marginals)Since Z is multivariate regularly varying in the sense of Definition 2.2.3,it follows from Theorem 1.1(i) in Basrak et al. [2002] that the marginaltails are regularly varying with index ν. Under the assumption that themarginals of Z are also skew-elliptical with densities of the form (4.2),we havelimx→∞P(X > x)P(X˜ > x)L′Hospital′s rule= limx→∞h1(x)f1(x− ξ1)(4.2)= limx→∞2 G0(w1(x− ξ1)) = 2 G0(w1,∞(1)),and henceP(X > x) ∼ 2 P(X˜ > x) G0(w1,∞(1)), x→∞. (A.7)65A.4. Proof of Theorem 4.1.34. (Conditional probability) Consider P(Y > y | X > x) for y ∼ zxas x→∞ and z > 0. Setting z = (1, z) in (A.6), and putting the jointand marginal tail behaviour together giveslimx,y→∞P(Y > y | X > x) = K(z) limx,y→∞P(Y˜ > y | X˜ > x), (A.8)whereK(z) =∫∫∞(1,z)Q∗(u)−(ν+2)/2 G(w∞(u)) duG0(w1,∞(1))∫∫∞(1,z)Q∗(u)−(ν+2)/2 du, (A.9)and (X˜, Y˜ ) follows a bivariate elliptical distribution with density f .The proof is completed by using Theorem 1(i) in Abdous et al. [2005b]for the limit of P(Y˜ > y | X˜ > x). ¶66Appendix BTablesB.1 Simulation StudiesIn this section, we further report two simulation results for different bivariateskew-t distributions as in Section 5. The objective is to show that 1) ourmethod could produce similar result as in Abdous et al. [2005a] for theelliptical distribution, and 2) our estimation method may not work wellwhen the tail decays fast due to poor estimation of the tail index, the sameproblem as in Abdous et al. [2005a].67Table B.1: Simulation results based on 1000 samples of size 1000 from a bivariate skew-t distribution with parameters ξ =(0,0), α = (0,0), ν =2, ρ = 0.5 and ω = diag(1, 1) for the scale matrix. Each cell provides the average (standard deviation) ofthe estimates of η(x, y) under various methods; see Section 5.5 for details. For ηˆAFG(x, y), ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y) weused z = ω1y/ω2x−ρ in the limit results. Values of x and y are chosen as the theoretical marginal quantiles with probabilityp, where p labels columns and rows.Quantile yQuantile x97.5% 99.0% 99.9% 99.99%97.5%ηˆAFG(x, y) 0.617 (0.089) 0.448 (0.072) 0.254 (0.028) 0.207 (0.021)ηˆ1(x, y) 0.641 (0.052) 0.472 (0.051) 0.278 (0.037) 0.231 (0.032)ηˆ2(x, y) 0.640 (0.057) 0.472 (0.057) 0.279 (0.043) 0.232 (0.038)ηemp(x, y) 0.599 (0.102) 0.437 (0.168) * *η(x, y)/ηlim(x, y) 0.598/0.609 0.438/0.440 0.257/0.257 0.213/0.21399.0%ηˆAFG(x, y) 0.790 (0.0801) 0.617 (0.089) 0.303 (0.040) 0.220 (0.022)ηˆ1(x, y) 0.810 (0.041) 0.641 (0.052) 0.328 (0.041) 0.243 (0.034)ηˆ2(x, y) 0.808 (0.044) 0.640 (0.057) 0.329 (0.048) 0.244 (0.040)ηemp(x, y) 0.774 (0.088) 0.602 (0.171) * *η(x, y)/ηlim(x, y) 0.775/0.786 0.605/0.609 0.302/0.302 0.225/0.22599.9%ηˆAFG(x, y) 0.973 (0.027) 0.933 (0.045) 0.617 (0.089) 0.304 (0.041)ηˆ1(x, y) 0.977 (0.010) 0.942 (0.019) 0.641 (0.052) 0.330 (0.042)ηˆ2(x, y) 0.977 (0.010) 0.942 (0.020) 0.640 (0.057) 0.330 (0.048)ηemp(x, y) 0.972 (0.034) 0.934 (0.082) * *η(x, y)/ηlim(x, y) 0.970/0.972 0.930/0.932 0.609/0.609 0.304/0.30499.99%ηˆAFG(x, y) 0.997 (0.007) 0.993 (0.013) 0.932 (0.045) 0.617 (0.089)ηˆ1(x, y) 0.998 (0.002) 0.994 (0.003) 0.941 (0.020) 0.641 (0.052)ηˆ2(x, y) 0.998 (0.002) 0.994 (0.004) 0.940 (0.020) 0.640 (0.057)ηemp(x, y) 0.997 (0.011) 0.993 (0.028) * *η(x, y)/ηlim(x, y) 0.997/0.997 0.992/0.992 0.930/0.931 0.609/0.60968Table B.2: Simulation results based on 1000 samples of size 1000 from a bivariate skew-t distribution with parameters ξ =(3,1), α = (1,-3), ν =20, ρ = 0.5 and ω = diag(2, 3) for the scale matrix. Each cell provides the average (standard deviation)of the estimates of η(x, y) under various methods; see Section 5.5 for details. For ηˆAFG(x, y), ηˆ1(x, y), ηˆ2(x, y) and ηlim(x, y)we used z = ω1y/ω2x − ρ in the limit results. Values of x and y are chosen as the theoretical marginal quantiles withprobability p, where p labels columns and rows.Quantile yQuantile x97.5% 99.0% 99.9% 99.99%97.5%ηˆAFG(x, y) 0.206 (0.027) 0.178 (0.025) 0.139 (0.023) 0.119 (0.021)ηˆ1(x, y) 0.475 (0.056) 0.434 (0.054) 0.370 (0.049) 0.332 (0.045)ηˆ2(x, y) 0.568 (0.057) 0.527 (0.056) 0.460 (0.054) 0.419 (0.052)ηemp(x, y) 0.692 (0.094) 0.604 (0.162) * *η(x, y)/ηlim(x, y) 0.691/0.457 0.598/0.363 0.408/0.243 0.286/0.18299.0%ηˆAFG(x, y) 0.276 (0.029) 0.234 (0.028) 0.176 (0.025) 0.145 (0.023)ηˆ1(x, y) 0.563 (0.060) 0.513 (0.058) 0.431 (0.054) 0.380 (0.050)ηˆ2(x, y) 0.655 (0.057) 0.606 (0.058) 0.524 (0.056) 0.471 (0.054)ηemp(x, y) 0.821 (0.079) 0.741 (0.153) * *η(x, y)/ηlim(x, y) 0.821/0.684 0.739/0.553 0.535/0.355 0.381/0.26099.9%ηˆAFG(x, y) 0.469 (0.036) 0.394 (0.033) 0.281 (0.030) 0.220 (0.027)ηˆ1(x, y) 0.739 (0.056) 0.679 (0.059) 0.569 (0.060) 0.494 (0.057)ηˆ2(x, y) 0.812 (0.048) 0.761 (0.053) 0.660 (0.057) 0.587 (0.058)ηemp(x, y) 0.971 (0.036) 0.943 (0.080) * *η(x, y)/ηlim(x, y) 0.970/0.974 0.942/0.919 0.815/0.698 0.641/0.50499.99%ηˆAFG(x, y) 0.650 (0.039) 0.560 (0.038) 0.404 (0.034) 0.309 (0.031)ηˆ1(x, y) 0.852 (0.043) 0.800 (0.050) 0.687 (0.059) 0.600 (0.060)ηˆ2(x, y) 0.903 (0.034) 0.862 (0.041) 0.768 (0.052) 0.689 (0.056)ηemp(x, y) 0.997 (0.012) 0.993 (0.027) * *η(x, y)/ηlim(x, y) 0.996/0.999 0.992/0.995 0.955/0.929 0.857/0.77269B.2. CoVaRB.2 CoVaRIn this section, we check robustness of the summary statistics for cross-sectional 4CoVaRs|jq1,t. Firstly, instead adopting EVT method to calculateVaRjq2,t of each institution, we directly model Zˆjt using the skew-t distribu-tion and take the quantile of skew-t as the estimate of VaRjq2,t. Secondly,we assume that {Zˆt} follows a bivariate skew-t distribution for the bench-mark case, and then apply the same procedure to compute CoVaR for theinnovations {Zˆt} using the fitted model.Generally speaking, the summary statistics are similar under differentmeasurement approaches. A more detailed look reveals that VaRjq2,t esti-mated by EVT methods leads to, on average, a higher 4CoVaRs|jq1,t, indi-cating that the parametric model (skew-t) is conservative in modelling tailevents when quantile q2 is small.Table B.3: Summary statistics for cross-sectional 4CoVaRs|jq1,t for all institutionsduring sample period from June 1, 2006 to May 31, 2007. VaRjq2,t is estimated byassuming that Zˆjt follows the skew-t distribution, and CoVaRs|bjq1,t is estimated byassuming that {Zˆt} follows a bivariate skew-t distribution.q2 Mean(%) Std.TS Std.CS Max(%) Min(%)5%Skew-t 172.8 29.8 25.9 366.2 85.7EVT 133.0 39.5 39.9 510.1 -14.2Empirical 164.3 74.1 30.3 370.6 18.01%Skew-t 285.6 61.7 55.5 717.2 126.8EVT 285.2 80.4 81.3 1182.3 26.40.01% EVT 1110.3 457.8 573.9 8557.5 153.270B.2. CoVaRTable B.4: Summary statistics for cross-sectiona 4CoVaRs|jq1,t for all institutionsduring sample period from June 1, 2006 to May 31, 2007. VaRjq2,t is estimated usingEVT method decribed in McNeil and Frey [2000], and CoVaRs|bjq1,t is estimated byassuming that {Zˆt} follows a bivariate skew-t distribution.q2 Mean(%) Std.TS Std.CS Max(%) Min(%)5%Skew-t 178.7 25.6 28.9 360.0 87.5EVT 140.5 38.4 40.9 487.1 -9.9Empirical 161.1 76.1 32.0 370.6 -15.61%Skew-t 304.1 54.6 63.7 681.8 120.2EVT 308.8 77.0 96.6 1278.3 53.80.01% EVT 1437.0 918.7 1368.2 14297.4 91.7Table B.5: Summary statistics for cross-sectional 4CoVaRs|jq1,t for all institutionsduring sample period from June 1, 2006 to May 31, 2007. VaRjq2,t is estimatedby assuming that Zˆjt follows the skew-t distribution, and CoVaRs|bjq1,t is estimatedempirically.q2 Mean(%) Std.TS Std.CS Max(%) Min(%)5%Skew-t 172.2 34.3 31.8 383.0 79.1EVT 131.9 42.3 39.4 559.0 -9.2Empirical 164.5 77.9 38.8 491.5 14.31%Skew-t 284.5 66.4 60.1 709.6 120.3EVT 283.0 83.9 79.3 1285.0 33.90.01% EVT 1099.5 458.5 558.4 9251.0 168.171

Cite

Citation Scheme:

        

Citations by CSL (citeproc-js)

Usage Statistics

Share

Embed

Customize your widget with the following options, then copy and paste the code below into the HTML of your page to embed this item in your website.
                        
                            <div id="ubcOpenCollectionsWidgetDisplay">
                            <script id="ubcOpenCollectionsWidget"
                            src="{[{embed.src}]}"
                            data-item="{[{embed.item}]}"
                            data-collection="{[{embed.collection}]}"
                            data-metadata="{[{embed.showMetadata}]}"
                            data-width="{[{embed.width}]}"
                            data-media="{[{embed.selectedMedia}]}"
                            async >
                            </script>
                            </div>
                        
                    
IIIF logo Our image viewer uses the IIIF 2.0 standard. To load this item in other compatible viewers, use this url:
https://iiif.library.ubc.ca/presentation/dsp.24.1-0166440/manifest

Comment

Related Items