UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Vine copulas : dependence structure learning, diagnostics, and applications to regression analysis Chang, Bo

Abstract

Copulas are widely used in high-dimensional multivariate applications where the assumption of Gaussian distributed variables does not hold. Vine copulas are a flexible family of copulas built from a sequence of bivariate copulas to represent bivariate dependence and bivariate conditional dependence. The vine structures consist of a hierarchy of trees to express conditional dependence. The contributions of this thesis are (a) improved methods for finding parsimonious truncated vine structures when the number of variables is moderate to large; (b) diagnostic methods to help in decisions for bivariate copulas in the vine; (c) applications to predictions based on conditional distributions of the vine copula. The vine structure learning problem has been challenging due to the large search space. Existing methods are based on greedy algorithms and do not in general produce a solution that is near the global optimum. It is an open problem to choose a good truncated vine structure when there are many variables. We propose a novel approach to learning truncated vine structures using Monte Carlo tree search, a method that has been widely adopted in game and planning problems. The proposed method has significantly better performance over the existing methods under various experimental setups. Moreover, diagnostic methods based on measures of dependence and tail asymmetry are proposed to guide the choice of parametric bivariate copula families assigned to the edges of the trees in the vine and to assess whether a copula is constant over the conditioning value(s) for trees 2 and higher. If the diagnostic methods suggest the existence of reflection asymmetry, permutation asymmetry, or asymmetric tail dependence, then three- or four-parameter bivariate copula families might be needed. If the conditional dependence measures or asymmetry measures in trees 2 and up are not constant over the conditioning value(s), then non-constant copulas with parameters varying over conditioning values should be considered. Finally, for data from an observational study, we propose a vine copula regression method that uses regular vines and handles mixed continuous and discrete variables. This method can efficiently compute the conditional distribution of the response variable given the explanatory variables.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International