UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Ranking functions of data Zhang, Cassie Yue

Abstract

This thesis addresses a significant problem in numerous scientific fields - the challenge of determining whether two sets of data are statistically identical, a procedure known as a two-sample homogeneity test. Various methods, including well-known parametric tests like t-tests and analysis of variance (ANOVA), have been employed for two-sample homogeneity tests. However, these tests have certain limitations. They rely heavily on specific assumptions about the data's distribution. If these assumptions are violated, the results can be misleading. To overcome these limitations, non-parametric methods that do not require strict assumptions about the data's nature have been explored. However, these too have their shortcomings. For instance, techniques like the Wilcoxon rank-sum test can only be applied effectively in univariate cases, as multivariate data lacks a natural ranking. This complicates the analysis and limits the utility of such methods for multivariate data. In response to identified challenges, this work introduces an innovative approach to rank multivariate data through the concept of data depth. It encompasses a thorough comparison of eleven widely-used depth functions and the development of two novel two-sample homogeneity tests, significantly enhancing the accuracy of homogeneity assessments in multivariate datasets. Further, inspired by the quality index [25] used in quality assurance contexts to measure a population's overall 'outlyingness' relative to another, the proposed test statistics are designed to yield a 𝒳²(1) asymptotic null distribution. Simulation studies underscore the robust performance of these tests, and the discussion also extends into their potential application in multivariate multisample contexts, demonstrating the versatility and broad applicability of depth-based methodologies in statistical analysis. The increasing number of samples correspondingly increases the number of pairwise quality indexes, an aspect considered and explored as part of the study. Future works include application on high-dimensional data and change point analysis. With its focus on boosting the power of depth-based tests, this thesis marks a significant step forward in the field of data analysis.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International