UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

On the improvement of density ratio estimation via probabilistic classifier : theoretical study and its applications Yin, Jiayang

Abstract

Density ratio estimation has a broad application in the world of machine learning and data science, especially in transfer learning and contrastive learning. This work mainly focuses on a particular type of density ratio estimation based on a probabilistic classification from the perspective of statistical inference. We show such a density ratio estimation relates to a probabilistic classifier such as logistic regression. We analyze the potential cause for its inefficiency and inaccuracy when the two distributions are much different from each other. Opposite to the target of a probabilistic classification, a density ratio estimation task with a more efficient estimator indicates the corresponding classification task is harder, which means it is more difficult to separate the two samples by a probabilistic classifier. We provide a theoretical explanation for this phenomenon from a mathematical and statistical standpoint. For the basic density ratio estimation by a probabilistic classification, we give a necessary and sufficient condition for its existence under a sample level. We analyze the probability that such conditions holds asymptotically if the supports of two densities are the same. Besides, we explore the asymptotic properties of a recent proposed approach to improving density ratio estimation by a probabilistic classification, Telescoping Density Ratio Estimation (TDRE) in (B. Rhodes, K. Xu, and M. U. Gutmann. Telescoping density-ratio estimation. Advances in neural information processing systems, 33:4905–4916, 2020). Numerically, we compare the asymptotic variance of basic density ratio estimation and TDRE We also explore some generalization on TDRE with unbalanced data and under some (partly) model misspecification through both theoretical discussion and empirical analysis. Some suggestions for future work on un-normalized model inference are also provided.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International