UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Learning from imbalanced data : a geometric study on over-parameterized models Behnia, Tina

Abstract

Machine learning has trended toward training overparameterized networks that can exactly fit the training set. In this regime, when the training data is class-imbalanced, traditional methods for mitigating imbalances are not effective and yield poor performance on minorities. Tailored to the over-parameterized setting, one successful technique to combat imbalances involves adjusting the logits during training by introducing several hyper-parameters in the cross-entropy loss. The idea behind these adjustments is rooted in the implicit bias analysis which, for linear models, explains why they successfully induce bias on the optimization path towards solutions that favor minorities. However, the impact of these adjustments is not well-understood for deep, non-linear models that simultaneously learn features from the data and fit classifiers on them. In this work, we take a step towards formalizing the impact of data imbalances and the choice of loss function when learning in over-parameterized setups. At the core of our analysis is the unconstrained features model (UFM), which recently provided partial theoretical justification for the empirical finding known as Neural-collapse. Limited to the balanced setting, Neural-collapse suggests that over-parameterized models learn features and classifiers that are arranged in a perfectly symmetric and maximally-separated geometry. Leveraging the UFM in our analysis, we analytically characterize the changes in the learned geometry as data becomes imbalanced or the loss function is modified. Our analysis characterizes the properties of the learned features and classifiers, justifying previously observed empirical findings on imbalanced training. To verify the accuracy of our theoretical predictions, we conduct experiments on benchmark networks and vision datasets. Additionally, we adopt the UFM to conduct a preliminary study on supervised contrastive loss, a recently proposed alternative to the cross-entropy. We observe that these two losses, under UFM, exhibit certain similarities. Furthermore, with a slight modification to the model, we discover that we can make the training geometry invariant to imbalances in this case. Overall, despite its simplicity, UFM uncovers certain biases of over-parameterized models at the training stage. Indeed, this simplification is not without its own limitations, which will become clear throughout our analysis.

Item Media

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International