Learning from imbalanced data : a geometric study on over-parameterized models

UBC Theses and Dissertations

Featured Collection

UBC Theses and Dissertations

Learning from imbalanced data : a geometric study on over-parameterized models Behnia, Tina

Abstract

Machine learning has trended toward training overparameterized networks that can exactly fit the training set. In this regime, when the training data is class-imbalanced, traditional methods for mitigating imbalances are not effective and yield poor performance on minorities. Tailored to the over-parameterized setting, one successful technique to combat imbalances involves adjusting the logits during training by introducing several hyper-parameters in the cross-entropy loss. The idea behind these adjustments is rooted in the implicit bias analysis which, for linear models, explains why they successfully induce bias on the optimization path towards solutions that favor minorities. However, the impact of these adjustments is not well-understood for deep, non-linear models that simultaneously learn features from the data and fit classifiers on them. In this work, we take a step towards formalizing the impact of data imbalances and the choice of loss function when learning in over-parameterized setups. At the core of our analysis is the unconstrained features model (UFM), which recently provided partial theoretical justification for the empirical finding known as Neural-collapse. Limited to the balanced setting, Neural-collapse suggests that over-parameterized models learn features and classifiers that are arranged in a perfectly symmetric and maximally-separated geometry. Leveraging the UFM in our analysis, we analytically characterize the changes in the learned geometry as data becomes imbalanced or the loss function is modified. Our analysis characterizes the properties of the learned features and classifiers, justifying previously observed empirical findings on imbalanced training. To verify the accuracy of our theoretical predictions, we conduct experiments on benchmark networks and vision datasets. Additionally, we adopt the UFM to conduct a preliminary study on supervised contrastive loss, a recently proposed alternative to the cross-entropy. We observe that these two losses, under UFM, exhibit certain similarities. Furthermore, with a slight modification to the model, we discover that we can make the training geometry invariant to imbalances in this case. Overall, despite its simplicity, UFM uncovers certain biases of over-parameterized models at the training stage. Indeed, this simplification is not without its own limitations, which will become clear throughout our analysis.

Item Metadata

Title	Learning from imbalanced data : a geometric study on over-parameterized models
Creator	Behnia, Tina
Supervisor	Thrampoulidis, Christos
Publisher	University of British Columbia
Date Issued	2023
Description	Machine learning has trended toward training overparameterized networks that can exactly fit the training set. In this regime, when the training data is class-imbalanced, traditional methods for mitigating imbalances are not effective and yield poor performance on minorities. Tailored to the over-parameterized setting, one successful technique to combat imbalances involves adjusting the logits during training by introducing several hyper-parameters in the cross-entropy loss. The idea behind these adjustments is rooted in the implicit bias analysis which, for linear models, explains why they successfully induce bias on the optimization path towards solutions that favor minorities. However, the impact of these adjustments is not well-understood for deep, non-linear models that simultaneously learn features from the data and fit classifiers on them. In this work, we take a step towards formalizing the impact of data imbalances and the choice of loss function when learning in over-parameterized setups. At the core of our analysis is the unconstrained features model (UFM), which recently provided partial theoretical justification for the empirical finding known as Neural-collapse. Limited to the balanced setting, Neural-collapse suggests that over-parameterized models learn features and classifiers that are arranged in a perfectly symmetric and maximally-separated geometry. Leveraging the UFM in our analysis, we analytically characterize the changes in the learned geometry as data becomes imbalanced or the loss function is modified. Our analysis characterizes the properties of the learned features and classifiers, justifying previously observed empirical findings on imbalanced training. To verify the accuracy of our theoretical predictions, we conduct experiments on benchmark networks and vision datasets. Additionally, we adopt the UFM to conduct a preliminary study on supervised contrastive loss, a recently proposed alternative to the cross-entropy. We observe that these two losses, under UFM, exhibit certain similarities. Furthermore, with a slight modification to the model, we discover that we can make the training geometry invariant to imbalances in this case. Overall, despite its simplicity, UFM uncovers certain biases of over-parameterized models at the training stage. Indeed, this simplification is not without its own limitations, which will become clear throughout our analysis.
Genre	Thesis/Dissertation
Type	Text
Language	eng
Date Available	2023-08-24
Provider	Vancouver : University of British Columbia Library
Rights	Attribution-NonCommercial-NoDerivatives 4.0 International
DOI	10.14288/1.0435573
URI	http://hdl.handle.net/2429/85620
Degree (Theses)	Master of Applied Science - MASc
Program (Theses)	Electrical and Computer Engineering
Affiliation	Applied Science, Faculty of; Electrical and Computer Engineering, Department of
Degree Grantor	University of British Columbia
Graduation Date	2023-11
Campus	UBCV
Scholarly Level	Graduate
Rights URI	http://creativecommons.org/licenses/by-nc-nd/4.0/
Aggregated Source Repository	DSpace

Open Collections

UBC Theses and Dissertations

UBC Theses and Dissertations

Learning from imbalanced data : a geometric study on over-parameterized models Behnia, Tina

Abstract

Item Metadata

Item Media

Item Citations and Data

Rights