UBC Theses and Dissertations

UBC Theses Logo

UBC Theses and Dissertations

Improving language models with novel contrastive learning objectives Khondaker, Md Tawkat Islam

Abstract

Contrastive learning (CL) has recently emerged as an effective technique in natural language processing, especially in the important area of language modeling. In this work, we offer novel methods for deploying CL in both pretraining and finetuning of language models. First, we present PACT (Pretraining with Adversarial Contrastive Learning for Text Classification), a novel self-supervised framework for text classification. Instead of contrasting against in-batch negatives, a popular approach in the literature, PACT mines negatives closer to the anchor representation. PACT operates by endowing the standard pretraining mechanisms of BERT with adversarial contrastive learning objectives, allowing for effective joint optimization of token- and sentence-level pretraining of the BERT model. Our experiments on 13 diverse datasets including token-level, single-sentence, and sentence-pair text classification tasks show that PACT achieves consistent improvements over SOTA baselines. We further show that PACT regularizes both token-level and sentence-level embedding spaces into more uniform representations, thereby alleviating the undesirable anisotropic phenomenon of language models. Subsequently, in the context of finetuning, we apply CL in tackling cross-platform abusive language detection. The prevalence of abusive language on different online platforms has been a major concern that raises the need for automated cross-platform abusive language detection. However, prior works focus on concatenating data from multiple platforms, inherently adopting Empirical Risk Minimization (ERM) method. In our work, we address this challenge from the perspective of domain generalization objective. We design SCL-Fish, a supervised contrastive learning integrated meta-learning algorithm to detect abusive language on unseen platforms. Our experimental analysis shows that SCL-Fish achieves better performance over ERM and the existing state-of-the-art models. We also show that SCL-Fish is data-efficient and achieves comparable performance with the large-scale pretrained models upon finetuning for the abusive language detection task.

Item Citations and Data

Rights

Attribution-NonCommercial-NoDerivatives 4.0 International