Most existing deep learning-based sentiment classification methods need large human-annotated data, but labeling large amounts of high-quality emotional texts is labor-intensive. Users on various social platforms generate massive amounts of tagged opinionated text (e.g., tweets, customer reviews), providing a new resource for training deep models. However, some of the tagged instances have sentiment tags that are diametrically opposed to their true semantics. We cannot use this tagged data directly because the noisy labeled instances have a negative impact on the training phase. In this paper, we present a novel Simple Weakly-supervised Contrastive Learning framework (SWCL). We use the contrastive learning strategy to pre-train the deep model on the large user-tagged data (referred to as weakly-labeled data) and then the pre-trained model is fine-tuned on the small human-annotated data. We refine the contrastive loss function to better exploit inter-class contrastive patterns, making contrastive learning more applicable to the weakly-supervised setting. Besides, multiple sampling on different sentiment pairs reduces the negative impact of label noises. SWCL captures the diverse sentiment semantics of weakly labeled data and improves their suitability for downstream sentiment classification tasks. Our method outperforms the other baseline methods in experiments on the Amazon review, Twitter, and SST-5 datasets. Even when fine-tuned on 0.5 percent of the training data (i.e. 32 instances), our framework significantly boosts the deep models’ performance, demonstrating its robustness in a few-shot learning scenario.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.