2022
DOI: 10.48550/arxiv.2201.07604
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Semi-Supervised Clustering with Contrastive Learning for Discovering New Intents

Abstract: Most dialogue systems in real world rely on predefined intents and answers for QA service, so discovering potential intents from large corpus previously is really important for building such dialogue services. Considering that most scenarios have few intents known already and most intents waiting to be discovered, we focus on semi-supervised text clustering and try to make the proposed method benefit from labeled samples for better overall clustering performance. In this paper, we propose Deep Contrastive Semi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 12 publications
(33 reference statements)
0
3
0
Order By: Relevance
“…In this learning, which is done through backpropagation, pairwise constraints are used for better learning of document representation. A method based on semi-supervised text clustering by Wei et al [22] is presented in which labeled samples are used for deep contrastive semi-supervised clustering (DCSC), which jointly optimizes the clustering and representation learning. In the paper presented by Vilhagra et al [23], deep clustering with a convolutional Siamese network has also been used to learn data representation with pairwise constraints, and the K-Means algorithm is used for unsupervised clustering.…”
Section: Related Workmentioning
confidence: 99%
“…In this learning, which is done through backpropagation, pairwise constraints are used for better learning of document representation. A method based on semi-supervised text clustering by Wei et al [22] is presented in which labeled samples are used for deep contrastive semi-supervised clustering (DCSC), which jointly optimizes the clustering and representation learning. In the paper presented by Vilhagra et al [23], deep clustering with a convolutional Siamese network has also been used to learn data representation with pairwise constraints, and the K-Means algorithm is used for unsupervised clustering.…”
Section: Related Workmentioning
confidence: 99%
“…Most existing NID methods (Lin et al, 2020;Zhang et al, 2021;Wei et al, 2022;Zhang et al, 2022;An et al, 2023) adopt a two-stage training strategy: pre-training on labeled data, then learning clustering-friendly representation with pseudo supervisory signals. However, previous methods only rely on semantic similarities to generate supervisory signals based on the assumption that samples within the feature hypersphere belong to the same category as the hypersphere anchor, e.g.…”
Section: Introductionmentioning
confidence: 99%
“…Contrastive learning has shown impressive outcomes in unsupervised sentence representation learning (Tang et al, 2022;Wei et al, 2022). The fundamental concept entails generating positive pairs and negative pairs via data augmentation (Wei and Zou, 2019), and feeding these pairs into a pre-trained model to minimize the distance between positive pairs while maximizing the distance between negative pairs.…”
mentioning
confidence: 99%