Proceedings of the 28th International Conference on Computational Linguistics 2020
DOI: 10.18653/v1/2020.coling-main.479
|View full text |Cite
|
Sign up to set email alerts
|

Data Augmentation for Multiclass Utterance Classification – A Systematic Study

Abstract: Utterance classification is a key component in many conversational systems. However, classifying real-world user utterances is challenging, as people may express their ideas and thoughts in manifold ways, and the amount of training data for some categories may be fairly limited, resulting in imbalanced data distributions. To alleviate these issues, we conduct a comprehensive survey regarding data augmentation approaches for text classification, including simple random resampling, word-level transformations, an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 41 publications
0
2
0
Order By: Relevance
“…Few-shot learning has become a hot spot [87,51,88,89,90] since current works focus more on high-frequency judgment results than few-shot judgment results to ensure enough training data.…”
Section: Few-shot Learning Frameworkmentioning
confidence: 99%
“…Few-shot learning has become a hot spot [87,51,88,89,90] since current works focus more on high-frequency judgment results than few-shot judgment results to ensure enough training data.…”
Section: Few-shot Learning Frameworkmentioning
confidence: 99%
“…Text Structure classification generation prediction Paraphrasing Thesauruses [5], [93], [49], [7], [42], [60], [44], [45], [98] - [42], [43] Embeddings [8], [49] --MLMs [10], [51], [54] [55] -Rules [10], [7], [11] -[99] MT [42], [60], [10], [12], [59], [61], [63], [7], [19], [66], [100], [98] [13], [58] [42], [57], [15] Seq2Seq [18], [68], [101] [18], [102] [18], [16], [67], [17], [103], [82] Noising Swapping [93], [60], [44], [61], [20], [19] -…”
Section: Textmentioning
confidence: 99%