2019
DOI: 10.3390/sym11111393
|View full text |Cite
|
Sign up to set email alerts
|

Self-Supervised Contextual Data Augmentation for Natural Language Processing

Abstract: In this paper, we propose a novel data augmentation method with respect to the target context of the data via self-supervised learning. Instead of looking for the exact synonyms of masked words, the proposed method finds words that can replace the original words considering the context. For self-supervised learning, we can employ the masked language model (MLM), which masks a specific word within a sentence and obtains the original word. The MLM learns the context of a sentence through asymmetrical inputs and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 33 publications
0
5
0
Order By: Relevance
“…Their study aimed to improve the training efficiency of this model to help handle the precise classification of NLP problems. Park and Ahn (2019) [100] inspected the vital gains of SSL to lead to efficient detection of NLP. Researchers proposed a new approach dedicated to data augmentation that considers the intended context of the data.…”
Section: Natural Language Processingmentioning
confidence: 99%
“…Their study aimed to improve the training efficiency of this model to help handle the precise classification of NLP problems. Park and Ahn (2019) [100] inspected the vital gains of SSL to lead to efficient detection of NLP. Researchers proposed a new approach dedicated to data augmentation that considers the intended context of the data.…”
Section: Natural Language Processingmentioning
confidence: 99%
“…Some approaches address the issue by integrating a semantic search capability in the OGC Catalogue Service for Web (CSW) or extracting and validating new dependencies, and, in this way, allowing for a more semantically enriched geospatial data discovery through Web ontology Language (OWL) [5,23,24]. With the introduction of deep learning and language models, some works have applied these in Natural Language Processing to extract valuable insight from unstructured text [25,26]. By leveraging NLP, keywords, textual annotations, and geospatial information within the text can be identified and semantically enriched.…”
Section: Semantic Augmentationmentioning
confidence: 99%
“…For example, replacing the abbreviated form with the full written form of the phrase (Coulombe, 2018), such as "He's" and 'He is'. Word replacement method represents replacing some randomly selected words with their substitute words having similar context (Abdurrahman and Purwarianti, 2019;Park and Ahn, 2019;Rizos et al, 2019;Wei and Zou, 2019). With back-translation method, an original text is translated into other, intermediate languages, which are then re-translated into the original language (Li et al, 2018a).…”
Section: Textual Data Augmentationmentioning
confidence: 99%