2020
DOI: 10.1109/access.2020.2981361
|View full text |Cite
|
Sign up to set email alerts
|

Construction of Machine-Labeled Data for Improving Named Entity Recognition by Transfer Learning

Abstract: Deep neural networks (DNNs) require a large amount of manually labeled training data to make significant achievements. However, manual labeling is laborious and costly. In this study, we propose a method for automatically generating training data and effectively using the generated data to reduce the labeling cost. The generated data (called ''machine-labeled data'') is generated using a bagging-based bootstrapping approach. However, using the machine-labeled data does not guarantee high performance because of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 35 publications
0
9
0
Order By: Relevance
“…The protagonistTagger tool achieves high results on all the tested novels -precision and recall above 83% (see Table 2). The tool's performance on Test small names shows that it can be successfully used for new distinct novels 2 The adapted tool along with the new datasets: https://zenodo.org/record/ 5060232 to create a larger corpus of annotated texts. The best proof of novels' diversity in the test sets is the tool's performance, whose precision varies from 79% to even 96% for different novels.…”
Section: Evaluation and Datasetsmentioning
confidence: 99%
See 1 more Smart Citation
“…The protagonistTagger tool achieves high results on all the tested novels -precision and recall above 83% (see Table 2). The tool's performance on Test small names shows that it can be successfully used for new distinct novels 2 The adapted tool along with the new datasets: https://zenodo.org/record/ 5060232 to create a larger corpus of annotated texts. The best proof of novels' diversity in the test sets is the tool's performance, whose precision varies from 79% to even 96% for different novels.…”
Section: Evaluation and Datasetsmentioning
confidence: 99%
“…The novel is a particular type of text in terms of writing style, the links between sentences, the plot's complexity, the number of characters, etc. NER phase of the linkage process in the literary domain uses a pretrained standard NER model fine-tuned with data from literary domain annotated with general tag person in a semi-automatic way [2]. It was necessary due to the relatively low performance of standard models trained primarily on web data such as blogs, news, and comments [1,4].…”
Section: Introductionmentioning
confidence: 99%
“…Researchers have also exploited other areas of code-switching schemes, like information retrieval systems e.g. : Automatic Aspect Extraction [48], Polarity Identification [49], Sequence labelling tasks such as Named Entity Recognition [50], Automatic Speech Recognition [51], and Parsing [52]. From the related works we observed that many studies have been conducted in the language identification task in many language pairs but none of them are addressing Malayalam-English due to lack of publicly available data set.…”
Section: Related Workmentioning
confidence: 99%
“…Transfer learning is a method that uses existing knowledge to solve problems in different but related fields. Transfer learning methods have successfully been applied in image recognition [5], speech recognition [6], text recognition [7], and other fields. Research on the combination of transfer learning and fault diagnosis has gradually become a research hotspot.…”
Section: Introductionmentioning
confidence: 99%