Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1231
|View full text |Cite
|
Sign up to set email alerts
|

Distantly Supervised Named Entity Recognition using Positive-Unlabeled Learning

Abstract: In this work, we explore the way to perform named entity recognition (NER) using only unlabeled data and named entity dictionaries. To this end, we formulate the task as a positive-unlabeled (PU) learning problem and accordingly propose a novel PU learning algorithm to perform the task. We prove that the proposed algorithm can unbiasedly and consistently estimate the task loss as if there is fully labeled data. A key feature of the proposed method is that it does not require the dictionaries to label every ent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
83
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 87 publications
(84 citation statements)
references
References 35 publications
1
83
0
Order By: Relevance
“…There are also a lot of weak labels lying on the web or gazetteers, which have not been explored. Consequently, a number of works focus on distantly supervised methods, using anchors or gazetteers to generate data by distant supervision (Liu et al, 2015;Cao et al, 2019;Peng et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…There are also a lot of weak labels lying on the web or gazetteers, which have not been explored. Consequently, a number of works focus on distantly supervised methods, using anchors or gazetteers to generate data by distant supervision (Liu et al, 2015;Cao et al, 2019;Peng et al, 2019).…”
Section: Related Workmentioning
confidence: 99%
“…Then we step into the second phase (i.e., NEE) in which the model is trained to extract typed entities with gazetteer-labeled data. Peng et al, 2019). A standard strategy is to scan through the anchor text in D g using the gazetteer of a given entity type y and treat anchors matched with entries of the given gazetteer as the entities with type y.…”
Section: Named Entity Extractionmentioning
confidence: 99%
See 1 more Smart Citation
“…Distant-LSTM-CRF [3] propose for the distantly supervised aspect term extraction, which can be viewed as an entity recognition task of a single type for business reviews. AdaPU [16] propose algorithm using unlabeled data and dictionary to perform NER tasks while using AdaSampling way to expand the named entity recognition dictionary.…”
Section: Related Workmentioning
confidence: 99%
“…The method then uses the classifier to perform unsupervised aspect term extraction by training on the auto-tagged datasets obtained by the method. AdaPU [16] explored ways to perform NER using only unlabeled data and a dictionary of named entities. The method represents the task as a Positive Unlabeled (PU) learning problem, and proposes a PU learning algorithm to perform the task.…”
Section: Distant-lstm-crf [3]mentioning
confidence: 99%