Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1510
|View full text |Cite
|
Sign up to set email alerts
|

NNE: A Dataset for Nested Named Entity Recognition in English Newswire

Abstract: Named entity recognition (NER) is widely used in natural language processing applications and downstream tasks. However, most NER tools target flat annotation from popular datasets, eschewing the semantic information available in nested entity mentions. We describe NNE-a fine-grained, nested named entity dataset over the full Wall Street Journal portion of the Penn Treebank (PTB). Our annotation comprises 279,795 mentions of 114 entity types with up to 6 layers of nesting. We hope the public release of this la… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 49 publications
(52 citation statements)
references
References 22 publications
0
51
0
1
Order By: Relevance
“…Our experiments were conducted on four nested named entity recognition datasets: GENIA [5] (biomedical domain), NNE [2] (news domain), PolEval [33] (mixed texts, a Polish corpus), and GermEval [34] (news and Wikipedia, a German corpus). Each dataset was split into three parts: training, validation, and test sets.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…Our experiments were conducted on four nested named entity recognition datasets: GENIA [5] (biomedical domain), NNE [2] (news domain), PolEval [33] (mixed texts, a Polish corpus), and GermEval [34] (news and Wikipedia, a German corpus). Each dataset was split into three parts: training, validation, and test sets.…”
Section: Methodsmentioning
confidence: 99%
“…It contains annotations of the Wall Street Journal portion of the Penn Treebank (PTB). The total number of named entities is 279 795, of which 118 525 [2] evaluated three variants of baseline model based on BiLSTM-CRF architecture: (a) a flat NER model detecting only the outermost entities; (b) a flat NER model detecting only the innermost entities; and (c) a combination of predictions of those two models. Our comparison also includes other known methods, retrained and evaluated using publicly available source codes.…”
Section: Nnementioning
confidence: 99%
See 3 more Smart Citations