Joint Entity Recognition and Disambiguation

Luo, Gang; Huang, Xin; Lin, Chin-Yew; Nie, Zaiqing

doi:10.18653/v1/d15-1104

Cited by 257 publications

(175 citation statements)

References 15 publications

Supporting

Mentioning

164

Contrasting

Order By: Relevance

“…Such named entities (including several polysemy) in the training set, development set, and test set reach relatively high percentage of respective 6.9%, 4.4%, and 6.5%. The inconsistent annotation and inconsistent tag assignment may be able to explain why most state-of-the-art NER methods achieve the F 1 at around 94.5% on the development set and around 91.5% on the test set [12,20,24,25,29,33,46], and why more than 10 years' effort improves the F 1 by only 0.8% on the development set (from 2003's 93.9% [16] to current 94.7% [25]) and by only 2.9% on the test set (from 2003's 88.7% [16] to current 91.6% [12]). The two inconsistency problems seem to limit the upper bound of the performance on development set at near 94.5% and the one on test set at near 91.5%.…”

Section: Discussionmentioning

confidence: 99%

Time Expression Recognition Using a Constituent-based Tagging Scheme

Zhong

Wang

2018

Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18

View full text Add to dashboard Cite

We find from four datasets that time expressions are formed by loose structure and the words used to express time information can differentiate time expressions from common text. The findings drive us to design a learning method named TOMN to model time expressions. TOMN defines a constituent-based tagging scheme named TOMN scheme with four tags, namely T, O, M, and N, indicating the constituents of time expression, namely Time token, Modifier, Numeral, and the words Outside time expression. In modeling, TOMN assigns a word with a TOMN tag under conditional random fields with minimal features. Essentially, our constituent-based TOMN scheme overcomes the problem of inconsistent tag assignment that is caused by the conventional position-based tagging schemes (e.g., BIO scheme and BILOU scheme). Experiments show that TOMN is equally or more effective than state-of-the-art methods on various datasets, and much more robust on cross-datasets. Moreover, our analysis can explain many empirical observations in other works about time expression recognition and named entity recognition.

show abstract

Section: Discussionmentioning

confidence: 99%

Time Expression Recognition Using a Constituent-based Tagging Scheme

Zhong

Wang

2018

Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18

View full text Add to dashboard Cite

show abstract

“…On the basis of the CoNLL-2003 [15] English dataset, we evaluated the effects of the character learning components of the model and compared them with those obtained by Chiu and Nichols [7], Luo et al [26], Lample et al [8], and Ma and Hovy [9]. Other models were also compared.…”

Section: Methodsmentioning

confidence: 99%

Character Feature Learning for Named Entity Recognition

Zeng

Tan

Zhang

et al. 2018

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThe deep neural named entity recognition model automatically learns and extracts the features of entities and solves the problem of the traditional model relying heavily on complex feature engineering and obscure professional knowledge. This issue has become a hot topic in recent years. Existing deep neural models only involve simple character learning and extraction methods, which limit their capability. To further explore the performance of deep neural models, we propose two character feature learning models based on convolution neural network and long short-term memory network. These two models consider the local semantic and position features of word characters. Experiments conducted on the CoNLL-2003 dataset show that the proposed models outperform traditional ones and demonstrate excellent performance.

show abstract

“…This avenue has recently been explored by Durrett and Klein [2], Luo et al [14] and Nguyen et al [18]. Consistently, these approaches extend conditional random fields (CRF; [10]) which constitute the state-of-the-art in named entity recognition.…”

Section: Related Workmentioning

confidence: 99%

Joint Entity Recognition and Linking in Technical Domains Using Undirected Probabilistic Graphical Models

Horst

Hartung

Cimiano

2017

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. The problems of recognizing mentions of entities in texts and linking them to unique knowledge base identifiers have received considerable attention in recent years. In this paper we present a probabilistic system based on undirected graphical models that jointly addresses both the entity recognition and the linking task. Our framework considers the span of mentions of entities as well as the corresponding knowledge base identifier as random variables and models the joint assignment using a factorized distribution. We show that our approach can be easily applied to different technical domains by merely exchanging the underlying ontology. On the task of recognizing and linking disease names, we show that our approach outperforms the state-of-the-art systems DNorm and TaggerOne, as well as two strong lexicon-based baselines. On the task of recognizing and linking chemical names, our system achieves comparable performance to the state-of-the-art.

show abstract

Joint Entity Recognition and Disambiguation

Cited by 257 publications

References 15 publications

Time Expression Recognition Using a Constituent-based Tagging Scheme

Time Expression Recognition Using a Constituent-based Tagging Scheme

Character Feature Learning for Named Entity Recognition

Joint Entity Recognition and Linking in Technical Domains Using Undirected Probabilistic Graphical Models

Contact Info

Product

Resources

About