A model-theoretic coreference scoring scheme

Vilain, Marc; Burger, John D.; Aberdeen, John S.; Connolly, Dennis; Hirschman, Lynette

doi:10.3115/1072399.1072405

Cited by 402 publications

(299 citation statements)

References 0 publications

Supporting

Mentioning

288

Contrasting

Unclassified

Order By: Relevance

“…There does not appear to be a single standard evaluation metric in the coreference resolution community. We opted to use the following three: muc-6 [38], ceaf [23], and b-cubed [1], which seem to be the most widely accepted metrics. All three metrics compute Recall, Precision and F-Scores on aligned gold-standard and resolver-tool coreference chains.…”

Section: Automatic Extrinsic Evaluation Of Claritymentioning

confidence: 99%

Generating Referring Expressions in Context: The GREC Task Evaluation Challenges

Belz

Kow

Viethen

et al. 2010

Empirical Methods in Natural Language Generation

View full text Add to dashboard Cite

Abstract. Until recently, referring expression generation (reg) research focused on the task of selecting the semantic content of definite mentions of listener-familiar discourse entities. In the grec research programme we have been interested in a version of the reg problem definition that is (i) grounded within discourse context, (ii) embedded within an application context, and (iii) informed by naturally occurring data. This paper provides an overview of our aims and motivations in this research programme, the data resources we have built, and the first three sharedtask challenges, grec-msr'08, grec-msr'09 and grec-neg'09, we have run based on the data.

show abstract

Section: Automatic Extrinsic Evaluation Of Claritymentioning

confidence: 99%

Generating Referring Expressions in Context: The GREC Task Evaluation Challenges

Belz

Kow

Viethen

et al. 2010

Empirical Methods in Natural Language Generation

View full text Add to dashboard Cite

show abstract

“…Evaluation Metrics. We compute three most popular performance metrics for coreference resolution: MUC (Vilain et al, 1995), B-Cubed (Bagga and Baldwin, 1998), and Entity-based CEAF (CEAF φ4 ) (Luo, 2005). As it is commonly done in CoNLL shared tasks (Pradhan et al, 2012), we employ the average F1 score (CoNLL F1) of these three metrics for comparison purposes.…”

Section: Experiments and Resultsmentioning

confidence: 99%

Prune-and-Score: Learning for Greedy Coreference Resolution

Doppa

Orr³

et al. 2014

Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

View full text Add to dashboard Cite

We propose a novel search-based approach for greedy coreference resolution, where the mentions are processed in order and added to previous coreference clusters. Our method is distinguished by the use of two functions to make each coreference decision: a pruning function that prunes bad coreference decisions from further consideration, and a scoring function that then selects the best among the remaining decisions. Our framework reduces learning of these functions to rank learning, which helps leverage powerful off-the-shelf rank-learners. We show that our Prune-and-Score approach is superior to using a single scoring function to make both decisions and outperforms several state-of-the-art approaches on multiple benchmark corpora including OntoNotes.

show abstract

“…anaphora. We compared our system to this baseline using the unweighted average of F 1 -measure over B-CUBED (Bagga and Baldwin, 1998), MUC (Vilain et al, 1995), and CEAF (Luo, 2005) metrics, the standard evaluation metrics for coreference resolution. We used the scripts provided by i2b2 shared task organizers for this purpose.…”

Section: Discussionmentioning

confidence: 99%

Coreference Resolution for Structured Drug Product Labels

Kilicoglu

Demner‐Fushman

2014

Proceedings of BioNLP 2014

View full text Add to dashboard Cite

FDA drug package inserts provide comprehensive and authoritative information about drugs. DailyMed database is a repository of structured product labels extracted from these package inserts. Most salient information about drugs remains in free text portions of these labels. Extracting information from these portions can improve the safety and quality of drug prescription. In this paper, we present a study that focuses on resolution of coreferential information from drug labels contained in DailyMed. We generalized and expanded an existing rule-based coreference resolution module for this purpose. Enhancements include resolution of set/instance anaphora, recognition of appositive constructions and wider use of UMLS semantic knowledge. We obtained an improvement of 40% over the baseline with unweighted average F 1 -measure using B-CUBED, MUC, and CEAF metrics. The results underscore the importance of set/instance anaphora and appositive constructions in this type of text and point out the shortcomings in coreference annotation in the dataset.

show abstract

A model-theoretic coreference scoring scheme

Cited by 402 publications

References 0 publications

Generating Referring Expressions in Context: The GREC Task Evaluation Challenges

Generating Referring Expressions in Context: The GREC Task Evaluation Challenges

Prune-and-Score: Learning for Greedy Coreference Resolution

Coreference Resolution for Structured Drug Product Labels

Contact Info

Product

Resources

About