Eduard Hovy scite author profile

In this paper we discuss manual and automatic evaluations of summaries using data from the Document Understanding Conference 2001 (DUC-2001). We first show the instability of the manual evaluation. Specifically, the low interhuman agreement indicates that more reference summaries are needed. To investigate the feasibility of automated summary evaluation based on the recent BLEU method from machine translation, we use accumulative n-gram overlap scores between system and human summaries. The initial results provide encouraging correlations with human judgments, based on the Spearman rank-order correlation coefficient. However, relative ranking of systems needs to take into account the instability.

show abstract

Hierarchical Attention Networks for Document Classification

Yang

Dyer

et al. 2016

4,049

2,782

View full text Add to dashboard Cite

We propose a hierarchical attention network for document classification. Our model has two distinctive characteristics: (i) it has a hierarchical structure that mirrors the hierarchical structure of documents; (ii) it has two levels of attention mechanisms applied at the wordand sentence-level, enabling it to attend differentially to more and less important content when constructing the document representation. Experiments conducted on six large scale text classification tasks demonstrate that the proposed architecture outperform previous methods by a substantial margin. Visualization of the attention layers illustrates that the model selects qualitatively informative words and sentences.

show abstract

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Hovy

2016

2,147

1,870

View full text Add to dashboard Cite

State-of-the-art sequence labeling systems traditionally require large amounts of taskspecific knowledge in the form of handcrafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word-and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data preprocessing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks -Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both datasets -97.55% accuracy for POS tagging and 91.21% F1 for NER.

show abstract

Self-Training With Noisy Student Improves ImageNet Classification

et al. 2020

View full text Add to dashboard Cite

Automatic evaluation of summaries using N-gram co-occurrence statistics

2003

View full text Add to dashboard Cite

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram cooccurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Eduard Hovy

Manual and automatic evaluation of summaries

Hierarchical Attention Networks for Document Classification

End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF

Self-Training With Noisy Student Improves ImageNet Classification

Automatic evaluation of summaries using N-gram co-occurrence statistics

Contact Info

Product

Resources

About