Proceedings of the 5th Information Interaction in Context Symposium 2014
DOI: 10.1145/2637002.2637017
|View full text |Cite
|
Sign up to set email alerts
|

On choosing an effective automatic evaluation metric for microblog summarisation

Abstract: Popular microblogging services, such as Twitter, are engaging millions of users who constantly post and share information about news and current events each day, resulting in millions of messages discussing what is happening in the world. To help users obtain an overview of microblog content relating to topics and events that they are interested in, classical summarisation techniques from the newswire domain have been successfully applied and extended for use on microblogs. However, much of the current literat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(8 citation statements)
references
References 20 publications
0
8
0
Order By: Relevance
“…Similarly, Giannakopoulos and Karkaletsis (2013) use machine learning to learn a linear combination of n-gram methods to evaluate summaries. Mackie et al (2014), Giannakopoulos (2013), and Cohan and Goharian (2016) investigate evaluation for microblog, multilingual, and scientific summarization, respectively. Our evaluation, on contrary, uses newswire datasets since this is the most prominent application domain for automatic summarization.…”
Section: Related Workmentioning
confidence: 99%
“…Similarly, Giannakopoulos and Karkaletsis (2013) use machine learning to learn a linear combination of n-gram methods to evaluate summaries. Mackie et al (2014), Giannakopoulos (2013), and Cohan and Goharian (2016) investigate evaluation for microblog, multilingual, and scientific summarization, respectively. Our evaluation, on contrary, uses newswire datasets since this is the most prominent application domain for automatic summarization.…”
Section: Related Workmentioning
confidence: 99%
“…In this section, we now examine if the ROUGE-2 metric aligns with user judgements, reproducing and validating previous findings -but generalising to the context of crowd-sourcing. This provides a measure of confidence in using crowdsourced evaluations of newswire summarisation, as has previously been demonstrated for microblog summarisation [11,12]. Our user study is conducted via CrowdFlower 1 , evaluating 5 baseline systems and 7 state-of-the-art systems, over the DUC 2004 dataset using summary texts from SumRepo.…”
Section: Crowd-sourced User Study To Validate That the Rouge-2 Metricmentioning
confidence: 98%
“…The key idea is that a summary should be similar to the original text in regard to characteristic criteria as the word distribution. (Mackie et al, 2014) find that topic words are a suitable metric to automatically evaluate micro blog summaries.…”
Section: Text Summarizationmentioning
confidence: 99%