Building a large-scale corpus for evaluating event detection on twitter

McMinn, Andrew James; Moshfeghi, Yashar; Jose, Joemon M.

doi:10.1145/2505515.2505695

Cited by 171 publications

(128 citation statements)

References 15 publications

Supporting

Mentioning

118

Contrasting

Unclassified

Order By: Relevance

“…McMinn et al [55] propose a methodology for creating a corpus to evaluate event detection methods. They used two existing state-of-the art event detection approaches [28,54] together with Wikipedia to create a set of candidate events together with a list of associated tweets.…”

Section: Available Corpora For Evaluationmentioning

confidence: 99%

“…For example, the organizers of the 2014 SNOW challenge [12] could only crawl 1 106 712 of the original 3 630 816 tweets of the above-mentioned 2012 US Presidential Election data set [37]. In order to assess how useable these collections of tweet identifiers are, we attempted to download the corpus of McMinn et al [55]. The standard restriction of crawling tweets with the Twitter API 4 is set to 180 queries per 15 minute window.…”

Section: Available Corpora For Evaluationmentioning

confidence: 99%

See 1 more Smart Citation

Editorial: Survey and Experimental Analysis of Event Detection Techniques for Twitter

Weiler

Grossniklaus

Scholl

2016

The Computer Journal

View full text Add to dashboard Cite

Twitter's popularity as a source of up-to-date news and information is constantly increasing. In response to this trend, numerous event detection techniques have been proposed to cope with the rate and volume of Twitter data streams. Although most of these works conduct some evaluation of the proposed technique, a comparative study is often omitted. In this paper, we present a survey and experimental analysis of state-of-the-art event detection techniques for Twitter data streams. In order to conduct this study, we define a series of measures to support the quantitative and qualitative comparison. We demonstrate the effectiveness of these measures by applying them to event detection techniques as well as to baseline approaches using real-world Twitter streaming data.

show abstract

Section: Available Corpora For Evaluationmentioning

confidence: 99%

Section: Available Corpora For Evaluationmentioning

confidence: 99%

Editorial: Survey and Experimental Analysis of Event Detection Techniques for Twitter

Weiler

Grossniklaus

Scholl

2016

The Computer Journal

View full text Add to dashboard Cite

show abstract

“…On the other hand, Aggarwal, et al [3] states that a news event is "something that happens at a specific time and place, but it is also an object of interest to the news media". Similarly, McMinn et al [9] define an event as something significant happening in a specific time and place beside it lead to discussions by the news media. This event might be a political event, natural disaster, terror attack or a protest, etc.…”

Section: Event Definitionmentioning

confidence: 99%

“…Thus, the methods and techniques used for these kinds of events should be evaluated in terms of how fast they can be identified rather than just evaluating based on precision and recall measurements [1]. Unfortunately, there are very few ED evaluation datasets [9]. The TDT5 dataset has been utilized by many studies [43], to evaluate precision.…”

Section: Evaluation Challengesmentioning

confidence: 99%

Challenges of event detection from social media streams

Al-Dyani¹,

Hussein²,

Ahmad³

2018

IJET

View full text Add to dashboard Cite

The area of Event Detection (ED) has attracted researchers' attention over the last few years because of the wide use of social media. Many studies have examined the problem of ED in various social media platforms, like Twitter, Facebook, YouTube, etc. The ED task for social networks involves many issues, including the processing of huge volumes of data with a high level of noise, data collection and privacy issues, etc. Hence, this article discusses and presents the wide range of challenges encountered in the ED process from unstructured text data for the most popular Social Networks (SNs), such as Facebook and Twitter. The main goal is to aid the researchers to understand the main challenges and to discuss the future directions in the ED area.

show abstract

“…We test our approach on two gold standard corpora: the First Story Detection (FSD) corpus (Petrović et al, 2012) and the EVENT2012 corpus (McMinn et al, 2013).…”

Section: Datasetmentioning

confidence: 99%

Graph-based Event Extraction from Twitter

Edouard

Cabrio²,

Tonelli

et al. 2017

RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning

View full text Add to dashboard Cite

Detecting which tweets describe a specific event and clustering them is one of the main challenging tasks related to Social Media currently addressed in the NLP community. Existing approaches have mainly focused on detecting spikes in clusters around specific keywords or Named Entities (NE). However, one of the main drawbacks of such approaches is the difficulty in understanding when the same keywords describe different events. In this paper, we propose a novel approach that exploits NE mentions in tweets and their entity context to create a temporal event graph. Then, using simple graph theory techniques and a PageRank-like algorithm, we process the event graphs to detect clusters of tweets describing the same events. Experiments on two gold standard datasets show that our approach achieves state-of-the-art results both in terms of evaluation performances and the quality of the detected events.

show abstract

Building a large-scale corpus for evaluating event detection on twitter

Cited by 171 publications

References 15 publications

Editorial: Survey and Experimental Analysis of Event Detection Techniques for Twitter

Editorial: Survey and Experimental Analysis of Event Detection Techniques for Twitter

Challenges of event detection from social media streams

Graph-based Event Extraction from Twitter

Contact Info

Product

Resources

About