Open domain event extraction from twitter

Ritter, Alan; Etzioni, Oren; Clark, Sam

doi:10.1145/2339530.2339704

Cited by 470 publications

(371 citation statements)

References 25 publications

Supporting

Mentioning

367

Contrasting

Unclassified

Order By: Relevance

“…Ritter et al [22] presented a system called TwiCal to extract and categorize events from Twitter. The strength of association between each named entity and date based on the number of tweets they cooccur in is measured to determine whether the extracted event is significant.…”

Section: Event Extractionmentioning

confidence: 99%

“…We argue that this is a reasonable choice since newsworthy events would be more interesting than others. In total, we have The baseline we chose is TwiCal [22], the state-of-the-art open event extraction system on tweets. Each event extracted in the baseline are represented as a 3-tuple y, d, k , where y stands for a non-location named entity, d for a date and k for an event phrase.…”

Section: Setupmentioning

confidence: 99%

“…We re-implemented the whole system and evaluate the performance of the baseline on the correctness of the exacted three elements only excluding the location element. Moreover, the parameters of TwiCal are optimized based on the suggestion mentioned in [22].…”

Section: Setupmentioning

confidence: 99%

See 2 more Smart Citations

Unsupervised event exploration from social text streams

Zhou

Chen

Zhang

et al. 2017

IDA

View full text Add to dashboard Cite

Abstract. Social media provides unprecedented opportunities for people to disseminate information and share their opinions and views online. Extracting events from social media platforms such as Twitter could help in understanding what is being discussed. However, event extraction from social text streams poses huge challenges due to the noisy nature of social media posts and dynamic evolution of language. We propose a generic unsupervised framework for exploring events on Twitter which consists of four major steps, filtering, pre-processing, extraction and categorization, and post-processing. Tweets published in a certain time period are aggregated and noisy tweets which do not contain newsworthy events are filtered by the filtering step. The remaining tweets are pre-processed by temporal resolution, part-of-speech tagging and named entity recognition in order to identify the key elements of events. An unsupervised Bayesian model is proposed to automatically extract the structured representations of events in the form of quadruples < entity, keyword, date, location > and further categorize the extracted events into event types. Finally, the categorized events are assigned with the event type labels without human intervention. The proposed framework has been evaluated on over 60 million tweets which were collected for one month in December 2010. A precision of 78.01% is achieved for event extraction using our proposed Bayesian model, outperforming a competitive baseline by nearly 13.6%. Moreover, events are also clustered into coherence groups with the automatically assigned event type labels with an accuracy of 42.57%.

show abstract

Section: Event Extractionmentioning

confidence: 99%

Section: Setupmentioning

confidence: 99%

See 1 more Smart Citation

Unsupervised event exploration from social text streams

Zhou

Chen

Zhang

et al. 2017

IDA

View full text Add to dashboard Cite

show abstract

“…They present a pipeline, somehow similar to Raimond, which extracts names entities, event phrases, calendar dates and event type. Their pipeline combines cutom NLP tools and unsupervised learning [18].…”

Section: Related Workmentioning

confidence: 99%

Raimond: Quantitative Data Extraction from Twitter to Describe Events

Sellam

Alonso

2015

Engineering the Web in the Big Data Era

View full text Add to dashboard Cite

Abstract. Social media play a decisive role in communicating and spreading information during global events. In particular, real-time microblogging platforms such as Twitter have become prevalent. Researchers have used microblogging for a number of tasks, including past events analysis, predictions, and information retrieval. Nevertheless, little attention has been given to quantitative data extraction. In this paper, we address two questions: can we develop a mechanism to extract quantitative data from a collection of tweets, and can we use the salient findings to describe an event? To answer the first question, we introduce Raimond, a virtual text curator, specialized in quantitative data extraction from Twitter. To address the second question, we use our system on three events and evaluate its output using a crowdsourcing strategy. We demonstrate the effectiveness of our approach with a number of real world examples.

show abstract

“…(2011) redeveloped the taggers and segmenters of Stanford NLP library1. Ritter et al (2012) extending the above work created an application Twical, that extracted an open domain calendar for events that were shared on Twitter.…”

Section: Background and Related Workmentioning

confidence: 99%

Proceedings of the First AHA!-Workshop on Information Discovery in Text

Akbik¹,

Visengeriyeva²

2014

View full text Add to dashboard Cite

ii IntroductionWelcome to the First AHA!-Workshop on Information Discovery in Text! In this workshop, we are bringing together leading researchers in the emerging field of Information Discovery to discuss approaches for Information Extraction that are not bound by a pre-specified schema of information, but rather discover relational or categorial structure automatically from given unstructured data.This includes approaches that are based on unsupervised machine-learning over models of distributional semantics, as well as OpenIE methods that relax the definition of semantic relations in order to more openly extract structured information. Other approaches focus on inexpensively training information extractors to be used across different domains with minimal supervision, or on adapting existing IE systems to new domains and relations. We received 19 paper submissions of which the programme committee has accepted ten -six of which were chosen for oral presentation and four as posters.We look forward to a workshop full of interesting paper presentations, invited talks and lively discussion. AbstractRecent approaches to relation extraction following the distant supervision paradigm have focused on exploiting large knowledge bases, from which they extract substantial amount of supervision. However, for many relations in real-world applications, there are few instances available to seed the relation extraction process, and appropriate named entity recognizers which are necessary for pre-processing do not exist. To overcome this issue, we learn entity filters jointly with relation extraction using imitation learning. We evaluate our approach on architect names and building completion years, using only around 30 seed instances for each relation and show that the jointly learned entity filters improved the performance by 30 and 7 points in average precision.

show abstract

Open domain event extraction from twitter

Cited by 470 publications

References 25 publications

Unsupervised event exploration from social text streams

Unsupervised event exploration from social text streams

Raimond: Quantitative Data Extraction from Twitter to Describe Events

Proceedings of the First AHA!-Workshop on Information Discovery in Text

Contact Info

Product

Resources

About