2016
DOI: 10.3233/sw-150188
|View full text |Cite
|
Sign up to set email alerts
|

Semantic Abstraction for generalization of tweet classification: An evaluation of incident-related tweets

Abstract: Social media is a rich source of up-to-date information about events such as incidents. The sheer amount of available information makes machine learning approaches a necessity to process this information further. This learning problem is often concerned with regionally restricted datasets such as data from only one city. Because social media data such as tweets varies considerably across different cities, the training of efficient models requires labeling data from each city of interest, which is costly and ti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
25
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(26 citation statements)
references
References 24 publications
1
25
0
Order By: Relevance
“…For the embedding-based metric, we used both GloVe embeddings pre-trained on general Twitter (General Embd), and GloVe embeddings that we trained on a crisis-related Twitter corpus (Crisis Embd). The crisis-related embeddings were trained on three previously published crisis datasets, specifically, CrisisLexT6 [39], CrisisLexT26 [40] and 2CTweets [50], together with tweets collected using the Twitter Streaming API during Hurricanes Harvey, Irma, and Maria, and Mexico Earthquake. The total corpus contains approximately 5.8 million tweets.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…For the embedding-based metric, we used both GloVe embeddings pre-trained on general Twitter (General Embd), and GloVe embeddings that we trained on a crisis-related Twitter corpus (Crisis Embd). The crisis-related embeddings were trained on three previously published crisis datasets, specifically, CrisisLexT6 [39], CrisisLexT26 [40] and 2CTweets [50], together with tweets collected using the Twitter Streaming API during Hurricanes Harvey, Irma, and Maria, and Mexico Earthquake. The total corpus contains approximately 5.8 million tweets.…”
Section: Evaluation Metricsmentioning
confidence: 99%
“…As a means to deal with the poor textual content of tweets, related work has suggested the use of external information to add context to tweet contents, in applications such as Event Classification (ABEL et al, 2012a;PACKER et al, 2012;PAULHEIM, 2013;GUCKELSBERGER;JANSSEN, 2015) and Sentiment Analysis (SAIF; HE; ALANI, 2012). As a general approach, the textual features to be enriched are selected according to some criterion, and mapped into the resources available in external knowledge source (e.g.…”
Section: List Of Figuresmentioning
confidence: 99%
“…Schulz et al use the extension for developing an approach for semantic abstraction for generalization of tweets classification [32,33]. Shidik et al [34] make use of the extension for developing a machine learning approach for predicting forest fires using LOD as background knowledge.…”
Section: Further Use Casesmentioning
confidence: 99%
“…All factors not influenced by factors originating in online data access, such as local performance of data analysis, can be addressed by design in RapidMiner, since there, e.g., are cloud computing services 31 as well as the Radoop extension 32 for data analysis with Hadoop, which allow for scaling the analytics operations.…”
Section: Scientific and Technical Journal Articlesmentioning
confidence: 99%