Improving Classification of Twitter Behavior During Hurricane Events

Stowe, Kevin; Anderson, Tony; Palmer, Martha; Palen, Leysia; Anderson, Kenneth M.

doi:10.18653/v1/w18-3512

Cited by 28 publications

(19 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Our infrastructure was recently used in one of our research studies [33] to help identify evacuation behavior to create a training set for automated detection of evacuation behavior based both on textual clues in a user's tweets as well as their physical movements. The tool is currently deployed for various research efforts associated with [1] and continues to prove itself a stable, scalable social media post collection infrastructure.…”

Section: Resultsmentioning

confidence: 99%

“…In practice, we have found that these locations may not be homes but instead gyms, work, or school. We found, however, that the accuracy of a home location is not as important as the identification of a location that represents normalcy during non-storm times [33]. Figure 3 shows the user's calculated home location as a transparent blue circle.…”

Section: Clustering Home Location and Movementmentioning

confidence: 94%

See 1 more Smart Citation

Incorporating Context and Location Into Social Media Analysis: A Scalable, Cloud-Based Approach for More Powerful Data Science

Anderson¹,

Sáez²,

Anderson³

et al. 2019

Proceedings of the Annual Hawaii International Conference on System Sciences

Self Cite

View full text Add to dashboard Cite

Dominated by quantitative data science techniques, social media data analysis often fails to incorporate the surrounding context, conversation, and metadata that allows for more complete, accurate, and informed analysis. Here we describe the development of a scalable data collection infrastructure to interrogate massive amounts of tweets-including complete user conversations-to perform contextualized social media analysis. Additionally, we discuss the nuances of location metadata and incorporate it when available to situate the user conversations within geographic context through an interactive map. The map also spatially clusters tweets to identify important locations and movement between them, illuminating specific behavior, like evacuating before a hurricane. We share performance details, the promising results of concurrent research utilizing this infrastructure, and discuss the challenges and ethics of using context-rich datasets.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Clustering Home Location and Movementmentioning

confidence: 94%

Incorporating Context and Location Into Social Media Analysis: A Scalable, Cloud-Based Approach for More Powerful Data Science

Anderson¹,

Sáez²,

Anderson³

et al. 2019

Proceedings of the Annual Hawaii International Conference on System Sciences

Self Cite

View full text Add to dashboard Cite

show abstract

“…We plan to further complement this research by analyzing tweets from Twitter communities that discuss different topics. In this regard, techniques such as domain adaptation, transfer learning, active and online learning could help models to adapt to new domains and address issues associated with the lack of training data (Johnson et al 2020 ; Kaufhold et al 2020 ; Stowe et al 2018 ).…”

Section: Discussionmentioning

confidence: 99%

Coding and Classifying Knowledge Exchange on Social Media: a Comparative Analysis of the #Twitterstorians and AskHistorians Communities

Gruzd

Kumar

Abul-Fottouh

et al. 2020

Comput Supported Coop Work

View full text Add to dashboard Cite

As social media become a staple for knowledge discovery and sharing, questions arise about how self-organizing communities manage learning outside the domain of organized, authority-led institutions. Yet examination of such communities is challenged by the quantity of posts and variety of media now used for learning. This paper addresses the challenges of identifying (1) what information, communication, and discursive practices support successful online communities, (2) whether such practices are similar on Twitter and Reddit, and (3) whether machine learning classifiers can be successfully used to analyze larger datasets of learning exchanges. This paper builds on earlier work that used manual coding of learning and exchange in Reddit ‘Ask’ communities to derive a coding schema we refer to as ‘learning in the wild’. This schema of eight categories: explanation with disagreement, agreement, or neutral presentation; socializing with negative, or positive intent; information seeking; providing resources; and comments about forum rules and norms. To compare across media, results from coding Reddit’s AskHistorians are compared to results from coding a sample of #Twitterstorians tweets (n = 594). High agreement between coders affirmed the applicability of the coding schema to this different medium. LIWC lexicon-based text analysis was used to build machine learning classifiers and apply these to code a larger dataset of tweets (n = 69,101). This research shows that the ‘learning in the wild’ coding schema holds across at least two different platforms, and is partially scalable to study larger online learning communities.

show abstract

“…Outside the social media domain, Velichkov et al (2019) investigate models to predict the outcome of sports events from interviews conducted shortly before the event. Within social media, Stowe et al (2018) present models to determine whether people evacuate during a hurricane event from their tweets. Finally, Swamy et al (2017) present a framework to forecast winners of events (e.g., sports events, elections, awards) by aggregating predictions made by individual users.…”

Section: Previous Workmentioning

confidence: 99%