Background Historic constraints on research dollars and reliable information have limited firearm research. At the same time, interest in the power and potential of social media analytics, particularly in health contexts, has surged. Objective The aim of this study is to contribute toward the goal of establishing a foundation for how social media data may best be used, alone or in conjunction with other data resources, to improve the information base for firearm research. Methods We examined the value of social media data for estimating a firearm outcome for which robust benchmark data exist—specifically, firearm mortality, which is captured in the National Vital Statistics System (NVSS). We hand curated tweet data from the Twitter application programming interface spanning January 1, 2017, to December 31, 2018. We developed machine learning classifiers to identify tweets that pertain to firearm deaths and develop estimates of the volume of Twitter firearm discussion by month. We compared within-state variation over time in the volume of tweets pertaining to firearm deaths with within-state trends in NVSS-based estimates of firearm fatalities using Pearson linear correlations. Results The correlation between the monthly number of firearm fatalities measured by the NVSS and the monthly volume of tweets pertaining to firearm deaths was weak (median 0.081) and highly dispersed across states (range –0.31 to 0.535). The median correlation between month-to-month changes in firearm fatalities in the NVSS and firearm deaths discussed in tweets was moderate (median 0.30) and exhibited less dispersion among states (range –0.06 to 0.69). Conclusions Our findings suggest that Twitter data may hold value for tracking dynamics in firearm-related outcomes, particularly for relatively populous cities that are identifiable through location mentions in tweet content. The data are likely to be particularly valuable for understanding firearm outcomes not currently measured, not measured well, or not measurable through other available means. This research provides an important building block for future work that continues to develop the usefulness of social media data for firearm research.
Background Gun violence research is characterized by a dearth of data available for measuring key constructs. Social media data may offer a potential opportunity to significantly reduce that gap, but developing methods for deriving firearms-related constructs from social media data and understanding the measurement properties of such constructs are critical precursors to their broader use. Objective This study aimed to develop a machine learning model of individual-level firearm ownership from social media data and assess the criterion validity of a state-level construct of ownership. Methods We used survey responses to questions on firearm ownership linked with Twitter data to construct different machine learning models of firearm ownership. We externally validated these models using a set of firearm-related tweets hand-curated from the Twitter Streaming application programming interface and created state-level ownership estimates using a sample of users collected from the Twitter Decahose application programming interface. We assessed the criterion validity of state-level estimates by comparing their geographic variance to benchmark measures from the RAND State-Level Firearm Ownership Database. Results We found that the logistic regression classifier for gun ownership performs the best with an accuracy of 0.7 and an F1-score of 0.69. We also found a strong positive correlation between Twitter-based estimates of gun ownership and benchmark ownership estimates. For states meeting a threshold requirement of a minimum of 100 labeled Twitter users, the Pearson and Spearman correlation coefficients are 0.63 (P<.001) and 0.64 (P<.001), respectively. Conclusions Our success in developing a machine learning model of firearm ownership at the individual level with limited training data as well as a state-level construct that achieves a high level of criterion validity underscores the potential of social media data for advancing gun violence research. The ownership construct is an important precursor for understanding the representativeness of and variability in outcomes that have been the focus of social media analyses in gun violence research to date, such as attitudes, opinions, policy stances, sentiments, and perspectives on gun violence and gun policy. The high criterion validity we achieved for state-level gun ownership suggests that social media data may be a useful complement to traditional sources of information on gun ownership such as survey and administrative data, especially for identifying early signals of changes in geographic patterns of gun ownership, given the immediacy of the availability of social media data, their continuous generation, and their responsiveness. These results also lend support to the possibility that other computationally derived, social media–based constructs may be derivable, which could lend additional insight into firearm behaviors that are currently not well understood. More work is needed to develop other firearms-related constructs and to assess their measurement properties.
BACKGROUND Social media data represent a potentially valuable source of information to support gun violence research. Strengthening the empirical and methodological foundations for using social media data in this context is important for advancing the future application of social media data to gun violence research. OBJECTIVE We assess the extent to which social media-based estimates are able to accurately capture geographic variability in firearms-related outcomes using firearm ownership as a test. METHODS We use Twitter data from 2019-2021 and state of the art computational methods to construct a machine learning model of firearm ownership. We create state-specific estimates of ownership and assess these estimates by comparing them to benchmark measures. RESULTS Methodologically, our study highlights the importance of large draws from social media data when location identification is paramount. Our analytic approach for modeling firearm ownership using machine learning and adjusting estimates using an inferred demographic provide examples of how these techniques can be used and expanded in future gun violence research. Empirically, we find a strong positive correlation between Twitter-based estimates of gun ownership and benchmark ownership estimates. For states meeting a threshold requirement of a minimum of 100 labeled Twitter users, the Pearson’s and Spearman’s correlations are 0.63 (p<0.001) and 0.64 (p <0.001), respectively. CONCLUSIONS Our findings underscore the potential of social media data for providing new windows into firearm behavior and outcomes, especially when measures from traditional data sources are limited or unavailable. Social media data carry analytical challenges when used for research purposes. Careful attention to them, as well as to ethical standards for use, is essential as the frontiers of social media data’s use in research are explored.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.