Twitter data is increasingly used to make predictions about real-world events. However recently, several studies directly or indirectly questioned proposed Twitter prediction procedures. In this paper, we conduct a literature review to investigate the research processes adopted by previous Twitter prediction studies in detail. We first identify the actors involved, and then we study how they influence the different phases of the research process. We found that in Twitter prediction research up to four actors perform several sampling, filtering, classification and assessment decisions throughout the development of prediction models. If these decisions and the reasons behind them are not sufficiently documented, the developed prediction methods cannot be reproduced in future research and consequently their validity and reliability are hard to assess.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.