In recent years, online social media information has been the subject of study in several data science fields due to its impact on users as a communication and expression channel. Data gathered from online platforms such as Twitter has the potential to facilitate research over social phenomena based on sentiment analysis, which usually employs Natural Language Processing and Machine Learning techniques to interpret sentimental tendencies related to users’ opinions and make predictions about real events. Cyber-attacks are not isolated from opinion subjectivity on online social networks. Various security attacks are performed by hacker activists motivated by reactions from polemic social events. In this paper, a methodology for tracking social data that can trigger cyber-attacks is developed. Our main contribution lies in the monthly prediction of tweets with content related to security attacks and the incidents detected based on ℓ1 regularization.
Abstract:In recent years, online social media information has been subject of study in several data science fields due to its impact on users as a communication and expression channel. Data gathered from online platforms such as Twitter has the potential to facilitate research over social phenomena based on sentiment analysis, which usually employs Natural Language Processing and Machine Learning techniques to interpret sentimental tendencies related to users opinions and make predictions about real events. Cyber attacks are not isolated from opinion subjectivity on online social networks. Various security attacks are performed by hacker activists motivated by reactions from polemic social events. In this paper, a methodology for tracking social data that can trigger cyber attacks is developed. Our main contribution lies in the monthly prediction of tweets with content related to security attacks and the incidents detected based on 1 regularization.
Retrieving information from social networks is the first and primordial step many data analysis fields such as Natural Language Processing, Sentiment Analysis and Machine Learning. Important data science tasks relay on historical data gathering for further predictive results. Most of the recent works use Twitter API, a public platform for collecting public streams of information, which allows querying chronological tweets for no more than three weeks old. In this paper, we present a new methodology for collecting historical tweets within any date range using web scraping techniques bypassing for Twitter API restrictions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.