Cryptocurrencies have recently emerged as financial assets that allow their users to execute transactions in a decentralized manner. Their popularity has led to the generation of huge amounts of data, specifically on social media networks such as Twitter. In this study, we propose an iterative kappa architecture that collects, processes, and temporarily stores data regarding transactions and tweets of two of the major cryptocurrencies according to their market capitalization: Bitcoin (BTC) and Ethereum (ETH). We applied a k-means clustering approach to group data according to their principal characteristics. Data are categorized into three groups: BTC typical data, ETH typical data, BTC and ETH atypical data. Findings show that activity on Twitter correlates to activity regarding the transactions of cryptocurrencies. It was also found that around 14% of data relate to extraordinary behaviors regarding cryptocurrencies. These data contain higher transaction volumes of both cryptocurrencies, and about 9.5% more social media publications in comparison with the rest of the data. The main advantages of the proposed architecture are its flexibility and its ability to relate data from various datasets.
Making predictions in the stock market is a challenging task. At the same time, several studies have focused on forecasting the future behavior of the market and classifying financial assets. A different approach is to classify correlated data to discover patterns and atypical behaviors in them. In this study, we propose applying unsupervised algorithms to process, model, and cluster related data from two different data sources, i.e., Google News and Yahoo Finance, to identify conditions in the stock market that might help to support the investment decision-making process. We applied principal component analysis (PCA) and a k-means clustering approach to group data according to their principal characteristics. We identified four conditions in the stock market, one comprising the least amount of data, characterized by high volatility. The main results show that, regularly, the stock market tends to have a steady performance. However, atypical conditions are conducive to higher volatility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations鈥揷itations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.