In this paper, we propose a robust evaluation of the information content of microblogging data to forecast useful stock market variables: returns, volatility and trading volume of diverse dataset of indices and portfolios. We analyze a large Twitter dataset, from December 2012 to October 2015, with about 31 million messages related with 3,800 stocks traded in US markets. Also, we apply a sound prediction procedure (e.g., rolling window evaluation, four regression methods) along with a statistical test of predictive accuracy. Furthermore, we explore the diversity of traditional sentiment indicators and assess their complementarity value with microblogging sentiment. A Kalman Filter (KF) procedure is applied to create an unique daily sentiment indicator from a Twitter indicator and four other sentiment indicators (created from surveys). We also predicted two popular survey sentiment indicators using microblogging data. We found that Twitter sentiment and posting volume were particularly important for the forecasting of returns of S&P 500 index, portfolios of lower market capitalization and some industries. Additionally, KF sentiment was informative for the forecasting of returns. Furthermore, Twitter and KF sentiment indicators were useful for the prediction of some AAII and II survey sentiment indicators. These results show that microblogging data are relevant to forecast stock market behavior and can provide a valuable alternative for existing measures (e.g., survey sentiment) with various advantages (e.g., fast and cheap creation, daily frequency).
Lexicon acquisition is a key issue for sentiment analysis. This paper presents a novel and fast approach for creating stock market lexicons. The approach is based on statistical measures applied over a vast set of labeled messages from StockTwits, which is a specialized stock market microblog. We compare three adaptations of statistical measures, such as pointwise mutual information (PMI), two new complementary statistics and the use of sentiment scores for affirmative and negated contexts. Using StockTwits, we show that the new lexicons are competitive for measuring investor sentiment when compared with six popular lexicons. We also applied a lexicon to easily produce Twitter investor sentiment indicators and analyzed their correlation with survey sentiment indexes. The new microblogging indicators have a moderate correlation with popular Investors Intelligence (II) and American Association of Individual Investors (AAII) indicators. Thus, the new microblogging approach can be used alternatively to traditional survey indicators with advantages (e.g., cheaper creation, higher frequencies).
In this study, we explored data from StockTwits, a microblogging platform exclusively dedicated to the stock market. We produced several indicators and analyzed their value when predicting three market variables: returns, volatility and trading volume. For six major stocks, we measured posting volume and sentiment indicators. We advance on the previous studies on this subject by considering a large time period, using a robust forecasting exercise and performing a statistical test of forecasting ability. In contrast with previous studies, we find no evidence of return predictability using sentiment indicators, and of information content of posting volume for forecasting volatility. However, there is evidence that posting volume can improve the forecasts of trading volume, which is useful for measuring stock liquidity (e.g. assets easily sold).
Many domain-specific languages, that try to bring feasible alternatives for existing solutions while simplifying programming work, have come up in recent years. Although, these little languages seem to be easy to use, there is an open issue whether they bring advantages in comparison to the application libraries, which are the most commonly used implementation approach. In this work, we present an experiment, which was carried out to compare such a domain-specific language with a comparable application library. The experiment was conducted with 36 programmers, who have answered a questionnaire on both implementation approaches. The questionnaire is more than 100 pages long. For a domain-specific language and the application library, the same problem domain has been used - construction of graphical user interfaces. In terms of a domain-specific language, XAML has been used and C# Forms for the application library. A cognitive dimension framework has been used for a comparison between XAML and C# Forms.
The analysis of microblogging data related with stock markets can reveal relevant new signals of investor sentiment and attention. It may also provide sentiment and attention indicators in a more rapid and cost-effective manner than other sources. In this study, we created several indicators using Twitter data and investigated their value when modeling relevant stock market variables, namely returns, trading volume and volatility. We collected recent data from nine major technological companies. Several sentiment analysis methods were explored, by comparing 5 popular lexical resources and two novel lexicons (emoticon based and the merge of all 6 lexicons) and sentiment indicators produced using two strategies (based on daily words and individual tweet classifications). Also, we measured posting volume associated with tweets related to the analyzed companies. While a short time period is considered (32 days), we found scarce evidence that sentiment indicators can explain these stock returns. However, interesting results were obtained when measuring the value of using posting volume for fitting trading volume and, in particular, volatility.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.