The paper presents the result of experiments that were designed with the goal of revealing the association between texts published in online environments (Yahoo! Finance, Facebook, and Twitter) and changes in stock prices of the corresponding companies at a micro level. The association between lexicon detected sentiment and stock price movements was not confirmed. It was, however, possible to reveal and quantify such association with the application of machine learning-based classification. From the experiments it was obvious that the data preparation procedure had a substantial impact on the results. Thus, different stock price smoothing, lags between the release of documents and related stock price changes, five levels of a minimal stock price change, three different weighting schemes for structured document representation, and six classifiers were studied. It has been shown that at least part of the movement of stock prices is associated with the textual content if a proper combination of processing parameters is selected.
A lot of research has been focusing on incorporating online data into models of various phenomena. The chapter focuses on one specific problem coming from the domain of capital markets where the information contained in online environments is quite topical. The presented experiments were designed to reveal the association between online texts (from Yahoo! Finance, Facebook, and Twitter) and changes in stock prices of the corresponding companies. As the method for quantifying the association, machine learning-based classification was chosen. The experiments showed that the data preparation procedure had a substantial impact on the results. Thus, different stock price smoothing, the lags between the release of documents and related stock price changes, levels of a minimal stock price change, different weighting schemes for structured document representation, and classifiers were studied. The chapter also shows how to use currently available open source technologies to implement a system for accomplishing the task.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.