The goal of the article was to examine the relationship between the content of text documents published on the Internet and the direction of movement of stock prices on the Prague Stock Exchange. The relationship was modeled by text classification. As data were used news articles and discussion posts on Czech websites and the value of the PX stock index and stock price of company CEZ. Document's class (plus/minus/constant) was determined by the relative price change that happened between the publication date of a document and the next working day. We achieved a high accuracy of 75% for classification of discussion posts, however the classification accuracy for news articles was about 60%. We tried both binary (documents with constant class were discarded) and ternary classification -the former was in all cases more successful.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.