Submitted:July 14, 2014 1 st Revision:September 12, 2014 Accepted:September 15, 2014 * 숭실대학교 SW특성화대학원 석사과정 ** 숭실대학교 SW특성화대학원 교수 *** 숭실대학교 SW특성화대학원 교수, 교신저자 Because people's interest of the stock market has been increased with the development of economy, a lot of studies have been going to predict fluctuation of stock prices. Latterly many studies have been made using scientific and technological method among the various forecasting method, and also data using for study are becoming diverse. So, in this paper we propose stock prices prediction models using sentiment analysis and machine learning based on news articles and SNS data to improve the accuracy of prediction of stock prices. Stock prices prediction models that we propose are generated through the four-step process that contain data collection, sentiment dictionary construction, sentiment analysis, and machine learning. The data have been collected to target newspapers related to economy in the case of news article and to target twitter in the case of SNS data. Sentiment dictionary was built using news articles among the collected data, and we utilize it to process sentiment analysis. In machine learning phase, we generate prediction models using various techniques of classification and the data that was made through sentiment analysis. After generating prediction models, we conducted 10-fold cross-validation to measure the performance of they. The experimental result showed that accuracy is over 80% in a number of ways and F1 score is closer to 0.8. The result can be seen as significantly enhanced result compared with conventional researches utilizing opinion mining or data mining techniques.
Deep learning is a rapidly growing technology repeating epoch-making development in the field of voice/text/image cognition. Its basic principle is to systematize information and let users find the pattern for themselves through the neural network using lots of layers. Technological core is anticipation by classification. This thesis uses SNS and webpage scrapping data and GIS data for consumer needs. Data will then be extracted by accurate classification for the purpose of spatial information data with deep learning algorithm. It is necessary to call shapefiles to R, improve the accessibility to data, and cross one data set to other data set areas. This thesis intends to analyze data of various environments with data analysis tool, R, and design the process combining data of spatial information and visualizing it based on deep learning algorithm
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.