Stock market forecasting is a knotty challenging task due to the highly noisy, nonparametric, complex and chaotic nature of the stock price time series. With a simple eight-trigram feature engineering scheme of the inter-day candlestick patterns, we construct a novel ensemble machine learning framework for daily stock pattern prediction, combining traditional candlestick charting with the latest artificial intelligence methods. Several machine learning techniques, including deep learning methods, are applied to stock data to predict the direction of the closing price. This framework can give a suitable machine learning prediction method for each pattern based on the trained results. The investment strategy is constructed according to the ensemble machine learning techniques. Empirical results for the period of 2000 to 2017 of China's stock market confirm that our feature engineering has effective predictive power, with a prediction accuracy of more than 60% for some trend patterns. Various measures such as big data, feature standardization, and elimination of abnormal data can effectively solve the problem of data noise. An investment strategy based on our forecasting framework excels in both individual stock and portfolio performance theoretically. Transaction costs have a significant impact on investment. Additional technical indicators can improve the forecast accuracy to varying degrees. Technical indicators, especially the momentum indicators, can improve forecasting accuracy in most cases.
PRML, a novel candlestick pattern recognition model using machine learning methods, is proposed to improve stock trading decisions. Four popular machine learning methods and 11 different features types are applied to all possible combinations of daily patterns to start the pattern recognition schedule. Different time windows from one to ten days are used to detect the prediction effect at different periods. An investment strategy is constructed according to the identified candlestick patterns and suitable time window. We deploy PRML for the forecast of all Chinese market stocks from Jan 1, 2000 until Oct 30, 2020. Among them, the data from Jan 1, 2000 to Dec 31, 2014 is used as the training data set, and the data set from Jan 1, 2015 to Oct 30, 2020 is used to verify the forecasting effect. Empirical results show that the two-day candlestick patterns after filtering have the best prediction effect when forecasting one day ahead; these patterns obtain an average annual return, an annual Sharpe ratio, and an information ratio as high as 36.73%, 0.81, and 2.37, respectively. After screening, three-day candlestick patterns also present a beneficial effect when forecasting one day ahead in that these patterns show stable characteristics. Two other popular machine learning methods, multilayer perceptron network and long short-term memory neural networks, are applied to the pattern recognition framework to evaluate the dependency of the prediction model. A transaction cost of 0.2% is considered on the two-day patterns predicting one day ahead, thus confirming the profitability. Empirical results show that applying different machine learning methods to two-day and three-day patterns for one-day-ahead forecasts can be profitable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.