The results of recent replication studies suggest that false positive findings are a big problem in empirical finance. We contribute to this debate by reviewing a sample of articles dealing with the short-term directional forecasting of the prices of stocks, commodities, and currencies. Screening all relevant articles published in 2016 by one of the 96 journals covered by the Social Sciences Citation Index in the category "Business, Finance," we select only those studies that use easily accessible data of daily or higher frequency. We examine each study in detail, from the selection of the dataset to the interpretation of the results. We also include empirical analyses to illustrate the shortcomings of certain approaches. There are three main findings from our review. First, the number of selected papers is very low, which is surprising even when the strict selection criteria are taken into account. Second, there are hardly any relevant studies that use high-frequency data-despite the hype about financial big data and machine learning. Third, the economic significance of the findings-for example, their usefulness for trading purposes-is questionable. In general, apparently good forecasting performance does not translate into profitability once realistic transaction costs and the effect of data snooping are taken into account. Other typical problems include unsuitable benchmarks, short evaluation periods, and nonoperational trading strategies.
<abstract><p>In this paper, we examine the usefulness of machine learning methods such as support vector machines, random forests and bagging for the extraction of information from the limit order book that can be used for intraday trading. For our empirical analysis, we first get 50 raw features from the LOBSTER message file and order book file of the iShares Core S & P 500 ETF for the time period from 27.06.2007 to 30.04.2019 and then construct 18 higher-level features (aggregated to 5 minutes frequency) which serve as predictors. Using straightforward specifications for the machine learning procedures and thereby avoiding excessive data snooping, we find that these procedures are unable to find high dimensional patterns in the order book that could be used for trading purposes. The observed significant predictability is mainly due to the inclusion of only one variable, namely the last price change, and is probably too small to ensure profitability once transaction costs are taken into account.</p></abstract>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.