With the development of quantitative finance, machine learning methods used in the financial fields have been given significant attention among researchers, investors, and traders. However, in the field of stock index spot–futures arbitrage, relevant work is still rare. Furthermore, existing work is mostly retrospective, rather than anticipatory of arbitrage opportunities. To close the gap, this study uses machine learning approaches based on historical high-frequency data to forecast spot–futures arbitrage opportunities for the China Security Index (CSI) 300. Firstly, the possibility of spot–futures arbitrage opportunities is identified through econometric models. Then, Exchange-Traded-Fund (ETF)-based portfolios are built to fit the movements of CSI 300 with the least tracking errors. A strategy consisting of non-arbitrage intervals and unwinding timing indicators is derived and proven profitable in a back-test. In forecasting, four machine learning methods are adopted to predict the indicator we acquired, namely Least Absolute Shrinkage and Selection Operator (LASSO), Extreme Gradient Boosting (XGBoost), Back Propagation Neural Network (BPNN), and Long Short-Term Memory neural network (LSTM). The performance of each algorithm is compared from two perspectives. One is an error perspective based on the Root-Mean-Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and goodness of fit (R2). Another is a return perspective based on the trade yield and the number of arbitrage opportunities captured. Finally, a performance heterogeneity analysis is conducted based on the separation of bull and bear markets. The results show that LSTM outperforms all other algorithms over the entire time period, with an RMSE of 0.00813, MAPE of 0.70 percent, R2 of 92.09 percent, and an arbitrage return of 58.18 percent. Meanwhile, in certain market conditions, namely both the bull market and bear market separately with a shorter period, LASSO can outperform.