Background
Performance optimization is a major goal in sports science. However, this remains difficult due to the small samples and large individual variation in physiology and training adaptations. Machine learning (ML) solutions seem promising, but have not been tested for their capability to predict performance in this setting. The aim of this study was to predict 4-km cycling performance following a 12-week training intervention based on ML models with predictors from physiological profiling, individual training load and well-being, and to retrieve the most important predictors. Specific techniques were applied to reduce the risk of overfitting.
Results
Twenty-seven recreational cyclists completed the 4-km time trial with a mean power output of 4.1 ± 0.7 W/kg. Changes in time-trial performance after training were not different between moderate-intensity endurance training (n = 6), polarised endurance training (n = 8), concurrent polarised with concentric strength training (n = 7) and concurrent polarised with eccentric strength training (n = 6) groups (P > 0.05), but included substantial inter-individual differences. ML models predicted cycling performance with excellent model performance on unseen data before (R2 = 0.923, mean absolute error (MAE) = 0.183 W/kg using a generalized linear model) and after training (R2 = 0.758, MAE = 0.338 W/kg using a generalized linear model). Absolute changes in performance were more difficult to predict (R2 = 0.483, MAE = 0.191 W/kg using a random forest model). Important predictors included power at V̇O2max, performance V̇O2, ventilatory thresholds and efficiency, but also parameters related to body composition, training impulse, sleep, sickness and well-being. Conclusion
ML models allow accurate predictions of cycling performance based on physiological profiling, individual training load and well-being during a 12-week training intervention, even using small sample sizes, although changes in cycling performance were more difficult to predict.