Purpose
Achilles tendon ruptures (ATR) are career-threatening injuries in elite soccer players due to the decreased sports performance they commonly inflict. This study presents an exploratory data analysis of match participation before and after ATRs and an evaluation of the performance of a machine learning (ML) model based on pre-injury features to predict whether a player will return to a previous level of match participation.
Methods
The website
transfermarkt.com
was mined, between January and March of 2021, for relevant entries regarding soccer players who suffered an ATR while playing in first or second leagues. The difference between average minutes played per match (MPM) 1 year before injury and between 1 and 2 years after the injury was used to identify patterns in match participation after injury. Clustering analysis was performed using
k
-means clustering. Predictions of post-injury match participation were made using the XGBoost classification algorithm. The performance of this model was evaluated using the area under the receiver operating characteristic curve (AUROC) and Brier score loss (BSL).
Results
Two hundred and nine players were included in the study. Data from 32,853 matches was analysed. Exploratory data analysis revealed that forwards, midfielders and defenders increased match participation during the first year after injury, with goalkeepers still improving at 2 years. Players were grouped into four clusters regarding the difference between MPMs 1 year before injury and between 1 and 2 years after the injury. These groups ranged between a severe decrease (
n
= 34; − 59 ± 13 MPM), moderate decrease (
n
= 75; − 25 ± 8 MPM), maintenance (
n
= 70; 0 ± 8 MPM), or increase (
n
= 30; 32 ± 13 MPM). Regarding the predictive model, the average AUROC after cross-validation was 0.81 ± 0.10, and the BSL was 0.12, with the most important features relating to pre-injury match participation.
Conclusion
Most players take 1 year to reach peak match participation after an ATR. Good performance was attained using a ML classifier to predict the level of match participation following an ATR, with features related to pre-injury match participation displaying the highest importance.
Level of evidence
I.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.