Quantitative evaluation of the ability of soccer players to contribute to team offensive performance is typically based on goals scored, assists made, and shots taken. In this paper, we describe a novel player ranking system based entirely on the value of passes completed. This value is derived based on the relationship of pass locations in a possession and shot opportunities generated. This relationship is learned by applying a supervised machine learning model to pass locations in event data from the 2012-2013 La Liga season. Interestingly, though this metric is based entirely on passes, the derived player rankings are largely consistent with general perceptions of offensive ability, e.g., Messi and Ronaldo are near the top. Additionally, when used to rank midfielders, it separates the more offensively-minded players from others.
In this paper, we present two approaches to analyzing pass event data to uncover sometimes-nonobvious insights into the game of soccer. We illustrate the utility of our methods by applying them to data from the 2012-2013 La Liga season. We first show that teams are characterized by where on the pitch they attempt passes, and can be identified by their passing styles. Using heatmaps of pass locations as features, we achieved a mean accuracy of 87% in a 20-team classification task. We also investigated using pass locations over the course of a possession to predict shots. For this task, we achieved an area under the receiver operating characteristic (AUROC) of 0.785. Finally, we used the weights of the predictive model to rank players by the value of their passes. Shockingly, Cristiano Ronaldo and Lionel Messi topped the rankings.Despite this problem, we demonstrate in this paper that by using machine learning techniques on passing data from the 2012-2013 La Liga season, we could uncover relevant data-driven insights into soccer.1. We show that heatmaps built using only the origins of passes provide fingerprints that can be used to identify teams with 87% accuracy.2. We further show that even when we only consider passes originating from the midfield, the resulting heatmaps can still be used to identify teams.3. We construct a model relating pass origins and destinations during a possession with the probability of a shot. The resulting weights offer insights into the offensive utility of passes.4. We utilize this model to rank players by the frequency with which their passes are highly valued by the model.The rest of the paper is organized as follows. In Section 2, we outline some previous related work on using machine learning for knowledge discovery in soccer and other sports. In Section 3, we describe the event-based dataset we used
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.