Soccer 1 is the biggest global sport and a fast-growing, multi-billion dollar industry. Advanced data analytics are being more frequently employed on both the club and national levels to improve performance, equipment, marketing, scouting, etc. Soccer therefore offers interesting challenges for the machine learning community. This special issue solicited articles on all aspects of data analysis and machine learning for soccer. As part of the special issue, we posed the 2017 Soccer Prediction Challenge that revolved around predicting the outcomes of future soccer matches. This is an interesting task for the general public, researchers, clubs, media, news and advertising companies, and professional odds setters. Soccer outcome prediction has been the subject of research since at least the 1960s (Reep and Benjamin 1968; Hill 1974; Maher 1982; Dixon and Coles 1997; Angelini and Angelis 2017). Various statistical techniques have been used for outcome prediction, including Poisson models (Karlis and Ntzoufras 2003), Bayesian models (Baio and Blangiardo 2010; Rue and Salvesen 2000), rating systems (Hvattum and Arntzen 2010), and more recently also machine learning methods, such as kernel-based relational learning (Van Haaren and Van den Broeck 2011). O'Donoghue et al. (2004) used machine learning and statistical methods to predict the results of the 2002 FIFA World Cup but achieved the best prediction with a simulation on a commercial game console.