Train dispatching (TD) is at the forefront of all rail operations that transport passengers or goods. Recent technological advances and the explosion of digital data have introduced data-driven methods (DDMs) in rail operations. In this study, DDMs on the TD problem are briefly explored, focusing on relevant studies on delay distribution, delay propagation, and timetable rescheduling. Data-driven TD methods, including statistical methods (SM), graphical models (GM), and machine learning (ML) methods are reviewed. Then, key issues in establishing different data-driven models for the TD problem are addressed. Subsequently, ML methods are considered to be among the most promising DDMs that lead to innovative TD methods, relying on rich data obtained from train operations. This study emphasizes the potentials for designing new alternatives in the three key fields of interest and provides directions for further research on TD. Future research, including the ML-driven TD and intelligent TD, were discussed in this study. INDEX TERMS Data-driven, delay distribution, delay propagation, timetable rescheduling, train dispatching, machine learning.