Train delay prediction can improve the quality of train dispatching, which helps the dispatcher to estimate the running state of the train more accurately and make reasonable dispatching decision. The delay of one train is affected by many factors, such as passenger flow, fault, extreme weather, dispatching strategy. The departure time of one train is generally determined by dispatchers, which is limited by their strategy and knowledge. The existing train delay prediction methods cannot comprehensively consider the temporal and spatial dependence between the multiple trains and routes. In this paper, we don't try to predict the specific delay time of one train, but predict the collective cumulative effect of train delay over a certain period, which is represented by the total number of arrival delays in one station. We propose a deep learning framework, train spatio-temporal graph convolutional network (TSTGCN), to predict the collective cumulative effect of train delay in one station for train dispatching and emergency plans. The proposed model is mainly composed of the recent, daily and weekly components. Each component contains two parts: spatio-temporal attention mechanism and spatio-temporal convolution, which can effectively capture spatio-temporal characteristics. The weighted fusion of the three components produces the final prediction result. The experiments on the train operation data from China Railway Passenger Ticket System demonstrate that TSTGCN clearly outperforms the existing advanced baselines in train delay prediction.
High-speed train operation data are reliable and rich resources in data-driven research. However, the data released by railway companies are poorly organized and not comprehensive enough to be applied directly and effectively. A public high-speed railway network dataset suitable for research is still lacking. To support the research in large-scale complex network, complex dynamic system and intelligent transportation, we develop a high-speed railway network dataset, containing the train operation data in different directions from October 8, 2019 to January 27, 2020, the train delay data of the railway stations, the junction stations data, and the mileage data of adjacent stations. In the dataset, weather, temperature, wind power and major holidays are considered as factors affecting train operation. Potential research values of the dataset include but are not limited to complex dynamic system pattern mining, community detection and discovery, and train delay analysis. Besides, the dataset can be used to solve various railway operation and management problems, such as passenger service network improvement, train real-time dispatching and intelligent driving assistance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.