Passenger train delay significantly influences riders' decision to choose rail transport as their mode choice. This article proposes real-time passenger train delay prediction (PTDP) models using the following machine learning techniques: random forest (RF), gradient boosting machine (GBM), and multi-layer perceptron (MLP). In this article, the impact on the PTPD models using Real-time based Data-frame Structure (RT-DFS) and Real-time with Historical based Data-frame Structure (RWH-DFS) is investigated. The results show that PTDP models using MLP with RWH-DFS outperformed all other models. The influence of the external variables such as historical delay profiles at the destination (HDPD), ridership, population, day of the week, geography, and weather information on the real-time PTPD models are also further analyzed and discussed.
This article proposes a Python-based Amtrak and Weather Underground (PAWU) tool to collect data on Amtrak (the main passenger train operator in the United States) departure and arrival times with weather information. In addition, this article offers a database, developed with PAWU, of the operating characteristics of 16 Amtrak routes from 2008 to 2019. More specifically, PAWU enables users to retrieve Amtrak departure and arrival times of any train number throughout the United States. It then automatically retrieves weather information from Weather Underground for each rail station and stores the data collected in a local MySQL database. Users can easily select any desired train number(s) and date range(s) without dealing with the code and the raw data from those sources that are in different formats. The database itself can be used, in part, to develop, apply, and benchmark models that assess the performance of rail services such as the one offered by Amtrak.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.