Accurate prediction and reliable significant factor analysis of incident clearance time are two main objects of traffic incident management (TIM) system, as it could help to relieve traffic congestion caused by traffic incidents. This study applies the extreme gradient boosting machine algorithm (XGBoost) to predict incident clearance time on freeway and analyze the significant factors of clearance time. The XGBoost integrates the superiority of statistical and machine learning methods, which can flexibly deal with the nonlinear data in high-dimensional space and quantify the relative importance of the explanatory variables. The data collected from the Washington Incident Tracking System in 2011 are used in this research. To investigate the potential philosophy hidden in data, K-means is chosen to cluster the data into two clusters. The XGBoost is built for each cluster. Bayesian optimization is used to optimize the parameters of XGBoost, and the MAPE is considered as the predictive indicator to evaluate the prediction performance. A comparative study confirms that the XGBoost outperforms other models. In addition, response time, AADT (annual average daily traffic), incident type, and lane closure type are identified as the significant explanatory variables for clearance time.
Studying the time interval duration between the first accident and the second accident caused by it can provide decision makers with valuable information on how to effectively deal with high-risk second accidents. This paper is aimed to explore the potential influencing factors of the interval duration between the two accidents and predict it. First, the spatiotemporal definition method is applied to identify the cascaded first accident and the second accident. Then, on the basis of using Kaiser-Meyer-Olkin (KMO) measure and Bartlett’s sphere test statistics to ensure the applicability of the data to the factor analysis method, the explanatory variables that can significantly affect the interval duration are obtained through the factor analysis method. Finally, the random forest model (RF), which combines the advantages of machine learning methods, is employed to predict the duration of the interval. Traffic accident data set collected in Los Angeles city from February 2016 to June 2020 is used to validate prediction performance in this study. Bayesian method is applied to optimize the hyperparameters in the RF, while three evaluation indicators, including the Root Mean Squared Error (RMSE), Mean Squared Error (MSE), and Mean Absolute Percentage Error (MAPE), are used to estimate the prediction effect. The test results and comparative experiments confirm that RF is able to predict the interval well and has better prediction performance. This is of great significance for the prediction of the duration of the interval between one accident and the second accident.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.