The presence of unobserved heterogeneity in crash data can result in estimation of biased model parameters and incorrect inferences. The research presented in this paper investigated severity of crashes reported at highway–rail grade crossings by appropriately clustering the data, accounting for unobserved heterogeneity. A combination of data mining and statistical regression methods was used to cluster crash data into subsets and then to identify factors associated with crash injury severity levels. This research relied on highway–rail accident, incident, and crossing inventory databases for 2011 to 2015 obtained from FRA. Three clustering methods— K-means, traditional latent class cluster, and variational Bayesian latent class cluster—were considered, and the variational Bayesian latent class cluster method was chosen for partitioning the data set for model estimation. Unclustered data as well as the clustered subsets were used to estimate ordered logit models for crash injury severity. A comparison revealed that the cluster-based approach provided more relevant model parameters and identified factors relevant only to certain clusters of the data.
Analysis of traffic crash and associated data provides insights and assists with identification of cause-and-effect relationships with crash probabilities and outcomes. This study utilized eight years of police-reported Nebraska crash data using a deep neural network (DNN) to model crash injury severity outcomes. Prediction performances and model interpretability were examined. The developed DNN excelled in prediction accuracy, precision, and recall but was computationally intensive compared with a baseline multinomial logistic regression model. While the lack of interpretability power of deep learning models limits their usage, the adoption of SHapley Additive exPlanation (SHAP) values was an improvement. Conclusions drawn from the DNN model are generally consistent with the estimated baseline model. For instance, the variable total number of pedestrians was found significant in both scenarios of the multinomial logit model indicating a strong relationship with more severe crash injury outcomes. It was also found important in all three sets of parameters in DNN. SHAP values also allow in-depth analysis of prediction results on a single observation, such as the variable crash type (same direction sideswipe) contributing to classifying a single observation as property damage only. These findings are beneficial for making more informed transportation safety-related decisions.
Crashes at Highway–Rail Grade Crossings (HRGCs) that involve a truck or a train carrying hazardous materials (hazmat) expose people and the environment to potentially severe consequences of hazmat release. This research involved statistical modeling of the probability of hazmat release from trucks and/or trains in crashes at HRGCs to identify factors associated with hazmat release. The Federal Railroad Administration (FRA) HRGC crash dataset (2007–2016) yielded two subsets of crashes: 1) those involving hazmat-carrying trucks, and 2) those involving hazmat-carrying trains. Results from a logistic regression model using data subset 1 (crashes involving hazmat-carrying trucks) with hazmat release/no release as the response variable showed that standard flashing signal lights, railroad crossbucks, and railroad classes II and III (relative to railroad class I) were associated with lower hazmat release probability from hazmat-carrying trucks. Hazmat release probability from trucks was higher with freight train involvement. Results from a logistic regression model using data subset 2 (crashes involving hazmat-carrying trains) revealed that hazmat release probability from trains was lower with warmer temperature. However, the probability of release from trains was greater with railroad class II (relative to railroad class I), type of highway user (different types of trucks and motorcycle relative to automobiles), and weather conditions (fog, sleet or snow, relative to clear). A comparison of the results from this study with HRGC crash severity studies highlighted the importance and usefulness of this study.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.