2015
DOI: 10.1002/atr.1358
|View full text |Cite
|
Sign up to set email alerts
|

A random forests approach to prioritize Highway Safety Manual (HSM) variables for data collection

Abstract: SUMMARYThe Highway Safety Manual (HSM) recommends using the empirical Bayes method with locally derived calibration factors to predict an agency's safety performance. The data needs for deriving these local calibration factors are significant, requiring very detailed roadway characteristics information. Many of these data variables are currently unavailable in most of the agencies' databases. Furthermore, it is not economically feasible to collect and maintain all the HSM data variables. This study aims to pri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(8 citation statements)
references
References 42 publications
0
8
0
Order By: Relevance
“…Tree-based ensemble methods, such as the stochastic gradient boosting (SGB) and AdaBoost, also demonstrated strong ability in enhancing reliability of the real-time risk assessment (Ahmed and Abdel-Aty, 2013a). Motivated by the fact that a large number of explanatory variables might induce overfitting issues, random forest was widely used for selecting important factors (Saha et al, 2015). Kwak and Kho (2016) used the conditional logistic regression analysis to remove confounding factors, and developed separate models to predict crash occurrence with the genetic programming technique.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Tree-based ensemble methods, such as the stochastic gradient boosting (SGB) and AdaBoost, also demonstrated strong ability in enhancing reliability of the real-time risk assessment (Ahmed and Abdel-Aty, 2013a). Motivated by the fact that a large number of explanatory variables might induce overfitting issues, random forest was widely used for selecting important factors (Saha et al, 2015). Kwak and Kho (2016) used the conditional logistic regression analysis to remove confounding factors, and developed separate models to predict crash occurrence with the genetic programming technique.…”
Section: Literature Reviewmentioning
confidence: 99%
“…It has been shown that random forest performs equally well or better than other methods on a diverse set of problems. It has been widely used in classification problems as diverse as bioinformatics 16 , medicine 17 , transportation safety 18 and customer behavior 19 .…”
Section: Random Forestmentioning
confidence: 99%
“…For instance, Random Forest Model is a promising data mining approach employed in many studies to prioritize the variables associated with crashes. 6,13,14 Researchers also found the better prediction performance of the support vector machine (SVM) in developing crash injury severity models compared to other parametric models. [15][16][17] Moreover, despite limited application in transportation sector, multivariate adaptive regression splines technique has outstanding predictive power in crash injury analysis.…”
Section: Introductionmentioning
confidence: 99%
“…Machine learning algorithm has been used extensively in transportation safety studies. For instance, Random Forest Model is a promising data mining approach employed in many studies to prioritize the variables associated with crashes 6,13,14 . Researchers also found the better prediction performance of the support vector machine (SVM) in developing crash injury severity models compared to other parametric models 15–17 .…”
Section: Introductionmentioning
confidence: 99%