Access to non-biased and accurate models capable of predicting driver injury severity of collision events is vital for determining what safety measures should be implemented at intersections. Inadequate models can underestimate the potential for collision events to result in driver fatalities or injuries, which can lead to improperly assessing the safety criteria of an intersection. This study investigates how injury severity differs between drivers of various ages and gender groups using cost-sensitive data-mining models. Previous research efforts have used machine learning methods for predicting injury severity; however, these studies did not consider the consequences (cost) of incorrect predictions. This paper addresses this shortfall by considering the monetary cost of incorrect injury severity predictions when developing C4.5, instance-based (IB), and random forest (RF) machine-learning models. One model of each method was developed for four distinct cohorts of drivers (i.e., younger males, younger females, older males, and older females). Each model considered a selection of driver, vehicular, road/traffic, environmental, and crash parameters for determining if they significantly influenced driver injury severity. A five-year period of two-vehicle crash data collected at signalized intersections in the metropolitan area of Miami, Florida was used in the models. Results indicated that cost-sensitive learning classifiers were superior to regular classifiers at accurately predicting injuries and fatalities of crashes. Among cost-sensitive models, RF outperformed C4.5 and IB models in predicting driver injury severity for four groups of drivers. The models displayed substantial differences in injury severity determinants across the age/gender cohorts.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.