Predicting Future Driving Risk of Crash-Involved Drivers Based on a Systematic Machine Learning Framework

Wang, Chen; Liu, Lin; Xu, Chengcheng; Lv, Wei-Tao

doi:10.3390/ijerph16030334

Cited by 36 publications

(18 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In order to identify the HR e-bike riders based on the machine learning methods, a number of features were developed. As for the demographics, the age groups were divided into four categories based on exclusive class intervals, namely, teenagers (<18 years old), young-aged riders (18~35 years old), middle-aged riders (35~65 years old), and old-aged riders (>65 years old), according to previous literature [ 25 ].…”

Section: Methodsmentioning

confidence: 99%

Identify Risk Pattern of E-Bike Riders in China Based on Machine Learning Framework

Wang

Kou

Song

2019

Entropy

View full text Add to dashboard Cite

In this paper, the risk pattern of e-bike riders in China was examined, based on tree-structured machine learning techniques. Three-year crash/violation data were acquired from the Kunshan traffic police department, China. Firstly, high-risk (HR) electric bicycle (e-bike) riders were defined as those with at-fault crash involvement, while others (i.e. non-at-fault or without crash involvement) were considered as non-high-risk (NHR) riders, based on quasi-induced exposure theory. Then, for e-bike riders, their demographics and previous violation-related features were developed based on the crash/violation records. After that, a systematic machine learning (ML) framework was proposed so as to capture the complex risk patterns of those e-bike riders. An ensemble sampling method was selected to deal with the imbalanced datasets. Four tree-structured machine learning methods were compared, and a gradient boost decision tree (GBDT) appeared to be the best. The feature importance and partial dependence were further examined. Interesting findings include the following: (1) tree-structured ML models are able to capture complex risk patterns and interpret them properly; (2) spatial-temporal violation features were found as important indicators of high-risk e-bike riders; and (3) violation behavior features appeared to be more effective than violation punishment-related features, in terms of identifying high-risk e-bike riders. In general, the proposed ML framework is able to identify the complex crash risk pattern of e-bike riders. This paper provides useful insights for policy-makers and traffic practitioners regarding e-bike safety improvement in China.

show abstract

Section: Methodsmentioning

confidence: 99%

Identify Risk Pattern of E-Bike Riders in China Based on Machine Learning Framework

Wang

Kou

Song

2019

Entropy

View full text Add to dashboard Cite

show abstract

“…This includes combining stumps with an enhancement program [ 24 ]. The random forest (RF) of a boosting procedure to combine stumps of trees belongs to a “bagging” algorithm [ 25 ], which has already been widely used in biological medicine researches [ 26 , 27 ], especially in the diagnosis of diabetes [ 11 , 12 ]; AdaBoost with a decision tree (AdaBoost) [ 28 ] and an extreme gradient boosting decision tree (XGBoost) [ 29 ] belong to “boosting” algorithms, and they had better performance than a decision tree in the prediction and classification [ 30 – 32 ]. In this study, LR- and tree-based models were used.…”

Section: Introductionmentioning

confidence: 99%

Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework

Xue

et al. 2020

Journal of Diabetes Research

View full text Add to dashboard Cite

Background. An estimated 425 million people globally have diabetes, accounting for 12% of the world’s health expenditures, and the number continues to grow, placing a huge burden on the healthcare system, especially in those remote, underserved areas. Methods. A total of 584,168 adult subjects who have participated in the national physical examination were enrolled in this study. The risk factors for type II diabetes mellitus (T2DM) were identified by p values and odds ratio, using logistic regression (LR) based on variables of physical measurement and a questionnaire. Combined with the risk factors selected by LR, we used a decision tree, a random forest, AdaBoost with a decision tree (AdaBoost), and an extreme gradient boosting decision tree (XGBoost) to identify individuals with T2DM, compared the performance of the four machine learning classifiers, and used the best-performing classifier to output the degree of variables’ importance scores of T2DM. Results. The results indicated that XGBoost had the best performance (accuracy=0.906, precision=0.910, recall=0.902, F‐1=0.906, and AUC=0.968). The degree of variables’ importance scores in XGBoost showed that BMI was the most significant feature, followed by age, waist circumference, systolic pressure, ethnicity, smoking amount, fatty liver, hypertension, physical activity, drinking status, dietary ratio (meat to vegetables), drink amount, smoking status, and diet habit (oil loving). Conclusions. We proposed a classifier based on LR-XGBoost which used fourteen variables of patients which are easily obtained and noninvasive as predictor variables to identify potential incidents of T2DM. The classifier can accurately screen the risk of diabetes in the early phrase, and the degree of variables’ importance scores gives a clue to prevent diabetes occurrence.

show abstract

“…Lim et al [20] proposed online sequential ELM update algorithm based on recursive least squares (Online sequential ELM, OSELM). Inspired by this idea, we first train a set of KELM sub-learner models based on historical data sets.…”

Section: Kelm Classifier Online Updating Incrementallymentioning

confidence: 99%

A Bagging Strategy-Based Kernel Extreme Learning Machine for Complex Network Intrusion Detection

Yin¹,

Li²,

Laghari³

et al. 2021

ICST Transactions on Scalable Information Systems

View full text Add to dashboard Cite

Network intrusion can enter the network through informal channels. Some illegal users utilize Trojans and selfprogrammed attack to change the network security system, so that the system loses the defense and alarm function and the Hacker can steal the internal information. Network intrusion seriously harms the security of network information and the legitimate rights of users. Therefore, a bagging strategy-based kernel extreme learning machine for complex network intrusion detection is presented in this paper. This method adopts a bagging strategy to train several sub-kernel extreme learning machines independently. Then the integrated gain of above machines is measured based on the margin distance minimization (MDM) criterion. Selected machines with high gain degree are selected for selective integration to obtain selective integrated learners with strong generalization ability and high efficiency. Then an improved universal gravitation search algorithm is used to optimize the kernel parameters. Meanwhile, a sub-kernel extreme learning machine online update strategy based on incremental learning of batch samples is introduced to realize the online update of intrusion detection model, so that the proposed detection method can effectively be adapted to the changes of complex network environment. Finally, experiments illustrate that the proposed method has better effect on network intrusion detection in terms of detection accuracy and speed, especially for unknown network intrusion connection events, the response speed is fast, the false alarm rate is low.

show abstract

Predicting Future Driving Risk of Crash-Involved Drivers Based on a Systematic Machine Learning Framework

Cited by 36 publications

References 35 publications

Identify Risk Pattern of E-Bike Riders in China Based on Machine Learning Framework

Identify Risk Pattern of E-Bike Riders in China Based on Machine Learning Framework

Identification of Potential Type II Diabetes in a Large-Scale Chinese Population Using a Systematic Machine Learning Framework

A Bagging Strategy-Based Kernel Extreme Learning Machine for Complex Network Intrusion Detection

Contact Info

Product

Resources

About